Re: Kernel crash when running bcc's test_xlate


Yonghong Song
 

It seems okay for bcc with latest 4.15-rc7 on x64 and with multiple
runs I cannot reproduce the issue:

[yhs@localhost python]$ ../../build/tests/wrapper.sh py_xlate1_c
namespace ./test_xlate1.py test_xlate1.c
Actual changes:
tx-checksumming: off
tx-checksum-ip-generic: off
tx-checksum-sctp: off
tcp-segmentation-offload: off
tx-tcp-segmentation: off [requested on]
tx-tcp-ecn-segmentation: off [requested on]
tx-tcp-mangleid-segmentation: off [requested on]
tx-tcp6-segmentation: off [requested on]
PING 192.168.1.1 (192.168.1.1) 56(84) bytes of data.
64 bytes from 192.168.1.1: icmp_seq=1 ttl=64 time=0.041 ms

--- 192.168.1.1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.041/0.041/0.041/0.000 ms
.
----------------------------------------------------------------------
Ran 1 test in 0.104s

OK
[yhs@localhost python]$ uname -a
Linux localhost.localdomain 4.15.0-rc7+ #2 SMP Fri Jan 12 22:29:57 PST
2018 x86_64 x86_64 x86_64 GNU/Linux
[yhs@localhost python]$

Regarding to how to translate various ip.addr, ip.tc proute2 commands to
plain "ip ..." or "tc ..." commands, the python implementation is in
/lib/python2.7/site-packages/pyroute2 through netlink interface.

I guess most cases you probably can figure it out easily, e.g.,
ip.addr("del", index=ifindex, address="172.16.1.2", mask=24)
=>
ip addr del 172.16.1.2/24 dev <dev_name>

ip.tc("add", "ingress", ifindex, "ffff:")
=>
tc qdisc add dev <dev_name> handle ffff: ingress


Maybe you can figure out the rest by looking at pyroute2 implementation as above
if you cannot simply map it to ip/tc commands.

On Wed, Jan 10, 2018 at 10:05 PM, Sandipan Das via iovisor-dev
<iovisor-dev@...> wrote:
Hi,

I was trying to run the bcc tests on a ppc64le VM with Fedora 26 and
v4.15-rc7 kernel and 'test_xlate' was causing a kernel panic. The test
crashes on all of the v4.15-rcX kernels that I built but run fine on
v4.14.11 though. To build the kernels, I used the same config as F26's
v4.14.11 distro kernel with default choices in case any new options
were added.

From my initial analysis, the crash occurs after the following statement
(line 35 of tests/python/test_xlate1.py) is executed.

ip.tc("add-filter", "u32", ifindex, ":1", parent="ffff:", action=[action],
protocol=protocols.ETH_P_ALL, classid=1, target=0x10002, keys=['0x0/0x0+0'])

Any ideas about why this is happening? Also, it would be really helpful
if someone can translate the Pyroute2 calls in the test script to the
corresponding tc commands.

Here is the kernel crash log:

[ 710.746123] IPv6: ADDRCONF(NETDEV_UP): py_xlate1_c.out: link is not ready
[ 710.746457] IPv6: ADDRCONF(NETDEV_CHANGE): py_xlate1_c.out: link becomes ready
[ 710.746662] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[ 711.141263] Unable to handle kernel paging request for data at address 0x3fecb0000
[ 711.145240] Faulting instruction address: 0xc0000000009f6f14
[ 711.145613] Oops: Kernel access of bad area, sig: 11 [#1]
[ 711.145898] LE SMP NR_CPUS=1024 NUMA pSeries
[ 711.146191] Modules linked in: act_bpf cls_u32 sch_sfq sch_ingress veth kvm_pr kvm ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_raw iptable_security ebtable_filter ebtables ip6table_filter ip6_tables sunrpc pseries_rng vmx_crypto crct10dif_vpmsum 9pnet_virtio 9pnet virtio_balloon xfs libcrc32c virtio_net virtio_blk virtio_pci ibmvscsi virtio_ring scsi_transport_srp crc32c_vpmsum virtio
[ 711.149415] CPU: 27 PID: 1857 Comm: ping Not tainted 4.15.0-rc7 #4
[ 711.149768] NIP: c0000000009f6f14 LR: c0000000009fbaf8 CTR: c000000000027020
[ 711.150195] REGS: c0000003fff07a00 TRAP: 0300 Not tainted (4.15.0-rc7)
[ 711.150550] MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 28002882 XER: 20000000
[ 711.150982] CFAR: 00007fff9f240520 DAR: 00000003fecb0000 DSISR: 40000000 SOFTE: 1
[ 711.150982] GPR00: c0000000009fbaf8 c0000003fff07c80 c0000000014a2d00 000000000000001c
[ 711.150982] GPR04: 0000000000000001 0000000000000000 0000000000000636 00000003fecb0000
[ 711.150982] GPR08: 00000003fecb0000 0000000000000000 c0000003f2558a80 d000000007c80fd0
[ 711.150982] GPR12: 0000000000002200 c00000000fd71b80 0000000000000000 0000000108d32594
[ 711.150982] GPR16: 0000000000008906 0000000000000000 0000000000000001 0000000000000001
[ 711.150982] GPR20: c0000003e4f42ba0 0000000000000000 0000000000000005 0000000000000000
[ 711.150982] GPR24: 0000000001080020 c000000001036fa8 000000000000dd86 000000000000a888
[ 711.150982] GPR28: c00000000305a000 0000000000000001 0000000000000000 c0000003e4a40200
[ 711.154688] NIP [c0000000009f6f14] __netif_receive_skb_core+0x734/0xe70
[ 711.155041] LR [c0000000009fbaf8] process_backlog+0xc8/0x1e0
[ 711.155392] Call Trace:
[ 711.155536] [c0000003fff07c80] [c0000000009f6d98] __netif_receive_skb_core+0x5b8/0xe70 (unreliable)
[ 711.156029] [c0000003fff07d40] [c0000000009fbaf8] process_backlog+0xc8/0x1e0
[ 711.156453] [c0000003fff07db0] [c000000000a00efc] net_rx_action+0x1ec/0x4c0
[ 711.156814] [c0000003fff07eb0] [c000000000be2b08] __do_softirq+0x158/0x3e4
[ 711.157168] [c0000003fff07f90] [c00000000002deec] call_do_softirq+0x14/0x24
[ 711.157521] [c0000003e49f3870] [c000000000018f0c] do_softirq_own_stack+0x5c/0xa0
[ 711.157952] [c0000003e49f38b0] [c00000000011d5b8] do_softirq.part.3+0x88/0xb0
[ 711.158376] [c0000003e49f38e0] [c00000000011d6b8] __local_bh_enable_ip+0xd8/0xe0
[ 711.158801] [c0000003e49f3910] [c000000000a7c7c0] ip_finish_output2+0x1c0/0x4e0
[ 711.159226] [c0000003e49f39b0] [c000000000a7f96c] ip_output+0xcc/0x150
[ 711.159579] [c0000003e49f3a30] [c000000000a7edf4] ip_local_out+0x74/0xa0
[ 711.159933] [c0000003e49f3a70] [c000000000a8063c] ip_send_skb+0x3c/0xc0
[ 711.160288] [c0000003e49f3aa0] [c000000000aba51c] raw_sendmsg+0x8bc/0xae0
[ 711.160640] [c0000003e49f3c70] [c000000000acf70c] inet_sendmsg+0x6c/0x130
[ 711.160994] [c0000003e49f3cb0] [c0000000009caf7c] sock_sendmsg+0x6c/0xa0
[ 711.161348] [c0000003e49f3ce0] [c0000000009ce284] SyS_sendto+0xd4/0x190
[ 711.161702] [c0000003e49f3e30] [c00000000000b8e0] system_call+0x58/0x6c
[ 711.162054] Instruction dump:
[ 711.162269] e95f00d0 65084000 907f0028 911f0090 7d4a4a14 a10a0004 2f880000 e90d0030
[ 711.162694] e9340008 7ce94214 419e0008 a08a0006 <7d49402a> 38c00000 38a10020 7d4a1a14
[ 711.163121] ---[ end trace 24e5137dc6336cd9 ]---
[ 711.166060]
[ 711.739873] Unable to handle kernel paging request for data at address 0x3febf0000
[ 711.741563] Faulting instruction address: 0xc0000000009f6f14
[ 711.741843] Oops: Kernel access of bad area, sig: 11 [#2]
[ 711.741894] LE SMP NR_CPUS=1024 NUMA pSeries
[ 711.741945] Modules linked in: act_bpf cls_u32 sch_sfq sch_ingress veth kvm_pr kvm ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_raw iptable_security ebtable_filter ebtables ip6table_filter ip6_tables sunrpc pseries_rng vmx_crypto crct10dif_vpmsum 9pnet_virtio 9pnet virtio_balloon xfs libcrc32c virtio_net virtio_blk virtio_pci ibmvscsi virtio_ring scsi_transport_srp crc32c_vpmsum virtio
[ 711.742462] CPU: 24 PID: 233 Comm: kworker/24:1 Tainted: G D 4.15.0-rc7 #4
[ 711.742537] Workqueue: ipv6_addrconf addrconf_dad_work
[ 711.742588] NIP: c0000000009f6f14 LR: c0000000009fbaf8 CTR: c0000000009fba30
[ 711.742660] REGS: c0000003fff1fa00 TRAP: 0300 Tainted: G D (4.15.0-rc7)
[ 711.742730] MSR: 800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 28002822 XER: 00000000
[ 711.742829] CFAR: c0000000000b40f4 DAR: 00000003febf0000 DSISR: 40000000 SOFTE: 1
[ 711.742829] GPR00: c0000000009fbaf8 c0000003fff1fc80 c0000000014a2d00 0000000000000048
[ 711.742829] GPR04: 0000000000000001 0000000000000000 00000000772a0000 00000003febf0000
[ 711.742829] GPR08: 00000003febf0000 0000000000000000 c0000003f31f0680 d000000007c80fd0
[ 711.742829] GPR12: 0000000000008800 c00000000fd6fc00 c0000000001454a8 c0000003f60f3240
[ 711.742829] GPR16: 0000000000000000 0000000000000000 0000000000000001 0000000000000001
[ 711.742829] GPR20: c0000003e4f42ba0 0000000000000000 0000000000000005 0000000000000000
[ 711.742829] GPR24: 0000000001080020 c000000001036fa8 000000000000dd86 000000000000a888
[ 711.742829] GPR28: c00000000305a000 0000000000000001 0000000000000000 c0000000030a3d00
[ 711.743455] NIP [c0000000009f6f14] __netif_receive_skb_core+0x734/0xe70
[ 711.743516] LR [c0000000009fbaf8] process_backlog+0xc8/0x1e0
[ 711.743575] Call Trace:
[ 711.743602] [c0000003fff1fc80] [c0000000009f6d98] __netif_receive_skb_core+0x5b8/0xe70 (unreliable)
[ 711.743687] [c0000003fff1fd40] [c0000000009fbaf8] process_backlog+0xc8/0x1e0
[ 711.743760] [c0000003fff1fdb0] [c000000000a00efc] net_rx_action+0x1ec/0x4c0
[ 711.743822] [c0000003fff1feb0] [c000000000be2b08] __do_softirq+0x158/0x3e4
[ 711.743884] [c0000003fff1ff90] [c00000000002deec] call_do_softirq+0x14/0x24
[ 711.743946] [c0000003e7c478f0] [c000000000018f0c] do_softirq_own_stack+0x5c/0xa0
[ 711.744019] [c0000003e7c47930] [c00000000011d5b8] do_softirq.part.3+0x88/0xb0
[ 711.744091] [c0000003e7c47960] [c00000000011d6b8] __local_bh_enable_ip+0xd8/0xe0
[ 711.744165] [c0000003e7c47990] [c000000000b2c908] ip6_finish_output2+0x208/0x6c0
[ 711.744238] [c0000003e7c47a20] [c000000000b30fbc] ip6_output+0x7c/0x190
[ 711.744300] [c0000003e7c47a90] [c000000000b55814] ndisc_send_skb+0x264/0x460
[ 711.744373] [c0000003e7c47b60] [c000000000b576ec] ndisc_send_ns+0x18c/0x2a0
[ 711.744435] [c0000003e7c47bd0] [c000000000b3dc7c] addrconf_dad_work+0x59c/0x6a0
[ 711.744508] [c0000003e7c47c80] [c00000000013bda8] process_one_work+0x248/0x540
[ 711.744580] [c0000003e7c47d20] [c00000000013c138] worker_thread+0x98/0x5f0
[ 711.744642] [c0000003e7c47dc0] [c000000000145648] kthread+0x1a8/0x1b0
[ 711.744704] [c0000003e7c47e30] [c00000000000bc60] ret_from_kernel_thread+0x5c/0x7c
[ 711.744775] Instruction dump:
[ 711.744813] e95f00d0 65084000 907f0028 911f0090 7d4a4a14 a10a0004 2f880000 e90d0030
[ 711.744889] e9340008 7ce94214 419e0008 a08a0006 <7d49402a> 38c00000 38a10020 7d4a1a14
[ 711.744966] ---[ end trace 24e5137dc6336cda ]---
[ 711.749691]
[ 712.166228] Kernel panic - not syncing: Fatal exception in interrupt

--
With Regards,
Sandipan

_______________________________________________
iovisor-dev mailing list
iovisor-dev@...
https://lists.iovisor.org/mailman/listinfo/iovisor-dev

Join iovisor-dev@lists.iovisor.org to automatically receive all group messages.