This group is locked. No changes can be made to the group while it is locked.
Date
1 - 5 of 5
Hashmap OOPS
Dan Siemon
Anyone seen this before? 4.17.6
I've seen this on a few boxes now. (If there is a better place to post this (LKML?) please let me know.) Oct 08 21:21:02 kernel: BUG: unable to handle kernel paging request at ffffc48f1f42d500 Oct 08 21:21:02 kernel: PGD 0 P4D 0 Oct 08 21:21:02 kernel: Oops: 0000 [#1] SMP PTI Oct 08 21:21:02 kernel: Modules linked in: sch_htb bpwd_drv(OE) lanner_drv(OE) cls_bpf sch_ingress ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nf> Oct 08 21:21:02 kernel: crc32c_intel igb i2c_algo_bit dca Oct 08 21:21:02 kernel: CPU: 5 PID: 27247 Comm: preseem-netdev- Tainted: G OE 4.17.6-200.fc28.x86_64 #1 Oct 08 21:21:02 kernel: Hardware name: Default string Default string/SKYBAY, BIOS 5.12 08/15/2017 Oct 08 21:21:02 kernel: RIP: 0010:memcpy_erms+0x6/0x10 Oct 08 21:21:02 kernel: RSP: 0018:ffffbc8ec1f37e40 EFLAGS: 00010286 Oct 08 21:21:02 kernel: RAX: ffff9eca25ae0ab0 RBX: ffff9eca25ae0ab0 RCX: 0000000000000010 Oct 08 21:21:02 kernel: RDX: 0000000000000010 RSI: ffffc48f1f42d500 RDI: ffff9eca25ae0ab0 Oct 08 21:21:02 kernel: RBP: 0000000000000010 R08: 0000000000400000 R09: 0000000000000001 Oct 08 21:21:02 kernel: R10: 000000002e523fcb R11: 0000000000000000 R12: ffff9ec9eb450600 Oct 08 21:21:02 kernel: R13: 0000000000141ca6 R14: ffff9eca464a3700 R15: ffff9eca25ae0ed0 Oct 08 21:21:02 kernel: FS: 00007f3262ffd700(0000) GS:ffff9eca9ed40000(0000) knlGS:0000000000000000 Oct 08 21:21:02 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Oct 08 21:21:02 kernel: CR2: ffffc48f1f42d500 CR3: 0000000422b82003 CR4: 00000000003606e0 Oct 08 21:21:02 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Oct 08 21:21:02 kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Oct 08 21:21:02 kernel: Call Trace: Oct 08 21:21:02 kernel: htab_map_get_next_key+0x237/0x280 Oct 08 21:21:02 kernel: map_get_next_key+0xed/0x1f0 Oct 08 21:21:02 kernel: __x64_sys_bpf+0x16b/0x260 Oct 08 21:21:02 kernel: do_syscall_64+0x5b/0x160 Oct 08 21:21:02 kernel: entry_SYSCALL_64_after_hwframe+0x44/0xa9 Oct 08 21:21:02 kernel: RIP: 0033:0x47fa54 Oct 08 21:21:02 kernel: RSP: 002b:000000c428409278 EFLAGS: 00000246 ORIG_RAX: 0000000000000141 Oct 08 21:21:02 kernel: RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 000000000047fa54 Oct 08 21:21:02 kernel: RDX: 0000000000000020 RSI: 000000c428409300 RDI: 0000000000000004 Oct 08 21:21:02 kernel: RBP: 000000c4284093a8 R08: 0000000000000000 R09: 0000000000000000 Oct 08 21:21:02 kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000001e00 Oct 08 21:21:02 kernel: R13: 0000000000000000 R14: 000000000045d560 R15: 0000000000000000 Oct 08 21:21:02 kernel: Code: 90 90 90 90 90 eb 1e 0f 1f 00 48 89 f8 48 89 d1 48 c1 e9 03 83 e2 07 f3 48 a5 89 d1 f3 a4 c3 66 0f 1f 44 00 00 48 89 f8 48 89 d1 <f3> Oct 08 21:21:02 kernel: RIP: memcpy_erms+0x6/0x10 RSP: ffffbc8ec1f37e40 Oct 08 21:21:02 kernel: CR2: ffffc48f1f42d500 Oct 08 21:21:02 kernel: ---[ end trace ebebe46b9c6091a1 ]--- -- Reboot -- |
Daniel Borkmann
Hi Dan,
On 10/17/2018 03:08 AM, Dan Siemon wrote: Anyone seen this before? 4.17.6Thanks a lot for the report! Do you happen to have a reproducer? Are you running into this with concurrent i) updates, ii) deletes, iii) both or iv) while destroying the whole htab? If i-iii) is it out of BPF datapath or from syscall path? Is your hashtab preallocated? Any more elaborate info would help. Thanks, Daniel |
Dan Siemon
On Wed, 2018-10-17 at 14:08 +0200, Daniel Borkmann wrote:
Hi Dan,Sadly I don't have a reproducer other than I've seen it happen on several distinct boxes and deployments. We use several hash maps and I don't know how to tell from the OOPs which one caused this but the two largest (100-600k entries) work like: - eBPF program creates entries as it sees flows on the network and counts metrics (bytes, packets, etc) into each entry and updates a timestamp on each update - User space iterates over the map every 10s and pulls out the metrics for further processing. If the timestamp on the entry is over a threshold, the user space process deletes the entry (kernel side never deletes) bpftool map output: 22: hash flags 0x0 key 12B value 8B max_entries 1213 memlock 122880B 23: hash flags 0x0 key 16B value 256B max_entries 2102717 memlock 739983360B 24: hash flags 0x0 key 12B value 2B max_entries 2102717 memlock 218505216B 25: hash flags 0x0 key 20B value 48B max_entries 2102717 memlock 319438848B |
Daniel Borkmann
On 10/17/2018 03:48 PM, Dan Siemon wrote:
On Wed, 2018-10-17 at 14:08 +0200, Daniel Borkmann wrote:Does that happen via map update helper from kernel side in your prog?Hi Dan,Sadly I don't have a reproducer other than I've seen it happen on - User space iterates over the map every 10s and pulls out the metricsAnd the map iterate + delete, are there multiple threads walking it at the same time or just single one (just making sure ...)? bpftool map output:Ok, you are using map prealloc here. How often does the BUG trigger in general on your machines? Was there an older kernel where you haven't been able to trigger it? key 12B value 8B max_entries 1213 memlock 122880B |
Dan Siemon
On Wed, 2018-10-17 at 16:18 +0200, Daniel Borkmann wrote:
Yes, when a new entry is added there is a map_update_elem() call.Sadly I don't have a reproducer other than I've seen it happen onDoes that happen via map update helper from kernel side in your prog? However when an entry is updated, we update the struct returned by map_lookup_elem directly without another call to map_update_elem(). We had another thread a while back on whether or not map_update_elem() needs to be called every time and I'm still not really clear on the requirements for safe concurrency in this area. There is one Goroutine which does this work which should map to one- User space iterates over the map every 10s and pulls out theAnd the map iterate + delete, are there multiple threads walking it thread (we use https://github.com/aterlo/puregobpf). We've seen these for some time going back several kernel versions. I'mbpftool map output:Ok, you are using map prealloc here. How often does the BUG trigger pretty sure I've also seen it on 4.18 but don't have an example handy. I can't say definitively how often per box but we probably see one of these every week or so across many boxes. I initially thought it was related to bad hardware but now doubt that. |