Re: Notification when an eBPF map is modified


Yonghong Song
 

On Mon, Aug 6, 2018 at 2:52 PM, Raffaele Sommese <raffysommy@...> wrote:
Okay, the htab_map_update_elem is indeed called, but you cannot trace it.
The following kernel code in kernel/bpf/syscall.c explained the reason:

/* must increment bpf_prog_active to avoid kprobe+bpf triggering from
* inside bpf map update or delete otherwise deadlocks are possible
*/
preempt_disable();
__this_cpu_inc(bpf_prog_active);
if (map->map_type == BPF_MAP_TYPE_PERCPU_HASH ||
map->map_type == BPF_MAP_TYPE_LRU_PERCPU_HASH) {
err = bpf_percpu_hash_update(map, key, value, attr->flags);
} else if (map->map_type == BPF_MAP_TYPE_PERCPU_ARRAY) {
err = bpf_percpu_array_update(map, key, value, attr->flags);
} else if (IS_FD_ARRAY(map)) {
rcu_read_lock();
err = bpf_fd_array_map_update_elem(map, f.file, key, value,
attr->flags);
rcu_read_unlock();
} else if (map->map_type == BPF_MAP_TYPE_HASH_OF_MAPS) {
rcu_read_lock();
err = bpf_fd_htab_map_update_elem(map, f.file, key, value,
attr->flags);
rcu_read_unlock();
} else {
rcu_read_lock();
err = map->ops->map_update_elem(map, key, value, attr->flags);
rcu_read_unlock();
}
__this_cpu_dec(bpf_prog_active);
preempt_enable();

The bpf_prog_active will prevent later kprobe for htab_map_update_elem.

How can we solve this problem then? One possible solution is as follows:
. disassemble vmlinux to find a proper place in function "map_update_elem"
where you can get the "map" (struct bpf_map *map) in a register, e.g.,
the insn offset inside map_update_elem is OFFSET and this OFFSET
should be outside the above preempt/__this_cpu_{inc/dec} region.
. improve trace.py to trace function+offset. the possible format could be
trace.py 'map_update_elem+OFFSET ...'
The attach_kprobe API should already support function_name + offset format.
I think that this way can be very tricky and platform depended, I have
found another solution. If I attach my bpf program to bpf_map_new_fd
with a kprobe and a kretprobe I can recover the mapping between (fd of
map-pid) and id or the name of the map (and save it). I have tested it
and it seems to work.
Then I can trace map_update_elem syscall and read the data (I'm
interested only into the key) from the userspace.
I attach the code here, it can be helpful if other people that want to
address this problem.
https://gist.github.com/raffysommy/45cf0544f34eb0e5fbf533f4d9a3b955
Thank you again for the support and for your time.
Yes, this approach should work too. I am thinking whether we could do
it with one invocation of trace.py...

Raffaele


--
________________________________
Raffaele Sommese
Mail:raffysommy@...
About me:https://about.me/r4ffy
Gpg Key:http://www.r4ffy.info/Openpgp.asc
GPG key ID: 0x830b1428cf91db2a on http://pgp.mit.edu:11371/

Join {iovisor-dev@lists.iovisor.org to automatically receive all group messages.