Francesco Picciariello <picciariello.francesco@...>
Hello all, Is there a way to receive asynchronous notification each time an eBPF map is modified?
I used bpf_obj_pin() in order to save a specific map on filesystem, and I called linux inotify() on the pinned object.
The inotify API provides a mechanism for monitoring filesystem events, and the goal is to notice when a pinned eBPF map is modified, but it seems inotify() does not work properly with bpf filesystem. In fact it's able to detect the creation and the deletion, but not the modification of the pinned object.
Regards.
|
On Fri, Feb 16, 2018 at 7:59 AM, Francesco Picciariello via iovisor-dev <iovisor-dev@...> wrote: Hello all, Is there a way to receive asynchronous notification each time an eBPF map is modified? When map is modified in kernel, you can send something into a ring buffer and userspace can poll on this ring buffer to get notification. I used bpf_obj_pin() in order to save a specific map on filesystem, and I called linux inotify() on the pinned object.
The inotify API provides a mechanism for monitoring filesystem events, and the goal is to notice when a pinned eBPF map is modified, but it seems inotify() does not work properly with bpf filesystem. In fact it's able to detect the creation and the deletion, but not the modification of the pinned object.
The inotify takes vfs_read/write for read/write operations. Here bpf map read/write happens inside bpf program, or through bpf syscall, and hence inotify mechanism won't work. Regards.
_______________________________________________ iovisor-dev mailing list iovisor-dev@... https://lists.iovisor.org/mailman/listinfo/iovisor-dev
|
On 17/02/2018 08:02, Y Song via iovisor-dev wrote: On Fri, Feb 16, 2018 at 7:59 AM, Francesco Picciariello via iovisor-dev <iovisor-dev@...> wrote:
Hello all, Is there a way to receive asynchronous notification each time an eBPF map is modified? When map is modified in kernel, you can send something into a ring buffer and userspace can poll on this ring buffer to get notification. When you say "you" means "the dataplane program"? In other words, the dataplane must be collaborative and send a notification to userspace when the map has been modified? Thanks, fulvio
I used bpf_obj_pin() in order to save a specific map on filesystem, and I called linux inotify() on the pinned object.
The inotify API provides a mechanism for monitoring filesystem events, and the goal is to notice when a pinned eBPF map is modified, but it seems inotify() does not work properly with bpf filesystem. In fact it's able to detect the creation and the deletion, but not the modification of the pinned object. The inotify takes vfs_read/write for read/write operations. Here bpf map read/write happens inside bpf program, or through bpf syscall, and hence inotify mechanism won't work.
Regards.
_______________________________________________ iovisor-dev mailing list iovisor-dev@... https://lists.iovisor.org/mailman/listinfo/iovisor-dev
_______________________________________________ iovisor-dev mailing list iovisor-dev@... https://lists.iovisor.org/mailman/listinfo/iovisor-dev
|
On Fri, Feb 16, 2018 at 11:19 PM, Fulvio Risso <fulvio.risso@...> wrote:
On 17/02/2018 08:02, Y Song via iovisor-dev wrote:
On Fri, Feb 16, 2018 at 7:59 AM, Francesco Picciariello via iovisor-dev <iovisor-dev@...> wrote:
Hello all, Is there a way to receive asynchronous notification each time an eBPF map is modified?
When map is modified in kernel, you can send something into a ring buffer and userspace can poll on this ring buffer to get notification.
When you say "you" means "the dataplane program"?
Yes. In other words, the dataplane must be collaborative and send a notification to userspace when the map has been modified? Yes. Not just notification, may be actual data or part of actual data, or aggregated data if modification is too frequent. Thanks,
fulvio
I used bpf_obj_pin() in order to save a specific map on filesystem, and I called linux inotify() on the pinned object.
The inotify API provides a mechanism for monitoring filesystem events, and the goal is to notice when a pinned eBPF map is modified, but it seems inotify() does not work properly with bpf filesystem. In fact it's able to detect the creation and the deletion, but not the modification of the pinned object.
The inotify takes vfs_read/write for read/write operations. Here bpf map read/write happens inside bpf program, or through bpf syscall, and hence inotify mechanism won't work.
Regards.
_______________________________________________ iovisor-dev mailing list iovisor-dev@... https://lists.iovisor.org/mailman/listinfo/iovisor-dev
_______________________________________________ iovisor-dev mailing list iovisor-dev@... https://lists.iovisor.org/mailman/listinfo/iovisor-dev
|
On 17/02/2018 08:41, Y Song wrote: On Fri, Feb 16, 2018 at 11:19 PM, Fulvio Risso <fulvio.risso@...> wrote:
On 17/02/2018 08:02, Y Song via iovisor-dev wrote:
On Fri, Feb 16, 2018 at 7:59 AM, Francesco Picciariello via iovisor-dev <iovisor-dev@...> wrote:
Hello all, Is there a way to receive asynchronous notification each time an eBPF map is modified?
When map is modified in kernel, you can send something into a ring buffer and userspace can poll on this ring buffer to get notification.
Got it. We were looking for a mechanism transparent to the eBPF program, though. A possible rational is to have an hot-standby copy of the program (including the state) in some other location, but I don't want my dataplane to be aware of that. Thanks, fulvio
When you say "you" means "the dataplane program"? Yes.
In other words, the dataplane must be collaborative and send a notification to userspace when the map has been modified? Yes. Not just notification, may be actual data or part of actual data, or aggregated data if modification is too frequent.
Thanks,
fulvio
I used bpf_obj_pin() in order to save a specific map on filesystem, and I called linux inotify() on the pinned object.
The inotify API provides a mechanism for monitoring filesystem events, and the goal is to notice when a pinned eBPF map is modified, but it seems inotify() does not work properly with bpf filesystem. In fact it's able to detect the creation and the deletion, but not the modification of the pinned object.
The inotify takes vfs_read/write for read/write operations. Here bpf map read/write happens inside bpf program, or through bpf syscall, and hence inotify mechanism won't work.
Regards.
_______________________________________________ iovisor-dev mailing list iovisor-dev@... https://lists.iovisor.org/mailman/listinfo/iovisor-dev
_______________________________________________ iovisor-dev mailing list iovisor-dev@... https://lists.iovisor.org/mailman/listinfo/iovisor-dev
|
On Sat, Feb 17, 2018 at 16:56 Fulvio Risso via iovisor-dev < iovisor-dev@...> wrote:
On 17/02/2018 08:41, Y Song wrote:
> On Fri, Feb 16, 2018 at 11:19 PM, Fulvio Risso <fulvio.risso@...> wrote:
>>
>>
>> On 17/02/2018 08:02, Y Song via iovisor-dev wrote:
>>>
>>> On Fri, Feb 16, 2018 at 7:59 AM, Francesco Picciariello via
>>> iovisor-dev <iovisor-dev@...> wrote:
>>>>
>>>> Hello all,
>>>> Is there a way to receive asynchronous notification each time an eBPF map
>>>> is
>>>> modified?
>>>
>>>
>>> When map is modified in kernel, you can send something into a ring buffer
>>> and userspace can poll on this ring buffer to get notification.
Got it.
We were looking for a mechanism transparent to the eBPF program, though.
A possible rational is to have an hot-standby copy of the program
(including the state) in some other location, but I don't want my
dataplane to be aware of that.
Thanks,
fulvio
You could also (use another BPF program or ftrace) to trace the bpf_map_update_elem Tracepoint. But in that case you get all update calls and would need to filter for the one you are interested on your own:)
Teng
>>
>>
>> When you say "you" means "the dataplane program"?
>
> Yes.
>
>> In other words, the dataplane must be collaborative and send a notification
>> to userspace when the map has been modified?
>
> Yes. Not just notification, may be actual data or part of actual data, or
> aggregated data if modification is too frequent.
>
>>
>> Thanks,
>>
>> fulvio
>>
>>>
>>>>
>>>> I used bpf_obj_pin() in order to save a specific map on filesystem, and I
>>>> called linux inotify() on the pinned object.
>>>>
>>>> The inotify API provides a mechanism for monitoring filesystem events,
>>>> and
>>>> the goal is to notice when a pinned eBPF map is modified, but it seems
>>>> inotify() does not work properly with bpf filesystem. In fact it's able
>>>> to
>>>> detect the creation and the deletion, but not the modification of the
>>>> pinned
>>>> object.
>>>
>>>
>>> The inotify takes vfs_read/write for read/write operations.
>>> Here bpf map read/write happens inside
>>> bpf program, or through bpf syscall, and hence inotify mechanism won't
>>> work.
>>>
>>>>
>>>> Regards.
>>>>
>>>> _______________________________________________
>>>> iovisor-dev mailing list
>>>> iovisor-dev@...
>>>> https://lists.iovisor.org/mailman/listinfo/iovisor-dev
>>>>
>>> _______________________________________________
>>> iovisor-dev mailing list
>>> iovisor-dev@...
>>> https://lists.iovisor.org/mailman/listinfo/iovisor-dev
>>>
>>
_______________________________________________
iovisor-dev mailing list
iovisor-dev@...
https://lists.iovisor.org/mailman/listinfo/iovisor-dev
|

Jesper Dangaard Brouer
On Sat, 17 Feb 2018 13:49:22 +0000 Teng Qin via iovisor-dev <iovisor-dev@...> wrote: We were looking for a mechanism transparent to the eBPF program, though. A possible rational is to have an hot-standby copy of the program (including the state) in some other location, but I don't want my dataplane to be aware of that. Thanks,
fulvio You could also (use another BPF program or ftrace) to trace the bpf_map_update_elem Tracepoint. But in that case you get all update calls and would need to filter for the one you are interested on your own:)
That is a good idea. Try it out via perf-record to see if it contains what you need: $ perf record -e bpf:bpf_map_update_elem -a $ perf script xdp_redirect_ma 2273 [011] 261187.968223: bpf:bpf_map_update_elem: map type= ufd=4 key=[00 00 00 00] val=[07 00 00 00] Looking at the above output and tracepoint kernel code, we should extend that with a map_id to easily identify/filter what map you are interested in. See patch below signature (not even compile tested). Example for attaching to tracepoints see: samples/bpf/xdp_monitor_*.c -- Best regards, Jesper Dangaard Brouer MSc.CS, Principal Kernel Engineer at Red Hat LinkedIn: http://www.linkedin.com/in/brouertracepoint: add map id to bpf tracepoints From: Jesper Dangaard Brouer <brouer@...> --- include/trace/events/bpf.h | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/include/trace/events/bpf.h b/include/trace/events/bpf.h index 150185647e6b..e6479ba45261 100644 --- a/include/trace/events/bpf.h +++ b/include/trace/events/bpf.h @@ -140,7 +140,7 @@ TRACE_EVENT(bpf_map_create, __entry->flags = map->map_flags; __entry->ufd = ufd; ), - +// TODO also add map_id here TP_printk("map type=%s ufd=%d key=%u val=%u max=%u flags=%x", __print_symbolic(__entry->type, __MAP_TYPE_SYM_TAB), __entry->ufd, __entry->size_key, __entry->size_value, @@ -199,15 +199,18 @@ DECLARE_EVENT_CLASS(bpf_obj_map, __field(u32, type) __field(int, ufd) __string(path, pname->name) + __field(u32, map_id) ), TP_fast_assign( __assign_str(path, pname->name); __entry->type = map->map_type; __entry->ufd = ufd; + __entry->map_id = map->id; ), - TP_printk("map type=%s ufd=%d path=%s", + TP_printk("map id=%u type=%s ufd=%d path=%s", + __entry->map_id, __print_symbolic(__entry->type, __MAP_TYPE_SYM_TAB), __entry->ufd, __get_str(path)) ); @@ -244,6 +247,7 @@ DECLARE_EVENT_CLASS(bpf_map_keyval, __dynamic_array(u8, val, map->value_size) __field(bool, val_trunc) __field(int, ufd) + __field(u32, map_id) ), TP_fast_assign( @@ -255,9 +259,11 @@ DECLARE_EVENT_CLASS(bpf_map_keyval, __entry->val_len = min(map->value_size, 16U); __entry->val_trunc = map->value_size != __entry->val_len; __entry->ufd = ufd; + __entry->map_id = map->id; ), - TP_printk("map type=%s ufd=%d key=[%s%s] val=[%s%s]", + TP_printk("map id=%d type=%s ufd=%d key=[%s%s] val=[%s%s]", + __entry->map_id, __print_symbolic(__entry->type, __MAP_TYPE_SYM_TAB), __entry->ufd, __print_hex(__get_dynamic_array(key), __entry->key_len),
|
That's cool :-) Francesco will have something to do for the next few days :-)
fulvio
toggle quoted message
Show quoted text
On 17/02/2018 18:40, Jesper Dangaard Brouer wrote: On Sat, 17 Feb 2018 13:49:22 +0000 Teng Qin via iovisor-dev <iovisor-dev@...> wrote:
We were looking for a mechanism transparent to the eBPF program, though. A possible rational is to have an hot-standby copy of the program (including the state) in some other location, but I don't want my dataplane to be aware of that. Thanks,
fulvio You could also (use another BPF program or ftrace) to trace the bpf_map_update_elem Tracepoint. But in that case you get all update calls and would need to filter for the one you are interested on your own:) That is a good idea. Try it out via perf-record to see if it contains what you need: $ perf record -e bpf:bpf_map_update_elem -a $ perf script xdp_redirect_ma 2273 [011] 261187.968223: bpf:bpf_map_update_elem: map type= ufd=4 key=[00 00 00 00] val=[07 00 00 00] Looking at the above output and tracepoint kernel code, we should extend that with a map_id to easily identify/filter what map you are interested in. See patch below signature (not even compile tested). Example for attaching to tracepoints see: samples/bpf/xdp_monitor_*.c
|
Hello everybody, I was looking for a similar mechanism, I need to trace an event on map update/delete, I have tried with tracepoint but I can recover only the file descriptor of map and I need the map id too (or the map name). Is there some other solution to trace this event and recover this data? I prefer to avoid to modify the kernel code. Thank You, Best Regards Raffaele Il giorno sab 17 feb 2018 alle ore 18:41 Jesper Dangaard Brouer via iovisor-dev <iovisor-dev@...> ha scritto:
On Sat, 17 Feb 2018 13:49:22 +0000 Teng Qin via iovisor-dev <iovisor-dev@...> wrote:
We were looking for a mechanism transparent to the eBPF program, though. A possible rational is to have an hot-standby copy of the program (including the state) in some other location, but I don't want my dataplane to be aware of that. Thanks,
fulvio You could also (use another BPF program or ftrace) to trace the bpf_map_update_elem Tracepoint. But in that case you get all update calls and would need to filter for the one you are interested on your own:) That is a good idea.
Try it out via perf-record to see if it contains what you need:
$ perf record -e bpf:bpf_map_update_elem -a
$ perf script xdp_redirect_ma 2273 [011] 261187.968223: bpf:bpf_map_update_elem: map type= ufd=4 key=[00 00 00 00] val=[07 00 00 00]
Looking at the above output and tracepoint kernel code, we should extend that with a map_id to easily identify/filter what map you are interested in.
See patch below signature (not even compile tested).
Example for attaching to tracepoints see: samples/bpf/xdp_monitor_*.c
-- Best regards, Jesper Dangaard Brouer MSc.CS, Principal Kernel Engineer at Red Hat LinkedIn: http://www.linkedin.com/in/brouer
tracepoint: add map id to bpf tracepoints
From: Jesper Dangaard Brouer <brouer@...>
--- include/trace/events/bpf.h | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-)
diff --git a/include/trace/events/bpf.h b/include/trace/events/bpf.h index 150185647e6b..e6479ba45261 100644 --- a/include/trace/events/bpf.h +++ b/include/trace/events/bpf.h @@ -140,7 +140,7 @@ TRACE_EVENT(bpf_map_create, __entry->flags = map->map_flags; __entry->ufd = ufd; ), - +// TODO also add map_id here TP_printk("map type=%s ufd=%d key=%u val=%u max=%u flags=%x", __print_symbolic(__entry->type, __MAP_TYPE_SYM_TAB), __entry->ufd, __entry->size_key, __entry->size_value, @@ -199,15 +199,18 @@ DECLARE_EVENT_CLASS(bpf_obj_map, __field(u32, type) __field(int, ufd) __string(path, pname->name) + __field(u32, map_id) ),
TP_fast_assign( __assign_str(path, pname->name); __entry->type = map->map_type; __entry->ufd = ufd; + __entry->map_id = map->id; ),
- TP_printk("map type=%s ufd=%d path=%s", + TP_printk("map id=%u type=%s ufd=%d path=%s", + __entry->map_id, __print_symbolic(__entry->type, __MAP_TYPE_SYM_TAB), __entry->ufd, __get_str(path)) ); @@ -244,6 +247,7 @@ DECLARE_EVENT_CLASS(bpf_map_keyval, __dynamic_array(u8, val, map->value_size) __field(bool, val_trunc) __field(int, ufd) + __field(u32, map_id) ),
TP_fast_assign( @@ -255,9 +259,11 @@ DECLARE_EVENT_CLASS(bpf_map_keyval, __entry->val_len = min(map->value_size, 16U); __entry->val_trunc = map->value_size != __entry->val_len; __entry->ufd = ufd; + __entry->map_id = map->id; ),
- TP_printk("map type=%s ufd=%d key=[%s%s] val=[%s%s]", + TP_printk("map id=%d type=%s ufd=%d key=[%s%s] val=[%s%s]", + __entry->map_id, __print_symbolic(__entry->type, __MAP_TYPE_SYM_TAB), __entry->ufd, __print_hex(__get_dynamic_array(key), __entry->key_len), _______________________________________________ iovisor-dev mailing list iovisor-dev@... https://lists.iovisor.org/mailman/listinfo/iovisor-dev
-- ________________________________ Raffaele Sommese Mail:raffysommy@... About me: https://about.me/r4ffyGpg Key: http://www.r4ffy.info/Openpgp.ascGPG key ID: 0x830b1428cf91db2a on http://pgp.mit.edu:11371/
|
On Wed, Aug 1, 2018 at 2:36 AM, Raffaele Sommese <raffysommy@...> wrote: Hello everybody, I was looking for a similar mechanism, I need to trace an event on map update/delete, I have tried with tracepoint but I can recover only the file descriptor of map and I need the map id too (or the map name). Is there some other solution to trace this event and recover this data? bpf tracepoints have been removed from recent linux so the you need to use kprobe to trace update/delete. typical map_update_elem and map_delete_elem first argument is 'struct bpf_map *map', you can get name and id from there: struct bpf_map { /* The first two cachelines with read-mostly members of which some * are also accessed in fast-path (e.g. ops, max_entries). */ const struct bpf_map_ops *ops ____cacheline_aligned; struct bpf_map *inner_map_meta; #ifdef CONFIG_SECURITY void *security; #endif enum bpf_map_type map_type; u32 key_size; u32 value_size; u32 max_entries; u32 map_flags; u32 pages; u32 id; int numa_node; u32 btf_key_type_id; u32 btf_value_type_id; struct btf *btf; bool unpriv_array; /* 55 bytes hole */ /* The 3rd and 4th cacheline with misc members to avoid false sharing * particularly with refcounting. */ struct user_struct *user ____cacheline_aligned; atomic_t refcnt; atomic_t usercnt; struct work_struct work; char name[BPF_OBJ_NAME_LEN]; }; I prefer to avoid to modify the kernel code. Thank You, Best Regards Raffaele Il giorno sab 17 feb 2018 alle ore 18:41 Jesper Dangaard Brouer via iovisor-dev <iovisor-dev@...> ha scritto:
On Sat, 17 Feb 2018 13:49:22 +0000 Teng Qin via iovisor-dev <iovisor-dev@...> wrote:
We were looking for a mechanism transparent to the eBPF program, though. A possible rational is to have an hot-standby copy of the program (including the state) in some other location, but I don't want my dataplane to be aware of that. Thanks,
fulvio You could also (use another BPF program or ftrace) to trace the bpf_map_update_elem Tracepoint. But in that case you get all update calls and would need to filter for the one you are interested on your own:) That is a good idea.
Try it out via perf-record to see if it contains what you need:
$ perf record -e bpf:bpf_map_update_elem -a
$ perf script xdp_redirect_ma 2273 [011] 261187.968223: bpf:bpf_map_update_elem: map type= ufd=4 key=[00 00 00 00] val=[07 00 00 00]
Looking at the above output and tracepoint kernel code, we should extend that with a map_id to easily identify/filter what map you are interested in.
See patch below signature (not even compile tested).
Example for attaching to tracepoints see: samples/bpf/xdp_monitor_*.c
-- Best regards, Jesper Dangaard Brouer MSc.CS, Principal Kernel Engineer at Red Hat LinkedIn: http://www.linkedin.com/in/brouer
tracepoint: add map id to bpf tracepoints
From: Jesper Dangaard Brouer <brouer@...>
--- include/trace/events/bpf.h | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-)
diff --git a/include/trace/events/bpf.h b/include/trace/events/bpf.h index 150185647e6b..e6479ba45261 100644 --- a/include/trace/events/bpf.h +++ b/include/trace/events/bpf.h @@ -140,7 +140,7 @@ TRACE_EVENT(bpf_map_create, __entry->flags = map->map_flags; __entry->ufd = ufd; ), - +// TODO also add map_id here TP_printk("map type=%s ufd=%d key=%u val=%u max=%u flags=%x", __print_symbolic(__entry->type, __MAP_TYPE_SYM_TAB), __entry->ufd, __entry->size_key, __entry->size_value, @@ -199,15 +199,18 @@ DECLARE_EVENT_CLASS(bpf_obj_map, __field(u32, type) __field(int, ufd) __string(path, pname->name) + __field(u32, map_id) ),
TP_fast_assign( __assign_str(path, pname->name); __entry->type = map->map_type; __entry->ufd = ufd; + __entry->map_id = map->id; ),
- TP_printk("map type=%s ufd=%d path=%s", + TP_printk("map id=%u type=%s ufd=%d path=%s", + __entry->map_id, __print_symbolic(__entry->type, __MAP_TYPE_SYM_TAB), __entry->ufd, __get_str(path)) ); @@ -244,6 +247,7 @@ DECLARE_EVENT_CLASS(bpf_map_keyval, __dynamic_array(u8, val, map->value_size) __field(bool, val_trunc) __field(int, ufd) + __field(u32, map_id) ),
TP_fast_assign( @@ -255,9 +259,11 @@ DECLARE_EVENT_CLASS(bpf_map_keyval, __entry->val_len = min(map->value_size, 16U); __entry->val_trunc = map->value_size != __entry->val_len; __entry->ufd = ufd; + __entry->map_id = map->id; ),
- TP_printk("map type=%s ufd=%d key=[%s%s] val=[%s%s]", + TP_printk("map id=%d type=%s ufd=%d key=[%s%s] val=[%s%s]", + __entry->map_id, __print_symbolic(__entry->type, __MAP_TYPE_SYM_TAB), __entry->ufd, __print_hex(__get_dynamic_array(key), __entry->key_len), _______________________________________________ iovisor-dev mailing list iovisor-dev@... https://lists.iovisor.org/mailman/listinfo/iovisor-dev
-- ________________________________ Raffaele Sommese Mail:raffysommy@... About me:https://about.me/r4ffy Gpg Key:http://www.r4ffy.info/Openpgp.asc GPG key ID: 0x830b1428cf91db2a on http://pgp.mit.edu:11371/
|
Hello, I have tried to use kprobe but it fails when I try to attach a kprobe on that function with this error: raise Exception("Failed to attach BPF to kprobe") I use b.attach_kprobe(event="map_update_elem", fn_name="hello") for the attaching, and int hello(struct pt_regs *ctx,struct bpf_map *map) as bpf function. (I use basically the code of hello_perf_output.py example right now). Is this the right way? Or I can attach my ebpf program only to syscall? Thank You, Raffaele
toggle quoted message
Show quoted text
Il giorno mer 1 ago 2018 alle ore 17:08 Y Song <ys114321@...> ha scritto: On Wed, Aug 1, 2018 at 2:36 AM, Raffaele Sommese <raffysommy@...> wrote:
Hello everybody, I was looking for a similar mechanism, I need to trace an event on map update/delete, I have tried with tracepoint but I can recover only the file descriptor of map and I need the map id too (or the map name). Is there some other solution to trace this event and recover this data? bpf tracepoints have been removed from recent linux so the you need to use kprobe to trace update/delete.
typical map_update_elem and map_delete_elem first argument is 'struct bpf_map *map', you can get name and id from there:
struct bpf_map { /* The first two cachelines with read-mostly members of which some * are also accessed in fast-path (e.g. ops, max_entries). */ const struct bpf_map_ops *ops ____cacheline_aligned; struct bpf_map *inner_map_meta; #ifdef CONFIG_SECURITY void *security; #endif enum bpf_map_type map_type; u32 key_size; u32 value_size; u32 max_entries; u32 map_flags; u32 pages; u32 id; int numa_node; u32 btf_key_type_id; u32 btf_value_type_id; struct btf *btf; bool unpriv_array; /* 55 bytes hole */
/* The 3rd and 4th cacheline with misc members to avoid false sharing * particularly with refcounting. */ struct user_struct *user ____cacheline_aligned; atomic_t refcnt; atomic_t usercnt; struct work_struct work; char name[BPF_OBJ_NAME_LEN]; };
I prefer to avoid to modify the kernel code. Thank You, Best Regards Raffaele Il giorno sab 17 feb 2018 alle ore 18:41 Jesper Dangaard Brouer via iovisor-dev <iovisor-dev@...> ha scritto:
On Sat, 17 Feb 2018 13:49:22 +0000 Teng Qin via iovisor-dev <iovisor-dev@...> wrote:
We were looking for a mechanism transparent to the eBPF program, though. A possible rational is to have an hot-standby copy of the program (including the state) in some other location, but I don't want my dataplane to be aware of that. Thanks,
fulvio You could also (use another BPF program or ftrace) to trace the bpf_map_update_elem Tracepoint. But in that case you get all update calls and would need to filter for the one you are interested on your own:) That is a good idea.
Try it out via perf-record to see if it contains what you need:
$ perf record -e bpf:bpf_map_update_elem -a
$ perf script xdp_redirect_ma 2273 [011] 261187.968223: bpf:bpf_map_update_elem: map type= ufd=4 key=[00 00 00 00] val=[07 00 00 00]
Looking at the above output and tracepoint kernel code, we should extend that with a map_id to easily identify/filter what map you are interested in.
See patch below signature (not even compile tested).
Example for attaching to tracepoints see: samples/bpf/xdp_monitor_*.c
-- Best regards, Jesper Dangaard Brouer MSc.CS, Principal Kernel Engineer at Red Hat LinkedIn: http://www.linkedin.com/in/brouer
tracepoint: add map id to bpf tracepoints
From: Jesper Dangaard Brouer <brouer@...>
--- include/trace/events/bpf.h | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-)
diff --git a/include/trace/events/bpf.h b/include/trace/events/bpf.h index 150185647e6b..e6479ba45261 100644 --- a/include/trace/events/bpf.h +++ b/include/trace/events/bpf.h @@ -140,7 +140,7 @@ TRACE_EVENT(bpf_map_create, __entry->flags = map->map_flags; __entry->ufd = ufd; ), - +// TODO also add map_id here TP_printk("map type=%s ufd=%d key=%u val=%u max=%u flags=%x", __print_symbolic(__entry->type, __MAP_TYPE_SYM_TAB), __entry->ufd, __entry->size_key, __entry->size_value, @@ -199,15 +199,18 @@ DECLARE_EVENT_CLASS(bpf_obj_map, __field(u32, type) __field(int, ufd) __string(path, pname->name) + __field(u32, map_id) ),
TP_fast_assign( __assign_str(path, pname->name); __entry->type = map->map_type; __entry->ufd = ufd; + __entry->map_id = map->id; ),
- TP_printk("map type=%s ufd=%d path=%s", + TP_printk("map id=%u type=%s ufd=%d path=%s", + __entry->map_id, __print_symbolic(__entry->type, __MAP_TYPE_SYM_TAB), __entry->ufd, __get_str(path)) ); @@ -244,6 +247,7 @@ DECLARE_EVENT_CLASS(bpf_map_keyval, __dynamic_array(u8, val, map->value_size) __field(bool, val_trunc) __field(int, ufd) + __field(u32, map_id) ),
TP_fast_assign( @@ -255,9 +259,11 @@ DECLARE_EVENT_CLASS(bpf_map_keyval, __entry->val_len = min(map->value_size, 16U); __entry->val_trunc = map->value_size != __entry->val_len; __entry->ufd = ufd; + __entry->map_id = map->id; ),
- TP_printk("map type=%s ufd=%d key=[%s%s] val=[%s%s]", + TP_printk("map id=%d type=%s ufd=%d key=[%s%s] val=[%s%s]", + __entry->map_id, __print_symbolic(__entry->type, __MAP_TYPE_SYM_TAB), __entry->ufd, __print_hex(__get_dynamic_array(key), __entry->key_len), _______________________________________________ iovisor-dev mailing list iovisor-dev@... https://lists.iovisor.org/mailman/listinfo/iovisor-dev
-- ________________________________ Raffaele Sommese Mail:raffysommy@... About me:https://about.me/r4ffy Gpg Key:http://www.r4ffy.info/Openpgp.asc GPG key ID: 0x830b1428cf91db2a on http://pgp.mit.edu:11371/
|
toggle quoted message
Show quoted text
On Thu, Aug 2, 2018 at 11:36 AM, Raffaele Sommese <raffysommy@...> wrote: Hello,
I have tried to use kprobe but it fails when I try to attach a kprobe
on that function with this error: raise Exception("Failed to attach
BPF to kprobe")
I use b.attach_kprobe(event="map_update_elem", fn_name="hello") for
the attaching, and int hello(struct pt_regs *ctx,struct bpf_map *map)
as bpf function.
(I use basically the code of hello_perf_output.py example right now).
Is this the right way? Or I can attach my ebpf program only to syscall?
Thank You,
Raffaele
Il giorno mer 1 ago 2018 alle ore 17:08 Y Song < ys114321@...> ha scritto:
>
> On Wed, Aug 1, 2018 at 2:36 AM, Raffaele Sommese < raffysommy@...> wrote:
> > Hello everybody,
> > I was looking for a similar mechanism,
> > I need to trace an event on map update/delete, I have tried with
> > tracepoint but I can recover only the file descriptor of map and I
> > need the map id too (or the map name).
> > Is there some other solution to trace this event and recover this data?
>
> bpf tracepoints have been removed from recent linux so the you need to
> use kprobe to trace update/delete.
>
> typical map_update_elem and map_delete_elem first argument is
> 'struct bpf_map *map', you can get name and id from there:
>
> struct bpf_map {
> /* The first two cachelines with read-mostly members of which some
> * are also accessed in fast-path (e.g. ops, max_entries).
> */
> const struct bpf_map_ops *ops ____cacheline_aligned;
> struct bpf_map *inner_map_meta;
> #ifdef CONFIG_SECURITY
> void *security;
> #endif
> enum bpf_map_type map_type;
> u32 key_size;
> u32 value_size;
> u32 max_entries;
> u32 map_flags;
> u32 pages;
> u32 id;
> int numa_node;
> u32 btf_key_type_id;
> u32 btf_value_type_id;
> struct btf *btf;
> bool unpriv_array;
> /* 55 bytes hole */
>
> /* The 3rd and 4th cacheline with misc members to avoid false sharing
> * particularly with refcounting.
> */
> struct user_struct *user ____cacheline_aligned;
> atomic_t refcnt;
> atomic_t usercnt;
> struct work_struct work;
> char name[BPF_OBJ_NAME_LEN];
> };
>
>
> > I prefer to avoid to modify the kernel code.
> > Thank You,
> > Best Regards
> > Raffaele
> > Il giorno sab 17 feb 2018 alle ore 18:41 Jesper Dangaard Brouer via
> > iovisor-dev < iovisor-dev@...> ha scritto:
> >>
> >>
> >>
> >> On Sat, 17 Feb 2018 13:49:22 +0000 Teng Qin via iovisor-dev < iovisor-dev@...> wrote:
> >>
> >> > > We were looking for a mechanism transparent to the eBPF program, though.
> >> > > A possible rational is to have an hot-standby copy of the program
> >> > > (including the state) in some other location, but I don't want my
> >> > > dataplane to be aware of that.
> >> > > Thanks,
> >> > >
> >> > > fulvio
> >> >
> >> >
> >> > You could also (use another BPF program or ftrace) to trace the
> >> > bpf_map_update_elem Tracepoint. But in that case you get all update calls
> >> > and would need to filter for the one you are interested on your own:)
> >>
> >> That is a good idea.
> >>
> >> Try it out via perf-record to see if it contains what you need:
> >>
> >> $ perf record -e bpf:bpf_map_update_elem -a
> >>
> >> $ perf script
> >> xdp_redirect_ma 2273 [011] 261187.968223: bpf:bpf_map_update_elem: map type= ufd=4 key=[00 00 00 00] val=[07 00 00 00]
> >>
> >>
> >> Looking at the above output and tracepoint kernel code, we should
> >> extend that with a map_id to easily identify/filter what map you are
> >> interested in.
> >>
> >> See patch below signature (not even compile tested).
> >>
> >> Example for attaching to tracepoints see:
> >> samples/bpf/xdp_monitor_*.c
> >>
> >> --
> >> Best regards,
> >> Jesper Dangaard Brouer
> >> MSc.CS, Principal Kernel Engineer at Red Hat
> >> LinkedIn: http://www.linkedin.com/in/brouer
> >>
> >> tracepoint: add map id to bpf tracepoints
> >>
> >> From: Jesper Dangaard Brouer < brouer@...>
> >>
> >>
> >> ---
> >> include/trace/events/bpf.h | 12 +++++++++---
> >> 1 file changed, 9 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/include/trace/events/bpf.h b/include/trace/events/bpf.h
> >> index 150185647e6b..e6479ba45261 100644
> >> --- a/include/trace/events/bpf.h
> >> +++ b/include/trace/events/bpf.h
> >> @@ -140,7 +140,7 @@ TRACE_EVENT(bpf_map_create,
> >> __entry->flags = map->map_flags;
> >> __entry->ufd = ufd;
> >> ),
> >> -
> >> +// TODO also add map_id here
> >> TP_printk("map type=%s ufd=%d key=%u val=%u max=%u flags=%x",
> >> __print_symbolic(__entry-> type, __MAP_TYPE_SYM_TAB),
> >> __entry->ufd, __entry->size_key, __entry->size_value,
> >> @@ -199,15 +199,18 @@ DECLARE_EVENT_CLASS(bpf_obj_ map,
> >> __field(u32, type)
> >> __field(int, ufd)
> >> __string(path, pname->name)
> >> + __field(u32, map_id)
> >> ),
> >>
> >> TP_fast_assign(
> >> __assign_str(path, pname->name);
> >> __entry->type = map->map_type;
> >> __entry->ufd = ufd;
> >> + __entry->map_id = map->id;
> >> ),
> >>
> >> - TP_printk("map type=%s ufd=%d path=%s",
> >> + TP_printk("map id=%u type=%s ufd=%d path=%s",
> >> + __entry->map_id,
> >> __print_symbolic(__entry-> type, __MAP_TYPE_SYM_TAB),
> >> __entry->ufd, __get_str(path))
> >> );
> >> @@ -244,6 +247,7 @@ DECLARE_EVENT_CLASS(bpf_map_ keyval,
> >> __dynamic_array(u8, val, map->value_size)
> >> __field(bool, val_trunc)
> >> __field(int, ufd)
> >> + __field(u32, map_id)
> >> ),
> >>
> >> TP_fast_assign(
> >> @@ -255,9 +259,11 @@ DECLARE_EVENT_CLASS(bpf_map_ keyval,
> >> __entry->val_len = min(map->value_size, 16U);
> >> __entry->val_trunc = map->value_size != __entry->val_len;
> >> __entry->ufd = ufd;
> >> + __entry->map_id = map->id;
> >> ),
> >>
> >> - TP_printk("map type=%s ufd=%d key=[%s%s] val=[%s%s]",
> >> + TP_printk("map id=%d type=%s ufd=%d key=[%s%s] val=[%s%s]",
> >> + __entry->map_id,
> >> __print_symbolic(__entry-> type, __MAP_TYPE_SYM_TAB),
> >> __entry->ufd,
> >> __print_hex(__get_dynamic_ array(key), __entry->key_len),
> >> ______________________________ _________________
> >> iovisor-dev mailing list
> >> iovisor-dev@...
> >> https://lists.iovisor.org/mailman/listinfo/iovisor-dev
> >
> >
> >
> > --
> > ______________________________ __
> > Raffaele Sommese
> > Mail:raffysommy@...
> > About me: https://about.me/r4ffy
> > Gpg Key: http://www.r4ffy.info/Openpgp.asc
> > GPG key ID: 0x830b1428cf91db2a on http://pgp.mit.edu:11371/
> >
> >
> >
--
______________________________ __
Raffaele Sommese
Mail:raffysommy@...
About me: https://about.me/r4ffy
Gpg Key: http://www.r4ffy.info/Openpgp.asc
GPG key ID: 0x830b1428cf91db2a on http://pgp.mit.edu:11371/
|
I think that I need kprobe, map_update_elem and map_delete_elem are defined in kernel space. Thank You, Raffaele Il giorno gio 2 ago 2018 alle ore 17:39 Sunny Klair <sunny@...> ha scritto: You likely want uprobes if your function is defined in userspace, not kprobes (which are for functions defined in kernel space).
relevant link: http://www.brendangregg.com/blog/2015-06-28/linux-ftrace-uprobe.html
- Sunny
On Thu, Aug 2, 2018 at 11:36 AM, Raffaele Sommese <raffysommy@...> wrote:
Hello, I have tried to use kprobe but it fails when I try to attach a kprobe on that function with this error: raise Exception("Failed to attach BPF to kprobe") I use b.attach_kprobe(event="map_update_elem", fn_name="hello") for the attaching, and int hello(struct pt_regs *ctx,struct bpf_map *map) as bpf function. (I use basically the code of hello_perf_output.py example right now). Is this the right way? Or I can attach my ebpf program only to syscall? Thank You, Raffaele Il giorno mer 1 ago 2018 alle ore 17:08 Y Song <ys114321@...> ha scritto:
On Wed, Aug 1, 2018 at 2:36 AM, Raffaele Sommese <raffysommy@...> wrote:
Hello everybody, I was looking for a similar mechanism, I need to trace an event on map update/delete, I have tried with tracepoint but I can recover only the file descriptor of map and I need the map id too (or the map name). Is there some other solution to trace this event and recover this data? bpf tracepoints have been removed from recent linux so the you need to use kprobe to trace update/delete.
typical map_update_elem and map_delete_elem first argument is 'struct bpf_map *map', you can get name and id from there:
struct bpf_map { /* The first two cachelines with read-mostly members of which some * are also accessed in fast-path (e.g. ops, max_entries). */ const struct bpf_map_ops *ops ____cacheline_aligned; struct bpf_map *inner_map_meta; #ifdef CONFIG_SECURITY void *security; #endif enum bpf_map_type map_type; u32 key_size; u32 value_size; u32 max_entries; u32 map_flags; u32 pages; u32 id; int numa_node; u32 btf_key_type_id; u32 btf_value_type_id; struct btf *btf; bool unpriv_array; /* 55 bytes hole */
/* The 3rd and 4th cacheline with misc members to avoid false sharing * particularly with refcounting. */ struct user_struct *user ____cacheline_aligned; atomic_t refcnt; atomic_t usercnt; struct work_struct work; char name[BPF_OBJ_NAME_LEN]; };
I prefer to avoid to modify the kernel code. Thank You, Best Regards Raffaele Il giorno sab 17 feb 2018 alle ore 18:41 Jesper Dangaard Brouer via iovisor-dev <iovisor-dev@...> ha scritto:
On Sat, 17 Feb 2018 13:49:22 +0000 Teng Qin via iovisor-dev <iovisor-dev@...> wrote:
We were looking for a mechanism transparent to the eBPF program, though. A possible rational is to have an hot-standby copy of the program (including the state) in some other location, but I don't want my dataplane to be aware of that. Thanks,
fulvio You could also (use another BPF program or ftrace) to trace the bpf_map_update_elem Tracepoint. But in that case you get all update calls and would need to filter for the one you are interested on your own:) That is a good idea.
Try it out via perf-record to see if it contains what you need:
$ perf record -e bpf:bpf_map_update_elem -a
$ perf script xdp_redirect_ma 2273 [011] 261187.968223: bpf:bpf_map_update_elem: map type= ufd=4 key=[00 00 00 00] val=[07 00 00 00]
Looking at the above output and tracepoint kernel code, we should extend that with a map_id to easily identify/filter what map you are interested in.
See patch below signature (not even compile tested).
Example for attaching to tracepoints see: samples/bpf/xdp_monitor_*.c
-- Best regards, Jesper Dangaard Brouer MSc.CS, Principal Kernel Engineer at Red Hat LinkedIn: http://www.linkedin.com/in/brouer
tracepoint: add map id to bpf tracepoints
From: Jesper Dangaard Brouer <brouer@...>
--- include/trace/events/bpf.h | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-)
diff --git a/include/trace/events/bpf.h b/include/trace/events/bpf.h index 150185647e6b..e6479ba45261 100644 --- a/include/trace/events/bpf.h +++ b/include/trace/events/bpf.h @@ -140,7 +140,7 @@ TRACE_EVENT(bpf_map_create, __entry->flags = map->map_flags; __entry->ufd = ufd; ), - +// TODO also add map_id here TP_printk("map type=%s ufd=%d key=%u val=%u max=%u flags=%x", __print_symbolic(__entry->type, __MAP_TYPE_SYM_TAB), __entry->ufd, __entry->size_key, __entry->size_value, @@ -199,15 +199,18 @@ DECLARE_EVENT_CLASS(bpf_obj_map, __field(u32, type) __field(int, ufd) __string(path, pname->name) + __field(u32, map_id) ),
TP_fast_assign( __assign_str(path, pname->name); __entry->type = map->map_type; __entry->ufd = ufd; + __entry->map_id = map->id; ),
- TP_printk("map type=%s ufd=%d path=%s", + TP_printk("map id=%u type=%s ufd=%d path=%s", + __entry->map_id, __print_symbolic(__entry->type, __MAP_TYPE_SYM_TAB), __entry->ufd, __get_str(path)) ); @@ -244,6 +247,7 @@ DECLARE_EVENT_CLASS(bpf_map_keyval, __dynamic_array(u8, val, map->value_size) __field(bool, val_trunc) __field(int, ufd) + __field(u32, map_id) ),
TP_fast_assign( @@ -255,9 +259,11 @@ DECLARE_EVENT_CLASS(bpf_map_keyval, __entry->val_len = min(map->value_size, 16U); __entry->val_trunc = map->value_size != __entry->val_len; __entry->ufd = ufd; + __entry->map_id = map->id; ),
- TP_printk("map type=%s ufd=%d key=[%s%s] val=[%s%s]", + TP_printk("map id=%d type=%s ufd=%d key=[%s%s] val=[%s%s]", + __entry->map_id, __print_symbolic(__entry->type, __MAP_TYPE_SYM_TAB), __entry->ufd, __print_hex(__get_dynamic_array(key), __entry->key_len), _______________________________________________ iovisor-dev mailing list iovisor-dev@... https://lists.iovisor.org/mailman/listinfo/iovisor-dev
-- ________________________________ Raffaele Sommese Mail:raffysommy@... About me:https://about.me/r4ffy Gpg Key:http://www.r4ffy.info/Openpgp.asc GPG key ID: 0x830b1428cf91db2a on http://pgp.mit.edu:11371/
-- ________________________________ Raffaele Sommese Mail:raffysommy@... About me:https://about.me/r4ffy Gpg Key:http://www.r4ffy.info/Openpgp.asc GPG key ID: 0x830b1428cf91db2a on http://pgp.mit.edu:11371/
-- ________________________________ Raffaele Sommese Mail:raffysommy@... About me: https://about.me/r4ffyGpg Key: http://www.r4ffy.info/Openpgp.ascGPG key ID: 0x830b1428cf91db2a on http://pgp.mit.edu:11371/
|
bpf tracepoints have been removed from recent linux so the you need to use kprobe to trace update/delete.
typical map_update_elem and map_delete_elem first argument is 'struct bpf_map *map', you can get name and id from there:
Hello again :) It seems that there is 2 function that can be traced inside the kernel, one is map_update_elem, and it is the syscall, the other one is the BPF helper. I have successful attach my ebpf code to the first one, but it doesn't have as parameter struct bpf_map *map (it have a union bpf_attr). If I attach my program to the bpf_map_update_elem (that I think is the function name of BPF helper), I don't receive any event. I'm using the last version of bcc and of kernel. I try also with kprobe program of perf kernel suite with the same results. I was looking for this helper BPF_CALL_4 (bpf_map_update_elem, struct bpf_map *, map, void *, key, void *, value, u64, flags) Thank you again for the support, Raffaele -- ________________________________ Raffaele Sommese Mail:raffysommy@... About me: https://about.me/r4ffyGpg Key: http://www.r4ffy.info/Openpgp.ascGPG key ID: 0x830b1428cf91db2a on http://pgp.mit.edu:11371/
|
On Mon, Aug 6, 2018 at 10:17 AM, Raffaele Sommese <raffysommy@...> wrote: bpf tracepoints have been removed from recent linux so the you need to use kprobe to trace update/delete.
typical map_update_elem and map_delete_elem first argument is 'struct bpf_map *map', you can get name and id from there:
Hello again :) It seems that there is 2 function that can be traced inside the kernel, one is map_update_elem, and it is the syscall, the other one is the BPF helper. I have successful attach my ebpf code to the first one, but it doesn't have as parameter struct bpf_map *map (it have a union bpf_attr). If I attach my program to the bpf_map_update_elem (that I think is the function name of BPF helper), I don't receive any event. I'm using the last version of bcc and of kernel. I try also with kprobe program of perf kernel suite with the same results. I was looking for this helper BPF_CALL_4 (bpf_map_update_elem, struct bpf_map *, map, void *, key, void *, value, u64, flags)
Please directly use the map lookup function for the specific map. For example, for hashmap, the verifier is smart enough to change the byte code to call the underlying hashmap map lookup function. === /* inline bpf_map_lookup_elem() call. * Instead of: * bpf_prog * bpf_map_lookup_elem * map->ops->map_lookup_elem * htab_map_lookup_elem * __htab_map_lookup_elem * do: * bpf_prog * __htab_map_lookup_elem */ static u32 htab_map_gen_lookup(struct bpf_map *map, struct bpf_insn *insn_buf) { struct bpf_insn *insn = insn_buf; const int ret = BPF_REG_0; BUILD_BUG_ON(!__same_type(&__htab_map_lookup_elem, (void *(*)(struct bpf_map *map, void *key))NULL)); *insn++ = BPF_EMIT_CALL(BPF_CAST_CALL(__htab_map_lookup_elem)); *insn++ = BPF_JMP_IMM(BPF_JEQ, ret, 0, 1); *insn++ = BPF_ALU64_IMM(BPF_ADD, ret, offsetof(struct htab_elem, key) + round_up(map->key_size, 8)); return insn - insn_buf; } === Please do check whether map_gen_lookup is implemented or not for the map type you are interested in. For example, for the latest bpf-next, map_gen_lookup is implemented for hashtable, const struct bpf_map_ops htab_map_ops = { .map_alloc_check = htab_map_alloc_check, .map_alloc = htab_map_alloc, .map_free = htab_map_free, .map_get_next_key = htab_map_get_next_key, .map_lookup_elem = htab_map_lookup_elem, .map_update_elem = htab_map_update_elem, .map_delete_elem = htab_map_delete_elem, .map_gen_lookup = htab_map_gen_lookup, }; For for hash map, the function you should attach to is __htab_map_lookup_elem. Thank you again for the support, Raffaele -- ________________________________ Raffaele Sommese Mail:raffysommy@... About me:https://about.me/r4ffy Gpg Key:http://www.r4ffy.info/Openpgp.asc GPG key ID: 0x830b1428cf91db2a on http://pgp.mit.edu:11371/
|
Il giorno lun 6 ago 2018 alle ore 19:40 Y Song <ys114321@...> ha scritto: On Mon, Aug 6, 2018 at 10:17 AM, Raffaele Sommese <raffysommy@...> wrote:
bpf tracepoints have been removed from recent linux so the you need to use kprobe to trace update/delete.
typical map_update_elem and map_delete_elem first argument is 'struct bpf_map *map', you can get name and id from there:
Hello again :) It seems that there is 2 function that can be traced inside the kernel, one is map_update_elem, and it is the syscall, the other one is the BPF helper. I have successful attach my ebpf code to the first one, but it doesn't have as parameter struct bpf_map *map (it have a union bpf_attr). If I attach my program to the bpf_map_update_elem (that I think is the function name of BPF helper), I don't receive any event. I'm using the last version of bcc and of kernel. I try also with kprobe program of perf kernel suite with the same results. I was looking for this helper BPF_CALL_4 (bpf_map_update_elem, struct bpf_map *, map, void *, key, void *, value, u64, flags) Please directly use the map lookup function for the specific map. For example, for hashmap, the verifier is smart enough to change the byte code to call the underlying hashmap map lookup function.
Thank you, right now I will try only to implement a solution for hashmap. I have detected a strange behavior for lookup I can receive the event when the map was looked, but for the updates, I don't receive anything. I have checked the kernel and there was the map_gen_lookup. The strange thing is that if I use kprobe tool I can see the event on htab_map_update_elem. Here is my test code: (I have tried with lookup and it works) https://gist.github.com/raffysommy/1dabe5bf9487d974f3acd1f7a32ed01chttps://gist.github.com/raffysommy/587f61c14d3e157f86da1aadd07442b1Thanks again, Raffaele -- ________________________________ Raffaele Sommese Mail:raffysommy@... About me: https://about.me/r4ffyGpg Key: http://www.r4ffy.info/Openpgp.ascGPG key ID: 0x830b1428cf91db2a on http://pgp.mit.edu:11371/
|
On Mon, Aug 6, 2018 at 11:53 AM, Raffaele Sommese <raffysommy@...> wrote: Il giorno lun 6 ago 2018 alle ore 19:40 Y Song <ys114321@...> ha scritto:
On Mon, Aug 6, 2018 at 10:17 AM, Raffaele Sommese <raffysommy@...> wrote:
bpf tracepoints have been removed from recent linux so the you need to use kprobe to trace update/delete.
typical map_update_elem and map_delete_elem first argument is 'struct bpf_map *map', you can get name and id from there:
Hello again :) It seems that there is 2 function that can be traced inside the kernel, one is map_update_elem, and it is the syscall, the other one is the BPF helper. I have successful attach my ebpf code to the first one, but it doesn't have as parameter struct bpf_map *map (it have a union bpf_attr). If I attach my program to the bpf_map_update_elem (that I think is the function name of BPF helper), I don't receive any event. I'm using the last version of bcc and of kernel. I try also with kprobe program of perf kernel suite with the same results. I was looking for this helper BPF_CALL_4 (bpf_map_update_elem, struct bpf_map *, map, void *, key, void *, value, u64, flags) Please directly use the map lookup function for the specific map. For example, for hashmap, the verifier is smart enough to change the byte code to call the underlying hashmap map lookup function. Thank you, right now I will try only to implement a solution for hashmap. I have detected a strange behavior for lookup I can receive the event when the map was looked, but for the updates, I don't receive anything. I have checked the kernel and there was the map_gen_lookup. The strange thing is that if I use kprobe tool I can see the event on htab_map_update_elem. Here is my test code: (I have tried with lookup and it works) https://gist.github.com/raffysommy/1dabe5bf9487d974f3acd1f7a32ed01c https://gist.github.com/raffysommy/587f61c14d3e157f86da1aadd07442b1 Okay, the htab_map_update_elem is indeed called, but you cannot trace it. The following kernel code in kernel/bpf/syscall.c explained the reason: /* must increment bpf_prog_active to avoid kprobe+bpf triggering from * inside bpf map update or delete otherwise deadlocks are possible */ preempt_disable(); __this_cpu_inc(bpf_prog_active); if (map->map_type == BPF_MAP_TYPE_PERCPU_HASH || map->map_type == BPF_MAP_TYPE_LRU_PERCPU_HASH) { err = bpf_percpu_hash_update(map, key, value, attr->flags); } else if (map->map_type == BPF_MAP_TYPE_PERCPU_ARRAY) { err = bpf_percpu_array_update(map, key, value, attr->flags); } else if (IS_FD_ARRAY(map)) { rcu_read_lock(); err = bpf_fd_array_map_update_elem(map, f.file, key, value, attr->flags); rcu_read_unlock(); } else if (map->map_type == BPF_MAP_TYPE_HASH_OF_MAPS) { rcu_read_lock(); err = bpf_fd_htab_map_update_elem(map, f.file, key, value, attr->flags); rcu_read_unlock(); } else { rcu_read_lock(); err = map->ops->map_update_elem(map, key, value, attr->flags); rcu_read_unlock(); } __this_cpu_dec(bpf_prog_active); preempt_enable(); The bpf_prog_active will prevent later kprobe for htab_map_update_elem. How can we solve this problem then? One possible solution is as follows: . disassemble vmlinux to find a proper place in function "map_update_elem" where you can get the "map" (struct bpf_map *map) in a register, e.g., the insn offset inside map_update_elem is OFFSET and this OFFSET should be outside the above preempt/__this_cpu_{inc/dec} region. . improve trace.py to trace function+offset. the possible format could be trace.py 'map_update_elem+OFFSET ...' The attach_kprobe API should already support function_name + offset format. Thanks again, Raffaele
-- ________________________________ Raffaele Sommese Mail:raffysommy@... About me:https://about.me/r4ffy Gpg Key:http://www.r4ffy.info/Openpgp.asc GPG key ID: 0x830b1428cf91db2a on http://pgp.mit.edu:11371/
|
Okay, the htab_map_update_elem is indeed called, but you cannot trace it. The following kernel code in kernel/bpf/syscall.c explained the reason:
/* must increment bpf_prog_active to avoid kprobe+bpf triggering from * inside bpf map update or delete otherwise deadlocks are possible */ preempt_disable(); __this_cpu_inc(bpf_prog_active); if (map->map_type == BPF_MAP_TYPE_PERCPU_HASH || map->map_type == BPF_MAP_TYPE_LRU_PERCPU_HASH) { err = bpf_percpu_hash_update(map, key, value, attr->flags); } else if (map->map_type == BPF_MAP_TYPE_PERCPU_ARRAY) { err = bpf_percpu_array_update(map, key, value, attr->flags); } else if (IS_FD_ARRAY(map)) { rcu_read_lock(); err = bpf_fd_array_map_update_elem(map, f.file, key, value, attr->flags); rcu_read_unlock(); } else if (map->map_type == BPF_MAP_TYPE_HASH_OF_MAPS) { rcu_read_lock(); err = bpf_fd_htab_map_update_elem(map, f.file, key, value, attr->flags); rcu_read_unlock(); } else { rcu_read_lock(); err = map->ops->map_update_elem(map, key, value, attr->flags); rcu_read_unlock(); } __this_cpu_dec(bpf_prog_active); preempt_enable();
The bpf_prog_active will prevent later kprobe for htab_map_update_elem.
How can we solve this problem then? One possible solution is as follows: . disassemble vmlinux to find a proper place in function "map_update_elem" where you can get the "map" (struct bpf_map *map) in a register, e.g., the insn offset inside map_update_elem is OFFSET and this OFFSET should be outside the above preempt/__this_cpu_{inc/dec} region. . improve trace.py to trace function+offset. the possible format could be trace.py 'map_update_elem+OFFSET ...' The attach_kprobe API should already support function_name + offset format. I think that this way can be very tricky and platform depended, I have found another solution. If I attach my bpf program to bpf_map_new_fd with a kprobe and a kretprobe I can recover the mapping between (fd of map-pid) and id or the name of the map (and save it). I have tested it and it seems to work. Then I can trace map_update_elem syscall and read the data (I'm interested only into the key) from the userspace. I attach the code here, it can be helpful if other people that want to address this problem. https://gist.github.com/raffysommy/45cf0544f34eb0e5fbf533f4d9a3b955Thank you again for the support and for your time. Raffaele -- ________________________________ Raffaele Sommese Mail:raffysommy@... About me: https://about.me/r4ffyGpg Key: http://www.r4ffy.info/Openpgp.ascGPG key ID: 0x830b1428cf91db2a on http://pgp.mit.edu:11371/
|
On Mon, Aug 6, 2018 at 2:52 PM, Raffaele Sommese <raffysommy@...> wrote: Okay, the htab_map_update_elem is indeed called, but you cannot trace it. The following kernel code in kernel/bpf/syscall.c explained the reason:
/* must increment bpf_prog_active to avoid kprobe+bpf triggering from * inside bpf map update or delete otherwise deadlocks are possible */ preempt_disable(); __this_cpu_inc(bpf_prog_active); if (map->map_type == BPF_MAP_TYPE_PERCPU_HASH || map->map_type == BPF_MAP_TYPE_LRU_PERCPU_HASH) { err = bpf_percpu_hash_update(map, key, value, attr->flags); } else if (map->map_type == BPF_MAP_TYPE_PERCPU_ARRAY) { err = bpf_percpu_array_update(map, key, value, attr->flags); } else if (IS_FD_ARRAY(map)) { rcu_read_lock(); err = bpf_fd_array_map_update_elem(map, f.file, key, value, attr->flags); rcu_read_unlock(); } else if (map->map_type == BPF_MAP_TYPE_HASH_OF_MAPS) { rcu_read_lock(); err = bpf_fd_htab_map_update_elem(map, f.file, key, value, attr->flags); rcu_read_unlock(); } else { rcu_read_lock(); err = map->ops->map_update_elem(map, key, value, attr->flags); rcu_read_unlock(); } __this_cpu_dec(bpf_prog_active); preempt_enable();
The bpf_prog_active will prevent later kprobe for htab_map_update_elem.
How can we solve this problem then? One possible solution is as follows: . disassemble vmlinux to find a proper place in function "map_update_elem" where you can get the "map" (struct bpf_map *map) in a register, e.g., the insn offset inside map_update_elem is OFFSET and this OFFSET should be outside the above preempt/__this_cpu_{inc/dec} region. . improve trace.py to trace function+offset. the possible format could be trace.py 'map_update_elem+OFFSET ...' The attach_kprobe API should already support function_name + offset format. I think that this way can be very tricky and platform depended, I have found another solution. If I attach my bpf program to bpf_map_new_fd with a kprobe and a kretprobe I can recover the mapping between (fd of map-pid) and id or the name of the map (and save it). I have tested it and it seems to work. Then I can trace map_update_elem syscall and read the data (I'm interested only into the key) from the userspace. I attach the code here, it can be helpful if other people that want to address this problem. https://gist.github.com/raffysommy/45cf0544f34eb0e5fbf533f4d9a3b955 Thank you again for the support and for your time.
Yes, this approach should work too. I am thinking whether we could do it with one invocation of trace.py... Raffaele
-- ________________________________ Raffaele Sommese Mail:raffysommy@... About me:https://about.me/r4ffy Gpg Key:http://www.r4ffy.info/Openpgp.asc GPG key ID: 0x830b1428cf91db2a on http://pgp.mit.edu:11371/
|