Notification when an eBPF map is modified


Francesco Picciariello <picciariello.francesco@...>
 

Hello all,
Is there a way to receive asynchronous notification each time an eBPF map is modified?

I used bpf_obj_pin() in order to save a specific map on filesystem, and I called linux inotify() on the pinned object.

The inotify API provides a mechanism for monitoring filesystem events, and the goal is to notice when a pinned eBPF map is modified, but it seems inotify() does not work properly with bpf filesystem. In fact it's able to detect the creation and the deletion, but not the modification of the pinned object.

Regards.


Yonghong Song
 

On Fri, Feb 16, 2018 at 7:59 AM, Francesco Picciariello via
iovisor-dev <iovisor-dev@...> wrote:
Hello all,
Is there a way to receive asynchronous notification each time an eBPF map is
modified?
When map is modified in kernel, you can send something into a ring buffer
and userspace can poll on this ring buffer to get notification.


I used bpf_obj_pin() in order to save a specific map on filesystem, and I
called linux inotify() on the pinned object.

The inotify API provides a mechanism for monitoring filesystem events, and
the goal is to notice when a pinned eBPF map is modified, but it seems
inotify() does not work properly with bpf filesystem. In fact it's able to
detect the creation and the deletion, but not the modification of the pinned
object.
The inotify takes vfs_read/write for read/write operations.
Here bpf map read/write happens inside
bpf program, or through bpf syscall, and hence inotify mechanism won't work.


Regards.

_______________________________________________
iovisor-dev mailing list
iovisor-dev@...
https://lists.iovisor.org/mailman/listinfo/iovisor-dev


Fulvio Risso
 

On 17/02/2018 08:02, Y Song via iovisor-dev wrote:
On Fri, Feb 16, 2018 at 7:59 AM, Francesco Picciariello via
iovisor-dev <iovisor-dev@...> wrote:
Hello all,
Is there a way to receive asynchronous notification each time an eBPF map is
modified?
When map is modified in kernel, you can send something into a ring buffer
and userspace can poll on this ring buffer to get notification.
When you say "you" means "the dataplane program"?
In other words, the dataplane must be collaborative and send a notification to userspace when the map has been modified?

Thanks,

fulvio



I used bpf_obj_pin() in order to save a specific map on filesystem, and I
called linux inotify() on the pinned object.

The inotify API provides a mechanism for monitoring filesystem events, and
the goal is to notice when a pinned eBPF map is modified, but it seems
inotify() does not work properly with bpf filesystem. In fact it's able to
detect the creation and the deletion, but not the modification of the pinned
object.
The inotify takes vfs_read/write for read/write operations.
Here bpf map read/write happens inside
bpf program, or through bpf syscall, and hence inotify mechanism won't work.


Regards.

_______________________________________________
iovisor-dev mailing list
iovisor-dev@...
https://lists.iovisor.org/mailman/listinfo/iovisor-dev
_______________________________________________
iovisor-dev mailing list
iovisor-dev@...
https://lists.iovisor.org/mailman/listinfo/iovisor-dev


Yonghong Song
 

On Fri, Feb 16, 2018 at 11:19 PM, Fulvio Risso <fulvio.risso@...> wrote:


On 17/02/2018 08:02, Y Song via iovisor-dev wrote:

On Fri, Feb 16, 2018 at 7:59 AM, Francesco Picciariello via
iovisor-dev <iovisor-dev@...> wrote:

Hello all,
Is there a way to receive asynchronous notification each time an eBPF map
is
modified?

When map is modified in kernel, you can send something into a ring buffer
and userspace can poll on this ring buffer to get notification.

When you say "you" means "the dataplane program"?
Yes.

In other words, the dataplane must be collaborative and send a notification
to userspace when the map has been modified?
Yes. Not just notification, may be actual data or part of actual data, or
aggregated data if modification is too frequent.


Thanks,

fulvio



I used bpf_obj_pin() in order to save a specific map on filesystem, and I
called linux inotify() on the pinned object.

The inotify API provides a mechanism for monitoring filesystem events,
and
the goal is to notice when a pinned eBPF map is modified, but it seems
inotify() does not work properly with bpf filesystem. In fact it's able
to
detect the creation and the deletion, but not the modification of the
pinned
object.

The inotify takes vfs_read/write for read/write operations.
Here bpf map read/write happens inside
bpf program, or through bpf syscall, and hence inotify mechanism won't
work.


Regards.

_______________________________________________
iovisor-dev mailing list
iovisor-dev@...
https://lists.iovisor.org/mailman/listinfo/iovisor-dev
_______________________________________________
iovisor-dev mailing list
iovisor-dev@...
https://lists.iovisor.org/mailman/listinfo/iovisor-dev


Fulvio Risso
 

On 17/02/2018 08:41, Y Song wrote:
On Fri, Feb 16, 2018 at 11:19 PM, Fulvio Risso <fulvio.risso@...> wrote:


On 17/02/2018 08:02, Y Song via iovisor-dev wrote:

On Fri, Feb 16, 2018 at 7:59 AM, Francesco Picciariello via
iovisor-dev <iovisor-dev@...> wrote:

Hello all,
Is there a way to receive asynchronous notification each time an eBPF map
is
modified?

When map is modified in kernel, you can send something into a ring buffer
and userspace can poll on this ring buffer to get notification.
Got it.
We were looking for a mechanism transparent to the eBPF program, though.
A possible rational is to have an hot-standby copy of the program (including the state) in some other location, but I don't want my dataplane to be aware of that.
Thanks,

fulvio




When you say "you" means "the dataplane program"?
Yes.

In other words, the dataplane must be collaborative and send a notification
to userspace when the map has been modified?
Yes. Not just notification, may be actual data or part of actual data, or
aggregated data if modification is too frequent.


Thanks,

fulvio



I used bpf_obj_pin() in order to save a specific map on filesystem, and I
called linux inotify() on the pinned object.

The inotify API provides a mechanism for monitoring filesystem events,
and
the goal is to notice when a pinned eBPF map is modified, but it seems
inotify() does not work properly with bpf filesystem. In fact it's able
to
detect the creation and the deletion, but not the modification of the
pinned
object.

The inotify takes vfs_read/write for read/write operations.
Here bpf map read/write happens inside
bpf program, or through bpf syscall, and hence inotify mechanism won't
work.


Regards.

_______________________________________________
iovisor-dev mailing list
iovisor-dev@...
https://lists.iovisor.org/mailman/listinfo/iovisor-dev
_______________________________________________
iovisor-dev mailing list
iovisor-dev@...
https://lists.iovisor.org/mailman/listinfo/iovisor-dev


Teng Qin
 


On Sat, Feb 17, 2018 at 16:56 Fulvio Risso via iovisor-dev <iovisor-dev@...> wrote:


On 17/02/2018 08:41, Y Song wrote:
> On Fri, Feb 16, 2018 at 11:19 PM, Fulvio Risso <fulvio.risso@...> wrote:
>>
>>
>> On 17/02/2018 08:02, Y Song via iovisor-dev wrote:
>>>
>>> On Fri, Feb 16, 2018 at 7:59 AM, Francesco Picciariello via
>>> iovisor-dev <iovisor-dev@...> wrote:
>>>>
>>>> Hello all,
>>>> Is there a way to receive asynchronous notification each time an eBPF map
>>>> is
>>>> modified?
>>>
>>>
>>> When map is modified in kernel, you can send something into a ring buffer
>>> and userspace can poll on this ring buffer to get notification.

Got it.
We were looking for a mechanism transparent to the eBPF program, though.
A possible rational is to have an hot-standby copy of the program
(including the state) in some other location, but I don't want my
dataplane to be aware of that.
Thanks,

        fulvio

You could also (use another BPF program or ftrace) to trace the bpf_map_update_elem Tracepoint. But in that case you get all update calls and would need to filter for the one you are interested on your own:)

Teng





>>
>>
>> When you say "you" means "the dataplane program"?
>
> Yes.
>
>> In other words, the dataplane must be collaborative and send a notification
>> to userspace when the map has been modified?
>
> Yes. Not just notification, may be actual data or part of actual data, or
> aggregated data if modification is too frequent.
>
>>
>> Thanks,
>>
>>          fulvio
>>
>>>
>>>>
>>>> I used bpf_obj_pin() in order to save a specific map on filesystem, and I
>>>> called linux inotify() on the pinned object.
>>>>
>>>> The inotify API provides a mechanism for monitoring filesystem events,
>>>> and
>>>> the goal is to notice when a pinned eBPF map is modified, but it seems
>>>> inotify() does not work properly with bpf filesystem. In fact it's able
>>>> to
>>>> detect the creation and the deletion, but not the modification of the
>>>> pinned
>>>> object.
>>>
>>>
>>> The inotify takes vfs_read/write for read/write operations.
>>> Here bpf map read/write happens inside
>>> bpf program, or through bpf syscall, and hence inotify mechanism won't
>>> work.
>>>
>>>>
>>>> Regards.
>>>>
>>>> _______________________________________________
>>>> iovisor-dev mailing list
>>>> iovisor-dev@...
>>>> https://lists.iovisor.org/mailman/listinfo/iovisor-dev
>>>>
>>> _______________________________________________
>>> iovisor-dev mailing list
>>> iovisor-dev@...
>>> https://lists.iovisor.org/mailman/listinfo/iovisor-dev
>>>
>>
_______________________________________________
iovisor-dev mailing list
iovisor-dev@...
https://lists.iovisor.org/mailman/listinfo/iovisor-dev


Jesper Dangaard Brouer
 

On Sat, 17 Feb 2018 13:49:22 +0000 Teng Qin via iovisor-dev <iovisor-dev@...> wrote:

We were looking for a mechanism transparent to the eBPF program, though.
A possible rational is to have an hot-standby copy of the program
(including the state) in some other location, but I don't want my
dataplane to be aware of that.
Thanks,

fulvio

You could also (use another BPF program or ftrace) to trace the
bpf_map_update_elem Tracepoint. But in that case you get all update calls
and would need to filter for the one you are interested on your own:)
That is a good idea.

Try it out via perf-record to see if it contains what you need:

$ perf record -e bpf:bpf_map_update_elem -a

$ perf script
xdp_redirect_ma 2273 [011] 261187.968223: bpf:bpf_map_update_elem: map type= ufd=4 key=[00 00 00 00] val=[07 00 00 00]


Looking at the above output and tracepoint kernel code, we should
extend that with a map_id to easily identify/filter what map you are
interested in.

See patch below signature (not even compile tested).

Example for attaching to tracepoints see:
samples/bpf/xdp_monitor_*.c

--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat
LinkedIn: http://www.linkedin.com/in/brouer

tracepoint: add map id to bpf tracepoints

From: Jesper Dangaard Brouer <brouer@...>


---
include/trace/events/bpf.h | 12 +++++++++---
1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/include/trace/events/bpf.h b/include/trace/events/bpf.h
index 150185647e6b..e6479ba45261 100644
--- a/include/trace/events/bpf.h
+++ b/include/trace/events/bpf.h
@@ -140,7 +140,7 @@ TRACE_EVENT(bpf_map_create,
__entry->flags = map->map_flags;
__entry->ufd = ufd;
),
-
+// TODO also add map_id here
TP_printk("map type=%s ufd=%d key=%u val=%u max=%u flags=%x",
__print_symbolic(__entry->type, __MAP_TYPE_SYM_TAB),
__entry->ufd, __entry->size_key, __entry->size_value,
@@ -199,15 +199,18 @@ DECLARE_EVENT_CLASS(bpf_obj_map,
__field(u32, type)
__field(int, ufd)
__string(path, pname->name)
+ __field(u32, map_id)
),

TP_fast_assign(
__assign_str(path, pname->name);
__entry->type = map->map_type;
__entry->ufd = ufd;
+ __entry->map_id = map->id;
),

- TP_printk("map type=%s ufd=%d path=%s",
+ TP_printk("map id=%u type=%s ufd=%d path=%s",
+ __entry->map_id,
__print_symbolic(__entry->type, __MAP_TYPE_SYM_TAB),
__entry->ufd, __get_str(path))
);
@@ -244,6 +247,7 @@ DECLARE_EVENT_CLASS(bpf_map_keyval,
__dynamic_array(u8, val, map->value_size)
__field(bool, val_trunc)
__field(int, ufd)
+ __field(u32, map_id)
),

TP_fast_assign(
@@ -255,9 +259,11 @@ DECLARE_EVENT_CLASS(bpf_map_keyval,
__entry->val_len = min(map->value_size, 16U);
__entry->val_trunc = map->value_size != __entry->val_len;
__entry->ufd = ufd;
+ __entry->map_id = map->id;
),

- TP_printk("map type=%s ufd=%d key=[%s%s] val=[%s%s]",
+ TP_printk("map id=%d type=%s ufd=%d key=[%s%s] val=[%s%s]",
+ __entry->map_id,
__print_symbolic(__entry->type, __MAP_TYPE_SYM_TAB),
__entry->ufd,
__print_hex(__get_dynamic_array(key), __entry->key_len),


Fulvio Risso
 

That's cool :-)
Francesco will have something to do for the next few days :-)

fulvio

On 17/02/2018 18:40, Jesper Dangaard Brouer wrote:
On Sat, 17 Feb 2018 13:49:22 +0000 Teng Qin via iovisor-dev <iovisor-dev@...> wrote:

We were looking for a mechanism transparent to the eBPF program, though.
A possible rational is to have an hot-standby copy of the program
(including the state) in some other location, but I don't want my
dataplane to be aware of that.
Thanks,

fulvio

You could also (use another BPF program or ftrace) to trace the
bpf_map_update_elem Tracepoint. But in that case you get all update calls
and would need to filter for the one you are interested on your own:)
That is a good idea.
Try it out via perf-record to see if it contains what you need:
$ perf record -e bpf:bpf_map_update_elem -a
$ perf script
xdp_redirect_ma 2273 [011] 261187.968223: bpf:bpf_map_update_elem: map type= ufd=4 key=[00 00 00 00] val=[07 00 00 00]
Looking at the above output and tracepoint kernel code, we should
extend that with a map_id to easily identify/filter what map you are
interested in.
See patch below signature (not even compile tested).
Example for attaching to tracepoints see:
samples/bpf/xdp_monitor_*.c


Raffaele Sommese
 

Hello everybody,
I was looking for a similar mechanism,
I need to trace an event on map update/delete, I have tried with
tracepoint but I can recover only the file descriptor of map and I
need the map id too (or the map name).
Is there some other solution to trace this event and recover this data?
I prefer to avoid to modify the kernel code.
Thank You,
Best Regards
Raffaele
Il giorno sab 17 feb 2018 alle ore 18:41 Jesper Dangaard Brouer via
iovisor-dev <iovisor-dev@...> ha scritto:



On Sat, 17 Feb 2018 13:49:22 +0000 Teng Qin via iovisor-dev <iovisor-dev@...> wrote:

We were looking for a mechanism transparent to the eBPF program, though.
A possible rational is to have an hot-standby copy of the program
(including the state) in some other location, but I don't want my
dataplane to be aware of that.
Thanks,

fulvio

You could also (use another BPF program or ftrace) to trace the
bpf_map_update_elem Tracepoint. But in that case you get all update calls
and would need to filter for the one you are interested on your own:)
That is a good idea.

Try it out via perf-record to see if it contains what you need:

$ perf record -e bpf:bpf_map_update_elem -a

$ perf script
xdp_redirect_ma 2273 [011] 261187.968223: bpf:bpf_map_update_elem: map type= ufd=4 key=[00 00 00 00] val=[07 00 00 00]


Looking at the above output and tracepoint kernel code, we should
extend that with a map_id to easily identify/filter what map you are
interested in.

See patch below signature (not even compile tested).

Example for attaching to tracepoints see:
samples/bpf/xdp_monitor_*.c

--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat
LinkedIn: http://www.linkedin.com/in/brouer

tracepoint: add map id to bpf tracepoints

From: Jesper Dangaard Brouer <brouer@...>


---
include/trace/events/bpf.h | 12 +++++++++---
1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/include/trace/events/bpf.h b/include/trace/events/bpf.h
index 150185647e6b..e6479ba45261 100644
--- a/include/trace/events/bpf.h
+++ b/include/trace/events/bpf.h
@@ -140,7 +140,7 @@ TRACE_EVENT(bpf_map_create,
__entry->flags = map->map_flags;
__entry->ufd = ufd;
),
-
+// TODO also add map_id here
TP_printk("map type=%s ufd=%d key=%u val=%u max=%u flags=%x",
__print_symbolic(__entry->type, __MAP_TYPE_SYM_TAB),
__entry->ufd, __entry->size_key, __entry->size_value,
@@ -199,15 +199,18 @@ DECLARE_EVENT_CLASS(bpf_obj_map,
__field(u32, type)
__field(int, ufd)
__string(path, pname->name)
+ __field(u32, map_id)
),

TP_fast_assign(
__assign_str(path, pname->name);
__entry->type = map->map_type;
__entry->ufd = ufd;
+ __entry->map_id = map->id;
),

- TP_printk("map type=%s ufd=%d path=%s",
+ TP_printk("map id=%u type=%s ufd=%d path=%s",
+ __entry->map_id,
__print_symbolic(__entry->type, __MAP_TYPE_SYM_TAB),
__entry->ufd, __get_str(path))
);
@@ -244,6 +247,7 @@ DECLARE_EVENT_CLASS(bpf_map_keyval,
__dynamic_array(u8, val, map->value_size)
__field(bool, val_trunc)
__field(int, ufd)
+ __field(u32, map_id)
),

TP_fast_assign(
@@ -255,9 +259,11 @@ DECLARE_EVENT_CLASS(bpf_map_keyval,
__entry->val_len = min(map->value_size, 16U);
__entry->val_trunc = map->value_size != __entry->val_len;
__entry->ufd = ufd;
+ __entry->map_id = map->id;
),

- TP_printk("map type=%s ufd=%d key=[%s%s] val=[%s%s]",
+ TP_printk("map id=%d type=%s ufd=%d key=[%s%s] val=[%s%s]",
+ __entry->map_id,
__print_symbolic(__entry->type, __MAP_TYPE_SYM_TAB),
__entry->ufd,
__print_hex(__get_dynamic_array(key), __entry->key_len),
_______________________________________________
iovisor-dev mailing list
iovisor-dev@...
https://lists.iovisor.org/mailman/listinfo/iovisor-dev


--
________________________________
Raffaele Sommese
Mail:raffysommy@...
About me:https://about.me/r4ffy
Gpg Key:http://www.r4ffy.info/Openpgp.asc
GPG key ID: 0x830b1428cf91db2a on http://pgp.mit.edu:11371/


Yonghong Song
 

On Wed, Aug 1, 2018 at 2:36 AM, Raffaele Sommese <raffysommy@...> wrote:
Hello everybody,
I was looking for a similar mechanism,
I need to trace an event on map update/delete, I have tried with
tracepoint but I can recover only the file descriptor of map and I
need the map id too (or the map name).
Is there some other solution to trace this event and recover this data?
bpf tracepoints have been removed from recent linux so the you need to
use kprobe to trace update/delete.

typical map_update_elem and map_delete_elem first argument is
'struct bpf_map *map', you can get name and id from there:

struct bpf_map {
/* The first two cachelines with read-mostly members of which some
* are also accessed in fast-path (e.g. ops, max_entries).
*/
const struct bpf_map_ops *ops ____cacheline_aligned;
struct bpf_map *inner_map_meta;
#ifdef CONFIG_SECURITY
void *security;
#endif
enum bpf_map_type map_type;
u32 key_size;
u32 value_size;
u32 max_entries;
u32 map_flags;
u32 pages;
u32 id;
int numa_node;
u32 btf_key_type_id;
u32 btf_value_type_id;
struct btf *btf;
bool unpriv_array;
/* 55 bytes hole */

/* The 3rd and 4th cacheline with misc members to avoid false sharing
* particularly with refcounting.
*/
struct user_struct *user ____cacheline_aligned;
atomic_t refcnt;
atomic_t usercnt;
struct work_struct work;
char name[BPF_OBJ_NAME_LEN];
};


I prefer to avoid to modify the kernel code.
Thank You,
Best Regards
Raffaele
Il giorno sab 17 feb 2018 alle ore 18:41 Jesper Dangaard Brouer via
iovisor-dev <iovisor-dev@...> ha scritto:



On Sat, 17 Feb 2018 13:49:22 +0000 Teng Qin via iovisor-dev <iovisor-dev@...> wrote:

We were looking for a mechanism transparent to the eBPF program, though.
A possible rational is to have an hot-standby copy of the program
(including the state) in some other location, but I don't want my
dataplane to be aware of that.
Thanks,

fulvio

You could also (use another BPF program or ftrace) to trace the
bpf_map_update_elem Tracepoint. But in that case you get all update calls
and would need to filter for the one you are interested on your own:)
That is a good idea.

Try it out via perf-record to see if it contains what you need:

$ perf record -e bpf:bpf_map_update_elem -a

$ perf script
xdp_redirect_ma 2273 [011] 261187.968223: bpf:bpf_map_update_elem: map type= ufd=4 key=[00 00 00 00] val=[07 00 00 00]


Looking at the above output and tracepoint kernel code, we should
extend that with a map_id to easily identify/filter what map you are
interested in.

See patch below signature (not even compile tested).

Example for attaching to tracepoints see:
samples/bpf/xdp_monitor_*.c

--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat
LinkedIn: http://www.linkedin.com/in/brouer

tracepoint: add map id to bpf tracepoints

From: Jesper Dangaard Brouer <brouer@...>


---
include/trace/events/bpf.h | 12 +++++++++---
1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/include/trace/events/bpf.h b/include/trace/events/bpf.h
index 150185647e6b..e6479ba45261 100644
--- a/include/trace/events/bpf.h
+++ b/include/trace/events/bpf.h
@@ -140,7 +140,7 @@ TRACE_EVENT(bpf_map_create,
__entry->flags = map->map_flags;
__entry->ufd = ufd;
),
-
+// TODO also add map_id here
TP_printk("map type=%s ufd=%d key=%u val=%u max=%u flags=%x",
__print_symbolic(__entry->type, __MAP_TYPE_SYM_TAB),
__entry->ufd, __entry->size_key, __entry->size_value,
@@ -199,15 +199,18 @@ DECLARE_EVENT_CLASS(bpf_obj_map,
__field(u32, type)
__field(int, ufd)
__string(path, pname->name)
+ __field(u32, map_id)
),

TP_fast_assign(
__assign_str(path, pname->name);
__entry->type = map->map_type;
__entry->ufd = ufd;
+ __entry->map_id = map->id;
),

- TP_printk("map type=%s ufd=%d path=%s",
+ TP_printk("map id=%u type=%s ufd=%d path=%s",
+ __entry->map_id,
__print_symbolic(__entry->type, __MAP_TYPE_SYM_TAB),
__entry->ufd, __get_str(path))
);
@@ -244,6 +247,7 @@ DECLARE_EVENT_CLASS(bpf_map_keyval,
__dynamic_array(u8, val, map->value_size)
__field(bool, val_trunc)
__field(int, ufd)
+ __field(u32, map_id)
),

TP_fast_assign(
@@ -255,9 +259,11 @@ DECLARE_EVENT_CLASS(bpf_map_keyval,
__entry->val_len = min(map->value_size, 16U);
__entry->val_trunc = map->value_size != __entry->val_len;
__entry->ufd = ufd;
+ __entry->map_id = map->id;
),

- TP_printk("map type=%s ufd=%d key=[%s%s] val=[%s%s]",
+ TP_printk("map id=%d type=%s ufd=%d key=[%s%s] val=[%s%s]",
+ __entry->map_id,
__print_symbolic(__entry->type, __MAP_TYPE_SYM_TAB),
__entry->ufd,
__print_hex(__get_dynamic_array(key), __entry->key_len),
_______________________________________________
iovisor-dev mailing list
iovisor-dev@...
https://lists.iovisor.org/mailman/listinfo/iovisor-dev


--
________________________________
Raffaele Sommese
Mail:raffysommy@...
About me:https://about.me/r4ffy
Gpg Key:http://www.r4ffy.info/Openpgp.asc
GPG key ID: 0x830b1428cf91db2a on http://pgp.mit.edu:11371/



Raffaele Sommese
 

Hello,
I have tried to use kprobe but it fails when I try to attach a kprobe
on that function with this error: raise Exception("Failed to attach
BPF to kprobe")
I use b.attach_kprobe(event="map_update_elem", fn_name="hello") for
the attaching, and int hello(struct pt_regs *ctx,struct bpf_map *map)
as bpf function.
(I use basically the code of hello_perf_output.py example right now).
Is this the right way? Or I can attach my ebpf program only to syscall?
Thank You,
Raffaele

Il giorno mer 1 ago 2018 alle ore 17:08 Y Song <ys114321@...> ha scritto:

On Wed, Aug 1, 2018 at 2:36 AM, Raffaele Sommese <raffysommy@...> wrote:
Hello everybody,
I was looking for a similar mechanism,
I need to trace an event on map update/delete, I have tried with
tracepoint but I can recover only the file descriptor of map and I
need the map id too (or the map name).
Is there some other solution to trace this event and recover this data?
bpf tracepoints have been removed from recent linux so the you need to
use kprobe to trace update/delete.

typical map_update_elem and map_delete_elem first argument is
'struct bpf_map *map', you can get name and id from there:

struct bpf_map {
/* The first two cachelines with read-mostly members of which some
* are also accessed in fast-path (e.g. ops, max_entries).
*/
const struct bpf_map_ops *ops ____cacheline_aligned;
struct bpf_map *inner_map_meta;
#ifdef CONFIG_SECURITY
void *security;
#endif
enum bpf_map_type map_type;
u32 key_size;
u32 value_size;
u32 max_entries;
u32 map_flags;
u32 pages;
u32 id;
int numa_node;
u32 btf_key_type_id;
u32 btf_value_type_id;
struct btf *btf;
bool unpriv_array;
/* 55 bytes hole */

/* The 3rd and 4th cacheline with misc members to avoid false sharing
* particularly with refcounting.
*/
struct user_struct *user ____cacheline_aligned;
atomic_t refcnt;
atomic_t usercnt;
struct work_struct work;
char name[BPF_OBJ_NAME_LEN];
};


I prefer to avoid to modify the kernel code.
Thank You,
Best Regards
Raffaele
Il giorno sab 17 feb 2018 alle ore 18:41 Jesper Dangaard Brouer via
iovisor-dev <iovisor-dev@...> ha scritto:



On Sat, 17 Feb 2018 13:49:22 +0000 Teng Qin via iovisor-dev <iovisor-dev@...> wrote:

We were looking for a mechanism transparent to the eBPF program, though.
A possible rational is to have an hot-standby copy of the program
(including the state) in some other location, but I don't want my
dataplane to be aware of that.
Thanks,

fulvio

You could also (use another BPF program or ftrace) to trace the
bpf_map_update_elem Tracepoint. But in that case you get all update calls
and would need to filter for the one you are interested on your own:)
That is a good idea.

Try it out via perf-record to see if it contains what you need:

$ perf record -e bpf:bpf_map_update_elem -a

$ perf script
xdp_redirect_ma 2273 [011] 261187.968223: bpf:bpf_map_update_elem: map type= ufd=4 key=[00 00 00 00] val=[07 00 00 00]


Looking at the above output and tracepoint kernel code, we should
extend that with a map_id to easily identify/filter what map you are
interested in.

See patch below signature (not even compile tested).

Example for attaching to tracepoints see:
samples/bpf/xdp_monitor_*.c

--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat
LinkedIn: http://www.linkedin.com/in/brouer

tracepoint: add map id to bpf tracepoints

From: Jesper Dangaard Brouer <brouer@...>


---
include/trace/events/bpf.h | 12 +++++++++---
1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/include/trace/events/bpf.h b/include/trace/events/bpf.h
index 150185647e6b..e6479ba45261 100644
--- a/include/trace/events/bpf.h
+++ b/include/trace/events/bpf.h
@@ -140,7 +140,7 @@ TRACE_EVENT(bpf_map_create,
__entry->flags = map->map_flags;
__entry->ufd = ufd;
),
-
+// TODO also add map_id here
TP_printk("map type=%s ufd=%d key=%u val=%u max=%u flags=%x",
__print_symbolic(__entry->type, __MAP_TYPE_SYM_TAB),
__entry->ufd, __entry->size_key, __entry->size_value,
@@ -199,15 +199,18 @@ DECLARE_EVENT_CLASS(bpf_obj_map,
__field(u32, type)
__field(int, ufd)
__string(path, pname->name)
+ __field(u32, map_id)
),

TP_fast_assign(
__assign_str(path, pname->name);
__entry->type = map->map_type;
__entry->ufd = ufd;
+ __entry->map_id = map->id;
),

- TP_printk("map type=%s ufd=%d path=%s",
+ TP_printk("map id=%u type=%s ufd=%d path=%s",
+ __entry->map_id,
__print_symbolic(__entry->type, __MAP_TYPE_SYM_TAB),
__entry->ufd, __get_str(path))
);
@@ -244,6 +247,7 @@ DECLARE_EVENT_CLASS(bpf_map_keyval,
__dynamic_array(u8, val, map->value_size)
__field(bool, val_trunc)
__field(int, ufd)
+ __field(u32, map_id)
),

TP_fast_assign(
@@ -255,9 +259,11 @@ DECLARE_EVENT_CLASS(bpf_map_keyval,
__entry->val_len = min(map->value_size, 16U);
__entry->val_trunc = map->value_size != __entry->val_len;
__entry->ufd = ufd;
+ __entry->map_id = map->id;
),

- TP_printk("map type=%s ufd=%d key=[%s%s] val=[%s%s]",
+ TP_printk("map id=%d type=%s ufd=%d key=[%s%s] val=[%s%s]",
+ __entry->map_id,
__print_symbolic(__entry->type, __MAP_TYPE_SYM_TAB),
__entry->ufd,
__print_hex(__get_dynamic_array(key), __entry->key_len),
_______________________________________________
iovisor-dev mailing list
iovisor-dev@...
https://lists.iovisor.org/mailman/listinfo/iovisor-dev


--
________________________________
Raffaele Sommese
Mail:raffysommy@...
About me:https://about.me/r4ffy
Gpg Key:http://www.r4ffy.info/Openpgp.asc
GPG key ID: 0x830b1428cf91db2a on http://pgp.mit.edu:11371/


--
________________________________
Raffaele Sommese
Mail:raffysommy@...
About me:https://about.me/r4ffy
Gpg Key:http://www.r4ffy.info/Openpgp.asc
GPG key ID: 0x830b1428cf91db2a on http://pgp.mit.edu:11371/


Sunny Klair
 

You likely want uprobes if your function is defined in userspace, not kprobes (which are for functions defined in kernel space).

relevant link: http://www.brendangregg.com/blog/2015-06-28/linux-ftrace-uprobe.html

- Sunny

On Thu, Aug 2, 2018 at 11:36 AM, Raffaele Sommese <raffysommy@...> wrote:
Hello,
I have tried to use kprobe but it fails when I try to attach a kprobe
on that function with this error: raise Exception("Failed to attach
BPF to kprobe")
I use b.attach_kprobe(event="map_update_elem", fn_name="hello") for
the attaching, and int hello(struct pt_regs *ctx,struct bpf_map *map)
as bpf function.
(I use basically the code of hello_perf_output.py example right now).
Is this the right way? Or I can attach my ebpf program only to syscall?
Thank You,
Raffaele
Il giorno mer 1 ago 2018 alle ore 17:08 Y Song <ys114321@...> ha scritto:
>
> On Wed, Aug 1, 2018 at 2:36 AM, Raffaele Sommese <raffysommy@...> wrote:
> > Hello everybody,
> > I was looking for a similar mechanism,
> > I need to trace an event on map update/delete, I have tried with
> > tracepoint but I can recover only the file descriptor of map and I
> > need the map id too (or the map name).
> > Is there some other solution to trace this event and recover this data?
>
> bpf tracepoints have been removed from recent linux so the you need to
> use kprobe to trace update/delete.
>
> typical map_update_elem and map_delete_elem first argument is
> 'struct bpf_map *map', you can get name and id from there:
>
> struct bpf_map {
>         /* The first two cachelines with read-mostly members of which some
>          * are also accessed in fast-path (e.g. ops, max_entries).
>          */
>         const struct bpf_map_ops *ops ____cacheline_aligned;
>         struct bpf_map *inner_map_meta;
> #ifdef CONFIG_SECURITY
>         void *security;
> #endif
>         enum bpf_map_type map_type;
>         u32 key_size;
>         u32 value_size;
>         u32 max_entries;
>         u32 map_flags;
>         u32 pages;
>         u32 id;
>         int numa_node;
>         u32 btf_key_type_id;
>         u32 btf_value_type_id;
>         struct btf *btf;
>         bool unpriv_array;
>         /* 55 bytes hole */
>
>         /* The 3rd and 4th cacheline with misc members to avoid false sharing
>          * particularly with refcounting.
>          */
>         struct user_struct *user ____cacheline_aligned;
>         atomic_t refcnt;
>         atomic_t usercnt;
>         struct work_struct work;
>         char name[BPF_OBJ_NAME_LEN];
> };
>
>
> > I prefer to avoid to modify the kernel code.
> > Thank You,
> > Best Regards
> > Raffaele
> > Il giorno sab 17 feb 2018 alle ore 18:41 Jesper Dangaard Brouer via
> > iovisor-dev <iovisor-dev@...> ha scritto:
> >>
> >>
> >>
> >> On Sat, 17 Feb 2018 13:49:22 +0000 Teng Qin via iovisor-dev <iovisor-dev@...> wrote:
> >>
> >> > > We were looking for a mechanism transparent to the eBPF program, though.
> >> > > A possible rational is to have an hot-standby copy of the program
> >> > > (including the state) in some other location, but I don't want my
> >> > > dataplane to be aware of that.
> >> > > Thanks,
> >> > >
> >> > >         fulvio
> >> >
> >> >
> >> > You could also (use another BPF program or ftrace) to trace the
> >> > bpf_map_update_elem Tracepoint. But in that case you get all update calls
> >> > and would need to filter for the one you are interested on your own:)
> >>
> >> That is a good idea.
> >>
> >> Try it out via perf-record to see if it contains what you need:
> >>
> >>  $ perf record -e bpf:bpf_map_update_elem -a
> >>
> >>  $ perf script
> >>  xdp_redirect_ma  2273 [011] 261187.968223: bpf:bpf_map_update_elem: map type= ufd=4 key=[00 00 00 00] val=[07 00 00 00]
> >>
> >>
> >> Looking at the above output and tracepoint kernel code, we should
> >> extend that with a map_id to easily identify/filter what map you are
> >> interested in.
> >>
> >> See patch below signature (not even compile tested).
> >>
> >> Example for attaching to tracepoints see:
> >>  samples/bpf/xdp_monitor_*.c
> >>
> >> --
> >> Best regards,
> >>   Jesper Dangaard Brouer
> >>   MSc.CS, Principal Kernel Engineer at Red Hat
> >>   LinkedIn: http://www.linkedin.com/in/brouer
> >>
> >> tracepoint: add map id to bpf tracepoints
> >>
> >> From: Jesper Dangaard Brouer <brouer@...>
> >>
> >>
> >> ---
> >>  include/trace/events/bpf.h |   12 +++++++++---
> >>  1 file changed, 9 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/include/trace/events/bpf.h b/include/trace/events/bpf.h
> >> index 150185647e6b..e6479ba45261 100644
> >> --- a/include/trace/events/bpf.h
> >> +++ b/include/trace/events/bpf.h
> >> @@ -140,7 +140,7 @@ TRACE_EVENT(bpf_map_create,
> >>                 __entry->flags       = map->map_flags;
> >>                 __entry->ufd         = ufd;
> >>         ),
> >> -
> >> +// TODO also add map_id here
> >>         TP_printk("map type=%s ufd=%d key=%u val=%u max=%u flags=%x",
> >>                   __print_symbolic(__entry->type, __MAP_TYPE_SYM_TAB),
> >>                   __entry->ufd, __entry->size_key, __entry->size_value,
> >> @@ -199,15 +199,18 @@ DECLARE_EVENT_CLASS(bpf_obj_map,
> >>                 __field(u32, type)
> >>                 __field(int, ufd)
> >>                 __string(path, pname->name)
> >> +               __field(u32, map_id)
> >>         ),
> >>
> >>         TP_fast_assign(
> >>                 __assign_str(path, pname->name);
> >>                 __entry->type = map->map_type;
> >>                 __entry->ufd  = ufd;
> >> +               __entry->map_id = map->id;
> >>         ),
> >>
> >> -       TP_printk("map type=%s ufd=%d path=%s",
> >> +       TP_printk("map id=%u type=%s ufd=%d path=%s",
> >> +                 __entry->map_id,
> >>                   __print_symbolic(__entry->type, __MAP_TYPE_SYM_TAB),
> >>                   __entry->ufd, __get_str(path))
> >>  );
> >> @@ -244,6 +247,7 @@ DECLARE_EVENT_CLASS(bpf_map_keyval,
> >>                 __dynamic_array(u8, val, map->value_size)
> >>                 __field(bool, val_trunc)
> >>                 __field(int, ufd)
> >> +               __field(u32, map_id)
> >>         ),
> >>
> >>         TP_fast_assign(
> >> @@ -255,9 +259,11 @@ DECLARE_EVENT_CLASS(bpf_map_keyval,
> >>                 __entry->val_len   = min(map->value_size, 16U);
> >>                 __entry->val_trunc = map->value_size != __entry->val_len;
> >>                 __entry->ufd       = ufd;
> >> +               __entry->map_id    = map->id;
> >>         ),
> >>
> >> -       TP_printk("map type=%s ufd=%d key=[%s%s] val=[%s%s]",
> >> +       TP_printk("map id=%d type=%s ufd=%d key=[%s%s] val=[%s%s]",
> >> +                 __entry->map_id,
> >>                   __print_symbolic(__entry->type, __MAP_TYPE_SYM_TAB),
> >>                   __entry->ufd,
> >>                   __print_hex(__get_dynamic_array(key), __entry->key_len),
> >> _______________________________________________
> >> iovisor-dev mailing list
> >> iovisor-dev@...
> >> https://lists.iovisor.org/mailman/listinfo/iovisor-dev
> >
> >
> >
> > --
> > ________________________________
> > Raffaele Sommese
> > Mail:raffysommy@...
> > About me:https://about.me/r4ffy
> > Gpg Key:http://www.r4ffy.info/Openpgp.asc
> > GPG key ID: 0x830b1428cf91db2a on http://pgp.mit.edu:11371/
> >
> >
> >



--
________________________________
Raffaele Sommese
Mail:raffysommy@...
About me:https://about.me/r4ffy
Gpg Key:http://www.r4ffy.info/Openpgp.asc
GPG key ID: 0x830b1428cf91db2a on http://pgp.mit.edu:11371/





Raffaele Sommese
 

I think that I need kprobe, map_update_elem and map_delete_elem are
defined in kernel space.
Thank You,
Raffaele
Il giorno gio 2 ago 2018 alle ore 17:39 Sunny Klair
<sunny@...> ha scritto:

You likely want uprobes if your function is defined in userspace, not kprobes (which are for functions defined in kernel space).

relevant link: http://www.brendangregg.com/blog/2015-06-28/linux-ftrace-uprobe.html

- Sunny

On Thu, Aug 2, 2018 at 11:36 AM, Raffaele Sommese <raffysommy@...> wrote:

Hello,
I have tried to use kprobe but it fails when I try to attach a kprobe
on that function with this error: raise Exception("Failed to attach
BPF to kprobe")
I use b.attach_kprobe(event="map_update_elem", fn_name="hello") for
the attaching, and int hello(struct pt_regs *ctx,struct bpf_map *map)
as bpf function.
(I use basically the code of hello_perf_output.py example right now).
Is this the right way? Or I can attach my ebpf program only to syscall?
Thank You,
Raffaele
Il giorno mer 1 ago 2018 alle ore 17:08 Y Song <ys114321@...> ha scritto:

On Wed, Aug 1, 2018 at 2:36 AM, Raffaele Sommese <raffysommy@...> wrote:
Hello everybody,
I was looking for a similar mechanism,
I need to trace an event on map update/delete, I have tried with
tracepoint but I can recover only the file descriptor of map and I
need the map id too (or the map name).
Is there some other solution to trace this event and recover this data?
bpf tracepoints have been removed from recent linux so the you need to
use kprobe to trace update/delete.

typical map_update_elem and map_delete_elem first argument is
'struct bpf_map *map', you can get name and id from there:

struct bpf_map {
/* The first two cachelines with read-mostly members of which some
* are also accessed in fast-path (e.g. ops, max_entries).
*/
const struct bpf_map_ops *ops ____cacheline_aligned;
struct bpf_map *inner_map_meta;
#ifdef CONFIG_SECURITY
void *security;
#endif
enum bpf_map_type map_type;
u32 key_size;
u32 value_size;
u32 max_entries;
u32 map_flags;
u32 pages;
u32 id;
int numa_node;
u32 btf_key_type_id;
u32 btf_value_type_id;
struct btf *btf;
bool unpriv_array;
/* 55 bytes hole */

/* The 3rd and 4th cacheline with misc members to avoid false sharing
* particularly with refcounting.
*/
struct user_struct *user ____cacheline_aligned;
atomic_t refcnt;
atomic_t usercnt;
struct work_struct work;
char name[BPF_OBJ_NAME_LEN];
};


I prefer to avoid to modify the kernel code.
Thank You,
Best Regards
Raffaele
Il giorno sab 17 feb 2018 alle ore 18:41 Jesper Dangaard Brouer via
iovisor-dev <iovisor-dev@...> ha scritto:



On Sat, 17 Feb 2018 13:49:22 +0000 Teng Qin via iovisor-dev <iovisor-dev@...> wrote:

We were looking for a mechanism transparent to the eBPF program, though.
A possible rational is to have an hot-standby copy of the program
(including the state) in some other location, but I don't want my
dataplane to be aware of that.
Thanks,

fulvio

You could also (use another BPF program or ftrace) to trace the
bpf_map_update_elem Tracepoint. But in that case you get all update calls
and would need to filter for the one you are interested on your own:)
That is a good idea.

Try it out via perf-record to see if it contains what you need:

$ perf record -e bpf:bpf_map_update_elem -a

$ perf script
xdp_redirect_ma 2273 [011] 261187.968223: bpf:bpf_map_update_elem: map type= ufd=4 key=[00 00 00 00] val=[07 00 00 00]


Looking at the above output and tracepoint kernel code, we should
extend that with a map_id to easily identify/filter what map you are
interested in.

See patch below signature (not even compile tested).

Example for attaching to tracepoints see:
samples/bpf/xdp_monitor_*.c

--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat
LinkedIn: http://www.linkedin.com/in/brouer

tracepoint: add map id to bpf tracepoints

From: Jesper Dangaard Brouer <brouer@...>


---
include/trace/events/bpf.h | 12 +++++++++---
1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/include/trace/events/bpf.h b/include/trace/events/bpf.h
index 150185647e6b..e6479ba45261 100644
--- a/include/trace/events/bpf.h
+++ b/include/trace/events/bpf.h
@@ -140,7 +140,7 @@ TRACE_EVENT(bpf_map_create,
__entry->flags = map->map_flags;
__entry->ufd = ufd;
),
-
+// TODO also add map_id here
TP_printk("map type=%s ufd=%d key=%u val=%u max=%u flags=%x",
__print_symbolic(__entry->type, __MAP_TYPE_SYM_TAB),
__entry->ufd, __entry->size_key, __entry->size_value,
@@ -199,15 +199,18 @@ DECLARE_EVENT_CLASS(bpf_obj_map,
__field(u32, type)
__field(int, ufd)
__string(path, pname->name)
+ __field(u32, map_id)
),

TP_fast_assign(
__assign_str(path, pname->name);
__entry->type = map->map_type;
__entry->ufd = ufd;
+ __entry->map_id = map->id;
),

- TP_printk("map type=%s ufd=%d path=%s",
+ TP_printk("map id=%u type=%s ufd=%d path=%s",
+ __entry->map_id,
__print_symbolic(__entry->type, __MAP_TYPE_SYM_TAB),
__entry->ufd, __get_str(path))
);
@@ -244,6 +247,7 @@ DECLARE_EVENT_CLASS(bpf_map_keyval,
__dynamic_array(u8, val, map->value_size)
__field(bool, val_trunc)
__field(int, ufd)
+ __field(u32, map_id)
),

TP_fast_assign(
@@ -255,9 +259,11 @@ DECLARE_EVENT_CLASS(bpf_map_keyval,
__entry->val_len = min(map->value_size, 16U);
__entry->val_trunc = map->value_size != __entry->val_len;
__entry->ufd = ufd;
+ __entry->map_id = map->id;
),

- TP_printk("map type=%s ufd=%d key=[%s%s] val=[%s%s]",
+ TP_printk("map id=%d type=%s ufd=%d key=[%s%s] val=[%s%s]",
+ __entry->map_id,
__print_symbolic(__entry->type, __MAP_TYPE_SYM_TAB),
__entry->ufd,
__print_hex(__get_dynamic_array(key), __entry->key_len),
_______________________________________________
iovisor-dev mailing list
iovisor-dev@...
https://lists.iovisor.org/mailman/listinfo/iovisor-dev


--
________________________________
Raffaele Sommese
Mail:raffysommy@...
About me:https://about.me/r4ffy
Gpg Key:http://www.r4ffy.info/Openpgp.asc
GPG key ID: 0x830b1428cf91db2a on http://pgp.mit.edu:11371/




--
________________________________
Raffaele Sommese
Mail:raffysommy@...
About me:https://about.me/r4ffy
Gpg Key:http://www.r4ffy.info/Openpgp.asc
GPG key ID: 0x830b1428cf91db2a on http://pgp.mit.edu:11371/



--
________________________________
Raffaele Sommese
Mail:raffysommy@...
About me:https://about.me/r4ffy
Gpg Key:http://www.r4ffy.info/Openpgp.asc
GPG key ID: 0x830b1428cf91db2a on http://pgp.mit.edu:11371/


Raffaele Sommese
 

bpf tracepoints have been removed from recent linux so the you need to
use kprobe to trace update/delete.

typical map_update_elem and map_delete_elem first argument is
'struct bpf_map *map', you can get name and id from there:
Hello again :)
It seems that there is 2 function that can be traced inside the
kernel, one is map_update_elem, and it is the syscall, the other one
is the BPF helper.
I have successful attach my ebpf code to the first one, but it doesn't
have as parameter struct bpf_map *map (it have a union bpf_attr).
If I attach my program to the bpf_map_update_elem (that I think is the
function name of BPF helper), I don't receive any event.
I'm using the last version of bcc and of kernel.
I try also with kprobe program of perf kernel suite with the same results.
I was looking for this helper BPF_CALL_4 (bpf_map_update_elem, struct
bpf_map *, map, void *, key, void *, value, u64, flags)
Thank you again for the support,
Raffaele
--
________________________________
Raffaele Sommese
Mail:raffysommy@...
About me:https://about.me/r4ffy
Gpg Key:http://www.r4ffy.info/Openpgp.asc
GPG key ID: 0x830b1428cf91db2a on http://pgp.mit.edu:11371/


Yonghong Song
 

On Mon, Aug 6, 2018 at 10:17 AM, Raffaele Sommese <raffysommy@...> wrote:
bpf tracepoints have been removed from recent linux so the you need to
use kprobe to trace update/delete.

typical map_update_elem and map_delete_elem first argument is
'struct bpf_map *map', you can get name and id from there:
Hello again :)
It seems that there is 2 function that can be traced inside the
kernel, one is map_update_elem, and it is the syscall, the other one
is the BPF helper.
I have successful attach my ebpf code to the first one, but it doesn't
have as parameter struct bpf_map *map (it have a union bpf_attr).
If I attach my program to the bpf_map_update_elem (that I think is the
function name of BPF helper), I don't receive any event.
I'm using the last version of bcc and of kernel.
I try also with kprobe program of perf kernel suite with the same results.
I was looking for this helper BPF_CALL_4 (bpf_map_update_elem, struct
bpf_map *, map, void *, key, void *, value, u64, flags)
Please directly use the map lookup function for the specific map.
For example, for hashmap, the verifier is smart enough to
change the byte code to call the underlying hashmap map lookup function.

===
/* inline bpf_map_lookup_elem() call.
* Instead of:
* bpf_prog
* bpf_map_lookup_elem
* map->ops->map_lookup_elem
* htab_map_lookup_elem
* __htab_map_lookup_elem
* do:
* bpf_prog
* __htab_map_lookup_elem
*/
static u32 htab_map_gen_lookup(struct bpf_map *map, struct bpf_insn *insn_buf)
{
struct bpf_insn *insn = insn_buf;
const int ret = BPF_REG_0;

BUILD_BUG_ON(!__same_type(&__htab_map_lookup_elem,
(void *(*)(struct bpf_map *map, void *key))NULL));
*insn++ = BPF_EMIT_CALL(BPF_CAST_CALL(__htab_map_lookup_elem));
*insn++ = BPF_JMP_IMM(BPF_JEQ, ret, 0, 1);
*insn++ = BPF_ALU64_IMM(BPF_ADD, ret,
offsetof(struct htab_elem, key) +
round_up(map->key_size, 8));
return insn - insn_buf;
}
===

Please do check whether map_gen_lookup is implemented or not
for the map type you are interested in. For example, for the latest
bpf-next, map_gen_lookup is implemented for hashtable,

const struct bpf_map_ops htab_map_ops = {
.map_alloc_check = htab_map_alloc_check,
.map_alloc = htab_map_alloc,
.map_free = htab_map_free,
.map_get_next_key = htab_map_get_next_key,
.map_lookup_elem = htab_map_lookup_elem,
.map_update_elem = htab_map_update_elem,
.map_delete_elem = htab_map_delete_elem,
.map_gen_lookup = htab_map_gen_lookup,
};

For for hash map, the function you should attach to
is __htab_map_lookup_elem.


Thank you again for the support,
Raffaele
--
________________________________
Raffaele Sommese
Mail:raffysommy@...
About me:https://about.me/r4ffy
Gpg Key:http://www.r4ffy.info/Openpgp.asc
GPG key ID: 0x830b1428cf91db2a on http://pgp.mit.edu:11371/


Raffaele Sommese
 

Il giorno lun 6 ago 2018 alle ore 19:40 Y Song <ys114321@...> ha scritto:

On Mon, Aug 6, 2018 at 10:17 AM, Raffaele Sommese <raffysommy@...> wrote:
bpf tracepoints have been removed from recent linux so the you need to
use kprobe to trace update/delete.

typical map_update_elem and map_delete_elem first argument is
'struct bpf_map *map', you can get name and id from there:
Hello again :)
It seems that there is 2 function that can be traced inside the
kernel, one is map_update_elem, and it is the syscall, the other one
is the BPF helper.
I have successful attach my ebpf code to the first one, but it doesn't
have as parameter struct bpf_map *map (it have a union bpf_attr).
If I attach my program to the bpf_map_update_elem (that I think is the
function name of BPF helper), I don't receive any event.
I'm using the last version of bcc and of kernel.
I try also with kprobe program of perf kernel suite with the same results.
I was looking for this helper BPF_CALL_4 (bpf_map_update_elem, struct
bpf_map *, map, void *, key, void *, value, u64, flags)
Please directly use the map lookup function for the specific map.
For example, for hashmap, the verifier is smart enough to
change the byte code to call the underlying hashmap map lookup function.
Thank you, right now I will try only to implement a solution for hashmap.
I have detected a strange behavior for lookup I can receive the event
when the map was looked, but for the updates, I don't receive
anything.
I have checked the kernel and there was the map_gen_lookup.
The strange thing is that if I use kprobe tool I can see the event on
htab_map_update_elem.
Here is my test code: (I have tried with lookup and it works)
https://gist.github.com/raffysommy/1dabe5bf9487d974f3acd1f7a32ed01c
https://gist.github.com/raffysommy/587f61c14d3e157f86da1aadd07442b1
Thanks again,
Raffaele

--
________________________________
Raffaele Sommese
Mail:raffysommy@...
About me:https://about.me/r4ffy
Gpg Key:http://www.r4ffy.info/Openpgp.asc
GPG key ID: 0x830b1428cf91db2a on http://pgp.mit.edu:11371/


Yonghong Song
 

On Mon, Aug 6, 2018 at 11:53 AM, Raffaele Sommese <raffysommy@...> wrote:
Il giorno lun 6 ago 2018 alle ore 19:40 Y Song <ys114321@...> ha scritto:

On Mon, Aug 6, 2018 at 10:17 AM, Raffaele Sommese <raffysommy@...> wrote:
bpf tracepoints have been removed from recent linux so the you need to
use kprobe to trace update/delete.

typical map_update_elem and map_delete_elem first argument is
'struct bpf_map *map', you can get name and id from there:
Hello again :)
It seems that there is 2 function that can be traced inside the
kernel, one is map_update_elem, and it is the syscall, the other one
is the BPF helper.
I have successful attach my ebpf code to the first one, but it doesn't
have as parameter struct bpf_map *map (it have a union bpf_attr).
If I attach my program to the bpf_map_update_elem (that I think is the
function name of BPF helper), I don't receive any event.
I'm using the last version of bcc and of kernel.
I try also with kprobe program of perf kernel suite with the same results.
I was looking for this helper BPF_CALL_4 (bpf_map_update_elem, struct
bpf_map *, map, void *, key, void *, value, u64, flags)
Please directly use the map lookup function for the specific map.
For example, for hashmap, the verifier is smart enough to
change the byte code to call the underlying hashmap map lookup function.
Thank you, right now I will try only to implement a solution for hashmap.
I have detected a strange behavior for lookup I can receive the event
when the map was looked, but for the updates, I don't receive
anything.
I have checked the kernel and there was the map_gen_lookup.
The strange thing is that if I use kprobe tool I can see the event on
htab_map_update_elem.
Here is my test code: (I have tried with lookup and it works)
https://gist.github.com/raffysommy/1dabe5bf9487d974f3acd1f7a32ed01c
https://gist.github.com/raffysommy/587f61c14d3e157f86da1aadd07442b1
Okay, the htab_map_update_elem is indeed called, but you cannot trace it.
The following kernel code in kernel/bpf/syscall.c explained the reason:

/* must increment bpf_prog_active to avoid kprobe+bpf triggering from
* inside bpf map update or delete otherwise deadlocks are possible
*/
preempt_disable();
__this_cpu_inc(bpf_prog_active);
if (map->map_type == BPF_MAP_TYPE_PERCPU_HASH ||
map->map_type == BPF_MAP_TYPE_LRU_PERCPU_HASH) {
err = bpf_percpu_hash_update(map, key, value, attr->flags);
} else if (map->map_type == BPF_MAP_TYPE_PERCPU_ARRAY) {
err = bpf_percpu_array_update(map, key, value, attr->flags);
} else if (IS_FD_ARRAY(map)) {
rcu_read_lock();
err = bpf_fd_array_map_update_elem(map, f.file, key, value,
attr->flags);
rcu_read_unlock();
} else if (map->map_type == BPF_MAP_TYPE_HASH_OF_MAPS) {
rcu_read_lock();
err = bpf_fd_htab_map_update_elem(map, f.file, key, value,
attr->flags);
rcu_read_unlock();
} else {
rcu_read_lock();
err = map->ops->map_update_elem(map, key, value, attr->flags);
rcu_read_unlock();
}
__this_cpu_dec(bpf_prog_active);
preempt_enable();

The bpf_prog_active will prevent later kprobe for htab_map_update_elem.

How can we solve this problem then? One possible solution is as follows:
. disassemble vmlinux to find a proper place in function "map_update_elem"
where you can get the "map" (struct bpf_map *map) in a register, e.g.,
the insn offset inside map_update_elem is OFFSET and this OFFSET
should be outside the above preempt/__this_cpu_{inc/dec} region.
. improve trace.py to trace function+offset. the possible format could be
trace.py 'map_update_elem+OFFSET ...'
The attach_kprobe API should already support function_name + offset format.

Thanks again,
Raffaele

--
________________________________
Raffaele Sommese
Mail:raffysommy@...
About me:https://about.me/r4ffy
Gpg Key:http://www.r4ffy.info/Openpgp.asc
GPG key ID: 0x830b1428cf91db2a on http://pgp.mit.edu:11371/


Raffaele Sommese
 

Okay, the htab_map_update_elem is indeed called, but you cannot trace it.
The following kernel code in kernel/bpf/syscall.c explained the reason:

/* must increment bpf_prog_active to avoid kprobe+bpf triggering from
* inside bpf map update or delete otherwise deadlocks are possible
*/
preempt_disable();
__this_cpu_inc(bpf_prog_active);
if (map->map_type == BPF_MAP_TYPE_PERCPU_HASH ||
map->map_type == BPF_MAP_TYPE_LRU_PERCPU_HASH) {
err = bpf_percpu_hash_update(map, key, value, attr->flags);
} else if (map->map_type == BPF_MAP_TYPE_PERCPU_ARRAY) {
err = bpf_percpu_array_update(map, key, value, attr->flags);
} else if (IS_FD_ARRAY(map)) {
rcu_read_lock();
err = bpf_fd_array_map_update_elem(map, f.file, key, value,
attr->flags);
rcu_read_unlock();
} else if (map->map_type == BPF_MAP_TYPE_HASH_OF_MAPS) {
rcu_read_lock();
err = bpf_fd_htab_map_update_elem(map, f.file, key, value,
attr->flags);
rcu_read_unlock();
} else {
rcu_read_lock();
err = map->ops->map_update_elem(map, key, value, attr->flags);
rcu_read_unlock();
}
__this_cpu_dec(bpf_prog_active);
preempt_enable();

The bpf_prog_active will prevent later kprobe for htab_map_update_elem.

How can we solve this problem then? One possible solution is as follows:
. disassemble vmlinux to find a proper place in function "map_update_elem"
where you can get the "map" (struct bpf_map *map) in a register, e.g.,
the insn offset inside map_update_elem is OFFSET and this OFFSET
should be outside the above preempt/__this_cpu_{inc/dec} region.
. improve trace.py to trace function+offset. the possible format could be
trace.py 'map_update_elem+OFFSET ...'
The attach_kprobe API should already support function_name + offset format.
I think that this way can be very tricky and platform depended, I have
found another solution. If I attach my bpf program to bpf_map_new_fd
with a kprobe and a kretprobe I can recover the mapping between (fd of
map-pid) and id or the name of the map (and save it). I have tested it
and it seems to work.
Then I can trace map_update_elem syscall and read the data (I'm
interested only into the key) from the userspace.
I attach the code here, it can be helpful if other people that want to
address this problem.
https://gist.github.com/raffysommy/45cf0544f34eb0e5fbf533f4d9a3b955
Thank you again for the support and for your time.
Raffaele


--
________________________________
Raffaele Sommese
Mail:raffysommy@...
About me:https://about.me/r4ffy
Gpg Key:http://www.r4ffy.info/Openpgp.asc
GPG key ID: 0x830b1428cf91db2a on http://pgp.mit.edu:11371/


Yonghong Song
 

On Mon, Aug 6, 2018 at 2:52 PM, Raffaele Sommese <raffysommy@...> wrote:
Okay, the htab_map_update_elem is indeed called, but you cannot trace it.
The following kernel code in kernel/bpf/syscall.c explained the reason:

/* must increment bpf_prog_active to avoid kprobe+bpf triggering from
* inside bpf map update or delete otherwise deadlocks are possible
*/
preempt_disable();
__this_cpu_inc(bpf_prog_active);
if (map->map_type == BPF_MAP_TYPE_PERCPU_HASH ||
map->map_type == BPF_MAP_TYPE_LRU_PERCPU_HASH) {
err = bpf_percpu_hash_update(map, key, value, attr->flags);
} else if (map->map_type == BPF_MAP_TYPE_PERCPU_ARRAY) {
err = bpf_percpu_array_update(map, key, value, attr->flags);
} else if (IS_FD_ARRAY(map)) {
rcu_read_lock();
err = bpf_fd_array_map_update_elem(map, f.file, key, value,
attr->flags);
rcu_read_unlock();
} else if (map->map_type == BPF_MAP_TYPE_HASH_OF_MAPS) {
rcu_read_lock();
err = bpf_fd_htab_map_update_elem(map, f.file, key, value,
attr->flags);
rcu_read_unlock();
} else {
rcu_read_lock();
err = map->ops->map_update_elem(map, key, value, attr->flags);
rcu_read_unlock();
}
__this_cpu_dec(bpf_prog_active);
preempt_enable();

The bpf_prog_active will prevent later kprobe for htab_map_update_elem.

How can we solve this problem then? One possible solution is as follows:
. disassemble vmlinux to find a proper place in function "map_update_elem"
where you can get the "map" (struct bpf_map *map) in a register, e.g.,
the insn offset inside map_update_elem is OFFSET and this OFFSET
should be outside the above preempt/__this_cpu_{inc/dec} region.
. improve trace.py to trace function+offset. the possible format could be
trace.py 'map_update_elem+OFFSET ...'
The attach_kprobe API should already support function_name + offset format.
I think that this way can be very tricky and platform depended, I have
found another solution. If I attach my bpf program to bpf_map_new_fd
with a kprobe and a kretprobe I can recover the mapping between (fd of
map-pid) and id or the name of the map (and save it). I have tested it
and it seems to work.
Then I can trace map_update_elem syscall and read the data (I'm
interested only into the key) from the userspace.
I attach the code here, it can be helpful if other people that want to
address this problem.
https://gist.github.com/raffysommy/45cf0544f34eb0e5fbf533f4d9a3b955
Thank you again for the support and for your time.
Yes, this approach should work too. I am thinking whether we could do
it with one invocation of trace.py...

Raffaele


--
________________________________
Raffaele Sommese
Mail:raffysommy@...
About me:https://about.me/r4ffy
Gpg Key:http://www.r4ffy.info/Openpgp.asc
GPG key ID: 0x830b1428cf91db2a on http://pgp.mit.edu:11371/