question about per_cpu maps


Pablo Alvarez
 

Dear eBPF community

If I understand correctly, per_cpu maps are meant to avoid threading
issues, and for example the need to call BPF_XADD to keep counters safe.
There is something I am unclear about:

- Is it possible for a BPF program to be preempted by another BPF
program on the same CPU?

If this is the case, it seems that the following scenario could arise:

BPF program 1 on cpu 0 reads a counter and is preempted before
incrementing it
BPF program 2 on cpu 0 reads the same counter, increments it, and finishes
BPF program 1 on cpu 0 increments the counter

At which point both programs would have read the same value for the
counter, with possible problems ensuing.


Is this a valid scenario? Am I missing something about how the per_cpu
maps are intended to be used?

Thanks

Pablo Alvarez


Mauricio Vasquez
 

On 11/28/18 1:50 PM, Pablo Alvarez via Lists.Iovisor.Org wrote:
Dear eBPF community

If I understand correctly, per_cpu maps are meant to avoid threading
issues, and for example the need to call BPF_XADD to keep counters safe.
There is something I am unclear about:

- Is it possible for a BPF program to be preempted by another BPF
program on the same CPU?
eBPF programs are never preempted by the kernel, this allows to keep counters in per_cpu maps without need to use synchronized primitives on them.

This is also guaranteed that there is not preemption in a chain of tail calls, then per_cpu maps can also be used to store "global variables".

If this is the case, it seems that the following scenario could arise:

BPF program 1 on cpu 0 reads a counter and is preempted before
incrementing it
BPF program 2 on cpu 0 reads the same counter, increments it, and finishes
BPF program 1 on cpu 0 increments the counter

At which point both programs would have read the same value for the
counter, with possible problems ensuing.


Is this a valid scenario? Am I missing something about how the per_cpu
maps are intended to be used?

Thanks

Pablo Alvarez



Pablo Alvarez
 

Good to know, thanks!

There is no mention of the non-preemption in the bpf or tc-bpf man pages
or the bcc tutorials. Is it possible to change that? I would be happy to
add a note about this if pointed in the right direction.

For example, this paragraph in the man bpf page could be edited (ALLCAPS
additions)

"eBPF programs can be attached to different events. These events can
be the arrival of network packets, tracing events, classification
events by network queueing disciplines (for eBPF programs attached
to a tc(8) classifier), and other types that may be added in the
future. A new event triggers execution of the eBPF program, which
may store information about the event in eBPF maps. Beyond storing
data, eBPF programs may call a fixed set of in-kernel helper
functions. EBPF PROGRAMS ARE NEVER PREEMPTED BY THE KERNEL. THIS
INCLUDES TAIL CALLS TO OTHER EBPF PROGRAMS."

Best regards

Pablo Alvarez

On 11/28/18 17:52, Mauricio Vasquez wrote:

On 11/28/18 1:50 PM, Pablo Alvarez via Lists.Iovisor.Org wrote:
Dear eBPF community

If I understand correctly, per_cpu maps are meant to avoid threading
issues, and for example the need to call BPF_XADD to keep counters safe.
There is something I am unclear about:

- Is it possible for a BPF program to be preempted by another BPF
program on the same CPU?
eBPF programs are never preempted by the kernel, this allows to keep
counters in per_cpu maps without need to use synchronized primitives on
them.

This is also guaranteed that there is not preemption in a chain of tail
calls, then per_cpu maps can also be used to store "global variables".

If this is the case, it seems that the following scenario could arise:

BPF program 1 on cpu 0 reads a counter and is preempted before
incrementing it
BPF program 2 on cpu 0 reads the same counter, increments it, and
finishes
BPF program 1 on cpu 0 increments the counter

At which point both programs would have read the same value for the
counter, with possible problems ensuing.


Is this a valid scenario? Am I missing something about how the per_cpu
maps are intended to be used?

Thanks

Pablo Alvarez




Yonghong Song
 

On Thu, Nov 29, 2018 at 5:48 AM Pablo Alvarez via Lists.Iovisor.Org
<palvarez=akamai.com@...> wrote:

Good to know, thanks!

There is no mention of the non-preemption in the bpf or tc-bpf man pages
or the bcc tutorials. Is it possible to change that? I would be happy to
add a note about this if pointed in the right direction.
Yes. bpf man page severely lags behind the current state. Yes, you can help
make the change. linux-man mailing list is probably proper place to send patches
(https://www.spinics.net/lists/linux-man/).


For example, this paragraph in the man bpf page could be edited (ALLCAPS
additions)

"eBPF programs can be attached to different events. These events can
be the arrival of network packets, tracing events, classification
events by network queueing disciplines (for eBPF programs attached
to a tc(8) classifier), and other types that may be added in the
future. A new event triggers execution of the eBPF program, which
may store information about the event in eBPF maps. Beyond storing
data, eBPF programs may call a fixed set of in-kernel helper
functions. EBPF PROGRAMS ARE NEVER PREEMPTED BY THE KERNEL. THIS
INCLUDES TAIL CALLS TO OTHER EBPF PROGRAMS."

Best regards

Pablo Alvarez


On 11/28/18 17:52, Mauricio Vasquez wrote:

On 11/28/18 1:50 PM, Pablo Alvarez via Lists.Iovisor.Org wrote:
Dear eBPF community

If I understand correctly, per_cpu maps are meant to avoid threading
issues, and for example the need to call BPF_XADD to keep counters safe.
There is something I am unclear about:

- Is it possible for a BPF program to be preempted by another BPF
program on the same CPU?
eBPF programs are never preempted by the kernel, this allows to keep
counters in per_cpu maps without need to use synchronized primitives on
them.

This is also guaranteed that there is not preemption in a chain of tail
calls, then per_cpu maps can also be used to store "global variables".

If this is the case, it seems that the following scenario could arise:

BPF program 1 on cpu 0 reads a counter and is preempted before
incrementing it
BPF program 2 on cpu 0 reads the same counter, increments it, and
finishes
BPF program 1 on cpu 0 increments the counter

At which point both programs would have read the same value for the
counter, with possible problems ensuing.


Is this a valid scenario? Am I missing something about how the per_cpu
maps are intended to be used?

Thanks

Pablo Alvarez