CPU Concurrency Issues


brad.lewis@...
 

Hi all, 

I'm trying to verify that there are no concurrency issues with an approach I'm using cpu_id as a key to a HASH_MAP.   My understanding is that bcc disables preemption but the details are fuzzy and I haven't been able to find anything in the code.  Can anyone shed some light on this? 

 

Thank you, 

Brad Lewis 

 

 


Yonghong Song
 

On Mon, Aug 19, 2019 at 1:54 PM <brad.lewis@...> wrote:

Hi all,

I'm trying to verify that there are no concurrency issues with an approach I'm using cpu_id as a key to a HASH_MAP. My understanding is that bcc disables preemption but the details are fuzzy and I haven't been able to find anything in the code. Can anyone shed some light on this?
preemption is a kernel thing, bcc does not disable it. You need to
check kernel configuration CONFIG_PREEMPT in your host.




Thank you,

Brad Lewis






Matt Ahrens
 

On Mon, Aug 19, 2019 at 4:30 PM Yonghong Song <ys114321@...> wrote:
On Mon, Aug 19, 2019 at 1:54 PM <brad.lewis@...> wrote:
>
> Hi all,
>
> I'm trying to verify that there are no concurrency issues with an approach I'm using cpu_id as a key to a HASH_MAP.   My understanding is that bcc disables preemption but the details are fuzzy and I haven't been able to find anything in the code.  Can anyone shed some light on this?

preemption is a kernel thing, bcc does not disable it.  You need to
check kernel configuration CONFIG_PREEMPT in your host.

When running bpf code from a kprobe / kretprobe, does anything ensure that cpu_id doesn't change while the bpf is running (e.g. due to preemption)?  Does anything ensure that no other bpf code runs on this CPU while this kprobe is running (e.g. due to an interrupt firing and hitting a different kprobe)?  If either of those things can happen, it seems difficult to atomically increment an entry in a HASH_MAP (even when using the cpu_id as a key).

--matt


Yonghong Song
 

On Tue, Aug 20, 2019 at 10:38 AM Matthew Ahrens <mahrens@...> wrote:

On Mon, Aug 19, 2019 at 4:30 PM Yonghong Song <ys114321@...> wrote:

On Mon, Aug 19, 2019 at 1:54 PM <brad.lewis@...> wrote:

Hi all,

I'm trying to verify that there are no concurrency issues with an approach I'm using cpu_id as a key to a HASH_MAP. My understanding is that bcc disables preemption but the details are fuzzy and I haven't been able to find anything in the code. Can anyone shed some light on this?
preemption is a kernel thing, bcc does not disable it. You need to
check kernel configuration CONFIG_PREEMPT in your host.

When running bpf code from a kprobe / kretprobe, does anything ensure that cpu_id doesn't change while the bpf is running (e.g. due to preemption)? Does anything ensure that no other bpf code runs on this CPU while this kprobe is running (e.g. due to an interrupt firing and hitting a different kprobe)? If either of those things can happen, it seems difficult to atomically increment an entry in a HASH_MAP (even when using the cpu_id as a key).
bpf program run is wrapped in preempt_disable()/preempt_enable()
region. Also we have per cpu counter to prevent when bpf program
interrupted then another bpf to run for tracing programs. Therefore,
for tracing programs, the situation you mentioned in the above won't
happen.

The only thing which can happen is networking and tracing program
share the same map, e.g.,
networking bpf program can be nmi interrupted and a bpf tracing
program (perf_event type) may be running.




--matt


Matt Ahrens
 



On Tue, Aug 20, 2019 at 10:47 AM Yonghong Song <ys114321@...> wrote:
On Tue, Aug 20, 2019 at 10:38 AM Matthew Ahrens <mahrens@...> wrote:
>
> On Mon, Aug 19, 2019 at 4:30 PM Yonghong Song <ys114321@...> wrote:
>>
>> On Mon, Aug 19, 2019 at 1:54 PM <brad.lewis@...> wrote:
>> >
>> > Hi all,
>> >
>> > I'm trying to verify that there are no concurrency issues with an approach I'm using cpu_id as a key to a HASH_MAP.   My understanding is that bcc disables preemption but the details are fuzzy and I haven't been able to find anything in the code.  Can anyone shed some light on this?
>>
>> preemption is a kernel thing, bcc does not disable it.  You need to
>> check kernel configuration CONFIG_PREEMPT in your host.
>
>
> When running bpf code from a kprobe / kretprobe, does anything ensure that cpu_id doesn't change while the bpf is running (e.g. due to preemption)?  Does anything ensure that no other bpf code runs on this CPU while this kprobe is running (e.g. due to an interrupt firing and hitting a different kprobe)?  If either of those things can happen, it seems difficult to atomically increment an entry in a HASH_MAP (even when using the cpu_id as a key).

bpf program run is wrapped in preempt_disable()/preempt_enable()
region. Also we have per cpu counter to prevent when bpf program
interrupted then another bpf to run for tracing programs. Therefore,
for tracing programs, the situation you mentioned in the above won't
happen.

The only thing which can happen is networking and tracing program
share the same map, e.g.,
   networking bpf program can be nmi interrupted and a bpf tracing
program (perf_event type) may be running.


Great, thanks Yonghong!  And I get that this is part of the infrastructure that calls into the bpf code (e.g. perf_events), not bcc.

> Also we have per cpu counter to prevent when bpf program interrupted then another bpf to run for tracing programs.

I think that means that if an interrupt fires while the bpf program is run, the interrupt will run, but if the interrupt causes another tracing event to fire, the associated bpf program will not run (i.e. the event will be ignored / dropped).  Is that right?

--matt
 


>
> --matt




Yonghong Song
 

On Tue, Aug 20, 2019 at 11:02 AM Matthew Ahrens <mahrens@...> wrote:



On Tue, Aug 20, 2019 at 10:47 AM Yonghong Song <ys114321@...> wrote:

On Tue, Aug 20, 2019 at 10:38 AM Matthew Ahrens <mahrens@...> wrote:

On Mon, Aug 19, 2019 at 4:30 PM Yonghong Song <ys114321@...> wrote:

On Mon, Aug 19, 2019 at 1:54 PM <brad.lewis@...> wrote:

Hi all,

I'm trying to verify that there are no concurrency issues with an approach I'm using cpu_id as a key to a HASH_MAP. My understanding is that bcc disables preemption but the details are fuzzy and I haven't been able to find anything in the code. Can anyone shed some light on this?
preemption is a kernel thing, bcc does not disable it. You need to
check kernel configuration CONFIG_PREEMPT in your host.

When running bpf code from a kprobe / kretprobe, does anything ensure that cpu_id doesn't change while the bpf is running (e.g. due to preemption)? Does anything ensure that no other bpf code runs on this CPU while this kprobe is running (e.g. due to an interrupt firing and hitting a different kprobe)? If either of those things can happen, it seems difficult to atomically increment an entry in a HASH_MAP (even when using the cpu_id as a key).
bpf program run is wrapped in preempt_disable()/preempt_enable()
region. Also we have per cpu counter to prevent when bpf program
interrupted then another bpf to run for tracing programs. Therefore,
for tracing programs, the situation you mentioned in the above won't
happen.

The only thing which can happen is networking and tracing program
share the same map, e.g.,
networking bpf program can be nmi interrupted and a bpf tracing
program (perf_event type) may be running.
Great, thanks Yonghong! And I get that this is part of the infrastructure that calls into the bpf code (e.g. perf_events), not bcc.

Also we have per cpu counter to prevent when bpf program interrupted then another bpf to run for tracing programs.
I think that means that if an interrupt fires while the bpf program is run, the interrupt will run, but if the interrupt causes another tracing event to fire, the associated bpf program will not run (i.e. the event will be ignored / dropped). Is that right?
Yes. See
https://github.com/torvalds/linux/blob/master/kernel/trace/bpf_trace.c#L88-L97


--matt





--matt


Matt Ahrens
 



On Tue, Aug 20, 2019 at 11:23 AM Y Song <ys114321@...> wrote:
On Tue, Aug 20, 2019 at 11:02 AM Matthew Ahrens <mahrens@...> wrote:
>
>
>
> On Tue, Aug 20, 2019 at 10:47 AM Yonghong Song <ys114321@...> wrote:
>>
>> On Tue, Aug 20, 2019 at 10:38 AM Matthew Ahrens <mahrens@...> wrote:
>> >
>> > On Mon, Aug 19, 2019 at 4:30 PM Yonghong Song <ys114321@...> wrote:
>> >>
>> >> On Mon, Aug 19, 2019 at 1:54 PM <brad.lewis@...> wrote:
>> >> >
>> >> > Hi all,
>> >> >
>> >> > I'm trying to verify that there are no concurrency issues with an approach I'm using cpu_id as a key to a HASH_MAP.   My understanding is that bcc disables preemption but the details are fuzzy and I haven't been able to find anything in the code.  Can anyone shed some light on this?
>> >>
>> >> preemption is a kernel thing, bcc does not disable it.  You need to
>> >> check kernel configuration CONFIG_PREEMPT in your host.
>> >
>> >
>> > When running bpf code from a kprobe / kretprobe, does anything ensure that cpu_id doesn't change while the bpf is running (e.g. due to preemption)?  Does anything ensure that no other bpf code runs on this CPU while this kprobe is running (e.g. due to an interrupt firing and hitting a different kprobe)?  If either of those things can happen, it seems difficult to atomically increment an entry in a HASH_MAP (even when using the cpu_id as a key).
>>
>> bpf program run is wrapped in preempt_disable()/preempt_enable()
>> region. Also we have per cpu counter to prevent when bpf program
>> interrupted then another bpf to run for tracing programs. Therefore,
>> for tracing programs, the situation you mentioned in the above won't
>> happen.
>>
>> The only thing which can happen is networking and tracing program
>> share the same map, e.g.,
>>    networking bpf program can be nmi interrupted and a bpf tracing
>> program (perf_event type) may be running.
>>
>
> Great, thanks Yonghong!  And I get that this is part of the infrastructure that calls into the bpf code (e.g. perf_events), not bcc.
>
> > Also we have per cpu counter to prevent when bpf program interrupted then another bpf to run for tracing programs.
>
> I think that means that if an interrupt fires while the bpf program is run, the interrupt will run, but if the interrupt causes another tracing event to fire, the associated bpf program will not run (i.e. the event will be ignored / dropped).  Is that right?

Yes. See
https://github.com/torvalds/linux/blob/master/kernel/trace/bpf_trace.c#L88-L97


Thanks for the pointer to the code!

--matt


brad.lewis@...
 

Thanks Matt and Yonghong.  Info on the preempt_disable()/preempt_enable() region is exactly what I was looking for.


Yonghong Song
 

On Tue, Aug 20, 2019 at 12:05 PM Arnaldo Carvalho de Melo
<arnaldo.melo@...> wrote:

Em Tue, Aug 20, 2019 at 11:22:50AM -0700, Yonghong Song escreveu:
On Tue, Aug 20, 2019 at 11:02 AM Matthew Ahrens <mahrens@...> wrote:
On Tue, Aug 20, 2019 at 10:47 AM Yonghong Song <ys114321@...> wrote:
Also we have per cpu counter to prevent when bpf program interrupted then another bpf to run for tracing programs.
I think that means that if an interrupt fires while the bpf program is run, the interrupt will run, but if the interrupt causes another tracing event to fire, the associated bpf program will not run (i.e. the event will be ignored / dropped). Is that right?
Yes. See
https://github.com/torvalds/linux/blob/master/kernel/trace/bpf_trace.c#L88-L97
Yonghong, I thought there would be some counter to at least let users
know that drop happened, has this ever surfaced? Or is there some way to
know about those drops that I'm missing?
You did not miss anything. Currently, there are no counters to count those
drops due to nmi or due to bpf program already running on that cpu.

There is effort by Daniel Xu to expose nhit/nmisses counters
from k/uprobe trace infra. Even kprobe is not a miss, bpf program may not
fire due to the above reasons.
https://lore.kernel.org/bpf/20190820214819.16154-1-dxu@dxuuu.xyz/T/#t

debugfs has k/uprobe_profile to count nhit/nmisses from k/uprobe trace infra.

We could add a counter into trace_event_call->event to count hit/miss. The hit
can also be counted by bpf program itself. The "miss" should be rare, and
most bpf programs e.g. in bcc are designed to tolerate occasional probe miss,
which should not affect much on the final aggregation results.

How strongly do you feel such a bpf prog hit/miss counter for tracing programs
is needed?


- Arnaldo