kretprobe not consistently being triggered for sys_futex calls?

Ashley <ashley@...>

Hi all,

I have a problem which has been plaguing me for a few days, and was hoping for some insight or information that might help get to the bottom of it. I think it might be a bug in the kernel instrumentation code or kernel/futex.c, but it could equally just be me misunderstanding something.

I've registered a kprobe and a kretprobe on sys_futex on kernel 4.15.0-43 (Ubuntu 18.04 LTS), and I'm finding quite unusual results where I rarely seem to get the kretprobe triggering for sys_futex.

This seems to be consistent across different types of futex operations - for example, sometimes I trace a futex wake operation which never triggers the retprobe, and similarly for other operations, including wake.

I haven't yet found any combinations of parameters to sys_futex that causes this reliably. I've read some of kernel/futex.c and the functions it calls into to see if perhaps it is doing some crazy things to jump back into user space which would avoid triggering the return probe, but I didn't see anything that looked particularly suspicious (disclaimer: I'm nowhere near an expert on the topic so it's likely I'd miss something like this)

If I run this alongside a small shell script along the lines of (excuse the fish syntax!):
while true
echo "a" >> file

I get the following output after a minute or so:

fish-16555 [001] .... 237577.357940: 0x00000001: Total futex in: 363
fish-16555 [001] .N.. 237577.357946: 0x00000001: Total futex out: 176
fish-16555 [001] .N.. 237577.357947: 0x00000001: Total write in: 258608
fish-16555 [001] .N.. 237577.357947: 0x00000001: Total write out: 258607

I suspect the difference of one between the write in/out is due to either instrumentation starting while another thread was already inside a write system call, or perhaps a process was killed inside the system call (if that's possible?).

I am quite sure there is an issue with futex that can't be explained by any of these possibilities, though: I'm able to observe specific, continuing processes which have managed to execute a futex call and then get back to user space without triggering the retprobe, and even trigger multiple futex calls in consequence without triggering a single retprobe (I'm not doing anything crazy in my user code -- just spawning some std C++ threads, and I see the same issues tracing MySQL Server).

On the other hand, I've also seen the same program do the same set of actions on a different run, but manage to trigger every retprobe (or some subset, at least) that I expect them to.

I've written a StackOverflow question here: but I anticipate that the question is too niche to get much visibility, so I was hoping it'd be okay to broadcast the question here as well. If we end up coming to an answer here then I'm very happy to accept your answer there.

Any input at all is appreciated :)

Thanks in advance!

PS: Apologies if I'm using the mailing list incorrectly; please feel free to prod me about etiquette - I've never used one before

PPS: Incase it helps, here's an abridged log of my initial program which traces specific threads' futex operations, which was my first exposure to this issue. FUTEX << signifies a kprobe call, >> FUTEX a kretprobe call, and {WAKE XXXX} are annotations I added myself based on some extra output I removed to abridge this output:

main-6311 .... 238378.639978: 0x00000001: FUTEX <<
main-6312 .... 238378.640005: 0x00000001: FUTEX <<
main-6310 .... 238379.141966: 0x00000001: FUTEX{WAKE 6311} <<
main-6310 .... 238379.141986: 0x00000001: FUTEX <<
main-6311 d... 238379.142007: 0x00000001: >> FUTEX
main-6311 .... 238379.642507: 0x00000001: FUTEX{WAKE 6312} <<
main-6311 d... 238379.642525: 0x00000001: >> FUTEX
main-6311 .... 238379.642563: 0x00000001: FUTEX{WAKE 6310} <<
main-6311 dN.. 238379.642573: 0x00000001: >> FUTEX
main-6310 .... 238379.642581: 0x00000001: FUTEX <<
main-6312 .... 238380.143266: 0x00000001: FUTEX <<
main-6312 .... 238380.143290: 0x00000001: FUTEX{WAKE 6310} <<
main-6310 d... 238380.143305: 0x00000001: >> FUTEX

The notable thing is that we see threads like main-6310 issuing multiple futex calls without any retprobe triggering between them (lines 3-4, 10).

Join { to automatically receive all group messages.