Re: Extracting data from tracepoints (and anything else)

Andrii Nakryiko

On Thu, Apr 16, 2020 at 8:42 AM <mayfieldtristan@...> wrote:

I've waited to reply, not wanting to clog the mailing list, but I thought it would be beneficial to follow up on the same topic with kprobes in addition to tracepoints. The main issue I had with tracepoints was not understanding the 8-byte alignment in the arguments. Once that was sorted, getting information was actually really simple.

At this point I've moved to kprobes, kretprobes, and raw tracepoints. From what I understand, if not using CO-RE or vmlinux.h, to access data from kprobes or kretprobes you must access the cpu registers in which those values live?
You are not really accessing CPU registers, but you access their
values before the program was interrupted. Those values are stored in
pt_regs struct. It's a technicality in this case, but you can't access
CPU registers directly in BPF.

BTW, raw_tracepoints are completely different, but you should be able
to find examples in selftests for those.

For example, if I'm porting Brenden Gregg's bpftrace tool "elfsnoop" to libbpf, I'd want to trace "load_elf_binary()." load_elf_binary() only has one argument: "struct linux_binrprm *bprm." So if I want to read that struct, I'd have to access the register with that argument. I think in bpf_tracing.h that macro would be PT_REGS_PARAM1(x). I don't have the greatest understanding of asm and cpu registers, but I believe that would be the %rdi register?
Yes, rdi register, which is accesed from pt_regs using PT_REGS_PARM1()

With that in mind, here's my code and build.

#include <linux/bpf.h>
#include "bpf_helpers.h"
#include "bpf_tracing.h"
#include <linux/ptrace.h>
#include <linux/types.h>

int trace_entry(struct pt_regs *ctx) {
char msg[] = "hello world\n"; // for verification that the bpf program is running at all
bpf_trace_printk(msg, sizeof(msg));

struct linux_binprm *arg = (struct linux_binprm *) PT_REGS_PARM1(ctx);

return 0;
char _license[] SEC("license") = "GPL";

//// And the build command.
// Target arch and kernel are defined to get the correct macros
// in bpf_tracing.h
$ clang -O3 -Wall -target bpf \
-D__KERNEL__ -c \
elfsnoop.bpf.c \
-I/home/vagrant/libbpf/src/ \
-o elfsnoop.bpf.o

Unfortunately, as Andrii mentioned previously in this topic, I think there are different definitions of pt_regs and my /usr/include/linux/ptrace.h does not have the correct one, as evidenced by the error I get when trying to build.

elfsnoop.bpf.c:89:54: error: no member named 'di' in 'struct pt_regs'
struct linux_binprm *arg = (struct linux_binprm *) PT_REGS_PARM1(ctx);
/home/vagrant/libbpf/src/bpf_tracing.h:54:32: note: expanded from macro 'PT_REGS_PARM1'
#define PT_REGS_PARM1(x) ((x)->di)

Is this the correct way to access data in kprobes? Most of the information I've found explicitly talking about accessing kprobe data is pretty old (2012-2015). selftests/bpf/ seems to not have examples of accessing kprobe data, and, from my understanding, libbpf-tools is CO-RE dependent which I'm trying to avoid for now just because most default kernels aren't BTF enabled yet (I will definitely be voicing my opinion to distros that this should change since the average user likely isn't keen on recompiling and installing a kernel). I also looked at the brief C Appendix of "BPF Performace Tools" and "Linux Observability with BPF" to try and understand, but I still haven't been able to extract data from the kprobes or raw tracepoints yet.
I think the final question that may (or may not) solve this issue is which pt_regs should be used?
So <linux/ptrace.h> in your case is taken from UAPI headers, not
kernel internal headers. They have different names for field. Drop
-D__KERNEL__ part and it should work.

Also, assuming this is the correct way, is this generalizable to raw tracepoints and kretprobes as well?
kretprobes can only safely access return value, which you would use
PT_REGS_RC(ctx) to get. Input arguments are clobbered by the time
kretprobe fires, so using PT_REGS_PARM1(ctx) would return you
something, but most probably it won't be a correct value of first
input argument.

raw_tracepoints are similar to fentry/fexit in that each input
argument is 8-byte long. See progs/test_vmlinux.c in selftests/bpf for
an example of getting a syscall number on sys_entry. BPF_PROG is
useful macro for such use cases.

After I have these things figured out with some working examples, I think I will publish a github repo with a tutorial as discussed with Andrii in a few messages above.
Appreciate any feedback and help.

Join { to automatically receive all group messages.