Re: Extracting data from tracepoints (and anything else)


Andrii Nakryiko
 

On Mon, Mar 23, 2020 at 9:38 AM <mayfieldtristan@...> wrote:

I've been exploring the libbpf library for different versions of the Linux kernel, and trying to rewrite some of the BCC tools. I would like to do more work with CO-RE eventually, but I'm trying to understand the entire model of how BPF programs work and how data flows between the kernel, the VM, and userspace. I just started using perf buffers instead of bpf_trace_printk and came across an issue that has me scratching my head. In the below code, I'm not able to access the const char * arg in the tracepoint sys_enter_openat (kernel 4.15). For some reason the verifier rejects this code. I think it's valid C (although I'm a little bit rusty still) and I think I followed the correct flow where data must be copied from the kernel to the VM before being able to use.

If anyone has insight to share, I'd much appreciate it. Conversely, if anyone can point me in the direction of how to debug BPF programs that would be extremely helpful too. Should I just dig into learning the basics of BPF asm?

Highlights of the code:

struct bpf_map_def SEC("maps") events = {
.type = BPF_MAP_TYPE_PERF_EVENT_ARRAY,
.key_size = sizeof(int),
.value_size = sizeof(u32),
.max_entries = MAX_CPUS,
};
nit: this is a legacy syntax of specifying BPF maps, please see [0]
for some newer examples

[0] https://github.com/iovisor/bcc/tree/master/libbpf-tools


struct sys_enter_openat_args {
u16 common_type;
u8 common_flags;
u8 common_preempt_count;
int common_pid;
int __syscall_nr;
int dfd;
char *filename;
int flags;
__mode_t mode;
};

SEC("tracepoint/syscalls/sys_enter_openat")
int bpf_prog(struct sys_enter_openat_args *ctx) {
struct data_t data;
struct sys_enter_openat_args *args;

int res = bpf_probe_read(args, sizeof(ctx), ctx);
you don't need to bpf_probe_read() ctx here, you can just access its
members directly.

if(!res) {
data.file = "couldn't get file";
} else {
data.file = args->filename;
But here if you want to read filename contents itself, you'll need to
use bpf_probe_read_str().

Having data_t definition would be also helpful.

}

Error Message:

bpf_load_program() err=13
0: (bf) r6 = r1
1: (b7) r2 = 8
2: (bf) r3 = r6
3: (85) call bpf_probe_read#4
R1 type=ctx expected=fp
this error from verifier is quite misleading, but what verifier
complains about here is that you try to read uninitialized pointer
(arg) and pass it as a first parameter into bpf_probe_read(). But see
above, you don't need to bpf_probe_read() anything, and even if you
wanted to it would have to be done very differently:

struct sys_enter_openat_args args; /* notice no pointer here */
bpf_probe_read(&args, sizeof(args), ctx); /* taking address of args,
taking size of args, not its pointer */

The kernel didn't load the BPF program

data.pid = bpf_get_current_pid_tgid(); // use fn from libbpf.h to get pid_tgid
bpf_get_current_comm(data.program_name, sizeof(data.program_name)); // puts current comm into char array

bpf_perf_event_output(ctx, &events, 0, &data, sizeof(data));

return 0;
}

If more code would be helpful, I'm happy to share.

I recognize that libbpf and CO-RE in later kernels provides an easier API for dealing with char * (bpf_probe_read_str() I believe) but I'm trying to understand what needs to be done to target different kernels and not just the most cutting edge.

As a second question, how much should I learn about perf(1) and its overlap with BPF?

Finally, for long-term monitoring solutions and passing readable data, do most programs rely on pinning maps to the vfs instead of using perf buffers or passing directly to a userspace process?
It's a mix. If your data should/can be pre-aggregated in kernel, using
map might benefit you in that you will be sending much less data to
user-space. But if you want to send every piece of information than
perf_buffer is faster and more convenient than having user-space query
BPF maps all the time.


Thanks for the patience and goodwill with a new systems dev. I've enjoyed my interactions with the BPF community.
You're welcome. Check libbpf-tools in BCC repo, it should give you
some examples to work off of.


Tristan

Join {iovisor-dev@lists.iovisor.org to automatically receive all group messages.