Re: Tracepoint/Kprobe for tracking inbound connections
Yonghong Song
On Tue, Sep 29, 2020 at 4:14 AM Kanthi P <Pavuluri.kanthi@...> wrote:
Maybe you can use sk_local_storage? You can attach a piece of information to the socket during TCP_SYN_RECV and later on during TCP_ESTABLISHED to check that data, and you can delete that data from the socket if you do not need it any more, all in bpf program.
|
Tracepoint/Kprobe for tracking inbound connections
Kanthi P <Pavuluri.kanthi@...>
Hi, I am looking for tracking inbound connections on a system using tracepoints/kprobes. I was checking "trace_inet_sock_set_state", with which we can track the state changes during connection establishment and closure. It seems straightforward to track total connections, but since we only want inbound, one way would be to look at what are the ip addresses/ports on which a node listens to and while tracking the state changes, I can see if the local address/port matches to the one this system listens on and based on that make a decision whether its an inbound connection or not. This looks a bit roundabout way for me, so thought of reaching for suggestions to do it simpler. Another way is to store the socker address when TCP_SYN_RECV to TCP_ESTABLISHED state change happens and during closure we can check if it is for this socket, so we know its inbound connection. But this would make the map size grow too high as we have about 50k concurrent connections. Can you suggest a better way to do this? Thanks, Kanthi |
Re: Load BPF program at boot-time?
Yonghong Song
On Sun, Sep 6, 2020 at 7:55 AM Shung-Hsi Yu <yu@...> wrote:
It is possible. See the patch below: https://lore.kernel.org/bpf/20200819042759.51280-1-alexei.starovoitov@gmail.com/ I tried to load a BPF program and pin it in bpffs system. The system could be extended to load bpf program, even attach it if other subsystem is ready. But this needs kernel work. What I'm trying to achieve is to trace every single call to a certainbpf program seems a good choice here since it can store arbitrary data in its maps and based on the tracing state, it can stop tracing. There are still some potential issues relating to not recompile kernel and just change bpf programs and recompile bpf programs and rebooting should just work, which is not available today. I guess this probably can be improved. If you are interested, please take a look at the above patch and may improve kernel to cover your use case.
|
Load BPF program at boot-time?
Shung-Hsi Yu
Hi,
Is it possible to load a BPF program at boot time? What I'm trying to achieve is to trace every single call to a certain function since the kernel starts, without missing anything. More specifically, I'm trying to debug iommu_alloc failures by looking at the stacktrace to find out which subsystem/driver allocated too many IOMMU slots on a ppc64le system, which I do not have direct access to. I've considered writing a systemd unit file that loads a BPF program before the sysinit target[1], but I'm not sure if that's early enough. An alternative seems to be to use boot-time tracing with ftrace[2] instead (which I end up doing), but it requires recompiling the kernel inorder to add tracepoints to retrieve the function call arguments, and there isn't an easy way to stop tracing to prevent the tracing buffer overflows (I end up writing a systemd unit file that sets a ftrace event trigger that turns off tracing). Maybe there is a better way to do something like this? Much thanks, Shung-Hsi Yu [1]: https://www.freedesktop.org/software/systemd/man/bootup.html [2]: https://www.kernel.org/doc/html/latest/trace/boottime-trace.html |
Re: Reading Pinned maps in eBPF Programs
Andrii Nakryiko
On Mon, Aug 31, 2020 at 12:03 PM Ian <icampbe14@...> wrote:
It's expected right now. BTF started out as purely debug information, but got elevated into pretty much a mandatory thing for modern BPF applications. We've talked about making .BTF emitted without -g, but that hasn't happened in Clang yet (there are some technical difficulties). Again, thank you so much for your help. There is no way I would have figured that out on my own. |
Re: Reading Pinned maps in eBPF Programs
Ian
Interestingly enough adding just -g in my Makefile built the BPF programs and allowed the BTF section to be found and properly loaded. My BPF program was loaded and is running properly with my desired functionality. I am confused though as to why the -g flag fixed this problem. Which according to the clang man page:
-g Generate debug information.Is BTF information considered debug information? Is that in general or in this case? Is the this unexpected behavior? Perhaps a bug of clangs non -g compiled binaries with BPF? It would seem to me that the BTF information should not be purged from a non -g binary. I am interested to hear your thought on this Andrii! Again, thank you so much for your help. There is no way I would have figured that out on my own. Ian |
Re: Reading Pinned maps in eBPF Programs
Andrii Nakryiko
On Sun, Aug 30, 2020 at 4:35 PM Ian <icampbe14@...> wrote:
[...] Ok, this is a very different issue than the kernel missing BTF. libbpf is complaining that your opensnoop.bpf.o itself is missing BTF. And right, BTF is required to parse map definitions properly, but it doesn't depend on having kernel support for BTF at all. Make sure you use recent enough Clang (v10+) and you build your opensnoop.bpf.o with -target bpf **and** -g flag to generate debug info (including .BTF ELF section). Ian |
Re: Reading Pinned maps in eBPF Programs
Ian
Hello,
Here is the libbpf Logs at all levels for the open snoop program when using the pinned option for a map. This was tested on Linux Kernel v5.4 with libbpf 0.0.9, 0.1.0, and the current version. All the results of the logs were the same so I have only posted a single copy of it here. Let me know what you think and what the next steps might be! I appreciate the help and am having a good time trying to piece this together. libbpf: loading bpf-library/bpf_objs/opensnoop.bpf.o
libbpf: section(1) .strtab, size 289, link 0, flags 0, type=3
libbpf: skip section(1) .strtab
libbpf: section(2) .text, size 0, link 0, flags 6, type=1
libbpf: skip section(2) .text
libbpf: section(3) tracepoint/syscalls/sys_enter_openat, size 1632, link 0, flags 6, type=1
libbpf: found program tracepoint/syscalls/sys_enter_openat
libbpf: section(4) .reltracepoint/syscalls/sys_enter_openat, size 32, link 15, flags 0, type=9
libbpf: section(5) tracepoint/syscalls/sys_enter_open, size 1368, link 0, flags 6, type=1
libbpf: found program tracepoint/syscalls/sys_enter_open
libbpf: section(6) .reltracepoint/syscalls/sys_enter_open, size 32, link 15, flags 0, type=9
libbpf: section(7) .data, size 4, link 0, flags 3, type=1
libbpf: section(8) maps, size 20, link 0, flags 3, type=1
libbpf: section(9) .rodata.str1.1, size 9, link 0, flags 32, type=1
libbpf: skip section(9) .rodata.str1.1
libbpf: section(10) version, size 4, link 0, flags 3, type=1
libbpf: kernel version of bpf-library/bpf_objs/opensnoop.bpf.o is 50422
libbpf: section(11) license, size 4, link 0, flags 3, type=1
libbpf: license of bpf-library/bpf_objs/opensnoop.bpf.o is GPL
libbpf: section(12) .maps, size 40, link 0, flags 3, type=1
libbpf: section(13) .eh_frame, size 80, link 0, flags 2, type=1
libbpf: skip section(13) .eh_frame
libbpf: section(14) .rel.eh_frame, size 32, link 15, flags 0, type=9
libbpf: skip relo .rel.eh_frame(14) for section(13)
libbpf: section(15) .symtab, size 408, link 1, flags 0, type=2
libbpf: BTF is required, but is missing or corrupted.
|
Re: Reading Pinned maps in eBPF Programs
Andrii Nakryiko
On Thu, Aug 27, 2020 at 6:55 AM Ian <icampbe14@...> wrote:
Check example [0] for how to set custom logging callback and print all libbpf logs (including those at DEBUG level of verbosity). [0] https://github.com/iovisor/bcc/blob/master/libbpf-tools/runqslower.c#L136 You are welcome!
|
Re: Reading Pinned maps in eBPF Programs
Ian
Hey Andrii,
By the way, thank you so much for all your help! Ian |
Re: Reading Pinned maps in eBPF Programs
Andrii Nakryiko
On Wed, Aug 26, 2020 at 6:54 AM Tristan Mayfield
<mayfieldtristan@...> wrote: Which version of libbpf are you seeing this on? We've had bugs in libbpf where we'd attempt to load kernel BTF unnecessarily, but I believe we've fixed all those issues. Can you please double-check with latest released libbpf and see if that's still happening? If it is, could you provide a repro and full libbpf debug logs for me to investigate? Thanks! Tristan |
Re: Reading Pinned maps in eBPF Programs
Andrii Nakryiko
On Sun, Aug 23, 2020 at 12:36 PM Ian <icampbe14@...> wrote:
[...] I don't see anything needing kernel BTF in there, so if libbpf still fails on not being able to load kernel BTF, that might be a bug in libbpf. Can you please double-check this with the latest released (or just plain latest) libbpf and if that's still happening, please provide debug-level logs from libbpf. Thank you! |
Re: Reading Pinned maps in eBPF Programs
Tristan Mayfield
I wanted to chime in and mention that I've seen the BTF error before when trying to declare maps the way shown in https://github.com/torvalds/linux/blob/master/tools/testing/selftests/bpf/progs/test_pinning.c. I haven't run through a debugger yet to verify that's the issue, but I have verified on the opensnoop code Ian posted. |
Re: Reading Pinned maps in eBPF Programs
Ian
Hello! Sorry for the wait, I just started back at uni and things are a little bit crazy around here! Anyways, this is the source code for my version of open snoop. Which is what I have been testing with. This does not contain the changes for map reading. My goal is to have this open snoop file open/read a map with one element after it gets the PID to compare them. It is also worth noting that I am tracking both open and openat within the same file. #include <linux/bpf.h> // BPF asm file that ships with the OS #include "bpf_helpers.h" // bpf_helper functions #include <linux/version.h>
// For navigating the task struct #include <linux/sched.h> #include <linux/nsproxy.h> #include <linux/pid_namespace.h> #include <linux/ns_common.h>
#define MAX_CPUS 4
/** * Struct to pass data to the perf buffer */ #pragma pack(1) struct opensnoop_data_t { u32 pid; u32 tgid; char program_name[16]; // max comm length is 16 char file[255]; u32 namespace; u64 time; };
struct sys_enter_openat_args { long long pad; long __syscall_nr; long dfd; const char *filename; long flags; long mode; };
struct sys_enter_open_args { long long pad; long __syscall_nr; const char *filename; long flags; long mode; };
/** * Using the magic macro SEC this struct declares * and creates a new bpf map of a type PERF that we * can use to pass data to userspace */ struct bpf_map_def SEC("maps") opensnoop_events = { .type = BPF_MAP_TYPE_PERF_EVENT_ARRAY, .key_size = sizeof(int), .value_size = sizeof(u32), .max_entries = MAX_CPUS, };
SEC("tracepoint/syscalls/sys_enter_openat") int bpf_prog(struct sys_enter_openat_args *ctx) {
struct opensnoop_data_t data = {}; data.pid = bpf_get_current_pid_tgid() >> 32; // use fn from libbpf.h to get pid_tgid data.tgid = bpf_get_current_pid_tgid(); // first 32 bits are tgid data.time = bpf_ktime_get_ns();
bpf_get_current_comm(&data.program_name, sizeof(data.program_name)); // puts current comm into char array
int err = bpf_probe_read_str(data.file, sizeof(data.file), ctx->filename); if (!err) { char msg[] = "Err: %d\n"; bpf_trace_printk(msg, sizeof(msg), err); }
struct task_struct *task = (struct task_struct *)bpf_get_current_task(); // sched.h
struct nsproxy *nsprox = 0; // nsproxy.h struct pid_namespace *pidns = 0; // pid_namespace.h struct ns_common *nsc = 0; // ns_common.h struct ns_common n = {}; data.namespace = ({ typeof(unsigned int) _val; __builtin_memset(&_val, 0, sizeof(_val)); // set bytes to 0 bpf_probe_read(&_val, sizeof(_val), &({
typeof(struct pid_namespace *) _val; __builtin_memset(&_val, 0, sizeof(_val)); bpf_probe_read(&_val, sizeof(_val), &({
typeof(struct nsproxy *) _val; __builtin_memset(&_val, 0, sizeof(_val)); bpf_probe_read(&_val, sizeof(_val), &task->nsproxy); _val;
})->pid_ns_for_children);
_val; })->ns.inum);
_val; });
#ifdef DEBUG char debug_msg[] = "Tracepoint on syscalls/sys_enter_openat was called for process %d\n"; bpf_trace_printk(debug_msg, sizeof(debug_msg), data.pid); #endif
bpf_perf_event_output(ctx, &opensnoop_events, BPF_F_CURRENT_CPU /*run on current cpu*/, &data, sizeof(data));
return 0; } SEC("tracepoint/syscalls/sys_enter_open") int sys_enter_open_prog(struct sys_enter_open_args *ctx) {
struct opensnoop_data_t data = {};
data.pid = bpf_get_current_pid_tgid() >> 32; // use fn from libbpf.h to get pid_tgid data.tgid = bpf_get_current_pid_tgid(); // first 32 bits are tgid data.time = bpf_ktime_get_ns();
bpf_get_current_comm(&data.program_name, sizeof(data.program_name)); // puts current comm into char array
int err = bpf_probe_read_str(data.file, sizeof(data.file), ctx->filename); if (!err) { char msg[] = "Err: %d\n"; bpf_trace_printk(msg, sizeof(msg), err); }
#ifdef DEBUG char debug_msg[] = "Tracepoint on syscalls/sys_enter_open was called for process %d\n"; bpf_trace_printk(debug_msg, sizeof(debug_msg), data.pid); #endif
bpf_perf_event_output(ctx, &opensnoop_events, BPF_F_CURRENT_CPU /*run on current cpu*/, &data, sizeof(data));
return 0; }
u32 _version SEC("version") = LINUX_VERSION_CODE; char _license[] SEC("license") = "GPL"; // necessary to use types of kernel ABI's |
Re: Reading Pinned maps in eBPF Programs
Andrii Nakryiko
On Thu, Aug 20, 2020 at 5:35 AM Ian <icampbe14@...> wrote:
Your BPF code must be relying on CO-RE. I can check if you can show me your BPF source code. The pinning and map definition itself doesn't rely on CO-RE and thus doesn't need kernel BTF.
|
Re: Reading Pinned maps in eBPF Programs
Ian
Interestingly enough I am using clang version 10.0.0! Even with that creating a structure from the examples like so:
struct {
__uint(type, BPF_MAP_TYPE_HASH); __uint(max_entries, 1);
__type(key, __u32);
__type(value, __u32); __uint(pinning, LIBBPF_PIN_BY_NAME);
} pid_map SEC(".maps"); I still get: libbpf: BTF is required, but is missing or corrupted. Here is my clang version output: vagrant@vagrant:/vagrant$ clang -v
clang version 10.0.0-4ubuntu1
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
Found candidate GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/9
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/9
Selected GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/9
Candidate multilib: .;@m64
Candidate multilib: 32;@m32
Candidate multilib: x32;@mx32
Selected multilib: .;@m64 I will continue looking into new clang versions to see if mine is slightly out of date! |
Re: Reading Pinned maps in eBPF Programs
Andrii Nakryiko
On Wed, Aug 19, 2020 at 3:40 PM Ian <icampbe14@...> wrote:
It doesn't require kernel BTF for that. Only BPF program's BTF generated by Clang. So you'll need something like Clang 10 (or maybe Clang 9 will do as well), but no requirements for kernel BTF. |
Re: Reading Pinned maps in eBPF Programs
Ian
Libbpf supports declarative pinning of maps, that's how you easily getThese examples are exactly what I am looking for but it appears that they either require BTF activated in the kernel or require a 5.8 kernel. Unfortunately I am targeting the new Ubuntu 20.04 system with "out-of-the-box" configurations. So that means I am saddled with kernel v5.4 and BTF not active. Why does libbpfs declarative map pinning require BTF? Does the metadata within BTF support the ability to correctly find and open the map? |
Re: Reading Pinned maps in eBPF Programs
Andrii Nakryiko
On Mon, Aug 17, 2020 at 6:36 AM Ian <icampbe14@...> wrote:
Libbpf supports declarative pinning of maps, that's how you easily get "map re-use" from BPF side. See [0] for example. But there is also bpf_map__pin() and bpf_map__reuse_fd() API on user-space side to set everything up, if you need to do it more flexibly. [0] https://github.com/torvalds/linux/blob/master/tools/testing/selftests/bpf/progs/test_pinning.c I have seen a function called bpf_obj_get_user in linux/bpf.h but I cannot find any documentation on it. It also just returns an unsupported error in my kernel's source code. |
Re: Reading Pinned maps in eBPF Programs
Ian
You can use bpf_obj_get() API to get a reference to the pinned map. It was my understanding that bpf_obj_get was intended to be used as a user space API. I am looking to "open" or obtain a reference to a map in the actual eBPF program that is loaded into the kernel space. My eBPF programs do include linux/bpf.h but not the uapi bpf.h. Can/should you use it in the actual BPF program? Or is there an a different way to achieve this? I have seen a function called bpf_obj_get_user in linux/bpf.h but I cannot find any documentation on it. It also just returns an unsupported error in my kernel's source code. static inline int bpf_obj_get_user(const char __user *pathname, int flags) { return -EOPNOTSUPP; } BPF_ANNOTATE_KV_PAIR is old way to provide map key/value types, mostlyAhh interesting! |