On Wed, Feb 24, 2021 at 11:28 AM Tristan Mayfield
Thank you to both Andrii and Toke! It's been extremely helpful to read your responses. Having conversations like these really helps me when I go into the source code and try to understand the overall intent of it. I'm going to try and summarize the conversation to confirm my understanding.
bpf_probe_read() will read any valid kernel memory (nothing new here). If the memory is already available to be read in the program (e.g. in tracepoint args), then __builtin_memcpy can be used and will potentially throw attach-time errors if reading structs incorrectly (for some reason I don't think we clarified).
CO-RE can guarantee valid memory reads because of the nature of being able to check offsets and relocations at load time rather than attach time or just returning garbage data with no errors.
To build CO-RE programs you need a vmlinux file (not to be confused with the header, vmlinux.h) which is normally found at /sys/kernel/btf/vmlinux on systems that have been compiled with pahole and CONFIG_DEBUG_INFO_BTF=y. Having the vmlinux.h file is helpful because it replaces kernel headers and makes building a bit nicer, but isn't necessary. Once compiled, CO-RE programs should be able to run on any system that has a vmlinux file in one of the locations listed here: https://github.com/libbpf/libbpf/blob/master/src/btf.c#L4583.
vmlinux usually refers to kernel image binary. /sys/kernel/btf/vmlinux
is not that, it's only the BTF data. So CO-RE needs kernel BTF, not
necessarily vmlinux kernel image. Just a clarification. But vmlinux
image (ELF file) itself has .BTF section, which has the same data
exposed in /sys/kernel/btf/vmlinux, so libbpf will try to fetch that
data, if /sys/kernel/btf/vmlinux is not present. That is necessary for
some older kernel versions, as well if you "embed" BTF information
manually with `pahole -J`.
For earlier kernels, it's possible to generate a vmlinux file (and this is one of the spots I'm a bit murky on) with pahole -J, but I'm not sure what you are supposed to target when running that? Just the compiled kernel binary? Something else?
Yes, `pahole -J <path-to-kernel-image-vmlinux-binary>`. Pahole is able
to produce BTF from DWARF type information, contained in your vmlinux
kernel image (if you compile it with DWARF, of course). That's what is
happening in newer kernels when you specify CONFIG_DEBUG_INFO_BTF=y
(plus some extra linker steps to make that section "loadable":
available to kernel itself in runtime, but that's not necessary for
BTF is just a type format that can describe C data-types. Almost like a meta-language? I've personally not looked at the source for BTF yet, but it seems to be versatile enough that it's useful for CO-RE for describing internal data structures from the kernel, but it's also useful for a variety of other things (like map declarations) and will likely be increasingly relied on in future iterations of BPF, both CO-RE and otherwise. BTF support mainly comes from the compiler (which I do believe clang 10+ works, just from my experience. I'm primarily using clang 10 right now) and libbpf supporting it, not necessarily the kernel (except for CO-RE with the vmlinux).
Again, appreciate the responses. I've been building with BPF/libbpf about a year now and still feel like I've only scratched the surface. Reading source code is great, but sometimes it just really helps to get high-level ideas as well!
I think you got everything right. BTW, feel free to check my more
recent blog post (), it might help a bit more.