Re: __builtin_memcpy behavior


Andrii Nakryiko
 

On Tue, Feb 23, 2021 at 1:12 PM Toke Høiland-Jørgensen <toke@...> wrote:

"Tristan Mayfield" <mayfieldtristan@...> writes:

Toke, thanks for the quick response!

Yes, I was checking the bpf_probe_read return values, and was reading
the number of bytes expected, so nothing wrong there!
Right, in that case that's probably just because the struct in question
is next to some other valid memory (not sure where tracepoints keep
their data, but if it's on the stack, for instance, you'll have no
problem reading past it).

Now that you mention CO-RE, it does actually make sense that these
sorts of errors could be shifted to load time rather than attach time
(that the right phrase?). I've fiddled with CO-RE a bit but I haven't
adopted it for a few reasons (which I could certainly be mistaken
about).
I'm by no means the leading authority on CO-RE, but I can give answering
a shot; hopefully someone will chime in to correct me if I'm wrong :)

I don't have control over kernel versions or compilation flags for the
kernel on the systems I'm targeting and I've had significant
difficulty trying to compile CO-RE programs (e.g. from the BCC repo's
libbpf-tools) on Linux <5.4 because I've had a hard time getting the
vmlinux. I can't remember if I used bpftool though (this was about a
year ago that I last played with CO-RE), so perhaps I'll give it
another shot.
Yeah, getting all your ducks in a row when compiling can be a bit of an
issue. However, I don't think you need anything special from the kernel
at compile-time if you just compile your own programs with a vmlinux.h
file you generated on a kernel that has been compiled with BTF.
As far as CO-RE BPF program compilation goes, there shouldn't be much
difference between the latest kernel vs some older one. In case of
libbpf-tools, some of the tools might be using some features that are
supported by newer kernels only, but that's a bit different.

BTW, vmlinux.h is a pure convenience, so that you don't have to use
system headers or define your own types with
__attribute__((preserve_access_index)). vmlinux.h is not a
requirement. For libbpf-tools, though, it's pre-packaged to make life
easier (and now we have per-architecture vmlinux.h to facilitate
building libbpf-tools for various target arches).

New enough Clang is a requirement, though. Clang 11+ is preferred, but
I believe Clang 10 should have enough features for a lot of CO-RE
functionality.


I've also been very unclear, and have gotten many different answers
regarding the target systems and whether they need to be custom
compiled with BTF enabled for CO-RE programs to run on them, or if you
can put a CO-RE program onto a generic kernel build and it "just
works?" From your answer, the answer seems to be that
/sys/kernel/btf/vmlinux needs to be on the target system, so it must
have that BTF_ENABLE flag set?
Well, you'll need the BTF information of the running kernel. It doesn't
*have* to come from /sys/kernel/btf/vmlinux, libbpf will look for it in
a few other locations as well:

https://github.com/libbpf/libbpf/blob/master/src/btf.c#L4583
Right. For older kernels that don't yet support
/sys/kernel/btf/vmlinux, it's possible to add .BTF data with pahole -J
after the kernel is built. It's also possible to provide just BTF data
separately using bpf_object_open_opts, if it's more convenient.
Certainly an advanced use case, but doable.

But, of course, having kernels built with BTF and exposing it from
/sys/kernel/btf/vmlinux is hands down the most convenient way, which
seems to become more and more an option for popular Linux distros. See
[0] for a list (I think ALT Linux is going to have BTF built-in as
well).

[0] https://github.com/libbpf/libbpf#bpf-co-re-compile-once--run-everywhere


Distros have gotten pretty good about enabling BTF in their kernel
builds, though, so it's getting increasingly feasible to rely on it. It
should certainly be available on RHEL8 (and thus CentOS 8).

If that's set, do you also need a vmlinux.h file as well? A coworker
was recently messing with CO-RE and seemed to think that deploying a
CO-RE program required shipping the vmlinux.h file and I think he
mentioned that file was about 1Gb big, which is certainly a no-go for
our position.
No, you don't need to ship the vmlinux.h file. That's just a regular
header file with an unusual amount of definitions in it, that will be
used at compile time. It can be useful to include a copy of it in your
source code repository, though, as mentioned above. That's what BCC
does, for instance:
https://github.com/iovisor/bcc/tree/master/libbpf-tools/x86

An no, it's not 1GB in size. Maybe that size was from before BTF
de-duplication got implemented? The one linked above is 2.7M.
Maybe if you build allyesconfig it can come closer to 1GB :) But as
Toke said, it's used during compilation only. After that you get BPF
object file (that .o file), which contains all the necessary
relocation information internally and is very small. Then there is BPF
skeleton, which can be used to avoid distributing those separate .o
(and provides a bunch of other convenience features, of course), but
it's not a requirement either.


In addition to that, I've been unclear in the role of BTF in BPF
generally. When I began tinkering with BPF I was under the impression
that BTF was *only* something used for CO-RE programs (something I
actually might've gotten from the article referenced and written by
Andrii), but I've periodically seen errors arise that cite BTF reasons
for erroring.
BTF started out as "just" compact debug info for your BPF programs,
but it quickly grew into much more and is used for many BPF-related
features. CO-RE is one big area, but there are kernel BPF features
that rely on in-kernel BTF heavily as well.

One common cause for this has been when loading 'tc' programs with
iproute2, because the iproute2 loader doesn't understand BTF and will
complain about it. That is usually harmless, though, but I agree it's
quite annoying. Fortunately, iproute2 has recently gained support for
using libbpf for its BPF loading, so hopefully that particular error
should go away before too long.

Unfortunately I haven't saved any of these errors and
can't remember the causes specifically, but something like the
"updated" maps declarations, i.e.

struct {
__uint(type, BPF_MAP_TYPE_PERF_EVENT_ARRAY);
__uint(key_size, sizeof(u32));
__uint(value_size, sizeof(u32));
} events SEC(".maps");

I've learned does use BTF?
Yes, the new-style map definitions use BTF. While BTF is ostensibly a
type format (i.e., something that describes C data types), Andrii
figured out that it is also possible to use it as a general purpose
key/value store. You do this by being a bit clever about how you
represent your data, which is what the __uint() macro in the above is
doing (it's encoding the integer value as the size of an array, which
becomes part of the type and thus embedded in the BTF). When loading,
libbpf will parse this data back out of the BTF data and use it when
creating the map. So you'll need BTF support in your compiler and in
libbpf to use this style of map definitions.
Right. Clang 10+ should be enough (but I'm too lazy to check), which
coincides with CO-RE requirements.


Am I misunderstanding what BTF is and the role it plays in BPF? Or
maybe has libbpf development moved so far toward CO-RE that non-CO-RE
development gets similar or the same error messages that just aren't
as clear for it?
Hmm, no, CO-RE is the specific feature that does relocations of struct
fields based on member names. This relies on BTF, but it's not the only
CO-RE is more than only field offset relocations, btw, you can detect
type and field existence, get type size, use relocatable enums
(internal kernel enums can get renumbered, so this feature allows to
accommodate that), etc.

thing BTF is used for. The map definition is another, as you discovered,
and there are some program types that cannot work without BTF
information at all. Also, things like bpftool being able to print out
the struct layout of map values is using BTF. So you're certainly right
that the BPF ecosystem in general is moving towards using BTF in more
and more places. And I guess you're also right that this leads to some
cryptic error messages sometimes... :)
Thanks for your reply, Toke. I don't think I added much value here :)

-Toke





Join iovisor-dev@lists.iovisor.org to automatically receive all group messages.