Date   

Re: BCC and passing packet from XDP to user-mode app #bcc

Yonghong Song
 

On Thu, Mar 18, 2021 at 4:49 AM Federico Parola
<federico.parola@...> wrote:

Hi,
the virtual function you are looking for is perf_submit_skb():

https://github.com/iovisor/bcc/blob/c8de00e1746e242cdcd68b4673a083bb467cd35e/src/cc/export/helpers.h#L193

Strangely it is not documented in the reference guide.
Thanks, Federico and others. Maybe one of you can add it to the
reference_guide.md? We
do have events.perf_submit there. Thanks!


Best regards,
Federico Parola

On 18/03/21 10:29, v.a.bonert@... wrote:
Hi!
Is it possible to pass full ethernet packet from XDP to user-mode app
using BCC?
I wrote C code like this:
BPF_PERF_OUTPUT(captured_data);
capture(struct xdp_md *ctx)
{
events.perf_submit(ctx, ...);
}
But there is no flags argument in perf_submit function (but
bpf_perf_event_output has such argument).
Without BCC I can write such code to pass full packet to user-mode:
struct packet_info
{
uint32_t packet_len;
uint32_t iface_id;
};
struct bpf_map_def SEC("maps") captured_data =
{
.type = BPF_MAP_TYPE_PERF_EVENT_ARRAY,
.key_size = sizeof(u32),
.value_size = sizeof(u32),
.max_entries = MAX_CPUS
};
SEC("xdp")
int capture_kern(struct xdp_md *ctx)
{
u32 len = ctx->data_end - ctx->data;
u64 flags = BPF_F_CURRENT_CPU;
flags |= (u64)len << 32;
struct packet_info info = {len, ctx->ingress_ifindex};
bpf_perf_event_output(ctx, &captured_data, flags, &info, sizeof(info));
return XDP_PASS;
}
How can I do the same when using BCC?




Re: BCC and passing packet from XDP to user-mode app #bcc

Federico Parola
 

Hi,
the virtual function you are looking for is perf_submit_skb():

https://github.com/iovisor/bcc/blob/c8de00e1746e242cdcd68b4673a083bb467cd35e/src/cc/export/helpers.h#L193

Strangely it is not documented in the reference guide.

Best regards,
Federico Parola

On 18/03/21 10:29, v.a.bonert@... wrote:
Hi!
Is it possible to pass full ethernet packet from XDP to user-mode app using BCC?
I wrote C code like this:
BPF_PERF_OUTPUT(captured_data);
capture(struct xdp_md *ctx)
{
    events.perf_submit(ctx, ...);
}
But there is no flags argument in perf_submit function (but bpf_perf_event_output has such argument).
Without BCC I can write such code to pass full packet to user-mode:
struct packet_info
{
    uint32_t packet_len;
    uint32_t iface_id;
};
struct bpf_map_def SEC("maps") captured_data =
{
    .type        = BPF_MAP_TYPE_PERF_EVENT_ARRAY,
    .key_size    = sizeof(u32),
    .value_size  = sizeof(u32),
    .max_entries = MAX_CPUS
};
SEC("xdp")
int capture_kern(struct xdp_md *ctx)
{
    u32 len = ctx->data_end - ctx->data;
    u64 flags = BPF_F_CURRENT_CPU;
    flags |= (u64)len << 32;
    struct packet_info info = {len, ctx->ingress_ifindex};
    bpf_perf_event_output(ctx, &captured_data, flags, &info, sizeof(info));
    return XDP_PASS;
}
How can I do the same when using BCC?


BCC and passing packet from XDP to user-mode app #bcc

v.a.bonert@...
 
Edited

Hi!
 
Is it possible to pass full ethernet packet from XDP to user-mode app using BCC?
I wrote C code like this:
 
BPF_PERF_OUTPUT(captured_data);
int capture(struct xdp_md *ctx)
{
    captured_data.perf_submit(ctx, ...);
    return XDP_PASS;
}
 
But there is no flags argument in perf_submit function (but bpf_perf_event_output has such argument).
 
Without BCC I can write such code to pass full packet to user-mode:
 
struct packet_info
{
    uint32_t packet_len;
    uint32_t iface_id;
};
 
struct bpf_map_def SEC("maps") captured_data =
{
    .type        = BPF_MAP_TYPE_PERF_EVENT_ARRAY,
    .key_size    = sizeof(u32),
    .value_size  = sizeof(u32),
    .max_entries = MAX_CPUS
};
 
SEC("xdp")
int capture_kern(struct xdp_md *ctx)
{
    u32 len = ctx->data_end - ctx->data;
    u64 flags = BPF_F_CURRENT_CPU;
    flags |= (u64)len << 32;
    struct packet_info info = {len, ctx->ingress_ifindex};
    bpf_perf_event_output(ctx, &captured_data, flags, &info, sizeof(info));
 
    return XDP_PASS;
}
 
How can I do the same when using BCC?


Re: Which file should I include for KERNEL_VERSION macro ?

Andrii Nakryiko
 

On Wed, Mar 17, 2021 at 5:10 AM <chenhengqi@...> wrote:

I'v read this blog post

https://facebookmicrosites.github.io/bpf/blog/2020/02/19/bpf-portability-and-co-re.html

And want to apply this technique to my program:

extern u32 LINUX_KERNEL_VERSION __kconfig; extern u32 CONFIG_HZ __kconfig; u64 utime_ns; if (LINUX_KERNEL_VERSION >= KERNEL_VERSION(4, 11, 0)) utime_ns = BPF_CORE_READ(task, utime); else /* convert jiffies to nanoseconds */ utime_ns = BPF_CORE_READ(task, utime) * (1000000000UL / CONFIG_HZ);
It will soon be part of bpf_helpers.h, but meanwhile just copy/paste
it into your code. See
https://patchwork.kernel.org/project/netdevbpf/patch/20210317200510.1354627-2-andrii@kernel.org/


Which file should I include for KERNEL_VERSION macro ?

chenhengqi@...
 

I'v read this blog post

https://facebookmicrosites.github.io/bpf/blog/2020/02/19/bpf-portability-and-co-re.html

And want to apply this technique to my program:

extern u32 LINUX_KERNEL_VERSION __kconfig; extern u32 CONFIG_HZ __kconfig; u64 utime_ns; if (LINUX_KERNEL_VERSION >= KERNEL_VERSION(4, 11, 0)) utime_ns = BPF_CORE_READ(task, utime); else /* convert jiffies to nanoseconds */ utime_ns = BPF_CORE_READ(task, utime) * (1000000000UL / CONFIG_HZ);


Re: Which is oldest linux kernel version that can support BTF? #bcc

Toke Høiland-Jørgensen
 

"Daniel Xu" <dxu@...> writes:

On Sun, Feb 28, 2021, at 12:07 PM, bg.salunke09@... wrote:
Hi,

I'm looking into BTF and it's use case. Based on the document I
understood to run BPF programs across different kernel versions, it
needs to build with libbpf which depends on the BTF information.
Now to enable/to have BTF information on any Kernel, the kernel needs
to be re-build with "" flag.

I can see the BTF support in Linus introduced from *kernel version
5.1.0
(*https://www.kernel.org/doc/html/v5.1/bpf/btf.html?highlight=btf)
however I can still see the BTF information(/sys/kernel/btf/vmlinux) on
my 4.18.0-193.28.1.el8_2.x86_64 kernel.
What distro are you using? Your distro probably backported BTF
support.
Yeah, that's a RHEL version number (RHEL8.2 in this case, as seen by the
"el8_2" bit). Which means that as far as features are concerned, the
4.18 version number is basically a complete fiction at this point. For
BPF we basically backport everything, IIRC we made it up to upstream
kernel 5.4 for RHEL8.2...

-Toke


Re: Which is oldest linux kernel version that can support BTF? #bcc

Daniel Xu
 

On Sun, Feb 28, 2021, at 12:07 PM, bg.salunke09@... wrote:
Hi,

I'm looking into BTF and it's use case. Based on the document I
understood to run BPF programs across different kernel versions, it
needs to build with libbpf which depends on the BTF information.
Now to enable/to have BTF information on any Kernel, the kernel needs
to be re-build with "" flag.

I can see the BTF support in Linus introduced from *kernel version
5.1.0
(*https://www.kernel.org/doc/html/v5.1/bpf/btf.html?highlight=btf)
however I can still see the BTF information(/sys/kernel/btf/vmlinux) on
my 4.18.0-193.28.1.el8_2.x86_64 kernel.
What distro are you using? Your distro probably backported BTF support.

Daniel


Questions about runqlen

Abel Wu
 

Hi, when I looked into the runqlen script yesterday, I found that,
sadly, I misunderstood the "queue length" all the time not only the
"length" part but also the "queue" part.

Queue
=====
Only CFS runqueues are taken into account. This makes sense when
main workloads are all under CFS scheduler, which is common in
cloud scenario. But what I don't quite follow is that the selected
queue is task->se.cfs_rq which is from a task view, rather than the
top level cfs_rq from a cpu view. I suppose the task view is not
enough to draw the whole picture of saturation?

Length
======
Within this scope length means the number of schedulable entities,
that is cfs_rq->nr_running. From time sharing point of view, it is
OK because it represents how many units involved in scheduling of
this cfs_rq. But what about from execution point of view in which
the number of tasks (cfs_rq->h_nr_running) will be used?

And besides the above, without the shares information of each entity,
how could runqlen help us optimizing the performance? Maybe we should
always focus on occupancy rather than length?

It would be very much appreciated if someone can shed some light.

Thanks & Best regards,
Abel


Re: Which is oldest linux kernel version that can support BTF? #bcc

bg.salunke09@...
 

On Tue, Mar 2, 2021 at 08:22 PM, Andrii Nakryiko wrote:
On Tue, Mar 2, 2021 at 4:42 PM <bg.salunke09@...> wrote:
Thanks Andrii, for detailed answer.
Yes you are right, I'm looking for CO-RE. Basically I'm trying to build the eBPF program which can run on any linux kernel version using libbpf

What I understood from your blog https://facebookmicrosites.github.io/bpf/blog/2020/02/19/bpf-portability-and-co-re.html (Thanks for in depth blog post, appreciate it), to work libbpf based program
the BTF information should be available on the running host. Is my understanding correct?
Yes, correct.
Thank you for confirming! 
Btw, Is there any document to generate BTF information for a linux kernel? Or Is there a way to generate BTF info for running kernel i.e. at runtime and not at compile time? Thanks!
Yes, you can, if you have vmlinux image with DWARF information in it.
You can use pahole tool like this to add .BTF section to vmlinux
image:

pahole -J <path-to-vmlinux-image>

You most probably would want to make a local copy of vmlinux image, of
course. After that you can pass the path to that vmlinux with embedded
.BTF to libbpf to use for CO-RE relocations. See [0] for recent
discussion of the exact same topic. See also patch [1] that was aiming
to make this scenario better in libbpf (unfortunately it hasn't landed
yet, but it is pretty close to being done, so shouldn't be a problem
for you to pick up, if necessary).

This is certainly not the most straightforward and easiest path, but
if you want to get CO-RE working with older kernel for which you don't
have much control, it is definitely a possible way (as long as you
have DWARF, which is used to produce BTF for vmlinux).

[0] https://lore.kernel.org/bpf/CAEf4BzbJZLjNoiK8_VfeVg_Vrg=9iYFv+po-38SMe=UzwDKJ=Q@.../
[1] https://lore.kernel.org/bpf/B8801F77-37E8-4EF8-8994-D366D48169A3@.../

Go it. I'm following the discussion thread and patch. Thank you so much for your time. 


Re: Which is oldest linux kernel version that can support BTF? #bcc

Andrii Nakryiko
 

On Tue, Mar 2, 2021 at 4:42 PM <bg.salunke09@...> wrote:

Thanks Andrii, for detailed answer.
Yes you are right, I'm looking for CO-RE. Basically I'm trying to build the eBPF program which can run on any linux kernel version using libbpf

What I understood from your blog https://facebookmicrosites.github.io/bpf/blog/2020/02/19/bpf-portability-and-co-re.html (Thanks for in depth blog post, appreciate it), to work libbpf based program
the BTF information should be available on the running host. Is my understanding correct?
Yes, correct.


Btw, Is there any document to generate BTF information for a linux kernel? Or Is there a way to generate BTF info for running kernel i.e. at runtime and not at compile time? Thanks!
Yes, you can, if you have vmlinux image with DWARF information in it.
You can use pahole tool like this to add .BTF section to vmlinux
image:

pahole -J <path-to-vmlinux-image>

You most probably would want to make a local copy of vmlinux image, of
course. After that you can pass the path to that vmlinux with embedded
.BTF to libbpf to use for CO-RE relocations. See [0] for recent
discussion of the exact same topic. See also patch [1] that was aiming
to make this scenario better in libbpf (unfortunately it hasn't landed
yet, but it is pretty close to being done, so shouldn't be a problem
for you to pick up, if necessary).

This is certainly not the most straightforward and easiest path, but
if you want to get CO-RE working with older kernel for which you don't
have much control, it is definitely a possible way (as long as you
have DWARF, which is used to produce BTF for vmlinux).

[0] https://lore.kernel.org/bpf/CAEf4BzbJZLjNoiK8_VfeVg_Vrg=9iYFv+po-38SMe=UzwDKJ=Q@mail.gmail.com/
[1] https://lore.kernel.org/bpf/B8801F77-37E8-4EF8-8994-D366D48169A3@araalinetworks.com/



Re: Which is oldest linux kernel version that can support BTF? #bcc

bg.salunke09@...
 

Thanks Andrii, for detailed answer.  
Yes you are right, I'm looking for CO-RE. Basically I'm trying to build the eBPF program which can run on any linux kernel version using libbpf

What I understood from your blog https://facebookmicrosites.github.io/bpf/blog/2020/02/19/bpf-portability-and-co-re.html (Thanks for in depth blog post, appreciate it), to work libbpf based program 
the BTF information should be available on the running host. Is my understanding correct?

Btw, Is there any document to generate BTF information for a linux kernel?  Or Is there a way to generate BTF info for running kernel i.e. at runtime and not at compile time? Thanks! 


Re: Which is oldest linux kernel version that can support BTF? #bcc

Toke Høiland-Jørgensen
 

"Andrii Nakryiko" <andrii.nakryiko@...> writes:

On Sun, Feb 28, 2021 at 12:37 PM <bg.salunke09@...> wrote:

[Edited Message Follows]

Hi,

I'm looking into BTF and it's use case. Based on the document I understood to run BPF programs across different kernel versions, it needs to build with libbpf which depends on the BTF information.
Now to enable/to have BTF information on any Kernel, the kernel needs to be re-build with "" flag.

I can see the BTF support in Linux introduced from kernel version 5.1.0 (https://www.kernel.org/doc/html/v5.1/bpf/btf.html?highlight=btf)
however I can still see the BTF information(/sys/kernel/btf/vmlinux) on my 4.18.0-193.28.1.el8_2.x86_64 kernel.

I'm little confused here how old kernel can generate BTF info if the was support added recently.

Can I get information about oldest linux kernel version that can support BTF?
/sys/kernel/btf/vmlinux appeared in 5.4 kernel (upstream version). If
you see it on 4.18, that means someone backported the changes.
Yeah, that looks like a RHEL/CentOS kernel version number, which means
the 4.18 bit is mostly fiction at this point (at least as far as BPF is
concerned). IIRC we backported up to upstream kernel 5.4 for RHEL 8.2,
which seems to be what you're running (from the el8_2 bit of the
version), and I guess that fits with the availability of
/sys/kernel/btf/vmlinux

-Toke


Re: Which is oldest linux kernel version that can support BTF? #bcc

Andrii Nakryiko
 

On Sun, Feb 28, 2021 at 12:37 PM <bg.salunke09@...> wrote:

[Edited Message Follows]

Hi,

I'm looking into BTF and it's use case. Based on the document I understood to run BPF programs across different kernel versions, it needs to build with libbpf which depends on the BTF information.
Now to enable/to have BTF information on any Kernel, the kernel needs to be re-build with "" flag.

I can see the BTF support in Linux introduced from kernel version 5.1.0 (https://www.kernel.org/doc/html/v5.1/bpf/btf.html?highlight=btf)
however I can still see the BTF information(/sys/kernel/btf/vmlinux) on my 4.18.0-193.28.1.el8_2.x86_64 kernel.

I'm little confused here how old kernel can generate BTF info if the was support added recently.

Can I get information about oldest linux kernel version that can support BTF?
/sys/kernel/btf/vmlinux appeared in 5.4 kernel (upstream version). If
you see it on 4.18, that means someone backported the changes. But for
BPF CO-RE (which I assume is what you are referring to) to work,
kernel itself doesn't need to "support BTF", it just needs to have
.BTF data built-in inside its vmlinux binary image, and that image
needs to be in one of the supported locations (see [0]). Starting from
5.2 kernel CONFIG_DEBUG_INTO_BTF=y is supported with adds .BTF section
as part of the kernel build process.

But one could technically add .BTF by using pahole tool (part of
dwarves package) even before that, as long as vmlinux image contains
DWARF information.

So in short, the easiest way is to get the latest kernel you can. But
with enough persistence and effort you can get kernel BTF embedded for
pretty much any kernel version.


[0] https://github.com/libbpf/libbpf/blob/master/src/btf.c#L4589-L4598




Re: Which is oldest linux kernel version that can support BTF? #bcc

Alison Chaiken
 

bg.salunke09@... asked:
Can I get information about oldest linux kernel version that can support BTF?
The basic support appears to have been added by

commit e83b9f55448afce3fe1abcd1d10db9584f8042a6
Author: Andrii Nakryiko <andriin@...>
Date: Tue Apr 2 09:49:50 2019 -0700
kbuild: add ability to generate BTF type info for vmlinux

The inquiry "git branch --contains e83b9f55448a" will tell you which
of your branches contains this commit.

Hope this helps,
Alison Chaiken
Aurora Innovation


Which is oldest linux kernel version that can support BTF? #bcc

bg.salunke09@...
 
Edited

Hi, 

I'm looking into BTF and it's use case. Based on the document I understood to run BPF programs across different kernel versions, it needs to build with libbpf which depends on the BTF information. 
Now to enable/to have BTF information on any Kernel, the kernel needs to be re-build with "" flag. 

I can see the BTF support in Linux introduced from kernel version 5.1.0  (https://www.kernel.org/doc/html/v5.1/bpf/btf.html?highlight=btf)
however I can still see the BTF information(/sys/kernel/btf/vmlinux) on my 4.18.0-193.28.1.el8_2.x86_64 kernel.

I'm little confused here how old kernel can generate BTF info if the was support added recently

Can I get information about oldest linux kernel version that can support BTF?



Re: BCC Support for BPF Subprograms with Tail Calls (Kernel 5.10 Feature)

Yonghong Song
 

On Wed, Feb 24, 2021 at 12:24 PM <jwkova@...> wrote:

Hello,

I was wondering if BCC implements the new BPF feature (as of kernel 5.10) to allow BPF programs to utilize both BPF tail calls and BPF subprograms. This behavior is described near the end of this section of the BPF reference guide. I am interested in this functionality to extend a BPF program in order to reach the limit of 8KB of stack space.
You can use bpf tail calls today. You can look at
bcc/tests/cc/test_prog_table.cc for an example. bcc does not support
subprogram yet. In the future we do plan to be more libbpf compatible
so we can use those features.
BTW, the stack limit is 512 bytes not 8KB.


Thanks,

Jake


Re: __builtin_memcpy behavior

Toke Høiland-Jørgensen
 

"Tristan Mayfield" <mayfieldtristan@...> writes:

Thank you to both Andrii and Toke! It's been extremely helpful to read
your responses. Having conversations like these really helps me when I
go into the source code and try to understand the overall intent of
it. I'm going to try and summarize the conversation to confirm my
understanding.

bpf_probe_read() will read any valid kernel memory (nothing new here).
If the memory is already available to be read in the program (e.g. in
tracepoint args), then __builtin_memcpy can be used and will
potentially throw attach-time errors if reading structs incorrectly
(for some reason I don't think we clarified).
OK, I'll try to explain this one:

Think of __builtin_memcpy() as a macro: it just compiles down to regular
program instructions copying the memory (i.e., these two are roughly
equivalent, modulo any optimisations the compiler might make):

x = y;
__builtin_memcpy(&x, &y, sizeof(x));

The verifier will check the resulting memory access instructions, to
make sure you're not reading or writing out of bounds for whatever
variable you're reading from / writing to. E.g., if you're reading from
a context pointer, the verifier will know the size of the context object
and make sure you only dereference up to the memory address ctx +
sizeof(*ctx).

CO-RE can guarantee valid memory reads because of the nature of being
able to check offsets and relocations at load time rather than attach
time or just returning garbage data with no errors.
Yes, that's basically what it boils down to. It works like this: What
CO-RE does (for structs) is add some more information to the compiled
binary so that you can reference struct members by name instead of
memory offset. So, normally if you write:

x = y->z;

that will compile to a load from 'y + offsetof(typeof(y), z)', with the
offset being computed at compile time. When you add the
preserve_access_index attribute, clang will record a relocation that
says you wanted the member named 'z' (and its type) by way of the BTF
information. libbpf will read that at load time, and compute a new
'offsetof(typeof(y), z)' for the struct member as it exists in the
running kernel, so that if the layout has changed, you'll still get the
right offset. The load instruction in the byte code is then rewritten
with this new offset.

This means that by the time the bytecode is loaded into the kernel, it
has already been rewritten, so the kernel bounds check is still the same
- it'll just check that the memory you read is inside the size of the
structure; but because the offsets have been fixed up, the end result
you won't get out-of-bound errors - i.e., you might say that passing the
bounds check is an implicit effect of the CO-RE rewriting.

-Toke


BCC Support for BPF Subprograms with Tail Calls (Kernel 5.10 Feature)

jwkova@...
 

Hello,

I was wondering if BCC implements the new BPF feature (as of kernel 5.10) to allow BPF programs to utilize both BPF tail calls and BPF subprograms. This behavior is described near the end of this section of the BPF reference guide. I am interested in this functionality to extend a BPF program in order to reach the limit of 8KB of stack space.

Thanks,

Jake


Re: __builtin_memcpy behavior

Andrii Nakryiko
 

-- Andrii

On Wed, Feb 24, 2021 at 11:28 AM Tristan Mayfield
<mayfieldtristan@...> wrote:

Thank you to both Andrii and Toke! It's been extremely helpful to read your responses. Having conversations like these really helps me when I go into the source code and try to understand the overall intent of it. I'm going to try and summarize the conversation to confirm my understanding.

bpf_probe_read() will read any valid kernel memory (nothing new here). If the memory is already available to be read in the program (e.g. in tracepoint args), then __builtin_memcpy can be used and will potentially throw attach-time errors if reading structs incorrectly (for some reason I don't think we clarified).

CO-RE can guarantee valid memory reads because of the nature of being able to check offsets and relocations at load time rather than attach time or just returning garbage data with no errors.

To build CO-RE programs you need a vmlinux file (not to be confused with the header, vmlinux.h) which is normally found at /sys/kernel/btf/vmlinux on systems that have been compiled with pahole and CONFIG_DEBUG_INFO_BTF=y. Having the vmlinux.h file is helpful because it replaces kernel headers and makes building a bit nicer, but isn't necessary. Once compiled, CO-RE programs should be able to run on any system that has a vmlinux file in one of the locations listed here: https://github.com/libbpf/libbpf/blob/master/src/btf.c#L4583.
vmlinux usually refers to kernel image binary. /sys/kernel/btf/vmlinux
is not that, it's only the BTF data. So CO-RE needs kernel BTF, not
necessarily vmlinux kernel image. Just a clarification. But vmlinux
image (ELF file) itself has .BTF section, which has the same data
exposed in /sys/kernel/btf/vmlinux, so libbpf will try to fetch that
data, if /sys/kernel/btf/vmlinux is not present. That is necessary for
some older kernel versions, as well if you "embed" BTF information
manually with `pahole -J`.


For earlier kernels, it's possible to generate a vmlinux file (and this is one of the spots I'm a bit murky on) with pahole -J, but I'm not sure what you are supposed to target when running that? Just the compiled kernel binary? Something else?
Yes, `pahole -J <path-to-kernel-image-vmlinux-binary>`. Pahole is able
to produce BTF from DWARF type information, contained in your vmlinux
kernel image (if you compile it with DWARF, of course). That's what is
happening in newer kernels when you specify CONFIG_DEBUG_INFO_BTF=y
(plus some extra linker steps to make that section "loadable":
available to kernel itself in runtime, but that's not necessary for
CO-RE itself).


BTF is just a type format that can describe C data-types. Almost like a meta-language? I've personally not looked at the source for BTF yet, but it seems to be versatile enough that it's useful for CO-RE for describing internal data structures from the kernel, but it's also useful for a variety of other things (like map declarations) and will likely be increasingly relied on in future iterations of BPF, both CO-RE and otherwise. BTF support mainly comes from the compiler (which I do believe clang 10+ works, just from my experience. I'm primarily using clang 10 right now) and libbpf supporting it, not necessarily the kernel (except for CO-RE with the vmlinux).

Again, appreciate the responses. I've been building with BPF/libbpf about a year now and still feel like I've only scratched the surface. Reading source code is great, but sometimes it just really helps to get high-level ideas as well!
I think you got everything right. BTW, feel free to check my more
recent blog post ([0]), it might help a bit more.

[0] https://nakryiko.com/posts/libbpf-bootstrap/


-Tristan


Re: __builtin_memcpy behavior

Tristan Mayfield
 

Thank you to both Andrii and Toke! It's been extremely helpful to read your responses. Having conversations like these really helps me when I go into the source code and try to understand the overall intent of it. I'm going to try and summarize the conversation to confirm my understanding.

bpf_probe_read() will read any valid kernel memory (nothing new here). If the memory is already available to be read in the program (e.g. in tracepoint args), then __builtin_memcpy can be used and will potentially throw attach-time errors if reading structs incorrectly (for some reason I don't think we clarified).

CO-RE can guarantee valid memory reads because of the nature of being able to check offsets and relocations at load time rather than attach time or just returning garbage data with no errors.

To build CO-RE programs you need a vmlinux file (not to be confused with the header, vmlinux.h) which is normally found at /sys/kernel/btf/vmlinux on systems that have been compiled with pahole and CONFIG_DEBUG_INFO_BTF=y. Having the vmlinux.h file is helpful because it replaces kernel headers and makes building a bit nicer, but isn't necessary. Once compiled, CO-RE programs should be able to run on any system  that has a vmlinux file in one of the locations listed here: https://github.com/libbpf/libbpf/blob/master/src/btf.c#L4583.

For earlier kernels, it's possible to generate a vmlinux file (and this is one of the spots I'm a bit murky on) with pahole -J, but I'm not sure what you are supposed to target when running that? Just the compiled kernel binary? Something else?

BTF is just a type format that can describe C data-types. Almost like a meta-language? I've personally not looked at the source for BTF yet, but it seems to be versatile enough that it's useful for CO-RE for describing internal data structures from the kernel, but it's also useful for a variety of other things (like map declarations) and will likely be increasingly relied on in future iterations of BPF, both CO-RE and otherwise. BTF support mainly comes from the compiler (which I do believe clang 10+ works, just from my experience. I'm primarily using clang 10 right now) and libbpf supporting it, not necessarily the kernel (except for CO-RE with the vmlinux).

Again, appreciate the responses. I've been building with BPF/libbpf about a year now and still feel like I've only scratched the surface. Reading source code is great, but sometimes it just really helps to get high-level ideas as well!

-Tristan

41 - 60 of 2014