Date   

Re: Question about inet_set_socket_state trace point

Raga lahari
 

Hello,

Correcting typo in code snippet

<code>

TRACEPOINT_PROBE(sock, inet_sock_set_state) {


if (args->newstate == TCP_ESTABLISHED) 

                 __sync_fetch_and_add(val, 1); 

       if (args->oldstate == TCP_ESTABLISHED)       

                 __sync_fetch_and_add(val, -1);  

 } 



Thanks & Regards,
Ragalahari


On Wed, Oct 14, 2020 at 10:35 AM Raga lahari <ragalahari.potti@...> wrote:

Hi everyone,


I am using inet_set_socket_state trace point to get current establish connection count

Here, incrementing counter value in BPF map when new state is TCP_ESTABLISHED and decrementing when old state is TCP_ESTABLISHED.


But observed that the map count is having discrepancy with what netstat shows. When we start the probe, it looks all fine, but when we leave it running say for 2-3 days we see the difference. And this difference is building over time.

Can someone please help me here if I am missing something?


<code>

TRACEPOINT_PROBE(sock, inet_sock_set_state) {


if (args->newstate >= TCP_ESTABLISHED) 

                 __sync_fetch_and_add(val, 1); 

       if (args->newstate >= TCP_ESTABLISHED)       

                 __sync_fetch_and_add(val, -1);  

 } 


netstat -tanp  | grep -i "EST" | wc -l

Thanks,
Ragalahari


Re: Question about inet_set_socket_state trace point

Tristan Mayfield
 

Hi Ragalahari,

In your code you seem to not check for "old state" when you're heading to decrement. It looks like you are adding 1 and then immediately subtracting 1 in the same condition. That might be your problem? You never stated what the difference between it and netstat are so I can't be sure.

Tristan


Question about inet_set_socket_state trace point

Raga lahari
 

Hi everyone,


I am using inet_set_socket_state trace point to get current establish connection count

Here, incrementing counter value in BPF map when new state is TCP_ESTABLISHED and decrementing when old state is TCP_ESTABLISHED.


But observed that the map count is having discrepancy with what netstat shows. When we start the probe, it looks all fine, but when we leave it running say for 2-3 days we see the difference. And this difference is building over time.

Can someone please help me here if I am missing something?


<code>

TRACEPOINT_PROBE(sock, inet_sock_set_state) {


if (args->newstate >= TCP_ESTABLISHED) 

                 __sync_fetch_and_add(val, 1); 

       if (args->newstate >= TCP_ESTABLISHED)       

                 __sync_fetch_and_add(val, -1);  

 } 


netstat -tanp  | grep -i "EST" | wc -l

Thanks,
Ragalahari


Re: [vagrant] accept PR to bring iovisor/vagrant to ubuntu 20.04 (from ubuntu 14.04)

Brenden Blanco
 

Sure I can accept a PR.

On Fri, Oct 9, 2020 at 5:59 AM <github@...> wrote:

I have to create a test-environment (based on vagrant) the last couple of days and i've done this with ubuntu 20.04 as base image.

Is the repository https://github.com/iovisor/vagrant still active?
If yes i would create a PR to update this Repository.


[vagrant] accept PR to bring iovisor/vagrant to ubuntu 20.04 (from ubuntu 14.04)

github@...
 

I have to create a test-environment (based on vagrant) the last couple of days and i've done this with ubuntu 20.04 as base image.

Is the repository https://github.com/iovisor/vagrant still active?
If yes i would create a PR to update this Repository.


Re: Tracepoint/Kprobe for tracking inbound connections

Forrest Chen
 

you can attach kprobe in 'tcp_conn_request" for inbound connection

--
forrest0579@...


Re: Tracepoint/Kprobe for tracking inbound connections

Yonghong Song
 

On Tue, Sep 29, 2020 at 4:14 AM Kanthi P <Pavuluri.kanthi@...> wrote:

Hi,

I am looking for tracking inbound connections on a system using tracepoints/kprobes.

I was checking "trace_inet_sock_set_state", with which we can track the state changes during connection establishment and closure. It seems straightforward to track total connections, but since we only want inbound, one way would be to look at what are the ip addresses/ports on which a node listens to and while tracking the state changes, I can see if the local address/port matches to the one this system listens on and based on that make a decision whether its an inbound connection or not. This looks a bit roundabout way for me, so thought of reaching for suggestions to do it simpler.

Another way is to store the socker address when TCP_SYN_RECV to TCP_ESTABLISHED state change happens and during closure we can check if it is for this socket, so we know its inbound connection. But this would make the map size grow too high as we have about 50k concurrent connections.

Can you suggest a better way to do this?
Maybe you can use sk_local_storage? You can attach a piece of
information to the socket during TCP_SYN_RECV and later on during
TCP_ESTABLISHED to check that data, and you can delete that data from
the socket if you do not need it any more,
all in bpf program.


Thanks,
Kanthi


Tracepoint/Kprobe for tracking inbound connections

Kanthi P
 

Hi,

I am looking for tracking inbound connections on a system using tracepoints/kprobes.

I was checking "trace_inet_sock_set_state", with which we can track the state changes during connection establishment and closure. It seems straightforward to track total connections, but since we only want inbound, one way would be to look at what are the ip addresses/ports on which a node listens to and while tracking the state changes, I can see if the local address/port matches to the one this system listens on and based on that make a decision whether its an inbound connection or not. This looks a bit roundabout way for me, so thought of reaching for suggestions to do it simpler.

Another way is to store the socker address when TCP_SYN_RECV to TCP_ESTABLISHED state change happens and during closure we can check if it is for this socket, so we know its inbound connection. But this would make the map size grow too high as we have about 50k concurrent connections.

Can you suggest a better way to do this?

Thanks,
Kanthi


Re: Load BPF program at boot-time?

Yonghong Song
 

On Sun, Sep 6, 2020 at 7:55 AM Shung-Hsi Yu <yu@...> wrote:

Hi,

Is it possible to load a BPF program at boot time?
It is possible. See the patch below:
https://lore.kernel.org/bpf/20200819042759.51280-1-alexei.starovoitov@gmail.com/

I tried to load a BPF program and pin it in bpffs system. The system could
be extended to load bpf program, even attach it if other subsystem is ready.
But this needs kernel work.

What I'm trying to achieve is to trace every single call to a certain
function since the kernel starts, without missing anything.

More specifically, I'm trying to debug iommu_alloc failures by looking
at the stacktrace to find out which subsystem/driver allocated too
many IOMMU slots on a ppc64le system, which I do not have direct
access to.

I've considered writing a systemd unit file that loads a BPF program
before the sysinit target[1], but I'm not sure if that's early enough.
An alternative seems to be to use boot-time tracing with ftrace[2]
instead (which I end up doing), but it requires recompiling the kernel
inorder to add tracepoints to retrieve the function call arguments,
and there isn't an easy way to stop tracing to prevent the tracing
buffer overflows (I end up writing a systemd unit file that sets a
ftrace event trigger that turns off tracing).
bpf program seems a good choice here since it can store arbitrary
data in its maps and based on the tracing state, it can stop tracing.

There are still some potential issues relating to not recompile kernel
and just change bpf programs and recompile bpf programs and
rebooting should just work, which is not available today. I guess this
probably can be improved. If you are interested, please take a look
at the above patch and may improve kernel to cover your use case.


Maybe there is a better way to do something like this?


Much thanks,
Shung-Hsi Yu

[1]: https://www.freedesktop.org/software/systemd/man/bootup.html
[2]: https://www.kernel.org/doc/html/latest/trace/boottime-trace.html



Load BPF program at boot-time?

Shung-Hsi Yu
 

Hi,

Is it possible to load a BPF program at boot time?
What I'm trying to achieve is to trace every single call to a certain
function since the kernel starts, without missing anything.

More specifically, I'm trying to debug iommu_alloc failures by looking
at the stacktrace to find out which subsystem/driver allocated too
many IOMMU slots on a ppc64le system, which I do not have direct
access to.

I've considered writing a systemd unit file that loads a BPF program
before the sysinit target[1], but I'm not sure if that's early enough.
An alternative seems to be to use boot-time tracing with ftrace[2]
instead (which I end up doing), but it requires recompiling the kernel
inorder to add tracepoints to retrieve the function call arguments,
and there isn't an easy way to stop tracing to prevent the tracing
buffer overflows (I end up writing a systemd unit file that sets a
ftrace event trigger that turns off tracing).

Maybe there is a better way to do something like this?


Much thanks,
Shung-Hsi Yu

[1]: https://www.freedesktop.org/software/systemd/man/bootup.html
[2]: https://www.kernel.org/doc/html/latest/trace/boottime-trace.html


Re: Reading Pinned maps in eBPF Programs

Andrii Nakryiko
 

On Mon, Aug 31, 2020 at 12:03 PM Ian <icampbe14@...> wrote:

Interestingly enough adding just -g in my Makefile built the BPF programs and allowed the BTF section to be found and properly loaded. My BPF program was loaded and is running properly with my desired functionality. I am confused though as to why the -g flag fixed this problem. Which according to the clang man page:

-g Generate debug information.

Is BTF information considered debug information? Is that in general or in this case? Is the this unexpected behavior? Perhaps a bug of clangs non -g compiled binaries with BPF? It would seem to me that the BTF information should not be purged from a non -g binary. I am interested to hear your thought on this Andrii!
It's expected right now. BTF started out as purely debug information,
but got elevated into pretty much a mandatory thing for modern BPF
applications. We've talked about making .BTF emitted without -g, but
that hasn't happened in Clang yet (there are some technical
difficulties).

Again, thank you so much for your help. There is no way I would have figured that out on my own.

Ian


Re: Reading Pinned maps in eBPF Programs

Ian
 

Interestingly enough adding just -g in my Makefile built the BPF programs and allowed the BTF section to be found and properly loaded. My BPF program was loaded and is running properly with my desired functionality. I am confused though as to why the -g flag fixed this problem. Which according to the clang man page:
-g Generate debug information.
Is BTF information considered debug information? Is that in general or in this case? Is the this unexpected behavior? Perhaps a bug of clangs non -g compiled binaries with BPF? It would seem to me that the BTF information should not be purged from a non -g binary. I am interested to hear your thought on this Andrii! 

Again, thank you so much for your help. There is no way I would have figured that out on my own. 

Ian


Re: Reading Pinned maps in eBPF Programs

Andrii Nakryiko
 

On Sun, Aug 30, 2020 at 4:35 PM Ian <icampbe14@...> wrote:

Hello,

Here is the libbpf Logs at all levels for the open snoop program when using the pinned option for a map. This was tested on Linux Kernel v5.4 with libbpf 0.0.9, 0.1.0, and the current version. All the results of the logs were the same so I have only posted a single copy of it here. Let me know what you think and what the next steps might be! I appreciate the help and am having a good time trying to piece this together.
[...]


libbpf: section(14) .rel.eh_frame, size 32, link 15, flags 0, type=9

libbpf: skip relo .rel.eh_frame(14) for section(13)

libbpf: section(15) .symtab, size 408, link 1, flags 0, type=2

libbpf: BTF is required, but is missing or corrupted.
Ok, this is a very different issue than the kernel missing BTF. libbpf
is complaining that your opensnoop.bpf.o itself is missing BTF. And
right, BTF is required to parse map definitions properly, but it
doesn't depend on having kernel support for BTF at all. Make sure you
use recent enough Clang (v10+) and you build your opensnoop.bpf.o with
-target bpf **and** -g flag to generate debug info (including .BTF ELF
section).


Ian


Re: Reading Pinned maps in eBPF Programs

Ian
 

Hello, 

Here is the libbpf Logs at all levels for the open snoop program when using the pinned option for a map. This was tested on Linux Kernel v5.4 with libbpf 0.0.9, 0.1.0, and the current version. All the results of the logs were the same so I have only posted a single copy of it here. Let me know what you think and what the next steps might be! I appreciate the help and am having a good time trying to piece this together. 

libbpf: loading bpf-library/bpf_objs/opensnoop.bpf.o
 
libbpf: section(1) .strtab, size 289, link 0, flags 0, type=3
 
libbpf: skip section(1) .strtab
 
libbpf: section(2) .text, size 0, link 0, flags 6, type=1
 
libbpf: skip section(2) .text
 
libbpf: section(3) tracepoint/syscalls/sys_enter_openat, size 1632, link 0, flags 6, type=1
 
libbpf: found program tracepoint/syscalls/sys_enter_openat
 
libbpf: section(4) .reltracepoint/syscalls/sys_enter_openat, size 32, link 15, flags 0, type=9
 
libbpf: section(5) tracepoint/syscalls/sys_enter_open, size 1368, link 0, flags 6, type=1
 
libbpf: found program tracepoint/syscalls/sys_enter_open
 
libbpf: section(6) .reltracepoint/syscalls/sys_enter_open, size 32, link 15, flags 0, type=9
 
libbpf: section(7) .data, size 4, link 0, flags 3, type=1
 
libbpf: section(8) maps, size 20, link 0, flags 3, type=1
 
libbpf: section(9) .rodata.str1.1, size 9, link 0, flags 32, type=1
 
libbpf: skip section(9) .rodata.str1.1
 
libbpf: section(10) version, size 4, link 0, flags 3, type=1
 
libbpf: kernel version of bpf-library/bpf_objs/opensnoop.bpf.o is 50422
 
libbpf: section(11) license, size 4, link 0, flags 3, type=1
 
libbpf: license of bpf-library/bpf_objs/opensnoop.bpf.o is GPL
 
libbpf: section(12) .maps, size 40, link 0, flags 3, type=1
 
libbpf: section(13) .eh_frame, size 80, link 0, flags 2, type=1
 
libbpf: skip section(13) .eh_frame
 
libbpf: section(14) .rel.eh_frame, size 32, link 15, flags 0, type=9
 
libbpf: skip relo .rel.eh_frame(14) for section(13)
 
libbpf: section(15) .symtab, size 408, link 1, flags 0, type=2
 
libbpf: BTF is required, but is missing or corrupted.
 
Ian


Re: Reading Pinned maps in eBPF Programs

Andrii Nakryiko
 

On Thu, Aug 27, 2020 at 6:55 AM Ian <icampbe14@...> wrote:

Hey Andrii,

I tried using the same BPF program with the declarative pinning of maps with Libbpf v.0.0.9, v.0.1.0 and the current master branch under commit 7bc52e6. All of these had the same error being generated requiring BTF. I will update this post with the Libbpf debugger messages once I figure out how to set those up/find them! Is there anything other than that you might need from me?
Check example [0] for how to set custom logging callback and print all
libbpf logs (including those at DEBUG level of verbosity).

[0] https://github.com/iovisor/bcc/blob/master/libbpf-tools/runqslower.c#L136



By the way, thank you so much for all your help!
You are welcome!


Ian


Re: Reading Pinned maps in eBPF Programs

Ian
 

Hey Andrii, 

I tried using the same BPF program with the declarative pinning of maps with Libbpf v.0.0.9, v.0.1.0 and the current master branch under commit 7bc52e6. All of these had the same error being generated requiring BTF. I will update this post with the Libbpf debugger messages once I figure out how to set those up/find them! Is there anything other than that you might need from me? 

 

By the way, thank you so much for all your help! 

Ian


Re: Reading Pinned maps in eBPF Programs

Andrii Nakryiko
 

On Wed, Aug 26, 2020 at 6:54 AM Tristan Mayfield
<mayfieldtristan@...> wrote:

I wanted to chime in and mention that I've seen the BTF error before when trying to declare maps the way shown in https://github.com/torvalds/linux/blob/master/tools/testing/selftests/bpf/progs/test_pinning.c.

I have tested kernel 4.15 and 5.4 (vanilla Ubuntu 18.04 and 20 respectively) and both have the same issue. Looking through libbpf it looks like the call would be coming from:

bpf_object__open() -> __bpf_object__open() -> bpf_object__elf_collect() -> bpf_object__init/finalize_btf()

I haven't run through a debugger yet to verify that's the issue, but I have verified on the opensnoop code Ian posted.
I'm not sure why the deprecated version of map declaration doesn't cause this BTF workflow while the newer one does, but I'll look through and debug today and if I can find it I'll send out a message. I'd be interested to know if that above code is doing something that triggers BTF reliance though.
Which version of libbpf are you seeing this on? We've had bugs in
libbpf where we'd attempt to load kernel BTF unnecessarily, but I
believe we've fixed all those issues. Can you please double-check with
latest released libbpf and see if that's still happening? If it is,
could you provide a repro and full libbpf debug logs for me to
investigate? Thanks!

Tristan


Re: Reading Pinned maps in eBPF Programs

Andrii Nakryiko
 

On Sun, Aug 23, 2020 at 12:36 PM Ian <icampbe14@...> wrote:

Hello! Sorry for the wait, I just started back at uni and things are a little bit crazy around here!

Anyways, this is the source code for my version of open snoop. Which is what I have been testing with. This does not contain the changes for map reading. My goal is to have this open snoop file open/read a map with one element after it gets the PID to compare them. It is also worth noting that I am tracking both open and openat within the same file.
[...]

I don't see anything needing kernel BTF in there, so if libbpf still
fails on not being able to load kernel BTF, that might be a bug in
libbpf. Can you please double-check this with the latest released (or
just plain latest) libbpf and if that's still happening, please
provide debug-level logs from libbpf. Thank you!



Re: Reading Pinned maps in eBPF Programs

Tristan Mayfield
 

I wanted to chime in and mention that I've seen the BTF error before when trying to declare maps the way shown in https://github.com/torvalds/linux/blob/master/tools/testing/selftests/bpf/progs/test_pinning.c.

I have tested kernel 4.15 and 5.4 (vanilla Ubuntu 18.04 and 20 respectively) and both have the same issue. Looking through libbpf it looks like the call would be coming from:

bpf_object__open() -> __bpf_object__open() -> bpf_object__elf_collect() -> bpf_object__init/finalize_btf()

I haven't run through a debugger yet to verify that's the issue, but I have verified on the opensnoop code Ian posted.
I'm not sure why the deprecated version of map declaration doesn't cause this BTF workflow while the newer one does, but I'll look through and debug today and if I can find it I'll send out a message. I'd be interested to know if that above code is doing something that triggers BTF reliance though.

Tristan


Re: Reading Pinned maps in eBPF Programs

Ian
 

Hello! Sorry for the wait, I just started back at uni and things are a little bit crazy around here!

Anyways,  this is the source code for my version of open snoop. Which is what I have been testing with. This does not contain the changes for map reading. My goal is to have this open snoop file open/read a map with one element after it gets the PID to compare them. It is also worth noting that I am tracking both open and openat within the same file. 

#include <linux/bpf.h>   // BPF asm file that ships with the OS

#include "bpf_helpers.h" // bpf_helper functions

#include <linux/version.h>

 

// For navigating the task struct

#include <linux/sched.h>

#include <linux/nsproxy.h>

#include <linux/pid_namespace.h>

#include <linux/ns_common.h>

 

#define MAX_CPUS 4

 

/**

 * Struct to pass data to the perf buffer

 */

#pragma pack(1)

struct opensnoop_data_t {

    u32 pid;

    u32 tgid;

    char program_name[16]; // max comm length is 16

    char file[255];

    u32 namespace;

    u64 time;

};

 

struct sys_enter_openat_args {

    long long pad;

    long __syscall_nr;

    long dfd;

    const char *filename;

    long flags;

    long mode;

};

 

struct sys_enter_open_args {

    long long pad;

    long __syscall_nr;

    const char *filename;

    long flags;

    long mode;

};

 

/**

 * Using the magic macro SEC this struct declares

 * and creates a new bpf map of a type PERF that we

 * can use to pass data to userspace

 */

struct bpf_map_def SEC("maps") opensnoop_events = {

    .type = BPF_MAP_TYPE_PERF_EVENT_ARRAY,

    .key_size = sizeof(int),

    .value_size = sizeof(u32),

    .max_entries = MAX_CPUS,

};

 

SEC("tracepoint/syscalls/sys_enter_openat")

int bpf_prog(struct sys_enter_openat_args *ctx) {

 

    struct opensnoop_data_t data = {};

    data.pid = bpf_get_current_pid_tgid() >> 32; // use fn from libbpf.h to get pid_tgid

    data.tgid = bpf_get_current_pid_tgid();      // first 32 bits are tgid

    data.time = bpf_ktime_get_ns();

 

    bpf_get_current_comm(&data.program_name, sizeof(data.program_name)); // puts current comm into char array

 

    int err = bpf_probe_read_str(data.file, sizeof(data.file), ctx->filename);

    if (!err) {

        char msg[] = "Err: %d\n";

        bpf_trace_printk(msg, sizeof(msg), err);

    }

 

    struct task_struct *task = (struct task_struct *)bpf_get_current_task(); // sched.h

 

    struct nsproxy *nsprox = 0;      // nsproxy.h

    struct pid_namespace *pidns = 0; // pid_namespace.h

    struct ns_common *nsc = 0;       // ns_common.h

    struct ns_common n = {};

    data.namespace = ({

        typeof(unsigned int) _val;

        __builtin_memset(&_val, 0, sizeof(_val)); // set bytes to 0

        bpf_probe_read(&_val, sizeof(_val), &({

 

            typeof(struct pid_namespace *) _val;

            __builtin_memset(&_val, 0, sizeof(_val));

            bpf_probe_read(&_val, sizeof(_val), &({

 

                typeof(struct nsproxy *) _val;

                __builtin_memset(&_val, 0, sizeof(_val));

                bpf_probe_read(&_val, sizeof(_val), &task->nsproxy);

                _val;

 

            })->pid_ns_for_children);

 

            _val;

        })->ns.inum);

 

        _val;

    });

 

#ifdef DEBUG

    char debug_msg[] = "Tracepoint on syscalls/sys_enter_openat was called for process %d\n";

    bpf_trace_printk(debug_msg, sizeof(debug_msg), data.pid);

#endif

 

    bpf_perf_event_output(ctx, &opensnoop_events, BPF_F_CURRENT_CPU /*run on current cpu*/, &data, sizeof(data));

 

    return 0;

}

SEC("tracepoint/syscalls/sys_enter_open")

int sys_enter_open_prog(struct sys_enter_open_args *ctx) {

 

    struct opensnoop_data_t data = {};

 

    data.pid = bpf_get_current_pid_tgid() >> 32; // use fn from libbpf.h to get pid_tgid

    data.tgid = bpf_get_current_pid_tgid();      // first 32 bits are tgid

    data.time = bpf_ktime_get_ns();

 

    bpf_get_current_comm(&data.program_name, sizeof(data.program_name)); // puts current comm into char array

 

    int err = bpf_probe_read_str(data.file, sizeof(data.file), ctx->filename);

    if (!err) {

        char msg[] = "Err: %d\n";

        bpf_trace_printk(msg, sizeof(msg), err);

    }

 

#ifdef DEBUG

    char debug_msg[] = "Tracepoint on syscalls/sys_enter_open was called for process %d\n";

    bpf_trace_printk(debug_msg, sizeof(debug_msg), data.pid);

#endif

 

    bpf_perf_event_output(ctx, &opensnoop_events, BPF_F_CURRENT_CPU /*run on current cpu*/, &data, sizeof(data));

 

    return 0;

}

 

u32 _version SEC("version") = LINUX_VERSION_CODE;

char _license[] SEC("license") = "GPL"; // necessary to use types of kernel ABI's

101 - 120 of 2015