Date   

Re: Future of BCC Python tools

Alexei Starovoitov
 

On Mon, Oct 26, 2020 at 3:34 PM Brendan Gregg <brendan.d.gregg@...> wrote:

G'Day all,

I have colleagues working on BCC Python tools (e.g., the recent
enhancement of tcpconnect.py) and I'm wondering, given libbpf tools,
what our advice should be.

- Should we keep both Python and libbpf tools in sync?
- Should we focus on libbpf only, and leave Python versions for legacy systems?
bcc python is still used by many where they need on the fly compilation.
Such cases still exist. One example is USDT support.
The libbpf and CO-RE support for USDT is still wip.
So such cases have to continue using bcc style with llvm.
The number of such cases is gradually reducing.
I think right now anyone who starts with bpf should be all set with
libbpf, BTF and CO-RE. It's much better suited for embedded setups too.
So I think bcc as a go-to place is still a great framework, but adding
a new python based tool is probably not the best investment of time
for the noobs. Experiences folks who already learned py-bcc will
keep hacking their scripts in python. That's all great.
noobs should probably learn bpftrace for quick experiments
and libbpf-tools for standalone long-term tried-and-true tools.

Should we keep libbpf-tools and py-bcc tools in sync?
I think py tools where libbpf-tools replacement is complete could be
moved into 'deprecated' directory and not installed by default.
All major distros are built with CONFIG_DEBUG_INFO_BTF=y
so the users won't be surprised. Their favorite tools will keep
working as-is. The underlying implementation of them will quietly change.
We can document it of course, but who reads docs.


Future of BCC Python tools

Brendan Gregg
 

G'Day all,

I have colleagues working on BCC Python tools (e.g., the recent
enhancement of tcpconnect.py) and I'm wondering, given libbpf tools,
what our advice should be.

- Should we keep both Python and libbpf tools in sync?
- Should we focus on libbpf only, and leave Python versions for legacy systems?

I like the tweak-ability of the Python tools: sometimes I'm on a
production instance and I'll copy a tool and edit it on the fly. That
won't work with libbpf. Although, we also install all the bpftrace
tools on our prod instances [0], and if I'm editing tools I start with
them.

However, the llvm dependency of the Python tools is a pain, and an
obstacle for making bcc tools a default install with different
distros. I could imagine having a selection of the top 10 libbpf tools
as a package (bcc-essentials), which would be about 1.5 Mbytes (last
time I did libbpf tool it was 150 Kbytes stripped), and getting that
installed by default by different distros. (Ultimately, I want a
lightweight bpftrace installed by default as well.)

So, I guess I'm not certain about the future of the BCC Python tools.
What do people think? If we agree that the Python tools are legacy, we
should update the README to let everyone know.

Note: I'm just talking about the tools (tools/*.py)! I imagine BCC
Python is currently used for many other BPF things, and I'm not
suggesting that go away.

Brendan

[0] https://github.com/Netflix-Skunkworks/bpftoolkit


execveat tracepoints issues

alessandro.gario@...
 

Hello everyone!

I am experiencing some issues with the execveat tracepoints, and was wondering if others could reproduce it or help me understand what I am doing wrong.

On Arch Linux (kernel 5.9.1, perf 5.7.g3d77e6a8804a), both sys_enter_execveat and sys_exit_execveat never seem to report any event.

On Ubuntu 20.04 (kernel 5.4.0, perf 5.4.65), sys_enter_execveat will work provided there is no one else making use of that tracepoint, while sys_exit_execveat is always completely silent.

I traced the program I am using to test this with strace and verified that execveat is being called correctly. The following is the source code for that program:

---
#include <unistd.h>
#include <linux/fcntl.h>
#include <linux/unistd.h>

int main() {
syscall(__NR_execveat, AT_FDCWD,
"/usr/bin/bash", nullptr,
nullptr, 0);

return 0;
}
---

Here's a recording of what I'm experiencing on Ubuntu: https://asciinema.org/a/6EiDfoOpK1AYcDm7aPftrYqdo

Thanks for your help!

Alessandro Gario


Re: Minimum LLVM version for bcc

Yonghong Song
 

On Wed, Oct 21, 2020 at 8:57 AM Dale Hamel <daleha@...> wrote:

Does the LLVM version used by bcc matter, for packaging purposes?
This is a good question. For packaging purpose, no, it does not matter
much. The people who builds package can choose whatever it is
available to them to package. bcc is supposed to work for all major
llvm releases since llvm 3.7.


I assume bcc includes some static libraries from LLVM, so I'm curious if the older versions are acceptable. For instance, on ubuntu 16.04, we use LLVM 3.7, but on ubuntu 18.04 and 20.04 it uses LLVM 6.0, based on the current debian control file.
This is probably due to historical reason.


Are there features of newer LLVM releases that we need? For example, does BTF require a specific minimum version of LLVM? If this is the case, perhaps we should update the dependency descriptions in the debian control file to reflect this.
for BTF support, best is >= llvm10. For testing purpose, we may still
want to keep an option to build with old llvm's.


Minimum LLVM version for bcc

Dale Hamel
 

Does the LLVM version used by bcc matter, for packaging purposes?

I assume bcc includes some static libraries from LLVM, so I'm curious if the older versions are acceptable. For instance, on ubuntu 16.04, we use LLVM 3.7, but on ubuntu 18.04 and 20.04 it uses LLVM 6.0, based on the current debian control file.

Are there features of newer LLVM releases that we need? For example, does BTF require a specific minimum version of LLVM? If this is the case, perhaps we should update the dependency descriptions in the debian control file to reflect this.

-Dale


Re: [Ext] Re: [iovisor-dev] Questions about current eBPF usages

Jiada Tu
 

Thank you very much, Yonghong! Those are very helpful.


Re: [Ext] Re: [iovisor-dev] Questions about current eBPF usages

Yonghong Song
 

On Thu, Oct 15, 2020 at 11:03 PM Jiada Tu <jtu3@...> wrote:

Thanks a lot, Yonghong. From your response:

In your case, the bpf program is to influence io scheduling decisions.
You could implement in a way to do kernel data structure write in
kernel but have a hook to a bpf program to make decision and based on
bpf program return value, kernel can decide what to schedule.

1. How can I make a kernel function use the return value of a eBPF program/function?
e.g., in kernel/events/core.c, for perf event overflow handler, we have

rcu_read_lock();
ret = BPF_PROG_RUN(event->prog, &ctx);
rcu_read_unlock();
out:
__this_cpu_dec(bpf_prog_active);
if (!ret)
return;

event->orig_overflow_handler(event, data, regs);

The above `ret` is the return value from the bpf program.


2. An KProbes related question: from an old article https://lwn.net/Articles/132196/ which was written in 2005, it said:
```
The current KProbes implementation, however, introduces some latency of its own in handling probes. The cause behind this latency is the single kprobe_lock which serializes the execution of probes across all CPUs on a SMP machine.
```
I read it as "functions (e.g., eBPF functions) attached to KProbes are executed in serial, i.e., the same eBPF function can not be run by multiple threads at the same time". As eBPF programs frequently use KProbes to hook to kernel functions, do you know if it's true currently that the calling of a eBPF function/program is single-threaded?
The article is 2005, I am not sure whether this serialization of
kprobes across all CPUs still true or not. bpf subsystem won't prevent
from executing on all cpus in parallelism if kprobe subsystem allows
it.

We recently have kfunc based probing, this is trampoline based, much
faster and does not have this restriction.


Re: [Ext] Re: [iovisor-dev] Questions about current eBPF usages

Jiada Tu
 

Thanks a lot, Yonghong. From your response:

In your case, the bpf program is to influence io scheduling decisions.
You could implement in a way to do kernel data structure write in
kernel but have a hook to a bpf program to make decision and based on
bpf program return value, kernel can decide what to schedule.

1. How can I make a kernel function use the return value of a eBPF program/function?

2. An KProbes related question: from an old article https://lwn.net/Articles/132196/ which was written in 2005, it said:
```
The current KProbes implementation, however, introduces some latency of its own in handling probes. The cause behind this latency is the single kprobe_lock which serializes the execution of probes across all CPUs on a SMP machine.
```
I read it as "functions (e.g., eBPF functions) attached to KProbes are executed in serial, i.e., the same eBPF function can not be run by multiple threads at the same time". As eBPF programs frequently use KProbes to hook to kernel functions, do you know if it's true currently that the calling of a eBPF function/program is single-threaded?

 


Re: Questions about current eBPF usages

Yonghong Song
 

On Thu, Oct 15, 2020 at 4:06 PM Jiada Tu via lists.iovisor.org
<jtu3=hawk.iit.edu@...> wrote:

Hello BPF community,

I am looking for a way to move a user space program's disk I/O scheduling related logic down to kernel space, and then have the new kernel logic communicate with the user space program to make better I/O scheduling decisions. The reason that the user space program itself has I/O scheduling logic is because it needs to prioritize certain read or write requests.

I started looking at eBPF for that purpose. After doing some research, I learned that eBPF is very good at kernel profiling and tracing, but I didn't find much information about modifying kernel functions / data-structure using eBPF.

I am wondering:

1. Instead of calling eBPF function before / after calling a kernel function and then returning back to that kernel function, is it possible for eBPF programs to totally replace a kernel function or module logic?
Currently, no. Kernel has support to replace a bpf program, but not
kernel function. Replacing kernel functions may easily causing kernel
mishehave. There are some proposals to explicitly specify functions
which can be replaced. This work is not done yet.


2. Is it possible for eBPF programs to tamper the parameter and return value of a kernel function, or eBPF program can only read kernel data-structure but can not modify them? (some search indicates that it can not few years ago, but I am not sure if it is changed recently)
No for input parameters.
Yes for return values in certain cases. For any kernel functions
annotated with ALLOW_ERROR_INJECTION, you can attach a bpf program to
that function to change its return values.

all tracing programs can read kernel data structures as of today with
bpf_probe_read or
direct memory access similar to bpf_probe_read in later kernels.
writing to kernel data structure has to be extremely careful as it can
easily crash the kernel or cause kernel to misbehavior. This has to be
done in a controlled way, e.g., in networking, through specific
helpers.

In your case, the bpf program is to influence io scheduling decisions.
You could implement in a way to do kernel data structure write in
kernel but have a hook to a bpf program to make decision and based on
bpf program return value, kernel can decide what to schedule.



Thank you!
Jiada


Questions about current eBPF usages

Jiada Tu
 

Hello BPF community,

I am looking for a way to move a user space program's disk I/O scheduling related logic down to kernel space, and then have the new kernel logic communicate with the user space program to make better I/O scheduling decisions. The reason that the user space program itself has I/O scheduling logic is because it needs to prioritize certain read or write requests.

I started looking at eBPF for that purpose. After doing some research, I learned that eBPF is very good at kernel profiling and tracing, but I didn't find much information about modifying kernel functions / data-structure using eBPF.

I am wondering:

1. Instead of calling eBPF function before / after calling a kernel function and then returning back to that kernel function, is it possible for eBPF programs to totally replace a kernel function or module logic?

2. Is it possible for eBPF programs to tamper the parameter and return value of a kernel function, or eBPF program can only read kernel data-structure but can not modify them? (some search indicates that it can not few years ago, but I am not sure if it is changed recently)


Thank you!
Jiada


Re: Tracepoint/Kprobe for tracking inbound connections

Yonghong Song
 

On Wed, Oct 14, 2020 at 11:57 AM Kanthi P <Pavuluri.kanthi@...> wrote:

[Edited Message Follows]

Nice, thanks Song. I am actually looking to track it till it is closed, so might have to remove that tag when the socket goes to closed state.
And once I have the concurrent connections info, say in a map, I am using XDP to drop the connections after they reach a threshold

So also wanted to ask if there is any way I can read the concurrent connections in XDP since the kernel already keeps track of them at /proc/net/tcp*?
That would help me avoid placing another tracepoint to track the connection count.
XDP only tracks raw packet. There is no skb or other meta data is
available at that point.
You either need to track by yourself or you add another skb or sk level hook.


Appreciate your help!

Thanks,
Kanthi

On Thu, Oct 1, 2020 at 11:26 AM Y Song <ys114321@...> wrote:

On Tue, Sep 29, 2020 at 4:14 AM Kanthi P <Pavuluri.kanthi@...> wrote:

Hi,

I am looking for tracking inbound connections on a system using tracepoints/kprobes.

I was checking "trace_inet_sock_set_state", with which we can track the state changes during connection establishment and closure. It seems straightforward to track total connections, but since we only want inbound, one way would be to look at what are the ip addresses/ports on which a node listens to and while tracking the state changes, I can see if the local address/port matches to the one this system listens on and based on that make a decision whether its an inbound connection or not. This looks a bit roundabout way for me, so thought of reaching for suggestions to do it simpler.

Another way is to store the socker address when TCP_SYN_RECV to TCP_ESTABLISHED state change happens and during closure we can check if it is for this socket, so we know its inbound connection. But this would make the map size grow too high as we have about 50k concurrent connections.

Can you suggest a better way to do this?
Maybe you can use sk_local_storage? You can attach a piece of
information to the socket during TCP_SYN_RECV and later on during
TCP_ESTABLISHED to check that data, and you can delete that data from
the socket if you do not need it any more,
all in bpf program.


Thanks,
Kanthi


Re: Tracepoint/Kprobe for tracking inbound connections

Kanthi P <Pavuluri.kanthi@...>
 

Thanks Forrest!


On Wed, Oct 7, 2020 at 1:03 PM Forrest Chen <forrest0579@...> wrote:
you can attach kprobe in 'tcp_conn_request" for inbound connection

--
forrest0579@...






Re: Tracepoint/Kprobe for tracking inbound connections

Kanthi P <Pavuluri.kanthi@...>
 
Edited

Nice, thanks Song. I am actually looking to track it till it is closed, so might have to remove that tag when the socket goes to closed state.
And once I have the concurrent connections info, say in a map, I am using XDP to drop the connections after they reach a threshold
 
So also wanted to ask if there is any way I can read the concurrent connections in XDP since the kernel already keeps track of them at /proc/net/tcp*?
That would help me avoid placing another tracepoint to track the connection count.
 
Appreciate your help!
 
Thanks,
Kanthi

On Thu, Oct 1, 2020 at 11:26 AM Y Song <ys114321@...> wrote:

On Tue, Sep 29, 2020 at 4:14 AM Kanthi P <Pavuluri.kanthi@...> wrote:
>
> Hi,
>
> I am looking for tracking inbound connections on a system using tracepoints/kprobes.
>
> I was checking "trace_inet_sock_set_state", with which we can track the state changes during connection establishment and closure. It seems straightforward to track total connections, but since we only want inbound, one way would be to look at what are the ip addresses/ports on which a node listens to and while tracking the state changes, I can see if the local address/port matches to the one this system listens on and based on that make a decision whether its an inbound connection or not. This looks a bit roundabout way for me, so thought of reaching for suggestions to do it simpler.
>
> Another way is to store the socker address when TCP_SYN_RECV to TCP_ESTABLISHED state change happens and during closure we can check if it is for this socket, so we know its inbound connection. But this would make the map size grow too high as we have about 50k concurrent connections.
>
> Can you suggest a better way to do this?

Maybe you can use sk_local_storage? You can attach a piece of
information to the socket during TCP_SYN_RECV and later on during
TCP_ESTABLISHED to check that data, and you can delete that data from
the socket if you do not need it any more,
all in bpf program.

>
> Thanks,
> Kanthi
>


Re: Question about inet_set_socket_state trace point

Raga lahari
 

Hi,


Observing established connection counter discrepancy as 20% (30-40 connections mismatch out of 200) in one day that builds to 30% by day-2 and so on.


This observation is with this code

if (args->newstate == TCP_ESTABLISHED) 

                 __sync_fetch_and_add(val, 1); 

       if (args->oldstate == TCP_ESTABLISHED)       

                 __sync_fetch_and_add(val, -1);  

 

 } 

There was a typo in my first message.

 


Regards,
Ragalahari


Re: Question about inet_set_socket_state trace point

Raga lahari
 

Hello,

Correcting typo in code snippet

<code>

TRACEPOINT_PROBE(sock, inet_sock_set_state) {


if (args->newstate == TCP_ESTABLISHED) 

                 __sync_fetch_and_add(val, 1); 

       if (args->oldstate == TCP_ESTABLISHED)       

                 __sync_fetch_and_add(val, -1);  

 } 



Thanks & Regards,
Ragalahari


On Wed, Oct 14, 2020 at 10:35 AM Raga lahari <ragalahari.potti@...> wrote:

Hi everyone,


I am using inet_set_socket_state trace point to get current establish connection count

Here, incrementing counter value in BPF map when new state is TCP_ESTABLISHED and decrementing when old state is TCP_ESTABLISHED.


But observed that the map count is having discrepancy with what netstat shows. When we start the probe, it looks all fine, but when we leave it running say for 2-3 days we see the difference. And this difference is building over time.

Can someone please help me here if I am missing something?


<code>

TRACEPOINT_PROBE(sock, inet_sock_set_state) {


if (args->newstate >= TCP_ESTABLISHED) 

                 __sync_fetch_and_add(val, 1); 

       if (args->newstate >= TCP_ESTABLISHED)       

                 __sync_fetch_and_add(val, -1);  

 } 


netstat -tanp  | grep -i "EST" | wc -l

Thanks,
Ragalahari


Re: Question about inet_set_socket_state trace point

Tristan Mayfield
 

Hi Ragalahari,

In your code you seem to not check for "old state" when you're heading to decrement. It looks like you are adding 1 and then immediately subtracting 1 in the same condition. That might be your problem? You never stated what the difference between it and netstat are so I can't be sure.

Tristan


Question about inet_set_socket_state trace point

Raga lahari
 

Hi everyone,


I am using inet_set_socket_state trace point to get current establish connection count

Here, incrementing counter value in BPF map when new state is TCP_ESTABLISHED and decrementing when old state is TCP_ESTABLISHED.


But observed that the map count is having discrepancy with what netstat shows. When we start the probe, it looks all fine, but when we leave it running say for 2-3 days we see the difference. And this difference is building over time.

Can someone please help me here if I am missing something?


<code>

TRACEPOINT_PROBE(sock, inet_sock_set_state) {


if (args->newstate >= TCP_ESTABLISHED) 

                 __sync_fetch_and_add(val, 1); 

       if (args->newstate >= TCP_ESTABLISHED)       

                 __sync_fetch_and_add(val, -1);  

 } 


netstat -tanp  | grep -i "EST" | wc -l

Thanks,
Ragalahari


Re: [vagrant] accept PR to bring iovisor/vagrant to ubuntu 20.04 (from ubuntu 14.04)

Brenden Blanco
 

Sure I can accept a PR.

On Fri, Oct 9, 2020 at 5:59 AM <github@...> wrote:

I have to create a test-environment (based on vagrant) the last couple of days and i've done this with ubuntu 20.04 as base image.

Is the repository https://github.com/iovisor/vagrant still active?
If yes i would create a PR to update this Repository.


[vagrant] accept PR to bring iovisor/vagrant to ubuntu 20.04 (from ubuntu 14.04)

github@...
 

I have to create a test-environment (based on vagrant) the last couple of days and i've done this with ubuntu 20.04 as base image.

Is the repository https://github.com/iovisor/vagrant still active?
If yes i would create a PR to update this Repository.


Re: Tracepoint/Kprobe for tracking inbound connections

Forrest Chen
 

you can attach kprobe in 'tcp_conn_request" for inbound connection

--
forrest0579@...