Date   

Run CO-RE version's runqslower failed

ethercflow@...
 
Edited

I tried to run CO-RE version's runqslower failed, the error info:

libbpf: sched_wakeup is not found in vmlinux BTF
libbpf: failed to load object 'runqslower_bpf'
libbpf: failed to load BPF skeleton 'runqslower_bpf': -2
failed to load BPF object: -2
ENV
clang version 10.0.0-+rc2-1
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin

Linux Kernel: 5.6.0-rc2+ (commit 8eece07c011f88da0ccf4127fca9a4e4faaf58ae)
CONFIG_CGROUP_BPF=y
CONFIG_BPF=y
CONFIG_BPF_SYSCALL=y
CONFIG_ARCH_WANT_DEFAULT_BPF_JIT=y
CONFIG_BPF_JIT_ALWAYS_ON=y
CONFIG_BPF_JIT_DEFAULT_ON=y
CONFIG_IPV6_SEG6_BPF=y
CONFIG_NETFILTER_XT_MATCH_BPF=m
CONFIG_BPFILTER=y
CONFIG_BPFILTER_UMH=m
CONFIG_NET_CLS_BPF=m
CONFIG_NET_ACT_BPF=m
CONFIG_BPF_JIT=y
CONFIG_BPF_STREAM_PARSER=y
CONFIG_LWTUNNEL_BPF=y
CONFIG_HAVE_EBPF_JIT=y
CONFIG_BPF_EVENTS=y
CONFIG_BPF_KPROBE_OVERRIDE=y
CONFIG_TEST_BPF=m

CONFIG_VIDEO_SONY_BTF_MPX=m
CONFIG_DEBUG_INFO_BTF=y

With gdb's help, I found the `btf__find_by_name_kind` return -ENOENT.
I printed all name: https://transfer.sh/ANKNs/log and found btf_trace_sched_wakeup doesn't exist. 




Confused about wakeup watermark vs sample period when attaching to BPF program

Hayden Livingston
 

Please correct me if I'm wrong about anything.

When a perf_event is attached to a BPF program and the BPF program is
going to do processing and then output what is the significant of
wakeup_events or wakeup_watermark for the original perf_event?

To me it seem like it BPF program will always run, but in the absence
of mmap buffer in original perf_event does it matter?

Then also, what should I set my BPF_OUTPUT wakeup to? Should I set to
large number? How can I get notified in my BPF OUTPUT (not the
original perf) after every 5 seconds? Is that possible?


Re: Can multiple BPF programs use same per-cpu perf ring buffer?

Yonghong Song
 

On Sun, Feb 16, 2020 at 8:43 PM Hayden Livingston
<halivingston@...> wrote:

Imagine I have a per-cpu perf ring buffer for all my cpus.

Now I have two eBPF programs.

In both these eBPF programs I do bpf_update_elem(myFD, &cpunumberkey,
&fdOfCPUspecificBuffer, BPF_ANY)

Will this mean that multiple eBPF programs will be able to write their
data into a single buffer (of course associated with cpu).

This would be amazing if it is truly possible. It seems like it should
be possible.
Yes, you can do this.


I have not tried yet.



Re: bpf_probe_read and pagefaults

Hayden Livingston
 

I should have search. Short answer it fails and you're out of luck.

https://lists.iovisor.org/g/iovisor-dev/topic/accessing_user_memory_and/21386221

On Sun, Feb 16, 2020 at 9:29 PM Hayden Livingston
<halivingston@...> wrote:

I'm curios to know how bpf_probe_read is able to read user-mode memory
in the face of page faulting.

I know in the helper it disables page faulting, but what does that mean?

If the memory the probe is trying to read is paged out then how does
my probe work?

It seems bpf_probe_read is best effort then. Is that true?


bpf_probe_read and pagefaults

Hayden Livingston
 

I'm curios to know how bpf_probe_read is able to read user-mode memory
in the face of page faulting.

I know in the helper it disables page faulting, but what does that mean?

If the memory the probe is trying to read is paged out then how does
my probe work?

It seems bpf_probe_read is best effort then. Is that true?


Can multiple BPF programs use same per-cpu perf ring buffer?

Hayden Livingston
 

Imagine I have a per-cpu perf ring buffer for all my cpus.

Now I have two eBPF programs.

In both these eBPF programs I do bpf_update_elem(myFD, &cpunumberkey,
&fdOfCPUspecificBuffer, BPF_ANY)

Will this mean that multiple eBPF programs will be able to write their
data into a single buffer (of course associated with cpu).

This would be amazing if it is truly possible. It seems like it should
be possible.

I have not tried yet.


Re: Why is BPF_PERF_OUTPUT max_entries set to total processor count?

Yonghong Song
 

On Sun, Feb 16, 2020 at 5:09 PM Hayden Livingston
<halivingston@...> wrote:

Thanks. I had to re-read your reply and the kernel code multiple
times, but I think I get it now. Please confirm.

It is this call is made by user mode code:

fd = bpf_create_map(BPF_MAP_TYPE_PERF_EVENT_ARRAY, /*key_size*/
sizeof(int), /*value_size*/ sizeof(int), NUM_POSSIBLE_CPUS, 0);

key is smp_processor_id. value is perf_events fd. This is why the map
is both is key integer and value integer.

Why so many indirections? Is it to support pinning where user program
can different ring buffers?
Perf event ring buffer is per cpu.


To me it seems the kernel code that uses cpu index to look into array
could just to told fd directly.
Yes, it is what it did in the kernel. Each array element holds one ring buffer.


On Sun, Feb 16, 2020 at 1:50 PM Y Song <ys114321@...> wrote:

PERF_EVENT_OUTPUT map is to hold per cpu ring buffers created by
perf_event_open.
That is why its typical size is the number of cpus on the host.

On Sun, Feb 16, 2020 at 1:52 AM Hayden Livingston
<halivingston@...> wrote:

I'm very confused why BCC creates a map of number of processors for
the perf_events output map.

I can imagine it being 1 since all it does is act as a kernel-user
mode intermediary and it is true that the code cannot be preempted.

Or if it can be preempted then I can imagine that since there can't be
more than processor count it is the max depth one has to worry about.

Is my thinking flawed? Or maybe there is a completely different reason?




Re: Why is BPF_PERF_OUTPUT max_entries set to total processor count?

Hayden Livingston
 

Thanks. I had to re-read your reply and the kernel code multiple
times, but I think I get it now. Please confirm.

It is this call is made by user mode code:

fd = bpf_create_map(BPF_MAP_TYPE_PERF_EVENT_ARRAY, /*key_size*/
sizeof(int), /*value_size*/ sizeof(int), NUM_POSSIBLE_CPUS, 0);

key is smp_processor_id. value is perf_events fd. This is why the map
is both is key integer and value integer.

Why so many indirections? Is it to support pinning where user program
can different ring buffers?

To me it seems the kernel code that uses cpu index to look into array
could just to told fd directly.

On Sun, Feb 16, 2020 at 1:50 PM Y Song <ys114321@...> wrote:

PERF_EVENT_OUTPUT map is to hold per cpu ring buffers created by
perf_event_open.
That is why its typical size is the number of cpus on the host.

On Sun, Feb 16, 2020 at 1:52 AM Hayden Livingston
<halivingston@...> wrote:

I'm very confused why BCC creates a map of number of processors for
the perf_events output map.

I can imagine it being 1 since all it does is act as a kernel-user
mode intermediary and it is true that the code cannot be preempted.

Or if it can be preempted then I can imagine that since there can't be
more than processor count it is the max depth one has to worry about.

Is my thinking flawed? Or maybe there is a completely different reason?



Re: Why is BPF_PERF_OUTPUT max_entries set to total processor count?

Yonghong Song
 

PERF_EVENT_OUTPUT map is to hold per cpu ring buffers created by
perf_event_open.
That is why its typical size is the number of cpus on the host.

On Sun, Feb 16, 2020 at 1:52 AM Hayden Livingston
<halivingston@...> wrote:

I'm very confused why BCC creates a map of number of processors for
the perf_events output map.

I can imagine it being 1 since all it does is act as a kernel-user
mode intermediary and it is true that the code cannot be preempted.

Or if it can be preempted then I can imagine that since there can't be
more than processor count it is the max depth one has to worry about.

Is my thinking flawed? Or maybe there is a completely different reason?



Why is BPF_PERF_OUTPUT max_entries set to total processor count?

Hayden Livingston
 

I'm very confused why BCC creates a map of number of processors for
the perf_events output map.

I can imagine it being 1 since all it does is act as a kernel-user
mode intermediary and it is true that the code cannot be preempted.

Or if it can be preempted then I can imagine that since there can't be
more than processor count it is the max depth one has to worry about.

Is my thinking flawed? Or maybe there is a completely different reason?


ebpf Tool to collect latency on all IP connections through a Linux router

vignesh_ramamurthy@...
 

Hello,

I was looking for a tool to collect latency on all IP connections through a linux router. We do have eBPF tools setup on the box. I am looking for something similar to tcpconnlat but need to capture the statistics for IP connections transiting through the router. 

Please suggest the best way to capture this. 

Best Regards,
Vignesh


Re: Is there an API to get the process command line?

Ganesan Rajagopal
 

Thanks Quillian. I considered tracing sys_execve since execsnoop already provides sample code for that. I also need to trace process exits to remove the pid to command line mapping. This is a very busy build server and spawning processes like crazy, so keeping a live mapping of all the processes and command lines may be too resource intensive. I'll give it a shot and see how it goes.

Ganesan

On Fri, Jan 3, 2020 at 1:58 AM Quillian Rutherford <quillian.rutherford@...> wrote:
If you are running while the process is created, you can set an entry probe on sys_execve and it has the cmdline in the arguments.  probe like:

int enter_sys_execve(struct pt_regs *ctx,
  const char __user *filename,
  const char __user *const __user *__argv,
  const char __user *const __user *__envp){


Then you can submit back the contents of argv.

On Wed, Jan 1, 2020 at 7:56 AM <rganesan+iovisor@...> wrote:
Hi all,

bcc monitoring tools which print a process being traced print only the command (and pid, ppid) without the full args. In many cases the monitored command is a script, so the command is just printed as (for example) "python" which isn't very useful. I couldn't find a bpf API to get the command line args.

Ganesan


Re: Is there an API to get the process command line?

Matheus Marchini <mat@...>
 

There's no API to access command line args. BPF_FUNC_get_current_comm
will give you the task name. If it's not enough, you can try to get it
via task_struct. Look for get_task_cmdline fs/proc/base.c in the
Kernel source code as a starting point to get the cmdline from a
task_struct.

On Wed, Jan 1, 2020 at 7:56 AM <rganesan+iovisor@...> wrote:

Hi all,

bcc monitoring tools which print a process being traced print only the command (and pid, ppid) without the full args. In many cases the monitored command is a script, so the command is just printed as (for example) "python" which isn't very useful. I couldn't find a bpf API to get the command line args.

Ganesan


Is there an API to get the process command line?

Ganesan Rajagopal
 

Hi all,

bcc monitoring tools which print a process being traced print only the command (and pid, ppid) without the full args. In many cases the monitored command is a script, so the command is just printed as (for example) "python" which isn't very useful. I couldn't find a bpf API to get the command line args.

Ganesan


Limiting string key size (with str() ??)

brad.lewis@...
 

Hi all, 

 

I'm trying to trace zfs reads and writes and collect statistics for different pools using the pool name.  

Here is a simplified version that just has a call count. 

kprobe:zfs_read,kprobe:zfs_write
{
$spa_name = ((struct zfsvfs *) ((struct inode *)arg0)->i_sb->s_fs_info)->z_os->os_spa->spa_name;
@zfs_count[$spa_name] = count();
}

I'm hitting the stack limit due to two large strings of length 256 being allocated on the stack: 

  %"@zfs_count_key" = alloca [256 x i8]
%"$spa_name" = alloca [256 x i8]


It seems that str( ) should be my friend here but so far no luck:  

: ERROR: str() only supports integer arguments (string provided)

 

        $spa_name = str((((struct zfsvfs *) ((struct inode *)arg0)->i_sb->s_fs_info)->z_os->os_spa->spa_name));

                    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

 

In this case I could use the os_spa pointer instead of the name to differentiate between the pools. 

I'd like to use the string for display purposes and I'd like to understand the issues.  

 

 

Thanks in advance for any insight.  


Re: [bpftrace] Another release?

Daniel Xu
 

0.9.3 is tagged: https://github.com/iovisor/bpftrace/releases/tag/v0.9.3 .

Please see changelog for more details.

Daniel

On Tue, Nov 5, 2019, at 3:26 AM, Jesper Dangaard Brouer wrote:
On Mon, 04 Nov 2019 13:42:32 -0800
"Daniel Xu" <dxu@...> wrote:

I was thinking about a 0.9.3 release and I noticed we had

https://github.com/iovisor/bpftrace/milestone/6

but it appears neglected.

Any thoughts on punting those issues and cutting a release now anyways?

We've merged a bunch of useful things (BTF, signed ints, int casts, better errors,
etc) since 0.9.2.
I would VERY much like to see a release, because I have bpftrace
scripts that depend on latest git-tree (I think it's the 'int casts'
that I depend on).

My scripts are here:
https://github.com/xdp-project/xdp-project/tree/master/areas/mem/bpftrace

--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat
LinkedIn: http://www.linkedin.com/in/brouer





Re: [bpftrace] Another release?

Jesper Dangaard Brouer
 

On Mon, 04 Nov 2019 13:42:32 -0800
"Daniel Xu" <dxu@...> wrote:

I was thinking about a 0.9.3 release and I noticed we had

https://github.com/iovisor/bpftrace/milestone/6

but it appears neglected.

Any thoughts on punting those issues and cutting a release now anyways?

We've merged a bunch of useful things (BTF, signed ints, int casts, better errors,
etc) since 0.9.2.
I would VERY much like to see a release, because I have bpftrace
scripts that depend on latest git-tree (I think it's the 'int casts'
that I depend on).

My scripts are here:
https://github.com/xdp-project/xdp-project/tree/master/areas/mem/bpftrace

--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat
LinkedIn: http://www.linkedin.com/in/brouer


Re: [bpftrace] Another release?

Daniel Xu
 

I'll update the 0.9.3 milestone and create a 0.9.4.

In the future I'll try to keep milestones updated for PRs and issues.

I guess this is also a good time to open the floor for other things that
should go into 0.9.3.

On Mon, Nov 4, 2019, at 1:51 PM, Matheus Marchini wrote:
Yes, a new release would be good (I was thinking about that as well).
I don't feel strong about the issues on 0.9.3 milestone, we can just
kick them to 0.9.4 if no one objects.

We might want to land this first though:
https://github.com/iovisor/bpftrace/pull/942.

And I'm looking to see if there's anything we can do about
https://github.com/iovisor/bpftrace/issues/940 as well.

On Mon, Nov 4, 2019 at 1:42 PM Daniel Xu <dxu@...> wrote:

I was thinking about a 0.9.3 release and I noticed we had

https://github.com/iovisor/bpftrace/milestone/6

but it appears neglected.

Any thoughts on punting those issues and cutting a release now anyways?

We've merged a bunch of useful things (BTF, signed ints, int casts, better errors,
etc) since 0.9.2.

And yes, I am volunteering to drive the release.


[bpftrace] Another release?

Daniel Xu
 

I was thinking about a 0.9.3 release and I noticed we had

https://github.com/iovisor/bpftrace/milestone/6

but it appears neglected.

Any thoughts on punting those issues and cutting a release now anyways?

We've merged a bunch of useful things (BTF, signed ints, int casts, better errors,
etc) since 0.9.2.

And yes, I am volunteering to drive the release.


Re: libbpf-devel rpm uapi headers

Alexei Starovoitov
 

On Wed, Oct 02, 2019 at 07:43:31PM +0200, Jiri Olsa wrote:
hi,
we'd like to have bcc linked with libbpf instead of the
github submodule, initial change is discussed in here:
https://github.com/iovisor/bcc/pull/2535

In order to do that, we need to have access to uapi headers
compatible with libbpf rpm, bcc is attaching and using them
during compilation.

I added them in the fedora spec below (not submitted yet),
so libbpf would carry those headers.

Thoughts? thanks,
I think it may break a bunch of people who rely on bcc being a single library.
What is the main motiviation to use libbpf as a shared library in libbcc?

I think we can have both options. libbpf as git submodule and as shared.
In practice git submodule is so much simpler to use and a lot less headaches.