Re: [agenda] IO Visor TSC/Dev Meeting
Brenden Blanco
Hi all,
toggle quoted messageShow quoted text
Let's skip the meeting this week, since there isn't a specific agenda to discuss. Going forward, I'll send out a call for agenda reminder 2 days in advance so the meeting isn't decided last minute. Thanks, Brenden
On Wed, Jun 26, 2019 at 9:27 AM Brenden Blanco <bblanco@...> wrote:
|
|
Re: [agenda] IO Visor TSC/Dev Meeting
Brenden Blanco
On Tue, Jun 25, 2019 at 1:39 PM Brenden Blanco <bblanco@...> wrote:
I have only received the following agenda from Brendan so far: I have a meeting clash so I can only join for a bit; my agenda items are:Are there any further agenda items?
|
|
[agenda] IO Visor TSC/Dev Meeting
Brenden Blanco
Hi All,
As per the discussion from last meeting, this week's meeting will be provisional on having a proposed agenda rather than free-form. Therefore, please reply if there is a topic that you would like to discuss live with the other BPF developers. === IO Visor Dev/TSC Meeting === Every 2 weeks on Wednesday, from Wednesday, January 25, 2017, to no end date 11:00 am | Pacific Daylight Time (San Francisco, GMT-07:00) | 30 min https://bluejeans.com/568677804/
|
|
bpftrace v0.9.1
Matheus Marchini <mat@...>
We just released bpftrace v0.9.1. This turned into a pretty big
release, with almost 200 commits and 3 months since v0.9.0. We'll try to have smaller development cycles for the next releases. Some highlights of this version: - Compound assignment operators (+= and friends) - Support arrays and IPv6 in ntop - Add basic support to enums - Add basic macro definition support - Allow comparison of two string variables - Add pre and post behavior to ++ and -- operators - Ban kprobes that cause CPU deadlocks - Add unsafe-mode and make default execution mode safe-mode Full release notes can be found at: https://github.com/iovisor/bpftrace/releases/tag/v0.9.1 Hope y'all enjoy this version! Let us know if you find any issues. Cheers,
|
|
Re: Performance of veth XDP
Forrest Chen
On Wed, Jun 12, 2019 at 09:31 PM, Toshiaki Makita wrote:
I cannot reproduce it.Thanks. I have tested in non-XDP mode and the problem also happen. It maybe a netperf bug...
|
|
Re: bpf_probe_read() split: bpftrace RFC
Matheus Marchini <mat@...>
How will bpf_probe_read_user/bpf_probe_read_kernel be enforced in the
Kernel? In other words, how bpf_probe_read_user will detect and report when it get's a Kernel address as parameter, and vice-versa? Will it be accomplished by the verifier (is it even possible to do this reliably with the verifier) or only on runtime? If the kernel will only test it during runtime, and it returns an unique error code (different than errors that probe_read can return today, we might need to create a new error code) , we could do the following for the dereference operands (*/str()): typedef int (probe_read_t)(void *dst, int size, void *src); // Assuming bpf_probe_read_[user,kernel] will return EINVALADDRSPC // if the user tires to access an address with the wrong function int err; // space_ctx is defined according to Brendan's email probe_read_t default_probe_read; = space_ctx == KERNEL ? bpf_probe_read_kernel : bpf_probe_read_user; probe_read_t fallback_probe_read; if (addr_space_ctx == KERNEL) { default_probe_read = bpf_probe_read_kernel; fallback_probe_read = bpf_probe_read_user; } else { default_probe_read = bpf_probe_read_user; fallback_probe_read = bpf_probe_read_kernel; } if (err = (*default_probe_read)(dst, size, src) == EINVALADDRSPC) { err = (*fallback_probe_read)(dst, size, src); } if (err < 0) { bpf_trace_printk("Error while reading address %x\n", src); return; } With this approach we can avoid breaking any scripts. The only difference is that it will add more overhead when the fallback probe_read is used (and if the user is affected by this overhead, they can still use kptr/uptr/kstr/ustr). We could also: print to stdout/syslog when the fallback method is used if bpftrace is running in verbose mode, and provide a "strict" mode which would not try to run the fallback probe_read. On Thu, Jun 13, 2019 at 11:32 AM Brendan Gregg <brendan.d.gregg@...> wrote:
|
|
bpf_probe_read() split: bpftrace RFC
Brendan Gregg
G'Day,
This is the biggest change afoot to the bpftrace API, and I think we can sort it out quickly without fuss, but it is worth sharing here. This is from https://github.com/iovisor/bpftrace/issues/614 . bpftrace currently allows pointer dereferencing via *addr, and str(addr) for strings. But the future split of bpf_probe_read() into bpf_probe_read_user() and bpf_probe_read_kernel() (to support SPARC, etc) may break a lot of bpftrace tools and documentation. Or it may not, if we are clever about it. The proposal is this: add the following bpftrace builtins: - uptr(addr): dereference user address - ustr(addr): fetch NULL-terminated user string - kptr(addr): dereference kernel address - kstr(addr): fetch NULL-terminated kernel string AND, to introduce a "context" for probe actions -- user or kernel -- where *addr and str(addr) work relative to that context. The context would be: - kprobes/kretprobes: kernel - uprobes/uretprobes: user - tracepoints: kernel (with the exception of syscall tracepoints: user) - other probe types: kernel It's possible that this context approach leaves us with zero broken tools and documentation (ie, there are zero cases so far where we even need to use uptr/ustr/kptr/kstr). I'm still checking and looking for exceptions. Where you can help: can you think of a syscall tracepoint that has a kernel address as an argument? Or another non-syscall tracepoint that has a user-address as an argument? Or can you think of any other problem with this plan? thanks, Brendan
|
|
Re: BPF Virtual Machine Runtime
On Thu, Jun 13, 2019 at 10:37 AM Fulvio Risso <fulvio.risso@...> wrote: Just a personal comment. FWIW - I don't share the same consideration, and I'd be inclined to say that clear documentation helps educate folks, whereas inventing new terminology just because there's a wall of confusion, would only lead to a greater wall longterm. That said we can be a bit more explicit. Would framing it as "the BPF VM is a specification of an in-kernel virtual machine that runs BPF instructions" help here? The extra "in-kernel" prefix would make people understand there's a some distinction here, while still being very true to reality.
Panagiotis Moustafellos SRE Tech Lead @ Elastic
|
|
Re: BPF Virtual Machine Runtime
Fulvio Risso
Just a personal comment.
toggle quoted messageShow quoted text
Talking about "BPF VM" to students raises a lot of confusion, as they expect a full fledged VM and do not undestand what "VM" means in this case. Comparing to Java does not help, as most people think about Java as a language, not as a VM. So, in my classes I started to present BPF it as a virtual CPU, which is not that far from the reality; this improves the way people quicly understand the concept. Cheers, fulvio
On 12/06/2019 23:36, Daniel Borkmann wrote:
On 06/12/2019 09:52 PM, Brendan Gregg wrote:Following on from the call... Does this sound even better? (mappingI'd probably drop the '(defined in filter.h, etc)' part, but otherwise
|
|
Re: Performance of veth XDP
Toshiaki Makita
On 2019/06/12 17:38, forrest0579@... wrote:
In https://lists.iovisor.org/g/iovisor-dev/topic/how_to_make_redirect_map/31867035 I have built up an environment to make veth+XDP work.Because XDP core does not support checksum (or any other) offload. Any of necessary information for offloading will be discarded when converting skb into xdp_frame. Basically veth XDP is not so fast when you only use XDP_PASS in containers. 2.I cannot reproduce it. Is it an XDP-related problem? What happens if you use bridge in place of XDP_REDIRECT? Did you collect tcpdump result in server-side? Also how about checking "ethtool -S" and "nstat" (not netstat) periodically? Toshiaki Makita
|
|
minutes: IO Visor TSC/Dev Meeting
Brenden Blanco
Hi all,
Thanks for dialing in to the meeting this week. Going forward, in order to conserve everyone's time, we will be polling ahead of time for an agenda, and foregoing the online meeting in favor of voluntary email updates if no agenda is suggested. I will send out a reminder for an agenda the day before this scheduled meeting if one hasn't been suggested. Cheers, Brenden === Discussion === This meeting has become somewhat repetitive, and often lacks an agenda. The discussion that focuses around status isn't too useful. The goal of the meeting should be to discuss specific issues. Please send issues to discuss prior to the meeting, otherwise send regular updates (if useful) in email format. Brendan: Book is not visible online yet, waiting on publisher Reviewers are still looking at pre-copy-edit versions Question for kernel devs: how does the verifier handle divide by zero? Should the book call the BPF a VM or runtime? Answer: runtime It is also an instruction set. Any mention of turing completeness? ... waiting for bpf runtime written in bpf :) === Attendees === Brenden Blanco Michael Savisko Bjorn Topel Jesper Brouer Jakub Kicinski Daniel Borkmann Jiong Wang Alexei Starovoitov Joe Stringer Marco Leogrande Martin Lau Maciej Fijalkowski Brendan Gregg Dan Siemon John F Quentin Monnet Richard Elling
|
|
Re: BPF Virtual Machine Runtime
Daniel Borkmann
On 06/12/2019 09:52 PM, Brendan Gregg wrote:
Following on from the call... Does this sound even better? (mappingI'd probably drop the '(defined in filter.h, etc)' part, but otherwise I think it's fine. Thanks, Daniel
|
|
BPF Virtual Machine Runtime
Brendan Gregg
Following on from the call... Does this sound even better? (mapping
from the JVM for comparison): The JVM is a specification of a virtual machine that runs Java bytecode. It is implemented by a Java Runtime Environment, such as OpenJDK, which includes an interpreter and a JIT compiler. The BPF VM (BVM?) is a specification of a virtual machine that runs BPF instructions (defined in filter.h, etc). It is implemented by the Linux kernel BPF runtime, which includes an interpreter and a JIT compiler. Most of the work for the past 5 years has been developing the BPF runtime. Brendan
|
|
Performance of veth XDP
Forrest Chen
In https://lists.iovisor.org/g/iovisor-dev/topic/how_to_make_redirect_map/31867035 I have built up an environment to make veth+XDP work.
There're some question when I do some performance test 1. When I do a performance test using iperf, I found that the test result with xdp is nearly the same as without xdp. I guess maybe it is because in xdp I have to turn off de tx offload. So my question is why the xdp would affect the veth tx offload? 2. When I test using netperf with TCP_CRR type, I find that after some connection test, the test will be blocked. After debug with tcpdump & netstat, I find that the last connection in client-side enter into FIN_WAIT2 state, the tcpdump result for normal and abnormal connection show in gist. Every normal connection has 10 records but the blocked abnormal connection has only 8 records. And the sequence of the first 8 records is different. I have no idea why this would happen since what I do is just redirect packets. DO anyone have any ideas?
|
|
Re: reminder: IO Visor TSC/Dev Meeting
Alexei Starovoitov
Brenden,
toggle quoted messageShow quoted text
thanks for the bi-weekly reminders! All, if you have any topics to discuss, please email them to the list, so folks have better idea what to expect tomorrow. Thanks!
On Tue, Jun 11, 2019 at 4:33 PM Brenden Blanco <bblanco@...> wrote:
|
|
reminder: IO Visor TSC/Dev Meeting
Brenden Blanco
Please join us tomorrow for our bi-weekly call. As usual, this meeting is
open to everybody and completely optional. You might be interested to join if: You want to know what is going on in BPF land You are doing something interesting yourself with BPF and would like to share You want to know what the heck BPF is === IO Visor Dev/TSC Meeting === Every 2 weeks on Wednesday, from Wednesday, January 25, 2017, to no end date 11:00 am | Pacific Daylight Time (San Francisco, GMT-07:00) | 30 min https://bluejeans.com/568677804/ https://www.timeanddate.com/worldclock/meetingdetails.html?year=2019&month=6&day=12&hour=18&min=0&sec=0&p1=900
|
|
Re: Headers Parsing with fields of variable length
Raymond
On 2019-06-10 12:35 p.m., mdimolianis@... wrote:
I am trying to create a header for the DNS protocol and parse DNS queries however I cannot parse headers with variable size e.g. the domain name (due to looping constraints of XDP). Is there a method I could handle cases like that?Wouldn't you punt that to a userland xdp listener for action? Dns packets are complicated.
|
|
Headers Parsing with fields of variable length
mdimolianis@...
Hi all,
I am trying to create a header for the DNS protocol and parse DNS queries however I cannot parse headers with variable size e.g. the domain name (due to looping constraints of XDP). Is there a method I could handle cases like that? Thank you in advance!
|
|
Re: how to make redirect_map work?
Forrest Chen
On Tue, Jun 4, 2019 at 12:36 PM, Mauricio Vasquez wrote:
I am sorry, I was not clear enough. If you attach the program in SKB mode you won't need to attach any XDP program on vbox1 and vbox2, on the other hand, if you use DRV mode you need to have an XDP pass program attached to vbox1 and vbox2 (as indicated by Toshiaki Makita).I'm sorry, it's my fault. I've re-test use SKB mode and it works now. I think the reason why I failed before was I didn't change the dst MAC address so the kernel drops it. Forrest
|
|
Re: how to make redirect_map work?
Mauricio Vasquez
On 5/30/19 9:25 PM,
forrest0579@... wrote:
On Thu, May 30, 2019 at 05:40 AM, Mauricio Vasquez wrote: I am sorry, I was not clear enough. If you attach the program in SKB mode you won't need to attach any XDP program on vbox1 and vbox2, on the other hand, if you use DRV mode you need to have an XDP pass program attached to vbox1 and vbox2 (as indicated by Toshiaki Makita). Mauricio.
|
|