minutes: IO Visor TSC and Dev members call


Brenden Blanco <bblanco@...>
 

Thanks all for another good discussion.

There are two major updates this week.

== Tracepoints

First, from Alexei, the infrastructure to attach bpf programs to tracepoints
has been merged in net-next. This will bring a stable kernel ABI to some
critical kernel events, rather than relying on kprobes (which can break as the
kernel internals change).

To start, tcp will likely be the first subsystem to define such events.
 - proposed events: retrans, rx estab, v4/v6 send rst, destroy sock
 - other ideas: passive open, active open, tcp state change

== XDP Early drop

I have also been working to prototype a new bpf hook early in the packet path
(driver rx), with the hopes of improving throughput for some use cases. The
first simple use case is for programmable early drop. The code is in RFC
format, see the discussion at [1] and an LWN article at [2].

The next step is to show that we can extend this infrastructure to include
forwarding, in addition to drop.


For xmit, defining the forward action is tricky, and we discussed some
possibilities. There was agreement that a complicated return code should be
avoided, instead model it like bpf_redirect().
 - Specify device rx queue
 - Specify ifindex - has the problem of allocating skb to cross nic boundary
 - Hardcode (ethtool) which rx queue maps to tx queue when fwd is picked
 - Batching should also be designed in

Hannes also mentioned a possible way to efficiently forward from phys_dev to a
namespace/socket. This mechanism would be cleaner and lighter weight than
ipvlan.

== Misc

Brendan mentioned that Xenial will be released soon, and we should prepare a
Xenial package for folks to download.

Brendan mentioned that from SREcon he found someone from a "major tech company"
that is interested in bpf as non-root for tracing.
This will be tricky since kprobes look well into privileged kernel structures,
but perhaps the tracepoint infrastructure will allow some use cases. Let's keep
this in the back of our minds for the future.

Alexei is working on infra to simplify packet read/write to reduce overhead. So
far, a new instruction with smarter bounds checking, as well as a smarter
verifier that lets packet access look more like regular pointer arithmetic are
being prototyped.

Daniel is working on arbitrary push/pop header. There is working code, maybe
soon we will see some patches.

Alexei is also working on the following enhancements, some of which have been
mentioned before:
 - mmap array
 - inline array map access

Also, someone raised the idea of having something like a static key for bpf
programs, which could be used to enable debug paths without reloading the
program.

Attendees:
Alexei Starovoitov
Brendan Gregg
Daniel Borkmann
Prem Jonnalagadda
Shehzad Ismail
Alex Bagehot
Alex Reece
Hannes Frederic Sowa
Uri Elzur
John Fastabend
Billy O Mahony