Date
1 - 1 of 1
[PATCH RFC 0/3] AF_XDP support for OVS
William Tu
The patch series introduces AF_XDP support for OVS netdev.
AF_XDP is a new address family working together with eBPF. In short, a socket with AF_XDP family can receive and send packets from an eBPF/XDP program attached to the netdev. For more details about AF_XDP, please see linux kernel's Documentation/networking/af_xdp.rst OVS has a couple of netdev types, i.e., system, tap, or internal. The patch first adds a new netdev types called "afxdp", and implement its configuration, packet reception, and transmit functions. Since the AF_XDP socket, xsk, operates in userspace, once ovs-vswitchd receives packets from xsk, the proposed architecture re-uses the existing userspace dpif-netdev datapath. As a result, most of the packet processing happens at the userspace instead of linux kernel. Architecure =========== _ | +-------------------+ | | ovs-vswitchd |<-->ovsdb-server | +-------------------+ | | ofproto |<-->OpenFlow controllers | +--------+-+--------+ | | netdev | |ofproto-| userspace | +--------+ | dpif | | | netdev | +--------+ | |provider| | dpif | | +---||---+ +--------+ | || | dpif- | | || | netdev | |_ || +--------+ || _ +---||-----+--------+ | | af_xdp prog + | kernel | | xsk_map | |_ +--------||---------+ || physical NIC To simply start, create a ovs userspace bridge using dpif-netdev by setting the datapath_type to netdev: # ovs-vsctl -- add-br br0 -- set Bridge br0 datapath_type=netdev And attach a linux netdev with type afxdp: # ovs-vsctl add-port br0 afxdp-p0 -- \ set interface afxdp-p0 type="afxdp" Most of the implementation follows the AF_XDP sample code in Linux kernel under samples/bpf/xdpsock_user.c. Configuration ============= When a new afxdp netdev is added to OVS, the patch does the following configuration 1) attach the afxdp program and map to the netdev (see bpf/xdp.h) 2) create an AF_XDP socket (XSK) for thie netdev 3) allocate a virtual contiguous memory region, called umem, and register this memory to the XSK 4) setup the rx/tx ring, and umem's fill/completion ring Packet Flow =========== Currently, the af_xdp program loaded to the netdev does nothing but simply forwards the packet to queue id 0. The patch simplifies the buffer/ring management by introducing a copy from umem to ovs's internal buffer, when receiving a packet. And when sending the packet out to another netdev, copying the packet to the netdev's umem. An AF_XDP packet forwarding from one netdev (ovs-p0) to another netdev (ovs-p1) goes through the following path: 1) xdp program at ovs-p0 copies packet to kernel (SKB_MODE) 2) kernel maps the packet to userspace umem 3) ovs-vswitchd receive the packet from ovs-p0, copy to internal packet buffer 4) ovs-vswitchd copies the pachet to umem of ovs-p1, kick_tx 5) kernel copies the packet from umem to ovs-p1 tx queue Since the total number of copies between two ports is 4, the performance will be bad so I don't want to do it. Hopefully by using AF_XDP zero copy mode, 1) and 5) will be removed and in ovs-vswitchd, it's possible to combine the 3) and 4) to only one copy. So the best case will be one copy between two netdev. Test Framework ============== # make check-afxdp will kick start two end-to-end tests using veth peer and namespaces: AFXDP netdev datapath-sanity 1: datapath - ping between two ports ok 2: datapath - http between two ports ok The patch series is based on the ovs-ebpf implementaion. A copy is put at: https://github.com/williamtu/ovs-ebpf/ branch afxdp-v1 William Tu (3): afxdp: add ebpf code for afxdp and xskmap. netdev-linux: add new netdev type afxdp. tests: add afxdp test cases. acinclude.m4 | 1 + bpf/api.h | 6 + bpf/helpers.h | 2 + bpf/maps.h | 12 + bpf/xdp.h | 34 +- lib/automake.mk | 3 +- lib/bpf.c | 41 ++- lib/bpf.h | 6 +- lib/dpif-netdev.c | 74 +++- lib/if_xdp.h | 79 +++++ lib/netdev-dummy.c | 1 + lib/netdev-linux.c | 741 +++++++++++++++++++++++++++++++++++++++- lib/netdev-provider.h | 2 + lib/netdev-vport.c | 4 + lib/netdev.c | 11 + lib/netdev.h | 1 + tests/automake.mk | 17 + tests/ofproto-macros.at | 1 + tests/system-afxdp-macros.at | 148 ++++++++ tests/system-afxdp-testsuite.at | 25 ++ tests/system-afxdp-traffic.at | 38 +++ vswitchd/bridge.c | 1 + 22 files changed, 1228 insertions(+), 20 deletions(-) create mode 100644 lib/if_xdp.h create mode 100644 tests/system-afxdp-macros.at create mode 100644 tests/system-afxdp-testsuite.at create mode 100644 tests/system-afxdp-traffic.at -- 2.7.4 |
|