This group is locked. No changes can be made to the group while it is locked.
Date
1 - 1 of 1
[PATCHv3 RFC 0/3] AF_XDP netdev support for OVS
William Tu
The patch series introduces AF_XDP support for OVS netdev.
AF_XDP is a new address family working together with eBPF. In short, a socket with AF_XDP family can receive and send packets from an eBPF/XDP program attached to the netdev. For more details about AF_XDP, please see linux kernel's Documentation/networking/af_xdp.rst OVS has a couple of netdev types, i.e., system, tap, or internal. The patch first adds a new netdev types called "afxdp", and implement its configuration, packet reception, and transmit functions. Since the AF_XDP socket, xsk, operates in userspace, once ovs-vswitchd receives packets from xsk, the proposed architecture re-uses the existing userspace dpif-netdev datapath. As a result, most of the packet processing happens at the userspace instead of linux kernel. Architecure =========== _ | +-------------------+ | | ovs-vswitchd |<-->ovsdb-server | +-------------------+ | | ofproto |<-->OpenFlow controllers | +--------+-+--------+ | | netdev | |ofproto-| userspace | +--------+ | dpif | | | netdev | +--------+ | |provider| | dpif | | +---||---+ +--------+ | || | dpif- | | || | netdev | |_ || +--------+ || _ +---||-----+--------+ | | af_xdp prog + | kernel | | xsk_map | |_ +--------||---------+ || physical NIC To simply start, create a ovs userspace bridge using dpif-netdev by setting the datapath_type to netdev: # ovs-vsctl -- add-br br0 -- set Bridge br0 datapath_type=netdev And attach a linux netdev with type afxdp: # ovs-vsctl add-port br0 afxdp-p0 -- \ set interface afxdp-p0 type="afxdp" Documentation ============= Most of the design details are described in the paper presetned at Linux Plumber 2018, "Bringing the Power of eBPF to Open vSwitch"[1], section 4, and slides[2]. This path uses a not-yet upstreamed feature called XDP_ATTACH[3], described in section 3.1, which is a built-in XDP program for the AF_XDP. This greatly simplifies the management of XDP/eBPF programs. [1] http://vger.kernel.org/lpc_net2018_talks/ovs-ebpf-afxdp.pdf [2] http://vger.kernel.org/lpc_net2018_talks/ovs-ebpf-lpc18-presentation.pdf [3] http://vger.kernel.org/lpc_net2018_talks/lpc18_paper_af_xdp_perf-v2.pdf Test Cases ========== Test cases are created using namespaces and veth peer, with AF_XDP socket attached to the veth (thus the SKB_MODE). By issuing "make check-afxdp", the patch shows the following: AF_XDP netdev datapath-sanity 1: datapath - ping between two ports ok 2: datapath - http between two ports ok 3: datapath - ping between two ports on vlan ok 4: datapath - ping6 between two ports ok 5: datapath - ping6 between two ports on vlan ok 6: datapath - ping over vxlan tunnel ok 7: datapath - ping over vxlan6 tunnel ok 8: datapath - ping over gre tunnel ok 9: datapath - ping over erspan v1 tunnel ok 10: datapath - ping over erspan v2 tunnel ok 11: datapath - ping over ip6erspan v1 tunnel ok 12: datapath - ping over ip6erspan v2 tunnel ok 13: datapath - ping over geneve tunnel ok 14: datapath - ping over geneve6 tunnel ok 15: datapath - clone action ok 16: datapath - mpls actions ok 17: datapath - basic truncate action ok conntrack 18: conntrack - controller ok 19: conntrack - force commit ok 20: conntrack - ct flush by 5-tuple ok 21: conntrack - IPv4 ping ok 22: conntrack - get_nconns and get/set_maxconns ok 23: conntrack - IPv6 ping ok 24: conntrack - preserve registers ok 25: conntrack - invalid ok 26: conntrack - zones ok 27: conntrack - zones from field ok 28: conntrack - multiple bridges ok 29: conntrack - multiple zones ok 30: conntrack - multiple namespaces, internal ports skipped (system-afxdp-traffic.at:1298) 31: conntrack - ct_mark ok 32: conntrack - ct_mark bit-fiddling ok system-ovn 36: ovn -- 2 LRs connected via LS, gateway router, SNAT and DNAT ok 37: ovn -- 2 LRs connected via LS, gateway router, easy SNAT ok 38: ovn -- multiple gateway routers, SNAT and DNAT ok 39: ovn -- load-balancing ok 40: ovn -- load-balancing - same subnet. ok 41: ovn -- load balancing in gateway router ok 42: ovn -- multiple gateway routers, load-balancing ok 43: ovn -- load balancing in router with gateway router port ok 44: ovn -- DNAT and SNAT on distributed router - N/S ok 45: ovn -- DNAT and SNAT on distributed router - E/W ok --- v1->v2: - add a list to maintain unused umem elements - remove copy from rx umem to ovs internal buffer - use hugetlb to reduce misses (not much difference) - use pmd mode netdev in OVS (huge performance improve) - remove malloc dp_packet, instead put dp_packet in umem v2->v3: - rebase on the OVS master, 7ab4b0653784 ("configure: Check for more specific function to pull in pthread library.") - remove the dependency on libbpf and dpif-bpf. instead, use the built-in XDP_ATTACH feature. - data structure optimizations for better performance, see[1] - more test cases support William Tu (3): netdev-afxdp: add new netdev type for AF_XDP tests: add AF_XDP netdev test cases. FIXME: work around the failed cases. acinclude.m4 | 13 + configure.ac | 1 + lib/automake.mk | 6 +- lib/dp-packet.c | 20 + lib/dp-packet.h | 29 +- lib/dpif-netdev.c | 2 +- lib/netdev-afxdp.c | 703 ++++++++++++++++++ lib/netdev-afxdp.h | 41 ++ lib/netdev-linux.c | 72 +- lib/netdev-provider.h | 1 + lib/netdev.c | 1 + lib/xdpsock.c | 171 +++++ lib/xdpsock.h | 144 ++++ tests/automake.mk | 17 + tests/system-afxdp-macros.at | 155 ++++ tests/system-afxdp-testsuite.at | 26 + tests/system-afxdp-traffic.at | 1541 +++++++++++++++++++++++++++++++++++++++ 17 files changed, 2937 insertions(+), 6 deletions(-) create mode 100644 lib/netdev-afxdp.c create mode 100644 lib/netdev-afxdp.h create mode 100644 lib/xdpsock.c create mode 100644 lib/xdpsock.h create mode 100644 tests/system-afxdp-macros.at create mode 100644 tests/system-afxdp-testsuite.at create mode 100644 tests/system-afxdp-traffic.at -- 2.7.4 |