On 08/08/2018 09:18 PM, Dan Siemon wrote:
I listened in on the IOVisor call today but wasn't sure if user
questions and use cases where appropriate there so I started writing
There are two things I was going to discuss:
We use cls_act and eBPF extensively (https://www.preseem.com)
I'm hoping someone can point me in the right direction... we track
metrics per-IP address (latency, loss, others) via a eBPF program and
also manage traffic via HTB and FQ-CoDel. A given end customer can have
several IP addresses assigned which all need to go through the same HTB
class (and attached FQ-CoDel instance) to enforce the plan rate.
At present, we extract the metrics in cls_act which is pre-qdisc. The
problem with this is that the byte and packet counts are too high
because packets can be dropped in the qdiscs. It also messes a bit with
the loss and latency calculations.
Is there a way to hook post qdisc? I looked a bit at XDP, but it seems
that is only Rx now?
There's a tracepoint right before netdev_start_xmit() which is called
trace_net_dev_start_xmit(). So you could combine sch_clsact egress with
cls_bpf and a BPF prog on the tracepoint right before handing the skb
to the driver, they could share a map for example for the tuple to counters
mapping, so you would still be able to do the major work in cls_bpf
outside the qdisc lock.
The Celium BPF docs are great but one thing I don't have a good handle
on is concurrent access to map data. Taking the hash map as an example,
does the map update function need to be called with the eBPF program
for each update to make it safe with concurrent access via userspace
bpf syscall? At present we do some map data updates just using the ptr
from the lookup function.
The latter works with e.g. counters using BPF_XADD instructions, or
switching to per-CPU map for counters to avoid the atomic op altogether.
For single CPU map with non-XADD updates you might need to use the map
update function instead to avoid races.