Re: handle pool of elements in eBPF maps


Yonghong Song
 

On Wed, Feb 28, 2018 at 2:24 PM, Mauricio Vasquez via iovisor-dev
<iovisor-dev@...> wrote:
Hello Teng,

(Adding the list as dropped in last message)

On 02/28/2018 02:37 AM, Teng Qin wrote:



On Tue, Feb 27, 2018 at 12:27 PM, Mauricio Vasquez via iovisor-dev
<iovisor-dev@...> wrote:

Dear All,

We know from our experience implementing network functions in eBPF that
some services require to keep pool of elements, for example addresses and
ports in a NAT. So far we haven't found a way to do it entirely in eBPF, we
have implemented some workarounds as described in [1] (use an array map and
a counter for example), we also have moved this logic into user space for
some applications, however none of these solutions fulfill our requirements.

We want to bring the discussion of a possible extension to eBPF maps, we
think the right way to go is to have a map that supports the push and pop
methods.

I think we could (kind of) simulate a stack-like data structure now, by
using
a normal BPF array as storage, along with another global variable
(i.e., array of size 1) to keep track of the stack top index, and inc / dec
it
on push / pop. There could be concurrency issues, so maybe using per-CPU
version
of those?


Unfortunately percpu would not work, the set of elements has to be shared
across all cpus.
We could think about using a synchronization primitive to avoid potential
problems, however that synchronization should also be available from user
space, because in our use case the eBPF programs are consumers while an
application in userspace is the producer.


I agree it feels complicated and error-prone, a native stack / queue map
type
would definitely make such use case nicer.


If there is consensus about this map, we could propose an implementation.
If a particular map (e.g., LRU map) serves your general need, and you only need
two more map operations (e.g., pop and push), you can just add two new
operations
for that map. if this is only available from BPF programs, you do not
even need to
introduce a sub command.

I think if you can prototype this, it will be great for people to see
your needs and
provide suggestions.


/Mauricio.




Any thoughts on this?

Thanks,

Mauricio

[1]
https://lists.iovisor.org/pipermail/iovisor-dev/2017-January/000614.html

_______________________________________________
iovisor-dev mailing list
iovisor-dev@...
https://lists.iovisor.org/mailman/listinfo/iovisor-dev



_______________________________________________
iovisor-dev mailing list
iovisor-dev@...
https://lists.iovisor.org/mailman/listinfo/iovisor-dev

Join iovisor-dev@lists.iovisor.org to automatically receive all group messages.