Re: [RFC][Proposal] BPF Control MAP

Alexei Starovoitov

On Tue, Mar 19, 2019 at 06:54:08PM +0000, Saeed Mahameed wrote:

This is exactly the purpose of map_{create/delete} to define what the
key,vlaue format will be ( it is not going to be ifindex for all
control maps), to make it clear, the key doesn't have to be an ifindex
at all, it depends on the map_attr.ctrl_type which the user request on
map creation, so different layouts are already supported by this

examples of different map_attr.ctrl_type:

fd = create_map(BPF_MAP_TYPE_CONTROL, map_attr.ctrl_type = XDP_ATTR)
// Key layout == ifindex, vlaue format if_xdp_attributes

fd = create_map(BPF_MAP_TYPE_CONTROL, map_attr.ctrl_type =
// Key layout == prog_fd, value format struct bpf_prog_stats

fd = create_map(BPF_MAP_TYPE_CONTROL, map_attr.ctrl_type =
// Key layout == socket_fd, value layout struct bpf_sock_attr

extending this can be done in any linux BPF subsystem, by introducing
new map_attr.ctrl_types and new key value layouts of that control type.

on map creation we will attach the btf format of the (key, value) pair
to the map created for that user.
In your examples above does netdev with corresponding ifindex
exist before map is created?
Does prog_fd exist ? and socket?
In all cases yes. they do. Hence creation of 'map' (even pseudo map)
is an unnecessary step.

User space providing BTF for input and output makes little sense to me.
What kernel suppose to do with it?

In case of XDP stats (that could be different between drivers)
the driver would provide a BTF to describe the stats it's collecting.
So it's kernel supplied BTF instead of user.

I think we need something else here. Using BTF to describe
output stats is nice, but using BTF to describe input query is
since user cannot know before hand what kernel can and cannot accept.
imo input should be stable uapi with fixed constants whereas
stats-like output can be BTF based and vary from driver to driver
and from one nic version to another.
Well, we can decide to use static stable uapi, and still use this
special map to leverage the map API as described in this doc.

but also we can allow dynamic value layouts, but any extension should
be done to the end of any value structuer and on lookup we will only
copy the size that the user already recognizes .. ? or we can
assume/force the user to use the map_attributes to figure out format
layouts and sizes, but still we will guarantee backward compatibility
in kernel level by keeping old format the same, and extension is only
allowed to the end of the value structures.
for input queries - yes. See how 'union bpf_attr' works.
It can accommodate any number of new commands with its own arguments
to query XDP stats.
For output the driver can supply BTF back to user space along with
a blob of data.

If I got your intent correctly you want only BPF_MAP_TYPE_CONTROL to
be processed by generic bpf layer and everything else to be done in
the driver? All commands, values, input/output to be driver specific?

Join to automatically receive all group messages.