Forcing compiler to "respect" verifier needs


Gianluca Borello <g.borello@...>
 

Hello,

As the complexity of my BPF programs increases, I am finding myself
spending a lot of time fighting with the verifier due to the way the
compiler generates optimized BPF code. I apologize in advance if the
question is naive, it might very well be caused by my poor knowledge
of llvm.

For example, this is a very common scenario where I get in trouble.
I'm trying to push a variable-sized frame through perf:


#define SCRATCH_SIZE 0x3fff
#define MAX_STR_LEN 0x3ff

void write_perf_pdu(void)
{
/* scratch_map is a per-cpu array used as an extended
* stack space
*/
int id = 0;
char *scratchp = bpf_map_lookup_elem(&scratch_map, &id);
if (!scratchp)
return;

int len = 0;
char *p = scratchp;

/* Write first frame element (variable-size string) */
int res = bpf_probe_read_str(p, MAX_STR_LEN, NULL);
if (res < 0)
return;
/* Check boundaries (needed to write the second
* frame element at p[res])
*/
res &= MAX_STR_LEN;
len += res;
p += res;
/* Write the second frame element at scratchp[res] */
*(int *) p = res;
len += sizeof(int);

/* Do some other stuff that will cause the compiler
* to spill "len" onto the stack. It really doesn't
* take much, just a "switch" with a few cases is often
* enough.
*/
do_some_work();

/* This is needed because "len" was spilled so we need to
* re-tell the verifier that the range is safe. However,
* this instruction will never be emitted because the
* compiler already knows that len is < SCRATCH_SIZE
* (because of the previous &= MAX_STR_LEN), so the verifier
* will refuse to load the program.
*/
len &= SCRATCH_SIZE;
++len;
bpf_perf_event_output(ctx, &perf_map,
bpf_get_smp_processor_id(), scratchp,
len);
}


In case my commentary was not clear enough, here is an alternative
commented version of the generated BPF code:


13: (85) call bpf_probe_read_str#45

// res &= MAX_STR_LEN;
// verifier knows r0 is now in the proper range
17: (57) r0 &= 1023

// res/len is spilled into the stack, and the range information
// about r0 is forgotten
18: (7b) *(u64 *)(r10 -80) = r0

// *(int *) p = res;
// before recycling r0, the range-checked r0 is used for a variable
// access inside the map
19: (bf) r1 = r8
20: (0f) r1 += r0
21: (63) *(u32 *)(r1 +0) = r0

...
// do_some_work() is called
...

// len is restored from the stack
149: (79) r6 = *(u64 *)(r10 -80)

// len += sizeof(int);
// ++len;
// However, len &= SCRATCH_SIZE is ignored because it's useless from
// a compiler point of view
150: (07) r6 += 5
...

// as a result, helper call naturally fails since length is not
// range checked
157: (bf) r5 = r6
158: (85) call bpf_perf_event_output#25
R5 min value is negative, either use unsigned or 'var &= const'


Notice that, if I manually edit the BPF code and add an explicit check
on r5 right before instruction 158, it will just work fine. But the
compiler decides, for optimization reasons, to not do that.

This happens in several other circumstances to the point where it's my
biggest struggle with BPF at the moment, hence this email.

The solutions I see would be:

- Manually reorganize the code to make sure do_some_work() is called
before or after: this was initially my strategy, but as my needs are
evolving I'm more frequently generating automatic BPF code based on
some metadata description of my tracing needs, so I can't really tweak
the code too much.

- Allow the verifier to track spilled SCALAR_VALUE registers: I think
this was discussed a while ago and the consensus was it would be a
large patch and would also make the verifier state tracking explode?
Maybe things changed in the last few months?

- Tell llvm to always emit an instruction via some asm("") trick or
similar? This is where my poor knowledge of llvm penalizes me, but by
quickly looking in the bcc code base I couldn't find any example of
such usage.

- Take the generated BPF program and patch it by manually emitting
range check instructions before helper calls prior to loading it with
bpf()? No thanks :)

Thanks for reading

Join iovisor-dev@lists.iovisor.org to automatically receive all group messages.