math between pkt pointer and register with unbounded min value is not allowed #verifier
Simon
I'm playing with bcc to prototype an UDP load balancer. I'm facing an issue that I didn't succeed to understand... In my code I tried to validate my UDP packet using code like this : struct udphdr *udp; udp = iph + 1; if (udp + 1 > data_end) return XDP_DROP; __u16 udp_len = bpf_ntohs(udp->len); //__u16 udp_len = 8; if (udp_len < 8) return XDP_DROP; if (udp_len > 512) // TODO use a more approriate max value return XDP_DROP; if ((void *) udp + udp_len > data_end) return XDP_DROP;And the verifier does not like it .. 28: (71) r2 = *(u8 *)(r7 +23) 29: (b7) r0 = 2 30: (55) if r2 != 0x11 goto pc+334 R0=inv2 R1=pkt_end(id=0,off=0,imm=0) R2=inv17 R3=inv5 R6=ctx(id=0,off=0,imm=0) R7=pkt(id=0,off=0,r=34,imm=0) R8=pkt(id=0,off=34,r=34,imm=0) R9=pkt(id=0,off=14,r=34,imm=0) R10=fp0,call_-1 31: (bf) r2 = r8 32: (07) r2 += 8 33: (b7) r0 = 1 34: (2d) if r2 > r1 goto pc+330 R0=inv1 R1=pkt_end(id=0,off=0,imm=0) R2=pkt(id=0,off=42,r=42,imm=0) R3=inv5 R6=ctx(id=0,off=0,imm=0) R7=pkt(id=0,off=0,r=42,imm=0) R8=pkt(id=0,off=34,r=42,imm=0) R9=pkt(id=0,off=14,r=42,imm=0) R10=fp0,call_-1 35: (69) r3 = *(u16 *)(r7 +38) 36: (dc) r3 = be16 r3 37: (bf) r2 = r3 38: (07) r2 += -8 39: (57) r2 &= 65535 40: (b7) r0 = 1 41: (25) if r2 > 0x1f8 goto pc+323 R0=inv1 R1=pkt_end(id=0,off=0,imm=0) R2=inv(id=0,umax_value=504,var_off=(0x0; 0x1ff)) R3=inv(id=0) R6=ctx(id=0,off=0,imm=0) R7=pkt(id=0,off=0,r=42,imm=0) R8=pkt(id=0,off=34,r=42,imm=0) R9=pkt(id=0,off=14,r=42,imm=0) R10=fp0,call_-1 42: (bf) r2 = r7 43: (0f) r2 += r3 math between pkt pointer and register with unbounded min value is not allowed I'm pretty sure the issue is about (I'm using python-bpfcc 0.8.0-4 from debian sid with a 4.19.12 kernel)
|
|
Yonghong Song
On Wed, Mar 6, 2019 at 7:08 AM <contact@...> wrote:
This is caused by compiler optimizations. r3 get the value from memory, its value could be any one as permitted by the type. 37: (bf) r2 = r3test is done by r2. We indeed get better range for r2 (below: R2=inv(id=0,umax_value=504,var_off=(0x0; 0x1ff)) ) but r3 range is not tightened. R0=inv1 R1=pkt_end(id=0,off=0,imm=0) R2=inv(id=0,umax_value=504,var_off=(0x0; 0x1ff)) R3=inv(id=0) R6=ctx(id=0,off=0,imm=0) R7=pkt(id=0,off=0,r=42,imm=0) R8=pkt(id=0,off=34,r=42,imm=0) R9=pkt(id=0,off=14,r=42,imm=0) R10=fp0,call_-1Here, we use r3 to do arith and the verifier not happy. If we use the tightened old r2, it may work. The compiler does the right thing, just verifier is not advanced enough. Yes, you will need some source workaround. You could try below (untested): if (udp_len > 512) // TODO use a more approriate max value return XDP_DROP; + udp_len = udp_len & 0x1ff; if ((void *) udp + udp_len > data_end) return XDP_DROP;
|
|
Jiong Wang
Yonghong Song writes:
On Wed, Mar 6, 2019 at 7:08 AM <contact@...> wrote:I had run into similar issue when debugging some other rejection beforeThis is caused by compiler optimizations. JMP32 introduced when LLVM was generating similar sequences under defult 64-bit mode, but IIRC LLVM generates betweer sequences with -mattr=alu32, under which it will just use w3 (as the type should be optimized into 32-bit) for the comparison. So, I guess this testcase could have easier sequence for verifier under ALU32 mode. But for this case, BPF_END is used which doesn't have sub-register code-gen support inside LLVM for be16 and be32 at the moment (noticed this several days ago when doing some other benchmarking). If I have .i file, I could do a quick prototype to see if ALU32 could improve this. Regards, Jiong
|
|
Simon
35: (69) r3 = *(u16 *)(r7 +38)r3 get the value from memory, its value could be any one as permitted Does it mean that r3 is considered as be16 ? I do not understand why as I explicitly convert it in u16. This output language is a readable format of bpf bytecode, right ? Is there any documentation to lean/understand it ? The compiler does the right thing, just verifier is not advanced enough.Is it worthy to share this issue of verifier.c with bpf maintainers ? The compiler which is used here is clang which is called by bcc, right ? Yes, you will need some source workaround. You could try below (untested): I tested it and it seems to work. Thx a lot !! But that means I can not use the u16 max value ?
|
|
Yonghong Song
On Fri, Mar 8, 2019 at 9:22 AM Simon <contact@...> wrote:
The be16 is to convert r3 with big endian encoding. If the host system is big endian, it will do nothing. Otherwise, it will convert from little endian to big endian. Yes, there is no documentation. It intends to be self explanatory. I guess "be16" is special and may need some documentation. Otherwise assembly-style codes should be easy to understand. I am also a regular kernel/bpf reviewer. The bpf maintainers/community are aware of this limitation. As you mentioned, the verifier is already very complex. To implement complex tracking like described in this thread will make verifier even more complex, hence this is delayed. One of reason is that we have reasonable, although unpleasant, workarounds. Yes, it is compiled with clang. You can. I add that because you have a test to limit the range of the value to 511.
|
|
Simon
Hi Jiong, I didn't finished yet, but I have a reduced version compared to the one I written for bcc and I face the same issue, so this time I can get .i file easily. So the .i file is as attachment. The error is still The verifier output is pretty much the same : 27: (71) r4 = *(u8 *)(r1 +23) 28: (b7) r0 = 2 29: (55) if r4 != 0x11 goto pc+15 R0=inv2 R1=pkt(id=0,off=0,r=34,imm=0) R2=pkt_end(id=0,off=0,imm=0) R3=pkt(id=0,off=34,r=34,imm=0) R4=inv17 R5=inv5 R10=fp0,call_-1 30: (07) r3 += 8 31: (b7) r0 = 1 32: (2d) if r3 > r2 goto pc+12 R0=inv1 R1=pkt(id=0,off=0,r=42,imm=0) R2=pkt_end(id=0,off=0,imm=0) R3=pkt(id=0,off=42,r=42,imm=0) R4=inv17 R5=inv5 R10=fp0,call_-1 33: (69) r3 = *(u16 *)(r1 +38) 34: (dc) r3 = be16 r3 35: (bf) r4 = r3 36: (07) r4 += -8 37: (57) r4 &= 65535 38: (b7) r0 = 1 39: (25) if r4 > 0x1f8 goto pc+5 R0=inv1 R1=pkt(id=0,off=0,r=42,imm=0) R2=pkt_end(id=0,off=0,imm=0) R3=inv(id=0) R4=inv(id=0,umax_value=504,var_off=(0x0; 0x1ff)) R5=inv5 R10=fp0,call_-1 40: (0f) r1 += r3 math between pkt pointer and register with unbounded min value is not allowed I'm pretty sure the bcc version I used before linked statically clang/llvm v7.0. The funny part ... This modification about just adding some logs/printk makes this error disappear : https://github.com/sbernard31/udploadbalancer/commit/0145538c7b35e2a6bb92225f69a45f4bee120a6d All of those erifier errors make me a bit crazy (╥_╥) HTH Simon
|
|
Jiong Wang
On 27 Mar 2019, at 14:53, Simon <contact@...> wrote: Hi Simon, Thanks for the .i, I prototyped some byteswap code-gen change, but seems doesn’t help your issue which could narrow down to the following general code pattern: unsigned char cal(unsigned int a, unsigned char *b) { if (a < 8 || a > 512) return 0; return b[a]; } LLVM is doing some optimisation, instead of generating two separate comparison, a < 8 and a > 512, it is combining them because a negative value when casted into unsigned must be bigger than 504, so above code turned into unsigned char cal(unsigned int a, unsigned char *b) { unsigned tmp = a - 8; if (tmp > 504) return 0; return b[a]; } The consequence of such optimisation is new variable “tmp” is used for comparison And verifier now know “tmp”'s value range instead of the original “a” which is used later adding to a packet pointer. A unknown value range of “a” then caused the verifier rejection. So, I suspect any code using above c code pattern will likely be rejected. 1. combinable comparisons 2. the variable involved in the comparison used later in places requiring value range Regards, Jiong
|
|
Jiong Wang
On 27 Mar 2019, at 16:11, Jiong Wang via Lists.Iovisor.Org <jiong.wang=netronome.com@...> wrote:And in your code, after you insert those printk, they made the following two comparisons non-combinable any more, so udp_len is used for the comparison and got correct value range to pass the later pkt pointer addition check. if (udp_len < 8) { return XDP_DROP; } if (udp_len > 512) { return XDP_DROP; }
|
|
Simon
Thx a lot for your time Jiong. The more I played with bpf/xdp, the more I understand that the challenge is about making "optimized byte code" compliant for the verifier. How could I do this kind of checks my self ? I mean looking how llvm optimized my code ? (to be able to do same kind of analyses you do above?)
|
|
Jiong Wang
On 27 Mar 2019, at 16:43, Simon <contact@...> wrote:Just my humble opinion, I would recommend: 1. get used to verifier rejection information, for example: R0=inv1 R1=pkt(id=0,off=0,r=42,imm=0) R2=pkt_end(id=0,off=0,imm=0) R3=inv(id=0) R4=inv(id=0,umax_value=504,var_off=(0x0; 0x1ff)) R5=inv5 R10=fp0,call_-1 40: (0f) r1 += r3 math between pkt pointer and register with unbounded min value is not allowed It tells you the status of each registers at the rejection point, for example, now R3 is “inv”, meaning a scalar value (not a pointer), and is without value range, then r4 has value range, and maximum value is 504. 2. known what verifier will reject. Could refer to: https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/tree/tools/testing/selftests/bpf/verifier?id=473c5daa86ffe91e937856cc32b4faa61db2e3e3 those are unit examples of what will be rejected, and some of them are with meaningful test name or comments so could be easy to understand. Regards, Jiong
|
|
Yonghong Song
On Wed, Mar 27, 2019 at 10:17 AM Jiong Wang <jiong.wang@...> wrote:
If you use BPF constructor debug=16 flag, it will print out theOn 27 Mar 2019, at 16:43, Simon <contact@...> wrote:Just my humble opinion, I would recommend: register state for every insn if you are even more curious. To resolve this issue, llvm may need to do more: - prevent/undo optimization which may cause ultimate verifier rejections. - provide hints (e.g., through BTF) to verifier so verifier may selectively do some analysis or enable some tracking for the cases where BTF instructed to handle. For example, BTF may tell verifier two register have the same state at a particular point and verifier only needs to check these two registers with limited range and no others, etc. Regards,
|
|
Pablo Alvarez
Why is it required that llvm compile the BPF code with -O2? That seems to be part of what is causing these verifier problems...
toggle quoted messageShow quoted text
On 3/27/19 4:23 PM, Yonghong Song wrote:
On Wed, Mar 27, 2019 at 10:17 AM Jiong Wang <jiong.wang@...> wrote:If you use BPF constructor debug=16 flag, it will print out theOn 27 Mar 2019, at 16:43, Simon <contact@...> wrote:Just my humble opinion, I would recommend:
|
|
Yonghong Song
On Wed, Mar 27, 2019 at 1:48 PM Pablo Alvarez via Lists.Iovisor.Org
<palvarez=akamai.com@...> wrote: -O0 won't work as helper call will become an indirect call static void *(*bpf_map_lookup_elem)(void *map, void *key) = (void *) BPF_FUNC_map_lookup_elem; Another reason is below value tracking in verifier. The verifier does not handle value spill well. It kept track of map pointers, packet pointers, etc. But if a particular value is spilled, later on, when reload from stack, it will become unknown. For example, r1 = 10; /* verifier state: r1 = 10 */ *(r10 - 40) = r1; r2 = *(r10 - 40); /* verifier state: r2, unknown */ The reason is that keep tracking of values could increase verification time quite a bit. This is measured sometime back, but it may warrant to do some measurement again at this moment to see whether can relax this restriction. So practically -O0 won't work. -O1 may or may not work and most people do not use it. -O2 is preferred. Also for performance reasons, we want to make -O2 work as BPF JIT inside the kernel does not do any optimization, it is simple one insn each time translation.
|
|
Yonghong Song
Hi, Simon,
toggle quoted messageShow quoted text
Finally got some time to deep dive into the verifier for this issue. I filed an issue https://github.com/iovisor/bcc/issues/2463 with possible suggestion to add a new checksum calculation helper for xdp. I forgot what is your ultimate workaround for this issue. Could you comment on the issue? Thanks! Yonghong
On Thu, Mar 7, 2019 at 9:37 AM Y Song <ys114321@...> wrote:
|
|