On Thu, 21 Sep 2017 11:56:55 -0700, Alexei Starovoitov wrote:
On Wed, Sep 20, 2017 at 12:20:40AM +0100, Jiong Wang via iovisor-dev wrote:
On 18/09/2017 22:29, Daniel Borkmann wrote: I don't think we need to gate 32bit alu generation with a flag.
On 09/18/2017 10:47 PM, Jiong Wang wrote: Hi Daniel,
Hi,Great to see work in this direction! Can we also enable to use / emit
Currently, LLVM eBPF backend always generate code in 64-bit mode,
cause troubles when JITing to 32-bit targets.
For example, it is quite common for XDP eBPF program to access
fields through base + offset that the default eBPF will generate
the address formation, later when JITing to 32-bit hardware,
to be expanded into 32 bit ALU sequences even though the address
32-bit that the high bits is not significant.
While a complete 32-bit mode implemention may need an new ABI
-target-abi=ilp32), this patch set first add some initial code so we
construct 32-bit eBPF tests through hand-written assembly.
A new 32-bit register set is introduced, its name is with "w"
prefix and LLVM
assembler will encode statements like "w1 += w2" into the following
BPF_ADD | BPF_X | BPF_ALU
BPF_ALU will be used instead of BPF_ALU64.
NOTE, currently you can only use "w" register with ALU
statements, not with
others like branches etc as they don't have different encoding for
all the 32bit BPF_ALU instructions whenever possible for the currently
available bpf targets while at it (which only use BPF_ALU64 right now)?
Thanks for the feedback.
I think we could also enable the use of all the 32bit BPF_ALU under
available bpf targets. As we now have 32bit register set support, we could
i32 type as legal type to prevent it be promoted into i64, then hook it up
ALU patterns, will look into this.
Though interpreter and JITs support 32-bit since day one, the verifier
never seen such programs before, so some valid programs may get
rejected. After some time passes and we're sure that all progs
still work fine when they're optimized with 32-bit alu, we can flip
the switch in llvm and make it default.
Thinking about next steps - do we expect the 32b operations to clear the
upper halves of the registers? The interpreter does it, and so does
x86. I don't think we can load 32bit-only programs on 64bit hosts, so
we would need some form of data flow analysis in the kernel to prune
the zeroing for 32bit offload targets. Is that correct?