[PATCH, BPF 4/5] BPF: Add 32-bit move patterns


Richard Henderson <rth@...>
 

Since all 32-bit operations zero-extend, the move operations are
immediately usable for loading unsigned constants and zero-extension.

Signed-off-by: Richard Henderson <rth@...>
---
lib/Target/BPF/BPFInstrInfo.td | 39 +++++++++++++++++++++++++++++++++------
1 file changed, 33 insertions(+), 6 deletions(-)

diff --git a/lib/Target/BPF/BPFInstrInfo.td b/lib/Target/BPF/BPFInstrInfo.td
index 5b9bba6..33481b9 100644
--- a/lib/Target/BPF/BPFInstrInfo.td
+++ b/lib/Target/BPF/BPFInstrInfo.td
@@ -53,6 +53,9 @@ def u64imm : Operand<i64> {
def i64immSExt32 : PatLeaf<(imm),
[{return isInt<32>(N->getSExtValue()); }]>;

+def i64immZExt32 : PatLeaf<(imm),
+ [{return isUInt<32>(N->getZExtValue()); }]>;
+
// Addressing modes.
def ADDRri : ComplexPattern<i64, 2, "SelectAddr", [], []>;
def FIri : ComplexPattern<i64, 2, "SelectFIAddr", [add, or], []>;
@@ -191,7 +194,21 @@ let isReMaterializable = 1, isAsCheapAsAMove = 1 in {
def MOV_rr
: F_COF<7 /* BPF_ALU64 */, 0xb /* BPF_MOV */, 1 /* BPF_X */,
(outs GPR:$dst), (ins GPR:$src),
- "mov\t$dst, $src", []> {
+ "mov\t$dst, $src",
+ [(set GPR:$dst, GPR:$src)]> {
+ bits<4> dst;
+ bits<4> src;
+ let BPFDst = dst;
+ let BPFSrc = src;
+ let BPFOff = 0;
+ let BPFImm = 0;
+ }
+
+ def MOV_rw
+ : F_COF<4 /* BPF_ALU */, 0xb /* BPF_MOV */, 1 /* BPF_X */,
+ (outs GPR:$dst), (ins GPR:$src),
+ "movw\t$dst, $src",
+ [(set GPR:$dst, (and GPR:$src, 0xffffffff))]> {
bits<4> dst;
bits<4> src;
let BPFDst = dst;
@@ -214,10 +231,24 @@ let isReMaterializable = 1, isAsCheapAsAMove = 1 in {
let BPFImm = imm;
}

+ def MOV_ru
+ : F_COF<4 /* BPF_ALU64 */, 0xb /* BPF_MOV */, 0 /* BPF_K */,
+ (outs GPR:$dst), (ins i64imm:$imm),
+ "movw\t$dst, $imm",
+ [(set GPR:$dst, (i64 i64immZExt32:$imm))]> {
+ bits<4> dst;
+ bits<32> imm;
+ let BPFDst = dst;
+ let BPFSrc = 0;
+ let BPFOff = 0;
+ let BPFImm = imm;
+ }
+
def LD_imm64
: F_CMS<0 /* BPF_LD */, 0 /* BPF_IMM */, 3 /* BPF_DW */,
(outs GPR:$dst), (ins u64imm:$imm),
- "ld_64\t$dst, $imm", []> {
+ "ld_64\t$dst, $imm",
+ [(set GPR:$dst, i64:$imm)]> {
bits<4> dst;
bits<64> imm;
let BPFDst = dst;
@@ -367,10 +398,6 @@ let usesCustomInserter = 1 in {
// load 64-bit global addr into register
def : Pat<(BPFWrapper tglobaladdr:$in), (LD_imm64 tglobaladdr:$in)>;

-// 0xffffFFFF doesn't fit into simm32, optimize common case
-def : Pat<(i64 (and (i64 GPR:$src), 0xffffFFFF)),
- (SRL_ri (SLL_ri (i64 GPR:$src), 32), 32)>;
-
// Calls
def : Pat<(BPFcall tglobaladdr:$dst), (JAL tglobaladdr:$dst)>;
def : Pat<(BPFcall imm:$dst), (JAL imm:$dst)>;
--
2.5.5


Alexei Starovoitov
 

On Wed, Jun 15, 2016 at 2:37 PM, Richard Henderson via iovisor-dev
<iovisor-dev@...> wrote:
Since all 32-bit operations zero-extend, the move operations are
immediately usable for loading unsigned constants and zero-extension.

Signed-off-by: Richard Henderson <rth@...>
---
lib/Target/BPF/BPFInstrInfo.td | 39 +++++++++++++++++++++++++++++++++------
1 file changed, 33 insertions(+), 6 deletions(-)

diff --git a/lib/Target/BPF/BPFInstrInfo.td b/lib/Target/BPF/BPFInstrInfo.td
index 5b9bba6..33481b9 100644
--- a/lib/Target/BPF/BPFInstrInfo.td
+++ b/lib/Target/BPF/BPFInstrInfo.td
@@ -53,6 +53,9 @@ def u64imm : Operand<i64> {
def i64immSExt32 : PatLeaf<(imm),
[{return isInt<32>(N->getSExtValue()); }]>;

+def i64immZExt32 : PatLeaf<(imm),
+ [{return isUInt<32>(N->getZExtValue()); }]>;
+
// Addressing modes.
def ADDRri : ComplexPattern<i64, 2, "SelectAddr", [], []>;
def FIri : ComplexPattern<i64, 2, "SelectFIAddr", [add, or], []>;
@@ -191,7 +194,21 @@ let isReMaterializable = 1, isAsCheapAsAMove = 1 in {
def MOV_rr
: F_COF<7 /* BPF_ALU64 */, 0xb /* BPF_MOV */, 1 /* BPF_X */,
(outs GPR:$dst), (ins GPR:$src),
- "mov\t$dst, $src", []> {
+ "mov\t$dst, $src",
+ [(set GPR:$dst, GPR:$src)]> {
+ bits<4> dst;
+ bits<4> src;
+ let BPFDst = dst;
+ let BPFSrc = src;
+ let BPFOff = 0;
+ let BPFImm = 0;
+ }
+
+ def MOV_rw
+ : F_COF<4 /* BPF_ALU */, 0xb /* BPF_MOV */, 1 /* BPF_X */,
+ (outs GPR:$dst), (ins GPR:$src),
+ "movw\t$dst, $src",
+ [(set GPR:$dst, (and GPR:$src, 0xffffffff))]> {
bits<4> dst;
bits<4> src;
let BPFDst = dst;
@@ -214,10 +231,24 @@ let isReMaterializable = 1, isAsCheapAsAMove = 1 in {
let BPFImm = imm;
}

+ def MOV_ru
+ : F_COF<4 /* BPF_ALU64 */, 0xb /* BPF_MOV */, 0 /* BPF_K */,
+ (outs GPR:$dst), (ins i64imm:$imm),
+ "movw\t$dst, $imm",
overall great improvement,
but before we go too far with new asm syntax like 'movw',
Do you want to revamp the whole thing?
imo the kernel verifier asm output is easier to read
due to being C-like and now more people understand it
vs what llvm bpf backend produces.
Canonical assembler style used here has intel vs gnu ambiguity
which C-style avoids.
The concern is that if we continue with existing llvm bpf asm
syntax, the gnu asm and other assemblers will come along
and it will be too late to change.
Right now we have two asms: what kernel verifier prints
and llvm bpf asm output. Kernel side we cannot drastically
change anymore, but minor changes should be ok if it
helps to converge to one common bpf asm syntax.
Few folks requested to add support for bpf inline asm in C too.


Richard Henderson <rth@...>
 

On 06/15/2016 10:38 PM, Alexei Starovoitov wrote:
but before we go too far with new asm syntax like 'movw',
Do you want to revamp the whole thing?
imo the kernel verifier asm output is easier to read
due to being C-like and now more people understand it
vs what llvm bpf backend produces.
While the kernel verifier format is ok for the 64-bit opcodes, I personally
can't stand what it does for the 32-bit opcodes, e.g.

(u32) r0 += (u32) r1

That's not only unnecessarily verbose, it's bad C.

Right now we have two asms: what kernel verifier prints
and llvm bpf asm output. Kernel side we cannot drastically
change anymore, but minor changes should be ok if it
helps to converge to one common bpf asm syntax.
Why, is the kernel verifier output considered part of the abi now? That seems
like an odd position to take.

I also have a disassembler written for elfutils, so that it ties in nicely with
objdump. I started with tweaks to the kernel output (which can be seen in the
post I quoted earlier), but have since changed to use the llvm disassembly (and
have not yet re-posted).


r~


Alexei Starovoitov
 

On Thu, Jun 16, 2016 at 10:28 AM, Richard Henderson <rth@...> wrote:
On 06/15/2016 10:38 PM, Alexei Starovoitov wrote:
but before we go too far with new asm syntax like 'movw',
Do you want to revamp the whole thing?
imo the kernel verifier asm output is easier to read
due to being C-like and now more people understand it
vs what llvm bpf backend produces.
While the kernel verifier format is ok for the 64-bit opcodes, I personally
can't stand what it does for the 32-bit opcodes, e.g.

(u32) r0 += (u32) r1

That's not only unnecessarily verbose, it's bad C.
sure. we didn't have llvm producing 32-bit opcodes,
so that part was a quick hack.
Looking at your elfutils patch that does:
r%1$d = (u32)r%1$d + (u32)r%2$d
that totally makes sense to me.
I can change kernel to do that as well.
What I'm suggesting is to teach llvm backend to output that
elfutils style as well.

Right now we have two asms: what kernel verifier prints
and llvm bpf asm output. Kernel side we cannot drastically
change anymore, but minor changes should be ok if it
helps to converge to one common bpf asm syntax.
Why, is the kernel verifier output considered part of the abi now? That seems
like an odd position to take.

I also have a disassembler written for elfutils, so that it ties in nicely with
objdump. I started with tweaks to the kernel output (which can be seen in the
post I quoted earlier), but have since changed to use the llvm disassembly (and
have not yet re-posted).
what was posted in elfutils is preferred to me.
We can cleanup kernel output for 32-bit like you did,
but we cannot change it drastically, since plenty of users already
learned to understand the verifier output which especially important
in case of program rejects.