-
Alexei Starovoitov authored
Compilers optimize conditional operators at will, but often bpf programmers want to force compilers to keep the same operator in asm as it's written in C. Introduce bpf_cmp_likely/unlikely(var1, conditional_op, var2) macros that can be used as: - if (seen >= 1000) + if (bpf_cmp_unlikely(seen, >=, 1000)) The macros take advantage of BPF assembly that is C like. The macros check the sign of variable 'seen' and emits either signed or unsigned compare. For example: int a; bpf_cmp_unlikely(a, >, 0) will be translated to 'if rX s> 0 goto' in BPF assembly. unsigned int a; bpf_cmp_unlikely(a, >, 0) will be translated to 'if rX > 0 goto' in BPF assembly. C type conversions coupled with comparison operator are tricky. int i = -1; unsigned int j = 1; if (i < j) // this is false. long i = -1; unsigned int j = 1; if (i < j) // this is true. Make sure BPF program is compiled with -Wsign-compare then the macros will catch the mistake. The macros check LHS (left hand side) only to figure out the sign of compare. 'if 0 < rX goto' is not allowed in the assembly, so the users have to use a variable on LHS anyway. The patch updates few tests to demonstrate the use of the macros. The macro allows to use BPF_JSET in C code, since LLVM doesn't generate it at present. For example: if (i & j) compiles into r0 &= r1; if r0 == 0 goto while if (bpf_cmp_unlikely(i, &, j)) compiles into if r0 & r1 goto Note that the macros has to be careful with RHS assembly predicate. Since: u64 __rhs = 1ull << 42; asm goto("if r0 < %[rhs] goto +1" :: [rhs] "ri" (__rhs)); LLVM will silently truncate 64-bit constant into s32 imm. Note that [lhs] "r"((short)LHS) the type cast is a workaround for LLVM issue. When LHS is exactly 32-bit LLVM emits redundant <<=32, >>=32 to zero upper 32-bits. When LHS is 64 or 16 or 8-bit variable there are no shifts. When LHS is 32-bit the (u64) cast doesn't help. Hence use (short) cast. It does _not_ truncate the variable before it's assigned to a register. Traditional likely()/unlikely() macros that use __builtin_expect(!!(x), 1 or 0) have no effect on these macros, hence macros implement the logic manually. bpf_cmp_unlikely() macro preserves compare operator as-is while bpf_cmp_likely() macro flips the compare. Consider two cases: A. for() { if (foo >= 10) { bar += foo; } other code; } B. for() { if (foo >= 10) break; other code; } It's ok to use either bpf_cmp_likely or bpf_cmp_unlikely macros in both cases, but consider that 'break' is effectively 'goto out_of_the_loop'. Hence it's better to use bpf_cmp_unlikely in the B case. While 'bar += foo' is better to keep as 'fallthrough' == likely code path in the A case. When it's written as: A. for() { if (bpf_cmp_likely(foo, >=, 10)) { bar += foo; } other code; } B. for() { if (bpf_cmp_unlikely(foo, >=, 10)) break; other code; } The assembly will look like: A. for() { if r1 < 10 goto L1; bar += foo; L1: other code; } B. for() { if r1 >= 10 goto L2; other code; } L2: The bpf_cmp_likely vs bpf_cmp_unlikely changes basic block layout, hence it will greatly influence the verification process. The number of processed instructions will be different, since the verifier walks the fallthrough first. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/bpf/20231226191148.48536-3-alexei.starovoitov@gmail.com
a8b242d7