You need to sign in or sign up before continuing.
  • Yonghong Song's avatar
    bpf: Get better reg range with ldsx and 32bit compare · 9f5469b8
    Yonghong Song authored
    
    
    With latest llvm19, the selftest iters/iter_arr_with_actual_elem_count
    failed with -mcpu=v4.
    
    The following are the details:
      0: R1=ctx() R10=fp0
      ; int iter_arr_with_actual_elem_count(const void *ctx) @ iters.c:1420
      0: (b4) w7 = 0                        ; R7_w=0
      ; int i, n = loop_data.n, sum = 0; @ iters.c:1422
      1: (18) r1 = 0xffffc90000191478       ; R1_w=map_value(map=iters.bss,ks=4,vs=1280,off=1144)
      3: (61) r6 = *(u32 *)(r1 +128)        ; R1_w=map_value(map=iters.bss,ks=4,vs=1280,off=1144) R6_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff))
      ; if (n > ARRAY_SIZE(loop_data.data)) @ iters.c:1424
      4: (26) if w6 > 0x20 goto pc+27       ; R6_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=32,var_off=(0x0; 0x3f))
      5: (bf) r8 = r10                      ; R8_w=fp0 R10=fp0
      6: (07) r8 += -8                      ; R8_w=fp-8
      ; bpf_for(i, 0, n) { @ iters.c:1427
      7: (bf) r1 = r8                       ; R1_w=fp-8 R8_w=fp-8
      8: (b4) w2 = 0                        ; R2_w=0
      9: (bc) w3 = w6                       ; R3_w=scalar(id=1,smin=smin32=0,smax=umax=smax32=umax32=32,var_off=(0x0; 0x3f)) R6_w=scalar(id=1,smin=smin32=0,smax=umax=smax32=umax32=32,var_off=(0x0; 0x3f))
      10: (85) call bpf_iter_num_new#45179          ; R0=scalar() fp-8=iter_num(ref_id=2,state=active,depth=0) refs=2
      11: (bf) r1 = r8                      ; R1=fp-8 R8=fp-8 refs=2
      12: (85) call bpf_iter_num_next#45181 13: R0=rdonly_mem(id=3,ref_obj_id=2,sz=4) R6=scalar(id=1,smin=smin32=0,smax=umax=smax32=umax32=32,var_off=(0x0; 0x3f)) R7=0 R8=fp-8 R10=fp0 fp-8=iter_num(ref_id=2,state=active,depth=1) refs=2
      ; bpf_for(i, 0, n) { @ iters.c:1427
      13: (15) if r0 == 0x0 goto pc+2       ; R0=rdonly_mem(id=3,ref_obj_id=2,sz=4) refs=2
      14: (81) r1 = *(s32 *)(r0 +0)         ; R0=rdonly_mem(id=3,ref_obj_id=2,sz=4) R1_w=scalar(smin=0xffffffff80000000,smax=0x7fffffff) refs=2
      15: (ae) if w1 < w6 goto pc+4 20: R0=rdonly_mem(id=3,ref_obj_id=2,sz=4) R1=scalar(smin=0xffffffff80000000,smax=smax32=umax32=31,umax=0xffffffff0000001f,smin32=0,var_off=(0x0; 0xffffffff0000001f)) R6=scalar(id=1,smin=umin=smin32=umin32=1,smax=umax=smax32=umax32=32,var_off=(0x0; 0x3f)) R7=0 R8=fp-8 R10=fp0 fp-8=iter_num(ref_id=2,state=active,depth=1) refs=2
      ; sum += loop_data.data[i]; @ iters.c:1429
      20: (67) r1 <<= 2                     ; R1_w=scalar(smax=0x7ffffffc0000007c,umax=0xfffffffc0000007c,smin32=0,smax32=umax32=124,var_off=(0x0; 0xfffffffc0000007c)) refs=2
      21: (18) r2 = 0xffffc90000191478      ; R2_w=map_value(map=iters.bss,ks=4,vs=1280,off=1144) refs=2
      23: (0f) r2 += r1
      math between map_value pointer and register with unbounded min value is not allowed
    
    The source code:
      int iter_arr_with_actual_elem_count(const void *ctx)
      {
            int i, n = loop_data.n, sum = 0;
    
            if (n > ARRAY_SIZE(loop_data.data))
                    return 0;
    
            bpf_for(i, 0, n) {
                    /* no rechecking of i against ARRAY_SIZE(loop_data.n) */
                    sum += loop_data.data[i];
            }
    
            return sum;
      }
    
    The insn #14 is a sign-extenstion load which is related to 'int i'.
    The insn #15 did a subreg comparision. Note that smin=0xffffffff80000000 and this caused later
    insn #23 failed verification due to unbounded min value.
    
    Actually insn #15 R1 smin range can be better. Before insn #15, we have
      R1_w=scalar(smin=0xffffffff80000000,smax=0x7fffffff)
    With the above range, we know for R1, upper 32bit can only be 0xffffffff or 0.
    Otherwise, the value range for R1 could be beyond [smin=0xffffffff80000000,smax=0x7fffffff].
    
    After insn #15, for the true patch, we know smin32=0 and smax32=32. With the upper 32bit 0xffffffff,
    then the corresponding value is [0xffffffff00000000, 0xffffffff00000020]. The range is
    obviously beyond the original range [smin=0xffffffff80000000,smax=0x7fffffff] and the
    range is not possible. So the upper 32bit must be 0, which implies smin = smin32 and
    smax = smax32.
    
    This patch fixed the issue by adding additional register deduction after 32-bit compare
    insn. If the signed 32-bit register range is non-negative then 64-bit smin is
    in range of [S32_MIN, S32_MAX], then the actual 64-bit smin/smax should be the same
    as 32-bit smin32/smax32.
    
    With this patch, iters/iter_arr_with_actual_elem_count succeeded with better register range:
    
    from 15 to 20: R0=rdonly_mem(id=7,ref_obj_id=2,sz=4) R1_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=31,var_off=(0x0; 0x1f)) R6=scalar(id=1,smin=umin=smin32=umin32=1,smax=umax=smax32=umax32=32,var_off=(0x0; 0x3f)) R7=scalar(id=9,smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) R8=scalar(id=9,smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) R10=fp0 fp-8=iter_num(ref_id=2,state=active,depth=3) refs=2
    Acked-by: default avatarEduard Zingerman <eddyz87@gmail.com>
    Acked-by: default avatarShung-Hsi Yu <shung-hsi.yu@suse.com>
    Signed-off-by: default avatarYonghong Song <yonghong.song@linux.dev>
    Link: https://lore.kernel.org/r/20240723162933.2731620-1-yonghong.song@linux.dev
    
    Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
    Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
    9f5469b8
verifier.c 657 KB