1. 17 Jun, 2022 13 commits
  2. 16 Jun, 2022 2 commits
    • Andrii Nakryiko's avatar
      libbpf: Fix internal USDT address translation logic for shared libraries · 3e6fe5ce
      Andrii Nakryiko authored
      Perform the same virtual address to file offset translation that libbpf
      is doing for executable ELF binaries also for shared libraries.
      Currently libbpf is making a simplifying and sometimes wrong assumption
      that for shared libraries relative virtual addresses inside ELF are
      always equal to file offsets.
      
      Unfortunately, this is not always the case with LLVM's lld linker, which
      now by default generates quite more complicated ELF segments layout.
      E.g., for liburandom_read.so from selftests/bpf, here's an excerpt from
      readelf output listing ELF segments (a.k.a. program headers):
      
        Type           Offset   VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
        PHDR           0x000040 0x0000000000000040 0x0000000000000040 0x0001f8 0x0001f8 R   0x8
        LOAD           0x000000 0x0000000000000000 0x0000000000000000 0x0005e4 0x0005e4 R   0x1000
        LOAD           0x0005f0 0x00000000000015f0 0x00000000000015f0 0x000160 0x000160 R E 0x1000
        LOAD           0x000750 0x0000000000002750 0x0000000000002750 0x000210 0x000210 RW  0x1000
        LOAD           0x000960 0x0000000000003960 0x0000000000003960 0x000028 0x000029 RW  0x1000
      
      Compare that to what is generated by GNU ld (or LLVM lld's with extra
      -znoseparate-code argument which disables this cleverness in the name of
      file size reduction):
      
        Type           Offset   VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
        LOAD           0x000000 0x0000000000000000 0x0000000000000000 0x000550 0x000550 R   0x1000
        LOAD           0x001000 0x0000000000001000 0x0000000000001000 0x000131 0x000131 R E 0x1000
        LOAD           0x002000 0x0000000000002000 0x0000000000002000 0x0000ac 0x0000ac R   0x1000
        LOAD           0x002dc0 0x0000000000003dc0 0x0000000000003dc0 0x000262 0x000268 RW  0x1000
      
      You can see from the first example above that for executable (Flg == "R E")
      PT_LOAD segment (LOAD #2), Offset doesn't match VirtAddr columns.
      And it does in the second case (GNU ld output).
      
      This is important because all the addresses, including USDT specs,
      operate in a virtual address space, while kernel is expecting file
      offsets when performing uprobe attach. So such mismatches have to be
      properly taken care of and compensated by libbpf, which is what this
      patch is fixing.
      
      Also patch clarifies few function and variable names, as well as updates
      comments to reflect this important distinction (virtaddr vs file offset)
      and to ephasize that shared libraries are not all that different from
      executables in this regard.
      
      This patch also changes selftests/bpf Makefile to force urand_read and
      liburand_read.so to be built with Clang and LLVM's lld (and explicitly
      request this ELF file size optimization through -znoseparate-code linker
      parameter) to validate libbpf logic and ensure regressions don't happen
      in the future. I've bundled these selftests changes together with libbpf
      changes to keep the above description tied with both libbpf and
      selftests changes.
      
      Fixes: 74cc6311 ("libbpf: Add USDT notes parsing and resolution logic")
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/bpf/20220616055543.3285835-1-andrii@kernel.org
      3e6fe5ce
    • Zhengchao Shao's avatar
      de5bb438
  3. 14 Jun, 2022 7 commits
    • Yonghong Song's avatar
      selftests/bpf: Avoid skipping certain subtests · 3831cd1f
      Yonghong Song authored
      Commit 704c91e5 ('selftests/bpf: Test "bpftool gen min_core_btf"')
      added a test test_core_btfgen to test core relocation with btf
      generated with 'bpftool gen min_core_btf'. Currently,
      among 76 subtests, 25 are skipped.
      
        ...
        #46/69   core_reloc_btfgen/enumval:OK
        #46/70   core_reloc_btfgen/enumval___diff:OK
        #46/71   core_reloc_btfgen/enumval___val3_missing:OK
        #46/72   core_reloc_btfgen/enumval___err_missing:SKIP
        #46/73   core_reloc_btfgen/enum64val:OK
        #46/74   core_reloc_btfgen/enum64val___diff:OK
        #46/75   core_reloc_btfgen/enum64val___val3_missing:OK
        #46/76   core_reloc_btfgen/enum64val___err_missing:SKIP
        ...
        #46      core_reloc_btfgen:SKIP
        Summary: 1/51 PASSED, 25 SKIPPED, 0 FAILED
      
      Alexei found that in the above core_reloc_btfgen/enum64val___err_missing
      should not be skipped.
      
      Currently, the core_reloc tests have some negative tests.
      In Commit 704c91e5, for core_reloc_btfgen, all negative tests
      are skipped with the following condition
        if (!test_case->btf_src_file || test_case->fails) {
      	test__skip();
      	continue;
        }
      This is too conservative. Negative tests do not fail
      mkstemp() and run_btfgen() should not be skipped.
      There are a few negative tests indeed failing run_btfgen()
      and this patch added 'run_btfgen_fails' to mark these tests
      so that they can be skipped for btfgen tests. With this,
      we have
        ...
        #46/69   core_reloc_btfgen/enumval:OK
        #46/70   core_reloc_btfgen/enumval___diff:OK
        #46/71   core_reloc_btfgen/enumval___val3_missing:OK
        #46/72   core_reloc_btfgen/enumval___err_missing:OK
        #46/73   core_reloc_btfgen/enum64val:OK
        #46/74   core_reloc_btfgen/enum64val___diff:OK
        #46/75   core_reloc_btfgen/enum64val___val3_missing:OK
        #46/76   core_reloc_btfgen/enum64val___err_missing:OK
        ...
        Summary: 1/62 PASSED, 14 SKIPPED, 0 FAILED
      
      Totally 14 subtests are skipped instead of 25.
      Reported-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarYonghong Song <yhs@fb.com>
      Link: https://lore.kernel.org/r/20220614055526.628299-1-yhs@fb.comSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      3831cd1f
    • Yonghong Song's avatar
      selftests/bpf: Fix test_varlen verification failure with latest llvm · 96752e1e
      Yonghong Song authored
      With latest llvm15, test_varlen failed with the following verifier log:
      
        17: (85) call bpf_probe_read_kernel_str#115   ; R0_w=scalar(smin=-4095,smax=256)
        18: (bf) r1 = r0                      ; R0_w=scalar(id=1,smin=-4095,smax=256) R1_w=scalar(id=1,smin=-4095,smax=256)
        19: (67) r1 <<= 32                    ; R1_w=scalar(smax=1099511627776,umax=18446744069414584320,var_off=(0x0; 0xffffffff00000000),s32_min=0,s32_max=0,u32_max=)
        20: (bf) r2 = r1                      ; R1_w=scalar(id=2,smax=1099511627776,umax=18446744069414584320,var_off=(0x0; 0xffffffff00000000),s32_min=0,s32_max=0,u32)
        21: (c7) r2 s>>= 32                   ; R2=scalar(smin=-2147483648,smax=256)
        ; if (len >= 0) {
        22: (c5) if r2 s< 0x0 goto pc+7       ; R2=scalar(umax=256,var_off=(0x0; 0x1ff))
        ; payload4_len1 = len;
        23: (18) r2 = 0xffffc90000167418      ; R2_w=map_value(off=1048,ks=4,vs=1572,imm=0)
        25: (63) *(u32 *)(r2 +0) = r0         ; R0=scalar(id=1,smin=-4095,smax=256) R2_w=map_value(off=1048,ks=4,vs=1572,imm=0)
        26: (77) r1 >>= 32                    ; R1_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff))
        ; payload += len;
        27: (18) r6 = 0xffffc90000167424      ; R6_w=map_value(off=1060,ks=4,vs=1572,imm=0)
        29: (0f) r6 += r1                     ; R1_w=Pscalar(umax=4294967295,var_off=(0x0; 0xffffffff)) R6_w=map_value(off=1060,ks=4,vs=1572,umax=4294967295,var_off=(0)
        ; len = bpf_probe_read_kernel_str(payload, MAX_LEN, &buf_in2[0]);
        30: (bf) r1 = r6                      ; R1_w=map_value(off=1060,ks=4,vs=1572,umax=4294967295,var_off=(0x0; 0xffffffff)) R6_w=map_value(off=1060,ks=4,vs=1572,um)
        31: (b7) r2 = 256                     ; R2_w=256
        32: (18) r3 = 0xffffc90000164100      ; R3_w=map_value(off=256,ks=4,vs=1056,imm=0)
        34: (85) call bpf_probe_read_kernel_str#115
        R1 unbounded memory access, make sure to bounds check any such access
        processed 27 insns (limit 1000000) max_states_per_insn 0 total_states 2 peak_states 2 mark_read 1
        -- END PROG LOAD LOG --
        libbpf: failed to load program 'handler32_signed'
      
      The failure is due to
        20: (bf) r2 = r1                      ; R1_w=scalar(id=2,smax=1099511627776,umax=18446744069414584320,var_off=(0x0; 0xffffffff00000000),s32_min=0,s32_max=0,u32)
        21: (c7) r2 s>>= 32                   ; R2=scalar(smin=-2147483648,smax=256)
        22: (c5) if r2 s< 0x0 goto pc+7       ; R2=scalar(umax=256,var_off=(0x0; 0x1ff))
        26: (77) r1 >>= 32                    ; R1_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff))
        29: (0f) r6 += r1                     ; R1_w=Pscalar(umax=4294967295,var_off=(0x0; 0xffffffff)) R6_w=map_value(off=1060,ks=4,vs=1572,umax=4294967295,var_off=(0)
      where r1 has conservative value range compared to r2 and r1 is used later.
      
      In llvm, commit [1] triggered the above code generation and caused
      verification failure.
      
      It may take a while for llvm to address this issue. In the main time,
      let us change the variable 'len' type to 'long' and adjust condition properly.
      Tested with llvm14 and latest llvm15, both worked fine.
      
       [1] https://reviews.llvm.org/D126647Signed-off-by: default avatarYonghong Song <yhs@fb.com>
      Link: https://lore.kernel.org/r/20220613233449.2860753-1-yhs@fb.comSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      96752e1e
    • Quentin Monnet's avatar
      bpftool: Do not check return value from libbpf_set_strict_mode() · 93270357
      Quentin Monnet authored
      The function always returns 0, so we don't need to check whether the
      return value is 0 or not.
      
      This change was first introduced in commit a777e18f ("bpftool: Use
      libbpf 1.0 API mode instead of RLIMIT_MEMLOCK"), but later reverted to
      restore the unconditional rlimit bump in bpftool. Let's re-add it.
      Co-developed-by: default avatarYafang Shao <laoar.shao@gmail.com>
      Signed-off-by: default avatarQuentin Monnet <quentin@isovalent.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/bpf/20220610112648.29695-3-quentin@isovalent.com
      93270357
    • Quentin Monnet's avatar
      Revert "bpftool: Use libbpf 1.0 API mode instead of RLIMIT_MEMLOCK" · 6b4384ff
      Quentin Monnet authored
      This reverts commit a777e18f.
      
      In commit a777e18f ("bpftool: Use libbpf 1.0 API mode instead of
      RLIMIT_MEMLOCK"), we removed the rlimit bump in bpftool, because the
      kernel has switched to memcg-based memory accounting. Thanks to the
      LIBBPF_STRICT_AUTO_RLIMIT_MEMLOCK, we attempted to keep compatibility
      with other systems and ask libbpf to raise the limit for us if
      necessary.
      
      How do we know if memcg-based accounting is supported? There is a probe
      in libbpf to check this. But this probe currently relies on the
      availability of a given BPF helper, bpf_ktime_get_coarse_ns(), which
      landed in the same kernel version as the memory accounting change. This
      works in the generic case, but it may fail, for example, if the helper
      function has been backported to an older kernel. This has been observed
      for Google Cloud's Container-Optimized OS (COS), where the helper is
      available but rlimit is still in use. The probe succeeds, the rlimit is
      not raised, and probing features with bpftool, for example, fails.
      
      A patch was submitted [0] to update this probe in libbpf, based on what
      the cilium/ebpf Go library does [1]. It would lower the soft rlimit to
      0, attempt to load a BPF object, and reset the rlimit. But it may induce
      some hard-to-debug flakiness if another process starts, or the current
      application is killed, while the rlimit is reduced, and the approach was
      discarded.
      
      As a workaround to ensure that the rlimit bump does not depend on the
      availability of a given helper, we restore the unconditional rlimit bump
      in bpftool for now.
      
        [0] https://lore.kernel.org/bpf/20220609143614.97837-1-quentin@isovalent.com/
        [1] https://github.com/cilium/ebpf/blob/v0.9.0/rlimit/rlimit.go#L39Signed-off-by: default avatarQuentin Monnet <quentin@isovalent.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Cc: Yafang Shao <laoar.shao@gmail.com>
      Cc: Stanislav Fomichev <sdf@google.com>
      Link: https://lore.kernel.org/bpf/20220610112648.29695-2-quentin@isovalent.com
      6b4384ff
    • YueHaibing's avatar
      bpf, arm: Remove unused function emit_a32_alu_r() · fc386ba7
      YueHaibing authored
      Since commit b18bea2a ("ARM: net: bpf: improve 64-bit ALU implementation")
      this is unused anymore, so can remove it.
      Signed-off-by: default avatarYueHaibing <yuehaibing@huawei.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/bpf/20220611040904.8976-1-yuehaibing@huawei.com
      fc386ba7
    • Yonghong Song's avatar
      libbpf: Fix an unsigned < 0 bug · c49a44b3
      Yonghong Song authored
      Andrii reported a bug with the following information:
      
        2859 	if (enum64_placeholder_id == 0) {
        2860 		enum64_placeholder_id = btf__add_int(btf, "enum64_placeholder", 1, 0);
        >>>     CID 394804:  Control flow issues  (NO_EFFECT)
        >>>     This less-than-zero comparison of an unsigned value is never true. "enum64_placeholder_id < 0U".
        2861 		if (enum64_placeholder_id < 0)
        2862 			return enum64_placeholder_id;
        2863    	...
      
      Here enum64_placeholder_id declared as '__u32' so enum64_placeholder_id < 0
      is always false. Declare enum64_placeholder_id as 'int' in order to capture
      the potential error properly.
      
      Fixes: f2a62588 ("libbpf: Add enum64 sanitization")
      Reported-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Signed-off-by: default avatarYonghong Song <yhs@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/bpf/20220613054314.1251905-1-yhs@fb.com
      c49a44b3
    • Hongyi Lu's avatar
      bpf: Fix spelling in bpf_verifier.h · 6dbdc9f3
      Hongyi Lu authored
      Minor spelling fix spotted in bpf_verifier.h. Spelling is no big deal,
      but it is still an improvement when reading through the code.
      Signed-off-by: default avatarHongyi Lu <jwnhy0@gmail.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/bpf/20220613211633.58647-1-jwnhy0@gmail.com
      6dbdc9f3
  4. 11 Jun, 2022 3 commits
  5. 09 Jun, 2022 3 commits
  6. 07 Jun, 2022 12 commits