1. 17 May, 2023 27 commits
  2. 16 May, 2023 7 commits
  3. 15 May, 2023 6 commits
    • Yafang Shao's avatar
      bpf: Fix memleak due to fentry attach failure · 108598c3
      Yafang Shao authored
      If it fails to attach fentry, the allocated bpf trampoline image will be
      left in the system. That can be verified by checking /proc/kallsyms.
      
      This meamleak can be verified by a simple bpf program as follows:
      
        SEC("fentry/trap_init")
        int fentry_run()
        {
            return 0;
        }
      
      It will fail to attach trap_init because this function is freed after
      kernel init, and then we can find the trampoline image is left in the
      system by checking /proc/kallsyms.
      
        $ tail /proc/kallsyms
        ffffffffc0613000 t bpf_trampoline_6442453466_1  [bpf]
        ffffffffc06c3000 t bpf_trampoline_6442453466_1  [bpf]
      
        $ bpftool btf dump file /sys/kernel/btf/vmlinux | grep "FUNC 'trap_init'"
        [2522] FUNC 'trap_init' type_id=119 linkage=static
      
        $ echo $((6442453466 & 0x7fffffff))
        2522
      
      Note that there are two left bpf trampoline images, that is because the
      libbpf will fallback to raw tracepoint if -EINVAL is returned.
      
      Fixes: e21aa341 ("bpf: Fix fexit trampoline.")
      Signed-off-by: default avatarYafang Shao <laoar.shao@gmail.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarSong Liu <song@kernel.org>
      Cc: Jiri Olsa <olsajiri@gmail.com>
      Link: https://lore.kernel.org/bpf/20230515130849.57502-2-laoar.shao@gmail.com
      108598c3
    • Yafang Shao's avatar
      bpf: Remove bpf trampoline selector · 47e79cbe
      Yafang Shao authored
      After commit e21aa341 ("bpf: Fix fexit trampoline."), the selector is only
      used to indicate how many times the bpf trampoline image are updated and been
      displayed in the trampoline ksym name. After the trampoline is freed, the
      selector will start from 0 again. So the selector is a useless value to the
      user. We can remove it.
      
      If the user want to check whether the bpf trampoline image has been updated
      or not, the user can compare the address. Each time the trampoline image is
      updated, the address will change consequently. Jiri also pointed out another
      issue that perf is still using the old name "bpf_trampoline_%lu", so this
      change can fix the issue in perf.
      
      Fixes: e21aa341 ("bpf: Fix fexit trampoline.")
      Signed-off-by: default avatarYafang Shao <laoar.shao@gmail.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarSong Liu <song@kernel.org>
      Cc: Jiri Olsa <olsajiri@gmail.com>
      Link: https://lore.kernel.org/bpf/ZFvOOlrmHiY9AgXE@krava
      Link: https://lore.kernel.org/bpf/20230515130849.57502-3-laoar.shao@gmail.com
      47e79cbe
    • Florent Revest's avatar
      bpf, arm64: Support struct arguments in the BPF trampoline · 90564f1e
      Florent Revest authored
      This extends the BPF trampoline JIT to support attachment to functions
      that take small structures (up to 128bit) as argument. This is trivially
      achieved by saving/restoring a number of "argument registers" rather
      than a number of arguments.
      
      The AAPCS64 section 6.8.2 describes the parameter passing ABI.
      "Composite types" (like C structs) below 16 bytes (as enforced by the
      BPF verifier) are provided as part of the 8 argument registers as
      explained in the section C.12.
      Signed-off-by: default avatarFlorent Revest <revest@chromium.org>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarYonghong Song <yhs@fb.com>
      Acked-by: default avatarXu Kuohai <xukuohai@huawei.com>
      Link: https://lore.kernel.org/bpf/20230511140507.514888-1-revest@chromium.org
      90564f1e
    • Alan Maguire's avatar
      bpftool: JIT limited misreported as negative value on aarch64 · 04cb8453
      Alan Maguire authored
      On aarch64, "bpftool feature" reports an incorrect BPF JIT limit:
      
      $ sudo /sbin/bpftool feature
      Scanning system configuration...
      bpf() syscall restricted to privileged users
      JIT compiler is enabled
      JIT compiler hardening is disabled
      JIT compiler kallsyms exports are enabled for root
      skipping kernel config, can't open file: No such file or directory
      Global memory limit for JIT compiler for unprivileged users is -201326592 bytes
      
      This is because /proc/sys/net/core/bpf_jit_limit reports
      
      $ sudo cat /proc/sys/net/core/bpf_jit_limit
      68169519595520
      
      ...and an int is assumed in read_procfs().  Change read_procfs()
      to return a long to avoid negative value reporting.
      
      Fixes: 7a4522bb ("tools: bpftool: add probes for /proc/ eBPF parameters")
      Reported-by: default avatarNicky Veitch <nicky.veitch@oracle.com>
      Signed-off-by: default avatarAlan Maguire <alan.maguire@oracle.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Acked-by: default avatarQuentin Monnet <quentin@isovalent.com>
      Link: https://lore.kernel.org/bpf/20230512113134.58996-1-alan.maguire@oracle.com
      04cb8453
    • Andrii Nakryiko's avatar
      bpf: fix calculation of subseq_idx during precision backtracking · d84b1a67
      Andrii Nakryiko authored
      Subsequent instruction index (subseq_idx) is an index of an instruction
      that was verified/executed by verifier after the currently processed
      instruction. It is maintained during precision backtracking processing
      and is used to detect various subprog calling conditions.
      
      This patch fixes the bug with incorrectly resetting subseq_idx to -1
      when going from child state to parent state during backtracking. If we
      don't maintain correct subseq_idx we can misidentify subprog calls
      leading to precision tracking bugs.
      
      One such case was triggered by test_global_funcs/global_func9 test where
      global subprog call happened to be the very last instruction in parent
      state, leading to subseq_idx==-1, triggering WARN_ONCE:
      
        [   36.045754] verifier backtracking bug
        [   36.045764] WARNING: CPU: 13 PID: 2073 at kernel/bpf/verifier.c:3503 __mark_chain_precision+0xcc6/0xde0
        [   36.046819] Modules linked in: aesni_intel(E) crypto_simd(E) cryptd(E) kvm_intel(E) kvm(E) irqbypass(E) i2c_piix4(E) serio_raw(E) i2c_core(E) crc32c_intel)
        [   36.048040] CPU: 13 PID: 2073 Comm: test_progs Tainted: G        W  OE      6.3.0-07976-g4d585f48-dirty #972
        [   36.048783] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
        [   36.049648] RIP: 0010:__mark_chain_precision+0xcc6/0xde0
        [   36.050038] Code: 3d 82 c6 05 bb 35 32 02 01 e8 66 21 ec ff 0f 0b b8 f2 ff ff ff e9 30 f5 ff ff 48 c7 c7 f3 61 3d 82 4c 89 0c 24 e8 4a 21 ec ff <0f> 0b 4c0
      
      With the fix precision tracking across multiple states works correctly now:
      
      mark_precise: frame0: last_idx 45 first_idx 38 subseq_idx -1
      mark_precise: frame0: regs=r8 stack= before 44: (61) r7 = *(u32 *)(r10 -4)
      mark_precise: frame0: regs=r8 stack= before 43: (85) call pc+41
      mark_precise: frame0: regs=r8 stack= before 42: (07) r1 += -48
      mark_precise: frame0: regs=r8 stack= before 41: (bf) r1 = r10
      mark_precise: frame0: regs=r8 stack= before 40: (63) *(u32 *)(r10 -48) = r1
      mark_precise: frame0: regs=r8 stack= before 39: (b4) w1 = 0
      mark_precise: frame0: regs=r8 stack= before 38: (85) call pc+38
      mark_precise: frame0: parent state regs=r8 stack=:  R0_w=scalar() R1_w=map_value(off=4,ks=4,vs=8,imm=0) R6=1 R7_w=scalar() R8_r=P0 R10=fpm
      mark_precise: frame0: last_idx 36 first_idx 28 subseq_idx 38
      mark_precise: frame0: regs=r8 stack= before 36: (18) r1 = 0xffff888104f2ed14
      mark_precise: frame0: regs=r8 stack= before 35: (85) call pc+33
      mark_precise: frame0: regs=r8 stack= before 33: (18) r1 = 0xffff888104f2ed10
      mark_precise: frame0: regs=r8 stack= before 32: (85) call pc+36
      mark_precise: frame0: regs=r8 stack= before 31: (07) r1 += -4
      mark_precise: frame0: regs=r8 stack= before 30: (bf) r1 = r10
      mark_precise: frame0: regs=r8 stack= before 29: (63) *(u32 *)(r10 -4) = r7
      mark_precise: frame0: regs=r8 stack= before 28: (4c) w7 |= w0
      mark_precise: frame0: parent state regs=r8 stack=:  R0_rw=scalar() R6=1 R7_rw=scalar() R8_rw=P0 R10=fp0 fp-48_r=mmmmmmmm
      mark_precise: frame0: last_idx 27 first_idx 16 subseq_idx 28
      mark_precise: frame0: regs=r8 stack= before 27: (85) call pc+31
      mark_precise: frame0: regs=r8 stack= before 26: (b7) r1 = 0
      mark_precise: frame0: regs=r8 stack= before 25: (b7) r8 = 0
      
      Note how subseq_idx starts out as -1, then is preserved as 38 and then 28 as we
      go up the parent state chain.
      Reported-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Fixes: fde2a388 ("bpf: support precision propagation in the presence of subprogs")
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Link: https://lore.kernel.org/r/20230515180710.1535018-1-andrii@kernel.orgSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      d84b1a67
    • Dave Marchevsky's avatar
      bpf: Remove anonymous union in bpf_kfunc_call_arg_meta · 4d585f48
      Dave Marchevsky authored
      For kfuncs like bpf_obj_drop and bpf_refcount_acquire - which take
      user-defined types as input - the verifier needs to track the specific
      type passed in when checking a particular kfunc call. This requires
      tracking (btf, btf_id) tuple. In commit 7c50b1cb
      ("bpf: Add bpf_refcount_acquire kfunc") I added an anonymous union with
      inner structs named after the specific kfuncs tracking this information,
      with the goal of making it more obvious which kfunc this data was being
      tracked / expected to be tracked on behalf of.
      
      In a recent series adding a new user of this tuple, Alexei mentioned
      that he didn't like this union usage as it doesn't really help with
      readability or bug-proofing ([0]). In an offline convo we agreed to
      have the tuple be fields (arg_btf, arg_btf_id), with comments in
      bpf_kfunc_call_arg_meta definition enumerating the uses of the fields by
      kfunc-specific handling logic. Such a pattern is used by struct
      bpf_reg_state without trouble.
      
      Accordingly, this patch removes the anonymous union in favor of arg_btf
      and arg_btf_id fields and comment enumerating their current uses. The
      patch also removes struct btf_and_id, which was only being used by the
      removed union's inner structs.
      
      This is a mechanical change, existing linked_list and rbtree tests will
      validate that correct (btf, btf_id) are being passed.
      
        [0]: https://lore.kernel.org/bpf/20230505021707.vlyiwy57vwxglbka@dhcp-172-26-102-232.dhcp.thefacebook.comSigned-off-by: default avatarDave Marchevsky <davemarchevsky@fb.com>
      Link: https://lore.kernel.org/r/20230510213047.1633612-1-davemarchevsky@fb.comSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      4d585f48