1. 17 May, 2023 19 commits
  2. 16 May, 2023 7 commits
  3. 15 May, 2023 14 commits
    • Yafang Shao's avatar
      bpf: Fix memleak due to fentry attach failure · 108598c3
      Yafang Shao authored
      If it fails to attach fentry, the allocated bpf trampoline image will be
      left in the system. That can be verified by checking /proc/kallsyms.
      
      This meamleak can be verified by a simple bpf program as follows:
      
        SEC("fentry/trap_init")
        int fentry_run()
        {
            return 0;
        }
      
      It will fail to attach trap_init because this function is freed after
      kernel init, and then we can find the trampoline image is left in the
      system by checking /proc/kallsyms.
      
        $ tail /proc/kallsyms
        ffffffffc0613000 t bpf_trampoline_6442453466_1  [bpf]
        ffffffffc06c3000 t bpf_trampoline_6442453466_1  [bpf]
      
        $ bpftool btf dump file /sys/kernel/btf/vmlinux | grep "FUNC 'trap_init'"
        [2522] FUNC 'trap_init' type_id=119 linkage=static
      
        $ echo $((6442453466 & 0x7fffffff))
        2522
      
      Note that there are two left bpf trampoline images, that is because the
      libbpf will fallback to raw tracepoint if -EINVAL is returned.
      
      Fixes: e21aa341 ("bpf: Fix fexit trampoline.")
      Signed-off-by: default avatarYafang Shao <laoar.shao@gmail.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarSong Liu <song@kernel.org>
      Cc: Jiri Olsa <olsajiri@gmail.com>
      Link: https://lore.kernel.org/bpf/20230515130849.57502-2-laoar.shao@gmail.com
      108598c3
    • Yafang Shao's avatar
      bpf: Remove bpf trampoline selector · 47e79cbe
      Yafang Shao authored
      After commit e21aa341 ("bpf: Fix fexit trampoline."), the selector is only
      used to indicate how many times the bpf trampoline image are updated and been
      displayed in the trampoline ksym name. After the trampoline is freed, the
      selector will start from 0 again. So the selector is a useless value to the
      user. We can remove it.
      
      If the user want to check whether the bpf trampoline image has been updated
      or not, the user can compare the address. Each time the trampoline image is
      updated, the address will change consequently. Jiri also pointed out another
      issue that perf is still using the old name "bpf_trampoline_%lu", so this
      change can fix the issue in perf.
      
      Fixes: e21aa341 ("bpf: Fix fexit trampoline.")
      Signed-off-by: default avatarYafang Shao <laoar.shao@gmail.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarSong Liu <song@kernel.org>
      Cc: Jiri Olsa <olsajiri@gmail.com>
      Link: https://lore.kernel.org/bpf/ZFvOOlrmHiY9AgXE@krava
      Link: https://lore.kernel.org/bpf/20230515130849.57502-3-laoar.shao@gmail.com
      47e79cbe
    • Florent Revest's avatar
      bpf, arm64: Support struct arguments in the BPF trampoline · 90564f1e
      Florent Revest authored
      This extends the BPF trampoline JIT to support attachment to functions
      that take small structures (up to 128bit) as argument. This is trivially
      achieved by saving/restoring a number of "argument registers" rather
      than a number of arguments.
      
      The AAPCS64 section 6.8.2 describes the parameter passing ABI.
      "Composite types" (like C structs) below 16 bytes (as enforced by the
      BPF verifier) are provided as part of the 8 argument registers as
      explained in the section C.12.
      Signed-off-by: default avatarFlorent Revest <revest@chromium.org>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarYonghong Song <yhs@fb.com>
      Acked-by: default avatarXu Kuohai <xukuohai@huawei.com>
      Link: https://lore.kernel.org/bpf/20230511140507.514888-1-revest@chromium.org
      90564f1e
    • Alan Maguire's avatar
      bpftool: JIT limited misreported as negative value on aarch64 · 04cb8453
      Alan Maguire authored
      On aarch64, "bpftool feature" reports an incorrect BPF JIT limit:
      
      $ sudo /sbin/bpftool feature
      Scanning system configuration...
      bpf() syscall restricted to privileged users
      JIT compiler is enabled
      JIT compiler hardening is disabled
      JIT compiler kallsyms exports are enabled for root
      skipping kernel config, can't open file: No such file or directory
      Global memory limit for JIT compiler for unprivileged users is -201326592 bytes
      
      This is because /proc/sys/net/core/bpf_jit_limit reports
      
      $ sudo cat /proc/sys/net/core/bpf_jit_limit
      68169519595520
      
      ...and an int is assumed in read_procfs().  Change read_procfs()
      to return a long to avoid negative value reporting.
      
      Fixes: 7a4522bb ("tools: bpftool: add probes for /proc/ eBPF parameters")
      Reported-by: default avatarNicky Veitch <nicky.veitch@oracle.com>
      Signed-off-by: default avatarAlan Maguire <alan.maguire@oracle.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Acked-by: default avatarQuentin Monnet <quentin@isovalent.com>
      Link: https://lore.kernel.org/bpf/20230512113134.58996-1-alan.maguire@oracle.com
      04cb8453
    • Andrii Nakryiko's avatar
      bpf: fix calculation of subseq_idx during precision backtracking · d84b1a67
      Andrii Nakryiko authored
      Subsequent instruction index (subseq_idx) is an index of an instruction
      that was verified/executed by verifier after the currently processed
      instruction. It is maintained during precision backtracking processing
      and is used to detect various subprog calling conditions.
      
      This patch fixes the bug with incorrectly resetting subseq_idx to -1
      when going from child state to parent state during backtracking. If we
      don't maintain correct subseq_idx we can misidentify subprog calls
      leading to precision tracking bugs.
      
      One such case was triggered by test_global_funcs/global_func9 test where
      global subprog call happened to be the very last instruction in parent
      state, leading to subseq_idx==-1, triggering WARN_ONCE:
      
        [   36.045754] verifier backtracking bug
        [   36.045764] WARNING: CPU: 13 PID: 2073 at kernel/bpf/verifier.c:3503 __mark_chain_precision+0xcc6/0xde0
        [   36.046819] Modules linked in: aesni_intel(E) crypto_simd(E) cryptd(E) kvm_intel(E) kvm(E) irqbypass(E) i2c_piix4(E) serio_raw(E) i2c_core(E) crc32c_intel)
        [   36.048040] CPU: 13 PID: 2073 Comm: test_progs Tainted: G        W  OE      6.3.0-07976-g4d585f48-dirty #972
        [   36.048783] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
        [   36.049648] RIP: 0010:__mark_chain_precision+0xcc6/0xde0
        [   36.050038] Code: 3d 82 c6 05 bb 35 32 02 01 e8 66 21 ec ff 0f 0b b8 f2 ff ff ff e9 30 f5 ff ff 48 c7 c7 f3 61 3d 82 4c 89 0c 24 e8 4a 21 ec ff <0f> 0b 4c0
      
      With the fix precision tracking across multiple states works correctly now:
      
      mark_precise: frame0: last_idx 45 first_idx 38 subseq_idx -1
      mark_precise: frame0: regs=r8 stack= before 44: (61) r7 = *(u32 *)(r10 -4)
      mark_precise: frame0: regs=r8 stack= before 43: (85) call pc+41
      mark_precise: frame0: regs=r8 stack= before 42: (07) r1 += -48
      mark_precise: frame0: regs=r8 stack= before 41: (bf) r1 = r10
      mark_precise: frame0: regs=r8 stack= before 40: (63) *(u32 *)(r10 -48) = r1
      mark_precise: frame0: regs=r8 stack= before 39: (b4) w1 = 0
      mark_precise: frame0: regs=r8 stack= before 38: (85) call pc+38
      mark_precise: frame0: parent state regs=r8 stack=:  R0_w=scalar() R1_w=map_value(off=4,ks=4,vs=8,imm=0) R6=1 R7_w=scalar() R8_r=P0 R10=fpm
      mark_precise: frame0: last_idx 36 first_idx 28 subseq_idx 38
      mark_precise: frame0: regs=r8 stack= before 36: (18) r1 = 0xffff888104f2ed14
      mark_precise: frame0: regs=r8 stack= before 35: (85) call pc+33
      mark_precise: frame0: regs=r8 stack= before 33: (18) r1 = 0xffff888104f2ed10
      mark_precise: frame0: regs=r8 stack= before 32: (85) call pc+36
      mark_precise: frame0: regs=r8 stack= before 31: (07) r1 += -4
      mark_precise: frame0: regs=r8 stack= before 30: (bf) r1 = r10
      mark_precise: frame0: regs=r8 stack= before 29: (63) *(u32 *)(r10 -4) = r7
      mark_precise: frame0: regs=r8 stack= before 28: (4c) w7 |= w0
      mark_precise: frame0: parent state regs=r8 stack=:  R0_rw=scalar() R6=1 R7_rw=scalar() R8_rw=P0 R10=fp0 fp-48_r=mmmmmmmm
      mark_precise: frame0: last_idx 27 first_idx 16 subseq_idx 28
      mark_precise: frame0: regs=r8 stack= before 27: (85) call pc+31
      mark_precise: frame0: regs=r8 stack= before 26: (b7) r1 = 0
      mark_precise: frame0: regs=r8 stack= before 25: (b7) r8 = 0
      
      Note how subseq_idx starts out as -1, then is preserved as 38 and then 28 as we
      go up the parent state chain.
      Reported-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Fixes: fde2a388 ("bpf: support precision propagation in the presence of subprogs")
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Link: https://lore.kernel.org/r/20230515180710.1535018-1-andrii@kernel.orgSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      d84b1a67
    • Dave Marchevsky's avatar
      bpf: Remove anonymous union in bpf_kfunc_call_arg_meta · 4d585f48
      Dave Marchevsky authored
      For kfuncs like bpf_obj_drop and bpf_refcount_acquire - which take
      user-defined types as input - the verifier needs to track the specific
      type passed in when checking a particular kfunc call. This requires
      tracking (btf, btf_id) tuple. In commit 7c50b1cb
      ("bpf: Add bpf_refcount_acquire kfunc") I added an anonymous union with
      inner structs named after the specific kfuncs tracking this information,
      with the goal of making it more obvious which kfunc this data was being
      tracked / expected to be tracked on behalf of.
      
      In a recent series adding a new user of this tuple, Alexei mentioned
      that he didn't like this union usage as it doesn't really help with
      readability or bug-proofing ([0]). In an offline convo we agreed to
      have the tuple be fields (arg_btf, arg_btf_id), with comments in
      bpf_kfunc_call_arg_meta definition enumerating the uses of the fields by
      kfunc-specific handling logic. Such a pattern is used by struct
      bpf_reg_state without trouble.
      
      Accordingly, this patch removes the anonymous union in favor of arg_btf
      and arg_btf_id fields and comment enumerating their current uses. The
      patch also removes struct btf_and_id, which was only being used by the
      removed union's inner structs.
      
      This is a mechanical change, existing linked_list and rbtree tests will
      validate that correct (btf, btf_id) are being passed.
      
        [0]: https://lore.kernel.org/bpf/20230505021707.vlyiwy57vwxglbka@dhcp-172-26-102-232.dhcp.thefacebook.comSigned-off-by: default avatarDave Marchevsky <davemarchevsky@fb.com>
      Link: https://lore.kernel.org/r/20230510213047.1633612-1-davemarchevsky@fb.comSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      4d585f48
    • Krzysztof Kozlowski's avatar
      nfc: llcp: fix possible use of uninitialized variable in nfc_llcp_send_connect() · 0d9b41da
      Krzysztof Kozlowski authored
      If sock->service_name is NULL, the local variable
      service_name_tlv_length will not be assigned by nfc_llcp_build_tlv(),
      later leading to using value frmo the stack.  Smatch warning:
      
        net/nfc/llcp_commands.c:442 nfc_llcp_send_connect() error: uninitialized symbol 'service_name_tlv_length'.
      
      Fixes: de9e5aeb ("NFC: llcp: Fix usage of llcp_add_tlv()")
      Signed-off-by: default avatarKrzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0d9b41da
    • Yang Li's avatar
      octeontx2-pf: mcs: Remove unneeded semicolon · d1e4632b
      Yang Li authored
      ./drivers/net/ethernet/marvell/octeontx2/nic/cn10k_macsec.c:242:2-3: Unneeded semicolon
      ./drivers/net/ethernet/marvell/octeontx2/nic/cn10k_macsec.c:476:2-3: Unneeded semicolon
      Reported-by: default avatarAbaci Robot <abaci@linux.alibaba.com>
      Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=4947Signed-off-by: default avatarYang Li <yang.lee@linux.alibaba.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d1e4632b
    • Anup Sharma's avatar
      net: ethernet: microchip: vcap: Remove extra semicolon · c515a444
      Anup Sharma authored
      Remove the extra semicolon at end. Issue identified using
      semicolon.cocci Coccinelle semantic patch.
      
      drivers/net/ethernet/microchip/vcap/vcap_api.c:1124:3-4: Unneeded semicolon
      drivers/net/ethernet/microchip/vcap/vcap_api.c:1165:3-4: Unneeded semicolon
      drivers/net/ethernet/microchip/vcap/vcap_api.c:1239:3-4: Unneeded semicolon
      drivers/net/ethernet/microchip/vcap/vcap_api.c:1287:3-4: Unneeded semicolon
      Signed-off-by: default avatarAnup Sharma <anupnewsmail@gmail.com>
      
      Changes:
      V1 -> V2: Target tree included in the subject line.
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c515a444
    • David S. Miller's avatar
      Merge branch 'octeontx2-pf-HTB' · 57f1915f
      David S. Miller authored
      Hariprasad Kelam says:
      
      ====================
      octeontx2-pf: HTB offload support
      
      octeontx2 silicon and CN10K transmit interface consists of five
      transmit levels starting from MDQ, TL4 to TL1. Once packets are
      submitted to MDQ, hardware picks all active MDQs using strict
      priority, and MDQs having the same priority level are chosen using
      round robin. Each packet will traverse MDQ, TL4 to TL1 levels.
      Each level contains an array of queues to support scheduling and
      shaping.
      
      As HTB supports classful queuing mechanism by supporting rate and
      ceil and allow the user to control the absolute bandwidth to
      particular classes of traffic the same can be achieved by
      configuring shapers and schedulers on different transmit levels.
      
      This series of patches adds support for HTB offload,
      
      Patch1: Allow strict priority parameter in HTB offload mode.
      
      Patch2: Rename existing total tx queues for better readability
      
      Patch3: defines APIs such that the driver can dynamically initialize/
              deinitialize the send queues.
      
      Patch4: Refactors transmit alloc/free calls as preparation for QOS
              offload code.
      
      Patch5: moves rate limiting logic to common header which will be used
              by qos offload code.
      
      Patch6: Adds actual HTB offload support.
      
      Patch7: exposes qos send queue stats over ethtool.
      
      Patch8: Add documentation about htb offload flow in driver
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      57f1915f
    • Hariprasad Kelam's avatar
      docs: octeontx2: Add Documentation for QOS · efe10306
      Hariprasad Kelam authored
      Add QOS example configuration along with tc-htb commands
      Signed-off-by: default avatarHariprasad Kelam <hkelam@marvell.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      efe10306
    • Hariprasad Kelam's avatar
      octeontx2-pf: ethtool expose qos stats · 6cebb6a4
      Hariprasad Kelam authored
      This patch extends ethtool stats support for QoS send queues as well.
      upon the number of transmit channels change request, Ensures the real
      number of transmit queues are equal to active QoS send queues plus
      configured transmit queues.
      
          ethtool -S eth0
          txq_qos0: bytes: 3021391800
          txq_qos0: frames: 1998275
          txq_qos1: bytes: 4619766312
          txq_qos1: frames: 3055401
          ...
          ...
      Signed-off-by: default avatarHariprasad Kelam <hkelam@marvell.com>
      Signed-off-by: default avatarSunil Kovvuri Goutham <sgoutham@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6cebb6a4
    • Naveen Mamindlapalli's avatar
      octeontx2-pf: Add support for HTB offload · 5e6808b4
      Naveen Mamindlapalli authored
      This patch registers callbacks to support HTB offload.
      
      Below are features supported,
      
      - supports traffic shaping on the given class by honoring rate and ceil
      configuration.
      
      - supports traffic scheduling,  which prioritizes different types of
      traffic based on strict priority values.
      
      - supports the creation of leaf to inner classes such that parent node
      rate limits apply to all child nodes.
      Signed-off-by: default avatarNaveen Mamindlapalli <naveenm@marvell.com>
      Signed-off-by: default avatarHariprasad Kelam <hkelam@marvell.com>
      Signed-off-by: default avatarSunil Kovvuri Goutham <sgoutham@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5e6808b4
    • Hariprasad Kelam's avatar
      octeontx2-pf: Prepare for QOS offload · cb748a7e
      Hariprasad Kelam authored
      This patch moves rate limiting definitions to a common header file and
      adds csr definitions required for QOS code.
      Signed-off-by: default avatarHariprasad Kelam <hkelam@marvell.com>
      Signed-off-by: default avatarSunil Kovvuri Goutham <sgoutham@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cb748a7e