1. 16 Aug, 2023 7 commits
  2. 15 Aug, 2023 5 commits
    • Lorenz Bauer's avatar
      net: Fix slab-out-of-bounds in inet[6]_steal_sock · 8897562f
      Lorenz Bauer authored
      Kumar reported a KASAN splat in tcp_v6_rcv:
      
        bash-5.2# ./test_progs -t btf_skc_cls_ingress
        ...
        [   51.810085] BUG: KASAN: slab-out-of-bounds in tcp_v6_rcv+0x2d7d/0x3440
        [   51.810458] Read of size 2 at addr ffff8881053f038c by task test_progs/226
      
      The problem is that inet[6]_steal_sock accesses sk->sk_protocol without
      accounting for request or timewait sockets. To fix this we can't just
      check sock_common->skc_reuseport since that flag is present on timewait
      sockets.
      
      Instead, add a fullsock check to avoid the out of bands access of sk_protocol.
      
      Fixes: 9c02bec9 ("bpf, net: Support SO_REUSEPORT sockets with bpf_sk_assign")
      Reported-by: default avatarKumar Kartikeya Dwivedi <memxor@gmail.com>
      Signed-off-by: default avatarLorenz Bauer <lmb@isovalent.com>
      Link: https://lore.kernel.org/r/20230815-bpf-next-v2-1-95126eaa4c1b@isovalent.comSigned-off-by: default avatarMartin KaFai Lau <martin.lau@kernel.org>
      8897562f
    • Martin KaFai Lau's avatar
      Merge branch 'Update and document struct_ops' · dda77040
      Martin KaFai Lau authored
      David Vernet says:
      
      ====================
      The struct bpf_struct_ops structure in BPF is a framework that allows
      subsystems to extend themselves using BPF. In commit 68b04864
      ("bpf: Create links for BPF struct_ops maps") and commit aef56f2e
      ("bpf: Update the struct_ops of a bpf_link"), the structure was updated
      to include new ->validate() and ->update() callbacks respectively in
      support of allowing struct_ops maps to be created with BPF_F_LINK.
      
      The intention was that struct bpf_struct_ops implementations could
      support map updates through the link. Because map validation and
      registration would take place in two separate steps for struct_ops
      maps managed by the link (the first in map update elem, and the latter
      in link create), the ->validate() callback was added, and any struct_ops
      implementation that wished to use BPF_F_LINK, even just for lifetime
      management, would then be required to define both it and ->update().
      
      Not all struct_ops implementations can or will support update, however.
      For example, the sched_ext struct_ops implementation proposed in [0]
      will not be able to support atomic map updates because it can race with
      sysrq, has to cycle tasks through various states in order to safely
      transition, etc. It can, however, benefit from letting the BPF link
      automatically evict the struc_ops map when the application exits (e.g.
      if it crashes).
      
      This patch set therefore:
      
      1. Updates the struct_ops implementation to support default values for
         ->validate() and ->update() so that struct_ops implementations can
         benefit from BPF_F_LINK management even if they can't support
         updates.
      2. Documents struct bpf_struct_ops so that the semantics are clear and
         well defined.
      ---
      v2: https://lore.kernel.org/bpf/0f5ea3de-c6e7-490f-b5ec-b5c7cd288687@gmail.com/T/
      Changes from v2 -> v3:
      - Add patch 2/2 that documents the struct bpf_struct_ops structure.
      - Add Kui-Feng's Acked-by tag to patch 1/2.
      
      v1: https://lore.kernel.org/lkml/20230811150934.GA542801@maniforge/
      Changes from v1 -> v2:
      - Move the if (!st_map->st_ops->update) check outside of the critical
        section before we acquire the update_mutex.
      ====================
      Signed-off-by: default avatarMartin KaFai Lau <martin.lau@kernel.org>
      dda77040
    • David Vernet's avatar
      bpf: Document struct bpf_struct_ops fields · bb48cf16
      David Vernet authored
      Subsystems that want to implement a struct bpf_struct_ops structure to
      enable struct_ops maps must currently reverse engineer how the structure
      works. Given that this is meant to be a way for subsystem maintainers to
      extend their subsystems using BPF, let's document it to make it a bit
      easier on them.
      Signed-off-by: default avatarDavid Vernet <void@manifault.com>
      Link: https://lore.kernel.org/r/20230814185908.700553-3-void@manifault.comSigned-off-by: default avatarMartin KaFai Lau <martin.lau@kernel.org>
      bb48cf16
    • David Vernet's avatar
      bpf: Support default .validate() and .update() behavior for struct_ops links · 8ba651ed
      David Vernet authored
      Currently, if a struct_ops map is loaded with BPF_F_LINK, it must also
      define the .validate() and .update() callbacks in its corresponding
      struct bpf_struct_ops in the kernel. Enabling struct_ops link is useful
      in its own right to ensure that the map is unloaded if an application
      crashes. For example, with sched_ext, we want to automatically unload
      the host-wide scheduler if the application crashes. We would likely
      never support updating elements of a sched_ext struct_ops map, so we'd
      have to implement these callbacks showing that they _can't_ support
      element updates just to benefit from the basic lifetime management of
      struct_ops links.
      
      Let's enable struct_ops maps to work with BPF_F_LINK even if they
      haven't defined these callbacks, by assuming that a struct_ops map
      element cannot be updated by default.
      Acked-by: default avatarKui-Feng Lee <thinker.li@gmail.com>
      Signed-off-by: default avatarDavid Vernet <void@manifault.com>
      Link: https://lore.kernel.org/r/20230814185908.700553-2-void@manifault.comSigned-off-by: default avatarMartin KaFai Lau <martin.lau@kernel.org>
      8ba651ed
    • Daniel Borkmann's avatar
      selftests/bpf: Add various more tcx test cases · ccd9a8be
      Daniel Borkmann authored
      Add several new tcx test cases to improve test coverage. This also includes
      a few new tests with ingress instead of clsact qdisc, to cover the fix from
      commit dc644b54 ("tcx: Fix splat in ingress_destroy upon tcx_entry_free").
      
        # ./test_progs -t tc
        [...]
        #234     tc_links_after:OK
        #235     tc_links_append:OK
        #236     tc_links_basic:OK
        #237     tc_links_before:OK
        #238     tc_links_chain_classic:OK
        #239     tc_links_chain_mixed:OK
        #240     tc_links_dev_cleanup:OK
        #241     tc_links_dev_mixed:OK
        #242     tc_links_ingress:OK
        #243     tc_links_invalid:OK
        #244     tc_links_prepend:OK
        #245     tc_links_replace:OK
        #246     tc_links_revision:OK
        #247     tc_opts_after:OK
        #248     tc_opts_append:OK
        #249     tc_opts_basic:OK
        #250     tc_opts_before:OK
        #251     tc_opts_chain_classic:OK
        #252     tc_opts_chain_mixed:OK
        #253     tc_opts_delete_empty:OK
        #254     tc_opts_demixed:OK
        #255     tc_opts_detach:OK
        #256     tc_opts_detach_after:OK
        #257     tc_opts_detach_before:OK
        #258     tc_opts_dev_cleanup:OK
        #259     tc_opts_invalid:OK
        #260     tc_opts_mixed:OK
        #261     tc_opts_prepend:OK
        #262     tc_opts_replace:OK
        #263     tc_opts_revision:OK
        [...]
        Summary: 44/38 PASSED, 0 SKIPPED, 0 FAILED
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/r/8699efc284b75ccdc51ddf7062fa2370330dc6c0.1692029283.git.daniel@iogearbox.netSigned-off-by: default avatarMartin KaFai Lau <martin.lau@kernel.org>
      ccd9a8be
  3. 14 Aug, 2023 3 commits
  4. 12 Aug, 2023 2 commits
  5. 11 Aug, 2023 16 commits
  6. 10 Aug, 2023 7 commits
    • Jakub Kicinski's avatar
      Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next · 6a1ed143
      Jakub Kicinski authored
      Martin KaFai Lau says:
      
      ====================
      pull-request: bpf-next 2023-08-09
      
      We've added 19 non-merge commits during the last 6 day(s) which contain
      a total of 25 files changed, 369 insertions(+), 141 deletions(-).
      
      The main changes are:
      
      1) Fix array-index-out-of-bounds access when detaching from an
         already empty mprog entry from Daniel Borkmann.
      
      2) Adjust bpf selftest because of a recent llvm change
         related to the cpu-v4 ISA from Eduard Zingerman.
      
      3) Add uprobe support for the bpf_get_func_ip helper from Jiri Olsa.
      
      4) Fix a KASAN splat due to the kernel incorrectly accepted
         an invalid program using the recent cpu-v4 instruction from
         Yonghong Song.
      
      * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next:
        bpf: btf: Remove two unused function declarations
        bpf: lru: Remove unused declaration bpf_lru_promote()
        selftests/bpf: relax expected log messages to allow emitting BPF_ST
        selftests/bpf: remove duplicated functions
        bpf, docs: Fix small typo and define semantics of sign extension
        selftests/bpf: Add bpf_get_func_ip test for uprobe inside function
        selftests/bpf: Add bpf_get_func_ip tests for uprobe on function entry
        bpf: Add support for bpf_get_func_ip helper for uprobe program
        selftests/bpf: Add a movsx selftest for sign-extension of R10
        bpf: Fix an incorrect verification success with movsx insn
        bpf, docs: Formalize type notation and function semantics in ISA standard
        bpf: change bpf_alu_sign_string and bpf_movsx_string to static
        libbpf: Use local includes inside the library
        bpf: fix bpf_dynptr_slice() to stop return an ERR_PTR.
        bpf: fix inconsistent return types of bpf_xdp_copy_buf().
        selftests/bpf: fix the incorrect verification of port numbers.
        selftests/bpf: Add test for detachment on empty mprog entry
        bpf: Fix mprog detachment for empty mprog entry
        bpf: bpf_struct_ops: Remove unnecessary initial values of variables
      ====================
      
      Link: https://lore.kernel.org/r/20230810055123.109578-1-martin.lau@linux.devSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      6a1ed143
    • Jakub Kicinski's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 4d016ae4
      Jakub Kicinski authored
      Cross-merge networking fixes after downstream PR.
      
      No conflicts.
      
      Adjacent changes:
      
      drivers/net/ethernet/intel/igc/igc_main.c
        06b41258 ("igc: Add lock to safeguard global Qbv variables")
        d3750076 ("igc: Add TransmissionOverrun counter")
      
      drivers/net/ethernet/microsoft/mana/mana_en.c
        a7dfeda6 ("net: mana: Fix MANA VF unload when hardware is unresponsive")
        a9ca9f9c ("page_pool: split types and declarations from page_pool.h")
        92272ec4 ("eth: add missing xdp.h includes in drivers")
      
      net/mptcp/protocol.h
        511b90e3 ("mptcp: fix disconnect vs accept race")
        b8dc6d6c ("mptcp: fix rcv buffer auto-tuning")
      
      tools/testing/selftests/net/mptcp/mptcp_join.sh
        c8c101ae ("selftests: mptcp: join: fix 'implicit EP' test")
        03668c65 ("selftests: mptcp: join: rework detailed report")
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      4d016ae4
    • Linus Torvalds's avatar
      Merge tag 'net-6.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 25aa0beb
      Linus Torvalds authored
      Pull networking fixes from Jakub Kicinski:
       "Including fixes from netfilter, wireless and bpf.
      
        Still trending up in size but the good news is that the "current"
        regressions are resolved, AFAIK.
      
        We're getting weirdly many fixes for Wake-on-LAN and suspend/resume
        handling on embedded this week (most not merged yet), not sure why.
        But those are all for older bugs.
      
        Current release - regressions:
      
         - tls: set MSG_SPLICE_PAGES consistently when handing encrypted data
           over to TCP
      
        Current release - new code bugs:
      
         - eth: mlx5: correct IDs on VFs internal to the device (IPU)
      
        Previous releases - regressions:
      
         - phy: at803x: fix WoL support / reporting on AR8032
      
         - bonding: fix incorrect deletion of ETH_P_8021AD protocol VID from
           slaves, leading to BUG_ON()
      
         - tun: prevent tun_build_skb() from exceeding the packet size limit
      
         - wifi: rtw89: fix 8852AE disconnection caused by RX full flags
      
         - eth/PCI: enetc: fix probing after 6fffbc7a ("PCI: Honor
           firmware's device disabled status"), keep PCI devices around even
           if they are disabled / not going to be probed to be able to apply
           quirks on them
      
         - eth: prestera: fix handling IPv4 routes with nexthop IDs
      
        Previous releases - always broken:
      
         - netfilter: re-work garbage collection to avoid races between
           user-facing API and timeouts
      
         - tunnels: fix generating ipv4 PMTU error on non-linear skbs
      
         - nexthop: fix infinite nexthop bucket dump when using maximum
           nexthop ID
      
         - wifi: nl80211: fix integer overflow in nl80211_parse_mbssid_elems()
      
        Misc:
      
         - unix: use consistent error code in SO_PEERPIDFD
      
         - ipv6: adjust ndisc_is_useropt() to include PREFIX_INFO, in prep for
           upcoming IETF RFC"
      
      * tag 'net-6.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (94 commits)
        net: hns3: fix strscpy causing content truncation issue
        net: tls: set MSG_SPLICE_PAGES consistently
        ibmvnic: Ensure login failure recovery is safe from other resets
        ibmvnic: Do partial reset on login failure
        ibmvnic: Handle DMA unmapping of login buffs in release functions
        ibmvnic: Unmap DMA login rsp buffer on send login fail
        ibmvnic: Enforce stronger sanity checks on login response
        net: mana: Fix MANA VF unload when hardware is unresponsive
        netfilter: nf_tables: remove busy mark and gc batch API
        netfilter: nft_set_hash: mark set element as dead when deleting from packet path
        netfilter: nf_tables: adapt set backend to use GC transaction API
        netfilter: nf_tables: GC transaction API to avoid race with control plane
        selftests/bpf: Add sockmap test for redirecting partial skb data
        selftests/bpf: fix a CI failure caused by vsock sockmap test
        bpf, sockmap: Fix bug that strp_done cannot be called
        bpf, sockmap: Fix map type error in sock_map_del_link
        xsk: fix refcount underflow in error path
        ipv6: adjust ndisc_is_useropt() to also return true for PIO
        selftests: forwarding: bridge_mdb: Make test more robust
        selftests: forwarding: bridge_mdb_max: Fix failing test with old libnet
        ...
      25aa0beb
    • Hao Chen's avatar
      net: hns3: fix strscpy causing content truncation issue · 5e3d2061
      Hao Chen authored
      hns3_dbg_fill_content()/hclge_dbg_fill_content() is aim to integrate some
      items to a string for content, and we add '\n' and '\0' in the last
      two bytes of content.
      
      strscpy() will add '\0' in the last byte of destination buffer(one of
      items), it result in finishing content print ahead of schedule and some
      dump content truncation.
      
      One Error log shows as below:
      cat mac_list/uc
      UC MAC_LIST:
      
      Expected:
      UC MAC_LIST:
      FUNC_ID  MAC_ADDR            STATE
      pf       00:2b:19:05:03:00   ACTIVE
      
      The destination buffer is length-bounded and not required to be
      NUL-terminated, so just change strscpy() to memcpy() to fix it.
      
      Fixes: 1cf3d556 ("net: hns3: fix strncpy() not using dest-buf length as length issue")
      Signed-off-by: default avatarHao Chen <chenhao418@huawei.com>
      Signed-off-by: default avatarJijie Shao <shaojijie@huawei.com>
      Link: https://lore.kernel.org/r/20230809020902.1941471-1-shaojijie@huawei.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      5e3d2061
    • Jakub Kicinski's avatar
      net: tls: set MSG_SPLICE_PAGES consistently · 6b486676
      Jakub Kicinski authored
      We used to change the flags for the last segment, because
      non-last segments had the MSG_SENDPAGE_NOTLAST flag set.
      That flag is no longer a thing so remove the setting.
      
      Since flags most likely don't have MSG_SPLICE_PAGES set
      this avoids passing parts of the sg as splice and parts
      as non-splice. Before commit under Fixes we'd have called
      tcp_sendpage() which would add the MSG_SPLICE_PAGES.
      
      Why this leads to trouble remains unclear but Tariq
      reports hitting the WARN_ON(!sendpage_ok()) due to
      page refcount of 0.
      
      Fixes: e117dcfd ("tls: Inline do_tcp_sendpages()")
      Reported-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Link: https://lore.kernel.org/all/4c49176f-147a-4283-f1b1-32aac7b4b996@gmail.com/Tested-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Link: https://lore.kernel.org/r/20230808180917.1243540-1-kuba@kernel.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      6b486676
    • Linus Torvalds's avatar
      Merge tag 'dmaengine-fix-6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine · 30813656
      Linus Torvalds authored
      Pull dmaengine fixes from Vinod Koul:
      
       - HAS_IOMEM fixes for fsl edma and intel idma
      
       - return-value fix, interrupt vector setting and typo fix for xilinx
         xdma
      
       - email updates for codeaurora email domain move
      
       - correct pause status for pl330 driver
      
       - idxd clear flag on disable fix
      
       - function documentation fix for owl dma
      
       - potential un-allocated memory fix for mcf driver
      
      * tag 'dmaengine-fix-6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine:
        dmaengine: xilinx: xdma: Fix typo
        dmaengine: xilinx: xdma: Fix interrupt vector setting
        dmaengine: owl-dma: Modify mismatched function name
        dmaengine: idxd: Clear PRS disable flag when disabling IDXD device
        dmaengine: pl330: Return DMA_PAUSED when transaction is paused
        dmaengine: qcom_hidma: Update codeaurora email domain
        dmaengine: mcf-edma: Fix a potential un-allocated memory access
        dmaengine: xilinx: xdma: Fix Judgment of the return value
        idmaengine: make FSL_EDMA and INTEL_IDMA64 depends on HAS_IOMEM
      30813656
    • Jakub Kicinski's avatar
      Merge tag 'nf-23-08-10' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf · 3e91b0eb
      Jakub Kicinski authored
      Pablo Neira Ayuso says:
      
      ====================
      Netfilter fixes for net
      
      The existing attempt to resolve races between control plane and GC work
      is error prone, as reported by Bien Pham <phamnnb@sea.com>, some places
      forgot to call nft_set_elem_mark_busy(), leading to double-deactivation
      of elements.
      
      This series contains the following patches:
      
      1) Do not skip expired elements during walk otherwise elements might
         never decrement the reference counter on data, leading to memleak.
      
      2) Add a GC transaction API to replace the former attempt to deal with
         races between control plane and GC. GC worker sets on NFT_SET_ELEM_DEAD_BIT
         on elements and it creates a GC transaction to remove the expired
         elements, GC transaction could abort in case of interference with
         control plane and retried later (GC async). Set backends such as
         rbtree and pipapo also perform GC from control plane (GC sync), in
         such case, element deactivation and removal is safe because mutex
         is held then collected elements are released via call_rcu().
      
      3) Adapt existing set backends to use the GC transaction API.
      
      4) Update rhash set backend to set on _DEAD bit to report deleted
         elements from datapath for GC.
      
      5) Remove old GC batch API and the NFT_SET_ELEM_BUSY_BIT.
      
      * tag 'nf-23-08-10' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
        netfilter: nf_tables: remove busy mark and gc batch API
        netfilter: nft_set_hash: mark set element as dead when deleting from packet path
        netfilter: nf_tables: adapt set backend to use GC transaction API
        netfilter: nf_tables: GC transaction API to avoid race with control plane
        netfilter: nf_tables: don't skip expired elements during walk
      ====================
      
      Link: https://lore.kernel.org/r/20230810070830.24064-1-pablo@netfilter.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      3e91b0eb