1. 13 Jul, 2022 2 commits
    • Michael Chan's avatar
      bnxt_en: Fix bnxt_reinit_after_abort() code path · 4279414b
      Michael Chan authored
      bnxt_reinit_after_abort() is called during ifup when a previous
      FW reset sequence has aborted or a previous ifup has failed after
      detecting FW reset.  In all cases, it is safe to assume that a
      previous FW reset has completed and the driver may not have fully
      reinitialized.
      
      Prior to this patch, it is assumed that the
      FUNC_DRV_IF_CHANGE_RESP_FLAGS_HOT_FW_RESET_DONE flag will always be
      set by the firmware in bnxt_hwrm_if_change().  This may not be true if
      the driver has already attempted to register with the firmware.  The
      firmware may not set the RESET_DONE flag again after the driver has
      registered, assuming that the driver has seen the flag already.
      
      Fix it to always go through the FW reset initialization path if
      the BNXT_STATE_FW_RESET_DET flag is set.  This flag is always set
      by the driver after successfully going through bnxt_reinit_after_abort().
      
      Fixes: 6882c36c ("bnxt_en: attempt to reinitialize after aborted reset")
      Reviewed-by: default avatarPavan Chebbi <pavan.chebbi@broadcom.com>
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      4279414b
    • Kashyap Desai's avatar
      bnxt_en: reclaim max resources if sriov enable fails · c5b744d3
      Kashyap Desai authored
      If bnxt_sriov_enable() fails after some resources have been reserved
      for the VFs, the current code is not unwinding properly and the
      reserved resources become unavailable afterwards.  Fix it by
      properly unwinding with a call to bnxt_hwrm_func_qcaps() to
      reset all maximum resources.
      
      Also, add the missing bnxt_ulp_sriov_cfg() call to let the RDMA
      driver know to abort.
      
      Fixes: c0c050c5 ("bnxt_en: New Broadcom ethernet driver.")
      Signed-off-by: default avatarKashyap Desai <kashyap.desai@broadcom.com>
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      c5b744d3
  2. 12 Jul, 2022 2 commits
  3. 11 Jul, 2022 4 commits
  4. 09 Jul, 2022 7 commits
    • Pablo Neira Ayuso's avatar
      netfilter: nf_tables: replace BUG_ON by element length check · c39ba4de
      Pablo Neira Ayuso authored
      BUG_ON can be triggered from userspace with an element with a large
      userdata area. Replace it by length check and return EINVAL instead.
      Over time extensions have been growing in size.
      
      Pick a sufficiently old Fixes: tag to propagate this fix.
      
      Fixes: 7d740264 ("netfilter: nf_tables: variable sized set element keys / data")
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      c39ba4de
    • Eric Dumazet's avatar
      vlan: fix memory leak in vlan_newlink() · 72a0b329
      Eric Dumazet authored
      Blamed commit added back a bug I fixed in commit 9bbd917e
      ("vlan: fix memory leak in vlan_dev_set_egress_priority")
      
      If a memory allocation fails in vlan_changelink() after other allocations
      succeeded, we need to call vlan_dev_free_egress_priority()
      to free all allocated memory because after a failed ->newlink()
      we do not call any methods like ndo_uninit() or dev->priv_destructor().
      
      In following example, if the allocation for last element 2000:2001 fails,
      we need to free eight prior allocations:
      
      ip link add link dummy0 dummy0.100 type vlan id 100 \
      	egress-qos-map 1:2 2:3 3:4 4:5 5:6 6:7 7:8 8:9 2000:2001
      
      syzbot report was:
      
      BUG: memory leak
      unreferenced object 0xffff888117bd1060 (size 32):
      comm "syz-executor408", pid 3759, jiffies 4294956555 (age 34.090s)
      hex dump (first 32 bytes):
      09 00 00 00 00 a0 00 00 00 00 00 00 00 00 00 00 ................
      00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
      backtrace:
      [<ffffffff83fc60ad>] kmalloc include/linux/slab.h:600 [inline]
      [<ffffffff83fc60ad>] vlan_dev_set_egress_priority+0xed/0x170 net/8021q/vlan_dev.c:193
      [<ffffffff83fc6628>] vlan_changelink+0x178/0x1d0 net/8021q/vlan_netlink.c:128
      [<ffffffff83fc67c8>] vlan_newlink+0x148/0x260 net/8021q/vlan_netlink.c:185
      [<ffffffff838b1278>] rtnl_newlink_create net/core/rtnetlink.c:3363 [inline]
      [<ffffffff838b1278>] __rtnl_newlink+0xa58/0xdc0 net/core/rtnetlink.c:3580
      [<ffffffff838b1629>] rtnl_newlink+0x49/0x70 net/core/rtnetlink.c:3593
      [<ffffffff838ac66c>] rtnetlink_rcv_msg+0x21c/0x5c0 net/core/rtnetlink.c:6089
      [<ffffffff839f9c37>] netlink_rcv_skb+0x87/0x1d0 net/netlink/af_netlink.c:2501
      [<ffffffff839f8da7>] netlink_unicast_kernel net/netlink/af_netlink.c:1319 [inline]
      [<ffffffff839f8da7>] netlink_unicast+0x397/0x4c0 net/netlink/af_netlink.c:1345
      [<ffffffff839f9266>] netlink_sendmsg+0x396/0x710 net/netlink/af_netlink.c:1921
      [<ffffffff8384dbf6>] sock_sendmsg_nosec net/socket.c:714 [inline]
      [<ffffffff8384dbf6>] sock_sendmsg+0x56/0x80 net/socket.c:734
      [<ffffffff8384e15c>] ____sys_sendmsg+0x36c/0x390 net/socket.c:2488
      [<ffffffff838523cb>] ___sys_sendmsg+0x8b/0xd0 net/socket.c:2542
      [<ffffffff838525b8>] __sys_sendmsg net/socket.c:2571 [inline]
      [<ffffffff838525b8>] __do_sys_sendmsg net/socket.c:2580 [inline]
      [<ffffffff838525b8>] __se_sys_sendmsg net/socket.c:2578 [inline]
      [<ffffffff838525b8>] __x64_sys_sendmsg+0x78/0xf0 net/socket.c:2578
      [<ffffffff845ad8d5>] do_syscall_x64 arch/x86/entry/common.c:50 [inline]
      [<ffffffff845ad8d5>] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
      [<ffffffff8460006a>] entry_SYSCALL_64_after_hwframe+0x46/0xb0
      
      Fixes: 37aa50c5 ("vlan: introduce vlan_dev_free_egress_priority")
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Xin Long <lucien.xin@gmail.com>
      Reviewed-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      72a0b329
    • Baowen Zheng's avatar
      nfp: fix issue of skb segments exceeds descriptor limitation · 9c840d5f
      Baowen Zheng authored
      TCP packets will be dropped if the segments number in the tx skb
      exceeds limitation when sending iperf3 traffic with --zerocopy option.
      
      we make the following changes:
      
      Get nr_frags in nfp_nfdk_tx_maybe_close_block instead of passing from
      outside because it will be changed after skb_linearize operation.
      
      Fill maximum dma_len in first tx descriptor to make sure the whole
      head is included in the first descriptor.
      
      Fixes: c10d12e3 ("nfp: add support for NFDK data path")
      Signed-off-by: default avatarBaowen Zheng <baowen.zheng@corigine.com>
      Reviewed-by: default avatarLouis Peens <louis.peens@corigine.com>
      Signed-off-by: default avatarSimon Horman <simon.horman@corigine.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9c840d5f
    • Pablo Neira Ayuso's avatar
      netfilter: nf_log: incorrect offset to network header · 7a847c00
      Pablo Neira Ayuso authored
      NFPROTO_ARP is expecting to find the ARP header at the network offset.
      
      In the particular case of ARP, HTYPE= field shows the initial bytes of
      the ethernet header destination MAC address.
      
       netdev out: IN= OUT=bridge0 MACSRC=c2:76:e5:71:e1:de MACDST=36:b0:4a:e2:72:ea MACPROTO=0806 ARP HTYPE=14000 PTYPE=0x4ae2 OPCODE=49782
      
      NFPROTO_NETDEV egress hook is also expecting to find the IP headers at
      the network offset.
      
      Fixes: 35b93951 ("netfilter: add generic ARP packet logger")
      Reported-by: default avatarTom Yan <tom.ty89@gmail.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      7a847c00
    • Jakub Kicinski's avatar
      Merge branch 'selftests-forwarding-install-two-missing-tests' · 6676d727
      Jakub Kicinski authored
      Martin Blumenstingl says:
      
      ====================
      selftests: forwarding: Install two missing tests
      
      For some distributions (e.g. OpenWrt) we don't want to rely on rsync
      to copy the tests to the target as some extra dependencies need to be
      installed. The Makefile in tools/testing/selftests/net/forwarding
      already installs most of the tests.
      
      This series adds the two missing tests to the list of installed tests.
      That way a downstream distribution can build a package using this
      Makefile (and add dependencies there as needed).
      ====================
      
      Link: https://lore.kernel.org/r/20220707135532.1783925-1-martin.blumenstingl@googlemail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      6676d727
    • Martin Blumenstingl's avatar
      selftests: forwarding: Install no_forwarding.sh · cfbba7b4
      Martin Blumenstingl authored
      When using the Makefile from tools/testing/selftests/net/forwarding/
      all tests should be installed. Add no_forwarding.sh to the list of
      "to be installed tests" where it has been missing so far.
      
      Fixes: 476a4f05 ("selftests: forwarding: add a no_forwarding.sh test")
      Signed-off-by: default avatarMartin Blumenstingl <martin.blumenstingl@googlemail.com>
      Reviewed-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      cfbba7b4
    • Martin Blumenstingl's avatar
      selftests: forwarding: Install local_termination.sh · 437ac259
      Martin Blumenstingl authored
      When using the Makefile from tools/testing/selftests/net/forwarding/
      all tests should be installed. Add local_termination.sh to the list of
      "to be installed tests" where it has been missing so far.
      
      Fixes: 90b9566a ("selftests: forwarding: add a test for local_termination.sh")
      Signed-off-by: default avatarMartin Blumenstingl <martin.blumenstingl@googlemail.com>
      Reviewed-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      437ac259
  5. 08 Jul, 2022 20 commits
  6. 07 Jul, 2022 5 commits
    • Florian Westphal's avatar
      netfilter: conntrack: fix crash due to confirmed bit load reordering · 0ed8f619
      Florian Westphal authored
      Kajetan Puchalski reports crash on ARM, with backtrace of:
      
      __nf_ct_delete_from_lists
      nf_ct_delete
      early_drop
      __nf_conntrack_alloc
      
      Unlike atomic_inc_not_zero, refcount_inc_not_zero is not a full barrier.
      conntrack uses SLAB_TYPESAFE_BY_RCU, i.e. it is possible that a 'newly'
      allocated object is still in use on another CPU:
      
      CPU1						CPU2
      						encounter 'ct' during hlist walk
       delete_from_lists
       refcount drops to 0
       kmem_cache_free(ct);
       __nf_conntrack_alloc() // returns same object
      						refcount_inc_not_zero(ct); /* might fail */
      
      						/* If set, ct is public/in the hash table */
      						test_bit(IPS_CONFIRMED_BIT, &ct->status);
      
      In case CPU1 already set refcount back to 1, refcount_inc_not_zero()
      will succeed.
      
      The expected possibilities for a CPU that obtained the object 'ct'
      (but no reference so far) are:
      
      1. refcount_inc_not_zero() fails.  CPU2 ignores the object and moves to
         the next entry in the list.  This happens for objects that are about
         to be free'd, that have been free'd, or that have been reallocated
         by __nf_conntrack_alloc(), but where the refcount has not been
         increased back to 1 yet.
      
      2. refcount_inc_not_zero() succeeds. CPU2 checks the CONFIRMED bit
         in ct->status.  If set, the object is public/in the table.
      
         If not, the object must be skipped; CPU2 calls nf_ct_put() to
         un-do the refcount increment and moves to the next object.
      
      Parallel deletion from the hlists is prevented by a
      'test_and_set_bit(IPS_DYING_BIT, &ct->status);' check, i.e. only one
      cpu will do the unlink, the other one will only drop its reference count.
      
      Because refcount_inc_not_zero is not a full barrier, CPU2 may try to
      delete an object that is not on any list:
      
      1. refcount_inc_not_zero() successful (refcount inited to 1 on other CPU)
      2. CONFIRMED test also successful (load was reordered or zeroing
         of ct->status not yet visible)
      3. delete_from_lists unlinks entry not on the hlist, because
         IPS_DYING_BIT is 0 (already cleared).
      
      2) is already wrong: CPU2 will handle a partially initited object
      that is supposed to be private to CPU1.
      
      Add needed barriers when refcount_inc_not_zero() is successful.
      
      It also inserts a smp_wmb() before the refcount is set to 1 during
      allocation.
      
      Because other CPU might still see the object, refcount_set(1)
      "resurrects" it, so we need to make sure that other CPUs will also observe
      the right content.  In particular, the CONFIRMED bit test must only pass
      once the object is fully initialised and either in the hash or about to be
      inserted (with locks held to delay possible unlink from early_drop or
      gc worker).
      
      I did not change flow_offload_alloc(), as far as I can see it should call
      refcount_inc(), not refcount_inc_not_zero(): the ct object is attached to
      the skb so its refcount should be >= 1 in all cases.
      
      v2: prefer smp_acquire__after_ctrl_dep to smp_rmb (Will Deacon).
      v3: keep smp_acquire__after_ctrl_dep close to refcount_inc_not_zero call
          add comment in nf_conntrack_netlink, no control dependency there
          due to locks.
      
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/all/Yr7WTfd6AVTQkLjI@e126311.manchester.arm.com/Reported-by: default avatarKajetan Puchalski <kajetan.puchalski@arm.com>
      Diagnosed-by: default avatarWill Deacon <will@kernel.org>
      Fixes: 71977437 ("netfilter: conntrack: convert to refcount_t api")
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Acked-by: default avatarWill Deacon <will@kernel.org>
      0ed8f619
    • Eric Dumazet's avatar
      bpf: Make sure mac_header was set before using it · 0326195f
      Eric Dumazet authored
      Classic BPF has a way to load bytes starting from the mac header.
      
      Some skbs do not have a mac header, and skb_mac_header()
      in this case is returning a pointer that 65535 bytes after
      skb->head.
      
      Existing range check in bpf_internal_load_pointer_neg_helper()
      was properly kicking and no illegal access was happening.
      
      New sanity check in skb_mac_header() is firing, so we need
      to avoid it.
      
      WARNING: CPU: 1 PID: 28990 at include/linux/skbuff.h:2785 skb_mac_header include/linux/skbuff.h:2785 [inline]
      WARNING: CPU: 1 PID: 28990 at include/linux/skbuff.h:2785 bpf_internal_load_pointer_neg_helper+0x1b1/0x1c0 kernel/bpf/core.c:74
      Modules linked in:
      CPU: 1 PID: 28990 Comm: syz-executor.0 Not tainted 5.19.0-rc4-syzkaller-00865-g4874fb94 #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/29/2022
      RIP: 0010:skb_mac_header include/linux/skbuff.h:2785 [inline]
      RIP: 0010:bpf_internal_load_pointer_neg_helper+0x1b1/0x1c0 kernel/bpf/core.c:74
      Code: ff ff 45 31 f6 e9 5a ff ff ff e8 aa 27 40 00 e9 3b ff ff ff e8 90 27 40 00 e9 df fe ff ff e8 86 27 40 00 eb 9e e8 2f 2c f3 ff <0f> 0b eb b1 e8 96 27 40 00 e9 79 fe ff ff 90 41 57 41 56 41 55 41
      RSP: 0018:ffffc9000309f668 EFLAGS: 00010216
      RAX: 0000000000000118 RBX: ffffffffffeff00c RCX: ffffc9000e417000
      RDX: 0000000000040000 RSI: ffffffff81873f21 RDI: 0000000000000003
      RBP: ffff8880842878c0 R08: 0000000000000003 R09: 000000000000ffff
      R10: 000000000000ffff R11: 0000000000000001 R12: 0000000000000004
      R13: ffff88803ac56c00 R14: 000000000000ffff R15: dffffc0000000000
      FS: 00007f5c88a16700(0000) GS:ffff8880b9b00000(0000) knlGS:0000000000000000
      CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007fdaa9f6c058 CR3: 000000003a82c000 CR4: 00000000003506e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
      <TASK>
      ____bpf_skb_load_helper_32 net/core/filter.c:276 [inline]
      bpf_skb_load_helper_32+0x191/0x220 net/core/filter.c:264
      
      Fixes: f9aefd6b ("net: warn if mac header was not set")
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/bpf/20220707123900.945305-1-edumazet@google.com
      0326195f
    • Linus Torvalds's avatar
      Merge tag 'net-5.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · ef4ab3ba
      Linus Torvalds authored
      Pull networking fixes from Paolo Abeni:
       "Including fixes from bpf, netfilter, can, and bluetooth.
      
        Current release - regressions:
      
         - bluetooth: fix deadlock on hci_power_on_sync
      
        Previous releases - regressions:
      
         - sched: act_police: allow 'continue' action offload
      
         - eth: usbnet: fix memory leak in error case
      
         - eth: ibmvnic: properly dispose of all skbs during a failover
      
        Previous releases - always broken:
      
         - bpf:
             - fix insufficient bounds propagation from
               adjust_scalar_min_max_vals
             - clear page contiguity bit when unmapping pool
      
         - netfilter: nft_set_pipapo: release elements in clone from
           abort path
      
         - mptcp: netlink: issue MP_PRIO signals from userspace PMs
      
         - can:
             - rcar_canfd: fix data transmission failed on R-Car V3U
             - gs_usb: gs_usb_open/close(): fix memory leak
      
        Misc:
      
         - add Wenjia as SMC maintainer"
      
      * tag 'net-5.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (57 commits)
        wireguard: Kconfig: select CRYPTO_CHACHA_S390
        crypto: s390 - do not depend on CRYPTO_HW for SIMD implementations
        wireguard: selftests: use microvm on x86
        wireguard: selftests: always call kernel makefile
        wireguard: selftests: use virt machine on m68k
        wireguard: selftests: set fake real time in init
        r8169: fix accessing unset transport header
        net: rose: fix UAF bug caused by rose_t0timer_expiry
        usbnet: fix memory leak in error case
        Revert "tls: rx: move counting TlsDecryptErrors for sync"
        mptcp: update MIB_RMSUBFLOW in cmd_sf_destroy
        mptcp: fix local endpoint accounting
        selftests: mptcp: userspace PM support for MP_PRIO signals
        mptcp: netlink: issue MP_PRIO signals from userspace PMs
        mptcp: Acquire the subflow socket lock before modifying MP_PRIO flags
        mptcp: Avoid acquiring PM lock for subflow priority changes
        mptcp: fix locking in mptcp_nl_cmd_sf_destroy()
        net/mlx5e: Fix matchall police parameters validation
        net/sched: act_police: allow 'continue' action offload
        net: lan966x: hardcode the number of external ports
        ...
      ef4ab3ba
    • Linus Torvalds's avatar
      Merge tag 'pinctrl-v5.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl · 651a8536
      Linus Torvalds authored
      Pull pin control fixes from Linus Walleij:
      
       - Tag Intel pin control as supported in MAINTAINERS
      
       - Fix a NULL pointer exception in the Aspeed driver
      
       - Correct some NAND functions in the Sunxi A83T driver
      
       - Use the right offset for some Sunxi pins
      
       - Fix a zero base offset in the Freescale (NXP) i.MX93
      
       - Fix the IRQ support in the STM32 driver
      
      * tag 'pinctrl-v5.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
        pinctrl: stm32: fix optional IRQ support to gpios
        pinctrl: imx: Add the zero base flag for imx93
        pinctrl: sunxi: sunxi_pconf_set: use correct offset
        pinctrl: sunxi: a83t: Fix NAND function name for some pins
        pinctrl: aspeed: Fix potential NULL dereference in aspeed_pinmux_set_mux()
        MAINTAINERS: Update Intel pin control to Supported
      651a8536
    • Linus Torvalds's avatar
      signal handling: don't use BUG_ON() for debugging · a382f8fe
      Linus Torvalds authored
      These are indeed "should not happen" situations, but it turns out recent
      changes made the 'task_is_stopped_or_trace()' case trigger (fix for that
      exists, is pending more testing), and the BUG_ON() makes it
      unnecessarily hard to actually debug for no good reason.
      
      It's been that way for a long time, but let's make it clear: BUG_ON() is
      not good for debugging, and should never be used in situations where you
      could just say "this shouldn't happen, but we can continue".
      
      Use WARN_ON_ONCE() instead to make sure it gets logged, and then just
      continue running.  Instead of making the system basically unusuable
      because you crashed the machine while potentially holding some very core
      locks (eg this function is commonly called while holding 'tasklist_lock'
      for writing).
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a382f8fe