1. 07 Jun, 2021 7 commits
  2. 04 Jun, 2021 19 commits
    • Rahul Lakkireddy's avatar
      cxgb4: avoid link re-train during TC-MQPRIO configuration · 3822d067
      Rahul Lakkireddy authored
      When configuring TC-MQPRIO offload, only turn off netdev carrier and
      don't bring physical link down in hardware. Otherwise, when the
      physical link is brought up again after configuration, it gets
      re-trained and stalls ongoing traffic.
      
      Also, when firmware is no longer accessible or crashed, avoid sending
      FLOWC and waiting for reply that will never come.
      
      Fix following hung_task_timeout_secs trace seen in these cases.
      
      INFO: task tc:20807 blocked for more than 122 seconds.
            Tainted: G S                5.13.0-rc3+ #122
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      task:tc   state:D stack:14768 pid:20807 ppid: 19366 flags:0x00000000
      Call Trace:
       __schedule+0x27b/0x6a0
       schedule+0x37/0xa0
       schedule_preempt_disabled+0x5/0x10
       __mutex_lock.isra.14+0x2a0/0x4a0
       ? netlink_lookup+0x120/0x1a0
       ? rtnl_fill_ifinfo+0x10f0/0x10f0
       __netlink_dump_start+0x70/0x250
       rtnetlink_rcv_msg+0x28b/0x380
       ? rtnl_fill_ifinfo+0x10f0/0x10f0
       ? rtnl_calcit.isra.42+0x120/0x120
       netlink_rcv_skb+0x4b/0xf0
       netlink_unicast+0x1a0/0x280
       netlink_sendmsg+0x216/0x440
       sock_sendmsg+0x56/0x60
       __sys_sendto+0xe9/0x150
       ? handle_mm_fault+0x6d/0x1b0
       ? do_user_addr_fault+0x1c5/0x620
       __x64_sys_sendto+0x1f/0x30
       do_syscall_64+0x3c/0x80
       entry_SYSCALL_64_after_hwframe+0x44/0xae
      RIP: 0033:0x7f7f73218321
      RSP: 002b:00007ffd19626208 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
      RAX: ffffffffffffffda RBX: 000055b7c0a8b240 RCX: 00007f7f73218321
      RDX: 0000000000000028 RSI: 00007ffd19626210 RDI: 0000000000000003
      RBP: 000055b7c08680ff R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 000055b7c085f5f6
      R13: 000055b7c085f60a R14: 00007ffd19636470 R15: 00007ffd196262a0
      
      Fixes: b1396c2b ("cxgb4: parse and configure TC-MQPRIO offload")
      Signed-off-by: default avatarRahul Lakkireddy <rahul.lakkireddy@chelsio.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3822d067
    • Yunjian Wang's avatar
      sch_htb: fix refcount leak in htb_parent_to_leaf_offload · 944d671d
      Yunjian Wang authored
      The commit ae81feb7 ("sch_htb: fix null pointer dereference
      on a null new_q") fixes a NULL pointer dereference bug, but it
      is not correct.
      
      Because htb_graft_helper properly handles the case when new_q
      is NULL, and after the previous patch by skipping this call
      which creates an inconsistency : dev_queue->qdisc will still
      point to the old qdisc, but cl->parent->leaf.q will point to
      the new one (which will be noop_qdisc, because new_q was NULL).
      The code is based on an assumption that these two pointers are
      the same, so it can lead to refcount leaks.
      
      The correct fix is to add a NULL pointer check to protect
      qdisc_refcount_inc inside htb_parent_to_leaf_offload.
      
      Fixes: ae81feb7 ("sch_htb: fix null pointer dereference on a null new_q")
      Signed-off-by: default avatarYunjian Wang <wangyunjian@huawei.com>
      Suggested-by: default avatarMaxim Mikityanskiy <maximmi@nvidia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      944d671d
    • David S. Miller's avatar
      Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue · 26821ecd
      David S. Miller authored
      Tony Nguyen says:
      
      ====================
      Intel Wired LAN Driver Updates 2021-06-04
      
      This series contains updates to virtchnl header file and ice driver.
      
      Brett fixes VF being unable to request a different number of queues then
      allocated and adds clearing of VF_MBX_ATQLEN register for VF reset.
      
      Haiyue handles error of rebuilding VF VSI during reset.
      
      Paul fixes reporting of autoneg to use the PHY capabilities.
      
      Dave allows LLDP packets without priority of TC_PRIO_CONTROL to be
      transmitted.
      
      Geert Uytterhoeven adds explicit padding to virtchnl_proto_hdrs
      structure in the virtchnl header file.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      26821ecd
    • David S. Miller's avatar
      Merge branch 'wireguard-fixes' · 6fd815bb
      David S. Miller authored
      Jason A. Donenfeld says:
      
      ====================
      wireguard fixes for 5.13-rc5
      
      Here are bug fixes to WireGuard for 5.13-rc5:
      
      1-2,6) These are small, trivial tweaks to our test harness.
      
      3) Linus thinks -O3 is still dangerous to enable. The code gen wasn't so
         much different with -O2 either.
      
      4) We were accidentally calling synchronize_rcu instead of
         synchronize_net while holding the rtnl_lock, resulting in some rather
         large stalls that hit production machines.
      
      5) Peer allocation was wasting literally hundreds of megabytes on real
         world deployments, due to oddly sized large objects not fitting
         nicely into a kmalloc slab.
      
      7-9) We move from an insanely expensive O(n) algorithm to a fast O(1)
           algorithm, and cleanup a massive memory leak in the process, in
           which allowed ips churn would leave danging nodes hanging around
           without cleanup until the interface was removed. The O(1) algorithm
           eliminates packet stalls and high latency issues, in addition to
           bringing operations that took as much as 10 minutes down to less
           than a second.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6fd815bb
    • Jason A. Donenfeld's avatar
      wireguard: allowedips: free empty intermediate nodes when removing single node · bf7b042d
      Jason A. Donenfeld authored
      When removing single nodes, it's possible that that node's parent is an
      empty intermediate node, in which case, it too should be removed.
      Otherwise the trie fills up and never is fully emptied, leading to
      gradual memory leaks over time for tries that are modified often. There
      was originally code to do this, but was removed during refactoring in
      2016 and never reworked. Now that we have proper parent pointers from
      the previous commits, we can implement this properly.
      
      In order to reduce branching and expensive comparisons, we want to keep
      the double pointer for parent assignment (which lets us easily chain up
      to the root), but we still need to actually get the parent's base
      address. So encode the bit number into the last two bits of the pointer,
      and pack and unpack it as needed. This is a little bit clumsy but is the
      fastest and less memory wasteful of the compromises. Note that we align
      the root struct here to a minimum of 4, because it's embedded into a
      larger struct, and we're relying on having the bottom two bits for our
      flag, which would only be 16-bit aligned on m68k.
      
      The existing macro-based helpers were a bit unwieldy for adding the bit
      packing to, so this commit replaces them with safer and clearer ordinary
      functions.
      
      We add a test to the randomized/fuzzer part of the selftests, to free
      the randomized tries by-peer, refuzz it, and repeat, until it's supposed
      to be empty, and then then see if that actually resulted in the whole
      thing being emptied. That combined with kmemcheck should hopefully make
      sure this commit is doing what it should. Along the way this resulted in
      various other cleanups of the tests and fixes for recent graphviz.
      
      Fixes: e7096c13 ("net: WireGuard secure network tunnel")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bf7b042d
    • Jason A. Donenfeld's avatar
      wireguard: allowedips: allocate nodes in kmem_cache · dc680de2
      Jason A. Donenfeld authored
      The previous commit moved from O(n) to O(1) for removal, but in the
      process introduced an additional pointer member to a struct that
      increased the size from 60 to 68 bytes, putting nodes in the 128-byte
      slab. With deployed systems having as many as 2 million nodes, this
      represents a significant doubling in memory usage (128 MiB -> 256 MiB).
      Fix this by using our own kmem_cache, that's sized exactly right. This
      also makes wireguard's memory usage more transparent in tools like
      slabtop and /proc/slabinfo.
      
      Fixes: e7096c13 ("net: WireGuard secure network tunnel")
      Suggested-by: default avatarArnd Bergmann <arnd@arndb.de>
      Suggested-by: default avatarMatthew Wilcox <willy@infradead.org>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      dc680de2
    • Jason A. Donenfeld's avatar
      wireguard: allowedips: remove nodes in O(1) · f634f418
      Jason A. Donenfeld authored
      Previously, deleting peers would require traversing the entire trie in
      order to rebalance nodes and safely free them. This meant that removing
      1000 peers from a trie with a half million nodes would take an extremely
      long time, during which we're holding the rtnl lock. Large-scale users
      were reporting 200ms latencies added to the networking stack as a whole
      every time their userspace software would queue up significant removals.
      That's a serious situation.
      
      This commit fixes that by maintaining a double pointer to the parent's
      bit pointer for each node, and then using the already existing node list
      belonging to each peer to go directly to the node, fix up its pointers,
      and free it with RCU. This means removal is O(1) instead of O(n), and we
      don't use gobs of stack.
      
      The removal algorithm has the same downside as the code that it fixes:
      it won't collapse needlessly long runs of fillers.  We can enhance that
      in the future if it ever becomes a problem. This commit documents that
      limitation with a TODO comment in code, a small but meaningful
      improvement over the prior situation.
      
      Currently the biggest flaw, which the next commit addresses, is that
      because this increases the node size on 64-bit machines from 60 bytes to
      68 bytes. 60 rounds up to 64, but 68 rounds up to 128. So we wind up
      using twice as much memory per node, because of power-of-two
      allocations, which is a big bummer. We'll need to figure something out
      there.
      
      Fixes: e7096c13 ("net: WireGuard secure network tunnel")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f634f418
    • Jason A. Donenfeld's avatar
      wireguard: allowedips: initialize list head in selftest · 46cfe8ee
      Jason A. Donenfeld authored
      The randomized trie tests weren't initializing the dummy peer list head,
      resulting in a NULL pointer dereference when used. Fix this by
      initializing it in the randomized trie test, just like we do for the
      static unit test.
      
      While we're at it, all of the other strings like this have the word
      "self-test", so add it to the missing place here.
      
      Fixes: e7096c13 ("net: WireGuard secure network tunnel")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      46cfe8ee
    • Jason A. Donenfeld's avatar
      wireguard: peer: allocate in kmem_cache · a4e9f8e3
      Jason A. Donenfeld authored
      With deployments having upwards of 600k peers now, this somewhat heavy
      structure could benefit from more fine-grained allocations.
      Specifically, instead of using a 2048-byte slab for a 1544-byte object,
      we can now use 1544-byte objects directly, thus saving almost 25%
      per-peer, or with 600k peers, that's a savings of 303 MiB. This also
      makes wireguard's memory usage more transparent in tools like slabtop
      and /proc/slabinfo.
      
      Fixes: 8b5553ac ("wireguard: queueing: get rid of per-peer ring buffers")
      Suggested-by: default avatarArnd Bergmann <arnd@arndb.de>
      Suggested-by: default avatarMatthew Wilcox <willy@infradead.org>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a4e9f8e3
    • Jason A. Donenfeld's avatar
      wireguard: use synchronize_net rather than synchronize_rcu · 24b70eee
      Jason A. Donenfeld authored
      Many of the synchronization points are sometimes called under the rtnl
      lock, which means we should use synchronize_net rather than
      synchronize_rcu. Under the hood, this expands to using the expedited
      flavor of function in the event that rtnl is held, in order to not stall
      other concurrent changes.
      
      This fixes some very, very long delays when removing multiple peers at
      once, which would cause some operations to take several minutes.
      
      Fixes: e7096c13 ("net: WireGuard secure network tunnel")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      24b70eee
    • Jason A. Donenfeld's avatar
      wireguard: do not use -O3 · cc5060ca
      Jason A. Donenfeld authored
      Apparently, various versions of gcc have O3-related miscompiles. Looking
      at the difference between -O2 and -O3 for gcc 11 doesn't indicate
      miscompiles, but the difference also doesn't seem so significant for
      performance that it's worth risking.
      
      Link: https://lore.kernel.org/lkml/CAHk-=wjuoGyxDhAF8SsrTkN0-YfCx7E6jUN3ikC_tn2AKWTTsA@mail.gmail.com/
      Link: https://lore.kernel.org/lkml/CAHmME9otB5Wwxp7H8bR_i2uH2esEMvoBMC8uEXBMH9p0q1s6Bw@mail.gmail.com/Reported-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Fixes: e7096c13 ("net: WireGuard secure network tunnel")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cc5060ca
    • Jason A. Donenfeld's avatar
      wireguard: selftests: make sure rp_filter is disabled on vethc · f8873d11
      Jason A. Donenfeld authored
      Some distros may enable strict rp_filter by default, which will prevent
      vethc from receiving the packets with an unrouteable reverse path address.
      Reported-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Fixes: e7096c13 ("net: WireGuard secure network tunnel")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f8873d11
    • Jason A. Donenfeld's avatar
      wireguard: selftests: remove old conntrack kconfig value · acf2492b
      Jason A. Donenfeld authored
      On recent kernels, this config symbol is no longer used.
      Reported-by: default avatarRui Salvaterra <rsalvaterra@gmail.com>
      Fixes: e7096c13 ("net: WireGuard secure network tunnel")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      acf2492b
    • Geert Uytterhoeven's avatar
      virtchnl: Add missing padding to virtchnl_proto_hdrs · 519d8ab1
      Geert Uytterhoeven authored
      On m68k (Coldfire M547x):
      
            CC      drivers/net/ethernet/intel/i40e/i40e_main.o
          In file included from drivers/net/ethernet/intel/i40e/i40e_prototype.h:9,
      		     from drivers/net/ethernet/intel/i40e/i40e.h:41,
      		     from drivers/net/ethernet/intel/i40e/i40e_main.c:12:
          include/linux/avf/virtchnl.h:153:36: warning: division by zero [-Wdiv-by-zero]
            153 |  { virtchnl_static_assert_##X = (n)/((sizeof(struct X) == (n)) ? 1 : 0) }
      	  |                                    ^
          include/linux/avf/virtchnl.h:844:1: note: in expansion of macro ‘VIRTCHNL_CHECK_STRUCT_LEN’
            844 | VIRTCHNL_CHECK_STRUCT_LEN(2312, virtchnl_proto_hdrs);
      	  | ^~~~~~~~~~~~~~~~~~~~~~~~~
          include/linux/avf/virtchnl.h:844:33: error: enumerator value for ‘virtchnl_static_assert_virtchnl_proto_hdrs’ is not an integer constant
            844 | VIRTCHNL_CHECK_STRUCT_LEN(2312, virtchnl_proto_hdrs);
      	  |                                 ^~~~~~~~~~~~~~~~~~~
      
      On m68k, integers are aligned on addresses that are multiples of two,
      not four, bytes.  Hence the size of a structure containing integers may
      not be divisible by 4.
      
      Fix this by adding explicit padding.
      
      Fixes: 1f7ea1cd ("ice: Enable FDIR Configure for AVF")
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Signed-off-by: default avatarGeert Uytterhoeven <geert@linux-m68k.org>
      Acked-by: default avatarJesse Brandeburg <jesse.brandeburg@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      519d8ab1
    • Dave Ertman's avatar
      ice: Allow all LLDP packets from PF to Tx · f9f83202
      Dave Ertman authored
      Currently in the ice driver, the check whether to
      allow a LLDP packet to egress the interface from the
      PF_VSI is being based on the SKB's priority field.
      It checks to see if the packets priority is equal to
      TC_PRIO_CONTROL.  Injected LLDP packets do not always
      meet this condition.
      
      SCAPY defaults to a sk_buff->protocol value of ETH_P_ALL
      (0x0003) and does not set the priority field.  There will
      be other injection methods (even ones used by end users)
      that will not correctly configure the socket so that
      SKB fields are correctly populated.
      
      Then ethernet header has to have to correct value for
      the protocol though.
      
      Add a check to also allow packets whose ethhdr->h_proto
      matches ETH_P_LLDP (0x88CC).
      
      Fixes: 0c3a6101 ("ice: Allow egress control packets from PF_VSI")
      Signed-off-by: default avatarDave Ertman <david.m.ertman@intel.com>
      Tested-by: default avatarTony Brelinski <tonyx.brelinski@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      f9f83202
    • Paul Greenwalt's avatar
      ice: report supported and advertised autoneg using PHY capabilities · 5cd349c3
      Paul Greenwalt authored
      Ethtool incorrectly reported supported and advertised auto-negotiation
      settings for a backplane PHY image which did not support auto-negotiation.
      This can occur when using media or PHY type for reporting ethtool
      supported and advertised auto-negotiation settings.
      
      Remove setting supported and advertised auto-negotiation settings based
      on PHY type in ice_phy_type_to_ethtool(), and MAC type in
      ice_get_link_ksettings().
      
      Ethtool supported and advertised auto-negotiation settings should be
      based on the PHY image using the AQ command get PHY capabilities with
      media. Add setting supported and advertised auto-negotiation settings
      based get PHY capabilities with media in ice_get_link_ksettings().
      
      Fixes: 48cb27f2 ("ice: Implement handlers for ethtool PHY/link operations")
      Signed-off-by: default avatarPaul Greenwalt <paul.greenwalt@intel.com>
      Tested-by: default avatarTony Brelinski <tonyx.brelinski@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      5cd349c3
    • Haiyue Wang's avatar
      ice: handle the VF VSI rebuild failure · c7ee6ce1
      Haiyue Wang authored
      VSI rebuild can be failed for LAN queue config, then the VF's VSI will
      be NULL, the VF reset should be stopped with the VF entering into the
      disable state.
      
      Fixes: 12bb018c ("ice: Refactor VF reset")
      Signed-off-by: default avatarHaiyue Wang <haiyue.wang@intel.com>
      Tested-by: default avatarKonrad Jankowski <konrad0.jankowski@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      c7ee6ce1
    • Brett Creeley's avatar
      ice: Fix VFR issues for AVF drivers that expect ATQLEN cleared · 8679f07a
      Brett Creeley authored
      Some AVF drivers expect the VF_MBX_ATQLEN register to be cleared for any
      type of VFR/VFLR. Fix this by clearing the VF_MBX_ATQLEN register at the
      same time as VF_MBX_ARQLEN.
      
      Fixes: 82ba0128 ("ice: clear VF ARQLEN register on reset")
      Signed-off-by: default avatarBrett Creeley <brett.creeley@intel.com>
      Tested-by: default avatarKonrad Jankowski <konrad0.jankowski@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      8679f07a
    • Brett Creeley's avatar
      ice: Fix allowing VF to request more/less queues via virtchnl · f0457690
      Brett Creeley authored
      Commit 12bb018c ("ice: Refactor VF reset") caused a regression
      that removes the ability for a VF to request a different amount of
      queues via VIRTCHNL_OP_REQUEST_QUEUES. This prevents VF drivers to
      either increase or decrease the number of queue pairs they are
      allocated. Fix this by using the variable vf->num_req_qs when
      determining the vf->num_vf_qs during VF VSI creation.
      
      Fixes: 12bb018c ("ice: Refactor VF reset")
      Signed-off-by: default avatarBrett Creeley <brett.creeley@intel.com>
      Tested-by: default avatarKonrad Jankowski <konrad0.jankowski@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      f0457690
  3. 03 Jun, 2021 14 commits
    • David S. Miller's avatar
      Merge tag 'for-net-2021-06-03' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth · 579028de
      David S. Miller authored
      bluetooth pull request for net:
      
       - Fixes UAF and CVE-2021-3564
       - Fix VIRTIO_ID_BT to use an unassigned ID
       - Fix firmware loading on some Intel Controllers
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      579028de
    • Xuan Zhuo's avatar
      virtio-net: fix for skb_over_panic inside big mode · 1a802423
      Xuan Zhuo authored
      In virtio-net's large packet mode, there is a hole in the space behind
      buf.
      
          hdr_padded_len - hdr_len
      
      We must take this into account when calculating tailroom.
      
      [   44.544385] skb_put.cold (net/core/skbuff.c:5254 (discriminator 1) net/core/skbuff.c:5252 (discriminator 1))
      [   44.544864] page_to_skb (drivers/net/virtio_net.c:485) [   44.545361] receive_buf (drivers/net/virtio_net.c:849 drivers/net/virtio_net.c:1131)
      [   44.545870] ? netif_receive_skb_list_internal (net/core/dev.c:5714)
      [   44.546628] ? dev_gro_receive (net/core/dev.c:6103)
      [   44.547135] ? napi_complete_done (./include/linux/list.h:35 net/core/dev.c:5867 net/core/dev.c:5862 net/core/dev.c:6565)
      [   44.547672] virtnet_poll (drivers/net/virtio_net.c:1427 drivers/net/virtio_net.c:1525)
      [   44.548251] __napi_poll (net/core/dev.c:6985)
      [   44.548744] net_rx_action (net/core/dev.c:7054 net/core/dev.c:7139)
      [   44.549264] __do_softirq (./arch/x86/include/asm/jump_label.h:19 ./include/linux/jump_label.h:200 ./include/trace/events/irq.h:142 kernel/softirq.c:560)
      [   44.549762] irq_exit_rcu (kernel/softirq.c:433 kernel/softirq.c:637 kernel/softirq.c:649)
      [   44.551384] common_interrupt (arch/x86/kernel/irq.c:240 (discriminator 13))
      [   44.551991] ? asm_common_interrupt (./arch/x86/include/asm/idtentry.h:638)
      [   44.552654] asm_common_interrupt (./arch/x86/include/asm/idtentry.h:638)
      
      Fixes: fb32856b ("virtio-net: page_to_skb() use build_skb when there's sufficient tailroom")
      Signed-off-by: default avatarXuan Zhuo <xuanzhuo@linux.alibaba.com>
      Reported-by: default avatarCorentin Noël <corentin.noel@collabora.com>
      Tested-by: default avatarCorentin Noël <corentin.noel@collabora.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1a802423
    • David S. Miller's avatar
      Merge tag 'ieee802154-for-davem-2021-06-03' of... · e31d57ca
      David S. Miller authored
      Merge tag 'ieee802154-for-davem-2021-06-03' of git://git.kernel.org/pub/scm/linux/kernel/git/sschmidt/wpan
      
      Stefan Schmidt says:
      
      ====================
      An update from ieee802154 for your *net* tree.
      
      This time we have fixes for the ieee802154 netlink code, as well as a driver
      fix. Zhen Lei, Wei Yongjun and Yang Li each had  a patch to cleanup some return
      code handling ensuring we actually get a real error code when things fails.
      
      Dan Robertson fixed a potential null dereference in our netlink handling.
      
      Andy Shevchenko removed of_match_ptr()usage in the mrf24j40 driver.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e31d57ca
    • Coco Li's avatar
      ipv6: Fix KASAN: slab-out-of-bounds Read in fib6_nh_flush_exceptions · 821bbf79
      Coco Li authored
      Reported by syzbot:
      HEAD commit:    90c911ad Merge tag 'fixes' of git://git.kernel.org/pub/scm..
      git tree:       git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
      dashboard link: https://syzkaller.appspot.com/bug?extid=123aa35098fd3c000eb7
      compiler:       Debian clang version 11.0.1-2
      
      ==================================================================
      BUG: KASAN: slab-out-of-bounds in fib6_nh_get_excptn_bucket net/ipv6/route.c:1604 [inline]
      BUG: KASAN: slab-out-of-bounds in fib6_nh_flush_exceptions+0xbd/0x360 net/ipv6/route.c:1732
      Read of size 8 at addr ffff8880145c78f8 by task syz-executor.4/17760
      
      CPU: 0 PID: 17760 Comm: syz-executor.4 Not tainted 5.12.0-rc8-syzkaller #0
      Call Trace:
       <IRQ>
       __dump_stack lib/dump_stack.c:79 [inline]
       dump_stack+0x202/0x31e lib/dump_stack.c:120
       print_address_description+0x5f/0x3b0 mm/kasan/report.c:232
       __kasan_report mm/kasan/report.c:399 [inline]
       kasan_report+0x15c/0x200 mm/kasan/report.c:416
       fib6_nh_get_excptn_bucket net/ipv6/route.c:1604 [inline]
       fib6_nh_flush_exceptions+0xbd/0x360 net/ipv6/route.c:1732
       fib6_nh_release+0x9a/0x430 net/ipv6/route.c:3536
       fib6_info_destroy_rcu+0xcb/0x1c0 net/ipv6/ip6_fib.c:174
       rcu_do_batch kernel/rcu/tree.c:2559 [inline]
       rcu_core+0x8f6/0x1450 kernel/rcu/tree.c:2794
       __do_softirq+0x372/0x7a6 kernel/softirq.c:345
       invoke_softirq kernel/softirq.c:221 [inline]
       __irq_exit_rcu+0x22c/0x260 kernel/softirq.c:422
       irq_exit_rcu+0x5/0x20 kernel/softirq.c:434
       sysvec_apic_timer_interrupt+0x91/0xb0 arch/x86/kernel/apic/apic.c:1100
       </IRQ>
       asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:632
      RIP: 0010:lock_acquire+0x1f6/0x720 kernel/locking/lockdep.c:5515
      Code: f6 84 24 a1 00 00 00 02 0f 85 8d 02 00 00 f7 c3 00 02 00 00 49 bd 00 00 00 00 00 fc ff df 74 01 fb 48 c7 44 24 40 0e 36 e0 45 <4b> c7 44 3d 00 00 00 00 00 4b c7 44 3d 09 00 00 00 00 43 c7 44 3d
      RSP: 0018:ffffc90009e06560 EFLAGS: 00000206
      RAX: 1ffff920013c0cc0 RBX: 0000000000000246 RCX: dffffc0000000000
      RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
      RBP: ffffc90009e066e0 R08: dffffc0000000000 R09: fffffbfff1f992b1
      R10: fffffbfff1f992b1 R11: 0000000000000000 R12: 0000000000000000
      R13: dffffc0000000000 R14: 0000000000000000 R15: 1ffff920013c0cb4
       rcu_lock_acquire+0x2a/0x30 include/linux/rcupdate.h:267
       rcu_read_lock include/linux/rcupdate.h:656 [inline]
       ext4_get_group_info+0xea/0x340 fs/ext4/ext4.h:3231
       ext4_mb_prefetch+0x123/0x5d0 fs/ext4/mballoc.c:2212
       ext4_mb_regular_allocator+0x8a5/0x28f0 fs/ext4/mballoc.c:2379
       ext4_mb_new_blocks+0xc6e/0x24f0 fs/ext4/mballoc.c:4982
       ext4_ext_map_blocks+0x2be3/0x7210 fs/ext4/extents.c:4238
       ext4_map_blocks+0xab3/0x1cb0 fs/ext4/inode.c:638
       ext4_getblk+0x187/0x6c0 fs/ext4/inode.c:848
       ext4_bread+0x2a/0x1c0 fs/ext4/inode.c:900
       ext4_append+0x1a4/0x360 fs/ext4/namei.c:67
       ext4_init_new_dir+0x337/0xa10 fs/ext4/namei.c:2768
       ext4_mkdir+0x4b8/0xc00 fs/ext4/namei.c:2814
       vfs_mkdir+0x45b/0x640 fs/namei.c:3819
       ovl_do_mkdir fs/overlayfs/overlayfs.h:161 [inline]
       ovl_mkdir_real+0x53/0x1a0 fs/overlayfs/dir.c:146
       ovl_create_real+0x280/0x490 fs/overlayfs/dir.c:193
       ovl_workdir_create+0x425/0x600 fs/overlayfs/super.c:788
       ovl_make_workdir+0xed/0x1140 fs/overlayfs/super.c:1355
       ovl_get_workdir fs/overlayfs/super.c:1492 [inline]
       ovl_fill_super+0x39ee/0x5370 fs/overlayfs/super.c:2035
       mount_nodev+0x52/0xe0 fs/super.c:1413
       legacy_get_tree+0xea/0x180 fs/fs_context.c:592
       vfs_get_tree+0x86/0x270 fs/super.c:1497
       do_new_mount fs/namespace.c:2903 [inline]
       path_mount+0x196f/0x2be0 fs/namespace.c:3233
       do_mount fs/namespace.c:3246 [inline]
       __do_sys_mount fs/namespace.c:3454 [inline]
       __se_sys_mount+0x2f9/0x3b0 fs/namespace.c:3431
       do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
       entry_SYSCALL_64_after_hwframe+0x44/0xae
      RIP: 0033:0x4665f9
      Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
      RSP: 002b:00007f68f2b87188 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5
      RAX: ffffffffffffffda RBX: 000000000056bf60 RCX: 00000000004665f9
      RDX: 00000000200000c0 RSI: 0000000020000000 RDI: 000000000040000a
      RBP: 00000000004bfbb9 R08: 0000000020000100 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 000000000056bf60
      R13: 00007ffe19002dff R14: 00007f68f2b87300 R15: 0000000000022000
      
      Allocated by task 17768:
       kasan_save_stack mm/kasan/common.c:38 [inline]
       kasan_set_track mm/kasan/common.c:46 [inline]
       set_alloc_info mm/kasan/common.c:427 [inline]
       ____kasan_kmalloc+0xc2/0xf0 mm/kasan/common.c:506
       kasan_kmalloc include/linux/kasan.h:233 [inline]
       __kmalloc+0xb4/0x380 mm/slub.c:4055
       kmalloc include/linux/slab.h:559 [inline]
       kzalloc include/linux/slab.h:684 [inline]
       fib6_info_alloc+0x2c/0xd0 net/ipv6/ip6_fib.c:154
       ip6_route_info_create+0x55d/0x1a10 net/ipv6/route.c:3638
       ip6_route_add+0x22/0x120 net/ipv6/route.c:3728
       inet6_rtm_newroute+0x2cd/0x2260 net/ipv6/route.c:5352
       rtnetlink_rcv_msg+0xb34/0xe70 net/core/rtnetlink.c:5553
       netlink_rcv_skb+0x1f0/0x460 net/netlink/af_netlink.c:2502
       netlink_unicast_kernel net/netlink/af_netlink.c:1312 [inline]
       netlink_unicast+0x7de/0x9b0 net/netlink/af_netlink.c:1338
       netlink_sendmsg+0xaa6/0xe90 net/netlink/af_netlink.c:1927
       sock_sendmsg_nosec net/socket.c:654 [inline]
       sock_sendmsg net/socket.c:674 [inline]
       ____sys_sendmsg+0x5a2/0x900 net/socket.c:2350
       ___sys_sendmsg net/socket.c:2404 [inline]
       __sys_sendmsg+0x319/0x400 net/socket.c:2433
       do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
       entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      Last potentially related work creation:
       kasan_save_stack+0x27/0x50 mm/kasan/common.c:38
       kasan_record_aux_stack+0xee/0x120 mm/kasan/generic.c:345
       __call_rcu kernel/rcu/tree.c:3039 [inline]
       call_rcu+0x1b1/0xa30 kernel/rcu/tree.c:3114
       fib6_info_release include/net/ip6_fib.h:337 [inline]
       ip6_route_info_create+0x10c4/0x1a10 net/ipv6/route.c:3718
       ip6_route_add+0x22/0x120 net/ipv6/route.c:3728
       inet6_rtm_newroute+0x2cd/0x2260 net/ipv6/route.c:5352
       rtnetlink_rcv_msg+0xb34/0xe70 net/core/rtnetlink.c:5553
       netlink_rcv_skb+0x1f0/0x460 net/netlink/af_netlink.c:2502
       netlink_unicast_kernel net/netlink/af_netlink.c:1312 [inline]
       netlink_unicast+0x7de/0x9b0 net/netlink/af_netlink.c:1338
       netlink_sendmsg+0xaa6/0xe90 net/netlink/af_netlink.c:1927
       sock_sendmsg_nosec net/socket.c:654 [inline]
       sock_sendmsg net/socket.c:674 [inline]
       ____sys_sendmsg+0x5a2/0x900 net/socket.c:2350
       ___sys_sendmsg net/socket.c:2404 [inline]
       __sys_sendmsg+0x319/0x400 net/socket.c:2433
       do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
       entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      Second to last potentially related work creation:
       kasan_save_stack+0x27/0x50 mm/kasan/common.c:38
       kasan_record_aux_stack+0xee/0x120 mm/kasan/generic.c:345
       insert_work+0x54/0x400 kernel/workqueue.c:1331
       __queue_work+0x981/0xcc0 kernel/workqueue.c:1497
       queue_work_on+0x111/0x200 kernel/workqueue.c:1524
       queue_work include/linux/workqueue.h:507 [inline]
       call_usermodehelper_exec+0x283/0x470 kernel/umh.c:433
       kobject_uevent_env+0x1349/0x1730 lib/kobject_uevent.c:617
       kvm_uevent_notify_change+0x309/0x3b0 arch/x86/kvm/../../../virt/kvm/kvm_main.c:4809
       kvm_destroy_vm arch/x86/kvm/../../../virt/kvm/kvm_main.c:877 [inline]
       kvm_put_kvm+0x9c/0xd10 arch/x86/kvm/../../../virt/kvm/kvm_main.c:920
       kvm_vcpu_release+0x53/0x60 arch/x86/kvm/../../../virt/kvm/kvm_main.c:3120
       __fput+0x352/0x7b0 fs/file_table.c:280
       task_work_run+0x146/0x1c0 kernel/task_work.c:140
       tracehook_notify_resume include/linux/tracehook.h:189 [inline]
       exit_to_user_mode_loop kernel/entry/common.c:174 [inline]
       exit_to_user_mode_prepare+0x10b/0x1e0 kernel/entry/common.c:208
       __syscall_exit_to_user_mode_work kernel/entry/common.c:290 [inline]
       syscall_exit_to_user_mode+0x26/0x70 kernel/entry/common.c:301
       entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      The buggy address belongs to the object at ffff8880145c7800
       which belongs to the cache kmalloc-192 of size 192
      The buggy address is located 56 bytes to the right of
       192-byte region [ffff8880145c7800, ffff8880145c78c0)
      The buggy address belongs to the page:
      page:ffffea00005171c0 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x145c7
      flags: 0xfff00000000200(slab)
      raw: 00fff00000000200 ffffea00006474c0 0000000200000002 ffff888010c41a00
      raw: 0000000000000000 0000000080100010 00000001ffffffff 0000000000000000
      page dumped because: kasan: bad access detected
      
      Memory state around the buggy address:
       ffff8880145c7780: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
       ffff8880145c7800: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      >ffff8880145c7880: 00 00 00 00 fc fc fc fc fc fc fc fc fc fc fc fc
                                                                      ^
       ffff8880145c7900: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
       ffff8880145c7980: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
      ==================================================================
      
      In the ip6_route_info_create function, in the case that the nh pointer
      is not NULL, the fib6_nh in fib6_info has not been allocated.
      Therefore, when trying to free fib6_info in this error case using
      fib6_info_release, the function will call fib6_info_destroy_rcu,
      which it will access fib6_nh_release(f6i->fib6_nh);
      However, f6i->fib6_nh doesn't have any refcount yet given the lack of allocation
      causing the reported memory issue above.
      Therefore, releasing the empty pointer directly instead would be the solution.
      
      Fixes: f88d8ea6 ("ipv6: Plumb support for nexthop object in a fib6_info")
      Fixes: 706ec919 ("ipv6: Fix nexthop refcnt leak when creating ipv6 route info")
      Signed-off-by: default avatarCoco Li <lixiaoyan@google.com>
      Cc: David Ahern <dsahern@kernel.org>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      821bbf79
    • David S. Miller's avatar
      Merge tag 'wireless-drivers-2021-06-03' of... · 5e7a2c64
      David S. Miller authored
      Merge tag 'wireless-drivers-2021-06-03' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers
      
      Kalle Valo says:
      
      ====================
      wireless-drivers fixes for v5.13
      
      We have only mt76 fixes this time, most important being the fix for
      A-MSDU injection attacks.
      
      mt76
      
      * mitigate A-MSDU injection attacks (CVE-2020-24588)
      
      * fix possible array out of bound access in mt7921_mcu_tx_rate_report
      
      * various aggregation and HE setting fixes
      
      * suspend/resume fix for pci devices
      
      * mt7615: fix crash when runtime-pm is not supported
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5e7a2c64
    • Zheng Yongjun's avatar
      fib: Return the correct errno code · 59607863
      Zheng Yongjun authored
      When kalloc or kmemdup failed, should return ENOMEM rather than ENOBUF.
      Signed-off-by: default avatarZheng Yongjun <zhengyongjun3@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      59607863
    • Zheng Yongjun's avatar
      net: Return the correct errno code · 49251cd0
      Zheng Yongjun authored
      When kalloc or kmemdup failed, should return ENOMEM rather than ENOBUF.
      Signed-off-by: default avatarZheng Yongjun <zhengyongjun3@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      49251cd0
    • Zheng Yongjun's avatar
      net/x25: Return the correct errno code · d7736958
      Zheng Yongjun authored
      When kalloc or kmemdup failed, should return ENOMEM rather than ENOBUF.
      Signed-off-by: default avatarZheng Yongjun <zhengyongjun3@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d7736958
    • Rahul Lakkireddy's avatar
      cxgb4: fix regression with HASH tc prio value update · a27fb314
      Rahul Lakkireddy authored
      commit db43b30c ("cxgb4: add ethtool n-tuple filter deletion")
      has moved searching for next highest priority HASH filter rule to
      cxgb4_flow_rule_destroy(), which searches the rhashtable before the
      the rule is removed from it and hence always finds at least 1 entry.
      Fix by removing the rule from rhashtable first before calling
      cxgb4_flow_rule_destroy() and hence avoid fetching stale info.
      
      Fixes: db43b30c ("cxgb4: add ethtool n-tuple filter deletion")
      Signed-off-by: default avatarRahul Lakkireddy <rahul.lakkireddy@chelsio.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a27fb314
    • David S. Miller's avatar
      Merge branch 'caif-fixes' · e0310182
      David S. Miller authored
      Pavel Skripkin says:
      
      ====================
      This patch series fix 2 memory leaks in caif
      interface.
      
      Syzbot reported memory leak in cfserl_create().
      The problem was in cfcnfg_add_phy_layer() function.
      This function accepts struct cflayer *link_support and
      assign it to corresponting structures, but it can fail
      in some cases.
      
      These cases must be handled to prevent leaking allocated
      struct cflayer *link_support pointer, because if error accured
      before assigning link_support pointer to somewhere, this pointer
      must be freed.
      
      Fail log:
      
      [   49.051872][ T7010] caif:cfcnfg_add_phy_layer(): Too many CAIF Link Layers (max 6)
      [   49.110236][ T7042] caif:cfcnfg_add_phy_layer(): Too many CAIF Link Layers (max 6)
      [   49.134936][ T7045] caif:cfcnfg_add_phy_layer(): Too many CAIF Link Layers (max 6)
      [   49.163083][ T7043] caif:cfcnfg_add_phy_layer(): Too many CAIF Link Layers (max 6)
      [   55.248950][ T6994] kmemleak: 4 new suspected memory leaks (see /sys/kernel/debug/kmemleak)
      
      int cfcnfg_add_phy_layer(..., struct cflayer *link_support, ...)
      {
      ...
      	/* CAIF protocol allow maximum 6 link-layers */
      	for (i = 0; i < 7; i++) {
      		phyid = (dev->ifindex + i) & 0x7;
      		if (phyid == 0)
      			continue;
      		if (cfcnfg_get_phyinfo_rcu(cnfg, phyid) == NULL)
      			goto got_phyid;
      	}
      	pr_warn("Too many CAIF Link Layers (max 6)\n");
      	goto out;
      ...
      	if (link_support != NULL) {
      		link_support->id = phyid;
      		layer_set_dn(frml, link_support);
      		layer_set_up(link_support, frml);
      		layer_set_dn(link_support, phy_layer);
      		layer_set_up(phy_layer, link_support);
      	}
      ...
      }
      
      As you can see, if cfcnfg_add_phy_layer fails before layer_set_*,
      link_support becomes leaked.
      
      So, in this series, I made cfcnfg_add_phy_layer()
      return an int and added error handling code to prevent
      leaking link_support pointer in caif_device_notify()
      and cfusbl_device_notify() functions.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e0310182
    • Pavel Skripkin's avatar
      net: caif: fix memory leak in cfusbl_device_notify · 7f5d8666
      Pavel Skripkin authored
      In case of caif_enroll_dev() fail, allocated
      link_support won't be assigned to the corresponding
      structure. So simply free allocated pointer in case
      of error.
      
      Fixes: 7ad65bf6 ("caif: Add support for CAIF over CDC NCM USB interface")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarPavel Skripkin <paskripkin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7f5d8666
    • Pavel Skripkin's avatar
      net: caif: fix memory leak in caif_device_notify · b53558a9
      Pavel Skripkin authored
      In case of caif_enroll_dev() fail, allocated
      link_support won't be assigned to the corresponding
      structure. So simply free allocated pointer in case
      of error
      
      Fixes: 7c18d220 ("caif: Restructure how link caif link layer enroll")
      Cc: stable@vger.kernel.org
      Reported-and-tested-by: syzbot+7ec324747ce876a29db6@syzkaller.appspotmail.com
      Signed-off-by: default avatarPavel Skripkin <paskripkin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b53558a9
    • Pavel Skripkin's avatar
      net: caif: add proper error handling · a2805dca
      Pavel Skripkin authored
      caif_enroll_dev() can fail in some cases. Ingnoring
      these cases can lead to memory leak due to not assigning
      link_support pointer to anywhere.
      
      Fixes: 7c18d220 ("caif: Restructure how link caif link layer enroll")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarPavel Skripkin <paskripkin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a2805dca
    • Pavel Skripkin's avatar
      net: caif: added cfserl_release function · bce130e7
      Pavel Skripkin authored
      Added cfserl_release() function.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarPavel Skripkin <paskripkin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bce130e7