1. 25 Feb, 2022 13 commits
    • Sukadev Bhattiprolu's avatar
      ibmvnic: define flush_reset_queue helper · 83da53f7
      Sukadev Bhattiprolu authored
      Define and use a helper to flush the reset queue.
      
      Fixes: 2770a798 ("ibmvnic: Introduce hard reset recovery")
      Signed-off-by: default avatarSukadev Bhattiprolu <sukadev@linux.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      83da53f7
    • Sukadev Bhattiprolu's avatar
      ibmvnic: initialize rc before completing wait · 765559b1
      Sukadev Bhattiprolu authored
      We should initialize ->init_done_rc before calling complete(). Otherwise
      the waiting thread may see ->init_done_rc as 0 before we have updated it
      and may assume that the CRQ was successful.
      
      Fixes: 6b278c0c ("ibmvnic delay complete()")
      Signed-off-by: default avatarSukadev Bhattiprolu <sukadev@linux.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      765559b1
    • Sukadev Bhattiprolu's avatar
      ibmvnic: free reset-work-item when flushing · 8d0657f3
      Sukadev Bhattiprolu authored
      Fix a tiny memory leak when flushing the reset work queue.
      
      Fixes: 2770a798 ("ibmvnic: Introduce hard reset recovery")
      Signed-off-by: default avatarSukadev Bhattiprolu <sukadev@linux.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8d0657f3
    • David S. Miller's avatar
      Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec · 31372fe9
      David S. Miller authored
      Steffen Klassert says:
      
      ====================
      1) Fix PMTU for IPv6 if the reported MTU minus the ESP overhead is
         smaller than 1280. From Jiri Bohac.
      
      2) Fix xfrm interface ID and inter address family tunneling when
         migrating xfrm states. From Yan Yan.
      
      3) Add missing xfrm intrerface ID initialization on xfrmi_changelink.
         From Antony Antony.
      
      4) Enforce validity of xfrm offload input flags so that userspace can't
         send undefined flags to the offload driver.
         From Leon Romanovsky.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      31372fe9
    • Vladimir Oltean's avatar
      net: dcb: flush lingering app table entries for unregistered devices · 91b0383f
      Vladimir Oltean authored
      If I'm not mistaken (and I don't think I am), the way in which the
      dcbnl_ops work is that drivers call dcb_ieee_setapp() and this populates
      the application table with dynamically allocated struct dcb_app_type
      entries that are kept in the module-global dcb_app_list.
      
      However, nobody keeps exact track of these entries, and although
      dcb_ieee_delapp() is supposed to remove them, nobody does so when the
      interface goes away (example: driver unbinds from device). So the
      dcb_app_list will contain lingering entries with an ifindex that no
      longer matches any device in dcb_app_lookup().
      
      Reclaim the lost memory by listening for the NETDEV_UNREGISTER event and
      flushing the app table entries of interfaces that are now gone.
      
      In fact something like this used to be done as part of the initial
      commit (blamed below), but it was done in dcbnl_exit() -> dcb_flushapp(),
      essentially at module_exit time. That became dead code after commit
      7a6b6f51 ("DCB: fix kconfig option") which essentially merged
      "tristate config DCB" and "bool config DCBNL" into a single "bool config
      DCB", so net/dcb/dcbnl.c could not be built as a module anymore.
      
      Commit 36b9ad80 ("net/dcb: make dcbnl.c explicitly non-modular")
      recognized this and deleted dcbnl_exit() and dcb_flushapp() altogether,
      leaving us with the version we have today.
      
      Since flushing application table entries can and should be done as soon
      as the netdevice disappears, fundamentally the commit that is to blame
      is the one that introduced the design of this API.
      
      Fixes: 9ab933ab ("dcbnl: add appliction tlv handlers")
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      91b0383f
    • D. Wythe's avatar
      net/smc: fix connection leak · 9f1c50cf
      D. Wythe authored
      There's a potential leak issue under following execution sequence :
      
      smc_release  				smc_connect_work
      if (sk->sk_state == SMC_INIT)
      					send_clc_confirim
      	tcp_abort();
      					...
      					sk.sk_state = SMC_ACTIVE
      smc_close_active
      switch(sk->sk_state) {
      ...
      case SMC_ACTIVE:
      	smc_close_final()
      	// then wait peer closed
      
      Unfortunately, tcp_abort() may discard CLC CONFIRM messages that are
      still in the tcp send buffer, in which case our connection token cannot
      be delivered to the server side, which means that we cannot get a
      passive close message at all. Therefore, it is impossible for the to be
      disconnected at all.
      
      This patch tries a very simple way to avoid this issue, once the state
      has changed to SMC_ACTIVE after tcp_abort(), we can actively abort the
      smc connection, considering that the state is SMC_INIT before
      tcp_abort(), abandoning the complete disconnection process should not
      cause too much problem.
      
      In fact, this problem may exist as long as the CLC CONFIRM message is
      not received by the server. Whether a timer should be added after
      smc_close_final() needs to be discussed in the future. But even so, this
      patch provides a faster release for connection in above case, it should
      also be valuable.
      
      Fixes: 39f41f36 ("net/smc: common release code for non-accepted sockets")
      Signed-off-by: default avatarD. Wythe <alibuda@linux.alibaba.com>
      Acked-by: default avatarKarsten Graul <kgraul@linux.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9f1c50cf
    • Vincent Whitchurch's avatar
      net: stmmac: only enable DMA interrupts when ready · 087a7b94
      Vincent Whitchurch authored
      In this driver's ->ndo_open() callback, it enables DMA interrupts,
      starts the DMA channels, then requests interrupts with request_irq(),
      and then finally enables napi.
      
      If RX DMA interrupts are received before napi is enabled, no processing
      is done because napi_schedule_prep() will return false.  If the network
      has a lot of broadcast/multicast traffic, then the RX ring could fill up
      completely before napi is enabled.  When this happens, no further RX
      interrupts will be delivered, and the driver will fail to receive any
      packets.
      
      Fix this by only enabling DMA interrupts after all other initialization
      is complete.
      
      Fixes: 523f11b5 ("net: stmmac: move hardware setup for stmmac_open to new function")
      Reported-by: default avatarLars Persson <larper@axis.com>
      Signed-off-by: default avatarVincent Whitchurch <vincent.whitchurch@axis.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      087a7b94
    • Marek Marczykowski-Górecki's avatar
      xen/netfront: destroy queues before real_num_tx_queues is zeroed · dcf4ff7a
      Marek Marczykowski-Górecki authored
      xennet_destroy_queues() relies on info->netdev->real_num_tx_queues to
      delete queues. Since d7dac083
      ("net-sysfs: update the queue counts in the unregistration path"),
      unregister_netdev() indirectly sets real_num_tx_queues to 0. Those two
      facts together means, that xennet_destroy_queues() called from
      xennet_remove() cannot do its job, because it's called after
      unregister_netdev(). This results in kfree-ing queues that are still
      linked in napi, which ultimately crashes:
      
          BUG: kernel NULL pointer dereference, address: 0000000000000000
          #PF: supervisor read access in kernel mode
          #PF: error_code(0x0000) - not-present page
          PGD 0 P4D 0
          Oops: 0000 [#1] PREEMPT SMP PTI
          CPU: 1 PID: 52 Comm: xenwatch Tainted: G        W         5.16.10-1.32.fc32.qubes.x86_64+ #226
          RIP: 0010:free_netdev+0xa3/0x1a0
          Code: ff 48 89 df e8 2e e9 00 00 48 8b 43 50 48 8b 08 48 8d b8 a0 fe ff ff 48 8d a9 a0 fe ff ff 49 39 c4 75 26 eb 47 e8 ed c1 66 ff <48> 8b 85 60 01 00 00 48 8d 95 60 01 00 00 48 89 ef 48 2d 60 01 00
          RSP: 0000:ffffc90000bcfd00 EFLAGS: 00010286
          RAX: 0000000000000000 RBX: ffff88800edad000 RCX: 0000000000000000
          RDX: 0000000000000001 RSI: ffffc90000bcfc30 RDI: 00000000ffffffff
          RBP: fffffffffffffea0 R08: 0000000000000000 R09: 0000000000000000
          R10: 0000000000000000 R11: 0000000000000001 R12: ffff88800edad050
          R13: ffff8880065f8f88 R14: 0000000000000000 R15: ffff8880066c6680
          FS:  0000000000000000(0000) GS:ffff8880f3300000(0000) knlGS:0000000000000000
          CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
          CR2: 0000000000000000 CR3: 00000000e998c006 CR4: 00000000003706e0
          Call Trace:
           <TASK>
           xennet_remove+0x13d/0x300 [xen_netfront]
           xenbus_dev_remove+0x6d/0xf0
           __device_release_driver+0x17a/0x240
           device_release_driver+0x24/0x30
           bus_remove_device+0xd8/0x140
           device_del+0x18b/0x410
           ? _raw_spin_unlock+0x16/0x30
           ? klist_iter_exit+0x14/0x20
           ? xenbus_dev_request_and_reply+0x80/0x80
           device_unregister+0x13/0x60
           xenbus_dev_changed+0x18e/0x1f0
           xenwatch_thread+0xc0/0x1a0
           ? do_wait_intr_irq+0xa0/0xa0
           kthread+0x16b/0x190
           ? set_kthread_struct+0x40/0x40
           ret_from_fork+0x22/0x30
           </TASK>
      
      Fix this by calling xennet_destroy_queues() from xennet_uninit(),
      when real_num_tx_queues is still available. This ensures that queues are
      destroyed when real_num_tx_queues is set to 0, regardless of how
      unregister_netdev() was called.
      
      Originally reported at
      https://github.com/QubesOS/qubes-issues/issues/7257
      
      Fixes: d7dac083 ("net-sysfs: update the queue counts in the unregistration path")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMarek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      dcf4ff7a
    • Jakub Kicinski's avatar
      Merge branch 'mptcp-fixes-for-5-17' · a6df953f
      Jakub Kicinski authored
      Mat Martineau says:
      
      ====================
      mptcp: Fixes for 5.17
      
      Patch 1 fixes an issue with the SIOCOUTQ ioctl in MPTCP sockets that
      have performed a fallback to TCP.
      
      Patch 2 is a selftest fix to correctly remove temp files.
      
      Patch 3 fixes a shift-out-of-bounds issue found by syzkaller.
      ====================
      
      Link: https://lore.kernel.org/r/20220225005259.318898-1-mathew.j.martineau@linux.intel.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      a6df953f
    • Mat Martineau's avatar
      mptcp: Correctly set DATA_FIN timeout when number of retransmits is large · 877d11f0
      Mat Martineau authored
      Syzkaller with UBSAN uncovered a scenario where a large number of
      DATA_FIN retransmits caused a shift-out-of-bounds in the DATA_FIN
      timeout calculation:
      
      ================================================================================
      UBSAN: shift-out-of-bounds in net/mptcp/protocol.c:470:29
      shift exponent 32 is too large for 32-bit type 'unsigned int'
      CPU: 1 PID: 13059 Comm: kworker/1:0 Not tainted 5.17.0-rc2-00630-g5fbf21c90c60 #1
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
      Workqueue: events mptcp_worker
      Call Trace:
       <TASK>
       __dump_stack lib/dump_stack.c:88 [inline]
       dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
       ubsan_epilogue+0xb/0x5a lib/ubsan.c:151
       __ubsan_handle_shift_out_of_bounds.cold+0xb2/0x20e lib/ubsan.c:330
       mptcp_set_datafin_timeout net/mptcp/protocol.c:470 [inline]
       __mptcp_retrans.cold+0x72/0x77 net/mptcp/protocol.c:2445
       mptcp_worker+0x58a/0xa70 net/mptcp/protocol.c:2528
       process_one_work+0x9df/0x16d0 kernel/workqueue.c:2307
       worker_thread+0x95/0xe10 kernel/workqueue.c:2454
       kthread+0x2f4/0x3b0 kernel/kthread.c:377
       ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295
       </TASK>
      ================================================================================
      
      This change limits the maximum timeout by limiting the size of the
      shift, which keeps all intermediate values in-bounds.
      
      Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/259
      Fixes: 6477dd39 ("mptcp: Retransmit DATA_FIN")
      Acked-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarMat Martineau <mathew.j.martineau@linux.intel.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      877d11f0
    • Paolo Abeni's avatar
      selftests: mptcp: do complete cleanup at exit · 63bb8239
      Paolo Abeni authored
      After commit 05be5e27 ("selftests: mptcp: add disconnect tests")
      the mptcp selftests leave behind a couple of tmp files after
      each run. run_tests_disconnect() misnames a few variables used to
      track them. Address the issue setting the appropriate global variables
      
      Fixes: 05be5e27 ("selftests: mptcp: add disconnect tests")
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarMat Martineau <mathew.j.martineau@linux.intel.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      63bb8239
    • Paolo Abeni's avatar
      mptcp: accurate SIOCOUTQ for fallback socket · 07c2c7a3
      Paolo Abeni authored
      The MPTCP SIOCOUTQ implementation is not very accurate in
      case of fallback: it only measures the data in the MPTCP-level
      write queue, but it does not take in account the subflow
      write queue utilization. In case of fallback the first can be
      empty, while the latter is not.
      
      The above produces sporadic self-tests issues and can foul
      legit user-space application.
      
      Fix the issue additionally querying the subflow in case of fallback.
      
      Fixes: 644807e3 ("mptcp: add SIOCINQ, OUTQ and OUTQNSD ioctls")
      Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/260Reported-by: default avatarMatthieu Baerts <matthieu.baerts@tessares.net>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarMat Martineau <mathew.j.martineau@linux.intel.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      07c2c7a3
    • Jakub Kicinski's avatar
      Merge tag 'for-net-2022-02-24' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth · 8a727100
      Jakub Kicinski authored
      Luiz Augusto von Dentz says:
      
      ====================
      bluetooth pull request for net:
      
       - Fix regression with RFCOMM
       - Fix regression with LE devices using Privacy (RPA)
       - Fix regression with LE devices not waiting proper timeout to
         establish connections
       - Fix race in smp
      
      * tag 'for-net-2022-02-24' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth:
        Bluetooth: hci_sync: Fix not using conn_timeout
        Bluetooth: hci_sync: Fix hci_update_accept_list_sync
        Bluetooth: assign len after null check
        Bluetooth: Fix bt_skb_sendmmsg not allocating partial chunks
        Bluetooth: fix data races in smp_unregister(), smp_del_chan()
        Bluetooth: hci_core: Fix leaking sent_cmd skb
      ====================
      
      Link: https://lore.kernel.org/r/20220224210838.197787-1-luiz.dentz@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      8a727100
  2. 24 Feb, 2022 27 commits