1. 15 Mar, 2023 15 commits
    • Daniel Golle's avatar
      net: ethernet: mtk_eth_soc: only write values if needed · 6e933a80
      Daniel Golle authored
      Only restart auto-negotiation and write link timer if actually
      necessary. This prevents losing the link in case of minor
      changes.
      
      Fixes: 7e538372 ("net: ethernet: mediatek: Re-add support SGMII")
      Reviewed-by: default avatarRussell King (Oracle) <rmk+kernel@armlinux.org.uk>
      Tested-by: default avatarBjørn Mork <bjorn@mork.no>
      Signed-off-by: default avatarDaniel Golle <daniel@makrotopia.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6e933a80
    • Daniel Golle's avatar
      net: ethernet: mtk_eth_soc: reset PCS state · 611e2dab
      Daniel Golle authored
      Reset the internal PCS state machine when changing interface mode.
      This prevents confusing the state machine when changing interface
      modes, e.g. from SGMII to 2500Base-X or vice-versa.
      
      Fixes: 7e538372 ("net: ethernet: mediatek: Re-add support SGMII")
      Reviewed-by: default avatarRussell King (Oracle) <rmk+kernel@armlinux.org.uk>
      Tested-by: default avatarBjørn Mork <bjorn@mork.no>
      Signed-off-by: default avatarDaniel Golle <daniel@makrotopia.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      611e2dab
    • Szymon Heidrich's avatar
      net: usb: smsc75xx: Limit packet length to skb->len · d8b22831
      Szymon Heidrich authored
      Packet length retrieved from skb data may be larger than
      the actual socket buffer length (up to 9026 bytes). In such
      case the cloned skb passed up the network stack will leak
      kernel memory contents.
      
      Fixes: d0cad871 ("smsc75xx: SMSC LAN75xx USB gigabit ethernet adapter driver")
      Signed-off-by: default avatarSzymon Heidrich <szymon.heidrich@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d8b22831
    • David S. Miller's avatar
      Merge branch 'net-smc-fixes' · fd6ad75f
      David S. Miller authored
      Wenjia Zhang says:
      
      ====================
      net/smc: Fixes 2023-03-01
      
      The 1st patch solves the problem that CLC message initialization was
      not properly reversed in error handling path. And the 2nd one fixes
      the possible deadlock triggered by cancel_delayed_work_sync().
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fd6ad75f
    • Stefan Raspl's avatar
      net/smc: Fix device de-init sequence · 9d876d3e
      Stefan Raspl authored
      CLC message initialization was not properly reversed in error handling path.
      Reported-and-suggested-by: default avatarAlexander Gordeev <agordeev@linux.ibm.com>
      Signed-off-by: default avatarStefan Raspl <raspl@linux.ibm.com>
      Signed-off-by: default avatarWenjia Zhang <wenjia@linux.ibm.com>
      Reviewed-by: default avatarTony Lu <tonylu@linux.alibaba.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9d876d3e
    • Wenjia Zhang's avatar
      net/smc: fix deadlock triggered by cancel_delayed_work_syn() · 13085e1b
      Wenjia Zhang authored
      The following LOCKDEP was detected:
      		Workqueue: events smc_lgr_free_work [smc]
      		WARNING: possible circular locking dependency detected
      		6.1.0-20221027.rc2.git8.56bc5b569087.300.fc36.s390x+debug #1 Not tainted
      		------------------------------------------------------
      		kworker/3:0/176251 is trying to acquire lock:
      		00000000f1467148 ((wq_completion)smc_tx_wq-00000000#2){+.+.}-{0:0},
      			at: __flush_workqueue+0x7a/0x4f0
      		but task is already holding lock:
      		0000037fffe97dc8 ((work_completion)(&(&lgr->free_work)->work)){+.+.}-{0:0},
      			at: process_one_work+0x232/0x730
      		which lock already depends on the new lock.
      		the existing dependency chain (in reverse order) is:
      		-> #4 ((work_completion)(&(&lgr->free_work)->work)){+.+.}-{0:0}:
      		       __lock_acquire+0x58e/0xbd8
      		       lock_acquire.part.0+0xe2/0x248
      		       lock_acquire+0xac/0x1c8
      		       __flush_work+0x76/0xf0
      		       __cancel_work_timer+0x170/0x220
      		       __smc_lgr_terminate.part.0+0x34/0x1c0 [smc]
      		       smc_connect_rdma+0x15e/0x418 [smc]
      		       __smc_connect+0x234/0x480 [smc]
      		       smc_connect+0x1d6/0x230 [smc]
      		       __sys_connect+0x90/0xc0
      		       __do_sys_socketcall+0x186/0x370
      		       __do_syscall+0x1da/0x208
      		       system_call+0x82/0xb0
      		-> #3 (smc_client_lgr_pending){+.+.}-{3:3}:
      		       __lock_acquire+0x58e/0xbd8
      		       lock_acquire.part.0+0xe2/0x248
      		       lock_acquire+0xac/0x1c8
      		       __mutex_lock+0x96/0x8e8
      		       mutex_lock_nested+0x32/0x40
      		       smc_connect_rdma+0xa4/0x418 [smc]
      		       __smc_connect+0x234/0x480 [smc]
      		       smc_connect+0x1d6/0x230 [smc]
      		       __sys_connect+0x90/0xc0
      		       __do_sys_socketcall+0x186/0x370
      		       __do_syscall+0x1da/0x208
      		       system_call+0x82/0xb0
      		-> #2 (sk_lock-AF_SMC){+.+.}-{0:0}:
      		       __lock_acquire+0x58e/0xbd8
      		       lock_acquire.part.0+0xe2/0x248
      		       lock_acquire+0xac/0x1c8
      		       lock_sock_nested+0x46/0xa8
      		       smc_tx_work+0x34/0x50 [smc]
      		       process_one_work+0x30c/0x730
      		       worker_thread+0x62/0x420
      		       kthread+0x138/0x150
      		       __ret_from_fork+0x3c/0x58
      		       ret_from_fork+0xa/0x40
      		-> #1 ((work_completion)(&(&smc->conn.tx_work)->work)){+.+.}-{0:0}:
      		       __lock_acquire+0x58e/0xbd8
      		       lock_acquire.part.0+0xe2/0x248
      		       lock_acquire+0xac/0x1c8
      		       process_one_work+0x2bc/0x730
      		       worker_thread+0x62/0x420
      		       kthread+0x138/0x150
      		       __ret_from_fork+0x3c/0x58
      		       ret_from_fork+0xa/0x40
      		-> #0 ((wq_completion)smc_tx_wq-00000000#2){+.+.}-{0:0}:
      		       check_prev_add+0xd8/0xe88
      		       validate_chain+0x70c/0xb20
      		       __lock_acquire+0x58e/0xbd8
      		       lock_acquire.part.0+0xe2/0x248
      		       lock_acquire+0xac/0x1c8
      		       __flush_workqueue+0xaa/0x4f0
      		       drain_workqueue+0xaa/0x158
      		       destroy_workqueue+0x44/0x2d8
      		       smc_lgr_free+0x9e/0xf8 [smc]
      		       process_one_work+0x30c/0x730
      		       worker_thread+0x62/0x420
      		       kthread+0x138/0x150
      		       __ret_from_fork+0x3c/0x58
      		       ret_from_fork+0xa/0x40
      		other info that might help us debug this:
      		Chain exists of:
      		  (wq_completion)smc_tx_wq-00000000#2
      	  	  --> smc_client_lgr_pending
      		  --> (work_completion)(&(&lgr->free_work)->work)
      		 Possible unsafe locking scenario:
      		       CPU0                    CPU1
      		       ----                    ----
      		  lock((work_completion)(&(&lgr->free_work)->work));
      		                   lock(smc_client_lgr_pending);
      		                   lock((work_completion)
      					(&(&lgr->free_work)->work));
      		  lock((wq_completion)smc_tx_wq-00000000#2);
      		 *** DEADLOCK ***
      		2 locks held by kworker/3:0/176251:
      		 #0: 0000000080183548
      			((wq_completion)events){+.+.}-{0:0},
      				at: process_one_work+0x232/0x730
      		 #1: 0000037fffe97dc8
      			((work_completion)
      			 (&(&lgr->free_work)->work)){+.+.}-{0:0},
      				at: process_one_work+0x232/0x730
      		stack backtrace:
      		CPU: 3 PID: 176251 Comm: kworker/3:0 Not tainted
      		Hardware name: IBM 8561 T01 701 (z/VM 7.2.0)
      		Call Trace:
      		 [<000000002983c3e4>] dump_stack_lvl+0xac/0x100
      		 [<0000000028b477ae>] check_noncircular+0x13e/0x160
      		 [<0000000028b48808>] check_prev_add+0xd8/0xe88
      		 [<0000000028b49cc4>] validate_chain+0x70c/0xb20
      		 [<0000000028b4bd26>] __lock_acquire+0x58e/0xbd8
      		 [<0000000028b4cf6a>] lock_acquire.part.0+0xe2/0x248
      		 [<0000000028b4d17c>] lock_acquire+0xac/0x1c8
      		 [<0000000028addaaa>] __flush_workqueue+0xaa/0x4f0
      		 [<0000000028addf9a>] drain_workqueue+0xaa/0x158
      		 [<0000000028ae303c>] destroy_workqueue+0x44/0x2d8
      		 [<000003ff8029af26>] smc_lgr_free+0x9e/0xf8 [smc]
      		 [<0000000028adf3d4>] process_one_work+0x30c/0x730
      		 [<0000000028adf85a>] worker_thread+0x62/0x420
      		 [<0000000028aeac50>] kthread+0x138/0x150
      		 [<0000000028a63914>] __ret_from_fork+0x3c/0x58
      		 [<00000000298503da>] ret_from_fork+0xa/0x40
      		INFO: lockdep is turned off.
      ===================================================================
      
      This deadlock occurs because cancel_delayed_work_sync() waits for
      the work(&lgr->free_work) to finish, while the &lgr->free_work
      waits for the work(lgr->tx_wq), which needs the sk_lock-AF_SMC, that
      is already used under the mutex_lock.
      
      The solution is to use cancel_delayed_work() instead, which kills
      off a pending work.
      
      Fixes: a52bcc91 ("net/smc: improve termination processing")
      Signed-off-by: default avatarWenjia Zhang <wenjia@linux.ibm.com>
      Reviewed-by: default avatarJan Karcher <jaka@linux.ibm.com>
      Reviewed-by: default avatarKarsten Graul <kgraul@linux.ibm.com>
      Reviewed-by: default avatarTony Lu <tonylu@linux.alibaba.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      13085e1b
    • Ido Schimmel's avatar
      mlxsw: spectrum: Fix incorrect parsing depth after reload · 35c35692
      Ido Schimmel authored
      Spectrum ASICs have a configurable limit on how deep into the packet
      they parse. By default, the limit is 96 bytes.
      
      There are several cases where this parsing depth is not enough and there
      is a need to increase it. For example, timestamping of PTP packets and a
      FIB multipath hash policy that requires hashing on inner fields. The
      driver therefore maintains a reference count that reflects the number of
      consumers that require an increased parsing depth.
      
      During reload_down() the parsing depth reference count does not
      necessarily drop to zero, but the parsing depth itself is restored to
      the default during reload_up() when the firmware is reset. It is
      therefore possible to end up in situations where the driver thinks that
      the parsing depth was increased (reference count is non-zero), when it
      is not.
      
      Fix by making sure that all the consumers that increase the parsing
      depth reference count also decrease it during reload_down().
      Specifically, make sure that when the routing code is de-initialized it
      drops the reference count if it was increased because of a FIB multipath
      hash policy that requires hashing on inner fields.
      
      Add a warning if the reference count is not zero after the driver was
      de-initialized and explicitly reset it to zero during initialization for
      good measures.
      
      Fixes: 2d91f080 ("mlxsw: spectrum: Add infrastructure for parsing configuration")
      Reported-by: default avatarMaksym Yaremchuk <maksymy@nvidia.com>
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Signed-off-by: default avatarPetr Machata <petrm@nvidia.com>
      Link: https://lore.kernel.org/r/9c35e1b3e6c1d8f319a2449d14e2b86373f3b3ba.1678727526.git.petrm@nvidia.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      35c35692
    • Lorenzo Bianconi's avatar
      veth: rely on rtnl_dereference() instead of on rcu_dereference() in veth_set_xdp_features() · 5ce76fe1
      Lorenzo Bianconi authored
      Fix the following kernel warning in veth_set_xdp_features routine
      relying on rtnl_dereference() instead of on rcu_dereference():
      
      =============================
      WARNING: suspicious RCU usage
      6.3.0-rc1-00144-g064d7052 #149 Not tainted
      -----------------------------
      drivers/net/veth.c:1265 suspicious rcu_dereference_check() usage!
      
      other info that might help us debug this:
      
      rcu_scheduler_active = 2, debug_locks = 1
      1 lock held by ip/135:
      (net/core/rtnetlink.c:6172)
      
      stack backtrace:
      CPU: 1 PID: 135 Comm: ip Not tainted 6.3.0-rc1-00144-g064d7052 #149
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1
      04/01/2014
      Call Trace:
       <TASK>
      dump_stack_lvl (lib/dump_stack.c:107)
      lockdep_rcu_suspicious (include/linux/context_tracking.h:152)
      veth_set_xdp_features (drivers/net/veth.c:1265 (discriminator 9))
      veth_newlink (drivers/net/veth.c:1892)
      ? veth_set_features (drivers/net/veth.c:1774)
      ? kasan_save_stack (mm/kasan/common.c:47)
      ? kasan_save_stack (mm/kasan/common.c:46)
      ? kasan_set_track (mm/kasan/common.c:52)
      ? alloc_netdev_mqs (include/linux/slab.h:737)
      ? rcu_read_lock_sched_held (kernel/rcu/update.c:125)
      ? trace_kmalloc (include/trace/events/kmem.h:54)
      ? __xdp_rxq_info_reg (net/core/xdp.c:188)
      ? alloc_netdev_mqs (net/core/dev.c:10657)
      ? rtnl_create_link (net/core/rtnetlink.c:3312)
      rtnl_newlink_create (net/core/rtnetlink.c:3440)
      ? rtnl_link_get_net_capable.constprop.0 (net/core/rtnetlink.c:3391)
      __rtnl_newlink (net/core/rtnetlink.c:3657)
      ? lock_downgrade (kernel/locking/lockdep.c:5321)
      ? rtnl_link_unregister (net/core/rtnetlink.c:3487)
      rtnl_newlink (net/core/rtnetlink.c:3671)
      rtnetlink_rcv_msg (net/core/rtnetlink.c:6174)
      ? rtnl_link_fill (net/core/rtnetlink.c:6070)
      ? mark_usage (kernel/locking/lockdep.c:4914)
      ? mark_usage (kernel/locking/lockdep.c:4914)
      netlink_rcv_skb (net/netlink/af_netlink.c:2574)
      ? rtnl_link_fill (net/core/rtnetlink.c:6070)
      ? netlink_ack (net/netlink/af_netlink.c:2551)
      ? lock_acquire (kernel/locking/lockdep.c:467)
      ? net_generic (include/linux/rcupdate.h:805)
      ? netlink_deliver_tap (include/linux/rcupdate.h:805)
      netlink_unicast (net/netlink/af_netlink.c:1340)
      ? netlink_attachskb (net/netlink/af_netlink.c:1350)
      netlink_sendmsg (net/netlink/af_netlink.c:1942)
      ? netlink_unicast (net/netlink/af_netlink.c:1861)
      ? netlink_unicast (net/netlink/af_netlink.c:1861)
      sock_sendmsg (net/socket.c:727)
      ____sys_sendmsg (net/socket.c:2501)
      ? kernel_sendmsg (net/socket.c:2448)
      ? __copy_msghdr (net/socket.c:2428)
      ___sys_sendmsg (net/socket.c:2557)
      ? mark_usage (kernel/locking/lockdep.c:4914)
      ? do_recvmmsg (net/socket.c:2544)
      ? lock_acquire (kernel/locking/lockdep.c:467)
      ? find_held_lock (kernel/locking/lockdep.c:5159)
      ? __lock_release (kernel/locking/lockdep.c:5345)
      ? __might_fault (mm/memory.c:5625)
      ? lock_downgrade (kernel/locking/lockdep.c:5321)
      ? __fget_light (include/linux/atomic/atomic-arch-fallback.h:227)
      __sys_sendmsg (include/linux/file.h:31)
      ? __sys_sendmsg_sock (net/socket.c:2572)
      ? rseq_get_rseq_cs (kernel/rseq.c:275)
      ? lockdep_hardirqs_on_prepare.part.0 (kernel/locking/lockdep.c:4263)
      do_syscall_64 (arch/x86/entry/common.c:50)
      entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120)
      RIP: 0033:0x7f0d1aadeb17
      Code: 0f 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b9 0f 1f 00 f3 0f 1e
      fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2e 00 00 00 0f 05 <48> 3d 00
      f0 ff ff 77 51 c3 48 83 ec 28 89 54 24 1c 48 89 74 24 10
      
      Fixes: fccca038 ("veth: take into account device reconfiguration for xdp_features flag")
      Suggested-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarMatthieu Baerts <matthieu.baerts@tessares.net>
      Link: https://lore.kernel.org/netdev/cover.1678364612.git.lorenzo@kernel.org/T/#me4c9d8e985ec7ebee981cfdb5bc5ec651ef4035dSigned-off-by: default avatarLorenzo Bianconi <lorenzo@kernel.org>
      Reported-by: syzbot+c3d0d9c42d59ff644ea6@syzkaller.appspotmail.com
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Tested-by: default avatarMatthieu Baerts <matthieu.baerts@tessares.net>
      Link: https://lore.kernel.org/r/dfd6a9a7d85e9113063165e1f47b466b90ad7b8a.1678748579.git.lorenzo@kernel.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      5ce76fe1
    • Zheng Wang's avatar
      nfc: st-nci: Fix use after free bug in ndlc_remove due to race condition · 5000fe6c
      Zheng Wang authored
      This bug influences both st_nci_i2c_remove and st_nci_spi_remove.
      Take st_nci_i2c_remove as an example.
      
      In st_nci_i2c_probe, it called ndlc_probe and bound &ndlc->sm_work
      with llt_ndlc_sm_work.
      
      When it calls ndlc_recv or timeout handler, it will finally call
      schedule_work to start the work.
      
      When we call st_nci_i2c_remove to remove the driver, there
      may be a sequence as follows:
      
      Fix it by finishing the work before cleanup in ndlc_remove
      
      CPU0                  CPU1
      
                          |llt_ndlc_sm_work
      st_nci_i2c_remove   |
        ndlc_remove       |
           st_nci_remove  |
           nci_free_device|
           kfree(ndev)    |
      //free ndlc->ndev   |
                          |llt_ndlc_rcv_queue
                          |nci_recv_frame
                          |//use ndlc->ndev
      
      Fixes: 35630df6 ("NFC: st21nfcb: Add driver for STMicroelectronics ST21NFCB NFC chip")
      Signed-off-by: default avatarZheng Wang <zyytlz.wz@163.com>
      Reviewed-by: default avatarKrzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
      Link: https://lore.kernel.org/r/20230312160837.2040857-1-zyytlz.wz@163.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      5000fe6c
    • Jakub Kicinski's avatar
      Merge branch 'tcp-fix-bind-regression-for-dual-stack-wildcard-address' · cf18d55e
      Jakub Kicinski authored
      Kuniyuki Iwashima says:
      
      ====================
      tcp: Fix bind() regression for dual-stack wildcard address.
      
      The first patch fixes the regression reported in [0], and the second
      patch adds a test for similar cases to catch future regression.
      
      [0]: https://lore.kernel.org/netdev/e21bf153-80b0-9ec0-15ba-e04a4ad42c34@redhat.com/
      ====================
      
      Link: https://lore.kernel.org/r/20230312031904.4674-1-kuniyu@amazon.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      cf18d55e
    • Kuniyuki Iwashima's avatar
      selftest: Add test for bind() conflicts. · 13715acf
      Kuniyuki Iwashima authored
      The test checks if (IPv4, IPv6) address pair properly conflict or not.
      
        * IPv4
          * 0.0.0.0
          * 127.0.0.1
      
        * IPv6
          * ::
          * ::1
      
      If the IPv6 address is [::], the second bind() always fails.
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      13715acf
    • Kuniyuki Iwashima's avatar
      tcp: Fix bind() conflict check for dual-stack wildcard address. · d9ba9934
      Kuniyuki Iwashima authored
      Paul Holzinger reported [0] that commit 5456262d ("net: Fix
      incorrect address comparison when searching for a bind2 bucket")
      introduced a bind() regression.  Paul also gave a nice repro that
      calls two types of bind() on the same port, both of which now
      succeed, but the second call should fail:
      
        bind(fd1, ::, port) + bind(fd2, 127.0.0.1, port)
      
      The cited commit added address family tests in three functions to
      fix the uninit-value KMSAN report. [1]  However, the test added to
      inet_bind2_bucket_match_addr_any() removed a necessary conflict
      check; the dual-stack wildcard address no longer conflicts with
      an IPv4 non-wildcard address.
      
      If tb->family is AF_INET6 and sk->sk_family is AF_INET in
      inet_bind2_bucket_match_addr_any(), we still need to check
      if tb has the dual-stack wildcard address.
      
      Note that the IPv4 wildcard address does not conflict with
      IPv6 non-wildcard addresses.
      
      [0]: https://lore.kernel.org/netdev/e21bf153-80b0-9ec0-15ba-e04a4ad42c34@redhat.com/
      [1]: https://lore.kernel.org/netdev/CAG_fn=Ud3zSW7AZWXc+asfMhZVL5ETnvuY44Pmyv4NPv-ijN-A@mail.gmail.com/
      
      Fixes: 5456262d ("net: Fix incorrect address comparison when searching for a bind2 bucket")
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Reported-by: default avatarPaul Holzinger <pholzing@redhat.com>
      Link: https://lore.kernel.org/netdev/CAG_fn=Ud3zSW7AZWXc+asfMhZVL5ETnvuY44Pmyv4NPv-ijN-A@mail.gmail.com/Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Tested-by: default avatarPaul Holzinger <pholzing@redhat.com>
      Reviewed-by: default avatarMartin KaFai Lau <martin.lau@kernel.org>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      d9ba9934
    • Heiner Kallweit's avatar
      net: phy: smsc: bail out in lan87xx_read_status if genphy_read_status fails · c22c3bbf
      Heiner Kallweit authored
      If genphy_read_status fails then further access to the PHY may result
      in unpredictable behavior. To prevent this bail out immediately if
      genphy_read_status fails.
      
      Fixes: 4223dbff ("net: phy: smsc: Re-enable EDPD mode for LAN87xx")
      Signed-off-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Link: https://lore.kernel.org/r/026aa4f2-36f5-1c10-ab9f-cdb17dda6ac4@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      c22c3bbf
    • Eric Dumazet's avatar
      net: tunnels: annotate lockless accesses to dev->needed_headroom · 4b397c06
      Eric Dumazet authored
      IP tunnels can apparently update dev->needed_headroom
      in their xmit path.
      
      This patch takes care of three tunnels xmit, and also the
      core LL_RESERVED_SPACE() and LL_RESERVED_SPACE_EXTRA()
      helpers.
      
      More changes might be needed for completeness.
      
      BUG: KCSAN: data-race in ip_tunnel_xmit / ip_tunnel_xmit
      
      read to 0xffff88815b9da0ec of 2 bytes by task 888 on cpu 1:
      ip_tunnel_xmit+0x1270/0x1730 net/ipv4/ip_tunnel.c:803
      __gre_xmit net/ipv4/ip_gre.c:469 [inline]
      ipgre_xmit+0x516/0x570 net/ipv4/ip_gre.c:661
      __netdev_start_xmit include/linux/netdevice.h:4881 [inline]
      netdev_start_xmit include/linux/netdevice.h:4895 [inline]
      xmit_one net/core/dev.c:3580 [inline]
      dev_hard_start_xmit+0x127/0x400 net/core/dev.c:3596
      __dev_queue_xmit+0x1007/0x1eb0 net/core/dev.c:4246
      dev_queue_xmit include/linux/netdevice.h:3051 [inline]
      neigh_direct_output+0x17/0x20 net/core/neighbour.c:1623
      neigh_output include/net/neighbour.h:546 [inline]
      ip_finish_output2+0x740/0x840 net/ipv4/ip_output.c:228
      ip_finish_output+0xf4/0x240 net/ipv4/ip_output.c:316
      NF_HOOK_COND include/linux/netfilter.h:291 [inline]
      ip_output+0xe5/0x1b0 net/ipv4/ip_output.c:430
      dst_output include/net/dst.h:444 [inline]
      ip_local_out+0x64/0x80 net/ipv4/ip_output.c:126
      iptunnel_xmit+0x34a/0x4b0 net/ipv4/ip_tunnel_core.c:82
      ip_tunnel_xmit+0x1451/0x1730 net/ipv4/ip_tunnel.c:813
      __gre_xmit net/ipv4/ip_gre.c:469 [inline]
      ipgre_xmit+0x516/0x570 net/ipv4/ip_gre.c:661
      __netdev_start_xmit include/linux/netdevice.h:4881 [inline]
      netdev_start_xmit include/linux/netdevice.h:4895 [inline]
      xmit_one net/core/dev.c:3580 [inline]
      dev_hard_start_xmit+0x127/0x400 net/core/dev.c:3596
      __dev_queue_xmit+0x1007/0x1eb0 net/core/dev.c:4246
      dev_queue_xmit include/linux/netdevice.h:3051 [inline]
      neigh_direct_output+0x17/0x20 net/core/neighbour.c:1623
      neigh_output include/net/neighbour.h:546 [inline]
      ip_finish_output2+0x740/0x840 net/ipv4/ip_output.c:228
      ip_finish_output+0xf4/0x240 net/ipv4/ip_output.c:316
      NF_HOOK_COND include/linux/netfilter.h:291 [inline]
      ip_output+0xe5/0x1b0 net/ipv4/ip_output.c:430
      dst_output include/net/dst.h:444 [inline]
      ip_local_out+0x64/0x80 net/ipv4/ip_output.c:126
      iptunnel_xmit+0x34a/0x4b0 net/ipv4/ip_tunnel_core.c:82
      ip_tunnel_xmit+0x1451/0x1730 net/ipv4/ip_tunnel.c:813
      __gre_xmit net/ipv4/ip_gre.c:469 [inline]
      ipgre_xmit+0x516/0x570 net/ipv4/ip_gre.c:661
      __netdev_start_xmit include/linux/netdevice.h:4881 [inline]
      netdev_start_xmit include/linux/netdevice.h:4895 [inline]
      xmit_one net/core/dev.c:3580 [inline]
      dev_hard_start_xmit+0x127/0x400 net/core/dev.c:3596
      __dev_queue_xmit+0x1007/0x1eb0 net/core/dev.c:4246
      dev_queue_xmit include/linux/netdevice.h:3051 [inline]
      neigh_direct_output+0x17/0x20 net/core/neighbour.c:1623
      neigh_output include/net/neighbour.h:546 [inline]
      ip_finish_output2+0x740/0x840 net/ipv4/ip_output.c:228
      ip_finish_output+0xf4/0x240 net/ipv4/ip_output.c:316
      NF_HOOK_COND include/linux/netfilter.h:291 [inline]
      ip_output+0xe5/0x1b0 net/ipv4/ip_output.c:430
      dst_output include/net/dst.h:444 [inline]
      ip_local_out+0x64/0x80 net/ipv4/ip_output.c:126
      iptunnel_xmit+0x34a/0x4b0 net/ipv4/ip_tunnel_core.c:82
      ip_tunnel_xmit+0x1451/0x1730 net/ipv4/ip_tunnel.c:813
      __gre_xmit net/ipv4/ip_gre.c:469 [inline]
      ipgre_xmit+0x516/0x570 net/ipv4/ip_gre.c:661
      __netdev_start_xmit include/linux/netdevice.h:4881 [inline]
      netdev_start_xmit include/linux/netdevice.h:4895 [inline]
      xmit_one net/core/dev.c:3580 [inline]
      dev_hard_start_xmit+0x127/0x400 net/core/dev.c:3596
      __dev_queue_xmit+0x1007/0x1eb0 net/core/dev.c:4246
      dev_queue_xmit include/linux/netdevice.h:3051 [inline]
      neigh_direct_output+0x17/0x20 net/core/neighbour.c:1623
      neigh_output include/net/neighbour.h:546 [inline]
      ip_finish_output2+0x740/0x840 net/ipv4/ip_output.c:228
      ip_finish_output+0xf4/0x240 net/ipv4/ip_output.c:316
      NF_HOOK_COND include/linux/netfilter.h:291 [inline]
      ip_output+0xe5/0x1b0 net/ipv4/ip_output.c:430
      dst_output include/net/dst.h:444 [inline]
      ip_local_out+0x64/0x80 net/ipv4/ip_output.c:126
      iptunnel_xmit+0x34a/0x4b0 net/ipv4/ip_tunnel_core.c:82
      ip_tunnel_xmit+0x1451/0x1730 net/ipv4/ip_tunnel.c:813
      __gre_xmit net/ipv4/ip_gre.c:469 [inline]
      ipgre_xmit+0x516/0x570 net/ipv4/ip_gre.c:661
      __netdev_start_xmit include/linux/netdevice.h:4881 [inline]
      netdev_start_xmit include/linux/netdevice.h:4895 [inline]
      xmit_one net/core/dev.c:3580 [inline]
      dev_hard_start_xmit+0x127/0x400 net/core/dev.c:3596
      __dev_queue_xmit+0x1007/0x1eb0 net/core/dev.c:4246
      dev_queue_xmit include/linux/netdevice.h:3051 [inline]
      neigh_direct_output+0x17/0x20 net/core/neighbour.c:1623
      neigh_output include/net/neighbour.h:546 [inline]
      ip_finish_output2+0x740/0x840 net/ipv4/ip_output.c:228
      ip_finish_output+0xf4/0x240 net/ipv4/ip_output.c:316
      NF_HOOK_COND include/linux/netfilter.h:291 [inline]
      ip_output+0xe5/0x1b0 net/ipv4/ip_output.c:430
      dst_output include/net/dst.h:444 [inline]
      ip_local_out+0x64/0x80 net/ipv4/ip_output.c:126
      iptunnel_xmit+0x34a/0x4b0 net/ipv4/ip_tunnel_core.c:82
      ip_tunnel_xmit+0x1451/0x1730 net/ipv4/ip_tunnel.c:813
      __gre_xmit net/ipv4/ip_gre.c:469 [inline]
      ipgre_xmit+0x516/0x570 net/ipv4/ip_gre.c:661
      __netdev_start_xmit include/linux/netdevice.h:4881 [inline]
      netdev_start_xmit include/linux/netdevice.h:4895 [inline]
      xmit_one net/core/dev.c:3580 [inline]
      dev_hard_start_xmit+0x127/0x400 net/core/dev.c:3596
      __dev_queue_xmit+0x1007/0x1eb0 net/core/dev.c:4246
      dev_queue_xmit include/linux/netdevice.h:3051 [inline]
      neigh_direct_output+0x17/0x20 net/core/neighbour.c:1623
      neigh_output include/net/neighbour.h:546 [inline]
      ip_finish_output2+0x740/0x840 net/ipv4/ip_output.c:228
      ip_finish_output+0xf4/0x240 net/ipv4/ip_output.c:316
      NF_HOOK_COND include/linux/netfilter.h:291 [inline]
      ip_output+0xe5/0x1b0 net/ipv4/ip_output.c:430
      dst_output include/net/dst.h:444 [inline]
      ip_local_out+0x64/0x80 net/ipv4/ip_output.c:126
      iptunnel_xmit+0x34a/0x4b0 net/ipv4/ip_tunnel_core.c:82
      ip_tunnel_xmit+0x1451/0x1730 net/ipv4/ip_tunnel.c:813
      __gre_xmit net/ipv4/ip_gre.c:469 [inline]
      ipgre_xmit+0x516/0x570 net/ipv4/ip_gre.c:661
      __netdev_start_xmit include/linux/netdevice.h:4881 [inline]
      netdev_start_xmit include/linux/netdevice.h:4895 [inline]
      xmit_one net/core/dev.c:3580 [inline]
      dev_hard_start_xmit+0x127/0x400 net/core/dev.c:3596
      __dev_queue_xmit+0x1007/0x1eb0 net/core/dev.c:4246
      
      write to 0xffff88815b9da0ec of 2 bytes by task 2379 on cpu 0:
      ip_tunnel_xmit+0x1294/0x1730 net/ipv4/ip_tunnel.c:804
      __gre_xmit net/ipv4/ip_gre.c:469 [inline]
      ipgre_xmit+0x516/0x570 net/ipv4/ip_gre.c:661
      __netdev_start_xmit include/linux/netdevice.h:4881 [inline]
      netdev_start_xmit include/linux/netdevice.h:4895 [inline]
      xmit_one net/core/dev.c:3580 [inline]
      dev_hard_start_xmit+0x127/0x400 net/core/dev.c:3596
      __dev_queue_xmit+0x1007/0x1eb0 net/core/dev.c:4246
      dev_queue_xmit include/linux/netdevice.h:3051 [inline]
      neigh_direct_output+0x17/0x20 net/core/neighbour.c:1623
      neigh_output include/net/neighbour.h:546 [inline]
      ip6_finish_output2+0x9bc/0xc50 net/ipv6/ip6_output.c:134
      __ip6_finish_output net/ipv6/ip6_output.c:195 [inline]
      ip6_finish_output+0x39a/0x4e0 net/ipv6/ip6_output.c:206
      NF_HOOK_COND include/linux/netfilter.h:291 [inline]
      ip6_output+0xeb/0x220 net/ipv6/ip6_output.c:227
      dst_output include/net/dst.h:444 [inline]
      NF_HOOK include/linux/netfilter.h:302 [inline]
      mld_sendpack+0x438/0x6a0 net/ipv6/mcast.c:1820
      mld_send_cr net/ipv6/mcast.c:2121 [inline]
      mld_ifc_work+0x519/0x7b0 net/ipv6/mcast.c:2653
      process_one_work+0x3e6/0x750 kernel/workqueue.c:2390
      worker_thread+0x5f2/0xa10 kernel/workqueue.c:2537
      kthread+0x1ac/0x1e0 kernel/kthread.c:376
      ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:308
      
      value changed: 0x0dd4 -> 0x0e14
      
      Reported by Kernel Concurrency Sanitizer on:
      CPU: 0 PID: 2379 Comm: kworker/0:0 Not tainted 6.3.0-rc1-syzkaller-00002-g8ca09d5f-dirty #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/02/2023
      Workqueue: mld mld_ifc_work
      
      Fixes: 8eb30be0 ("ipv6: Create ip6_tnl_xmit")
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Link: https://lore.kernel.org/r/20230310191109.2384387-1-edumazet@google.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      4b397c06
    • Dave Ertman's avatar
      ice: avoid bonding causing auxiliary plug/unplug under RTNL lock · 248401cb
      Dave Ertman authored
      RDMA is not supported in ice on a PF that has been added to a bonded
      interface. To enforce this, when an interface enters a bond, we unplug
      the auxiliary device that supports RDMA functionality.  This unplug
      currently happens in the context of handling the netdev bonding event.
      This event is sent to the ice driver under RTNL context.  This is causing
      a deadlock where the RDMA driver is waiting for the RTNL lock to complete
      the removal.
      
      Defer the unplugging/re-plugging of the auxiliary device to the service
      task so that it is not performed under the RTNL lock context.
      
      Cc: stable@vger.kernel.org # 6.1.x
      Reported-by: default avatarJaroslav Pulchart <jaroslav.pulchart@gooddata.com>
      Link: https://lore.kernel.org/netdev/CAK8fFZ6A_Gphw_3-QMGKEFQk=sfCw1Qmq0TVZK3rtAi7vb621A@mail.gmail.com/
      Fixes: 5cb1ebdb ("ice: Fix race condition during interface enslave")
      Fixes: 4eace75e ("RDMA/irdma: Report the correct link speed")
      Signed-off-by: default avatarDave Ertman <david.m.ertman@intel.com>
      Tested-by: Arpana Arland <arpanax.arland@intel.com> (A Contingent worker at Intel)
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Reviewed-by: default avatarLeon Romanovsky <leonro@nvidia.com>
      Link: https://lore.kernel.org/r/20230310194833.3074601-1-anthony.l.nguyen@intel.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      248401cb
  2. 14 Mar, 2023 4 commits
  3. 13 Mar, 2023 3 commits
  4. 11 Mar, 2023 18 commits