1. 15 Mar, 2023 29 commits
    • Oz Shlomo's avatar
      net/mlx5e: TC, fix cloned flow attribute · b23bf10c
      Oz Shlomo authored
      Currently the cloned flow attr resets the original tc action cookies
      count.
      Fix that by resetting the cloned flow attribute.
      
      Fixes: cca7eac1 ("net/mlx5e: TC, store tc action cookies per attr")
      Signed-off-by: default avatarOz Shlomo <ozsh@nvidia.com>
      Reviewed-by: default avatarPaul Blakey <paulb@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      b23bf10c
    • Oz Shlomo's avatar
      net/mlx5e: TC, fix missing error code · 1166add4
      Oz Shlomo authored
      Missing error code when mlx5e_tc_act_stats_create fails
      
      Fixes: d13674b1 ("net/mlx5e: TC, map tc action cookie to a hw counter")
      Reported-by: default avatarDan Carpenter <error27@gmail.com>
      Signed-off-by: default avatarOz Shlomo <ozsh@nvidia.com>
      Reviewed-by: default avatarPaul Blakey <paulb@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      1166add4
    • Oz Shlomo's avatar
      net/sched: TC, fix raw counter initialization · d1a0075a
      Oz Shlomo authored
      Freed counters may be reused by fs core.
      As such, raw counters may not be initialized to zero.
      
      Cache the counter values when the action stats object is initialized to
      have a proper base value for calculating the difference from the previous
      query.
      
      Fixes: 2b68d659 ("net/mlx5e: TC, support per action stats")
      Signed-off-by: default avatarOz Shlomo <ozsh@nvidia.com>
      Reviewed-by: default avatarPaul Blakey <paulb@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      d1a0075a
    • Adham Faris's avatar
      net/mlx5e: Lower maximum allowed MTU in XSK to match XDP prerequisites · 78dee7be
      Adham Faris authored
      XSK redirecting XDP programs require linearity, hence applies
      restrictions on the MTU. For PAGE_SIZE=4K, MTU shouldn't exceed 3498.
      
      Features that contradict with XDP such HW-LRO and HW-GRO are enforced
      by the driver in advance, during XSK params validation, except for MTU,
      which was not enforced before this patch.
      
      This has been spotted during test scenario described below:
      Attaching xdpsock program (PAGE_SIZE=4K), with MTU < 3498, detaching
      XDP program, changing the MTU to arbitrary value in the range
      [3499, 3754], attaching XDP program again, which ended up with failure
      since MTU is > 3498.
      
      This commit lowers the XSK MTU limitation to be aligned with XDP MTU
      limitation, since XSK socket is meaningless without XDP program.
      Signed-off-by: default avatarAdham Faris <afaris@nvidia.com>
      Reviewed-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      78dee7be
    • Shay Drory's avatar
      net/mlx5: Set BREAK_FW_WAIT flag first when removing driver · 031a163f
      Shay Drory authored
      Currently, BREAK_FW_WAIT flag is set after syncing with fw_reset.
      However, fw_reset can call mlx5_load_one() which is waiting for fw
      init bit and BREAK_FW_WAIT flag is intended to stop. e.g.: the driver
      might wait on a loop it should exit.
      Fix it by setting the flag before syncing with fw_reset.
      
      Fixes: 8324a02c ("net/mlx5: Add exit route when waiting for FW")
      Signed-off-by: default avatarShay Drory <shayd@nvidia.com>
      Reviewed-by: default avatarMoshe Shemesh <moshe@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      031a163f
    • Gal Pressman's avatar
      net/mlx5e: kTLS, Fix missing error unwind on unsupported cipher type · dd645724
      Gal Pressman authored
      Do proper error unwinding when adding an unsupported TX/RX cipher type.
      Move the switch case prior to key creation so there's less to unwind,
      and change the goto label name to describe the action performed instead
      of what failed.
      
      Fixes: 4960c414 ("net/mlx5e: Support 256 bit keys with kTLS device offload")
      Signed-off-by: default avatarGal Pressman <gal@nvidia.com>
      Reviewed-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      dd645724
    • Paul Blakey's avatar
      net/mlx5e: Fix cleanup null-ptr deref on encap lock · c9668f0b
      Paul Blakey authored
      During module is unloaded while a peer tc flow is still offloaded,
      first the peer uplink rep profile is changed to a nic profile, and so
      neigh encap lock is destroyed. Next during unload, the VF reps netdevs
      are unregistered which causes the original non-peer tc flow to be deleted,
      which deletes the peer flow. The peer flow deletion detaches the encap
      entry and try to take the already destroyed encap lock, causing the
      below trace.
      
      Fix this by clearing peer flows during tc eswitch cleanup
      (mlx5e_tc_esw_cleanup()).
      
      Relevant trace:
      [ 4316.837128] BUG: kernel NULL pointer dereference, address: 00000000000001d8
      [ 4316.842239] RIP: 0010:__mutex_lock+0xb5/0xc40
      [ 4316.851897] Call Trace:
      [ 4316.852481]  <TASK>
      [ 4316.857214]  mlx5e_rep_neigh_entry_release+0x93/0x790 [mlx5_core]
      [ 4316.858258]  mlx5e_rep_encap_entry_detach+0xa7/0xf0 [mlx5_core]
      [ 4316.859134]  mlx5e_encap_dealloc+0xa3/0xf0 [mlx5_core]
      [ 4316.859867]  clean_encap_dests.part.0+0x5c/0xe0 [mlx5_core]
      [ 4316.860605]  mlx5e_tc_del_fdb_flow+0x32a/0x810 [mlx5_core]
      [ 4316.862609]  __mlx5e_tc_del_fdb_peer_flow+0x1a2/0x250 [mlx5_core]
      [ 4316.863394]  mlx5e_tc_del_flow+0x(/0x630 [mlx5_core]
      [ 4316.864090]  mlx5e_flow_put+0x5f/0x100 [mlx5_core]
      [ 4316.864771]  mlx5e_delete_flower+0x4de/0xa40 [mlx5_core]
      [ 4316.865486]  tc_setup_cb_reoffload+0x20/0x80
      [ 4316.865905]  fl_reoffload+0x47c/0x510 [cls_flower]
      [ 4316.869181]  tcf_block_playback_offloads+0x91/0x1d0
      [ 4316.869649]  tcf_block_unbind+0xe7/0x1b0
      [ 4316.870049]  tcf_block_offload_cmd.isra.0+0x1ee/0x270
      [ 4316.879266]  tcf_block_offload_unbind+0x61/0xa0
      [ 4316.879711]  __tcf_block_put+0xa4/0x310
      
      Fixes: 04de7dda ("net/mlx5e: Infrastructure for duplicated offloading of TC flows")
      Fixes: 1418ddd9 ("net/mlx5e: Duplicate offloaded TC eswitch rules under uplink LAG")
      Signed-off-by: default avatarPaul Blakey <paulb@nvidia.com>
      Reviewed-by: default avatarChris Mi <cmi@nvidia.com>
      Reviewed-by: default avatarRoi Dayan <roid@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      c9668f0b
    • Maor Dickman's avatar
      net/mlx5: E-switch, Fix missing set of split_count when forward to ovs internal port · 28d3815a
      Maor Dickman authored
      Rules with mirror actions are split to two FTEs when the actions after the mirror
      action contains pedit, vlan push/pop or ct. Forward to ovs internal port adds
      implicit header rewrite (pedit) but missing trigger to do split.
      
      Fix by setting split_count when forwarding to ovs internal port which
      will trigger split in mirror rules.
      
      Fixes: 27484f71 ("net/mlx5e: Offload tc rules that redirect to ovs internal port")
      Signed-off-by: default avatarMaor Dickman <maord@nvidia.com>
      Reviewed-by: default avatarRoi Dayan <roid@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      28d3815a
    • Maor Dickman's avatar
      net/mlx5: E-switch, Fix wrong usage of source port rewrite in split rules · 1313d78a
      Maor Dickman authored
      In few cases, rules with mirror use case are split to two FTEs, one which
      do the mirror action and forward to second FTE which do the rest of the rule
      actions and the second redirect action.
      In case of mirror rules which do split and forward to ovs internal port or
      VF stack devices, source port rewrite should be used in the second FTE but
      it is wrongly also set in the first FTE which break the offload.
      
      Fix this issue by removing the wrong check if source port rewrite is needed to
      be used on the first FTE of the split and instead return EOPNOTSUPP which will
      block offload of rules which mirror to ovs internal port or VF stack devices
      which isn't supported.
      
      Fixes: 10742efc ("net/mlx5e: VF tunnel TX traffic offloading")
      Fixes: a508728a ("net/mlx5e: VF tunnel RX traffic offloading")
      Signed-off-by: default avatarMaor Dickman <maord@nvidia.com>
      Reviewed-by: default avatarRoi Dayan <roid@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      1313d78a
    • Daniel Jurgens's avatar
      net/mlx5: Disable eswitch before waiting for VF pages · 7ba930fc
      Daniel Jurgens authored
      The offending commit changed the ordering of moving to legacy mode and
      waiting for the VF pages. Moving to legacy mode is important in
      bluefield, because it sends the host driver into error state, and frees
      its pages. Without this transition we end up waiting 2 minutes for
      pages that aren't coming before carrying on with the unload process.
      
      Fixes: f019679e ("net/mlx5: E-switch, Remove dependency between sriov and eswitch mode")
      Signed-off-by: default avatarDaniel Jurgens <danielj@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      7ba930fc
    • Parav Pandit's avatar
      net/mlx5: Fix setting ec_function bit in MANAGE_PAGES · ba5d8f72
      Parav Pandit authored
      When ECPF is a page supplier, reclaim pages missed to honor the
      ec_function bit provided by the firmware. It always used the ec_function
      to true during driver unload flow for ECPF. This is incorrect.
      
      Honor the ec_function bit provided by device during page allocation
      request event.
      
      Fixes: d6945242 ("net/mlx5: Hold pages RB tree per VF")
      Signed-off-by: default avatarParav Pandit <parav@nvidia.com>
      Signed-off-by: default avatarDaniel Jurgens <danielj@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      ba5d8f72
    • Parav Pandit's avatar
      net/mlx5e: Don't cache tunnel offloads capability · 9a92fe1d
      Parav Pandit authored
      When mlx5e attaches again after device health recovery, the device
      capabilities might have changed by the eswitch manager.
      
      For example in one flow when ECPF changes the eswitch mode between
      legacy and switchdev, it updates the flow table tunnel capability.
      
      The cached value is only used in one place, so just check the capability
      there instead.
      
      Fixes: 5bef709d ("net/mlx5: Enable host PF HCA after eswitch is initialized")
      Signed-off-by: default avatarParav Pandit <parav@nvidia.com>
      Signed-off-by: default avatarDaniel Jurgens <danielj@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      9a92fe1d
    • Emeel Hakim's avatar
      net/mlx5e: Fix macsec ASO context alignment · 37beabe9
      Emeel Hakim authored
      Currently mlx5e_macsec_umr struct does not satisfy hardware memory
      alignment requirement. Hence the result of querying advanced steering
      operation (ASO) is not copied to the memory region as expected.
      
      Fix by satisfying hardware memory alignment requirement and move
      context to be first field in struct for better readability.
      
      Fixes: 1f53da67 ("net/mlx5e: Create advanced steering operation (ASO) object for MACsec")
      Signed-off-by: default avatarEmeel Hakim <ehakim@nvidia.com>
      Reviewed-by: default avatarLeon Romanovsky <leonro@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      37beabe9
    • David S. Miller's avatar
      Merge branch 'mtk_eth_soc-SGMII-fixes' · 75014826
      David S. Miller authored
      Daniel Golle says:
      
      ====================
      net: ethernet: mtk_eth_soc: minor SGMII fixes
      
      This small series brings two minor fixes for the SGMII unit found in
      MediaTek's router SoCs.
      
      The first patch resets the PCS internal state machine on major
      configuration changes, just like it is also done in MediaTek's SDK.
      
      The second patch makes sure we only write values and restart AN if
      actually needed, thus preventing unnesseray loss of an existing link
      in some cases.
      
      Both patches have previously been submitted as part of the series
      "net: ethernet: mtk_eth_soc: various enhancements" which grew a bit
      too big and it has correctly been criticized that some of the patches
      should rather go as fixes to net-next.
      
      This new series tries to address this.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      75014826
    • Daniel Golle's avatar
      net: ethernet: mtk_eth_soc: only write values if needed · 6e933a80
      Daniel Golle authored
      Only restart auto-negotiation and write link timer if actually
      necessary. This prevents losing the link in case of minor
      changes.
      
      Fixes: 7e538372 ("net: ethernet: mediatek: Re-add support SGMII")
      Reviewed-by: default avatarRussell King (Oracle) <rmk+kernel@armlinux.org.uk>
      Tested-by: default avatarBjørn Mork <bjorn@mork.no>
      Signed-off-by: default avatarDaniel Golle <daniel@makrotopia.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6e933a80
    • Daniel Golle's avatar
      net: ethernet: mtk_eth_soc: reset PCS state · 611e2dab
      Daniel Golle authored
      Reset the internal PCS state machine when changing interface mode.
      This prevents confusing the state machine when changing interface
      modes, e.g. from SGMII to 2500Base-X or vice-versa.
      
      Fixes: 7e538372 ("net: ethernet: mediatek: Re-add support SGMII")
      Reviewed-by: default avatarRussell King (Oracle) <rmk+kernel@armlinux.org.uk>
      Tested-by: default avatarBjørn Mork <bjorn@mork.no>
      Signed-off-by: default avatarDaniel Golle <daniel@makrotopia.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      611e2dab
    • Szymon Heidrich's avatar
      net: usb: smsc75xx: Limit packet length to skb->len · d8b22831
      Szymon Heidrich authored
      Packet length retrieved from skb data may be larger than
      the actual socket buffer length (up to 9026 bytes). In such
      case the cloned skb passed up the network stack will leak
      kernel memory contents.
      
      Fixes: d0cad871 ("smsc75xx: SMSC LAN75xx USB gigabit ethernet adapter driver")
      Signed-off-by: default avatarSzymon Heidrich <szymon.heidrich@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d8b22831
    • David S. Miller's avatar
      Merge branch 'net-smc-fixes' · fd6ad75f
      David S. Miller authored
      Wenjia Zhang says:
      
      ====================
      net/smc: Fixes 2023-03-01
      
      The 1st patch solves the problem that CLC message initialization was
      not properly reversed in error handling path. And the 2nd one fixes
      the possible deadlock triggered by cancel_delayed_work_sync().
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fd6ad75f
    • Stefan Raspl's avatar
      net/smc: Fix device de-init sequence · 9d876d3e
      Stefan Raspl authored
      CLC message initialization was not properly reversed in error handling path.
      Reported-and-suggested-by: default avatarAlexander Gordeev <agordeev@linux.ibm.com>
      Signed-off-by: default avatarStefan Raspl <raspl@linux.ibm.com>
      Signed-off-by: default avatarWenjia Zhang <wenjia@linux.ibm.com>
      Reviewed-by: default avatarTony Lu <tonylu@linux.alibaba.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9d876d3e
    • Wenjia Zhang's avatar
      net/smc: fix deadlock triggered by cancel_delayed_work_syn() · 13085e1b
      Wenjia Zhang authored
      The following LOCKDEP was detected:
      		Workqueue: events smc_lgr_free_work [smc]
      		WARNING: possible circular locking dependency detected
      		6.1.0-20221027.rc2.git8.56bc5b569087.300.fc36.s390x+debug #1 Not tainted
      		------------------------------------------------------
      		kworker/3:0/176251 is trying to acquire lock:
      		00000000f1467148 ((wq_completion)smc_tx_wq-00000000#2){+.+.}-{0:0},
      			at: __flush_workqueue+0x7a/0x4f0
      		but task is already holding lock:
      		0000037fffe97dc8 ((work_completion)(&(&lgr->free_work)->work)){+.+.}-{0:0},
      			at: process_one_work+0x232/0x730
      		which lock already depends on the new lock.
      		the existing dependency chain (in reverse order) is:
      		-> #4 ((work_completion)(&(&lgr->free_work)->work)){+.+.}-{0:0}:
      		       __lock_acquire+0x58e/0xbd8
      		       lock_acquire.part.0+0xe2/0x248
      		       lock_acquire+0xac/0x1c8
      		       __flush_work+0x76/0xf0
      		       __cancel_work_timer+0x170/0x220
      		       __smc_lgr_terminate.part.0+0x34/0x1c0 [smc]
      		       smc_connect_rdma+0x15e/0x418 [smc]
      		       __smc_connect+0x234/0x480 [smc]
      		       smc_connect+0x1d6/0x230 [smc]
      		       __sys_connect+0x90/0xc0
      		       __do_sys_socketcall+0x186/0x370
      		       __do_syscall+0x1da/0x208
      		       system_call+0x82/0xb0
      		-> #3 (smc_client_lgr_pending){+.+.}-{3:3}:
      		       __lock_acquire+0x58e/0xbd8
      		       lock_acquire.part.0+0xe2/0x248
      		       lock_acquire+0xac/0x1c8
      		       __mutex_lock+0x96/0x8e8
      		       mutex_lock_nested+0x32/0x40
      		       smc_connect_rdma+0xa4/0x418 [smc]
      		       __smc_connect+0x234/0x480 [smc]
      		       smc_connect+0x1d6/0x230 [smc]
      		       __sys_connect+0x90/0xc0
      		       __do_sys_socketcall+0x186/0x370
      		       __do_syscall+0x1da/0x208
      		       system_call+0x82/0xb0
      		-> #2 (sk_lock-AF_SMC){+.+.}-{0:0}:
      		       __lock_acquire+0x58e/0xbd8
      		       lock_acquire.part.0+0xe2/0x248
      		       lock_acquire+0xac/0x1c8
      		       lock_sock_nested+0x46/0xa8
      		       smc_tx_work+0x34/0x50 [smc]
      		       process_one_work+0x30c/0x730
      		       worker_thread+0x62/0x420
      		       kthread+0x138/0x150
      		       __ret_from_fork+0x3c/0x58
      		       ret_from_fork+0xa/0x40
      		-> #1 ((work_completion)(&(&smc->conn.tx_work)->work)){+.+.}-{0:0}:
      		       __lock_acquire+0x58e/0xbd8
      		       lock_acquire.part.0+0xe2/0x248
      		       lock_acquire+0xac/0x1c8
      		       process_one_work+0x2bc/0x730
      		       worker_thread+0x62/0x420
      		       kthread+0x138/0x150
      		       __ret_from_fork+0x3c/0x58
      		       ret_from_fork+0xa/0x40
      		-> #0 ((wq_completion)smc_tx_wq-00000000#2){+.+.}-{0:0}:
      		       check_prev_add+0xd8/0xe88
      		       validate_chain+0x70c/0xb20
      		       __lock_acquire+0x58e/0xbd8
      		       lock_acquire.part.0+0xe2/0x248
      		       lock_acquire+0xac/0x1c8
      		       __flush_workqueue+0xaa/0x4f0
      		       drain_workqueue+0xaa/0x158
      		       destroy_workqueue+0x44/0x2d8
      		       smc_lgr_free+0x9e/0xf8 [smc]
      		       process_one_work+0x30c/0x730
      		       worker_thread+0x62/0x420
      		       kthread+0x138/0x150
      		       __ret_from_fork+0x3c/0x58
      		       ret_from_fork+0xa/0x40
      		other info that might help us debug this:
      		Chain exists of:
      		  (wq_completion)smc_tx_wq-00000000#2
      	  	  --> smc_client_lgr_pending
      		  --> (work_completion)(&(&lgr->free_work)->work)
      		 Possible unsafe locking scenario:
      		       CPU0                    CPU1
      		       ----                    ----
      		  lock((work_completion)(&(&lgr->free_work)->work));
      		                   lock(smc_client_lgr_pending);
      		                   lock((work_completion)
      					(&(&lgr->free_work)->work));
      		  lock((wq_completion)smc_tx_wq-00000000#2);
      		 *** DEADLOCK ***
      		2 locks held by kworker/3:0/176251:
      		 #0: 0000000080183548
      			((wq_completion)events){+.+.}-{0:0},
      				at: process_one_work+0x232/0x730
      		 #1: 0000037fffe97dc8
      			((work_completion)
      			 (&(&lgr->free_work)->work)){+.+.}-{0:0},
      				at: process_one_work+0x232/0x730
      		stack backtrace:
      		CPU: 3 PID: 176251 Comm: kworker/3:0 Not tainted
      		Hardware name: IBM 8561 T01 701 (z/VM 7.2.0)
      		Call Trace:
      		 [<000000002983c3e4>] dump_stack_lvl+0xac/0x100
      		 [<0000000028b477ae>] check_noncircular+0x13e/0x160
      		 [<0000000028b48808>] check_prev_add+0xd8/0xe88
      		 [<0000000028b49cc4>] validate_chain+0x70c/0xb20
      		 [<0000000028b4bd26>] __lock_acquire+0x58e/0xbd8
      		 [<0000000028b4cf6a>] lock_acquire.part.0+0xe2/0x248
      		 [<0000000028b4d17c>] lock_acquire+0xac/0x1c8
      		 [<0000000028addaaa>] __flush_workqueue+0xaa/0x4f0
      		 [<0000000028addf9a>] drain_workqueue+0xaa/0x158
      		 [<0000000028ae303c>] destroy_workqueue+0x44/0x2d8
      		 [<000003ff8029af26>] smc_lgr_free+0x9e/0xf8 [smc]
      		 [<0000000028adf3d4>] process_one_work+0x30c/0x730
      		 [<0000000028adf85a>] worker_thread+0x62/0x420
      		 [<0000000028aeac50>] kthread+0x138/0x150
      		 [<0000000028a63914>] __ret_from_fork+0x3c/0x58
      		 [<00000000298503da>] ret_from_fork+0xa/0x40
      		INFO: lockdep is turned off.
      ===================================================================
      
      This deadlock occurs because cancel_delayed_work_sync() waits for
      the work(&lgr->free_work) to finish, while the &lgr->free_work
      waits for the work(lgr->tx_wq), which needs the sk_lock-AF_SMC, that
      is already used under the mutex_lock.
      
      The solution is to use cancel_delayed_work() instead, which kills
      off a pending work.
      
      Fixes: a52bcc91 ("net/smc: improve termination processing")
      Signed-off-by: default avatarWenjia Zhang <wenjia@linux.ibm.com>
      Reviewed-by: default avatarJan Karcher <jaka@linux.ibm.com>
      Reviewed-by: default avatarKarsten Graul <kgraul@linux.ibm.com>
      Reviewed-by: default avatarTony Lu <tonylu@linux.alibaba.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      13085e1b
    • Ido Schimmel's avatar
      mlxsw: spectrum: Fix incorrect parsing depth after reload · 35c35692
      Ido Schimmel authored
      Spectrum ASICs have a configurable limit on how deep into the packet
      they parse. By default, the limit is 96 bytes.
      
      There are several cases where this parsing depth is not enough and there
      is a need to increase it. For example, timestamping of PTP packets and a
      FIB multipath hash policy that requires hashing on inner fields. The
      driver therefore maintains a reference count that reflects the number of
      consumers that require an increased parsing depth.
      
      During reload_down() the parsing depth reference count does not
      necessarily drop to zero, but the parsing depth itself is restored to
      the default during reload_up() when the firmware is reset. It is
      therefore possible to end up in situations where the driver thinks that
      the parsing depth was increased (reference count is non-zero), when it
      is not.
      
      Fix by making sure that all the consumers that increase the parsing
      depth reference count also decrease it during reload_down().
      Specifically, make sure that when the routing code is de-initialized it
      drops the reference count if it was increased because of a FIB multipath
      hash policy that requires hashing on inner fields.
      
      Add a warning if the reference count is not zero after the driver was
      de-initialized and explicitly reset it to zero during initialization for
      good measures.
      
      Fixes: 2d91f080 ("mlxsw: spectrum: Add infrastructure for parsing configuration")
      Reported-by: default avatarMaksym Yaremchuk <maksymy@nvidia.com>
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Signed-off-by: default avatarPetr Machata <petrm@nvidia.com>
      Link: https://lore.kernel.org/r/9c35e1b3e6c1d8f319a2449d14e2b86373f3b3ba.1678727526.git.petrm@nvidia.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      35c35692
    • Lorenzo Bianconi's avatar
      veth: rely on rtnl_dereference() instead of on rcu_dereference() in veth_set_xdp_features() · 5ce76fe1
      Lorenzo Bianconi authored
      Fix the following kernel warning in veth_set_xdp_features routine
      relying on rtnl_dereference() instead of on rcu_dereference():
      
      =============================
      WARNING: suspicious RCU usage
      6.3.0-rc1-00144-g064d7052 #149 Not tainted
      -----------------------------
      drivers/net/veth.c:1265 suspicious rcu_dereference_check() usage!
      
      other info that might help us debug this:
      
      rcu_scheduler_active = 2, debug_locks = 1
      1 lock held by ip/135:
      (net/core/rtnetlink.c:6172)
      
      stack backtrace:
      CPU: 1 PID: 135 Comm: ip Not tainted 6.3.0-rc1-00144-g064d7052 #149
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1
      04/01/2014
      Call Trace:
       <TASK>
      dump_stack_lvl (lib/dump_stack.c:107)
      lockdep_rcu_suspicious (include/linux/context_tracking.h:152)
      veth_set_xdp_features (drivers/net/veth.c:1265 (discriminator 9))
      veth_newlink (drivers/net/veth.c:1892)
      ? veth_set_features (drivers/net/veth.c:1774)
      ? kasan_save_stack (mm/kasan/common.c:47)
      ? kasan_save_stack (mm/kasan/common.c:46)
      ? kasan_set_track (mm/kasan/common.c:52)
      ? alloc_netdev_mqs (include/linux/slab.h:737)
      ? rcu_read_lock_sched_held (kernel/rcu/update.c:125)
      ? trace_kmalloc (include/trace/events/kmem.h:54)
      ? __xdp_rxq_info_reg (net/core/xdp.c:188)
      ? alloc_netdev_mqs (net/core/dev.c:10657)
      ? rtnl_create_link (net/core/rtnetlink.c:3312)
      rtnl_newlink_create (net/core/rtnetlink.c:3440)
      ? rtnl_link_get_net_capable.constprop.0 (net/core/rtnetlink.c:3391)
      __rtnl_newlink (net/core/rtnetlink.c:3657)
      ? lock_downgrade (kernel/locking/lockdep.c:5321)
      ? rtnl_link_unregister (net/core/rtnetlink.c:3487)
      rtnl_newlink (net/core/rtnetlink.c:3671)
      rtnetlink_rcv_msg (net/core/rtnetlink.c:6174)
      ? rtnl_link_fill (net/core/rtnetlink.c:6070)
      ? mark_usage (kernel/locking/lockdep.c:4914)
      ? mark_usage (kernel/locking/lockdep.c:4914)
      netlink_rcv_skb (net/netlink/af_netlink.c:2574)
      ? rtnl_link_fill (net/core/rtnetlink.c:6070)
      ? netlink_ack (net/netlink/af_netlink.c:2551)
      ? lock_acquire (kernel/locking/lockdep.c:467)
      ? net_generic (include/linux/rcupdate.h:805)
      ? netlink_deliver_tap (include/linux/rcupdate.h:805)
      netlink_unicast (net/netlink/af_netlink.c:1340)
      ? netlink_attachskb (net/netlink/af_netlink.c:1350)
      netlink_sendmsg (net/netlink/af_netlink.c:1942)
      ? netlink_unicast (net/netlink/af_netlink.c:1861)
      ? netlink_unicast (net/netlink/af_netlink.c:1861)
      sock_sendmsg (net/socket.c:727)
      ____sys_sendmsg (net/socket.c:2501)
      ? kernel_sendmsg (net/socket.c:2448)
      ? __copy_msghdr (net/socket.c:2428)
      ___sys_sendmsg (net/socket.c:2557)
      ? mark_usage (kernel/locking/lockdep.c:4914)
      ? do_recvmmsg (net/socket.c:2544)
      ? lock_acquire (kernel/locking/lockdep.c:467)
      ? find_held_lock (kernel/locking/lockdep.c:5159)
      ? __lock_release (kernel/locking/lockdep.c:5345)
      ? __might_fault (mm/memory.c:5625)
      ? lock_downgrade (kernel/locking/lockdep.c:5321)
      ? __fget_light (include/linux/atomic/atomic-arch-fallback.h:227)
      __sys_sendmsg (include/linux/file.h:31)
      ? __sys_sendmsg_sock (net/socket.c:2572)
      ? rseq_get_rseq_cs (kernel/rseq.c:275)
      ? lockdep_hardirqs_on_prepare.part.0 (kernel/locking/lockdep.c:4263)
      do_syscall_64 (arch/x86/entry/common.c:50)
      entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120)
      RIP: 0033:0x7f0d1aadeb17
      Code: 0f 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b9 0f 1f 00 f3 0f 1e
      fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2e 00 00 00 0f 05 <48> 3d 00
      f0 ff ff 77 51 c3 48 83 ec 28 89 54 24 1c 48 89 74 24 10
      
      Fixes: fccca038 ("veth: take into account device reconfiguration for xdp_features flag")
      Suggested-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarMatthieu Baerts <matthieu.baerts@tessares.net>
      Link: https://lore.kernel.org/netdev/cover.1678364612.git.lorenzo@kernel.org/T/#me4c9d8e985ec7ebee981cfdb5bc5ec651ef4035dSigned-off-by: default avatarLorenzo Bianconi <lorenzo@kernel.org>
      Reported-by: syzbot+c3d0d9c42d59ff644ea6@syzkaller.appspotmail.com
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Tested-by: default avatarMatthieu Baerts <matthieu.baerts@tessares.net>
      Link: https://lore.kernel.org/r/dfd6a9a7d85e9113063165e1f47b466b90ad7b8a.1678748579.git.lorenzo@kernel.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      5ce76fe1
    • Zheng Wang's avatar
      nfc: st-nci: Fix use after free bug in ndlc_remove due to race condition · 5000fe6c
      Zheng Wang authored
      This bug influences both st_nci_i2c_remove and st_nci_spi_remove.
      Take st_nci_i2c_remove as an example.
      
      In st_nci_i2c_probe, it called ndlc_probe and bound &ndlc->sm_work
      with llt_ndlc_sm_work.
      
      When it calls ndlc_recv or timeout handler, it will finally call
      schedule_work to start the work.
      
      When we call st_nci_i2c_remove to remove the driver, there
      may be a sequence as follows:
      
      Fix it by finishing the work before cleanup in ndlc_remove
      
      CPU0                  CPU1
      
                          |llt_ndlc_sm_work
      st_nci_i2c_remove   |
        ndlc_remove       |
           st_nci_remove  |
           nci_free_device|
           kfree(ndev)    |
      //free ndlc->ndev   |
                          |llt_ndlc_rcv_queue
                          |nci_recv_frame
                          |//use ndlc->ndev
      
      Fixes: 35630df6 ("NFC: st21nfcb: Add driver for STMicroelectronics ST21NFCB NFC chip")
      Signed-off-by: default avatarZheng Wang <zyytlz.wz@163.com>
      Reviewed-by: default avatarKrzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
      Link: https://lore.kernel.org/r/20230312160837.2040857-1-zyytlz.wz@163.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      5000fe6c
    • Jakub Kicinski's avatar
      Merge branch 'tcp-fix-bind-regression-for-dual-stack-wildcard-address' · cf18d55e
      Jakub Kicinski authored
      Kuniyuki Iwashima says:
      
      ====================
      tcp: Fix bind() regression for dual-stack wildcard address.
      
      The first patch fixes the regression reported in [0], and the second
      patch adds a test for similar cases to catch future regression.
      
      [0]: https://lore.kernel.org/netdev/e21bf153-80b0-9ec0-15ba-e04a4ad42c34@redhat.com/
      ====================
      
      Link: https://lore.kernel.org/r/20230312031904.4674-1-kuniyu@amazon.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      cf18d55e
    • Kuniyuki Iwashima's avatar
      selftest: Add test for bind() conflicts. · 13715acf
      Kuniyuki Iwashima authored
      The test checks if (IPv4, IPv6) address pair properly conflict or not.
      
        * IPv4
          * 0.0.0.0
          * 127.0.0.1
      
        * IPv6
          * ::
          * ::1
      
      If the IPv6 address is [::], the second bind() always fails.
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      13715acf
    • Kuniyuki Iwashima's avatar
      tcp: Fix bind() conflict check for dual-stack wildcard address. · d9ba9934
      Kuniyuki Iwashima authored
      Paul Holzinger reported [0] that commit 5456262d ("net: Fix
      incorrect address comparison when searching for a bind2 bucket")
      introduced a bind() regression.  Paul also gave a nice repro that
      calls two types of bind() on the same port, both of which now
      succeed, but the second call should fail:
      
        bind(fd1, ::, port) + bind(fd2, 127.0.0.1, port)
      
      The cited commit added address family tests in three functions to
      fix the uninit-value KMSAN report. [1]  However, the test added to
      inet_bind2_bucket_match_addr_any() removed a necessary conflict
      check; the dual-stack wildcard address no longer conflicts with
      an IPv4 non-wildcard address.
      
      If tb->family is AF_INET6 and sk->sk_family is AF_INET in
      inet_bind2_bucket_match_addr_any(), we still need to check
      if tb has the dual-stack wildcard address.
      
      Note that the IPv4 wildcard address does not conflict with
      IPv6 non-wildcard addresses.
      
      [0]: https://lore.kernel.org/netdev/e21bf153-80b0-9ec0-15ba-e04a4ad42c34@redhat.com/
      [1]: https://lore.kernel.org/netdev/CAG_fn=Ud3zSW7AZWXc+asfMhZVL5ETnvuY44Pmyv4NPv-ijN-A@mail.gmail.com/
      
      Fixes: 5456262d ("net: Fix incorrect address comparison when searching for a bind2 bucket")
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Reported-by: default avatarPaul Holzinger <pholzing@redhat.com>
      Link: https://lore.kernel.org/netdev/CAG_fn=Ud3zSW7AZWXc+asfMhZVL5ETnvuY44Pmyv4NPv-ijN-A@mail.gmail.com/Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Tested-by: default avatarPaul Holzinger <pholzing@redhat.com>
      Reviewed-by: default avatarMartin KaFai Lau <martin.lau@kernel.org>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      d9ba9934
    • Heiner Kallweit's avatar
      net: phy: smsc: bail out in lan87xx_read_status if genphy_read_status fails · c22c3bbf
      Heiner Kallweit authored
      If genphy_read_status fails then further access to the PHY may result
      in unpredictable behavior. To prevent this bail out immediately if
      genphy_read_status fails.
      
      Fixes: 4223dbff ("net: phy: smsc: Re-enable EDPD mode for LAN87xx")
      Signed-off-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Link: https://lore.kernel.org/r/026aa4f2-36f5-1c10-ab9f-cdb17dda6ac4@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      c22c3bbf
    • Eric Dumazet's avatar
      net: tunnels: annotate lockless accesses to dev->needed_headroom · 4b397c06
      Eric Dumazet authored
      IP tunnels can apparently update dev->needed_headroom
      in their xmit path.
      
      This patch takes care of three tunnels xmit, and also the
      core LL_RESERVED_SPACE() and LL_RESERVED_SPACE_EXTRA()
      helpers.
      
      More changes might be needed for completeness.
      
      BUG: KCSAN: data-race in ip_tunnel_xmit / ip_tunnel_xmit
      
      read to 0xffff88815b9da0ec of 2 bytes by task 888 on cpu 1:
      ip_tunnel_xmit+0x1270/0x1730 net/ipv4/ip_tunnel.c:803
      __gre_xmit net/ipv4/ip_gre.c:469 [inline]
      ipgre_xmit+0x516/0x570 net/ipv4/ip_gre.c:661
      __netdev_start_xmit include/linux/netdevice.h:4881 [inline]
      netdev_start_xmit include/linux/netdevice.h:4895 [inline]
      xmit_one net/core/dev.c:3580 [inline]
      dev_hard_start_xmit+0x127/0x400 net/core/dev.c:3596
      __dev_queue_xmit+0x1007/0x1eb0 net/core/dev.c:4246
      dev_queue_xmit include/linux/netdevice.h:3051 [inline]
      neigh_direct_output+0x17/0x20 net/core/neighbour.c:1623
      neigh_output include/net/neighbour.h:546 [inline]
      ip_finish_output2+0x740/0x840 net/ipv4/ip_output.c:228
      ip_finish_output+0xf4/0x240 net/ipv4/ip_output.c:316
      NF_HOOK_COND include/linux/netfilter.h:291 [inline]
      ip_output+0xe5/0x1b0 net/ipv4/ip_output.c:430
      dst_output include/net/dst.h:444 [inline]
      ip_local_out+0x64/0x80 net/ipv4/ip_output.c:126
      iptunnel_xmit+0x34a/0x4b0 net/ipv4/ip_tunnel_core.c:82
      ip_tunnel_xmit+0x1451/0x1730 net/ipv4/ip_tunnel.c:813
      __gre_xmit net/ipv4/ip_gre.c:469 [inline]
      ipgre_xmit+0x516/0x570 net/ipv4/ip_gre.c:661
      __netdev_start_xmit include/linux/netdevice.h:4881 [inline]
      netdev_start_xmit include/linux/netdevice.h:4895 [inline]
      xmit_one net/core/dev.c:3580 [inline]
      dev_hard_start_xmit+0x127/0x400 net/core/dev.c:3596
      __dev_queue_xmit+0x1007/0x1eb0 net/core/dev.c:4246
      dev_queue_xmit include/linux/netdevice.h:3051 [inline]
      neigh_direct_output+0x17/0x20 net/core/neighbour.c:1623
      neigh_output include/net/neighbour.h:546 [inline]
      ip_finish_output2+0x740/0x840 net/ipv4/ip_output.c:228
      ip_finish_output+0xf4/0x240 net/ipv4/ip_output.c:316
      NF_HOOK_COND include/linux/netfilter.h:291 [inline]
      ip_output+0xe5/0x1b0 net/ipv4/ip_output.c:430
      dst_output include/net/dst.h:444 [inline]
      ip_local_out+0x64/0x80 net/ipv4/ip_output.c:126
      iptunnel_xmit+0x34a/0x4b0 net/ipv4/ip_tunnel_core.c:82
      ip_tunnel_xmit+0x1451/0x1730 net/ipv4/ip_tunnel.c:813
      __gre_xmit net/ipv4/ip_gre.c:469 [inline]
      ipgre_xmit+0x516/0x570 net/ipv4/ip_gre.c:661
      __netdev_start_xmit include/linux/netdevice.h:4881 [inline]
      netdev_start_xmit include/linux/netdevice.h:4895 [inline]
      xmit_one net/core/dev.c:3580 [inline]
      dev_hard_start_xmit+0x127/0x400 net/core/dev.c:3596
      __dev_queue_xmit+0x1007/0x1eb0 net/core/dev.c:4246
      dev_queue_xmit include/linux/netdevice.h:3051 [inline]
      neigh_direct_output+0x17/0x20 net/core/neighbour.c:1623
      neigh_output include/net/neighbour.h:546 [inline]
      ip_finish_output2+0x740/0x840 net/ipv4/ip_output.c:228
      ip_finish_output+0xf4/0x240 net/ipv4/ip_output.c:316
      NF_HOOK_COND include/linux/netfilter.h:291 [inline]
      ip_output+0xe5/0x1b0 net/ipv4/ip_output.c:430
      dst_output include/net/dst.h:444 [inline]
      ip_local_out+0x64/0x80 net/ipv4/ip_output.c:126
      iptunnel_xmit+0x34a/0x4b0 net/ipv4/ip_tunnel_core.c:82
      ip_tunnel_xmit+0x1451/0x1730 net/ipv4/ip_tunnel.c:813
      __gre_xmit net/ipv4/ip_gre.c:469 [inline]
      ipgre_xmit+0x516/0x570 net/ipv4/ip_gre.c:661
      __netdev_start_xmit include/linux/netdevice.h:4881 [inline]
      netdev_start_xmit include/linux/netdevice.h:4895 [inline]
      xmit_one net/core/dev.c:3580 [inline]
      dev_hard_start_xmit+0x127/0x400 net/core/dev.c:3596
      __dev_queue_xmit+0x1007/0x1eb0 net/core/dev.c:4246
      dev_queue_xmit include/linux/netdevice.h:3051 [inline]
      neigh_direct_output+0x17/0x20 net/core/neighbour.c:1623
      neigh_output include/net/neighbour.h:546 [inline]
      ip_finish_output2+0x740/0x840 net/ipv4/ip_output.c:228
      ip_finish_output+0xf4/0x240 net/ipv4/ip_output.c:316
      NF_HOOK_COND include/linux/netfilter.h:291 [inline]
      ip_output+0xe5/0x1b0 net/ipv4/ip_output.c:430
      dst_output include/net/dst.h:444 [inline]
      ip_local_out+0x64/0x80 net/ipv4/ip_output.c:126
      iptunnel_xmit+0x34a/0x4b0 net/ipv4/ip_tunnel_core.c:82
      ip_tunnel_xmit+0x1451/0x1730 net/ipv4/ip_tunnel.c:813
      __gre_xmit net/ipv4/ip_gre.c:469 [inline]
      ipgre_xmit+0x516/0x570 net/ipv4/ip_gre.c:661
      __netdev_start_xmit include/linux/netdevice.h:4881 [inline]
      netdev_start_xmit include/linux/netdevice.h:4895 [inline]
      xmit_one net/core/dev.c:3580 [inline]
      dev_hard_start_xmit+0x127/0x400 net/core/dev.c:3596
      __dev_queue_xmit+0x1007/0x1eb0 net/core/dev.c:4246
      dev_queue_xmit include/linux/netdevice.h:3051 [inline]
      neigh_direct_output+0x17/0x20 net/core/neighbour.c:1623
      neigh_output include/net/neighbour.h:546 [inline]
      ip_finish_output2+0x740/0x840 net/ipv4/ip_output.c:228
      ip_finish_output+0xf4/0x240 net/ipv4/ip_output.c:316
      NF_HOOK_COND include/linux/netfilter.h:291 [inline]
      ip_output+0xe5/0x1b0 net/ipv4/ip_output.c:430
      dst_output include/net/dst.h:444 [inline]
      ip_local_out+0x64/0x80 net/ipv4/ip_output.c:126
      iptunnel_xmit+0x34a/0x4b0 net/ipv4/ip_tunnel_core.c:82
      ip_tunnel_xmit+0x1451/0x1730 net/ipv4/ip_tunnel.c:813
      __gre_xmit net/ipv4/ip_gre.c:469 [inline]
      ipgre_xmit+0x516/0x570 net/ipv4/ip_gre.c:661
      __netdev_start_xmit include/linux/netdevice.h:4881 [inline]
      netdev_start_xmit include/linux/netdevice.h:4895 [inline]
      xmit_one net/core/dev.c:3580 [inline]
      dev_hard_start_xmit+0x127/0x400 net/core/dev.c:3596
      __dev_queue_xmit+0x1007/0x1eb0 net/core/dev.c:4246
      dev_queue_xmit include/linux/netdevice.h:3051 [inline]
      neigh_direct_output+0x17/0x20 net/core/neighbour.c:1623
      neigh_output include/net/neighbour.h:546 [inline]
      ip_finish_output2+0x740/0x840 net/ipv4/ip_output.c:228
      ip_finish_output+0xf4/0x240 net/ipv4/ip_output.c:316
      NF_HOOK_COND include/linux/netfilter.h:291 [inline]
      ip_output+0xe5/0x1b0 net/ipv4/ip_output.c:430
      dst_output include/net/dst.h:444 [inline]
      ip_local_out+0x64/0x80 net/ipv4/ip_output.c:126
      iptunnel_xmit+0x34a/0x4b0 net/ipv4/ip_tunnel_core.c:82
      ip_tunnel_xmit+0x1451/0x1730 net/ipv4/ip_tunnel.c:813
      __gre_xmit net/ipv4/ip_gre.c:469 [inline]
      ipgre_xmit+0x516/0x570 net/ipv4/ip_gre.c:661
      __netdev_start_xmit include/linux/netdevice.h:4881 [inline]
      netdev_start_xmit include/linux/netdevice.h:4895 [inline]
      xmit_one net/core/dev.c:3580 [inline]
      dev_hard_start_xmit+0x127/0x400 net/core/dev.c:3596
      __dev_queue_xmit+0x1007/0x1eb0 net/core/dev.c:4246
      
      write to 0xffff88815b9da0ec of 2 bytes by task 2379 on cpu 0:
      ip_tunnel_xmit+0x1294/0x1730 net/ipv4/ip_tunnel.c:804
      __gre_xmit net/ipv4/ip_gre.c:469 [inline]
      ipgre_xmit+0x516/0x570 net/ipv4/ip_gre.c:661
      __netdev_start_xmit include/linux/netdevice.h:4881 [inline]
      netdev_start_xmit include/linux/netdevice.h:4895 [inline]
      xmit_one net/core/dev.c:3580 [inline]
      dev_hard_start_xmit+0x127/0x400 net/core/dev.c:3596
      __dev_queue_xmit+0x1007/0x1eb0 net/core/dev.c:4246
      dev_queue_xmit include/linux/netdevice.h:3051 [inline]
      neigh_direct_output+0x17/0x20 net/core/neighbour.c:1623
      neigh_output include/net/neighbour.h:546 [inline]
      ip6_finish_output2+0x9bc/0xc50 net/ipv6/ip6_output.c:134
      __ip6_finish_output net/ipv6/ip6_output.c:195 [inline]
      ip6_finish_output+0x39a/0x4e0 net/ipv6/ip6_output.c:206
      NF_HOOK_COND include/linux/netfilter.h:291 [inline]
      ip6_output+0xeb/0x220 net/ipv6/ip6_output.c:227
      dst_output include/net/dst.h:444 [inline]
      NF_HOOK include/linux/netfilter.h:302 [inline]
      mld_sendpack+0x438/0x6a0 net/ipv6/mcast.c:1820
      mld_send_cr net/ipv6/mcast.c:2121 [inline]
      mld_ifc_work+0x519/0x7b0 net/ipv6/mcast.c:2653
      process_one_work+0x3e6/0x750 kernel/workqueue.c:2390
      worker_thread+0x5f2/0xa10 kernel/workqueue.c:2537
      kthread+0x1ac/0x1e0 kernel/kthread.c:376
      ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:308
      
      value changed: 0x0dd4 -> 0x0e14
      
      Reported by Kernel Concurrency Sanitizer on:
      CPU: 0 PID: 2379 Comm: kworker/0:0 Not tainted 6.3.0-rc1-syzkaller-00002-g8ca09d5f-dirty #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/02/2023
      Workqueue: mld mld_ifc_work
      
      Fixes: 8eb30be0 ("ipv6: Create ip6_tnl_xmit")
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Link: https://lore.kernel.org/r/20230310191109.2384387-1-edumazet@google.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      4b397c06
    • Dave Ertman's avatar
      ice: avoid bonding causing auxiliary plug/unplug under RTNL lock · 248401cb
      Dave Ertman authored
      RDMA is not supported in ice on a PF that has been added to a bonded
      interface. To enforce this, when an interface enters a bond, we unplug
      the auxiliary device that supports RDMA functionality.  This unplug
      currently happens in the context of handling the netdev bonding event.
      This event is sent to the ice driver under RTNL context.  This is causing
      a deadlock where the RDMA driver is waiting for the RTNL lock to complete
      the removal.
      
      Defer the unplugging/re-plugging of the auxiliary device to the service
      task so that it is not performed under the RTNL lock context.
      
      Cc: stable@vger.kernel.org # 6.1.x
      Reported-by: default avatarJaroslav Pulchart <jaroslav.pulchart@gooddata.com>
      Link: https://lore.kernel.org/netdev/CAK8fFZ6A_Gphw_3-QMGKEFQk=sfCw1Qmq0TVZK3rtAi7vb621A@mail.gmail.com/
      Fixes: 5cb1ebdb ("ice: Fix race condition during interface enslave")
      Fixes: 4eace75e ("RDMA/irdma: Report the correct link speed")
      Signed-off-by: default avatarDave Ertman <david.m.ertman@intel.com>
      Tested-by: Arpana Arland <arpanax.arland@intel.com> (A Contingent worker at Intel)
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Reviewed-by: default avatarLeon Romanovsky <leonro@nvidia.com>
      Link: https://lore.kernel.org/r/20230310194833.3074601-1-anthony.l.nguyen@intel.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      248401cb
  2. 14 Mar, 2023 4 commits
  3. 13 Mar, 2023 3 commits
  4. 11 Mar, 2023 4 commits