1. 20 Jan, 2023 10 commits
    • Kees Cook's avatar
      bnxt: Do not read past the end of test names · d3e599c0
      Kees Cook authored
      Test names were being concatenated based on a offset beyond the end of
      the first name, which tripped the buffer overflow detection logic:
      
       detected buffer overflow in strnlen
       [...]
       Call Trace:
       bnxt_ethtool_init.cold+0x18/0x18
      
      Refactor struct hwrm_selftest_qlist_output to use an actual array,
      and adjust the concatenation to use snprintf() rather than a series of
      strncat() calls.
      Reported-by: default avatarNiklas Cassel <Niklas.Cassel@wdc.com>
      Link: https://lore.kernel.org/lkml/Y8F%2F1w1AZTvLglFX@x1-carbon/Tested-by: default avatarNiklas Cassel <Niklas.Cassel@wdc.com>
      Fixes: eb513658 ("bnxt_en: Add basic ethtool -t selftest support.")
      Cc: Michael Chan <michael.chan@broadcom.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Jakub Kicinski <kuba@kernel.org>
      Cc: Paolo Abeni <pabeni@redhat.com>
      Cc: netdev@vger.kernel.org
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Reviewed-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Reviewed-by: default avatarNiklas Cassel <niklas.cassel@wdc.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d3e599c0
    • Andrew Halaney's avatar
      net: stmmac: enable all safety features by default · fdfc76a1
      Andrew Halaney authored
      In the original implementation of dwmac5
      commit 8bf993a5 ("net: stmmac: Add support for DWMAC5 and implement Safety Features")
      all safety features were enabled by default.
      
      Later it seems some implementations didn't have support for all the
      features, so in
      commit 5ac712dc ("net: stmmac: enable platform specific safety features")
      the safety_feat_cfg structure was added to the callback and defined for
      some platforms to selectively enable these safety features.
      
      The problem is that only certain platforms were given that software
      support. If the automotive safety package bit is set in the hardware
      features register the safety feature callback is called for the platform,
      and for platforms that didn't get a safety_feat_cfg defined this results
      in the following NULL pointer dereference:
      
      [    7.933303] Call trace:
      [    7.935812]  dwmac5_safety_feat_config+0x20/0x170 [stmmac]
      [    7.941455]  __stmmac_open+0x16c/0x474 [stmmac]
      [    7.946117]  stmmac_open+0x38/0x70 [stmmac]
      [    7.950414]  __dev_open+0x100/0x1dc
      [    7.954006]  __dev_change_flags+0x18c/0x204
      [    7.958297]  dev_change_flags+0x24/0x6c
      [    7.962237]  do_setlink+0x2b8/0xfa4
      [    7.965827]  __rtnl_newlink+0x4ec/0x840
      [    7.969766]  rtnl_newlink+0x50/0x80
      [    7.973353]  rtnetlink_rcv_msg+0x12c/0x374
      [    7.977557]  netlink_rcv_skb+0x5c/0x130
      [    7.981500]  rtnetlink_rcv+0x18/0x2c
      [    7.985172]  netlink_unicast+0x2e8/0x340
      [    7.989197]  netlink_sendmsg+0x1a8/0x420
      [    7.993222]  ____sys_sendmsg+0x218/0x280
      [    7.997249]  ___sys_sendmsg+0xac/0x100
      [    8.001103]  __sys_sendmsg+0x84/0xe0
      [    8.004776]  __arm64_sys_sendmsg+0x24/0x30
      [    8.008983]  invoke_syscall+0x48/0x114
      [    8.012840]  el0_svc_common.constprop.0+0xcc/0xec
      [    8.017665]  do_el0_svc+0x38/0xb0
      [    8.021071]  el0_svc+0x2c/0x84
      [    8.024212]  el0t_64_sync_handler+0xf4/0x120
      [    8.028598]  el0t_64_sync+0x190/0x194
      
      Go back to the original behavior, if the automotive safety package
      is found to be supported in hardware enable all the features unless
      safety_feat_cfg is passed in saying this particular platform only
      supports a subset of the features.
      
      Fixes: 5ac712dc ("net: stmmac: enable platform specific safety features")
      Reported-by: default avatarNing Cai <ncai@quicinc.com>
      Signed-off-by: default avatarAndrew Halaney <ahalaney@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fdfc76a1
    • David S. Miller's avatar
      Merge branch 'octeontx2-af-CPT' · b4fbf0b2
      David S. Miller authored
      Srujana Challa says:
      
      ====================
      octeontx2-af: Miscellaneous changes for CPT
      
      This patchset consists of miscellaneous changes for CPT.
      - Adds a new mailbox to reset the requested CPT LF.
      - Modify FLR sequence as per HW team suggested.
      - Adds support to recover CPT engines when they gets fault.
      - Updates CPT inbound inline IPsec configuration mailbox,
        as per new generation of the OcteonTX2 chips.
      - Adds a new mailbox to return CPT FLT Interrupt info.
      
      ---
      v2:
      - Addressed a review comment.
      v1:
      - Dropped patch "octeontx2-af: Fix interrupt name strings completely"
        to submit to net.
      ---
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b4fbf0b2
    • Srujana Challa's avatar
      octeontx2-af: add mbox to return CPT_AF_FLT_INT info · 8299ffe3
      Srujana Challa authored
      CPT HW would trigger the CPT AF FLT interrupt when CPT engines
      hits some uncorrectable errors and AF is the one which receives
      the interrupt and recovers the engines.
      This patch adds a mailbox for CPT VFs to request for CPT faulted
      and recovered engines info.
      Signed-off-by: default avatarSrujana Challa <schalla@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8299ffe3
    • Srujana Challa's avatar
      octeontx2-af: update cpt lf alloc mailbox · c0688ec0
      Srujana Challa authored
      The CN10K CPT coprocessor contains a context processor
      to accelerate updates to the IPsec security association
      contexts. The context processor contains a context cache.
      This patch updates CPT LF ALLOC mailbox to config ctx_ilen
      requested by VFs. CPT_LF_ALLOC:ctx_ilen is the size of
      initial context fetch.
      Signed-off-by: default avatarSrujana Challa <schalla@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c0688ec0
    • Nithin Dabilpuram's avatar
      octeontx2-af: restore rxc conf after teardown sequence · d5b2e0a2
      Nithin Dabilpuram authored
      CN10K CPT coprocessor includes a component named RXC which
      is responsible for reassembly of inner IP packets. RXC has
      the feature to evict oldest entries based on age/threshold.
      The age/threshold is being set to minimum values to evict
      all entries at the time of teardown.
      This patch adds code to restore timeout and threshold config
      after teardown sequence is complete as it is global config.
      Signed-off-by: default avatarNithin Dabilpuram <ndabilpuram@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d5b2e0a2
    • Srujana Challa's avatar
      octeontx2-af: optimize cpt pf identification · 9adb04ff
      Srujana Challa authored
      Optimize CPT PF identification in mbox handling for faster
      mbox response by doing it at AF driver probe instead of
      every mbox message.
      Signed-off-by: default avatarSrujana Challa <schalla@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9adb04ff
    • Srujana Challa's avatar
      octeontx2-af: modify FLR sequence for CPT · 1286c50a
      Srujana Challa authored
      On OcteonTX2 platform CPT instruction enqueue is only
      possible via LMTST operations.
      The existing FLR sequence mentioned in HRM requires
      a dummy LMTST to CPT but LMTST can't be submitted from
      AF driver. So, HW team provided a new sequence to avoid
      dummy LMTST. This patch adds code for the same.
      Signed-off-by: default avatarSrujana Challa <schalla@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1286c50a
    • Srujana Challa's avatar
      octeontx2-af: add mbox for CPT LF reset · f58cf765
      Srujana Challa authored
      On OcteonTX2 SoC, the admin function (AF) is the only one with all
      priviliges to configure HW and alloc resources, PFs and it's VFs
      have to request AF via mailbox for all their needs.
      This patch adds a new mailbox for CPT VFs to request for CPT LF
      reset.
      Signed-off-by: default avatarSrujana Challa <schalla@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f58cf765
    • Srujana Challa's avatar
      octeontx2-af: recover CPT engine when it gets fault · 07ea567d
      Srujana Challa authored
      When CPT engine has uncorrectable errors, it will get halted and
      must be disabled and re-enabled. This patch adds code for the same.
      Signed-off-by: default avatarSrujana Challa <schalla@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      07ea567d
  2. 19 Jan, 2023 10 commits
  3. 18 Jan, 2023 18 commits
    • Eric Dumazet's avatar
      l2tp: prevent lockdep issue in l2tp_tunnel_register() · b9fb10d1
      Eric Dumazet authored
      lockdep complains with the following lock/unlock sequence:
      
           lock_sock(sk);
           write_lock_bh(&sk->sk_callback_lock);
      [1]  release_sock(sk);
      [2]  write_unlock_bh(&sk->sk_callback_lock);
      
      We need to swap [1] and [2] to fix this issue.
      
      Fixes: 0b2c5972 ("l2tp: close all race conditions in l2tp_tunnel_register()")
      Reported-by: syzbot+bbd35b345c7cab0d9a08@syzkaller.appspotmail.com
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Link: https://lore.kernel.org/netdev/20230114030137.672706-1-xiyou.wangcong@gmail.com/T/#m1164ff20628671b0f326a24cb106ab3239c70ce3
      Cc: Cong Wang <cong.wang@bytedance.com>
      Cc: Guillaume Nault <gnault@redhat.com>
      Reviewed-by: default avatarGuillaume Nault <gnault@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b9fb10d1
    • Jason Wang's avatar
      virtio-net: correctly enable callback during start_xmit · d71ebe81
      Jason Wang authored
      Commit a7766ef1("virtio_net: disable cb aggressively") enables
      virtqueue callback via the following statement:
      
              do {
      		if (use_napi)
      			virtqueue_disable_cb(sq->vq);
      
      		free_old_xmit_skbs(sq, false);
      
      	} while (use_napi && kick &&
                     unlikely(!virtqueue_enable_cb_delayed(sq->vq)));
      
      When NAPI is used and kick is false, the callback won't be enabled
      here. And when the virtqueue is about to be full, the tx will be
      disabled, but we still don't enable tx interrupt which will cause a TX
      hang. This could be observed when using pktgen with burst enabled.
      
      TO be consistent with the logic that tries to disable cb only for
      NAPI, fixing this by trying to enable delayed callback only when NAPI
      is enabled when the queue is about to be full.
      
      Fixes: a7766ef1 ("virtio_net: disable cb aggressively")
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Tested-by: default avatarLaurent Vivier <lvivier@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d71ebe81
    • Robert Hancock's avatar
      net: macb: fix PTP TX timestamp failure due to packet padding · 7b90f5a6
      Robert Hancock authored
      PTP TX timestamp handling was observed to be broken with this driver
      when using the raw Layer 2 PTP encapsulation. ptp4l was not receiving
      the expected TX timestamp after transmitting a packet, causing it to
      enter a failure state.
      
      The problem appears to be due to the way that the driver pads packets
      which are smaller than the Ethernet minimum of 60 bytes. If headroom
      space was available in the SKB, this caused the driver to move the data
      back to utilize it. However, this appears to cause other data references
      in the SKB to become inconsistent. In particular, this caused the
      ptp_one_step_sync function to later (in the TX completion path) falsely
      detect the packet as a one-step SYNC packet, even when it was not, which
      caused the TX timestamp to not be processed when it should be.
      
      Using the headroom for this purpose seems like an unnecessary complexity
      as this is not a hot path in the driver, and in most cases it appears
      that there is sufficient tailroom to not require using the headroom
      anyway. Remove this usage of headroom to prevent this inconsistency from
      occurring and causing other problems.
      
      Fixes: 653e92a9 ("net: macb: add support for padding and fcs computation")
      Signed-off-by: default avatarRobert Hancock <robert.hancock@calian.com>
      Reviewed-by: default avatarJacob Keller <jacob.e.keller@intel.com>
      Tested-by: Claudiu Beznea <claudiu.beznea@microchip.com> # on SAMA7G5
      Reviewed-by: default avatarClaudiu Beznea <claudiu.beznea@microchip.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7b90f5a6
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf · 3e134696
      David S. Miller authored
      Pablo Niera Ayuso says:
      
      ====================
      
      The following patchset contains Netfilter fixes for net:
      
      1) Fix syn-retransmits until initiator gives up when connection is re-used
         due to rst marked as invalid, from Florian Westphal.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3e134696
    • Randy Dunlap's avatar
      net: mlx5: eliminate anonymous module_init & module_exit · 2c1e1b94
      Randy Dunlap authored
      Eliminate anonymous module_init() and module_exit(), which can lead to
      confusion or ambiguity when reading System.map, crashes/oops/bugs,
      or an initcall_debug log.
      
      Give each of these init and exit functions unique driver-specific
      names to eliminate the anonymous names.
      
      Example 1: (System.map)
       ffffffff832fc78c t init
       ffffffff832fc79e t init
       ffffffff832fc8f8 t init
      
      Example 2: (initcall_debug log)
       calling  init+0x0/0x12 @ 1
       initcall init+0x0/0x12 returned 0 after 15 usecs
       calling  init+0x0/0x60 @ 1
       initcall init+0x0/0x60 returned 0 after 2 usecs
       calling  init+0x0/0x9a @ 1
       initcall init+0x0/0x9a returned 0 after 74 usecs
      
      Fixes: e126ba97 ("mlx5: Add driver for Mellanox Connect-IB adapters")
      Signed-off-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Cc: Eli Cohen <eli@mellanox.com>
      Cc: Saeed Mahameed <saeedm@nvidia.com>
      Cc: Leon Romanovsky <leon@kernel.org>
      Cc: linux-rdma@vger.kernel.org
      Reviewed-by: default avatarIra Weiny <ira.weiny@intel.com>
      Reviewed-by: default avatarLeon Romanovsky <leonro@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      2c1e1b94
    • Chris Mi's avatar
      net/mlx5: E-switch, Fix switchdev mode after devlink reload · 7c83d1f4
      Chris Mi authored
      The cited commit removes eswitch mode none. So after devlink reload
      in switchdev mode, eswitch mode is not changed. But actually eswitch
      is disabled during devlink reload.
      
      Fix it by setting eswitch mode to legacy when disabling eswitch
      which is called by reload_down.
      
      Fixes: f019679e ("net/mlx5: E-switch, Remove dependency between sriov and eswitch mode")
      Signed-off-by: default avatarChris Mi <cmi@nvidia.com>
      Reviewed-by: default avatarRoi Dayan <roid@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      7c83d1f4
    • Leon Romanovsky's avatar
      net/mlx5e: Protect global IPsec ASO · e4d38c45
      Leon Romanovsky authored
      ASO operations are global to whole IPsec as they share one DMA address
      for all operations. As such all WQE operations need to be protected with
      lock. In this case, it must be spinlock to allow mlx5e_ipsec_aso_query()
      operate in atomic context.
      
      Fixes: 1ed78fc0 ("net/mlx5e: Update IPsec soft and hard limits")
      Reviewed-by: default avatarJianbo Liu <jianbol@nvidia.com>
      Signed-off-by: default avatarLeon Romanovsky <leonro@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      e4d38c45
    • Leon Romanovsky's avatar
      net/mlx5e: Remove optimization which prevented update of ESN state · 16bccbaa
      Leon Romanovsky authored
      aso->use_cache variable introduced in commit 8c582ddf ("net/mlx5e: Handle
      hardware IPsec limits events") was an optimization to skip recurrent calls
      to mlx5e_ipsec_aso_query(). Such calls are possible when lifetime event is
      generated:
       -> mlx5e_ipsec_handle_event()
        -> mlx5e_ipsec_aso_query() - first call
        -> xfrm_state_check_expire()
         -> mlx5e_xfrm_update_curlft()
          -> mlx5e_ipsec_aso_query() - second call
      
      However, such optimization not really effective as mlx5e_ipsec_aso_query()
      is needed to be called for update ESN anyway, which was missed due to misplaced
      use_cache assignment.
      
      Fixes: cee137a6 ("net/mlx5e: Handle ESN update events")
      Reviewed-by: default avatarJianbo Liu <jianbol@nvidia.com>
      Signed-off-by: default avatarLeon Romanovsky <leonro@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      16bccbaa
    • Chris Mi's avatar
      net/mlx5e: Set decap action based on attr for sample · ffa99b53
      Chris Mi authored
      Currently decap action is set based on tunnel_id. That means it is
      set unconditionally. But for decap, ct and sample actions, decap is
      done before ct. No need to decap again in sample.
      
      And the actions are set correctly when parsing. So set decap action
      based on attr instead of tunnel_id.
      
      Fixes: 2741f223 ("net/mlx5e: TC, Support sample offload action for tunneled traffic")
      Signed-off-by: default avatarChris Mi <cmi@nvidia.com>
      Reviewed-by: default avatarRoi Dayan <roid@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      ffa99b53
    • Maor Dickman's avatar
      net/mlx5e: QoS, Fix wrongfully setting parent_element_id on MODIFY_SCHEDULING_ELEMENT · 4ddf77f9
      Maor Dickman authored
      According to HW spec parent_element_id field should be reserved (0x0) when calling
      MODIFY_SCHEDULING_ELEMENT command.
      
      This patch remove the wrong initialization of reserved field, parent_element_id, on
      mlx5_qos_update_node.
      
      Fixes: 214baf22 ("net/mlx5e: Support HTB offload")
      Signed-off-by: default avatarMaor Dickman <maord@nvidia.com>
      Reviewed-by: default avatarEli Cohen <elic@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      4ddf77f9
    • Maor Dickman's avatar
      net/mlx5: E-switch, Fix setting of reserved fields on MODIFY_SCHEDULING_ELEMENT · f51471d1
      Maor Dickman authored
      According to HW spec element_type, element_attributes and parent_element_id fields
      should be reserved (0x0) when calling MODIFY_SCHEDULING_ELEMENT command.
      
      This patch remove initialization of these fields when calling the command.
      
      Fixes: bd77bf1c ("net/mlx5: Add SRIOV VF max rate configuration support")
      Signed-off-by: default avatarMaor Dickman <maord@nvidia.com>
      Reviewed-by: default avatarEli Cohen <elic@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      f51471d1
    • Adham Faris's avatar
      net/mlx5e: Remove redundant xsk pointer check in mlx5e_mpwrq_validate_xsk · 6624bfee
      Adham Faris authored
      This validation function is relevant only for XSK cases, hence it
      assumes to be called only with xsk != NULL.
      Thus checking for invalid xsk pointer is redundant and misleads static
      code analyzers.
      This commit removes redundant xsk pointer check.
      
      This solves the following smatch warning:
      drivers/net/ethernet/mellanox/mlx5/core/en/params.c:481
      mlx5e_mpwrq_validate_xsk() error: we previously assumed 'xsk' could be
      null (see line 478)
      
      Fixes: 6470d2e7 ("net/mlx5e: xsk: Use KSM for unaligned XSK")
      Signed-off-by: default avatarAdham Faris <afaris@nvidia.com>
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Reported-by: default avatarDan Carpenter <error27@gmail.com>
      Reviewed-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      6624bfee
    • Vlad Buslov's avatar
      net/mlx5e: Avoid false lock dependency warning on tc_ht even more · 5aa56105
      Vlad Buslov authored
      The cited commit changed class of tc_ht internal mutex in order to avoid
      false lock dependency with fs_core node and flow_table hash table
      structures. However, hash table implementation internally also includes a
      workqueue task with its own lockdep map which causes similar bogus lockdep
      splat[0]. Fix it by also adding dedicated class for hash table workqueue
      work structure of tc_ht.
      
      [0]:
      
      [ 1139.672465] ======================================================
      [ 1139.673552] WARNING: possible circular locking dependency detected
      [ 1139.674635] 6.1.0_for_upstream_debug_2022_12_12_17_02 #1 Not tainted
      [ 1139.675734] ------------------------------------------------------
      [ 1139.676801] modprobe/5998 is trying to acquire lock:
      [ 1139.677726] ffff88811e7b93b8 (&node->lock){++++}-{3:3}, at: down_write_ref_node+0x7c/0xe0 [mlx5_core]
      [ 1139.679662]
                     but task is already holding lock:
      [ 1139.680703] ffff88813c1f96a0 (&tc_ht_lock_key){+.+.}-{3:3}, at: rhashtable_free_and_destroy+0x38/0x6f0
      [ 1139.682223]
                     which lock already depends on the new lock.
      
      [ 1139.683640]
                     the existing dependency chain (in reverse order) is:
      [ 1139.684887]
                     -> #2 (&tc_ht_lock_key){+.+.}-{3:3}:
      [ 1139.685975]        __mutex_lock+0x12c/0x14b0
      [ 1139.686659]        rht_deferred_worker+0x35/0x1540
      [ 1139.687405]        process_one_work+0x7c2/0x1310
      [ 1139.688134]        worker_thread+0x59d/0xec0
      [ 1139.688820]        kthread+0x28f/0x330
      [ 1139.689444]        ret_from_fork+0x1f/0x30
      [ 1139.690106]
                     -> #1 ((work_completion)(&ht->run_work)){+.+.}-{0:0}:
      [ 1139.691250]        __flush_work+0xe8/0x900
      [ 1139.691915]        __cancel_work_timer+0x2ca/0x3f0
      [ 1139.692655]        rhashtable_free_and_destroy+0x22/0x6f0
      [ 1139.693472]        del_sw_flow_table+0x22/0xb0 [mlx5_core]
      [ 1139.694592]        tree_put_node+0x24c/0x450 [mlx5_core]
      [ 1139.695686]        tree_remove_node+0x6e/0x100 [mlx5_core]
      [ 1139.696803]        mlx5_destroy_flow_table+0x187/0x690 [mlx5_core]
      [ 1139.698017]        mlx5e_tc_nic_cleanup+0x2f8/0x400 [mlx5_core]
      [ 1139.699217]        mlx5e_cleanup_nic_rx+0x2b/0x210 [mlx5_core]
      [ 1139.700397]        mlx5e_detach_netdev+0x19d/0x2b0 [mlx5_core]
      [ 1139.701571]        mlx5e_suspend+0xdb/0x140 [mlx5_core]
      [ 1139.702665]        mlx5e_remove+0x89/0x190 [mlx5_core]
      [ 1139.703756]        auxiliary_bus_remove+0x52/0x70
      [ 1139.704492]        device_release_driver_internal+0x3c1/0x600
      [ 1139.705360]        bus_remove_device+0x2a5/0x560
      [ 1139.706080]        device_del+0x492/0xb80
      [ 1139.706724]        mlx5_rescan_drivers_locked+0x194/0x6a0 [mlx5_core]
      [ 1139.707961]        mlx5_unregister_device+0x7a/0xa0 [mlx5_core]
      [ 1139.709138]        mlx5_uninit_one+0x5f/0x160 [mlx5_core]
      [ 1139.710252]        remove_one+0xd1/0x160 [mlx5_core]
      [ 1139.711297]        pci_device_remove+0x96/0x1c0
      [ 1139.722721]        device_release_driver_internal+0x3c1/0x600
      [ 1139.723590]        unbind_store+0x1b1/0x200
      [ 1139.724259]        kernfs_fop_write_iter+0x348/0x520
      [ 1139.725019]        vfs_write+0x7b2/0xbf0
      [ 1139.725658]        ksys_write+0xf3/0x1d0
      [ 1139.726292]        do_syscall_64+0x3d/0x90
      [ 1139.726942]        entry_SYSCALL_64_after_hwframe+0x46/0xb0
      [ 1139.727769]
                     -> #0 (&node->lock){++++}-{3:3}:
      [ 1139.728698]        __lock_acquire+0x2cf5/0x62f0
      [ 1139.729415]        lock_acquire+0x1c1/0x540
      [ 1139.730076]        down_write+0x8e/0x1f0
      [ 1139.730709]        down_write_ref_node+0x7c/0xe0 [mlx5_core]
      [ 1139.731841]        mlx5_del_flow_rules+0x6f/0x610 [mlx5_core]
      [ 1139.732982]        __mlx5_eswitch_del_rule+0xdd/0x560 [mlx5_core]
      [ 1139.734207]        mlx5_eswitch_del_offloaded_rule+0x14/0x20 [mlx5_core]
      [ 1139.735491]        mlx5e_tc_rule_unoffload+0x104/0x2b0 [mlx5_core]
      [ 1139.736716]        mlx5e_tc_unoffload_fdb_rules+0x10c/0x1f0 [mlx5_core]
      [ 1139.738007]        mlx5e_tc_del_fdb_flow+0xc3c/0xfa0 [mlx5_core]
      [ 1139.739213]        mlx5e_tc_del_flow+0x146/0xa20 [mlx5_core]
      [ 1139.740377]        _mlx5e_tc_del_flow+0x38/0x60 [mlx5_core]
      [ 1139.741534]        rhashtable_free_and_destroy+0x3be/0x6f0
      [ 1139.742351]        mlx5e_tc_ht_cleanup+0x1b/0x30 [mlx5_core]
      [ 1139.743512]        mlx5e_cleanup_rep_tx+0x4a/0xe0 [mlx5_core]
      [ 1139.744683]        mlx5e_detach_netdev+0x1ca/0x2b0 [mlx5_core]
      [ 1139.745860]        mlx5e_netdev_change_profile+0xd9/0x1c0 [mlx5_core]
      [ 1139.747098]        mlx5e_netdev_attach_nic_profile+0x1b/0x30 [mlx5_core]
      [ 1139.748372]        mlx5e_vport_rep_unload+0x16a/0x1b0 [mlx5_core]
      [ 1139.749590]        __esw_offloads_unload_rep+0xb1/0xd0 [mlx5_core]
      [ 1139.750813]        mlx5_eswitch_unregister_vport_reps+0x409/0x5f0 [mlx5_core]
      [ 1139.752147]        mlx5e_rep_remove+0x62/0x80 [mlx5_core]
      [ 1139.753293]        auxiliary_bus_remove+0x52/0x70
      [ 1139.754028]        device_release_driver_internal+0x3c1/0x600
      [ 1139.754885]        driver_detach+0xc1/0x180
      [ 1139.755553]        bus_remove_driver+0xef/0x2e0
      [ 1139.756260]        auxiliary_driver_unregister+0x16/0x50
      [ 1139.757059]        mlx5e_rep_cleanup+0x19/0x30 [mlx5_core]
      [ 1139.758207]        mlx5e_cleanup+0x12/0x30 [mlx5_core]
      [ 1139.759295]        mlx5_cleanup+0xc/0x49 [mlx5_core]
      [ 1139.760384]        __x64_sys_delete_module+0x2b5/0x450
      [ 1139.761166]        do_syscall_64+0x3d/0x90
      [ 1139.761827]        entry_SYSCALL_64_after_hwframe+0x46/0xb0
      [ 1139.762663]
                     other info that might help us debug this:
      
      [ 1139.763925] Chain exists of:
                       &node->lock --> (work_completion)(&ht->run_work) --> &tc_ht_lock_key
      
      [ 1139.765743]  Possible unsafe locking scenario:
      
      [ 1139.766688]        CPU0                    CPU1
      [ 1139.767399]        ----                    ----
      [ 1139.768111]   lock(&tc_ht_lock_key);
      [ 1139.768704]                                lock((work_completion)(&ht->run_work));
      [ 1139.769869]                                lock(&tc_ht_lock_key);
      [ 1139.770770]   lock(&node->lock);
      [ 1139.771326]
                      *** DEADLOCK ***
      
      [ 1139.772345] 2 locks held by modprobe/5998:
      [ 1139.772994]  #0: ffff88813c1ff0e8 (&dev->mutex){....}-{3:3}, at: device_release_driver_internal+0x8d/0x600
      [ 1139.774399]  #1: ffff88813c1f96a0 (&tc_ht_lock_key){+.+.}-{3:3}, at: rhashtable_free_and_destroy+0x38/0x6f0
      [ 1139.775822]
                     stack backtrace:
      [ 1139.776579] CPU: 3 PID: 5998 Comm: modprobe Not tainted 6.1.0_for_upstream_debug_2022_12_12_17_02 #1
      [ 1139.777935] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
      [ 1139.779529] Call Trace:
      [ 1139.779992]  <TASK>
      [ 1139.780409]  dump_stack_lvl+0x57/0x7d
      [ 1139.781015]  check_noncircular+0x278/0x300
      [ 1139.781687]  ? print_circular_bug+0x460/0x460
      [ 1139.782381]  ? rcu_read_lock_sched_held+0x3f/0x70
      [ 1139.783121]  ? lock_release+0x487/0x7c0
      [ 1139.783759]  ? orc_find.part.0+0x1f1/0x330
      [ 1139.784423]  ? mark_lock.part.0+0xef/0x2fc0
      [ 1139.785091]  __lock_acquire+0x2cf5/0x62f0
      [ 1139.785754]  ? register_lock_class+0x18e0/0x18e0
      [ 1139.786483]  lock_acquire+0x1c1/0x540
      [ 1139.787093]  ? down_write_ref_node+0x7c/0xe0 [mlx5_core]
      [ 1139.788195]  ? lockdep_hardirqs_on_prepare+0x3f0/0x3f0
      [ 1139.788978]  ? register_lock_class+0x18e0/0x18e0
      [ 1139.789715]  down_write+0x8e/0x1f0
      [ 1139.790292]  ? down_write_ref_node+0x7c/0xe0 [mlx5_core]
      [ 1139.791380]  ? down_write_killable+0x220/0x220
      [ 1139.792080]  ? find_held_lock+0x2d/0x110
      [ 1139.792713]  down_write_ref_node+0x7c/0xe0 [mlx5_core]
      [ 1139.793795]  mlx5_del_flow_rules+0x6f/0x610 [mlx5_core]
      [ 1139.794879]  __mlx5_eswitch_del_rule+0xdd/0x560 [mlx5_core]
      [ 1139.796032]  ? __esw_offloads_unload_rep+0xd0/0xd0 [mlx5_core]
      [ 1139.797227]  ? xa_load+0x11a/0x200
      [ 1139.797800]  ? __xa_clear_mark+0xf0/0xf0
      [ 1139.798438]  mlx5_eswitch_del_offloaded_rule+0x14/0x20 [mlx5_core]
      [ 1139.799660]  mlx5e_tc_rule_unoffload+0x104/0x2b0 [mlx5_core]
      [ 1139.800821]  mlx5e_tc_unoffload_fdb_rules+0x10c/0x1f0 [mlx5_core]
      [ 1139.802049]  ? mlx5_eswitch_get_uplink_priv+0x25/0x80 [mlx5_core]
      [ 1139.803260]  mlx5e_tc_del_fdb_flow+0xc3c/0xfa0 [mlx5_core]
      [ 1139.804398]  ? __cancel_work_timer+0x1c2/0x3f0
      [ 1139.805099]  ? mlx5e_tc_unoffload_from_slow_path+0x460/0x460 [mlx5_core]
      [ 1139.806387]  mlx5e_tc_del_flow+0x146/0xa20 [mlx5_core]
      [ 1139.807481]  _mlx5e_tc_del_flow+0x38/0x60 [mlx5_core]
      [ 1139.808564]  rhashtable_free_and_destroy+0x3be/0x6f0
      [ 1139.809336]  ? mlx5e_tc_del_flow+0xa20/0xa20 [mlx5_core]
      [ 1139.809336]  ? mlx5e_tc_del_flow+0xa20/0xa20 [mlx5_core]
      [ 1139.810455]  mlx5e_tc_ht_cleanup+0x1b/0x30 [mlx5_core]
      [ 1139.811552]  mlx5e_cleanup_rep_tx+0x4a/0xe0 [mlx5_core]
      [ 1139.812655]  mlx5e_detach_netdev+0x1ca/0x2b0 [mlx5_core]
      [ 1139.813768]  mlx5e_netdev_change_profile+0xd9/0x1c0 [mlx5_core]
      [ 1139.814952]  mlx5e_netdev_attach_nic_profile+0x1b/0x30 [mlx5_core]
      [ 1139.816166]  mlx5e_vport_rep_unload+0x16a/0x1b0 [mlx5_core]
      [ 1139.817336]  __esw_offloads_unload_rep+0xb1/0xd0 [mlx5_core]
      [ 1139.818507]  mlx5_eswitch_unregister_vport_reps+0x409/0x5f0 [mlx5_core]
      [ 1139.819788]  ? mlx5_eswitch_uplink_get_proto_dev+0x30/0x30 [mlx5_core]
      [ 1139.821051]  ? kernfs_find_ns+0x137/0x310
      [ 1139.821705]  mlx5e_rep_remove+0x62/0x80 [mlx5_core]
      [ 1139.822778]  auxiliary_bus_remove+0x52/0x70
      [ 1139.823449]  device_release_driver_internal+0x3c1/0x600
      [ 1139.824240]  driver_detach+0xc1/0x180
      [ 1139.824842]  bus_remove_driver+0xef/0x2e0
      [ 1139.825504]  auxiliary_driver_unregister+0x16/0x50
      [ 1139.826245]  mlx5e_rep_cleanup+0x19/0x30 [mlx5_core]
      [ 1139.827322]  mlx5e_cleanup+0x12/0x30 [mlx5_core]
      [ 1139.828345]  mlx5_cleanup+0xc/0x49 [mlx5_core]
      [ 1139.829382]  __x64_sys_delete_module+0x2b5/0x450
      [ 1139.830119]  ? module_flags+0x300/0x300
      [ 1139.830750]  ? task_work_func_match+0x50/0x50
      [ 1139.831440]  ? task_work_cancel+0x20/0x20
      [ 1139.832088]  ? lockdep_hardirqs_on_prepare+0x273/0x3f0
      [ 1139.832873]  ? syscall_enter_from_user_mode+0x1d/0x50
      [ 1139.833661]  ? trace_hardirqs_on+0x2d/0x100
      [ 1139.834328]  do_syscall_64+0x3d/0x90
      [ 1139.834922]  entry_SYSCALL_64_after_hwframe+0x46/0xb0
      [ 1139.835700] RIP: 0033:0x7f153e71288b
      [ 1139.836302] Code: 73 01 c3 48 8b 0d 9d 75 0e 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 6d 75 0e 00 f7 d8 64 89 01 48
      [ 1139.838866] RSP: 002b:00007ffe0a3ed938 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
      [ 1139.840020] RAX: ffffffffffffffda RBX: 0000564c2cbf8220 RCX: 00007f153e71288b
      [ 1139.841043] RDX: 0000000000000000 RSI: 0000000000000800 RDI: 0000564c2cbf8288
      [ 1139.842072] RBP: 0000564c2cbf8220 R08: 0000000000000000 R09: 0000000000000000
      [ 1139.843094] R10: 00007f153e7a3ac0 R11: 0000000000000206 R12: 0000564c2cbf8288
      [ 1139.844118] R13: 0000000000000000 R14: 0000564c2cbf7ae8 R15: 00007ffe0a3efcb8
      
      Fixes: 9ba33339 ("net/mlx5e: Avoid false lock depenency warning on tc_ht")
      Signed-off-by: default avatarVlad Buslov <vladbu@nvidia.com>
      Reviewed-by: default avatarEli Cohen <elic@nvidia.com>
      Reviewed-by: default avatarRoi Dayan <roid@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      5aa56105
    • Yang Yingliang's avatar
      net/mlx5: fix missing mutex_unlock in mlx5_fw_fatal_reporter_err_work() · 90e7cb78
      Yang Yingliang authored
      Add missing mutex_unlock() before returning from
      mlx5_fw_fatal_reporter_err_work().
      
      Fixes: 9078e843 ("net/mlx5: Avoid recovery in probe flows")
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Reported-by: default avatarDan Carpenter <error27@gmail.com>
      Signed-off-by: default avatarYang Yingliang <yangyingliang@huawei.com>
      Signed-off-by: default avatarShay Drory <shayd@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      90e7cb78
    • Jakub Kicinski's avatar
      Merge tag 'for-net-2023-01-17' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth · 010a74f5
      Jakub Kicinski authored
      Luiz Augusto von Dentz says:
      
      ====================
      bluetooth pull request for net:
      
       - Fix a buffer overflow in mgmt_mesh_add
       - Fix use HCI_OP_LE_READ_BUFFER_SIZE_V2
       - Fix hci_qca shutdown on closed serdev
       - Fix possible circular locking dependencies on ISO code
       - Fix possible deadlock in rfcomm_sk_state_change
      
      * tag 'for-net-2023-01-17' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth:
        Bluetooth: Fix possible deadlock in rfcomm_sk_state_change
        Bluetooth: ISO: Fix possible circular locking dependency
        Bluetooth: hci_event: Fix Invalid wait context
        Bluetooth: ISO: Fix possible circular locking dependency
        Bluetooth: hci_sync: fix memory leak in hci_update_adv_data()
        Bluetooth: hci_qca: Fix driver shutdown on closed serdev
        Bluetooth: hci_conn: Fix memory leaks
        Bluetooth: hci_sync: Fix use HCI_OP_LE_READ_BUFFER_SIZE_V2
        Bluetooth: Fix a buffer overflow in mgmt_mesh_add()
      ====================
      
      Link: https://lore.kernel.org/r/20230118002944.1679845-1-luiz.dentz@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      010a74f5
    • Jakub Kicinski's avatar
      Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf · 423c1d36
      Jakub Kicinski authored
      Daniel Borkmann says:
      
      ====================
      bpf 2023-01-16
      
      We've added 6 non-merge commits during the last 8 day(s) which contain
      a total of 6 files changed, 22 insertions(+), 24 deletions(-).
      
      The main changes are:
      
      1) Mitigate a Spectre v4 leak in unprivileged BPF from speculative
         pointer-as-scalar type confusion, from Luis Gerhorst.
      
      2) Fix a splat when pid 1 attaches a BPF program that attempts to
         send killing signal to itself, from Hao Sun.
      
      3) Fix BPF program ID information in BPF_AUDIT_UNLOAD as well as
         PERF_BPF_EVENT_PROG_UNLOAD events, from Paul Moore.
      
      4) Fix BPF verifier warning triggered from invalid kfunc call in
         backtrack_insn, also from Hao Sun.
      
      5) Fix potential deadlock in htab_lock_bucket from same bucket index
         but different map_locked index, from Tonghao Zhang.
      
      * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
        bpf: Fix pointer-leak due to insufficient speculative store bypass mitigation
        bpf: hash map, avoid deadlock with suitable hash mask
        bpf: remove the do_idr_lock parameter from bpf_prog_free_id()
        bpf: restore the ebpf program ID for BPF_AUDIT_UNLOAD and PERF_BPF_EVENT_PROG_UNLOAD
        bpf: Skip task with pid=1 in send_signal_common()
        bpf: Skip invalid kfunc call in backtrack_insn
      ====================
      
      Link: https://lore.kernel.org/r/20230116230745.21742-1-daniel@iogearbox.netSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      423c1d36
    • Shyam Sundar S K's avatar
      MAINTAINERS: Update AMD XGBE driver maintainers · 441717b6
      Shyam Sundar S K authored
      Due to other additional responsibilities Tom would no longer
      be able to support AMD XGBE driver.
      
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Signed-off-by: default avatarShyam Sundar S K <Shyam-sundar.S-k@amd.com>
      Link: https://lore.kernel.org/r/20230116085015.443127-1-Shyam-sundar.S-k@amd.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      441717b6
    • Caleb Connolly's avatar
      net: ipa: disable ipa interrupt during suspend · 9ec9b2a3
      Caleb Connolly authored
      The IPA interrupt can fire when pm_runtime is disabled due to it racing
      with the PM suspend/resume code. This causes a splat in the interrupt
      handler when it tries to call pm_runtime_get().
      
      Explicitly disable the interrupt in our ->suspend callback, and
      re-enable it in ->resume to avoid this. If there is an interrupt pending
      it will be handled after resuming. The interrupt is a wake_irq, as a
      result even when disabled if it fires it will cause the system to wake
      from suspend as well as cancel any suspend transition that may be in
      progress. If there is an interrupt pending, the ipa_isr_thread handler
      will be called after resuming.
      
      Fixes: 1aac309d ("net: ipa: use autosuspend")
      Signed-off-by: default avatarCaleb Connolly <caleb.connolly@linaro.org>
      Reviewed-by: default avatarAlex Elder <elder@linaro.org>
      Link: https://lore.kernel.org/r/20230115175925.465918-1-caleb.connolly@linaro.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      9ec9b2a3
  4. 17 Jan, 2023 2 commits
    • Ying Hsu's avatar
      Bluetooth: Fix possible deadlock in rfcomm_sk_state_change · 1d80d57f
      Ying Hsu authored
      syzbot reports a possible deadlock in rfcomm_sk_state_change [1].
      While rfcomm_sock_connect acquires the sk lock and waits for
      the rfcomm lock, rfcomm_sock_release could have the rfcomm
      lock and hit a deadlock for acquiring the sk lock.
      Here's a simplified flow:
      
      rfcomm_sock_connect:
        lock_sock(sk)
        rfcomm_dlc_open:
          rfcomm_lock()
      
      rfcomm_sock_release:
        rfcomm_sock_shutdown:
          rfcomm_lock()
          __rfcomm_dlc_close:
              rfcomm_k_state_change:
      	  lock_sock(sk)
      
      This patch drops the sk lock before calling rfcomm_dlc_open to
      avoid the possible deadlock and holds sk's reference count to
      prevent use-after-free after rfcomm_dlc_open completes.
      
      Reported-by: syzbot+d7ce59...@syzkaller.appspotmail.com
      Fixes: 1804fdf6 ("Bluetooth: btintel: Combine setting up MSFT extension")
      Link: https://syzkaller.appspot.com/bug?extid=d7ce59b06b3eb14fd218 [1]
      Signed-off-by: default avatarYing Hsu <yinghsu@chromium.org>
      Signed-off-by: default avatarLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
      1d80d57f
    • Luiz Augusto von Dentz's avatar
      Bluetooth: ISO: Fix possible circular locking dependency · 506d9b40
      Luiz Augusto von Dentz authored
      This attempts to fix the following trace:
      
      iso-tester/52 is trying to acquire lock:
      ffff8880024e0070 (&hdev->lock){+.+.}-{3:3}, at:
      iso_sock_listen+0x29e/0x440
      
      but task is already holding lock:
      ffff888001978130 (sk_lock-AF_BLUETOOTH-BTPROTO_ISO){+.+.}-{0:0}, at:
      iso_sock_listen+0x8b/0x440
      
      which lock already depends on the new lock.
      
      the existing dependency chain (in reverse order) is:
      
      -> #2 (sk_lock-AF_BLUETOOTH-BTPROTO_ISO){+.+.}-{0:0}:
             lock_acquire+0x176/0x3d0
             lock_sock_nested+0x32/0x80
             iso_connect_cfm+0x1a3/0x630
             hci_cc_le_setup_iso_path+0x195/0x340
             hci_cmd_complete_evt+0x1ae/0x500
             hci_event_packet+0x38e/0x7c0
             hci_rx_work+0x34c/0x980
             process_one_work+0x5a5/0x9a0
             worker_thread+0x89/0x6f0
             kthread+0x14e/0x180
             ret_from_fork+0x22/0x30
      
      -> #1 (hci_cb_list_lock){+.+.}-{3:3}:
             lock_acquire+0x176/0x3d0
             __mutex_lock+0x13b/0xf50
             hci_le_remote_feat_complete_evt+0x17e/0x320
             hci_event_packet+0x38e/0x7c0
             hci_rx_work+0x34c/0x980
             process_one_work+0x5a5/0x9a0
             worker_thread+0x89/0x6f0
             kthread+0x14e/0x180
             ret_from_fork+0x22/0x30
      
      -> #0 (&hdev->lock){+.+.}-{3:3}:
             check_prev_add+0xfc/0x1190
             __lock_acquire+0x1e27/0x2750
             lock_acquire+0x176/0x3d0
             __mutex_lock+0x13b/0xf50
             iso_sock_listen+0x29e/0x440
             __sys_listen+0xe6/0x160
             __x64_sys_listen+0x25/0x30
             do_syscall_64+0x42/0x90
             entry_SYSCALL_64_after_hwframe+0x62/0xcc
      
      other info that might help us debug this:
      
      Chain exists of:
        &hdev->lock --> hci_cb_list_lock --> sk_lock-AF_BLUETOOTH-BTPROTO_ISO
      
       Possible unsafe locking scenario:
      
             CPU0                    CPU1
             ----                    ----
        lock(sk_lock-AF_BLUETOOTH-BTPROTO_ISO);
                                     lock(hci_cb_list_lock);
                                     lock(sk_lock-AF_BLUETOOTH-BTPROTO_ISO);
        lock(&hdev->lock);
      
       *** DEADLOCK ***
      
      1 lock held by iso-tester/52:
       #0: ffff888001978130 (sk_lock-AF_BLUETOOTH-BTPROTO_ISO){+.+.}-{0:0}, at:
       iso_sock_listen+0x8b/0x440
      
      Fixes: f764a6c2 ("Bluetooth: ISO: Add broadcast support")
      Signed-off-by: default avatarLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
      506d9b40