1. 24 Aug, 2022 16 commits
  2. 23 Aug, 2022 16 commits
  3. 22 Aug, 2022 8 commits
    • Dan Carpenter's avatar
      net/mlx5: Unlock on error in mlx5_sriov_enable() · 35419025
      Dan Carpenter authored
      Unlock before returning if mlx5_device_enable_sriov() fails.
      
      Fixes: 84a433a4 ("net/mlx5: Lock mlx5 devlink reload callbacks")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      35419025
    • Dan Carpenter's avatar
      net/mlx5e: Fix use after free in mlx5e_fs_init() · 21234e3a
      Dan Carpenter authored
      Call mlx5e_fs_vlan_free(fs) before kvfree(fs).
      
      Fixes: af8bbf73 ("net/mlx5e: Convert mlx5e_flow_steering member of mlx5e_priv to pointer")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Reviewed-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      21234e3a
    • Dan Carpenter's avatar
      net/mlx5e: kTLS, Use _safe() iterator in mlx5e_tls_priv_tx_list_cleanup() · 6514210b
      Dan Carpenter authored
      Use the list_for_each_entry_safe() macro to prevent dereferencing "obj"
      after it has been freed.
      
      Fixes: c4dfe704 ("net/mlx5e: kTLS, Recycle objects of device-offloaded TLS TX connections")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Reviewed-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      6514210b
    • Dan Carpenter's avatar
      net/mlx5: unlock on error path in esw_vfs_changed_event_handler() · b868c8fe
      Dan Carpenter authored
      Unlock before returning on this error path.
      
      Fixes: f1bc646c ("net/mlx5: Use devl_ API in mlx5_esw_offloads_devlink_port_register")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      b868c8fe
    • Maor Dickman's avatar
      net/mlx5e: Fix wrong tc flag used when set hw-tc-offload off · 550f9643
      Maor Dickman authored
      The cited commit reintroduced the ability to set hw-tc-offload
      in switchdev mode by reusing NIC mode calls without modifying it
      to support both modes, this can cause an illegal memory access
      when trying to turn hw-tc-offload off.
      
      Fix this by using the right TC_FLAG when checking if tc rules
      are installed while disabling hw-tc-offload.
      
      Fixes: d3cbd425 ("net/mlx5e: Add ndo_set_feature for uplink representor")
      Signed-off-by: default avatarMaor Dickman <maord@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      550f9643
    • Roi Dayan's avatar
      net/mlx5e: TC, Add missing policer validation · f7a4e867
      Roi Dayan authored
      There is a missing policer validation when offloading police action
      with tc action api. Add it.
      
      Fixes: 7d1a5ce4 ("net/mlx5e: TC, Support tc action api for police")
      Signed-off-by: default avatarRoi Dayan <roid@nvidia.com>
      Reviewed-by: default avatarMaor Dickman <maord@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      f7a4e867
    • Aya Levin's avatar
      net/mlx5e: Fix wrong application of the LRO state · 7b3707fc
      Aya Levin authored
      Driver caches packet merge type in mlx5e_params instance which must be
      in perfect sync with the netdev_feature's bit.
      Prior to this patch, in certain conditions (*) LRO state was set in
      mlx5e_params, while netdev_feature's bit was off. Causing the LRO to
      be applied on the RQs (HW level).
      
      (*) This can happen only on profile init (mlx5e_build_nic_params()),
      when RQ expect non-linear SKB and PCI is fast enough in comparison to
      link width.
      
      Solution: remove setting of packet merge type from
      mlx5e_build_nic_params() as netdev features are not updated.
      
      Fixes: 619a8f2a ("net/mlx5e: Use linear SKB in Striding RQ")
      Signed-off-by: default avatarAya Levin <ayal@nvidia.com>
      Reviewed-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Reviewed-by: default avatarMaxim Mikityanskiy <maximmi@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      7b3707fc
    • Moshe Shemesh's avatar
      net/mlx5: Avoid false positive lockdep warning by adding lock_class_key · d59b73a6
      Moshe Shemesh authored
      Add a lock_class_key per mlx5 device to avoid a false positive
      "possible circular locking dependency" warning by lockdep, on flows
      which lock more than one mlx5 device, such as adding SF.
      
      kernel log:
       ======================================================
       WARNING: possible circular locking dependency detected
       5.19.0-rc8+ #2 Not tainted
       ------------------------------------------------------
       kworker/u20:0/8 is trying to acquire lock:
       ffff88812dfe0d98 (&dev->intf_state_mutex){+.+.}-{3:3}, at: mlx5_init_one+0x2e/0x490 [mlx5_core]
      
       but task is already holding lock:
       ffff888101aa7898 (&(&notifier->n_head)->rwsem){++++}-{3:3}, at: blocking_notifier_call_chain+0x5a/0x130
      
       which lock already depends on the new lock.
      
       the existing dependency chain (in reverse order) is:
      
       -> #1 (&(&notifier->n_head)->rwsem){++++}-{3:3}:
              down_write+0x90/0x150
              blocking_notifier_chain_register+0x53/0xa0
              mlx5_sf_table_init+0x369/0x4a0 [mlx5_core]
              mlx5_init_one+0x261/0x490 [mlx5_core]
              probe_one+0x430/0x680 [mlx5_core]
              local_pci_probe+0xd6/0x170
              work_for_cpu_fn+0x4e/0xa0
              process_one_work+0x7c2/0x1340
              worker_thread+0x6f6/0xec0
              kthread+0x28f/0x330
              ret_from_fork+0x1f/0x30
      
       -> #0 (&dev->intf_state_mutex){+.+.}-{3:3}:
              __lock_acquire+0x2fc7/0x6720
              lock_acquire+0x1c1/0x550
              __mutex_lock+0x12c/0x14b0
              mlx5_init_one+0x2e/0x490 [mlx5_core]
              mlx5_sf_dev_probe+0x29c/0x370 [mlx5_core]
              auxiliary_bus_probe+0x9d/0xe0
              really_probe+0x1e0/0xaa0
              __driver_probe_device+0x219/0x480
              driver_probe_device+0x49/0x130
              __device_attach_driver+0x1b8/0x280
              bus_for_each_drv+0x123/0x1a0
              __device_attach+0x1a3/0x460
              bus_probe_device+0x1a2/0x260
              device_add+0x9b1/0x1b40
              __auxiliary_device_add+0x88/0xc0
              mlx5_sf_dev_state_change_handler+0x67e/0x9d0 [mlx5_core]
              blocking_notifier_call_chain+0xd5/0x130
              mlx5_vhca_state_work_handler+0x2b0/0x3f0 [mlx5_core]
              process_one_work+0x7c2/0x1340
              worker_thread+0x59d/0xec0
              kthread+0x28f/0x330
              ret_from_fork+0x1f/0x30
      
        other info that might help us debug this:
      
        Possible unsafe locking scenario:
      
              CPU0                    CPU1
              ----                    ----
         lock(&(&notifier->n_head)->rwsem);
                                      lock(&dev->intf_state_mutex);
                                      lock(&(&notifier->n_head)->rwsem);
         lock(&dev->intf_state_mutex);
      
        *** DEADLOCK ***
      
       4 locks held by kworker/u20:0/8:
        #0: ffff888150612938 ((wq_completion)mlx5_events){+.+.}-{0:0}, at: process_one_work+0x6e2/0x1340
        #1: ffff888100cafdb8 ((work_completion)(&work->work)#3){+.+.}-{0:0}, at: process_one_work+0x70f/0x1340
        #2: ffff888101aa7898 (&(&notifier->n_head)->rwsem){++++}-{3:3}, at: blocking_notifier_call_chain+0x5a/0x130
        #3: ffff88813682d0e8 (&dev->mutex){....}-{3:3}, at:__device_attach+0x76/0x460
      
       stack backtrace:
       CPU: 6 PID: 8 Comm: kworker/u20:0 Not tainted 5.19.0-rc8+
       Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
       Workqueue: mlx5_events mlx5_vhca_state_work_handler [mlx5_core]
       Call Trace:
        <TASK>
        dump_stack_lvl+0x57/0x7d
        check_noncircular+0x278/0x300
        ? print_circular_bug+0x460/0x460
        ? lock_chain_count+0x20/0x20
        ? register_lock_class+0x1880/0x1880
        __lock_acquire+0x2fc7/0x6720
        ? register_lock_class+0x1880/0x1880
        ? register_lock_class+0x1880/0x1880
        lock_acquire+0x1c1/0x550
        ? mlx5_init_one+0x2e/0x490 [mlx5_core]
        ? lockdep_hardirqs_on_prepare+0x400/0x400
        __mutex_lock+0x12c/0x14b0
        ? mlx5_init_one+0x2e/0x490 [mlx5_core]
        ? mlx5_init_one+0x2e/0x490 [mlx5_core]
        ? _raw_read_unlock+0x1f/0x30
        ? mutex_lock_io_nested+0x1320/0x1320
        ? __ioremap_caller.constprop.0+0x306/0x490
        ? mlx5_sf_dev_probe+0x269/0x370 [mlx5_core]
        ? iounmap+0x160/0x160
        mlx5_init_one+0x2e/0x490 [mlx5_core]
        mlx5_sf_dev_probe+0x29c/0x370 [mlx5_core]
        ? mlx5_sf_dev_remove+0x130/0x130 [mlx5_core]
        auxiliary_bus_probe+0x9d/0xe0
        really_probe+0x1e0/0xaa0
        __driver_probe_device+0x219/0x480
        ? auxiliary_match_id+0xe9/0x140
        driver_probe_device+0x49/0x130
        __device_attach_driver+0x1b8/0x280
        ? driver_allows_async_probing+0x140/0x140
        bus_for_each_drv+0x123/0x1a0
        ? bus_for_each_dev+0x1a0/0x1a0
        ? lockdep_hardirqs_on_prepare+0x286/0x400
        ? trace_hardirqs_on+0x2d/0x100
        __device_attach+0x1a3/0x460
        ? device_driver_attach+0x1e0/0x1e0
        ? kobject_uevent_env+0x22d/0xf10
        bus_probe_device+0x1a2/0x260
        device_add+0x9b1/0x1b40
        ? dev_set_name+0xab/0xe0
        ? __fw_devlink_link_to_suppliers+0x260/0x260
        ? memset+0x20/0x40
        ? lockdep_init_map_type+0x21a/0x7d0
        __auxiliary_device_add+0x88/0xc0
        ? auxiliary_device_init+0x86/0xa0
        mlx5_sf_dev_state_change_handler+0x67e/0x9d0 [mlx5_core]
        blocking_notifier_call_chain+0xd5/0x130
        mlx5_vhca_state_work_handler+0x2b0/0x3f0 [mlx5_core]
        ? mlx5_vhca_event_arm+0x100/0x100 [mlx5_core]
        ? lock_downgrade+0x6e0/0x6e0
        ? lockdep_hardirqs_on_prepare+0x286/0x400
        process_one_work+0x7c2/0x1340
        ? lockdep_hardirqs_on_prepare+0x400/0x400
        ? pwq_dec_nr_in_flight+0x230/0x230
        ? rwlock_bug.part.0+0x90/0x90
        worker_thread+0x59d/0xec0
        ? process_one_work+0x1340/0x1340
        kthread+0x28f/0x330
        ? kthread_complete_and_exit+0x20/0x20
        ret_from_fork+0x1f/0x30
        </TASK>
      
      Fixes: 6a327321 ("net/mlx5: SF, Port function state change support")
      Signed-off-by: default avatarMoshe Shemesh <moshe@nvidia.com>
      Reviewed-by: default avatarShay Drory <shayd@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      d59b73a6