Merge tag 'mlx5-updates-2023-04-20' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says: ==================== mlx5-updates-2023-04-20 1) Dragos Improves RX page pool, and provides some fixes to his previous series: 1.1) Fix releasing page_pool for striding RQ and legacy RQ nonlinear case 1.2) Hook NAPIs to page pools to gain more performance. 2) From Roi, Some cleanups to TC and eswitch modules. 3) Maher migrates vnic diagnostic counters reporting from debugfs to a dedicated devlink health reporter Maher Says: =========== net/mlx5: Expose vnic diagnostic counters using devlink Currently, vnic diagnostic counters are exposed through the following debugfs: $ ls /sys/kernel/debug/mlx5/0000:08:00.0/esw/vf_0/vnic_diag/ cq_overrun quota_exceeded_command total_q_under_processor_handle invalid_command send_queue_priority_update_flow nic_receive_steering_discard The current design does not allow the hypervisor to view the diagnostic counters of its VFs, in case the VFs get bound to a VM. In other words, the counters are not exposed for representor interfaces. Furthermore, the debugfs design is inconvenient future-wise, in case more counters need to be reported by the driver in the future. As these counters pertain to vNIC health, it is more appropriate to utilize the devlink health reporter to expose them. Thus, this patchest includes the following changes: * Drop the current vnic diagnostic counters debugfs interface. * Add a vnic devlink health reporter for PFs/VFs core devices, which when diagnosed will dump vnic diagnostic counter values that are queried from FW. * Add a vnic devlink health reporter for the representor interface, which serves the same purpose listed in the previous point, in addition to allowing the hypervisor to view its VFs diagnostic counters, even when the VFs are bounded to external VMs. Example of devlink health reporter usage is: $devlink health diagnose pci/0000:08:00.0 reporter vnic vNIC env counters: total_error_queues: 0 send_queue_priority_update_flow: 0 comp_eq_overrun: 0 async_eq_overrun: 0 cq_overrun: 0 invalid_command: 0 quota_exceeded_command: 0 nic_receive_steering_discard: 0 =========== 4) SW steering fixes and improvements Yevgeny Kliteynik Says: ======================= These short patch series are just small fixes / improvements for SW steering: - Patch 1: Fix dumping of legacy modify_hdr in debug dump to align to what is expected by parser - Patch 2: Have separate threshold for ICM sync per ICM type - Patch 3: Add more info to the steering debug dump - Linux version and device name - Patch 4: Keep track of number of buddies that are currently in use per domain per buddy type ======================= * tag 'mlx5-updates-2023-04-20' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux: net/mlx5: Update op_mode to op_mod for port selection net/mlx5: E-Switch, Remove unused mlx5_esw_offloads_vport_metadata_set() net/mlx5: E-Switch, Remove redundant dev arg from mlx5_esw_vport_alloc() net/mlx5: Include linux/pci.h for pci_msix_can_alloc_dyn() net/mlx5e: RX, Hook NAPIs to page pools net/mlx5e: RX, Fix XDP_TX page release for legacy rq nonlinear case net/mlx5e: RX, Fix releasing page_pool pages twice for striding RQ net/mlx5e: Add vnic devlink health reporter to representors net/mlx5: Add vnic devlink health reporter to PFs/VFs Revert "net/mlx5: Expose vnic diagnostic counters for eswitch managed vports" Revert "net/mlx5: Expose steering dropped packets counter" net/mlx5: DR, Add memory statistics for domain object net/mlx5: DR, Add more info in domain dbg dump net/mlx5: DR, Calculate sync threshold of each pool according to its type net/mlx5: DR, Fix dumping of legacy modify_hdr in debug dump ==================== Link: https://lore.kernel.org/r/20230421013850.349646-1-saeed@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>

Merge tag 'mlx5-updates-2023-04-20' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says: ==================== mlx5-updates-2023-04-20 1) Dragos Improves RX page pool, and provides some fixes to his previous series: 1.1) Fix releasing page_pool for striding RQ and legacy RQ nonlinear case 1.2) Hook NAPIs to page pools to gain more performance. 2) From Roi, Some cleanups to TC and eswitch modules. 3) Maher migrates vnic diagnostic counters reporting from debugfs to a dedicated devlink health reporter Maher Says: =========== net/mlx5: Expose vnic diagnostic counters using devlink Currently, vnic diagnostic counters are exposed through the following debugfs: $ ls /sys/kernel/debug/mlx5/0000:08:00.0/esw/vf_0/vnic_diag/ cq_overrun quota_exceeded_command total_q_under_processor_handle invalid_command send_queue_priority_update_flow nic_receive_steering_discard The current design does not allow the hypervisor to view the diagnostic counters of its VFs, in case the VFs get bound to a VM. In other words, the counters are not exposed for representor interfaces. Furthermore, the debugfs design is inconvenient future-wise, in case more counters need to be reported by the driver in the future. As these counters pertain to vNIC health, it is more appropriate to utilize the devlink health reporter to expose them. Thus, this patchest includes the following changes: * Drop the current vnic diagnostic counters debugfs interface. * Add a vnic devlink health reporter for PFs/VFs core devices, which when diagnosed will dump vnic diagnostic counter values that are queried from FW. * Add a vnic devlink health reporter for the representor interface, which serves the same purpose listed in the previous point, in addition to allowing the hypervisor to view its VFs diagnostic counters, even when the VFs are bounded to external VMs. Example of devlink health reporter usage is: $devlink health diagnose pci/0000:08:00.0 reporter vnic vNIC env counters: total_error_queues: 0 send_queue_priority_update_flow: 0 comp_eq_overrun: 0 async_eq_overrun: 0 cq_overrun: 0 invalid_command: 0 quota_exceeded_command: 0 nic_receive_steering_discard: 0 =========== 4) SW steering fixes and improvements Yevgeny Kliteynik Says: ======================= These short patch series are just small fixes / improvements for SW steering: - Patch 1: Fix dumping of legacy modify_hdr in debug dump to align to what is expected by parser - Patch 2: Have separate threshold for ICM sync per ICM type - Patch 3: Add more info to the steering debug dump - Linux version and device name - Patch 4: Keep track of number of buddies that are currently in use per domain per buddy type ======================= * tag 'mlx5-updates-2023-04-20' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux: net/mlx5: Update op_mode to op_mod for port selection net/mlx5: E-Switch, Remove unused mlx5_esw_offloads_vport_metadata_set() net/mlx5: E-Switch, Remove redundant dev arg from mlx5_esw_vport_alloc() net/mlx5: Include linux/pci.h for pci_msix_can_alloc_dyn() net/mlx5e: RX, Hook NAPIs to page pools net/mlx5e: RX, Fix XDP_TX page release for legacy rq nonlinear case net/mlx5e: RX, Fix releasing page_pool pages twice for striding RQ net/mlx5e: Add vnic devlink health reporter to representors net/mlx5: Add vnic devlink health reporter to PFs/VFs Revert "net/mlx5: Expose vnic diagnostic counters for eswitch managed vports" Revert "net/mlx5: Expose steering dropped packets counter" net/mlx5: DR, Add memory statistics for domain object net/mlx5: DR, Add more info in domain dbg dump net/mlx5: DR, Calculate sync threshold of each pool according to its type net/mlx5: DR, Fix dumping of legacy modify_hdr in debug dump ==================== Link: https://lore.kernel.org/r/20230421013850.349646-1-saeed@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
fbc1449d · Jakub Kicinski · 9a82cdc2 · f9c895a7 · fbc1449d · fbc1449d
Commit fbc1449d authored Apr 21, 2023 by Jakub Kicinski
20 changed files
--- a/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/devlink.rst
+++ b/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/devlink.rst
@@ -257,3 +257,36 @@ User commands examples:
    $ devlink health dump show pci/0000:82:00.1 reporter fw_fatal

 NOTE: This command can run only on PF.
+
+vnic reporter
+-------------
+The vnic reporter implements only the `diagnose` callback.
+It is responsible for querying the vnic diagnostic counters from fw and displaying
+them in realtime.
+
+Description of the vnic counters:
+total_q_under_processor_handle: number of queues in an error state due to
+an async error or errored command.
+send_queue_priority_update_flow: number of QP/SQ priority/SL update
+events.
+cq_overrun: number of times CQ entered an error state due to an
+overflow.
+async_eq_overrun: number of times an EQ mapped to async events was
+overrun.
+comp_eq_overrun: number of times an EQ mapped to completion events was
+overrun.
+quota_exceeded_command: number of commands issued and failed due to quota
+exceeded.
+invalid_command: number of commands issued and failed dues to any reason
+other than quota exceeded.
+nic_receive_steering_discard: number of packets that completed RX flow
+steering but were discarded due to a mismatch in flow table.
+
+User commands examples:
+- Diagnose PF/VF vnic counters
+        $ devlink health diagnose pci/0000:82:00.1 reporter vnic
+- Diagnose representor vnic counters (performed by supplying devlink port of the
+  representor, which can be obtained via devlink port command)
+        $ devlink health diagnose pci/0000:82:00.1/65537 reporter vnic
+
+NOTE: This command can run over all interfaces such as PF/VF and representor ports.
--- a/drivers/net/ethernet/mellanox/mlx5/core/Makefile
+++ b/drivers/net/ethernet/mellanox/mlx5/core/Makefile
@@ -16,7 +16,7 @@ mlx5_core-y :=	main.o cmd.o debugfs.o fw.o eq.o uar.o pagealloc.o \
 		transobj.o vport.o sriov.o fs_cmd.o fs_core.o pci_irq.o \
 		fs_counters.o fs_ft_pool.o rl.o lag/debugfs.o lag/lag.o dev.o events.o wq.o lib/gid.o \
 		lib/devcom.o lib/pci_vsc.o lib/dm.o lib/fs_ttc.o diag/fs_tracepoint.o \
-		diag/fw_tracer.o diag/crdump.o devlink.o diag/rsc_dump.o \
+		diag/fw_tracer.o diag/crdump.o devlink.o diag/rsc_dump.o diag/reporter_vnic.o \
 		fw_reset.o qos.o lib/tout.o lib/aso.o

 #
@@ -69,7 +69,7 @@ mlx5_core-$(CONFIG_MLX5_TC_SAMPLE)   += en/tc/sample.o
 #
 mlx5_core-$(CONFIG_MLX5_ESWITCH)   += eswitch.o eswitch_offloads.o eswitch_offloads_termtbl.o \
 				      ecpf.o rdma.o esw/legacy.o \
-				      esw/debugfs.o esw/devlink_port.o esw/vporttbl.o esw/qos.o
+				      esw/devlink_port.o esw/vporttbl.o esw/qos.o

 mlx5_core-$(CONFIG_MLX5_ESWITCH)   += esw/acl/helper.o \
 				      esw/acl/egress_lgcy.o esw/acl/egress_ofld.o \

--- a/drivers/net/ethernet/mellanox/mlx5/core/diag/reporter_vnic.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/diag/reporter_vnic.c
+// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB
+/* Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. */
+
+#include "reporter_vnic.h"
+#include "devlink.h"
+
+#define VNIC_ENV_GET64(vnic_env_stats, c) \
+	MLX5_GET64(query_vnic_env_out, (vnic_env_stats)->query_vnic_env_out, \
+		 vport_env.c)
+
+struct mlx5_vnic_diag_stats {
+	__be64 query_vnic_env_out[MLX5_ST_SZ_QW(query_vnic_env_out)];
+};
+
+int mlx5_reporter_vnic_diagnose_counters(struct mlx5_core_dev *dev,
+					 struct devlink_fmsg *fmsg,
+					 u16 vport_num, bool other_vport)
+{
+	u32 in[MLX5_ST_SZ_DW(query_vnic_env_in)] = {};
+	struct mlx5_vnic_diag_stats vnic;
+	int err;
+
+	MLX5_SET(query_vnic_env_in, in, opcode, MLX5_CMD_OP_QUERY_VNIC_ENV);
+	MLX5_SET(query_vnic_env_in, in, vport_number, vport_num);
+	MLX5_SET(query_vnic_env_in, in, other_vport, !!other_vport);
+
+	err = mlx5_cmd_exec_inout(dev, query_vnic_env, in, &vnic.query_vnic_env_out);
+	if (err)
+		return err;
+
+	err = devlink_fmsg_pair_nest_start(fmsg, "vNIC env counters");
+	if (err)
+		return err;
+
+	err = devlink_fmsg_obj_nest_start(fmsg);
+	if (err)
+		return err;
+
+	err = devlink_fmsg_u64_pair_put(fmsg, "total_error_queues",
+					VNIC_ENV_GET64(&vnic, total_error_queues));
+	if (err)
+		return err;
+
+	err = devlink_fmsg_u64_pair_put(fmsg, "send_queue_priority_update_flow",
+					VNIC_ENV_GET64(&vnic, send_queue_priority_update_flow));
+	if (err)
+		return err;
+
+	err = devlink_fmsg_u64_pair_put(fmsg, "comp_eq_overrun",
+					VNIC_ENV_GET64(&vnic, comp_eq_overrun));
+	if (err)
+		return err;
+
+	err = devlink_fmsg_u64_pair_put(fmsg, "async_eq_overrun",
+					VNIC_ENV_GET64(&vnic, async_eq_overrun));
+	if (err)
+		return err;
+
+	err = devlink_fmsg_u64_pair_put(fmsg, "cq_overrun",
+					VNIC_ENV_GET64(&vnic, cq_overrun));
+	if (err)
+		return err;
+
+	err = devlink_fmsg_u64_pair_put(fmsg, "invalid_command",
+					VNIC_ENV_GET64(&vnic, invalid_command));
+	if (err)
+		return err;
+
+	err = devlink_fmsg_u64_pair_put(fmsg, "quota_exceeded_command",
+					VNIC_ENV_GET64(&vnic, quota_exceeded_command));
+	if (err)
+		return err;
+
+	err = devlink_fmsg_u64_pair_put(fmsg, "nic_receive_steering_discard",
+					VNIC_ENV_GET64(&vnic, nic_receive_steering_discard));
+	if (err)
+		return err;
+
+	err = devlink_fmsg_obj_nest_end(fmsg);
+	if (err)
+		return err;
+
+	err = devlink_fmsg_pair_nest_end(fmsg);
+	if (err)
+		return err;
+
+	return 0;
+}
+
+static int mlx5_reporter_vnic_diagnose(struct devlink_health_reporter *reporter,
+				       struct devlink_fmsg *fmsg,
+				       struct netlink_ext_ack *extack)
+{
+	struct mlx5_core_dev *dev = devlink_health_reporter_priv(reporter);
+
+	return mlx5_reporter_vnic_diagnose_counters(dev, fmsg, 0, false);
+}
+
+static const struct devlink_health_reporter_ops mlx5_reporter_vnic_ops = {
+	.name = "vnic",
+	.diagnose = mlx5_reporter_vnic_diagnose,
+};
+
+void mlx5_reporter_vnic_create(struct mlx5_core_dev *dev)
+{
+	struct mlx5_core_health *health = &dev->priv.health;
+	struct devlink *devlink = priv_to_devlink(dev);
+
+	health->vnic_reporter =
+		devlink_health_reporter_create(devlink,
+					       &mlx5_reporter_vnic_ops,
+					       0, dev);
+	if (IS_ERR(health->vnic_reporter))
+		mlx5_core_warn(dev,
+			       "Failed to create vnic reporter, err = %ld\n",
+			       PTR_ERR(health->vnic_reporter));
+}
+
+void mlx5_reporter_vnic_destroy(struct mlx5_core_dev *dev)
+{
+	struct mlx5_core_health *health = &dev->priv.health;
+
+	if (!IS_ERR_OR_NULL(health->vnic_reporter))
+		devlink_health_reporter_destroy(health->vnic_reporter);
+}
--- a/drivers/net/ethernet/mellanox/mlx5/core/diag/reporter_vnic.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/diag/reporter_vnic.h
+/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB
+ * Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES.
+ */
+#ifndef __MLX5_REPORTER_VNIC_H
+#define __MLX5_REPORTER_VNIC_H
+
+#include "mlx5_core.h"
+
+void mlx5_reporter_vnic_create(struct mlx5_core_dev *dev);
+void mlx5_reporter_vnic_destroy(struct mlx5_core_dev *dev);
+
+int mlx5_reporter_vnic_diagnose_counters(struct mlx5_core_dev *dev,
+					 struct devlink_fmsg *fmsg,
+					 u16 vport_num, bool other_vport);
+
+#endif /* __MLX5_REPORTER_VNIC_H */
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -857,6 +857,7 @@ static int mlx5e_alloc_rq(struct mlx5e_params *params,
 		pp_params.pool_size = pool_size;
 		pp_params.nid       = node;
 		pp_params.dev       = rq->pdev;
+		pp_params.napi      = rq->cq.napi;
 		pp_params.dma_dir   = rq->buff.map_dir;
 		pp_params.max_len   = PAGE_SIZE;


--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
@@ -53,6 +53,7 @@
 #include "lib/vxlan.h"
 #define CREATE_TRACE_POINTS
 #include "diag/en_rep_tracepoint.h"
+#include "diag/reporter_vnic.h"
 #include "en_accel/ipsec.h"
 #include "en/tc/int_port.h"
 #include "en/ptp.h"
@@ -1294,6 +1295,50 @@ static unsigned int mlx5e_ul_rep_stats_grps_num(struct mlx5e_priv *priv)
 	return ARRAY_SIZE(mlx5e_ul_rep_stats_grps);
 }

+static int
+mlx5e_rep_vnic_reporter_diagnose(struct devlink_health_reporter *reporter,
+				 struct devlink_fmsg *fmsg,
+				 struct netlink_ext_ack *extack)
+{
+	struct mlx5e_rep_priv *rpriv = devlink_health_reporter_priv(reporter);
+	struct mlx5_eswitch_rep *rep = rpriv->rep;
+
+	return mlx5_reporter_vnic_diagnose_counters(rep->esw->dev, fmsg,
+						    rep->vport, true);
+}
+
+static const struct devlink_health_reporter_ops mlx5_rep_vnic_reporter_ops = {
+	.name = "vnic",
+	.diagnose = mlx5e_rep_vnic_reporter_diagnose,
+};
+
+static void mlx5e_rep_vnic_reporter_create(struct mlx5e_priv *priv,
+					   struct devlink_port *dl_port)
+{
+	struct mlx5e_rep_priv *rpriv = priv->ppriv;
+	struct devlink_health_reporter *reporter;
+
+	reporter = devl_port_health_reporter_create(dl_port,
+						    &mlx5_rep_vnic_reporter_ops,
+						    0, rpriv);
+	if (IS_ERR(reporter)) {
+		mlx5_core_err(priv->mdev,
+			      "Failed to create representor vnic reporter, err = %ld\n",
+			      PTR_ERR(reporter));
+		return;
+	}
+
+	rpriv->rep_vnic_reporter = reporter;
+}
+
+static void mlx5e_rep_vnic_reporter_destroy(struct mlx5e_priv *priv)
+{
+	struct mlx5e_rep_priv *rpriv = priv->ppriv;
+
+	if (!IS_ERR_OR_NULL(rpriv->rep_vnic_reporter))
+		devl_health_reporter_destroy(rpriv->rep_vnic_reporter);
+}
+
 static const struct mlx5e_profile mlx5e_rep_profile = {
 	.init			= mlx5e_init_rep,
 	.cleanup		= mlx5e_cleanup_rep,
@@ -1394,8 +1439,10 @@ mlx5e_vport_vf_rep_load(struct mlx5_core_dev *dev, struct mlx5_eswitch_rep *rep)

 	dl_port = mlx5_esw_offloads_devlink_port(dev->priv.eswitch,
 						 rpriv->rep->vport);
-	if (dl_port)
+	if (dl_port) {
 		SET_NETDEV_DEVLINK_PORT(netdev, dl_port);
+		mlx5e_rep_vnic_reporter_create(priv, dl_port);
+	}

 	err = register_netdev(netdev);
 	if (err) {
@@ -1408,8 +1455,8 @@ mlx5e_vport_vf_rep_load(struct mlx5_core_dev *dev, struct mlx5_eswitch_rep *rep)
 	return 0;

 err_detach_netdev:
+	mlx5e_rep_vnic_reporter_destroy(priv);
 	mlx5e_detach_netdev(netdev_priv(netdev));
-
 err_cleanup_profile:
 	priv->profile->cleanup(priv);

@@ -1458,6 +1505,7 @@ mlx5e_vport_rep_unload(struct mlx5_eswitch_rep *rep)
 	}

 	unregister_netdev(netdev);
+	mlx5e_rep_vnic_reporter_destroy(priv);
 	mlx5e_detach_netdev(priv);
 	priv->profile->cleanup(priv);
 	mlx5e_destroy_netdev(priv);

--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.h
@@ -118,6 +118,7 @@ struct mlx5e_rep_priv {
 	struct rtnl_link_stats64 prev_vf_vport_stats;
 	struct mlx5_flow_handle *send_to_vport_meta_rule;
 	struct rhashtable tc_ht;
+	struct devlink_health_reporter *rep_vnic_reporter;
 };

 static inline

--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
@@ -861,6 +861,11 @@ static void mlx5e_dealloc_rx_mpwqe(struct mlx5e_rq *rq, u16 ix)
 	struct mlx5e_mpw_info *wi = mlx5e_get_mpw_info(rq, ix);
 	/* This function is called on rq/netdev close. */
 	mlx5e_free_rx_mpwqe(rq, wi);
+
+	/* Avoid a second release of the wqe pages: dealloc is called also
+	 * for missing wqes on an already flushed RQ.
+	 */
+	bitmap_fill(wi->skip_release_bitmap, rq->mpwqe.pages_per_wqe);
 }

 INDIRECT_CALLABLE_SCOPE bool mlx5e_post_rx_wqes(struct mlx5e_rq *rq)
@@ -1741,10 +1746,10 @@ mlx5e_skb_from_cqe_nonlinear(struct mlx5e_rq *rq, struct mlx5e_wqe_frag_info *wi
 	prog = rcu_dereference(rq->xdp_prog);
 	if (prog && mlx5e_xdp_handle(rq, prog, &mxbuf)) {
 		if (test_bit(MLX5E_RQ_FLAG_XDP_XMIT, rq->flags)) {
-			int i;
+			struct mlx5e_wqe_frag_info *pwi;

-			for (i = wi - head_wi; i < rq->wqe.info.num_frags; i++)
-				mlx5e_put_rx_frag(rq, &head_wi[i]);
+			for (pwi = head_wi; pwi < wi; pwi++)
+				pwi->flags |= BIT(MLX5E_WQE_FRAG_SKIP_RELEASE);
 		}
 		return NULL; /* page/packet was consumed by XDP */
 	}

--- a/drivers/net/ethernet/mellanox/mlx5/core/esw/debugfs.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/esw/debugfs.c
-// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB
-/* Copyright (c) 2022, NVIDIA CORPORATION & AFFILIATES. All rights reserved. */
-
-#include <linux/debugfs.h>
-#include "eswitch.h"
-
-enum vnic_diag_counter {
-	MLX5_VNIC_DIAG_TOTAL_Q_UNDER_PROCESSOR_HANDLE,
-	MLX5_VNIC_DIAG_SEND_QUEUE_PRIORITY_UPDATE_FLOW,
-	MLX5_VNIC_DIAG_COMP_EQ_OVERRUN,
-	MLX5_VNIC_DIAG_ASYNC_EQ_OVERRUN,
-	MLX5_VNIC_DIAG_CQ_OVERRUN,
-	MLX5_VNIC_DIAG_INVALID_COMMAND,
-	MLX5_VNIC_DIAG_QOUTA_EXCEEDED_COMMAND,
-	MLX5_VNIC_DIAG_RX_STEERING_DISCARD,
-};
-
-static int mlx5_esw_query_vnic_diag(struct mlx5_vport *vport, enum vnic_diag_counter counter,
-				    u64 *val)
-{
-	u32 out[MLX5_ST_SZ_DW(query_vnic_env_out)] = {};
-	u32 in[MLX5_ST_SZ_DW(query_vnic_env_in)] = {};
-	struct mlx5_core_dev *dev = vport->dev;
-	u16 vport_num = vport->vport;
-	void *vnic_diag_out;
-	int err;
-
-	MLX5_SET(query_vnic_env_in, in, opcode, MLX5_CMD_OP_QUERY_VNIC_ENV);
-	MLX5_SET(query_vnic_env_in, in, vport_number, vport_num);
-	if (!mlx5_esw_is_manager_vport(dev->priv.eswitch, vport_num))
-		MLX5_SET(query_vnic_env_in, in, other_vport, 1);
-
-	err = mlx5_cmd_exec(dev, in, sizeof(in), out, sizeof(out));
-	if (err)
-		return err;
-
-	vnic_diag_out = MLX5_ADDR_OF(query_vnic_env_out, out, vport_env);
-	switch (counter) {
-	case MLX5_VNIC_DIAG_TOTAL_Q_UNDER_PROCESSOR_HANDLE:
-		*val = MLX5_GET(vnic_diagnostic_statistics, vnic_diag_out, total_error_queues);
-		break;
-	case MLX5_VNIC_DIAG_SEND_QUEUE_PRIORITY_UPDATE_FLOW:
-		*val = MLX5_GET(vnic_diagnostic_statistics, vnic_diag_out,
-				send_queue_priority_update_flow);
-		break;
-	case MLX5_VNIC_DIAG_COMP_EQ_OVERRUN:
-		*val = MLX5_GET(vnic_diagnostic_statistics, vnic_diag_out, comp_eq_overrun);
-		break;
-	case MLX5_VNIC_DIAG_ASYNC_EQ_OVERRUN:
-		*val = MLX5_GET(vnic_diagnostic_statistics, vnic_diag_out, async_eq_overrun);
-		break;
-	case MLX5_VNIC_DIAG_CQ_OVERRUN:
-		*val = MLX5_GET(vnic_diagnostic_statistics, vnic_diag_out, cq_overrun);
-		break;
-	case MLX5_VNIC_DIAG_INVALID_COMMAND:
-		*val = MLX5_GET(vnic_diagnostic_statistics, vnic_diag_out, invalid_command);
-		break;
-	case MLX5_VNIC_DIAG_QOUTA_EXCEEDED_COMMAND:
-		*val = MLX5_GET(vnic_diagnostic_statistics, vnic_diag_out, quota_exceeded_command);
-		break;
-	case MLX5_VNIC_DIAG_RX_STEERING_DISCARD:
-		*val = MLX5_GET64(vnic_diagnostic_statistics, vnic_diag_out,
-				  nic_receive_steering_discard);
-		break;
-	}
-
-	return 0;
-}
-
-static int __show_vnic_diag(struct seq_file *file, struct mlx5_vport *vport,
-			    enum vnic_diag_counter type)
-{
-	u64 val = 0;
-	int ret;
-
-	ret = mlx5_esw_query_vnic_diag(vport, type, &val);
-	if (ret)
-		return ret;
-
-	seq_printf(file, "%llu\n", val);
-	return 0;
-}
-
-static int total_q_under_processor_handle_show(struct seq_file *file, void *priv)
-{
-	return __show_vnic_diag(file, file->private, MLX5_VNIC_DIAG_TOTAL_Q_UNDER_PROCESSOR_HANDLE);
-}
-
-static int send_queue_priority_update_flow_show(struct seq_file *file, void *priv)
-{
-	return __show_vnic_diag(file, file->private,
-				MLX5_VNIC_DIAG_SEND_QUEUE_PRIORITY_UPDATE_FLOW);
-}
-
-static int comp_eq_overrun_show(struct seq_file *file, void *priv)
-{
-	return __show_vnic_diag(file, file->private, MLX5_VNIC_DIAG_COMP_EQ_OVERRUN);
-}
-
-static int async_eq_overrun_show(struct seq_file *file, void *priv)
-{
-	return __show_vnic_diag(file, file->private, MLX5_VNIC_DIAG_ASYNC_EQ_OVERRUN);
-}
-
-static int cq_overrun_show(struct seq_file *file, void *priv)
-{
-	return __show_vnic_diag(file, file->private, MLX5_VNIC_DIAG_CQ_OVERRUN);
-}
-
-static int invalid_command_show(struct seq_file *file, void *priv)
-{
-	return __show_vnic_diag(file, file->private, MLX5_VNIC_DIAG_INVALID_COMMAND);
-}
-
-static int quota_exceeded_command_show(struct seq_file *file, void *priv)
-{
-	return __show_vnic_diag(file, file->private, MLX5_VNIC_DIAG_QOUTA_EXCEEDED_COMMAND);
-}
-
-static int rx_steering_discard_show(struct seq_file *file, void *priv)
-{
-	return __show_vnic_diag(file, file->private, MLX5_VNIC_DIAG_RX_STEERING_DISCARD);
-}
-
-DEFINE_SHOW_ATTRIBUTE(total_q_under_processor_handle);
-DEFINE_SHOW_ATTRIBUTE(send_queue_priority_update_flow);
-DEFINE_SHOW_ATTRIBUTE(comp_eq_overrun);
-DEFINE_SHOW_ATTRIBUTE(async_eq_overrun);
-DEFINE_SHOW_ATTRIBUTE(cq_overrun);
-DEFINE_SHOW_ATTRIBUTE(invalid_command);
-DEFINE_SHOW_ATTRIBUTE(quota_exceeded_command);
-DEFINE_SHOW_ATTRIBUTE(rx_steering_discard);
-
-void mlx5_esw_vport_debugfs_destroy(struct mlx5_eswitch *esw, u16 vport_num)
-{
-	struct mlx5_vport *vport = mlx5_eswitch_get_vport(esw, vport_num);
-
-	debugfs_remove_recursive(vport->dbgfs);
-	vport->dbgfs = NULL;
-}
-
-/* vnic diag dir name is "pf", "ecpf" or "{vf/sf}_xxxx" */
-#define VNIC_DIAG_DIR_NAME_MAX_LEN 8
-
-void mlx5_esw_vport_debugfs_create(struct mlx5_eswitch *esw, u16 vport_num, bool is_sf, u16 sf_num)
-{
-	struct mlx5_vport *vport = mlx5_eswitch_get_vport(esw, vport_num);
-	struct dentry *vnic_diag;
-	char dir_name[VNIC_DIAG_DIR_NAME_MAX_LEN];
-	int err;
-
-	if (!MLX5_CAP_GEN(esw->dev, vport_group_manager))
-		return;
-
-	if (vport_num == MLX5_VPORT_PF) {
-		strcpy(dir_name, "pf");
-	} else if (vport_num == MLX5_VPORT_ECPF) {
-		strcpy(dir_name, "ecpf");
-	} else {
-		err = snprintf(dir_name, VNIC_DIAG_DIR_NAME_MAX_LEN, "%s_%d", is_sf ? "sf" : "vf",
-			       is_sf ? sf_num : vport_num - MLX5_VPORT_FIRST_VF);
-		if (WARN_ON(err < 0))
-			return;
-	}
-
-	vport->dbgfs = debugfs_create_dir(dir_name, esw->dbgfs);
-	vnic_diag = debugfs_create_dir("vnic_diag", vport->dbgfs);
-
-	if (MLX5_CAP_GEN(esw->dev, vnic_env_queue_counters)) {
-		debugfs_create_file("total_q_under_processor_handle", 0444, vnic_diag, vport,
-				    &total_q_under_processor_handle_fops);
-		debugfs_create_file("send_queue_priority_update_flow", 0444, vnic_diag, vport,
-				    &send_queue_priority_update_flow_fops);
-	}
-
-	if (MLX5_CAP_GEN(esw->dev, eq_overrun_count)) {
-		debugfs_create_file("comp_eq_overrun", 0444, vnic_diag, vport,
-				    &comp_eq_overrun_fops);
-		debugfs_create_file("async_eq_overrun", 0444, vnic_diag, vport,
-				    &async_eq_overrun_fops);
-	}
-
-	if (MLX5_CAP_GEN(esw->dev, vnic_env_cq_overrun))
-		debugfs_create_file("cq_overrun", 0444, vnic_diag, vport, &cq_overrun_fops);
-
-	if (MLX5_CAP_GEN(esw->dev, invalid_command_count))
-		debugfs_create_file("invalid_command", 0444, vnic_diag, vport,
-				    &invalid_command_fops);
-
-	if (MLX5_CAP_GEN(esw->dev, quota_exceeded_count))
-		debugfs_create_file("quota_exceeded_command", 0444, vnic_diag, vport,
-				    &quota_exceeded_command_fops);
-
-	if (MLX5_CAP_GEN(esw->dev, nic_receive_steering_discard))
-		debugfs_create_file("rx_steering_discard", 0444, vnic_diag, vport,
-				    &rx_steering_discard_fops);
-
-}
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c
@@ -36,7 +36,6 @@
 #include <linux/mlx5/vport.h>
 #include <linux/mlx5/fs.h>
 #include <linux/mlx5/mpfs.h>
-#include <linux/debugfs.h>
 #include "esw/acl/lgcy.h"
 #include "esw/legacy.h"
 #include "esw/qos.h"
@@ -1056,7 +1055,6 @@ int mlx5_eswitch_load_vport(struct mlx5_eswitch *esw, u16 vport_num,
 	if (err)
 		return err;

-	mlx5_esw_vport_debugfs_create(esw, vport_num, false, 0);
 	err = esw_offloads_load_rep(esw, vport_num);
 	if (err)
 		goto err_rep;
@@ -1064,7 +1062,6 @@ int mlx5_eswitch_load_vport(struct mlx5_eswitch *esw, u16 vport_num,
 	return err;

 err_rep:
-	mlx5_esw_vport_debugfs_destroy(esw, vport_num);
 	mlx5_esw_vport_disable(esw, vport_num);
 	return err;
 }
@@ -1072,7 +1069,6 @@ int mlx5_eswitch_load_vport(struct mlx5_eswitch *esw, u16 vport_num,
 void mlx5_eswitch_unload_vport(struct mlx5_eswitch *esw, u16 vport_num)
 {
 	esw_offloads_unload_rep(esw, vport_num);
-	mlx5_esw_vport_debugfs_destroy(esw, vport_num);
 	mlx5_esw_vport_disable(esw, vport_num);
 }

@@ -1510,7 +1506,7 @@ int mlx5_esw_sf_max_hpf_functions(struct mlx5_core_dev *dev, u16 *max_sfs, u16 *
 	return err;
 }

-static int mlx5_esw_vport_alloc(struct mlx5_eswitch *esw, struct mlx5_core_dev *dev,
+static int mlx5_esw_vport_alloc(struct mlx5_eswitch *esw,
 				int index, u16 vport_num)
 {
 	struct mlx5_vport *vport;
@@ -1564,7 +1560,7 @@ static int mlx5_esw_vports_init(struct mlx5_eswitch *esw)

 	xa_init(&esw->vports);

-	err = mlx5_esw_vport_alloc(esw, dev, idx, MLX5_VPORT_PF);
+	err = mlx5_esw_vport_alloc(esw, idx, MLX5_VPORT_PF);
 	if (err)
 		goto err;
 	if (esw->first_host_vport == MLX5_VPORT_PF)
@@ -1572,7 +1568,7 @@ static int mlx5_esw_vports_init(struct mlx5_eswitch *esw)
 	idx++;

 	for (i = 0; i < mlx5_core_max_vfs(dev); i++) {
-		err = mlx5_esw_vport_alloc(esw, dev, idx, idx);
+		err = mlx5_esw_vport_alloc(esw, idx, idx);
 		if (err)
 			goto err;
 		xa_set_mark(&esw->vports, idx, MLX5_ESW_VPT_VF);
@@ -1581,7 +1577,7 @@ static int mlx5_esw_vports_init(struct mlx5_eswitch *esw)
 	}
 	base_sf_num = mlx5_sf_start_function_id(dev);
 	for (i = 0; i < mlx5_sf_max_functions(dev); i++) {
-		err = mlx5_esw_vport_alloc(esw, dev, idx, base_sf_num + i);
+		err = mlx5_esw_vport_alloc(esw, idx, base_sf_num + i);
 		if (err)
 			goto err;
 		xa_set_mark(&esw->vports, base_sf_num + i, MLX5_ESW_VPT_SF);
@@ -1592,7 +1588,7 @@ static int mlx5_esw_vports_init(struct mlx5_eswitch *esw)
 	if (err)
 		goto err;
 	for (i = 0; i < max_host_pf_sfs; i++) {
-		err = mlx5_esw_vport_alloc(esw, dev, idx, base_sf_num + i);
+		err = mlx5_esw_vport_alloc(esw, idx, base_sf_num + i);
 		if (err)
 			goto err;
 		xa_set_mark(&esw->vports, base_sf_num + i, MLX5_ESW_VPT_SF);
@@ -1600,12 +1596,12 @@ static int mlx5_esw_vports_init(struct mlx5_eswitch *esw)
 	}

 	if (mlx5_ecpf_vport_exists(dev)) {
-		err = mlx5_esw_vport_alloc(esw, dev, idx, MLX5_VPORT_ECPF);
+		err = mlx5_esw_vport_alloc(esw, idx, MLX5_VPORT_ECPF);
 		if (err)
 			goto err;
 		idx++;
 	}
-	err = mlx5_esw_vport_alloc(esw, dev, idx, MLX5_VPORT_UPLINK);
+	err = mlx5_esw_vport_alloc(esw, idx, MLX5_VPORT_UPLINK);
 	if (err)
 		goto err;
 	return 0;
@@ -1672,7 +1668,6 @@ int mlx5_eswitch_init(struct mlx5_core_dev *dev)
 	dev->priv.eswitch = esw;
 	BLOCKING_INIT_NOTIFIER_HEAD(&esw->n_head);

-	esw->dbgfs = debugfs_create_dir("esw", mlx5_debugfs_get_dev_root(esw->dev));
 	esw_info(dev,
 		 "Total vports %d, per vport: max uc(%d) max mc(%d)\n",
 		 esw->total_vports,
@@ -1696,7 +1691,6 @@ void mlx5_eswitch_cleanup(struct mlx5_eswitch *esw)

 	esw_info(esw->dev, "cleanup\n");

-	debugfs_remove_recursive(esw->dbgfs);
 	esw->dev->priv.eswitch = NULL;
 	destroy_workqueue(esw->work_queue);
 	WARN_ON(refcount_read(&esw->qos.refcnt));

--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h
@@ -195,7 +195,6 @@ struct mlx5_vport {
 	enum mlx5_eswitch_vport_event enabled_events;
 	int index;
 	struct devlink_port *dl_port;
-	struct dentry *dbgfs;
 };

 struct mlx5_esw_indir_table;
@@ -343,7 +342,6 @@ struct mlx5_eswitch {
 		u32             large_group_num;
 	}  params;
 	struct blocking_notifier_head n_head;
-	struct dentry *dbgfs;
 };

 void esw_offloads_disable(struct mlx5_eswitch *esw);
@@ -356,7 +354,6 @@ mlx5_eswitch_add_send_to_vport_meta_rule(struct mlx5_eswitch *esw, u16 vport_num
 void mlx5_eswitch_del_send_to_vport_meta_rule(struct mlx5_flow_handle *rule);

 bool mlx5_esw_vport_match_metadata_supported(const struct mlx5_eswitch *esw);
-int mlx5_esw_offloads_vport_metadata_set(struct mlx5_eswitch *esw, bool enable);
 u32 mlx5_esw_match_metadata_alloc(struct mlx5_eswitch *esw);
 void mlx5_esw_match_metadata_free(struct mlx5_eswitch *esw, u32 metadata);

@@ -704,9 +701,6 @@ int mlx5_esw_offloads_devlink_port_register(struct mlx5_eswitch *esw, u16 vport_
 void mlx5_esw_offloads_devlink_port_unregister(struct mlx5_eswitch *esw, u16 vport_num);
 struct devlink_port *mlx5_esw_offloads_devlink_port(struct mlx5_eswitch *esw, u16 vport_num);

-void mlx5_esw_vport_debugfs_create(struct mlx5_eswitch *esw, u16 vport_num, bool is_sf, u16 sf_num);
-void mlx5_esw_vport_debugfs_destroy(struct mlx5_eswitch *esw, u16 vport_num);
-
 int mlx5_esw_devlink_sf_port_register(struct mlx5_eswitch *esw, struct devlink_port *dl_port,
 				      u16 vport_num, u32 controller, u32 sfnum);
 void mlx5_esw_devlink_sf_port_unregister(struct mlx5_eswitch *esw, u16 vport_num);

--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
@@ -2939,28 +2939,6 @@ static int esw_offloads_metadata_init(struct mlx5_eswitch *esw)
 	return err;
 }

-int mlx5_esw_offloads_vport_metadata_set(struct mlx5_eswitch *esw, bool enable)
-{
-	int err = 0;
-
-	down_write(&esw->mode_lock);
-	if (mlx5_esw_is_fdb_created(esw)) {
-		err = -EBUSY;
-		goto done;
-	}
-	if (!mlx5_esw_vport_match_metadata_supported(esw)) {
-		err = -EOPNOTSUPP;
-		goto done;
-	}
-	if (enable)
-		esw->flags |= MLX5_ESWITCH_VPORT_MATCH_METADATA;
-	else
-		esw->flags &= ~MLX5_ESWITCH_VPORT_MATCH_METADATA;
-done:
-	up_write(&esw->mode_lock);
-	return err;
-}
-
 int
 esw_vport_create_offloads_acl_tables(struct mlx5_eswitch *esw,
 				     struct mlx5_vport *vport)
@@ -3828,14 +3806,12 @@ int mlx5_esw_offloads_sf_vport_enable(struct mlx5_eswitch *esw, struct devlink_p
 	if (err)
 		goto devlink_err;

-	mlx5_esw_vport_debugfs_create(esw, vport_num, true, sfnum);
 	err = mlx5_esw_offloads_rep_load(esw, vport_num);
 	if (err)
 		goto rep_err;
 	return 0;

 rep_err:
-	mlx5_esw_vport_debugfs_destroy(esw, vport_num);
 	mlx5_esw_devlink_sf_port_unregister(esw, vport_num);
 devlink_err:
 	mlx5_esw_vport_disable(esw, vport_num);
@@ -3845,7 +3821,6 @@ int mlx5_esw_offloads_sf_vport_enable(struct mlx5_eswitch *esw, struct devlink_p
 void mlx5_esw_offloads_sf_vport_disable(struct mlx5_eswitch *esw, u16 vport_num)
 {
 	mlx5_esw_offloads_rep_unload(esw, vport_num);
-	mlx5_esw_vport_debugfs_destroy(esw, vport_num);
 	mlx5_esw_devlink_sf_port_unregister(esw, vport_num);
 	mlx5_esw_vport_disable(esw, vport_num);
 }

--- a/drivers/net/ethernet/mellanox/mlx5/core/health.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/health.c
@@ -42,6 +42,7 @@
 #include "lib/pci_vsc.h"
 #include "lib/tout.h"
 #include "diag/fw_tracer.h"
+#include "diag/reporter_vnic.h"

 enum {
 	MAX_MISSES			= 3,
@@ -898,6 +899,7 @@ void mlx5_health_cleanup(struct mlx5_core_dev *dev)

 	cancel_delayed_work_sync(&health->update_fw_log_ts_work);
 	destroy_workqueue(health->wq);
+	mlx5_reporter_vnic_destroy(dev);
 	mlx5_fw_reporters_destroy(dev);
 }

@@ -907,6 +909,7 @@ int mlx5_health_init(struct mlx5_core_dev *dev)
 	char *name;

 	mlx5_fw_reporters_create(dev);
+	mlx5_reporter_vnic_create(dev);

 	health = &dev->priv.health;
 	name = kmalloc(64, GFP_KERNEL);
@@ -926,6 +929,7 @@ int mlx5_health_init(struct mlx5_core_dev *dev)
 	return 0;

 out_err:
+	mlx5_reporter_vnic_destroy(dev);
 	mlx5_fw_reporters_destroy(dev);
 	return -ENOMEM;
 }
--- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
@@ -717,7 +717,7 @@ static int handle_hca_cap_port_selection(struct mlx5_core_dev *dev,
 	       MLX5_ST_SZ_BYTES(port_selection_cap));
 	MLX5_SET(port_selection_cap, set_hca_cap, port_select_flow_table_bypass, 1);

-	err = set_caps(dev, set_ctx, MLX5_SET_HCA_CAP_OP_MODE_PORT_SELECTION);
+	err = set_caps(dev, set_ctx, MLX5_SET_HCA_CAP_OP_MOD_PORT_SELECTION);

 	return err;
 }

--- a/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c
 // SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB
 /* Copyright (c) 2019 Mellanox Technologies. */

+#include <linux/pci.h>
 #include <linux/interrupt.h>
 #include <linux/notifier.h>
 #include <linux/mlx5/driver.h>

--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_dbg.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_dbg.c
@@ -4,6 +4,7 @@
 #include <linux/debugfs.h>
 #include <linux/kernel.h>
 #include <linux/seq_file.h>
+#include <linux/version.h>
 #include "dr_types.h"

 #define DR_DBG_PTR_TO_ID(p) ((u64)(uintptr_t)(p) & 0xFFFFFFFFULL)
@@ -153,14 +154,16 @@ dr_dump_rule_action_mem(struct seq_file *file, const u64 rule_id,
 			   DR_DUMP_REC_TYPE_ACTION_MODIFY_HDR, action_id,
 			   rule_id, action->rewrite->index,
 			   action->rewrite->single_action_opt,
-			   action->rewrite->num_of_actions,
+			   ptrn_arg ? action->rewrite->num_of_actions : 0,
 			   ptrn_arg ? ptrn->index : 0,
 			   ptrn_arg ? mlx5dr_arg_get_obj_id(arg) : 0);

+		if (ptrn_arg) {
 			for (i = 0; i < action->rewrite->num_of_actions; i++) {
 				seq_printf(file, ",0x%016llx",
 					   be64_to_cpu(((__be64 *)rewrite_data)[i]));
 			}
+		}

 		seq_puts(file, "\n");
 		break;
@@ -630,9 +633,18 @@ dr_dump_domain(struct seq_file *file, struct mlx5dr_domain *dmn)
 	u64 domain_id = DR_DBG_PTR_TO_ID(dmn);
 	int ret;

-	seq_printf(file, "%d,0x%llx,%d,0%x,%d,%s\n", DR_DUMP_REC_TYPE_DOMAIN,
+	seq_printf(file, "%d,0x%llx,%d,0%x,%d,%u.%u.%u,%s,%d,%u,%u,%u\n",
+		   DR_DUMP_REC_TYPE_DOMAIN,
 		   domain_id, dmn->type, dmn->info.caps.gvmi,
-		   dmn->info.supp_sw_steering, pci_name(dmn->mdev->pdev));
+		   dmn->info.supp_sw_steering,
+		   /* package version */
+		   LINUX_VERSION_MAJOR, LINUX_VERSION_PATCHLEVEL,
+		   LINUX_VERSION_SUBLEVEL,
+		   pci_name(dmn->mdev->pdev),
+		   0, /* domain flags */
+		   dmn->num_buddies[DR_ICM_TYPE_STE],
+		   dmn->num_buddies[DR_ICM_TYPE_MODIFY_ACTION],
+		   dmn->num_buddies[DR_ICM_TYPE_MODIFY_HDR_PTRN]);

 	ret = dr_dump_domain_info(file, &dmn->info, domain_id);
 	if (ret < 0)

--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_icm_pool.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_icm_pool.c
@@ -4,7 +4,9 @@
 #include "dr_types.h"

 #define DR_ICM_MODIFY_HDR_ALIGN_BASE 64
-#define DR_ICM_POOL_HOT_MEMORY_FRACTION 4
+#define DR_ICM_POOL_STE_HOT_MEM_PERCENT 25
+#define DR_ICM_POOL_MODIFY_HDR_PTRN_HOT_MEM_PERCENT 50
+#define DR_ICM_POOL_MODIFY_ACTION_HOT_MEM_PERCENT 90

 struct mlx5dr_icm_hot_chunk {
 	struct mlx5dr_icm_buddy_mem *buddy_mem;
@@ -29,6 +31,8 @@ struct mlx5dr_icm_pool {
 	struct mlx5dr_icm_hot_chunk *hot_chunks_arr;
 	u32 hot_chunks_num;
 	u64 hot_memory_size;
+	/* hot memory size threshold for triggering sync */
+	u64 th;
 };

 struct mlx5dr_icm_dm {
@@ -284,6 +288,8 @@ static int dr_icm_buddy_create(struct mlx5dr_icm_pool *pool)
 	/* add it to the -start- of the list in order to search in it first */
 	list_add(&buddy->list_node, &pool->buddy_mem_list);

+	pool->dmn->num_buddies[pool->icm_type]++;
+
 	return 0;

 err_cleanup_buddy:
@@ -297,13 +303,17 @@ static int dr_icm_buddy_create(struct mlx5dr_icm_pool *pool)

 static void dr_icm_buddy_destroy(struct mlx5dr_icm_buddy_mem *buddy)
 {
+	enum mlx5dr_icm_type icm_type = buddy->pool->icm_type;
+
 	dr_icm_pool_mr_destroy(buddy->icm_mr);

 	mlx5dr_buddy_cleanup(buddy);

-	if (buddy->pool->icm_type == DR_ICM_TYPE_STE)
+	if (icm_type == DR_ICM_TYPE_STE)
 		dr_icm_buddy_cleanup_ste_cache(buddy);

+	buddy->pool->dmn->num_buddies[icm_type]--;
+
 	kvfree(buddy);
 }

@@ -330,15 +340,7 @@ dr_icm_chunk_init(struct mlx5dr_icm_chunk *chunk,

 static bool dr_icm_pool_is_sync_required(struct mlx5dr_icm_pool *pool)
 {
-	int allow_hot_size;
-
-	/* sync when hot memory reaches a certain fraction of the pool size */
-	allow_hot_size =
-		mlx5dr_icm_pool_chunk_size_to_byte(pool->max_log_chunk_sz,
-						   pool->icm_type) /
-		DR_ICM_POOL_HOT_MEMORY_FRACTION;
-
-	return pool->hot_memory_size > allow_hot_size;
+	return pool->hot_memory_size > pool->th;
 }

 static void dr_icm_pool_clear_hot_chunks_arr(struct mlx5dr_icm_pool *pool)
@@ -503,8 +505,9 @@ void mlx5dr_icm_pool_free_htbl(struct mlx5dr_icm_pool *pool, struct mlx5dr_ste_h
 struct mlx5dr_icm_pool *mlx5dr_icm_pool_create(struct mlx5dr_domain *dmn,
 					       enum mlx5dr_icm_type icm_type)
 {
-	u32 num_of_chunks, entry_size, max_hot_size;
+	u32 num_of_chunks, entry_size;
 	struct mlx5dr_icm_pool *pool;
+	u32 max_hot_size = 0;

 	pool = kvzalloc(sizeof(*pool), GFP_KERNEL);
 	if (!pool)
@@ -520,12 +523,21 @@ struct mlx5dr_icm_pool *mlx5dr_icm_pool_create(struct mlx5dr_domain *dmn,
 	switch (icm_type) {
 	case DR_ICM_TYPE_STE:
 		pool->max_log_chunk_sz = dmn->info.max_log_sw_icm_sz;
+		max_hot_size = mlx5dr_icm_pool_chunk_size_to_byte(pool->max_log_chunk_sz,
+								  pool->icm_type) *
+			       DR_ICM_POOL_STE_HOT_MEM_PERCENT / 100;
 		break;
 	case DR_ICM_TYPE_MODIFY_ACTION:
 		pool->max_log_chunk_sz = dmn->info.max_log_action_icm_sz;
+		max_hot_size = mlx5dr_icm_pool_chunk_size_to_byte(pool->max_log_chunk_sz,
+								  pool->icm_type) *
+			       DR_ICM_POOL_MODIFY_ACTION_HOT_MEM_PERCENT / 100;
 		break;
 	case DR_ICM_TYPE_MODIFY_HDR_PTRN:
 		pool->max_log_chunk_sz = dmn->info.max_log_modify_hdr_pattern_icm_sz;
+		max_hot_size = mlx5dr_icm_pool_chunk_size_to_byte(pool->max_log_chunk_sz,
+								  pool->icm_type) *
+			       DR_ICM_POOL_MODIFY_HDR_PTRN_HOT_MEM_PERCENT / 100;
 		break;
 	default:
 		WARN_ON(icm_type);
@@ -533,11 +545,8 @@ struct mlx5dr_icm_pool *mlx5dr_icm_pool_create(struct mlx5dr_domain *dmn,

 	entry_size = mlx5dr_icm_pool_dm_type_to_entry_size(pool->icm_type);

-	max_hot_size = mlx5dr_icm_pool_chunk_size_to_byte(pool->max_log_chunk_sz,
-							  pool->icm_type) /
-		       DR_ICM_POOL_HOT_MEMORY_FRACTION;
-
 	num_of_chunks = DIV_ROUND_UP(max_hot_size, entry_size) + 1;
+	pool->th = max_hot_size;

 	pool->hot_chunks_arr = kvcalloc(num_of_chunks,
 					sizeof(struct mlx5dr_icm_hot_chunk),

--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_types.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_types.h
@@ -72,6 +72,7 @@ enum mlx5dr_icm_type {
 	DR_ICM_TYPE_STE,
 	DR_ICM_TYPE_MODIFY_ACTION,
 	DR_ICM_TYPE_MODIFY_HDR_PTRN,
+	DR_ICM_TYPE_MAX,
 };

 static inline enum mlx5dr_icm_chunk_size
@@ -955,6 +956,8 @@ struct mlx5dr_domain {
 	struct list_head dbg_tbl_list;
 	struct mlx5dr_dbg_dump_info dump_info;
 	struct xarray definers_xa;
+	/* memory management statistics */
+	u32 num_buddies[DR_ICM_TYPE_MAX];
 };

 struct mlx5dr_table_rx_tx {

--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -439,6 +439,7 @@ struct mlx5_core_health {
 	struct work_struct		report_work;
 	struct devlink_health_reporter *fw_reporter;
 	struct devlink_health_reporter *fw_fatal_reporter;
+	struct devlink_health_reporter *vnic_reporter;
 	struct delayed_work		update_fw_log_ts_work;
 };


--- a/include/linux/mlx5/mlx5_ifc.h
+++ b/include/linux/mlx5/mlx5_ifc.h
@@ -69,7 +69,7 @@ enum {
 	MLX5_SET_HCA_CAP_OP_MOD_ATOMIC                = 0x3,
 	MLX5_SET_HCA_CAP_OP_MOD_ROCE                  = 0x4,
 	MLX5_SET_HCA_CAP_OP_MOD_GENERAL_DEVICE2       = 0x20,
-	MLX5_SET_HCA_CAP_OP_MODE_PORT_SELECTION       = 0x25,
+	MLX5_SET_HCA_CAP_OP_MOD_PORT_SELECTION        = 0x25,
 };

 enum {