Commits · 5056860cf8eab681a08913ec03432c57c83e71be · Kirill Smelkov / linux

19 Jun, 2024 9 commits

dt-bindings: net: Add IEP interrupt · 5056860c

Diogo Ivo authored Jun 17, 2024

The IEP interrupt is used in order to support both capture events, where
an incoming external signal gets timestamped on arrival, and compare
events, where an interrupt is generated internally when the IEP counter
reaches a programmed value.
Signed-off-by: Diogo Ivo <diogo.ivo@siemens.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

5056860c

net: ti: icss-iep: Remove spinlock-based synchronization · 5758e03c

Diogo Ivo authored Jun 17, 2024

As all sources of concurrency in hardware register access occur in
non-interrupt context eliminate spinlock-based synchronization and
rely on the mutex-based synchronization that is already present.
Signed-off-by: Diogo Ivo <diogo.ivo@siemens.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

5758e03c

net: ti: icssg-prueth: Enable PTP timestamping support for SR1.0 devices · 5e1e4389

Diogo Ivo authored Jun 17, 2024

Enable PTP support for AM65x SR1.0 devices by registering with the IEP
infrastructure in order to expose a PTP clock to userspace.
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Signed-off-by: Diogo Ivo <diogo.ivo@siemens.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

5e1e4389

rds:Simplify the allocation of slab caches · 9f1f70dd

Hongfu Li authored Jun 17, 2024

Use the new KMEM_CACHE() macro instead of direct kmem_cache_create
to simplify the creation of SLAB caches.
Signed-off-by: Hongfu Li <lihongfu@kylinos.cn>
Reviewed-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9f1f70dd

net: mana: Add support for page sizes other than 4KB on ARM64 · 382d1741

Haiyang Zhang authored Jun 17, 2024

As defined by the MANA Hardware spec, the queue size for DMA is 4KB
minimal, and power of 2. And, the HWC queue size has to be exactly
4KB.

To support page sizes other than 4KB on ARM64, define the minimal
queue size as a macro separately from the PAGE_SIZE, which we always
assumed it to be 4KB before supporting ARM64.

Also, add MANA specific macros and update code related to size
alignment, DMA region calculations, etc.
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: Michael Kelley <mhklinux@outlook.com>
Link: https://lore.kernel.org/r/1718655446-6576-1-git-send-email-haiyangz@microsoft.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

382d1741

Merge branch 'net-mlx4_en-use-ethtool_puts-sprintf' · 2c6a4b96

Jakub Kicinski authored Jun 18, 2024

Kamal Heib says:

====================
net/mlx4_en: Use ethtool_puts/sprintf

This patchset updates the mlx4_en driver to use the ethtool_puts and
ethtool_sprintf helper functions.
Signed-off-by: Kamal Heib <kheib@redhat.com>
====================

Link: https://lore.kernel.org/r/20240617172329.239819-1-kheib@redhat.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

2c6a4b96

net/mlx4_en: Use ethtool_puts/sprintf to fill stats strings · 6c7dd432

Kamal Heib authored Jun 17, 2024

Use the ethtool_puts/ethtool_sprintf helper to print the stats strings
into the ethtool strings interface.
Signed-off-by: Kamal Heib <kheib@redhat.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240617172329.239819-4-kheib@redhat.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

6c7dd432

net/mlx4_en: Use ethtool_puts to fill selftest strings · 4454929c

Kamal Heib authored Jun 17, 2024

Use the ethtool_puts helper to print the selftest strings into the
ethtool strings interface.
Signed-off-by: Kamal Heib <kheib@redhat.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240617172329.239819-3-kheib@redhat.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

4454929c

net/mlx4_en: Use ethtool_puts to fill priv flags strings · e52e0103

Kamal Heib authored Jun 17, 2024

Use the ethtool_puts helper to print the priv flags strings into the
ethtool strings interface.
Signed-off-by: Kamal Heib <kheib@redhat.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240617172329.239819-2-kheib@redhat.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

e52e0103

18 Jun, 2024 9 commits

net: microchip: Constify struct vcap_operations · 8c379e3c

Christophe JAILLET authored Jun 16, 2024

"struct vcap_operations" are not modified in these drivers.

Constifying this structure moves some data to a read-only section, so
increase overall security.

In order to do it, "struct vcap_control" also needs to be adjusted to this
new const qualifier.

As an example, on a x86_64, with allmodconfig:
Before:
======
   text	   data	    bss	    dec	    hex	filename
  15176	   1094	     16	  16286	   3f9e	drivers/net/ethernet/microchip/lan966x/lan966x_vcap_impl.o

After:
=====
   text	   data	    bss	    dec	    hex	filename
  15268	    998	     16	  16282	   3f9a	drivers/net/ethernet/microchip/lan966x/lan966x_vcap_impl.o
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Daniel Machon <daniel.machon@microchip.com>
Link: https://lore.kernel.org/r/d8e76094d2e98ebb5bfc8205799b3a9db0b46220.1718524644.git.christophe.jaillet@wanadoo.frSigned-off-by: Paolo Abeni <pabeni@redhat.com>

8c379e3c

Merge branch 'introduce-phy-mode-10g-qxgmii' · e845bb84

Paolo Abeni authored Jun 18, 2024

Luo Jie says:

====================
Introduce PHY mode 10G-QXGMII

This patch series adds 10G-QXGMII mode for PHY driver. The patch
series is split from the QCA8084 PHY driver patch series below.
https://lore.kernel.org/all/20231215074005.26976-1-quic_luoj@quicinc.com/

Per Andrew Lunn’s advice, submitting this patch series for acceptance
as they already include the necessary 'Reviewed-by:' tags. This way,
they need not wait for QCA8084 series patches to conclude review.

Changes in v2:
	* remove PHY_INTERFACE_MODE_10G_QXGMII from workaround of
	  validation in the phylink_validate_phy. 10G_QXGMII will
	  be set into phy->possible_interfaces in its .config_init
	  method of PHY driver that supports it.
====================

Link: https://lore.kernel.org/r/20240615120028.2384732-1-quic_luoj@quicinc.comSigned-off-by: Paolo Abeni <pabeni@redhat.com>

e845bb84

dt-bindings: net: ethernet-controller: add 10g-qxgmii mode · 5dfabcdd

Vladimir Oltean authored Jun 15, 2024

Add the new interface mode 10g-qxgmii, which is similar to
usxgmii but extend to 4 channels to support maximum of 4
ports with the link speed 10M/100M/1G/2.5G.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Luo Jie <quic_luoj@quicinc.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

5dfabcdd

net: phy: introduce core support for phy-mode = "10g-qxgmii" · 777b8afb

Vladimir Oltean authored Jun 15, 2024

10G-QXGMII is a MAC-to-PHY interface defined by the USXGMII multiport
specification. It uses the same signaling as USXGMII, but it multiplexes
4 ports over the link, resulting in a maximum speed of 2.5G per port.

Some in-tree SoCs like the NXP LS1028A use "usxgmii" when they mean
either the single-port USXGMII or the quad-port 10G-QXGMII variant, and
they could get away just fine with that thus far. But there is a need to
distinguish between the 2 as far as SerDes drivers are concerned.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Luo Jie <quic_luoj@quicinc.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

777b8afb

net: stmmac: Enable TSO on VLANs · 041cc86b

Furong Xu authored Jun 15, 2024

The TSO engine works well when the frames are not VLAN Tagged.
But it will produce broken segments when frames are VLAN Tagged.

The first segment is all good, while the second segment to the
last segment are broken, they lack of required VLAN tag.

An example here:
========
// 1st segment of a VLAN Tagged TSO frame, nothing wrong.
MacSrc > MacDst, ethertype 802.1Q (0x8100), length 1518: vlan 100, p 1, ethertype IPv4 (0x0800), HostA:42643 > HostB:5201: Flags [.], seq 1:1449

// 2nd to last segments of a VLAN Tagged TSO frame, VLAN tag is missing.
MacSrc > MacDst, ethertype IPv4 (0x0800), length 1514: HostA:42643 > HostB:5201: Flags [.], seq 1449:2897
MacSrc > MacDst, ethertype IPv4 (0x0800), length 1514: HostA:42643 > HostB:5201: Flags [.], seq 2897:4345
MacSrc > MacDst, ethertype IPv4 (0x0800), length 1514: HostA:42643 > HostB:5201: Flags [.], seq 4345:5793
MacSrc > MacDst, ethertype IPv4 (0x0800), length 1514: HostA:42643 > HostB:5201: Flags [P.], seq 5793:7241

// normal VLAN Tagged non-TSO frame, nothing wrong.
MacSrc > MacDst, ethertype 802.1Q (0x8100), length 1022: vlan 100, p 1, ethertype IPv4 (0x0800), HostA:42643 > HostB:5201: Flags [P.], seq 7241:8193
MacSrc > MacDst, ethertype 802.1Q (0x8100), length 70: vlan 100, p 1, ethertype IPv4 (0x0800), HostA:42643 > HostB:5201: Flags [F.], seq 8193
========

When transmitting VLAN Tagged TSO frames, never insert VLAN tag by HW,
always insert VLAN tag to SKB payload, then TSO works well on VLANs for
all MAC cores.

Tested on DWMAC CORE 5.10a, DWMAC CORE 5.20a and DWXGMAC CORE 3.20a
Signed-off-by: Furong Xu <0x1207@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240615095611.517323-1-0x1207@gmail.comSigned-off-by: Paolo Abeni <pabeni@redhat.com>

041cc86b

net: Move dev_set_hwtstamp_phylib to net/core/dev.h · efb45930

Kory Maincent authored Jun 12, 2024

This declaration was added to the header to be called from ethtool.
ethtool is separated from core for code organization but it is not really
a separate entity, it controls very core things.
As ethtool is an internal stuff it is not wise to have it in netdevice.h.
Move the declaration to net/core/dev.h instead.

Remove the EXPORT_SYMBOL_GPL call as ethtool can not be built as a module.
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
Link: https://lore.kernel.org/r/20240612-feature_ptp_netnext-v15-2-b2a086257b63@bootlin.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

efb45930

net: dwc-xlgmac: fix missing MODULE_DESCRIPTION() warning · 0d9bb144

Jeff Johnson authored Jun 16, 2024

With ARCH=hexagon, make allmodconfig && make W=1 C=1 reports:
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/net/ethernet/synopsys/dwc-xlgmac.o

With most other ARCH settings the MODULE_DESCRIPTION() is provided by
the macro invocation in dwc-xlgmac-pci.c. However, for hexagon, the
PCI bus is not enabled, and hence CONFIG_DWC_XLGMAC_PCI is not set.
As a result, dwc-xlgmac-pci.c is not compiled, and hence is not linked
into dwc-xlgmac.o.

To avoid this issue, relocate the MODULE_DESCRIPTION() and other
related macros from dwc-xlgmac-pci.c to dwc-xlgmac-common.c, since
that file already has an existing MODULE_LICENSE() and it is
unconditionally linked into dwc-xlgmac.o.
Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Link: https://lore.kernel.org/r/20240616-md-hexagon-drivers-net-ethernet-synopsys-v1-1-55852b60aef8@quicinc.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

0d9bb144

net: mana: Use mana_cleanup_port_context() for rxq cleanup · e275e19c

Shradha Gupta authored Jun 14, 2024

To cleanup rxqs in port context structures, instead of duplicating the
code, use existing function mana_cleanup_port_context() which does
the exact cleanup that's needed.
Signed-off-by: Shradha Gupta <shradhagupta@linux.microsoft.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Wei Liu <wei.liu@kernel.org>
Reviewed-by: Heng Qi <hengqi@linux.alibaba.com>
Link: https://lore.kernel.org/r/1718349548-28697-1-git-send-email-shradhagupta@linux.microsoft.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

e275e19c

fou: remove warn in gue_gro_receive on unsupported protocol · dd89a81d

Willem de Bruijn authored Jun 14, 2024

Drop the WARN_ON_ONCE inn gue_gro_receive if the encapsulated type is
not known or does not have a GRO handler.

Such a packet is easily constructed. Syzbot generates them and sets
off this warning.

Remove the warning as it is expected and not actionable.

The warning was previously reduced from WARN_ON to WARN_ON_ONCE in
commit 27013661 ("fou: Do WARN_ON_ONCE in gue_gro_receive for bad
proto callbacks").
Signed-off-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20240614122552.1649044-1-willemdebruijn.kernel@gmail.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

dd89a81d

17 Jun, 2024 8 commits

Merge branch 'net-smc-IPPROTO_SMC' · 4314175a

David S. Miller authored Jun 17, 2024

D. Wythe says:

====================
Introduce IPPROTO_SMC

This patch allows to create smc socket via AF_INET,
similar to the following code,

/* create v4 smc sock */
v4 = socket(AF_INET, SOCK_STREAM, IPPROTO_SMC);

/* create v6 smc sock */
v6 = socket(AF_INET6, SOCK_STREAM, IPPROTO_SMC);

There are several reasons why we believe it is appropriate here:

1. For smc sockets, it actually use IPv4 (AF-INET) or IPv6 (AF-INET6)
address. There is no AF_SMC address at all.

2. Create smc socket in the AF_INET(6) path, which allows us to reuse
the infrastructure of AF_INET(6) path, such as common ebpf hooks.
Otherwise, smc have to implement it again in AF_SMC path. Such as:
  1. Replace IPPROTO_TCP with IPPROTO_SMC in the socket() syscall
     initiated by the user, without the use of LD-PRELOAD.
  2. Select whether immediate fallback is required based on peer's port/ip
     before connect().

A very significant result is that we can now use eBPF to implement smc_run
instead of LD_PRELOAD, who is completely ineffective in scenarios of static
linking.

Another potential value is that we are attempting to optimize the
performance of fallback socks, where merging socks is an important part,
and it relies on the creation of SMC sockets under the AF_INET path.
(More information :
https://lore.kernel.org/netdev/1699442703-25015-1-git-send-email-alibuda@linux.alibaba.com/T/)

v2 -> v1:

- Code formatting, mainly including alignment and annotation repair.
- move inet_smc proto ops to inet_smc.c, avoiding af_smc.c becoming too bulky.
- Fix the issue where refactoring affects the initialization order.
- Fix compile warning (unused out_inet_prot) while CONFIG_IPV6 was not set.

v3 -> v2:

- Add Alibaba's copyright information to the newfile

v4 -> v3:

- Fix some spelling errors
- Align function naming style with smc_sock_init() to smc_sk_init()
- Reversing the order of the conditional checks on clcsock to make the code more intuitive

v5 -> v4:

- Fix some spelling errors
- Added comment, "/* CONFIG_IPV6 */", after the final #endif directive.
- Rename smc_inet.h and smc_inet.c to smc_inet.h and smc_inet.c
- Encapsulate the initialization and destruction of inet_smc in inet_smc.c,
  rather than implementing it directly in af_smc.c.
- Remove useless header files in smc_inet.h
- Make smc_inet_prot_xxx and smc_inet_sock_init() to be static, since it's
  only used in smc_inet.c

v6 -> v5:

- Wrapping lines to not exceed 80 characters
- Combine initialization and error handling of smc_inet6 into the same #if
  macro block.

v7 -> v6:

- Modify the value of IPPROTO_SMC to 256 so that it does not affect IPPROTO-MAX

v8 -> v7:

- Remove useless declarations.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

4314175a

net/smc: Introduce IPPROTO_SMC · d25a92cc

D. Wythe authored Jun 14, 2024

This patch allows to create smc socket via AF_INET,
similar to the following code,

/* create v4 smc sock */
v4 = socket(AF_INET, SOCK_STREAM, IPPROTO_SMC);

/* create v6 smc sock */
v6 = socket(AF_INET6, SOCK_STREAM, IPPROTO_SMC);

There are several reasons why we believe it is appropriate here:

1. For smc sockets, it actually use IPv4 (AF-INET) or IPv6 (AF-INET6)
address. There is no AF_SMC address at all.

2. Create smc socket in the AF_INET(6) path, which allows us to reuse
the infrastructure of AF_INET(6) path, such as common ebpf hooks.
Otherwise, smc have to implement it again in AF_SMC path.
Signed-off-by: D. Wythe <alibuda@linux.alibaba.com>
Reviewed-by: Wenjia Zhang <wenjia@linux.ibm.com>
Reviewed-by: Dust Li <dust.li@linux.alibaba.com>
Tested-by: Niklas Schnelle <schnelle@linux.ibm.com>
Tested-by: Wenjia Zhang <wenjia@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

d25a92cc

net/smc: expose smc proto operations · 13543d02

D. Wythe authored Jun 14, 2024

Externalize smc proto operations (smc_xxx) to allow
access from files other than af_smc.c

This is in preparation for the subsequent implementation
of the AF_INET version of SMC.
Signed-off-by: D. Wythe <alibuda@linux.alibaba.com>
Reviewed-by: Wenjia Zhang <wenjia@linux.ibm.com>
Reviewed-by: Dust Li <dust.li@linux.alibaba.com>
Tested-by: Niklas Schnelle <schnelle@linux.ibm.com>
Tested-by: Wenjia Zhang <wenjia@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

13543d02

net/smc: refactoring initialization of smc sock · d0e35656

D. Wythe authored Jun 14, 2024

This patch aims to isolate the shared components of SMC socket
allocation by introducing smc_sk_init() for sock initialization
and __smc_create_clcsk() for the initialization of clcsock.

This is in preparation for the subsequent implementation of the
AF_INET version of SMC.
Signed-off-by: D. Wythe <alibuda@linux.alibaba.com>
Reviewed-by: Tony Lu <tonylu@linux.alibaba.com>
Reviewed-by: Wenjia Zhang <wenjia@linux.ibm.com>
Reviewed-by: Dust Li <dust.li@linux.alibaba.com>
Tested-by: Niklas Schnelle <schnelle@linux.ibm.com>
Tested-by: Wenjia Zhang <wenjia@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

d0e35656

net: make for_each_netdev_dump() a little more bug-proof · f22b4b55

Jakub Kicinski authored Jun 13, 2024

I find the behavior of xa_for_each_start() slightly counter-intuitive.
It doesn't end the iteration by making the index point after the last
element. IOW calling xa_for_each_start() again after it "finished"
will run the body of the loop for the last valid element, instead
of doing nothing.

This works fine for netlink dumps if they terminate correctly
(i.e. coalesce or carefully handle NLM_DONE), but as we keep getting
reminded legacy dumps are unlikely to go away.

Fixing this generically at the xa_for_each_start() level seems hard -
there is no index reserved for "end of iteration".
ifindexes are 31b wide, tho, and iterator is ulong so for
for_each_netdev_dump() it's safe to go to the next element.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

f22b4b55

Merge branch 'mlx5-genl-queue-stats' · 69776921

David S. Miller authored Jun 17, 2024

Joe Damato says:

====================
mlx5: Add netdev-genl queue stats

Welcome to v5.

Switched from RFC to just a v5, because I think this is pretty close.
Minor changes from v4 summarized below in the changelog.

Note that my NIC does not seem to support PTP and I couldn't get the
mlnx-tools mlnx_qos script to work, so I was only able to test the
following cases:

- device up at boot
- adjusting queue counts
- device down (e.g. ip link set dev eth4 down)

Please see the commit message of patch 2/2 for more details on output
and test cases.

rfcv4 thread:
  https://lore.kernel.org/linux-kernel/20240604004629.299699-1-jdamato@fastly.com/T/

rfcv4 -> v5:
 - Patch 1/2: change variable name 'mlx5e_qid' to 'txq_ix'.
 - Patch 2/2:
    - remove logic in mlx5e_get_queue_stats_rx for PTP. PTP RX are
      always reported in base.
    - report PTP TX in mlx5e_get_base_stats only if:
      - PTP has ever been opened, and
      - either PTP is NULL (closed) or the MLX5E_PTP_STATE_TX bit in its
        state is not set

    Otherwise, PTP TX will be reported when the txq_ix is passed into
    mlx5e_get_queue_stats_tx

rfcv3 -> rfcv4:
 - Patch 1/2 now creates a mapping (priv->txq2sq_stats) which maps txq
   indices to sq_stats structures so stats can be accessed directly.
   This mapping is kept up to date along side txq2sq.

 - Patch 2/2:
   - All mutex_lock/unlock on state_lock has been dropped.
   - mlx5e_get_queue_stats_rx now uses ASSERT_RTNL() and has a special
     case for PTP. If PTP was ever opened, is currently opened, and the
     channel index matches, stats for PTP RX are output.
   - mlx5e_get_queue_stats_tx rewritten to use priv->txq2sq_stats. No
     corner cases are needed here because any txq idx (passed in as i)
     will have an up to date mapping in priv->txq2sq_stats.
   - mlx5e_get_base_stats:
     - in the RX case:
       - iterates from [params.num_channels, stats_nch) collecting
         stats.
       - if ptp was ever opened but is currently closed, add the PTP
         stats.
     - in the TX case:
       - handle 2 cases:
         - the channel is available, so sum only the unavailable TCs
           [mlx5e_get_dcb_num_tc, max_opened_tc).
         - the channel is unavailable, so sum all TCs [0, max_opened_tc).
       - if ptp was ever opened but is currently closed, add the PTP
         sq stats.

v2 -> rfcv3:
 - Added patch 1/2 which creates some helpers for computing the txq_ix
   and ch_ix/tc_ix.

 - Patch 2/2 modified in several ways:
   - Fixed variable declarations in mlx5e_get_queue_stats_rx to be at
     the start of the function.
   - mlx5e_get_queue_stats_tx rewritten to access sq stats directly by
     using the helpers added in the previous patch.
   - mlx5e_get_base_stats modified in several ways:
     - Took the state_lock when accessing priv->channels.
     - For the base RX stats, code was simplified to call
       mlx5e_get_queue_stats_rx instead of repeating the same code.
     - For the base TX stats, I attempted to implement what I think
       Tariq suggested in the previous thread:
         - for available channels, only unavailable TC stats are summed
	 - for unavailable channels, all stats for TCs up to
	   max_opened_tc are summed.

v1 - > v2:
  - Essentially a full rewrite after comments from Jakub, Tariq, and
    Zhu.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

69776921

net/mlx5e: Add per queue netdev-genl stats · 7b66ae53

Joe Damato authored Jun 12, 2024

./cli.py --spec netlink/specs/netdev.yaml \
         --dump qstats-get --json '{"scope": "queue"}'

...snip

 {'ifindex': 7,
  'queue-id': 62,
  'queue-type': 'rx',
  'rx-alloc-fail': 0,
  'rx-bytes': 105965251,
  'rx-packets': 179790},
 {'ifindex': 7,
  'queue-id': 0,
  'queue-type': 'tx',
  'tx-bytes': 9402665,
  'tx-packets': 17551},

...snip

Also tested with the script tools/testing/selftests/drivers/net/stats.py
in several scenarios to ensure stats tallying was correct:

- on boot (default queue counts)
- adjusting queue count up or down (ethtool -L eth0 combined ...)

The tools/testing/selftests/drivers/net/stats.py brings the device up,
so to test with the device down, I did the following:

$ ip link show eth4
7: eth4: <BROADCAST,MULTICAST> mtu 9000 qdisc mq state DOWN [..snip..]
  [..snip..]

$ cat /proc/net/dev | grep eth4
eth4: 235710489  434811 [..snip rx..] 2878744 21227  [..snip tx..]

$ ./cli.py --spec ../../../Documentation/netlink/specs/netdev.yaml \
           --dump qstats-get --json '{"ifindex": 7}'
[{'ifindex': 7,
  'rx-alloc-fail': 0,
  'rx-bytes': 235710489,
  'rx-packets': 434811,
  'tx-bytes': 2878744,
  'tx-packets': 21227}]

Compare the values in /proc/net/dev match the output of cli for the same
device, even while the device is down.

Note that while the device is down, per queue stats output nothing
(because the device is down there are no queues):

$ ./cli.py --spec ../../../Documentation/netlink/specs/netdev.yaml \
           --dump qstats-get --json '{"scope": "queue", "ifindex": 7}'
[]
Signed-off-by: Joe Damato <jdamato@fastly.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

7b66ae53

net/mlx5e: Add txq to sq stats mapping · 0a3e5c1b

Joe Damato authored Jun 12, 2024

mlx5 currently maps txqs to an sq via priv->txq2sq. It is useful to map
txqs to sq_stats, as well, for direct access to stats.

Add priv->txq2sq_stats and insert mappings. The mappings will be used
next to tabulate stats information.
Signed-off-by: Joe Damato <jdamato@fastly.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

0a3e5c1b

15 Jun, 2024 14 commits

net: micro-optimize skb_datagram_iter · 934c2999

Sagi Grimberg authored Jun 13, 2024

We only use the mapping in a single context in a short and contained scope,
so kmap_local_page is sufficient and cheaper. This will also allow
skb_datagram_iter to be called from softirq context.
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Link: https://lore.kernel.org/r/20240613113504.1079860-1-sagi@grimberg.meSigned-off-by: Jakub Kicinski <kuba@kernel.org>

934c2999

Merge branch 'mlxsw-handle-mtu-values' · abef8495

Jakub Kicinski authored Jun 14, 2024

Petr Machata says:

====================
mlxsw: Handle MTU values

Amit Cohen writes:

The driver uses two values for maximum MTU, but neither is accurate.
In addition, the value which is configured to hardware is not calculated
correctly. Handle these issues and expose accurate values for minimum
and maximum MTU per netdevice.

Add test cases to check that the exposed values are really supported.

Patch set overview:
Patches #1-#3 set the driver to use accurate values for MTU
Patch #4 aligns the driver to always use the same value for maximum MTU
Patch #5 adds a test
====================

Link: https://lore.kernel.org/r/cover.1718275854.git.petrm@nvidia.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

abef8495