Commits · da8ff2a278b94088f501e55f543644f88b3ffe37 · Kirill Smelkov / linux

29 Jun, 2022 19 commits

Merge branch 'mlxsw-unified-bridge-conversion-part-5' · da8ff2a2

David S. Miller authored Jun 29, 2022

Ido Schimmel says:

====================
mlxsw: Unified bridge conversion - part 5/6

This is the fifth part of the conversion of mlxsw to the unified bridge
model.

The previous part that was merged in commit d521bc0a ("Merge branch
'mlxsw-unified-bridge-conversion-part-4-6'") converted the flooding code
to use the new APIs of the unified bridge model. As part of this
conversion, the flooding code started accessing the port group table
(PGT) directly in order to allocate MID indexes and configure the ports
via which a packet needs to be replicated.

MDB entries in the device also make use of the PGT table, but the
related code has its own PGT allocator and does not make use of the
common core that was added in the previous patchset. This patchset
converts the MDB code to use the common PGT code.

The first nine patches prepare the MDB code for the conversion that is
performed by the last patch.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

da8ff2a2

mlxsw: spectrum_switchdev: Convert MDB code to use PGT APIs · e28cd993

Amit Cohen authored Jun 29, 2022

The previous patches added common APIs for maintaining PGT (Port Group
Table) table. In the legacy model, software did not interact with this
table directly. Instead, it was accessed by firmware in response to
registers such as SFTR and SMID. In the new model, software has full
control over the PGT table using the SMID register.

The configuration of MDB entries is already done via SMID, so the new
PGT APIs can be used also using the legacy model, the only difference is
that MID index should be aligned to bridge model. See a previous patch
which added API for that.

The main changes are:
- MDB code does not maintain bitmap of ports in MDB entry anymore, instead,
  it stores a list of ports with additional information.
- MDB code does not configure SMID register directly anymore, it will be
  done via PGT API when port is first added or removed.
- Today MDB code does not update SMID when port is added/removed while
  multicast is disabled. Instead, it maintains bitmap of ports and once
  multicast is enabled, it rewrite the entry to hardware. Using PGT APIs,
  the entry will be updated also when multicast is disabled, but the
  mapping between {MAC, FID}->{MID} will not appear in SFD register. It
  means that SMID will be updated all the time and disable/enable multicast
  will impact only SFD configuration.
- For multicast router, today only SMID is updated and the bitmap is not
  updated. Using the new list of ports, there is a reference count for each
  port, so it can be saved in software also. For such port,
  'struct mlxsw_sp_mdb_entry.ports_count' will not be updated and the
  port in the list will be marked as 'mrouter'.
- Finally, `struct mlxsw_sp_mid.in_hw` is not needed anymore.
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

e28cd993

mlxsw: spectrum_switchdev: Flush port from MDB entries according to FID index · 4c3f7442

Amit Cohen authored Jun 29, 2022

Currently, flushing port from all MDB entries is done when the last VLAN
is removed. This behavior is inaccurate, as port can be removed while there
is another port which uses the same VLAN, in such case, this is not the
last port which uses this VLAN and removed, but this port is supposed to be
removed from the MDB entries.

Flush the port from MDB when it is removed, regardless the state of other
ports. Flush only the MDB entries which are relevant for the same FID
index.
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

4c3f7442

mlxsw: spectrum_switchdev: Add support for getting and putting MDB entry · 7434ed61

Amit Cohen authored Jun 29, 2022

A previous patch added support for init() and fini() for MDB entries. MDB
entry can be updated, ports can be added and removed from the entry. Add
get() and put() functions, the first one checks if the entry already exists
and otherwise initializes the entry. The second removes the entry just in
case that there are no more ports in this entry.

Use the list of the ports which was added in a previous patch. When the
list contains only one port which is not multicast router, and this port
is removed, the MDB entry can be removed. Use
'struct mlxsw_sp_mdb_entry.ports_count' to know how many ports use the
entry, regardless the use of multicast router ports.

When mlxsw_sp_mc_mdb_entry_put() is called with specific port which
supposed to be removed, check if the removal will cause a deletion of
the entry. If this is the case, call mlxsw_sp_mc_mdb_entry_fini() which
first deletes the MDB entry and then releases the PGT entry, to avoid a
temporary situation in which the MDB entry points to an empty PGT entry,
as otherwise packets will be temporarily dropped instead of being flooded.

The new functions will be used in the next patches.
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

7434ed61

mlxsw: spectrum_switchdev: Implement mlxsw_sp_mc_mdb_entry_{init, fini}() · ea0f58d6

Amit Cohen authored Jun 29, 2022

The next patches will convert MDB code to use PGT APIs. The change will
move the responsibility of allocating MID indexes and writing PGT
configurations to hardware to PGT code. As part of this change, most of the
MDB code will be changed and improved.

As a preparation for the above mentioned change, implement
mlxsw_sp_mc_mdb_entry_{init, fini}(). Currently, there is a function
__mlxsw_sp_mc_alloc(), which does not only allocate MID. In addition,
there is no an equivalent function to free the MID. When
mlxsw_sp_port_remove_from_mid() removes the last port, it handles MID
removal. Instead, add init() and fini() functions, which use PGT APIs.

The differences between the existing and the new functions are as follows:
1. Today MDB code does not update SMID when port is added/removed while
   multicast is disabled. It maintains a bitmap of ports and once multicast
   is enabled, it writes the entry to hardware. Instead, using PGT APIs,
   the entry will be updated also when multicast is disabled, but the
   mapping between {MAC, FID}->{MID} (is configured using SFD) will be
   updated according to multicast state. It means that SMID will be updated
   all the time and disable/enable multicast will impact only SFD
   configuration.

2. Today the allocation of MID index is done as part of
   mlxsw_sp_mc_write_mdb_entry(). The fact that the entry will be
   written in hardware all the time, moves the allocation of the index to
   be as part of the MDB entry initialization. PGT API is used for the
   allocation.

3. Today the update of multicast router ports is done as part of
   mlxsw_sp_mc_write_mdb_entry(). Instead, add functions to add/remove
   all multicast router ports when entry is first added or removed. When
   new multicast router port will be added/removed, the dedicated API will
   be used to add/remove it from the existing entries.

4. A list of ports will be stored per MDB entry instead of the exiting
   bitmap. The list will contain the multicast router ports and maintain
   reference counter per port.

Add mlxsw_sp_mdb_entry_write() which is almost identical to
mlxsw_sp_port_mdb_op(). Use more clear name and align the MID index to
bridge model using PGT API. The existing function will be removed in the
next patches.

Note that PGT APIs configure the firmware using SMID register, like the
driver already does today for MDB entries, so PGT APIs can be used also
using legacy bridge model.
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ea0f58d6

mlxsw: spectrum_switchdev: Add support for maintaining list of ports per MDB entry · d2994e13

Amit Cohen authored Jun 29, 2022

As part of converting MDB code to use PGT APIs, PGT code stores which ports
are mapped to each PGT entry. PGT code is not aware of the type of the port
(multicast router or not), as it is not relevant there.

To be able to release an MDB entry when the there are no ports which are
not multicast routers, the entry should be aware of the state of its
ports. Add support for maintaining list of ports per MDB entry.

Each port will hold a reference count as multiple MDB entries can use the
same hardware MDB entry. It occurs because MDB entries in the Linux bridge
are keyed according to their multicast IP, when these entries are notified
to device drivers via switchdev, the multicast IP is converted to a
multicast MAC. This conversion might cause collisions, for example,
ff0e::1 and ff0e:1234::1 are both mapped to the multicast MAC
33:33:00:00:00:01.

Multicast router port will take a reference once, and will be marked as
'mrouter', then when port in the list is multicast router and its
reference value is one, it means that the entry can be removed in case
that there are no other ports which are not multicast routers. For that,
maintain a counter per MDB entry to count ports in the list, which were
added to the multicast group, and not because they are multicast routers.
When this counter is zero, the entry can be removed.

Add mlxsw_sp_mdb_entry_port_{get,put}() for regular ports and
mlxsw_sp_mdb_entry_mrouter_port_{get,put}() for multicast router ports.
Call PGT API to add or remove port from PGT entry when port is first added
or removed, according to the reference counting.
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

d2994e13

mlxsw: spectrum_switchdev: Add support for maintaining hash table of MDB entries · 5d0512e5

Amit Cohen authored Jun 29, 2022

Currently MDB entries are stored in a list as part of
'struct mlxsw_sp_bridge_device'. Storing them in a hash table in
addition to the list will allow finding a specific entry more efficiently.

Add support for the required hash table, the next patches will insert
and remove MDB entries from the table. The existing code which adds and
removes entries will be removed and replaced by new code in the next
patches, so there is no point to adjust the existing code.
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

5d0512e5

mlxsw: spectrum_switchdev: Save MAC and FID as a key in 'struct mlxsw_sp_mdb_entry' · 0ac98543

Amit Cohen authored Jun 29, 2022

The next patch will add support for storing all the MDB entries in a hash
table. As a preparation, save the MAC address and the FID in a
separate structure. This structure will be used later as a key for the
hash table.
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

0ac98543

mlxsw: spectrum_switchdev: Rename MIDs list · eaa0791a

Amit Cohen authored Jun 29, 2022

Currently, the list which stores the MDB entries for a given bridge
instance is called 'mids_list'.

This name is not accurate as a MID entry stores a bitmap of ports to
which a packet needs to be replicated and a MDB entry stores the mapping
from {MAC, FID} to PGT index (MID)

Rename it to 'mdb_list'.
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

eaa0791a

mlxsw: spectrum_switchdev: Rename MID structure · eede53a4

Amit Cohen authored Jun 29, 2022

Currently the structure which represents MDB entry is called
'struct mlxsw_sp_mid'. This name is not accurate as a MID entry stores a
bitmap of ports to which a packet needs to be replicated and a MDB entry
stores the mapping from {MAC, FID} to PGT index (MID).

Rename the structure to 'struct mlxsw_sp_mdb_entry'. The structure
'mlxsw_sp_mid' is defined as part of spectrum.h. The only file which
uses it is spectrum_switchdev.c, so there is no reason to expose it to
other files. Move the definition to spectrum_switchdev.c.
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

eede53a4

mlxsw: Align PGT index to legacy bridge model · 4abaa5cc

Amit Cohen authored Jun 29, 2022

FID code reserves about 15K entries in PGT table for flooding. These
entries are just allocated and are not used yet because the code that uses
them is skipped now.

The next patches will convert MDB code to use PGT APIs. The allocation of
indexes for multicast is done after FID code reserves 15K entries.
Currently, legacy bridge model is used and firmware manages PGT table. That
means that the indexes which are allocated using PGT API are too high when
legacy bridge model is used. To not exceed firmware limitation for MDB
entries, add an API that returns the correct 'mid_index', based on bridge
model. For legacy model, subtract the number of flood entries from PGT
index. Use it to write the correct MID to SMID register. This API will be
used also from MDB code in the next patches.

PGT should not be aware of MDB and FID different usage, this API is
temporary and will be removed once unified bridge model will be used.
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

4abaa5cc

net: mptcp: fix some spelling mistake in mptcp · d640516a

Menglong Dong authored Jun 27, 2022

codespell finds some spelling mistake in mptcp:

net/mptcp/subflow.c:1624: interaces ==> interfaces
net/mptcp/pm_netlink.c:1130: regarless ==> regardless

Just fix them.
Signed-off-by: Menglong Dong <imagedong@tencent.com>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Link: https://lore.kernel.org/r/20220627121626.1595732-1-imagedong@tencent.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

d640516a

Revert the ARM/dts changes for Renesas RZ/N1 · eba3a981

Jakub Kicinski authored Jun 27, 2022

Based on a request from Geert:

Revert "ARM: dts: r9a06g032-rzn1d400-db: add switch description"
This reverts commit 9aab31d6.

Revert "ARM: dts: r9a06g032: describe switch"
This reverts commit cf9695d8.

Revert "ARM: dts: r9a06g032: describe GMAC2"
This reverts commit 3f5261f1.

Revert "ARM: dts: r9a06g032: describe MII converter"
This reverts commit 066c3bd3.

to let these changes flow thru the platform and SoC trees.

Link: https://lore.kernel.org/r/CAMuHMdUvSLFU56gsp1a9isOiP9otdCJ2-BqhbrffcoHuA6JNig@mail.gmail.com/
Link: https://lore.kernel.org/r/20220627173900.3136386-1-kuba@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>

eba3a981

Merge branch 'net-phylink-cleanup-pcs-code' · 957b96e3

Jakub Kicinski authored Jun 28, 2022

Russell King says:

====================
net: phylink: cleanup pcs code

These two patches were part of the larger series for the mv88e6xxx
phylink pcs conversion. As this is delayed, I've decided to send these
two patches now.
====================

Link: https://lore.kernel.org/r/YrmYEC2N9mVpg9g6@shell.armlinux.org.ukSigned-off-by: Jakub Kicinski <kuba@kernel.org>

957b96e3

net: phylink: disable PCS polling over major configuration · bfac8c49

Russell King (Oracle) authored Jun 27, 2022

While we are performing a major configuration, there is no point having
the PCS polling timer running. Stop it before we begin preparing for
the configuration change, and restart it only once we've successfully
completed the change.
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

bfac8c49

net: phylink: remove pcs_ops member · 4f1dd48f

Russell King (Oracle) authored Jun 27, 2022

Remove the pcs_ops member from struct phylink, using the one stored in
struct phylink_pcs instead.
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

4f1dd48f

tcp: diag: add support for TIME_WAIT sockets to tcp_abort() · af9784d0

Eric Dumazet authored Jun 27, 2022

Currently, "ss -K -ta ..." does not support TIME_WAIT sockets.

Issue has been raised at least two times in the past [1] [2]
it is time to fix it.

[1] https://lore.kernel.org/netdev/ba65f579-4e69-ae0d-4770-bc6234beb428@gmail.com/
[2] https://lore.kernel.org/netdev/CANn89i+R9RgmD=AQ4vX1Vb_SQAj4c3fi7-ZtQz-inYY4Sq4CMQ@mail.gmail.com/T/

While we are at it, use inet_sk_state_load() while tcp_abort()
does not hold a lock on the socket.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Tested-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Link: https://lore.kernel.org/r/20220627121038.226500-1-edumazet@google.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

af9784d0

net/funeth: Support for ethtool -m · f03c8a1e

Dimitris Michailidis authored Jun 27, 2022

Add the FW command for reading port module memory pages and implement
ethtool's get_module_eeprom_by_page operation.
Signed-off-by: Dimitris Michailidis <dmichail@fungible.com>
Link: https://lore.kernel.org/r/20220627182000.8198-1-dmichail@fungible.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

f03c8a1e

af_unix: Do not call kmemdup() for init_net's sysctl table. · 849d5aa3

Kuniyuki Iwashima authored Jun 27, 2022

While setting up init_net's sysctl table, we need not duplicate the
global table and can use it directly as ipv4_sysctl_init_net() does.

Unlike IPv4, AF_UNIX does not have a huge sysctl table for now, so it
cannot be a problem, but this patch makes code consistent.
Acked-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://lore.kernel.org/r/20220627233627.51646-1-kuniyu@amazon.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

849d5aa3

28 Jun, 2022 21 commits

Merge branch 'mlxsw-unified-bridge-conversion-part-4-6' · d521bc0a

Paolo Abeni authored Jun 28, 2022

Ido Schimmel says:

====================
mlxsw: Unified bridge conversion - part 4/6

This is the fourth part of the conversion of mlxsw to the unified bridge
model.

Unlike previous parts that prepared mlxsw for the conversion, this part
actually starts the conversion. It focuses on flooding configuration and
converts mlxsw to the more "raw" APIs of the unified bridge model.

The patches configure the different stages of the flooding pipeline in
Spectrum that looks as follows (at a high-level):

         +------------+                +----------+           +-------+
  {FID,  |            | {Packet type,  |          |           |       |  MID
   DMAC} | FDB lookup |  Bridge type}  |   SFGC   | MID base  |       | Index
+-------->   (miss)   +----------------> register +-----------> Adder +------->
         |            |                |          |           |       |
         |            |                |          |           |       |
         +------------+                +----+-----+           +---^---+
                                            |                     |
                                    Table   |                     |
                                     type   |                     | Offset
                                            |      +-------+      |
                                            |      |       |      |
                                            |      |       |      |
                                            +----->+  Mux  +------+
                                                   |       |
                                                   |       |
                                                   +-^---^-+
                                                     |   |
                                                  FID|   |FID
                                                     |   |offset
                                                     +   +

The multicast identifier (MID) index is used as an index to the port
group table (PGT) that contains a bitmap of ports via which a packet
needs to be replicated.

From the PGT table, the packet continues to the multicast port egress
(MPE) table that determines the packet's egress VLAN. This is a
two-dimensional table that is indexed by port and switch multicast port
to egress (SMPE) index. The latter can be thought of as a FID. Without
it, all the packets replicated via a certain port would get the same
VLAN, regardless of the bridge domain (FID).

Logically, these two steps look as follows:

                     PGT table                           MPE table
             +-----------------------+               +---------------+
             |                       | {Local port,  |               | Egress
  MID index  | Local ports bitmap #1 |  SMPE index}  |               |  VID
+------------>        ...            +--------------->               +-------->
             | Local ports bitmap #N |               |               |
             |                       |          SMPE |               |
             +-----------------------+               +---------------+
                                                        Local port

Patchset overview:

Patch #1 adds a variable to guard against mixed model configuration.
Will be removed in part 6 when mlxsw is fully converted to the unified
model.

Patches #2-#5 introduce two new FID attributes required for flooding
configuration in the new model:

1. 'flood_rsp': Instructs the firmware to handle flooding configuration
for this FID. Only set for router FIDs (rFIDs) which are used to connect
a {Port, VLAN} to the router block.

2. 'bridge_type': Allows the device to determine the flood table (i.e.,
base index to the PGT table) for the FID. The first type will be used
for FIDs in a VLAN-aware bridge and the second for FIDs representing
VLAN-unaware bridges.

Patch #6 configures the MPE table that determines the egress VLAN of a
packet that is forwarded according to L2 multicast / flood.

Patches #7-#11 add the PGT table and related APIs to allocate entries
and set / clear ports in them.

Patches #12-#13 convert the flooding configuration to use the new PGT
APIs.
====================

Link: https://lore.kernel.org/r/20220627070621.648499-1-idosch@nvidia.comSigned-off-by: Paolo Abeni <pabeni@redhat.com>