Commits · f3514a5d67403f803eadd39cf61986638101e755 · Kirill Smelkov / linux

19 Jul, 2023 4 commits

selftests/bpf: Disable newly-added 'owner' field test until refcount re-enabled · f3514a5d

Dave Marchevsky authored Jul 18, 2023

The test added in previous patch will fail with bpf_refcount_acquire
disabled. Until all races are fixed and bpf_refcount_acquire is
re-enabled on bpf-next, disable the test so CI doesn't complain.
Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/r/20230718083813.3416104-6-davemarchevsky@fb.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

f3514a5d

selftests/bpf: Add rbtree test exercising race which 'owner' field prevents · fdf48dc2

Dave Marchevsky authored Jul 18, 2023

This patch adds a runnable version of one of the races described by
Kumar in [0]. Specifically, this interleaving:

(rbtree1 and list head protected by lock1, rbtree2 protected by lock2)

Prog A                          Prog B
======================================
n = bpf_obj_new(...)
m = bpf_refcount_acquire(n)
kptr_xchg(map, m)

                                m = kptr_xchg(map, NULL)
                                lock(lock2)
				bpf_rbtree_add(rbtree2, m->r, less)
				unlock(lock2)

lock(lock1)
bpf_list_push_back(head, n->l)
/* make n non-owning ref */
bpf_rbtree_remove(rbtree1, n->r)
unlock(lock1)

The above interleaving, the node's struct bpf_rb_node *r can be used to
add it to either rbtree1 or rbtree2, which are protected by different
locks. If the node has been added to rbtree2, we should not be allowed
to remove it while holding rbtree1's lock.

Before changes in the previous patch in this series, the rbtree_remove
in the second part of Prog A would succeed as the verifier has no way of
knowing which tree owns a particular node at verification time. The
addition of 'owner' field results in bpf_rbtree_remove correctly
failing.

The test added in this patch splits "Prog A" above into two separate BPF
programs - A1 and A2 - and uses a second mapval + kptr_xchg to pass n
from A1 to A2 similarly to the pass from A1 to B. If the test is run
without the fix applied, the remove will succeed.

Kumar's example had the two programs running on separate CPUs. This
patch doesn't do this as it's not necessary to exercise the broken
behavior / validate fixed behavior.

  [0]: https://lore.kernel.org/bpf/d7hyspcow5wtjcmw4fugdgyp3fwhljwuscp3xyut5qnwivyeru@ysdq543otzv2Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Suggested-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20230718083813.3416104-5-davemarchevsky@fb.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

fdf48dc2

bpf: Add 'owner' field to bpf_{list,rb}_node · c3c510ce

Dave Marchevsky authored Jul 18, 2023

As described by Kumar in [0], in shared ownership scenarios it is
necessary to do runtime tracking of {rb,list} node ownership - and
synchronize updates using this ownership information - in order to
prevent races. This patch adds an 'owner' field to struct bpf_list_node
and bpf_rb_node to implement such runtime tracking.

The owner field is a void * that describes the ownership state of a
node. It can have the following values:

  NULL           - the node is not owned by any data structure
  BPF_PTR_POISON - the node is in the process of being added to a data
                   structure
  ptr_to_root    - the pointee is a data structure 'root'
                   (bpf_rb_root / bpf_list_head) which owns this node

The field is initially NULL (set by bpf_obj_init_field default behavior)
and transitions states in the following sequence:

  Insertion: NULL -> BPF_PTR_POISON -> ptr_to_root
  Removal:   ptr_to_root -> NULL

Before a node has been successfully inserted, it is not protected by any
root's lock, and therefore two programs can attempt to add the same node
to different roots simultaneously. For this reason the intermediate
BPF_PTR_POISON state is necessary. For removal, the node is protected
by some root's lock so this intermediate hop isn't necessary.

Note that bpf_list_pop_{front,back} helpers don't need to check owner
before removing as the node-to-be-removed is not passed in as input and
is instead taken directly from the list. Do the check anyways and
WARN_ON_ONCE in this unexpected scenario.

Selftest changes in this patch are entirely mechanical: some BTF
tests have hardcoded struct sizes for structs that contain
bpf_{list,rb}_node fields, those were adjusted to account for the new
sizes. Selftest additions to validate the owner field are added in a
further patch in the series.

  [0]: https://lore.kernel.org/bpf/d7hyspcow5wtjcmw4fugdgyp3fwhljwuscp3xyut5qnwivyeru@ysdq543otzv2Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Suggested-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20230718083813.3416104-4-davemarchevsky@fb.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

c3c510ce

bpf: Introduce internal definitions for UAPI-opaque bpf_{rb,list}_node · 0a1f7bfe

Dave Marchevsky authored Jul 18, 2023

Structs bpf_rb_node and bpf_list_node are opaquely defined in
uapi/linux/bpf.h, as BPF program writers are not expected to touch their
fields - nor does the verifier allow them to do so.

Currently these structs are simple wrappers around structs rb_node and
list_head and linked_list / rbtree implementation just casts and passes
to library functions for those data structures. Later patches in this
series, though, will add an "owner" field to bpf_{rb,list}_node, such
that they're not just wrapping an underlying node type. Moreover, the
bpf linked_list and rbtree implementations will deal with these owner
pointers directly in a few different places.

To avoid having to do

void *owner = (void*)bpf_list_node + sizeof(struct list_head)

with opaque UAPI node types, add bpf_{list,rb}_node_kern struct
definitions to internal headers and modify linked_list and rbtree to use
the internal types where appropriate.
Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/r/20230718083813.3416104-3-davemarchevsky@fb.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

0a1f7bfe

17 Jul, 2023 24 commits

Merge branch 'phy-at803x-support' · 60cc1f7d

David S. Miller authored Jul 17, 2023

Luo Jie says:

====================
net: phy: at803x: support qca8081 1G version chip

This patch series add supporting qca8081 1G version chip, the 1G version
chip can be identified by the register mmd7.0x901d bit0.

In addition, qca8081 does not support 1000BaseX mode and the sgmii fifo
reset is added on the link changed, which assert the fifo on the link
down, deassert the fifo on the link up.

Changes in v1:
	* switch to use genphy_c45_pma_read_abilities.
	* remove the patch [remove 1000BaseX mode of qca8081].
	* move the sgmii fifo reset to link_change_notify.

Changes in v2:
	* split the qca8081 1G chip support patch.
	* improve the slave seed config, disable it if master preferred.

Changes in v3:
	* fix the comments.
	* add the help function qca808x_has_fast_retrain_or_slave_seed.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

60cc1f7d

net: phy: at803x: add qca8081 fifo reset on the link changed · 723970af

Luo Jie authored Jul 16, 2023

The qca8081 sgmii fifo needs to be reset on link down and
released on the link up in case of any abnormal issue
such as the packet blocked on the PHY.
Signed-off-by: Luo Jie <quic_luoj@quicinc.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>

723970af

net: phy: at803x: remove qca8081 1G fast retrain and slave seed config · df9401ff

Luo Jie authored Jul 16, 2023

The fast retrain and slave seed configs are only applicable when the 2.5G
ability is supported.
Signed-off-by: Luo Jie <quic_luoj@quicinc.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

df9401ff

net: phy: at803x: support qca8081 1G chip type · fea7cfb8

Luo Jie authored Jul 16, 2023

The qca8081 1G chip version does not support 2.5 capability, which
is distinguished from qca8081 2.5G chip according to the bit0 of
register mmd7.0x901d, the 1G version chip also has the same PHY ID
as the normal qca8081 2.5G chip.
Signed-off-by: Luo Jie <quic_luoj@quicinc.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>

fea7cfb8

net: phy: at803x: enable qca8081 slave seed conditionally · 7cc32095

Luo Jie authored Jul 16, 2023

qca8081 is the single port PHY, the slave prefer mode is used
by default.

if the phy master perfer mode is configured, the slave seed
configuration should not be enabled, since the slave seed
enablement is for making PHY linked as slave mode easily.

disable slave seed if the master mode is preferred.
Signed-off-by: Luo Jie <quic_luoj@quicinc.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

7cc32095

net: phy: at803x: merge qca8081 slave seed function · f3db55ae

Luo Jie authored Jul 16, 2023

merge the seed enablement and seed value configuration into
one function, since the random seed value is needed to be
configured when the seed is enabled.
Signed-off-by: Luo Jie <quic_luoj@quicinc.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>

f3db55ae

net: phy: at803x: support qca8081 genphy_c45_pma_read_abilities · 8b8bc13d

Luo Jie authored Jul 16, 2023

qca8081 PHY supports to use genphy_c45_pma_read_abilities for
getting the PHY features supported except for the autoneg ability

but autoneg ability exists in MDIO_STAT1 instead of MMD7.1, add it
manually after calling genphy_c45_pma_read_abilities.
Signed-off-by: Luo Jie <quic_luoj@quicinc.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>

8b8bc13d

Merge branch 'qrtr-fixes' · ae02f8d4

David S. Miller authored Jul 17, 2023

Vignesh Viswanathan says:

====================
net: qrtr: Few fixes in QRTR

Add fixes in QRTR ns to change server and nodes radix tree to xarray to
avoid a use-after-free while iterating through the server or nodes
radix tree.

Also fix the destination port value for IPCR control buffer on older
targets.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

ae02f8d4

net: qrtr: Handle IPCR control port format of older targets · 69940b88

Vignesh Viswanathan authored Jul 14, 2023

The destination port value in the IPCR control buffer on older
targets is 0xFFFF. Handle the same by updating the dst_port to
QRTR_PORT_CTRL.
Signed-off-by: Vignesh Viswanathan <quic_viswanat@quicinc.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

69940b88

net: qrtr: ns: Change nodes radix tree to xarray · f26b32ef

Vignesh Viswanathan authored Jul 14, 2023

There is a use after free scenario while iterating through the nodes
radix tree despite the ns being a single threaded process. This can
happen when the radix tree APIs are not synchronized with the
rcu_read_lock() APIs.

Convert the radix tree for nodes to xarray to take advantage of the
built in rcu lock usage provided by xarray.
Signed-off-by: Chris Lew <quic_clew@quicinc.com>
Signed-off-by: Vignesh Viswanathan <quic_viswanat@quicinc.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

f26b32ef

net: qrtr: ns: Change servers radix tree to xarray · 608a147a

Vignesh Viswanathan authored Jul 14, 2023

There is a use after free scenario while iterating through the servers
radix tree despite the ns being a single threaded process. This can
happen when the radix tree APIs are not synchronized with the
rcu_read_lock() APIs.

Convert the radix tree for servers to xarray to take advantage of the
built in rcu lock usage provided by xarray.
Signed-off-by: Chris Lew <quic_clew@quicinc.com>
Signed-off-by: Vignesh Viswanathan <quic_viswanat@quicinc.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

608a147a

Merge branch 'brcm-asp-2.0-support' · 89e970ea

David S. Miller authored Jul 17, 2023

Justin Chen says:

====================
Brcm ASP 2.0 Ethernet Controller

Add support for the Broadcom ASP 2.0 Ethernet controller which is first
introduced with 72165.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

89e970ea

MAINTAINERS: ASP 2.0 Ethernet driver maintainers · 3abf3d15

Justin Chen authored Jul 13, 2023

Add maintainers entry for ASP 2.0 Ethernet driver.
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com>
Signed-off-by: Justin Chen <justin.chen@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

3abf3d15

net: phy: bcm7xxx: Add EPHY entry for 74165 · 9fa0bba0

Florian Fainelli authored Jul 13, 2023

74165 is a 16nm process SoC with a 10/100 integrated Ethernet PHY,
utilize the recently defined 16nm EPHY macro to configure that PHY.
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com>
Signed-off-by: Justin Chen <justin.chen@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9fa0bba0

net: phy: mdio-bcm-unimac: Add asp v2.0 support · 9de2b402

Justin Chen authored Jul 13, 2023

Add mdio compat string for ASP 2.0 ethernet driver.
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com>
Signed-off-by: Justin Chen <justin.chen@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9de2b402

net: bcmasp: Add support for ethtool driver stats · 7c10691e

Justin Chen authored Jul 13, 2023

Add support for ethernet driver specific stats.
Signed-off-by: Justin Chen <justin.chen@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

7c10691e

net: bcmasp: Add support for ethtool standard stats · 64931534

Justin Chen authored Jul 13, 2023

Add support for eth_mac_stats, rmon_stats, and eth_ctrl_stats.
Signed-off-by: Justin Chen <justin.chen@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

64931534

net: bcmasp: Add support for eee mode · 550e6f34

Justin Chen authored Jul 13, 2023

Add support for eee mode.
Signed-off-by: Justin Chen <justin.chen@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

550e6f34

net: bcmasp: Add support for wake on net filters · c5d511c4

Justin Chen authored Jul 13, 2023

Add support for wake on network filters. The max match is 256 bytes.
Signed-off-by: Justin Chen <justin.chen@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

c5d511c4

net: bcmasp: Add support for WoL magic packet · a2f07512

Justin Chen authored Jul 13, 2023

Add support for Wake-On-Lan magic packet and magic packet with password.
Signed-off-by: Justin Chen <justin.chen@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

a2f07512

net: bcmasp: Add support for ASP2.0 Ethernet controller · 490cb412

Justin Chen authored Jul 13, 2023

Add support for the Broadcom ASP 2.0 Ethernet controller which is first
introduced with 72165. This controller features two distinct Ethernet
ports that can be independently operated.
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com>
Signed-off-by: Justin Chen <justin.chen@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

490cb412

dt-bindings: net: Brcm ASP 2.0 Ethernet controller · a29401be

Florian Fainelli authored Jul 13, 2023

Add a binding document for the Broadcom ASP 2.0 Ethernet
controller.
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com>
Signed-off-by: Justin Chen <justin.chen@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

a29401be

dt-bindings: net: brcm,unimac-mdio: Add asp-v2.0 · 27312c43

Justin Chen authored Jul 13, 2023

The ASP 2.0 Ethernet controller uses a brcm unimac.
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com>
Signed-off-by: Justin Chen <justin.chen@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

27312c43

net: fec: Refactor: rename `adapter` to `fep` · f08469d0

Csókás Bence authored Jul 13, 2023

Rename local `struct fec_enet_private *adapter` to `fep` in `fec_ptp_gettime()` to match the rest of the driver
Signed-off-by: Csókás Bence <csokas.bence@prolan.hu>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

f08469d0

14 Jul, 2023 12 commits

gve: trivial spell fix Recive to Receive · 68af9000

Jesper Dangaard Brouer authored Jul 13, 2023

Spotted this trivial spell mistake while casually reading
the google GVE driver code.
Signed-off-by: Jesper Dangaard Brouer <hawk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

68af9000

Merge branch 'mlxsw-rif-pvid' · 382d7dcf

David S. Miller authored Jul 14, 2023

Petr Machata says:

====================
mlxsw: Manage RIF across PVID changes

The mlxsw driver currently makes the assumption that the user applies
configuration in a bottom-up manner. Thus netdevices need to be added to
the bridge before IP addresses are configured on that bridge or SVI added
on top of it. Enslaving a netdevice to another netdevice that already has
uppers is in fact forbidden by mlxsw for this reason. Despite this safety,
it is rather easy to get into situations where the offloaded configuration
is just plain wrong.

As an example, take a front panel port, configure an IP address: it gets a
RIF. Now enslave the port to the bridge, and the RIF is gone. Remove the
port from the bridge again, but the RIF never comes back. There is a number
of similar situations, where changing the configuration there and back
utterly breaks the offload.

The situation is going to be made better by implementing a range of replays
and post-hoc offloads.

In this patch set, address the ordering issues related to creation of
bridge RIFs. Currently, mlxsw has several shortcomings with regards to RIF
handling due to PVID changes:

- In order to cause RIF for a bridge device to be created, the user is
  expected first to set PVID, then to add an IP address. The reverse
  ordering is disallowed, which is not very user-friendly.

- When such bridge gets a VLAN upper whose VID was the same as the existing
  PVID, and this VLAN netdevice gets an IP address, a RIF is created for
  this netdevice. The new RIF is then assigned to the 802.1Q FID for the
  given VID. This results in a working configuration. However, then, when
  the VLAN netdevice is removed again, the RIF for the bridge itself is
  never reassociated to the PVID.

- PVID cannot be changed once the bridge has uppers. Presumably this is
  because the driver does not manage RIFs properly in face of PVID changes.
  However, as the previous point shows, it is still possible to get into
  invalid configurations.

This patch set addresses these issues and relaxes some of the ordering
requirements that mlxsw had. The patch set proceeds as follows:

- In patch #1, pass extack to mlxsw_sp_br_ban_rif_pvid_change()

- To relax ordering between setting PVID and adding an IP address to a
  bridge, mlxsw must be able to request that a RIF is created with a given
  VLAN ID, instead of trying to deduce it from the current netdevice
  settings, which do not reflect the user-requested values yet. This is
  done in patches #2 and #3.

- Similarly, mlxsw_sp_inetaddr_bridge_event() will need to make decisions
  based on the user-requested value of PVID, not the current value. Thus in
  patches #4 and #5, add a new argument which carries the requested PVID
  value.

- Finally in patch #6 relax the ban on PVID changes when a bridge has
  uppers. Instead, add the logic necessary for creation of a RIF as a
  result of PVID change.

- Relevant selftests are presented afterwards. In patch #7 a preparatory
  helper is added to lib.sh. Patches #8, #9, #10 and #11 include selftests
  themselves.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

382d7dcf

selftests: router_bridge_pvid_vlan_upper: Add a new selftest · 9cbb3da4

Petr Machata authored Jul 13, 2023

This tests whether addition and deletion of a VLAN upper that coincides
with the current PVID setting throws off forwarding.

This selftests is specifically geared towards offloading drivers. In
particular, mlxsw used to fail this selftest, and an earlier patch in this
patchset fixes the issue. However, there's nothing HW-specific in the test
itself (it absolutely is supposed to pass on SW datapath), and therefore it
is put into the generic forwarding directory.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9cbb3da4

selftests: router_bridge_vlan_upper_pvid: Add a new selftest · b0307b77

Petr Machata authored Jul 13, 2023

This tests whether changes to PVID that coincide with an existing VLAN
upper throw off forwarding. This selftests is specifically geared towards
offloading drivers, but since there's nothing HW-specific in the test
itself (it absolutely is supposed to pass on SW datapath), it is put into
the generic forwarding directory.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

b0307b77

selftests: router_bridge_vlan: Add PVID change test · d4172a93

Petr Machata authored Jul 13, 2023

Add an alternative path involving VLAN 777 instead of the current 555. Then
add tests that verify that marking 777 as PVID makes the 555 path not work,
and the 777 path work.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

d4172a93

selftests: router_bridge: Add tests to remove and add PVID · c7203a29

Petr Machata authored Jul 13, 2023

This test relies on PVID being configured on the bridge itself. Thus when
it is deconfigured, the system should lose the ability to forward traffic.
Later when it is added again, the ability to forward traffic should be
regained. Add tests to exercise these configuration changes and verify
results.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

c7203a29

selftests: forwarding: lib: Add ping6_, ping_test_fails() · 5f44a714

Petr Machata authored Jul 13, 2023

Add two helpers to run a ping test that succeeds when the pings themselves
fail.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

5f44a714

mlxsw: spectrum_switchdev: Manage RIFs on PVID change · a5b52692

Petr Machata authored Jul 13, 2023

Currently, mlxsw has several shortcomings with regards to RIF handling due
to PVID changes:

- In order to cause RIF for a bridge device to be created, the user is
  expected first to set PVID, then to add an IP address. The reverse
  ordering is disallowed, which is not very user-friendly.

- When such bridge gets a VLAN upper whose VID was the same as the existing
  PVID, and this VLAN netdevice gets an IP address, a RIF is created for
  this netdevice. The new RIF is then assigned to the 802.1Q FID for the
  given VID. This results in a working configuration. However, then, when
  the VLAN netdevice is removed again, the RIF for the bridge itself is
  never reassociated to the VLAN.

- PVID cannot be changed once the bridge has uppers. Presumably this is
  because the driver does not manage RIFs properly in face of PVID changes.
  However, as the previous point shows, it is still possible to get into
  invalid configurations.

In this patch, add the logic necessary for creation of a RIF as a result of
PVID change. Moreover, when a VLAN upper is created whose VID matches lower
PVID, do not create RIF for this netdevice.

These changes obviate the need for ordering of IP address additions and
PVID configuration, so stop forbidding addition of an IP address to a
PVID-less bridge. Instead, bail out quietly. Also stop preventing PVID
changes when the bridge has uppers.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

a5b52692

mlxsw: spectrum_router: mlxsw_sp_inetaddr_bridge_event: Add an argument · 3430f2cf

Petr Machata authored Jul 13, 2023

For purposes of replay, mlxsw_sp_inetaddr_bridge_event() will need to make
decisions based on the proposed value of PVID. Querying PVID reveals the
current settings, not the in-flight values that the user requested and that
the notifiers are acting upon. Add a parameter, lower_pvid, which carries
the proposed PVID of the lower bridge, or -1 if the lower is not a bridge.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

3430f2cf

mlxsw: spectrum_router: Adjust mlxsw_sp_inetaddr_vlan_event() coding style · a24a4d29

Petr Machata authored Jul 13, 2023

The bridge branch of the dispatch in this function is going to get more
code and will need curly braces. Per the doctrine, that means the whole
if-else chain should get them.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

a24a4d29

mlxsw: spectrum_router: Take VID for VLAN FIDs from RIF params · a0944b24

Petr Machata authored Jul 13, 2023

Currently, when an IP address is added to a bridge that has no PVID, the
operation is rejected. An IP address addition is interpreted as a request
to create a RIF for the bridge device, but without a PVID there is no VLAN
for which the RIF should be created. Thus the correct way to create a RIF
for a bridge as a user is to first add a PVID, and then add the IP address.

Ideally this ordering requirement would not exist. RIF would be created
either because an IP address is added, or because a PVID is added,
depending on which comes last.

For that, the switchdev code (which notices the PVID change request) must
be able to request that a RIF is created with a given VLAN ID, because at
the time that the PVID notification is distributed, the PVID setting is not
yet visible for querying.

Therefore when creating a VLAN-based RIF, use mlxsw_sp_rif_params.vid to
communicate the VID, and do not determine it ad-hoc in the fid_get
callback.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

a0944b24

mlxsw: spectrum_router: Pass struct mlxsw_sp_rif_params to fid_get · 5ca9f42c

Petr Machata authored Jul 13, 2023

The fid_get callback is called to allocate a FID for the newly-created RIF.
In a following patch, the fid_get implementation for VLANs will be modified
to take the VLAN ID from the parameters instead of deducing it from the
netdevice. To that end, propagate the RIF parameters to the fid_get
callback.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

5ca9f42c