1. 17 Feb, 2022 36 commits
  2. 16 Feb, 2022 4 commits
    • David S. Miller's avatar
      Merge branch 'Replay-and-offload-host-VLAN-entries-in-DSA' · f0ead99e
      David S. Miller authored
      Vladimir Oltean says:
      
      ====================
      Replay and offload host VLAN entries in DSA
      
      v2->v3:
      - make the bridge stop notifying switchdev for !BRENTRY VLANs
      - create precommit and commit wrappers around __vlan_add_flags().
      - special-case the BRENTRY transition from false to true, instead of
        treating it as a change of flags and letting drivers figure out that
        it really isn't.
      - avoid setting *changed unless we know that functions will not error
        out later.
      - drop "old_flags" from struct switchdev_obj_port_vlan, nobody needs it
        now, in v2 only DSA needed it to filter out BRENTRY transitions, that
        is now solved cleaner.
      - no BRIDGE_VLAN_INFO_BRENTRY flag checks and manipulations in DSA
        whatsoever, use the "bool changed" bit as-is after changing what it
        means.
      - merge dsa_slave_host_vlan_{add,del}() with
        dsa_slave_foreign_vlan_{add,del}(), since now they do the same thing,
        because the host_vlan functions no longer need to mangle the vlan
        BRENTRY flags and bool changed.
      
      v1->v2:
      - prune switchdev VLAN additions with no actual change differently
      - no longer need to revert struct net_bridge_vlan changes on error from
        switchdev
      - no longer need to first delete a changed VLAN before readding it
      - pass 'bool changed' and 'u16 old_flags' through switchdev_obj_port_vlan
        so that DSA can do some additional post-processing with the
        BRIDGE_VLAN_INFO_BRENTRY flag
      - support VLANs on foreign interfaces
      - fix the same -EOPNOTSUPP error in mv88e6xxx, this time on removal, due
        to VLAN deletion getting replayed earlier than FDB deletion
      
      The motivation behind these patches is that Rafael reported the
      following error with mv88e6xxx when the first switch port joins a
      bridge:
      
      mv88e6085 0x0000000008b96000:00: port 0 failed to add a6:ef:77:c8:5f:3d vid 1 to fdb: -95 (-EOPNOTSUPP)
      
      The FDB entry that's added is the MAC address of the bridge, in VID 1
      (the default_pvid), being replayed as part of br_add_if() -> ... ->
      nbp_switchdev_sync_objs().
      
      -EOPNOTSUPP is the mv88e6xxx driver's way of saying that VID 1 doesn't
      exist in the VTU, so it can't program the ATU with a FID, something
      which it needs.
      
      It appears to be a race, but it isn't, since we only end up installing
      VID 1 in the VTU by coincidence. DSA's approximation of programming
      VLANs on the CPU port together with the user ports breaks down with
      host FDB entries on mv88e6xxx, since that strictly requires the VTU to
      contain the VID. But the user may freely add VLANs pointing just towards
      the bridge, and FDB entries in those VLANs, and DSA will not be aware of
      them, because it only listens for VLANs on user ports.
      
      To create a solution that scales properly to cross-chip setups and
      doesn't leak entries behind, some changes in the bridge driver are
      required. I believe that these are for the better overall, but I may be
      wrong. Namely, the same refcounting procedure that DSA has in place for
      host FDB and MDB entries can be replicated for VLANs, except that it's
      garbage in, garbage out: the VLAN addition and removal notifications
      from switchdev aren't balanced. So the first 2 patches attempt to deal
      with that.
      
      This patch set has been superficially tested on a board with 3 mv88e6xxx
      switches in a daisy chain and appears to produce the primary desired
      effect - the driver no longer returns -EOPNOTSUPP when the first port
      joins a bridge, and is successful in performing local termination under
      a VLAN-aware bridge.
      As an additional side effect, it silences the annoying "p%d: already a
      member of VLAN %d\n" warning messages that the mv88e6xxx driver produces
      when coupled with systemd-networkd, and a few VLANs are configured.
      Furthermore, it advances Florian's idea from a few years back, which
      never got merged:
      https://lore.kernel.org/lkml/20180624153339.13572-1-f.fainelli@gmail.com/
      v2 has also been tested on the NXP LS1028A felix switch.
      
      Some testing:
      
      root@debian:~# bridge vlan add dev br0 vid 101 pvid self
      [  100.709220] mv88e6085 d0032004.mdio-mii:10: mv88e6xxx_port_vlan_add: port 9 vlan 101
      [  100.873426] mv88e6085 d0032004.mdio-mii:10: mv88e6xxx_port_vlan_add: port 10 vlan 101
      [  100.892314] mv88e6085 d0032004.mdio-mii:11: mv88e6xxx_port_vlan_add: port 9 vlan 101
      [  101.053392] mv88e6085 d0032004.mdio-mii:11: mv88e6xxx_port_vlan_add: port 10 vlan 101
      [  101.076994] mv88e6085 d0032004.mdio-mii:12: mv88e6xxx_port_vlan_add: port 9 vlan 101
      root@debian:~# bridge vlan add dev br0 vid 101 pvid self
      root@debian:~# bridge vlan add dev br0 vid 101 pvid self
      root@debian:~# bridge vlan
      port              vlan-id
      eth0              1 PVID Egress Untagged
      lan9              1 PVID Egress Untagged
      lan10             1 PVID Egress Untagged
      lan11             1 PVID Egress Untagged
      lan12             1 PVID Egress Untagged
      lan13             1 PVID Egress Untagged
      lan14             1 PVID Egress Untagged
      lan15             1 PVID Egress Untagged
      lan16             1 PVID Egress Untagged
      lan17             1 PVID Egress Untagged
      lan18             1 PVID Egress Untagged
      lan19             1 PVID Egress Untagged
      lan20             1 PVID Egress Untagged
      lan21             1 PVID Egress Untagged
      lan22             1 PVID Egress Untagged
      lan23             1 PVID Egress Untagged
      lan24             1 PVID Egress Untagged
      sfp               1 PVID Egress Untagged
      lan1              1 PVID Egress Untagged
      lan2              1 PVID Egress Untagged
      lan3              1 PVID Egress Untagged
      lan4              1 PVID Egress Untagged
      lan5              1 PVID Egress Untagged
      lan6              1 PVID Egress Untagged
      lan7              1 PVID Egress Untagged
      lan8              1 PVID Egress Untagged
      br0               1 Egress Untagged
                        101 PVID
      root@debian:~# bridge vlan del dev br0 vid 101 pvid self
      [  108.340487] mv88e6085 d0032004.mdio-mii:11: mv88e6xxx_port_vlan_del: port 9 vlan 101
      [  108.379167] mv88e6085 d0032004.mdio-mii:11: mv88e6xxx_port_vlan_del: port 10 vlan 101
      [  108.402319] mv88e6085 d0032004.mdio-mii:12: mv88e6xxx_port_vlan_del: port 9 vlan 101
      [  108.425866] mv88e6085 d0032004.mdio-mii:10: mv88e6xxx_port_vlan_del: port 9 vlan 101
      [  108.452280] mv88e6085 d0032004.mdio-mii:10: mv88e6xxx_port_vlan_del: port 10 vlan 101
      root@debian:~# bridge vlan del dev br0 vid 101 pvid self
      root@debian:~# bridge vlan del dev br0 vid 101 pvid self
      root@debian:~# bridge vlan
      port              vlan-id
      eth0              1 PVID Egress Untagged
      lan9              1 PVID Egress Untagged
      lan10             1 PVID Egress Untagged
      lan11             1 PVID Egress Untagged
      lan12             1 PVID Egress Untagged
      lan13             1 PVID Egress Untagged
      lan14             1 PVID Egress Untagged
      lan15             1 PVID Egress Untagged
      lan16             1 PVID Egress Untagged
      lan17             1 PVID Egress Untagged
      lan18             1 PVID Egress Untagged
      lan19             1 PVID Egress Untagged
      lan20             1 PVID Egress Untagged
      lan21             1 PVID Egress Untagged
      lan22             1 PVID Egress Untagged
      lan23             1 PVID Egress Untagged
      lan24             1 PVID Egress Untagged
      sfp               1 PVID Egress Untagged
      lan1              1 PVID Egress Untagged
      lan2              1 PVID Egress Untagged
      lan3              1 PVID Egress Untagged
      lan4              1 PVID Egress Untagged
      lan5              1 PVID Egress Untagged
      lan6              1 PVID Egress Untagged
      lan7              1 PVID Egress Untagged
      lan8              1 PVID Egress Untagged
      br0               1 Egress Untagged
      root@debian:~# bridge vlan del dev br0 vid 101 pvid self
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f0ead99e
    • Vladimir Oltean's avatar
      net: dsa: offload bridge port VLANs on foreign interfaces · 164f861b
      Vladimir Oltean authored
      DSA now explicitly handles VLANs installed with the 'self' flag on the
      bridge as host VLANs, instead of just replicating every bridge port VLAN
      also on the CPU port and never deleting it, which is what it did before.
      
      However, this leaves a corner case uncovered, as explained by
      Tobias Waldekranz:
      https://patchwork.kernel.org/project/netdevbpf/patch/20220209213044.2353153-6-vladimir.oltean@nxp.com/#24735260
      
      Forwarding towards a bridge port VLAN installed on a bridge port foreign
      to DSA (separate NIC, Wi-Fi AP) used to work by virtue of the fact that
      DSA itself needed to have at least one port in that VLAN (therefore, it
      also had the CPU port in said VLAN). However, now that the CPU port may
      not be member of all VLANs that user ports are members of, we need to
      ensure this isn't the case if software forwarding to a foreign interface
      is required.
      
      The solution is to treat bridge port VLANs on standalone interfaces in
      the exact same way as host VLANs. From DSA's perspective, there is no
      difference between local termination and software forwarding; packets in
      that VLAN must reach the CPU in both cases.
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      164f861b
    • Vladimir Oltean's avatar
      net: dsa: add explicit support for host bridge VLANs · 134ef238
      Vladimir Oltean authored
      Currently, DSA programs VLANs on shared (DSA and CPU) ports each time it
      does so on user ports. This is good for basic functionality but has
      several limitations:
      
      - the VLAN group which must reach the CPU may be radically different
        from the VLAN group that must be autonomously forwarded by the switch.
        In other words, the admin may want to isolate noisy stations and avoid
        traffic from them going to the control processor of the switch, where
        it would just waste useless cycles. The bridge already supports
        independent control of VLAN groups on bridge ports and on the bridge
        itself, and when VLAN-aware, it will drop packets in software anyway
        if their VID isn't added as a 'self' entry towards the bridge device.
      
      - Replaying host FDB entries may depend, for some drivers like mv88e6xxx,
        on replaying the host VLANs as well. The 2 VLAN groups are
        approximately the same in most regular cases, but there are corner
        cases when timing matters, and DSA's approximation of replicating
        VLANs on shared ports simply does not work.
      
      - If a user makes the bridge (implicitly the CPU port) join a VLAN by
        accident, there is no way for the CPU port to isolate itself from that
        noisy VLAN except by rebooting the system. This is because for each
        VLAN added on a user port, DSA will add it on shared ports too, but
        for each VLAN deletion on a user port, it will remain installed on
        shared ports, since DSA has no good indication of whether the VLAN is
        still in use or not.
      
      Now that the bridge driver emits well-balanced SWITCHDEV_OBJ_ID_PORT_VLAN
      addition and removal events, DSA has a simple and straightforward task
      of separating the bridge port VLANs (these have an orig_dev which is a
      DSA slave interface, or a LAG interface) from the host VLANs (these have
      an orig_dev which is a bridge interface), and to keep a simple reference
      count of each VID on each shared port.
      
      Forwarding VLANs must be installed on the bridge ports and on all DSA
      ports interconnecting them. We don't have a good view of the exact
      topology, so we simply install forwarding VLANs on all DSA ports, which
      is what has been done until now.
      
      Host VLANs must be installed primarily on the dedicated CPU port of each
      bridge port. More subtly, they must also be installed on upstream-facing
      and downstream-facing DSA ports that are connecting the bridge ports and
      the CPU. This ensures that the mv88e6xxx's problem (VID of host FDB
      entry may be absent from VTU) is still addressed even if that switch is
      in a cross-chip setup, and it has no local CPU port.
      
      Therefore:
      - user ports contain only bridge port (forwarding) VLANs, and no
        refcounting is necessary
      - DSA ports contain both forwarding and host VLANs. Refcounting is
        necessary among these 2 types.
      - CPU ports contain only host VLANs. Refcounting is also necessary.
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      134ef238
    • Vladimir Oltean's avatar
      net: switchdev: introduce switchdev_handle_port_obj_{add,del} for foreign interfaces · c4076cdd
      Vladimir Oltean authored
      The switchdev_handle_port_obj_add() helper is good for replicating a
      port object on the lower interfaces of @dev, if that object was emitted
      on a bridge, or on a bridge port that is a LAG.
      
      However, drivers that use this helper limit themselves to a box from
      which they can no longer intercept port objects notified on neighbor
      ports ("foreign interfaces").
      
      One such driver is DSA, where software bridging with foreign interfaces
      such as standalone NICs or Wi-Fi APs is an important use case. There, a
      VLAN installed on a neighbor bridge port roughly corresponds to a
      forwarding VLAN installed on the DSA switch's CPU port.
      
      To support this use case while also making use of the benefits of the
      switchdev_handle_* replication helper for port objects, introduce a new
      variant of these functions that crawls through the neighbor ports of
      @dev, in search of potentially compatible switchdev ports that are
      interested in the event.
      
      The strategy is identical to switchdev_handle_fdb_event_to_device():
      if @dev wasn't a switchdev interface, then go one step upper, and
      recursively call this function on the bridge that this port belongs to.
      At the next recursion step, __switchdev_handle_port_obj_add() will
      iterate through the bridge's lower interfaces. Among those, some will be
      switchdev interfaces, and one will be the original @dev that we came
      from. To prevent infinite recursion, we must suppress reentry into the
      original @dev, and just call the @add_cb for the switchdev_interfaces.
      
      It looks like this:
      
                      br0
                     / | \
                    /  |  \
                   /   |   \
                 swp0 swp1 eth0
      
      1. __switchdev_handle_port_obj_add(eth0)
         -> check_cb(eth0) returns false
         -> eth0 has no lower interfaces
         -> eth0's bridge is br0
         -> switchdev_lower_dev_find(br0, check_cb, foreign_dev_check_cb))
            finds br0
      
      2. __switchdev_handle_port_obj_add(br0)
         -> check_cb(br0) returns false
         -> netdev_for_each_lower_dev
            -> check_cb(swp0) returns true, so we don't skip this interface
      
      3. __switchdev_handle_port_obj_add(swp0)
         -> check_cb(swp0) returns true, so we call add_cb(swp0)
      
      (back to netdev_for_each_lower_dev from 2)
            -> check_cb(swp1) returns true, so we don't skip this interface
      
      4. __switchdev_handle_port_obj_add(swp1)
         -> check_cb(swp1) returns true, so we call add_cb(swp1)
      
      (back to netdev_for_each_lower_dev from 2)
            -> check_cb(eth0) returns false, so we skip this interface to
               avoid infinite recursion
      
      Note: eth0 could have been a LAG, and we don't want to suppress the
      recursion through its lowers if those exist, so when check_cb() returns
      false, we still call switchdev_lower_dev_find() to estimate whether
      there's anything worth a recursion beneath that LAG. Using check_cb()
      and foreign_dev_check_cb(), switchdev_lower_dev_find() not only figures
      out whether the lowers of the LAG are switchdev, but also whether they
      actively offload the LAG or not (whether the LAG is "foreign" to the
      switchdev interface or not).
      
      The port_obj_info->orig_dev is preserved across recursive calls, so
      switchdev drivers still know on which device was this notification
      originally emitted.
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c4076cdd