• John Fastabend's avatar
    net: add generic PF_BRIDGE:RTM_ FDB hooks · 77162022
    John Fastabend authored
    This adds two new flags NTF_MASTER and NTF_SELF that can
    now be used to specify where PF_BRIDGE netlink commands should
    be sent. NTF_MASTER sends the commands to the 'dev->master'
    device for parsing. Typically this will be the linux net/bridge,
    or open-vswitch devices. Also without any flags set the command
    will be handled by the master device as well so that current user
    space tools continue to work as expected.
    
    The NTF_SELF flag will push the PF_BRIDGE commands to the
    device. In the basic example below the commands are then parsed
    and programmed in the embedded bridge.
    
    Note if both NTF_SELF and NTF_MASTER bits are set then the
    command will be sent to both 'dev->master' and 'dev' this allows
    user space to easily keep the embedded bridge and software bridge
    in sync.
    
    There is a slight complication in the case with both flags set
    when an error occurs. To resolve this the rtnl handler clears
    the NTF_ flag in the netlink ack to indicate which sets completed
    successfully. The add/del handlers will abort as soon as any
    error occurs.
    
    To support this new net device ops were added to call into
    the device and the existing bridging code was refactored
    to use these. There should be no required changes in user space
    to support the current bridge behavior.
    
    A basic setup with a SR-IOV enabled NIC looks like this,
    
              veth0  veth2
                |      |
              ------------
              |  bridge0 |   <---- software bridging
              ------------
                   /
                   /
      ethx.y      ethx
        VF         PF
         \         \          <---- propagate FDB entries to HW
         \         \
      --------------------
      |  Embedded Bridge |    <---- hardware offloaded switching
      --------------------
    
    In this case the embedded bridge must be managed to allow 'veth0'
    to communicate with 'ethx.y' correctly. At present drivers managing
    the embedded bridge either send frames onto the network which
    then get dropped by the switch OR the embedded bridge will flood
    these frames. With this patch we have a mechanism to manage the
    embedded bridge correctly from user space. This example is specific
    to SR-IOV but replacing the VF with another PF or dropping this
    into the DSA framework generates similar management issues.
    
    Examples session using the 'br'[1] tool to add, dump and then
    delete a mac address with a new "embedded" option and enabled
    ixgbe driver:
    
    # br fdb add 22:35:19:ac:60:59 dev eth3
    # br fdb
    port    mac addr                flags
    veth0   22:35:19:ac:60:58       static
    veth0   9a:5f:81:f7:f6:ec       local
    eth3    00:1b:21:55:23:59       local
    eth3    22:35:19:ac:60:59       static
    veth0   22:35:19:ac:60:57       static
    #br fdb add 22:35:19:ac:60:59 embedded dev eth3
    #br fdb
    port    mac addr                flags
    veth0   22:35:19:ac:60:58       static
    veth0   9a:5f:81:f7:f6:ec       local
    eth3    00:1b:21:55:23:59       local
    eth3    22:35:19:ac:60:59       static
    veth0   22:35:19:ac:60:57       static
    eth3    22:35:19:ac:60:59       local embedded
    #br fdb del 22:35:19:ac:60:59 embedded dev eth3
    
    I added a couple lines to 'br' to set the flags correctly is all. It
    is my opinion that the merit of this patch is now embedded and SW
    bridges can both be modeled correctly in user space using very nearly
    the same message passing.
    
    [1] 'br' tool was published as an RFC here and will be renamed 'bridge'
        http://patchwork.ozlabs.org/patch/117664/
    
    Thanks to Jamal Hadi Salim, Stephen Hemminger and Ben Hutchings for
    valuable feedback, suggestions, and review.
    
    v2: fixed api descriptions and error case with both NTF_SELF and
        NTF_MASTER set plus updated patch description.
    Signed-off-by: default avatarJohn Fastabend <john.r.fastabend@intel.com>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    77162022
br_private.h 15.7 KB