1. 15 Jan, 2020 15 commits
    • Ido Schimmel's avatar
      netdevsim: fib: Add dummy implementation for FIB offload · 48bb9eb4
      Ido Schimmel authored
      Implement dummy IPv4 and IPv6 FIB "offload" in the driver by storing
      currently "programmed" routes in a hash table. Each route in the hash
      table is marked with "trap" indication. The indication is cleared when
      the route is replaced or when the netdevsim instance is deleted.
      
      This will later allow us to test the route offload API on top of
      netdevsim.
      
      v2:
      * Convert to new fib_alias_hw_flags_set() interface
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Reviewed-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      48bb9eb4
    • Ido Schimmel's avatar
      mlxsw: spectrum_router: Set hardware flags for routes · ee5a0448
      Ido Schimmel authored
      Previous patches added support for two hardware flags for IPv4 and IPv6
      routes: 'RTM_F_OFFLOAD' and 'RTM_F_TRAP'. Both indicate the presence of
      the route in hardware. The first indicates that traffic is actually
      offloaded from the kernel, whereas the second indicates that packets
      hitting such routes are trapped to the kernel for processing (e.g., host
      routes).
      
      Use these two flags in mlxsw. The flags are modified in two places.
      Firstly, whenever a route is updated in the device's table. This
      includes the addition, deletion or update of a route. For example, when
      a host route is promoted to perform NVE decapsulation, its action in the
      device is updated, the 'RTM_F_OFFLOAD' flag set and the 'RTM_F_TRAP'
      flag cleared.
      
      Secondly, when a route is replaced and overwritten by another route, its
      flags are cleared.
      
      v2:
      * Convert to new fib_alias_hw_flags_set() interface
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ee5a0448
    • Ido Schimmel's avatar
      mlxsw: spectrum_router: Separate nexthop offload indication from route · 8c5a5b9b
      Ido Schimmel authored
      The driver currently uses the 'RTNH_F_OFFLOAD' flag for both routes and
      nexthops, which is cumbersome and unnecessary now that we have separate
      flag for the route itself.
      
      Separate the offload indication for nexthops from routes and call it
      whenever the offload state within the nexthop group changes.
      
      Note that IPv6 (unlike IPv4) does not share the same nexthop group
      between different routes, whereas mlxsw does. Therefore, whenever the
      offload indication within an IPv6 nexthop group changes, all the linked
      routes need to be updated.
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8c5a5b9b
    • Ido Schimmel's avatar
      ipv6: Add "offload" and "trap" indications to routes · bb3c4ab9
      Ido Schimmel authored
      In a similar fashion to previous patch, add "offload" and "trap"
      indication to IPv6 routes.
      
      This is done by using two unused bits in 'struct fib6_info' to hold
      these indications. Capable drivers are expected to set these when
      processing the various in-kernel route notifications.
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Reviewed-by: default avatarJiri Pirko <jiri@mellanox.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@gmail.com>
      Acked-by: default avatarRoopa Prabhu <roopa@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bb3c4ab9
    • Ido Schimmel's avatar
      ipv4: Add "offload" and "trap" indications to routes · 90b93f1b
      Ido Schimmel authored
      When performing L3 offload, routes and nexthops are usually programmed
      into two different tables in the underlying device. Therefore, the fact
      that a nexthop resides in hardware does not necessarily mean that all
      the associated routes also reside in hardware and vice-versa.
      
      While the kernel can signal to user space the presence of a nexthop in
      hardware (via 'RTNH_F_OFFLOAD'), it does not have a corresponding flag
      for routes. In addition, the fact that a route resides in hardware does
      not necessarily mean that the traffic is offloaded. For example,
      unreachable routes (i.e., 'RTN_UNREACHABLE') are programmed to trap
      packets to the CPU so that the kernel will be able to generate the
      appropriate ICMP error packet.
      
      This patch adds an "offload" and "trap" indications to IPv4 routes, so
      that users will have better visibility into the offload process.
      
      'struct fib_alias' is extended with two new fields that indicate if the
      route resides in hardware or not and if it is offloading traffic from
      the kernel or trapping packets to it. Note that the new fields are added
      in the 6 bytes hole and therefore the struct still fits in a single
      cache line [1].
      
      Capable drivers are expected to invoke fib_alias_hw_flags_set() with the
      route's key in order to set the flags.
      
      The indications are dumped to user space via a new flags (i.e.,
      'RTM_F_OFFLOAD' and 'RTM_F_TRAP') in the 'rtm_flags' field in the
      ancillary header.
      
      v2:
      * Make use of 'struct fib_rt_info' in fib_alias_hw_flags_set()
      
      [1]
      struct fib_alias {
              struct hlist_node  fa_list;                      /*     0    16 */
              struct fib_info *          fa_info;              /*    16     8 */
              u8                         fa_tos;               /*    24     1 */
              u8                         fa_type;              /*    25     1 */
              u8                         fa_state;             /*    26     1 */
              u8                         fa_slen;              /*    27     1 */
              u32                        tb_id;                /*    28     4 */
              s16                        fa_default;           /*    32     2 */
              u8                         offload:1;            /*    34: 0  1 */
              u8                         trap:1;               /*    34: 1  1 */
              u8                         unused:6;             /*    34: 2  1 */
      
              /* XXX 5 bytes hole, try to pack */
      
              struct callback_head rcu __attribute__((__aligned__(8))); /*    40    16 */
      
              /* size: 56, cachelines: 1, members: 12 */
              /* sum members: 50, holes: 1, sum holes: 5 */
              /* sum bitfield members: 8 bits (1 bytes) */
              /* forced alignments: 1, forced holes: 1, sum forced holes: 5 */
              /* last cacheline: 56 bytes */
      } __attribute__((__aligned__(8)));
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@gmail.com>
      Reviewed-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      90b93f1b
    • Ido Schimmel's avatar
      ipv4: Encapsulate function arguments in a struct · 1e301fd0
      Ido Schimmel authored
      fib_dump_info() is used to prepare RTM_{NEW,DEL}ROUTE netlink messages
      using the passed arguments. Currently, the function takes 11 arguments,
      6 of which are attributes of the route being dumped (e.g., prefix, TOS).
      
      The next patch will need the function to also dump to user space an
      indication if the route is present in hardware or not. Instead of
      passing yet another argument, change the function to take a struct
      containing the different route attributes.
      
      v2:
      * Name last argument of fib_dump_info()
      * Move 'struct fib_rt_info' to include/net/ip_fib.h so that it could
        later be passed to fib_alias_hw_flags_set()
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@gmail.com>
      Reviewed-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1e301fd0
    • Ido Schimmel's avatar
      ipv4: Replace route in list before notifying · 6324d0fa
      Ido Schimmel authored
      Subsequent patches will add an offload / trap indication to routes which
      will signal if the route is present in hardware or not.
      
      After programming the route to the hardware, drivers will have to ask
      the IPv4 code to set the flags by passing the route's key.
      
      In the case of route replace, the new route is notified before it is
      actually inserted into the FIB alias list. This can prevent simple
      drivers (e.g., netdevsim) that program the route to the hardware in the
      same context it is notified in from being able to set the flag.
      
      Solve this by first inserting the new route to the list and rollback the
      operation in case the route was vetoed.
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Reviewed-by: default avatarJiri Pirko <jiri@mellanox.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6324d0fa
    • Lorenzo Bianconi's avatar
      net: socionext: get rid of huge dma sync in netsec_alloc_rx_data · 0fadc0a2
      Lorenzo Bianconi authored
      Socionext driver can run on dma coherent and non-coherent devices.
      Get rid of huge dma_sync_single_for_device in netsec_alloc_rx_data since
      now the driver can let page_pool API to managed needed DMA sync
      Reviewed-by: default avatarIlias Apalodimas <ilias.apalodimas@linaro.org>
      Signed-off-by: default avatarLorenzo Bianconi <lorenzo@kernel.org>
      Acked-by: default avatarJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0fadc0a2
    • David S. Miller's avatar
      Merge branch 'QRTR-flow-control-improvements' · 0c73ffc7
      David S. Miller authored
      Bjorn Andersson says:
      
      ====================
      QRTR flow control improvements
      
      In order to prevent overconsumption of resources on the remote side QRTR
      implements a flow control mechanism.
      
      Move the handling of the incoming confirm_rx to the receiving process to
      ensure incoming flow is controlled. Then implement outgoing flow
      control, using the recommended algorithm of counting outstanding
      non-confirmed messages and blocking when hitting a limit. The last three
      patches refactors the node assignment and port lookup, in order to
      remove the worker in the receive path.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0c73ffc7
    • Bjorn Andersson's avatar
      net: qrtr: Remove receive worker · e04df98a
      Bjorn Andersson authored
      Rather than enqueuing messages and scheduling a worker to deliver them
      to the individual sockets we can now, thanks to the previous work, move
      this directly into the endpoint callback.
      
      This saves us a context switch per incoming message and removes the
      possibility of an opportunistic suspend to happen between the message is
      coming from the endpoint until it ends up in the socket's receive
      buffer.
      Signed-off-by: default avatarBjorn Andersson <bjorn.andersson@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e04df98a
    • Bjorn Andersson's avatar
      net: qrtr: Make qrtr_port_lookup() use RCU · f16a4b26
      Bjorn Andersson authored
      The important part of qrtr_port_lookup() wrt synchronization is that the
      function returns a reference counted struct qrtr_sock, or fail.
      
      As such we need only to ensure that an decrement of the object's
      refcount happens inbetween the finding of the object in the idr and
      qrtr_port_lookup()'s own increment of the object.
      
      By using RCU and putting a synchronization point after we remove the
      mapping from the idr, but before it can be released we achieve this -
      with the benefit of not having to hold the mutex in qrtr_port_lookup().
      Signed-off-by: default avatarBjorn Andersson <bjorn.andersson@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f16a4b26
    • Bjorn Andersson's avatar
      net: qrtr: Migrate node lookup tree to spinlock · 0a7e0d0e
      Bjorn Andersson authored
      Move operations on the qrtr_nodes radix tree under a separate spinlock
      and make the qrtr_nodes tree GFP_ATOMIC, to allow operation from atomic
      context in a subsequent patch.
      Signed-off-by: default avatarBjorn Andersson <bjorn.andersson@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0a7e0d0e
    • Bjorn Andersson's avatar
      net: qrtr: Implement outgoing flow control · 5fdeb0d3
      Bjorn Andersson authored
      In order to prevent overconsumption of resources on the remote side QRTR
      implements a flow control mechanism.
      
      The mechanism works by the sender keeping track of the number of
      outstanding unconfirmed messages that has been transmitted to a
      particular node/port pair.
      
      Upon count reaching a low watermark (L) the confirm_rx bit is set in the
      outgoing message and when the count reaching a high watermark (H)
      transmission will be blocked upon the reception of a resume_tx message
      from the remote, that resets the counter to 0.
      
      This guarantees that there will be at most 2H - L messages in flight.
      Values chosen for L and H are 5 and 10 respectively.
      Signed-off-by: default avatarBjorn Andersson <bjorn.andersson@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5fdeb0d3
    • Bjorn Andersson's avatar
      net: qrtr: Move resume-tx transmission to recvmsg · cb6530b9
      Bjorn Andersson authored
      The confirm-rx bit is used to implement a per port flow control, in
      order to make sure that no messages are dropped due to resource
      exhaustion. Move the resume-tx transmission to recvmsg to only confirm
      messages as they are consumed by the application.
      Signed-off-by: default avatarBjorn Andersson <bjorn.andersson@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cb6530b9
    • Niu Xilei's avatar
      pktgen: Allow configuration of IPv6 source address range · 7786a1af
      Niu Xilei authored
      Pktgen can use only one IPv6 source address from output device or src6
      command setting. In pressure test we need create lots of sessions more
      than 65535. So add src6_min and src6_max command to set the range.
      Signed-off-by: default avatarNiu Xilei <niu_xilei@163.com>
      
      Changes since v3:
       - function set_src_in6_addr use static instead of static inline
       - precompute min_in6_l,min_in6_h,max_in6_h,max_in6_l in setup time
      Changes since v2:
       - reword subject line
      Changes since v1:
       - only create IPv6 source address over least significant 64 bit range
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7786a1af
  2. 14 Jan, 2020 25 commits
    • Tian Tao's avatar
      nfc: No need to set .owner platform_driver_register · a4d35e77
      Tian Tao authored
      the i2c_add_driver will set the .owner to THIS_MODULE
      Signed-off-by: default avatarTian Tao <tiantao6@hisilicon.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a4d35e77
    • David S. Miller's avatar
      Merge branch 'skb_list_walk_safe-refactoring' · 2b133adf
      David S. Miller authored
      Jason A. Donenfeld says:
      
      ====================
      skb_list_walk_safe refactoring for net/*'s skb_gso_segment usage
      
      This patchset adjusts all return values of skb_gso_segment in net/* to
      use the new skb_list_walk_safe helper.
      
      First we fix a minor bug in the helper macro that didn't come up in the
      last patchset's uses. Then we adjust several cases throughout net/. The
      xfrm changes were a bit hairy, but doable. Reading and thinking about
      the code in mac80211 indicates a memory leak, which the commit
      addresses. All the other cases were pretty trivial.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2b133adf
    • Jason A. Donenfeld's avatar
      net: mac80211: use skb_list_walk_safe helper for gso segments · 9f3ef3d7
      Jason A. Donenfeld authored
      This is a conversion case for the new function, keeping the flow of the
      existing code as intact as possible. We also switch over to using
      skb_mark_not_on_list instead of a null write to skb->next.
      
      Finally, this code appeared to have a memory leak in the case where
      header building fails before the last gso segment. In that case, the
      remaining segments are not freed. So this commit also adds the proper
      kfree_skb_list call for the remainder of the skbs.
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9f3ef3d7
    • Jason A. Donenfeld's avatar
      net: netfilter: use skb_list_walk_safe helper for gso segments · 2670ee77
      Jason A. Donenfeld authored
      This is a straight-forward conversion case for the new function, keeping
      the flow of the existing code as intact as possible.
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2670ee77
    • Jason A. Donenfeld's avatar
      net: ipv4: use skb_list_walk_safe helper for gso segments · 88bebdf5
      Jason A. Donenfeld authored
      This is a straight-forward conversion case for the new function, keeping
      the flow of the existing code as intact as possible.
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      88bebdf5
    • Jason A. Donenfeld's avatar
      net: sched: use skb_list_walk_safe helper for gso segments · b950d8a5
      Jason A. Donenfeld authored
      This is a straight-forward conversion case for the new function, keeping
      the flow of the existing code as intact as possible.
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b950d8a5
    • Jason A. Donenfeld's avatar
      net: openvswitch: use skb_list_walk_safe helper for gso segments · 2cec4448
      Jason A. Donenfeld authored
      This is a straight-forward conversion case for the new function, keeping
      the flow of the existing code as intact as possible.
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2cec4448
    • Jason A. Donenfeld's avatar
      net: xfrm: use skb_list_walk_safe helper for gso segments · c3b18e0d
      Jason A. Donenfeld authored
      This is converts xfrm segment iteration to use the new function, keeping
      the flow of the existing code as intact as possible. One case is very
      straight-forward, whereas the other case has some more subtle code that
      likes to peak at ->next and relink skbs. By keeping the variables the
      same as before, we can upgrade this code with minimal surgery required.
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c3b18e0d
    • Jason A. Donenfeld's avatar
      net: udp: use skb_list_walk_safe helper for gso segments · 1a186c14
      Jason A. Donenfeld authored
      This is a straight-forward conversion case for the new function,
      iterating over the return value from udp_rcv_segment, which actually is
      a wrapper around skb_gso_segment.
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1a186c14
    • Jason A. Donenfeld's avatar
      net: skbuff: disambiguate argument and member for skb_list_walk_safe helper · 5eee7bd7
      Jason A. Donenfeld authored
      This worked before, because we made all callers name their next pointer
      "next". But in trying to be more "drop-in" ready, the silliness here is
      revealed. This commit fixes the problem by making the macro argument and
      the member use different names.
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5eee7bd7
    • David S. Miller's avatar
      Merge branch 'macsec-hw-offload' · ec22ab00
      David S. Miller authored
      Antoine Tenart says:
      
      ====================
      net: macsec: initial support for hardware offloading
      
      This series intends to add support for offloading MACsec transformations
      to hardware enabled devices. The series adds the necessary
      infrastructure for offloading MACsec configurations to hardware drivers,
      in patches 1 to 5; then introduces MACsec offloading support in the
      Microsemi MSCC PHY driver, in patches 6 to 10.
      
      The series can also be found at:
      https://github.com/atenart/linux/tree/net-next/macsec
      
      IProute2 modifications can be found at:
      https://github.com/atenart/iproute2/tree/macsec
      
      MACsec hardware offloading infrastructure
      -----------------------------------------
      
      Linux has a software implementation of the MACsec standard. There are
      hardware engines supporting MACsec operations, such as the Intel ixgbe
      NIC and some Microsemi PHYs (the one we use in this series). This means
      the MACsec offloading infrastructure should support networking PHY and
      MAC drivers. Note that MAC driver preliminary support is part of this
      series, but should not be merged before we actually have a provider for
      this.
      
      We do intend in this series to re-use the logic, netlink API and data
      structures of the existing MACsec software implementation. This allows
      not to duplicate definitions and structure storing the same information;
      as well as using the same userspace tools to configure both software or
      hardware offloaded MACsec flows (with `ip macsec`).
      
      When adding a new MACsec virtual interface the existing logic is kept:
      offloading is disabled by default. A user driven configuration choice is
      needed to switch to offloading mode (a patch in iproute2 is needed for
      this). A single MACsec interface can be offloaded for now, and some
      limitations are there: no flow can be moved from one implementation to
      the other so the decision needs to be done before configuring the
      interface.
      
      MACsec offloading ops are called in 2 steps: a preparation one, and a
      commit one. The first step is allowed to fail and should be used to
      check if a provided configuration is compatible with a given MACsec
      capable hardware. The second step is not allowed to fail and should
      only be used to enable a given MACsec configuration.
      
      A limitation as of now is the counters and statistics are not reported
      back from the hardware to the software MACsec implementation. This
      isn't an issue when using offloaded MACsec transformations, but it
      should be added in the future so that the MACsec state can be reported
      to the user (which would also improve the debug).
      
      Microsemi PHY MACsec support
      ----------------------------
      
      In order to add support for the MACsec offloading feature in the
      Microsemi MSCC PHY driver, the __phy_read_page and __phy_write_page
      helpers had to be exported. This is because the initialization of the
      PHY is done while holding the MDIO bus lock, and we need to change the
      page to configure the MACsec block.
      
      The support itself is then added in three patches. The first one adds
      support for configuring the MACsec block within the PHY, so that it is
      up, running and available for future configuration, but is not doing any
      modification on the traffic passing through the PHY. The second patch
      implements the phy_device MACsec ops in the Microsemi MSCC PHY driver,
      and introduce helpers to configure MACsec transformations and flows to
      match specific packets. The last one adds support for PN rollover.
      
      Thanks!
      Antoine
      
      Since v5:
        - Fixed a compilation issue due to an inclusion from an UAPI header.
        - Added an EXPORT_SYMBOL_GPL for the PN rollover helper, to fix module
          compilation issues.
        - Added a dependency for the MSCC driver on MACSEC || MACSEC=n.
        - Removed the patches including the MAC offloading support as they are
          not to be applied for now.
      
      Since v4:
        - Reworked the MACsec read and write functions in the MSCC PHY driver
          to remove the conditional locking.
      
      Since v3:
        - Fixed a check when enabling offloading that was too restrictive.
        - Fixed the propagation of the changelink event to the underlying
          device drivers.
      
      Since v2:
        - Allow selection the offloading from userspace, defaulting to the
          software implementation when adding a new MACsec interface. The
          offloading mode is now also reported through netlink.
        - Added support for letting MKA packets in and out when using MACsec
          (there are rules to let them bypass the MACsec h/w engine within the
          PHY).
        - Added support for PN rollover (following what's currently done in
          the software implementation: the flow is disabled).
        - Split patches to remove MAC offloading support for now, as there are
          no current provider for this (patches are still included).
        - Improved a few parts of the MACsec support within the MSCC PHY
          driver (e.g. default rules now block non-MACsec traffic, depending
          on the configuration).
        - Many cosmetic fixes & small improvements.
      
      Since v1:
        - Reworked the MACsec offloading API, moving from a single helper
          called for all MACsec configuration operations, to a per-operation
          function that is provided by the underlying hardware drivers.
        - Those functions now contain a verb to describe the configuration
          action they're offloading.
        - Improved the error handling in the MACsec genl helpers to revert
          the configuration to its previous state when the offloading call
          failed.
        - Reworked the file inclusions.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ec22ab00
    • Antoine Tenart's avatar
      net: phy: mscc: PN rollover support · 781449a4
      Antoine Tenart authored
      This patch adds support for handling MACsec PN rollover in the mscc PHY
      driver. When a flow rolls over, an interrupt is fired. This patch adds
      the logic to check all flows and identify the one rolling over in the
      handle_interrupt PHY helper, then disables the flow and report the event
      to the MACsec core.
      Signed-off-by: default avatarAntoine Tenart <antoine.tenart@bootlin.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      781449a4
    • Antoine Tenart's avatar
      net: macsec: PN wrap callback · 5c937de7
      Antoine Tenart authored
      Allow to call macsec_pn_wrapped from hardware drivers to notify when a
      PN rolls over. Some drivers might used an interrupt to implement this.
      Signed-off-by: default avatarAntoine Tenart <antoine.tenart@bootlin.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5c937de7
    • Antoine Tenart's avatar
      net: phy: mscc: macsec support · 28c5107a
      Antoine Tenart authored
      This patch adds MACsec offloading support to some Microsemi PHYs, to
      configure flows and transformations so that matched packets can be
      processed by the MACsec engine, either at egress, or at ingress.
      Signed-off-by: default avatarAntoine Tenart <antoine.tenart@bootlin.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      28c5107a
    • Antoine Tenart's avatar
      net: phy: mscc: macsec initialization · 1bbe0ecc
      Antoine Tenart authored
      This patch adds support for initializing the MACsec engine found within
      some Microsemi PHYs. The engine is initialized in a passthrough mode and
      does not modify any incoming or outgoing packet. But thanks to this it
      now can be configured to perform MACsec transformations on packets,
      which will be supported by a future patch.
      
      The MACsec read and write functions are wrapped into two versions: one
      called during the init phase, and the other one later on. This is
      because the init functions in the Microsemi PHY driver are called while
      the MDIO bus lock is taken.
      Signed-off-by: default avatarAntoine Tenart <antoine.tenart@bootlin.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1bbe0ecc
    • Antoine Tenart's avatar
      net: macsec: add nla support for changing the offloading selection · dcb780fb
      Antoine Tenart authored
      MACsec offloading to underlying hardware devices is disabled by default
      (the software implementation is used). This patch adds support for
      changing this setting through the MACsec netlink interface. Many checks
      are done when enabling offloading on a given MACsec interface as there
      are limitations (it must be supported by the hardware, only a single
      interface can be offloaded on a given physical device at a time, rules
      can't be moved for now).
      Signed-off-by: default avatarAntoine Tenart <antoine.tenart@bootlin.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      dcb780fb
    • Antoine Tenart's avatar
      net: macsec: hardware offloading infrastructure · 3cf3227a
      Antoine Tenart authored
      This patch introduces the MACsec hardware offloading infrastructure.
      
      The main idea here is to re-use the logic and data structures of the
      software MACsec implementation. This allows not to duplicate definitions
      and structure storing the same kind of information. It also allows to
      use a unified genlink interface for both MACsec implementations (so that
      the same userspace tool, `ip macsec`, is used with the same arguments).
      The MACsec offloading support cannot be disabled if an interface
      supports it at the moment.
      
      The MACsec configuration is passed to device drivers supporting it
      through macsec_ops which are called from the MACsec genl helpers. Those
      functions call the macsec ops of PHY and Ethernet drivers in two steps:
      a preparation one, and a commit one. The first step is allowed to fail
      and should be used to check if a provided configuration is compatible
      with the features provided by a MACsec engine, while the second step is
      not allowed to fail and should only be used to enable a given MACsec
      configuration. Two extra calls are made: when a virtual MACsec interface
      is created and when it is deleted, so that the hardware driver can stay
      in sync.
      
      The Rx and TX handlers are modified to take in account the special case
      were the MACsec transformation happens in the hardware, whether in a PHY
      or in a MAC, as the packets seen by the networking stack on both the
      physical and MACsec virtual interface are exactly the same. This leads
      to some limitations: the hardware and software implementations can't be
      used on the same physical interface, as the policies would be impossible
      to fulfill (such as strict validation of the frames). Also only a single
      virtual MACsec interface can be offloaded to a physical port supporting
      hardware offloading as it would be impossible to guess onto which
      interface a given packet should go (for ingress traffic).
      
      Another limitation as of now is that the counters and statistics are not
      reported back from the hardware to the software MACsec implementation.
      This isn't an issue when using offloaded MACsec transformations, but it
      should be added in the future so that the MACsec state can be reported
      to the user (which would also improve the debug).
      Signed-off-by: default avatarAntoine Tenart <antoine.tenart@bootlin.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3cf3227a
    • Antoine Tenart's avatar
      net: phy: add MACsec ops in phy_device · 2e181358
      Antoine Tenart authored
      This patch adds a reference to MACsec ops in the phy_device, to allow
      PHYs to support offloading MACsec operations. The phydev lock will be
      held while calling those helpers.
      Signed-off-by: default avatarAntoine Tenart <antoine.tenart@bootlin.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2e181358
    • Antoine Tenart's avatar
      net: macsec: introduce MACsec ops · 0830e20b
      Antoine Tenart authored
      This patch introduces MACsec ops for drivers to support offloading
      MACsec operations.
      Signed-off-by: default avatarAntoine Tenart <antoine.tenart@bootlin.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0830e20b
    • Antoine Tenart's avatar
      net: macsec: introduce the macsec_context structure · 76564261
      Antoine Tenart authored
      This patch introduces the macsec_context structure. It will be used
      in the kernel to exchange information between the common MACsec
      implementation (macsec.c) and the MACsec hardware offloading
      implementations. This structure contains pointers to MACsec specific
      structures which contain the actual MACsec configuration, and to the
      underlying device (phydev for now).
      Signed-off-by: default avatarAntoine Tenart <antoine.tenart@bootlin.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      76564261
    • Antoine Tenart's avatar
      net: macsec: move some definitions in a dedicated header · c0e4eadf
      Antoine Tenart authored
      This patch moves some structure, type and identifier definitions into a
      MACsec specific header. This patch does not modify how the MACsec code
      is running and only move things around. This is a preparation for the
      future MACsec hardware offloading support, which will re-use those
      definitions outside macsec.c.
      Signed-off-by: default avatarAntoine Tenart <antoine.tenart@bootlin.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c0e4eadf
    • David S. Miller's avatar
      Merge branch 'netns-Optimise-netns-ID-lookups' · 169af346
      David S. Miller authored
      Guillaume Nault says:
      
      ====================
      netns: Optimise netns ID lookups
      
      Netns ID lookups can be easily protected by RCU, rather than by holding
      a spinlock.
      
      Patch 1 prepares the code, patch 2 does the RCU conversion, and finally
      patch 3 stops disabling BHs on updates (patch 2 makes that unnecessary).
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      169af346
    • Guillaume Nault's avatar
      netns: don't disable BHs when locking "nsid_lock" · 8d7e5dee
      Guillaume Nault authored
      When peernet2id() had to lock "nsid_lock" before iterating through the
      nsid table, we had to disable BHs, because VXLAN can call peernet2id()
      from the xmit path:
        vxlan_xmit() -> vxlan_fdb_miss() -> vxlan_fdb_notify()
          -> __vxlan_fdb_notify() -> vxlan_fdb_info() -> peernet2id().
      
      Now that peernet2id() uses RCU protection, "nsid_lock" isn't used in BH
      context anymore. Therefore, we can safely use plain
      spin_lock()/spin_unlock() and let BHs run when holding "nsid_lock".
      Signed-off-by: default avatarGuillaume Nault <gnault@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8d7e5dee
    • Guillaume Nault's avatar
      netns: protect netns ID lookups with RCU · 2dce224f
      Guillaume Nault authored
      __peernet2id() can be protected by RCU as it only calls idr_for_each(),
      which is RCU-safe, and never modifies the nsid table.
      
      rtnl_net_dumpid() can also do lockless lookups. It does two nested
      idr_for_each() calls on nsid tables (one direct call and one indirect
      call because of rtnl_net_dumpid_one() calling __peernet2id()). The
      netnsid tables are never updated. Therefore it is safe to not take the
      nsid_lock and run within an RCU-critical section instead.
      Signed-off-by: default avatarGuillaume Nault <gnault@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2dce224f
    • Guillaume Nault's avatar
      netns: Remove __peernet2id_alloc() · 49052941
      Guillaume Nault authored
      __peernet2id_alloc() was used for both plain lookups and for netns ID
      allocations (depending the value of '*alloc'). Let's separate lookups
      from allocations instead. That is, integrate the lookup code into
      __peernet2id() and make peernet2id_alloc() responsible for allocating
      new netns IDs when necessary.
      
      This makes it clear that __peernet2id() doesn't modify the idr and
      prepares the code for lockless lookups.
      
      Also, mark the 'net' argument of __peernet2id() as 'const', since we're
      modifying this line.
      Signed-off-by: default avatarGuillaume Nault <gnault@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      49052941