1. 18 Mar, 2021 40 commits
    • Tobias Waldekranz's avatar
      net: dsa: mv88e6xxx: Offload bridge broadcast flooding flag · 8d1d8298
      Tobias Waldekranz authored
      These switches have two modes of classifying broadcast:
      
      1. Broadcast is multicast.
      2. Broadcast is its own unique thing that is always flooded
         everywhere.
      
      This driver uses the first option, making sure to load the broadcast
      address into all active databases. Because of this, we can support
      per-port broadcast flooding by (1) making sure to only set the subset
      of ports that have it enabled whenever joining a new bridge or VLAN,
      and (2) by updating all active databases whenever the setting is
      changed on a port.
      Signed-off-by: default avatarTobias Waldekranz <tobias@waldekranz.com>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8d1d8298
    • Tobias Waldekranz's avatar
      net: dsa: mv88e6xxx: Offload bridge learning flag · 041bd545
      Tobias Waldekranz authored
      Allow a user to control automatic learning per port.
      
      Many chips have an explicit "LearningDisable"-bit that can be used for
      this, but we opt for setting/clearing the PAV instead, as it works on
      all devices at least as far back as 6083.
      Signed-off-by: default avatarTobias Waldekranz <tobias@waldekranz.com>
      Reviewed-by: default avatarVladimir Oltean <olteanv@gmail.com>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      041bd545
    • Tobias Waldekranz's avatar
      net: dsa: mv88e6xxx: Flood all traffic classes on standalone ports · 7b9f16fe
      Tobias Waldekranz authored
      In accordance with the comment in dsa_port_bridge_leave, standalone
      ports shall be configured to flood all types of traffic. This change
      aligns the mv88e6xxx driver with that policy.
      
      Previously a standalone port would initially not egress any unknown
      traffic, but after joining and then leaving a bridge, it would.
      
      This does not matter that much since we only ever send FROM_CPUs on
      standalone ports, but it seems prudent to make sure that the initial
      values match those that are applied after a bridging/unbridging cycle.
      Signed-off-by: default avatarTobias Waldekranz <tobias@waldekranz.com>
      Reviewed-by: default avatarVladimir Oltean <olteanv@gmail.com>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7b9f16fe
    • Tobias Waldekranz's avatar
      net: dsa: mv88e6xxx: Use standard helper for broadcast address · 0806dd46
      Tobias Waldekranz authored
      Use the conventional declaration style of a MAC address in the
      kernel (u8 addr[ETH_ALEN]) for the broadcast address, then set it
      using the existing helper.
      Signed-off-by: default avatarTobias Waldekranz <tobias@waldekranz.com>
      Reviewed-by: default avatarVladimir Oltean <olteanv@gmail.com>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0806dd46
    • Tobias Waldekranz's avatar
      net: dsa: mv88e6xxx: Remove some bureaucracy around querying the VTU · 34065c58
      Tobias Waldekranz authored
      The hardware has a somewhat quirky protocol for reading out the VTU
      entry for a particular VID. But there is no reason why we cannot
      create a better API for ourselves in the driver.
      Signed-off-by: default avatarTobias Waldekranz <tobias@waldekranz.com>
      Reviewed-by: default avatarVladimir Oltean <olteanv@gmail.com>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      34065c58
    • Tobias Waldekranz's avatar
      net: dsa: mv88e6xxx: Provide generic VTU iterator · d89ef4b8
      Tobias Waldekranz authored
      Move the intricacies of correctly iterating over the VTU to a common
      implementation.
      Signed-off-by: default avatarTobias Waldekranz <tobias@waldekranz.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Reviewed-by: default avatarVladimir Oltean <olteanv@gmail.com>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d89ef4b8
    • Tobias Waldekranz's avatar
      net: dsa: mv88e6xxx: Avoid useless attempts to fast-age LAGs · ffcec3f2
      Tobias Waldekranz authored
      When a port is a part of a LAG, the ATU will create dynamic entries
      belonging to the LAG ID when learning is enabled. So trying to
      fast-age those out using the constituent port will have no
      effect. Unfortunately the hardware does not support move operations on
      LAGs so there is no obvious way to transform the request to target the
      LAG instead.
      
      Instead we document this known limitation and at least avoid wasting
      any time on it.
      Signed-off-by: default avatarTobias Waldekranz <tobias@waldekranz.com>
      Reviewed-by: default avatarVladimir Oltean <olteanv@gmail.com>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ffcec3f2
    • Tobias Waldekranz's avatar
      net: dsa: Add helper to resolve bridge port from DSA port · cc76ce9e
      Tobias Waldekranz authored
      In order for a driver to be able to query a bridge for information
      about itself, e.g. reading out port flags, it has to use a netdev that
      is known to the bridge. In the simple case, that is just the netdev
      representing the port, e.g. swp0 or swp1 in this example:
      
         br0
         / \
      swp0 swp1
      
      But in the case of an offloaded lag, this will be the bond or team
      interface, e.g. bond0 in this example:
      
           br0
           /
        bond0
         / \
      swp0 swp1
      
      Add a helper that hides some of this complexity from the
      drivers. Then, redefine dsa_port_offloads_bridge_port using the helper
      to avoid double accounting of the set of possible offloaded uppers.
      Signed-off-by: default avatarTobias Waldekranz <tobias@waldekranz.com>
      Reviewed-by: default avatarVladimir Oltean <olteanv@gmail.com>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cc76ce9e
    • David S. Miller's avatar
      Merge branch 'ipa-32bit' · 44b958a6
      David S. Miller authored
      Alex Elder says:
      
      ====================
      net: ipa: support 32-bit targets
      
      There is currently a configuration dependency that restricts IPA to
      be supported only on 64-bit machines.  There are only a few things
      that really require that, and those are fixed in this series.  The
      last patch in the series removes the CONFIG_64BIT build dependency
      for IPA.
      
      Version 2 of this series uses upper_32_bits() rather than creating
      a new function to extract bits out of a DMA address.  Version 3 of
      uses lower_32_bits() as well.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      44b958a6
    • Alex Elder's avatar
      net: ipa: relax 64-bit build requirement · 99e75a37
      Alex Elder authored
      We currently assume the IPA driver is built only for a 64 bit kernel.
      
      When this constraint was put in place it eliminated some do_div()
      calls, replacing them with the "/" and "%" operators.  We now only
      use these operations on u32 and size_t objects.  In a 32-bit kernel
      build, size_t will be 32 bits wide, so there remains no reason to
      use do_div() for divide and modulo.
      
      A few recent commits also fix some code that assumes that DMA
      addresses are 64 bits wide.
      
      With that, we can get rid of the 64-bit build requirement.
      Signed-off-by: default avatarAlex Elder <elder@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      99e75a37
    • Alex Elder's avatar
      net: ipa: fix table alignment requirement · e5d4e96b
      Alex Elder authored
      We currently have a build-time check to ensure that the minimum DMA
      allocation alignment satisfies the constraint that IPA filter and
      route tables must point to rules that are 128-byte aligned.
      
      But what's really important is that the actual allocated DMA memory
      has that alignment, even if the minimum is smaller than that.
      
      Remove the BUILD_BUG_ON() call checking against minimim DMA alignment
      and instead verify at rutime that the allocated memory is properly
      aligned.
      Signed-off-by: default avatarAlex Elder <elder@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e5d4e96b
    • Alex Elder's avatar
      net: ipa: use upper_32_bits() · 3c54b7be
      Alex Elder authored
      Use upper_32_bits() to extract the high-order 32 bits of a DMA
      address.  This avoids doing a 32-position shift on a DMA address
      if it happens not to be 64 bits wide.  Use lower_32_bits() to
      extract the low-order 32 bits (because that's what it's for).
      Suggested-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarAlex Elder <elder@linaro.org>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3c54b7be
    • Alex Elder's avatar
      net: ipa: fix assumptions about DMA address size · d2fd2311
      Alex Elder authored
      Some build time checks in ipa_table_validate_build() assume that a
      DMA address is 64 bits wide.  That is more restrictive than it has
      to be.  A route or filter table is 64 bits wide no matter what the
      size of a DMA address is on the AP.  The code actually uses a
      pointer to __le64 to access table entries, and a fixed constant
      IPA_TABLE_ENTRY_SIZE to describe the size of those entries.
      
      Loosen up two checks so they still verify some requirements, but
      such that they do not assume the size of a DMA address is 64 bits.
      Signed-off-by: default avatarAlex Elder <elder@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d2fd2311
    • David S. Miller's avatar
      Merge branch 's390-qeth-next' · 5108802a
      David S. Miller authored
      Julian Wiedmann says:
      
      ====================
      s390/qeth: updates 2021-03-18
      
      please apply the following patch series for qeth to netdev's net-next
      tree.
      
      This brings two small optimizations (replace a hard-coded GFP_ATOMIC,
      pass through the NAPI budget to enable napi_consume_skb()), and removes
      some redundant VLAN filter code.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5108802a
    • Julian Wiedmann's avatar
      s390/qeth: remove RX VLAN filter stubs in L3 driver · d96a8c69
      Julian Wiedmann authored
      The callbacks have been slimmed down to a level where they no longer do
      any actual work. So stop pretending that we support the
      NETIF_F_HW_VLAN_CTAG_FILTER feature.
      Signed-off-by: default avatarJulian Wiedmann <jwi@linux.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d96a8c69
    • Julian Wiedmann's avatar
      s390/qeth: enable napi_consume_skb() for pending TX buffers · ad4bbd72
      Julian Wiedmann authored
      Pending TX buffers are completed from the same NAPI code as normal
      TX buffers. Pass the NAPI budget to qeth_tx_complete_buf() so that
      the freeing of the completed skbs can be deferred.
      Signed-off-by: default avatarJulian Wiedmann <jwi@linux.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ad4bbd72
    • Julian Wiedmann's avatar
      s390/qeth: allocate initial TX Buffer structs with GFP_KERNEL · e47ded97
      Julian Wiedmann authored
      qeth_init_qdio_out_buf() is typically called during initialization, and
      the GFP_ATOMIC is only needed for a very specific & rare case during TX
      completion.
      
      Allow callers to specify a gfp mask, so that the initialization path can
      select GFP_KERNEL. While at it also clarify the function name.
      Signed-off-by: default avatarJulian Wiedmann <jwi@linux.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e47ded97
    • David S. Miller's avatar
      Merge branch 'net-xps-improve-the-xps-maps-handling' · c2ed62b9
      David S. Miller authored
      Antoine Tenart says:
      
      ====================
      net: xps: improve the xps maps handling
      
      This series aims at fixing various issues with the xps code, including
      out-of-bound accesses and use-after-free. While doing so we try to
      improve the xps code maintainability and readability.
      
      The main change is moving dev->num_tc and dev->nr_ids in the xps maps, to
      avoid out-of-bound accesses as those two fields can be updated after the
      maps have been allocated. This allows further reworks, to improve the
      xps code readability and allow to stop taking the rtnl lock when
      reading the maps in sysfs. The maps are moved to an array in net_device,
      which simplifies the code a lot.
      
      One future improvement may be to remove the use of xps_map_mutex from
      net/core/dev.c, but that may require extra care.
      
      Thanks!
      Antoine
      
      Since v3:
        - Removed the 3 patches about the rtnl lock and __netif_set_xps_queue
          as there are extra issues. Those patches were not tied to the
          others, and I'll see want can be done as a separate effort.
        - One small fix in patch 12.
      
      Since v2:
        - Patches 13-16 are new to the series.
        - Fixed another issue I found while preparing v3 (use after free of
          old xps maps).
        - Kept the rtnl lock when calling netdev_get_tx_queue and
          netdev_txq_to_tc.
        - Use get_device/put_device when using the sb_dev.
        - Take the rtnl lock in mlx5 and virtio_net when calling
          netif_set_xps_queue.
        - Fixed a coding style issue.
      
      Since v1:
        - Reordered the patches to improve readability and avoid introducing
          issues in between patches.
        - Use dev_maps->nr_ids to allocate the mask in xps_queue_show but
          still default to nr_cpu_ids/dev->num_rx_queues in xps_queue_show
          when dev_maps hasn't been allocated yet for backward
          compatibility.:w
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c2ed62b9
    • Antoine Tenart's avatar
      net: NULL the old xps map entries when freeing them · 75b2758a
      Antoine Tenart authored
      In __netif_set_xps_queue, old map entries from the old dev_maps are
      freed but their corresponding entry in the old dev_maps aren't NULLed.
      Fix this.
      Signed-off-by: default avatarAntoine Tenart <atenart@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      75b2758a
    • Antoine Tenart's avatar
      net: fix use after free in xps · 2d05bf01
      Antoine Tenart authored
      When setting up an new dev_maps in __netif_set_xps_queue, we remove and
      free maps from unused CPUs/rx-queues near the end of the function; by
      calling remove_xps_queue. However it's possible those maps are also part
      of the old not-freed-yet dev_maps, which might be used concurrently.
      When that happens, a map can be freed while its corresponding entry in
      the old dev_maps table isn't NULLed, leading to: "BUG: KASAN:
      use-after-free" in different places.
      
      This fixes the map freeing logic for unused CPUs/rx-queues, to also NULL
      the map entries from the old dev_maps table.
      Signed-off-by: default avatarAntoine Tenart <atenart@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2d05bf01
    • Antoine Tenart's avatar
      net-sysfs: move the xps cpus/rxqs retrieval in a common function · 2db6cdae
      Antoine Tenart authored
      Most of the xps_cpus_show and xps_rxqs_show functions share the same
      logic. Having it in two different functions does not help maintenance.
      This patch moves their common logic into a new function, xps_queue_show,
      to improve this.
      Signed-off-by: default avatarAntoine Tenart <atenart@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2db6cdae
    • Antoine Tenart's avatar
      net-sysfs: move the rtnl unlock up in the xps show helpers · d7be87a6
      Antoine Tenart authored
      Now that nr_ids and num_tc are stored in the xps dev_maps, which are RCU
      protected, we do not have the need to protect the maps in the rtnl lock.
      Move the rtnl unlock up so we reduce the rtnl locking section.
      
      We also increase the reference count on the subordinate device if any,
      as we don't want this device to be freed while we use it (now that the
      rtnl lock isn't protecting it in the whole function).
      Signed-off-by: default avatarAntoine Tenart <atenart@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d7be87a6
    • Antoine Tenart's avatar
      net: improve queue removal readability in __netif_set_xps_queue · 132f743b
      Antoine Tenart authored
      Improve the readability of the loop removing tx-queue from unused
      CPUs/rx-queues in __netif_set_xps_queue. The change should only be
      cosmetic.
      Signed-off-by: default avatarAntoine Tenart <atenart@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      132f743b
    • Antoine Tenart's avatar
      net: add an helper to copy xps maps to the new dev_maps · 402fbb99
      Antoine Tenart authored
      This patch adds an helper, xps_copy_dev_maps, to copy maps from dev_maps
      to new_dev_maps at a given index. The logic should be the same, with an
      improved code readability and maintenance.
      Signed-off-by: default avatarAntoine Tenart <atenart@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      402fbb99
    • Antoine Tenart's avatar
      net: move the xps maps to an array · 044ab86d
      Antoine Tenart authored
      Move the xps maps (xps_cpus_map and xps_rxqs_map) to an array in
      net_device. That will simplify a lot the code removing the need for lots
      of if/else conditionals as the correct map will be available using its
      offset in the array.
      
      This should not modify the xps maps behaviour in any way.
      Suggested-by: default avatarAlexander Duyck <alexander.duyck@gmail.com>
      Signed-off-by: default avatarAntoine Tenart <atenart@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      044ab86d
    • Antoine Tenart's avatar
      net: remove the xps possible_mask · 6f36158e
      Antoine Tenart authored
      Remove the xps possible_mask. It was an optimization but we can just
      loop from 0 to nr_ids now that it is embedded in the xps dev_maps. That
      simplifies the code a bit.
      Suggested-by: default avatarAlexander Duyck <alexander.duyck@gmail.com>
      Signed-off-by: default avatarAntoine Tenart <atenart@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6f36158e
    • Antoine Tenart's avatar
      net: embed nr_ids in the xps maps · 5478fcd0
      Antoine Tenart authored
      Embed nr_ids (the number of cpu for the xps cpus map, and the number of
      rxqs for the xps cpus map) in dev_maps. That will help not accessing out
      of bound memory if those values change after dev_maps was allocated.
      Suggested-by: default avatarAlexander Duyck <alexander.duyck@gmail.com>
      Signed-off-by: default avatarAntoine Tenart <atenart@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5478fcd0
    • Antoine Tenart's avatar
      net: embed num_tc in the xps maps · 255c04a8
      Antoine Tenart authored
      The xps cpus/rxqs map is accessed using dev->num_tc, which is used when
      allocating the map. But later updates of dev->num_tc can lead to having
      a mismatch between the maps and how they're accessed. In such cases the
      map values do not make any sense and out of bound accesses can occur
      (that can be easily seen using KASAN).
      
      This patch aims at fixing this by embedding num_tc into the maps, using
      the value at the time the map is created. This brings two improvements:
      - The maps can be accessed using the embedded num_tc, so we know for
        sure we won't have out of bound accesses.
      - Checks can be made before accessing the maps so we know the values
        retrieved will make sense.
      
      We also update __netif_set_xps_queue to conditionally copy old maps from
      dev_maps in the new one only if the number of traffic classes from both
      maps match.
      Signed-off-by: default avatarAntoine Tenart <atenart@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      255c04a8
    • Antoine Tenart's avatar
      net-sysfs: make xps_cpus_show and xps_rxqs_show consistent · 73f5e52b
      Antoine Tenart authored
      Make the implementations of xps_cpus_show and xps_rxqs_show to converge,
      as the two share the same logic but diverted over time. This should not
      modify their behaviour but will help future changes and improve
      maintenance.
      Signed-off-by: default avatarAntoine Tenart <atenart@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      73f5e52b
    • Antoine Tenart's avatar
      net-sysfs: store the return of get_netdev_queue_index in an unsigned int · d9a063d2
      Antoine Tenart authored
      In net-sysfs, get_netdev_queue_index returns an unsigned int. Some of
      its callers use an unsigned long to store the returned value. Update the
      code to be consistent, this should only be cosmetic.
      Signed-off-by: default avatarAntoine Tenart <atenart@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d9a063d2
    • Antoine Tenart's avatar
      net-sysfs: convert xps_cpus_show to bitmap_zalloc · ea4fe7e8
      Antoine Tenart authored
      Use bitmap_zalloc instead of zalloc_cpumask_var in xps_cpus_show to
      align with xps_rxqs_show. This will improve maintenance and allow us to
      factorize the two functions. The function should behave the same.
      Signed-off-by: default avatarAntoine Tenart <atenart@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ea4fe7e8
    • Rafał Miłecki's avatar
      net: dsa: bcm_sf2: fix BCM4908 RGMII reg(s) · 6859d915
      Rafał Miłecki authored
      BCM4908 has only 1 RGMII reg for controlling port 7.
      
      Fixes: 73b7a604 ("net: dsa: bcm_sf2: support BCM4908's integrated switch")
      Signed-off-by: default avatarRafał Miłecki <rafal@milecki.pl>
      Acked-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6859d915
    • Rafał Miłecki's avatar
      net: dsa: bcm_sf2: add function finding RGMII register · 55cfeb39
      Rafał Miłecki authored
      Simple macro like REG_RGMII_CNTRL_P() is insufficient as:
      1. It doesn't validate port argument
      2. It doesn't support chipsets with non-lineral RGMII regs layout
      
      Missing port validation could result in getting register offset from out
      of array. Random memory -> random offset -> random reads/writes. It
      affected e.g. BCM4908 for REG_RGMII_CNTRL_P(7).
      
      Fixes: a78e86ed ("net: dsa: bcm_sf2: Prepare for different register layouts")
      Signed-off-by: default avatarRafał Miłecki <rafal@milecki.pl>
      Acked-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      55cfeb39
    • Álvaro Fernández Rojas's avatar
      net: dsa: b53: mmap: Add device tree support · a5538a77
      Álvaro Fernández Rojas authored
      Add device tree support to b53_mmap.c while keeping platform devices support.
      Signed-off-by: default avatarÁlvaro Fernández Rojas <noltari@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a5538a77
    • David S. Miller's avatar
      Merge branch 'stmmac-EST-interrupts-and-ethtool' · 7b78702e
      David S. Miller authored
      Mohammad Athari Bin Ismail says:
      
      ====================
      net: stmmac: EST interrupts and ethtool
      
      This patchset adds support for handling EST interrupts and reporting EST
      errors. Additionally, the errors are added into ethtool statistic.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7b78702e
    • Ong Boon Leong's avatar
      net: stmmac: Add EST errors into ethtool statistic · 9f298959
      Ong Boon Leong authored
      Below EST errors are added into ethtool statistic:
      1) Constant Gate Control Error (CGCE):
         The counter "mtl_est_cgce" increases everytime CGCE interrupt is
         triggered.
      
      2) Head-of-Line Blocking due to Scheduling (HLBS):
         The counter "mtl_est_hlbs" increases everytime HLBS interrupt is
         triggered.
      
      3) Head-of-Line Blocking due to Frame Size (HLBF):
         The counter "mtl_est_hlbf" increases everytime HLBF interrupt is
         triggered.
      
      4) Base Time Register error (BTRE):
         The counter "mtl_est_btre" increases everytime BTRE interrupt is
         triggered but BTRL not reaches maximum value of 15.
      
      5) Base Time Register Error Loop Count (BTRL) reaches maximum value:
         The counter "mtl_est_btrlm" increases everytime BTRE interrupt is
         triggered and BTRL value reaches maximum value of 15.
      
      Please refer to MTL_EST_STATUS register in DesignWare Cores Ethernet
      Quality-of-Service Databook for more detail explanation.
      Signed-off-by: default avatarOng Boon Leong <boon.leong.ong@intel.com>
      Signed-off-by: default avatarVoon Weifeng <weifeng.voon@intel.com>
      Co-developed-by: default avatarMohammad Athari Bin Ismail <mohammad.athari.ismail@intel.com>
      Signed-off-by: default avatarMohammad Athari Bin Ismail <mohammad.athari.ismail@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9f298959
    • Voon Weifeng's avatar
      net: stmmac: EST interrupts handling and error reporting · e49aa315
      Voon Weifeng authored
      Enabled EST related interrupts as below:
      1) Constant Gate Control Error (CGCE)
      2) Head-of-Line Blocking due to Scheduling (HLBS)
      3) Head-of-Line Blocking due to Frame Size (HLBF).
      4) Base Time Register error (BTRE)
      5) Switch to S/W owned list Complete (SWLC)
      
      For HLBS, the user will get the info of all the queues that shows this
      error. For HLBF, the user will get the info of all the queue with the
      latest frame size which causes the error. Frame size 0 indicates no
      error.
      
      The ISR handling takes place when EST feature is enabled by user.
      Signed-off-by: default avatarVoon Weifeng <weifeng.voon@intel.com>
      Signed-off-by: default avatarOng Boon Leong <boon.leong.ong@intel.com>
      Co-developed-by: default avatarMohammad Athari Bin Ismail <mohammad.athari.ismail@intel.com>
      Signed-off-by: default avatarMohammad Athari Bin Ismail <mohammad.athari.ismail@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e49aa315
    • David S. Miller's avatar
      Merge branch 'stmmac-vlan-priority-rx-steering' · 09bef832
      David S. Miller authored
      Ong Boon Leong says:
      
      ====================
      stmmac: add VLAN priority based RX steering
      
      The current tc flower implementation in stmmac supports both L3 and L4
      filter offloading. This patch adds the support of VLAN priority based
      RX frame steering into different Rx Queues.
      
      The patches have been tested on both configuration test (include L3/L4)
      and traffic test (multi VLAN ping streams with RX Frame Steering) below:-
      
      > tc qdisc delete dev eth0 ingress
      
      > tc qdisc del dev eth0 parent root 2&> /dev/null
      > tc qdisc del dev eth0 parent ffff: 2&> /dev/null
      
      > tc qdisc add dev eth0 ingress
      
      > tc filter add dev eth0 parent ffff: protocol ip flower dst_ip 192.168.0.1 \
        src_ip 192.168.1.1 ip_proto tcp dst_port 5201 src_port 6201 action drop
      
      > tc filter add dev eth0 parent ffff: protocol ip flower dst_ip 192.168.0.2 \
        src_ip 192.168.1.2 ip_proto tcp dst_port 5202 src_port 6202 action drop
      
      > tc filter show dev eth0 ingress
      filter parent ffff: protocol ip pref 49151 flower chain 0
      filter parent ffff: protocol ip pref 49151 flower chain 0 handle 0x1
        eth_type ipv4
        ip_proto tcp
        dst_ip 192.168.0.2
        src_ip 192.168.1.2
        dst_port 5202
        src_port 6202
        in_hw in_hw_count 1
              action order 1: gact action drop
               random type none pass val 0
               index 2 ref 1 bind 1
      
      filter parent ffff: protocol ip pref 49152 flower chain 0
      filter parent ffff: protocol ip pref 49152 flower chain 0 handle 0x1
        eth_type ipv4
        ip_proto tcp
        dst_ip 192.168.0.1
        src_ip 192.168.1.1
        dst_port 5201
        src_port 6201
        in_hw in_hw_count 1
              action order 1: gact action drop
               random type none pass val 0
               index 1 ref 1 bind 1
      
      > tc qdisc delete dev eth0 ingress
      
      > tc qdisc del dev eth0 parent root 2&> /dev/null
      > tc qdisc del dev eth0 parent ffff: 2&> /dev/null
      
      > tc qdisc add dev eth0 ingress
      
      > tc qdisc add dev eth0 root mqprio num_tc 4 \
        map 0 1 2 3 0 0 0 0 0 0 0 0 0 0 0 0 \
        queues 1@0 1@1 1@2 1@3 hw 0
      
      > tc filter add dev eth0 parent ffff: protocol 802.1Q flower vlan_prio 0 hw_tc 3
      
      > tc filter add dev eth0 parent ffff: protocol 802.1Q flower vlan_prio 1 hw_tc 2
      
      > tc filter add dev eth0 parent ffff: protocol 802.1Q flower vlan_prio 2 hw_tc 1
      
      > tc filter add dev eth0 parent ffff: protocol 802.1Q flower vlan_prio 3 hw_tc 0
      
      > tc filter show dev eth0 ingress
      filter parent ffff: protocol 802.1Q pref 49149 flower chain 0
      filter parent ffff: protocol 802.1Q pref 49149 flower chain 0 handle 0x1 hw_tc 0
        vlan_prio 3
        in_hw in_hw_count 1
      filter parent ffff: protocol 802.1Q pref 49150 flower chain 0
      filter parent ffff: protocol 802.1Q pref 49150 flower chain 0 handle 0x1 hw_tc 1
        vlan_prio 2
        in_hw in_hw_count 1
      filter parent ffff: protocol 802.1Q pref 49151 flower chain 0
      filter parent ffff: protocol 802.1Q pref 49151 flower chain 0 handle 0x1 hw_tc 2
        vlan_prio 1
        in_hw in_hw_count 1
      filter parent ffff: protocol 802.1Q pref 49152 flower chain 0
      filter parent ffff: protocol 802.1Q pref 49152 flower chain 0 handle 0x1 hw_tc 3
        vlan_prio 0
        in_hw in_hw_count 1
      
      > tc qdisc delete dev eth0 ingress
      
      > ip address flush dev eth0
      > ip address add 169.254.1.11/24 dev eth0
      
      > ip link delete dev eth0.vlan1 2> /dev/null
      > ip link add link eth0 name eth0.vlan1 type vlan id 1
      > ip address flush dev eth0.vlan1 2> /dev/null
      > ip address add 169.254.11.11/24 dev eth0.vlan1
      
      > ip link delete dev eth0.vlan2 2> /dev/null
      > ip link add link eth0 name eth0.vlan2 type vlan id 2
      > ip address flush dev eth0.vlan2 2> /dev/null
      > ip address add 169.254.12.11/24 dev eth0.vlan2
      
      > ip link delete dev eth0.vlan3 2> /dev/null
      > ip link add link eth0 name eth0.vlan3 type vlan id 3
      > ip address flush dev eth0.vlan3 2> /dev/null
      > ip address add 169.254.13.11/24 dev eth0.vlan3
      
      > ip link delete dev eth0.vlan4 2> /dev/null
      > ip link add link eth0 name eth0.vlan4 type vlan id 4
      > ip address flush dev eth0.vlan4 2> /dev/null
      > ip address add 169.254.14.11/24 dev eth0.vlan4
      
      > ip address flush dev eth0
      > ip address add 169.254.1.22/24 dev eth0
      
      > ip link delete dev eth0.vlan1 2> /dev/null
      > ip link add link eth0 name eth0.vlan1 type vlan id 1
      > ip address flush dev eth0.vlan1 2> /dev/null
      > ip address add 169.254.11.22/24 dev eth0.vlan1
      
      > ip link delete dev eth0.vlan2 2> /dev/null
      > ip link add link eth0 name eth0.vlan2 type vlan id 2
      > ip address flush dev eth0.vlan2 2> /dev/null
      > ip address add 169.254.12.22/24 dev eth0.vlan2
      
      > ip link delete dev eth0.vlan3 2> /dev/null
      > ip link add link eth0 name eth0.vlan3 type vlan id 3
      > ip address flush dev eth0.vlan3 2> /dev/null
      > ip address add 169.254.13.22/24 dev eth0.vlan3
      
      > ip link delete dev eth0.vlan4 2> /dev/null
      > ip link add link eth0 name eth0.vlan4 type vlan id 4
      > ip address flush dev eth0.vlan4 2> /dev/null
      > ip address add 169.254.14.22/24 dev eth0.vlan4
      
      > mkdir -p /sys/fs/cgroup/net_prio/grp0
      > echo eth0 0 > /sys/fs/cgroup/net_prio/grp0/net_prio.ifpriomap
      > echo eth0.vlan1 0 >  /sys/fs/cgroup/net_prio/grp0/net_prio.ifpriomap
      > mkdir -p /sys/fs/cgroup/net_prio/grp1
      > echo eth0 0 > /sys/fs/cgroup/net_prio/grp1/net_prio.ifpriomap
      > echo eth0.vlan2 1 >  /sys/fs/cgroup/net_prio/grp1/net_prio.ifpriomap
      > mkdir -p /sys/fs/cgroup/net_prio/grp2
      > echo eth0 0 > /sys/fs/cgroup/net_prio/grp2/net_prio.ifpriomap
      > echo eth0.vlan3 2 >  /sys/fs/cgroup/net_prio/grp2/net_prio.ifpriomap
      > mkdir -p /sys/fs/cgroup/net_prio/grp3
      > echo eth0 0 > /sys/fs/cgroup/net_prio/grp3/net_prio.ifpriomap
      > echo eth0.vlan4 3 >  /sys/fs/cgroup/net_prio/grp3/net_prio.ifpriomap
      
      > tc qdisc del dev eth0 parent root 2&> /dev/null
      > tc qdisc del dev eth0 parent ffff: 2&> /dev/null
      
      > tc qdisc add dev eth0 ingress
      > tc qdisc add dev eth0 root mqprio num_tc 4 map 0 1 2 3 0 0 0 0 0 0 0 0 0 0 0 0 queues 1@0 1@1 1@2 1@3 hw 0
      
      > tc filter add dev eth0 parent ffff: protocol 802.1Q flower vlan_prio 0 hw_tc 0
      
      > tc filter add dev eth0 parent ffff: protocol 802.1Q flower vlan_prio 1 hw_tc 1
      
      > tc filter add dev eth0 parent ffff: protocol 802.1Q flower vlan_prio 2 hw_tc 2
      
      > tc filter add dev eth0 parent ffff: protocol 802.1Q flower vlan_prio 3 hw_tc 3
      
      > ip link set eth0.vlan1 type vlan egress-qos-map 0:0
      > ip link set eth0.vlan2 type vlan egress-qos-map 1:1
      > ip link set eth0.vlan3 type vlan egress-qos-map 2:2
      > ip link set eth0.vlan4 type vlan egress-qos-map 3:3
      
      > tc filter show dev eth0 ingress
      filter parent ffff: protocol 802.1Q pref 49149 flower chain 0
      filter parent ffff: protocol 802.1Q pref 49149 flower chain 0 handle 0x1 hw_tc 3
        vlan_prio 3
        in_hw in_hw_count 1
      filter parent ffff: protocol 802.1Q pref 49150 flower chain 0
      filter parent ffff: protocol 802.1Q pref 49150 flower chain 0 handle 0x1 hw_tc 2
        vlan_prio 2
        in_hw in_hw_count 1
      filter parent ffff: protocol 802.1Q pref 49151 flower chain 0
      filter parent ffff: protocol 802.1Q pref 49151 flower chain 0 handle 0x1 hw_tc 1
        vlan_prio 1
        in_hw in_hw_count 1
      filter parent ffff: protocol 802.1Q pref 49152 flower chain 0
      filter parent ffff: protocol 802.1Q pref 49152 flower chain 0 handle 0x1 hw_tc 0
        vlan_prio 0
        in_hw in_hw_count 1
      
      > echo 1 > /proc/irq/131/smp_affinity
      > echo 1 > /proc/irq/132/smp_affinity
      
      > echo 4 > /proc/irq/133/smp_affinity
      > echo 4 > /proc/irq/134/smp_affinity
      
      > echo 4 > /proc/irq/135/smp_affinity
      > echo 4 > /proc/irq/136/smp_affinity
      
      > echo 2 > /proc/irq/137/smp_affinity
      > echo 2 > /proc/irq/138/smp_affinity
      
      > ping -i 0.001 169.254.11.22 2&> /dev/null &
      > PID1="$!"
      > echo $PID1 >  /sys/fs/cgroup/net_prio/grp0/cgroup.procs
      
      > ping -i 0.001 169.254.12.22 2&> /dev/null &
      > PID2="$!"
      > echo $PID2 >  /sys/fs/cgroup/net_prio/grp1/cgroup.procs
      
      > ping -i 0.001 169.254.13.22 2&> /dev/null &
      > PID3="$!"
      > echo $PID3 >  /sys/fs/cgroup/net_prio/grp2/cgroup.procs
      
      > ping -i 0.001 169.254.14.22 2&> /dev/null &
      > PID4="$!"
      > echo $PID4 >  /sys/fs/cgroup/net_prio/grp3/cgroup.procs
      
      > ping -i 0.001 169.254.11.11 2&> /dev/null &
      > PID1="$!"
      > echo $PID1 >  /sys/fs/cgroup/net_prio/grp0/cgroup.procs
      
      > ping -i 0.001 169.254.12.11 2&> /dev/null &
      > PID2="$!"
      > echo $PID2 >  /sys/fs/cgroup/net_prio/grp1/cgroup.procs
      
      > ping -i 0.001 169.254.13.11 2&> /dev/null &
      > PID3="$!"
      > echo $PID3 >  /sys/fs/cgroup/net_prio/grp2/cgroup.procs
      
      > ping -i 0.001 169.254.14.11 2&> /dev/null &
      > PID4="$!"
      > echo $PID4 >  /sys/fs/cgroup/net_prio/grp3/cgroup.procs
      
      > watch -n 0.5 -d "cat /proc/interrupts | grep eth0"
       131:     251918         41          0          0  IR-PCI-MSI 477184-edge      eth0:rx-0
       132:      18969          1          0          0  IR-PCI-MSI 477185-edge      eth0:tx-0
       133:          0          0     295872          0  IR-PCI-MSI 477186-edge      eth0:rx-1
       134:          0          0      16136          0  IR-PCI-MSI 477187-edge      eth0:tx-1
       135:          0          0     288042          0  IR-PCI-MSI 477188-edge      eth0:rx-2
       136:          0          0      16135          0  IR-PCI-MSI 477189-edge      eth0:tx-2
       137:          0     211177          0          0  IR-PCI-MSI 477190-edge      eth0:rx-3
       138:          2      16144          0          0  IR-PCI-MSI 477191-edge      eth0:tx-3
       139:          0          0          0          0  IR-PCI-MSI 477192-edge      eth0:rx-4
       140:          0          0          0          0  IR-PCI-MSI 477193-edge      eth0:tx-4
       141:          0          0          0          0  IR-PCI-MSI 477194-edge      eth0:rx-5
       142:          0          0          0          0  IR-PCI-MSI 477195-edge      eth0:tx-5
       143:          0          0          0          0  IR-PCI-MSI 477196-edge      eth0:rx-6
       144:          0          0          0          0  IR-PCI-MSI 477197-edge      eth0:tx-6
       145:          0          0          0          0  IR-PCI-MSI 477198-edge      eth0:rx-7
       146:          0          0          0          0  IR-PCI-MSI 477199-edge      eth0:tx-7
       157:          0          0          0          0  IR-PCI-MSI 477210-edge      eth0:safety-ue
      
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      09bef832
    • Ong Boon Leong's avatar
      net: stmmac: add RX frame steering based on VLAN priority in tc flower · 0e039f5c
      Ong Boon Leong authored
      We extend tc flower to support configuration of VLAN priority-based RX
      frame steering hardware offloading.
      
      To map VLAN <PCP> to Traffic Class <TC>:
        $ tc filter add dev <IFNAME> parent ffff: protocol 802.1Q flower \
             vlan_prio <PCP> hw_tc <TC>
      
        Note: <TC> < N whereby "tc qdisc ... num_tc N ..."
      
      To delete all tc flower configurations:
        $ tc qdisc delete dev <IFNAME> ingress
      Signed-off-by: default avatarOng Boon Leong <boon.leong.ong@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0e039f5c
    • Ong Boon Leong's avatar
      net: stmmac: restructure tc implementation for RX VLAN Priority steering · bd0f670e
      Ong Boon Leong authored
      The current tc_add_flow() and tc_del_flow() use hardware L3 & L4 filters
      as offloading. The number of L3/L4 filters is read from L3L4FNUM field
      from MAC_HW_Feature1 register and is used to alloc priv->tc_entries[].
      
      For RX frame steering based on VLAN priority offloading, we use
      MAC_RXQ_CTRL2 & MAC_RXQ_CTRL3 registers and all VLAN priority level
      can be configured independent from L3 & L4 filters.
      Signed-off-by: default avatarOng Boon Leong <boon.leong.ong@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bd0f670e