1. 19 Jan, 2023 14 commits
    • Hangbin Liu's avatar
      selftests/net: mv bpf/nat6to4.c to net folder · 3c107f36
      Hangbin Liu authored
      There are some issues with the bpf/nat6to4.c building.
      
      1. It use TEST_CUSTOM_PROGS, which will add the nat6to4.o to
         kselftest-list file and run by common run_tests.
      2. When building the test via `make -C tools/testing/selftests/
         TARGETS="net"`, the nat6to4.o will be build in selftests/net/bpf/
         folder. But in test udpgro_frglist.sh it refers to ../bpf/nat6to4.o.
         The correct path should be ./bpf/nat6to4.o.
      3. If building the test via `make -C tools/testing/selftests/ TARGETS="net"
         install`. The nat6to4.o will be installed to kselftest_install/net/
         folder. Then the udpgro_frglist.sh should refer to ./nat6to4.o.
      
      To fix the confusing test path, let's just move the nat6to4.c to net folder
      and build it as TEST_GEN_FILES.
      
      Fixes: edae34a3 ("selftests net: add UDP GRO fraglist + bpf self-tests")
      Tested-by: default avatarBjörn Töpel <bjorn@kernel.org>
      Signed-off-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Link: https://lore.kernel.org/r/20230118020927.3971864-1-liuhangbin@gmail.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      3c107f36
    • Jakub Kicinski's avatar
      Merge branch 'enetc-bd-ring-cleanup' · 9ffb07a3
      Jakub Kicinski authored
      Vladimir Oltean says:
      
      ====================
      ENETC BD ring cleanup
      
      The highlights of this patch set are:
      
      - Installing a BPF program and changing PTP RX timestamping settings are
        currently implemented through a port reconfiguration procedure which
        triggers an AN restart on the PHY, and these procedures are not
        generally guaranteed to leave the port in a sane state. Patches 9/12
        and 11/12 address that.
      
      - Attempting to put the port down (or trying to reconfigure it) has the
        driver oppose some resistance if it's bombarded with RX traffic
        (it won't go down). Patch 12/12 addresses that.
      
      The other 9 patches are just cleanup in the BD ring setup/teardown code,
      which gradually led to bringing the driver in a position where resolving
      those 2 issues was possible.
      ====================
      
      Link: https://lore.kernel.org/r/20230117230234.2950873-1-vladimir.oltean@nxp.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      9ffb07a3
    • Vladimir Oltean's avatar
      net: enetc: prioritize ability to go down over packet processing · ff58fda0
      Vladimir Oltean authored
      napi_synchronize() from enetc_stop() waits until the softirq has
      finished execution and no longer wants to be rescheduled. However under
      high traffic load, this will never happen, and the interface can never
      be closed.
      
      The problem is the fact that the NAPI poll routine is written to update
      the consumer index which makes the device want to put more buffers in
      the RX ring, which restarts the madness again.
      
      Browsing around, it seems that some drivers like i40e keep a bit
      (__I40E_VSI_DOWN) which they use as communication between the control
      path and the data path. But that isn't my first choice, because
      complications ensue - since the enetc hardirq may trigger while we are
      in a theoretical ENETC_DOWN state, it may happen that enetc_msix() masks
      it, but enetc_poll() never unmasks it. To prevent a stall in that case,
      one would need to schedule all NAPI instances when ENETC_DOWN gets
      cleared, to process what's pending.
      
      I find it more desirable for the control path - enetc_stop() - to just
      quiesce the RX ring and let the softirq finish what remains there,
      without any explicit communication, just by making hardware not provide
      any more packets.
      
      This seems possible with the Enable bit of the RX BD ring (RBaMR[EN]).
      I can't seem to find an exact definition of what this bit does, but when
      the RX ring is disabled, the port seems to no longer update the producer
      index, and not react to software updates of the consumer index.
      
      In fact, the RBaMR[EN] bit is already toggled by the driver, but too
      late for what we want:
      
      enetc_close()
      -> enetc_stop()
         -> napi_synchronize()
      -> enetc_clear_bdrs()
         -> enetc_clear_rxbdr()
      
      The enetc_clear_bdrs() function contains not only logic to disable the
      RX and TX rings, but also logic to wait for the TX ring stop being busy.
      
      We split enetc_clear_bdrs() into enetc_disable_bdrs() and
      enetc_wait_bdrs(). One needs to run before napi_synchronize() and the
      other after (NAPI also processes TX completions, so we maximize our
      chances of not waiting for the ENETC_TBSR_BUSY bit - unless a packet is
      stuck for some reason, ofc).
      
      We also split off enetc_enable_bdrs() from enetc_setup_bdrs(), and call
      this from the mirror position in enetc_start() compared to enetc_stop(),
      i.e. right before netif_tx_start_all_queues().
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      ff58fda0
    • Vladimir Oltean's avatar
      net: enetc: set up XDP program under enetc_reconfigure() · c33bfaf9
      Vladimir Oltean authored
      Offloading a BPF program to the RX path of the driver suffers from the
      same problems as the PTP reconfiguration - improper error checking can
      leave the driver in an invalid state, and the link on the PHY is lost.
      
      Reuse the enetc_reconfigure() procedure, but here, we need to run some
      code in the middle of the ring reconfiguration procedure - while the
      interface is still down. Introduce a callback which makes that possible.
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      c33bfaf9
    • Vladimir Oltean's avatar
      net: enetc: rename "xdp" and "dev" in enetc_setup_bpf() · 766338c7
      Vladimir Oltean authored
      Follow the convention from this driver, which is to name "struct
      net_device *" as "ndev", and the convention from other drivers, to name
      "struct netdev_bpf *" as "bpf".
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      766338c7
    • Vladimir Oltean's avatar
      net: enetc: implement ring reconfiguration procedure for PTP RX timestamping · 5093406c
      Vladimir Oltean authored
      The crude enetc_stop() -> enetc_open() mechanism suffers from 2
      problems:
      
      1. improper error checking
      2. it involves phylink_stop() -> phylink_start() which loses the link
      
      Right now, the driver is prepared to offer a better alternative: a ring
      reconfiguration procedure which takes the RX BD size (normal or
      extended) as argument. It allocates new resources (failing if that
      fails), stops the traffic, and assigns the new resources to the rings.
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      5093406c
    • Vladimir Oltean's avatar
      net: enetc: move phylink_start/stop out of enetc_start/stop · 598ca0d0
      Vladimir Oltean authored
      We want to introduce a fast interface reconfiguration procedure, which
      involves temporarily stopping the rings.
      
      But we want enetc_start() and enetc_stop() to not restart PHY autoneg,
      because that can take a few seconds until it completes again.
      
      So we need part of enetc_start() and enetc_stop(), but not all of them.
      Move phylink_start() right next to phylink_of_phy_connect(), and
      phylink_stop() right next to phylink_disconnect_phy(), both still in
      ndo_open() and ndo_stop().
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      598ca0d0
    • Vladimir Oltean's avatar
      net: enetc: split ring resource allocation from assignment · f3ce29e1
      Vladimir Oltean authored
      We have a few instances in the enetc driver where the ring resources
      (BD ring iomem, software BD ring, software TSO headers, basically
      everything except RX buffers) need to be reallocated. For example, when
      RX timestamping is enabled, the RX BD format changes to an extended one
      (twice as large).
      
      Currently, this is done using a simplistic enetc_close() -> enetc_open()
      procedure. But this is quite crude, since it also invokes phylink_stop()
      -> phylink_start(), the link is lost, and a few seconds need to pass for
      autoneg to complete again.
      
      In fact it's bad also due to the improper (yolo) error checking. In case
      we fail to allocate new resources, we've already freed the old ones, so
      the interface is more or less stuck.
      
      To avoid that, we need a system where reconfiguration is possible in a
      way in which resources are allocated upfront. This means that there will
      be a higher memory usage temporarily, but the assignment of resources to
      rings can be done when both the old and new resources are still available.
      
      Introduce a struct enetc_bdr_resource which holds the resources for a
      ring, be it RX or TX. This structure duplicates a lot of fields from
      struct enetc_bdr (and access to the same fields in the ring structure
      was left duplicated, to not change cache characteristics in the fast
      path).
      
      When enetc_alloc_tx_resources() runs, it returns an array of resource
      elements (one per TX ring), in addition to the existing priv->tx_res.
      To populate priv->tx_res with that array, one must call
      enetc_assign_tx_resources(), and this also frees the old resources.
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      f3ce29e1
    • Vladimir Oltean's avatar
      net: enetc: bring "bool extended" to top-level in enetc_open() · d075db51
      Vladimir Oltean authored
      Extended RX buffer descriptors are necessary if they carry RX
      timestamps, which will be true when PTP timestamping is enabled.
      
      Right now, the rx_ring->ext_en is set from the function that allocates
      ring resources (enetc_alloc_rx_resources() -> enetc_alloc_rxbdr()), and
      also used later, in enetc_setup_rxbdr(). It is also used in the
      enetc_rxbd() and enetc_rxbd_next() fast path helpers.
      
      We want to decouple resource allocation from BD ring setup, but both
      procedures depend on BD size (extended or not). Move the "extended"
      boolean to enetc_open() and pass it both to the RX allocation procedure
      as well as to the RX ring setup procedure. The latter will set
      rx_ring->ext_en from now on.
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      d075db51
    • Vladimir Oltean's avatar
      net: enetc: drop redundant enetc_free_tx_frame() call from enetc_free_txbdr() · bbd6043f
      Vladimir Oltean authored
      The call path in enetc_close() is:
      
      enetc_close()
      -> enetc_free_rxtx_rings()
         -> enetc_free_tx_ring()
            -> enetc_free_tx_frame()
      -> enetc_free_tx_resources()
         -> enetc_free_txbdr()
            -> enetc_free_tx_frame()
      
      The enetc_free_tx_frame() function is written such that the second call
      exits without doing anything, but nonetheless, it is completely
      redundant. Delete it. This makes the TX teardown path more similar to
      the RX one, where rx_swbd freeing is done in enetc_free_rx_ring(), not
      in enetc_free_rxbdr().
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      bbd6043f
    • Vladimir Oltean's avatar
      net: enetc: rx_swbd and tx_swbd are never NULL in enetc_free_rxtx_rings() · 2c338710
      Vladimir Oltean authored
      The call path in enetc_close() is:
      
      enetc_close()
      -> enetc_free_rxtx_rings()
         -> enetc_free_rx_ring()
            -> tests whether rx_ring->rx_swbd is NULL
         -> enetc_free_tx_ring()
            -> tests whether tx_ring->tx_swbd is NULL
      -> enetc_free_rx_resources()
         -> enetc_free_rxbdr()
            -> sets rxr->rx_swbd to NULL
      -> enetc_free_tx_resources()
         -> enetc_free_txbdr()
            -> setx txr->tx_swbd to NULL
      
      From the above, it is clear that due to the function ordering, the
      checks for NULL are redundant, since the software buffer descriptor
      arrays have not yet been set to NULL. Drop these checks.
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      2c338710
    • Vladimir Oltean's avatar
      net: enetc: create enetc_dma_free_bdr() · 0d6cfd0f
      Vladimir Oltean authored
      This is a refactoring change which introduces the opposite function of
      enetc_dma_alloc_bdr().
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      0d6cfd0f
    • Vladimir Oltean's avatar
      net: enetc: set up RX ring indices from enetc_setup_rxbdr() · fbf1cff9
      Vladimir Oltean authored
      There is only one place which needs to set up indices in the RX ring.
      Be consistent with what was done in the TX path and do this in
      enetc_setup_rxbdr().
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      fbf1cff9
    • Vladimir Oltean's avatar
      net: enetc: set next_to_clean/next_to_use just from enetc_setup_txbdr() · 1cbf19c5
      Vladimir Oltean authored
      enetc_alloc_txbdr() deals with allocating resources necessary for a TX
      ring to work (the array of software BDs and the array of TSO headers).
      
      The next_to_clean and next_to_use pointers are overwritten with proper
      values which are read from hardware here:
      
      enetc_open
      -> enetc_alloc_tx_resources
         -> enetc_alloc_txbdr
            -> set to zero
      -> enetc_setup_bdrs
         -> enetc_setup_txbdr
            -> read from hardware
      
      So their initialization with zeroes is pointless and confusing.
      Delete it.
      
      Consequently, since enetc_setup_txbdr() has no opposite cleanup
      function, also delete the resetting of these indices from
      enetc_free_tx_ring().
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      1cbf19c5
  2. 18 Jan, 2023 26 commits