• Vladimir Oltean's avatar
    net: enetc: prioritize ability to go down over packet processing · ff58fda0
    Vladimir Oltean authored
    napi_synchronize() from enetc_stop() waits until the softirq has
    finished execution and no longer wants to be rescheduled. However under
    high traffic load, this will never happen, and the interface can never
    be closed.
    
    The problem is the fact that the NAPI poll routine is written to update
    the consumer index which makes the device want to put more buffers in
    the RX ring, which restarts the madness again.
    
    Browsing around, it seems that some drivers like i40e keep a bit
    (__I40E_VSI_DOWN) which they use as communication between the control
    path and the data path. But that isn't my first choice, because
    complications ensue - since the enetc hardirq may trigger while we are
    in a theoretical ENETC_DOWN state, it may happen that enetc_msix() masks
    it, but enetc_poll() never unmasks it. To prevent a stall in that case,
    one would need to schedule all NAPI instances when ENETC_DOWN gets
    cleared, to process what's pending.
    
    I find it more desirable for the control path - enetc_stop() - to just
    quiesce the RX ring and let the softirq finish what remains there,
    without any explicit communication, just by making hardware not provide
    any more packets.
    
    This seems possible with the Enable bit of the RX BD ring (RBaMR[EN]).
    I can't seem to find an exact definition of what this bit does, but when
    the RX ring is disabled, the port seems to no longer update the producer
    index, and not react to software updates of the consumer index.
    
    In fact, the RBaMR[EN] bit is already toggled by the driver, but too
    late for what we want:
    
    enetc_close()
    -> enetc_stop()
       -> napi_synchronize()
    -> enetc_clear_bdrs()
       -> enetc_clear_rxbdr()
    
    The enetc_clear_bdrs() function contains not only logic to disable the
    RX and TX rings, but also logic to wait for the TX ring stop being busy.
    
    We split enetc_clear_bdrs() into enetc_disable_bdrs() and
    enetc_wait_bdrs(). One needs to run before napi_synchronize() and the
    other after (NAPI also processes TX completions, so we maximize our
    chances of not waiting for the ENETC_TBSR_BUSY bit - unless a packet is
    stuck for some reason, ofc).
    
    We also split off enetc_enable_bdrs() from enetc_setup_bdrs(), and call
    this from the mirror position in enetc_start() compared to enetc_stop(),
    i.e. right before netif_tx_start_all_queues().
    Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
    Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
    ff58fda0
enetc.c 74.2 KB