• Jesper Dangaard Brouer's avatar
    xdp: change ndo_xdp_xmit API to support bulking · 735fc405
    Jesper Dangaard Brouer authored
    This patch change the API for ndo_xdp_xmit to support bulking
    xdp_frames.
    
    When kernel is compiled with CONFIG_RETPOLINE, XDP sees a huge slowdown.
    Most of the slowdown is caused by DMA API indirect function calls, but
    also the net_device->ndo_xdp_xmit() call.
    
    Benchmarked patch with CONFIG_RETPOLINE, using xdp_redirect_map with
    single flow/core test (CPU E5-1650 v4 @ 3.60GHz), showed
    performance improved:
     for driver ixgbe: 6,042,682 pps -> 6,853,768 pps = +811,086 pps
     for driver i40e : 6,187,169 pps -> 6,724,519 pps = +537,350 pps
    
    With frames avail as a bulk inside the driver ndo_xdp_xmit call,
    further optimizations are possible, like bulk DMA-mapping for TX.
    
    Testing without CONFIG_RETPOLINE show the same performance for
    physical NIC drivers.
    
    The virtual NIC driver tun sees a huge performance boost, as it can
    avoid doing per frame producer locking, but instead amortize the
    locking cost over the bulk.
    
    V2: Fix compile errors reported by kbuild test robot <lkp@intel.com>
    V4: Isolated ndo, driver changes and callers.
    Signed-off-by: default avatarJesper Dangaard Brouer <brouer@redhat.com>
    Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
    735fc405
i40e_txrx.h 18.3 KB