• Alexander Duyck's avatar
    net: Use cached copy of pfmemalloc to avoid accessing page · 9451980a
    Alexander Duyck authored
    While testing I found that the testing for pfmemalloc in build_skb was
    rather expensive.  I found the issue to be two-fold.  First we have to get
    from the virtual address to the head page and that comes at the cost of
    something like 11 cycles.  Then there is the cost for reading pfmemalloc out
    of the head page which can be cache cold due to the fact that
    put_page_testzero is likely invalidating the cache-line on one or more
    CPUs as the fragments can be shared.
    
    To avoid this extra expense I have added a pfmemalloc member to the
    netdev_alloc_cache.  I then pushed pieces of __alloc_rx_skb into
    __napi_alloc_skb and __netdev_alloc_skb so that I could rewrite them to
    make use of the cached pfmemalloc value.  The result is that my perf traces
    show a reduction from 9.28% overhead to 3.7% for the code covered by
    build_skb, __alloc_rx_skb, and __napi_alloc_skb when performing a test with
    the packet being dropped instead of being handed to napi_gro_receive.
    Signed-off-by: default avatarAlexander Duyck <alexander.h.duyck@redhat.com>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    9451980a
skbuff.c 112 KB