• Xi Wang's avatar
    net-tun: restructure tun_do_read for better sleep/wakeup efficiency · 9e641bdc
    Xi Wang authored
    tun_do_read always adds current thread to wait queue, even if a packet
    is ready to read. This is inefficient because both sleeper and waker
    want to acquire the wait queue spin lock when packet rate is high.
    
    We restructure the read function and use common kernel networking
    routines to handle receive, sleep and wakeup. With the change
    available packets are checked first before the reading thread is added
    to the wait queue.
    
    Ran performance tests with the following configuration:
    
     - my packet generator -> tap1 -> br0 -> tap0 -> my packet consumer
     - sender pinned to one core and receiver pinned to another core
     - sender send small UDP packets (64 bytes total) as fast as it can
     - sandy bridge cores
     - throughput are receiver side goodput numbers
    
    The results are
    
    baseline: 731k pkts/sec, cpu utilization at 1.50 cpus
     changed: 783k pkts/sec, cpu utilization at 1.53 cpus
    
    The performance difference is largely determined by packet rate and
    inter-cpu communication cost. For example, if the sender and
    receiver are pinned to different cpu sockets, the results are
    
    baseline: 558k pkts/sec, cpu utilization at 1.71 cpus
     changed: 690k pkts/sec, cpu utilization at 1.67 cpus
    Co-authored-by: default avatarEric Dumazet <edumazet@google.com>
    Signed-off-by: default avatarXi Wang <xii@google.com>
    Acked-by: default avatarMichael S. Tsirkin <mst@redhat.com>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    9e641bdc
tun.c 55.5 KB