• Michael S. Tsirkin's avatar
    virtio_net: fix race in RX VQ processing · cbdadbbf
    Michael S. Tsirkin authored
    virtio net called virtqueue_enable_cq on RX path after napi_complete, so
    with NAPI_STATE_SCHED clear - outside the implicit napi lock.
    This violates the requirement to synchronize virtqueue_enable_cq wrt
    virtqueue_add_buf.  In particular, used event can move backwards,
    causing us to lose interrupts.
    In a debug build, this can trigger panic within START_USE.
    
    Jason Wang reports that he can trigger the races artificially,
    by adding udelay() in virtqueue_enable_cb() after virtio_mb().
    
    However, we must call napi_complete to clear NAPI_STATE_SCHED before
    polling the virtqueue for used buffers, otherwise napi_schedule_prep in
    a callback will fail, causing us to lose RX events.
    
    To fix, call virtqueue_enable_cb_prepare with NAPI_STATE_SCHED
    set (under napi lock), later call virtqueue_poll with
    NAPI_STATE_SCHED clear (outside the lock).
    Reported-by: default avatarJason Wang <jasowang@redhat.com>
    Tested-by: default avatarJason Wang <jasowang@redhat.com>
    Acked-by: default avatarJason Wang <jasowang@redhat.com>
    Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    cbdadbbf
virtio_net.c 43 KB