• Mohammed Gamal's avatar
    hv_netvsc: Ensure correct teardown message sequence order · a56d99d7
    Mohammed Gamal authored
    Prior to commit 0cf73780 ("hv_netvsc: netvsc_teardown_gpadl() split")
    the call sequence in netvsc_device_remove() was as follows (as
    implemented in netvsc_destroy_buf()):
    1- Send NVSP_MSG1_TYPE_REVOKE_RECV_BUF message
    2- Teardown receive buffer GPADL
    3- Send NVSP_MSG1_TYPE_REVOKE_SEND_BUF message
    4- Teardown send buffer GPADL
    5- Close vmbus
    
    This didn't work for WS2016 hosts. Commit 0cf73780
    ("hv_netvsc: netvsc_teardown_gpadl() split") rearranged the
    teardown sequence as follows:
    1- Send NVSP_MSG1_TYPE_REVOKE_RECV_BUF message
    2- Send NVSP_MSG1_TYPE_REVOKE_SEND_BUF message
    3- Close vmbus
    4- Teardown receive buffer GPADL
    5- Teardown send buffer GPADL
    
    That worked well for WS2016 hosts, but it prevented guests on older hosts from
    shutting down after changing network settings. Commit 0ef58b0a
    ("hv_netvsc: change GPAD teardown order on older versions") ensured the
    following message sequence for older hosts
    1- Send NVSP_MSG1_TYPE_REVOKE_RECV_BUF message
    2- Send NVSP_MSG1_TYPE_REVOKE_SEND_BUF message
    3- Teardown receive buffer GPADL
    4- Teardown send buffer GPADL
    5- Close vmbus
    
    However, with this sequence calling `ip link set eth0 mtu 1000` hangs and the
    process becomes uninterruptible. On futher analysis it turns out that on tearing
    down the receive buffer GPADL the kernel is waiting indefinitely
    in vmbus_teardown_gpadl() for a completion to be signaled.
    
    Here is a snippet of where this occurs:
    int vmbus_teardown_gpadl(struct vmbus_channel *channel, u32 gpadl_handle)
    {
            struct vmbus_channel_gpadl_teardown *msg;
            struct vmbus_channel_msginfo *info;
            unsigned long flags;
            int ret;
    
            info = kmalloc(sizeof(*info) +
                           sizeof(struct vmbus_channel_gpadl_teardown), GFP_KERNEL);
            if (!info)
                    return -ENOMEM;
    
            init_completion(&info->waitevent);
            info->waiting_channel = channel;
    [....]
            ret = vmbus_post_msg(msg, sizeof(struct vmbus_channel_gpadl_teardown),
                                 true);
    
            if (ret)
                    goto post_msg_err;
    
            wait_for_completion(&info->waitevent);
    [....]
    }
    
    The completion is signaled from vmbus_ongpadl_torndown(), which gets called when
    the corresponding message is received from the host, which apparently never happens
    in that case.
    This patch works around the issue by restoring the first mentioned message sequence
    for older hosts
    
    Fixes: 0ef58b0a ("hv_netvsc: change GPAD teardown order on older versions")
    Signed-off-by: default avatarMohammed Gamal <mgamal@redhat.com>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    a56d99d7
netvsc.c 38.9 KB