• Sagi Grimberg's avatar
    nvmet-rdma: fix possible bogus dereference under heavy load · 8407879c
    Sagi Grimberg authored
    Currently we always repost the recv buffer before we send a response
    capsule back to the host. Since ordering is not guaranteed for send
    and recv completions, it is posible that we will receive a new request
    from the host before we got a send completion for the response capsule.
    
    Today, we pre-allocate 2x rsps the length of the queue, but in reality,
    under heavy load there is nothing that is really preventing the gap to
    expand until we exhaust all our rsps.
    
    To fix this, if we don't have any pre-allocated rsps left, we dynamically
    allocate a rsp and make sure to free it when we are done. If under memory
    pressure we fail to allocate a rsp, we silently drop the command and
    wait for the host to retry.
    Reported-by: default avatarSteve Wise <swise@opengridcomputing.com>
    Tested-by: default avatarSteve Wise <swise@opengridcomputing.com>
    Signed-off-by: default avatarSagi Grimberg <sagi@grimberg.me>
    [hch: dropped a superflous assignment]
    Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
    8407879c
rdma.c 40.5 KB