• Alex Elder's avatar
    rbd: fix safety of rbd_put_client() · d23a4b3f
    Alex Elder authored
    The rbd_client structure uses a kref to arrange for cleaning up and
    freeing an instance when its last reference is dropped.  The cleanup
    routine is rbd_client_release(), and one of the things it does is
    delete the rbd_client from rbd_client_list.  It acquires node_lock
    to do so, but the way it is done is still not safe.
    
    The problem is that when attempting to reuse an existing rbd_client,
    the structure found might already be in the process of getting
    destroyed and cleaned up.
    
    Here's the scenario, with "CLIENT" representing an existing
    rbd_client that's involved in the race:
    
     Thread on CPU A                | Thread on CPU B
     ---------------                | ---------------
     rbd_put_client(CLIENT)         | rbd_get_client()
       kref_put()                   |   (acquires node_lock)
         kref->refcount becomes 0   |   __rbd_client_find() returns CLIENT
         calls rbd_client_release() |   kref_get(&CLIENT->kref);
                                    |   (releases node_lock)
           (acquires node_lock)     |
           deletes CLIENT from list | ...and starts using CLIENT...
           (releases node_lock)     |
           and frees CLIENT         | <-- but CLIENT gets freed here
    
    Fix this by having rbd_put_client() acquire node_lock.  The result
    could still be improved, but at least it avoids this problem.
    Signed-off-by: default avatarAlex Elder <elder@dreamhost.com>
    Signed-off-by: default avatarSage Weil <sage@newdream.net>
    d23a4b3f
rbd.c 53.6 KB