• Mikulas Patocka's avatar
    dm snapshot: rework COW throttling to fix deadlock · b2155578
    Mikulas Patocka authored
    Commit 721b1d98 ("dm snapshot: Fix excessive memory usage and
    workqueue stalls") introduced a semaphore to limit the maximum number of
    in-flight kcopyd (COW) jobs.
    
    The implementation of this throttling mechanism is prone to a deadlock:
    
    1. One or more threads write to the origin device causing COW, which is
       performed by kcopyd.
    
    2. At some point some of these threads might reach the s->cow_count
       semaphore limit and block in down(&s->cow_count), holding a read lock
       on _origins_lock.
    
    3. Someone tries to acquire a write lock on _origins_lock, e.g.,
       snapshot_ctr(), which blocks because the threads at step (2) already
       hold a read lock on it.
    
    4. A COW operation completes and kcopyd runs dm-snapshot's completion
       callback, which ends up calling pending_complete().
       pending_complete() tries to resubmit any deferred origin bios. This
       requires acquiring a read lock on _origins_lock, which blocks.
    
       This happens because the read-write semaphore implementation gives
       priority to writers, meaning that as soon as a writer tries to enter
       the critical section, no readers will be allowed in, until all
       writers have completed their work.
    
       So, pending_complete() waits for the writer at step (3) to acquire
       and release the lock. This writer waits for the readers at step (2)
       to release the read lock and those readers wait for
       pending_complete() (the kcopyd thread) to signal the s->cow_count
       semaphore: DEADLOCK.
    
    The above was thoroughly analyzed and documented by Nikos Tsironis as
    part of his initial proposal for fixing this deadlock, see:
    https://www.redhat.com/archives/dm-devel/2019-October/msg00001.html
    
    Fix this deadlock by reworking COW throttling so that it waits without
    holding any locks. Add a variable 'in_progress' that counts how many
    kcopyd jobs are running. A function wait_for_in_progress() will sleep if
    'in_progress' is over the limit. It drops _origins_lock in order to
    avoid the deadlock.
    Reported-by: default avatarGuruswamy Basavaiah <guru2018@gmail.com>
    Reported-by: default avatarNikos Tsironis <ntsironis@arrikto.com>
    Reviewed-by: default avatarNikos Tsironis <ntsironis@arrikto.com>
    Tested-by: default avatarNikos Tsironis <ntsironis@arrikto.com>
    Fixes: 721b1d98 ("dm snapshot: Fix excessive memory usage and workqueue stalls")
    Cc: stable@vger.kernel.org # v5.0+
    Depends-on: 4a3f111a73a8c ("dm snapshot: introduce account_start_copy() and account_end_copy()")
    Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
    Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
    b2155578
dm-snap.c 67.7 KB