Btrfs: fix deadlock between alloc_mutex/chunk_mutex
This fixes a deadlock that happens between the alloc_mutex and chunk_mutex.
Process A comes in, decides to do a do_chunk_alloc, which takes the
chunk_mutex, and is holding the alloc_mutex because the only way you get to
do_chunk_alloc is by holding the alloc_mutex. btrfs_alloc_chunk does its thing
and goes to insert a new item, which results in a cow of the block.
We get into del_pending_extents from there, where if we need to be rescheduled
we drop the alloc_mutex and schedule. At this point process B comes in to do
an allocation and gets the alloc_mutex, and because process A did not do the
chunk allocation completely it thinks its a good time to do a chunk allocation
as well, and hangs on the chunk_mutex.
Process A wakes up and tries to take the alloc_mutex and cannot. The way to
fix this is do a mutex_trylock() on chunk_mutex. If we return 0 we didn't get
the lock, and if this is just a "hey it may be a good time to allocate a chunk"
then we just exit. If we are trying to force an allocation then we reschedule
and keep trying to acquire the chunk_mutex. If once we acquire it the space is
already full then we can just exit, otherwise we can continue with the chunk
allocation. Thank you,
Signed-off-by: Josef Bacik <jbacik@redhat.com>
Showing
Please register or sign in to comment