Commit a035fc3e authored by NeilBrown's avatar NeilBrown

md: fix possible deadlock in handling flush requests.

As recorded in
    https://bugzilla.kernel.org/show_bug.cgi?id=24012

it is possible for a flush request through md to hang.  This is due to
an interaction between the recursion avoidance in
generic_make_request, the insistence in md of only having one flush
active at a time, and the possibility of dm (or md) submitting two
flush requests to a device from the one generic_make_request.

If a generic_make_request call into dm causes two flush requests to be
queued (as happens if the dm table has two targets - they get one
each), these two will be queued inside generic_make_request.

Assume they are for the same md device.
The first is processed and causes 1 or more flush requests to be sent
to lower devices.  These get queued within generic_make_request too.
Then the second flush to the md device gets handled and it blocks
waiting for the first flush to complete.  But it won't complete until
the two lower-device requests complete, and they haven't even been
submitted yet as they are on the generic_make_request queue.

The deadlock can be broken by using a separate thread to submit the
requests to lower devices.  md has such a thread readily available:
md_wq.

So use it to submit these requests.
Reported-by: default avatarGiacomo Catenazzi <cate@cateee.net>
Tested-by: default avatarGiacomo Catenazzi <cate@cateee.net>
Signed-off-by: default avatarNeilBrown <neilb@suse.de>
parent a7a07e69
...@@ -373,8 +373,9 @@ static void md_end_flush(struct bio *bio, int err) ...@@ -373,8 +373,9 @@ static void md_end_flush(struct bio *bio, int err)
static void md_submit_flush_data(struct work_struct *ws); static void md_submit_flush_data(struct work_struct *ws);
static void submit_flushes(mddev_t *mddev) static void submit_flushes(struct work_struct *ws)
{ {
mddev_t *mddev = container_of(ws, mddev_t, flush_work);
mdk_rdev_t *rdev; mdk_rdev_t *rdev;
INIT_WORK(&mddev->flush_work, md_submit_flush_data); INIT_WORK(&mddev->flush_work, md_submit_flush_data);
...@@ -432,7 +433,8 @@ void md_flush_request(mddev_t *mddev, struct bio *bio) ...@@ -432,7 +433,8 @@ void md_flush_request(mddev_t *mddev, struct bio *bio)
mddev->flush_bio = bio; mddev->flush_bio = bio;
spin_unlock_irq(&mddev->write_lock); spin_unlock_irq(&mddev->write_lock);
submit_flushes(mddev); INIT_WORK(&mddev->flush_work, submit_flushes);
queue_work(md_wq, &mddev->flush_work);
} }
EXPORT_SYMBOL(md_flush_request); EXPORT_SYMBOL(md_flush_request);
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment