Commit 5eb46851 authored by Yasuaki Ishimatsu's avatar Yasuaki Ishimatsu Committed by Jens Axboe

cfq-iosched: fix cfq_cic_link() race confition

cfq_cic_link() has race condition. When some processes which shared ioc
issue I/O to same block device simultaneously, cfq_cic_link() returns -EEXIST
sometimes. The race condition might stop I/O by following steps:

step  1: Process A: Issue an I/O to /dev/sda
step  2: Process A: Get an ioc (iocA here) in get_io_context() which does not
		    linked with a cic for the device
step  3: Process A: Get a new cic for the device (cicA here) in
		    cfq_alloc_io_context()

step  4: Process B: Issue an I/O to /dev/sda
step  5: Process B: Get iocA in get_io_context() since process A and B share the
		    same ioc
step  6: Process B: Get a new cic for the device (cicB here) in
		    cfq_alloc_io_context() since iocA has not been linked with a
		    cic for the device yet

step  7: Process A: Link cicA to iocA in cfq_cic_link()
step  8: Process A: Dispatch I/O to driver and finish it

step  9: Process B: Try to link cicB to iocA in cfq_cic_link()
		    But it fails with showing "cfq: cic link failed!" kernel
		    message, since iocA has already linked with cicA at step 7.
step 10: Process B: Wait for finishig I/O in get_request_wait()
		    The function does not wake up, when there is no I/O to the
		    device.

When cfq_cic_link() returns -EEXIST, it means ioc has already linked with cic.
So when cfq_cic_link() return -EEXIST, retry cfq_cic_lookup().
Signed-off-by: default avatarYasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: stable@kernel.org
Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
parent 2984ff38
...@@ -3184,7 +3184,7 @@ static int cfq_cic_link(struct cfq_data *cfqd, struct io_context *ioc, ...@@ -3184,7 +3184,7 @@ static int cfq_cic_link(struct cfq_data *cfqd, struct io_context *ioc,
} }
} }
if (ret) if (ret && ret != -EEXIST)
printk(KERN_ERR "cfq: cic link failed!\n"); printk(KERN_ERR "cfq: cic link failed!\n");
return ret; return ret;
...@@ -3200,6 +3200,7 @@ cfq_get_io_context(struct cfq_data *cfqd, gfp_t gfp_mask) ...@@ -3200,6 +3200,7 @@ cfq_get_io_context(struct cfq_data *cfqd, gfp_t gfp_mask)
{ {
struct io_context *ioc = NULL; struct io_context *ioc = NULL;
struct cfq_io_context *cic; struct cfq_io_context *cic;
int ret;
might_sleep_if(gfp_mask & __GFP_WAIT); might_sleep_if(gfp_mask & __GFP_WAIT);
...@@ -3207,6 +3208,7 @@ cfq_get_io_context(struct cfq_data *cfqd, gfp_t gfp_mask) ...@@ -3207,6 +3208,7 @@ cfq_get_io_context(struct cfq_data *cfqd, gfp_t gfp_mask)
if (!ioc) if (!ioc)
return NULL; return NULL;
retry:
cic = cfq_cic_lookup(cfqd, ioc); cic = cfq_cic_lookup(cfqd, ioc);
if (cic) if (cic)
goto out; goto out;
...@@ -3215,7 +3217,12 @@ cfq_get_io_context(struct cfq_data *cfqd, gfp_t gfp_mask) ...@@ -3215,7 +3217,12 @@ cfq_get_io_context(struct cfq_data *cfqd, gfp_t gfp_mask)
if (cic == NULL) if (cic == NULL)
goto err; goto err;
if (cfq_cic_link(cfqd, ioc, cic, gfp_mask)) ret = cfq_cic_link(cfqd, ioc, cic, gfp_mask);
if (ret == -EEXIST) {
/* someone has linked cic to ioc already */
cfq_cic_free(cic);
goto retry;
} else if (ret)
goto err_free; goto err_free;
out: out:
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment