• Yufen Yu's avatar
    md raid10: fix NULL deference in handle_write_completed() · 01a69cab
    Yufen Yu authored
    In the case of 'recover', an r10bio with R10BIO_WriteError &
    R10BIO_IsRecover will be progressed by handle_write_completed().
    This function traverses all r10bio->devs[copies].
    If devs[m].repl_bio != NULL, it thinks conf->mirrors[dev].replacement
    is also not NULL. However, this is not always true.
    
    When there is an rdev of raid10 has replacement, then each r10bio
    ->devs[m].repl_bio != NULL in conf->r10buf_pool. However, in 'recover',
    even if corresponded replacement is NULL, it doesn't clear r10bio
    ->devs[m].repl_bio, resulting in replacement NULL deference.
    
    This bug was introduced when replacement support for raid10 was
    added in Linux 3.3.
    
    As NeilBrown suggested:
    	Elsewhere the determination of "is this device part of the
    	resync/recovery" is made by resting bio->bi_end_io.
    	If this is end_sync_write, then we tried to write here.
    	If it is NULL, then we didn't try to write.
    
    Fixes: 9ad1aefc ("md/raid10:  Handle replacement devices during resync.")
    Cc: stable (V3.3+)
    Suggested-by: default avatarNeilBrown <neilb@suse.com>
    Signed-off-by: default avatarYufen Yu <yuyufen@huawei.com>
    Signed-off-by: default avatarShaohua Li <sh.li@alibaba-inc.com>
    01a69cab
raid10.c 135 KB