• NeilBrown's avatar
    md/raid10: close race that lose writes lost when replacement completes. · e7c0c3fa
    NeilBrown authored
    When a replacement operation completes there is a small window
    when the original device is marked 'faulty' and the replacement
    still looks like a replacement.  The faulty should be removed and
    the replacement moved in place very quickly, bit it isn't instant.
    
    So the code write out to the array must handle the possibility that
    the only working device for some slot in the replacement - but it
    doesn't.  If the primary device is faulty it just gives up.  This
    can lead to corruption.
    
    So make the code more robust: if either  the primary or the
    replacement is present and working, write to them.  Only when
    neither are present do we give up.
    
    This bug has been present since replacement was introduced in
    3.3, so it is suitable for any -stable kernel since then.
    Reported-by: default avatar"George Spelvin" <linux@horizon.com>
    Cc: stable@vger.kernel.org
    Signed-off-by: default avatarNeilBrown <neilb@suse.de>
    e7c0c3fa
raid10.c 128 KB