• Logan Gunthorpe's avatar
    md: Ensure resync is reported after it starts · b368856a
    Logan Gunthorpe authored
    The 07layouts test in mdadm fails on some systems. The failure
    presents itself as the backup file not being removed before the next
    layout is grown into:
    
      mdadm: /dev/md0: cannot create backup file /tmp/md-test-backup:
          File exists
    
    This is because the background mdadm process, which is responsible for
    cleaning up this backup file gets into an infinite loop waiting for
    the reshape to start. mdadm checks the mdstat file if a reshape is
    going and, if it is not, it waits for an event on the file or times
    out in 5 seconds. On faster machines, the reshape may complete before
    the 5 seconds times out, and thus the background mdadm process loops
    waiting for a reshape to start that has already occurred.
    
    mdadm reads the mdstat file to start, but mdstat does not report that the
    reshape has begun, even though it has indeed begun. So the mdstat_wait()
    call (in mdadm) which polls on the mdstat file won't ever return until
    timing out.
    
    The reason mdstat reports the reshape has started is due to an issue
    in status_resync(). recovery_active is subtracted from curr_resync which
    will result in a value of zero for the first chunk of reshaped data, and
    the resulting read will report no reshape in progress.
    
    To fix this, if "resync - recovery_active" is an overloaded value, force
    the value to be MD_RESYNC_ACTIVE so the code reports a resync in progress.
    Signed-off-by: default avatarLogan Gunthorpe <logang@deltatee.com>
    Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
    Signed-off-by: default avatarSong Liu <song@kernel.org>
    Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
    b368856a
md.c 259 KB