• Brian Foster's avatar
    xfs: update metadata LSN in buffers during log recovery · 60a4a222
    Brian Foster authored
    Log recovery is currently broken for v5 superblocks in that it never
    updates the metadata LSN of buffers written out during recovery. The
    metadata LSN is recorded in various bits of metadata to provide recovery
    ordering criteria that prevents transient corruption states reported by
    buffer write verifiers. Without such ordering logic, buffer updates can
    be replayed out of order and lead to false positive transient corruption
    states. This is generally not a corruption vector on its own, but
    corruption detection shuts down the filesystem and ultimately prevents a
    mount if it occurs during log recovery. This requires an xfs_repair run
    that clears the log and potentially loses filesystem updates.
    
    This problem is avoided in most cases as metadata writes during normal
    filesystem operation update the metadata LSN appropriately. The problem
    with log recovery not updating metadata LSNs manifests if the system
    happens to crash shortly after log recovery itself. In this scenario, it
    is possible for log recovery to complete all metadata I/O such that the
    filesystem is consistent. If a crash occurs after that point but before
    the log tail is pushed forward by subsequent operations, however, the
    next mount performs the same log recovery over again. If a buffer is
    updated multiple times in the dirty range of the log, an earlier update
    in the log might not be valid based on the current state of the
    associated buffer after all of the updates in the log had been replayed
    (before the previous crash). If a verifier happens to detect such a
    problem, the filesystem claims corruption and immediately shuts down.
    
    This commonly manifests in practice as directory block verifier failures
    such as the following, likely due to directory verifiers being
    particularly detailed in their checks as compared to most others:
    
      ...
      Mounting V5 Filesystem
      XFS (dm-0): Starting recovery (logdev: internal)
      XFS (dm-0): Internal error XFS_WANT_CORRUPTED_RETURN at line ... of \
        file fs/xfs/libxfs/xfs_dir2_data.c.  Caller xfs_dir3_data_verify ...
      ...
    
    Update log recovery to update the metadata LSN of recovered buffers.
    Since metadata LSNs are already updated by write verifer functions via
    attached log items, attach a dummy log item to the buffer during
    validation and explicitly set the LSN of the current transaction. This
    ensures that the metadata LSN of a buffer is updated based on whether
    the recovery I/O actually completes, and if so, that subsequent recovery
    attempts identify that the buffer is already up to date with respect to
    the current transaction.
    Signed-off-by: default avatarBrian Foster <bfoster@redhat.com>
    Reviewed-by: default avatarDave Chinner <dchinner@redhat.com>
    Signed-off-by: default avatarDave Chinner <david@fromorbit.com>
    60a4a222
xfs_log_recover.c 151 KB