• Brian Foster's avatar
    bcachefs: mark active journal devices on journal replicas gc · d14bfd10
    Brian Foster authored
    A simple device evacuate, remove, add test loop with concurrent
    shutdowns occasionally reproduces a problem where the filesystem
    fails to mount. The mount failure occurs because the filesystem was
    uncleanly shut down, yet no member device is marked for journal data
    in the superblock. An fsck detects the problem, restores the mark
    and allows the mount to proceed without further consistency issues.
    
    The reason for the lack of journal data marks is the gc mechanism
    invoked via bch2_journal_flush_device_pins() runs while the journal
    happens to be empty. This results in garbage collection of all journal
    replicas entries. Once the updated replicas table is written to the
    superblock, the filesystem is put in a transiently unrecoverable state
    until further journal data is written, because journal recovery expects
    to find at least one marked journal device whenever the filesystem is
    not otherwise marked clean (i.e. as on clean unmount).
    
    To fix this problem, update the journal replicas gc algorithm to always
    mark currently active journal replicas entries by writing to the
    journal. This ensures that only entries for devices that are no longer
    used for journaling are garbage collected, not just those that don't
    happen to currently hold journal data. This preserves the journal
    recovery invariant above and avoids putting the fs into a transiently
    unrecoverable state.
    Signed-off-by: default avatarBrian Foster <bfoster@redhat.com>
    Signed-off-by: default avatarKent Overstreet <kent.overstreet@linux.dev>
    d14bfd10
journal_reclaim.c 20.9 KB