• Carlos Maiolino's avatar
    Force log to disk before reading the AGF during a fstrim · 72422ef8
    Carlos Maiolino authored
    [ Upstream commit 8c81dd46 ]
    
    Forcing the log to disk after reading the agf is wrong, we might be
    calling xfs_log_force with XFS_LOG_SYNC with a metadata lock held.
    
    This can cause a deadlock when racing a fstrim with a filesystem
    shutdown.
    
    The deadlock has been identified due a miscalculation bug in device-mapper
    dm-thin, which returns lack of space to its users earlier than the device itself
    really runs out of space, changing the device-mapper volume into an error state.
    
    The problem happened while filling the filesystem with a single file,
    triggering the bug in device-mapper, consequently causing an IO error
    and shutting down the filesystem.
    
    If such file is removed, and fstrim executed before the XFS finishes the
    shut down process, the fstrim process will end up holding the buffer
    lock, and going to sleep on the cil wait queue.
    
    At this point, the shut down process will try to wake up all the threads
    waiting on the cil wait queue, but for this, it will try to hold the
    same buffer log already held my the fstrim, locking up the filesystem.
    Signed-off-by: default avatarCarlos Maiolino <cmaiolino@redhat.com>
    Reviewed-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
    Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
    Signed-off-by: default avatarSasha Levin <alexander.levin@microsoft.com>
    Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
    72422ef8
xfs_discard.c 6.42 KB