• Darrick J. Wong's avatar
    xfs: invalidate block device page cache during unmount · 032e1603
    Darrick J. Wong authored
    Every now and then I see fstests failures on aarch64 (64k pages) that
    trigger on the following sequence:
    
    mkfs.xfs $dev
    mount $dev $mnt
    touch $mnt/a
    umount $mnt
    xfs_db -c 'path /a' -c 'print' $dev
    
    99% of the time this succeeds, but every now and then xfs_db cannot find
    /a and fails.  This turns out to be a race involving udev/blkid, the
    page cache for the block device, and the xfs_db process.
    
    udev is triggered whenever anyone closes a block device or unmounts it.
    The default udev rules invoke blkid to read the fs super and create
    symlinks to the bdev under /dev/disk.  For this, it uses buffered reads
    through the page cache.
    
    xfs_db also uses buffered reads to examine metadata.  There is no
    coordination between xfs_db and udev, which means that they can run
    concurrently.  Note there is no coordination between the kernel and
    blkid either.
    
    On a system with 64k pages, the page cache can cache the superblock and
    the root inode (and hence the root dir) with the same 64k page.  If
    udev spawns blkid after the mkfs and the system is busy enough that it
    is still running when xfs_db starts up, they'll both read from the same
    page in the pagecache.
    
    The unmount writes updated inode metadata to disk directly.  The XFS
    buffer cache does not use the bdev pagecache, nor does it invalidate the
    pagecache on umount.  If the above scenario occurs, the pagecache no
    longer reflects what's on disk, xfs_db reads the stale metadata, and
    fails to find /a.  Most of the time this succeeds because closing a bdev
    invalidates the page cache, but when processes race, everyone loses.
    
    Fix the problem by invalidating the bdev pagecache after flushing the
    bdev, so that xfs_db will see up to date metadata.
    Signed-off-by: default avatarDarrick J. Wong <djwong@kernel.org>
    Reviewed-by: default avatarGao Xiang <hsiangkao@linux.alibaba.com>
    Reviewed-by: default avatarDave Chinner <dchinner@redhat.com>
    032e1603
xfs_buf.c 58 KB