- 29 Aug, 2023 2 commits
-
-
Christian Brauner authored
For keyed filesystems that recycle superblocks based on s_fs_info or information contained therein s_fs_info must be kept as long as the superblock is on the filesystem type super list. This isn't guaranteed as s_fs_info will be freed latest in sb->kill_sb(). The fix is simply to perform notification and list removal in kill_anon_super(). Any filesystem needs to free s_fs_info after they call the kill_*() helpers. If they don't they risk use-after-free right now so fixing it here is guaranteed that s_fs_info remain valid. For block backed filesystems notifying in pass sb->kill_sb() in deactivate_locked_super() remains unproblematic and is required because multiple other block devices can be shut down after kill_block_super() has been called from a filesystem's sb->kill_sb() handler. For example, ext4 and xfs close additional devices. Block based filesystems don't depend on s_fs_info (btrfs does use s_fs_info but also uses kill_anon_super() and not kill_block_super().). Sorry for that braino. Goal should be to unify this behavior during this cycle obviously. But let's please do a simple bugfix now. Fixes: 2c18a63b ("super: wait until we passed kill super") Fixes: syzbot+5b64180f8d9e39d3f061@syzkaller.appspotmail.com Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Christoph Hellwig <hch@lst.de> Reported-by: syzbot+5b64180f8d9e39d3f061@syzkaller.appspotmail.com Message-Id: <20230828-vfs-super-fixes-v1-2-b37a4a04a88f@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
-
Christian Brauner authored
Fix braino and move the lockdep assertion after put_super() otherwise we risk a use-after-free. Fixes: 2c18a63b ("super: wait until we passed kill super") Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Christoph Hellwig <hch@lst.de> Message-Id: <20230828-vfs-super-fixes-v1-1-b37a4a04a88f@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
-
- 23 Aug, 2023 2 commits
-
-
ssh://gitolite.kernel.org/pub/scm/fs/xfs/xfs-linuxChristian Brauner authored
Pull xfs online fsck update from Darrick Wong: New code for 6.6: * Allow the kernel to initiate a freeze of a filesystem. The kernel and userspace can both hold a freeze on a filesystem at the same time; the freeze is not lifted until /both/ holders lift it. This will enable us to fix a longstanding bug in XFS online fsck. * Use kernel-initated fsfreeze to fix some longstanding false negatives in online fsck of the free space and inode counters. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Message-Id: <20230822182604.GB11286@frogsfrogsfrogs> Signed-off-by: Christian Brauner <brauner@kernel.org>
-
ssh://gitolite.kernel.org/pub/scm/fs/xfs/xfs-linuxChristian Brauner authored
Pull filesystem freezing updates from Darrick Wong: New code for 6.6: * Allow the kernel to initiate a freeze of a filesystem. The kernel and userspace can both hold a freeze on a filesystem at the same time; the freeze is not lifted until /both/ holders lift it. This will enable us to fix a longstanding bug in XFS online fsck. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Message-Id: <20230822182604.GB11286@frogsfrogsfrogs> Signed-off-by: Christian Brauner <brauner@kernel.org>
-
- 22 Aug, 2023 1 commit
-
-
Christian Brauner authored
It's not necessary to use low-level locking helpers here. Use the higher-level locking helpers and log if the superblock is dying. Since the caller is assumed to already hold an active reference it isn't possible to observe a dying superblock. Suggested-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
-
- 21 Aug, 2023 15 commits
-
-
Christian Brauner authored
Recent rework moved block device closing out of sb->put_super() and into sb->kill_sb() to avoid deadlocks as s_umount is held in put_super() and blkdev_put() can end up taking s_umount again. That means we need to move the removal of the superblock from @fs_supers out of generic_shutdown_super() and into deactivate_locked_super() to ensure that concurrent mounters don't fail to open block devices that are still in use because blkdev_put() in sb->kill_sb() hasn't been called yet. We can now do this as we can make iterators through @fs_super and @super_blocks wait without holding s_umount. Concurrent mounts will wait until a dying superblock is fully dead so until sb->kill_sb() has been called and SB_DEAD been set. Concurrent iterators can already discard any SB_DYING superblock. Reviewed-by: Jan Kara <jack@suse.cz> Message-Id: <20230818-vfs-super-fixes-v3-v3-4-9f0b1876e46b@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
-
Christian Brauner authored
Recent patches experiment with making it possible to allocate a new superblock before opening the relevant block device. Naturally this has intricate side-effects that we get to learn about while developing this. Superblock allocators such as sget{_fc}() return with s_umount of the new superblock held and lock ordering currently requires that block level locks such as bdev_lock and open_mutex rank above s_umount. Before aca740ce ("fs: open block device after superblock creation") ordering was guaranteed to be correct as block devices were opened prior to superblock allocation and thus s_umount wasn't held. But now s_umount must be dropped before opening block devices to avoid locking violations. This has consequences. The main one being that iterators over @super_blocks and @fs_supers that grab a temporary reference to the superblock can now also grab s_umount before the caller has managed to open block devices and called fill_super(). So whereas before such iterators or concurrent mounts would have simply slept on s_umount until SB_BORN was set or the superblock was discard due to initalization failure they can now needlessly spin through sget{_fc}(). If the caller is sleeping on bdev_lock or open_mutex one caller waiting on SB_BORN will always spin somewhere and potentially this can go on for quite a while. It should be possible to drop s_umount while allowing iterators to wait on a nascent superblock to either be born or discarded. This patch implements a wait_var_event() mechanism allowing iterators to sleep until they are woken when the superblock is born or discarded. This also allows us to avoid relooping through @fs_supers and @super_blocks if a superblock isn't yet born or dying. Link: aca740ce ("fs: open block device after superblock creation") Reviewed-by: Jan Kara <jack@suse.cz> Message-Id: <20230818-vfs-super-fixes-v3-v3-3-9f0b1876e46b@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
-
Christian Brauner authored
Make the naming consistent with the earlier introduced super_lock_{read,write}() helpers. Reviewed-by: Jan Kara <jack@suse.cz> Message-Id: <20230818-vfs-super-fixes-v3-v3-2-9f0b1876e46b@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
-
Christian Brauner authored
Replace the open-coded {down,up}_{read,write}() calls with simple wrappers. Follow-up patches will benefit from this as well. Reviewed-by: Jan Kara <jack@suse.cz> Message-Id: <20230818-vfs-super-fixes-v3-v3-1-9f0b1876e46b@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
-
Christoph Hellwig authored
kill_dirty has always been true for a long time, so hard code it and remove the unused return value. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Christian Brauner <brauner@kernel.org> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Message-Id: <20230811100828.1897174-18-hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
-
Christoph Hellwig authored
get_super is unused now, remove it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Christian Brauner <brauner@kernel.org> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Message-Id: <20230811100828.1897174-17-hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
-
Christoph Hellwig authored
BLKFLSBUF is a historic ioctl that is called on a file handle to a block device and syncs either the file system mounted on that block device if there is one, or otherwise the just the data on the block device. Replace the get_super based syncing with a holder operation to remove the last usage of get_super, and to also support syncing the file system if the block device is not the main block device stored in s_dev. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Message-Id: <20230811100828.1897174-16-hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
-
Christoph Hellwig authored
Combine the newly merged bdev_mark_dead helper with the existing mark_dead holder operation so that all operations that invalidate a device that is dead or being removed now go through the holder ops. This allows file systems to explicitly shutdown either ASAP (for a surprise removal) or after writing back data (for an orderly removal), and do so not only for the main device. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Message-Id: <20230811100828.1897174-15-hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
-
Christoph Hellwig authored
We currently have two interfaces that take a block_devices and the find a mounted file systems to flush or invaldidate data on it. Both are a bit problematic because they only work for the "main" block devices that is used as s_dev for the super_block, and because they don't call into the file system at all. Merge the two into a new bdev_mark_dead helper that does both the syncing and invalidation and which is properly documented. This is in preparation of merging the functionality into the ->mark_dead holder operation so that it will work on additional block devices used by a file systems and give us a single entry point for invalidation of dead devices or media. Note that a single standalone fsync_bdev call for an obscure ioctl remains for now, but that one will also be deal with in a bit. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Message-Id: <20230811100828.1897174-14-hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
-
Christoph Hellwig authored
This message isn't exactly helpful, and file systems already print way more useful messages when shut down while active. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Message-Id: <20230811100828.1897174-13-hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
-
Christoph Hellwig authored
Don't just write out the data, but also invalidate all caches when setting the device offline. Stop canceling the offlining when writeback fails as there is no way to recover from that anyway. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Message-Id: <20230811100828.1897174-12-hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
-
Christoph Hellwig authored
FDFMTBEG is used by fdformat to calibrate before formatting a disk. Neither the atari nor PC floppy driver sync data, which also seems a bit pointless for a disk hat is about to get formatted. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Message-Id: <20230811100828.1897174-11-hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
-
Christoph Hellwig authored
While changing the format of a floppy isn't strictly speaking a media change, the effects are the same in that the content of the media changes and the diskseq should be increased and uevent should be sent. Switch from calling __invalidate_device to disk_force_media_change to do so. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Message-Id: <20230811100828.1897174-10-hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
-
Christoph Hellwig authored
Hard code the events to DISK_EVENT_MEDIA_CHANGE as that is the only useful use case, and drop the superfluous return value. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Message-Id: <20230811100828.1897174-9-hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
-
Christoph Hellwig authored
nbd_clear_sock_ioctl kills the socket and with that the block device. Instead of just invalidating file system buffers, mark the device as dead, which will also invalidate the buffers as part of the proper shutdown sequence. This also includes invalidating partitions if there are any. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Message-Id: <20230811100828.1897174-8-hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
-
- 11 Aug, 2023 7 commits
-
-
Christoph Hellwig authored
Use the generic fs_holder_ops to shut down the file system when the log or RT device goes away instead of duplicating the logic. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Message-Id: <20230802154131.2221419-13-hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
-
Christoph Hellwig authored
Just like get_tree_bdev needs to drop s_umount when opening the main device, we need to do the same for the xfs log and RT devices to avoid a potential lock order reversal with s_unmount for the mark_dead path. It might be preferable to just drop s_umount over ->fill_super entirely, but that will require a fairly massive audit first, so we'll do the easy version here first. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Message-Id: <20230802154131.2221419-12-hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
-
Christoph Hellwig authored
Use the generic fs_holder_ops to shut down the file system when the log device goes away instead of duplicating the logic. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Message-Id: <20230802154131.2221419-11-hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
-
Christoph Hellwig authored
Just like get_tree_bdev needs to drop s_umount when opening the main device, we need to do the same for the ext4 log device to avoid a potential lock order reversal with s_unmount for the mark_dead path. It might be preferable to just drop s_umount over ->fill_super entirely, but that will require a fairly massive audit first, so we'll do the easy version here first. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Christian Brauner <brauner@kernel.org> Message-Id: <20230802154131.2221419-10-hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
-
Christoph Hellwig authored
Export fs_holder_ops so that file systems that open additional block devices can use it as well. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Christian Brauner <brauner@kernel.org> Message-Id: <20230802154131.2221419-9-hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
-
Christoph Hellwig authored
fs_mark_dead currently uses get_super to find the superblock for the block device that is going away. This means it is limited to the main device stored in sb->s_dev, leading to a lot of code duplication for file systems that can use multiple block devices. Now that the holder for all block devices used by file systems is set to the super_block, we can instead look at that holder and then check if the file system is born and active, so do that instead. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Christian Brauner <brauner@kernel.org> Message-Id: <20230802154131.2221419-8-hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
-
Christoph Hellwig authored
The file system type is not a very useful holder as it doesn't allow us to go back to the actual file system instance. Pass the super_block instead which is useful when passed back to the file system driver. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Christian Brauner <brauner@kernel.org> Message-Id: <20230802154131.2221419-7-hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
-
- 10 Aug, 2023 13 commits
-
-
Christoph Hellwig authored
Check for sb->s_type which is the right place to look at the file system type, not the holder, which is just an implementation detail in the VFS helpers. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Christian Brauner <brauner@kernel.org> Message-Id: <20230802154131.2221419-6-hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
-
Christoph Hellwig authored
Use the generic setup_bdev_super helper to open the main block device and do various bits of superblock setup instead of duplicating the logic. This includes moving to the new scheme implemented in common code that only opens the block device after the superblock has allocated. It does not yet convert nilfs2 to the new mount API, but doing so will become a bit simpler after this first step. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Message-Id: <20230802154131.2221419-3-hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
-
Christoph Hellwig authored
We'll want to use setup_bdev_super instead of duplicating it in nilfs2. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Christian Brauner <brauner@kernel.org> Message-Id: <20230802154131.2221419-2-hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
-
Jan Kara authored
Currently get_tree_bdev and mount_bdev open the block device before committing to allocating a super block. That creates problems for restricting the number of writers to a device, and also leads to a unusual and not very helpful holder (the fs_type). Reorganize the super block code to first look whether the superblock for a particular device does already exist and open the block device only if it doesn't. [hch: port to before the bdev_handle changes, duplicate the bdev read-only check from blkdev_get_by_path, extend the fsfree_mutex coverage to protect against freezes, fix an open bdev leak when the bdev is frozen, use the bdev local variable more, rename the s variable to sb to be more descriptive] [brauner: remove references to mounts as they're mostly irrelevant] [brauner & hch: fold fixes for romfs and cramfs for syzbot+2faac0423fdc9692822b@syzkaller.appspotmail.com] Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Christoph Hellwig <hch@lst.de> Message-Id: <20230724175145.201318-1-hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
-
Christoph Hellwig authored
As a rule of thumb everything allocated to the fs_context and moved into the super_block should be freed by ->kill_sb so that the teardown handling doesn't need to be duplicated between the fill_super error path and put_super. Implement an ntfs3-specific kill_sb method to do that. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Christian Brauner <brauner@kernel.org> Message-Id: <20230809220545.1308228-14-hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
-
Christoph Hellwig authored
kill_block_super will call sync_blockdev just a tad later already. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Christian Brauner <brauner@kernel.org> Message-Id: <20230809220545.1308228-13-hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
-
Christoph Hellwig authored
put_ntfs is a rather unconventional name for a function that frees the sbi and associated resources. Give it a more descriptive name and drop the duplicate name in the top of the function comment. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Christian Brauner <brauner@kernel.org> Message-Id: <20230809220545.1308228-12-hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
-
Christoph Hellwig authored
As a rule of thumb everything allocated to the fs_context and moved into the super_block should be freed by ->kill_sb so that the teardown handling doesn't need to be duplicated between the fill_super error path and put_super. Implement an exfat-specific kill_sb method to do that and share the code with the mount contex free helper for the mount error handling case. Signed-off-by: Christoph Hellwig <hch@lst.de> Message-Id: <20230809220545.1308228-11-hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
-
Christoph Hellwig authored
There are no RCU critical sections for accessing any information in the sbi, so drop the call_rcu indirection for freeing the sbi. Signed-off-by: Christoph Hellwig <hch@lst.de> Message-Id: <20230809220545.1308228-10-hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
-
Christoph Hellwig authored
blkdev_put must not be called under sb->s_umount to avoid a lock order reversal with disk->open_mutex. Move closing the external journal device into ->kill_sb to archive that. Signed-off-by: Christoph Hellwig <hch@lst.de> Message-Id: <20230809220545.1308228-9-hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
-
Christoph Hellwig authored
Copy and paste the commit message from Darrick into a comment to explain the seemingly odd invalidate_bdev in xfs_shutdown_devices. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Message-Id: <20230809220545.1308228-8-hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
-
Christoph Hellwig authored
blkdev_put must not be called under sb->s_umount to avoid a lock order reversal with disk->open_mutex. Move closing the buftargs into ->kill_sb to archive that. Note that the flushing of the disk caches and block device mapping invalidated needs to stay in ->put_super as the main block device is closed in kill_block_super already. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Message-Id: <20230809220545.1308228-7-hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
-
Christoph Hellwig authored
Closing the block devices logically belongs into xfs_free_buftarg, So instead of open coding it in the caller move it there and add a check for the s_bdev so that the main device isn't close as that's done by the VFS helper. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Message-Id: <20230809220545.1308228-6-hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
-