- 07 May, 2024 40 commits
-
-
Matthew Wilcox (Oracle) authored
This is a direct conversion from pages to folios, assuming single page folio. Also removes a few calls to compound_head() and calls to obsolete APIs. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Matthew Wilcox (Oracle) authored
Several modules use __bio_add_page() today and may need to be converted to bio_add_folio_nofail(). Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Thorsten Blum authored
Remove duplicate included header file linux/blkdev.h . Signed-off-by: Thorsten Blum <thorsten.blum@toblux.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Josef Bacik authored
Now that we have the lock_extent tightly coupled with extent_clear_unlock_delalloc we can add a cached state to extent_clear_unlock_delalloc and benefit from skipping the extra lookup when we're doing cow. Reviewed-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Josef Bacik authored
We don't need to include the time we spend in the allocator under our extent lock protection, move it after the allocator and make sure we lock the extent in the error case to ensure we're not clearing these bits without the extent lock held. Reviewed-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Josef Bacik authored
Now that we've got the extent lock pushed into cow_file_range() we can push it further down into the allocation loop. This allows us to only hold the extent lock during the dropping of the extent map range and inserting the ordered extent. This makes the error case a little trickier as we'll now have to lock the range before clearing any of the other extent bits for the range, but this is the error path so is less performance critical. Reviewed-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Josef Bacik authored
These checks aren't reliant on the extent lock. Move this up into cow_file_range_inline(), and then update encoded writes to call this check before calling __cow_file_range_inline(). This will allow us to skip the extent lock if we're not able to inline the given extent. Reviewed-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Josef Bacik authored
Now that we've pushed the lock_extent() into cow_file_range() we can push the extent locking into cow_file_range_inline() and move the lock_extent in cow_file_range() to after we call cow_file_range_inline(). Reviewed-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Josef Bacik authored
Now that cow_file_range is the only function that is called with the range locked, push this call into cow_file_range so we can further narrow the scope. Reviewed-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Josef Bacik authored
This is used by zoned but also as the fallback for uncompressed extents when we fail to compress the ranges. Push the extent lock into run_dealloc_cow(), and adjust the compression case to take the extent lock after calling run_delalloc_cow(). Reviewed-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Josef Bacik authored
Since we immediately unlock the extent range when we enter run_delalloc_compressed() simply move the lock_extent() down to cover cow_file_range() and then remove the unlock_extent() from run_delalloc_compressed. Reviewed-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Josef Bacik authored
run_delalloc_nocow is a little special because we use the file extents to see if we can nocow a range. We don't actually need the protection of the extent lock to look at the file extents at this point however. We are currently holding the page lock for this range, so we are protected from anybody who would simultaneously be modifying the file extent items for this range. * mmap() - we're holding the page lock. * buffered writes - we're holding the page lock. * direct writes - we're holding the page lock and direct IO has to flush page cache before it's able to continue. * fallocate() - all callers flush the range and wait on ordered extents while holding the inode lock and the mmap lock, so we are again saved by the page lock. We want to use the extent lock to protect 1) The mapping tree for the given range. 2) The ordered extents for the given range. 3) The io_tree for the given range. Push the extent lock down to cover these operations. In the fallback_to_cow() case we simply lock before doing anything and rely on the cow_file_range() helper to handle it's range properly. Reviewed-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Josef Bacik authored
We have the following pattern while (1) { if (cur_offset > end) break; } Which is just while (cur_offset <= end) { ... } so adjust the code to be more clear. Reviewed-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Josef Bacik authored
run_delalloc_nocow is a bit special as it walks through the file extents for the inode and determines what it can nocow and what it can't. This is the more complicated area for extent locking, so start with this function. Reviewed-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Josef Bacik authored
We want to limit the scope of the extent lock to be around operations that can change in flight. Currently we hold the extent lock through the entire writepage operation, which isn't really necessary. We want to protect to make sure nobody has updated DELALLOC. In find_lock_delalloc_range we must lock the range in order to validate the contents of our io_tree. However once we've done that we're safe to unlock the range and continue, as we have the page lock already held for the range. We are protected from all operations at this point. * mmap() - we're holding the page lock, thus are protected. * buffered writes - again, we're protected because we take the page lock for the first and last page in our range for buffered writes so we won't create new delalloc ranges in this area. * direct IO - we invalidate pagecache before attempting to write a new area, which requires the page lock, so again are protected once we're holding the page lock on this range. Additionally this behavior actually already exists for compressed, we unlock the range as soon as we start to process the async extents, and re-lock it during compression. So this is completely safe, and makes the locking more consistent. Make this simple by just pushing the extent lock into btrfs_run_delalloc_range. From there followup patches will push the lock further down into its users. Reviewed-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Josef Bacik authored
We currently don't lock the extent when we're doing a cow_file_range_inline() for a compressed extent. This isn't a problem necessarily, but it's inconsistent with the rest of our usage of cow_file_range_inline(). This also leads to some extra weird logic around whether the extent is locked or not. Fix this to lock the extent before calling cow_file_range_inline() in compression to make it consistent with the rest of the inline users. In future patches this will be pushed down into the cow_file_range_inline() helper, so we're fine with the quick and dirty locking here. This patch exists to make the behavior change obvious. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Josef Bacik authored
We duplicate the extent cleanup for cow_file_range_inline() in the cow and compressed case. The encoded case doesn't need to do cleanup the same way, so rename cow_file_range_inline to __cow_file_range_inline and then make cow_file_range_inline handle the extent cleanup appropriately, and update the callers. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Josef Bacik authored
Since 4750af3b ("btrfs: prevent extent_clear_unlock_delalloc() to unlock page not locked by __process_pages_contig()") we have been unlocking the locked page manually instead of via extent_clear_unlock_delalloc() because of subpage blocksize support. However we actually disable inline extent creation for subpage blocksize support, so this behavior isn't necessary. Remove this code and comment, if at some point the subpage blocksize code grows support for inline extents this can be re-evaluated. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Josef Bacik authored
Currently we have a lot of duplicated checks of if (start == 0 && fs_info->sectorsize == PAGE_SIZE) cow_file_range_inline(); Instead of duplicating this check everywhere, consolidate all of the inline extent logic into a helper which documents all of the checks and then use that helper inside of cow_file_range_inline(). With this we can clean up all of the calls to either unconditionally call cow_file_range_inline(), or at least reduce the checks we're doing before we call cow_file_range_inline(); Reviewed-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Josef Bacik authored
In the cow path we will clone the reloc csums for relocated data extents, and if there's an error we already have an ordered extent and rely on the ordered extent finishing to clean everything up. There's a problem however, we don't mark the ordered extent with an error, we pretend like everything was just fine. If we were at the end of our range we won't actually bubble up this error anywhere, and we could end up inserting an extent that doesn't have csums where it should have them. Fix this by adding a helper to mark the ordered extent with an error, and then use this when we fail to lookup the csums in btrfs_reloc_clone_csums. Use this helper in the other place where we use the same pattern while we're here. This will prevent us from erroneously inserting the extent that doesn't have the required checksums. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Qu Wenruo authored
The function create_io_em() is called before we submit an IO, to update the in-memory extent map for the involved range. This patch changes the following aspects: - Does not allow BTRFS_ORDERED_NOCOW type For real NOCOW (excluding NOCOW writes into preallocated ranges) writes, we never call create_io_em(), as we does not need to update the extent map at all. So remove the sanity check allowing BTRFS_ORDERED_NOCOW type. - Add extra sanity checks * PREALLOC - @block_len == len For uncompressed writes. * REGULAR - @block_len == @orig_block_len == @ram_bytes == @len We're creating a new uncompressed extent, and referring all of it. - @orig_start == @start We haven no offset inside the extent. * COMPRESSED - valid @compress_type - @len <= @ram_bytes This is to co-operate with encoded writes, which can cause a new file extent referring only part of a uncompressed extent. Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Qu Wenruo authored
With the tree-checker ensuring all inline file extents starts at file offset 0 and has a length no larger than sectorsize, we can simplify the calculation to assigned those fixes values directly. Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Qu Wenruo authored
The extent_map structure is very critical to btrfs, as it is involved for both read and write paths. Unfortunately the structure is not properly explained, making it pretty hard to understand nor to do further improvement. This patch adds extra comments explaining the major members based on my code reading. Hopefully we can find more members to cleanup in the future. Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Naohiro Aota authored
calcu_metadata_size() has a "reserve" argument, but the only caller always set it to "1". The other usage (reserve = 0) is dropped by a commit 0647bf56 ("Btrfs: improve forever loop when doing balance relocation"), which is more than 10 years ago. Drop the argument and simplify the code. Reviewed-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Anand Jain authored
There's another return variable wret that is only passed to ret on error, we can simply use ret. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Anand Jain authored
First, drop err instead reuse ret, choose to return the error instead of goto fail and then return the same error. Do not initialize the ret until where it has to be initialized. Slight logic change in handling the btrfs_search_slot() and btrfs_next_leaf() return value. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Anand Jain authored
Rename ret to ret2 compile and then err to ret. Also, new ret2 is found to be localized within the 'if (trans)' statement, so move its declaration there. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Anand Jain authored
In quick_update_accounting() err is used as 2nd return value, which could be achieved just with ret. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Anand Jain authored
Coding style fixes the function relocate_tree_blocks(). After the fix, ret is the return value variable. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Anand Jain authored
Code style fix in the function build_backref_tree(). Drop the ret initialization 0, as we don't need it. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Anand Jain authored
Rename the function's local return variables err and werr to ret. Also, align the variable declarations with the other declarations in the function for better function space alignment. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Anand Jain authored
Rename the function's local variable werr and err to ret. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Anand Jain authored
In the function btrfs_write_marked_extents() and in __btrfs_wait_marked_extents() return the actual error if when filemap_fdata<write|wait>_range() fails. Suggested-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
David Sterba authored
There are open coded tests of BTRFS_FS_STATE_DUMMY_FS_INFO and we have a wrapper for that that's a compile-time constant when self-tests are not built in. As this is only for development we can save some bytes and conditions on release configs by using the helper in the remaining cases. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Filipe Manana authored
There's no need to initialize the delayed inodes xarray with a GFP_ATOMIC flag because that actually does nothing on the xarray operations. That was needed for radix trees, but for xarrays the allocation flags are passed as the last argument to xa_store() (which we are using correctly). So initialize the delayed inodes xarray with a simple xa_init(). Reviewed-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Filipe Manana authored
Currently try_release_extent_mapping() as an int return type, but we use it as a boolean. Its only caller, the release folio callback, also returns a boolean which corresponds to try_release_extent_mapping()'s return value. So change its return value type to bool as well as its helper try_release_extent_state(). Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Filipe Manana authored
At try_release_extent_mapping(), called during the release folio callback (btrfs_release_folio() callchain), we don't release any extent maps in the range if the GFP flags don't allow blocking. This behaviour is exaggerated because: 1) Both searching for extent maps and removing them are not blocking operations. The only thing that it is the cond_resched() call at the end of the loop that searches for and removes extent maps; 2) We currently only operate on a single page, so for the case where block size matches the page size, we can only have one extent map, and for the case where the block size is smaller than the page size, we can have at most 16 extent maps. So it's very unlikely the cond_resched() call will ever block even in the block size smaller than page size scenario. So instead of not removing any extent maps at all in case the GFP glags don't allow blocking, keep removing extent maps while we don't need to reschedule. This makes it safe for the subpage case and for a future where we can process folios with a size larger than a page. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Filipe Manana authored
Currently we don't attempt to release extent maps if the inode has an i_size that is not greater than 16M. This condition was added way back in 2008 by commit 70dec807 ("Btrfs: extent_io and extent_state optimizations"), without any explanation about it. A quick chat with Chris on slack revealed that the goal was probably to release the extent maps for small files only when closing the inode. This however can be harmful in case we have tons of such files being kept open for very long periods of time, since we will consume more and more pages for extent maps. So remove the condition. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Filipe Manana authored
Nowadays we have the btrfs_get_fs_generation() to get the current generation of the filesystem, so there's no need anymore to lock the transaction spinlock to read it. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Filipe Manana authored
Rename the following variables: 1) "btrfs_inode" to "inode", because it's shorter to type and clear, and we don't have a VFS inode here as well, so there's no confusion; 2) "tree" to "io_tree", to be clear which tree we are dealing with, since we use 2 different trees in the function; 3) "map" to "extent_tree" since "map" gives the idea we are dealing with an extent map for example, but we are dealing with the inode's extent tree (the tree which stores extent maps). These also make the next patches simpler. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-