An error occurred fetching the project authors.
- 26 Sep, 2017 1 commit
-
-
Misono, Tomohiro authored
Currently, "btrfs quota enable" would fail after "btrfs quota disable" on the first time with syslog output "qgroup_rescan_init failed with -22", but it would succeed on the second time. When "quota disable" is called, BTRFS_FS_QUOTA_DISABLING flag bit will be set in fs_info->flags in btrfs_quota_disable(), but it will not be droppd in btrfs_run_qgroups() (which is called in btrfs_commit_transaction()) because quota_root has already been freed. If "quota enable" is called after that, both BTRFS_FS_QUOTA_DISABLING and BTRFS_FS_QUOTA_ENABLED flag would be dropped in the btrfs_run_qgroups() since quota_root is not NULL. This leads to the failure of "quota enable" on the first time. BTRFS_FS_QUOTA_DISABLING flag is not used outside of "quota disable" context and is equivalent to whether quota_root is NULL or not. btrfs_run_qgroups() checks whether quota_root is NULL or not in the first place. So, let's remove BTRFS_FS_QUOTA_DISABLING flag. Signed-off-by:
Tomohiro Misono <misono.tomohiro@jp.fujitsu.com> Reviewed-by:
David Sterba <dsterba@suse.com> Signed-off-by:
David Sterba <dsterba@suse.com>
-
- 21 Aug, 2017 1 commit
-
-
Jeff Mahoney authored
btrfs_del_roots always uses the tree_root. Let's pass fs_info instead. Signed-off-by:
Jeff Mahoney <jeffm@suse.com> Reviewed-by:
David Sterba <dsterba@suse.com> Signed-off-by:
David Sterba <dsterba@suse.com>
-
- 16 Aug, 2017 2 commits
-
-
David Sterba authored
The helpers append "\n" so we can keep the actual strings shorter. The extra newline will print an empty line. Some messages have been slightly modified to be more consistent with the rest (lowercase first letter). Reviewed-by:
Nikolay Borisov <nborisov@suse.com> Signed-off-by:
David Sterba <dsterba@suse.com>
-
Nikolay Borisov authored
The current code was erroneously checking for root_level > BTRFS_MAX_LEVEL. If we had a root_level of 8 then the check won't trigger and we could potentially hit a buffer overflow. The correct check should be root_level >= BTRFS_MAX_LEVEL . Signed-off-by:
Nikolay Borisov <nborisov@suse.com> Reviewed-by:
Qu Wenruo <quwenruo.btrfs@gmx.com> Signed-off-by:
David Sterba <dsterba@suse.com>
-
- 29 Jun, 2017 6 commits
-
-
Chris Mason authored
Dave Jones hit a WARN_ON(nr < 0) in btrfs_wait_ordered_roots() with v4.12-rc6. This was because commit 70e7af24 made it possible for calc_reclaim_items_nr() to return a negative number. It's not really a bug in that commit, it just didn't go far enough down the stack to find all the possible 64->32 bit overflows. This switches calc_reclaim_items_nr() to return a u64 and changes everyone that uses the results of that math to u64 as well. Reported-by:
Dave Jones <davej@codemonkey.org.uk> Fixes: 70e7af24 ("Btrfs: fix delalloc accounting leak caused by u32 overflow") Signed-off-by:
Chris Mason <clm@fb.com> Reviewed-by:
David Sterba <dsterba@suse.com> Signed-off-by:
David Sterba <dsterba@suse.com>
-
Qu Wenruo authored
[BUG] For the following case, btrfs can underflow qgroup reserved space at an error path: (Page size 4K, function name without "btrfs_" prefix) Task A | Task B ---------------------------------------------------------------------- Buffered_write [0, 2K) | |- check_data_free_space() | | |- qgroup_reserve_data() | | Range aligned to page | | range [0, 4K) <<< | | 4K bytes reserved <<< | |- copy pages to page cache | | Buffered_write [2K, 4K) | |- check_data_free_space() | | |- qgroup_reserved_data() | | Range alinged to page | | range [0, 4K) | | Already reserved by A <<< | | 0 bytes reserved <<< | |- delalloc_reserve_metadata() | | And it *FAILED* (Maybe EQUOTA) | |- free_reserved_data_space() |- qgroup_free_data() Range aligned to page range [0, 4K) Freeing 4K (Special thanks to Chandan for the detailed report and analyse) [CAUSE] Above Task B is freeing reserved data range [0, 4K) which is actually reserved by Task A. And at writeback time, page dirty by Task A will go through writeback routine, which will free 4K reserved data space at file extent insert time, causing the qgroup underflow. [FIX] For btrfs_qgroup_free_data(), add @reserved parameter to only free data ranges reserved by previous btrfs_qgroup_reserve_data(). So in above case, Task B will try to free 0 byte, so no underflow. Reported-by:
Chandan Rajendra <chandan@linux.vnet.ibm.com> Signed-off-by:
Qu Wenruo <quwenruo@cn.fujitsu.com> Reviewed-by:
Chandan Rajendra <chandan@linux.vnet.ibm.com> Tested-by:
Chandan Rajendra <chandan@linux.vnet.ibm.com> Signed-off-by:
David Sterba <dsterba@suse.com>
-
Qu Wenruo authored
Introduce a new parameter, struct extent_changeset for btrfs_qgroup_reserved_data() and its callers. Such extent_changeset was used in btrfs_qgroup_reserve_data() to record which range it reserved in current reserve, so it can free it in error paths. The reason we need to export it to callers is, at buffered write error path, without knowing what exactly which range we reserved in current allocation, we can free space which is not reserved by us. This will lead to qgroup reserved space underflow. Reviewed-by:
Chandan Rajendra <chandan@linux.vnet.ibm.com> Signed-off-by:
Qu Wenruo <quwenruo@cn.fujitsu.com> Signed-off-by:
David Sterba <dsterba@suse.com>
-
Qu Wenruo authored
btrfs_qgroup_release/free_data() only returns 0 or a negative error number (ENOMEM is the only possible error). This is normally good enough, but sometimes we need the exact byte count it freed/released. Change it to return actually released/freed bytenr number instead of 0 for success. And slightly modify related extent_changeset structure, since in btrfs one no-hole data extent won't be larger than 128M, so "unsigned int" is large enough for the use case. Signed-off-by:
Qu Wenruo <quwenruo@cn.fujitsu.com> Signed-off-by:
David Sterba <dsterba@suse.com>
-
Qu Wenruo authored
Quite a lot of qgroup corruption happens due to wrong time of calling btrfs_qgroup_prepare_account_extents(). Since the safest time is to call it just before btrfs_qgroup_account_extents(), there is no need to separate these 2 functions. Merging them will make code cleaner and less bug prone. Signed-off-by:
Qu Wenruo <quwenruo@cn.fujitsu.com> [ changelog and comment adjustments ] Signed-off-by:
David Sterba <dsterba@suse.com>
-
Qu Wenruo authored
Modify btrfs_qgroup_account_extent() to exit quicker for non-fs extents. The quick exit condition is: 1) The extent belongs to a non-fs tree Only fs-tree extents can affect qgroup numbers and is the only case where extent can be shared between different trees. Although strictly speaking extent in data-reloc or tree-reloc tree can be shared, data/tree-reloc root won't appear in the result of btrfs_find_all_roots(), so we can ignore such case. So we can check the first root in old_roots/new_roots ulist. - if we find the 1st root is a not a fs/subvol root, then we can skip the extent - if we find the 1st root is a fs/subvol root, then we must continue calculation OR 2) both 'nr_old_roots' and 'nr_new_roots' are 0 This means either such extent got allocated then freed in current transaction or it's a new reloc tree extent, whose nr_new_roots is 0. Either way it won't affect qgroup accounting and can be skipped safely. Such quick exit can make trace output more quite and less confusing: (example with fs uuid and time stamp removed) Before: ------ add_delayed_tree_ref: bytenr=29556736 num_bytes=16384 action=ADD_DELAYED_REF parent=0(-) ref_root=2(EXTENT_TREE) level=0 type=TREE_BLOCK_REF seq=0 btrfs_qgroup_account_extent: bytenr=29556736 num_bytes=16384 nr_old_roots=0 nr_new_roots=1 ------ Extent tree block will trigger btrfs_qgroup_account_extent() trace point while no qgroup number is changed, as extent tree won't affect qgroup accounting. After: ------ add_delayed_tree_ref: bytenr=29556736 num_bytes=16384 action=ADD_DELAYED_REF parent=0(-) ref_root=2(EXTENT_TREE) level=0 type=TREE_BLOCK_REF seq=0 ------ Now such unrelated extent won't trigger btrfs_qgroup_account_extent() trace point, making the trace less noisy. Signed-off-by:
Qu Wenruo <quwenruo@cn.fujitsu.com> [ changelog and comment adjustments ] Signed-off-by:
David Sterba <dsterba@suse.com>
-
- 21 Jun, 2017 1 commit
-
-
Jeff Mahoney authored
On an uncontended system, we can end up hitting soft lockups while doing replace_path. At the core, and frequently called is btrfs_qgroup_trace_leaf_items, so it makes sense to add a cond_resched there. Signed-off-by:
Jeff Mahoney <jeffm@suse.com> Reviewed-by:
David Sterba <dsterba@suse.com> Signed-off-by:
David Sterba <dsterba@suse.com>
-
- 19 Jun, 2017 1 commit
-
-
Sargun Dhillon authored
This patch introduces the quota override flag to btrfs_fs_info, and a change to quota limit checking code to temporarily allow for quota to be overridden for processes with CAP_SYS_RESOURCE. It's useful for administrative programs, such as log rotation, that may need to temporarily use more disk space in order to free up a greater amount of overall disk space without yielding more disk space to the rest of userland. Eventually, we may want to add the idea of an operator-specific quota, operator reserved space, or something else to allow for administrative override, but this is perhaps the simplest solution. Signed-off-by:
Sargun Dhillon <sargun@sargun.me> Reviewed-by:
David Sterba <dsterba@suse.com> [ minor changelog edits ] Signed-off-by:
David Sterba <dsterba@suse.com>
-
- 19 Apr, 2017 1 commit
-
-
David Sterba authored
The WARN_ON and warning from report_reserved_underflow can become very noisy and is visible unconditionally although this is namely for debugging. The patch "btrfs: Add WARN_ON for qgroup reserved underflow" (18dc22c1) went to 4.11-rc1 and the plan was to get the fix as well, but this hasn't happened. CC: Qu Wenruo <quwenruo@cn.fujitsu.com> Reviewed-by:
Qu Wenruo <quwenruo@cn.fujitsu.com> Signed-off-by:
David Sterba <dsterba@suse.com>
-
- 18 Apr, 2017 5 commits
-
-
Qu Wenruo authored
Newly introduced qgroup reserved space trace points are normally nested into several common qgroup operations. While some other trace points are not well placed to co-operate with them, causing confusing output. This patch re-arrange trace_btrfs_qgroup_release_data() and trace_btrfs_qgroup_free_delayed_ref() trace points so they are triggered before reserved space ones. Signed-off-by:
Qu Wenruo <quwenruo@cn.fujitsu.com> Reviewed-by:
Jeff Mahoney <jeffm@suse.com> Signed-off-by:
David Sterba <dsterba@suse.com>
-
Qu Wenruo authored
Introduce the following trace points: qgroup_update_reserve qgroup_meta_reserve These trace points are handy to trace qgroup reserve space related problems. Also export btrfs_qgroup structure, as now we directly pass btrfs_qgroup structure to trace points, so that structure needs to be exported. Signed-off-by:
Qu Wenruo <quwenruo@cn.fujitsu.com> Reviewed-by:
David Sterba <dsterba@suse.com> Signed-off-by:
David Sterba <dsterba@suse.com>
-
Goldwyn Rodrigues authored
We are facing the same problem with EDQUOT which was experienced with ENOSPC. Not sure if we require a full ticketing system such as ENOSPC, but here is a quick fix, which may be too big a hammer. Quotas are reserved during the start of an operation, incrementing qg->reserved. However, it is written to disk in a commit_transaction which could take as long as commit_interval. In the meantime there could be deletions which are not accounted for because deletions are accounted for only while committed (free_refroot). So, when we get a EDQUOT flush the data to disk and try again. This fixes fstests btrfs/139. Here is a sample script which shows this issue. DEVICE=/dev/vdb MOUNTPOINT=/mnt TESTVOL=$MOUNTPOINT/tmp QUOTA=5 PROG=btrfs DD_BS="4k" DD_COUNT="256" RUN_TIMES=5000 mkfs.btrfs -f $DEVICE mount -o commit=240 $DEVICE $MOUNTPOINT $PROG subvolume create $TESTVOL $PROG quota enable $TESTVOL $PROG qgroup limit ${QUOTA}G $TESTVOL typeset -i DD_RUN_GOOD typeset -i QUOTA function _check_cmd() { if [[ ${?} > 0 ]]; then echo -n "$(date) E: Running previous command" echo ${*} echo "Without sync" $PROG qgroup show -pcreFf ${TESTVOL} echo "With sync" $PROG qgroup show -pcreFf --sync ${TESTVOL} exit 1 fi } while true; do DD_RUN_GOOD=$RUN_TIMES while (( ${DD_RUN_GOOD} != 0 )); do dd if=/dev/zero of=${TESTVOL}/quotatest${DD_RUN_GOOD} bs=${DD_BS} count=${DD_COUNT} _check_cmd "dd if=/dev/zero of=${TESTVOL}/quotatest${DD_RUN_GOOD} bs=${DD_BS} count=${DD_COUNT}" DD_RUN_GOOD=(${DD_RUN_GOOD}-1) done $PROG qgroup show -pcref $TESTVOL echo "----------- Cleanup ---------- " rm $TESTVOL/quotatest* done Signed-off-by:
Goldwyn Rodrigues <rgoldwyn@suse.com> Reviewed-by:
Qu Wenruo <quwenruo@cn.fujitsu.com> Signed-off-by:
David Sterba <dsterba@suse.com>
-
Edmund Nadolski authored
Define the SEQ_LAST macro to replace (u64)-1 in places where said value triggers a special-case ref search behavior. Signed-off-by:
Edmund Nadolski <enadolski@suse.com> Reviewed-by:
Jeff Mahoney <jeffm@suse.com> Reviewed-by:
Qu Wenruo <quwenruo@cn.fujitsu.com> Signed-off-by:
David Sterba <dsterba@suse.com>
-
David Sterba authored
The members have been effectively unused since "Btrfs: rework qgroup accounting" (fcebe456), there's no substitute for assert_qgroups_uptodate so it's removed as well. Reviewed-by:
Qu Wenruo <quwenruo@cn.fujitsu.com> Signed-off-by:
David Sterba <dsterba@suse.com>
-
- 29 Mar, 2017 1 commit
-
-
Goldwyn Rodrigues authored
Using an int value is causing qg->reserved to become negative and exclusive -EDQUOT to be reached prematurely. This affects exclusive qgroups only. TEST CASE: DEVICE=/dev/vdb MOUNTPOINT=/mnt SUBVOL=$MOUNTPOINT/tmp umount $SUBVOL umount $MOUNTPOINT mkfs.btrfs -f $DEVICE mount /dev/vdb $MOUNTPOINT btrfs quota enable $MOUNTPOINT btrfs subvol create $SUBVOL umount $MOUNTPOINT mount /dev/vdb $MOUNTPOINT mount -o subvol=tmp $DEVICE $SUBVOL btrfs qgroup limit -e 3G $SUBVOL btrfs quota rescan /mnt -w for i in `seq 1 44000`; do dd if=/dev/zero of=/mnt/tmp/test_$i bs=10k count=1 if [[ $? > 0 ]]; then btrfs qgroup show -pcref $SUBVOL exit 1 fi done Signed-off-by:
Goldwyn Rodrigues <rgoldwyn@suse.com> [ add reproducer to changelog ] Signed-off-by:
David Sterba <dsterba@suse.com>
-
- 17 Feb, 2017 12 commits
-
-
Qu Wenruo authored
Just as Filipe pointed out, the most time consuming parts of qgroup are btrfs_qgroup_account_extents() and btrfs_qgroup_prepare_account_extents(). Which both call btrfs_find_all_roots() to get old_roots and new_roots ulist. What makes things worse is, we're calling that expensive btrfs_find_all_roots() at transaction committing time with TRANS_STATE_COMMIT_DOING, which will blocks all incoming transaction. Such behavior is necessary for @new_roots search as current btrfs_find_all_roots() can't do it correctly so we do call it just before switch commit roots. However for @old_roots search, it's not necessary as such search is based on commit_root, so it will always be correct and we can move it out of transaction committing. This patch moves the @old_roots search part out of commit_transaction(), so in theory we can half the time qgroup time consumption at commit_transaction(). But please note that, this won't speedup qgroup overall, the total time consumption is still the same, just reduce the performance stall. Cc: Filipe Manana <fdmanana@suse.com> Signed-off-by:
Qu Wenruo <quwenruo@cn.fujitsu.com> Reviewed-by:
Filipe Manana <fdmanana@suse.com> Signed-off-by:
David Sterba <dsterba@suse.com>
-
David Sterba authored
Never used. Reviewed-by:
Liu Bo <bo.li.liu@oracle.com> Signed-off-by:
David Sterba <dsterba@suse.com>
-
David Sterba authored
Added but never needed. Reviewed-by:
Liu Bo <bo.li.liu@oracle.com> Signed-off-by:
David Sterba <dsterba@suse.com>
-
David Sterba authored
Change the name so it matches the naming we already use eg. for btrfs_path. Suggested-by:
Qu Wenruo <quwenruo@cn.fujitsu.com> Signed-off-by:
David Sterba <dsterba@suse.com>
-
David Sterba authored
There was never need for RCU protection around reading nodesize or other fairly constant filesystem data. Signed-off-by:
David Sterba <dsterba@suse.com>
-
David Sterba authored
The helper name is not too helpful and is just wrapping a simple call. Signed-off-by:
David Sterba <dsterba@suse.com>
-
David Sterba authored
Status of quotas should be the first check in btrfs_qgroup_account_extent and we can return immediatelly, no need to do no-op ulist frees. Signed-off-by:
David Sterba <dsterba@suse.com>
-
David Sterba authored
We can embed range_changed to the extent changeset to address following problems: - no need to allocate ulist dynamically, we also get rid of the GFP_NOFS for free - fix lack of allocation failure checking in btrfs_qgroup_reserve_data The stack consuption where extent_changeset is used slightly increases: before: 16 after: 16 - 8 (for pointer) + 32 (sizeof ulist) = 40 Which is bearable. Reviewed-by:
Qu Wenruo <quwenruo@cn.fujitsu.com> Signed-off-by:
David Sterba <dsterba@suse.com>
-
David Sterba authored
Internal helper. Signed-off-by:
David Sterba <dsterba@suse.com>
-
David Sterba authored
Qgroup relations are added/deleted from ioctl, we hold the high level qgroup lock, no deadlocks or recursion from the allocation possible here. Signed-off-by:
David Sterba <dsterba@suse.com>
-
David Sterba authored
We don't need to use GFP_NOFS here as this is called from ioctls an the only lock held is the subvol_sem, which is of a high level and protects creation/renames/deletion and is never held in the writeout paths. Signed-off-by:
David Sterba <dsterba@suse.com>
-
David Sterba authored
The qgroup config is read during mount, we do not have to use NOFS. Signed-off-by:
David Sterba <dsterba@suse.com>
-
- 14 Feb, 2017 2 commits
-
-
Jeff Mahoney authored
Once a qgroup limit is exceeded, it's impossible to restore normal operation to the subvolume without modifying the limit or removing the subvolume. This is a surprising situation for many users used to the typical workflow with quotas on other file systems where it's possible to remove files until the used space is back under the limit. When we go to unlink a file and start the transaction, we'll hit the qgroup limit while trying to reserve space for the items we'll modify while removing the file. We discussed last month how best to handle this situation and agreed that there is no perfect solution. The best principle-of-least-surprise solution is to handle it similarly to how we already handle ENOSPC when unlinking, which is to allow the operation to succeed with the expectation that it will ultimately release space under most circumstances. This patch modifies the transaction start path to select whether to honor the qgroups limits. btrfs_start_transaction_fallback_global_rsv is the only caller that skips enforcement. The reservation and tracking still happens normally -- it just skips the enforcement step. Signed-off-by:
Jeff Mahoney <jeffm@suse.com> Reviewed-by:
Qu Wenruo <quwenruo@cn.fujitsu.com> Signed-off-by:
David Sterba <dsterba@suse.com>
-
Qu Wenruo authored
Goldwyn Rodrigues has exposed and fixed a bug which underflows btrfs qgroup reserved space, and leads to non-writable fs. This reminds us that we don't have enough underflow check for qgroup reserved space. For underflow case, we should not really underflow the numbers but warn and keeps qgroup still work. So add more check on qgroup reserved space and add WARN_ON() and btrfs_warn() for any underflow case. Signed-off-by:
Qu Wenruo <quwenruo@cn.fujitsu.com> Reviewed-by:
David Sterba <dsterba@suse.com> Reviewed-by:
Goldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by:
David Sterba <dsterba@suse.com>
-
- 06 Dec, 2016 4 commits
-
-
Jeff Mahoney authored
Now we only use the root parameter to print the root objectid in a tracepoint. We can use the root parameter from the transaction handle for that. It's also used to join the transaction with async commits, so we remove the comment that it's just for checking. Signed-off-by:
Jeff Mahoney <jeffm@suse.com> Signed-off-by:
David Sterba <dsterba@suse.com>
-
Jeff Mahoney authored
There are loads of functions in btrfs that accept a root parameter but only use it to obtain an fs_info pointer. Let's convert those to just accept an fs_info pointer directly. Signed-off-by:
Jeff Mahoney <jeffm@suse.com> Signed-off-by:
David Sterba <dsterba@suse.com>
-
Jeff Mahoney authored
In routines where someptr->fs_info is referenced multiple times, we introduce a convenience variable. This makes the code considerably more readable. Signed-off-by:
Jeff Mahoney <jeffm@suse.com> Signed-off-by:
David Sterba <dsterba@suse.com>
-
Jeff Mahoney authored
We track the node sizes per-root, but they never vary from the values in the superblock. This patch messes with the 80-column style a bit, but subsequent patches to factor out root->fs_info into a convenience variable fix it up again. Signed-off-by:
Jeff Mahoney <jeffm@suse.com> Signed-off-by:
David Sterba <dsterba@suse.com>
-
- 30 Nov, 2016 3 commits
-
-
Qu Wenruo authored
Move account_shared_subtree() to qgroup.c and rename it to btrfs_qgroup_trace_subtree(). Do the same thing for account_leaf_items() and rename it to btrfs_qgroup_trace_leaf_items(). Since all these functions are only for qgroup, move them to qgroup.c and export them is more appropriate. Signed-off-by:
Qu Wenruo <quwenruo@cn.fujitsu.com> Reviewed-and-Tested-by:
Goldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by:
David Sterba <dsterba@suse.com>
-
Qu Wenruo authored
Rename btrfs_qgroup_insert_dirty_extent(_nolock) to btrfs_qgroup_trace_extent(_nolock), according to the new reserve/trace/account naming schema. Signed-off-by:
Qu Wenruo <quwenruo@cn.fujitsu.com> Reviewed-and-Tested-by:
Goldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by:
David Sterba <dsterba@suse.com>
-
David Sterba authored
The helpers are not meant to be generic, the name is misleading. Convert them to static inlines for type checking. Signed-off-by:
David Sterba <dsterba@suse.com>
-