- 19 Jun, 2017 40 commits
-
-
David Sterba authored
As we don't use vmalloc/vzalloc/vfree directly in ctree.c, we can now use the proper header that defines kvmalloc. Signed-off-by: David Sterba <dsterba@suse.com>
-
David Sterba authored
Now that init_ipath is called either from a safe context or with memalloc_nofs protection, we can switch to GFP_KERNEL allocations in init_path and init_data_container. Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
David Sterba authored
init_ipath is called from a safe ioctl context and from scrub when printing an error. The protection is added for three reasons: * init_data_container calls vmalloc and this does not work as expected in the GFP_NOFS context, so this silently does GFP_KERNEL and might deadlock in some cases * keep the context constraint of GFP_NOFS, used by scrub * we want to use GFP_KERNEL unconditionally inside init_ipath or its callees Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
David Sterba authored
We use a growing buffer for xattrs larger than a page size, at some point vmalloc is unconditionally used for larger buffers. We can still try to avoid it using the kvmalloc helper. Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
David Sterba authored
The logic of kmalloc and vmalloc fallback is opencoded in several places, we can now use the existing helper. Signed-off-by: David Sterba <dsterba@suse.com>
-
Timofey Titovets authored
Logic already skips if compression makes data bigger, let's sync lzo with zlib and also return error if compressed size is equal to input size. Signed-off-by: Timofey Titovets <nefelim4ag@gmail.com> Reviewed-by: David Sterba <dsterba@suse.com> [ update changelog ] Signed-off-by: David Sterba <dsterba@suse.com>
-
Guoqing Jiang authored
bio_io_error was introduced in the commit 4246a0b6 ("block: add a bi_error field to struct bio"), so use it to simplify code. Signed-off-by: Guoqing Jiang <gqjiang@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Omar Sandoval authored
First, instead of open-coding the vmalloc() fallback, use the new kvzalloc() helper. Second, use memalloc_nofs_{save,restore}() instead of GFP_NOFS, as vmalloc() uses some GFP_KERNEL allocations internally which could lead to deadlocks. Signed-off-by: Omar Sandoval <osandov@fb.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
David Sterba authored
Observing the number of slab objects of btrfs_transaction, there's just one active on an almost quiescent filesystem, and the number of objects goes to about ten when sync is in progress. Then the nubmer goes down to 1. This matches the expectations of the transaction lifetime. For such use the separate slab cache is not justified, as we do not reuse objects frequently. For the shortlived transaction, the generic slab (size 512) should be ok. We can optimistically expect that the 512 slabs are not all used (fragmentation) and there are free slots to take when we do the allocation, compared to potentially allocating a whole new page for the separate slab. We'll lose the stats about the object use, which could be added later if we really need them. Signed-off-by: David Sterba <dsterba@suse.com>
-
David Sterba authored
The structure scrub_wr_ctx is not used anywhere just the scrub context, we can move the members there. The tgtdev is renamed so it's more clear that it belongs to the "wr" part. Signed-off-by: David Sterba <dsterba@suse.com>
-
David Sterba authored
As we now have the node/block sizes in fs_info, we can use them and can drop the local copies. Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Yonghong Song authored
Return enhanced file attributes from the btrfs, including: (1). inode creation time as stx_btime, and (2). Certain BTRFS_INODE_xxx flags are mapped to stx_attributes flags. Example output: [root@localhost ~]# cat t.sh touch t chattr +aic t ~/linux/samples/statx/test-statx t chattr -aic t touch t echo "========================================" ~/linux/samples/statx/test-statx t /bin/rm t [root@localhost ~]# ./t.sh statx(t) = 0 results=fff Size: 0 Blocks: 0 IO Block: 4096 regular file Device: 00:1c Inode: 63962 Links: 1 Access: (0644/-rw-r--r--) Uid: 0 Gid: 0 Access: 2017-05-11 16:03:13.999856591-0700 Modify: 2017-05-11 16:03:13.999856591-0700 Change: 2017-05-11 16:03:14.000856663-0700 Birth: 2017-05-11 16:03:13.999856591-0700 Attributes: 0000000000000034 (........ ........ ........ ........ ........ ........ ........ .-ai.c..) ======================================== statx(t) = 0 results=fff Size: 0 Blocks: 0 IO Block: 4096 regular file Device: 00:1c Inode: 63962 Links: 1 Access: (0644/-rw-r--r--) Uid: 0 Gid: 0 Access: 2017-05-11 16:03:14.006857097-0700 Modify: 2017-05-11 16:03:14.006857097-0700 Change: 2017-05-11 16:03:14.006857097-0700 Birth: 2017-05-11 16:03:13.999856591-0700 Attributes: 0000000000000000 (........ ........ ........ ........ ........ ........ ........ .---.-..) [root@localhost ~]# Reviewed-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Yonghong Song <yhs@fb.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Timofey Titovets authored
Fix copy paste typo in debug message for lzo.c, lzo is not deflate. Signed-off-by: Timofey Titovets <nefelim4ag@gmail.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Jeff Layton authored
Nothing checks its return value. Is it safe to skip checking return value of btrfs_wait_tree_block_writeback? Liu Bo: I think yes, it's used in walk_log_tree which is called in two places, free_log_tree and log replay. For free_log_tree, it waits for any running writeback of the extent buffer under freeing to finish in case we need to access the eb pointer from page->private, and it's OK to not check the return value, while for log replay, it's doesn't wait because wc->wait is not set. So neither cares about the writeback error. Signed-off-by: Jeff Layton <jlayton@redhat.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Liu Bo <bo.li.liu@oracle.com> [ added more explanation to changelog, from Liu Bo ] Signed-off-by: David Sterba <dsterba@suse.com>
-
Nikolay Borisov authored
__BTRFS_LAF_DATA_SIZE is used only by BTRFS_LEAF_DATA_SIZE. Make the latter subsume the former. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Nikolay Borisov authored
Commit 5f39d397 ("Btrfs: Create extent_buffer interface for large blocksizes") refactored btrfs_leaf_data function to take extent_buffer rather than struct btrfs_leaf. However, as it turns out the parameter being passed is never used. Furthermore this function no longer returns the leaf data but rather the offset to it. So rename the function to BTRFS_LEAF_DATA_OFFSET to make it consistent with other BTRFS_LEAF_* helpers and turn it into a macro. Signed-off-by: Nikolay Borisov <nborisov@suse.com> [ removed () from the macro ] Signed-off-by: David Sterba <dsterba@suse.com>
-
Anand Jain authored
struct compressed_bio pointer can be used instead. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Anand Jain authored
Instead of sending each argument of struct compressed_bio, send the compressed_bio itself. Also by having struct compressed_bio in btrfs_decompress_bio() it would help tracing. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Nikolay Borisov authored
Following the factoring out of the creation code udpate_space_info can only be called for already-existing space_info structs. As such it cannot fail. Remove superfluous error handling and make the function return void. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: Jeff Mahoney <jeffm@suse.com> Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Nikolay Borisov authored
Currently the struct space_info creation code is intermixed in the udpate_space_info function. There are well-defined points at which the we actually want to create brand-new space_info structs (e.g. during mount of the filesystem as well as sometimes when adding/initialising new chunks). In such cases update_space_info is called with 0 as the bytes parameter. All of this makes for spaghetti code. Fix it by factoring out the creation code in a separate create_space_info structure. This also allows to simplify the internals. Also remove BUG_ON from do_alloc_chunk since the callers handle errors. Furthermore it will make the update_space_info function not fail, allowing us to remove error handling in callers. This will come in a follow up patch. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: Jeff Mahoney <jeffm@suse.com> Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Liu Bo authored
This adds chunk_objectid and flags, with flags we can recognize whether the block group is about data or metadata. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Liu Bo authored
We commit transaction in order to reclaim space from pinned bytes because it could process delayed refs, and in may_commit_transaction(), we check first if pinned bytes are enough for the required space, we then check if that plus bytes reserved for delayed insert are enough for the required space. This changes the code to the above logic. Fixes: b150a4f1 ("Btrfs: use a percpu to keep track of possibly pinned bytes") Tested-by: Nikolay Borisov <nborisov@suse.com> Reported-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
David Sterba authored
We don't need to take the mutex and zero out wr_cur_bio, as this is called after the scrub finished. Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
David Sterba authored
The helper scrub_free_wr_ctx is used only once and fits into scrub_free_ctx as it continues sctx shutdown, no need to keep it separate. Signed-off-by: David Sterba <dsterba@suse.com>
-
David Sterba authored
The helper scrub_setup_wr_ctx is used only once and fits into scrub_setup_ctx as it continues intialization, no need to keep it separate. Signed-off-by: David Sterba <dsterba@suse.com>
-
Jeff Mahoney authored
can_overcommit using the root to determine the allocation profile is the only use of a root in the call graph below reserve_metadata_bytes. It turns out that we only need to know whether the allocation is for the chunk root or not -- and we can pass that around as a bool instead. This allows us to pull root usage out of the reservation path all the way up to reserve_metadata_bytes itself, which uses it only to compare against fs_info->chunk_root to set the bool. In turn, this eliminates a bunch of races where we use a particular root too early in the mount process. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Jeff Mahoney authored
There are two places where we don't already know what kind of alloc profile we need before calling btrfs_get_alloc_profile, but we need access to a root everywhere we call it. This patch adds helpers for btrfs_{data,metadata,system}_alloc_profile() and relegates btrfs_system_alloc_profile to a static for use in those two cases. The next patch will eliminate one of those. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
David Sterba authored
We use only a simple bool indicator, int is not a problem here. Signed-off-by: David Sterba <dsterba@suse.com>
-
David Sterba authored
The end io work queue items have been tracked by the work queues since "Btrfs: Add async worker threads for pre and post IO checksumming" (8b712842) (2008). Signed-off-by: David Sterba <dsterba@suse.com>
-
David Sterba authored
The two members do not seem to be used since the initial commit. Signed-off-by: David Sterba <dsterba@suse.com>
-
David Sterba authored
The list used to track checksums in the early version (2.6.29), but I was able not pinpoint the commit that stopped using it. Everything apparently works without it for a long time. Signed-off-by: David Sterba <dsterba@suse.com>
-
David Sterba authored
Seems to be unused since the initial commit, we ignore readahead errors anyway, the full read will handle that if necessary. Signed-off-by: David Sterba <dsterba@suse.com>
-
Sahil Kang authored
Both btrfs_create_free_space_tree and btrfs_clear_free_space_tree contain: if (ret) return ret; return 0; The if statement is only false when ret equals zero, and since we return zero in such cases, we can safely remove the branching. Signed-off-by: Sahil Kang <sahil.kang@asilaycomputing.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Liu Bo authored
We only pass GFP_NOFS to btrfs_bio_clone_partial, so lets hardcode it. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Arnd Bergmann authored
A rewrite of btrfs_submit_direct_hook appears to have introduced a warning: fs/btrfs/inode.c: In function 'btrfs_submit_direct_hook': fs/btrfs/inode.c:8467:14: error: 'bio' may be used uninitialized in this function [-Werror=maybe-uninitialized] Where the 'bio' variable was previously initialized unconditionally, it is now set in the "while (submit_len > 0)" loop that would never execute if submit_len is zero. Assuming this cannot happen in practice, we can avoid the warning by simply replacing the while{} loop with a do{}while() loop so the compiler knows that it will always be entered at least once. Fixes changes introduced in "Btrfs: use bio_clone_bioset_partial to simplify DIO submit". Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David Sterba <dsterba@suse.com>
-
Liu Bo authored
All dio endio functions are using io_bio for struct btrfs_io_bio, this makes btrfs_submit_direct to follow this convention. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Liu Bo authored
Some check-integrity code depends on bio->bi_vcnt, this changes it to use bio segments because some bios passing here may not have a reliable bi_vcnt. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Liu Bo authored
In the nocsum case of dio read endio, it returns immediately if an error gets returned when repairing, which leaves the rest blocks unrepaired. The behavior is different from how buffered read endio works in the same case. This changes it to record error only and go on repairing the rest blocks. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Liu Bo authored
Since dio submit has used bio_clone_fast, the submitted bio may not have a reliable bi_vcnt, for the bio vector iterations in checksum related functions, bio->bi_iter is not modified yet and it's safe to use bio_for_each_segment, while for those bio vector iterations in dio read's endio, we now save a copy of bvec_iter in struct btrfs_io_bio when cloning bios and use the helper __bio_for_each_segment with the saved bvec_iter to access each bvec. Also for dio reads which don't get split, we also need to save a copy of bio iterator in btrfs_bio_clone to let __bio_for_each_segments to access each bvec in dio read's endio. Note that it doesn't affect other calls of btrfs_bio_clone() because they don't need to use this iterator. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Liu Bo authored
Currently when mapping bio to limit bio to a single stripe length, we split bio by adding page to bio one by one, but later we don't modify the vector of bio at all, thus we can use bio_clone_fast to use the original bio vector directly. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-