Commits · 9c106405ddf893fcd04cd46555464417d2df8451 · Kirill Smelkov / linux

15 Jun, 2012 19 commits

Btrfs: update MAINTAINERS info for BTRFS FILE SYSTEM · 9c106405

Liu Bo authored Jun 14, 2012

Update to the latest btrfs's maintainer mail and git repo.
Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>

9c106405

Btrfs: destroy the items of the delayed inodes in error handling routine · 67cde344

Miao Xie authored Jun 14, 2012

the items of the delayed inodes were forgotten to be freed, this patch
fixes it.
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>

67cde344

Btrfs: make sure that we've made everything in pinned tree clean · ed0eaa14

Liu Bo authored Jun 14, 2012

Since we have two trees for recording pinned extents, we need to go through
both of them to make sure that we've done everything clean.
Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>

ed0eaa14

Btrfs: avoid memory leak of extent state in error handling routine · 6e841e32

Liu Bo authored Jun 14, 2012

We've forgotten to clear extent states in pinned tree, which will results in
space counter mismatch and memory leak:

WARNING: at fs/btrfs/extent-tree.c:7537 btrfs_free_block_groups+0x1f3/0x2e0 [btrfs]()
...
space_info 2 has 8380416 free, is not full
space_info total=12582912, used=4096, pinned=4096, reserved=0, may_use=0, readonly=4194304
btrfs state leak: start 29364224 end 29376511 state 1 in tree ffff880075f20090 refs 1
...
Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>

6e841e32

Btrfs: do not resize a seeding device · 4e42ae1b

Liu Bo authored Jun 14, 2012

Seeding devices are not supposed to change any more.
Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>

4e42ae1b

Btrfs: fix missing inherited flag in rename · bc178237

Liu Bo authored Jun 14, 2012

When we move a file into a directory with compression flag, we need to
inherite BTRFS_INODE_COMPRESS and clear BTRFS_INODE_NOCOMPRESS as well.
But if we move a file into a directory without compression flag, we need
to clear both of them.

It is the way how our setflags deals with compression flag, so keep
the same behaviour here.
Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>

bc178237

Merge branch 'for-chris' of git://git.jan-o-sch.net/btrfs-unstable into for-linus · acbcabd2
Chris Mason authored Jun 14, 2012

acbcabd2

Btrfs: fix incompat flags setting · 69e380d1

Li Zefan authored Jun 11, 2012

It's a bug, but it happens to work, as BTRFS_COMPRESS_LZO == 2, which
has only one bit set.
Signed-off-by: Li Zefan <lizefan@huawei.com>

69e380d1

Btrfs: fix defrag regression · 6c282eb4

Li Zefan authored Jun 11, 2012

If a file has 3 small extents:

| ext1 | ext2 | ext3 |

Running "btrfs fi defrag" will only defrag the last two extents, if those
extent mappings hasn't been read into memory from disk.

This bug was introduced by commit 17ce6ef8
("Btrfs: add a check to decide if we should defrag the range")

The cause is, that commit looked into previous and next extents using
lookup_extent_mapping() only.

While at it, remove the code that checks the previous extent, since
it's sufficient to check the next extent.
Signed-off-by: Li Zefan <lizefan@huawei.com>

6c282eb4

Btrfs: call filemap_fdatawrite twice for compression · 7ddf5a42

Josef Bacik authored Jun 08, 2012

I removed this in an earlier commit and I was wrong. Because compression
can return from filemap_fdatawrite() without having actually set any of it's
pages as writeback() it can make filemap_fdatawait() do essentially nothing,
and then we won't find any ordered extents because they may not have been
created yet. So not only does this make fsync() completely useless, but it
will also screw up if you truncate on a non-page aligned offset since we
zero out the end and then wait on ordered extents and then call drop caches.
We can drop the cache before the io completes and then we try to unpin the
extent we just wrote we won't find it and everything goes sideways. So fix
this by putting it back and put a giant comment there to keep me from trying
to remove it in the future. Thanks,
Signed-off-by: Josef Bacik <josef@redhat.com>

7ddf5a42

Btrfs: keep inode pinned when compressing writes · 8180ef88

Josef Bacik authored Jun 08, 2012

A user reported lots of problems using compression on the new code and it
turns out part of the problem was that igrab() was failing when we added a
new ordered extent. This is because when writing out an inode under
compression we immediately return without actually doing anything to the
pages, and then in another thread at some point down the line actually do
the ordered dance. The problem is between the point that we start writeback
and we actually add the ordered extent we could be trying to reclaim the
inode, which makes igrab() return NULL. So we need to do an igrab() when we
create the async extent and then drop it when we are done with it. This
makes sure we stay pinned in memory until the ordered extent can get a
reference on it and we are good to go. With this patch we no longer panic
in btrfs_finish_ordered_io(). Thanks,
Signed-off-by: Josef Bacik <josef@redhat.com>

8180ef88

Btrfs: implement ->show_devname · 9c5085c1

Josef Bacik authored Jun 05, 2012

Because btrfs can remove the device that was mounted we need to have a
->show_devname so that in this case we can print out some other device in
the file system to /proc/mount.  So if there are multiple devices in a btrfs
file system we will just print the device with the lowest devid that we can
find.  This will make everything consistent and deal with device removal
properly.  The drawback is if you mount with a device that is higher than
the lowest devicd it won't show up as the mounted device in /proc/mounts,
but this is a small price to pay. This was inspired by Miao Xie's patch.
Thanks,
Reviewed-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Josef Bacik <josef@redhat.com>

9c5085c1

Btrfs: use rcu to protect device->name · 606686ee

Josef Bacik authored Jun 04, 2012

Al pointed out that we can just toss out the old name on a device and add a
new one arbitrarily, so anybody who uses device->name in printk could
possibly use free'd memory. Instead of adding locking around all of this he
suggested doing it with RCU, so I've introduced a struct rcu_string that
does just that and have gone through and protected all accesses to
device->name that aren't under the uuid_mutex with rcu_read_lock(). This
protects us and I will use it for dealing with removing the device that we
used to mount the file system in a later patch. Thanks,
Reviewed-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Josef Bacik <josef@redhat.com>

606686ee

Btrfs: unlock everything properly in the error case for nocow · 17ca04af

Josef Bacik authored May 31, 2012

I was getting hung on umount when a transaction was aborted because a range
of one of the free space inodes was still locked. This is because the nocow
stuff doesn't unlock anything on error. This fixed the problem and I
verified that is what was happening. Thanks,
Signed-off-by: Josef Bacik <josef@redhat.com>

17ca04af

Btrfs: fix btrfs_destroy_marked_extents · ee670f0a

Josef Bacik authored May 31, 2012

So we're forcing the eb's to have their ref count set to 1 so invalidatepage
works but this breaks lots of things, for example root nodes, and is just
plain wrong, we don't need to just evict all of this stuff. Also drop the
invalidatepage altogether and add a page_cache_release(). With this patch
we no longer hang when trying to access the root nodes after an aborted
transaction and we no longer leak memory. Thanks,
Signed-off-by: Josef Bacik <josef@redhat.com>

ee670f0a

Btrfs: abort the transaction if the commit fails · 7b8b92af

Josef Bacik authored May 31, 2012

If a transaction commit fails we don't abort it so we don't set an error on
the file system. This patch fixes that by actually calling the abort stuff
and then adding a check for a fs error in the transaction start stuff to
make sure it is caught properly. Thanks,
Signed-off-by: Josef Bacik <josef@redhat.com>

7b8b92af

Btrfs: wake up transaction waiters when aborting a transaction · d7096fc3

Josef Bacik authored May 31, 2012

I was getting lots of hung tasks and a NULL pointer dereference because we
are not cleaning up the transaction properly when it aborts. First we need
to reset the running_transaction to NULL so we don't get a bad dereference
for any start_transaction callers after this. Also we cannot rely on
waitqueue_active() since it's just a list_empty(), so just call wake_up()
directly since that will do the barrier for us and such. Thanks,
Signed-off-by: Josef Bacik <josef@redhat.com>

d7096fc3

Btrfs: fix locking in btrfs_destroy_delayed_refs · b939d1ab

Josef Bacik authored May 31, 2012

The transaction abort stuff was throwing warnings from the list debugging
code because we do a list_del_init outside of the delayed_refs spin lock.
The delayed refs locking makes baby Jesus cry so it's not hard to get wrong,
but we need to take the ref head mutex to make sure it's not being processed
currently, and so if it is we need to drop the spin lock and then take and
drop the mutex and do the search again. If we can take the mutex then we
can safely remove the head from the list and carry on. Now when the
transaction aborts I don't get the list debugging warnings. Thanks,
Signed-off-by: Josef Bacik <josef@redhat.com>

b939d1ab

Btrfs: pass locked_page into extent_clear_unlock_delalloc if theres an error · beb42dd7

Josef Bacik authored May 30, 2012

While doing my enospc work I got a transaction abortion that resulted in a
panic when we tried to unlock_page() an already unlocked page.  This is
because we aren't calling extent_clear_unlock_delalloc with the locked page
so it was unlocking all the pages in the range.  This is wrong since
__extent_writepage expects to have the page locked still unless we return
*page_started as 1.  This should keep us from panicing.  Thanks,
Signed-off-by: Josef Bacik <josef@redhat.com>

beb42dd7

14 Jun, 2012 5 commits

Btrfs: fix race in tree mod log addition · 3310c36e

Jan Schmidt authored Jun 11, 2012

When adding to the tree modification log, we grab two locks at different
stages. We must not drop the outer lock until we're done with section
protected by the inner lock. This moves the unlock call for the outer lock
to the appropriate position.
Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>

3310c36e

Btrfs: add btrfs_next_old_leaf · 3d7806ec

Jan Schmidt authored Jun 11, 2012

To make sense of the tree mod log, the backref walker not only needs
btrfs_search_old_slot, but it also called btrfs_next_leaf, which in turn was
calling btrfs_search_slot. This obviously didn't give the correct result.

This commit adds btrfs_next_old_leaf, a drop-in replacement for
btrfs_next_leaf with a time_seq parameter. If it is zero, it behaves exactly
like btrfs_next_leaf. If it is non-zero, it will use btrfs_search_old_slot
with this time_seq parameter.
Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>

3d7806ec

Btrfs: fix return value for __tree_mod_log_oldest_root · a95236d9

Jan Schmidt authored Jun 05, 2012

In __tree_mod_log_oldest_root() we must return the found operation even if
it's not a ROOT_REPLACE operation. Otherwise, the caller assumes that there
are no operations to be rewinded and returns immediately.

The code in the caller is modified to improve readability.
Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>

a95236d9

Btrfs: use btrfs_read_lock_root_node in get_old_root · 8ba97a15

Jan Schmidt authored Jun 04, 2012

get_old_root could race with root node updates because we weren't locking
the node early enough. Use btrfs_read_lock_root_node to grab the root locked
in the very beginning and release the lock as soon as possible (just like
btrfs_search_slot does).
Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>

8ba97a15

Btrfs: remove obsolete btrfs_next_leaf call from __resolve_indirect_ref · f617e2fd

Jan Schmidt authored Jun 14, 2012

When resolving indirect refs, we used to call btrfs_next_leaf in case we
didn't find an exact match. While we should find exact matches most of the
time, in case we don't, we must continue searching. Treating those matches
differently depending on the level we're searching doesn't make sense.

Even worse, we might end up searching for a key larger than the largest, in
which case there is no next_leaf and subsequent jobs would fail. This commit
drops the bogous lines.
Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>

f617e2fd

04 Jun, 2012 1 commit

Btrfs: remove call to btrfs_header_nritems with no effect · 4d5a0565

Jan Schmidt authored Apr 30, 2012

This is a leftover from cleanup patch 559af821. Before the cleanup,
btrfs_header_nritems was called inside an if condition. As it has no side
effects we need to preserve here, it should simply be dropped.
Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>

4d5a0565

31 May, 2012 6 commits

Merge branch 'for-chris' of git://git.jan-o-sch.net/btrfs-unstable into for-linus · 1e20932a
Chris Mason authored May 31, 2012
```
Conflicts:
	fs/btrfs/ulist.h
Signed-off-by: Chris Mason <chris.mason@oracle.com>
```
1e20932a

Btrfs: fix tree mod log rewinded level and rewinding of moved keys · c3193108

Jan Schmidt authored May 31, 2012

When we rewind REMOVE_WHILE_FREEING operations, there's code that allocates
a fresh buffer instead of cloning the old one. Setting that buffer's level
correctly was missing in this case.

When rewinding a MOVE_KEYS operation, btrfs_node_key_ptr_offset(slot) was
missing for memmove_extent_buffer()'s arguments.
Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>

c3193108

Btrfs: fix tree mod log del_ptr · f395694c

Jan Schmidt authored May 31, 2012

Logging for del_ptr when we're not deleting the last pointer was wrong. This
fixes both, duplicate log entries and log sequence.
Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>

f395694c

Btrfs: add tree_mod_dont_log helper · e9b7fd4d

Jan Schmidt authored May 31, 2012

Replace duplicate code by small inline helper function.
Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>

e9b7fd4d

Btrfs: add missing spin_lock for insertion into tree mod log · 926dd8a6

Jan Schmidt authored May 31, 2012

tree_mod_alloc calls __get_tree_mod_seq and must acquire a spinlock before
doing so.
Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>

926dd8a6

Btrfs: add inodes before dropping the extent lock in find_all_leafs · 3301958b

Jan Schmidt authored May 30, 2012

We must build up the inode list with the extent lock held after following
indirect refs.

This also requires an extension to ulists, which allows to modify the stored
aux value in case a key already exists in the list.
Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>

3301958b

30 May, 2012 9 commits

Btrfs: use delayed ref sequence numbers for all fs-tree updates · 95a06077

Jan Schmidt authored May 29, 2012

The sequence number for delayed refs is needed to postpone certain delayed
refs for a very short period while walking backrefs. Before the tree
modification log, we thought we'd only have to hold back those references
that don't have a counter operation.

While now we've the tree mod log, we're rewinding fs tree blocks to a
defined consistent state. We cannot know in advance for which tree block
we'll be doing rewind operations later. Therefore, we must postpone all the
delayed refs for fs-tree blocks, even those having a counter operation.
Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>

95a06077

Merge branch 'for-chris' of... · cfc442b6

Chris Mason authored May 30, 2012

Merge branch 'for-chris' of git://git.kernel.org/pub/scm/linux/kernel/git/josef/btrfs-next into HEAD

cfc442b6

Btrfs: fix false positive in check-integrity on unmount · 48235a68

Stefan Behrens authored May 23, 2012

During unmount, it could happen that the integrity checker printed a
warning message "attempt to free ... on umount which is not yet iodone"
which turned out to be a false positive.
Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de>

48235a68

Btrfs: fix runtime warning in check-integrity check data mode · 86ff7ffc

Stefan Behrens authored Apr 24, 2012

If a file_extent_item was located at the very end of a leaf and there was
not enough space to hold a full item, but there was enough space to hold
one of type BTRFS_FILE_EXTENT_INLINE or PREALLOC, and it was only such a
short item, a warning was printed anyway. This check is now fixed.
Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de>

86ff7ffc

Btrfs: set ioprio of scrub readahead to idle · 3d136a11

Stefan Behrens authored Feb 03, 2012

Reduce ioprio class of scrub readahead threads to idle priority.
This setting is fixed. This priority has shown the best performance
during all measurements.
Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de>

3d136a11

Btrfs: fix return code in drop_objectid_items · 5bdbeb21

Josef Bacik authored May 29, 2012

So dpkg fsync()'s the file and the directory containing the file whenever it
writes to a file which is really slow in btrfs. This is partly because
fsync()'ing a directory _always_ committed the transaction instead of just
going to the tree log. This is because drop_objectid_items() would return 1
since it does a btrfs_search_slot() which returns 1. In tree-log jargon
this means that we have to commit the transaction to be safe. So just check
if ret is greater than 0 and set it to 0 if it does. With this patch we now
use the tree-log instead of committing the entire transaction, which is
twice as fast on my box. Thanks,
Signed-off-by: Josef Bacik <josef@redhat.com>

5bdbeb21

Btrfs: check to see if the inode is in the log before fsyncing · 22ee6985

Josef Bacik authored May 29, 2012

We have this check down in the actual logging code, but this is after we
start a transaction and all that good stuff. So move the helper
inode_in_log() out so we can call it in fsync() and avoid starting a
transaction altogether and just exit if we've already fsync()'ed this file
recently. You would notice this issue if you fsync()'ed a file over and
over again until the transaction committed. Thanks,
Signed-off-by: Josef Bacik <josef@redhat.com>

22ee6985

Btrfs: return value of btrfs_read_buffer is checked correctly · 018642a1

Tsutomu Itoh authored May 29, 2012

btrfs_read_buffer() has the possibility of returning the error.
Therefore, I add the code in which the return value of btrfs_read_buffer()
is checked.
Signed-off-by: Tsutomu Itoh <t-itoh@jp.fujitsu.com>

018642a1

Btrfs: read device stats on mount, write modified ones during commit · 733f4fbb

Stefan Behrens authored May 25, 2012

The device statistics are written into the device tree with each
transaction commit. Only modified statistics are written.
When a filesystem is mounted, the device statistics for each involved
device are read from the device tree and used to initialize the
counters.
Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de>

733f4fbb