Commits · 7b13b7b119c932a5eca486db4113f4c1fe3b97a8 · nexedi / linux

25 Sep, 2008 40 commits

Btrfs: Don't drop extent_map cache during releasepage on the btree inode · 7b13b7b1

Chris Mason authored Apr 18, 2008

The btree inode should only have a single extent_map in the cache,
it doesn't make sense to ever drop it.
Signed-off-by: Chris Mason <chris.mason@oracle.com>

7b13b7b1

Btrfs: Add support for labels in the super block · 7ae9c09d
Chris Mason authored Apr 18, 2008
```
Signed-off-by: Chris Mason <chris.mason@oracle.com>
```
7ae9c09d
Btrfs: Check device uuids along with devids · a443755f
Chris Mason authored Apr 18, 2008
```
Signed-off-by: Chris Mason <chris.mason@oracle.com>
```
a443755f

Btrfs: Remove bogus max_sector warnings from the extent_io code · 41471e83

Chris Mason authored Apr 17, 2008

It was testing the bio before doing logical->physical mapping, so the
test was always wrong.
Signed-off-by: Chris Mason <chris.mason@oracle.com>

41471e83

Btrfs: Avoid 64 bit div for RAID10 · 7bf3b490
Chris Mason authored Apr 17, 2008
```
Signed-off-by: Chris Mason <chris.mason@oracle.com>
```
7bf3b490

Btrfs: Use the extent map cache to find the logical disk block during data retries · 3b951516

Chris Mason authored Apr 17, 2008

The data read retry code needs to find the logical disk block before it
can resubmit new bios. But, finding this block isn't allowed to take
the fs_mutex because that will deadlock with a number of different callers.

This changes the retry code to use the extent map cache instead, but
that requires the extent map cache to have the extent we're looking for.
This is a problem because btrfs_drop_extent_cache just drops the entire
extent instead of the little tiny part it is invalidating.

The bulk of the code in this patch changes btrfs_drop_extent_cache to
invalidate only a portion of the extent cache, and changes btrfs_get_extent
to deal with the results.
Signed-off-by: Chris Mason <chris.mason@oracle.com>

3b951516

Btrfs: Only do async bio submission for pdflush · 7b859fe7
Chris Mason authored Apr 16, 2008
```
Signed-off-by: Chris Mason <chris.mason@oracle.com>
```
7b859fe7

Btrfs: Don't wait on tree block writeback before freeing them anymore · 699122f5

Chris Mason authored Apr 16, 2008

This isn't required anymore because we don't reallocate blocks that
have already been written in this transaction.
Signed-off-by: Chris Mason <chris.mason@oracle.com>

699122f5

Btrfs: Write bio checksumming outside the FS mutex · e015640f

Chris Mason authored Apr 16, 2008

This significantly improves streaming write performance by allowing
concurrency in the data checksumming.
Signed-off-by: Chris Mason <chris.mason@oracle.com>

e015640f

Btrfs: Create a work queue for bio writes · 44b8bd7e

Chris Mason authored Apr 16, 2008

This allows checksumming to happen in parallel among many cpus, and
keeps us from bogging down pdflush with the checksumming code.
Signed-off-by: Chris Mason <chris.mason@oracle.com>

44b8bd7e

Btrfs: Add RAID10 support · 321aecc6
Chris Mason authored Apr 16, 2008
```
Signed-off-by: Chris Mason <chris.mason@oracle.com>
```
321aecc6

Btrfs: Add chunk uuids and update multi-device back references · e17cade2

Chris Mason authored Apr 15, 2008

Block headers now store the chunk tree uuid

Chunk items records the device uuid for each stripes

Device extent items record better back refs to the chunk tree

Block groups record better back refs to the chunk tree

The chunk tree format has also changed.  The objectid of BTRFS_CHUNK_ITEM_KEY
used to be the logical offset of the chunk.  Now it is a chunk tree id,
with the logical offset being stored in the offset field of the key.

This allows a single chunk tree to record multiple logical address spaces,
upping the number of bytes indexed by a chunk tree from 2^64 to
2^128.
Signed-off-by: Chris Mason <chris.mason@oracle.com>

e17cade2

Btrfs: A few updates for 2.6.18 and versions older than 2.6.25 · b248a415

Chris Mason authored Apr 14, 2008

This includes fixing a missing spinlock init call that caused oops on mount
for most kernels other than 2.6.25.
Signed-off-by: Chris Mason <chris.mason@oracle.com>

b248a415

Add a min size parameter to btrfs_alloc_extent · 98d20f67

Chris Mason authored Apr 14, 2008

On huge machines, delayed allocation may try to allocate massive extents.
This change allows btrfs_alloc_extent to return something smaller than
the caller asked for, and the data allocation routines will loop over
the allocations until it fills the whole delayed alloc.
Signed-off-by: Chris Mason <chris.mason@oracle.com>

98d20f67

Btrfs: bio_endio support for linux 2.6.23 and older. · 73f61b2a

Miguel authored Apr 11, 2008

bio_endio() changed prototype on linux 2.6.24, support older kernels
using the older prototype.
Signed-off-by: Chris Mason <chris.mason@oracle.com>

73f61b2a

Btrfs: define write_cache_pages for linux kernel <= 2.6.20 instead · 594994aa

Miguel authored Apr 11, 2008

write_cache_pages doesn't exist in linux 2.6.20,  change the #if
condition to match that.
Signed-off-by: Chris Mason <chris.mason@oracle.com>

594994aa

Btrfs: Endianess bug fix for v0.13 with kernels · a5eb62e3

Miguel authored Apr 11, 2008

Fix for a endianess BUG when using btrfs v0.13 with kernels older than 2.6.23

Problem:

Has of v0.13, btrfs-progs is using crc32c.c equivalent to the one found on
linux-2.6.23/lib/libcrc32c.c Since crc32c_le() changed in linux-2.6.23, when
running btrfs v0.13 with older kernels we have a missmatch between the versions
of crc32c_le() from btrfs-progs and libcrc32c in the kernel.  This missmatch
causes a bug when using btrfs on big endian machines.

Solution:
btrfs_crc32c() macro that when compiling for kernels older than 2.6.23, does
endianess conversion to parameters and return value of crc32c().
This endianess conversion nullifies the differences in implementation
of crc32c_le().
If kernel 2.6.23 or better, it calls crc32c().
Signed-off-by: Miguel Sousa Filipe <miguel.filipe@gmail.com>
---
Signed-off-by: Chris Mason <chris.mason@oracle.com>

a5eb62e3

Btrfs: Fixup a few u64<->pointer casts for 32 bit · 587f7704
Chris Mason authored Apr 11, 2008
```
Signed-off-by: Chris Mason <chris.mason@oracle.com>
```
587f7704
Btrfs: Add extra checks to avoid removing extent_state from pages we can't free · 3dd39914
Chris Mason authored Apr 11, 2008
```
Signed-off-by: Chris Mason <chris.mason@oracle.com>
```
3dd39914
Btrfs: Write out all super blocks on commit, and bring back proper barrier support · f2984462
Chris Mason authored Apr 10, 2008
```
Signed-off-by: Chris Mason <chris.mason@oracle.com>
```
f2984462

Btrfs: Add O_DIRECT read and write (writes == buffered + cache flush) · 16432985

Chris Mason authored Apr 10, 2008

This adds basic O_DIRECT read and write support.  In the write case, we
just do a normal buffered write followed by a cache flush.  O_DIRECT +
O_SYNC are required to trigger metadata syncs.

In the read case, there is a basic btrfs_get_block call for use by
the generic O_DIRECT code.  This does honor multi-volume mapping rules
but it skips all checksumming.
Signed-off-by: Chris Mason <chris.mason@oracle.com>

16432985

Btrfs: Disable extra debugging checks on tree blocks · 85d824c4
Chris Mason authored Apr 10, 2008
```
Signed-off-by: Chris Mason <chris.mason@oracle.com>
```
85d824c4
Btrfs: Handle checksumming errors while reading data blocks · 7e38326f
Chris Mason authored Apr 09, 2008
```
Signed-off-by: Chris Mason <chris.mason@oracle.com>
```
7e38326f
Btrfs: Retry metadata reads in the face of checksum failures · f188591e
Chris Mason authored Apr 09, 2008
```
Signed-off-by: Chris Mason <chris.mason@oracle.com>
```
f188591e

Btrfs: Handle data block end_io through the async work queue · 22c59948

Chris Mason authored Apr 09, 2008

Before it was done by the bio end_io routine, the work queue code is able
to scale much better with faster IO subsystems.
Signed-off-by: Chris Mason <chris.mason@oracle.com>

22c59948

Btrfs: Do metadata checksums for reads via a workqueue · ce9adaa5

Chris Mason authored Apr 09, 2008

Before, metadata checksumming was done by the callers of read_tree_block,
which would set EXTENT_CSUM bits in the extent tree to show that a given
range of pages was already checksummed and didn't need to be verified
again.

But, those bits could go away via try_to_releasepage, and the end
result was bogus checksum failures on pages that never left the cache.

The new code validates checksums when the page is read.  It is a little
tricky because metadata blocks can span pages and a single read may
end up going via multiple bios.
Signed-off-by: Chris Mason <chris.mason@oracle.com>

ce9adaa5

Btrfs: Add additional debugging for metadata checksum failures · 728131d8
Chris Mason authored Apr 09, 2008
```
Signed-off-by: Chris Mason <chris.mason@oracle.com>
```
728131d8
Change btrfs_map_block to return a structure with mappings for all stripes · cea9e445
Chris Mason authored Apr 09, 2008
```
Signed-off-by: Chris Mason <chris.mason@oracle.com>
```
cea9e445
Btrfs: Fix allocation profile init · d18a2c44
Chris Mason authored Apr 04, 2008
```
Signed-off-by: Chris Mason <chris.mason@oracle.com>
```
d18a2c44

Btrfs: Don't allow written blocks from this transaction to be reallocated · 6bc34676

Chris Mason authored Apr 04, 2008

When a block is freed, it can be immediately reused if it is from
the current transaction. But, an extra check is required to make sure
the block had not been written yet. If it were reused after being written,
the transid in the block header might match the transid of the
next time the block was allocated.

The parent node records the transaction ID of the block it is pointing to,
and this is used as part of validating the block on reads. So, there
can only be one version of a block per transaction.
Signed-off-by: Chris Mason <chris.mason@oracle.com>

6bc34676

Btrfs: Add support for duplicate blocks on a single spindle · 611f0e00
Chris Mason authored Apr 03, 2008
```
Signed-off-by: Chris Mason <chris.mason@oracle.com>
```
611f0e00
Btrfs: Add support for mirroring across drives · 8790d502
Chris Mason authored Apr 03, 2008
```
Signed-off-by: Chris Mason <chris.mason@oracle.com>
```
8790d502
Btrfs: Properly dirty buffers in the split corner cases · 0ef8b242
Chris Mason authored Apr 03, 2008
```
Signed-off-by: Chris Mason <chris.mason@oracle.com>
```
0ef8b242

Btrfs: Verify checksums on tree blocks found without read_tree_block · 0999df54

Chris Mason authored Apr 01, 2008

Checksums were only verified by btrfs_read_tree_block, which meant the
functions to probe the page cache for blocks were not validating checksums.
Normally this is fine because the buffers will only be in cache if they
have already been validated.

But, there is a window while the buffer is being read from disk where
it could be up to date in the cache but not yet verified.  This patch
makes sure all buffers go through checksum verification before they
are used.

This is safer, and it prevents modification of buffers before they go
through the csum code.
Signed-off-by: Chris Mason <chris.mason@oracle.com>

0999df54

Btrfs: Keep fs_mutex during reads done by snapshot deletion · ecbe2402

Chris Mason authored Apr 01, 2008

There was an optimization to drop the fs_mutex when doing snapshot deletion
reads, but this can lead to false positives on checksumming errors.  Keep
the lock for now.
Signed-off-by: Chris Mason <chris.mason@oracle.com>

ecbe2402

btrfs-progs: Stop stomping on 'name' input parameter · 140dfd00

Alex Chiang authored Apr 01, 2008

In btrfs_name_hash, Local variable 'buf' is declared as

	__u32 buf[2];

but we then try to do this:

	buf[0] = 0x67452301;
	buf[1] = 0xefcdab89;
	buf[2] = 0x98badcfe;
	buf[3] = 0x10325476;

Oops. Fix buf to be the proper size.
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>

140dfd00

Btrfs: Correct usage of IS_ERR() in extent_io.c · 2b114d1d

Peter authored Apr 01, 2008

Signed-off-by: Peter Teoh <htmldeveloper@gmail.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>

2b114d1d

Fix btrfs_fill_super to return -EINVAL when no FS found · e58ca020
Yan authored Apr 01, 2008
```
Signed-off-by: Chris Mason <chris.mason@oracle.com>
```
e58ca020

Reorder the flags field in struct btrfs_header and record a flag on writeout · 63b10fc4

Chris Mason authored Apr 01, 2008

This allows detection of blocks that have already been written in the
running transaction so they can be recowed instead of modified again.
It is step one in trusting the transid field of the block pointers.
Signed-off-by: Chris Mason <chris.mason@oracle.com>

63b10fc4

Btrfs: Add leak debugging for extent_buffer and extent_state · 2d2ae547

Chris Mason authored Mar 26, 2008

This also fixes one leak around the super block when failing to mount the
FS.
Signed-off-by: Chris Mason <chris.mason@oracle.com>

2d2ae547