Commits · e3804b55e4358cf5a235fa1ba32204af9f7046dd · Kirill Smelkov / linux

22 Oct, 2023 40 commits

bcachefs: bch2_version_to_text() · e3804b55

Kent Overstreet authored Jun 28, 2023

Add a new helper for printing out metadata versions in a standard
format.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

e3804b55

bcachefs: Kill BTREE_INSERT_USE_RESERVE · f33c58fc

Kent Overstreet authored Jun 27, 2023

Now that we have journal watermarks and alloc watermarks unified,
BTREE_INSERT_USE_RESERVE is redundant and can be deleted.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

f33c58fc

bcachefs: Fix a null ptr deref in bch2_fs_alloc() error path · 65db6049

Kent Overstreet authored Jun 28, 2023

This fixes a null ptr deref in bch2_free_pending_node_rewrites() when
the list head wasn't initialized.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

65db6049

bcachefs: Fix a format string warning · 0b9fbce2
Kent Overstreet authored Jun 27, 2023
```
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
```
0b9fbce2

bcachefs: Kill JOURNAL_WATERMARK · ec14fc60

Kent Overstreet authored Jun 27, 2023

This unifies JOURNAL_WATERMARK with BCH_WATERMARK; we're working towards
specifying watermarks once in the transaction commit path.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

ec14fc60

bcachefs: BCH_WATERMARK_reclaim · 494036d8

Kent Overstreet authored Jun 27, 2023

Add another watermark for journal reclaim - this is needed for the next
patches, that unify BCH_WATERMARK with JOURNAL_WATERMARK.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

494036d8

bcachefs: struct bch_extent_rebalance · 2766876d

Kent Overstreet authored Jun 27, 2023

This adds the extent entry for extents that rebalance needs to do
something with.

We're adding this ahead of the main rebalance_work patchset, because
adding new extent entries can't be done in a forwards-compatible way.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

2766876d

bcachefs: Expand BTREE_NODE_ID · 4e1430a7

Kent Overstreet authored Jun 27, 2023

We now have 20 bits for the btree ID in the on disk format - sufficient
for 1 million distinct btrees.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

4e1430a7

bcachefs: Fix btree node write error message · e4eb661d

Kent Overstreet authored Jun 27, 2023

Error messages should include the error code, when available.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

e4eb661d

bcachefs: fsck: Break walk_inode() up into multiple functions · 06dcca51

Kent Overstreet authored Jun 25, 2023

Some refactoring, prep work for algorithm improvements related to
snapshots.

we need to add a bitmap to the list of inodes for "seen this snapshot";
for this bitmap to correctly be available, we'll need to gather the list
of inodes first, and later look up the inode for a given snapshot.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

06dcca51

bcachefs: Fix leak in backpointers fsck · 1fa3e87a

Kent Overstreet authored Jun 27, 2023

We were forgetting to exit a printbuf - whoops.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

1fa3e87a

bcachefs: unregister_shrinker() now safe on not-registered shrinker · b3591acc
Kent Overstreet authored Jun 26, 2023
```
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
```
b3591acc

bcachefs: Add a missing rhashtable_destroy() call · 0ce4e0e7

Kent Overstreet authored Jun 26, 2023

Fixes https://lore.kernel.org/linux-bcachefs/784c3e6a-75bd-e6ca-535a-43b3e1daf643@kernel.dk/T/#mbf7caf005f960018eba23b58795d06c06c947411Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

0ce4e0e7

bcachefs: Improve bch2_bkey_make_mut() · 0fb3355d

Kent Overstreet authored Jun 26, 2023

bch2_bkey_make_mut() now takes the bkey_s_c by reference and points it
at the new, mutable key.

This helps in some fsck paths that may have multiple repair operations
on the same key.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

0fb3355d

bcachefs: Reduce stack frame size of bch2_check_alloc_info() · 298ac24e

Kent Overstreet authored Jun 26, 2023

Excessive inlining may (on some versions of gcc?) cause excessive stack
usage; this turns off some inlining in bch2_check_alloc_info.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

298ac24e

bcachefs: fsck needs BTREE_UPDATE_INTERNAL_SNAPSHOT_NODE · 75da9764

Kent Overstreet authored Jun 25, 2023

A few fsck paths weren't using BTREE_UPDATE_INTERNAL_SNAPSHOT_NODE -
oops.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

75da9764

bcachefs: Improve error message for overlapping extents · 454377d8

Kent Overstreet authored Jun 24, 2023

We now print out the full previous extent we overlapping with, to aid in
debugging and searching through the journal.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

454377d8

bcachefs: Fix check_pos_snapshot_overwritten() · 8f507f89
Kent Overstreet authored Jun 24, 2023
```
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
```
8f507f89

bcachefs: Rename enum alloc_reserve -> bch_watermark · e53a961c

Kent Overstreet authored Jun 24, 2023

This is prep work for consolidating with JOURNAL_WATERMARK.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

e53a961c

bcachefs: BCH_ERR_fsck -> EINVAL · e9d01723

Kent Overstreet authored Jun 24, 2023

When we return errors outside of bcachefs, we need to return a standard
error code - fix this for BCH_ERR_fsck.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

e9d01723

bcachefs: bch2_trans_mark_pointer() refactoring · 3a63b32f

Kent Overstreet authored Jun 24, 2023

bch2_bucket_backpointer_mod() doesn't need to update the alloc key, we
can exit the alloc iter earlier.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

3a63b32f

bcachefs: Fix more lockdep splats in debug.c · 9473cff9

Kent Overstreet authored Jun 21, 2023

Similar to previous fixes, we can't incur page faults while holding
btree locks.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

9473cff9

bcachefs: Fix lockdep splat in bch2_readdir · 462f494b

Kent Overstreet authored Jun 21, 2023

dir_emit() can fault (taking mmap_lock); thus we can't be holding btree
locks.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

462f494b

bcachefs: Check for ERR_PTR() from filemap_lock_folio() · b6898917
Kent Overstreet authored Jun 21, 2023
```
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
```
b6898917

bcachefs: New error message helpers · 1bb3c2a9

Kent Overstreet authored Jun 20, 2023

Add two new helpers for printing error messages with __func__ and
bch2_err_str():
 - bch_err_fn
 - bch_err_msg

Also kill the old error strings in the recovery path, which were causing
us to incorrectly report memory allocation failures - they're not needed
anymore.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

1bb3c2a9

bcachefs: fiemap: Fix a lockdep splat · a83e108f

Kent Overstreet authored Jun 19, 2023

As with the previous patch, we generally can't hold btree locks while
copying to userspace, as that may incur a page fault and require
mmap_lock.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

a83e108f

bcachefs: seqmutex; fix a lockdep splat · a5b696ee

Kent Overstreet authored Jun 19, 2023

We can't be holding btree_trans_lock while copying to user space, which
might incur a page fault. To fix this, convert it to a seqmutex so we
can unlock/relock.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

a5b696ee

bcachefs: Don't call lock_graph_descend() with wait lock held · 6547ebab

Kent Overstreet authored Jun 19, 2023

This fixes a deadlock:

01305 WARNING: possible circular locking dependency detected
01305 6.3.0-ktest-gf4de9bee61af #5305 Tainted: G        W
01305 ------------------------------------------------------
01305 cat/14658 is trying to acquire lock:
01305 ffffffc00982f460 (fs_reclaim){+.+.}-{0:0}, at: __kmem_cache_alloc_node+0x48/0x278
01305
01305 but task is already holding lock:
01305 ffffff8011aaf040 (&lock->wait_lock){+.+.}-{2:2}, at: bch2_check_for_deadlock+0x4b8/0xa58
01305
01305 which lock already depends on the new lock.
01305
01305
01305 the existing dependency chain (in reverse order) is:
01305
01305 -> #2 (&lock->wait_lock){+.+.}-{2:2}:
01305        _raw_spin_lock+0x54/0x70
01305        __six_lock_wakeup+0x40/0x1b0
01305        six_unlock_ip+0xe8/0x248
01305        bch2_btree_key_cache_scan+0x720/0x940
01305        shrink_slab.constprop.0+0x284/0x770
01305        shrink_node+0x390/0x828
01305        balance_pgdat+0x390/0x6d0
01305        kswapd+0x2e4/0x718
01305        kthread+0x184/0x1a8
01305        ret_from_fork+0x10/0x20
01305
01305 -> #1 (&c->lock#2){+.+.}-{3:3}:
01305        __mutex_lock+0x104/0x14a0
01305        mutex_lock_nested+0x30/0x40
01305        bch2_btree_key_cache_scan+0x5c/0x940
01305        shrink_slab.constprop.0+0x284/0x770
01305        shrink_node+0x390/0x828
01305        balance_pgdat+0x390/0x6d0
01305        kswapd+0x2e4/0x718
01305        kthread+0x184/0x1a8
01305        ret_from_fork+0x10/0x20
01305
01305 -> #0 (fs_reclaim){+.+.}-{0:0}:
01305        __lock_acquire+0x19d0/0x2930
01305        lock_acquire+0x1dc/0x458
01305        fs_reclaim_acquire+0x9c/0xe0
01305        __kmem_cache_alloc_node+0x48/0x278
01305        __kmalloc_node_track_caller+0x5c/0x278
01305        krealloc+0x94/0x180
01305        bch2_printbuf_make_room.part.0+0xac/0x118
01305        bch2_prt_printf+0x150/0x1e8
01305        bch2_btree_bkey_cached_common_to_text+0x170/0x298
01305        bch2_btree_trans_to_text+0x244/0x348
01305        print_cycle+0x7c/0xb0
01305        break_cycle+0x254/0x528
01305        bch2_check_for_deadlock+0x59c/0xa58
01305        bch2_btree_deadlock_read+0x174/0x200
01305        full_proxy_read+0x94/0xf0
01305        vfs_read+0x15c/0x3a8
01305        ksys_read+0xb8/0x148
01305        __arm64_sys_read+0x48/0x60
01305        invoke_syscall.constprop.0+0x64/0x138
01305        do_el0_svc+0x84/0x138
01305        el0_svc+0x34/0x80
01305        el0t_64_sync_handler+0xb0/0xb8
01305        el0t_64_sync+0x14c/0x150
01305
01305 other info that might help us debug this:
01305
01305 Chain exists of:
01305   fs_reclaim --> &c->lock#2 --> &lock->wait_lock
01305
01305  Possible unsafe locking scenario:
01305
01305        CPU0                    CPU1
01305        ----                    ----
01305   lock(&lock->wait_lock);
01305                                lock(&c->lock#2);
01305                                lock(&lock->wait_lock);
01305   lock(fs_reclaim);
01305
01305  *** DEADLOCK ***
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

6547ebab

bcachefs: Fix bch2_check_discard_freespace_key() · e96f5a61

Kent Overstreet authored Jun 18, 2023

We weren't correctly checking the freespace btree - it's an extents
btree, which means we need to iterate over each bucket in a freespace
extent.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

e96f5a61

bcachefs: bch2_trans_unlock_noassert() · 25aa8c21

Kent Overstreet authored Jun 18, 2023

This fixes a spurious assert in the btree node read path.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

25aa8c21

bcachefs: Fix bch2_btree_update_start() · 45a1ab57

Kent Overstreet authored Jun 16, 2023

The calculation for number of nodes to allocate in
bch2_btree_update_start() was incorrect - this fixes a BUG_ON() on the
small nodes test.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

45a1ab57

bcachefs: bch2_extent_ptr_desired_durability() · 91ecd41b

Kent Overstreet authored Jun 13, 2023

This adds a new helper for getting a pointer's durability irrespective
of the device state, and uses it in the the data update path.

This fixes a bug where we do a data update but request 0 replicas to be
allocated, because the replica being rewritten is on a device marked as
failed.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

91ecd41b

bcachefs: snapshot_to_text() includes snapshot tree · 253748a2
Kent Overstreet authored Jun 13, 2023
```
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
```
253748a2

bcachefs: Fix try_decrease_writepoints() · 995f9128

Kent Overstreet authored Mar 16, 2023

 - We may need to drop btree locks before taking the writepoint_lock, as
   is done in other places.
 - We should be using open_bucket_free_unused(), so that we don't waste
   space.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

995f9128

bcachefs: Delete weird hacky transaction restart injection · 25c70097

Kent Overstreet authored Jun 11, 2023

since we currently don't have a good fault injection library,
bch2_btree_insert_node() was randomly injecting faults based on
local_clock().

At the very least this should have been a debug mode only thing, but
this is a brittle method so let's just delete it.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

25c70097

bcachefs: Write buffer flush needs BTREE_INSERT_NOCHECK_RW · 8e5b1115

Kent Overstreet authored Jun 11, 2023

btree write buffer flush is only invoked from contexts that already hold
a write ref, and checking if we're still RW could cause us to fail to
completely flush the write buffer when shutting down.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

8e5b1115

bcachefs: New assertions when marking filesystem clean · 7724664f
Kent Overstreet authored Jun 11, 2023
```
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
```
7724664f
bcachefs: ec: Fix a lost wakeup · 99a3d398
Kent Overstreet authored Jun 10, 2023
```
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
```
99a3d398

bcachefs: fix NULL pointer dereference in try_alloc_bucket · 954ed17e

Mikulas Patocka authored May 30, 2023

On Mon, 29 May 2023, Mikulas Patocka wrote:

> The oops happens in set_btree_iter_dontneed and it is caused by the fact
> that iter->path is NULL. The code in try_alloc_bucket is buggy because it
> sets "struct btree_iter iter = { NULL };" and then jumps to the "err"
> label that tries to dereference values in "iter".

Here I'm sending a patch for it.

From: Mikulas Patocka <mpatocka@redhat.com>

The function try_alloc_bucket sets the variable "iter" to NULL and then
(on various error conditions) jumps to the label "err". On the "err"
label, it calls "set_btree_iter_dontneed" that tries to dereference
"iter->trans" and "iter->path".

So, we get an oops on error condition.

This patch fixes the crash by testing that iter.trans and iter.path is
non-zero before calling set_btree_iter_dontneed.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

954ed17e

bcachefs: Fix subvol deletion deadlock · b0e8c75e

Kent Overstreet authored Jun 09, 2023

d_prune_aliases() may call bch2_evict_inode(), which needs
c->vfs_inodes_list_lock.

Fix this by always calling igrab() before putting the inodes onto our
disposal list, and then calling d_prune_aliases() with
c->vfs_inodes_lock dropped.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

b0e8c75e