Commits · 58caa786f1c02fd84919fb6db9eaecb22e8f7983 · Kirill Smelkov / linux

14 Apr, 2024 1 commit

bcachefs: Fix UAFs of btree_insert_entry array · 58caa786

Kent Overstreet authored Apr 11, 2024

The btree paths array is now dynamically resizable - and as well the
btree_insert_entries array, as it needs to be the same size.

The merge path (and interior update path) allocates new btree paths,
thus can trigger a resize; thus we need to not retain direct pointers
after invoking merge; similarly when running btree node triggers.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

58caa786

12 Apr, 2024 1 commit

bcachefs: Don't use bch2_btree_node_lock_write_nofail() in btree split path · 2b3e79fe

Kent Overstreet authored Apr 11, 2024

It turns out - btree splits happen with the rest of the transaction
still locked, to avoid unnecessary restarts, which means using nofail
doesn't work here - we can deadlock.

Fortunately, we now have the ability to return errors here.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

2b3e79fe

11 Apr, 2024 3 commits

bcachefs: Fix __bch2_btree_and_journal_iter_init_node_iter() · 1189bdda

Kent Overstreet authored Apr 10, 2024

We weren't respecting trans->journal_replay_not_finished - we shouldn't
be searching the journal keys unless we have a ref on them.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

1189bdda

bcachefs: Kill read lock dropping in bch2_btree_node_lock_write_nofail() · 517236cb

Kent Overstreet authored Apr 10, 2024

dropping read locks in bch2_btree_node_lock_write_nofail() dates from
before we had the cycle detector; we can now tell the cycle detector
directly when taking a lock may not fail because we can't handle
transaction restarts.

This is needed for adding should_be_locked asserts.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

517236cb

bcachefs: Fix a race in btree_update_nodes_written() · beccf291

Kent Overstreet authored Apr 10, 2024

One btree update might have terminated in a node update, and then while
it is in flight another btree update might free that original node.

This race has to be handled in btree_update_nodes_written() - we were
missing a READ_ONCE().
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

beccf291

09 Apr, 2024 4 commits

bcachefs: btree_node_scan: Respect member.data_allowed · 9b31152f

Kent Overstreet authored Apr 09, 2024

If a device wasn't used for btree nodes, no need to scan for them.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

9b31152f

bcachefs: Don't scan for btree nodes when we can reconstruct · 5ab4beb7
Kent Overstreet authored Apr 09, 2024
```
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
```
5ab4beb7

bcachefs: Fix check_topology() when using node scan · 359571c3

Kent Overstreet authored Apr 09, 2024

shoot down journal keys _before_ populating journal keys with pointers
to scanned nodes
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

359571c3

bcachefs: fix eytzinger0_find_gt() · 9c432404

Kent Overstreet authored Apr 08, 2024

- fix return types: promoting from unsigned to ssize_t does not do what
  we want here, and was pointless since the rest of the eytzinger code
  is u32
- nr, not size
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

9c432404

07 Apr, 2024 3 commits

bcachefs: fix bch2_get_acl() transaction restart handling · b897b148

Kent Overstreet authored Apr 07, 2024

bch2_acl_from_disk() uses allocate_dropping_locks, and can thus return
a transaction restart - this wasn't handled.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

b897b148

bcachefs: fix the count of nr_freed_pcpu after changing bc->freed_nonpcpu list · 09e913f5

Hongbo Li authored Mar 26, 2024

When allocating bkey_cached from bc->freed_pcpu list, it missed
decreasing the count of nr_freed_pcpu which would cause the mismatch
between the value of nr_freed_pcpu and the list items. This problem
also exists in moving new bkey_cached to bc->freed_pcpu list.
If these happened, the bug info may appear in
bch2_fs_btree_key_cache_exit by the follow code:

BUG_ON(list_count_nodes(&bc->freed_pcpu) != bc->nr_freed_pcpu);
BUG_ON(list_count_nodes(&bc->freed_nonpcpu) != bc->nr_freed_nonpcpu);

Fixes: c65c13f0 ("bcachefs: Run btree key cache shrinker less aggressively")
Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

09e913f5

bcachefs: Fix gap buffer bug in bch2_journal_key_insert_take() · 30e615a2

Kent Overstreet authored Apr 06, 2024

Multiple bug fixes for journal iters:

 - When the journal keys gap buffer is resized, we have to adjust the
   iterators for moving the gap to the end
 - We don't want to rewind iterators to point to the key we just
   inserted if it's not for the correct btree/level

Also, add some new assertions.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

30e615a2

06 Apr, 2024 6 commits

bcachefs: Rename struct field swap to prevent macro naming collision · 2d793e93

Thorsten Blum authored Apr 06, 2024

The struct field swap can collide with the swap() macro defined in
linux/minmax.h. Rename the struct field to prevent such collisions.
Signed-off-by: Thorsten Blum <thorsten.blum@toblux.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

2d793e93

MAINTAINERS: Add entry for bcachefs documentation · 7d83cf53

Bagas Sanjaya authored Apr 05, 2024

Now that bcachefs docs exist in Documentation/filesystems/bcachefs/,
cover it in MAINTAINERS entry for the filesystem.
Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

7d83cf53

Documentation: filesystems: Add bcachefs toctree · aa98e70f

Bagas Sanjaya authored Apr 05, 2024

Commit eb386617 ("bcachefs: Errcode tracepoint, documentation")
adds initial bcachefs documentation (private error codes) but without
any table of contents tree for the filesystem docs, hence Sphinx warns:

Documentation/filesystems/bcachefs/errorcodes.rst: WARNING: document isn't included in any toctree

Add bcachefs toctree to fix above warning.

Fixes: eb386617 ("bcachefs: Errcode tracepoint, documentation")
Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

aa98e70f

bcachefs: JOURNAL_SPACE_LOW · 6088234c

Kent Overstreet authored Apr 05, 2024

"bcachefs; Fix deadlock in bch2_btree_update_start()" was a significant
performance regression (nearly 50%) on multithreaded random writes with
fio.

The reason is that the journal watermark checks multiple things,
including the state of the btree write buffer, and on multithreaded
update heavy workloads we're bottleneked on write buffer flushing - we
don't want kicknig off btree updates to depend on the state of the write
buffer.

This isn't strictly correct; the interior btree update path does do
write buffer updates, but it's a tiny fraction of total accounting
updates and we're more concerned with space in the journal itself.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

6088234c

bcachefs: Disable errors=panic for BCH_IOCTL_FSCK_OFFLINE · 05801b65

Kent Overstreet authored Apr 05, 2024

BCH_IOCTL_FSCK_OFFLINE allows the userspace fsck tool to use the kernel
implementation of fsck - primarily when the kernel version is a better
version match.

It should look and act exactly like the normal userspace fsck that the
user expected to be invoking, so errors should never result in a kernel
panic.

We may want to consider further restricting errors=panic - it's only
intended for debugging in controlled test environments, it should have
no purpose it normal usage.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

05801b65

bcachefs: Fix BCH_IOCTL_FSCK_OFFLINE for encrypted filesystems · 374b3d38

Kent Overstreet authored Apr 05, 2024

To open an encrypted filesystem, we use request_key() to get the
encryption key from the user's keyring - but request_key() needs to
happen in the context of the process that invoked the ioctl.

This easily fixed by using bch2_fs_open() in nostart mode.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

374b3d38

05 Apr, 2024 3 commits

bcachefs: fix rand_delete unit test · cf979fca
Kent Overstreet authored Apr 05, 2024
```
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
```
cf979fca

bcachefs: fix ! vs ~ typo in __clear_bit_le64() · a6c4162d

Dan Carpenter authored Apr 05, 2024

The ! was obviously intended to be ~. As it is, this function does
the equivalent to: "addr[bit / 64] = 0;".

Fixes: 27fcec6c ("bcachefs: Clear recovery_passes_required as they complete without errors")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

a6c4162d

bcachefs: Fix rebalance from durability=0 device · 5957e0a2
Kent Overstreet authored Apr 05, 2024
```
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
```
5957e0a2

04 Apr, 2024 6 commits

bcachefs: Print shutdown journal sequence number · 9802ff48
Kent Overstreet authored Feb 20, 2024
```
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
```
9802ff48

bcachefs: Further improve btree_update_to_text() · d880a438

Kent Overstreet authored Apr 03, 2024

Print start and end level of the btree update; also a bit of cleanup.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

d880a438

bcachefs: Move btree_updates to debugfs · 9fb3036f

Kent Overstreet authored Apr 03, 2024

sysfs is limited to PAGE_SIZE, and when we're debugging strange
deadlocks/priority inversions we need to see the full list.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

9fb3036f

bcachefs: Bump limit in btree_trans_too_many_iters() · be42e4a6
Kent Overstreet authored Apr 04, 2024
```
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
```
be42e4a6

bcachefs: Make snapshot_is_ancestor() safe · 01e5f4fc

Kent Overstreet authored Apr 04, 2024

Snapshot table accesses generally need to be checking for invalid
snapshot ID now, fix one that was missed.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

01e5f4fc

bcachefs: create debugfs dir for each btree · e60aa472

Thomas Bertschinger authored Mar 14, 2024

This creates a subdirectory for each individual btree under the btrees/
debugfs directory.

Directory structure, before:

/sys/kernel/debug/bcachefs/$FS_ID/btrees/
├── alloc
├── alloc-bfloat-failed
├── alloc-formats
├── backpointers
├── backpointers-bfloat-failed
├── backpointers-formats
...

Directory structure, after:

/sys/kernel/debug/bcachefs/$FS_ID/btrees/
├── alloc
│   ├── bfloat-failed
│   ├── formats
│   └── keys
├── backpointers
│   ├── bfloat-failed
│   ├── formats
│   └── keys
...
Signed-off-by: Thomas Bertschinger <tahbertschinger@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

e60aa472

03 Apr, 2024 13 commits

bcachefs: reconstruct_inode() · 09d4c2ac

Kent Overstreet authored Apr 01, 2024

If an inode is missing, but corresponding extents and dirent still
exist, it's well worth recreating it - this does so.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

09d4c2ac

bcachefs: Subvolume reconstruction · cc053290

Kent Overstreet authored Mar 31, 2024

We can now recreate missing subvolumes from dirents and/or inodes.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

cc053290

bcachefs: Check for extents that point to same space · 4c02e63d

Kent Overstreet authored Mar 30, 2024

In backpointer repair, if we get a missing backpointer - but there's
already a backpointer that points to an existing extent - we've got
multiple extents that point to the same space and need to decide which
to keep.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

4c02e63d

bcachefs: Reconstruct missing snapshot nodes · a292be3b

Kent Overstreet authored Mar 27, 2024

When the snapshots btree is going, we'll have to delete huge amounts of
data - unless we can reconstruct it by looking at the keys that refer to
it.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

a292be3b

bcachefs: Flag btrees with missing data · 55936afe

Kent Overstreet authored Mar 15, 2024

We need this to know when we should attempt to reconstruct the snapshots
btree
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

55936afe

bcachefs: Topology repair now uses nodes found by scanning to fill holes · 43f5ea46

Kent Overstreet authored Mar 16, 2024

With the new btree node scan code, we can now recover from corrupt btree
roots - simply create a new fake root at depth 1, and then insert all
the leaves we found.

If the root wasn't corrupt but there's corruption elsewhere in the
btree, we can fill in holes as needed with the newest version of a given
node(s) from the scan; we also check if a given btree node is older than
what we found from the scan.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

43f5ea46

bcachefs: Repair pass for scanning for btree nodes · 4409b808

Kent Overstreet authored Mar 11, 2024

If a btree root or interior btree node goes bad, we're going to lose a
lot of data, unless we can recover the nodes that it pointed to by
scanning.

Fortunately btree node headers are fully self describing, and
additionally the magic number is xored with the filesytem UUID, so we
can do so safely.

This implements the scanning - next patch will rework topology repair to
make use of the found nodes.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

4409b808

bcachefs: Don't skip fake btree roots in fsck · b268aa4e

Kent Overstreet authored Mar 10, 2024

When a btree root is unreadable, we might still have keys fro the
journal to walk and mark.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

b268aa4e

bcachefs: bch2_btree_root_alloc() -> bch2_btree_root_alloc_fake() · f2f61f41
Kent Overstreet authored Mar 14, 2024
```
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
```
f2f61f41

bcachefs: Etyzinger cleanups · ca1e02f7

Kent Overstreet authored Mar 22, 2024

Pull out eytzinger.c and kill eytzinger_cmp_fn. We now provide
eytzinger0_sort and eytzinger0_sort_r, which use the standard cmp_func_t
and cmp_r_func_t callbacks.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

ca1e02f7

bcachefs: bch2_shoot_down_journal_keys() · bdbf953b
Kent Overstreet authored Mar 19, 2024
```
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
```
bdbf953b
bcachefs: Clear recovery_passes_required as they complete without errors · 27fcec6c
Kent Overstreet authored Mar 30, 2024
```
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
```
27fcec6c
bcachefs: ratelimit informational fsck errors · fa14b504
Kent Overstreet authored Apr 02, 2024
```
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
```
fa14b504