Commits · 64ee1431cc7d11e01a1007ead0afe737781cbbab · Kirill Smelkov / linux

25 Jun, 2024 3 commits

bcachefs: Discard, invalidate workers are now per device · 64ee1431

Kent Overstreet authored Jun 23, 2024

There's no reason for discards to be single threaded across all devices;
this will improve performance on multi device setups.

Additionally, making them per-device simplifies the refcounting on
bch_dev->io_ref; we now hold it for the duration that the discard path
is running, which fixes a race between the discard path and device
removal.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

64ee1431

bcachefs: Fix shift-out-of-bounds in bch2_blacklist_entries_gc · 472237b6

Pei Li authored Jun 25, 2024

This series fix the shift-out-of-bounds issue in
bch2_blacklist_entries_gc().

Instead of passing 0 to eytzinger0_first() when iterating the entries,
we explicitly check 0 and initialize i to be 0.

syzbot has tested the proposed patch and the reproducer did not trigger
any issue:

Reported-and-tested-by: syzbot+835d255ad6bc7f29ee12@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=835d255ad6bc7f29ee12Signed-off-by: Pei Li <peili.dev@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

472237b6

bcachefs: slab-use-after-free Read in bch2_sb_errors_from_cpu · 211c581d

Pei Li authored Jun 25, 2024

Acquire fsck_error_counts_lock before accessing the critical section
protected by this lock.

syzbot has tested the proposed patch and the reproducer did not trigger
any issue.

Reported-by: syzbot+a2bc0e838efd7663f4d9@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=a2bc0e838efd7663f4d9Signed-off-by: Pei Li <peili.dev@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

211c581d

23 Jun, 2024 8 commits

bcachefs: Add missing bch2_journal_do_writes() call · 89d21b69

Kent Overstreet authored Jun 23, 2024

This fixes a rare deadlock when we're doing an emergency shutdown due to
failure to do a journal write.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

89d21b69

bcachefs: Fix null ptr deref in journal_pins_to_text() · d6b52f68
Kent Overstreet authored Jun 23, 2024
```
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
```
d6b52f68

bcachefs: Add missing recalc_capacity() call · 36da8e38

Kent Overstreet authored Jun 23, 2024

This fixes filesystem size not changing on device removal.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

36da8e38

bcachefs: Fix btree_trans list ordering · 1aaf5cb4

Kent Overstreet authored Jun 22, 2024

The debug code relies on btree_trans_list being ordered so that it can
resume on subsequent calls or lock restarts.

However, it was using trans->locknig_wait.task.pid, which is incorrect
since btree_trans objects are cached and reused - typically by different
tasks.

Fix this by switching to pointer order, and also sort them lazily when
required - speeding up the btree_trans_get() fastpath.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

1aaf5cb4

bcachefs: Fix race between trans_put() and btree_transactions_read() · de611ab6

Kent Overstreet authored Jun 22, 2024

debug.c was using closure_get() on a different thread's closure where
the we don't know if the object being refcounted is alive.

We keep btree_trans objects on a list so they can be printed by debug
code, and because it is cost prohibitive to touch the btree_trans list
every time we allocate and free btree_trans objects, cached objects are
also on this list.

However, we do not want the debug code to see cached but not in use
btree_trans objects - critically because the btree_paths array will have
been freed (if it was reallocated).

closure_get() is also incorrect to use when that get may race with it
hitting zero, i.e. we must already have a ref on the object or know the
ref can't currently hit 0 for other reasons (as used in the cycle
detector).

to fix this, use the previously introduced closure_get_not_zero(),
closure_return_sync(), and closure_init_stack_release(); the debug code
now can only take a ref on a trans object if it's alive and in use.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

de611ab6

closures: closure_get_not_zero(), closure_return_sync() · 06efa5f3

Kent Overstreet authored Jun 22, 2024

Provide new primitives for solving a lifetime issue with bcachefs
btree_trans objects.

closure_sync_return(): like closure_sync(), wait synchronously for any
outstanding gets. like closure_return, the closure is considered
"finished" and the ref left at 0.

closure_get_not_zero(): get a ref on a closure if it's alive, i.e. the
ref is not zero.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

06efa5f3

bcachefs: Make btree_deadlock_to_text() clearer · 18e92841

Kent Overstreet authored Jun 22, 2024

btree_deadlock_to_text() searches the list of btree transactions to find
a deadlock - when it finds one it's done; it's not like other *_read()
functions that's printing each object.

Factor out btree_deadlock_to_text() to make this clearer.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

18e92841

bcachefs: fix seqmutex_relock() · f44cc269

Kent Overstreet authored Jun 22, 2024

We were grabbing the sequence number before unlock incremented it - fix
this by moving the increment to seqmutex_lock() (so the seqmutex_relock()
failure path skips the mutex_trylock()), and returning the sequence
number from unlock(), to make the API simpler and safer.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

f44cc269

22 Jun, 2024 1 commit

bcachefs: Fix freeing of error pointers · 9bd01500

Kent Overstreet authored Jun 22, 2024

This fixes incorrect/missign checking of strndup_user() returns.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

9bd01500

21 Jun, 2024 7 commits

bcachefs: Move the ei_flags setting to after initialization · bd4da046

Youling Tang authored Jun 04, 2024

`inode->ei_flags` setting and cleaning should be done after initialization,
otherwise the operation is invalid.

Fixes: 9ca4853b ("bcachefs: Fix quota support for snapshots")
Signed-off-by: Youling Tang <tangyouling@kylinos.cn>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

bd4da046

bcachefs: Fix a UAF after write_super() · 2fe79ce7

Kent Overstreet authored Jun 20, 2024

write_super() may reallocate the superblock buffer - but
bch_sb_field_ext was referencing it; don't use it after the write_super
call.

Reported-by: syzbot+8992fc10a192067b8d8a@syzkaller.appspotmail.com
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

2fe79ce7

bcachefs: Use bch2_print_string_as_lines for long err · e6b3a655

Kent Overstreet authored Jun 20, 2024

printk strings get truncated to 1024 bytes; if we have a long error
message (journal debug info) we need to use a helper.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

e6b3a655

bcachefs: Fix I_NEW warning in race path in bch2_inode_insert() · dd908648

Kent Overstreet authored Jun 20, 2024

discard_new_inode() is the correct interface for tearing down an indoe
that was fully created but not made visible to other threads, but it
expects I_NEW to be set, which we don't use.

Reported-by: https://github.com/koverstreet/bcachefs/issues/690
Fixes: bcachefs: Fix race path in bch2_inode_insert()
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

dd908648

bcachefs: Replace bare EEXIST with private error codes · 50479406
Kent Overstreet authored May 26, 2024
```
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
```
50479406

bcachefs: Fix missing alloc_data_type_set() · f648b6c1

Kent Overstreet authored Jun 20, 2024

Incorrect bucket state transition in the discard path; when incrementing
a bucket's generation number that had already been discarded, we were
forgetting to check if it should be need_gc_gens, not free.

This was caught by the .invalid checks in the transaction commit path,
causing us to go emergency read only.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

f648b6c1

closures: Change BUG_ON() to WARN_ON() · 339b84ab

Kent Overstreet authored Jun 20, 2024

If a BUG_ON() can be hit in the wild, it shouldn't be a BUG_ON()

For reference, this has popped up once in the CI, and we'll need more
info to debug it:

03240 ------------[ cut here ]------------
03240 kernel BUG at lib/closure.c:21!
03240 kernel BUG at lib/closure.c:21!
03240 Internal error: Oops - BUG: 00000000f2000800 [#1] SMP
03240 Modules linked in:
03240 CPU: 15 PID: 40534 Comm: kworker/u80:1 Not tainted 6.10.0-rc4-ktest-ga56da697 #25570
03240 Hardware name: linux,dummy-virt (DT)
03240 Workqueue: btree_update btree_interior_update_work
03240 pstate: 00001005 (nzcv daif -PAN -UAO -TCO -DIT +SSBS BTYPE=--)
03240 pc : closure_put+0x224/0x2a0
03240 lr : closure_put+0x24/0x2a0
03240 sp : ffff0000d12071c0
03240 x29: ffff0000d12071c0 x28: dfff800000000000 x27: ffff0000d1207360
03240 x26: 0000000000000040 x25: 0000000000000040 x24: 0000000000000040
03240 x23: ffff0000c1f20180 x22: 0000000000000000 x21: ffff0000c1f20168
03240 x20: 0000000040000000 x19: ffff0000c1f20140 x18: 0000000000000001
03240 x17: 0000000000003aa0 x16: 0000000000003ad0 x15: 1fffe0001c326974
03240 x14: 0000000000000a1e x13: 0000000000000000 x12: 1fffe000183e402d
03240 x11: ffff6000183e402d x10: dfff800000000000 x9 : ffff6000183e402e
03240 x8 : 0000000000000001 x7 : 00009fffe7c1bfd3 x6 : ffff0000c1f2016b
03240 x5 : ffff0000c1f20168 x4 : ffff6000183e402e x3 : ffff800081391954
03240 x2 : 0000000000000001 x1 : 0000000000000000 x0 : 00000000a8000000
03240 Call trace:
03240  closure_put+0x224/0x2a0
03240  bch2_check_for_deadlock+0x910/0x1028
03240  bch2_six_check_for_deadlock+0x1c/0x30
03240  six_lock_slowpath.isra.0+0x29c/0xed0
03240  six_lock_ip_waiter+0xa8/0xf8
03240  __bch2_btree_node_lock_write+0x14c/0x298
03240  bch2_trans_lock_write+0x6d4/0xb10
03240  __bch2_trans_commit+0x135c/0x5520
03240  btree_interior_update_work+0x1248/0x1c10
03240  process_scheduled_works+0x53c/0xd90
03240  worker_thread+0x370/0x8c8
03240  kthread+0x258/0x2e8
03240  ret_from_fork+0x10/0x20
03240 Code: aa1303e0 d63f0020 a94363f7 17ffff8c (d4210000)
03240 ---[ end trace 0000000000000000 ]---
03240 Kernel panic - not syncing: Oops - BUG: Fatal exception
03240 SMP: stopping secondary CPUs
03241 SMP: failed to stop secondary CPUs 13,15
03241 Kernel Offset: disabled
03241 CPU features: 0x00,00000003,80000008,4240500b
03241 Memory Limit: none
03241 ---[ end Kernel panic - not syncing: Oops - BUG: Fatal exception ]---
03246 ========= FAILED TIMEOUT copygc_torture_no_checksum in 7200s
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

339b84ab

20 Jun, 2024 2 commits

bcachefs: fix alignment of VMA for memory mapped files on THP · c6cab97c

Youling Tang authored Jun 20, 2024

With CONFIG_READ_ONLY_THP_FOR_FS, the Linux kernel supports using THPs
for read-only mmapped files, such as shared libraries. However, the
kernel makes no attempt to actually align those mappings on 2MB
boundaries, which makes it impossible to use those THPs most of the
time. This issue applies to general file mapping THP as well as
existing setups using CONFIG_READ_ONLY_THP_FOR_FS. This is easily
fixed by using thp_get_unmapped_area for the unmapped_area function
in bcachefs, which is what ext2, ext4, fuse, xfs and btrfs all use.

Similar to commit b0c58223 ("btrfs: fix alignment of VMA for
memory mapped files on THP").
Signed-off-by: Youling Tang <tangyouling@kylinos.cn>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

c6cab97c

bcachefs: Fix safe errors by default · 33dfafa9

Kent Overstreet authored Jun 19, 2024

i.e. the start of automatic self healing:

If errors=continue or fix_safe, we now automatically fix simple errors
without user intervention.

New error action option: fix_safe

This replaces the existing errors=ro option, which gets a new slot, i.e.
existing errors=ro users now get errors=fix_safe.

This is currently only enabled for a limited set of errors - initially
just disk accounting; errors we would never not want to fix, and we
don't want to require user intervention (i.e. to make sure a bug report
gets filed).

Errors will still be counted in the superblock, so we (developers) will
still know they've been occuring if a bug report gets filed (as bug
reports typically include the errors superblock section).

Eventually we'll be enabling this for a much wider set of errors, after
we've done thorough error injection testing.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

33dfafa9

19 Jun, 2024 13 commits

bcachefs: Fix bch2_trans_put() · a56da697

Kent Overstreet authored Jun 19, 2024

reference: https://github.com/koverstreet/bcachefs/issues/692

trans->ref is the reference used by the cycle detector, which walks
btree_trans objects of other threads to walk the graph of held locks and
issue wakeups when an abort is required.

We have to wait for the ref to go to 1 before freeing trans->paths or
clearing trans->locking_wait.task.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

a56da697

bcachefs: set_worker_desc() for delete_dead_snapshots · 0a2a507d

Kent Overstreet authored Jun 19, 2024

this is long running - help users see what's going on
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

0a2a507d

bcachefs: Fix bch2_sb_downgrade_update() · ddd118ab
Kent Overstreet authored Jun 17, 2024
```
Missing enum conversion
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
```
ddd118ab

bcachefs: Handle cached data LRU wraparound · 2e9940d4

Kent Overstreet authored Jun 17, 2024

We only have 48 bits for the LRU time field, which is insufficient to
prevent wraparound.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

2e9940d4

bcachefs: Guard against overflowing LRU_TIME_BITS · cff07e27

Kent Overstreet authored Jun 17, 2024

LRUs only have 48 bits for the time field (i.e. LRU order); thus we need
overflow checks and guards.

Reported-by: syzbot+df3bf3f088dcaa728857@syzkaller.appspotmail.com
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

cff07e27

bcachefs: delete_dead_snapshots() doesn't need to go RW · 1ba44217

Kent Overstreet authored Jun 17, 2024

We've been moving away from going RW lazily; if we want to go RW we do
that in set_may_go_rw(), and if we didn't go RW we don't need to delete
dead snapshots.

Reported-by: syzbot+4366624c0b5aac4906cf@syzkaller.appspotmail.com
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

1ba44217

bcachefs: Fix early init error path in journal code · dbf4d79b

Kent Overstreet authored Jun 17, 2024

We shouln't be running the journal shutdown sequence if we never fully
initialized the journal.

Reported-by: syzbot+ffd2270f0bca3322ee00@syzkaller.appspotmail.com
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

dbf4d79b

bcachefs: Check for invalid btree IDs · 9e7cfb35

Kent Overstreet authored Jun 17, 2024

We can only handle btree IDs up to 62, since the btree id (plus the type
for interior btree nodes) has to fit ito a 64 bit bitmask - check for
invalid ones to avoid invalid shifts later.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

9e7cfb35

bcachefs: Fix btree ID bitmasks · e3fd3faa

Kent Overstreet authored Jun 17, 2024

these should be 64 bit bitmasks, not 32 bit.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

e3fd3faa

bcachefs: Fix shift overflow in read_one_super() · d4065456

Kent Overstreet authored Jun 17, 2024

Reported-by: syzbot+9f74cb4006b83e2a3df1@syzkaller.appspotmail.com
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

d4065456

bcachefs: Fix a locking bug in the do_discard_fast() path · 3727ca56

Kent Overstreet authored Jun 17, 2024

We can't discard a bucket while it's still open; this needs the
bucket_is_open_safe() version, which takes the open_buckets lock.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

3727ca56

bcachefs: Fix array-index-out-of-bounds · d47df4f6

Kent Overstreet authored Jun 12, 2024

We use 0 size arrays as markers, but ubsan doesn't know that - cast them
to a pointer to fix the splat.

Also, make sure this code gets tested a bit more.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

d47df4f6

bcachefs: Fix initialization order for srcu barrier · f770a6e9

Kent Overstreet authored Jun 12, 2024

btree_iter_init() needs to happen before key_cache_init(), to initialize
btree_trans_barrier

Reported-by: syzbot+3cca837c2183f8f6fcaf@syzkaller.appspotmail.com
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

f770a6e9

16 Jun, 2024 6 commits

Linux 6.10-rc4 · 6ba59ff4
Linus Torvalds authored Jun 16, 2024

6ba59ff4

Merge tag 'parisc-for-6.10-rc4' of... · 6456c425

Linus Torvalds authored Jun 16, 2024

Merge tag 'parisc-for-6.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux

Pull parisc fix from Helge Deller:
 "On parisc we have suffered since years from random segfaults which
  seem to have been triggered due to cache inconsistencies. Those
  segfaults happened more often on machines with PA8800 and PA8900 CPUs,
  which have much bigger caches than the earlier machines.

  Dave Anglin has worked over the last few weeks to fix this bug. His
  patch has been successfully tested by various people on various
  machines and with various kernels (6.6, 6.8 and 6.9), and the debian
  buildd servers haven't shown a single random segfault with this patch.

  Since the cache handling has been reworked, the patch is slightly
  bigger than I would like in this stage, but the greatly improved
  stability IMHO justifies the inclusion now"

* tag 'parisc-for-6.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
  parisc: Try to fix random segmentation faults in package builds

6456c425

Merge tag 'i2c-for-6.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux · 4301487e

Linus Torvalds authored Jun 16, 2024

Pull i2c fixes from Wolfram Sang:
 "Two fixes to correctly report i2c functionality, ensuring that
  I2C_FUNC_SLAVE is reported when a device operates solely as a slave
  interface"

* tag 'i2c-for-6.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  i2c: designware: Fix the functionality flags of the slave-only interface
  i2c: at91: Fix the functionality flags of the slave-only interface

4301487e

Merge tag 'usb-6.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb · b5beaa44

Linus Torvalds authored Jun 16, 2024

Pull USB / Thunderbolt fixes from Greg KH:
 "Here are some small USB and Thunderbolt driver fixes for 6.10-rc4.
  Included in here are:

   - thunderbolt debugfs bugfix

   - USB typec bugfixes

   - kcov usb bugfix

   - xhci bugfixes

   - usb-storage bugfix

   - dt-bindings bugfix

   - cdc-wdm log message spam bugfix

  All of these, except for the last cdc-wdm log level change, have been
  in linux-next for a while with no reported problems. The cdc-wdm
  bugfix has been tested by syzbot and proved to fix the reported cpu
  lockup issues when the log is constantly spammed by a broken device"

* tag 'usb-6.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  USB: class: cdc-wdm: Fix CPU lockup caused by excessive log messages
  xhci: Handle TD clearing for multiple streams case
  xhci: Apply broken streams quirk to Etron EJ188 xHCI host
  xhci: Apply reset resume quirk to Etron EJ188 xHCI host
  xhci: Set correct transferred length for cancelled bulk transfers
  usb-storage: alauda: Check whether the media is initialized
  usb: typec: ucsi: Ack also failed Get Error commands
  kcov, usb: disable interrupts in kcov_remote_start_usb_softirq
  dt-bindings: usb: realtek,rts5411: Add missing "additionalProperties" on child nodes
  usb: typec: tcpm: Ignore received Hard Reset in TOGGLING state
  usb: typec: tcpm: fix use-after-free case in tcpm_register_source_caps
  USB: xen-hcd: Traverse host/ when CONFIG_USB_XEN_HCD is selected
  usb: typec: ucsi: glink: increase max ports for x1e80100
  Revert "usb: chipidea: move ci_ulpi_init after the phy initialization"
  thunderbolt: debugfs: Fix margin debugfs node creation condition

b5beaa44

Merge tag 'tty-6.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty · 6efc63a8

Linus Torvalds authored Jun 16, 2024

Pull tty/serial driver fixes from Greg KH:
 "Here are some small tty and serial driver fixes that resolve som
  reported problems. Included in here are:

   - n_tty lookahead buffer bugfix

   - WARN_ON() removal where it was not needed

   - 8250_dw driver bugfixes

   - 8250_pxa bugfix

   - sc16is7xx Kconfig fixes for reported build issues

  All of these have been in linux-next for over a week with no reported
  problems"

* tag 'tty-6.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
  serial: drop debugging WARN_ON_ONCE() from uart_write()
  serial: sc16is7xx: re-add Kconfig SPI or I2C dependency
  serial: sc16is7xx: rename Kconfig CONFIG_SERIAL_SC16IS7XX_CORE
  serial: port: Don't block system suspend even if bytes are left to xmit
  serial: 8250_pxa: Configure tx_loadsz to match FIFO IRQ level
  serial: 8250_dw: Revert "Move definitions to the shared header"
  serial: 8250_dw: Don't use struct dw8250_data outside of 8250_dw
  tty: n_tty: Fix buffer offsets when lookahead is used

6efc63a8

Merge tag 'staging-6.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging · d3e6dc4f

Linus Torvalds authored Jun 16, 2024

Pull staging driver fix from Greg KH:
 "Here is a single staging driver fix, for the vc04 driver. It resolves
  a reported problem that showed up in the merge window set of changes.

  It's been in linux-next for over a week with no reported problems"

* tag 'staging-6.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
  staging: vchiq_debugfs: Fix NPD in vchiq_dump_state

d3e6dc4f