Commits · f943ebe2ec26272d71f9c7643ec667c616419bb1 · Kirill Smelkov / linux

06 Mar, 2024 6 commits

RISC-V: KVM: Allow Ztso extension for Guest/VM · f943ebe2

Anup Patel authored Feb 13, 2024

Extend the KVM ISA extension ONE_REG interface to allow KVM user space
to detect and enable Ztso extension for Guest/VM.
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>

f943ebe2

RISC-V: KVM: Forward SEED CSR access to user space · d808f0b1

Anup Patel authored Feb 13, 2024

The SEED CSR access from VS/VU mode (guest) will always trap to
HS-mode (KVM) when Zkr extension is available to the Guest/VM.

Forward this CSR access to KVM user space so that it can be
emulated based on the method chosen by VMM.

Fixes: f370b4e6 ("RISC-V: KVM: Allow scalar crypto extensions for Guest/VM")
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>

d808f0b1

KVM: riscv: selftests: Add sstc timer test · d0b94bcb

Haibo Xu authored Jan 22, 2024

Add a KVM selftests to validate the Sstc timer functionality.
The test was ported from arm64 arch timer test.
Signed-off-by: Haibo Xu <haibo1.xu@intel.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>

d0b94bcb

KVM: riscv: selftests: Change vcpu_has_ext to a common function · 812806bd

Haibo Xu authored Jan 22, 2024

Move vcpu_has_ext to the processor.c and rename it to __vcpu_has_ext
so that other test cases can use it for vCPU extension check.
Signed-off-by: Haibo Xu <haibo1.xu@intel.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>

812806bd

KVM: riscv: selftests: Add guest helper to get vcpu id · 1e979288

Haibo Xu authored Jan 22, 2024

Add guest_get_vcpuid() helper to simplify accessing to per-cpu
private data. The sscratch CSR was used to store the vcpu id.
Signed-off-by: Haibo Xu <haibo1.xu@intel.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>

1e979288

KVM: riscv: selftests: Add exception handling support · 38f680c2

Haibo Xu authored Jan 22, 2024

Add the infrastructure for guest exception handling in riscv selftests.
Customized handlers can be enabled by vm_install_exception_handler(vector)
or vm_install_interrupt_handler().

The code is inspired from that of x86/arm64.
Signed-off-by: Haibo Xu <haibo1.xu@intel.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>

38f680c2

26 Feb, 2024 8 commits

KVM: riscv: selftests: Switch to use macro from csr.h · feb2c8fa

Haibo Xu authored Jan 22, 2024

Signed-off-by: Haibo Xu <haibo1.xu@intel.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>

feb2c8fa

tools: riscv: Add header file vdso/processor.h · 1d50c772

Haibo Xu authored Jan 22, 2024

Borrow the cpu_relax() definitions from kernel's
arch/riscv/include/asm/vdso/processor.h to tools/ for riscv.
Signed-off-by: Haibo Xu <haibo1.xu@intel.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>

1d50c772

tools: riscv: Add header file csr.h · a69459d5

Haibo Xu authored Jan 22, 2024

Borrow the csr definitions and operations from kernel's
arch/riscv/include/asm/csr.h to tools/ for riscv.
Signed-off-by: Haibo Xu <haibo1.xu@intel.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>

a69459d5

KVM: selftests: Add CONFIG_64BIT definition for the build · b4b12469

Haibo Xu authored Jan 22, 2024

Since only 64bit KVM selftests were supported on all architectures,
add the CONFIG_64BIT definition in kvm/Makefile to ensure only 64bit
definitions were available in the corresponding included files.
Suggested-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Haibo Xu <haibo1.xu@intel.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>

b4b12469

KVM: arm64: selftests: Split arch_timer test code · c20dd9e0

Haibo Xu authored Jan 22, 2024

Split the arch-neutral test code out of aarch64/arch_timer.c
and put them into a common arch_timer.c. This is a preparation
to share timer test codes in riscv.
Suggested-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Haibo Xu <haibo1.xu@intel.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>

c20dd9e0

KVM: arm64: selftests: Enable tuning of error margin in arch_timer test · d1dafd06

Haibo Xu authored Jan 22, 2024

There are intermittent failures occurred when stressing the
arch-timer test in a Qemu VM:

 Guest assert failed,  vcpu 0; stage; 4; iter: 3
 ==== Test Assertion Failure ====
   aarch64/arch_timer.c:196: config_iter + 1 == irq_iter
   pid=4048 tid=4049 errno=4 - Interrupted system call
      1  0x000000000040253b: test_vcpu_run at arch_timer.c:248
      2  0x0000ffffb60dd5c7: ?? ??:0
      3  0x0000ffffb6145d1b: ?? ??:0
   0x3 != 0x2 (config_iter + 1 != irq_iter)e

Further test and debug show that the timeout for an interrupt
to arrive do have random high fluctuation, espectially when
testing in an virtual environment.

To alleviate this issue, just expose the timeout value as user
configurable and print some hint message to increase the value
when hitting the failure..
Signed-off-by: Haibo Xu <haibo1.xu@intel.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Anup Patel <anup@brainfault.org>

d1dafd06

KVM: arm64: selftests: Data type cleanup for arch_timer test · f0617e4a

Haibo Xu authored Jan 22, 2024

Change signed type to unsigned in test_args struct which
only make sense for unsigned value.
Suggested-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Haibo Xu <haibo1.xu@intel.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Anup Patel <anup@brainfault.org>

f0617e4a

selftests/kvm: Fix issues with $(SPLIT_TESTS) · 2c5af1c8

Paolo Bonzini authored Jan 22, 2024

The introduction of $(SPLIT_TESTS) also introduced a warning when
building selftests on architectures that include get-reg-lists:

    make: Entering directory '/root/kvm/tools/testing/selftests/kvm'
    Makefile:272: warning: overriding recipe for target '/root/kvm/tools/testing/selftests/kvm/get-reg-list'
    Makefile:267: warning: ignoring old recipe for target '/root/kvm/tools/testing/selftests/kvm/get-reg-list'
    make: Leaving directory '/root/kvm/tools/testing/selftests/kvm'

In addition, the rule for $(SPLIT_TESTS_TARGETS) includes _all_
the $(SPLIT_TESTS_OBJS), which only works because there is just one.
So fix both by adjusting the rules:

- remove $(SPLIT_TESTS_TARGETS) from the $(TEST_GEN_PROGS) rules,
  and rename it to $(SPLIT_TEST_GEN_PROGS)

- fix $(SPLIT_TESTS_OBJS) so that it plays well with $(OUTPUT),
  rename it to $(SPLIT_TEST_GEN_OBJ), and list the object file
  explicitly in the $(SPLIT_TEST_GEN_PROGS) link rule

Fixes: 17da79e0 ("KVM: arm64: selftests: Split get-reg-list test code", 2023-08-09)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Tested-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Anup Patel <anup@brainfault.org>

2c5af1c8

25 Feb, 2024 26 commits

Linux 6.8-rc6 · d206a76d
Linus Torvalds authored Feb 25, 2024

d206a76d

Merge tag 'bcachefs-2024-02-25' of https://evilpiepirate.org/git/bcachefs · e231dbd4

Linus Torvalds authored Feb 25, 2024

Pull bcachefs fixes from Kent Overstreet:
 "Some more mostly boring fixes, but some not

  User reported ones:

   - the BTREE_ITER_FILTER_SNAPSHOTS one fixes a really nasty
     performance bug; user reported an untar initially taking two
     seconds and then ~2 minutes

   - kill a __GFP_NOFAIL in the buffered read path; this was a leftover
     from the trickier fix to kill __GFP_NOFAIL in readahead, where we
     can't return errors (and have to silently truncate the read
     ourselves).

     bcachefs can't use GFP_NOFAIL for folio state unlike iomap based
     filesystems because our folio state is just barely too big, 2MB
     hugepages cause us to exceed the 2 page threshhold for GFP_NOFAIL.

     additionally, the flags argument was just buggy, we weren't
     supplying GFP_KERNEL previously (!)"

* tag 'bcachefs-2024-02-25' of https://evilpiepirate.org/git/bcachefs:
  bcachefs: fix bch2_save_backtrace()
  bcachefs: Fix check_snapshot() memcpy
  bcachefs: Fix bch2_journal_flush_device_pins()
  bcachefs: fix iov_iter count underflow on sub-block dio read
  bcachefs: Fix BTREE_ITER_FILTER_SNAPSHOTS on inodes btree
  bcachefs: Kill __GFP_NOFAIL in buffered read path
  bcachefs: fix backpointer_to_text() when dev does not exist

e231dbd4

bcachefs: fix bch2_save_backtrace() · 5197728f

Kent Overstreet authored Feb 25, 2024

Missed a call in the previous fix.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

5197728f

Merge tag 'docs-6.8-fixes3' of git://git.lwn.net/linux · 70ff1fe6

Linus Torvalds authored Feb 25, 2024

Pull two documentation build fixes from Jonathan Corbet:

 - The XFS online fsck documentation uses incredibly deeply nested
   subsection and list nesting; that broke the PDF docs build. Tweak a
   parameter to tell LaTeX to allow the deeper nesting.

 - Fix a 6.8 PDF-build regression

* tag 'docs-6.8-fixes3' of git://git.lwn.net/linux:
  docs: translations: use attribute to store current language
  docs: Instruct LaTeX to cope with deeper nesting

70ff1fe6

Merge tag 'usb-6.8-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb · c46ac50e

Linus Torvalds authored Feb 25, 2024

Pull USB fixes from Greg KH:
 "Here are some small USB fixes for 6.8-rc6 to resolve some reported
  problems. These include:

   - regression fixes with typec tpcm code as reported by many

   - cdnsp and cdns3 driver fixes

   - usb role setting code bugfixes

   - build fix for uhci driver

   - ncm gadget driver bugfix

   - MAINTAINERS entry update

  All of these have been in linux-next all week with no reported issues
  and there is at least one fix in here that is in Thorsten's regression
  list that is being tracked"

* tag 'usb-6.8-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  usb: typec: tpcm: Fix issues with power being removed during reset
  MAINTAINERS: Drop myself as maintainer of TYPEC port controller drivers
  usb: gadget: ncm: Avoid dropping datagrams of properly parsed NTBs
  Revert "usb: typec: tcpm: reset counter when enter into unattached state after try role"
  usb: gadget: omap_udc: fix USB gadget regression on Palm TE
  usb: dwc3: gadget: Don't disconnect if not started
  usb: cdns3: fix memory double free when handle zero packet
  usb: cdns3: fixed memory use after free at cdns3_gadget_ep_disable()
  usb: roles: don't get/set_role() when usb_role_switch is unregistered
  usb: roles: fix NULL pointer issue when put module's reference
  usb: cdnsp: fixed issue with incorrect detecting CDNSP family controllers
  usb: cdnsp: blocked some cdns3 specific code
  usb: uhci-grlib: Explicitly include linux/platform_device.h

c46ac50e

Merge tag 'tty-6.8-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty · 1e592e95

Linus Torvalds authored Feb 25, 2024

Pull tty/serial driver fixes from Greg KH:
 "Here are three small serial/tty driver fixes for 6.8-rc6 that resolve
  the following reported errors:

   - riscv hvc console driver fix that was reported by many

   - amba-pl011 serial driver fix for RS485 mode

   - stm32 serial driver fix for RS485 mode

  All of these have been in linux-next all week with no reported
  problems"

* tag 'tty-6.8-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
  serial: amba-pl011: Fix DMA transmission in RS485 mode
  serial: stm32: do not always set SER_RS485_RX_DURING_TX if RS485 is enabled
  tty: hvc: Don't enable the RISC-V SBI console by default

1e592e95

Merge tag 'x86_urgent_for_v6.8_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 1eee4ef3

Linus Torvalds authored Feb 25, 2024

Pull x86 fixes from Borislav Petkov:

 - Make sure clearing CPU buffers using VERW happens at the latest
   possible point in the return-to-userspace path, otherwise memory
   accesses after the VERW execution could cause data to land in CPU
   buffers again

* tag 'x86_urgent_for_v6.8_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  KVM/VMX: Move VERW closer to VMentry for MDS mitigation
  KVM/VMX: Use BT+JNC, i.e. EFLAGS.CF to select VMRESUME vs. VMLAUNCH
  x86/bugs: Use ALTERNATIVE() instead of mds_user_clear static key
  x86/entry_32: Add VERW just before userspace transition
  x86/entry_64: Add VERW just before userspace transition
  x86/bugs: Add asm helpers for executing VERW

1eee4ef3

Merge tag 'irq_urgent_for_v6.8_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 8c46ed37

Linus Torvalds authored Feb 25, 2024

Pull irq fixes from Borislav Petkov:

 - Make sure GICv4 always gets initialized to prevent a kexec-ed kernel
   from silently failing to set it up

 - Do not call bus_get_dev_root() for the mbigen irqchip as it always
   returns NULL - use NULL directly

 - Fix hardware interrupt number truncation when assigning MSI
   interrupts

 - Correct sending end-of-interrupt messages to disabled interrupts
   lines on RISC-V PLIC

* tag 'irq_urgent_for_v6.8_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqchip/gic-v3-its: Do not assume vPE tables are preallocated
  irqchip/mbigen: Don't use bus_get_dev_root() to find the parent
  PCI/MSI: Prevent MSI hardware interrupt number truncation
  irqchip/sifive-plic: Enable interrupt if needed before EOI

8c46ed37

Merge tag 'erofs-for-6.8-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs · 4ca0d989

Linus Torvalds authored Feb 25, 2024

Pull erofs fix from Gao Xiang:

 - Fix page refcount leak when looking up specific inodes
   introduced by metabuf reworking

* tag 'erofs-for-6.8-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs:
  erofs: fix refcount on the metabuf used for inode lookup

4ca0d989

Merge tag 'pull-fixes.pathwalk-rcu-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 66a97c2e

Linus Torvalds authored Feb 25, 2024

Pull RCU pathwalk fixes from Al Viro:
 "We still have some races in filesystem methods when exposed to RCU
  pathwalk. This series is a result of code audit (the second round of
  it) and it should deal with most of that stuff.

  Still pending: ntfs3 ->d_hash()/->d_compare() and ceph_d_revalidate().
  Up to maintainers (a note for NTFS folks - when documentation says
  that a method may not block, it *does* imply that blocking allocations
  are to be avoided. Really)"

[ More explanations for people who aren't familiar with the vagaries of
  RCU path walking: most of it is hidden from filesystems, but if a
  filesystem actively participates in the low-level path walking it
  needs to make sure the fields involved in that walk are RCU-safe.

  That "actively participate in low-level path walking" includes things
  like having its own ->d_hash()/->d_compare() routines, or by having
  its own directory permission function that doesn't just use the common
  helpers.  Having a ->d_revalidate() function will also have this issue.

  Note that instead of making everything RCU safe you can also choose to
  abort the RCU pathwalk if your operation cannot be done safely under
  RCU, but that obviously comes with a performance penalty. One common
  pattern is to allow the simple cases under RCU, and abort only if you
  need to do something more complicated.

  So not everything needs to be RCU-safe, and things like the inode etc
  that the VFS itself maintains obviously already are. But these fixes
  tend to be about properly RCU-delaying things like ->s_fs_info that
  are maintained by the filesystem and that got potentially released too
  early.   - Linus ]

* tag 'pull-fixes.pathwalk-rcu-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  ext4_get_link(): fix breakage in RCU mode
  cifs_get_link(): bail out in unsafe case
  fuse: fix UAF in rcu pathwalks
  procfs: make freeing proc_fs_info rcu-delayed
  procfs: move dropping pde and pid from ->evict_inode() to ->free_inode()
  nfs: fix UAF on pathwalk running into umount
  nfs: make nfs_set_verifier() safe for use in RCU pathwalk
  afs: fix __afs_break_callback() / afs_drop_open_mmap() race
  hfsplus: switch to rcu-delayed unloading of nls and freeing ->s_fs_info
  exfat: move freeing sbi, upcase table and dropping nls into rcu-delayed helper
  affs: free affs_sb_info with kfree_rcu()
  rcu pathwalk: prevent bogus hard errors from may_lookup()
  fs/super.c: don't drop ->s_user_ns until we free struct super_block itself

66a97c2e

Merge tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 9b243492

Linus Torvalds authored Feb 25, 2024

Pull vfs fixes from Al Viro:
 "A couple of fixes - revert of regression from this cycle and a fix for
  erofs failure exit breakage (had been there since way back)"

* tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  erofs: fix handling kern_mount() failure
  Revert "get rid of DCACHE_GENOCIDE"

9b243492

ext4_get_link(): fix breakage in RCU mode · 9fa8e282

Al Viro authored Feb 03, 2024

1) errors from ext4_getblk() should not be propagated to caller
unless we are really sure that we would've gotten the same error
in non-RCU pathwalk.
2) we leak buffer_heads if ext4_getblk() is successful, but bh is
not uptodate.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

9fa8e282

cifs_get_link(): bail out in unsafe case · 0511fdb4

Al Viro authored Sep 19, 2023

->d_revalidate() bails out there, anyway.  It's not enough
to prevent getting into ->get_link() in RCU mode, but that
could happen only in a very contrieved setup.  Not worth
trying to do anything fancy here unless ->d_revalidate()
stops kicking out of RCU mode at least in some cases.
Reviewed-by: Christian Brauner <brauner@kernel.org>
Acked-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

0511fdb4

fuse: fix UAF in rcu pathwalks · 053fc4f7

Al Viro authored Sep 28, 2023

->permission(), ->get_link() and ->inode_get_acl() might dereference
->s_fs_info (and, in case of ->permission(), ->s_fs_info->fc->user_ns
as well) when called from rcu pathwalk.

Freeing ->s_fs_info->fc is rcu-delayed; we need to make freeing ->s_fs_info
and dropping ->user_ns rcu-delayed too.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

053fc4f7

procfs: make freeing proc_fs_info rcu-delayed · e31f0a57

Al Viro authored Sep 20, 2023

makes proc_pid_ns() safe from rcu pathwalk (put_pid_ns()
is still synchronous, but that's not a problem - it does
rcu-delay everything that needs to be)
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

e31f0a57

procfs: move dropping pde and pid from ->evict_inode() to ->free_inode() · 47458802

Al Viro authored Sep 19, 2023

that keeps both around until struct inode is freed, making access
to them safe from rcu-pathwalk
Acked-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

47458802

nfs: fix UAF on pathwalk running into umount · c1b967d0

Al Viro authored Sep 27, 2023

NFS ->d_revalidate(), ->permission() and ->get_link() need to access
some parts of nfs_server when called in RCU mode:
	server->flags
	server->caps
	*(server->io_stats)
and, worst of all, call
	server->nfs_client->rpc_ops->have_delegation
(the last one - as NFS_PROTO(inode)->have_delegation()).  We really
don't want to RCU-delay the entire nfs_free_server() (it would have
to be done with schedule_work() from RCU callback, since it can't
be made to run from interrupt context), but actual freeing of
nfs_server and ->io_stats can be done via call_rcu() just fine.
nfs_client part is handled simply by making nfs_free_client() use
kfree_rcu().
Acked-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

c1b967d0

nfs: make nfs_set_verifier() safe for use in RCU pathwalk · 10a973fc

Al Viro authored Sep 27, 2023

nfs_set_verifier() relies upon dentry being pinned; if that's
the case, grabbing ->d_lock stabilizes ->d_parent and guarantees
that ->d_parent points to a positive dentry.  For something
we'd run into in RCU mode that is *not* true - dentry might've
been through dentry_kill() just as we grabbed ->d_lock, with
its parent going through the same just as we get to into
nfs_set_verifier_locked().  It might get to detaching inode
(and zeroing ->d_inode) before nfs_set_verifier_locked() gets
to fetching that; we get an oops as the result.

That can happen in nfs{,4} ->d_revalidate(); the call chain in
question is nfs_set_verifier_locked() <- nfs_set_verifier() <-
nfs_lookup_revalidate_delegated() <- nfs{,4}_do_lookup_revalidate().
We have checked that the parent had been positive, but that's
done before we get to nfs_set_verifier() and it's possible for
memory pressure to pick our dentry as eviction candidate by that
time.  If that happens, back-to-back attempts to kill dentry and
its parent are quite normal.  Sure, in case of eviction we'll
fail the ->d_seq check in the caller, but we need to survive
until we return there...
Acked-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

10a973fc

afs: fix __afs_break_callback() / afs_drop_open_mmap() race · 275655d3

Al Viro authored Sep 29, 2023

In __afs_break_callback() we might check ->cb_nr_mmap and if it's non-zero
do queue_work(&vnode->cb_work).  In afs_drop_open_mmap() we decrement
->cb_nr_mmap and do flush_work(&vnode->cb_work) if it reaches zero.

The trouble is, there's nothing to prevent __afs_break_callback() from
seeing ->cb_nr_mmap before the decrement and do queue_work() after both
the decrement and flush_work().  If that happens, we might be in trouble -
vnode might get freed before the queued work runs.

__afs_break_callback() is always done under ->cb_lock, so let's make
sure that ->cb_nr_mmap can change from non-zero to zero while holding
->cb_lock (the spinlock component of it - it's a seqlock and we don't
need to mess with the counter).
Acked-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

275655d3

hfsplus: switch to rcu-delayed unloading of nls and freeing ->s_fs_info · af072cf6

Al Viro authored Sep 19, 2023

->d_hash() and ->d_compare() use those, so we need to delay freeing
them.
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

af072cf6

exfat: move freeing sbi, upcase table and dropping nls into rcu-delayed helper · a13d1a4d

Al Viro authored Sep 19, 2023

That stuff can be accessed by ->d_hash()/->d_compare(); as it is, we have
a hard-to-hit UAF if rcu pathwalk manages to get into ->d_hash() on a filesystem
that is in process of getting shut down.

Besides, having nls and upcase table cleanup moved from ->put_super() towards
the place where sbi is freed makes for simpler failure exits.
Acked-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

a13d1a4d

affs: free affs_sb_info with kfree_rcu() · 529f89a9

Al Viro authored Sep 19, 2023

one of the flags in it is used by ->d_hash()/->d_compare()
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

529f89a9

rcu pathwalk: prevent bogus hard errors from may_lookup() · cdb67fde

Al Viro authored Sep 29, 2023

If lazy call of ->permission() returns a hard error, check that
try_to_unlazy() succeeds before returning it.  That both makes
life easier for ->permission() instances and closes the race
in ENOTDIR handling - it is possible that positive d_can_lookup()
seen in link_path_walk() applies to the state *after* unlink() +
mkdir(), while nd->inode matches the state prior to that.

Normally seeing e.g. EACCES from permission check in rcu pathwalk
means that with some timings non-rcu pathwalk would've run into
the same; however, running into a non-executable regular file
in the middle of a pathname would not get to permission check -
it would fail with ENOTDIR instead.
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

cdb67fde

fs/super.c: don't drop ->s_user_ns until we free struct super_block itself · 583340de

Al Viro authored Feb 01, 2024

Avoids fun races in RCU pathwalk...  Same goes for freeing LSM shite
hanging off super_block's arse.
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

583340de

bcachefs: Fix check_snapshot() memcpy · c4333eb5

Kent Overstreet authored Feb 24, 2024

check_snapshot() copies the bch_snapshot to a temporary to easily handle
older versions that don't have all the fields of the current version,
but it lacked a min() to correctly handle keys newer and larger than the
current version.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

c4333eb5

bcachefs: Fix bch2_journal_flush_device_pins() · 097471f9

Kent Overstreet authored Feb 17, 2024

If a journal write errored, the list of devices it was written to could
be empty - we're not supposed to mark an empty replicas list.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

097471f9