- 20 Feb, 2024 10 commits
-
-
Paul Durrant authored
A subsequent patch will allow shared_info to be initialized using either a GPA or a user-space (i.e. VMM) HVA. To make that patch cleaner, separate the initialization of the shared_info content from the activation of the pfncache. Signed-off-by: Paul Durrant <pdurrant@amazon.com> Reviewed-by: David Woodhouse <dwmw@amazon.co.uk> Link: https://lore.kernel.org/r/20240215152916.1158-11-paul@xen.orgSigned-off-by: Sean Christopherson <seanjc@google.com>
-
Paul Durrant authored
Some pfncache pages may actually be overlays on guest memory that have a fixed HVA within the VMM. It's pointless to invalidate such cached mappings if the overlay is moved so allow a cache to be activated directly with the HVA to cater for such cases. A subsequent patch will make use of this facility. Signed-off-by: Paul Durrant <pdurrant@amazon.com> Reviewed-by: David Woodhouse <dwmw@amazon.co.uk> Link: https://lore.kernel.org/r/20240215152916.1158-10-paul@xen.orgSigned-off-by: Sean Christopherson <seanjc@google.com>
-
Sean Christopherson authored
Rename kvm_is_error_gpa() to kvm_is_gpa_in_memslot() and invert the polarity accordingly in order to (a) free up kvm_is_error_gpa() to match with kvm_is_error_{hva,page}(), and (b) to make it more obvious that the helper is doing a memslot lookup, i.e. not simply checking for INVALID_GPA. No functional change intended. Link: https://lore.kernel.org/r/20240215152916.1158-9-paul@xen.orgSigned-off-by: Sean Christopherson <seanjc@google.com>
-
Paul Durrant authored
Currently the pfncache page offset is sometimes determined using the gpa and sometimes the khva, whilst the uhva is always page-aligned. After a subsequent patch is applied the gpa will not always be valid so adjust the code to include the page offset in the uhva and use it consistently as the source of truth. Also, where a page-aligned address is required, use PAGE_ALIGN_DOWN() for clarity. No functional change intended. Signed-off-by: Paul Durrant <pdurrant@amazon.com> Reviewed-by: David Woodhouse <dwmw@amazon.co.uk> Link: https://lore.kernel.org/r/20240215152916.1158-8-paul@xen.orgSigned-off-by: Sean Christopherson <seanjc@google.com>
-
Paul Durrant authored
Some code in pfncache uses offset_in_page() but in other places it is open- coded. Use offset_in_page() consistently everywhere. Signed-off-by: Paul Durrant <pdurrant@amazon.com> Reviewed-by: David Woodhouse <dwmw@amazon.co.uk> Link: https://lore.kernel.org/r/20240215152916.1158-7-paul@xen.orgSigned-off-by: Sean Christopherson <seanjc@google.com>
-
Paul Durrant authored
As noted in [1] the KVM_GUEST_USES_PFN usage flag is never set by any callers of kvm_gpc_init(), and for good reason: the implementation is incomplete/broken. And it's not clear that there will ever be a user of KVM_GUEST_USES_PFN, as coordinating vCPUs with mmu_notifier events is non-trivial. Remove KVM_GUEST_USES_PFN and all related code, e.g. dropping KVM_GUEST_USES_PFN also makes the 'vcpu' argument redundant, to avoid having to reason about broken code as __kvm_gpc_refresh() evolves. Moreover, all existing callers specify KVM_HOST_USES_PFN so the usage check in hva_to_pfn_retry() and hence the 'usage' argument to kvm_gpc_init() are also redundant. [1] https://lore.kernel.org/all/ZQiR8IpqOZrOpzHC@google.comSigned-off-by: Paul Durrant <pdurrant@amazon.com> Reviewed-by: David Woodhouse <dwmw@amazon.co.uk> Link: https://lore.kernel.org/r/20240215152916.1158-6-paul@xen.org [sean: explicitly call out that guest usage is incomplete] Signed-off-by: Sean Christopherson <seanjc@google.com>
-
Paul Durrant authored
At the moment pages are marked dirty by open-coded calls to mark_page_dirty_in_slot(), directly deferefencing the gpa and memslot from the cache. After a subsequent patch these may not always be set so add a helper now so that caller will protected from the need to know about this detail. Signed-off-by: Paul Durrant <pdurrant@amazon.com> Reviewed-by: David Woodhouse <dwmw@amazon.co.uk> Link: https://lore.kernel.org/r/20240215152916.1158-5-paul@xen.org [sean: decrease indentation, use gpa_to_gfn()] Signed-off-by: Sean Christopherson <seanjc@google.com>
-
Paul Durrant authored
Sampling gpa and memslot from an unlocked pfncache may yield inconsistent values so, since there is no problem with calling mark_page_dirty_in_slot() with the pfncache lock held, relocate the calls in kvm_xen_update_runstate_guest() and kvm_xen_inject_pending_events() accordingly. Signed-off-by: Paul Durrant <pdurrant@amazon.com> Reviewed-by: David Woodhouse <dwmw@amazon.co.uk> Link: https://lore.kernel.org/r/20240215152916.1158-4-paul@xen.orgSigned-off-by: Sean Christopherson <seanjc@google.com>
-
Paul Durrant authored
There is no need for the existing kvm_gpc_XXX() functions to be exported. Clean up now before additional functions are added in subsequent patches. Signed-off-by: Paul Durrant <pdurrant@amazon.com> Reviewed-by: David Woodhouse <dwmw@amazon.co.uk> Link: https://lore.kernel.org/r/20240215152916.1158-3-paul@xen.orgSigned-off-by: Sean Christopherson <seanjc@google.com>
-
Paul Durrant authored
There is a pfncache unmap helper but mapping is open-coded. Arguably this is fine because mapping is done in only one place, hva_to_pfn_retry(), but adding the helper does make that function more readable. No functional change intended. Signed-off-by: Paul Durrant <pdurrant@amazon.com> Reviewed-by: David Woodhouse <dwmw@amazon.co.uk> Link: https://lore.kernel.org/r/20240215152916.1158-2-paul@xen.orgSigned-off-by: Sean Christopherson <seanjc@google.com>
-
- 08 Feb, 2024 13 commits
-
-
Paolo Bonzini authored
KVM_CAP_IRQ_ROUTING is always defined, so there is no need to check if it is. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Paolo Bonzini authored
Since all architectures (for historical reasons) have to define struct kvm_guest_debug_arch, and since userspace has to check KVM_CHECK_EXTENSION(KVM_CAP_SET_GUEST_DEBUG) anyway, there is no advantage in masking the capability #define itself. Remove the #define __KVM_HAVE_GUEST_DEBUG from architecture-specific headers. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Paolo Bonzini authored
KVM uses __KVM_HAVE_* symbols in the architecture-dependent uapi/asm/kvm.h to mask unused definitions in include/uapi/linux/kvm.h. __KVM_HAVE_READONLY_MEM however was nothing but a misguided attempt to define KVM_CAP_READONLY_MEM only on architectures where KVM_CHECK_EXTENSION(KVM_CAP_READONLY_MEM) could possibly return nonzero. This however does not make sense, and it prevented userspace from supporting this architecture-independent feature without recompilation. Therefore, these days __KVM_HAVE_READONLY_MEM does not mask anything and is only used in virt/kvm/kvm_main.c. Userspace does not need to test it and there should be no need for it to exist. Remove it and replace it with a Kconfig symbol within Linux source code. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Paolo Bonzini authored
While this in principle breaks userspace code that mentions KVM_ARM_DEV_* on architectures other than aarch64, this seems unlikely to be a problem considering that run->s.regs.device_irq_level is only defined on that architecture. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Paolo Bonzini authored
While this in principle breaks the appearance of KVM_S390_* ioctls on architectures other than s390, this seems unlikely to be a problem considering that there are already many "struct kvm_s390_*" definitions in arch/s390/include/uapi. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Paolo Bonzini authored
While this in principle breaks the appearance of KVM_PPC_* ioctls on architectures other than powerpc, this seems unlikely to be a problem considering that there are already many "struct kvm_ppc_*" definitions in arch/powerpc/include/uapi. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Paolo Bonzini authored
Several capabilities that exist only on x86 nevertheless have their structs defined in include/uapi/linux/kvm.h. Move them to arch/x86/include/uapi/asm/kvm.h for cleanliness. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Paolo Bonzini authored
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Paolo Bonzini authored
Change uapi header uses of GENMASK to instead use the uapi/linux/bits.h bit macros, since GENMASK is not defined in uapi headers. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Dionna Glaze authored
Change uapi header uses of BIT to instead use the uapi/linux/const.h bit macros, since BIT is not defined in uapi headers. The PMU mask uses _BITUL since it targets a 32 bit flag field, whereas the longmode definition is meant for a 64 bit flag field. Cc: Sean Christophersen <seanjc@google.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Dionna Glaze <dionnaglaze@google.com> Message-Id: <20231207001142.3617856-1-dionnaglaze@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Paolo Bonzini authored
Move __GENMASK and __GENMASK_ULL from include/ to include/uapi/ so that they can be used to define masks in userspace API headers. Compared to what is already in include/linux/bits.h, the definitions need to use the uglified versions of UL(), ULL(), BITS_PER_LONG and BITS_PER_LONG_LONG (which did not even exist), but otherwise expand to the same content. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6Linus Torvalds authored
Pull crypto fixes from Herbert Xu: "Fix regressions in cbc and algif_hash, as well as an older NULL-pointer dereference in ccp" * tag 'v6.8-p3' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: algif_hash - Remove bogus SGL free on zero-length error path crypto: cbc - Ensure statesize is zero crypto: ccp - Fix null pointer dereference in __sev_platform_shutdown_locked
-
git://git.kernel.org/pub/scm/linux/kernel/git/dennis/percpuLinus Torvalds authored
Pull percpu fix from Dennis Zhou: - fix riscv wrong size passed to local_flush_tlb_range_asid() * tag 'percpu-for-6.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/dennis/percpu: riscv: Fix wrong size passed to local_flush_tlb_range_asid()
-
- 07 Feb, 2024 4 commits
-
-
Linus Torvalds authored
Merge tag 'loongarch-fixes-6.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson Pull LoongArch fixes from Huacai Chen: "Fix acpi_core_pic[] array overflow, fix earlycon parameter if KASAN enabled, disable UBSAN instrumentation for vDSO build, and two Kconfig cleanups" * tag 'loongarch-fixes-6.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson: LoongArch: vDSO: Disable UBSAN instrumentation LoongArch: Fix earlycon parameter if KASAN enabled LoongArch: Change acpi_core_pic[NR_CPUS] to acpi_core_pic[MAX_CORE_PIC] LoongArch: Select HAVE_ARCH_SECCOMP to use the common SECCOMP menu LoongArch: Select ARCH_ENABLE_THP_MIGRATION instead of redefining it
-
git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds authored
Pull kvm fixes from Paolo Bonzini: "x86 guest: - Avoid false positive for check that only matters on AMD processors x86: - Give a hint when Win2016 might fail to boot due to XSAVES && !XSAVEC configuration - Do not allow creating an in-kernel PIT unless an IOAPIC already exists RISC-V: - Allow ISA extensions that were enabled for bare metal in 6.8 (Zbc, scalar and vector crypto, Zfh[min], Zihintntl, Zvfh[min], Zfa) S390: - fix CC for successful PQAP instruction - fix a race when creating a shadow page" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: x86/coco: Define cc_vendor without CONFIG_ARCH_HAS_CC_PLATFORM x86/kvm: Fix SEV check in sev_map_percpu_data() KVM: x86: Give a hint when Win2016 might fail to boot due to XSAVES erratum KVM: x86: Check irqchip mode before create PIT KVM: riscv: selftests: Add Zfa extension to get-reg-list test RISC-V: KVM: Allow Zfa extension for Guest/VM KVM: riscv: selftests: Add Zvfh[min] extensions to get-reg-list test RISC-V: KVM: Allow Zvfh[min] extensions for Guest/VM KVM: riscv: selftests: Add Zihintntl extension to get-reg-list test RISC-V: KVM: Allow Zihintntl extension for Guest/VM KVM: riscv: selftests: Add Zfh[min] extensions to get-reg-list test RISC-V: KVM: Allow Zfh[min] extensions for Guest/VM KVM: riscv: selftests: Add vector crypto extensions to get-reg-list test RISC-V: KVM: Allow vector crypto extensions for Guest/VM KVM: riscv: selftests: Add scaler crypto extensions to get-reg-list test RISC-V: KVM: Allow scalar crypto extensions for Guest/VM KVM: riscv: selftests: Add Zbc extension to get-reg-list test RISC-V: KVM: Allow Zbc extension for Guest/VM KVM: s390: fix cc for successful PQAP KVM: s390: vsie: fix race during shadow creation
-
git://git.kernel.org/pub/scm/linux/kernel/git/cel/linuxLinus Torvalds authored
Pull nfsd fix from Chuck Lever: - Address a deadlock regression in RELEASE_LOCKOWNER * tag 'nfsd-6.8-3' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: nfsd: don't take fi_lock in nfsd_break_deleg_cb()
-
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linuxLinus Torvalds authored
Pull btrfs fixes from David Sterba: - two fixes preventing deletion and manual creation of subvolume qgroup - unify error code returned for unknown send flags - fix assertion during subvolume creation when anonymous device could be allocated by other thread (e.g. due to backref walk) * tag 'for-6.8-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: do not ASSERT() if the newly created subvolume already got read btrfs: forbid deleting live subvol qgroup btrfs: forbid creating subvol qgroups btrfs: send: return EOPNOTSUPP on unknown flags
-
- 06 Feb, 2024 7 commits
-
-
Nathan Chancellor authored
After commit a9ef2774 ("x86/kvm: Fix SEV check in sev_map_percpu_data()"), there is a build error when building x86_64_defconfig with GCOV using LLVM: ld.lld: error: undefined symbol: cc_vendor >>> referenced by kvm.c >>> arch/x86/kernel/kvm.o:(kvm_smp_prepare_boot_cpu) in archive vmlinux.a which corresponds to if (cc_vendor != CC_VENDOR_AMD || !cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT)) return; Without GCOV, clang is able to eliminate the use of cc_vendor because cc_platform_has() evaluates to false when CONFIG_ARCH_HAS_CC_PLATFORM is not set, meaning that if statement will be true no matter what value cc_vendor has. With GCOV, the instrumentation keeps the use of cc_vendor around for code coverage purposes but cc_vendor is only declared, not defined, without CONFIG_ARCH_HAS_CC_PLATFORM, leading to the build error above. Provide a macro definition of cc_vendor when CONFIG_ARCH_HAS_CC_PLATFORM is not set with a value of CC_VENDOR_NONE, so that the first condition can always be evaluated/eliminated at compile time, avoiding the build error altogether. This is very similar to the situation prior to commit da86eb96 ("x86/coco: Get rid of accessor functions"). Signed-off-by: Nathan Chancellor <nathan@kernel.org> Acked-by: Borislav Petkov (AMD) <bp@alien8.de> Message-Id: <20240202-provide-cc_vendor-without-arch_has_cc_platform-v1-1-09ad5f2a3099@kernel.org> Fixes: a9ef2774 ("x86/kvm: Fix SEV check in sev_map_percpu_data()", 2024-01-31) Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
https://evilpiepirate.org/git/bcachefsLinus Torvalds authored
Pull bcachefs fixes from Kent Overstreet: "Two serious ones here that we'll want to backport to stable: a fix for a race in the thread_with_file code, and another locking fixup in the subvolume deletion path" * tag 'bcachefs-2024-02-05' of https://evilpiepirate.org/git/bcachefs: bcachefs: time_stats: Check for last_event == 0 when updating freq stats bcachefs: install fd later to avoid race with close bcachefs: unlock parent dir if entry is not found in subvolume deletion bcachefs: Fix build on parisc by avoiding __multi3()
-
Kees Cook authored
The vDSO executes in userspace, so the kernel's UBSAN should not instrument it. Solves these kind of build errors: loongarch64-linux-ld: arch/loongarch/vdso/vgettimeofday.o: in function `vdso_shift_ns': lib/vdso/gettimeofday.c:23:(.text+0x3f8): undefined reference to `__ubsan_handle_shift_out_of_bounds' Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202401310530.lZHCj1Zl-lkp@intel.com/ Cc: Huacai Chen <chenhuacai@kernel.org> Cc: WANG Xuerui <kernel@xen0n.name> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Fangrui Song <maskray@google.com> Cc: loongarch@lists.linux.dev Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
-
Huacai Chen authored
The earlycon parameter is based on fixmap, and fixmap addresses are not supposed to be shadowed by KASAN. So return the kasan_early_shadow_page in kasan_mem_to_shadow() if the input address is above FIXADDR_START. Otherwise earlycon cannot work after kasan_init(). Cc: stable@vger.kernel.org Fixes: 5aa4ac64 ("LoongArch: Add KASAN (Kernel Address Sanitizer) support") Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
-
Huacai Chen authored
With default config, the value of NR_CPUS is 64. When HW platform has more then 64 cpus, system will crash on these platforms. MAX_CORE_PIC is the maximum cpu number in MADT table (max physical number) which can exceed the supported maximum cpu number (NR_CPUS, max logical number), but kernel should not crash. Kernel should boot cpus with NR_CPUS, let the remainder cpus stay in BIOS. The potential crash reason is that the array acpi_core_pic[NR_CPUS] can be overflowed when parsing MADT table, and it is obvious that CORE_PIC should be corresponding to physical core rather than logical core, so it is better to define the array as acpi_core_pic[MAX_CORE_PIC]. With the patch, system can boot up 64 vcpus with qemu parameter -smp 128, otherwise system will crash with the following message. [ 0.000000] CPU 0 Unable to handle kernel paging request at virtual address 0000420000004259, era == 90000000037a5f0c, ra == 90000000037a46ec [ 0.000000] Oops[#1]: [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 6.8.0-rc2+ #192 [ 0.000000] Hardware name: QEMU QEMU Virtual Machine, BIOS unknown 2/2/2022 [ 0.000000] pc 90000000037a5f0c ra 90000000037a46ec tp 9000000003c90000 sp 9000000003c93d60 [ 0.000000] a0 0000000000000019 a1 9000000003d93bc0 a2 0000000000000000 a3 9000000003c93bd8 [ 0.000000] a4 9000000003c93a74 a5 9000000083c93a67 a6 9000000003c938f0 a7 0000000000000005 [ 0.000000] t0 0000420000004201 t1 0000000000000000 t2 0000000000000001 t3 0000000000000001 [ 0.000000] t4 0000000000000003 t5 0000000000000000 t6 0000000000000030 t7 0000000000000063 [ 0.000000] t8 0000000000000014 u0 ffffffffffffffff s9 0000000000000000 s0 9000000003caee98 [ 0.000000] s1 90000000041b0480 s2 9000000003c93da0 s3 9000000003c93d98 s4 9000000003c93d90 [ 0.000000] s5 9000000003caa000 s6 000000000a7fd000 s7 000000000f556b60 s8 000000000e0a4330 [ 0.000000] ra: 90000000037a46ec platform_init+0x214/0x250 [ 0.000000] ERA: 90000000037a5f0c efi_runtime_init+0x30/0x94 [ 0.000000] CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE) [ 0.000000] PRMD: 00000000 (PPLV0 -PIE -PWE) [ 0.000000] EUEN: 00000000 (-FPE -SXE -ASXE -BTE) [ 0.000000] ECFG: 00070800 (LIE=11 VS=7) [ 0.000000] ESTAT: 00010000 [PIL] (IS= ECode=1 EsubCode=0) [ 0.000000] BADV: 0000420000004259 [ 0.000000] PRID: 0014c010 (Loongson-64bit, Loongson-3A5000) [ 0.000000] Modules linked in: [ 0.000000] Process swapper (pid: 0, threadinfo=(____ptrval____), task=(____ptrval____)) [ 0.000000] Stack : 9000000003c93a14 9000000003800898 90000000041844f8 90000000037a46ec [ 0.000000] 000000000a7fd000 0000000008290000 0000000000000000 0000000000000000 [ 0.000000] 0000000000000000 0000000000000000 00000000019d8000 000000000f556b60 [ 0.000000] 000000000a7fd000 000000000f556b08 9000000003ca7700 9000000003800000 [ 0.000000] 9000000003c93e50 9000000003800898 9000000003800108 90000000037a484c [ 0.000000] 000000000e0a4330 000000000f556b60 000000000a7fd000 000000000f556b08 [ 0.000000] 9000000003ca7700 9000000004184000 0000000000200000 000000000e02b018 [ 0.000000] 000000000a7fd000 90000000037a0790 9000000003800108 0000000000000000 [ 0.000000] 0000000000000000 000000000e0a4330 000000000f556b60 000000000a7fd000 [ 0.000000] 000000000f556b08 000000000eaae298 000000000eaa5040 0000000000200000 [ 0.000000] ... [ 0.000000] Call Trace: [ 0.000000] [<90000000037a5f0c>] efi_runtime_init+0x30/0x94 [ 0.000000] [<90000000037a46ec>] platform_init+0x214/0x250 [ 0.000000] [<90000000037a484c>] setup_arch+0x124/0x45c [ 0.000000] [<90000000037a0790>] start_kernel+0x90/0x670 [ 0.000000] [<900000000378b0d8>] kernel_entry+0xd8/0xdc Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
-
Masahiro Yamada authored
LoongArch missed the refactoring made by commit 282a181b ("seccomp: Move config option SECCOMP to arch/Kconfig") because LoongArch was not mainlined at that time. The 'depends on PROC_FS' statement is stale as described in that commit. Select HAVE_ARCH_SECCOMP, and remove the duplicated config entry. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
-
Masahiro Yamada authored
ARCH_ENABLE_THP_MIGRATION is supposed to be selected by arch Kconfig. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
-
- 05 Feb, 2024 3 commits
-
-
NeilBrown authored
A recent change to check_for_locks() changed it to take ->flc_lock while holding ->fi_lock. This creates a lock inversion (reported by lockdep) because there is a case where ->fi_lock is taken while holding ->flc_lock. ->flc_lock is held across ->fl_lmops callbacks, and nfsd_break_deleg_cb() is one of those and does take ->fi_lock. However it doesn't need to. Prior to v4.17-rc1~110^2~22 ("nfsd: create a separate lease for each delegation") nfsd_break_deleg_cb() would walk the ->fi_delegations list and so needed the lock. Since then it doesn't walk the list and doesn't need the lock. Two actions are performed under the lock. One is to call nfsd_break_one_deleg which calls nfsd4_run_cb(). These doesn't act on the nfs4_file at all, so don't need the lock. The other is to set ->fi_had_conflict which is in the nfs4_file. This field is only ever set here (except when initialised to false) so there is no possible problem will multiple threads racing when setting it. The field is tested twice in nfs4_set_delegation(). The first test does not hold a lock and is documented as an opportunistic optimisation, so it doesn't impose any need to hold ->fi_lock while setting ->fi_had_conflict. The second test in nfs4_set_delegation() *is* make under ->fi_lock, so removing the locking when ->fi_had_conflict is set could make a change. The change could only be interesting if ->fi_had_conflict tested as false even though nfsd_break_one_deleg() ran before ->fi_lock was unlocked. i.e. while hash_delegation_locked() was running. As hash_delegation_lock() doesn't interact in any way with nfs4_run_cb() there can be no importance to this interaction. So this patch removes the locking from nfsd_break_one_deleg() and moves the final test on ->fi_had_conflict out of the locked region to make it clear that locking isn't important to the test. It is still tested *after* vfs_setlease() has succeeded. This might be significant and as vfs_setlease() takes ->flc_lock, and nfsd_break_one_deleg() is called under ->flc_lock this "after" is a true ordering provided by a spinlock. Fixes: edcf9725 ("nfsd: fix RELEASE_LOCKOWNER") Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
-
Kent Overstreet authored
This fixes spurious outliers in the frequency stats. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
-
Mathias Krause authored
Calling fd_install() makes a file reachable for userland, including the possibility to close the file descriptor, which leads to calling its 'release' hook. If that happens before the code had a chance to bump the reference of the newly created task struct, the release callback will call put_task_struct() too early, leading to the premature destruction of the kernel thread. Avoid that race by calling fd_install() later, after all the setup is done. Fixes: 1c6fdbd8 ("bcachefs: Initial commit") Signed-off-by: Mathias Krause <minipli@grsecurity.net> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
-
- 04 Feb, 2024 3 commits
-
-
Linus Torvalds authored
-
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4Linus Torvalds authored
Pull ext4 fixes from Ted Ts'o: "Miscellaneous bug fixes and cleanups in ext4's multi-block allocator and extent handling code" * tag 'for-linus-6.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (23 commits) ext4: make ext4_set_iomap() recognize IOMAP_DELALLOC map type ext4: make ext4_map_blocks() distinguish delalloc only extent ext4: add a hole extent entry in cache after punch ext4: correct the hole length returned by ext4_map_blocks() ext4: convert to exclusive lock while inserting delalloc extents ext4: refactor ext4_da_map_blocks() ext4: remove 'needed' in trace_ext4_discard_preallocations ext4: remove unnecessary parameter "needed" in ext4_discard_preallocations ext4: remove unused return value of ext4_mb_release_group_pa ext4: remove unused return value of ext4_mb_release_inode_pa ext4: remove unused return value of ext4_mb_release ext4: remove unused ext4_allocation_context::ac_groups_considered ext4: remove unneeded return value of ext4_mb_release_context ext4: remove unused parameter ngroup in ext4_mb_choose_next_group_*() ext4: remove unused return value of __mb_check_buddy ext4: mark the group block bitmap as corrupted before reporting an error ext4: avoid allocating blocks from corrupted group in ext4_mb_find_by_goal() ext4: avoid allocating blocks from corrupted group in ext4_mb_try_best_found() ext4: avoid dividing by 0 in mb_update_avg_fragment_size() when block bitmap corrupt ext4: avoid bb_free and bb_fragments inconsistency in mb_free_blocks() ...
-
git://git.samba.org/sfrench/cifs-2.6Linus Torvalds authored
Pull smb client fixes from Steve French: "Five smb3 client fixes, mostly multichannel related: - four multichannel fixes including fix for channel allocation when multiple inactive channels, fix for unneeded race in channel deallocation, correct redundant channel scaling, and redundant multichannel disabling scenarios - add warning if max compound requests reached" * tag 'v6.8-rc3-smb-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: smb: client: increase number of PDUs allowed in a compound request cifs: failure to add channel on iface should bump up weight cifs: do not search for channel if server is terminating cifs: avoid redundant calls to disable multichannel cifs: make sure that channel scaling is done only once
-