1. 01 Oct, 2024 1 commit
    • Mark Brown's avatar
      KVM: selftests: Fix build on architectures other than x86_64 · 76f972c2
      Mark Brown authored
      The recent addition of support for testing with the x86 specific quirk
      KVM_X86_QUIRK_SLOT_ZAP_ALL disabled in the generic memslot tests broke the
      build of the KVM selftests for all other architectures:
      
      In file included from include/kvm_util.h:8,
                       from include/memstress.h:13,
                       from memslot_modification_stress_test.c:21:
      memslot_modification_stress_test.c: In function ‘main’:
      memslot_modification_stress_test.c:176:38: error: ‘KVM_X86_QUIRK_SLOT_ZAP_ALL’ undeclared (first use in this function)
        176 |                                      KVM_X86_QUIRK_SLOT_ZAP_ALL);
            |                                      ^~~~~~~~~~~~~~~~~~~~~~~~~~
      
      Add __x86_64__ guard defines to avoid building the relevant code on other
      architectures.
      
      Fixes: 61de4c34 ("KVM: selftests: Test memslot move in memslot_perf_test with quirk disabled")
      Fixes: 218f6415 ("KVM: selftests: Allow slot modification stress test with quirk disabled")
      Reported-by: default avatarAishwarya TCV <aishwarya.tcv@arm.com>
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Message-ID: <20240930-kvm-build-breakage-v1-1-866fad3cc164@kernel.org>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      76f972c2
  2. 27 Sep, 2024 1 commit
  3. 17 Sep, 2024 10 commits
    • Paolo Bonzini's avatar
      Merge tag 'kvm-x86-vmx-6.12' of https://github.com/kvm-x86/linux into HEAD · 3f8df628
      Paolo Bonzini authored
      KVM VMX changes for 6.12:
      
       - Set FINAL/PAGE in the page fault error code for EPT Violations if and only
         if the GVA is valid.  If the GVA is NOT valid, there is no guest-side page
         table walk and so stuffing paging related metadata is nonsensical.
      
       - Fix a bug where KVM would incorrectly synthesize a nested VM-Exit instead of
         emulating posted interrupt delivery to L2.
      
       - Add a lockdep assertion to detect unsafe accesses of vmcs12 structures.
      
       - Harden eVMCS loading against an impossible NULL pointer deref (really truly
         should be impossible).
      
       - Minor SGX fix and a cleanup.
      3f8df628
    • Paolo Bonzini's avatar
      Merge tag 'kvm-x86-svm-6.12' of https://github.com/kvm-x86/linux into HEAD · 55e6f8f2
      Paolo Bonzini authored
      KVM SVM changes for 6.12:
      
       - Don't stuff the RSB after VM-Exit when RETPOLINE=y and AutoIBRS is enabled,
         i.e. when the CPU has already flushed the RSB.
      
       - Trace the per-CPU host save area as a VMCB pointer to improve readability
         and cleanup the retrieval of the SEV-ES host save area.
      
       - Remove unnecessary accounting of temporary nested VMCB related allocations.
      55e6f8f2
    • Paolo Bonzini's avatar
      Merge tag 'kvm-x86-pat_vmx_msrs-6.12' of https://github.com/kvm-x86/linux into HEAD · 43d97b2e
      Paolo Bonzini authored
      KVM VMX and x86 PAT MSR macro cleanup for 6.12:
      
       - Add common defines for the x86 architectural memory types, i.e. the types
         that are shared across PAT, MTRRs, VMCSes, and EPTPs.
      
       - Clean up the various VMX MSR macros to make the code self-documenting
         (inasmuch as possible), and to make it less painful to add new macros.
      43d97b2e
    • Paolo Bonzini's avatar
      Merge tag 'kvm-x86-mmu-6.12' of https://github.com/kvm-x86/linux into HEAD · 5d55a052
      Paolo Bonzini authored
      KVM x86 MMU changes for 6.12:
      
       - Overhaul the "unprotect and retry" logic to more precisely identify cases
         where retrying is actually helpful, and to harden all retry paths against
         putting the guest into an infinite retry loop.
      
       - Add support for yielding, e.g. to honor NEED_RESCHED, when zapping rmaps in
         the shadow MMU.
      
       - Refactor pieces of the shadow MMU related to aging SPTEs in prepartion for
         adding MGLRU support in KVM.
      
       - Misc cleanups
      5d55a052
    • Paolo Bonzini's avatar
      Merge tag 'kvm-x86-selftests-6.12' of https://github.com/kvm-x86/linux into HEAD · c345344e
      Paolo Bonzini authored
      KVM selftests changes for 6.12:
      
       - Fix a goof that caused some Hyper-V tests to be skipped when run on bare
         metal, i.e. NOT in a VM.
      
       - Add a regression test for KVM's handling of SHUTDOWN for an SEV-ES guest.
      
       - Explicitly include one-off assets in .gitignore.  Past Sean was completely
         wrong about not being able to detect missing .gitignore entries.
      
       - Verify userspace single-stepping works when KVM happens to handle a VM-Exit
         in its fastpath.
      
       - Misc cleanups
      c345344e
    • Paolo Bonzini's avatar
      Merge tag 'kvm-x86-misc-6.12' of https://github.com/kvm-x86/linux into HEAD · 41786cc5
      Paolo Bonzini authored
      KVM x86 misc changes for 6.12
      
       - Advertise AVX10.1 to userspace (effectively prep work for the "real" AVX10
         functionality that is on the horizon).
      
       - Rework common MSR handling code to suppress errors on userspace accesses to
         unsupported-but-advertised MSRs.  This will allow removing (almost?) all of
         KVM's exemptions for userspace access to MSRs that shouldn't exist based on
         the vCPU model (the actual cleanup is non-trivial future work).
      
       - Rework KVM's handling of x2APIC ICR, again, because AMD (x2AVIC) splits the
         64-bit value into the legacy ICR and ICR2 storage, whereas Intel (APICv)
         stores the entire 64-bit value a the ICR offset.
      
       - Fix a bug where KVM would fail to exit to userspace if one was triggered by
         a fastpath exit handler.
      
       - Add fastpath handling of HLT VM-Exit to expedite re-entering the guest when
         there's already a pending wake event at the time of the exit.
      
       - Finally fix the RSM vs. nested VM-Enter WARN by forcing the vCPU out of
         guest mode prior to signalling SHUTDOWN (architecturally, the SHUTDOWN is
         supposed to hit L1, not L2).
      41786cc5
    • Paolo Bonzini's avatar
      Merge tag 'kvm-x86-generic-6.12' of https://github.com/kvm-x86/linux into HEAD · 7056c4e2
      Paolo Bonzini authored
      KVK generic changes for 6.12:
      
       - Fix a bug that results in KVM prematurely exiting to userspace for coalesced
         MMIO/PIO in many cases, clean up the related code, and add a testcase.
      
       - Fix a bug in kvm_clear_guest() where it would trigger a buffer overflow _if_
         the gpa+len crosses a page boundary, which thankfully is guaranteed to not
         happen in the current code base.  Add WARNs in more helpers that read/write
         guest memory to detect similar bugs.
      7056c4e2
    • Paolo Bonzini's avatar
      Merge branch 'kvm-redo-enable-virt' into HEAD · c09dd2bb
      Paolo Bonzini authored
      Register KVM's cpuhp and syscore callbacks when enabling virtualization in
      hardware, as the sole purpose of said callbacks is to disable and re-enable
      virtualization as needed.
      
      The primary motivation for this series is to simplify dealing with enabling
      virtualization for Intel's TDX, which needs to enable virtualization
      when kvm-intel.ko is loaded, i.e. long before the first VM is created.
      
      That said, this is a nice cleanup on its own.  By registering the callbacks
      on-demand, the callbacks themselves don't need to check kvm_usage_count,
      because their very existence implies a non-zero count.
      
      Patch 1 (re)adds a dedicated lock for kvm_usage_count.  This avoids a
      lock ordering issue between cpus_read_lock() and kvm_lock.  The lock
      ordering issue still exist in very rare cases, and will be fixed for
      good by switching vm_list to an (S)RCU-protected list.
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      c09dd2bb
    • Paolo Bonzini's avatar
      Merge branch 'kvm-memslot-zap-quirk' into HEAD · 55f50b2f
      Paolo Bonzini authored
      Today whenever a memslot is moved or deleted, KVM invalidates the entire
      page tables and generates fresh ones based on the new memslot layout.
      
      This behavior traditionally was kept because of a bug which was never
      fully investigated and caused VM instability with assigned GeForce
      GPUs.  It generally does not have a huge overhead, because the old
      MMU is able to reuse cached page tables and the new one is more
      scalabale and can resolve EPT violations/nested page faults in parallel,
      but it has worse performance if the guest frequently deletes and
      adds small memslots, and it's entirely not viable for TDX.  This is
      because TDX requires re-accepting of private pages after page dropping.
      
      For non-TDX VMs, this series therefore introduces the
      KVM_X86_QUIRK_SLOT_ZAP_ALL quirk, enabling users to control the behavior
      of memslot zapping when a memslot is moved/deleted.  The quirk is turned
      on by default, leading to the zapping of all SPTEs when a memslot is
      moved/deleted; users however have the option to turn off the quirk,
      which limits the zapping only to those SPTEs hat lie within the range
      of memslot being moved/deleted.
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      55f50b2f
    • Paolo Bonzini's avatar
      Merge tag 'kvm-s390-next-6.12-1' of... · 356dab4e
      Paolo Bonzini authored
      Merge tag 'kvm-s390-next-6.12-1' of https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD
      
      * New ucontrol selftest
      * Inline assembly touchups
      356dab4e
  4. 16 Sep, 2024 2 commits
  5. 15 Sep, 2024 3 commits
    • Paolo Bonzini's avatar
      Merge tag 'kvm-riscv-6.12-1' of https://github.com/kvm-riscv/linux into HEAD · 0cdcc99e
      Paolo Bonzini authored
      KVM/riscv changes for 6.12
      
      - Fix sbiret init before forwarding to userspace
      - Don't zero-out PMU snapshot area before freeing data
      - Allow legacy PMU access from guest
      - Fix to allow hpmcounter31 from the guest
      0cdcc99e
    • Paolo Bonzini's avatar
      Merge tag 'loongarch-kvm-6.12' of... · 1a371190
      Paolo Bonzini authored
      Merge tag 'loongarch-kvm-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson into HEAD
      
      LoongArch KVM changes for v6.12
      
      1. Revert qspinlock to test-and-set simple lock on VM.
      2. Add Loongson Binary Translation extension support.
      3. Add PMU support for guest.
      4. Enable paravirt feature control from VMM.
      5. Implement function kvm_para_has_feature().
      1a371190
    • Paolo Bonzini's avatar
      Merge tag 'kvmarm-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD · 091b2eca
      Paolo Bonzini authored
      KVM/arm64 updates for 6.12
      
      * New features:
      
        - Add a Stage-2 page table dumper, reusing the main ptdump
          infrastructure, and allowing easier debugging of the our
          page-table infrastructure
      
        - Add FP8 support to the KVM/arm64 floating point handling.
      
        - Add NV support for the AT family of instructions, which mostly
          results in adding a page table walker that deals with most of the
          complexity of the architecture.
      
      * Improvements, fixes and cleanups:
      
        - Add selftest checks for a bunch of timer emulation corner cases
      
        - Fix the multiple of cases where KVM/arm64 doesn't correctly handle
          the guest trying to use a GICv3 that isn't advertised
      
        - Remove REG_HIDDEN_USER from the sysreg infrastructure, making
          things little more simple
      
        - Prevent MTE tags being restored by userspace if we are actively
          logging writes, as that's a recipe for disaster
      
        - Correct the refcount on a page that is not considered for MTE tag
          copying (such as a device)
      
        - Relax the synchronisation when walking a page table to split block
          mappings, moving it at the end the walk, as there is no need to
          perform it on every store.
      
        - Fix boundary check when transfering memory using FFA
      
        - Fix pKVM TLB invalidation, only affecting currently out of tree
          code but worth addressing for peace of mind
      091b2eca
  6. 12 Sep, 2024 10 commits
    • Bibo Mao's avatar
      LoongArch: KVM: Implement function kvm_para_has_feature() · 3abb708e
      Bibo Mao authored
      Implement function kvm_para_has_feature() to detect supported paravirt
      features. It can be used by device driver to detect and enable paravirt
      features, such as the EIOINTC irqchip driver is able to detect feature
      KVM_FEATURE_VIRT_EXTIOI and do some optimization.
      Signed-off-by: default avatarBibo Mao <maobibo@loongson.cn>
      Signed-off-by: default avatarHuacai Chen <chenhuacai@loongson.cn>
      3abb708e
    • Bibo Mao's avatar
      LoongArch: KVM: Enable paravirt feature control from VMM · cdc118f8
      Bibo Mao authored
      Export kernel paravirt features to user space, so that VMM can control
      each single paravirt feature. By default paravirt features will be the
      same with kvm supported features if VMM does not set it.
      
      Also a new feature KVM_FEATURE_VIRT_EXTIOI is added which can be set
      from user space. This feature indicates that the virt EIOINTC can route
      interrupts to 256 vCPUs, rather than 4 vCPUs like with real HW.
      Signed-off-by: default avatarBibo Mao <maobibo@loongson.cn>
      Signed-off-by: default avatarHuacai Chen <chenhuacai@loongson.cn>
      cdc118f8
    • Song Gao's avatar
      LoongArch: KVM: Add PMU support for guest · f4e40ea9
      Song Gao authored
      On LoongArch, the host and guest have their own PMU CSRs registers and
      they share PMU hardware resources. A set of PMU CSRs consists of a CTRL
      register and a CNTR register. We can set which PMU CSRs are used by the
      guest by writing to the GCFG register [24:26] bits.
      
      On KVM side:
      - Save the host PMU CSRs into structure kvm_context.
      - If the host supports the PMU feature.
        - When entering guest mode, save the host PMU CSRs and restore the guest PMU CSRs.
        - When exiting guest mode, save the guest PMU CSRs and restore the host PMU CSRs.
      Reviewed-by: default avatarBibo Mao <maobibo@loongson.cn>
      Signed-off-by: default avatarSong Gao <gaosong@loongson.cn>
      Signed-off-by: default avatarHuacai Chen <chenhuacai@loongson.cn>
      f4e40ea9
    • Marc Zyngier's avatar
      Merge branch kvm-arm64/visibility-cleanups into kvmarm-master/next · 17a00056
      Marc Zyngier authored
      * kvm-arm64/visibility-cleanups:
        : .
        : Remove REG_HIDDEN_USER from the sysreg infrastructure, making things
        : a little more simple. From the cover letter:
        :
        : "Since 4d4f5205 ("KVM: arm64: nv: Drop EL12 register traps that are
        : redirected to VNCR") and the admission that KVM would never be supporting
        : the original FEAT_NV, REG_HIDDEN_USER only had a few users, all of which
        : could either be replaced by a more ad-hoc mechanism, or removed altogether."
        : .
        KVM: arm64: Get rid of REG_HIDDEN_USER visibility qualifier
        KVM: arm64: Simplify visibility handling of AArch32 SPSR_*
        KVM: arm64: Simplify handling of CNTKCTL_EL12
      Signed-off-by: default avatarMarc Zyngier <maz@kernel.org>
      17a00056
    • Marc Zyngier's avatar
      Merge branch kvm-arm64/s2-ptdump into kvmarm-master/next · f6254690
      Marc Zyngier authored
      * kvm-arm64/s2-ptdump:
        : .
        : Stage-2 page table dumper, reusing the main ptdump infrastructure,
        : courtesy of Sebastian Ene. From the cover letter:
        :
        : "This series extends the ptdump support to allow dumping the guest
        : stage-2 pagetables. When CONFIG_PTDUMP_STAGE2_DEBUGFS is enabled, ptdump
        : registers the new following files under debugfs:
        : - /sys/debug/kvm/<guest_id>/stage2_page_tables
        : - /sys/debug/kvm/<guest_id>/stage2_levels
        : - /sys/debug/kvm/<guest_id>/ipa_range
        :
        : This allows userspace tools (eg. cat) to dump the stage-2 pagetables by
        : reading the 'stage2_page_tables' file.
        : [...]"
        : .
        KVM: arm64: Register ptdump with debugfs on guest creation
        arm64: ptdump: Don't override the level when operating on the stage-2 tables
        arm64: ptdump: Use the ptdump description from a local context
        arm64: ptdump: Expose the attribute parsing functionality
        KVM: arm64: Move pagetable definitions to common header
      Signed-off-by: default avatarMarc Zyngier <maz@kernel.org>
      f6254690
    • Marc Zyngier's avatar
      Merge branch kvm-arm64/nv-at-pan into kvmarm-master/next · 2e0f2394
      Marc Zyngier authored
      * kvm-arm64/nv-at-pan:
        : .
        : Add NV support for the AT family of instructions, which mostly results
        : in adding a page table walker that deals with most of the complexity
        : of the architecture.
        :
        : From the cover letter:
        :
        : "Another task that a hypervisor supporting NV on arm64 has to deal with
        : is to emulate the AT instruction, because we multiplex all the S1
        : translations on a single set of registers, and the guest S2 is never
        : truly resident on the CPU.
        :
        : So given that we lie about page tables, we also have to lie about
        : translation instructions, hence the emulation. Things are made
        : complicated by the fact that guest S1 page tables can be swapped out,
        : and that our shadow S2 is likely to be incomplete. So while using AT
        : to emulate AT is tempting (and useful), it is not going to always
        : work, and we thus need a fallback in the shape of a SW S1 walker."
        : .
        KVM: arm64: nv: Add support for FEAT_ATS1A
        KVM: arm64: nv: Plumb handling of AT S1* traps from EL2
        KVM: arm64: nv: Make AT+PAN instructions aware of FEAT_PAN3
        KVM: arm64: nv: Sanitise SCTLR_EL1.EPAN according to VM configuration
        KVM: arm64: nv: Add SW walker for AT S1 emulation
        KVM: arm64: nv: Make ps_to_output_size() generally available
        KVM: arm64: nv: Add emulation of AT S12E{0,1}{R,W}
        KVM: arm64: nv: Add basic emulation of AT S1E2{R,W}
        KVM: arm64: nv: Add basic emulation of AT S1E1{R,W}P
        KVM: arm64: nv: Add basic emulation of AT S1E{0,1}{R,W}
        KVM: arm64: nv: Honor absence of FEAT_PAN2
        KVM: arm64: nv: Turn upper_attr for S2 walk into the full descriptor
        KVM: arm64: nv: Enforce S2 alignment when contiguous bit is set
        arm64: Add ESR_ELx_FSC_ADDRSZ_L() helper
        arm64: Add system register encoding for PSTATE.PAN
        arm64: Add PAR_EL1 field description
        arm64: Add missing APTable and TCR_ELx.HPD masks
        KVM: arm64: Make kvm_at() take an OP_AT_*
      Signed-off-by: default avatarMarc Zyngier <maz@kernel.org>
      
      # Conflicts:
      #	arch/arm64/kvm/nested.c
      2e0f2394
    • Marc Zyngier's avatar
      Merge branch kvm-arm64/selftests-6.12 into kvmarm-master/next · f77e63e2
      Marc Zyngier authored
      * kvm-arm64/selftests-6.12:
        : .
        : KVM/arm64 selftest updates for 6.12
        :
        : - Check for a bunch of timer emulation corner cases (COlton Lewis)
        : .
        KVM: arm64: selftests: Add arch_timer_edge_cases selftest
        KVM: arm64: selftests: Ensure pending interrupts are handled in arch_timer test
      Signed-off-by: default avatarMarc Zyngier <maz@kernel.org>
      f77e63e2
    • Marc Zyngier's avatar
      Merge branch kvm-arm64/vgic-sre-traps into kvmarm-master/next · acf2ab28
      Marc Zyngier authored
      * kvm-arm64/vgic-sre-traps:
        : .
        : Fix the multiple of cases where KVM/arm64 doesn't correctly
        : handle the guest trying to use a GICv3 that isn't advertised.
        :
        : From the cover letter:
        :
        : "It recently appeared that, when running on a GICv3-equipped platform
        : (which is what non-ancient arm64 HW has), *not* configuring a GICv3
        : for the guest could result in less than desirable outcomes.
        :
        : We have multiple issues to fix:
        :
        : - for registers that *always* trap (the SGI registers) or that *may*
        :   trap (the SRE register), we need to check whether a GICv3 has been
        :   instantiated before acting upon the trap.
        :
        : - for registers that only conditionally trap, we must actively trap
        :   them even in the absence of a GICv3 being instantiated, and handle
        :   those traps accordingly.
        :
        : - finally, ID registers must reflect the absence of a GICv3, so that
        :   we are consistent.
        :
        : This series goes through all these requirements. The main complexity
        : here is to apply a GICv3 configuration on the host in the absence of a
        : GICv3 in the guest. This is pretty hackish, but I don't have a much
        : better solution so far.
        :
        : As part of making wider use of of the trap bits, we fully define the
        : trap routing as per the architecture, something that we eventually
        : need for NV anyway."
        : .
        KVM: arm64: selftests: Cope with lack of GICv3 in set_id_regs
        KVM: arm64: Add selftest checking how the absence of GICv3 is handled
        KVM: arm64: Unify UNDEF injection helpers
        KVM: arm64: Make most GICv3 accesses UNDEF if they trap
        KVM: arm64: Honor guest requested traps in GICv3 emulation
        KVM: arm64: Add trap routing information for ICH_HCR_EL2
        KVM: arm64: Add ICH_HCR_EL2 to the vcpu state
        KVM: arm64: Zero ID_AA64PFR0_EL1.GIC when no GICv3 is presented to the guest
        KVM: arm64: Add helper for last ditch idreg adjustments
        KVM: arm64: Force GICv3 trap activation when no irqchip is configured on VHE
        KVM: arm64: Force SRE traps when SRE access is not enabled
        KVM: arm64: Move GICv3 trap configuration to kvm_calculate_traps()
      Signed-off-by: default avatarMarc Zyngier <maz@kernel.org>
      acf2ab28
    • Marc Zyngier's avatar
      Merge branch kvm-arm64/fpmr into kvmarm-master/next · 091258a0
      Marc Zyngier authored
      * kvm-arm64/fpmr:
        : .
        : Add FP8 support to the KVM/arm64 floating point handling.
        :
        : This includes new ID registers (ID_AA64PFR2_EL1 ID_AA64FPFR0_EL1)
        : being made visible to guests, as well as a new confrol register
        : (FPMR) which gets context-switched.
        : .
        KVM: arm64: Expose ID_AA64PFR2_EL1 to userspace and guests
        KVM: arm64: Enable FP8 support when available and configured
        KVM: arm64: Expose ID_AA64FPFR0_EL1 as a writable ID reg
        KVM: arm64: Honor trap routing for FPMR
        KVM: arm64: Add save/restore support for FPMR
        KVM: arm64: Move FPMR into the sysreg array
        KVM: arm64: Add predicate for FPMR support in a VM
        KVM: arm64: Move SVCR into the sysreg array
      Signed-off-by: default avatarMarc Zyngier <maz@kernel.org>
      091258a0
    • Marc Zyngier's avatar
      Merge branch kvm-arm64/mmu-misc-6.12 into kvmarm-master/next · 8884fd12
      Marc Zyngier authored
      * kvm-arm64/mmu-misc-6.12:
        : .
        : Various minor MMU improvements and bug-fixes:
        :
        : - Prevent MTE tags being restored by userspace if we are actively
        :   logging writes, as that's a recipe for disaster
        :
        : - Correct the refcount on a page that is not considered for MTE
        :   tag copying (such as a device)
        :
        : - When walking a page table to split blocks, keep the DSB at the end
        :   the walk, as there is no need to perform it on every store.
        :
        : - Fix boundary check when transfering memory using FFA
        : .
        KVM: arm64: Add memory length checks and remove inline in do_ffa_mem_xfer
        KVM: arm64: Disallow copying MTE to guest memory while KVM is dirty logging
        KVM: arm64: Release pfn, i.e. put page, if copying MTE tags hits ZONE_DEVICE
        KVM: arm64: Move data barrier to end of split walk
      Signed-off-by: default avatarMarc Zyngier <maz@kernel.org>
      8884fd12
  7. 11 Sep, 2024 7 commits
  8. 10 Sep, 2024 6 commits