1. 09 May, 2024 8 commits
  2. 08 May, 2024 4 commits
    • Marc Zyngier's avatar
      Merge branch kvm-arm64/misc-6.10 into kvmarm-master/next · e2815706
      Marc Zyngier authored
      * kvm-arm64/misc-6.10:
        : .
        : Misc fixes and updates targeting 6.10
        :
        : - Improve boot-time diagnostics when the sysreg tables
        :   are not correctly sorted
        :
        : - Allow FFA_MSG_SEND_DIRECT_REQ in the FFA proxy
        :
        : - Fix duplicate XNX field in the ID_AA64MMFR1_EL1
        :   writeable mask
        :
        : - Allocate PPIs and SGIs outside of the vcpu structure, allowing
        :   for smaller EL2 mapping and some flexibility in implementing
        :   more or less than 32 private IRQs.
        :
        : - Use bitmap_gather() instead of its open-coded equivalent
        :
        : - Make protected mode use hVHE if available
        :
        : - Purge stale mpidr_data if a vcpu is created after the MPIDR
        :   map has been created
        : .
        KVM: arm64: Destroy mpidr_data for 'late' vCPU creation
        KVM: arm64: Use hVHE in pKVM by default on CPUs with VHE support
        KVM: arm64: Fix hvhe/nvhe early alias parsing
        KVM: arm64: Convert kvm_mpidr_index() to bitmap_gather()
        KVM: arm64: vgic: Allocate private interrupts on demand
        KVM: arm64: Remove duplicated AA64MMFR1_EL1 XNX
        KVM: arm64: Remove FFA_MSG_SEND_DIRECT_REQ from the denylist
        KVM: arm64: Improve out-of-order sysreg table diagnostics
      Signed-off-by: default avatarMarc Zyngier <maz@kernel.org>
      e2815706
    • Oliver Upton's avatar
      KVM: arm64: Destroy mpidr_data for 'late' vCPU creation · ce5d2448
      Oliver Upton authored
      A particularly annoying userspace could create a vCPU after KVM has
      computed mpidr_data for the VM, either by racing against VGIC
      initialization or having a userspace irqchip.
      
      In any case, this means mpidr_data no longer fully describes the VM, and
      attempts to find the new vCPU with kvm_mpidr_to_vcpu() will fail. The
      fix is to discard mpidr_data altogether, as it is only a performance
      optimization and not required for correctness. In all likelihood KVM
      will recompute the mappings when KVM_RUN is called on the new vCPU.
      
      Note that reads of mpidr_data are not guarded by a lock; promote to RCU
      to cope with the possibility of mpidr_data being invalidated at runtime.
      
      Fixes: 54a8006d ("KVM: arm64: Fast-track kvm_mpidr_to_vcpu() when mpidr_data is available")
      Signed-off-by: default avatarOliver Upton <oliver.upton@linux.dev>
      Link: https://lore.kernel.org/r/20240508071952.2035422-1-oliver.upton@linux.devSigned-off-by: default avatarMarc Zyngier <maz@kernel.org>
      ce5d2448
    • Will Deacon's avatar
      KVM: arm64: Use hVHE in pKVM by default on CPUs with VHE support · 5053c3f0
      Will Deacon authored
      The early command line parsing treats "kvm-arm.mode=protected" as an
      alias for "id_aa64mmfr1.vh=0", forcing the use of nVHE so that the host
      kernel runs at EL1 with the pKVM hypervisor at EL2.
      
      With the introduction of hVHE support in ad744e8c ("arm64: Allow
      arm64_sw.hvhe on command line"), the hypervisor can run using the EL2+0
      translation regime. This is interesting for unusual CPUs that have VH
      stuck to 1, but also because it opens the possibility of a hypervisor
      "userspace" in the distant future which could be used to isolate vCPU
      contexts in the hypervisor (see Marc's talk from KVM Forum 2022 [1]).
      
      Repaint the "kvm-arm.mode=protected" alias to map to "arm64_sw.hvhe=1",
      which will use hVHE on CPUs that support it and remain with nVHE
      otherwise.
      
      [1] https://www.youtube.com/watch?v=1F_Mf2j9eIoSigned-off-by: default avatarWill Deacon <will@kernel.org>
      Acked-by: default avatarOliver Upton <oliver.upton@linux.dev>
      Link: https://lore.kernel.org/r/20240501163400.15838-3-will@kernel.orgSigned-off-by: default avatarMarc Zyngier <maz@kernel.org>
      5053c3f0
    • Will Deacon's avatar
      KVM: arm64: Fix hvhe/nvhe early alias parsing · 3c142f9d
      Will Deacon authored
      Booting a kernel with "arm64_sw.hvhe=1 kvm-arm.mode=nvhe" on the
      command-line results in KVM initialising using hVHE, whereas one might
      expect the latter option to override the former.
      
      Fix this by adding "arm64_sw.hvhe=0" to the alias expansion for
      "kvm-arm.mode=nvhe".
      Signed-off-by: default avatarWill Deacon <will@kernel.org>
      Acked-by: default avatarOliver Upton <oliver.upton@linux.dev>
      Link: https://lore.kernel.org/r/20240501163400.15838-2-will@kernel.orgSigned-off-by: default avatarMarc Zyngier <maz@kernel.org>
      3c142f9d
  3. 03 May, 2024 7 commits
    • Marc Zyngier's avatar
      Merge branch kvm-arm64/pkvm-6.10 into kvmarm-master/next · 8540bd1b
      Marc Zyngier authored
      * kvm-arm64/pkvm-6.10: (25 commits)
        : .
        : At last, a bunch of pKVM patches, courtesy of Fuad Tabba.
        : From the cover letter:
        :
        : "This series is a bit of a bombay-mix of patches we've been
        : carrying. There's no one overarching theme, but they do improve
        : the code by fixing existing bugs in pKVM, refactoring code to
        : make it more readable and easier to re-use for pKVM, or adding
        : functionality to the existing pKVM code upstream."
        : .
        KVM: arm64: Force injection of a data abort on NISV MMIO exit
        KVM: arm64: Restrict supported capabilities for protected VMs
        KVM: arm64: Refactor setting the return value in kvm_vm_ioctl_enable_cap()
        KVM: arm64: Document the KVM/arm64-specific calls in hypercalls.rst
        KVM: arm64: Rename firmware pseudo-register documentation file
        KVM: arm64: Reformat/beautify PTP hypercall documentation
        KVM: arm64: Clarify rationale for ZCR_EL1 value restored on guest exit
        KVM: arm64: Introduce and use predicates that check for protected VMs
        KVM: arm64: Add is_pkvm_initialized() helper
        KVM: arm64: Simplify vgic-v3 hypercalls
        KVM: arm64: Move setting the page as dirty out of the critical section
        KVM: arm64: Change kvm_handle_mmio_return() return polarity
        KVM: arm64: Fix comment for __pkvm_vcpu_init_traps()
        KVM: arm64: Prevent kmemleak from accessing .hyp.data
        KVM: arm64: Do not map the host fpsimd state to hyp in pKVM
        KVM: arm64: Rename __tlb_switch_to_{guest,host}() in VHE
        KVM: arm64: Support TLB invalidation in guest context
        KVM: arm64: Avoid BBM when changing only s/w bits in Stage-2 PTE
        KVM: arm64: Check for PTE validity when checking for executable/cacheable
        KVM: arm64: Avoid BUG-ing from the host abort path
        ...
      Signed-off-by: default avatarMarc Zyngier <maz@kernel.org>
      8540bd1b
    • Marc Zyngier's avatar
      Merge branch kvm-arm64/lpi-xa-cache into kvmarm-master/next · 3d5689e0
      Marc Zyngier authored
      * kvm-arm64/lpi-xa-cache:
        : .
        : New and improved LPI translation cache from Oliver Upton.
        :
        : From the cover letter:
        :
        : "As discussed [*], here is the new take on the LPI translation cache,
        : migrating to an xarray indexed by (devid, eventid) per ITS.
        :
        : The end result is quite satisfying, as it becomes possible to rip out
        : other nasties such as the lpi_list_lock. To that end, patches 2-6 aren't
        : _directly_ related to the translation cache cleanup, but instead are
        : done to enable the cleanups at the end of the series.
        :
        : I changed out my test machine from the last time so the baseline has
        : moved a bit, but here are the results from the vgic_lpi_stress test:
        :
        : +----------------------------+------------+-------------------+
        : |       Configuration        |  v6.8-rc1  | v6.8-rc1 + series |
        : +----------------------------+------------+-------------------+
        : | -v 1 -d 1 -e 1 -i 1000000  | 2063296.81 |        1362602.35 |
        : | -v 16 -d 16 -e 16 -i 10000 |  610678.33 |        5200910.01 |
        : | -v 16 -d 16 -e 17 -i 10000 |  678361.53 |        5890675.51 |
        : | -v 32 -d 32 -e 1 -i 100000 |  580918.96 |        8304552.67 |
        : | -v 1 -d 1 -e 17 -i 1000    | 1512443.94 |         1425953.8 |
        : +----------------------------+------------+-------------------+
        :
        : Unlike last time, no dramatic regressions at any performance point. The
        : regression on a single interrupt stream is to be expected, as the
        : overheads of SRCU and two tree traversals (kvm_io_bus_get_dev(),
        : translation cache xarray) are likely greater than that of a linked-list
        : with a single node."
        : .
        KVM: selftests: Add stress test for LPI injection
        KVM: selftests: Use MPIDR_HWID_BITMASK from cputype.h
        KVM: selftests: Add helper for enabling LPIs on a redistributor
        KVM: selftests: Add a minimal library for interacting with an ITS
        KVM: selftests: Add quadword MMIO accessors
        KVM: selftests: Standardise layout of GIC frames
        KVM: selftests: Align with kernel's GIC definitions
        KVM: arm64: vgic-its: Get rid of the lpi_list_lock
        KVM: arm64: vgic-its: Rip out the global translation cache
        KVM: arm64: vgic-its: Use the per-ITS translation cache for injection
        KVM: arm64: vgic-its: Spin off helper for finding ITS by doorbell addr
        KVM: arm64: vgic-its: Maintain a translation cache per ITS
        KVM: arm64: vgic-its: Scope translation cache invalidations to an ITS
        KVM: arm64: vgic-its: Get rid of vgic_copy_lpi_list()
        KVM: arm64: vgic-debug: Use an xarray mark for debug iterator
        KVM: arm64: vgic-its: Walk LPI xarray in vgic_its_cmd_handle_movall()
        KVM: arm64: vgic-its: Walk LPI xarray in vgic_its_invall()
        KVM: arm64: vgic-its: Walk LPI xarray in its_sync_lpi_pending_table()
        KVM: Treat the device list as an rculist
      Signed-off-by: default avatarMarc Zyngier <maz@kernel.org>
      3d5689e0
    • Marc Zyngier's avatar
      Merge branch kvm-arm64/nv-eret-pauth into kvmarm-master/next · 2d38f439
      Marc Zyngier authored
      * kvm-arm64/nv-eret-pauth:
        : .
        : Add NV support for the ERETAA/ERETAB instructions. From the cover letter:
        :
        : "Although the current upstream NV support has *some* support for
        : correctly emulating ERET, that support is only partial as it doesn't
        : support the ERETAA and ERETAB variants.
        :
        : Supporting these instructions was cast aside for a long time as it
        : involves implementing some form of PAuth emulation, something I wasn't
        : overly keen on. But I have reached a point where enough of the
        : infrastructure is there that it actually makes sense. So here it is!"
        : .
        KVM: arm64: nv: Work around lack of pauth support in old toolchains
        KVM: arm64: Drop trapping of PAuth instructions/keys
        KVM: arm64: nv: Advertise support for PAuth
        KVM: arm64: nv: Handle ERETA[AB] instructions
        KVM: arm64: nv: Add emulation for ERETAx instructions
        KVM: arm64: nv: Add kvm_has_pauth() helper
        KVM: arm64: nv: Reinject PAC exceptions caused by HCR_EL2.API==0
        KVM: arm64: nv: Handle HCR_EL2.{API,APK} independently
        KVM: arm64: nv: Honor HFGITR_EL2.ERET being set
        KVM: arm64: nv: Fast-track 'InHost' exception returns
        KVM: arm64: nv: Add trap forwarding for ERET and SMC
        KVM: arm64: nv: Configure HCR_EL2 for FEAT_NV2
        KVM: arm64: nv: Drop VCPU_HYP_CONTEXT flag
        KVM: arm64: Constraint PAuth support to consistent implementations
        KVM: arm64: Add helpers for ESR_ELx_ERET_ISS_ERET*
        KVM: arm64: Harden __ctxt_sys_reg() against out-of-range values
      Signed-off-by: default avatarMarc Zyngier <maz@kernel.org>
      2d38f439
    • Marc Zyngier's avatar
      Merge branch kvm-arm64/host_data into kvmarm-master/next · 34c0d5a6
      Marc Zyngier authored
      * kvm-arm64/host_data:
        : .
        : Rationalise the host-specific data to live as part of the per-CPU state.
        :
        : From the cover letter:
        :
        : "It appears that over the years, we have accumulated a lot of cruft in
        : the kvm_vcpu_arch structure. Part of the gunk is data that is strictly
        : host CPU specific, and this result in two main problems:
        :
        : - the structure itself is stupidly large, over 8kB. With the
        :   arch-agnostic kvm_vcpu, we're above 10kB, which is insane. This has
        :   some ripple effects, as we need physically contiguous allocation to
        :   be able to map it at EL2 for !VHE. There is more to it though, as
        :   some data structures, although per-vcpu, could be allocated
        :   separately.
        :
        : - We lose track of the life-cycle of this data, because we're
        :   guaranteed that it will be around forever and we start relying on
        :   wrong assumptions. This is becoming a maintenance burden.
        :
        : This series rectifies some of these things, starting with the two main
        : offenders: debug and FP, a lot of which gets pushed out to the per-CPU
        : host structure. Indeed, their lifetime really isn't that of the vcpu,
        : but tied to the physical CPU the vpcu runs on.
        :
        : This results in a small reduction of the vcpu size, but mainly a much
        : clearer understanding of the life-cycle of these structures."
        : .
        KVM: arm64: Move management of __hyp_running_vcpu to load/put on VHE
        KVM: arm64: Exclude FP ownership from kvm_vcpu_arch
        KVM: arm64: Exclude host_fpsimd_state pointer from kvm_vcpu_arch
        KVM: arm64: Exclude mdcr_el2_host from kvm_vcpu_arch
        KVM: arm64: Exclude host_debug_data from vcpu_arch
        KVM: arm64: Add accessor for per-CPU state
      Signed-off-by: default avatarMarc Zyngier <maz@kernel.org>
      34c0d5a6
    • Marc Zyngier's avatar
      KVM: arm64: Move management of __hyp_running_vcpu to load/put on VHE · 9a393599
      Marc Zyngier authored
      The per-CPU host context structure contains a __hyp_running_vcpu that
      serves as a replacement for kvm_get_current_vcpu() in contexts where
      we cannot make direct use of it (such as in the nVHE hypervisor).
      Since there is a lot of common code between nVHE and VHE, the latter
      also populates this field even if kvm_get_running_vcpu() always works.
      
      We currently pretty inconsistent when populating __hyp_running_vcpu
      to point to the currently running vcpu:
      
      - on {n,h}VHE, we set __hyp_running_vcpu on entry to __kvm_vcpu_run
        and clear it on exit.
      
      - on VHE, we set __hyp_running_vcpu on entry to __kvm_vcpu_run_vhe
        and never clear it, effectively leaving a dangling pointer...
      
      VHE is obviously the odd one here. Although we could make it behave
      just like nVHE, this wouldn't match the behaviour of KVM with VHE,
      where the load phase is where most of the context-switch gets done.
      
      So move all the __hyp_running_vcpu management to the VHE-specific
      load/put phases, giving us a bit more sanity and matching the
      behaviour of kvm_get_running_vcpu().
      Reviewed-by: default avatarOliver Upton <oliver.upton@linux.dev>
      Link: https://lore.kernel.org/r/20240502154030.3011995-1-maz@kernel.orgSigned-off-by: default avatarMarc Zyngier <maz@kernel.org>
      9a393599
    • Marc Zyngier's avatar
      KVM: arm64: Convert kvm_mpidr_index() to bitmap_gather() · 838d992b
      Marc Zyngier authored
      Linux 6.9 has introduced new bitmap manipulation helpers, with
      bitmap_gather() being of special interest, as it does exactly
      what kvm_mpidr_index() is already doing.
      
      Make the latter a wrapper around the former.
      Reviewed-by: default avatarOliver Upton <oliver.upton@linux.dev>
      Link: https://lore.kernel.org/r/20240502154247.3012042-1-maz@kernel.orgSigned-off-by: default avatarMarc Zyngier <maz@kernel.org>
      838d992b
    • Marc Zyngier's avatar
      KVM: arm64: vgic: Allocate private interrupts on demand · 03b3d00a
      Marc Zyngier authored
      Private interrupts are currently part of the CPU interface structure
      that is part of each and every vcpu we create.
      
      Currently, we have 32 of them per vcpu, resulting in a per-vcpu array
      that is just shy of 4kB. On its own, that's no big deal, but it gets
      in the way of other things:
      
      - each vcpu gets mapped at EL2 on nVHE/hVHE configurations. This
        requires memory that is physically contiguous. However, the EL2
        code has no purpose looking at the interrupt structures and
        could do without them being mapped.
      
      - supporting features such as EPPIs, which extend the number of
        private interrupts past the 32 limit would make the array
        even larger, even for VMs that do not use the EPPI feature.
      
      Address these issues by moving the private interrupt array outside
      of the vcpu, and replace it with a simple pointer. We take this
      opportunity to make it obvious what gets initialised when, as
      that path was remarkably opaque, and tighten the locking.
      Reviewed-by: default avatarOliver Upton <oliver.upton@linux.dev>
      Link: https://lore.kernel.org/r/20240502154545.3012089-1-maz@kernel.orgSigned-off-by: default avatarMarc Zyngier <maz@kernel.org>
      03b3d00a
  4. 01 May, 2024 21 commits