- 09 May, 2024 8 commits
-
-
Marc Zyngier authored
* kvm-arm64/mpidr-reset: : . : Fixes for CLIDR_EL1 and MPIDR_EL1 being accidentally mutable across : a vcpu reset, courtesy of Oliver. From the cover letter: : : "For VM-wide feature ID registers we ensure they get initialized once for : the lifetime of a VM. On the other hand, vCPU-local feature ID registers : get re-initialized on every vCPU reset, potentially clobbering the : values userspace set up. : : MPIDR_EL1 and CLIDR_EL1 are the only registers in this space that we : allow userspace to modify for now. Clobbering the value of MPIDR_EL1 has : some disastrous side effects as the compressed index used by the : MPIDR-to-vCPU lookup table assumes MPIDR_EL1 is immutable after KVM_RUN. : : Series + reproducer test case to address the problem of KVM wiping out : userspace changes to these registers. Note that there are still some : differences between VM and vCPU scoped feature ID registers from the : perspective of userspace. We do not allow the value of VM-scope : registers to change after KVM_RUN, but vCPU registers remain mutable." : . KVM: selftests: arm64: Test vCPU-scoped feature ID registers KVM: selftests: arm64: Test that feature ID regs survive a reset KVM: selftests: arm64: Store expected register value in set_id_regs KVM: selftests: arm64: Rename helper in set_id_regs to imply VM scope KVM: arm64: Only reset vCPU-scoped feature ID regs once KVM: arm64: Reset VM feature ID regs from kvm_reset_sys_regs() KVM: arm64: Rename is_id_reg() to imply VM scope Signed-off-by: Marc Zyngier <maz@kernel.org>
-
Oliver Upton authored
Test that CLIDR_EL1 and MPIDR_EL1 are modifiable from userspace and that the values are preserved across a vCPU reset like the other feature ID registers. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240502233529.1958459-8-oliver.upton@linux.devSigned-off-by: Marc Zyngier <maz@kernel.org>
-
Oliver Upton authored
One of the expectations with feature ID registers is that their values survive a vCPU reset. Start testing that. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240502233529.1958459-7-oliver.upton@linux.devSigned-off-by: Marc Zyngier <maz@kernel.org>
-
Oliver Upton authored
Rather than comparing against what is returned by the ioctl, store expected values for the feature ID registers in a table and compare with that instead. This will prove useful for subsequent tests involving vCPU reset. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240502233529.1958459-6-oliver.upton@linux.devSigned-off-by: Marc Zyngier <maz@kernel.org>
-
Oliver Upton authored
Prepare for a later change that'll cram in per-vCPU feature ID test cases by renaming the current test case. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240502233529.1958459-5-oliver.upton@linux.devSigned-off-by: Marc Zyngier <maz@kernel.org>
-
Oliver Upton authored
The general expecation with feature ID registers is that they're 'reset' exactly once by KVM for the lifetime of a vCPU/VM, such that any userspace changes to the CPU features / identity are honored after a vCPU gets reset (e.g. PSCI_ON). KVM handles what it calls VM-scoped feature ID registers correctly, but feature ID registers local to a vCPU (CLIDR_EL1, MPIDR_EL1) get wiped after every reset. What's especially concerning is that a potentially-changing MPIDR_EL1 breaks MPIDR compression for indexing mpidr_data, as the mask of useful bits to build the index could change. This is absolutely no good. Avoid resetting vCPU feature ID registers more than once. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240502233529.1958459-4-oliver.upton@linux.devSigned-off-by: Marc Zyngier <maz@kernel.org>
-
Oliver Upton authored
A subsequent change to KVM will expand the range of feature ID registers that get special treatment at reset. Fold the existing ones back in to kvm_reset_sys_regs() to avoid the need for an additional table walk. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240502233529.1958459-3-oliver.upton@linux.devSigned-off-by: Marc Zyngier <maz@kernel.org>
-
Oliver Upton authored
The naming of some of the feature ID checks is ambiguous. Rephrase the is_id_reg() helper to make its purpose slightly clearer. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240502233529.1958459-2-oliver.upton@linux.devSigned-off-by: Marc Zyngier <maz@kernel.org>
-
- 08 May, 2024 4 commits
-
-
Marc Zyngier authored
* kvm-arm64/misc-6.10: : . : Misc fixes and updates targeting 6.10 : : - Improve boot-time diagnostics when the sysreg tables : are not correctly sorted : : - Allow FFA_MSG_SEND_DIRECT_REQ in the FFA proxy : : - Fix duplicate XNX field in the ID_AA64MMFR1_EL1 : writeable mask : : - Allocate PPIs and SGIs outside of the vcpu structure, allowing : for smaller EL2 mapping and some flexibility in implementing : more or less than 32 private IRQs. : : - Use bitmap_gather() instead of its open-coded equivalent : : - Make protected mode use hVHE if available : : - Purge stale mpidr_data if a vcpu is created after the MPIDR : map has been created : . KVM: arm64: Destroy mpidr_data for 'late' vCPU creation KVM: arm64: Use hVHE in pKVM by default on CPUs with VHE support KVM: arm64: Fix hvhe/nvhe early alias parsing KVM: arm64: Convert kvm_mpidr_index() to bitmap_gather() KVM: arm64: vgic: Allocate private interrupts on demand KVM: arm64: Remove duplicated AA64MMFR1_EL1 XNX KVM: arm64: Remove FFA_MSG_SEND_DIRECT_REQ from the denylist KVM: arm64: Improve out-of-order sysreg table diagnostics Signed-off-by: Marc Zyngier <maz@kernel.org>
-
Oliver Upton authored
A particularly annoying userspace could create a vCPU after KVM has computed mpidr_data for the VM, either by racing against VGIC initialization or having a userspace irqchip. In any case, this means mpidr_data no longer fully describes the VM, and attempts to find the new vCPU with kvm_mpidr_to_vcpu() will fail. The fix is to discard mpidr_data altogether, as it is only a performance optimization and not required for correctness. In all likelihood KVM will recompute the mappings when KVM_RUN is called on the new vCPU. Note that reads of mpidr_data are not guarded by a lock; promote to RCU to cope with the possibility of mpidr_data being invalidated at runtime. Fixes: 54a8006d ("KVM: arm64: Fast-track kvm_mpidr_to_vcpu() when mpidr_data is available") Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240508071952.2035422-1-oliver.upton@linux.devSigned-off-by: Marc Zyngier <maz@kernel.org>
-
Will Deacon authored
The early command line parsing treats "kvm-arm.mode=protected" as an alias for "id_aa64mmfr1.vh=0", forcing the use of nVHE so that the host kernel runs at EL1 with the pKVM hypervisor at EL2. With the introduction of hVHE support in ad744e8c ("arm64: Allow arm64_sw.hvhe on command line"), the hypervisor can run using the EL2+0 translation regime. This is interesting for unusual CPUs that have VH stuck to 1, but also because it opens the possibility of a hypervisor "userspace" in the distant future which could be used to isolate vCPU contexts in the hypervisor (see Marc's talk from KVM Forum 2022 [1]). Repaint the "kvm-arm.mode=protected" alias to map to "arm64_sw.hvhe=1", which will use hVHE on CPUs that support it and remain with nVHE otherwise. [1] https://www.youtube.com/watch?v=1F_Mf2j9eIoSigned-off-by: Will Deacon <will@kernel.org> Acked-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240501163400.15838-3-will@kernel.orgSigned-off-by: Marc Zyngier <maz@kernel.org>
-
Will Deacon authored
Booting a kernel with "arm64_sw.hvhe=1 kvm-arm.mode=nvhe" on the command-line results in KVM initialising using hVHE, whereas one might expect the latter option to override the former. Fix this by adding "arm64_sw.hvhe=0" to the alias expansion for "kvm-arm.mode=nvhe". Signed-off-by: Will Deacon <will@kernel.org> Acked-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240501163400.15838-2-will@kernel.orgSigned-off-by: Marc Zyngier <maz@kernel.org>
-
- 03 May, 2024 7 commits
-
-
Marc Zyngier authored
* kvm-arm64/pkvm-6.10: (25 commits) : . : At last, a bunch of pKVM patches, courtesy of Fuad Tabba. : From the cover letter: : : "This series is a bit of a bombay-mix of patches we've been : carrying. There's no one overarching theme, but they do improve : the code by fixing existing bugs in pKVM, refactoring code to : make it more readable and easier to re-use for pKVM, or adding : functionality to the existing pKVM code upstream." : . KVM: arm64: Force injection of a data abort on NISV MMIO exit KVM: arm64: Restrict supported capabilities for protected VMs KVM: arm64: Refactor setting the return value in kvm_vm_ioctl_enable_cap() KVM: arm64: Document the KVM/arm64-specific calls in hypercalls.rst KVM: arm64: Rename firmware pseudo-register documentation file KVM: arm64: Reformat/beautify PTP hypercall documentation KVM: arm64: Clarify rationale for ZCR_EL1 value restored on guest exit KVM: arm64: Introduce and use predicates that check for protected VMs KVM: arm64: Add is_pkvm_initialized() helper KVM: arm64: Simplify vgic-v3 hypercalls KVM: arm64: Move setting the page as dirty out of the critical section KVM: arm64: Change kvm_handle_mmio_return() return polarity KVM: arm64: Fix comment for __pkvm_vcpu_init_traps() KVM: arm64: Prevent kmemleak from accessing .hyp.data KVM: arm64: Do not map the host fpsimd state to hyp in pKVM KVM: arm64: Rename __tlb_switch_to_{guest,host}() in VHE KVM: arm64: Support TLB invalidation in guest context KVM: arm64: Avoid BBM when changing only s/w bits in Stage-2 PTE KVM: arm64: Check for PTE validity when checking for executable/cacheable KVM: arm64: Avoid BUG-ing from the host abort path ... Signed-off-by: Marc Zyngier <maz@kernel.org>
-
Marc Zyngier authored
* kvm-arm64/lpi-xa-cache: : . : New and improved LPI translation cache from Oliver Upton. : : From the cover letter: : : "As discussed [*], here is the new take on the LPI translation cache, : migrating to an xarray indexed by (devid, eventid) per ITS. : : The end result is quite satisfying, as it becomes possible to rip out : other nasties such as the lpi_list_lock. To that end, patches 2-6 aren't : _directly_ related to the translation cache cleanup, but instead are : done to enable the cleanups at the end of the series. : : I changed out my test machine from the last time so the baseline has : moved a bit, but here are the results from the vgic_lpi_stress test: : : +----------------------------+------------+-------------------+ : | Configuration | v6.8-rc1 | v6.8-rc1 + series | : +----------------------------+------------+-------------------+ : | -v 1 -d 1 -e 1 -i 1000000 | 2063296.81 | 1362602.35 | : | -v 16 -d 16 -e 16 -i 10000 | 610678.33 | 5200910.01 | : | -v 16 -d 16 -e 17 -i 10000 | 678361.53 | 5890675.51 | : | -v 32 -d 32 -e 1 -i 100000 | 580918.96 | 8304552.67 | : | -v 1 -d 1 -e 17 -i 1000 | 1512443.94 | 1425953.8 | : +----------------------------+------------+-------------------+ : : Unlike last time, no dramatic regressions at any performance point. The : regression on a single interrupt stream is to be expected, as the : overheads of SRCU and two tree traversals (kvm_io_bus_get_dev(), : translation cache xarray) are likely greater than that of a linked-list : with a single node." : . KVM: selftests: Add stress test for LPI injection KVM: selftests: Use MPIDR_HWID_BITMASK from cputype.h KVM: selftests: Add helper for enabling LPIs on a redistributor KVM: selftests: Add a minimal library for interacting with an ITS KVM: selftests: Add quadword MMIO accessors KVM: selftests: Standardise layout of GIC frames KVM: selftests: Align with kernel's GIC definitions KVM: arm64: vgic-its: Get rid of the lpi_list_lock KVM: arm64: vgic-its: Rip out the global translation cache KVM: arm64: vgic-its: Use the per-ITS translation cache for injection KVM: arm64: vgic-its: Spin off helper for finding ITS by doorbell addr KVM: arm64: vgic-its: Maintain a translation cache per ITS KVM: arm64: vgic-its: Scope translation cache invalidations to an ITS KVM: arm64: vgic-its: Get rid of vgic_copy_lpi_list() KVM: arm64: vgic-debug: Use an xarray mark for debug iterator KVM: arm64: vgic-its: Walk LPI xarray in vgic_its_cmd_handle_movall() KVM: arm64: vgic-its: Walk LPI xarray in vgic_its_invall() KVM: arm64: vgic-its: Walk LPI xarray in its_sync_lpi_pending_table() KVM: Treat the device list as an rculist Signed-off-by: Marc Zyngier <maz@kernel.org>
-
Marc Zyngier authored
* kvm-arm64/nv-eret-pauth: : . : Add NV support for the ERETAA/ERETAB instructions. From the cover letter: : : "Although the current upstream NV support has *some* support for : correctly emulating ERET, that support is only partial as it doesn't : support the ERETAA and ERETAB variants. : : Supporting these instructions was cast aside for a long time as it : involves implementing some form of PAuth emulation, something I wasn't : overly keen on. But I have reached a point where enough of the : infrastructure is there that it actually makes sense. So here it is!" : . KVM: arm64: nv: Work around lack of pauth support in old toolchains KVM: arm64: Drop trapping of PAuth instructions/keys KVM: arm64: nv: Advertise support for PAuth KVM: arm64: nv: Handle ERETA[AB] instructions KVM: arm64: nv: Add emulation for ERETAx instructions KVM: arm64: nv: Add kvm_has_pauth() helper KVM: arm64: nv: Reinject PAC exceptions caused by HCR_EL2.API==0 KVM: arm64: nv: Handle HCR_EL2.{API,APK} independently KVM: arm64: nv: Honor HFGITR_EL2.ERET being set KVM: arm64: nv: Fast-track 'InHost' exception returns KVM: arm64: nv: Add trap forwarding for ERET and SMC KVM: arm64: nv: Configure HCR_EL2 for FEAT_NV2 KVM: arm64: nv: Drop VCPU_HYP_CONTEXT flag KVM: arm64: Constraint PAuth support to consistent implementations KVM: arm64: Add helpers for ESR_ELx_ERET_ISS_ERET* KVM: arm64: Harden __ctxt_sys_reg() against out-of-range values Signed-off-by: Marc Zyngier <maz@kernel.org>
-
Marc Zyngier authored
* kvm-arm64/host_data: : . : Rationalise the host-specific data to live as part of the per-CPU state. : : From the cover letter: : : "It appears that over the years, we have accumulated a lot of cruft in : the kvm_vcpu_arch structure. Part of the gunk is data that is strictly : host CPU specific, and this result in two main problems: : : - the structure itself is stupidly large, over 8kB. With the : arch-agnostic kvm_vcpu, we're above 10kB, which is insane. This has : some ripple effects, as we need physically contiguous allocation to : be able to map it at EL2 for !VHE. There is more to it though, as : some data structures, although per-vcpu, could be allocated : separately. : : - We lose track of the life-cycle of this data, because we're : guaranteed that it will be around forever and we start relying on : wrong assumptions. This is becoming a maintenance burden. : : This series rectifies some of these things, starting with the two main : offenders: debug and FP, a lot of which gets pushed out to the per-CPU : host structure. Indeed, their lifetime really isn't that of the vcpu, : but tied to the physical CPU the vpcu runs on. : : This results in a small reduction of the vcpu size, but mainly a much : clearer understanding of the life-cycle of these structures." : . KVM: arm64: Move management of __hyp_running_vcpu to load/put on VHE KVM: arm64: Exclude FP ownership from kvm_vcpu_arch KVM: arm64: Exclude host_fpsimd_state pointer from kvm_vcpu_arch KVM: arm64: Exclude mdcr_el2_host from kvm_vcpu_arch KVM: arm64: Exclude host_debug_data from vcpu_arch KVM: arm64: Add accessor for per-CPU state Signed-off-by: Marc Zyngier <maz@kernel.org>
-
Marc Zyngier authored
The per-CPU host context structure contains a __hyp_running_vcpu that serves as a replacement for kvm_get_current_vcpu() in contexts where we cannot make direct use of it (such as in the nVHE hypervisor). Since there is a lot of common code between nVHE and VHE, the latter also populates this field even if kvm_get_running_vcpu() always works. We currently pretty inconsistent when populating __hyp_running_vcpu to point to the currently running vcpu: - on {n,h}VHE, we set __hyp_running_vcpu on entry to __kvm_vcpu_run and clear it on exit. - on VHE, we set __hyp_running_vcpu on entry to __kvm_vcpu_run_vhe and never clear it, effectively leaving a dangling pointer... VHE is obviously the odd one here. Although we could make it behave just like nVHE, this wouldn't match the behaviour of KVM with VHE, where the load phase is where most of the context-switch gets done. So move all the __hyp_running_vcpu management to the VHE-specific load/put phases, giving us a bit more sanity and matching the behaviour of kvm_get_running_vcpu(). Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240502154030.3011995-1-maz@kernel.orgSigned-off-by: Marc Zyngier <maz@kernel.org>
-
Marc Zyngier authored
Linux 6.9 has introduced new bitmap manipulation helpers, with bitmap_gather() being of special interest, as it does exactly what kvm_mpidr_index() is already doing. Make the latter a wrapper around the former. Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240502154247.3012042-1-maz@kernel.orgSigned-off-by: Marc Zyngier <maz@kernel.org>
-
Marc Zyngier authored
Private interrupts are currently part of the CPU interface structure that is part of each and every vcpu we create. Currently, we have 32 of them per vcpu, resulting in a per-vcpu array that is just shy of 4kB. On its own, that's no big deal, but it gets in the way of other things: - each vcpu gets mapped at EL2 on nVHE/hVHE configurations. This requires memory that is physically contiguous. However, the EL2 code has no purpose looking at the interrupt structures and could do without them being mapped. - supporting features such as EPPIs, which extend the number of private interrupts past the 32 limit would make the array even larger, even for VMs that do not use the EPPI feature. Address these issues by moving the private interrupt array outside of the vcpu, and replace it with a simple pointer. We take this opportunity to make it obvious what gets initialised when, as that path was remarkably opaque, and tighten the locking. Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240502154545.3012089-1-maz@kernel.orgSigned-off-by: Marc Zyngier <maz@kernel.org>
-
- 01 May, 2024 21 commits
-
-
Marc Zyngier authored
If a vcpu exits for a data abort with an invalid syndrome, the expectations are that userspace has a chance to save the day if it has requested to see such exits. However, this is completely futile in the case of a protected VM, as none of the state is available. In this particular case, inject a data abort directly into the vcpu, consistent with what userspace could do. This also helps with pKVM, which discards all syndrome information when forwarding data aborts that are not known to be MMIO. Finally, document this tweak to the API. Signed-off-by: Fuad Tabba <tabba@google.com> Acked-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240423150538.2103045-31-tabba@google.comSigned-off-by: Marc Zyngier <maz@kernel.org>
-
Fuad Tabba authored
For practical reasons as well as security related ones, not all capabilities are supported for protected VMs in pKVM. Add a function that restricts the capabilities for protected VMs. This behaves as an allow-list to ensure that future capabilities are checked for compatibility and security before being allowed for protected VMs. Signed-off-by: Fuad Tabba <tabba@google.com> Acked-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240423150538.2103045-30-tabba@google.comSigned-off-by: Marc Zyngier <maz@kernel.org>
-
Fuad Tabba authored
Initialize r = -EINVAL to get rid of the error-path initializations in kvm_vm_ioctl_enable_cap(). No functional change intended. Suggested-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Fuad Tabba <tabba@google.com> Acked-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240423150538.2103045-29-tabba@google.comSigned-off-by: Marc Zyngier <maz@kernel.org>
-
Will Deacon authored
KVM/arm64 makes use of the SMCCC "Vendor Specific Hypervisor Service Call Range" to expose KVM-specific hypercalls to guests in a discoverable and extensible fashion. Document the existence of this interface and the discovery hypercall. Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Fuad Tabba <tabba@google.com> Acked-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240423150538.2103045-28-tabba@google.comSigned-off-by: Marc Zyngier <maz@kernel.org>
-
Will Deacon authored
In preparation for describing the guest view of KVM/arm64 hypercalls in hypercalls.rst, move the existing contents of the file concerning the firmware pseudo-registers elsewhere. Cc: Raghavendra Rao Ananta <rananta@google.com> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Fuad Tabba <tabba@google.com> Acked-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240423150538.2103045-27-tabba@google.comSigned-off-by: Marc Zyngier <maz@kernel.org>
-
Will Deacon authored
The PTP hypercall documentation doesn't produce the best-looking table when formatting in HTML as all of the return value definitions end up on the same line. Reformat the PTP hypercall documentation to follow the formatting used by hypercalls.rst. Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Fuad Tabba <tabba@google.com> Acked-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240423150538.2103045-26-tabba@google.comSigned-off-by: Marc Zyngier <maz@kernel.org>
-
Fuad Tabba authored
Expand comment clarifying why the host value representing SVE vector length being restored for ZCR_EL1 on guest exit isn't the same as it was on guest entry. Signed-off-by: Fuad Tabba <tabba@google.com> Reviewed-by: Mark Brown <broonie@kernel.org> Acked-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240423150538.2103045-21-tabba@google.comSigned-off-by: Marc Zyngier <maz@kernel.org>
-
Fuad Tabba authored
In order to determine whether or not a VM or vcpu are protected, introduce helpers to query this state. While at it, use the vcpu helper to check vcpus protected state instead of the kvm one. Co-authored-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Fuad Tabba <tabba@google.com> Acked-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240423150538.2103045-19-tabba@google.comSigned-off-by: Marc Zyngier <maz@kernel.org>
-
Quentin Perret authored
Add a helper allowing to check when the pkvm static key is enabled to ease the introduction of pkvm hooks in other parts of the code. Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Fuad Tabba <tabba@google.com> Acked-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240423150538.2103045-18-tabba@google.comSigned-off-by: Marc Zyngier <maz@kernel.org>
-
Marc Zyngier authored
Consolidate the GICv3 VMCR accessor hypercalls into the APR save/restore hypercalls so that all of the EL2 GICv3 state is covered by a single pair of hypercalls. Signed-off-by: Fuad Tabba <tabba@google.com> Acked-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240423150538.2103045-17-tabba@google.comSigned-off-by: Marc Zyngier <maz@kernel.org>
-
Fuad Tabba authored
Move the unlock earlier in user_mem_abort() to shorten the critical section. This also helps for future refactoring and reuse of similar code. This moves out marking the page as dirty outside of the critical section. That code does not interact with the stage-2 page tables, which the read lock in the critical section protects. Signed-off-by: Fuad Tabba <tabba@google.com> Acked-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240423150538.2103045-16-tabba@google.comSigned-off-by: Marc Zyngier <maz@kernel.org>
-
Fuad Tabba authored
Most exit handlers return <= 0 to indicate that the host needs to handle the exit. Make kvm_handle_mmio_return() consistent with the exit handlers in handle_exit(). This makes the code easier to reason about, and makes it easier to add other handlers in future patches. No functional change intended. Signed-off-by: Fuad Tabba <tabba@google.com> Acked-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240423150538.2103045-15-tabba@google.comSigned-off-by: Marc Zyngier <maz@kernel.org>
-
Fuad Tabba authored
Fix the comment to clarify that __pkvm_vcpu_init_traps() initializes traps for all VMs in protected mode, and not only for protected VMs. Signed-off-by: Fuad Tabba <tabba@google.com> Acked-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240423150538.2103045-14-tabba@google.comSigned-off-by: Marc Zyngier <maz@kernel.org>
-
Quentin Perret authored
We've added a .data section for the hypervisor, which kmemleak is eager to parse. This clearly doesn't go well, so add the section to kmemleak's block list. Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Fuad Tabba <tabba@google.com> Acked-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240423150538.2103045-13-tabba@google.comSigned-off-by: Marc Zyngier <maz@kernel.org>
-
Fuad Tabba authored
pKVM maintains its own state at EL2 for tracking the host fpsimd state. Therefore, no need to map and share the host's view with it. Signed-off-by: Fuad Tabba <tabba@google.com> Reviewed-by: Mark Brown <broonie@kernel.org> Acked-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240423150538.2103045-12-tabba@google.comSigned-off-by: Marc Zyngier <maz@kernel.org>
-
Fuad Tabba authored
Rename __tlb_switch_to_{guest,host}() to {enter,exit}_vmid_context() in VHE code to maintain symmetry between the nVHE and VHE TLB invalidations. No functional change intended. Suggested-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Fuad Tabba <tabba@google.com> Acked-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240423150538.2103045-11-tabba@google.comSigned-off-by: Marc Zyngier <maz@kernel.org>
-
Will Deacon authored
Typically, TLB invalidation of guest stage-2 mappings using nVHE is performed by a hypercall originating from the host. For the invalidation instruction to be effective, therefore, __tlb_switch_to_{guest,host}() swizzle the active stage-2 context around the TLBI instruction. With guest-to-host memory sharing and unsharing hypercalls originating from the guest under pKVM, there is need to support both guest and host VMID invalidations issued from guest context. Replace the __tlb_switch_to_{guest,host}() functions with a more general {enter,exit}_vmid_context() implementation which supports being invoked from guest context and acts as a no-op if the target context matches the running context. Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Fuad Tabba <tabba@google.com> Acked-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240423150538.2103045-10-tabba@google.comSigned-off-by: Marc Zyngier <maz@kernel.org>
-
Will Deacon authored
Break-before-make (BBM) can be expensive, as transitioning via an invalid mapping (i.e. the "break" step) requires the completion of TLB invalidation and can also cause other agents to fault concurrently on the invalid mapping. Since BBM is not required when changing only the software bits of a PTE, avoid the sequence in this case and just update the PTE directly. Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Fuad Tabba <tabba@google.com> Acked-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240423150538.2103045-9-tabba@google.comSigned-off-by: Marc Zyngier <maz@kernel.org>
-
Marc Zyngier authored
Don't just assume that the PTE is valid when checking whether it describes an executable or cacheable mapping. This makes sure that we don't issue CMOs for invalid mappings. Suggested-by: Will Deacon <will@kernel.org> Signed-off-by: Fuad Tabba <tabba@google.com> Acked-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240423150538.2103045-8-tabba@google.comSigned-off-by: Marc Zyngier <maz@kernel.org>
-
Quentin Perret authored
Under certain circumstances __get_fault_info() may resolve the faulting address using the AT instruction. Given that this is being done outside of the host lock critical section, it is racy and the resolution via AT may fail. We currently BUG() in this situation, which is obviously less than ideal. Moving the address resolution to the critical section may have a performance impact, so let's keep it where it is, but bail out and return to the host to try a second time. Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Fuad Tabba <tabba@google.com> Acked-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240423150538.2103045-7-tabba@google.comSigned-off-by: Marc Zyngier <maz@kernel.org>
-
Quentin Perret authored
On the guest teardown path, pKVM will zero the pages used to back the guest data structures before returning them to the host as they may contain secrets (e.g. in the vCPU registers). However, the zeroing is done using a cacheable alias, and CMOs are missing, hence giving the host a potential opportunity to read the original content of the guest structs from memory. Fix this by issuing CMOs after zeroing the pages. Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Fuad Tabba <tabba@google.com> Acked-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240423150538.2103045-6-tabba@google.comSigned-off-by: Marc Zyngier <maz@kernel.org>
-