- 26 Sep, 2022 40 commits
-
-
Sean Christopherson authored
Exclude General Detect #DBs, which have fault-like behavior but also have a non-zero payload (DR6.BD=1), from nVMX's handling of pending debug traps. Opportunistically rewrite the comment to better document what is being checked, i.e. "has a non-zero payload" vs. "has a payload", and to call out the many caveats surrounding #DBs that KVM dodges one way or another. Cc: Oliver Upton <oupton@google.com> Cc: Peter Shier <pshier@google.com> Fixes: 684c0422 ("KVM: nVMX: Handle pending #DB when injecting INIT VM-exit") Signed-off-by:
Sean Christopherson <seanjc@google.com> Reviewed-by:
Maxim Levitsky <mlevitsk@redhat.com> Link: https://lore.kernel.org/r/20220830231614.3580124-7-seanjc@google.comSigned-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Sean Christopherson authored
Suppress code breakpoints if MOV/POP SS blocking is active and the guest CPU is Intel, i.e. if the guest thinks it's running on an Intel CPU. Intel CPUs inhibit code #DBs when MOV/POP SS blocking is active, whereas AMD (and its descendents) do not. Signed-off-by:
Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20220830231614.3580124-6-seanjc@google.comSigned-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Sean Christopherson authored
Extend force_emulation_prefix to an 'int' and use bit 1 as a flag to indicate that KVM should clear RFLAGS.RF before emulating, e.g. to allow tests to force emulation of code breakpoints in conjunction with MOV/POP SS blocking, which is impossible without KVM intervention as VMX unconditionally sets RFLAGS.RF on intercepted #UD. Make the behavior controllable so that tests can also test RFLAGS.RF=1 (again in conjunction with code #DBs). Note, clearing RFLAGS.RF won't create an infinite #DB loop as the guest's IRET from the #DB handler will return to the instruction and not the prefix, i.e. the restart won't force emulation. Opportunistically convert the permissions to the preferred octal format. Signed-off-by:
Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20220830231614.3580124-5-seanjc@google.comSigned-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Sean Christopherson authored
Don't check for code breakpoints during instruction emulation if the emulation was triggered by exception interception. Code breakpoints are the highest priority fault-like exception, and KVM only emulates on exceptions that are fault-like. Thus, if hardware signaled a different exception, then the vCPU is already passed the stage of checking for hardware breakpoints. This is likely a glorified nop in terms of functionality, and is more for clarification and is technically an optimization. Intel's SDM explicitly states vmcs.GUEST_RFLAGS.RF on exception interception is the same as the value that would have been saved on the stack had the exception not been intercepted, i.e. will be '1' due to all fault-like exceptions setting RF to '1'. AMD says "guest state saved ... is the processor state as of the moment the intercept triggers", but that begs the question, "when does the intercept trigger?". Signed-off-by:
Sean Christopherson <seanjc@google.com> Reviewed-by:
Maxim Levitsky <mlevitsk@redhat.com> Link: https://lore.kernel.org/r/20220830231614.3580124-4-seanjc@google.comSigned-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Sean Christopherson authored
Deliberately truncate the exception error code when shoving it into the VMCS (VM-Entry field for vmcs01 and vmcs02, VM-Exit field for vmcs12). Intel CPUs are incapable of handling 32-bit error codes and will never generate an error code with bits 31:16, but userspace can provide an arbitrary error code via KVM_SET_VCPU_EVENTS. Failure to drop the bits on exception injection results in failed VM-Entry, as VMX disallows setting bits 31:16. Setting the bits on VM-Exit would at best confuse L1, and at worse induce a nested VM-Entry failure, e.g. if L1 decided to reinject the exception back into L2. Cc: stable@vger.kernel.org Signed-off-by:
Sean Christopherson <seanjc@google.com> Reviewed-by:
Jim Mattson <jmattson@google.com> Reviewed-by:
Maxim Levitsky <mlevitsk@redhat.com> Link: https://lore.kernel.org/r/20220830231614.3580124-3-seanjc@google.comSigned-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Sean Christopherson authored
Drop pending exceptions and events queued for re-injection when leaving nested guest mode, even if the "exit" is due to VM-Fail, SMI, or forced by host userspace. Failure to purge events could result in an event belonging to L2 being injected into L1. This _should_ never happen for VM-Fail as all events should be blocked by nested_run_pending, but it's possible if KVM, not the L1 hypervisor, is the source of VM-Fail when running vmcs02. SMI is a nop (barring unknown bugs) as recognition of SMI and thus entry to SMM is blocked by pending exceptions and re-injected events. Forced exit is definitely buggy, but has likely gone unnoticed because userspace probably follows the forced exit with KVM_SET_VCPU_EVENTS (or some other ioctl() that purges the queue). Fixes: 4f350c6d ("kvm: nVMX: Handle deferred early VMLAUNCH/VMRESUME failure properly") Cc: stable@vger.kernel.org Signed-off-by:
Sean Christopherson <seanjc@google.com> Reviewed-by:
Jim Mattson <jmattson@google.com> Reviewed-by:
Maxim Levitsky <mlevitsk@redhat.com> Link: https://lore.kernel.org/r/20220830231614.3580124-2-seanjc@google.comSigned-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Hou Wenlong authored
Since the RDMSR/WRMSR emulation uses a sepearte emualtor interface, the trace points for RDMSR/WRMSR can be added in emulator path like normal path. Signed-off-by:
Hou Wenlong <houwenlong.hwl@antgroup.com> Link: https://lore.kernel.org/r/39181a9f777a72d61a4d0bb9f6984ccbd1de2ea3.1661930557.git.houwenlong.hwl@antgroup.comSigned-off-by:
Sean Christopherson <seanjc@google.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Hou Wenlong authored
The return value of emulator_{get|set}_mst_with_filter() is confused, since msr access error and emulator error are mixed. Although, KVM_MSR_RET_* doesn't conflict with X86EMUL_IO_NEEDED at present, it is better to convert msr access error to emulator error if error value is needed. So move "r < 0" handling for wrmsr emulation into the set helper function, then only X86EMUL_* is returned in the helper functions. Also add "r < 0" check in the get helper function, although KVM doesn't return -errno today, but assuming that will always hold true is unnecessarily risking. Suggested-by:
Sean Christopherson <seanjc@google.com> Signed-off-by:
Hou Wenlong <houwenlong.hwl@antgroup.com> Link: https://lore.kernel.org/r/09b2847fc3bcb8937fb11738f0ccf7be7f61d9dd.1661930557.git.houwenlong.hwl@antgroup.com [sean: wrap changelog less aggressively] Signed-off-by:
Sean Christopherson <seanjc@google.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Jilin Yuan authored
Delete the redundant word 'to'. Signed-off-by:
Jilin Yuan <yuanjilin@cdjrlc.com> Link: https://lore.kernel.org/r/20220831125217.12313-1-yuanjilin@cdjrlc.comSigned-off-by:
Sean Christopherson <seanjc@google.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Vitaly Kuznetsov authored
vmcs_config has cached host MSR_IA32_VMX_MISC value, use it for setting up nested MSR_IA32_VMX_MISC in nested_vmx_setup_ctls_msrs() and avoid the redundant rdmsr(). No (real) functional change intended. Reviewed-by:
Jim Mattson <jmattson@google.com> Reviewed-by:
Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by:
Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20220830133737.1539624-34-vkuznets@redhat.comSigned-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Vitaly Kuznetsov authored
Like other host VMX control MSRs, MSR_IA32_VMX_MISC can be cached in vmcs_config to avoid the need to re-read it later, e.g. from cpu_has_vmx_intel_pt() or cpu_has_vmx_shadow_vmcs(). No (real) functional change intended. Reviewed-by:
Jim Mattson <jmattson@google.com> Reviewed-by:
Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by:
Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20220830133737.1539624-33-vkuznets@redhat.comSigned-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Vitaly Kuznetsov authored
Using raw host MSR values for setting up nested VMX control MSRs is incorrect as some features need to disabled, e.g. when KVM runs as a nested hypervisor on Hyper-V and uses Enlightened VMCS or when a workaround for IA32_PERF_GLOBAL_CTRL is applied. For non-nested VMX, this is done in setup_vmcs_config() and the result is stored in vmcs_config. Use it for setting up allowed-1 bits in nested VMX MSRs too. Suggested-by:
Sean Christopherson <seanjc@google.com> Reviewed-by:
Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by:
Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20220830133737.1539624-32-vkuznets@redhat.comSigned-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Vitaly Kuznetsov authored
Similar to exit_ctls_low, entry_ctls_low, and procbased_ctls_low, pinbased_ctls_low should be set to PIN_BASED_ALWAYSON_WITHOUT_TRUE_MSR and not host's MSR_IA32_VMX_PINBASED_CTLS value |= PIN_BASED_ALWAYSON_WITHOUT_TRUE_MSR. The commit eabeaacc ("KVM: nVMX: Clean up and fix pin-based execution controls") which introduced '|=' doesn't mention anything about why this is needed, the change seems rather accidental. Note: normally, required-1 portion of MSR_IA32_VMX_PINBASED_CTLS should be equal to PIN_BASED_ALWAYSON_WITHOUT_TRUE_MSR so no behavioral change is expected, however, it is (in theory) possible to observe something different there when e.g. KVM is running as a nested hypervisor. Hope this doesn't happen in practice. Reported-by:
Jim Mattson <jmattson@google.com> Reviewed-by:
Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by:
Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20220830133737.1539624-31-vkuznets@redhat.comSigned-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Vitaly Kuznetsov authored
As a preparation to reusing the result of setup_vmcs_config() for setting up nested VMX control MSRs, move LOAD_IA32_PERF_GLOBAL_CTRL errata handling to vmx_vmexit_ctrl()/vmx_vmentry_ctrl() and print the warning from hardware_setup(). While it seems reasonable to not expose LOAD_IA32_PERF_GLOBAL_CTRL controls to L1 hypervisor on buggy CPUs, such change would inevitably break live migration from older KVMs where the controls are exposed. Keep the status quo for now, L1 hypervisor itself is supposed to take care of the errata. Reviewed-by:
Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by:
Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20220830133737.1539624-30-vkuznets@redhat.comSigned-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Jim Mattson authored
Intel processor code names are more familiar to many readers than their decimal model numbers. Signed-off-by:
Jim Mattson <jmattson@google.com> Reviewed-by:
Sean Christopherson <seanjc@google.com> Reviewed-by:
Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by:
Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20220830133737.1539624-29-vkuznets@redhat.comSigned-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Sean Christopherson authored
Clear the CR3 and INVLPG interception controls at runtime based on whether or not EPT is being _used_, as opposed to clearing the bits at setup if EPT is _supported_ in hardware, and then restoring them when EPT is not used. Not mucking with the base config will allow using the base config as the starting point for emulating the VMX capability MSRs. Signed-off-by:
Sean Christopherson <seanjc@google.com> Reviewed-by:
Jim Mattson <jmattson@google.com> Reviewed-by:
Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by:
Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20220830133737.1539624-28-vkuznets@redhat.comSigned-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Vitaly Kuznetsov authored
As a preparation to reusing the result of setup_vmcs_config() in nested VMX MSR setup, add the CPU based VM execution controls which KVM doesn't use but supports for nVMX to KVM_OPT_VMX_CPU_BASED_VM_EXEC_CONTROL and filter them out in vmx_exec_control(). No functional change intended. Reviewed-by:
Jim Mattson <jmattson@google.com> Reviewed-by:
Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by:
Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20220830133737.1539624-27-vkuznets@redhat.comSigned-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Vitaly Kuznetsov authored
As a preparation to reusing the result of setup_vmcs_config() in nested VMX MSR setup, add the VMEXIT controls which KVM doesn't use but supports for nVMX to KVM_OPT_VMX_VM_EXIT_CONTROLS and filter them out in vmx_vmexit_ctrl(). No functional change intended. Reviewed-by:
Jim Mattson <jmattson@google.com> Reviewed-by:
Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by:
Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20220830133737.1539624-26-vkuznets@redhat.comSigned-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Vitaly Kuznetsov authored
As a preparation to reusing the result of setup_vmcs_config() in nested VMX MSR setup, move CPU_BASED_CR8_{LOAD,STORE}_EXITING filtering to vmx_exec_control(). No functional change intended. Reviewed-by:
Jim Mattson <jmattson@google.com> Reviewed-by:
Maxim Levitsky <mlevitsk@redhat.com> Reviewed-by:
Sean Christopherson <seanjc@google.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by:
Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20220830133737.1539624-25-vkuznets@redhat.comSigned-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Vitaly Kuznetsov authored
When VMX controls macros are used to set or clear a control bit, make sure that this bit was checked in setup_vmcs_config() and thus is properly reflected in vmcs_config. Opportunistically drop pointless "< 0" check for adjust_vmx_controls()'s return value. No functional change intended. Suggested-by:
Sean Christopherson <seanjc@google.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by:
Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20220830133737.1539624-24-vkuznets@redhat.comSigned-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Sean Christopherson authored
Don't toggle VM_ENTRY_IA32E_MODE in 32-bit kernels/KVM and instead bug the VM if KVM attempts to run the guest with EFER.LMA=1. KVM doesn't support running 64-bit guests with 32-bit hosts. Signed-off-by:
Sean Christopherson <seanjc@google.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by:
Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20220830133737.1539624-23-vkuznets@redhat.comSigned-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Vitaly Kuznetsov authored
SECONDARY_EXEC_ENCLS_EXITING is the only control which is conditionally added to the 'optional' checklist in setup_vmcs_config() but the special case can be avoided by always checking for its presence first and filtering out the result later. Note: the situation when SECONDARY_EXEC_ENCLS_EXITING is present but cpu_has_sgx() is false is possible when SGX is "soft-disabled", e.g. if software writes MCE control MSRs or there's an uncorrectable #MC. Reviewed-by:
Jim Mattson <jmattson@google.com> Reviewed-by:
Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by:
Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20220830133737.1539624-22-vkuznets@redhat.comSigned-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Vitaly Kuznetsov authored
CPU_BASED_{INTR,NMI}_WINDOW_EXITING controls are toggled dynamically by vmx_enable_{irq,nmi}_window, handle_interrupt_window(), handle_nmi_window() but setup_vmcs_config() doesn't check their existence. Add the check and filter the controls out in vmx_exec_control(). Note: KVM explicitly supports CPUs without VIRTUAL_NMIS and all these CPUs are supposedly lacking NMI_WINDOW_EXITING too. Adjust cpu_has_virtual_nmis() accordingly. No functional change intended. Reviewed-by:
Maxim Levitsky <mlevitsk@redhat.com> Reviewed-by:
Sean Christopherson <seanjc@google.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by:
Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20220830133737.1539624-21-vkuznets@redhat.comSigned-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Vitaly Kuznetsov authored
VM_ENTRY_IA32E_MODE control is toggled dynamically by vmx_set_efer() and setup_vmcs_config() doesn't check its existence. On the contrary, nested_vmx_setup_ctls_msrs() doesn set it on x86_64. Add the missing check and filter the bit out in vmx_vmentry_ctrl(). No (real) functional change intended as all existing CPUs supporting long mode and VMX are supposed to have it. Reviewed-by:
Maxim Levitsky <mlevitsk@redhat.com> Reviewed-by:
Jim Mattson <jmattson@google.com> Reviewed-by:
Sean Christopherson <seanjc@google.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by:
Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20220830133737.1539624-20-vkuznets@redhat.comSigned-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Sean Christopherson authored
Advertise VM_{ENTRY,EXIT}_LOAD_IA32_PERF_GLOBAL_CTRL as being supported for nested VMs irrespective of hardware support. KVM fully emulates the controls, i.e. manually emulates MSR writes on entry/exit, and never propagates the guest settings directly to vmcs02. In addition to allowing L1 VMMs to use the controls on older hardware, unconditionally advertising the controls will also allow KVM to use its vmcs01 configuration as the basis for the nested VMX configuration without causing a regression (due the errata which causes KVM to "hide" the control from vmcs01 but not vmcs12). Signed-off-by:
Sean Christopherson <seanjc@google.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by:
Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20220830133737.1539624-19-vkuznets@redhat.comSigned-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Sean Christopherson authored
Don't propagate vmcs12's VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL to vmcs02. KVM doesn't disallow L1 from using VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL even when KVM itself doesn't use the control, e.g. due to the various CPU errata that where the MSR can be corrupted on VM-Exit. Preserve KVM's (vmcs01) setting to hopefully avoid having to toggle the bit in vmcs02 at a later point. E.g. if KVM is loading PERF_GLOBAL_CTRL when running L1, then odds are good KVM will also load the MSR when running L2. Fixes: 8bf00a52 ("KVM: VMX: add support for switching of PERF_GLOBAL_CTRL") Cc: stable@vger.kernel.org Signed-off-by:
Sean Christopherson <seanjc@google.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com> Link: https://lore.kernel.org/r/20220830133737.1539624-18-vkuznets@redhat.comSigned-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Vitaly Kuznetsov authored
With the updated eVMCSv1 definition, there's no known 'problematic' controls which are exposed in VMX control MSRs but are not present in eVMCSv1: all known Hyper-V versions either don't expose the new fields by not setting bits in the VMX feature controls or support the new eVMCS revision. Get rid of VMX control MSRs filtering for KVM on Hyper-V. Note: VMX control MSRs filtering for Hyper-V on KVM (nested_evmcs_filter_control_msr()) stays as even the updated eVMCSv1 definition doesn't have all the features implemented by KVM and some fields are still missing. Moreover, nested_evmcs_filter_control_msr() has to support the original eVMCSv1 version when VMM wishes so. Reviewed-by:
Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by:
Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20220830133737.1539624-17-vkuznets@redhat.comSigned-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Vitaly Kuznetsov authored
Enlightened VMCS v1 got updated and now includes the required fields for loading PERF_GLOBAL_CTRL upon VMENTER/VMEXIT features. For KVM on Hyper-V enablement, KVM can just observe VMX control MSRs and use the features (with or without eVMCS) when possible. Hyper-V on KVM is messier as Windows 11 guests fail to boot if the controls are advertised and a new PV feature flag, CPUID.0x4000000A.EBX BIT(0), is not set. Honor the Hyper-V CPUID feature flag to play nice with Windows guests. Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com> Link: https://lore.kernel.org/r/20220830133737.1539624-16-vkuznets@redhat.comSigned-off-by:
Sean Christopherson <seanjc@google.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Sean Christopherson authored
WARN and reject nested VM-Enter if KVM is using eVMCS and manages to allow a non-zero value in the upper 32 bits of VM-function controls. The eVMCS code assumes all inputs are 32-bit values and subtly drops the upper bits. WARN instead of adding proper "support", it's unlikely the upper bits will be defined/used in the next decade. Signed-off-by:
Sean Christopherson <seanjc@google.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by:
Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20220830133737.1539624-15-vkuznets@redhat.comSigned-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Vitaly Kuznetsov authored
Update Enlightened VMCS definition in selftests from KVM. Reviewed-by:
Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by:
Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20220830133737.1539624-14-vkuznets@redhat.comSigned-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Vitaly Kuznetsov authored
The updated Enlightened VMCS definition has 'encls_exiting_bitmap' field which needs mapping to VMCS, add the missing encoding. Reviewed-by:
Maxim Levitsky <mlevitsk@redhat.com> Reviewed-by:
Kai Huang <kai.huang@intel.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by:
Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20220830133737.1539624-13-vkuznets@redhat.comSigned-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Vitaly Kuznetsov authored
KVM has to check guest visible HYPERV_CPUID_NESTED_FEATURES.EBX CPUID leaf to know which Enlightened VMCS definition to use (original or 2022 update). Cache the leaf along with other Hyper-V CPUID feature leaves to make the check quick. Reviewed-by:
Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by:
Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20220830133737.1539624-12-vkuznets@redhat.comSigned-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Vitaly Kuznetsov authored
Enlightened VMCS v1 definition was updated with new fields, add support for them for Hyper-V on KVM. Note: SSP, CET and Guest LBR features are not supported by KVM yet and 'struct vmcs12' has no corresponding fields. Reviewed-by:
Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by:
Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20220830133737.1539624-11-vkuznets@redhat.comSigned-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Vitaly Kuznetsov authored
Enlightened VMCS v1 definition was updated with new fields, support them in KVM by defining VMCS-to-EVMCS conversion. Note: SSP, CET and Guest LBR features are not supported by KVM yet and the corresponding fields are not defined in 'enum vmcs_field', leave them commented out for now. Reviewed-by:
Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by:
Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20220830133737.1539624-10-vkuznets@redhat.comSigned-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Sean Christopherson authored
Locally #define and use the nested virtualization Consistency Check (CC) macro to handle eVMCS unsupported controls checks. Using the macro loses the existing printing of the unsupported controls, but that's a feature and not a bug. The existing approach is flawed because the @err param to trace_kvm_nested_vmenter_failed() is the error code, not the error value. The eVMCS trickery mostly works as __print_symbolic() falls back to printing the raw hex value, but that subtly relies on not having a match between the unsupported value and VMX_VMENTER_INSTRUCTION_ERRORS. If it's really truly necessary to snapshot the bad value, then the tracepoint can be extended in the future. Signed-off-by:
Sean Christopherson <seanjc@google.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by:
Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20220830133737.1539624-9-vkuznets@redhat.comSigned-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Vitaly Kuznetsov authored
Refactor the handling of unsupported eVMCS to use a 2-d array to store the set of unsupported controls. KVM's handling of eVMCS is completely broken as there is no way for userspace to query which features are unsupported, nor does KVM prevent userspace from attempting to enable unsupported features. A future commit will remedy that by filtering and enforcing unsupported features when eVMCS, but that needs to be opt-in from userspace to avoid breakage, i.e. KVM needs to maintain its legacy behavior by snapshotting the exact set of controls that are currently (un)supported by eVMCS. No functional change intended. Suggested-by:
Paolo Bonzini <pbonzini@redhat.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com> [sean: split to standalone patch, write changelog] Signed-off-by:
Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20220830133737.1539624-8-vkuznets@redhat.comSigned-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Sean Christopherson authored
When querying whether or not eVMCS is enabled on behalf of the guest, treat eVMCS as enable if and only if Hyper-V is enabled/exposed to the guest. Note, flows that come from the host, e.g. KVM_SET_NESTED_STATE, must NOT check for Hyper-V being enabled as KVM doesn't require guest CPUID to be set before most ioctls(). Signed-off-by:
Sean Christopherson <seanjc@google.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by:
Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20220830133737.1539624-7-vkuznets@redhat.comSigned-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Sean Christopherson authored
Return -ENOMEM back to userspace if allocating the Hyper-V vCPU struct fails when enabling Hyper-V in guest CPUID. Silently ignoring failure means that KVM will not have an up-to-date CPUID cache if allocating the struct succeeds later on, e.g. when activating SynIC. Rejecting the CPUID operation also guarantess that vcpu->arch.hyperv is non-NULL if hyperv_enabled is true, which will allow for additional cleanup, e.g. in the eVMCS code. Note, the initialization needs to be done before CPUID is set, and more subtly before kvm_check_cpuid(), which potentially enables dynamic XFEATURES. Sadly, there's no easy way to avoid exposing Hyper-V details to CPUID or vice versa. Expose kvm_hv_vcpu_init() and the Hyper-V CPUID signature to CPUID instead of exposing cpuid_entry2_find() outside of CPUID code. It's hard to envision kvm_hv_vcpu_init() being misused, whereas cpuid_entry2_find() absolutely shouldn't be used outside of core CPUID code. Fixes: 10d7bf1e ("KVM: x86: hyper-v: Cache guest CPUID leaves determining features availability") Signed-off-by:
Sean Christopherson <seanjc@google.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by:
Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20220830133737.1539624-6-vkuznets@redhat.comSigned-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Sean Christopherson authored
When potentially allocating/initializing the Hyper-V vCPU struct, check for an existing instance in kvm_hv_vcpu_init() instead of requiring callers to perform the check. Relying on callers to do the check is risky as it's all too easy for KVM to overwrite vcpu->arch.hyperv and leak memory, and it adds additional burden on callers without much benefit. No functional change intended. Signed-off-by:
Sean Christopherson <seanjc@google.com> Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by:
Sean Christopherson <seanjc@google.com> Reviewed-by:
Wei Liu <wei.liu@kernel.org> Link: https://lore.kernel.org/r/20220830133737.1539624-5-vkuznets@redhat.comSigned-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Vitaly Kuznetsov authored
Wipe the whole 'hv_vcpu->cpuid_cache' with memset() instead of having to zero each particular member when the corresponding CPUID entry was not found. No functional change intended. Signed-off-by:
Vitaly Kuznetsov <vkuznets@redhat.com> [sean: split to separate patch] Signed-off-by:
Sean Christopherson <seanjc@google.com> Reviewed-by:
Wei Liu <wei.liu@kernel.org> Link: https://lore.kernel.org/r/20220830133737.1539624-4-vkuznets@redhat.comSigned-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-