Revert "KVM: VMX: Always honor guest PAT on CPUs that support self-snoop"

This reverts commit 377b2f35. This caused a regression with the bochsdrm driver, which used ioremap() instead of ioremap_wc() to map the video RAM. After the commit, the WB memory type is used without the IGNORE_PAT, resulting in the slower UC memory type. In fact, UC is slow enough to basically cause guests to not boot... but only on new processors such as Sapphire Rapids and Cascade Lake. Coffee Lake for example works properly, though that might also be an effect of being on a larger, more NUMA system. The driver has been fixed but that does not help older guests. Until we figure out whether Cascade Lake and newer processors are working as intended, revert the commit. Long term we might add a quirk, but the details depend on whether the processors are working as intended: for example if they are, the quirk might reference bochs-compatible devices, e.g. in the name and documentation, so that userspace can disable the quirk by default and only leave it enabled if such a device is being exposed to the guest. If instead this is actually a bug in CLX+, then the actions we need to take are different and depend on the actual cause of the bug. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Revert "KVM: VMX: Always honor guest PAT on CPUs that support self-snoop"
This reverts commit 377b2f35. This caused a regression with the bochsdrm driver, which used ioremap() instead of ioremap_wc() to map the video RAM. After the commit, the WB memory type is used without the IGNORE_PAT, resulting in the slower UC memory type. In fact, UC is slow enough to basically cause guests to not boot... but only on new processors such as Sapphire Rapids and Cascade Lake. Coffee Lake for example works properly, though that might also be an effect of being on a larger, more NUMA system. The driver has been fixed but that does not help older guests. Until we figure out whether Cascade Lake and newer processors are working as intended, revert the commit. Long term we might add a quirk, but the details depend on whether the processors are working as intended: for example if they are, the quirk might reference bochs-compatible devices, e.g. in the name and documentation, so that userspace can disable the quirk by default and only leave it enabled if such a device is being exposed to the guest. If instead this is actually a bug in CLX+, then the actions we need to take are different and depend on the actual cause of the bug. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
9d70f3fe · Paolo Bonzini · 59cbd4ee · 9d70f3fe · 9d70f3fe
Commit 9d70f3fe authored Sep 15, 2024 by Paolo Bonzini
Show whitespace changes
Inline Side-by-side

Showing with 7 additions and 11 deletions

arch/x86/kvm/mmu/mmu.c arch/x86/kvm/mmu/mmu.c +3 -5

arch/x86/kvm/vmx/vmx.c arch/x86/kvm/vmx/vmx.c +4 -6

No files found.
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -4674,16 +4674,14 @@ static int kvm_tdp_mmu_page_fault(struct kvm_vcpu *vcpu,
 bool kvm_mmu_may_ignore_guest_pat(void)
 {
 	/*
-	 * When EPT is enabled (shadow_memtype_mask is non-zero), the CPU does
+	 * When EPT is enabled (shadow_memtype_mask is non-zero), and the VM
-	 * not support self-snoop (or is affected by an erratum), and the VM
 	 * has non-coherent DMA (DMA doesn't snoop CPU caches), KVM's ABI is to
 	 * honor the memtype from the guest's PAT so that guest accesses to
 	 * memory that is DMA'd aren't cached against the guest's wishes.  As a
 	 * result, KVM _may_ ignore guest PAT, whereas without non-coherent DMA,
-	 * KVM _always_ ignores or honors guest PAT, i.e. doesn't toggle SPTE
+	 * KVM _always_ ignores guest PAT (when EPT is enabled).
-	 * bits in response to non-coherent device (un)registration.
 	 */
-	return !static_cpu_has(X86_FEATURE_SELFSNOOP) && shadow_memtype_mask;
+	return shadow_memtype_mask;
 }
 int kvm_tdp_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault)

--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7659,13 +7659,11 @@ u8 vmx_get_mt_mask(struct kvm_vcpu *vcpu, gfn_t gfn, bool is_mmio)
 	/*
 	 * Force WB and ignore guest PAT if the VM does NOT have a non-coherent
-	 * device attached and the CPU doesn't support self-snoop.  Letting the
+	 * device attached.  Letting the guest control memory types on Intel
-	 * guest control memory types on Intel CPUs without self-snoop may
+	 * CPUs may result in unexpected behavior, and so KVM's ABI is to trust
-	 * result in unexpected behavior, and so KVM's (historical) ABI is to
+	 * the guest to behave only as a last resort.
-	 * trust the guest to behave only as a last resort.
 	 */
-	if (!static_cpu_has(X86_FEATURE_SELFSNOOP) &&
+	if (!kvm_arch_has_noncoherent_dma(vcpu->kvm))
-	    !kvm_arch_has_noncoherent_dma(vcpu->kvm))
 		return (MTRR_TYPE_WRBACK << VMX_EPT_MT_EPTE_SHIFT) | VMX_EPT_IPAT_BIT;
 	return (MTRR_TYPE_WRBACK << VMX_EPT_MT_EPTE_SHIFT);