• Sean Christopherson's avatar
    KVM: x86/mmu: Don't force emulation of L2 accesses to non-APIC internal slots · 5bd74f6e
    Sean Christopherson authored
    Allow mapping KVM's internal memslots used for EPT without unrestricted
    guest into L2, i.e. allow mapping the hidden TSS and the identity mapped
    page tables into L2.  Unlike the APIC access page, there is no correctness
    issue with letting L2 access the "hidden" memory.  Allowing these memslots
    to be mapped into L2 fixes a largely theoretical bug where KVM could
    incorrectly emulate subsequent _L1_ accesses as MMIO, and also ensures
    consistent KVM behavior for L2.
    
    If KVM is using TDP, but L1 is using shadow paging for L2, then routing
    through kvm_handle_noslot_fault() will incorrectly cache the gfn as MMIO,
    and create an MMIO SPTE.  Creating an MMIO SPTE is ok, but only because
    kvm_mmu_page_role.guest_mode ensure KVM uses different roots for L1 vs.
    L2.  But vcpu->arch.mmio_gfn will remain valid, and could cause KVM to
    incorrectly treat an L1 access to the hidden TSS or identity mapped page
    tables as MMIO.
    
    Furthermore, forcing L2 accesses to be treated as "no slot" faults doesn't
    actually prevent exposing KVM's internal memslots to L2, it simply forces
    KVM to emulate the access.  In most cases, that will trigger MMIO,
    amusingly due to filling vcpu->arch.mmio_gfn, but also because
    vcpu_is_mmio_gpa() unconditionally treats APIC accesses as MMIO, i.e. APIC
    accesses are ok.  But the hidden TSS and identity mapped page tables could
    go either way (MMIO or access the private memslot's backing memory).
    
    Alternatively, the inconsistent emulator behavior could be addressed by
    forcing MMIO emulation for L2 access to all internal memslots, not just to
    the APIC.  But that's arguably less correct than letting L2 access the
    hidden TSS and identity mapped page tables, not to mention that it's
    *extremely* unlikely anyone cares what KVM does in this case.  From L1's
    perspective there is R/W memory at those memslots, the memory just happens
    to be initialized with non-zero data.  Making the memory disappear when it
    is accessed by L2 is far more magical and arbitrary than the memory
    existing in the first place.
    
    The APIC access page is special because KVM _must_ emulate the access to
    do the right thing (emulate an APIC access instead of reading/writing the
    APIC access page).  And despite what commit 3a2936de ("kvm: mmu: Don't
    expose private memslots to L2") said, it's not just necessary when L1 is
    accelerating L2's virtual APIC, it's just as important (likely *more*
    imporant for correctness when L1 is passing through its own APIC to L2.
    
    Fixes: 3a2936de ("kvm: mmu: Don't expose private memslots to L2")
    Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
    Reviewed-by: default avatarKai Huang <kai.huang@intel.com>
    Message-ID: <20240228024147.41573-11-seanjc@google.com>
    Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
    5bd74f6e
mmu.c 207 KB