• Sean Christopherson's avatar
    KVM: x86/mmu: Don't attempt fast page fault just because EPT is in use · 54275f74
    Sean Christopherson authored
    Check for A/D bits being disabled instead of the access tracking mask
    being non-zero when deciding whether or not to attempt to fix a page
    fault vian the fast path.  Originally, the access tracking mask was
    non-zero if and only if A/D bits were disabled by _KVM_ (including not
    being supported by hardware), but that hasn't been true since nVMX was
    fixed to honor EPTP12's A/D enabling, i.e. since KVM allowed L1 to cause
    KVM to not use A/D bits while running L2 despite KVM using them while
    running L1.
    
    In other words, don't attempt the fast path just because EPT is enabled.
    
    Note, attempting the fast path for all !PRESENT faults can "fix" a very,
    _VERY_ tiny percentage of faults out of mmu_lock by detecting that the
    fault is spurious, i.e. has been fixed by a different vCPU, but again the
    odds of that happening are vanishingly small.  E.g. booting an 8-vCPU VM
    gets less than 10 successes out of 30k+ faults, and that's likely one of
    the more favorable scenarios.  Disabling dirty logging can likely lead to
    a rash of collisions between vCPUs for some workloads that operate on a
    common set of pages, but penalizing _all_ !PRESENT faults for that one
    case is unlikely to be a net positive, not to mention that that problem
    is best solved by not zapping in the first place.
    
    The number of spurious faults does scale with the number of vCPUs, e.g. a
    255-vCPU VM using TDP "jumps" to ~60 spurious faults detected in the fast
    path (again out of 30k), but that's all of 0.2% of faults.  Using legacy
    shadow paging does get more spurious faults, and a few more detected out
    of mmu_lock, but the percentage goes _down_ to 0.08% (and that's ignoring
    faults that are reflected into the guest), i.e. the extra detections are
    purely due to the sheer number of faults observed.
    
    On the other hand, getting a "negative" in the fast path takes in the
    neighborhood of 150-250 cycles.  So while it is tempting to keep/extend
    the current behavior, such a change needs to come with hard numbers
    showing that it's actually a win in the grand scheme, or any scheme for
    that matter.
    
    Fixes: 995f00a6 ("x86: kvm: mmu: use ept a/d in vmcs02 iff used in vmcs12")
    Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
    Message-Id: <20220423034752.1161007-5-seanjc@google.com>
    Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
    54275f74
spte.h 15.9 KB