Commit 58a20a94 authored by Sean Christopherson's avatar Sean Christopherson Committed by Paolo Bonzini

KVM: x86/mmu: Zap only SPs that shadow gPTEs when deleting memslot

When performing a targeted zap on memslot removal, zap only MMU pages that
shadow guest PTEs, as zapping all SPs that "match" the gfn is inexact and
unnecessary.  Furthermore, for_each_gfn_valid_sp() arguably shouldn't
exist, because it doesn't do what most people would it expect it to do.
The "round gfn for level" adjustment that is done for direct SPs (no gPTE)
means that the exact gfn comparison will not get a match, even when a SP
does "cover" a gfn, or was even created specifically for a gfn.

For memslot deletion specifically, KVM's behavior will vary significantly
based on the size and alignment of a memslot, and in weird ways.  E.g. for
a 4KiB memslot, KVM will zap more SPs if the slot is 1GiB aligned than if
it's only 4KiB aligned.  And as described below, zapping SPs in the
aligned case overzaps for direct MMUs, as odds are good the upper-level
SPs are serving other memslots.

To iterate over all potentially-relevant gfns, KVM would need to make a
pass over the hash table for each level, with the gfn used for lookup
rounded for said level.  And then check that the SP is of the correct
level, too, e.g. to avoid over-zapping.

But even then, KVM would massively overzap, as processing every level is
all but guaranteed to zap SPs that serve other memslots, especially if the
memslot being removed is relatively small.  KVM could mitigate that issue
by processing only levels that can be possible guest huge pages, i.e. are
less likely to be re-used for other memslot, but while somewhat logical,
that's quite arbitrary and would be a bit of a mess to implement.

So, zap only SPs with gPTEs, as the resulting behavior is easy to describe,
is predictable, and is explicitly minimal, i.e. KVM only zaps SPs that
absolutely must be zapped.

Cc: Yan Zhao <yan.y.zhao@intel.com>
Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
Reviewed-by: default avatarYan Zhao <yan.y.zhao@intel.com>
Tested-by: default avatarYan Zhao <yan.y.zhao@intel.com>
Message-ID: <20241009192345.1148353-2-seanjc@google.com>
Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
parent 8e690b81
...@@ -1884,14 +1884,10 @@ static bool sp_has_gptes(struct kvm_mmu_page *sp) ...@@ -1884,14 +1884,10 @@ static bool sp_has_gptes(struct kvm_mmu_page *sp)
if (is_obsolete_sp((_kvm), (_sp))) { \ if (is_obsolete_sp((_kvm), (_sp))) { \
} else } else
#define for_each_gfn_valid_sp(_kvm, _sp, _gfn) \ #define for_each_gfn_valid_sp_with_gptes(_kvm, _sp, _gfn) \
for_each_valid_sp(_kvm, _sp, \ for_each_valid_sp(_kvm, _sp, \
&(_kvm)->arch.mmu_page_hash[kvm_page_table_hashfn(_gfn)]) \ &(_kvm)->arch.mmu_page_hash[kvm_page_table_hashfn(_gfn)]) \
if ((_sp)->gfn != (_gfn)) {} else if ((_sp)->gfn != (_gfn) || !sp_has_gptes(_sp)) {} else
#define for_each_gfn_valid_sp_with_gptes(_kvm, _sp, _gfn) \
for_each_gfn_valid_sp(_kvm, _sp, _gfn) \
if (!sp_has_gptes(_sp)) {} else
static bool kvm_sync_page_check(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp) static bool kvm_sync_page_check(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp)
{ {
...@@ -7063,15 +7059,15 @@ static void kvm_mmu_zap_memslot_pages_and_flush(struct kvm *kvm, ...@@ -7063,15 +7059,15 @@ static void kvm_mmu_zap_memslot_pages_and_flush(struct kvm *kvm,
/* /*
* Since accounting information is stored in struct kvm_arch_memory_slot, * Since accounting information is stored in struct kvm_arch_memory_slot,
* shadow pages deletion (e.g. unaccount_shadowed()) requires that all * all MMU pages that are shadowing guest PTEs must be zapped before the
* gfns with a shadow page have a corresponding memslot. Do so before * memslot is deleted, as freeing such pages after the memslot is freed
* the memslot goes away. * will result in use-after-free, e.g. in unaccount_shadowed().
*/ */
for (i = 0; i < slot->npages; i++) { for (i = 0; i < slot->npages; i++) {
struct kvm_mmu_page *sp; struct kvm_mmu_page *sp;
gfn_t gfn = slot->base_gfn + i; gfn_t gfn = slot->base_gfn + i;
for_each_gfn_valid_sp(kvm, sp, gfn) for_each_gfn_valid_sp_with_gptes(kvm, sp, gfn)
kvm_mmu_prepare_zap_page(kvm, sp, &invalid_list); kvm_mmu_prepare_zap_page(kvm, sp, &invalid_list);
if (need_resched() || rwlock_needbreak(&kvm->mmu_lock)) { if (need_resched() || rwlock_needbreak(&kvm->mmu_lock)) {
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment