• David Matlack's avatar
    KVM: Cache the last used slot index per vCPU · fe22ed82
    David Matlack authored
    The memslot for a given gfn is looked up multiple times during page
    fault handling. Avoid binary searching for it multiple times by caching
    the most recently used slot. There is an existing VM-wide last_used_slot
    but that does not work well for cases where vCPUs are accessing memory
    in different slots (see performance data below).
    
    Another benefit of caching the most recently use slot (versus looking
    up the slot once and passing around a pointer) is speeding up memslot
    lookups *across* faults and during spte prefetching.
    
    To measure the performance of this change I ran dirty_log_perf_test with
    64 vCPUs and 64 memslots and measured "Populate memory time" and
    "Iteration 2 dirty memory time".  Tests were ran with eptad=N to force
    dirty logging to use fast_page_fault so its performance could be
    measured.
    
    Config     | Metric                        | Before | After
    ---------- | ----------------------------- | ------ | ------
    tdp_mmu=Y  | Populate memory time          | 6.76s  | 5.47s
    tdp_mmu=Y  | Iteration 2 dirty memory time | 2.83s  | 0.31s
    tdp_mmu=N  | Populate memory time          | 20.4s  | 18.7s
    tdp_mmu=N  | Iteration 2 dirty memory time | 2.65s  | 0.30s
    
    The "Iteration 2 dirty memory time" results are especially compelling
    because they are equivalent to running the same test with a single
    memslot. In other words, fast_page_fault performance no longer scales
    with the number of memslots.
    Signed-off-by: default avatarDavid Matlack <dmatlack@google.com>
    Message-Id: <20210804222844.1419481-4-dmatlack@google.com>
    Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
    fe22ed82
kvm_main.c 137 KB