• Maciej S. Szmigiero's avatar
    KVM: x86/mmu: Make HVA handler retpoline-friendly · 8f5c44f9
    Maciej S. Szmigiero authored
    When retpolines are enabled they have high overhead in the inner loop
    inside kvm_handle_hva_range() that iterates over the provided memory area.
    
    Let's mark this function and its TDP MMU equivalent __always_inline so
    compiler will be able to change the call to the actual handler function
    inside each of them into a direct one.
    
    This significantly improves performance on the unmap test on the existing
    kernel memslot code (tested on a Xeon 8167M machine):
    30 slots in use:
    Test       Before   After     Improvement
    Unmap      0.0353s  0.0334s   5%
    Unmap 2M   0.00104s 0.000407s 61%
    
    509 slots in use:
    Test       Before   After     Improvement
    Unmap      0.0742s  0.0740s   None
    Unmap 2M   0.00221s 0.00159s  28%
    
    Looks like having an indirect call in these functions (and, so, a
    retpoline) might have interfered with unrolling of the whole loop in the
    CPU.
    Signed-off-by: default avatarMaciej S. Szmigiero <maciej.szmigiero@oracle.com>
    Message-Id: <732d3fe9eb68aa08402a638ab0309199fa89ae56.1612810129.git.maciej.szmigiero@oracle.com>
    Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
    8f5c44f9
tdp_mmu.c 39.9 KB