1. 18 Oct, 2021 1 commit
  2. 05 Oct, 2021 1 commit
    • Paolo Bonzini's avatar
      Merge tag 'kvm-riscv-5.16-1' of git://github.com/kvm-riscv/linux into HEAD · 542a2640
      Paolo Bonzini authored
      Initial KVM RISC-V support
      
      Following features are supported by the initial KVM RISC-V support:
      1. No RISC-V specific KVM IOCTL
      2. Loadable KVM RISC-V module
      3. Minimal possible KVM world-switch which touches only GPRs and few CSRs
      4. Works on both RV64 and RV32 host
      5. Full Guest/VM switch via vcpu_get/vcpu_put infrastructure
      6. KVM ONE_REG interface for VCPU register access from KVM user-space
      7. Interrupt controller emulation in KVM user-space
      8. Timer and IPI emuation in kernel
      9. Both Sv39x4 and Sv48x4 supported for RV64 host
      10. MMU notifiers supported
      11. Generic dirty log supported
      12. FP lazy save/restore supported
      13. SBI v0.1 emulation for Guest/VM
      14. Forward unhandled SBI calls to KVM user-space
      15. Hugepage support for Guest/VM
      16. IOEVENTFD support for Vhost
      542a2640
  3. 04 Oct, 2021 18 commits
  4. 01 Oct, 2021 20 commits
    • David Stevens's avatar
      KVM: x86: only allocate gfn_track when necessary · deae4a10
      David Stevens authored
      Avoid allocating the gfn_track arrays if nothing needs them. If there
      are no external to KVM users of the API (i.e. no GVT-g), then page
      tracking is only needed for shadow page tables. This means that when tdp
      is enabled and there are no external users, then the gfn_track arrays
      can be lazily allocated when the shadow MMU is actually used. This avoid
      allocations equal to .05% of guest memory when nested virtualization is
      not used, if the kernel is compiled without GVT-g.
      Signed-off-by: default avatarDavid Stevens <stevensd@chromium.org>
      Message-Id: <20210922045859.2011227-3-stevensd@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      deae4a10
    • David Stevens's avatar
      KVM: x86: add config for non-kvm users of page tracking · e9d0c0c4
      David Stevens authored
      Add a config option that allows kvm to determine whether or not there
      are any external users of page tracking.
      Signed-off-by: default avatarDavid Stevens <stevensd@chromium.org>
      Message-Id: <20210922045859.2011227-2-stevensd@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      e9d0c0c4
    • Krish Sadhukhan's avatar
      nSVM: Check for reserved encodings of TLB_CONTROL in nested VMCB · 174a921b
      Krish Sadhukhan authored
      According to section "TLB Flush" in APM vol 2,
      
          "Support for TLB_CONTROL commands other than the first two, is
           optional and is indicated by CPUID Fn8000_000A_EDX[FlushByAsid].
      
           All encodings of TLB_CONTROL not defined in the APM are reserved."
      Signed-off-by: default avatarKrish Sadhukhan <krish.sadhukhan@oracle.com>
      Message-Id: <20210920235134.101970-3-krish.sadhukhan@oracle.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      174a921b
    • Juergen Gross's avatar
      kvm: use kvfree() in kvm_arch_free_vm() · 78b497f2
      Juergen Gross authored
      By switching from kfree() to kvfree() in kvm_arch_free_vm() Arm64 can
      use the common variant. This can be accomplished by adding another
      macro __KVM_HAVE_ARCH_VM_FREE, which will be used only by x86 for now.
      
      Further simplification can be achieved by adding __kvm_arch_free_vm()
      doing the common part.
      Suggested-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarJuergen Gross <jgross@suse.com>
      Message-Id: <20210903130808.30142-5-jgross@suse.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      78b497f2
    • Babu Moger's avatar
      KVM: x86: Expose Predictive Store Forwarding Disable · b73a5432
      Babu Moger authored
      Predictive Store Forwarding: AMD Zen3 processors feature a new
      technology called Predictive Store Forwarding (PSF).
      
      PSF is a hardware-based micro-architectural optimization designed
      to improve the performance of code execution by predicting address
      dependencies between loads and stores.
      
      How PSF works:
      
      It is very common for a CPU to execute a load instruction to an address
      that was recently written by a store. Modern CPUs implement a technique
      known as Store-To-Load-Forwarding (STLF) to improve performance in such
      cases. With STLF, data from the store is forwarded directly to the load
      without having to wait for it to be written to memory. In a typical CPU,
      STLF occurs after the address of both the load and store are calculated
      and determined to match.
      
      PSF expands on this by speculating on the relationship between loads and
      stores without waiting for the address calculation to complete. With PSF,
      the CPU learns over time the relationship between loads and stores. If
      STLF typically occurs between a particular store and load, the CPU will
      remember this.
      
      In typical code, PSF provides a performance benefit by speculating on
      the load result and allowing later instructions to begin execution
      sooner than they otherwise would be able to.
      
      The details of security analysis of AMD predictive store forwarding is
      documented here.
      https://www.amd.com/system/files/documents/security-analysis-predictive-store-forwarding.pdf
      
      Predictive Store Forwarding controls:
      There are two hardware control bits which influence the PSF feature:
      - MSR 48h bit 2 – Speculative Store Bypass (SSBD)
      - MSR 48h bit 7 – Predictive Store Forwarding Disable (PSFD)
      
      The PSF feature is disabled if either of these bits are set.  These bits
      are controllable on a per-thread basis in an SMT system. By default, both
      SSBD and PSFD are 0 meaning that the speculation features are enabled.
      
      While the SSBD bit disables PSF and speculative store bypass, PSFD only
      disables PSF.
      
      PSFD may be desirable for software which is concerned with the
      speculative behavior of PSF but desires a smaller performance impact than
      setting SSBD.
      
      Support for PSFD is indicated in CPUID Fn8000_0008 EBX[28].
      All processors that support PSF will also support PSFD.
      
      Linux kernel does not have the interface to enable/disable PSFD yet. Plan
      here is to expose the PSFD technology to KVM so that the guest kernel can
      make use of it if they wish to.
      Signed-off-by: default avatarBabu Moger <Babu.Moger@amd.com>
      Message-Id: <163244601049.30292.5855870305350227855.stgit@bmoger-ubuntu>
      [Keep feature private to KVM, as requested by Borislav Petkov. - Paolo]
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      b73a5432
    • David Matlack's avatar
      KVM: x86/mmu: Avoid memslot lookup in make_spte and mmu_try_to_unsync_pages · 53597858
      David Matlack authored
      mmu_try_to_unsync_pages checks if page tracking is active for the given
      gfn, which requires knowing the memslot. We can pass down the memslot
      via make_spte to avoid this lookup.
      
      The memslot is also handy for make_spte's marking of the gfn as dirty:
      we can test whether dirty page tracking is enabled, and if so ensure that
      pages are mapped as writable with 4K granularity.  Apart from the warning,
      no functional change is intended.
      Signed-off-by: default avatarDavid Matlack <dmatlack@google.com>
      Message-Id: <20210813203504.2742757-7-dmatlack@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      53597858
    • David Matlack's avatar
      KVM: x86/mmu: Avoid memslot lookup in rmap_add · 8a9f566a
      David Matlack authored
      Avoid the memslot lookup in rmap_add, by passing it down from the fault
      handling code to mmu_set_spte and then to rmap_add.
      
      No functional change intended.
      Signed-off-by: default avatarDavid Matlack <dmatlack@google.com>
      Message-Id: <20210813203504.2742757-6-dmatlack@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      8a9f566a
    • Paolo Bonzini's avatar
      KVM: MMU: pass struct kvm_page_fault to mmu_set_spte · a12f4381
      Paolo Bonzini authored
      mmu_set_spte is called for either PTE prefetching or page faults.  The
      three boolean arguments write_fault, speculative and host_writable are
      always respectively false/true/true for prefetching and coming from
      a struct kvm_page_fault for page faults.
      
      Let mmu_set_spte distinguish these two situation by accepting a
      possibly NULL struct kvm_page_fault argument.
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      a12f4381
    • Paolo Bonzini's avatar
      KVM: MMU: pass kvm_mmu_page struct to make_spte · 7158bee4
      Paolo Bonzini authored
      The level and A/D bit support of the new SPTE can be found in the role,
      which is stored in the kvm_mmu_page struct.  This merges two arguments
      into one.
      
      For the TDP MMU, the kvm_mmu_page was not used (kvm_tdp_mmu_map does
      not use it if the SPTE is already present) so we fetch it just before
      calling make_spte.
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      7158bee4
    • Paolo Bonzini's avatar
      KVM: MMU: set ad_disabled in TDP MMU role · 87e888ea
      Paolo Bonzini authored
      Prepare for removing the ad_disabled argument of make_spte; instead it can
      be found in the role of a struct kvm_mmu_page.  First of all, the TDP MMU
      must set the role accurately.
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      87e888ea
    • Paolo Bonzini's avatar
      KVM: MMU: remove unnecessary argument to mmu_set_spte · eb5cd7ff
      Paolo Bonzini authored
      The level of the new SPTE can be found in the kvm_mmu_page struct; there
      is no need to pass it down.
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      eb5cd7ff
    • Paolo Bonzini's avatar
      KVM: MMU: clean up make_spte return value · ad67e480
      Paolo Bonzini authored
      Now that make_spte is called directly by the shadow MMU (rather than
      wrapped by set_spte), it only has to return one boolean value.
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      ad67e480
    • Paolo Bonzini's avatar
      KVM: MMU: inline set_spte in FNAME(sync_page) · 4758d47e
      Paolo Bonzini authored
      Since the two callers of set_spte do different things with the results,
      inlining it actually makes the code simpler to reason about.  For example,
      FNAME(sync_page) already has a struct kvm_mmu_page *, but set_spte had to
      fish it back out of sptep's private page data.
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      4758d47e
    • Paolo Bonzini's avatar
      KVM: MMU: inline set_spte in mmu_set_spte · d786c778
      Paolo Bonzini authored
      Since the two callers of set_spte do different things with the results,
      inlining it actually makes the code simpler to reason about.  For example,
      mmu_set_spte looks quite like tdp_mmu_map_handle_target_level, but the
      similarity is hidden by set_spte.
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      d786c778
    • David Matlack's avatar
      KVM: x86/mmu: Avoid memslot lookup in page_fault_handle_page_track · 88810413
      David Matlack authored
      Now that kvm_page_fault has a pointer to the memslot it can be passed
      down to the page tracking code to avoid a redundant slot lookup.
      
      No functional change intended.
      Signed-off-by: default avatarDavid Matlack <dmatlack@google.com>
      Message-Id: <20210813203504.2742757-5-dmatlack@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      88810413
    • David Matlack's avatar
      KVM: x86/mmu: Pass the memslot around via struct kvm_page_fault · e710c5f6
      David Matlack authored
      The memslot for the faulting gfn is used throughout the page fault
      handling code, so capture it in kvm_page_fault as soon as we know the
      gfn and use it in the page fault handling code that has direct access
      to the kvm_page_fault struct.  Replace various tests using is_noslot_pfn
      with more direct tests on fault->slot being NULL.
      
      This, in combination with the subsequent patch, improves "Populate
      memory time" in dirty_log_perf_test by 5% when using the legacy MMU.
      There is no discerable improvement to the performance of the TDP MMU.
      
      No functional change intended.
      Suggested-by: default avatarBen Gardon <bgardon@google.com>
      Signed-off-by: default avatarDavid Matlack <dmatlack@google.com>
      Message-Id: <20210813203504.2742757-4-dmatlack@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      e710c5f6
    • Paolo Bonzini's avatar
      KVM: MMU: unify tdp_mmu_map_set_spte_atomic and tdp_mmu_set_spte_atomic_no_dirty_log · 6ccf4438
      Paolo Bonzini authored
      tdp_mmu_map_set_spte_atomic is not taking care of dirty logging anymore,
      the only difference that remains is that it takes a vCPU instead of
      the struct kvm.  Merge the two functions.
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      6ccf4438
    • Paolo Bonzini's avatar
      KVM: MMU: mark page dirty in make_spte · bcc4f2bc
      Paolo Bonzini authored
      This simplifies set_spte, which we want to remove, and unifies code
      between the shadow MMU and the TDP MMU.  The warning will be added
      back later to make_spte as well.
      
      There is a small disadvantage in the TDP MMU; it may unnecessarily mark
      a page as dirty twice if two vCPUs end up mapping the same page twice.
      However, this is a very small cost for a case that is already rare.
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      bcc4f2bc
    • David Matlack's avatar
      KVM: x86/mmu: Fold rmap_recycle into rmap_add · 68be1306
      David Matlack authored
      Consolidate rmap_recycle and rmap_add into a single function since they
      are only ever called together (and only from one place). This has a nice
      side effect of eliminating an extra kvm_vcpu_gfn_to_memslot(). In
      addition it makes mmu_set_spte(), which is a very long function, a
      little shorter.
      
      No functional change intended.
      Signed-off-by: default avatarDavid Matlack <dmatlack@google.com>
      Message-Id: <20210813203504.2742757-3-dmatlack@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      68be1306
    • Sean Christopherson's avatar
      KVM: x86/mmu: Verify shadow walk doesn't terminate early in page faults · b1a429fb
      Sean Christopherson authored
      WARN and bail if the shadow walk for faulting in a SPTE terminates early,
      i.e. doesn't reach the expected level because the walk encountered a
      terminal SPTE.  The shadow walks for page faults are subtle in that they
      install non-leaf SPTEs (zapping leaf SPTEs if necessary!) in the loop
      body, and consume the newly created non-leaf SPTE in the loop control,
      e.g. __shadow_walk_next().  In other words, the walks guarantee that the
      walk will stop if and only if the target level is reached by installing
      non-leaf SPTEs to guarantee the walk remains valid.
      
      Opportunistically use fault->goal-level instead of it.level in
      FNAME(fetch) to further clarify that KVM always installs the leaf SPTE at
      the target level.
      Reviewed-by: default avatarLai Jiangshan <jiangshanlai@gmail.com>
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Signed-off-by: default avatarLai Jiangshan <laijs@linux.alibaba.com>
      Message-Id: <20210906122547.263316-1-jiangshanlai@gmail.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      b1a429fb