1. 29 Apr, 2024 23 commits
  2. 19 Apr, 2024 1 commit
  3. 12 Apr, 2024 7 commits
    • Paolo Bonzini's avatar
      1ab157ce
    • Sean Christopherson's avatar
      KVM: VMX: Modify NMI and INTR handlers to take intr_info as function argument · 2325a21a
      Sean Christopherson authored
      TDX uses different ABI to get information about VM exit.  Pass intr_info to
      the NMI and INTR handlers instead of pulling it from vcpu_vmx in
      preparation for sharing the bulk of the handlers with TDX.
      
      When the guest TD exits to VMM, RAX holds status and exit reason, RCX holds
      exit qualification etc rather than the VMCS fields because VMM doesn't have
      access to the VMCS.  The eventual code will be
      
      VMX:
        - get exit reason, intr_info, exit_qualification, and etc from VMCS
        - call NMI/INTR handlers (common code)
      
      TDX:
        - get exit reason, intr_info, exit_qualification, and etc from guest
          registers
        - call NMI/INTR handlers (common code)
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Signed-off-by: default avatarIsaku Yamahata <isaku.yamahata@intel.com>
      Reviewed-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Message-Id: <0396a9ae70d293c9d0b060349dae385a8a4fbcec.1705965635.git.isaku.yamahata@intel.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      2325a21a
    • Paolo Bonzini's avatar
      KVM: VMX: Move out vmx_x86_ops to 'main.c' to dispatch VMX and TDX · 5f18c642
      Paolo Bonzini authored
      KVM accesses Virtual Machine Control Structure (VMCS) with VMX instructions
      to operate on VM.  TDX doesn't allow VMM to operate VMCS directly.
      Instead, TDX has its own data structures, and TDX SEAMCALL APIs for VMM to
      indirectly operate those data structures.  This means we must have a TDX
      version of kvm_x86_ops.
      
      The existing global struct kvm_x86_ops already defines an interface which
      can be adapted to TDX, but kvm_x86_ops is a system-wide, not per-VM
      structure.  To allow VMX to coexist with TDs, the kvm_x86_ops callbacks
      will have wrappers "if (tdx) tdx_op() else vmx_op()" to pick VMX or
      TDX at run time.
      
      To split the runtime switch, the VMX implementation, and the TDX
      implementation, add main.c, and move out the vmx_x86_ops hooks in
      preparation for adding TDX.  Use 'vt' for the naming scheme as a nod to
      VT-x and as a concatenation of VmxTdx.
      
      The eventually converted code will look like this:
      
      vmx.c:
        vmx_op() { ... }
        VMX initialization
      tdx.c:
        tdx_op() { ... }
        TDX initialization
      x86_ops.h:
        vmx_op();
        tdx_op();
      main.c:
        static vt_op() { if (tdx) tdx_op() else vmx_op() }
        static struct kvm_x86_ops vt_x86_ops = {
              .op = vt_op,
        initialization functions call both VMX and TDX initialization
      
      Opportunistically, fix the name inconsistency from vmx_create_vcpu() and
      vmx_free_vcpu() to vmx_vcpu_create() and vmx_vcpu_free().
      Co-developed-by: default avatarXiaoyao Li <xiaoyao.li@intel.com>
      Signed-off-by: default avatarXiaoyao Li <xiaoyao.li@intel.com>
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Signed-off-by: default avatarIsaku Yamahata <isaku.yamahata@intel.com>
      Reviewed-by: default avatarBinbin Wu <binbin.wu@linux.intel.com>
      Reviewed-by: default avatarXiaoyao Li <xiaoyao.li@intel.com>
      Reviewed-by: default avatarYuan Yao <yuan.yao@intel.com>
      Message-Id: <e603c317587f933a9d1bee8728c84e4935849c16.1705965634.git.isaku.yamahata@intel.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      5f18c642
    • Sean Christopherson's avatar
      KVM: x86: Split core of hypercall emulation to helper function · e913ef15
      Sean Christopherson authored
      By necessity, TDX will use a different register ABI for hypercalls.
      Break out the core functionality so that it may be reused for TDX.
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Signed-off-by: default avatarIsaku Yamahata <isaku.yamahata@intel.com>
      Message-Id: <5134caa55ac3dec33fb2addb5545b52b3b52db02.1705965635.git.isaku.yamahata@intel.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      e913ef15
    • Paolo Bonzini's avatar
      Merge branch 'kvm-sev-init2' into HEAD · f9cecb3c
      Paolo Bonzini authored
      The idea that no parameter would ever be necessary when enabling SEV or
      SEV-ES for a VM was decidedly optimistic.  The first source of variability
      that was encountered is the desired set of VMSA features, as that affects
      the measurement of the VM's initial state and cannot be changed
      arbitrarily by the hypervisor.
      
      This series adds all the APIs that are needed to customize the features,
      with room for future enhancements:
      
      - a new /dev/kvm device attribute to retrieve the set of supported
        features (right now, only debug swap)
      
      - a new sub-operation for KVM_MEM_ENCRYPT_OP that can take a struct,
        replacing the existing KVM_SEV_INIT and KVM_SEV_ES_INIT
      
      It then puts the new op to work by including the VMSA features as a field
      of the The existing KVM_SEV_INIT and KVM_SEV_ES_INIT use the full set of
      supported VMSA features for backwards compatibility; but I am considering
      also making them use zero as the feature mask, and will gladly adjust the
      patches if so requested.
      
      In order to avoid creating *two* new KVM_MEM_ENCRYPT_OPs, I decided that
      I could as well make SEV and SEV-ES use VM types.  This allows SEV-SNP
      to reuse the KVM_SEV_INIT2 ioctl.
      
      And while at it, KVM_SEV_INIT2 also includes two bugfixes.  First of all,
      SEV-ES VM, when created with the new VM type instead of KVM_SEV_ES_INIT,
      reject KVM_GET_REGS/KVM_SET_REGS and friends on the vCPU file descriptor
      once the VMSA has been encrypted...  which is how the API should have
      always behaved.  Second, they also synchronize the FPU and AVX state.
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      f9cecb3c
    • Paolo Bonzini's avatar
      Merge branch 'mm-delete-change-gpte' into HEAD · 531f5200
      Paolo Bonzini authored
      The .change_pte() MMU notifier callback was intended as an optimization
      and for this reason it was initially called without a surrounding
      mmu_notifier_invalidate_range_{start,end}() pair.  It was only ever
      implemented by KVM (which was also the original user of MMU notifiers)
      and the rules on when to call set_pte_at_notify() rather than set_pte_at()
      have always been pretty obscure.
      
      It may seem a miracle that it has never caused any hard to trigger
      bugs, but there's a good reason for that: KVM's implementation has
      been nonfunctional for a good part of its existence.  Already in
      2012, commit 6bdb913f ("mm: wrap calls to set_pte_at_notify with
      invalidate_range_start and invalidate_range_end", 2012-10-09) changed the
      .change_pte() callback to occur within an invalidate_range_start/end()
      pair; and because KVM unmaps the sPTEs during .invalidate_range_start(),
      .change_pte() has no hope of finding a sPTE to change.
      
      Therefore, all the code for .change_pte() can be removed from both KVM
      and mm/, and set_pte_at_notify() can be replaced with just set_pte_at().
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      531f5200
    • Paolo Bonzini's avatar
      mm: replace set_pte_at_notify() with just set_pte_at() · f7842747
      Paolo Bonzini authored
      With the demise of the .change_pte() MMU notifier callback, there is no
      notification happening in set_pte_at_notify().  It is a synonym of
      set_pte_at() and can be replaced with it.
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Reviewed-by: default avatarDavid Hildenbrand <david@redhat.com>
      Reviewed-by: default avatarPhilippe Mathieu-Daudé <philmd@linaro.org>
      Message-ID: <20240405115815.3226315-5-pbonzini@redhat.com>
      Acked-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      f7842747
  4. 11 Apr, 2024 9 commits
    • Paolo Bonzini's avatar
      mmu_notifier: remove the .change_pte() callback · 997308f9
      Paolo Bonzini authored
      The scope of set_pte_at_notify() has reduced more and more through the
      years.  Initially, it was meant for when the change to the PTE was
      not bracketed by mmu_notifier_invalidate_range_{start,end}().  However,
      that has not been so for over ten years.  During all this period
      the only implementation of .change_pte() was KVM and it
      had no actual functionality, because it was called after
      mmu_notifier_invalidate_range_start() zapped the secondary PTE.
      
      Now that this (nonfunctional) user of the .change_pte() callback is
      gone, the whole callback can be removed.  For now, leave in place
      set_pte_at_notify() even though it is just a synonym for set_pte_at().
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Reviewed-by: default avatarDavid Hildenbrand <david@redhat.com>
      Message-ID: <20240405115815.3226315-4-pbonzini@redhat.com>
      Acked-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      997308f9
    • Paolo Bonzini's avatar
      KVM: remove unused argument of kvm_handle_hva_range() · 5257de95
      Paolo Bonzini authored
      The only user was kvm_mmu_notifier_change_pte(), which is now gone.
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Reviewed-by: default avatarPhilippe Mathieu-Daudé <philmd@linaro.org>
      Message-ID: <20240405115815.3226315-3-pbonzini@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      5257de95
    • Paolo Bonzini's avatar
      KVM: delete .change_pte MMU notifier callback · f3b65bba
      Paolo Bonzini authored
      The .change_pte() MMU notifier callback was intended as an
      optimization. The original point of it was that KSM could tell KVM to flip
      its secondary PTE to a new location without having to first zap it. At
      the time there was also an .invalidate_page() callback; both of them were
      *not* bracketed by calls to mmu_notifier_invalidate_range_{start,end}(),
      and .invalidate_page() also doubled as a fallback implementation of
      .change_pte().
      
      Later on, however, both callbacks were changed to occur within an
      invalidate_range_start/end() block.
      
      In the case of .change_pte(), commit 6bdb913f ("mm: wrap calls to
      set_pte_at_notify with invalidate_range_start and invalidate_range_end",
      2012-10-09) did so to remove the fallback from .invalidate_page() to
      .change_pte() and allow sleepable .invalidate_page() hooks.
      
      This however made KVM's usage of the .change_pte() callback completely
      moot, because KVM unmaps the sPTEs during .invalidate_range_start()
      and therefore .change_pte() has no hope of finding a sPTE to change.
      Drop the generic KVM code that dispatches to kvm_set_spte_gfn(), as
      well as all the architecture specific implementations.
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Acked-by: default avatarAnup Patel <anup@brainfault.org>
      Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
      Reviewed-by: default avatarBibo Mao <maobibo@loongson.cn>
      Message-ID: <20240405115815.3226315-2-pbonzini@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      f3b65bba
    • Paolo Bonzini's avatar
      selftests: kvm: add test for transferring FPU state into VMSA · 8c53183d
      Paolo Bonzini authored
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Message-ID: <20240404121327.3107131-18-pbonzini@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      8c53183d
    • Paolo Bonzini's avatar
      selftests: kvm: split "launch" phase of SEV VM creation · 4c180a57
      Paolo Bonzini authored
      Allow the caller to set the initial state of the VM.  Doing this
      before sev_vm_launch() matters for SEV-ES, since that is the
      place where the VMSA is updated and after which the guest state
      becomes sealed.
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Message-ID: <20240404121327.3107131-17-pbonzini@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      4c180a57
    • Paolo Bonzini's avatar
      selftests: kvm: switch to using KVM_X86_*_VM · d18c8648
      Paolo Bonzini authored
      This removes the concept of "subtypes", instead letting the tests use proper
      VM types that were recently added.  While the sev_init_vm() and sev_es_init_vm()
      are still able to operate with the legacy KVM_SEV_INIT and KVM_SEV_ES_INIT
      ioctls, this is limited to VMs that are created manually with
      vm_create_barebones().
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Message-ID: <20240404121327.3107131-16-pbonzini@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      d18c8648
    • Paolo Bonzini's avatar
      selftests: kvm: add tests for KVM_SEV_INIT2 · dfc083a1
      Paolo Bonzini authored
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Message-ID: <20240404121327.3107131-15-pbonzini@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      dfc083a1
    • Paolo Bonzini's avatar
      KVM: SEV: allow SEV-ES DebugSwap again · 4dd5ecac
      Paolo Bonzini authored
      The DebugSwap feature of SEV-ES provides a way for confidential guests
      to use data breakpoints.  Its status is record in VMSA, and therefore
      attestation signatures depend on whether it is enabled or not.  In order
      to avoid invalidating the signatures depending on the host machine, it
      was disabled by default (see commit 5abf6dce, "SEV: disable SEV-ES
      DebugSwap by default", 2024-03-09).
      
      However, we now have a new API to create SEV VMs that allows enabling
      DebugSwap based on what the user tells KVM to do, and we also changed the
      legacy KVM_SEV_ES_INIT API to never enable DebugSwap.  It is therefore
      possible to re-enable the feature without breaking compatibility with
      kernels that pre-date the introduction of DebugSwap, so go ahead.
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Message-ID: <20240404121327.3107131-14-pbonzini@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      4dd5ecac
    • Paolo Bonzini's avatar
      KVM: SEV: introduce KVM_SEV_INIT2 operation · 4f5defae
      Paolo Bonzini authored
      The idea that no parameter would ever be necessary when enabling SEV or
      SEV-ES for a VM was decidedly optimistic.  In fact, in some sense it's
      already a parameter whether SEV or SEV-ES is desired.  Another possible
      source of variability is the desired set of VMSA features, as that affects
      the measurement of the VM's initial state and cannot be changed
      arbitrarily by the hypervisor.
      
      Create a new sub-operation for KVM_MEMORY_ENCRYPT_OP that can take a struct,
      and put the new op to work by including the VMSA features as a field of the
      struct.  The existing KVM_SEV_INIT and KVM_SEV_ES_INIT use the full set of
      supported VMSA features for backwards compatibility.
      
      The struct also includes the usual bells and whistles for future
      extensibility: a flags field that must be zero for now, and some padding
      at the end.
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Message-ID: <20240404121327.3107131-13-pbonzini@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      4f5defae