Commits · 0f53a0245068e09b3b31e7f43ace0ab9edd066ef · Kirill Smelkov / linux

29 Apr, 2024 23 commits

KVM: selftests: Init x86's segments during VM creation · 0f53a024

Sean Christopherson authored Mar 14, 2024

Initialize x86's various segments in the GDT during creation of relevant
VMs instead of waiting until vCPUs come along. Re-installing the segments
for every vCPU is both wasteful and confusing, as is installing KERNEL_DS
multiple times; NOT installing KERNEL_DS for GS is icing on the cake.
Reviewed-by: Ackerley Tng <ackerleytng@google.com>
Link: https://lore.kernel.org/r/20240314232637.2538648-18-seanjc@google.comSigned-off-by: Sean Christopherson <seanjc@google.com>

0f53a024

KVM: selftests: Add macro for TSS selector, rename up code/data macros · f18ef97f

Sean Christopherson authored Mar 14, 2024

Add a proper #define for the TSS selector instead of open coding 0x18 and
hoping future developers don't use that selector for something else.

Opportunistically rename the code and data selector macros to shorten the
names, align the naming with the kernel's scheme, and capture that they
are *kernel* segments.
Reviewed-by: Ackerley Tng <ackerleytng@google.com>
Link: https://lore.kernel.org/r/20240314232637.2538648-17-seanjc@google.comSigned-off-by: Sean Christopherson <seanjc@google.com>

f18ef97f

KVM: selftests: Allocate x86's TSS at VM creation · a2834e6e

Sean Christopherson authored Mar 14, 2024

Allocate x86's per-VM TSS at creation of a non-barebones VM. Like the
GDT, the TSS is needed to actually run vCPUs, i.e. every non-barebones VM
is all but guaranteed to allocate the TSS sooner or later.
Reviewed-by: Ackerley Tng <ackerleytng@google.com>
Link: https://lore.kernel.org/r/20240314232637.2538648-16-seanjc@google.comSigned-off-by: Sean Christopherson <seanjc@google.com>

a2834e6e

KVM: selftests: Fold x86's descriptor tables helpers into vcpu_init_sregs() · 23ef21f5

Sean Christopherson authored Mar 14, 2024

Now that the per-VM, on-demand allocation logic in kvm_setup_gdt() and
vcpu_init_descriptor_tables() is gone, fold them into vcpu_init_sregs().

Note, both kvm_setup_gdt() and vcpu_init_descriptor_tables() configured the
GDT, which is why it looks like kvm_setup_gdt() disappears.

Opportunistically delete the pointless zeroing of the IDT limit (it was
being unconditionally overwritten by vcpu_init_descriptor_tables()).
Reviewed-by: Ackerley Tng <ackerleytng@google.com>
Link: https://lore.kernel.org/r/20240314232637.2538648-15-seanjc@google.comSigned-off-by: Sean Christopherson <seanjc@google.com>

23ef21f5

KVM: selftests: Drop superfluous switch() on vm->mode in vcpu_init_sregs() · 1051e29c

Sean Christopherson authored Mar 14, 2024

Replace the switch statement on vm->mode in x86's vcpu_init_sregs()'s with
a simple assert that the VM has a 48-bit virtual address space. A switch
statement is both overkill and misleading, as the existing code incorrectly
implies that VMs with LA57 would need different to configuration for the
LDT, TSS, and flat segments. In all likelihood, the only difference that
would be needed for selftests is CR4.LA57 itself.
Reviewed-by: Ackerley Tng <ackerleytng@google.com>
Link: https://lore.kernel.org/r/20240314232637.2538648-14-seanjc@google.comSigned-off-by: Sean Christopherson <seanjc@google.com>

1051e29c

KVM: selftests: Allocate x86's GDT during VM creation · 2a511ca9

Sean Christopherson authored Mar 14, 2024

Allocate the GDT during creation of non-barebones VMs instead of waiting
until the first vCPU is created, as the whole point of non-barebones VMs
is to be able to run vCPUs, i.e. the GDT is going to get allocated no
matter what.
Reviewed-by: Ackerley Tng <ackerleytng@google.com>
Link: https://lore.kernel.org/r/20240314232637.2538648-13-seanjc@google.comSigned-off-by: Sean Christopherson <seanjc@google.com>

2a511ca9

KVM: selftests: Map x86's exception_handlers at VM creation, not vCPU setup · 44c93b27

Sean Christopherson authored Mar 14, 2024

Map x86's exception handlers at VM creation, not vCPU setup, as the
mapping is per-VM, i.e. doesn't need to be (re)done for every vCPU.
Reviewed-by: Ackerley Tng <ackerleytng@google.com>
Link: https://lore.kernel.org/r/20240314232637.2538648-12-seanjc@google.comSigned-off-by: Sean Christopherson <seanjc@google.com>

44c93b27

KVM: selftests: Init IDT and exception handlers for all VMs/vCPUs on x86 · c1b9793b

Sean Christopherson authored Mar 14, 2024

Initialize the IDT and exception handlers for all non-barebones VMs and
vCPUs on x86.  Forcing tests to manually configure the IDT just to save
8KiB of memory is a terrible tradeoff, and also leads to weird tests
(multiple tests have deliberately relied on shutdown to indicate success),
and hard-to-debug failures, e.g. instead of a precise unexpected exception
failure, tests see only shutdown.
Reviewed-by: Ackerley Tng <ackerleytng@google.com>
Link: https://lore.kernel.org/r/20240314232637.2538648-11-seanjc@google.comSigned-off-by: Sean Christopherson <seanjc@google.com>

c1b9793b

KVM: selftests: Rename x86's vcpu_setup() to vcpu_init_sregs() · d8c63805

Sean Christopherson authored Mar 14, 2024

Rename vcpu_setup() to be more descriptive and precise, there is a whole
lot of "setup" that is done for a vCPU that isn't in said helper.

No functional change intended.
Reviewed-by: Ackerley Tng <ackerleytng@google.com>
Link: https://lore.kernel.org/r/20240314232637.2538648-10-seanjc@google.comSigned-off-by: Sean Christopherson <seanjc@google.com>

d8c63805

KVM: selftests: Move x86's descriptor table helpers "up" in processor.c · b62c32c5

Sean Christopherson authored Mar 14, 2024

Move x86's various descriptor table helpers in processor.c up above
kvm_arch_vm_post_create() and vcpu_setup() so that the helpers can be
made static and invoked from the aforementioned functions.

No functional change intended.
Reviewed-by: Ackerley Tng <ackerleytng@google.com>
Link: https://lore.kernel.org/r/20240314232637.2538648-9-seanjc@google.comSigned-off-by: Sean Christopherson <seanjc@google.com>

b62c32c5

KVM: selftests: Explicitly clobber the IDT in the "delete memslot" testcase · 61c3cffd

Sean Christopherson authored Mar 14, 2024

Explicitly clobber the guest IDT in the "delete memslot" test, which
expects the deleted memslot to result in either a KVM emulation error, or
a triple fault shutdown. A future change to the core selftests library
will configuring the guest IDT and exception handlers by default, i.e.
will install a guest #PF handler and put the guest into an infinite #NPF
loop (the guest hits a !PRESENT SPTE when trying to vector a #PF, and KVM
reinjects the #PF without fixing the #NPF, because there is no memslot).

Note, it's not clear whether or not KVM's behavior is reasonable in this
case, e.g. arguably KVM should try (and fail) to emulate in response to
the #NPF. But barring a goofy/broken userspace, this scenario will likely
never happen in practice. Punt the KVM investigation to the future.

Link: https://lore.kernel.org/r/20240314232637.2538648-8-seanjc@google.comSigned-off-by: Sean Christopherson <seanjc@google.com>

61c3cffd

KVM: selftests: Rework platform_info_test to actually verify #GP · dec79eab

Sean Christopherson authored Mar 14, 2024

Rework platform_info_test to actually handle and verify the expected #GP
on RDMSR when the associated KVM capability is disabled. Currently, the
test _deliberately_ doesn't handle the #GP, and instead lets it escalated
to a triple fault shutdown.

In addition to verifying that KVM generates the correct fault, handling
the #GP will be necessary (without even more shenanigans) when a future
change to the core KVM selftests library configures the IDT and exception
handlers by default (the test subtly relies on the IDT limit being '0').

Link: https://lore.kernel.org/r/20240314232637.2538648-7-seanjc@google.comSigned-off-by: Sean Christopherson <seanjc@google.com>

dec79eab

KVM: selftests: Move platform_info_test's main assert into guest code · 53635ec2

Sean Christopherson authored Mar 14, 2024

As a first step toward gracefully handling the expected #GP on RDMSR in
platform_info_test, move the test's assert on the non-faulting RDMSR
result into the guest itself. This will allow using a unified flow for
the host userspace side of things.

Link: https://lore.kernel.org/r/20240314232637.2538648-6-seanjc@google.comSigned-off-by: Sean Christopherson <seanjc@google.com>

53635ec2

KVM: selftests: Fix off-by-one initialization of GDT limit · 0d95817e

Ackerley Tng authored Mar 14, 2024

Fix an off-by-one bug in the initialization of the GDT limit, which as
defined in the SDM is inclusive, not exclusive.

Note, vcpu_init_descriptor_tables() gets the limit correct, it's only
vcpu_setup() that is broken, i.e. only tests that _don't_ invoke
vcpu_init_descriptor_tables() can have problems.  And the fact that KVM
effectively initializes the GDT twice will be cleaned up in the near
future.
Signed-off-by: Ackerley Tng <ackerleytng@google.com>
[sean: rewrite changelog]
Link: https://lore.kernel.org/r/20240314232637.2538648-5-seanjc@google.comSigned-off-by: Sean Christopherson <seanjc@google.com>

0d95817e

KVM: selftests: Move GDT, IDT, and TSS fields to x86's kvm_vm_arch · 3a085fbf

Sean Christopherson authored Mar 14, 2024

Now that kvm_vm_arch exists, move the GDT, IDT, and TSS fields to x86's
implementation, as the structures are firmly x86-only.
Reviewed-by: Ackerley Tng <ackerleytng@google.com>
Link: https://lore.kernel.org/r/20240314232637.2538648-4-seanjc@google.comSigned-off-by: Sean Christopherson <seanjc@google.com>

3a085fbf

KVM: sefltests: Add kvm_util_types.h to hold common types, e.g. vm_vaddr_t · f54884f9

Sean Christopherson authored Mar 14, 2024

Move the base types unique to KVM selftests out of kvm_util.h and into a
new header, kvm_util_types.h. This will allow kvm_util_arch.h, i.e. core
arch headers, to reference common types, e.g. vm_vaddr_t and vm_paddr_t.

No functional change intended.
Reviewed-by: Ackerley Tng <ackerleytng@google.com>
Link: https://lore.kernel.org/r/20240314232637.2538648-3-seanjc@google.comSigned-off-by: Sean Christopherson <seanjc@google.com>

f54884f9

Revert "kvm: selftests: move base kvm_util.h declarations to kvm_util_base.h" · 2b7deea3

Sean Christopherson authored Mar 14, 2024

Effectively revert the movement of code from kvm_util.h => kvm_util_base.h,
as the TL;DR of the justification for the move was to avoid #idefs and/or
circular dependencies between what ended up being ucall_common.h and what
was (and now again, is), kvm_util.h.

But avoiding #ifdef and circular includes is trivial: don't do that. The
cost of removing kvm_util_base.h is a few extra includes of ucall_common.h,
but that cost is practically nothing. On the other hand, having a "base"
version of a header that is really just the header itself is confusing,
and makes it weird/hard to choose names for headers that actually are
"base" headers, e.g. to hold core KVM selftests typedefs.

For all intents and purposes, this reverts commit
7d9a662e.
Reviewed-by: Ackerley Tng <ackerleytng@google.com>
Link: https://lore.kernel.org/r/20240314232637.2538648-2-seanjc@google.comSigned-off-by: Sean Christopherson <seanjc@google.com>

2b7deea3

KVM: selftests: Randomly force emulation on x86 writes from guest code · 87aa264c

Sean Christopherson authored Mar 14, 2024

Override vcpu_arch_put_guest() to randomly force emulation on supported
accesses. Force emulation of LOCK CMPXCHG as well as a regular MOV to
stress KVM's emulation of atomic accesses, which has a unique path in
KVM's emulator.

Arbitrarily give all the decisions 50/50 odds; absent much, much more
sophisticated infrastructure for generating random numbers, it's highly
unlikely that doing more than a coin flip with affect selftests' ability
to find KVM bugs.

This is effectively a regression test for commit 910c57df ("KVM: x86:
Mark target gfn of emulated atomic instruction as dirty").

Link: https://lore.kernel.org/r/20240314185459.2439072-6-seanjc@google.comSigned-off-by: Sean Christopherson <seanjc@google.com>

87aa264c

KVM: selftests: Add vcpu_arch_put_guest() to do writes from guest code · 2f2bc6af

Sean Christopherson authored Mar 14, 2024

Introduce a macro, vcpu_arch_put_guest(), for "putting" values to memory
from guest code in "interesting" situations, e.g. when writing memory that
is being dirty logged.  Structure the macro so that arch code can provide
a custom implementation, e.g. x86 will use the macro to force emulation of
the access.

Use the helper in dirty_log_test, which is of particular interest (see
above), and in xen_shinfo_test, which isn't all that interesting, but
provides a second usage of the macro with a different size operand
(uint8_t versus uint64_t), i.e. to help verify that the macro works for
more than just 64-bit values.

Use "put" as the verb to align with the kernel's {get,put}_user()
terminology.

Link: https://lore.kernel.org/r/20240314185459.2439072-5-seanjc@google.comSigned-off-by: Sean Christopherson <seanjc@google.com>

2f2bc6af

KVM: selftests: Add global snapshot of kvm_is_forced_emulation_enabled() · e1ff1152

Sean Christopherson authored Mar 14, 2024

Add a global snapshot of kvm_is_forced_emulation_enabled() and sync it to
all VMs by default so that core library code can force emulation, e.g. to
allow for easier testing of the intersections between emulation and other
features in KVM.

Link: https://lore.kernel.org/r/20240314185459.2439072-4-seanjc@google.comSigned-off-by: Sean Christopherson <seanjc@google.com>

e1ff1152

KVM: selftests: Provide an API for getting a random bool from an RNG · 73369acd

Sean Christopherson authored Mar 14, 2024

Move memstress' random bool logic into common code to avoid reinventing
the wheel for basic yes/no decisions. Provide an outer wrapper to handle
the basic/common case of just wanting a 50/50 chance of something
happening.

Link: https://lore.kernel.org/r/20240314185459.2439072-3-seanjc@google.comSigned-off-by: Sean Christopherson <seanjc@google.com>

73369acd

KVM: selftests: Provide a global pseudo-RNG instance for all tests · cb6c6914

Sean Christopherson authored Mar 14, 2024

Add a global guest_random_state instance, i.e. a pseudo-RNG, so that an
RNG is available for *all* tests. This will allow randomizing behavior
in core library code, e.g. x86 will utilize the pRNG to conditionally
force emulation of writes from within common guest code.

To allow for deterministic runs, and to be compatible with existing tests,
allow tests to override the seed used to initialize the pRNG.

Note, the seed *must* be overwritten before a VM is created in order for
the seed to take effect, though it's perfectly fine for a test to
initialize multiple VMs with different seeds.

And as evidenced by memstress_guest_code(), it's also a-ok to instantiate
more RNGs using the global seed (or a modified version of it). The goal
of the global RNG is purely to ensure that _a_ source of random numbers is
available, it doesn't have to be the _only_ RNG.

Link: https://lore.kernel.org/r/20240314185459.2439072-2-seanjc@google.comSigned-off-by: Sean Christopherson <seanjc@google.com>

cb6c6914

KVM: selftests: Define _GNU_SOURCE for all selftests code · 730cfa45

Sean Christopherson authored Apr 23, 2024

Define _GNU_SOURCE is the base CFLAGS instead of relying on selftests to
manually #define _GNU_SOURCE, which is repetitive and error prone.  E.g.
kselftest_harness.h requires _GNU_SOURCE for asprintf(), but if a selftest
includes kvm_test_harness.h after stdio.h, the include guards result in
the effective version of stdio.h consumed by kvm_test_harness.h not
defining asprintf():

  In file included from x86_64/fix_hypercall_test.c:12:
  In file included from include/kvm_test_harness.h:11:
 ../kselftest_harness.h:1169:2: error: call to undeclared function
  'asprintf'; ISO C99 and later do not support implicit function declarations
  [-Wimplicit-function-declaration]
   1169 |         asprintf(&test_name, "%s%s%s.%s", f->name,
        |         ^

When including the rseq selftest's "library" code, #undef _GNU_SOURCE so
that rseq.c controls whether or not it wants to build with _GNU_SOURCE.
Reported-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Acked-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Acked-by: Oliver Upton <oliver.upton@linux.dev>
Acked-by: Anup Patel <anup@brainfault.org>
Reviewed-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Link: https://lore.kernel.org/r/20240423190308.2883084-1-seanjc@google.comSigned-off-by: Sean Christopherson <seanjc@google.com>

730cfa45

19 Apr, 2024 1 commit

Merge x86 bugfixes from Linux 6.9-rc3 · a96cb3bf

Paolo Bonzini authored Apr 19, 2024

Pull fix for SEV-SNP late disable bugs.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

a96cb3bf

12 Apr, 2024 7 commits

KVM: SEV: use u64_to_user_ptr throughout · 1ab157ce
Paolo Bonzini authored Feb 26, 2024
```
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
```
1ab157ce

KVM: VMX: Modify NMI and INTR handlers to take intr_info as function argument · 2325a21a

Sean Christopherson authored Jan 22, 2024

TDX uses different ABI to get information about VM exit.  Pass intr_info to
the NMI and INTR handlers instead of pulling it from vcpu_vmx in
preparation for sharing the bulk of the handlers with TDX.

When the guest TD exits to VMM, RAX holds status and exit reason, RCX holds
exit qualification etc rather than the VMCS fields because VMM doesn't have
access to the VMCS.  The eventual code will be

VMX:
  - get exit reason, intr_info, exit_qualification, and etc from VMCS
  - call NMI/INTR handlers (common code)

TDX:
  - get exit reason, intr_info, exit_qualification, and etc from guest
    registers
  - call NMI/INTR handlers (common code)
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <0396a9ae70d293c9d0b060349dae385a8a4fbcec.1705965635.git.isaku.yamahata@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

2325a21a

KVM: VMX: Move out vmx_x86_ops to 'main.c' to dispatch VMX and TDX · 5f18c642

Paolo Bonzini authored Mar 18, 2024

KVM accesses Virtual Machine Control Structure (VMCS) with VMX instructions
to operate on VM.  TDX doesn't allow VMM to operate VMCS directly.
Instead, TDX has its own data structures, and TDX SEAMCALL APIs for VMM to
indirectly operate those data structures.  This means we must have a TDX
version of kvm_x86_ops.

The existing global struct kvm_x86_ops already defines an interface which
can be adapted to TDX, but kvm_x86_ops is a system-wide, not per-VM
structure.  To allow VMX to coexist with TDs, the kvm_x86_ops callbacks
will have wrappers "if (tdx) tdx_op() else vmx_op()" to pick VMX or
TDX at run time.

To split the runtime switch, the VMX implementation, and the TDX
implementation, add main.c, and move out the vmx_x86_ops hooks in
preparation for adding TDX.  Use 'vt' for the naming scheme as a nod to
VT-x and as a concatenation of VmxTdx.

The eventually converted code will look like this:

vmx.c:
  vmx_op() { ... }
  VMX initialization
tdx.c:
  tdx_op() { ... }
  TDX initialization
x86_ops.h:
  vmx_op();
  tdx_op();
main.c:
  static vt_op() { if (tdx) tdx_op() else vmx_op() }
  static struct kvm_x86_ops vt_x86_ops = {
        .op = vt_op,
  initialization functions call both VMX and TDX initialization

Opportunistically, fix the name inconsistency from vmx_create_vcpu() and
vmx_free_vcpu() to vmx_vcpu_create() and vmx_vcpu_free().
Co-developed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: Yuan Yao <yuan.yao@intel.com>
Message-Id: <e603c317587f933a9d1bee8728c84e4935849c16.1705965634.git.isaku.yamahata@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

5f18c642

KVM: x86: Split core of hypercall emulation to helper function · e913ef15

Sean Christopherson authored Jan 22, 2024

By necessity, TDX will use a different register ABI for hypercalls.
Break out the core functionality so that it may be reused for TDX.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Message-Id: <5134caa55ac3dec33fb2addb5545b52b3b52db02.1705965635.git.isaku.yamahata@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

e913ef15

Merge branch 'kvm-sev-init2' into HEAD · f9cecb3c

Paolo Bonzini authored Apr 04, 2024

The idea that no parameter would ever be necessary when enabling SEV or
SEV-ES for a VM was decidedly optimistic.  The first source of variability
that was encountered is the desired set of VMSA features, as that affects
the measurement of the VM's initial state and cannot be changed
arbitrarily by the hypervisor.

This series adds all the APIs that are needed to customize the features,
with room for future enhancements:

- a new /dev/kvm device attribute to retrieve the set of supported
  features (right now, only debug swap)

- a new sub-operation for KVM_MEM_ENCRYPT_OP that can take a struct,
  replacing the existing KVM_SEV_INIT and KVM_SEV_ES_INIT

It then puts the new op to work by including the VMSA features as a field
of the The existing KVM_SEV_INIT and KVM_SEV_ES_INIT use the full set of
supported VMSA features for backwards compatibility; but I am considering
also making them use zero as the feature mask, and will gladly adjust the
patches if so requested.

In order to avoid creating *two* new KVM_MEM_ENCRYPT_OPs, I decided that
I could as well make SEV and SEV-ES use VM types.  This allows SEV-SNP
to reuse the KVM_SEV_INIT2 ioctl.

And while at it, KVM_SEV_INIT2 also includes two bugfixes.  First of all,
SEV-ES VM, when created with the new VM type instead of KVM_SEV_ES_INIT,
reject KVM_GET_REGS/KVM_SET_REGS and friends on the vCPU file descriptor
once the VMSA has been encrypted...  which is how the API should have
always behaved.  Second, they also synchronize the FPU and AVX state.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

f9cecb3c

Merge branch 'mm-delete-change-gpte' into HEAD · 531f5200

Paolo Bonzini authored Apr 11, 2024

The .change_pte() MMU notifier callback was intended as an optimization
and for this reason it was initially called without a surrounding
mmu_notifier_invalidate_range_{start,end}() pair.  It was only ever
implemented by KVM (which was also the original user of MMU notifiers)
and the rules on when to call set_pte_at_notify() rather than set_pte_at()
have always been pretty obscure.

It may seem a miracle that it has never caused any hard to trigger
bugs, but there's a good reason for that: KVM's implementation has
been nonfunctional for a good part of its existence.  Already in
2012, commit 6bdb913f ("mm: wrap calls to set_pte_at_notify with
invalidate_range_start and invalidate_range_end", 2012-10-09) changed the
.change_pte() callback to occur within an invalidate_range_start/end()
pair; and because KVM unmaps the sPTEs during .invalidate_range_start(),
.change_pte() has no hope of finding a sPTE to change.

Therefore, all the code for .change_pte() can be removed from both KVM
and mm/, and set_pte_at_notify() can be replaced with just set_pte_at().
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

531f5200

mm: replace set_pte_at_notify() with just set_pte_at() · f7842747

Paolo Bonzini authored Apr 05, 2024

With the demise of the .change_pte() MMU notifier callback, there is no
notification happening in set_pte_at_notify().  It is a synonym of
set_pte_at() and can be replaced with it.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240405115815.3226315-5-pbonzini@redhat.com>
Acked-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

f7842747

11 Apr, 2024 9 commits

mmu_notifier: remove the .change_pte() callback · 997308f9

Paolo Bonzini authored Apr 05, 2024

The scope of set_pte_at_notify() has reduced more and more through the
years.  Initially, it was meant for when the change to the PTE was
not bracketed by mmu_notifier_invalidate_range_{start,end}().  However,
that has not been so for over ten years.  During all this period
the only implementation of .change_pte() was KVM and it
had no actual functionality, because it was called after
mmu_notifier_invalidate_range_start() zapped the secondary PTE.

Now that this (nonfunctional) user of the .change_pte() callback is
gone, the whole callback can be removed.  For now, leave in place
set_pte_at_notify() even though it is just a synonym for set_pte_at().
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Message-ID: <20240405115815.3226315-4-pbonzini@redhat.com>
Acked-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

997308f9

KVM: remove unused argument of kvm_handle_hva_range() · 5257de95

Paolo Bonzini authored Apr 05, 2024

The only user was kvm_mmu_notifier_change_pte(), which is now gone.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240405115815.3226315-3-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

5257de95

KVM: delete .change_pte MMU notifier callback · f3b65bba

Paolo Bonzini authored Apr 05, 2024

The .change_pte() MMU notifier callback was intended as an
optimization. The original point of it was that KSM could tell KVM to flip
its secondary PTE to a new location without having to first zap it. At
the time there was also an .invalidate_page() callback; both of them were
*not* bracketed by calls to mmu_notifier_invalidate_range_{start,end}(),
and .invalidate_page() also doubled as a fallback implementation of
.change_pte().

Later on, however, both callbacks were changed to occur within an
invalidate_range_start/end() block.

In the case of .change_pte(), commit 6bdb913f ("mm: wrap calls to
set_pte_at_notify with invalidate_range_start and invalidate_range_end",
2012-10-09) did so to remove the fallback from .invalidate_page() to
.change_pte() and allow sleepable .invalidate_page() hooks.

This however made KVM's usage of the .change_pte() callback completely
moot, because KVM unmaps the sPTEs during .invalidate_range_start()
and therefore .change_pte() has no hope of finding a sPTE to change.
Drop the generic KVM code that dispatches to kvm_set_spte_gfn(), as
well as all the architecture specific implementations.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Anup Patel <anup@brainfault.org>
Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Reviewed-by: Bibo Mao <maobibo@loongson.cn>
Message-ID: <20240405115815.3226315-2-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

f3b65bba

selftests: kvm: add test for transferring FPU state into VMSA · 8c53183d

Paolo Bonzini authored Apr 04, 2024

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-ID: <20240404121327.3107131-18-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

8c53183d

selftests: kvm: split "launch" phase of SEV VM creation · 4c180a57

Paolo Bonzini authored Apr 04, 2024

Allow the caller to set the initial state of the VM.  Doing this
before sev_vm_launch() matters for SEV-ES, since that is the
place where the VMSA is updated and after which the guest state
becomes sealed.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-ID: <20240404121327.3107131-17-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

4c180a57

selftests: kvm: switch to using KVM_X86_*_VM · d18c8648

Paolo Bonzini authored Apr 04, 2024

This removes the concept of "subtypes", instead letting the tests use proper
VM types that were recently added.  While the sev_init_vm() and sev_es_init_vm()
are still able to operate with the legacy KVM_SEV_INIT and KVM_SEV_ES_INIT
ioctls, this is limited to VMs that are created manually with
vm_create_barebones().
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-ID: <20240404121327.3107131-16-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

d18c8648

selftests: kvm: add tests for KVM_SEV_INIT2 · dfc083a1

Paolo Bonzini authored Apr 04, 2024

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-ID: <20240404121327.3107131-15-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

dfc083a1

KVM: SEV: allow SEV-ES DebugSwap again · 4dd5ecac

Paolo Bonzini authored Apr 04, 2024

The DebugSwap feature of SEV-ES provides a way for confidential guests
to use data breakpoints. Its status is record in VMSA, and therefore
attestation signatures depend on whether it is enabled or not. In order
to avoid invalidating the signatures depending on the host machine, it
was disabled by default (see commit 5abf6dce, "SEV: disable SEV-ES
DebugSwap by default", 2024-03-09).

However, we now have a new API to create SEV VMs that allows enabling
DebugSwap based on what the user tells KVM to do, and we also changed the
legacy KVM_SEV_ES_INIT API to never enable DebugSwap. It is therefore
possible to re-enable the feature without breaking compatibility with
kernels that pre-date the introduction of DebugSwap, so go ahead.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-ID: <20240404121327.3107131-14-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

4dd5ecac

KVM: SEV: introduce KVM_SEV_INIT2 operation · 4f5defae

Paolo Bonzini authored Apr 04, 2024

The idea that no parameter would ever be necessary when enabling SEV or
SEV-ES for a VM was decidedly optimistic.  In fact, in some sense it's
already a parameter whether SEV or SEV-ES is desired.  Another possible
source of variability is the desired set of VMSA features, as that affects
the measurement of the VM's initial state and cannot be changed
arbitrarily by the hypervisor.

Create a new sub-operation for KVM_MEMORY_ENCRYPT_OP that can take a struct,
and put the new op to work by including the VMSA features as a field of the
struct.  The existing KVM_SEV_INIT and KVM_SEV_ES_INIT use the full set of
supported VMSA features for backwards compatibility.

The struct also includes the usual bells and whistles for future
extensibility: a flags field that must be zero for now, and some padding
at the end.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-ID: <20240404121327.3107131-13-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

4f5defae