Commit 38b334fc authored by Linus Torvalds's avatar Linus Torvalds

Merge tag 'x86_sev_for_v6.9_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 SEV updates from Borislav Petkov:

 - Add the x86 part of the SEV-SNP host support.

   This will allow the kernel to be used as a KVM hypervisor capable of
   running SNP (Secure Nested Paging) guests. Roughly speaking, SEV-SNP
   is the ultimate goal of the AMD confidential computing side,
   providing the most comprehensive confidential computing environment
   up to date.

   This is the x86 part and there is a KVM part which did not get ready
   in time for the merge window so latter will be forthcoming in the
   next cycle.

 - Rework the early code's position-dependent SEV variable references in
   order to allow building the kernel with clang and -fPIE/-fPIC and
   -mcmodel=kernel

 - The usual set of fixes, cleanups and improvements all over the place

* tag 'x86_sev_for_v6.9_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits)
  x86/sev: Disable KMSAN for memory encryption TUs
  x86/sev: Dump SEV_STATUS
  crypto: ccp - Have it depend on AMD_IOMMU
  iommu/amd: Fix failure return from snp_lookup_rmpentry()
  x86/sev: Fix position dependent variable references in startup code
  crypto: ccp: Make snp_range_list static
  x86/Kconfig: Remove CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT
  Documentation: virt: Fix up pre-formatted text block for SEV ioctls
  crypto: ccp: Add the SNP_SET_CONFIG command
  crypto: ccp: Add the SNP_COMMIT command
  crypto: ccp: Add the SNP_PLATFORM_STATUS command
  x86/cpufeatures: Enable/unmask SEV-SNP CPU feature
  KVM: SEV: Make AVIC backing, VMSA and VMCB memory allocation SNP safe
  crypto: ccp: Add panic notifier for SEV/SNP firmware shutdown on kdump
  iommu/amd: Clean up RMP entries for IOMMU pages during SNP shutdown
  crypto: ccp: Handle legacy SEV commands when SNP is enabled
  crypto: ccp: Handle non-volatile INIT_EX data when SNP is enabled
  crypto: ccp: Handle the legacy TMR allocation when SNP is enabled
  x86/sev: Introduce an SNP leaked pages list
  crypto: ccp: Provide an API to issue SEV and SNP commands
  ...
parents 2edfd104 c0935fca
......@@ -3318,9 +3318,7 @@
mem_encrypt= [X86-64] AMD Secure Memory Encryption (SME) control
Valid arguments: on, off
Default (depends on kernel configuration option):
on (CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT=y)
off (CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT=n)
Default: off
mem_encrypt=on: Activate SME
mem_encrypt=off: Do not activate SME
......
......@@ -87,14 +87,14 @@ The state of SME in the Linux kernel can be documented as follows:
kernel is non-zero).
SME can also be enabled and activated in the BIOS. If SME is enabled and
activated in the BIOS, then all memory accesses will be encrypted and it will
not be necessary to activate the Linux memory encryption support. If the BIOS
merely enables SME (sets bit 23 of the MSR_AMD64_SYSCFG), then Linux can activate
memory encryption by default (CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT=y) or
by supplying mem_encrypt=on on the kernel command line. However, if BIOS does
not enable SME, then Linux will not be able to activate memory encryption, even
if configured to do so by default or the mem_encrypt=on command line parameter
is specified.
activated in the BIOS, then all memory accesses will be encrypted and it
will not be necessary to activate the Linux memory encryption support.
If the BIOS merely enables SME (sets bit 23 of the MSR_AMD64_SYSCFG),
then memory encryption can be enabled by supplying mem_encrypt=on on the
kernel command line. However, if BIOS does not enable SME, then Linux
will not be able to activate memory encryption, even if configured to do
so by default or the mem_encrypt=on command line parameter is specified.
Secure Nested Paging (SNP)
==========================
......
......@@ -67,6 +67,23 @@ counter (e.g. counter overflow), then -EIO will be returned.
};
};
The host ioctls are issued to a file descriptor of the /dev/sev device.
The ioctl accepts the command ID/input structure documented below.
::
struct sev_issue_cmd {
/* Command ID */
__u32 cmd;
/* Command request structure */
__u64 data;
/* Firmware error code on failure (see psp-sev.h) */
__u32 error;
};
2.1 SNP_GET_REPORT
------------------
......@@ -124,6 +141,41 @@ be updated with the expected value.
See GHCB specification for further detail on how to parse the certificate blob.
2.4 SNP_PLATFORM_STATUS
-----------------------
:Technology: sev-snp
:Type: hypervisor ioctl cmd
:Parameters (out): struct sev_user_data_snp_status
:Returns (out): 0 on success, -negative on error
The SNP_PLATFORM_STATUS command is used to query the SNP platform status. The
status includes API major, minor version and more. See the SEV-SNP
specification for further details.
2.5 SNP_COMMIT
--------------
:Technology: sev-snp
:Type: hypervisor ioctl cmd
:Returns (out): 0 on success, -negative on error
SNP_COMMIT is used to commit the currently installed firmware using the
SEV-SNP firmware SNP_COMMIT command. This prevents roll-back to a previously
committed firmware version. This will also update the reported TCB to match
that of the currently installed firmware.
2.6 SNP_SET_CONFIG
------------------
:Technology: sev-snp
:Type: hypervisor ioctl cmd
:Parameters (in): struct sev_user_data_snp_config
:Returns (out): 0 on success, -negative on error
SNP_SET_CONFIG is used to set the system-wide configuration such as
reported TCB version in the attestation report. The command is similar
to SNP_CONFIG command defined in the SEV-SNP spec. The current values of
the firmware parameters affected by this command can be queried via
SNP_PLATFORM_STATUS.
3. SEV-SNP CPUID Enforcement
============================
......
......@@ -28,5 +28,7 @@ obj-y += net/
obj-$(CONFIG_KEXEC_FILE) += purgatory/
obj-y += virt/svm/
# for cleaning
subdir- += boot tools
......@@ -1548,19 +1548,6 @@ config AMD_MEM_ENCRYPT
This requires an AMD processor that supports Secure Memory
Encryption (SME).
config AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT
bool "Activate AMD Secure Memory Encryption (SME) by default"
depends on AMD_MEM_ENCRYPT
help
Say yes to have system memory encrypted by default if running on
an AMD processor that supports Secure Memory Encryption (SME).
If set to Y, then the encryption of system memory can be
deactivated with the mem_encrypt=off command line option.
If set to N, then the encryption of system memory can be
activated with the mem_encrypt=on command line option.
# Common NUMA Features
config NUMA
bool "NUMA Memory Allocation and Scheduler Support"
......
......@@ -304,6 +304,10 @@ void do_boot_stage2_vc(struct pt_regs *regs, unsigned long exit_code)
if (result != ES_OK)
goto finish;
result = vc_check_opcode_bytes(&ctxt, exit_code);
if (result != ES_OK)
goto finish;
switch (exit_code) {
case SVM_EXIT_RDTSC:
case SVM_EXIT_RDTSCP:
......@@ -365,7 +369,7 @@ static void enforce_vmpl0(void)
MSR_AMD64_SNP_VMPL_SSS | \
MSR_AMD64_SNP_SECURE_TSC | \
MSR_AMD64_SNP_VMGEXIT_PARAM | \
MSR_AMD64_SNP_VMSA_REG_PROTECTION | \
MSR_AMD64_SNP_VMSA_REG_PROT | \
MSR_AMD64_SNP_RESERVED_BIT13 | \
MSR_AMD64_SNP_RESERVED_BIT15 | \
MSR_AMD64_SNP_RESERVED_MASK)
......
......@@ -14,7 +14,7 @@
#include <asm/processor.h>
enum cc_vendor cc_vendor __ro_after_init = CC_VENDOR_NONE;
static u64 cc_mask __ro_after_init;
u64 cc_mask __ro_after_init;
static bool noinstr intel_cc_platform_has(enum cc_attr attr)
{
......@@ -148,8 +148,3 @@ u64 cc_mkdec(u64 val)
}
}
EXPORT_SYMBOL_GPL(cc_mkdec);
__init void cc_set_mask(u64 mask)
{
cc_mask = mask;
}
......@@ -113,6 +113,20 @@
#endif
#ifndef __ASSEMBLY__
#ifndef __pic__
static __always_inline __pure void *rip_rel_ptr(void *p)
{
asm("leaq %c1(%%rip), %0" : "=r"(p) : "i"(p));
return p;
}
#define RIP_REL_REF(var) (*(typeof(&(var)))rip_rel_ptr(&(var)))
#else
#define RIP_REL_REF(var) (var)
#endif
#endif
/*
* Macros to generate condition code outputs from inline assembly,
* The output operand must be type "bool".
......
......@@ -2,6 +2,7 @@
#ifndef _ASM_X86_COCO_H
#define _ASM_X86_COCO_H
#include <asm/asm.h>
#include <asm/types.h>
enum cc_vendor {
......@@ -12,7 +13,13 @@ enum cc_vendor {
#ifdef CONFIG_ARCH_HAS_CC_PLATFORM
extern enum cc_vendor cc_vendor;
void cc_set_mask(u64 mask);
extern u64 cc_mask;
static inline void cc_set_mask(u64 mask)
{
RIP_REL_REF(cc_mask) = mask;
}
u64 cc_mkenc(u64 val);
u64 cc_mkdec(u64 val);
#else
......
......@@ -442,6 +442,7 @@
#define X86_FEATURE_SEV (19*32+ 1) /* AMD Secure Encrypted Virtualization */
#define X86_FEATURE_VM_PAGE_FLUSH (19*32+ 2) /* "" VM Page Flush MSR is supported */
#define X86_FEATURE_SEV_ES (19*32+ 3) /* AMD Secure Encrypted Virtualization - Encrypted State */
#define X86_FEATURE_SEV_SNP (19*32+ 4) /* AMD Secure Encrypted Virtualization - Secure Nested Paging */
#define X86_FEATURE_V_TSC_AUX (19*32+ 9) /* "" Virtual TSC_AUX */
#define X86_FEATURE_SME_COHERENT (19*32+10) /* "" AMD hardware-enforced cache coherency */
#define X86_FEATURE_DEBUG_SWAP (19*32+14) /* AMD SEV-ES full debug state swap support */
......
......@@ -123,6 +123,12 @@
# define DISABLE_FRED (1 << (X86_FEATURE_FRED & 31))
#endif
#ifdef CONFIG_KVM_AMD_SEV
#define DISABLE_SEV_SNP 0
#else
#define DISABLE_SEV_SNP (1 << (X86_FEATURE_SEV_SNP & 31))
#endif
/*
* Make sure to add features to the correct mask
*/
......@@ -147,7 +153,7 @@
DISABLE_ENQCMD)
#define DISABLED_MASK17 0
#define DISABLED_MASK18 (DISABLE_IBT)
#define DISABLED_MASK19 0
#define DISABLED_MASK19 (DISABLE_SEV_SNP)
#define DISABLED_MASK20 0
#define DISABLED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 21)
......
......@@ -10,6 +10,7 @@ extern int force_iommu, no_iommu;
extern int iommu_detected;
extern int iommu_merge;
extern int panic_on_overflow;
extern bool amd_iommu_snp_en;
#ifdef CONFIG_SWIOTLB
extern bool x86_swiotlb_enable;
......
......@@ -138,6 +138,7 @@ KVM_X86_OP(complete_emulated_msr)
KVM_X86_OP(vcpu_deliver_sipi_vector)
KVM_X86_OP_OPTIONAL_RET0(vcpu_get_apicv_inhibit_reasons);
KVM_X86_OP_OPTIONAL(get_untagged_addr)
KVM_X86_OP_OPTIONAL(alloc_apic_backing_page)
#undef KVM_X86_OP
#undef KVM_X86_OP_OPTIONAL
......
......@@ -1796,6 +1796,7 @@ struct kvm_x86_ops {
unsigned long (*vcpu_get_apicv_inhibit_reasons)(struct kvm_vcpu *vcpu);
gva_t (*get_untagged_addr)(struct kvm_vcpu *vcpu, gva_t gva, unsigned int flags);
void *(*alloc_apic_backing_page)(struct kvm_vcpu *vcpu);
};
struct kvm_x86_nested_ops {
......
......@@ -15,7 +15,8 @@
#include <linux/init.h>
#include <linux/cc_platform.h>
#include <asm/bootparam.h>
#include <asm/asm.h>
struct boot_params;
#ifdef CONFIG_X86_MEM_ENCRYPT
void __init mem_encrypt_init(void);
......@@ -58,6 +59,11 @@ void __init mem_encrypt_free_decrypted_mem(void);
void __init sev_es_init_vc_handling(void);
static inline u64 sme_get_me_mask(void)
{
return RIP_REL_REF(sme_me_mask);
}
#define __bss_decrypted __section(".bss..decrypted")
#else /* !CONFIG_AMD_MEM_ENCRYPT */
......@@ -89,6 +95,8 @@ early_set_mem_enc_dec_hypercall(unsigned long vaddr, unsigned long size, bool en
static inline void mem_encrypt_free_decrypted_mem(void) { }
static inline u64 sme_get_me_mask(void) { return 0; }
#define __bss_decrypted
#endif /* CONFIG_AMD_MEM_ENCRYPT */
......@@ -106,11 +114,6 @@ void add_encrypt_protection_map(void);
extern char __start_bss_decrypted[], __end_bss_decrypted[], __start_bss_decrypted_unused[];
static inline u64 sme_get_me_mask(void)
{
return sme_me_mask;
}
#endif /* __ASSEMBLY__ */
#endif /* __X86_MEM_ENCRYPT_H__ */
......@@ -605,34 +605,47 @@
#define MSR_AMD64_SEV_ES_GHCB 0xc0010130
#define MSR_AMD64_SEV 0xc0010131
#define MSR_AMD64_SEV_ENABLED_BIT 0
#define MSR_AMD64_SEV_ES_ENABLED_BIT 1
#define MSR_AMD64_SEV_SNP_ENABLED_BIT 2
#define MSR_AMD64_SEV_ENABLED BIT_ULL(MSR_AMD64_SEV_ENABLED_BIT)
#define MSR_AMD64_SEV_ES_ENABLED_BIT 1
#define MSR_AMD64_SEV_ES_ENABLED BIT_ULL(MSR_AMD64_SEV_ES_ENABLED_BIT)
#define MSR_AMD64_SEV_SNP_ENABLED_BIT 2
#define MSR_AMD64_SEV_SNP_ENABLED BIT_ULL(MSR_AMD64_SEV_SNP_ENABLED_BIT)
/* SNP feature bits enabled by the hypervisor */
#define MSR_AMD64_SNP_VTOM BIT_ULL(3)
#define MSR_AMD64_SNP_REFLECT_VC BIT_ULL(4)
#define MSR_AMD64_SNP_RESTRICTED_INJ BIT_ULL(5)
#define MSR_AMD64_SNP_ALT_INJ BIT_ULL(6)
#define MSR_AMD64_SNP_DEBUG_SWAP BIT_ULL(7)
#define MSR_AMD64_SNP_PREVENT_HOST_IBS BIT_ULL(8)
#define MSR_AMD64_SNP_BTB_ISOLATION BIT_ULL(9)
#define MSR_AMD64_SNP_VMPL_SSS BIT_ULL(10)
#define MSR_AMD64_SNP_SECURE_TSC BIT_ULL(11)
#define MSR_AMD64_SNP_VMGEXIT_PARAM BIT_ULL(12)
#define MSR_AMD64_SNP_IBS_VIRT BIT_ULL(14)
#define MSR_AMD64_SNP_VMSA_REG_PROTECTION BIT_ULL(16)
#define MSR_AMD64_SNP_SMT_PROTECTION BIT_ULL(17)
/* SNP feature bits reserved for future use. */
#define MSR_AMD64_SNP_RESERVED_BIT13 BIT_ULL(13)
#define MSR_AMD64_SNP_RESERVED_BIT15 BIT_ULL(15)
#define MSR_AMD64_SNP_RESERVED_MASK GENMASK_ULL(63, 18)
#define MSR_AMD64_SNP_VTOM_BIT 3
#define MSR_AMD64_SNP_VTOM BIT_ULL(MSR_AMD64_SNP_VTOM_BIT)
#define MSR_AMD64_SNP_REFLECT_VC_BIT 4
#define MSR_AMD64_SNP_REFLECT_VC BIT_ULL(MSR_AMD64_SNP_REFLECT_VC_BIT)
#define MSR_AMD64_SNP_RESTRICTED_INJ_BIT 5
#define MSR_AMD64_SNP_RESTRICTED_INJ BIT_ULL(MSR_AMD64_SNP_RESTRICTED_INJ_BIT)
#define MSR_AMD64_SNP_ALT_INJ_BIT 6
#define MSR_AMD64_SNP_ALT_INJ BIT_ULL(MSR_AMD64_SNP_ALT_INJ_BIT)
#define MSR_AMD64_SNP_DEBUG_SWAP_BIT 7
#define MSR_AMD64_SNP_DEBUG_SWAP BIT_ULL(MSR_AMD64_SNP_DEBUG_SWAP_BIT)
#define MSR_AMD64_SNP_PREVENT_HOST_IBS_BIT 8
#define MSR_AMD64_SNP_PREVENT_HOST_IBS BIT_ULL(MSR_AMD64_SNP_PREVENT_HOST_IBS_BIT)
#define MSR_AMD64_SNP_BTB_ISOLATION_BIT 9
#define MSR_AMD64_SNP_BTB_ISOLATION BIT_ULL(MSR_AMD64_SNP_BTB_ISOLATION_BIT)
#define MSR_AMD64_SNP_VMPL_SSS_BIT 10
#define MSR_AMD64_SNP_VMPL_SSS BIT_ULL(MSR_AMD64_SNP_VMPL_SSS_BIT)
#define MSR_AMD64_SNP_SECURE_TSC_BIT 11
#define MSR_AMD64_SNP_SECURE_TSC BIT_ULL(MSR_AMD64_SNP_SECURE_TSC_BIT)
#define MSR_AMD64_SNP_VMGEXIT_PARAM_BIT 12
#define MSR_AMD64_SNP_VMGEXIT_PARAM BIT_ULL(MSR_AMD64_SNP_VMGEXIT_PARAM_BIT)
#define MSR_AMD64_SNP_RESERVED_BIT13 BIT_ULL(13)
#define MSR_AMD64_SNP_IBS_VIRT_BIT 14
#define MSR_AMD64_SNP_IBS_VIRT BIT_ULL(MSR_AMD64_SNP_IBS_VIRT_BIT)
#define MSR_AMD64_SNP_RESERVED_BIT15 BIT_ULL(15)
#define MSR_AMD64_SNP_VMSA_REG_PROT_BIT 16
#define MSR_AMD64_SNP_VMSA_REG_PROT BIT_ULL(MSR_AMD64_SNP_VMSA_REG_PROT_BIT)
#define MSR_AMD64_SNP_SMT_PROT_BIT 17
#define MSR_AMD64_SNP_SMT_PROT BIT_ULL(MSR_AMD64_SNP_SMT_PROT_BIT)
#define MSR_AMD64_SNP_RESV_BIT 18
#define MSR_AMD64_SNP_RESERVED_MASK GENMASK_ULL(63, MSR_AMD64_SNP_RESV_BIT)
#define MSR_AMD64_VIRT_SPEC_CTRL 0xc001011f
#define MSR_AMD64_RMP_BASE 0xc0010132
#define MSR_AMD64_RMP_END 0xc0010133
/* AMD Collaborative Processor Performance Control MSRs */
#define MSR_AMD_CPPC_CAP1 0xc00102b0
#define MSR_AMD_CPPC_ENABLE 0xc00102b1
......@@ -719,8 +732,15 @@
#define MSR_K8_TOP_MEM1 0xc001001a
#define MSR_K8_TOP_MEM2 0xc001001d
#define MSR_AMD64_SYSCFG 0xc0010010
#define MSR_AMD64_SYSCFG_MEM_ENCRYPT_BIT 23
#define MSR_AMD64_SYSCFG_MEM_ENCRYPT_BIT 23
#define MSR_AMD64_SYSCFG_MEM_ENCRYPT BIT_ULL(MSR_AMD64_SYSCFG_MEM_ENCRYPT_BIT)
#define MSR_AMD64_SYSCFG_SNP_EN_BIT 24
#define MSR_AMD64_SYSCFG_SNP_EN BIT_ULL(MSR_AMD64_SYSCFG_SNP_EN_BIT)
#define MSR_AMD64_SYSCFG_SNP_VMPL_EN_BIT 25
#define MSR_AMD64_SYSCFG_SNP_VMPL_EN BIT_ULL(MSR_AMD64_SYSCFG_SNP_VMPL_EN_BIT)
#define MSR_AMD64_SYSCFG_MFDM_BIT 19
#define MSR_AMD64_SYSCFG_MFDM BIT_ULL(MSR_AMD64_SYSCFG_MFDM_BIT)
#define MSR_K8_INT_PENDING_MSG 0xc0010055
/* C1E active bits in int pending message */
#define K8_INTP_C1E_ACTIVE_MASK 0x18000000
......
......@@ -87,9 +87,23 @@ extern bool handle_vc_boot_ghcb(struct pt_regs *regs);
/* Software defined (when rFlags.CF = 1) */
#define PVALIDATE_FAIL_NOUPDATE 255
/* RMUPDATE detected 4K page and 2MB page overlap. */
#define RMPUPDATE_FAIL_OVERLAP 4
/* RMP page size */
#define RMP_PG_SIZE_4K 0
#define RMP_PG_SIZE_2M 1
#define RMP_TO_PG_LEVEL(level) (((level) == RMP_PG_SIZE_4K) ? PG_LEVEL_4K : PG_LEVEL_2M)
#define PG_LEVEL_TO_RMP(level) (((level) == PG_LEVEL_4K) ? RMP_PG_SIZE_4K : RMP_PG_SIZE_2M)
struct rmp_state {
u64 gpa;
u8 assigned;
u8 pagesize;
u8 immutable;
u8 rsvd;
u32 asid;
} __packed;
#define RMPADJUST_VMSA_PAGE_BIT BIT(16)
......@@ -213,6 +227,8 @@ int snp_issue_guest_request(u64 exit_code, struct snp_req_data *input, struct sn
void snp_accept_memory(phys_addr_t start, phys_addr_t end);
u64 snp_get_unsupported_features(u64 status);
u64 sev_get_status(void);
void kdump_sev_callback(void);
void sev_show_status(void);
#else
static inline void sev_es_ist_enter(struct pt_regs *regs) { }
static inline void sev_es_ist_exit(void) { }
......@@ -241,6 +257,30 @@ static inline int snp_issue_guest_request(u64 exit_code, struct snp_req_data *in
static inline void snp_accept_memory(phys_addr_t start, phys_addr_t end) { }
static inline u64 snp_get_unsupported_features(u64 status) { return 0; }
static inline u64 sev_get_status(void) { return 0; }
static inline void kdump_sev_callback(void) { }
static inline void sev_show_status(void) { }
#endif
#ifdef CONFIG_KVM_AMD_SEV
bool snp_probe_rmptable_info(void);
int snp_lookup_rmpentry(u64 pfn, bool *assigned, int *level);
void snp_dump_hva_rmpentry(unsigned long address);
int psmash(u64 pfn);
int rmp_make_private(u64 pfn, u64 gpa, enum pg_level level, u32 asid, bool immutable);
int rmp_make_shared(u64 pfn, enum pg_level level);
void snp_leak_pages(u64 pfn, unsigned int npages);
#else
static inline bool snp_probe_rmptable_info(void) { return false; }
static inline int snp_lookup_rmpentry(u64 pfn, bool *assigned, int *level) { return -ENODEV; }
static inline void snp_dump_hva_rmpentry(unsigned long address) {}
static inline int psmash(u64 pfn) { return -ENODEV; }
static inline int rmp_make_private(u64 pfn, u64 gpa, enum pg_level level, u32 asid,
bool immutable)
{
return -ENODEV;
}
static inline int rmp_make_shared(u64 pfn, enum pg_level level) { return -ENODEV; }
static inline void snp_leak_pages(u64 pfn, unsigned int npages) {}
#endif
#endif
......@@ -2,6 +2,8 @@
#ifndef _ASM_X86_TRAP_PF_H
#define _ASM_X86_TRAP_PF_H
#include <linux/bits.h>
/*
* Page fault error code bits:
*
......@@ -13,16 +15,18 @@
* bit 5 == 1: protection keys block access
* bit 6 == 1: shadow stack access fault
* bit 15 == 1: SGX MMU page-fault
* bit 31 == 1: fault was due to RMP violation
*/
enum x86_pf_error_code {
X86_PF_PROT = 1 << 0,
X86_PF_WRITE = 1 << 1,
X86_PF_USER = 1 << 2,
X86_PF_RSVD = 1 << 3,
X86_PF_INSTR = 1 << 4,
X86_PF_PK = 1 << 5,
X86_PF_SHSTK = 1 << 6,
X86_PF_SGX = 1 << 15,
X86_PF_PROT = BIT(0),
X86_PF_WRITE = BIT(1),
X86_PF_USER = BIT(2),
X86_PF_RSVD = BIT(3),
X86_PF_INSTR = BIT(4),
X86_PF_PK = BIT(5),
X86_PF_SHSTK = BIT(6),
X86_PF_SGX = BIT(15),
X86_PF_RMP = BIT(31),
};
#endif /* _ASM_X86_TRAP_PF_H */
......@@ -33,6 +33,7 @@ KASAN_SANITIZE_sev.o := n
KCSAN_SANITIZE := n
KMSAN_SANITIZE_head$(BITS).o := n
KMSAN_SANITIZE_nmi.o := n
KMSAN_SANITIZE_sev.o := n
# If instrumentation of the following files is enabled, boot hangs during
# first second.
......
......@@ -20,6 +20,7 @@
#include <asm/delay.h>
#include <asm/debugreg.h>
#include <asm/resctrl.h>
#include <asm/sev.h>
#ifdef CONFIG_X86_64
# include <asm/mmconfig.h>
......@@ -451,6 +452,21 @@ static void bsp_init_amd(struct cpuinfo_x86 *c)
break;
}
if (cpu_has(c, X86_FEATURE_SEV_SNP)) {
/*
* RMP table entry format is not architectural and it can vary by processor
* and is defined by the per-processor PPR. Restrict SNP support on the
* known CPU model and family for which the RMP table entry format is
* currently defined for.
*/
if (!boot_cpu_has(X86_FEATURE_ZEN3) &&
!boot_cpu_has(X86_FEATURE_ZEN4) &&
!boot_cpu_has(X86_FEATURE_ZEN5))
setup_clear_cpu_cap(X86_FEATURE_SEV_SNP);
else if (!snp_probe_rmptable_info())
setup_clear_cpu_cap(X86_FEATURE_SEV_SNP);
}
return;
warn:
......@@ -469,8 +485,8 @@ static void early_detect_mem_encrypt(struct cpuinfo_x86 *c)
* SME feature (set in scattered.c).
* If the kernel has not enabled SME via any means then
* don't advertise the SME feature.
* For SEV: If BIOS has not enabled SEV then don't advertise the
* SEV and SEV_ES feature (set in scattered.c).
* For SEV: If BIOS has not enabled SEV then don't advertise SEV and
* any additional functionality based on it.
*
* In all cases, since support for SME and SEV requires long mode,
* don't advertise the feature under CONFIG_X86_32.
......@@ -505,6 +521,7 @@ static void early_detect_mem_encrypt(struct cpuinfo_x86 *c)
clear_sev:
setup_clear_cpu_cap(X86_FEATURE_SEV);
setup_clear_cpu_cap(X86_FEATURE_SEV_ES);
setup_clear_cpu_cap(X86_FEATURE_SEV_SNP);
}
}
......
......@@ -1309,8 +1309,13 @@ static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c)
/*
* AMD's AutoIBRS is equivalent to Intel's eIBRS - use the Intel feature
* flag and protect from vendor-specific bugs via the whitelist.
*
* Don't use AutoIBRS when SNP is enabled because it degrades host
* userspace indirect branch performance.
*/
if ((ia32_cap & ARCH_CAP_IBRS_ALL) || cpu_has(c, X86_FEATURE_AUTOIBRS)) {
if ((ia32_cap & ARCH_CAP_IBRS_ALL) ||
(cpu_has(c, X86_FEATURE_AUTOIBRS) &&
!cpu_feature_enabled(X86_FEATURE_SEV_SNP))) {
setup_force_cpu_cap(X86_FEATURE_IBRS_ENHANCED);
if (!cpu_matches(cpu_vuln_whitelist, NO_EIBRS_PBRSB) &&
!(ia32_cap & ARCH_CAP_PBRSB_NO))
......
......@@ -108,6 +108,9 @@ static inline void k8_check_syscfg_dram_mod_en(void)
(boot_cpu_data.x86 >= 0x0f)))
return;
if (cpu_feature_enabled(X86_FEATURE_SEV_SNP))
return;
rdmsr(MSR_AMD64_SYSCFG, lo, hi);
if (lo & K8_MTRRFIXRANGE_DRAM_MODIFY) {
pr_err(FW_WARN "MTRR: CPU %u: SYSCFG[MtrrFixDramModEn]"
......
......@@ -40,6 +40,7 @@
#include <asm/intel_pt.h>
#include <asm/crash.h>
#include <asm/cmdline.h>
#include <asm/sev.h>
/* Used while preparing memory map entries for second kernel */
struct crash_memmap_data {
......@@ -59,6 +60,8 @@ static void kdump_nmi_callback(int cpu, struct pt_regs *regs)
*/
cpu_emergency_stop_pt();
kdump_sev_callback();
disable_local_APIC();
}
......
......@@ -10,11 +10,15 @@
*/
#ifndef __BOOT_COMPRESSED
#define error(v) pr_err(v)
#define has_cpuflag(f) boot_cpu_has(f)
#define error(v) pr_err(v)
#define has_cpuflag(f) boot_cpu_has(f)
#define sev_printk(fmt, ...) printk(fmt, ##__VA_ARGS__)
#define sev_printk_rtl(fmt, ...) printk_ratelimited(fmt, ##__VA_ARGS__)
#else
#undef WARN
#define WARN(condition, format...) (!!(condition))
#define sev_printk(fmt, ...)
#define sev_printk_rtl(fmt, ...)
#endif
/* I/O parameters for CPUID-related helpers */
......@@ -556,9 +560,9 @@ static int snp_cpuid(struct ghcb *ghcb, struct es_em_ctxt *ctxt, struct cpuid_le
leaf->eax = leaf->ebx = leaf->ecx = leaf->edx = 0;
/* Skip post-processing for out-of-range zero leafs. */
if (!(leaf->fn <= cpuid_std_range_max ||
(leaf->fn >= 0x40000000 && leaf->fn <= cpuid_hyp_range_max) ||
(leaf->fn >= 0x80000000 && leaf->fn <= cpuid_ext_range_max)))
if (!(leaf->fn <= RIP_REL_REF(cpuid_std_range_max) ||
(leaf->fn >= 0x40000000 && leaf->fn <= RIP_REL_REF(cpuid_hyp_range_max)) ||
(leaf->fn >= 0x80000000 && leaf->fn <= RIP_REL_REF(cpuid_ext_range_max))))
return 0;
}
......@@ -574,6 +578,7 @@ void __init do_vc_no_ghcb(struct pt_regs *regs, unsigned long exit_code)
{
unsigned int subfn = lower_bits(regs->cx, 32);
unsigned int fn = lower_bits(regs->ax, 32);
u16 opcode = *(unsigned short *)regs->ip;
struct cpuid_leaf leaf;
int ret;
......@@ -581,6 +586,10 @@ void __init do_vc_no_ghcb(struct pt_regs *regs, unsigned long exit_code)
if (exit_code != SVM_EXIT_CPUID)
goto fail;
/* Is it really a CPUID insn? */
if (opcode != 0xa20f)
goto fail;
leaf.fn = fn;
leaf.subfn = subfn;
......@@ -1063,11 +1072,11 @@ static void __init setup_cpuid_table(const struct cc_blob_sev_info *cc_info)
const struct snp_cpuid_fn *fn = &cpuid_table->fn[i];
if (fn->eax_in == 0x0)
cpuid_std_range_max = fn->eax;
RIP_REL_REF(cpuid_std_range_max) = fn->eax;
else if (fn->eax_in == 0x40000000)
cpuid_hyp_range_max = fn->eax;
RIP_REL_REF(cpuid_hyp_range_max) = fn->eax;
else if (fn->eax_in == 0x80000000)
cpuid_ext_range_max = fn->eax;
RIP_REL_REF(cpuid_ext_range_max) = fn->eax;
}
}
......@@ -1170,3 +1179,92 @@ static int vmgexit_psc(struct ghcb *ghcb, struct snp_psc_desc *desc)
out:
return ret;
}
static enum es_result vc_check_opcode_bytes(struct es_em_ctxt *ctxt,
unsigned long exit_code)
{
unsigned int opcode = (unsigned int)ctxt->insn.opcode.value;
u8 modrm = ctxt->insn.modrm.value;
switch (exit_code) {
case SVM_EXIT_IOIO:
case SVM_EXIT_NPF:
/* handled separately */
return ES_OK;
case SVM_EXIT_CPUID:
if (opcode == 0xa20f)
return ES_OK;
break;
case SVM_EXIT_INVD:
if (opcode == 0x080f)
return ES_OK;
break;
case SVM_EXIT_MONITOR:
if (opcode == 0x010f && modrm == 0xc8)
return ES_OK;
break;
case SVM_EXIT_MWAIT:
if (opcode == 0x010f && modrm == 0xc9)
return ES_OK;
break;
case SVM_EXIT_MSR:
/* RDMSR */
if (opcode == 0x320f ||
/* WRMSR */
opcode == 0x300f)
return ES_OK;
break;
case SVM_EXIT_RDPMC:
if (opcode == 0x330f)
return ES_OK;
break;
case SVM_EXIT_RDTSC:
if (opcode == 0x310f)
return ES_OK;
break;
case SVM_EXIT_RDTSCP:
if (opcode == 0x010f && modrm == 0xf9)
return ES_OK;
break;
case SVM_EXIT_READ_DR7:
if (opcode == 0x210f &&
X86_MODRM_REG(ctxt->insn.modrm.value) == 7)
return ES_OK;
break;
case SVM_EXIT_VMMCALL:
if (opcode == 0x010f && modrm == 0xd9)
return ES_OK;
break;
case SVM_EXIT_WRITE_DR7:
if (opcode == 0x230f &&
X86_MODRM_REG(ctxt->insn.modrm.value) == 7)
return ES_OK;
break;
case SVM_EXIT_WBINVD:
if (opcode == 0x90f)
return ES_OK;
break;
default:
break;
}
sev_printk(KERN_ERR "Wrong/unhandled opcode bytes: 0x%x, exit_code: 0x%lx, rIP: 0x%lx\n",
opcode, exit_code, ctxt->regs->ip);
return ES_UNSUPPORTED;
}
......@@ -59,6 +59,25 @@
#define AP_INIT_CR0_DEFAULT 0x60000010
#define AP_INIT_MXCSR_DEFAULT 0x1f80
static const char * const sev_status_feat_names[] = {
[MSR_AMD64_SEV_ENABLED_BIT] = "SEV",
[MSR_AMD64_SEV_ES_ENABLED_BIT] = "SEV-ES",
[MSR_AMD64_SEV_SNP_ENABLED_BIT] = "SEV-SNP",
[MSR_AMD64_SNP_VTOM_BIT] = "vTom",
[MSR_AMD64_SNP_REFLECT_VC_BIT] = "ReflectVC",
[MSR_AMD64_SNP_RESTRICTED_INJ_BIT] = "RI",
[MSR_AMD64_SNP_ALT_INJ_BIT] = "AI",
[MSR_AMD64_SNP_DEBUG_SWAP_BIT] = "DebugSwap",
[MSR_AMD64_SNP_PREVENT_HOST_IBS_BIT] = "NoHostIBS",
[MSR_AMD64_SNP_BTB_ISOLATION_BIT] = "BTBIsol",
[MSR_AMD64_SNP_VMPL_SSS_BIT] = "VmplSSS",
[MSR_AMD64_SNP_SECURE_TSC_BIT] = "SecureTSC",
[MSR_AMD64_SNP_VMGEXIT_PARAM_BIT] = "VMGExitParam",
[MSR_AMD64_SNP_IBS_VIRT_BIT] = "IBSVirt",
[MSR_AMD64_SNP_VMSA_REG_PROT_BIT] = "VMSARegProt",
[MSR_AMD64_SNP_SMT_PROT_BIT] = "SMTProt",
};
/* For early boot hypervisor communication in SEV-ES enabled guests */
static struct ghcb boot_ghcb_page __bss_decrypted __aligned(PAGE_SIZE);
......@@ -748,7 +767,7 @@ void __init early_snp_set_memory_private(unsigned long vaddr, unsigned long padd
* This eliminates worries about jump tables or checking boot_cpu_data
* in the cc_platform_has() function.
*/
if (!(sev_status & MSR_AMD64_SEV_SNP_ENABLED))
if (!(RIP_REL_REF(sev_status) & MSR_AMD64_SEV_SNP_ENABLED))
return;
/*
......@@ -767,7 +786,7 @@ void __init early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr
* This eliminates worries about jump tables or checking boot_cpu_data
* in the cc_platform_has() function.
*/
if (!(sev_status & MSR_AMD64_SEV_SNP_ENABLED))
if (!(RIP_REL_REF(sev_status) & MSR_AMD64_SEV_SNP_ENABLED))
return;
/* Ask hypervisor to mark the memory pages shared in the RMP table. */
......@@ -1752,7 +1771,10 @@ static enum es_result vc_handle_exitcode(struct es_em_ctxt *ctxt,
struct ghcb *ghcb,
unsigned long exit_code)
{
enum es_result result;
enum es_result result = vc_check_opcode_bytes(ctxt, exit_code);
if (result != ES_OK)
return result;
switch (exit_code) {
case SVM_EXIT_READ_DR7:
......@@ -2262,3 +2284,29 @@ static int __init snp_init_platform_device(void)
return 0;
}
device_initcall(snp_init_platform_device);
void kdump_sev_callback(void)
{
/*
* Do wbinvd() on remote CPUs when SNP is enabled in order to
* safely do SNP_SHUTDOWN on the local CPU.
*/
if (cpu_feature_enabled(X86_FEATURE_SEV_SNP))
wbinvd();
}
void sev_show_status(void)
{
int i;
pr_info("Status: ");
for (i = 0; i < MSR_AMD64_SNP_RESV_BIT; i++) {
if (sev_status & BIT_ULL(i)) {
if (!sev_status_feat_names[i])
continue;
pr_cont("%s ", sev_status_feat_names[i]);
}
}
pr_cont("\n");
}
......@@ -2815,7 +2815,10 @@ int kvm_create_lapic(struct kvm_vcpu *vcpu, int timer_advance_ns)
vcpu->arch.apic = apic;
apic->regs = (void *)get_zeroed_page(GFP_KERNEL_ACCOUNT);
if (kvm_x86_ops.alloc_apic_backing_page)
apic->regs = static_call(kvm_x86_alloc_apic_backing_page)(vcpu);
else
apic->regs = (void *)get_zeroed_page(GFP_KERNEL_ACCOUNT);
if (!apic->regs) {
printk(KERN_ERR "malloc apic regs error for vcpu %x\n",
vcpu->vcpu_id);
......
......@@ -1181,7 +1181,7 @@ int svm_allocate_nested(struct vcpu_svm *svm)
if (svm->nested.initialized)
return 0;
vmcb02_page = alloc_page(GFP_KERNEL_ACCOUNT | __GFP_ZERO);
vmcb02_page = snp_safe_alloc_page(&svm->vcpu);
if (!vmcb02_page)
return -ENOMEM;
svm->nested.vmcb02.ptr = page_address(vmcb02_page);
......
......@@ -246,6 +246,7 @@ static void sev_unbind_asid(struct kvm *kvm, unsigned int handle)
static int sev_guest_init(struct kvm *kvm, struct kvm_sev_cmd *argp)
{
struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
struct sev_platform_init_args init_args = {0};
int asid, ret;
if (kvm->created_vcpus)
......@@ -262,7 +263,8 @@ static int sev_guest_init(struct kvm *kvm, struct kvm_sev_cmd *argp)
goto e_no_asid;
sev->asid = asid;
ret = sev_platform_init(&argp->error);
init_args.probe = false;
ret = sev_platform_init(&init_args);
if (ret)
goto e_free;
......@@ -274,6 +276,7 @@ static int sev_guest_init(struct kvm *kvm, struct kvm_sev_cmd *argp)
return 0;
e_free:
argp->error = init_args.error;
sev_asid_free(sev);
sev->asid = 0;
e_no_asid:
......@@ -3165,3 +3168,35 @@ void sev_vcpu_deliver_sipi_vector(struct kvm_vcpu *vcpu, u8 vector)
ghcb_set_sw_exit_info_2(svm->sev_es.ghcb, 1);
}
struct page *snp_safe_alloc_page(struct kvm_vcpu *vcpu)
{
unsigned long pfn;
struct page *p;
if (!cpu_feature_enabled(X86_FEATURE_SEV_SNP))
return alloc_page(GFP_KERNEL_ACCOUNT | __GFP_ZERO);
/*
* Allocate an SNP-safe page to workaround the SNP erratum where
* the CPU will incorrectly signal an RMP violation #PF if a
* hugepage (2MB or 1GB) collides with the RMP entry of a
* 2MB-aligned VMCB, VMSA, or AVIC backing page.
*
* Allocate one extra page, choose a page which is not
* 2MB-aligned, and free the other.
*/
p = alloc_pages(GFP_KERNEL_ACCOUNT | __GFP_ZERO, 1);
if (!p)
return NULL;
split_page(p, 1);
pfn = page_to_pfn(p);
if (IS_ALIGNED(pfn, PTRS_PER_PMD))
__free_page(p++);
else
__free_page(p + 1);
return p;
}
......@@ -703,7 +703,7 @@ static int svm_cpu_init(int cpu)
int ret = -ENOMEM;
memset(sd, 0, sizeof(struct svm_cpu_data));
sd->save_area = alloc_page(GFP_KERNEL | __GFP_ZERO);
sd->save_area = snp_safe_alloc_page(NULL);
if (!sd->save_area)
return ret;
......@@ -1421,7 +1421,7 @@ static int svm_vcpu_create(struct kvm_vcpu *vcpu)
svm = to_svm(vcpu);
err = -ENOMEM;
vmcb01_page = alloc_page(GFP_KERNEL_ACCOUNT | __GFP_ZERO);
vmcb01_page = snp_safe_alloc_page(vcpu);
if (!vmcb01_page)
goto out;
......@@ -1430,7 +1430,7 @@ static int svm_vcpu_create(struct kvm_vcpu *vcpu)
* SEV-ES guests require a separate VMSA page used to contain
* the encrypted register state of the guest.
*/
vmsa_page = alloc_page(GFP_KERNEL_ACCOUNT | __GFP_ZERO);
vmsa_page = snp_safe_alloc_page(vcpu);
if (!vmsa_page)
goto error_free_vmcb_page;
......@@ -4900,6 +4900,16 @@ static int svm_vm_init(struct kvm *kvm)
return 0;
}
static void *svm_alloc_apic_backing_page(struct kvm_vcpu *vcpu)
{
struct page *page = snp_safe_alloc_page(vcpu);
if (!page)
return NULL;
return page_address(page);
}
static struct kvm_x86_ops svm_x86_ops __initdata = {
.name = KBUILD_MODNAME,
......@@ -5031,6 +5041,7 @@ static struct kvm_x86_ops svm_x86_ops __initdata = {
.vcpu_deliver_sipi_vector = svm_vcpu_deliver_sipi_vector,
.vcpu_get_apicv_inhibit_reasons = avic_vcpu_get_apicv_inhibit_reasons,
.alloc_apic_backing_page = svm_alloc_apic_backing_page,
};
/*
......
......@@ -694,6 +694,7 @@ void sev_es_vcpu_reset(struct vcpu_svm *svm);
void sev_vcpu_deliver_sipi_vector(struct kvm_vcpu *vcpu, u8 vector);
void sev_es_prepare_switch_to_guest(struct sev_es_save_area *hostsa);
void sev_es_unmap_ghcb(struct vcpu_svm *svm);
struct page *snp_safe_alloc_page(struct kvm_vcpu *vcpu);
/* vmenter.S */
......
......@@ -16,6 +16,7 @@ KASAN_SANITIZE_pgprot.o := n
KCSAN_SANITIZE := n
# Avoid recursion by not calling KMSAN hooks for CEA code.
KMSAN_SANITIZE_cpu_entry_area.o := n
KMSAN_SANITIZE_mem_encrypt_identity.o := n
ifdef CONFIG_FUNCTION_TRACER
CFLAGS_REMOVE_mem_encrypt.o = -pg
......
......@@ -35,6 +35,7 @@
#include <asm/vdso.h> /* fixup_vdso_exception() */
#include <asm/irq_stack.h>
#include <asm/fred.h>
#include <asm/sev.h> /* snp_dump_hva_rmpentry() */
#define CREATE_TRACE_POINTS
#include <asm/trace/exceptions.h>
......@@ -548,6 +549,7 @@ show_fault_oops(struct pt_regs *regs, unsigned long error_code, unsigned long ad
!(error_code & X86_PF_PROT) ? "not-present page" :
(error_code & X86_PF_RSVD) ? "reserved bit violation" :
(error_code & X86_PF_PK) ? "protection keys violation" :
(error_code & X86_PF_RMP) ? "RMP violation" :
"permissions violation");
if (!(error_code & X86_PF_USER) && user_mode(regs)) {
......@@ -580,6 +582,9 @@ show_fault_oops(struct pt_regs *regs, unsigned long error_code, unsigned long ad
}
dump_pagetable(address);
if (error_code & X86_PF_RMP)
snp_dump_hva_rmpentry(address);
}
static noinline void
......
......@@ -14,6 +14,8 @@
#include <linux/mem_encrypt.h>
#include <linux/virtio_anchor.h>
#include <asm/sev.h>
/* Override for DMA direct allocation check - ARCH_HAS_FORCE_DMA_UNENCRYPTED */
bool force_dma_unencrypted(struct device *dev)
{
......@@ -42,38 +44,45 @@ bool force_dma_unencrypted(struct device *dev)
static void print_mem_encrypt_feature_info(void)
{
pr_info("Memory Encryption Features active:");
if (cpu_feature_enabled(X86_FEATURE_TDX_GUEST)) {
pr_cont(" Intel TDX\n");
return;
}
pr_info("Memory Encryption Features active: ");
pr_cont(" AMD");
switch (cc_vendor) {
case CC_VENDOR_INTEL:
pr_cont("Intel TDX\n");
break;
case CC_VENDOR_AMD:
pr_cont("AMD");
/* Secure Memory Encryption */
if (cc_platform_has(CC_ATTR_HOST_MEM_ENCRYPT)) {
/* Secure Memory Encryption */
if (cc_platform_has(CC_ATTR_HOST_MEM_ENCRYPT)) {
/*
* SME is mutually exclusive with any of the SEV
* features below.
*/
pr_cont(" SME\n");
return;
}
*/
pr_cont(" SME\n");
return;
}
/* Secure Encrypted Virtualization */
if (cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT))
pr_cont(" SEV");
/* Secure Encrypted Virtualization */
if (cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT))
pr_cont(" SEV");
/* Encrypted Register State */
if (cc_platform_has(CC_ATTR_GUEST_STATE_ENCRYPT))
pr_cont(" SEV-ES");
/* Encrypted Register State */
if (cc_platform_has(CC_ATTR_GUEST_STATE_ENCRYPT))
pr_cont(" SEV-ES");
/* Secure Nested Paging */
if (cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
pr_cont(" SEV-SNP");
/* Secure Nested Paging */
if (cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
pr_cont(" SEV-SNP");
pr_cont("\n");
pr_cont("\n");
sev_show_status();
break;
default:
pr_cont("Unknown\n");
}
}
/* Architecture __weak replacement functions */
......
......@@ -97,7 +97,6 @@ static char sme_workarea[2 * PMD_SIZE] __section(".init.scratch");
static char sme_cmdline_arg[] __initdata = "mem_encrypt";
static char sme_cmdline_on[] __initdata = "on";
static char sme_cmdline_off[] __initdata = "off";
static void __init sme_clear_pgd(struct sme_populate_pgd_data *ppd)
{
......@@ -305,7 +304,8 @@ void __init sme_encrypt_kernel(struct boot_params *bp)
* instrumentation or checking boot_cpu_data in the cc_platform_has()
* function.
*/
if (!sme_get_me_mask() || sev_status & MSR_AMD64_SEV_ENABLED)
if (!sme_get_me_mask() ||
RIP_REL_REF(sev_status) & MSR_AMD64_SEV_ENABLED)
return;
/*
......@@ -504,10 +504,9 @@ void __init sme_encrypt_kernel(struct boot_params *bp)
void __init sme_enable(struct boot_params *bp)
{
const char *cmdline_ptr, *cmdline_arg, *cmdline_on, *cmdline_off;
const char *cmdline_ptr, *cmdline_arg, *cmdline_on;
unsigned int eax, ebx, ecx, edx;
unsigned long feature_mask;
bool active_by_default;
unsigned long me_mask;
char buffer[16];
bool snp;
......@@ -543,11 +542,11 @@ void __init sme_enable(struct boot_params *bp)
me_mask = 1UL << (ebx & 0x3f);
/* Check the SEV MSR whether SEV or SME is enabled */
sev_status = __rdmsr(MSR_AMD64_SEV);
feature_mask = (sev_status & MSR_AMD64_SEV_ENABLED) ? AMD_SEV_BIT : AMD_SME_BIT;
RIP_REL_REF(sev_status) = msr = __rdmsr(MSR_AMD64_SEV);
feature_mask = (msr & MSR_AMD64_SEV_ENABLED) ? AMD_SEV_BIT : AMD_SME_BIT;
/* The SEV-SNP CC blob should never be present unless SEV-SNP is enabled. */
if (snp && !(sev_status & MSR_AMD64_SEV_SNP_ENABLED))
if (snp && !(msr & MSR_AMD64_SEV_SNP_ENABLED))
snp_abort();
/* Check if memory encryption is enabled */
......@@ -573,7 +572,6 @@ void __init sme_enable(struct boot_params *bp)
return;
} else {
/* SEV state cannot be controlled by a command line option */
sme_me_mask = me_mask;
goto out;
}
......@@ -588,31 +586,17 @@ void __init sme_enable(struct boot_params *bp)
asm ("lea sme_cmdline_on(%%rip), %0"
: "=r" (cmdline_on)
: "p" (sme_cmdline_on));
asm ("lea sme_cmdline_off(%%rip), %0"
: "=r" (cmdline_off)
: "p" (sme_cmdline_off));
if (IS_ENABLED(CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT))
active_by_default = true;
else
active_by_default = false;
cmdline_ptr = (const char *)((u64)bp->hdr.cmd_line_ptr |
((u64)bp->ext_cmd_line_ptr << 32));
if (cmdline_find_option(cmdline_ptr, cmdline_arg, buffer, sizeof(buffer)) < 0)
if (cmdline_find_option(cmdline_ptr, cmdline_arg, buffer, sizeof(buffer)) < 0 ||
strncmp(buffer, cmdline_on, sizeof(buffer)))
return;
if (!strncmp(buffer, cmdline_on, sizeof(buffer)))
sme_me_mask = me_mask;
else if (!strncmp(buffer, cmdline_off, sizeof(buffer)))
sme_me_mask = 0;
else
sme_me_mask = active_by_default ? me_mask : 0;
out:
if (sme_me_mask) {
physical_mask &= ~sme_me_mask;
cc_vendor = CC_VENDOR_AMD;
cc_set_mask(sme_me_mask);
}
RIP_REL_REF(sme_me_mask) = me_mask;
physical_mask &= ~me_mask;
cc_vendor = CC_VENDOR_AMD;
cc_set_mask(me_mask);
}
# SPDX-License-Identifier: GPL-2.0
obj-$(CONFIG_KVM_AMD_SEV) += sev.o
// SPDX-License-Identifier: GPL-2.0-only
/*
* AMD SVM-SEV Host Support.
*
* Copyright (C) 2023 Advanced Micro Devices, Inc.
*
* Author: Ashish Kalra <ashish.kalra@amd.com>
*
*/
#include <linux/cc_platform.h>
#include <linux/printk.h>
#include <linux/mm_types.h>
#include <linux/set_memory.h>
#include <linux/memblock.h>
#include <linux/kernel.h>
#include <linux/mm.h>
#include <linux/cpumask.h>
#include <linux/iommu.h>
#include <linux/amd-iommu.h>
#include <asm/sev.h>
#include <asm/processor.h>
#include <asm/setup.h>
#include <asm/svm.h>
#include <asm/smp.h>
#include <asm/cpu.h>
#include <asm/apic.h>
#include <asm/cpuid.h>
#include <asm/cmdline.h>
#include <asm/iommu.h>
/*
* The RMP entry format is not architectural. The format is defined in PPR
* Family 19h Model 01h, Rev B1 processor.
*/
struct rmpentry {
union {
struct {
u64 assigned : 1,
pagesize : 1,
immutable : 1,
rsvd1 : 9,
gpa : 39,
asid : 10,
vmsa : 1,
validated : 1,
rsvd2 : 1;
};
u64 lo;
};
u64 hi;
} __packed;
/*
* The first 16KB from the RMP_BASE is used by the processor for the
* bookkeeping, the range needs to be added during the RMP entry lookup.
*/
#define RMPTABLE_CPU_BOOKKEEPING_SZ 0x4000
/* Mask to apply to a PFN to get the first PFN of a 2MB page */
#define PFN_PMD_MASK GENMASK_ULL(63, PMD_SHIFT - PAGE_SHIFT)
static u64 probed_rmp_base, probed_rmp_size;
static struct rmpentry *rmptable __ro_after_init;
static u64 rmptable_max_pfn __ro_after_init;
static LIST_HEAD(snp_leaked_pages_list);
static DEFINE_SPINLOCK(snp_leaked_pages_list_lock);
static unsigned long snp_nr_leaked_pages;
#undef pr_fmt
#define pr_fmt(fmt) "SEV-SNP: " fmt
static int __mfd_enable(unsigned int cpu)
{
u64 val;
if (!cpu_feature_enabled(X86_FEATURE_SEV_SNP))
return 0;
rdmsrl(MSR_AMD64_SYSCFG, val);
val |= MSR_AMD64_SYSCFG_MFDM;
wrmsrl(MSR_AMD64_SYSCFG, val);
return 0;
}
static __init void mfd_enable(void *arg)
{
__mfd_enable(smp_processor_id());
}
static int __snp_enable(unsigned int cpu)
{
u64 val;
if (!cpu_feature_enabled(X86_FEATURE_SEV_SNP))
return 0;
rdmsrl(MSR_AMD64_SYSCFG, val);
val |= MSR_AMD64_SYSCFG_SNP_EN;
val |= MSR_AMD64_SYSCFG_SNP_VMPL_EN;
wrmsrl(MSR_AMD64_SYSCFG, val);
return 0;
}
static __init void snp_enable(void *arg)
{
__snp_enable(smp_processor_id());
}
#define RMP_ADDR_MASK GENMASK_ULL(51, 13)
bool snp_probe_rmptable_info(void)
{
u64 max_rmp_pfn, calc_rmp_sz, rmp_sz, rmp_base, rmp_end;
rdmsrl(MSR_AMD64_RMP_BASE, rmp_base);
rdmsrl(MSR_AMD64_RMP_END, rmp_end);
if (!(rmp_base & RMP_ADDR_MASK) || !(rmp_end & RMP_ADDR_MASK)) {
pr_err("Memory for the RMP table has not been reserved by BIOS\n");
return false;
}
if (rmp_base > rmp_end) {
pr_err("RMP configuration not valid: base=%#llx, end=%#llx\n", rmp_base, rmp_end);
return false;
}
rmp_sz = rmp_end - rmp_base + 1;
/*
* Calculate the amount the memory that must be reserved by the BIOS to
* address the whole RAM, including the bookkeeping area. The RMP itself
* must also be covered.
*/
max_rmp_pfn = max_pfn;
if (PHYS_PFN(rmp_end) > max_pfn)
max_rmp_pfn = PHYS_PFN(rmp_end);
calc_rmp_sz = (max_rmp_pfn << 4) + RMPTABLE_CPU_BOOKKEEPING_SZ;
if (calc_rmp_sz > rmp_sz) {
pr_err("Memory reserved for the RMP table does not cover full system RAM (expected 0x%llx got 0x%llx)\n",
calc_rmp_sz, rmp_sz);
return false;
}
probed_rmp_base = rmp_base;
probed_rmp_size = rmp_sz;
pr_info("RMP table physical range [0x%016llx - 0x%016llx]\n",
probed_rmp_base, probed_rmp_base + probed_rmp_size - 1);
return true;
}
/*
* Do the necessary preparations which are verified by the firmware as
* described in the SNP_INIT_EX firmware command description in the SNP
* firmware ABI spec.
*/
static int __init snp_rmptable_init(void)
{
void *rmptable_start;
u64 rmptable_size;
u64 val;
if (!cpu_feature_enabled(X86_FEATURE_SEV_SNP))
return 0;
if (!amd_iommu_snp_en)
return 0;
if (!probed_rmp_size)
goto nosnp;
rmptable_start = memremap(probed_rmp_base, probed_rmp_size, MEMREMAP_WB);
if (!rmptable_start) {
pr_err("Failed to map RMP table\n");
return 1;
}
/*
* Check if SEV-SNP is already enabled, this can happen in case of
* kexec boot.
*/
rdmsrl(MSR_AMD64_SYSCFG, val);
if (val & MSR_AMD64_SYSCFG_SNP_EN)
goto skip_enable;
memset(rmptable_start, 0, probed_rmp_size);
/* Flush the caches to ensure that data is written before SNP is enabled. */
wbinvd_on_all_cpus();
/* MtrrFixDramModEn must be enabled on all the CPUs prior to enabling SNP. */
on_each_cpu(mfd_enable, NULL, 1);
on_each_cpu(snp_enable, NULL, 1);
skip_enable:
rmptable_start += RMPTABLE_CPU_BOOKKEEPING_SZ;
rmptable_size = probed_rmp_size - RMPTABLE_CPU_BOOKKEEPING_SZ;
rmptable = (struct rmpentry *)rmptable_start;
rmptable_max_pfn = rmptable_size / sizeof(struct rmpentry) - 1;
cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "x86/rmptable_init:online", __snp_enable, NULL);
/*
* Setting crash_kexec_post_notifiers to 'true' to ensure that SNP panic
* notifier is invoked to do SNP IOMMU shutdown before kdump.
*/
crash_kexec_post_notifiers = true;
return 0;
nosnp:
setup_clear_cpu_cap(X86_FEATURE_SEV_SNP);
return -ENOSYS;
}
/*
* This must be called after the IOMMU has been initialized.
*/
device_initcall(snp_rmptable_init);
static struct rmpentry *get_rmpentry(u64 pfn)
{
if (WARN_ON_ONCE(pfn > rmptable_max_pfn))
return ERR_PTR(-EFAULT);
return &rmptable[pfn];
}
static struct rmpentry *__snp_lookup_rmpentry(u64 pfn, int *level)
{
struct rmpentry *large_entry, *entry;
if (!cpu_feature_enabled(X86_FEATURE_SEV_SNP))
return ERR_PTR(-ENODEV);
entry = get_rmpentry(pfn);
if (IS_ERR(entry))
return entry;
/*
* Find the authoritative RMP entry for a PFN. This can be either a 4K
* RMP entry or a special large RMP entry that is authoritative for a
* whole 2M area.
*/
large_entry = get_rmpentry(pfn & PFN_PMD_MASK);
if (IS_ERR(large_entry))
return large_entry;
*level = RMP_TO_PG_LEVEL(large_entry->pagesize);
return entry;
}
int snp_lookup_rmpentry(u64 pfn, bool *assigned, int *level)
{
struct rmpentry *e;
e = __snp_lookup_rmpentry(pfn, level);
if (IS_ERR(e))
return PTR_ERR(e);
*assigned = !!e->assigned;
return 0;
}
EXPORT_SYMBOL_GPL(snp_lookup_rmpentry);
/*
* Dump the raw RMP entry for a particular PFN. These bits are documented in the
* PPR for a particular CPU model and provide useful information about how a
* particular PFN is being utilized by the kernel/firmware at the time certain
* unexpected events occur, such as RMP faults.
*/
static void dump_rmpentry(u64 pfn)
{
u64 pfn_i, pfn_end;
struct rmpentry *e;
int level;
e = __snp_lookup_rmpentry(pfn, &level);
if (IS_ERR(e)) {
pr_err("Failed to read RMP entry for PFN 0x%llx, error %ld\n",
pfn, PTR_ERR(e));
return;
}
if (e->assigned) {
pr_info("PFN 0x%llx, RMP entry: [0x%016llx - 0x%016llx]\n",
pfn, e->lo, e->hi);
return;
}
/*
* If the RMP entry for a particular PFN is not in an assigned state,
* then it is sometimes useful to get an idea of whether or not any RMP
* entries for other PFNs within the same 2MB region are assigned, since
* those too can affect the ability to access a particular PFN in
* certain situations, such as when the PFN is being accessed via a 2MB
* mapping in the host page table.
*/
pfn_i = ALIGN_DOWN(pfn, PTRS_PER_PMD);
pfn_end = pfn_i + PTRS_PER_PMD;
pr_info("PFN 0x%llx unassigned, dumping non-zero entries in 2M PFN region: [0x%llx - 0x%llx]\n",
pfn, pfn_i, pfn_end);
while (pfn_i < pfn_end) {
e = __snp_lookup_rmpentry(pfn_i, &level);
if (IS_ERR(e)) {
pr_err("Error %ld reading RMP entry for PFN 0x%llx\n",
PTR_ERR(e), pfn_i);
pfn_i++;
continue;
}
if (e->lo || e->hi)
pr_info("PFN: 0x%llx, [0x%016llx - 0x%016llx]\n", pfn_i, e->lo, e->hi);
pfn_i++;
}
}
void snp_dump_hva_rmpentry(unsigned long hva)
{
unsigned long paddr;
unsigned int level;
pgd_t *pgd;
pte_t *pte;
pgd = __va(read_cr3_pa());
pgd += pgd_index(hva);
pte = lookup_address_in_pgd(pgd, hva, &level);
if (!pte) {
pr_err("Can't dump RMP entry for HVA %lx: no PTE/PFN found\n", hva);
return;
}
paddr = PFN_PHYS(pte_pfn(*pte)) | (hva & ~page_level_mask(level));
dump_rmpentry(PHYS_PFN(paddr));
}
/*
* PSMASH a 2MB aligned page into 4K pages in the RMP table while preserving the
* Validated bit.
*/
int psmash(u64 pfn)
{
unsigned long paddr = pfn << PAGE_SHIFT;
int ret;
if (!cpu_feature_enabled(X86_FEATURE_SEV_SNP))
return -ENODEV;
if (!pfn_valid(pfn))
return -EINVAL;
/* Binutils version 2.36 supports the PSMASH mnemonic. */
asm volatile(".byte 0xF3, 0x0F, 0x01, 0xFF"
: "=a" (ret)
: "a" (paddr)
: "memory", "cc");
return ret;
}
EXPORT_SYMBOL_GPL(psmash);
/*
* If the kernel uses a 2MB or larger directmap mapping to write to an address,
* and that mapping contains any 4KB pages that are set to private in the RMP
* table, an RMP #PF will trigger and cause a host crash. Hypervisor code that
* owns the PFNs being transitioned will never attempt such a write, but other
* kernel tasks writing to other PFNs in the range may trigger these checks
* inadvertently due a large directmap mapping that happens to overlap such a
* PFN.
*
* Prevent this by splitting any 2MB+ mappings that might end up containing a
* mix of private/shared PFNs as a result of a subsequent RMPUPDATE for the
* PFN/rmp_level passed in.
*
* Note that there is no attempt here to scan all the RMP entries for the 2MB
* physical range, since it would only be worthwhile in determining if a
* subsequent RMPUPDATE for a 4KB PFN would result in all the entries being of
* the same shared/private state, thus avoiding the need to split the mapping.
* But that would mean the entries are currently in a mixed state, and so the
* mapping would have already been split as a result of prior transitions.
* And since the 4K split is only done if the mapping is 2MB+, and there isn't
* currently a mechanism in place to restore 2MB+ mappings, such a check would
* not provide any usable benefit.
*
* More specifics on how these checks are carried out can be found in APM
* Volume 2, "RMP and VMPL Access Checks".
*/
static int adjust_direct_map(u64 pfn, int rmp_level)
{
unsigned long vaddr;
unsigned int level;
int npages, ret;
pte_t *pte;
/*
* pfn_to_kaddr() will return a vaddr only within the direct
* map range.
*/
vaddr = (unsigned long)pfn_to_kaddr(pfn);
/* Only 4KB/2MB RMP entries are supported by current hardware. */
if (WARN_ON_ONCE(rmp_level > PG_LEVEL_2M))
return -EINVAL;
if (!pfn_valid(pfn))
return -EINVAL;
if (rmp_level == PG_LEVEL_2M &&
(!IS_ALIGNED(pfn, PTRS_PER_PMD) || !pfn_valid(pfn + PTRS_PER_PMD - 1)))
return -EINVAL;
/*
* If an entire 2MB physical range is being transitioned, then there is
* no risk of RMP #PFs due to write accesses from overlapping mappings,
* since even accesses from 1GB mappings will be treated as 2MB accesses
* as far as RMP table checks are concerned.
*/
if (rmp_level == PG_LEVEL_2M)
return 0;
pte = lookup_address(vaddr, &level);
if (!pte || pte_none(*pte))
return 0;
if (level == PG_LEVEL_4K)
return 0;
npages = page_level_size(rmp_level) / PAGE_SIZE;
ret = set_memory_4k(vaddr, npages);
if (ret)
pr_warn("Failed to split direct map for PFN 0x%llx, ret: %d\n",
pfn, ret);
return ret;
}
/*
* It is expected that those operations are seldom enough so that no mutual
* exclusion of updaters is needed and thus the overlap error condition below
* should happen very rarely and would get resolved relatively quickly by
* the firmware.
*
* If not, one could consider introducing a mutex or so here to sync concurrent
* RMP updates and thus diminish the amount of cases where firmware needs to
* lock 2M ranges to protect against concurrent updates.
*
* The optimal solution would be range locking to avoid locking disjoint
* regions unnecessarily but there's no support for that yet.
*/
static int rmpupdate(u64 pfn, struct rmp_state *state)
{
unsigned long paddr = pfn << PAGE_SHIFT;
int ret, level;
if (!cpu_feature_enabled(X86_FEATURE_SEV_SNP))
return -ENODEV;
level = RMP_TO_PG_LEVEL(state->pagesize);
if (adjust_direct_map(pfn, level))
return -EFAULT;
do {
/* Binutils version 2.36 supports the RMPUPDATE mnemonic. */
asm volatile(".byte 0xF2, 0x0F, 0x01, 0xFE"
: "=a" (ret)
: "a" (paddr), "c" ((unsigned long)state)
: "memory", "cc");
} while (ret == RMPUPDATE_FAIL_OVERLAP);
if (ret) {
pr_err("RMPUPDATE failed for PFN %llx, pg_level: %d, ret: %d\n",
pfn, level, ret);
dump_rmpentry(pfn);
dump_stack();
return -EFAULT;
}
return 0;
}
/* Transition a page to guest-owned/private state in the RMP table. */
int rmp_make_private(u64 pfn, u64 gpa, enum pg_level level, u32 asid, bool immutable)
{
struct rmp_state state;
memset(&state, 0, sizeof(state));
state.assigned = 1;
state.asid = asid;
state.immutable = immutable;
state.gpa = gpa;
state.pagesize = PG_LEVEL_TO_RMP(level);
return rmpupdate(pfn, &state);
}
EXPORT_SYMBOL_GPL(rmp_make_private);
/* Transition a page to hypervisor-owned/shared state in the RMP table. */
int rmp_make_shared(u64 pfn, enum pg_level level)
{
struct rmp_state state;
memset(&state, 0, sizeof(state));
state.pagesize = PG_LEVEL_TO_RMP(level);
return rmpupdate(pfn, &state);
}
EXPORT_SYMBOL_GPL(rmp_make_shared);
void snp_leak_pages(u64 pfn, unsigned int npages)
{
struct page *page = pfn_to_page(pfn);
pr_warn("Leaking PFN range 0x%llx-0x%llx\n", pfn, pfn + npages);
spin_lock(&snp_leaked_pages_list_lock);
while (npages--) {
/*
* Reuse the page's buddy list for chaining into the leaked
* pages list. This page should not be on a free list currently
* and is also unsafe to be added to a free list.
*/
if (likely(!PageCompound(page)) ||
/*
* Skip inserting tail pages of compound page as
* page->buddy_list of tail pages is not usable.
*/
(PageHead(page) && compound_nr(page) <= npages))
list_add_tail(&page->buddy_list, &snp_leaked_pages_list);
dump_rmpentry(pfn);
snp_nr_leaked_pages++;
pfn++;
page++;
}
spin_unlock(&snp_leaked_pages_list_lock);
}
EXPORT_SYMBOL_GPL(snp_leak_pages);
......@@ -38,7 +38,7 @@ config CRYPTO_DEV_CCP_CRYPTO
config CRYPTO_DEV_SP_PSP
bool "Platform Security Processor (PSP) device"
default y
depends on CRYPTO_DEV_CCP_DD && X86_64
depends on CRYPTO_DEV_CCP_DD && X86_64 && AMD_IOMMU
help
Provide support for the AMD Platform Security Processor (PSP).
The PSP is a dedicated processor that provides support for key
......
......@@ -21,14 +21,18 @@
#include <linux/hw_random.h>
#include <linux/ccp.h>
#include <linux/firmware.h>
#include <linux/panic_notifier.h>
#include <linux/gfp.h>
#include <linux/cpufeature.h>
#include <linux/fs.h>
#include <linux/fs_struct.h>
#include <linux/psp.h>
#include <linux/amd-iommu.h>
#include <asm/smp.h>
#include <asm/cacheflush.h>
#include <asm/e820/types.h>
#include <asm/sev.h>
#include "psp-dev.h"
#include "sev-dev.h"
......@@ -37,6 +41,19 @@
#define SEV_FW_FILE "amd/sev.fw"
#define SEV_FW_NAME_SIZE 64
/* Minimum firmware version required for the SEV-SNP support */
#define SNP_MIN_API_MAJOR 1
#define SNP_MIN_API_MINOR 51
/*
* Maximum number of firmware-writable buffers that might be specified
* in the parameters of a legacy SEV command buffer.
*/
#define CMD_BUF_FW_WRITABLE_MAX 2
/* Leave room in the descriptor array for an end-of-list indicator. */
#define CMD_BUF_DESC_MAX (CMD_BUF_FW_WRITABLE_MAX + 1)
static DEFINE_MUTEX(sev_cmd_mutex);
static struct sev_misc_dev *misc_dev;
......@@ -68,9 +85,14 @@ static int psp_timeout;
* The TMR is a 1MB area that must be 1MB aligned. Use the page allocator
* to allocate the memory, which will return aligned memory for the specified
* allocation order.
*
* When SEV-SNP is enabled the TMR needs to be 2MB aligned and 2MB sized.
*/
#define SEV_ES_TMR_SIZE (1024 * 1024)
#define SEV_TMR_SIZE (1024 * 1024)
#define SNP_TMR_SIZE (2 * 1024 * 1024)
static void *sev_es_tmr;
static size_t sev_es_tmr_size = SEV_TMR_SIZE;
/* INIT_EX NV Storage:
* The NV Storage is a 32Kb area and must be 4Kb page aligned. Use the page
......@@ -80,6 +102,13 @@ static void *sev_es_tmr;
#define NV_LENGTH (32 * 1024)
static void *sev_init_ex_buffer;
/*
* SEV_DATA_RANGE_LIST:
* Array containing range of pages that firmware transitions to HV-fixed
* page state.
*/
static struct sev_data_range_list *snp_range_list;
static inline bool sev_version_greater_or_equal(u8 maj, u8 min)
{
struct sev_device *sev = psp_master->sev_data;
......@@ -115,6 +144,25 @@ static int sev_wait_cmd_ioc(struct sev_device *sev,
{
int ret;
/*
* If invoked during panic handling, local interrupts are disabled,
* so the PSP command completion interrupt can't be used. Poll for
* PSP command completion instead.
*/
if (irqs_disabled()) {
unsigned long timeout_usecs = (timeout * USEC_PER_SEC) / 10;
/* Poll for SEV command completion: */
while (timeout_usecs--) {
*reg = ioread32(sev->io_regs + sev->vdata->cmdresp_reg);
if (*reg & PSP_CMDRESP_RESP)
return 0;
udelay(10);
}
return -ETIMEDOUT;
}
ret = wait_event_timeout(sev->int_queue,
sev->int_rcvd, timeout * HZ);
if (!ret)
......@@ -130,6 +178,8 @@ static int sev_cmd_buffer_len(int cmd)
switch (cmd) {
case SEV_CMD_INIT: return sizeof(struct sev_data_init);
case SEV_CMD_INIT_EX: return sizeof(struct sev_data_init_ex);
case SEV_CMD_SNP_SHUTDOWN_EX: return sizeof(struct sev_data_snp_shutdown_ex);
case SEV_CMD_SNP_INIT_EX: return sizeof(struct sev_data_snp_init_ex);
case SEV_CMD_PLATFORM_STATUS: return sizeof(struct sev_user_data_status);
case SEV_CMD_PEK_CSR: return sizeof(struct sev_data_pek_csr);
case SEV_CMD_PEK_CERT_IMPORT: return sizeof(struct sev_data_pek_cert_import);
......@@ -158,23 +208,27 @@ static int sev_cmd_buffer_len(int cmd)
case SEV_CMD_GET_ID: return sizeof(struct sev_data_get_id);
case SEV_CMD_ATTESTATION_REPORT: return sizeof(struct sev_data_attestation_report);
case SEV_CMD_SEND_CANCEL: return sizeof(struct sev_data_send_cancel);
case SEV_CMD_SNP_GCTX_CREATE: return sizeof(struct sev_data_snp_addr);
case SEV_CMD_SNP_LAUNCH_START: return sizeof(struct sev_data_snp_launch_start);
case SEV_CMD_SNP_LAUNCH_UPDATE: return sizeof(struct sev_data_snp_launch_update);
case SEV_CMD_SNP_ACTIVATE: return sizeof(struct sev_data_snp_activate);
case SEV_CMD_SNP_DECOMMISSION: return sizeof(struct sev_data_snp_addr);
case SEV_CMD_SNP_PAGE_RECLAIM: return sizeof(struct sev_data_snp_page_reclaim);
case SEV_CMD_SNP_GUEST_STATUS: return sizeof(struct sev_data_snp_guest_status);
case SEV_CMD_SNP_LAUNCH_FINISH: return sizeof(struct sev_data_snp_launch_finish);
case SEV_CMD_SNP_DBG_DECRYPT: return sizeof(struct sev_data_snp_dbg);
case SEV_CMD_SNP_DBG_ENCRYPT: return sizeof(struct sev_data_snp_dbg);
case SEV_CMD_SNP_PAGE_UNSMASH: return sizeof(struct sev_data_snp_page_unsmash);
case SEV_CMD_SNP_PLATFORM_STATUS: return sizeof(struct sev_data_snp_addr);
case SEV_CMD_SNP_GUEST_REQUEST: return sizeof(struct sev_data_snp_guest_request);
case SEV_CMD_SNP_CONFIG: return sizeof(struct sev_user_data_snp_config);
case SEV_CMD_SNP_COMMIT: return sizeof(struct sev_data_snp_commit);
default: return 0;
}
return 0;
}
static void *sev_fw_alloc(unsigned long len)
{
struct page *page;
page = alloc_pages(GFP_KERNEL, get_order(len));
if (!page)
return NULL;
return page_address(page);
}
static struct file *open_file_as_root(const char *filename, int flags, umode_t mode)
{
struct file *fp;
......@@ -305,13 +359,485 @@ static int sev_write_init_ex_file_if_required(int cmd_id)
return sev_write_init_ex_file();
}
/*
* snp_reclaim_pages() needs __sev_do_cmd_locked(), and __sev_do_cmd_locked()
* needs snp_reclaim_pages(), so a forward declaration is needed.
*/
static int __sev_do_cmd_locked(int cmd, void *data, int *psp_ret);
static int snp_reclaim_pages(unsigned long paddr, unsigned int npages, bool locked)
{
int ret, err, i;
paddr = __sme_clr(ALIGN_DOWN(paddr, PAGE_SIZE));
for (i = 0; i < npages; i++, paddr += PAGE_SIZE) {
struct sev_data_snp_page_reclaim data = {0};
data.paddr = paddr;
if (locked)
ret = __sev_do_cmd_locked(SEV_CMD_SNP_PAGE_RECLAIM, &data, &err);
else
ret = sev_do_cmd(SEV_CMD_SNP_PAGE_RECLAIM, &data, &err);
if (ret)
goto cleanup;
ret = rmp_make_shared(__phys_to_pfn(paddr), PG_LEVEL_4K);
if (ret)
goto cleanup;
}
return 0;
cleanup:
/*
* If there was a failure reclaiming the page then it is no longer safe
* to release it back to the system; leak it instead.
*/
snp_leak_pages(__phys_to_pfn(paddr), npages - i);
return ret;
}
static int rmp_mark_pages_firmware(unsigned long paddr, unsigned int npages, bool locked)
{
unsigned long pfn = __sme_clr(paddr) >> PAGE_SHIFT;
int rc, i;
for (i = 0; i < npages; i++, pfn++) {
rc = rmp_make_private(pfn, 0, PG_LEVEL_4K, 0, true);
if (rc)
goto cleanup;
}
return 0;
cleanup:
/*
* Try unrolling the firmware state changes by
* reclaiming the pages which were already changed to the
* firmware state.
*/
snp_reclaim_pages(paddr, i, locked);
return rc;
}
static struct page *__snp_alloc_firmware_pages(gfp_t gfp_mask, int order)
{
unsigned long npages = 1ul << order, paddr;
struct sev_device *sev;
struct page *page;
if (!psp_master || !psp_master->sev_data)
return NULL;
page = alloc_pages(gfp_mask, order);
if (!page)
return NULL;
/* If SEV-SNP is initialized then add the page in RMP table. */
sev = psp_master->sev_data;
if (!sev->snp_initialized)
return page;
paddr = __pa((unsigned long)page_address(page));
if (rmp_mark_pages_firmware(paddr, npages, false))
return NULL;
return page;
}
void *snp_alloc_firmware_page(gfp_t gfp_mask)
{
struct page *page;
page = __snp_alloc_firmware_pages(gfp_mask, 0);
return page ? page_address(page) : NULL;
}
EXPORT_SYMBOL_GPL(snp_alloc_firmware_page);
static void __snp_free_firmware_pages(struct page *page, int order, bool locked)
{
struct sev_device *sev = psp_master->sev_data;
unsigned long paddr, npages = 1ul << order;
if (!page)
return;
paddr = __pa((unsigned long)page_address(page));
if (sev->snp_initialized &&
snp_reclaim_pages(paddr, npages, locked))
return;
__free_pages(page, order);
}
void snp_free_firmware_page(void *addr)
{
if (!addr)
return;
__snp_free_firmware_pages(virt_to_page(addr), 0, false);
}
EXPORT_SYMBOL_GPL(snp_free_firmware_page);
static void *sev_fw_alloc(unsigned long len)
{
struct page *page;
page = __snp_alloc_firmware_pages(GFP_KERNEL, get_order(len));
if (!page)
return NULL;
return page_address(page);
}
/**
* struct cmd_buf_desc - descriptors for managing legacy SEV command address
* parameters corresponding to buffers that may be written to by firmware.
*
* @paddr_ptr: pointer to the address parameter in the command buffer which may
* need to be saved/restored depending on whether a bounce buffer
* is used. In the case of a bounce buffer, the command buffer
* needs to be updated with the address of the new bounce buffer
* snp_map_cmd_buf_desc() has allocated specifically for it. Must
* be NULL if this descriptor is only an end-of-list indicator.
*
* @paddr_orig: storage for the original address parameter, which can be used to
* restore the original value in @paddr_ptr in cases where it is
* replaced with the address of a bounce buffer.
*
* @len: length of buffer located at the address originally stored at @paddr_ptr
*
* @guest_owned: true if the address corresponds to guest-owned pages, in which
* case bounce buffers are not needed.
*/
struct cmd_buf_desc {
u64 *paddr_ptr;
u64 paddr_orig;
u32 len;
bool guest_owned;
};
/*
* If a legacy SEV command parameter is a memory address, those pages in
* turn need to be transitioned to/from firmware-owned before/after
* executing the firmware command.
*
* Additionally, in cases where those pages are not guest-owned, a bounce
* buffer is needed in place of the original memory address parameter.
*
* A set of descriptors are used to keep track of this handling, and
* initialized here based on the specific commands being executed.
*/
static void snp_populate_cmd_buf_desc_list(int cmd, void *cmd_buf,
struct cmd_buf_desc *desc_list)
{
switch (cmd) {
case SEV_CMD_PDH_CERT_EXPORT: {
struct sev_data_pdh_cert_export *data = cmd_buf;
desc_list[0].paddr_ptr = &data->pdh_cert_address;
desc_list[0].len = data->pdh_cert_len;
desc_list[1].paddr_ptr = &data->cert_chain_address;
desc_list[1].len = data->cert_chain_len;
break;
}
case SEV_CMD_GET_ID: {
struct sev_data_get_id *data = cmd_buf;
desc_list[0].paddr_ptr = &data->address;
desc_list[0].len = data->len;
break;
}
case SEV_CMD_PEK_CSR: {
struct sev_data_pek_csr *data = cmd_buf;
desc_list[0].paddr_ptr = &data->address;
desc_list[0].len = data->len;
break;
}
case SEV_CMD_LAUNCH_UPDATE_DATA: {
struct sev_data_launch_update_data *data = cmd_buf;
desc_list[0].paddr_ptr = &data->address;
desc_list[0].len = data->len;
desc_list[0].guest_owned = true;
break;
}
case SEV_CMD_LAUNCH_UPDATE_VMSA: {
struct sev_data_launch_update_vmsa *data = cmd_buf;
desc_list[0].paddr_ptr = &data->address;
desc_list[0].len = data->len;
desc_list[0].guest_owned = true;
break;
}
case SEV_CMD_LAUNCH_MEASURE: {
struct sev_data_launch_measure *data = cmd_buf;
desc_list[0].paddr_ptr = &data->address;
desc_list[0].len = data->len;
break;
}
case SEV_CMD_LAUNCH_UPDATE_SECRET: {
struct sev_data_launch_secret *data = cmd_buf;
desc_list[0].paddr_ptr = &data->guest_address;
desc_list[0].len = data->guest_len;
desc_list[0].guest_owned = true;
break;
}
case SEV_CMD_DBG_DECRYPT: {
struct sev_data_dbg *data = cmd_buf;
desc_list[0].paddr_ptr = &data->dst_addr;
desc_list[0].len = data->len;
desc_list[0].guest_owned = true;
break;
}
case SEV_CMD_DBG_ENCRYPT: {
struct sev_data_dbg *data = cmd_buf;
desc_list[0].paddr_ptr = &data->dst_addr;
desc_list[0].len = data->len;
desc_list[0].guest_owned = true;
break;
}
case SEV_CMD_ATTESTATION_REPORT: {
struct sev_data_attestation_report *data = cmd_buf;
desc_list[0].paddr_ptr = &data->address;
desc_list[0].len = data->len;
break;
}
case SEV_CMD_SEND_START: {
struct sev_data_send_start *data = cmd_buf;
desc_list[0].paddr_ptr = &data->session_address;
desc_list[0].len = data->session_len;
break;
}
case SEV_CMD_SEND_UPDATE_DATA: {
struct sev_data_send_update_data *data = cmd_buf;
desc_list[0].paddr_ptr = &data->hdr_address;
desc_list[0].len = data->hdr_len;
desc_list[1].paddr_ptr = &data->trans_address;
desc_list[1].len = data->trans_len;
break;
}
case SEV_CMD_SEND_UPDATE_VMSA: {
struct sev_data_send_update_vmsa *data = cmd_buf;
desc_list[0].paddr_ptr = &data->hdr_address;
desc_list[0].len = data->hdr_len;
desc_list[1].paddr_ptr = &data->trans_address;
desc_list[1].len = data->trans_len;
break;
}
case SEV_CMD_RECEIVE_UPDATE_DATA: {
struct sev_data_receive_update_data *data = cmd_buf;
desc_list[0].paddr_ptr = &data->guest_address;
desc_list[0].len = data->guest_len;
desc_list[0].guest_owned = true;
break;
}
case SEV_CMD_RECEIVE_UPDATE_VMSA: {
struct sev_data_receive_update_vmsa *data = cmd_buf;
desc_list[0].paddr_ptr = &data->guest_address;
desc_list[0].len = data->guest_len;
desc_list[0].guest_owned = true;
break;
}
default:
break;
}
}
static int snp_map_cmd_buf_desc(struct cmd_buf_desc *desc)
{
unsigned int npages;
if (!desc->len)
return 0;
/* Allocate a bounce buffer if this isn't a guest owned page. */
if (!desc->guest_owned) {
struct page *page;
page = alloc_pages(GFP_KERNEL_ACCOUNT, get_order(desc->len));
if (!page) {
pr_warn("Failed to allocate bounce buffer for SEV legacy command.\n");
return -ENOMEM;
}
desc->paddr_orig = *desc->paddr_ptr;
*desc->paddr_ptr = __psp_pa(page_to_virt(page));
}
npages = PAGE_ALIGN(desc->len) >> PAGE_SHIFT;
/* Transition the buffer to firmware-owned. */
if (rmp_mark_pages_firmware(*desc->paddr_ptr, npages, true)) {
pr_warn("Error moving pages to firmware-owned state for SEV legacy command.\n");
return -EFAULT;
}
return 0;
}
static int snp_unmap_cmd_buf_desc(struct cmd_buf_desc *desc)
{
unsigned int npages;
if (!desc->len)
return 0;
npages = PAGE_ALIGN(desc->len) >> PAGE_SHIFT;
/* Transition the buffers back to hypervisor-owned. */
if (snp_reclaim_pages(*desc->paddr_ptr, npages, true)) {
pr_warn("Failed to reclaim firmware-owned pages while issuing SEV legacy command.\n");
return -EFAULT;
}
/* Copy data from bounce buffer and then free it. */
if (!desc->guest_owned) {
void *bounce_buf = __va(__sme_clr(*desc->paddr_ptr));
void *dst_buf = __va(__sme_clr(desc->paddr_orig));
memcpy(dst_buf, bounce_buf, desc->len);
__free_pages(virt_to_page(bounce_buf), get_order(desc->len));
/* Restore the original address in the command buffer. */
*desc->paddr_ptr = desc->paddr_orig;
}
return 0;
}
static int snp_map_cmd_buf_desc_list(int cmd, void *cmd_buf, struct cmd_buf_desc *desc_list)
{
int i;
snp_populate_cmd_buf_desc_list(cmd, cmd_buf, desc_list);
for (i = 0; i < CMD_BUF_DESC_MAX; i++) {
struct cmd_buf_desc *desc = &desc_list[i];
if (!desc->paddr_ptr)
break;
if (snp_map_cmd_buf_desc(desc))
goto err_unmap;
}
return 0;
err_unmap:
for (i--; i >= 0; i--)
snp_unmap_cmd_buf_desc(&desc_list[i]);
return -EFAULT;
}
static int snp_unmap_cmd_buf_desc_list(struct cmd_buf_desc *desc_list)
{
int i, ret = 0;
for (i = 0; i < CMD_BUF_DESC_MAX; i++) {
struct cmd_buf_desc *desc = &desc_list[i];
if (!desc->paddr_ptr)
break;
if (snp_unmap_cmd_buf_desc(&desc_list[i]))
ret = -EFAULT;
}
return ret;
}
static bool sev_cmd_buf_writable(int cmd)
{
switch (cmd) {
case SEV_CMD_PLATFORM_STATUS:
case SEV_CMD_GUEST_STATUS:
case SEV_CMD_LAUNCH_START:
case SEV_CMD_RECEIVE_START:
case SEV_CMD_LAUNCH_MEASURE:
case SEV_CMD_SEND_START:
case SEV_CMD_SEND_UPDATE_DATA:
case SEV_CMD_SEND_UPDATE_VMSA:
case SEV_CMD_PEK_CSR:
case SEV_CMD_PDH_CERT_EXPORT:
case SEV_CMD_GET_ID:
case SEV_CMD_ATTESTATION_REPORT:
return true;
default:
return false;
}
}
/* After SNP is INIT'ed, the behavior of legacy SEV commands is changed. */
static bool snp_legacy_handling_needed(int cmd)
{
struct sev_device *sev = psp_master->sev_data;
return cmd < SEV_CMD_SNP_INIT && sev->snp_initialized;
}
static int snp_prep_cmd_buf(int cmd, void *cmd_buf, struct cmd_buf_desc *desc_list)
{
if (!snp_legacy_handling_needed(cmd))
return 0;
if (snp_map_cmd_buf_desc_list(cmd, cmd_buf, desc_list))
return -EFAULT;
/*
* Before command execution, the command buffer needs to be put into
* the firmware-owned state.
*/
if (sev_cmd_buf_writable(cmd)) {
if (rmp_mark_pages_firmware(__pa(cmd_buf), 1, true))
return -EFAULT;
}
return 0;
}
static int snp_reclaim_cmd_buf(int cmd, void *cmd_buf)
{
if (!snp_legacy_handling_needed(cmd))
return 0;
/*
* After command completion, the command buffer needs to be put back
* into the hypervisor-owned state.
*/
if (sev_cmd_buf_writable(cmd))
if (snp_reclaim_pages(__pa(cmd_buf), 1, true))
return -EFAULT;
return 0;
}
static int __sev_do_cmd_locked(int cmd, void *data, int *psp_ret)
{
struct cmd_buf_desc desc_list[CMD_BUF_DESC_MAX] = {0};
struct psp_device *psp = psp_master;
struct sev_device *sev;
unsigned int cmdbuff_hi, cmdbuff_lo;
unsigned int phys_lsb, phys_msb;
unsigned int reg, ret = 0;
void *cmd_buf;
int buf_len;
if (!psp || !psp->sev_data)
......@@ -331,12 +857,47 @@ static int __sev_do_cmd_locked(int cmd, void *data, int *psp_ret)
* work for some memory, e.g. vmalloc'd addresses, and @data may not be
* physically contiguous.
*/
if (data)
memcpy(sev->cmd_buf, data, buf_len);
if (data) {
/*
* Commands are generally issued one at a time and require the
* sev_cmd_mutex, but there could be recursive firmware requests
* due to SEV_CMD_SNP_PAGE_RECLAIM needing to be issued while
* preparing buffers for another command. This is the only known
* case of nesting in the current code, so exactly one
* additional command buffer is available for that purpose.
*/
if (!sev->cmd_buf_active) {
cmd_buf = sev->cmd_buf;
sev->cmd_buf_active = true;
} else if (!sev->cmd_buf_backup_active) {
cmd_buf = sev->cmd_buf_backup;
sev->cmd_buf_backup_active = true;
} else {
dev_err(sev->dev,
"SEV: too many firmware commands in progress, no command buffers available.\n");
return -EBUSY;
}
memcpy(cmd_buf, data, buf_len);
/*
* The behavior of the SEV-legacy commands is altered when the
* SNP firmware is in the INIT state.
*/
ret = snp_prep_cmd_buf(cmd, cmd_buf, desc_list);
if (ret) {
dev_err(sev->dev,
"SEV: failed to prepare buffer for legacy command 0x%x. Error: %d\n",
cmd, ret);
return ret;
}
} else {
cmd_buf = sev->cmd_buf;
}
/* Get the physical address of the command buffer */
phys_lsb = data ? lower_32_bits(__psp_pa(sev->cmd_buf)) : 0;
phys_msb = data ? upper_32_bits(__psp_pa(sev->cmd_buf)) : 0;
phys_lsb = data ? lower_32_bits(__psp_pa(cmd_buf)) : 0;
phys_msb = data ? upper_32_bits(__psp_pa(cmd_buf)) : 0;
dev_dbg(sev->dev, "sev command id %#x buffer 0x%08x%08x timeout %us\n",
cmd, phys_msb, phys_lsb, psp_timeout);
......@@ -374,115 +935,329 @@ static int __sev_do_cmd_locked(int cmd, void *data, int *psp_ret)
cmd, FIELD_GET(PSP_CMDRESP_STS, reg));
/*
* PSP firmware may report additional error information in the
* command buffer registers on error. Print contents of command
* buffer registers if they changed.
* PSP firmware may report additional error information in the
* command buffer registers on error. Print contents of command
* buffer registers if they changed.
*/
cmdbuff_hi = ioread32(sev->io_regs + sev->vdata->cmdbuff_addr_hi_reg);
cmdbuff_lo = ioread32(sev->io_regs + sev->vdata->cmdbuff_addr_lo_reg);
if (cmdbuff_hi != phys_msb || cmdbuff_lo != phys_lsb) {
dev_dbg(sev->dev, "Additional error information reported in cmdbuff:");
dev_dbg(sev->dev, " cmdbuff hi: %#010x\n", cmdbuff_hi);
dev_dbg(sev->dev, " cmdbuff lo: %#010x\n", cmdbuff_lo);
}
ret = -EIO;
} else {
ret = sev_write_init_ex_file_if_required(cmd);
}
/*
* Copy potential output from the PSP back to data. Do this even on
* failure in case the caller wants to glean something from the error.
*/
if (data) {
int ret_reclaim;
/*
* Restore the page state after the command completes.
*/
ret_reclaim = snp_reclaim_cmd_buf(cmd, cmd_buf);
if (ret_reclaim) {
dev_err(sev->dev,
"SEV: failed to reclaim buffer for legacy command %#x. Error: %d\n",
cmd, ret_reclaim);
return ret_reclaim;
}
memcpy(data, cmd_buf, buf_len);
if (sev->cmd_buf_backup_active)
sev->cmd_buf_backup_active = false;
else
sev->cmd_buf_active = false;
if (snp_unmap_cmd_buf_desc_list(desc_list))
return -EFAULT;
}
print_hex_dump_debug("(out): ", DUMP_PREFIX_OFFSET, 16, 2, data,
buf_len, false);
return ret;
}
int sev_do_cmd(int cmd, void *data, int *psp_ret)
{
int rc;
mutex_lock(&sev_cmd_mutex);
rc = __sev_do_cmd_locked(cmd, data, psp_ret);
mutex_unlock(&sev_cmd_mutex);
return rc;
}
EXPORT_SYMBOL_GPL(sev_do_cmd);
static int __sev_init_locked(int *error)
{
struct sev_data_init data;
memset(&data, 0, sizeof(data));
if (sev_es_tmr) {
/*
* Do not include the encryption mask on the physical
* address of the TMR (firmware should clear it anyway).
*/
data.tmr_address = __pa(sev_es_tmr);
data.flags |= SEV_INIT_FLAGS_SEV_ES;
data.tmr_len = sev_es_tmr_size;
}
return __sev_do_cmd_locked(SEV_CMD_INIT, &data, error);
}
static int __sev_init_ex_locked(int *error)
{
struct sev_data_init_ex data;
memset(&data, 0, sizeof(data));
data.length = sizeof(data);
data.nv_address = __psp_pa(sev_init_ex_buffer);
data.nv_len = NV_LENGTH;
if (sev_es_tmr) {
/*
* Do not include the encryption mask on the physical
* address of the TMR (firmware should clear it anyway).
*/
data.tmr_address = __pa(sev_es_tmr);
data.flags |= SEV_INIT_FLAGS_SEV_ES;
data.tmr_len = sev_es_tmr_size;
}
return __sev_do_cmd_locked(SEV_CMD_INIT_EX, &data, error);
}
static inline int __sev_do_init_locked(int *psp_ret)
{
if (sev_init_ex_buffer)
return __sev_init_ex_locked(psp_ret);
else
return __sev_init_locked(psp_ret);
}
static void snp_set_hsave_pa(void *arg)
{
wrmsrl(MSR_VM_HSAVE_PA, 0);
}
static int snp_filter_reserved_mem_regions(struct resource *rs, void *arg)
{
struct sev_data_range_list *range_list = arg;
struct sev_data_range *range = &range_list->ranges[range_list->num_elements];
size_t size;
/*
* Ensure the list of HV_FIXED pages that will be passed to firmware
* do not exceed the page-sized argument buffer.
*/
if ((range_list->num_elements * sizeof(struct sev_data_range) +
sizeof(struct sev_data_range_list)) > PAGE_SIZE)
return -E2BIG;
switch (rs->desc) {
case E820_TYPE_RESERVED:
case E820_TYPE_PMEM:
case E820_TYPE_ACPI:
range->base = rs->start & PAGE_MASK;
size = PAGE_ALIGN((rs->end + 1) - rs->start);
range->page_count = size >> PAGE_SHIFT;
range_list->num_elements++;
break;
default:
break;
}
return 0;
}
static int __sev_snp_init_locked(int *error)
{
struct psp_device *psp = psp_master;
struct sev_data_snp_init_ex data;
struct sev_device *sev;
void *arg = &data;
int cmd, rc = 0;
if (!cpu_feature_enabled(X86_FEATURE_SEV_SNP))
return -ENODEV;
sev = psp->sev_data;
if (sev->snp_initialized)
return 0;
if (!sev_version_greater_or_equal(SNP_MIN_API_MAJOR, SNP_MIN_API_MINOR)) {
dev_dbg(sev->dev, "SEV-SNP support requires firmware version >= %d:%d\n",
SNP_MIN_API_MAJOR, SNP_MIN_API_MINOR);
return 0;
}
/* SNP_INIT requires MSR_VM_HSAVE_PA to be cleared on all CPUs. */
on_each_cpu(snp_set_hsave_pa, NULL, 1);
/*
* Starting in SNP firmware v1.52, the SNP_INIT_EX command takes a list
* of system physical address ranges to convert into HV-fixed page
* states during the RMP initialization. For instance, the memory that
* UEFI reserves should be included in the that list. This allows system
* components that occasionally write to memory (e.g. logging to UEFI
* reserved regions) to not fail due to RMP initialization and SNP
* enablement.
*
*/
if (sev_version_greater_or_equal(SNP_MIN_API_MAJOR, 52)) {
/*
* Firmware checks that the pages containing the ranges enumerated
* in the RANGES structure are either in the default page state or in the
* firmware page state.
*/
snp_range_list = kzalloc(PAGE_SIZE, GFP_KERNEL);
if (!snp_range_list) {
dev_err(sev->dev,
"SEV: SNP_INIT_EX range list memory allocation failed\n");
return -ENOMEM;
}
/*
* Retrieve all reserved memory regions from the e820 memory map
* to be setup as HV-fixed pages.
*/
cmdbuff_hi = ioread32(sev->io_regs + sev->vdata->cmdbuff_addr_hi_reg);
cmdbuff_lo = ioread32(sev->io_regs + sev->vdata->cmdbuff_addr_lo_reg);
if (cmdbuff_hi != phys_msb || cmdbuff_lo != phys_lsb) {
dev_dbg(sev->dev, "Additional error information reported in cmdbuff:");
dev_dbg(sev->dev, " cmdbuff hi: %#010x\n", cmdbuff_hi);
dev_dbg(sev->dev, " cmdbuff lo: %#010x\n", cmdbuff_lo);
rc = walk_iomem_res_desc(IORES_DESC_NONE, IORESOURCE_MEM, 0, ~0,
snp_range_list, snp_filter_reserved_mem_regions);
if (rc) {
dev_err(sev->dev,
"SEV: SNP_INIT_EX walk_iomem_res_desc failed rc = %d\n", rc);
return rc;
}
ret = -EIO;
memset(&data, 0, sizeof(data));
data.init_rmp = 1;
data.list_paddr_en = 1;
data.list_paddr = __psp_pa(snp_range_list);
cmd = SEV_CMD_SNP_INIT_EX;
} else {
ret = sev_write_init_ex_file_if_required(cmd);
cmd = SEV_CMD_SNP_INIT;
arg = NULL;
}
print_hex_dump_debug("(out): ", DUMP_PREFIX_OFFSET, 16, 2, data,
buf_len, false);
/*
* Copy potential output from the PSP back to data. Do this even on
* failure in case the caller wants to glean something from the error.
* The following sequence must be issued before launching the first SNP
* guest to ensure all dirty cache lines are flushed, including from
* updates to the RMP table itself via the RMPUPDATE instruction:
*
* - WBINVD on all running CPUs
* - SEV_CMD_SNP_INIT[_EX] firmware command
* - WBINVD on all running CPUs
* - SEV_CMD_SNP_DF_FLUSH firmware command
*/
if (data)
memcpy(data, sev->cmd_buf, buf_len);
wbinvd_on_all_cpus();
return ret;
}
rc = __sev_do_cmd_locked(cmd, arg, error);
if (rc)
return rc;
static int sev_do_cmd(int cmd, void *data, int *psp_ret)
{
int rc;
/* Prepare for first SNP guest launch after INIT. */
wbinvd_on_all_cpus();
rc = __sev_do_cmd_locked(SEV_CMD_SNP_DF_FLUSH, NULL, error);
if (rc)
return rc;
mutex_lock(&sev_cmd_mutex);
rc = __sev_do_cmd_locked(cmd, data, psp_ret);
mutex_unlock(&sev_cmd_mutex);
sev->snp_initialized = true;
dev_dbg(sev->dev, "SEV-SNP firmware initialized\n");
sev_es_tmr_size = SNP_TMR_SIZE;
return rc;
}
static int __sev_init_locked(int *error)
static void __sev_platform_init_handle_tmr(struct sev_device *sev)
{
struct sev_data_init data;
if (sev_es_tmr)
return;
memset(&data, 0, sizeof(data));
/* Obtain the TMR memory area for SEV-ES use */
sev_es_tmr = sev_fw_alloc(sev_es_tmr_size);
if (sev_es_tmr) {
/*
* Do not include the encryption mask on the physical
* address of the TMR (firmware should clear it anyway).
*/
data.tmr_address = __pa(sev_es_tmr);
data.flags |= SEV_INIT_FLAGS_SEV_ES;
data.tmr_len = SEV_ES_TMR_SIZE;
/* Must flush the cache before giving it to the firmware */
if (!sev->snp_initialized)
clflush_cache_range(sev_es_tmr, sev_es_tmr_size);
} else {
dev_warn(sev->dev, "SEV: TMR allocation failed, SEV-ES support unavailable\n");
}
return __sev_do_cmd_locked(SEV_CMD_INIT, &data, error);
}
static int __sev_init_ex_locked(int *error)
/*
* If an init_ex_path is provided allocate a buffer for the file and
* read in the contents. Additionally, if SNP is initialized, convert
* the buffer pages to firmware pages.
*/
static int __sev_platform_init_handle_init_ex_path(struct sev_device *sev)
{
struct sev_data_init_ex data;
struct page *page;
int rc;
memset(&data, 0, sizeof(data));
data.length = sizeof(data);
data.nv_address = __psp_pa(sev_init_ex_buffer);
data.nv_len = NV_LENGTH;
if (!init_ex_path)
return 0;
if (sev_es_tmr) {
/*
* Do not include the encryption mask on the physical
* address of the TMR (firmware should clear it anyway).
*/
data.tmr_address = __pa(sev_es_tmr);
if (sev_init_ex_buffer)
return 0;
data.flags |= SEV_INIT_FLAGS_SEV_ES;
data.tmr_len = SEV_ES_TMR_SIZE;
page = alloc_pages(GFP_KERNEL, get_order(NV_LENGTH));
if (!page) {
dev_err(sev->dev, "SEV: INIT_EX NV memory allocation failed\n");
return -ENOMEM;
}
return __sev_do_cmd_locked(SEV_CMD_INIT_EX, &data, error);
}
sev_init_ex_buffer = page_address(page);
static inline int __sev_do_init_locked(int *psp_ret)
{
if (sev_init_ex_buffer)
return __sev_init_ex_locked(psp_ret);
else
return __sev_init_locked(psp_ret);
rc = sev_read_init_ex_file();
if (rc)
return rc;
/* If SEV-SNP is initialized, transition to firmware page. */
if (sev->snp_initialized) {
unsigned long npages;
npages = 1UL << get_order(NV_LENGTH);
if (rmp_mark_pages_firmware(__pa(sev_init_ex_buffer), npages, false)) {
dev_err(sev->dev, "SEV: INIT_EX NV memory page state change failed.\n");
return -ENOMEM;
}
}
return 0;
}
static int __sev_platform_init_locked(int *error)
{
int rc = 0, psp_ret = SEV_RET_NO_FW_CALL;
struct psp_device *psp = psp_master;
int rc, psp_ret = SEV_RET_NO_FW_CALL;
struct sev_device *sev;
if (!psp || !psp->sev_data)
if (!psp_master || !psp_master->sev_data)
return -ENODEV;
sev = psp->sev_data;
sev = psp_master->sev_data;
if (sev->state == SEV_STATE_INIT)
return 0;
if (sev_init_ex_buffer) {
rc = sev_read_init_ex_file();
if (rc)
return rc;
}
__sev_platform_init_handle_tmr(sev);
rc = __sev_platform_init_handle_init_ex_path(sev);
if (rc)
return rc;
rc = __sev_do_init_locked(&psp_ret);
if (rc && psp_ret == SEV_RET_SECURE_DATA_INVALID) {
......@@ -520,12 +1295,46 @@ static int __sev_platform_init_locked(int *error)
return 0;
}
int sev_platform_init(int *error)
static int _sev_platform_init_locked(struct sev_platform_init_args *args)
{
struct sev_device *sev;
int rc;
if (!psp_master || !psp_master->sev_data)
return -ENODEV;
sev = psp_master->sev_data;
if (sev->state == SEV_STATE_INIT)
return 0;
/*
* Legacy guests cannot be running while SNP_INIT(_EX) is executing,
* so perform SEV-SNP initialization at probe time.
*/
rc = __sev_snp_init_locked(&args->error);
if (rc && rc != -ENODEV) {
/*
* Don't abort the probe if SNP INIT failed,
* continue to initialize the legacy SEV firmware.
*/
dev_err(sev->dev, "SEV-SNP: failed to INIT rc %d, error %#x\n",
rc, args->error);
}
/* Defer legacy SEV/SEV-ES support if allowed by caller/module. */
if (args->probe && !psp_init_on_probe)
return 0;
return __sev_platform_init_locked(&args->error);
}
int sev_platform_init(struct sev_platform_init_args *args)
{
int rc;
mutex_lock(&sev_cmd_mutex);
rc = __sev_platform_init_locked(error);
rc = _sev_platform_init_locked(args);
mutex_unlock(&sev_cmd_mutex);
return rc;
......@@ -556,17 +1365,6 @@ static int __sev_platform_shutdown_locked(int *error)
return ret;
}
static int sev_platform_shutdown(int *error)
{
int rc;
mutex_lock(&sev_cmd_mutex);
rc = __sev_platform_shutdown_locked(NULL);
mutex_unlock(&sev_cmd_mutex);
return rc;
}
static int sev_get_platform_state(int *state, int *error)
{
struct sev_user_data_status data;
......@@ -842,6 +1640,72 @@ static int sev_update_firmware(struct device *dev)
return ret;
}
static int __sev_snp_shutdown_locked(int *error, bool panic)
{
struct sev_device *sev = psp_master->sev_data;
struct sev_data_snp_shutdown_ex data;
int ret;
if (!sev->snp_initialized)
return 0;
memset(&data, 0, sizeof(data));
data.len = sizeof(data);
data.iommu_snp_shutdown = 1;
/*
* If invoked during panic handling, local interrupts are disabled
* and all CPUs are stopped, so wbinvd_on_all_cpus() can't be called.
* In that case, a wbinvd() is done on remote CPUs via the NMI
* callback, so only a local wbinvd() is needed here.
*/
if (!panic)
wbinvd_on_all_cpus();
else
wbinvd();
ret = __sev_do_cmd_locked(SEV_CMD_SNP_SHUTDOWN_EX, &data, error);
/* SHUTDOWN may require DF_FLUSH */
if (*error == SEV_RET_DFFLUSH_REQUIRED) {
ret = __sev_do_cmd_locked(SEV_CMD_SNP_DF_FLUSH, NULL, NULL);
if (ret) {
dev_err(sev->dev, "SEV-SNP DF_FLUSH failed\n");
return ret;
}
/* reissue the shutdown command */
ret = __sev_do_cmd_locked(SEV_CMD_SNP_SHUTDOWN_EX, &data,
error);
}
if (ret) {
dev_err(sev->dev, "SEV-SNP firmware shutdown failed\n");
return ret;
}
/*
* SNP_SHUTDOWN_EX with IOMMU_SNP_SHUTDOWN set to 1 disables SNP
* enforcement by the IOMMU and also transitions all pages
* associated with the IOMMU to the Reclaim state.
* Firmware was transitioning the IOMMU pages to Hypervisor state
* before version 1.53. But, accounting for the number of assigned
* 4kB pages in a 2M page was done incorrectly by not transitioning
* to the Reclaim state. This resulted in RMP #PF when later accessing
* the 2M page containing those pages during kexec boot. Hence, the
* firmware now transitions these pages to Reclaim state and hypervisor
* needs to transition these pages to shared state. SNP Firmware
* version 1.53 and above are needed for kexec boot.
*/
ret = amd_iommu_snp_disable();
if (ret) {
dev_err(sev->dev, "SNP IOMMU shutdown failed\n");
return ret;
}
sev->snp_initialized = false;
dev_dbg(sev->dev, "SEV-SNP firmware shutdown\n");
return ret;
}
static int sev_ioctl_do_pek_import(struct sev_issue_cmd *argp, bool writable)
{
struct sev_device *sev = psp_master->sev_data;
......@@ -1084,6 +1948,85 @@ static int sev_ioctl_do_pdh_export(struct sev_issue_cmd *argp, bool writable)
return ret;
}
static int sev_ioctl_do_snp_platform_status(struct sev_issue_cmd *argp)
{
struct sev_device *sev = psp_master->sev_data;
struct sev_data_snp_addr buf;
struct page *status_page;
void *data;
int ret;
if (!sev->snp_initialized || !argp->data)
return -EINVAL;
status_page = alloc_page(GFP_KERNEL_ACCOUNT);
if (!status_page)
return -ENOMEM;
data = page_address(status_page);
/*
* Firmware expects status page to be in firmware-owned state, otherwise
* it will report firmware error code INVALID_PAGE_STATE (0x1A).
*/
if (rmp_mark_pages_firmware(__pa(data), 1, true)) {
ret = -EFAULT;
goto cleanup;
}
buf.address = __psp_pa(data);
ret = __sev_do_cmd_locked(SEV_CMD_SNP_PLATFORM_STATUS, &buf, &argp->error);
/*
* Status page will be transitioned to Reclaim state upon success, or
* left in Firmware state in failure. Use snp_reclaim_pages() to
* transition either case back to Hypervisor-owned state.
*/
if (snp_reclaim_pages(__pa(data), 1, true))
return -EFAULT;
if (ret)
goto cleanup;
if (copy_to_user((void __user *)argp->data, data,
sizeof(struct sev_user_data_snp_status)))
ret = -EFAULT;
cleanup:
__free_pages(status_page, 0);
return ret;
}
static int sev_ioctl_do_snp_commit(struct sev_issue_cmd *argp)
{
struct sev_device *sev = psp_master->sev_data;
struct sev_data_snp_commit buf;
if (!sev->snp_initialized)
return -EINVAL;
buf.len = sizeof(buf);
return __sev_do_cmd_locked(SEV_CMD_SNP_COMMIT, &buf, &argp->error);
}
static int sev_ioctl_do_snp_set_config(struct sev_issue_cmd *argp, bool writable)
{
struct sev_device *sev = psp_master->sev_data;
struct sev_user_data_snp_config config;
if (!sev->snp_initialized || !argp->data)
return -EINVAL;
if (!writable)
return -EPERM;
if (copy_from_user(&config, (void __user *)argp->data, sizeof(config)))
return -EFAULT;
return __sev_do_cmd_locked(SEV_CMD_SNP_CONFIG, &config, &argp->error);
}
static long sev_ioctl(struct file *file, unsigned int ioctl, unsigned long arg)
{
void __user *argp = (void __user *)arg;
......@@ -1135,6 +2078,15 @@ static long sev_ioctl(struct file *file, unsigned int ioctl, unsigned long arg)
case SEV_GET_ID2:
ret = sev_ioctl_do_get_id2(&input);
break;
case SNP_PLATFORM_STATUS:
ret = sev_ioctl_do_snp_platform_status(&input);
break;
case SNP_COMMIT:
ret = sev_ioctl_do_snp_commit(&input);
break;
case SNP_SET_CONFIG:
ret = sev_ioctl_do_snp_set_config(&input, writable);
break;
default:
ret = -EINVAL;
goto out;
......@@ -1245,10 +2197,12 @@ int sev_dev_init(struct psp_device *psp)
if (!sev)
goto e_err;
sev->cmd_buf = (void *)devm_get_free_pages(dev, GFP_KERNEL, 0);
sev->cmd_buf = (void *)devm_get_free_pages(dev, GFP_KERNEL, 1);
if (!sev->cmd_buf)
goto e_sev;
sev->cmd_buf_backup = (uint8_t *)sev->cmd_buf + PAGE_SIZE;
psp->sev_data = sev;
sev->dev = dev;
......@@ -1287,24 +2241,51 @@ int sev_dev_init(struct psp_device *psp)
return ret;
}
static void sev_firmware_shutdown(struct sev_device *sev)
static void __sev_firmware_shutdown(struct sev_device *sev, bool panic)
{
sev_platform_shutdown(NULL);
int error;
__sev_platform_shutdown_locked(NULL);
if (sev_es_tmr) {
/* The TMR area was encrypted, flush it from the cache */
wbinvd_on_all_cpus();
/*
* The TMR area was encrypted, flush it from the cache.
*
* If invoked during panic handling, local interrupts are
* disabled and all CPUs are stopped, so wbinvd_on_all_cpus()
* can't be used. In that case, wbinvd() is done on remote CPUs
* via the NMI callback, and done for this CPU later during
* SNP shutdown, so wbinvd_on_all_cpus() can be skipped.
*/
if (!panic)
wbinvd_on_all_cpus();
free_pages((unsigned long)sev_es_tmr,
get_order(SEV_ES_TMR_SIZE));
__snp_free_firmware_pages(virt_to_page(sev_es_tmr),
get_order(sev_es_tmr_size),
true);
sev_es_tmr = NULL;
}
if (sev_init_ex_buffer) {
free_pages((unsigned long)sev_init_ex_buffer,
get_order(NV_LENGTH));
__snp_free_firmware_pages(virt_to_page(sev_init_ex_buffer),
get_order(NV_LENGTH),
true);
sev_init_ex_buffer = NULL;
}
if (snp_range_list) {
kfree(snp_range_list);
snp_range_list = NULL;
}
__sev_snp_shutdown_locked(&error, panic);
}
static void sev_firmware_shutdown(struct sev_device *sev)
{
mutex_lock(&sev_cmd_mutex);
__sev_firmware_shutdown(sev, false);
mutex_unlock(&sev_cmd_mutex);
}
void sev_dev_destroy(struct psp_device *psp)
......@@ -1322,6 +2303,29 @@ void sev_dev_destroy(struct psp_device *psp)
psp_clear_sev_irq_handler(psp);
}
static int snp_shutdown_on_panic(struct notifier_block *nb,
unsigned long reason, void *arg)
{
struct sev_device *sev = psp_master->sev_data;
/*
* If sev_cmd_mutex is already acquired, then it's likely
* another PSP command is in flight and issuing a shutdown
* would fail in unexpected ways. Rather than create even
* more confusion during a panic, just bail out here.
*/
if (mutex_is_locked(&sev_cmd_mutex))
return NOTIFY_DONE;
__sev_firmware_shutdown(sev, true);
return NOTIFY_DONE;
}
static struct notifier_block snp_panic_notifier = {
.notifier_call = snp_shutdown_on_panic,
};
int sev_issue_cmd_external_user(struct file *filep, unsigned int cmd,
void *data, int *error)
{
......@@ -1335,7 +2339,8 @@ EXPORT_SYMBOL_GPL(sev_issue_cmd_external_user);
void sev_pci_init(void)
{
struct sev_device *sev = psp_master->sev_data;
int error, rc;
struct sev_platform_init_args args = {0};
int rc;
if (!sev)
return;
......@@ -1348,36 +2353,18 @@ void sev_pci_init(void)
if (sev_update_firmware(sev->dev) == 0)
sev_get_api_version();
/* If an init_ex_path is provided rely on INIT_EX for PSP initialization
* instead of INIT.
*/
if (init_ex_path) {
sev_init_ex_buffer = sev_fw_alloc(NV_LENGTH);
if (!sev_init_ex_buffer) {
dev_err(sev->dev,
"SEV: INIT_EX NV memory allocation failed\n");
goto err;
}
}
/* Obtain the TMR memory area for SEV-ES use */
sev_es_tmr = sev_fw_alloc(SEV_ES_TMR_SIZE);
if (sev_es_tmr)
/* Must flush the cache before giving it to the firmware */
clflush_cache_range(sev_es_tmr, SEV_ES_TMR_SIZE);
else
dev_warn(sev->dev,
"SEV: TMR allocation failed, SEV-ES support unavailable\n");
if (!psp_init_on_probe)
return;
/* Initialize the platform */
rc = sev_platform_init(&error);
args.probe = true;
rc = sev_platform_init(&args);
if (rc)
dev_err(sev->dev, "SEV: failed to INIT error %#x, rc %d\n",
error, rc);
args.error, rc);
dev_info(sev->dev, "SEV%s API:%d.%d build:%d\n", sev->snp_initialized ?
"-SNP" : "", sev->api_major, sev->api_minor, sev->build);
atomic_notifier_chain_register(&panic_notifier_list,
&snp_panic_notifier);
return;
err:
......@@ -1392,4 +2379,7 @@ void sev_pci_exit(void)
return;
sev_firmware_shutdown(sev);
atomic_notifier_chain_unregister(&panic_notifier_list,
&snp_panic_notifier);
}
......@@ -52,6 +52,11 @@ struct sev_device {
u8 build;
void *cmd_buf;
void *cmd_buf_backup;
bool cmd_buf_active;
bool cmd_buf_backup_active;
bool snp_initialized;
};
int sev_dev_init(struct psp_device *psp);
......
......@@ -164,5 +164,4 @@ void amd_iommu_domain_set_pgtable(struct protection_domain *domain,
u64 *root, int mode);
struct dev_table_entry *get_dev_table(struct amd_iommu *iommu);
extern bool amd_iommu_snp_en;
#endif
......@@ -30,6 +30,7 @@
#include <asm/io_apic.h>
#include <asm/irq_remapping.h>
#include <asm/set_memory.h>
#include <asm/sev.h>
#include <linux/crash_dump.h>
......@@ -3221,6 +3222,36 @@ static bool __init detect_ivrs(void)
return true;
}
static void iommu_snp_enable(void)
{
#ifdef CONFIG_KVM_AMD_SEV
if (!cpu_feature_enabled(X86_FEATURE_SEV_SNP))
return;
/*
* The SNP support requires that IOMMU must be enabled, and is
* not configured in the passthrough mode.
*/
if (no_iommu || iommu_default_passthrough()) {
pr_err("SNP: IOMMU disabled or configured in passthrough mode, SNP cannot be supported.\n");
return;
}
amd_iommu_snp_en = check_feature(FEATURE_SNP);
if (!amd_iommu_snp_en) {
pr_err("SNP: IOMMU SNP feature not enabled, SNP cannot be supported.\n");
return;
}
pr_info("IOMMU SNP support enabled.\n");
/* Enforce IOMMU v1 pagetable when SNP is enabled. */
if (amd_iommu_pgtable != AMD_IOMMU_V1) {
pr_warn("Forcing use of AMD IOMMU v1 page table due to SNP.\n");
amd_iommu_pgtable = AMD_IOMMU_V1;
}
#endif
}
/****************************************************************************
*
* AMD IOMMU Initialization State Machine
......@@ -3256,6 +3287,7 @@ static int __init state_next(void)
break;
case IOMMU_ENABLED:
register_syscore_ops(&amd_iommu_syscore_ops);
iommu_snp_enable();
ret = amd_iommu_init_pci();
init_state = ret ? IOMMU_INIT_ERROR : IOMMU_PCI_INIT;
break;
......@@ -3767,40 +3799,85 @@ int amd_iommu_pc_set_reg(struct amd_iommu *iommu, u8 bank, u8 cntr, u8 fxn, u64
return iommu_pc_get_set_reg(iommu, bank, cntr, fxn, value, true);
}
#ifdef CONFIG_AMD_MEM_ENCRYPT
int amd_iommu_snp_enable(void)
#ifdef CONFIG_KVM_AMD_SEV
static int iommu_page_make_shared(void *page)
{
/*
* The SNP support requires that IOMMU must be enabled, and is
* not configured in the passthrough mode.
*/
if (no_iommu || iommu_default_passthrough()) {
pr_err("SNP: IOMMU is disabled or configured in passthrough mode, SNP cannot be supported");
return -EINVAL;
unsigned long paddr, pfn;
paddr = iommu_virt_to_phys(page);
/* Cbit maybe set in the paddr */
pfn = __sme_clr(paddr) >> PAGE_SHIFT;
if (!(pfn % PTRS_PER_PMD)) {
int ret, level;
bool assigned;
ret = snp_lookup_rmpentry(pfn, &assigned, &level);
if (ret) {
pr_warn("IOMMU PFN %lx RMP lookup failed, ret %d\n", pfn, ret);
return ret;
}
if (!assigned) {
pr_warn("IOMMU PFN %lx not assigned in RMP table\n", pfn);
return -EINVAL;
}
if (level > PG_LEVEL_4K) {
ret = psmash(pfn);
if (!ret)
goto done;
pr_warn("PSMASH failed for IOMMU PFN %lx huge RMP entry, ret: %d, level: %d\n",
pfn, ret, level);
return ret;
}
}
/*
* Prevent enabling SNP after IOMMU_ENABLED state because this process
* affect how IOMMU driver sets up data structures and configures
* IOMMU hardware.
*/
if (init_state > IOMMU_ENABLED) {
pr_err("SNP: Too late to enable SNP for IOMMU.\n");
return -EINVAL;
done:
return rmp_make_shared(pfn, PG_LEVEL_4K);
}
static int iommu_make_shared(void *va, size_t size)
{
void *page;
int ret;
if (!va)
return 0;
for (page = va; page < (va + size); page += PAGE_SIZE) {
ret = iommu_page_make_shared(page);
if (ret)
return ret;
}
amd_iommu_snp_en = check_feature(FEATURE_SNP);
return 0;
}
int amd_iommu_snp_disable(void)
{
struct amd_iommu *iommu;
int ret;
if (!amd_iommu_snp_en)
return -EINVAL;
return 0;
for_each_iommu(iommu) {
ret = iommu_make_shared(iommu->evt_buf, EVT_BUFFER_SIZE);
if (ret)
return ret;
pr_info("SNP enabled\n");
ret = iommu_make_shared(iommu->ppr_log, PPR_LOG_SIZE);
if (ret)
return ret;
/* Enforce IOMMU v1 pagetable when SNP is enabled. */
if (amd_iommu_pgtable != AMD_IOMMU_V1) {
pr_warn("Force to using AMD IOMMU v1 page table due to SNP\n");
amd_iommu_pgtable = AMD_IOMMU_V1;
ret = iommu_make_shared((void *)iommu->cmd_sem, PAGE_SIZE);
if (ret)
return ret;
}
return 0;
}
EXPORT_SYMBOL_GPL(amd_iommu_snp_disable);
#endif
......@@ -85,8 +85,10 @@ int amd_iommu_pc_get_reg(struct amd_iommu *iommu, u8 bank, u8 cntr, u8 fxn,
u64 *value);
struct amd_iommu *get_amd_iommu(unsigned int idx);
#ifdef CONFIG_AMD_MEM_ENCRYPT
int amd_iommu_snp_enable(void);
#ifdef CONFIG_KVM_AMD_SEV
int amd_iommu_snp_disable(void);
#else
static inline int amd_iommu_snp_disable(void) { return 0; }
#endif
#endif /* _ASM_X86_AMD_IOMMU_H */
......@@ -78,6 +78,36 @@ enum sev_cmd {
SEV_CMD_DBG_DECRYPT = 0x060,
SEV_CMD_DBG_ENCRYPT = 0x061,
/* SNP specific commands */
SEV_CMD_SNP_INIT = 0x081,
SEV_CMD_SNP_SHUTDOWN = 0x082,
SEV_CMD_SNP_PLATFORM_STATUS = 0x083,
SEV_CMD_SNP_DF_FLUSH = 0x084,
SEV_CMD_SNP_INIT_EX = 0x085,
SEV_CMD_SNP_SHUTDOWN_EX = 0x086,
SEV_CMD_SNP_DECOMMISSION = 0x090,
SEV_CMD_SNP_ACTIVATE = 0x091,
SEV_CMD_SNP_GUEST_STATUS = 0x092,
SEV_CMD_SNP_GCTX_CREATE = 0x093,
SEV_CMD_SNP_GUEST_REQUEST = 0x094,
SEV_CMD_SNP_ACTIVATE_EX = 0x095,
SEV_CMD_SNP_LAUNCH_START = 0x0A0,
SEV_CMD_SNP_LAUNCH_UPDATE = 0x0A1,
SEV_CMD_SNP_LAUNCH_FINISH = 0x0A2,
SEV_CMD_SNP_DBG_DECRYPT = 0x0B0,
SEV_CMD_SNP_DBG_ENCRYPT = 0x0B1,
SEV_CMD_SNP_PAGE_SWAP_OUT = 0x0C0,
SEV_CMD_SNP_PAGE_SWAP_IN = 0x0C1,
SEV_CMD_SNP_PAGE_MOVE = 0x0C2,
SEV_CMD_SNP_PAGE_MD_INIT = 0x0C3,
SEV_CMD_SNP_PAGE_SET_STATE = 0x0C6,
SEV_CMD_SNP_PAGE_RECLAIM = 0x0C7,
SEV_CMD_SNP_PAGE_UNSMASH = 0x0C8,
SEV_CMD_SNP_CONFIG = 0x0C9,
SEV_CMD_SNP_DOWNLOAD_FIRMWARE_EX = 0x0CA,
SEV_CMD_SNP_COMMIT = 0x0CB,
SEV_CMD_SNP_VLEK_LOAD = 0x0CD,
SEV_CMD_MAX,
};
......@@ -523,12 +553,269 @@ struct sev_data_attestation_report {
u32 len; /* In/Out */
} __packed;
/**
* struct sev_data_snp_download_firmware - SNP_DOWNLOAD_FIRMWARE command params
*
* @address: physical address of firmware image
* @len: length of the firmware image
*/
struct sev_data_snp_download_firmware {
u64 address; /* In */
u32 len; /* In */
} __packed;
/**
* struct sev_data_snp_activate - SNP_ACTIVATE command params
*
* @gctx_paddr: system physical address guest context page
* @asid: ASID to bind to the guest
*/
struct sev_data_snp_activate {
u64 gctx_paddr; /* In */
u32 asid; /* In */
} __packed;
/**
* struct sev_data_snp_addr - generic SNP command params
*
* @address: physical address of generic data param
*/
struct sev_data_snp_addr {
u64 address; /* In/Out */
} __packed;
/**
* struct sev_data_snp_launch_start - SNP_LAUNCH_START command params
*
* @gctx_paddr: system physical address of guest context page
* @policy: guest policy
* @ma_gctx_paddr: system physical address of migration agent
* @ma_en: the guest is associated with a migration agent
* @imi_en: launch flow is launching an IMI (Incoming Migration Image) for the
* purpose of guest-assisted migration.
* @rsvd: reserved
* @gosvw: guest OS-visible workarounds, as defined by hypervisor
*/
struct sev_data_snp_launch_start {
u64 gctx_paddr; /* In */
u64 policy; /* In */
u64 ma_gctx_paddr; /* In */
u32 ma_en:1; /* In */
u32 imi_en:1; /* In */
u32 rsvd:30;
u8 gosvw[16]; /* In */
} __packed;
/* SNP support page type */
enum {
SNP_PAGE_TYPE_NORMAL = 0x1,
SNP_PAGE_TYPE_VMSA = 0x2,
SNP_PAGE_TYPE_ZERO = 0x3,
SNP_PAGE_TYPE_UNMEASURED = 0x4,
SNP_PAGE_TYPE_SECRET = 0x5,
SNP_PAGE_TYPE_CPUID = 0x6,
SNP_PAGE_TYPE_MAX
};
/**
* struct sev_data_snp_launch_update - SNP_LAUNCH_UPDATE command params
*
* @gctx_paddr: system physical address of guest context page
* @page_size: page size 0 indicates 4K and 1 indicates 2MB page
* @page_type: encoded page type
* @imi_page: indicates that this page is part of the IMI (Incoming Migration
* Image) of the guest
* @rsvd: reserved
* @rsvd2: reserved
* @address: system physical address of destination page to encrypt
* @rsvd3: reserved
* @vmpl1_perms: VMPL permission mask for VMPL1
* @vmpl2_perms: VMPL permission mask for VMPL2
* @vmpl3_perms: VMPL permission mask for VMPL3
* @rsvd4: reserved
*/
struct sev_data_snp_launch_update {
u64 gctx_paddr; /* In */
u32 page_size:1; /* In */
u32 page_type:3; /* In */
u32 imi_page:1; /* In */
u32 rsvd:27;
u32 rsvd2;
u64 address; /* In */
u32 rsvd3:8;
u32 vmpl1_perms:8; /* In */
u32 vmpl2_perms:8; /* In */
u32 vmpl3_perms:8; /* In */
u32 rsvd4;
} __packed;
/**
* struct sev_data_snp_launch_finish - SNP_LAUNCH_FINISH command params
*
* @gctx_paddr: system physical address of guest context page
* @id_block_paddr: system physical address of ID block
* @id_auth_paddr: system physical address of ID block authentication structure
* @id_block_en: indicates whether ID block is present
* @auth_key_en: indicates whether author key is present in authentication structure
* @rsvd: reserved
* @host_data: host-supplied data for guest, not interpreted by firmware
*/
struct sev_data_snp_launch_finish {
u64 gctx_paddr;
u64 id_block_paddr;
u64 id_auth_paddr;
u8 id_block_en:1;
u8 auth_key_en:1;
u64 rsvd:62;
u8 host_data[32];
} __packed;
/**
* struct sev_data_snp_guest_status - SNP_GUEST_STATUS command params
*
* @gctx_paddr: system physical address of guest context page
* @address: system physical address of guest status page
*/
struct sev_data_snp_guest_status {
u64 gctx_paddr;
u64 address;
} __packed;
/**
* struct sev_data_snp_page_reclaim - SNP_PAGE_RECLAIM command params
*
* @paddr: system physical address of page to be claimed. The 0th bit in the
* address indicates the page size. 0h indicates 4KB and 1h indicates
* 2MB page.
*/
struct sev_data_snp_page_reclaim {
u64 paddr;
} __packed;
/**
* struct sev_data_snp_page_unsmash - SNP_PAGE_UNSMASH command params
*
* @paddr: system physical address of page to be unsmashed. The 0th bit in the
* address indicates the page size. 0h indicates 4 KB and 1h indicates
* 2 MB page.
*/
struct sev_data_snp_page_unsmash {
u64 paddr;
} __packed;
/**
* struct sev_data_snp_dbg - DBG_ENCRYPT/DBG_DECRYPT command parameters
*
* @gctx_paddr: system physical address of guest context page
* @src_addr: source address of data to operate on
* @dst_addr: destination address of data to operate on
*/
struct sev_data_snp_dbg {
u64 gctx_paddr; /* In */
u64 src_addr; /* In */
u64 dst_addr; /* In */
} __packed;
/**
* struct sev_data_snp_guest_request - SNP_GUEST_REQUEST command params
*
* @gctx_paddr: system physical address of guest context page
* @req_paddr: system physical address of request page
* @res_paddr: system physical address of response page
*/
struct sev_data_snp_guest_request {
u64 gctx_paddr; /* In */
u64 req_paddr; /* In */
u64 res_paddr; /* In */
} __packed;
/**
* struct sev_data_snp_init_ex - SNP_INIT_EX structure
*
* @init_rmp: indicate that the RMP should be initialized.
* @list_paddr_en: indicate that list_paddr is valid
* @rsvd: reserved
* @rsvd1: reserved
* @list_paddr: system physical address of range list
* @rsvd2: reserved
*/
struct sev_data_snp_init_ex {
u32 init_rmp:1;
u32 list_paddr_en:1;
u32 rsvd:30;
u32 rsvd1;
u64 list_paddr;
u8 rsvd2[48];
} __packed;
/**
* struct sev_data_range - RANGE structure
*
* @base: system physical address of first byte of range
* @page_count: number of 4KB pages in this range
* @rsvd: reserved
*/
struct sev_data_range {
u64 base;
u32 page_count;
u32 rsvd;
} __packed;
/**
* struct sev_data_range_list - RANGE_LIST structure
*
* @num_elements: number of elements in RANGE_ARRAY
* @rsvd: reserved
* @ranges: array of num_elements of type RANGE
*/
struct sev_data_range_list {
u32 num_elements;
u32 rsvd;
struct sev_data_range ranges[];
} __packed;
/**
* struct sev_data_snp_shutdown_ex - SNP_SHUTDOWN_EX structure
*
* @len: length of the command buffer read by the PSP
* @iommu_snp_shutdown: Disable enforcement of SNP in the IOMMU
* @rsvd1: reserved
*/
struct sev_data_snp_shutdown_ex {
u32 len;
u32 iommu_snp_shutdown:1;
u32 rsvd1:31;
} __packed;
/**
* struct sev_platform_init_args
*
* @error: SEV firmware error code
* @probe: True if this is being called as part of CCP module probe, which
* will defer SEV_INIT/SEV_INIT_EX firmware initialization until needed
* unless psp_init_on_probe module param is set
*/
struct sev_platform_init_args {
int error;
bool probe;
};
/**
* struct sev_data_snp_commit - SNP_COMMIT structure
*
* @len: length of the command buffer read by the PSP
*/
struct sev_data_snp_commit {
u32 len;
} __packed;
#ifdef CONFIG_CRYPTO_DEV_SP_PSP
/**
* sev_platform_init - perform SEV INIT command
*
* @error: SEV command return code
* @args: struct sev_platform_init_args to pass in arguments
*
* Returns:
* 0 if the SEV successfully processed the command
......@@ -537,7 +824,7 @@ struct sev_data_attestation_report {
* -%ETIMEDOUT if the SEV command timed out
* -%EIO if the SEV returned a non-zero return code
*/
int sev_platform_init(int *error);
int sev_platform_init(struct sev_platform_init_args *args);
/**
* sev_platform_status - perform SEV PLATFORM_STATUS command
......@@ -637,14 +924,32 @@ int sev_guest_df_flush(int *error);
*/
int sev_guest_decommission(struct sev_data_decommission *data, int *error);
/**
* sev_do_cmd - issue an SEV or an SEV-SNP command
*
* @cmd: SEV or SEV-SNP firmware command to issue
* @data: arguments for firmware command
* @psp_ret: SEV command return code
*
* Returns:
* 0 if the SEV device successfully processed the command
* -%ENODEV if the PSP device is not available
* -%ENOTSUPP if PSP device does not support SEV
* -%ETIMEDOUT if the SEV command timed out
* -%EIO if PSP device returned a non-zero return code
*/
int sev_do_cmd(int cmd, void *data, int *psp_ret);
void *psp_copy_user_blob(u64 uaddr, u32 len);
void *snp_alloc_firmware_page(gfp_t mask);
void snp_free_firmware_page(void *addr);
#else /* !CONFIG_CRYPTO_DEV_SP_PSP */
static inline int
sev_platform_status(struct sev_user_data_status *status, int *error) { return -ENODEV; }
static inline int sev_platform_init(int *error) { return -ENODEV; }
static inline int sev_platform_init(struct sev_platform_init_args *args) { return -ENODEV; }
static inline int
sev_guest_deactivate(struct sev_data_deactivate *data, int *error) { return -ENODEV; }
......@@ -652,6 +957,9 @@ sev_guest_deactivate(struct sev_data_deactivate *data, int *error) { return -ENO
static inline int
sev_guest_decommission(struct sev_data_decommission *data, int *error) { return -ENODEV; }
static inline int
sev_do_cmd(int cmd, void *data, int *psp_ret) { return -ENODEV; }
static inline int
sev_guest_activate(struct sev_data_activate *data, int *error) { return -ENODEV; }
......@@ -662,6 +970,13 @@ sev_issue_cmd_external_user(struct file *filep, unsigned int id, void *data, int
static inline void *psp_copy_user_blob(u64 __user uaddr, u32 len) { return ERR_PTR(-EINVAL); }
static inline void *snp_alloc_firmware_page(gfp_t mask)
{
return NULL;
}
static inline void snp_free_firmware_page(void *addr) { }
#endif /* CONFIG_CRYPTO_DEV_SP_PSP */
#endif /* __PSP_SEV_H__ */
......@@ -28,6 +28,9 @@ enum {
SEV_PEK_CERT_IMPORT,
SEV_GET_ID, /* This command is deprecated, use SEV_GET_ID2 */
SEV_GET_ID2,
SNP_PLATFORM_STATUS,
SNP_COMMIT,
SNP_SET_CONFIG,
SEV_MAX,
};
......@@ -69,6 +72,12 @@ typedef enum {
SEV_RET_RESOURCE_LIMIT,
SEV_RET_SECURE_DATA_INVALID,
SEV_RET_INVALID_KEY = 0x27,
SEV_RET_INVALID_PAGE_SIZE,
SEV_RET_INVALID_PAGE_STATE,
SEV_RET_INVALID_MDATA_ENTRY,
SEV_RET_INVALID_PAGE_OWNER,
SEV_RET_INVALID_PAGE_AEAD_OFLOW,
SEV_RET_RMP_INIT_REQUIRED,
SEV_RET_MAX,
} sev_ret_code;
......@@ -155,6 +164,56 @@ struct sev_user_data_get_id2 {
__u32 length; /* In/Out */
} __packed;
/**
* struct sev_user_data_snp_status - SNP status
*
* @api_major: API major version
* @api_minor: API minor version
* @state: current platform state
* @is_rmp_initialized: whether RMP is initialized or not
* @rsvd: reserved
* @build_id: firmware build id for the API version
* @mask_chip_id: whether chip id is present in attestation reports or not
* @mask_chip_key: whether attestation reports are signed or not
* @vlek_en: VLEK (Version Loaded Endorsement Key) hashstick is loaded
* @rsvd1: reserved
* @guest_count: the number of guest currently managed by the firmware
* @current_tcb_version: current TCB version
* @reported_tcb_version: reported TCB version
*/
struct sev_user_data_snp_status {
__u8 api_major; /* Out */
__u8 api_minor; /* Out */
__u8 state; /* Out */
__u8 is_rmp_initialized:1; /* Out */
__u8 rsvd:7;
__u32 build_id; /* Out */
__u32 mask_chip_id:1; /* Out */
__u32 mask_chip_key:1; /* Out */
__u32 vlek_en:1; /* Out */
__u32 rsvd1:29;
__u32 guest_count; /* Out */
__u64 current_tcb_version; /* Out */
__u64 reported_tcb_version; /* Out */
} __packed;
/**
* struct sev_user_data_snp_config - system wide configuration value for SNP.
*
* @reported_tcb: the TCB version to report in the guest attestation report.
* @mask_chip_id: whether chip id is present in attestation reports or not
* @mask_chip_key: whether attestation reports are signed or not
* @rsvd: reserved
* @rsvd1: reserved
*/
struct sev_user_data_snp_config {
__u64 reported_tcb ; /* In */
__u32 mask_chip_id:1; /* In */
__u32 mask_chip_key:1; /* In */
__u32 rsvd:30; /* In */
__u8 rsvd1[52];
} __packed;
/**
* struct sev_issue_cmd - SEV ioctl parameters
*
......
......@@ -444,6 +444,7 @@
#define X86_FEATURE_SEV (19*32+ 1) /* AMD Secure Encrypted Virtualization */
#define X86_FEATURE_VM_PAGE_FLUSH (19*32+ 2) /* "" VM Page Flush MSR is supported */
#define X86_FEATURE_SEV_ES (19*32+ 3) /* AMD Secure Encrypted Virtualization - Encrypted State */
#define X86_FEATURE_SEV_SNP (19*32+ 4) /* AMD Secure Encrypted Virtualization - Secure Nested Paging */
#define X86_FEATURE_V_TSC_AUX (19*32+ 9) /* "" Virtual TSC_AUX */
#define X86_FEATURE_SME_COHERENT (19*32+10) /* "" AMD hardware-enforced cache coherency */
#define X86_FEATURE_DEBUG_SWAP (19*32+14) /* AMD SEV-ES full debug state swap support */
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment