Commit 138bcddb authored by Linus Torvalds's avatar Linus Torvalds

Merge tag 'x86_bugs_srso' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86/srso fixes from Borislav Petkov:
 "Add a mitigation for the speculative RAS (Return Address Stack)
  overflow vulnerability on AMD processors.

  In short, this is yet another issue where userspace poisons a
  microarchitectural structure which can then be used to leak privileged
  information through a side channel"

* tag 'x86_bugs_srso' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/srso: Tie SBPB bit setting to microcode patch detection
  x86/srso: Add a forgotten NOENDBR annotation
  x86/srso: Fix return thunks in generated code
  x86/srso: Add IBPB on VMEXIT
  x86/srso: Add IBPB
  x86/srso: Add SRSO_NO support
  x86/srso: Add IBPB_BRTYPE support
  x86/srso: Add a Speculative RAS Overflow mitigation
  x86/bugs: Increase the x86 bugs vector size to two u32s
parents 14f9643d 5a15d834
......@@ -19,3 +19,4 @@ are configurable at compile, boot or run time.
l1d_flush.rst
processor_mmio_stale_data.rst
cross-thread-rsb.rst
srso
.. SPDX-License-Identifier: GPL-2.0
Speculative Return Stack Overflow (SRSO)
========================================
This is a mitigation for the speculative return stack overflow (SRSO)
vulnerability found on AMD processors. The mechanism is by now the well
known scenario of poisoning CPU functional units - the Branch Target
Buffer (BTB) and Return Address Predictor (RAP) in this case - and then
tricking the elevated privilege domain (the kernel) into leaking
sensitive data.
AMD CPUs predict RET instructions using a Return Address Predictor (aka
Return Address Stack/Return Stack Buffer). In some cases, a non-architectural
CALL instruction (i.e., an instruction predicted to be a CALL but is
not actually a CALL) can create an entry in the RAP which may be used
to predict the target of a subsequent RET instruction.
The specific circumstances that lead to this varies by microarchitecture
but the concern is that an attacker can mis-train the CPU BTB to predict
non-architectural CALL instructions in kernel space and use this to
control the speculative target of a subsequent kernel RET, potentially
leading to information disclosure via a speculative side-channel.
The issue is tracked under CVE-2023-20569.
Affected processors
-------------------
AMD Zen, generations 1-4. That is, all families 0x17 and 0x19. Older
processors have not been investigated.
System information and options
------------------------------
First of all, it is required that the latest microcode be loaded for
mitigations to be effective.
The sysfs file showing SRSO mitigation status is:
/sys/devices/system/cpu/vulnerabilities/spec_rstack_overflow
The possible values in this file are:
- 'Not affected' The processor is not vulnerable
- 'Vulnerable: no microcode' The processor is vulnerable, no
microcode extending IBPB functionality
to address the vulnerability has been
applied.
- 'Mitigation: microcode' Extended IBPB functionality microcode
patch has been applied. It does not
address User->Kernel and Guest->Host
transitions protection but it does
address User->User and VM->VM attack
vectors.
(spec_rstack_overflow=microcode)
- 'Mitigation: safe RET' Software-only mitigation. It complements
the extended IBPB microcode patch
functionality by addressing User->Kernel
and Guest->Host transitions protection.
Selected by default or by
spec_rstack_overflow=safe-ret
- 'Mitigation: IBPB' Similar protection as "safe RET" above
but employs an IBPB barrier on privilege
domain crossings (User->Kernel,
Guest->Host).
(spec_rstack_overflow=ibpb)
- 'Mitigation: IBPB on VMEXIT' Mitigation addressing the cloud provider
scenario - the Guest->Host transitions
only.
(spec_rstack_overflow=ibpb-vmexit)
In order to exploit vulnerability, an attacker needs to:
- gain local access on the machine
- break kASLR
- find gadgets in the running kernel in order to use them in the exploit
- potentially create and pin an additional workload on the sibling
thread, depending on the microarchitecture (not necessary on fam 0x19)
- run the exploit
Considering the performance implications of each mitigation type, the
default one is 'Mitigation: safe RET' which should take care of most
attack vectors, including the local User->Kernel one.
As always, the user is advised to keep her/his system up-to-date by
applying software updates regularly.
The default setting will be reevaluated when needed and especially when
new attack vectors appear.
As one can surmise, 'Mitigation: safe RET' does come at the cost of some
performance depending on the workload. If one trusts her/his userspace
and does not want to suffer the performance impact, one can always
disable the mitigation with spec_rstack_overflow=off.
Similarly, 'Mitigation: IBPB' is another full mitigation type employing
an indrect branch prediction barrier after having applied the required
microcode patch for one's system. This mitigation comes also at
a performance cost.
Mitigation: safe RET
--------------------
The mitigation works by ensuring all RET instructions speculate to
a controlled location, similar to how speculation is controlled in the
retpoline sequence. To accomplish this, the __x86_return_thunk forces
the CPU to mispredict every function return using a 'safe return'
sequence.
To ensure the safety of this mitigation, the kernel must ensure that the
safe return sequence is itself free from attacker interference. In Zen3
and Zen4, this is accomplished by creating a BTB alias between the
untraining function srso_untrain_ret_alias() and the safe return
function srso_safe_ret_alias() which results in evicting a potentially
poisoned BTB entry and using that safe one for all function returns.
In older Zen1 and Zen2, this is accomplished using a reinterpretation
technique similar to Retbleed one: srso_untrain_ret() and
srso_safe_ret().
......@@ -5875,6 +5875,17 @@
Not specifying this option is equivalent to
spectre_v2_user=auto.
spec_rstack_overflow=
[X86] Control RAS overflow mitigation on AMD Zen CPUs
off - Disable mitigation
microcode - Enable microcode mitigation only
safe-ret - Enable sw-only safe RET mitigation (default)
ibpb - Enable mitigation by issuing IBPB on
kernel entry
ibpb-vmexit - Issue IBPB only on VMEXIT
(cloud-specific mitigation)
spec_store_bypass_disable=
[HW] Control Speculative Store Bypass (SSB) Disable mitigation
(Speculative Store Bypass vulnerability)
......
......@@ -2593,6 +2593,13 @@ config CPU_IBRS_ENTRY
This mitigates both spectre_v2 and retbleed at great cost to
performance.
config CPU_SRSO
bool "Mitigate speculative RAS overflow on AMD"
depends on CPU_SUP_AMD && X86_64 && RETHUNK
default y
help
Enable the SRSO mitigation needed on AMD Zen1-4 machines.
config SLS
bool "Mitigate Straight-Line-Speculation"
depends on CC_HAS_SLS && X86_64
......
......@@ -14,7 +14,7 @@
* Defines x86 CPU feature bits
*/
#define NCAPINTS 21 /* N 32-bit words worth of info */
#define NBUGINTS 1 /* N 32-bit bug flags */
#define NBUGINTS 2 /* N 32-bit bug flags */
/*
* Note: If the comment begins with a quoted string, that string is used
......@@ -309,6 +309,10 @@
#define X86_FEATURE_SMBA (11*32+21) /* "" Slow Memory Bandwidth Allocation */
#define X86_FEATURE_BMEC (11*32+22) /* "" Bandwidth Monitoring Event Configuration */
#define X86_FEATURE_SRSO (11*32+24) /* "" AMD BTB untrain RETs */
#define X86_FEATURE_SRSO_ALIAS (11*32+25) /* "" AMD BTB untrain RETs through aliasing */
#define X86_FEATURE_IBPB_ON_VMEXIT (11*32+26) /* "" Issue an IBPB only on VMEXIT */
/* Intel-defined CPU features, CPUID level 0x00000007:1 (EAX), word 12 */
#define X86_FEATURE_AVX_VNNI (12*32+ 4) /* AVX VNNI instructions */
#define X86_FEATURE_AVX512_BF16 (12*32+ 5) /* AVX512 BFLOAT16 instructions */
......@@ -442,6 +446,10 @@
#define X86_FEATURE_AUTOIBRS (20*32+ 8) /* "" Automatic IBRS */
#define X86_FEATURE_NO_SMM_CTL_MSR (20*32+ 9) /* "" SMM_CTL MSR is not present */
#define X86_FEATURE_SBPB (20*32+27) /* "" Selective Branch Prediction Barrier */
#define X86_FEATURE_IBPB_BRTYPE (20*32+28) /* "" MSR_PRED_CMD[IBPB] flushes all branch type predictions */
#define X86_FEATURE_SRSO_NO (20*32+29) /* "" CPU is not affected by SRSO */
/*
* BUG word(s)
*/
......@@ -484,4 +492,6 @@
#define X86_BUG_EIBRS_PBRSB X86_BUG(28) /* EIBRS is vulnerable to Post Barrier RSB Predictions */
#define X86_BUG_SMT_RSB X86_BUG(29) /* CPU is vulnerable to Cross-Thread Return Address Predictions */
/* BUG word 2 */
#define X86_BUG_SRSO X86_BUG(1*32 + 0) /* AMD SRSO bug */
#endif /* _ASM_X86_CPUFEATURES_H */
......@@ -57,6 +57,7 @@
#define MSR_IA32_PRED_CMD 0x00000049 /* Prediction Command */
#define PRED_CMD_IBPB BIT(0) /* Indirect Branch Prediction Barrier */
#define PRED_CMD_SBPB BIT(7) /* Selective Branch Prediction Barrier */
#define MSR_PPIN_CTL 0x0000004e
#define MSR_PPIN 0x0000004f
......
......@@ -211,7 +211,8 @@
* eventually turn into it's own annotation.
*/
.macro VALIDATE_UNRET_END
#if defined(CONFIG_NOINSTR_VALIDATION) && defined(CONFIG_CPU_UNRET_ENTRY)
#if defined(CONFIG_NOINSTR_VALIDATION) && \
(defined(CONFIG_CPU_UNRET_ENTRY) || defined(CONFIG_CPU_SRSO))
ANNOTATE_RETPOLINE_SAFE
nop
#endif
......@@ -289,13 +290,18 @@
*/
.macro UNTRAIN_RET
#if defined(CONFIG_CPU_UNRET_ENTRY) || defined(CONFIG_CPU_IBPB_ENTRY) || \
defined(CONFIG_CALL_DEPTH_TRACKING)
defined(CONFIG_CALL_DEPTH_TRACKING) || defined(CONFIG_CPU_SRSO)
VALIDATE_UNRET_END
ALTERNATIVE_3 "", \
CALL_ZEN_UNTRAIN_RET, X86_FEATURE_UNRET, \
"call entry_ibpb", X86_FEATURE_ENTRY_IBPB, \
__stringify(RESET_CALL_DEPTH), X86_FEATURE_CALL_DEPTH
#endif
#ifdef CONFIG_CPU_SRSO
ALTERNATIVE_2 "", "call srso_untrain_ret", X86_FEATURE_SRSO, \
"call srso_untrain_ret_alias", X86_FEATURE_SRSO_ALIAS
#endif
.endm
.macro UNTRAIN_RET_FROM_CALL
......@@ -307,6 +313,11 @@
"call entry_ibpb", X86_FEATURE_ENTRY_IBPB, \
__stringify(RESET_CALL_DEPTH_FROM_CALL), X86_FEATURE_CALL_DEPTH
#endif
#ifdef CONFIG_CPU_SRSO
ALTERNATIVE_2 "", "call srso_untrain_ret", X86_FEATURE_SRSO, \
"call srso_untrain_ret_alias", X86_FEATURE_SRSO_ALIAS
#endif
.endm
......@@ -332,6 +343,8 @@ extern retpoline_thunk_t __x86_indirect_jump_thunk_array[];
extern void __x86_return_thunk(void);
extern void zen_untrain_ret(void);
extern void srso_untrain_ret(void);
extern void srso_untrain_ret_alias(void);
extern void entry_ibpb(void);
#ifdef CONFIG_CALL_THUNKS
......@@ -479,11 +492,11 @@ void alternative_msr_write(unsigned int msr, u64 val, unsigned int feature)
: "memory");
}
extern u64 x86_pred_cmd;
static inline void indirect_branch_prediction_barrier(void)
{
u64 val = PRED_CMD_IBPB;
alternative_msr_write(MSR_IA32_PRED_CMD, val, X86_FEATURE_USE_IBPB);
alternative_msr_write(MSR_IA32_PRED_CMD, x86_pred_cmd, X86_FEATURE_USE_IBPB);
}
/* The Intel SPEC CTRL MSR base value cache */
......
......@@ -682,9 +682,11 @@ extern u16 get_llc_id(unsigned int cpu);
#ifdef CONFIG_CPU_SUP_AMD
extern u32 amd_get_nodes_per_socket(void);
extern u32 amd_get_highest_perf(void);
extern bool cpu_has_ibpb_brtype_microcode(void);
#else
static inline u32 amd_get_nodes_per_socket(void) { return 0; }
static inline u32 amd_get_highest_perf(void) { return 0; }
static inline bool cpu_has_ibpb_brtype_microcode(void) { return false; }
#endif
extern unsigned long arch_align_stack(unsigned long sp);
......
......@@ -1290,3 +1290,22 @@ void amd_check_microcode(void)
{
on_each_cpu(zenbleed_check_cpu, NULL, 1);
}
bool cpu_has_ibpb_brtype_microcode(void)
{
switch (boot_cpu_data.x86) {
/* Zen1/2 IBPB flushes branch type predictions too. */
case 0x17:
return boot_cpu_has(X86_FEATURE_AMD_IBPB);
case 0x19:
/* Poke the MSR bit on Zen3/4 to check its presence. */
if (!wrmsrl_safe(MSR_IA32_PRED_CMD, PRED_CMD_SBPB)) {
setup_force_cpu_cap(X86_FEATURE_SBPB);
return true;
} else {
return false;
}
default:
return false;
}
}
......@@ -47,6 +47,7 @@ static void __init taa_select_mitigation(void);
static void __init mmio_select_mitigation(void);
static void __init srbds_select_mitigation(void);
static void __init l1d_flush_select_mitigation(void);
static void __init srso_select_mitigation(void);
/* The base value of the SPEC_CTRL MSR without task-specific bits set */
u64 x86_spec_ctrl_base;
......@@ -56,6 +57,9 @@ EXPORT_SYMBOL_GPL(x86_spec_ctrl_base);
DEFINE_PER_CPU(u64, x86_spec_ctrl_current);
EXPORT_SYMBOL_GPL(x86_spec_ctrl_current);
u64 x86_pred_cmd __ro_after_init = PRED_CMD_IBPB;
EXPORT_SYMBOL_GPL(x86_pred_cmd);
static DEFINE_MUTEX(spec_ctrl_mutex);
/* Update SPEC_CTRL MSR and its cached copy unconditionally */
......@@ -160,6 +164,7 @@ void __init cpu_select_mitigations(void)
md_clear_select_mitigation();
srbds_select_mitigation();
l1d_flush_select_mitigation();
srso_select_mitigation();
}
/*
......@@ -2187,6 +2192,165 @@ static int __init l1tf_cmdline(char *str)
}
early_param("l1tf", l1tf_cmdline);
#undef pr_fmt
#define pr_fmt(fmt) "Speculative Return Stack Overflow: " fmt
enum srso_mitigation {
SRSO_MITIGATION_NONE,
SRSO_MITIGATION_MICROCODE,
SRSO_MITIGATION_SAFE_RET,
SRSO_MITIGATION_IBPB,
SRSO_MITIGATION_IBPB_ON_VMEXIT,
};
enum srso_mitigation_cmd {
SRSO_CMD_OFF,
SRSO_CMD_MICROCODE,
SRSO_CMD_SAFE_RET,
SRSO_CMD_IBPB,
SRSO_CMD_IBPB_ON_VMEXIT,
};
static const char * const srso_strings[] = {
[SRSO_MITIGATION_NONE] = "Vulnerable",
[SRSO_MITIGATION_MICROCODE] = "Mitigation: microcode",
[SRSO_MITIGATION_SAFE_RET] = "Mitigation: safe RET",
[SRSO_MITIGATION_IBPB] = "Mitigation: IBPB",
[SRSO_MITIGATION_IBPB_ON_VMEXIT] = "Mitigation: IBPB on VMEXIT only"
};
static enum srso_mitigation srso_mitigation __ro_after_init = SRSO_MITIGATION_NONE;
static enum srso_mitigation_cmd srso_cmd __ro_after_init = SRSO_CMD_SAFE_RET;
static int __init srso_parse_cmdline(char *str)
{
if (!str)
return -EINVAL;
if (!strcmp(str, "off"))
srso_cmd = SRSO_CMD_OFF;
else if (!strcmp(str, "microcode"))
srso_cmd = SRSO_CMD_MICROCODE;
else if (!strcmp(str, "safe-ret"))
srso_cmd = SRSO_CMD_SAFE_RET;
else if (!strcmp(str, "ibpb"))
srso_cmd = SRSO_CMD_IBPB;
else if (!strcmp(str, "ibpb-vmexit"))
srso_cmd = SRSO_CMD_IBPB_ON_VMEXIT;
else
pr_err("Ignoring unknown SRSO option (%s).", str);
return 0;
}
early_param("spec_rstack_overflow", srso_parse_cmdline);
#define SRSO_NOTICE "WARNING: See https://kernel.org/doc/html/latest/admin-guide/hw-vuln/srso.html for mitigation options."
static void __init srso_select_mitigation(void)
{
bool has_microcode;
if (!boot_cpu_has_bug(X86_BUG_SRSO) || cpu_mitigations_off())
goto pred_cmd;
/*
* The first check is for the kernel running as a guest in order
* for guests to verify whether IBPB is a viable mitigation.
*/
has_microcode = boot_cpu_has(X86_FEATURE_IBPB_BRTYPE) || cpu_has_ibpb_brtype_microcode();
if (!has_microcode) {
pr_warn("IBPB-extending microcode not applied!\n");
pr_warn(SRSO_NOTICE);
} else {
/*
* Enable the synthetic (even if in a real CPUID leaf)
* flags for guests.
*/
setup_force_cpu_cap(X86_FEATURE_IBPB_BRTYPE);
/*
* Zen1/2 with SMT off aren't vulnerable after the right
* IBPB microcode has been applied.
*/
if ((boot_cpu_data.x86 < 0x19) &&
(!cpu_smt_possible() || (cpu_smt_control == CPU_SMT_DISABLED)))
setup_force_cpu_cap(X86_FEATURE_SRSO_NO);
}
if (retbleed_mitigation == RETBLEED_MITIGATION_IBPB) {
if (has_microcode) {
pr_err("Retbleed IBPB mitigation enabled, using same for SRSO\n");
srso_mitigation = SRSO_MITIGATION_IBPB;
goto pred_cmd;
}
}
switch (srso_cmd) {
case SRSO_CMD_OFF:
return;
case SRSO_CMD_MICROCODE:
if (has_microcode) {
srso_mitigation = SRSO_MITIGATION_MICROCODE;
pr_warn(SRSO_NOTICE);
}
break;
case SRSO_CMD_SAFE_RET:
if (IS_ENABLED(CONFIG_CPU_SRSO)) {
/*
* Enable the return thunk for generated code
* like ftrace, static_call, etc.
*/
setup_force_cpu_cap(X86_FEATURE_RETHUNK);
if (boot_cpu_data.x86 == 0x19)
setup_force_cpu_cap(X86_FEATURE_SRSO_ALIAS);
else
setup_force_cpu_cap(X86_FEATURE_SRSO);
srso_mitigation = SRSO_MITIGATION_SAFE_RET;
} else {
pr_err("WARNING: kernel not compiled with CPU_SRSO.\n");
goto pred_cmd;
}
break;
case SRSO_CMD_IBPB:
if (IS_ENABLED(CONFIG_CPU_IBPB_ENTRY)) {
if (has_microcode) {
setup_force_cpu_cap(X86_FEATURE_ENTRY_IBPB);
srso_mitigation = SRSO_MITIGATION_IBPB;
}
} else {
pr_err("WARNING: kernel not compiled with CPU_IBPB_ENTRY.\n");
goto pred_cmd;
}
break;
case SRSO_CMD_IBPB_ON_VMEXIT:
if (IS_ENABLED(CONFIG_CPU_SRSO)) {
if (!boot_cpu_has(X86_FEATURE_ENTRY_IBPB) && has_microcode) {
setup_force_cpu_cap(X86_FEATURE_IBPB_ON_VMEXIT);
srso_mitigation = SRSO_MITIGATION_IBPB_ON_VMEXIT;
}
} else {
pr_err("WARNING: kernel not compiled with CPU_SRSO.\n");
goto pred_cmd;
}
break;
default:
break;
}
pr_info("%s%s\n", srso_strings[srso_mitigation], (has_microcode ? "" : ", no microcode"));
pred_cmd:
if ((boot_cpu_has(X86_FEATURE_SRSO_NO) || srso_cmd == SRSO_CMD_OFF) &&
boot_cpu_has(X86_FEATURE_SBPB))
x86_pred_cmd = PRED_CMD_SBPB;
}
#undef pr_fmt
#define pr_fmt(fmt) fmt
......@@ -2385,6 +2549,13 @@ static ssize_t retbleed_show_state(char *buf)
return sysfs_emit(buf, "%s\n", retbleed_strings[retbleed_mitigation]);
}
static ssize_t srso_show_state(char *buf)
{
return sysfs_emit(buf, "%s%s\n",
srso_strings[srso_mitigation],
(cpu_has_ibpb_brtype_microcode() ? "" : ", no microcode"));
}
static ssize_t cpu_show_common(struct device *dev, struct device_attribute *attr,
char *buf, unsigned int bug)
{
......@@ -2434,6 +2605,9 @@ static ssize_t cpu_show_common(struct device *dev, struct device_attribute *attr
case X86_BUG_RETBLEED:
return retbleed_show_state(buf);
case X86_BUG_SRSO:
return srso_show_state(buf);
default:
break;
}
......@@ -2498,4 +2672,9 @@ ssize_t cpu_show_retbleed(struct device *dev, struct device_attribute *attr, cha
{
return cpu_show_common(dev, attr, buf, X86_BUG_RETBLEED);
}
ssize_t cpu_show_spec_rstack_overflow(struct device *dev, struct device_attribute *attr, char *buf)
{
return cpu_show_common(dev, attr, buf, X86_BUG_SRSO);
}
#endif
......@@ -1250,6 +1250,8 @@ static const __initconst struct x86_cpu_id cpu_vuln_whitelist[] = {
#define RETBLEED BIT(3)
/* CPU is affected by SMT (cross-thread) return predictions */
#define SMT_RSB BIT(4)
/* CPU is affected by SRSO */
#define SRSO BIT(5)
static const struct x86_cpu_id cpu_vuln_blacklist[] __initconst = {
VULNBL_INTEL_STEPPINGS(IVYBRIDGE, X86_STEPPING_ANY, SRBDS),
......@@ -1281,8 +1283,9 @@ static const struct x86_cpu_id cpu_vuln_blacklist[] __initconst = {
VULNBL_AMD(0x15, RETBLEED),
VULNBL_AMD(0x16, RETBLEED),
VULNBL_AMD(0x17, RETBLEED | SMT_RSB),
VULNBL_AMD(0x17, RETBLEED | SMT_RSB | SRSO),
VULNBL_HYGON(0x18, RETBLEED | SMT_RSB),
VULNBL_AMD(0x19, SRSO),
{}
};
......@@ -1406,6 +1409,11 @@ static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c)
if (cpu_matches(cpu_vuln_blacklist, SMT_RSB))
setup_force_cpu_bug(X86_BUG_SMT_RSB);
if (!cpu_has(c, X86_FEATURE_SRSO_NO)) {
if (cpu_matches(cpu_vuln_blacklist, SRSO))
setup_force_cpu_bug(X86_BUG_SRSO);
}
if (cpu_matches(cpu_vuln_whitelist, NO_MELTDOWN))
return;
......
......@@ -134,13 +134,27 @@ SECTIONS
SOFTIRQENTRY_TEXT
#ifdef CONFIG_RETPOLINE
__indirect_thunk_start = .;
*(.text.__x86.*)
*(.text.__x86.indirect_thunk)
*(.text.__x86.return_thunk)
__indirect_thunk_end = .;
#endif
STATIC_CALL_TEXT
ALIGN_ENTRY_TEXT_BEGIN
#ifdef CONFIG_CPU_SRSO
*(.text.__x86.rethunk_untrain)
#endif
ENTRY_TEXT
#ifdef CONFIG_CPU_SRSO
/*
* See the comment above srso_untrain_ret_alias()'s
* definition.
*/
. = srso_untrain_ret_alias | (1 << 2) | (1 << 8) | (1 << 14) | (1 << 20);
*(.text.__x86.rethunk_safe)
#endif
ALIGN_ENTRY_TEXT_END
*(.gnu.warning)
......@@ -509,7 +523,18 @@ INIT_PER_CPU(irq_stack_backing_store);
#endif
#ifdef CONFIG_RETHUNK
. = ASSERT((__x86_return_thunk & 0x3f) == 0, "__x86_return_thunk not cacheline-aligned");
. = ASSERT((__ret & 0x3f) == 0, "__ret not cacheline-aligned");
. = ASSERT((srso_safe_ret & 0x3f) == 0, "srso_safe_ret not cacheline-aligned");
#endif
#ifdef CONFIG_CPU_SRSO
/*
* GNU ld cannot do XOR so do: (A | B) - (A & B) in order to compute the XOR
* of the two function addresses:
*/
. = ASSERT(((srso_untrain_ret_alias | srso_safe_ret_alias) -
(srso_untrain_ret_alias & srso_safe_ret_alias)) == ((1 << 2) | (1 << 8) | (1 << 14) | (1 << 20)),
"SRSO function pair won't alias");
#endif
#endif /* CONFIG_X86_64 */
......@@ -729,6 +729,9 @@ void kvm_set_cpu_caps(void)
F(NULL_SEL_CLR_BASE) | F(AUTOIBRS) | 0 /* PrefetchCtlMsr */
);
if (cpu_feature_enabled(X86_FEATURE_SRSO_NO))
kvm_cpu_cap_set(X86_FEATURE_SRSO_NO);
kvm_cpu_cap_init_kvm_defined(CPUID_8000_0022_EAX,
F(PERFMON_V2)
);
......
......@@ -1498,6 +1498,8 @@ static void svm_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
if (sd->current_vmcb != svm->vmcb) {
sd->current_vmcb = svm->vmcb;
if (!cpu_feature_enabled(X86_FEATURE_IBPB_ON_VMEXIT))
indirect_branch_prediction_barrier();
}
if (kvm_vcpu_apicv_active(vcpu))
......
......@@ -224,6 +224,9 @@ SYM_FUNC_START(__svm_vcpu_run)
*/
UNTRAIN_RET
/* SRSO */
ALTERNATIVE "", "call entry_ibpb", X86_FEATURE_IBPB_ON_VMEXIT
/*
* Clear all general purpose registers except RSP and RAX to prevent
* speculative use of the guest's values, even those that are reloaded
......
......@@ -11,6 +11,7 @@
#include <asm/unwind_hints.h>
#include <asm/percpu.h>
#include <asm/frame.h>
#include <asm/nops.h>
.section .text.__x86.indirect_thunk
......@@ -131,6 +132,46 @@ SYM_CODE_END(__x86_indirect_jump_thunk_array)
*/
#ifdef CONFIG_RETHUNK
/*
* srso_untrain_ret_alias() and srso_safe_ret_alias() are placed at
* special addresses:
*
* - srso_untrain_ret_alias() is 2M aligned
* - srso_safe_ret_alias() is also in the same 2M page but bits 2, 8, 14
* and 20 in its virtual address are set (while those bits in the
* srso_untrain_ret_alias() function are cleared).
*
* This guarantees that those two addresses will alias in the branch
* target buffer of Zen3/4 generations, leading to any potential
* poisoned entries at that BTB slot to get evicted.
*
* As a result, srso_safe_ret_alias() becomes a safe return.
*/
#ifdef CONFIG_CPU_SRSO
.section .text.__x86.rethunk_untrain
SYM_START(srso_untrain_ret_alias, SYM_L_GLOBAL, SYM_A_NONE)
ANNOTATE_NOENDBR
ASM_NOP2
lfence
jmp __x86_return_thunk
SYM_FUNC_END(srso_untrain_ret_alias)
__EXPORT_THUNK(srso_untrain_ret_alias)
.section .text.__x86.rethunk_safe
#endif
/* Needs a definition for the __x86_return_thunk alternative below. */
SYM_START(srso_safe_ret_alias, SYM_L_GLOBAL, SYM_A_NONE)
#ifdef CONFIG_CPU_SRSO
add $8, %_ASM_SP
UNWIND_HINT_FUNC
#endif
ANNOTATE_UNRET_SAFE
ret
int3
SYM_FUNC_END(srso_safe_ret_alias)
.section .text.__x86.return_thunk
/*
......@@ -143,7 +184,7 @@ SYM_CODE_END(__x86_indirect_jump_thunk_array)
* from re-poisioning the BTB prediction.
*/
.align 64
.skip 64 - (__x86_return_thunk - zen_untrain_ret), 0xcc
.skip 64 - (__ret - zen_untrain_ret), 0xcc
SYM_START(zen_untrain_ret, SYM_L_GLOBAL, SYM_A_NONE)
ANNOTATE_NOENDBR
/*
......@@ -175,10 +216,10 @@ SYM_START(zen_untrain_ret, SYM_L_GLOBAL, SYM_A_NONE)
* evicted, __x86_return_thunk will suffer Straight Line Speculation
* which will be contained safely by the INT3.
*/
SYM_INNER_LABEL(__x86_return_thunk, SYM_L_GLOBAL)
SYM_INNER_LABEL(__ret, SYM_L_GLOBAL)
ret
int3
SYM_CODE_END(__x86_return_thunk)
SYM_CODE_END(__ret)
/*
* Ensure the TEST decoding / BTB invalidation is complete.
......@@ -189,11 +230,45 @@ SYM_CODE_END(__x86_return_thunk)
* Jump back and execute the RET in the middle of the TEST instruction.
* INT3 is for SLS protection.
*/
jmp __x86_return_thunk
jmp __ret
int3
SYM_FUNC_END(zen_untrain_ret)
__EXPORT_THUNK(zen_untrain_ret)
/*
* SRSO untraining sequence for Zen1/2, similar to zen_untrain_ret()
* above. On kernel entry, srso_untrain_ret() is executed which is a
*
* movabs $0xccccccc308c48348,%rax
*
* and when the return thunk executes the inner label srso_safe_ret()
* later, it is a stack manipulation and a RET which is mispredicted and
* thus a "safe" one to use.
*/
.align 64
.skip 64 - (srso_safe_ret - srso_untrain_ret), 0xcc
SYM_START(srso_untrain_ret, SYM_L_GLOBAL, SYM_A_NONE)
ANNOTATE_NOENDBR
.byte 0x48, 0xb8
SYM_INNER_LABEL(srso_safe_ret, SYM_L_GLOBAL)
add $8, %_ASM_SP
ret
int3
int3
int3
lfence
call srso_safe_ret
int3
SYM_CODE_END(srso_safe_ret)
SYM_FUNC_END(srso_untrain_ret)
__EXPORT_THUNK(srso_untrain_ret)
SYM_FUNC_START(__x86_return_thunk)
ALTERNATIVE_2 "jmp __ret", "call srso_safe_ret", X86_FEATURE_SRSO, \
"call srso_safe_ret_alias", X86_FEATURE_SRSO_ALIAS
int3
SYM_CODE_END(__x86_return_thunk)
EXPORT_SYMBOL(__x86_return_thunk)
#endif /* CONFIG_RETHUNK */
......
......@@ -577,6 +577,12 @@ ssize_t __weak cpu_show_retbleed(struct device *dev,
return sysfs_emit(buf, "Not affected\n");
}
ssize_t __weak cpu_show_spec_rstack_overflow(struct device *dev,
struct device_attribute *attr, char *buf)
{
return sysfs_emit(buf, "Not affected\n");
}
static DEVICE_ATTR(meltdown, 0444, cpu_show_meltdown, NULL);
static DEVICE_ATTR(spectre_v1, 0444, cpu_show_spectre_v1, NULL);
static DEVICE_ATTR(spectre_v2, 0444, cpu_show_spectre_v2, NULL);
......@@ -588,6 +594,7 @@ static DEVICE_ATTR(itlb_multihit, 0444, cpu_show_itlb_multihit, NULL);
static DEVICE_ATTR(srbds, 0444, cpu_show_srbds, NULL);
static DEVICE_ATTR(mmio_stale_data, 0444, cpu_show_mmio_stale_data, NULL);
static DEVICE_ATTR(retbleed, 0444, cpu_show_retbleed, NULL);
static DEVICE_ATTR(spec_rstack_overflow, 0444, cpu_show_spec_rstack_overflow, NULL);
static struct attribute *cpu_root_vulnerabilities_attrs[] = {
&dev_attr_meltdown.attr,
......@@ -601,6 +608,7 @@ static struct attribute *cpu_root_vulnerabilities_attrs[] = {
&dev_attr_srbds.attr,
&dev_attr_mmio_stale_data.attr,
&dev_attr_retbleed.attr,
&dev_attr_spec_rstack_overflow.attr,
NULL
};
......
......@@ -70,6 +70,8 @@ extern ssize_t cpu_show_mmio_stale_data(struct device *dev,
char *buf);
extern ssize_t cpu_show_retbleed(struct device *dev,
struct device_attribute *attr, char *buf);
extern ssize_t cpu_show_spec_rstack_overflow(struct device *dev,
struct device_attribute *attr, char *buf);
extern __printf(4, 5)
struct device *cpu_device_create(struct device *parent, void *drvdata,
......
......@@ -14,7 +14,7 @@
* Defines x86 CPU feature bits
*/
#define NCAPINTS 21 /* N 32-bit words worth of info */
#define NBUGINTS 1 /* N 32-bit bug flags */
#define NBUGINTS 2 /* N 32-bit bug flags */
/*
* Note: If the comment begins with a quoted string, that string is used
......
......@@ -824,5 +824,8 @@ bool arch_is_retpoline(struct symbol *sym)
bool arch_is_rethunk(struct symbol *sym)
{
return !strcmp(sym->name, "__x86_return_thunk");
return !strcmp(sym->name, "__x86_return_thunk") ||
!strcmp(sym->name, "srso_untrain_ret") ||
!strcmp(sym->name, "srso_safe_ret") ||
!strcmp(sym->name, "__ret");
}
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment