- 30 Sep, 2020 12 commits
-
-
Marc Zyngier authored
Signed-off-by: Marc Zyngier <maz@kernel.org>
-
Marc Zyngier authored
Signed-off-by: Marc Zyngier <maz@kernel.org>
-
David Brazdil authored
With all nVHE per-CPU variables being part of the hyp per-CPU region, mapping them individual is not necessary any longer. They are mapped to hyp as part of the overall per-CPU region. Signed-off-by: David Brazdil <dbrazdil@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Acked-by: Andrew Scull <ascull@google.com> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20200922204910.7265-11-dbrazdil@google.com
-
David Brazdil authored
Add hyp percpu section to linker script and rename the corresponding ELF sections of hyp/nvhe object files. This moves all nVHE-specific percpu variables to the new hyp percpu section. Allocate sufficient amount of memory for all percpu hyp regions at global KVM init time and create corresponding hyp mappings. The base addresses of hyp percpu regions are kept in a dynamically allocated array in the kernel. Add NULL checks in PMU event-reset code as it may run before KVM memory is initialized. Signed-off-by: David Brazdil <dbrazdil@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20200922204910.7265-10-dbrazdil@google.com
-
David Brazdil authored
Host CPU context is stored in a global per-cpu variable `kvm_host_data`. In preparation for introducing independent per-CPU region for nVHE hyp, create two separate instances of `kvm_host_data`, one for VHE and one for nVHE. Signed-off-by: David Brazdil <dbrazdil@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20200922204910.7265-9-dbrazdil@google.com
-
David Brazdil authored
Hyp keeps track of which cores require SSBD callback by accessing a kernel-proper global variable. Create an nVHE symbol of the same name and copy the value from kernel proper to nVHE as KVM is being enabled on a core. Done in preparation for separating percpu memory owned by kernel proper and nVHE. Signed-off-by: David Brazdil <dbrazdil@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20200922204910.7265-8-dbrazdil@google.com
-
David Brazdil authored
Defining a per-CPU variable in hyp/nvhe will result in its name being prefixed with __kvm_nvhe_. Add helpers for declaring these variables in kernel proper and accessing them with this_cpu_ptr and per_cpu_ptr. Signed-off-by: David Brazdil <dbrazdil@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20200922204910.7265-7-dbrazdil@google.com
-
David Brazdil authored
The hyp_adr/ldr_this_cpu helpers were introduced for use in hyp code because they always needed to use TPIDR_EL2 for base, while adr/ldr_this_cpu from kernel proper would select between TPIDR_EL2 and _EL1 based on VHE/nVHE. Simplify this now that the hyp mode case can be handled using the __KVM_VHE/NVHE_HYPERVISOR__ macros. Signed-off-by: David Brazdil <dbrazdil@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Acked-by: Andrew Scull <ascull@google.com> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20200922204910.7265-6-dbrazdil@google.com
-
David Brazdil authored
this_cpu_ptr is meant for use in kernel proper because it selects between TPIDR_EL1/2 based on nVHE/VHE. __hyp_this_cpu_ptr was used in hyp to always select TPIDR_EL2. Unify all users behind this_cpu_ptr and friends by selecting _EL2 register under __KVM_NVHE_HYPERVISOR__. VHE continues selecting the register using alternatives. Under CONFIG_DEBUG_PREEMPT, the kernel helpers perform a preemption check which is omitted by the hyp helpers. Preserve the behavior for nVHE by overriding the corresponding macros under __KVM_NVHE_HYPERVISOR__. Extend the checks into VHE hyp code. Signed-off-by: David Brazdil <dbrazdil@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Acked-by: Andrew Scull <ascull@google.com> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20200922204910.7265-5-dbrazdil@google.com
-
David Brazdil authored
Minor cleanup that only creates __kvm_ex_table ELF section and related symbols if CONFIG_KVM is enabled. Also useful as more hyp-specific sections will be added. Signed-off-by: David Brazdil <dbrazdil@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20200922204910.7265-4-dbrazdil@google.com
-
David Brazdil authored
Minor cleanup to move all macros related to prefixing nVHE hyp section and symbol names into one place: hyp_image.h. Signed-off-by: David Brazdil <dbrazdil@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20200922204910.7265-3-dbrazdil@google.com
-
David Brazdil authored
Relying on objcopy to prefix the ELF section names of the nVHE hyp code is brittle and prevents us from using wildcards to match specific section names. Improve the build rules by partially linking all '.nvhe.o' files and prefixing their ELF section names using a linker script. Continue using objcopy for prefixing ELF symbol names. One immediate advantage of this approach is that all subsections matching a pattern can be merged into a single prefixed section, eg. .text and .text.* can be linked into a single '.hyp.text'. This removes the need for -fno-reorder-functions on GCC and will be useful in the future too: LTO builds use .text subsections, compilers routinely generate .rodata subsections, etc. Partially linking all hyp code into a single object file also makes it easier to analyze. Signed-off-by: David Brazdil <dbrazdil@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20200922204910.7265-2-dbrazdil@google.com
-
- 29 Sep, 2020 28 commits
-
-
Will Deacon authored
The PR_SPEC_DISABLE_NOEXEC option to the PR_SPEC_STORE_BYPASS prctl() allows the SSB mitigation to be enabled only until the next execve(), at which point the state will revert back to PR_SPEC_ENABLE and the mitigation will be disabled. Add support for PR_SPEC_DISABLE_NOEXEC on arm64. Reported-by: Anthony Steinhauser <asteinhauser@google.com> Signed-off-by: Will Deacon <will@kernel.org>
-
Will Deacon authored
The kbuild robot reports that we're relying on an implicit inclusion to get a definition of task_stack_page() in the Spectre-v4 mitigation code, which is not always in place for some configurations: | arch/arm64/kernel/proton-pack.c:329:2: error: implicit declaration of function 'task_stack_page' [-Werror,-Wimplicit-function-declaration] | task_pt_regs(task)->pstate |= val; | ^ | arch/arm64/include/asm/processor.h:268:36: note: expanded from macro 'task_pt_regs' | ((struct pt_regs *)(THREAD_SIZE + task_stack_page(p)) - 1) | ^ | arch/arm64/kernel/proton-pack.c:329:2: note: did you mean 'task_spread_page'? Add the missing include to fix the build error. Fixes: a44acf477220 ("arm64: Move SSBD prctl() handler alongside other spectre mitigation code") Reported-by: Anthony Steinhauser <asteinhauser@google.com> Reported-by: kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/r/202009260013.Ul7AD29w%lkp@intel.comSigned-off-by: Will Deacon <will@kernel.org>
-
Will Deacon authored
Patching the EL2 exception vectors is integral to the Spectre-v2 workaround, where it can be necessary to execute CPU-specific sequences to nobble the branch predictor before running the hypervisor text proper. Remove the dependency on CONFIG_RANDOMIZE_BASE and allow the EL2 vectors to be patched even when KASLR is not enabled. Fixes: 7a132017e7a5 ("KVM: arm64: Replace CONFIG_KVM_INDIRECT_VECTORS with CONFIG_RANDOMIZE_BASE") Reported-by: kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/r/202009221053.Jv1XsQUZ%lkp@intel.comSigned-off-by: Will Deacon <will@kernel.org>
-
Marc Zyngier authored
Out with the old ghost, in with the new... Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Will Deacon <will@kernel.org>
-
Marc Zyngier authored
Convert the KVM WA2 code to using the Spectre infrastructure, making the code much more readable. It also allows us to take SSBS into account for the mitigation. Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Will Deacon <will@kernel.org>
-
Marc Zyngier authored
kvm_arm_have_ssbd() is now completely unused, get rid of it. Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Will Deacon <will@kernel.org>
-
Marc Zyngier authored
Owing to the fact that the host kernel is always mitigated, we can drastically simplify the WA2 handling by keeping the mitigation state ON when entering the guest. This means the guest is either unaffected or not mitigated. This results in a nice simplification of the mitigation space, and the removal of a lot of code that was never really used anyway. Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Will Deacon <will@kernel.org>
-
Will Deacon authored
Rewrite the Spectre-v4 mitigation handling code to follow the same approach as that taken by Spectre-v2. For now, report to KVM that the system is vulnerable (by forcing 'ssbd_state' to ARM64_SSBD_UNKNOWN), as this will be cleared up in subsequent steps. Signed-off-by: Will Deacon <will@kernel.org>
-
Will Deacon authored
As part of the spectre consolidation effort to shift all of the ghosts into their own proton pack, move all of the horrible SSBD prctl() code out of its own 'ssbd.c' file. Signed-off-by: Will Deacon <will@kernel.org>
-
Will Deacon authored
In a similar manner to the renaming of ARM64_HARDEN_BRANCH_PREDICTOR to ARM64_SPECTRE_V2, rename ARM64_SSBD to ARM64_SPECTRE_V4. This isn't _entirely_ accurate, as we also need to take into account the interaction with SSBS, but that will be taken care of in subsequent patches. Signed-off-by: Will Deacon <will@kernel.org>
-
Will Deacon authored
If all CPUs discovered during boot have SSBS, then spectre-v4 will be considered to be "mitigated". However, we still allow late CPUs without SSBS to be onlined, albeit with a "SANITY CHECK" warning. This is problematic for userspace because it means that the system can quietly transition to "Vulnerable" at runtime. Avoid this by treating SSBS as a non-strict system feature: if all of the CPUs discovered during boot have SSBS, then late arriving secondaries better have it as well. Signed-off-by: Will Deacon <will@kernel.org>
-
Will Deacon authored
The is_ttbrX_addr() functions have somehow ended up in the middle of the start_thread() functions, so move them out of the way to keep the code readable. Signed-off-by: Will Deacon <will@kernel.org>
-
Marc Zyngier authored
If the system is not affected by Spectre-v2, then advertise to the KVM guest that it is not affected, without the need for a safelist in the guest. Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Will Deacon <will@kernel.org>
-
Will Deacon authored
The Spectre-v2 mitigation code is pretty unwieldy and hard to maintain. This is largely due to it being written hastily, without much clue as to how things would pan out, and also because it ends up mixing policy and state in such a way that it is very difficult to figure out what's going on. Rewrite the Spectre-v2 mitigation so that it clearly separates state from policy and follows a more structured approach to handling the mitigation. Signed-off-by: Will Deacon <will@kernel.org>
-
Will Deacon authored
The spectre mitigation code is spread over a few different files, which makes it both hard to follow, but also hard to remove it should we want to do that in future. Introduce a new file for housing the spectre mitigations, and populate it with the spectre-v1 reporting code to start with. Signed-off-by: Will Deacon <will@kernel.org>
-
Will Deacon authored
For better or worse, the world knows about "Spectre" and not about "Branch predictor hardening". Rename ARM64_HARDEN_BRANCH_PREDICTOR to ARM64_SPECTRE_V2 as part of moving all of the Spectre mitigations into their own little corner. Signed-off-by: Will Deacon <will@kernel.org>
-
Will Deacon authored
Use is_hyp_mode_available() to detect whether or not we need to patch the KVM vectors for branch hardening, which avoids the need to take the vector pointers as parameters. Signed-off-by: Will Deacon <will@kernel.org>
-
Will Deacon authored
The removal of CONFIG_HARDEN_BRANCH_PREDICTOR means that CONFIG_KVM_INDIRECT_VECTORS is synonymous with CONFIG_RANDOMIZE_BASE, so replace it. Signed-off-by: Will Deacon <will@kernel.org>
-
Will Deacon authored
The spectre mitigations are too configurable for their own good, leading to confusing logic trying to figure out when we should mitigate and when we shouldn't. Although the plethora of command-line options need to stick around for backwards compatibility, the default-on CONFIG options that depend on EXPERT can be dropped, as the mitigations only do anything if the system is vulnerable, a mitigation is available and the command-line hasn't disabled it. Remove CONFIG_HARDEN_BRANCH_PREDICTOR and CONFIG_ARM64_SSBD in favour of enabling this code unconditionally. Signed-off-by: Will Deacon <will@kernel.org>
-
Marc Zyngier authored
Commit 606f8e7b ("arm64: capabilities: Use linear array for detection and verification") changed the way we deal with per-CPU errata by only calling the .matches() callback until one CPU is found to be affected. At this point, .matches() stop being called, and .cpu_enable() will be called on all CPUs. This breaks the ARCH_WORKAROUND_2 handling, as only a single CPU will be mitigated. In order to address this, forcefully call the .matches() callback from a .cpu_enable() callback, which brings us back to the original behaviour. Fixes: 606f8e7b ("arm64: capabilities: Use linear array for detection and verification") Cc: <stable@vger.kernel.org> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Will Deacon <will@kernel.org>
-
Marc Zyngier authored
Signed-off-by: Marc Zyngier <maz@kernel.org>
-
Alexandru Elisei authored
Update the description of the PMU KVM_{GET, SET}_DEVICE_ATTR error codes to be a better match for the code that returns them. Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Andrew Jones <drjones@redhat.com> Link: https://lore.kernel.org/r/20200924123731.268177-3-alexandru.elisei@arm.com
-
Alexandru Elisei authored
KVM_ARM_VCPU_PMU_V3_IRQ returns -EFAULT if get_user() fails when reading the interrupt number from kvm_device_attr.addr. KVM_ARM_VCPU_PMU_V3_INIT returns the error value from kvm_vgic_set_owner(). kvm_arm_pmu_v3_init() checks that the vgic has been initialized and the interrupt number is valid, but kvm_vgic_set_owner() can still return the error code -EEXIST if another device has already claimed the interrupt. Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Andrew Jones <drjones@redhat.com> Link: https://lore.kernel.org/r/20200924123731.268177-2-alexandru.elisei@arm.com
-
Marc Zyngier authored
Add a small blurb describing how the event filtering API gets used. Reviewed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
-
Marc Zyngier authored
As we can now hide events from the guest, let's also adjust its view of PCMEID{0,1}_EL1 so that it can figure out why some common events are not counting as they should. The astute user can still look into the TRM for their CPU and find out they've been cheated, though. Nobody's perfect. Signed-off-by: Marc Zyngier <maz@kernel.org>
-
Marc Zyngier authored
It can be desirable to expose a PMU to a guest, and yet not want the guest to be able to count some of the implemented events (because this would give information on shared resources, for example. For this, let's extend the PMUv3 device API, and offer a way to setup a bitmap of the allowed events (the default being no bitmap, and thus no filtering). Userspace can thus allow/deny ranges of event. The default policy depends on the "polarity" of the first filter setup (default deny if the filter allows events, and default allow if the filter denies events). This allows to setup exactly what is allowed for a given guest. Note that although the ioctl is per-vcpu, the map of allowed events is global to the VM (it can be setup from any vcpu until the vcpu PMU is initialized). Reviewed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
-
Marc Zyngier authored
The PMU code suffers from a small defect where we assume that the event number provided by the guest is always 16 bit wide, even if the CPU only implements the ARMv8.0 architecture. This isn't really problematic in the sense that the event number ends up in a system register, cropping it to the right width, but still this needs fixing. In order to make it work, let's probe the version of the PMU that the guest is going to use. This is done by temporarily creating a kernel event and looking at the PMUVer field that has been saved at probe time in the associated arm_pmu structure. This in turn gets saved in the kvm structure, and subsequently used to compute the event mask that gets used throughout the PMU code. Signed-off-by: Marc Zyngier <maz@kernel.org>
-
Marc Zyngier authored
The PMU emulation error handling is pretty messy when dealing with attributes. Let's refactor it so that we have less duplication, and that it is easy to extend later on. A functional change is that kvm_arm_pmu_v3_init() used to return -ENXIO when the PMU feature wasn't set. The error is now reported as -ENODEV, matching the documentation. -ENXIO is still returned when the interrupt isn't properly configured. Signed-off-by: Marc Zyngier <maz@kernel.org>
-