Commits · 4e9d580d61da8cc015ffdf5b8f23f38baded9035 · Kirill Smelkov / linux

06 Nov, 2019 40 commits

UBUNTU: SAUCE: KVM: vmx, svm: always run with EFER.NXE=1 when shadow paging is active · 4e9d580d

Paolo Bonzini authored Oct 27, 2019

VMX already does so if the host has SMEP, in order to support the combination of
CR0.WP=1 and CR4.SMEP=1.  However, it is perfectly safe to always do so, and in
fact VMX already ends up running with EFER.NXE=1 on old processors that lack the
"load EFER" controls, because it may help avoiding a slow MSR write.  Removing
all the conditionals simplifies the code.

SVM does not have similar code, but it should since recent AMD processors do
support SMEP.  So this patch also makes the code for the two vendors more similar
while fixing NPT=0, CR0.WP=1 and CR4.SMEP=1 on AMD processors.

Cc: stable@vger.kernel.org
Cc: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Joerg Roedel <jroedel@suse.de>

CVE-2018-12207

[tyhicks: Backport to 4.15
 - vmx.c is up one directory level]
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

4e9d580d

KVM: x86: add tracepoints around __direct_map and FNAME(fetch) · 82db338f

Paolo Bonzini authored Jul 01, 2019

These are useful in debugging shadow paging.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

CVE-2018-12207

(backported from commit 335e192a)
[tyhicks: Backport to 4.4
 - Continue to use pfn_t instead of kvm_pfn_t
 - Remove the use of shadow_present_mask in the kvm_mmu_set_spte trace
   point since we don't have commit ffb128c8 ("kvm: mmu: don't set
   the present bit unconditionally")
 - Open code is_executable_pte() in the kvm_mmu_set_spte() trace point
   since that function doesn't exit]
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

82db338f

KVM: x86: change kvm_mmu_page_get_gfn BUG_ON to WARN_ON · ede7288e

Paolo Bonzini authored Jun 30, 2019

Note that in such a case it is quite likely that KVM will BUG_ON
in __pte_list_remove when the VM is closed.  However, there is no
immediate risk of memory corruption in the host so a WARN_ON is
enough and it lets you gather traces for debugging.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

CVE-2018-12207

(cherry picked from commit e9f2a760)
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

ede7288e

KVM: x86: remove now unneeded hugepage gfn adjustment · 52b2fe27

Paolo Bonzini authored Jun 23, 2019

After the previous patch, the low bits of the gfn are masked in
both FNAME(fetch) and __direct_map, so we do not need to clear them
in transparent_hugepage_adjust.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

CVE-2018-12207

(backported from commit d679b326)
[tyhicks: Backport to 4.4
 - Continue to use a pointer to pfn_t for the type of the the pfnp
   parameter]
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

52b2fe27

KVM: x86: make FNAME(fetch) and __direct_map more similar · 68ffdcac

Paolo Bonzini authored Jun 24, 2019

These two functions are basically doing the same thing through
kvm_mmu_get_page, link_shadow_page and mmu_set_spte; yet, for historical
reasons, their code looks very different.  This patch tries to take the
best of each and make them very similar, so that it is easy to understand
changes that apply to both of them.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

CVE-2018-12207

(backported from commit 3fcf2d1b)
[tyhicks: Backport to 4.4
 - Minor context change due to mmu not being a pointer in the
   kvm_vcpu_arch struct
 - Continue to use pfn_t for the type of the pfn parameter of
   __direct_map()]
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

68ffdcac

kvm: x86: Do not release the page inside mmu_set_spte() · ded8fef3

Junaid Shahid authored Jan 03, 2019

Release the page at the call-site where it was originally acquired.
This makes the exit code cleaner for most call sites, since they
do not need to duplicate code between success and the failure
label.
Signed-off-by: Junaid Shahid <junaids@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

CVE-2018-12207

(backportedfrom commit 43fdcda9)
[tyhicks: Backport to 4.4
 - Adjust for differences in call sites of __direct_map() in mmu.c]
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

ded8fef3

kvm: Convert kvm_lock to a mutex · 7411be04

Junaid Shahid authored Jan 03, 2019

It doesn't seem as if there is any particular need for kvm_lock to be a
spinlock, so convert the lock to a mutex so that sleepable functions (in
particular cond_resched()) can be called while holding it.
Signed-off-by: Junaid Shahid <junaids@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

CVE-2018-12207

(backported from commit 0d9ce162)
[tyhicks: Backport to 4.4
 - kvm_hyperv_tsc_notifier() does not exist
 - Adjust for surrounding code changes kvm-s390.c and kvm_main.c]
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

7411be04

KVM: MMU: drop vcpu param in gpte_access · 5e50df2d

Peter Xu authored Jul 18, 2018

It's never used.  Drop it.
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

CVE-2018-12207

(backported from commit 42522d08)
[tyhicks: Backport to 4.4
 - Considerable context differences due to code changes but nothing too
   complex]
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

5e50df2d

KVM: x86: extend usage of RET_MMIO_PF_* constants · 3639763c

Paolo Bonzini authored Aug 17, 2017

The x86 MMU if full of code that returns 0 and 1 for retry/emulate.  Use
the existing RET_MMIO_PF_RETRY/RET_MMIO_PF_EMULATE enum, renaming it to
drop the MMIO part.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

CVE-2018-12207

(backported from commit 9b8ebbdb)
[tyhicks: Backport to 4.4
 - Some conditionals that checked for tracked pages did not exist due to
   the lack of commit 3d0c27ad ("KVM: MMU: let page fault handler be
   aware tracked page")
 - lower_32_bits(error_code) is not used in kvm_mmu_page_fault() due to
   the lack of commit 14727754 ("kvm: svm: Add support for
   additional SVM NPF error codes")]
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

3639763c

KVM: x86: simplify ept_misconfig · a634fbe9

Paolo Bonzini authored Aug 17, 2017

Calling handle_mmio_page_fault() has been unnecessary since commit
e9ee956e ("KVM: x86: MMU: Move handle_mmio_page_fault() call to
kvm_mmu_page_fault()", 2016-02-22).

handle_mmio_page_fault() can now be made static.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>

CVE-2018-12207

(backported from commit e08d26f0)
[tyhicks: Backport to 4.4
 - Minor context change in handle_ept_misconfig() due to missing commit
   db1c056c ("kvm: vmx: Use the hardware provided GPA instead of
   page walk")]
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

a634fbe9

KVM: x86: MMU: Remove unused parameter parent_pte from kvm_mmu_get_page() · 428b0294

Takuya Yoshikawa authored Nov 26, 2015

Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

CVE-2018-12207

(cherry picked from commit bb11c6c9)
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

428b0294

KVM: x86: MMU: Move parent_pte handling from kvm_mmu_get_page() to link_shadow_page() · 0da886be

Takuya Yoshikawa authored Nov 26, 2015

Every time kvm_mmu_get_page() is called with a non-NULL parent_pte
argument, link_shadow_page() follows that to set the parent entry so
that the new mapping will point to the returned page table.

Moving parent_pte handling there allows to clean up the code because
parent_pte is passed to kvm_mmu_get_page() just for mark_unsync() and
mmu_page_add_parent_pte().

In addition, the patch avoids calling mark_unsync() for other parents in
the sp->parent_ptes chain than the newly added parent_pte, because they
have been there since before the current page fault handling started.
Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Cc: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

CVE-2018-12207

(cherry picked from commit 98bba238)
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

0da886be

KVM: x86: MMU: always set accessed bit in shadow PTEs · a8c8b7f2

Paolo Bonzini authored Nov 13, 2015

Commit 7a1638ce ("nEPT: Redefine EPT-specific link_shadow_page()",
2013-08-05) says:

    Since nEPT doesn't support A/D bit, we should not set those bit
    when building the shadow page table.

but this is not necessary.  Even though nEPT doesn't support A/D
bits, and hence the vmcs12 EPT pointer will never enable them,
we always use them for shadow page tables if available (see
construct_eptp in vmx.c).  So we can set the A/D bits freely
in the shadow page table.

This patch hence basically reverts commit 7a1638ce.

Cc: Yang Zhang <yang.z.zhang@Intel.com>
Cc: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

CVE-2018-12207

(cherry picked from commit 0e3d0648)
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

a8c8b7f2

KVM: x86: MMU: Move initialization of parent_ptes out from kvm_mmu_alloc_page() · 7d0475b3

Takuya Yoshikawa authored Nov 20, 2015

Make kvm_mmu_alloc_page() do just what its name tells to do, and remove
the extra allocation error check and zero-initialization of parent_ptes:
shadow page headers allocated by kmem_cache_zalloc() are always in the
per-VCPU pools.
Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

CVE-2018-12207

(cherry picked from commit 47005792)
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

7d0475b3

KVM: x86: MMU: Make mmu_set_spte() return emulate value · 4391bd8a

Takuya Yoshikawa authored Nov 20, 2015

mmu_set_spte()'s code is based on the assumption that the emulate
parameter has a valid pointer value if set_spte() returns true and
write_fault is not zero.  In other cases, emulate may be NULL, so a
NULL-check is needed.

Stop passing emulate pointer and make mmu_set_spte() return the emulate
value instead to clean up this complex interface.  Prefetch functions
can just throw away the return value.
Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

CVE-2018-12207

(cherry picked from commit 029499b4)
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

4391bd8a

KVM: MMU: introduce kvm_mmu_gfn_{allow,disallow}_lpage · 5c731275

Xiao Guangrong authored Feb 24, 2016

Abstract the common operations from account_shadowed() and
unaccount_shadowed(), then introduce kvm_mmu_gfn_disallow_lpage()
and kvm_mmu_gfn_allow_lpage()

These two functions will be used by page tracking in the later patch
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

CVE-2018-12207

(cherry picked from commit 547ffaed)
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

5c731275

KVM: MMU: rename has_wrprotected_page to mmu_gfn_lpage_is_disallowed · ef51b2ed

Xiao Guangrong authored Feb 24, 2016

kvm_lpage_info->write_count is used to detect if the large page mapping
for the gfn on the specified level is allowed, rename it to disallow_lpage
to reflect its purpose, also we rename has_wrprotected_page() to
mmu_gfn_lpage_is_disallowed() to make the code more clearer

Later we will extend this mechanism for page tracking: if the gfn is
tracked then large mapping for that gfn on any level is not allowed.
The new name is more straightforward
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

CVE-2018-12207

(cherry picked from commit 92f94f1e)
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

ef51b2ed

KVM: x86: MMU: Move handle_mmio_page_fault() call to kvm_mmu_page_fault() · d55d4a9c

Takuya Yoshikawa authored Feb 22, 2016

Rather than placing a handle_mmio_page_fault() call in each
vcpu->arch.mmu.page_fault() handler, moving it up to
kvm_mmu_page_fault() makes the code better:

 - avoids code duplication
 - for kvm_arch_async_page_ready(), which is the other caller of
   vcpu->arch.mmu.page_fault(), removes an extra error_code check
 - avoids returning both RET_MMIO_PF_* values and raw integer values
   from vcpu->arch.mmu.page_fault()
Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

CVE-2018-12207

(cherry picked from commit e9ee956e)
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

d55d4a9c

KVM: x86: MMU: Consolidate quickly_check_mmio_pf() and is_mmio_page_fault() · 0b139e8b

Takuya Yoshikawa authored Feb 22, 2016

These two have only slight differences:
 - whether 'addr' is of type u64 or of type gva_t
 - whether they have 'direct' parameter or not

Concerning the former, quickly_check_mmio_pf()'s u64 is better because
'addr' needs to be able to have both a guest physical address and a
guest virtual address.

The latter is just a stylistic issue as we can always calculate the mode
from the 'vcpu' as is_mmio_page_fault() does.  This patch keeps the
parameter to make the following patch cleaner.

In addition, the patch renames the function to mmio_info_in_cache() to
make it clear what it actually checks for.
Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

CVE-2018-12207

(cherry picked from commit ded58749)
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

0b139e8b

KVM: x86: MMU: Encapsulate the type of rmap-chain head in a new struct · 700e3a38

Takuya Yoshikawa authored Nov 20, 2015

New struct kvm_rmap_head makes the code type-safe to some extent.
Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

CVE-2018-12207

(cherry picked from commit 018aabb5)
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

700e3a38

UBUNTU: [Config] Disable TSX by default when possible · 1bd595bc

Tyler Hicks authored Oct 30, 2019

Turn on CONFIG_X86_INTEL_TSX_MODE_OFF to disable Intel's Transactional
Synchronization Extensions (TSX) feature by default. TSX can only be
disable on certain, newer processors that support the IA32_TSX_CTRL MSR
via a microcode update. Intel says that future processors will also
support the MSR. On processors that support the MSR, TSX will be
disabled unless the system administrator overrides the configuration
with the "tsx" kernel command line option.

CVE-2019-11135
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

1bd595bc

UBUNTU: SAUCE: x86/cpu: Include cpu header from bugs.c · e1c4e805

Tyler Hicks authored Oct 29, 2019

The linux-4.14.y backport of commit 286836a7 ("x86/cpu: Add a helper
function x86_read_arch_cap_msr()") added a dependency on cpu.h from
bugs.c so include the header file from bugs.c.

CVE-2019-11135
Suggested-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

e1c4e805

UBUNTU: SAUCE: x86/speculation/taa: Call tsx_init() · 67d157af

Tyler Hicks authored Oct 29, 2019

The linux-4.14.y backport of upstream commit 95c5824f ("x86/cpu: Add
a "tsx=" cmdline option with TSX disabled by default") incorrectly
dropped the call to tsx_init(). Add the function call back to
identify_boot_cpu()

CVE-2019-11135
Suggested-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

67d157af

x86/tsx: Add config options to set tsx=on|off|auto · 5bb7ea16

Michal Hocko authored Oct 23, 2019

commit db616173 upstream

There is a general consensus that TSX usage is not largely spread while
the history shows there is a non trivial space for side channel attacks
possible. Therefore the tsx is disabled by default even on platforms
that might have a safe implementation of TSX according to the current
knowledge. This is a fair trade off to make.

There are, however, workloads that really do benefit from using TSX and
updating to a newer kernel with TSX disabled might introduce a
noticeable regressions. This would be especially a problem for Linux
distributions which will provide TAA mitigations.

Introduce config options X86_INTEL_TSX_MODE_OFF, X86_INTEL_TSX_MODE_ON
and X86_INTEL_TSX_MODE_AUTO to control the TSX feature. The config
setting can be overridden by the tsx cmdline options.

 [ bp: Text cleanups from Josh. ]
Suggested-by: Borislav Petkov <bpetkov@suse.de>
Signed-off-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>

CVE-2019-11135

[tyhicks: Backport to 4.4
 - Minor context adjustment in arch/x86/Kconfig due to different
   surrounding Kconfig options]
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

5bb7ea16

x86/speculation/taa: Add documentation for TSX Async Abort · 3f27239d

Pawan Gupta authored Oct 23, 2019

commit a7a248c5 upstream

Add the documenation for TSX Async Abort. Include the description of
the issue, how to check the mitigation state, control the mitigation,
guidance for system administrators.

 [ bp: Add proper SPDX tags, touch ups by Josh and me. ]
Co-developed-by: Antonio Gomez Iglesias <antonio.gomez.iglesias@intel.com>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Antonio Gomez Iglesias <antonio.gomez.iglesias@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Mark Gross <mgross@linux.intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>

CVE-2019-11135

[tyhicks: Backport to 4.4
 - kernel-parameters.txt is up one directory level]
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

3f27239d

x86/tsx: Add "auto" option to the tsx= cmdline parameter · 6b2b952a

Pawan Gupta authored Oct 23, 2019

commit 7531a359 upstream

Platforms which are not affected by X86_BUG_TAA may want the TSX feature
enabled. Add "auto" option to the TSX cmdline parameter. When tsx=auto
disable TSX when X86_BUG_TAA is present, otherwise enable TSX.

More details on X86_BUG_TAA can be found here:
https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/tsx_async_abort.html

 [ bp: Extend the arg buffer to accommodate "auto\0". ]
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>

CVE-2019-11135

[tyhicks: Backport to 4.4
 - kernel-parameters.txt is up one directory level]
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

6b2b952a

kvm/x86: Export MDS_NO=0 to guests when TSX is enabled · b36841a3

Pawan Gupta authored Oct 23, 2019

commit e1d38b63 upstream

Export the IA32_ARCH_CAPABILITIES MSR bit MDS_NO=0 to guests on TSX
Async Abort(TAA) affected hosts that have TSX enabled and updated
microcode. This is required so that the guests don't complain,

  "Vulnerable: Clear CPU buffers attempted, no microcode"

when the host has the updated microcode to clear CPU buffers.

Microcode update also adds support for MSR_IA32_TSX_CTRL which is
enumerated by the ARCH_CAP_TSX_CTRL bit in IA32_ARCH_CAPABILITIES MSR.
Guests can't do this check themselves when the ARCH_CAP_TSX_CTRL bit is
not exported to the guests.

In this case export MDS_NO=0 to the guests. When guests have
CPUID.MD_CLEAR=1, they deploy MDS mitigation which also mitigates TAA.
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Neelima Krishnan <neelima.krishnan@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>

CVE-2019-11135
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

b36841a3

x86/speculation/taa: Add sysfs reporting for TSX Async Abort · 765f3620

Pawan Gupta authored Oct 23, 2019

commit 6608b45a upstream

Add the sysfs reporting file for TSX Async Abort. It exposes the
vulnerability and the mitigation state similar to the existing files for
the other hardware vulnerabilities.

Sysfs file path is:
/sys/devices/system/cpu/vulnerabilities/tsx_async_abort
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Neelima Krishnan <neelima.krishnan@intel.com>
Reviewed-by: Mark Gross <mgross@linux.intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>

CVE-2019-11135
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

765f3620

x86/speculation/taa: Add mitigation for TSX Async Abort · 7344b409

Pawan Gupta authored Oct 23, 2019

commit 1b42f017 upstream

TSX Async Abort (TAA) is a side channel vulnerability to the internal
buffers in some Intel processors similar to Microachitectural Data
Sampling (MDS). In this case, certain loads may speculatively pass
invalid data to dependent operations when an asynchronous abort
condition is pending in a TSX transaction.

This includes loads with no fault or assist condition. Such loads may
speculatively expose stale data from the uarch data structures as in
MDS. Scope of exposure is within the same-thread and cross-thread. This
issue affects all current processors that support TSX, but do not have
ARCH_CAP_TAA_NO (bit 8) set in MSR_IA32_ARCH_CAPABILITIES.

On CPUs which have their IA32_ARCH_CAPABILITIES MSR bit MDS_NO=0,
CPUID.MD_CLEAR=1 and the MDS mitigation is clearing the CPU buffers
using VERW or L1D_FLUSH, there is no additional mitigation needed for
TAA. On affected CPUs with MDS_NO=1 this issue can be mitigated by
disabling the Transactional Synchronization Extensions (TSX) feature.

A new MSR IA32_TSX_CTRL in future and current processors after a
microcode update can be used to control the TSX feature. There are two
bits in that MSR:

* TSX_CTRL_RTM_DISABLE disables the TSX sub-feature Restricted
Transactional Memory (RTM).

* TSX_CTRL_CPUID_CLEAR clears the RTM enumeration in CPUID. The other
TSX sub-feature, Hardware Lock Elision (HLE), is unconditionally
disabled with updated microcode but still enumerated as present by
CPUID(EAX=7).EBX{bit4}.

The second mitigation approach is similar to MDS which is clearing the
affected CPU buffers on return to user space and when entering a guest.
Relevant microcode update is required for the mitigation to work.  More
details on this approach can be found here:

  https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/mds.html

The TSX feature can be controlled by the "tsx" command line parameter.
If it is force-enabled then "Clear CPU buffers" (MDS mitigation) is
deployed. The effective mitigation state can be read from sysfs.

 [ bp:
   - massage + comments cleanup
   - s/TAA_MITIGATION_TSX_DISABLE/TAA_MITIGATION_TSX_DISABLED/g - Josh.
   - remove partial TAA mitigation in update_mds_branch_idle() - Josh.
   - s/tsx_async_abort_cmdline/tsx_async_abort_parse_cmdline/g
 ]
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>

CVE-2019-11135
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

7344b409

x86/cpu: Add a "tsx=" cmdline option with TSX disabled by default · e7ed47bd

Pawan Gupta authored Oct 23, 2019

commit 95c5824f upstream

Add a kernel cmdline parameter "tsx" to control the Transactional
Synchronization Extensions (TSX) feature. On CPUs that support TSX
control, use "tsx=on|off" to enable or disable TSX. Not specifying this
option is equivalent to "tsx=off". This is because on certain processors
TSX may be used as a part of a speculative side channel attack.

Carve out the TSX controlling functionality into a separate compilation
unit because TSX is a CPU feature while the TSX async abort control
machinery will go to cpu/bugs.c.

 [ bp: - Massage, shorten and clear the arg buffer.
       - Clarifications of the tsx= possible options - Josh.
       - Expand on TSX_CTRL availability - Pawan. ]
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>

CVE-2019-11135

[tyhicks: Backport to 4.4
 - kernel-parameters.txt is up one directory level
 - Minor context adjustment in init_intel() because
   init_intel_misc_features() is not called]
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

e7ed47bd

x86/cpu: Add a helper function x86_read_arch_cap_msr() · 09636161

Pawan Gupta authored Oct 23, 2019

commit 286836a7 upstream

Add a helper function to read the IA32_ARCH_CAPABILITIES MSR.
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Neelima Krishnan <neelima.krishnan@intel.com>
Reviewed-by: Mark Gross <mgross@linux.intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>

CVE-2019-11135
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

09636161

x86/msr: Add the IA32_TSX_CTRL MSR · a919a18c

Pawan Gupta authored Oct 23, 2019

commit c2955f27 upstream

Transactional Synchronization Extensions (TSX) may be used on certain
processors as part of a speculative side channel attack.  A microcode
update for existing processors that are vulnerable to this attack will
add a new MSR - IA32_TSX_CTRL to allow the system administrator the
option to disable TSX as one of the possible mitigations.

The CPUs which get this new MSR after a microcode upgrade are the ones
which do not set MSR_IA32_ARCH_CAPABILITIES.MDS_NO (bit 5) because those
CPUs have CPUID.MD_CLEAR, i.e., the VERW implementation which clears all
CPU buffers takes care of the TAA case as well.

  [ Note that future processors that are not vulnerable will also
    support the IA32_TSX_CTRL MSR. ]

Add defines for the new IA32_TSX_CTRL MSR and its bits.

TSX has two sub-features:

1. Restricted Transactional Memory (RTM) is an explicitly-used feature
   where new instructions begin and end TSX transactions.
2. Hardware Lock Elision (HLE) is implicitly used when certain kinds of
   "old" style locks are used by software.

Bit 7 of the IA32_ARCH_CAPABILITIES indicates the presence of the
IA32_TSX_CTRL MSR.

There are two control bits in IA32_TSX_CTRL MSR:

  Bit 0: When set, it disables the Restricted Transactional Memory (RTM)
         sub-feature of TSX (will force all transactions to abort on the
	 XBEGIN instruction).

  Bit 1: When set, it disables the enumeration of the RTM and HLE feature
         (i.e. it will make CPUID(EAX=7).EBX{bit4} and
	  CPUID(EAX=7).EBX{bit11} read as 0).

The other TSX sub-feature, Hardware Lock Elision (HLE), is
unconditionally disabled by the new microcode but still enumerated
as present by CPUID(EAX=7).EBX{bit4}, unless disabled by
IA32_TSX_CTRL_MSR[1] - TSX_CTRL_CPUID_CLEAR.
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Neelima Krishnan <neelima.krishnan@intel.com>
Reviewed-by: Mark Gross <mgross@linux.intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>

CVE-2019-11135
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

a919a18c

KVM: x86: use Intel speculation bugs and features as derived in generic x86 code · 036ff4e2

Paolo Bonzini authored Aug 19, 2019

Similar to AMD bits, set the Intel bits from the vendor-independent
feature and bug flags, because KVM_GET_SUPPORTED_CPUID does not care
about the vendor and they should be set on AMD processors as well.
Suggested-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

CVE-2019-11135

(backported from commit 0c54914d)
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

036ff4e2

KVM: x86: Emulate MSR_IA32_ARCH_CAPABILITIES on AMD hosts · fb61464d

Sean Christopherson authored Mar 07, 2019

The CPUID flag ARCH_CAPABILITIES is unconditioinally exposed to host
userspace for all x86 hosts, i.e. KVM advertises ARCH_CAPABILITIES
regardless of hardware support under the pretense that KVM fully
emulates MSR_IA32_ARCH_CAPABILITIES.  Unfortunately, only VMX hosts
handle accesses to MSR_IA32_ARCH_CAPABILITIES (despite KVM_GET_MSRS
also reporting MSR_IA32_ARCH_CAPABILITIES for all hosts).

Move the MSR_IA32_ARCH_CAPABILITIES handling to common x86 code so
that it's emulated on AMD hosts.

Fixes: 1eaafe91 ("kvm: x86: IA32_ARCH_CAPABILITIES is always supported")
Cc: stable@vger.kernel.org
Reported-by: Xiaoyao Li <xiaoyao.li@linux.intel.com>
Cc: Jim Mattson <jmattson@google.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

CVE-2019-11135

(backported from commit 0cf9135b)
[tyhicks: Backport to 4.4
 - vmx.c and vmx.h are up one directory level
 - Minor context adjustments in x86.c due to different surrounding MSR
   case statements and stack variable differences in
   kvm_arch_vcpu_setup()
 - Call guest_cpuid_has_arch_capabilities() instead of the non-existent
   guest_cpuid_has()]
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

fb61464d

UBUNTU: SAUCE: drm/i915/gen8+: Add RC6 CTX corruption WA · 65bee31e

Imre Deak authored Jul 09, 2018

In some circumstances the RC6 context can get corrupted. We can detect
this and take the required action, that is disable RC6 and runtime PM.
The HW recovers from the corrupted state after a system suspend/resume
cycle, so detect the recovery and re-enable RC6 and runtime PM.

v2: rebase (Mika)
v3:
- Move intel_suspend_gt_powersave() to the end of the GEM suspend
  sequence.
- Add commit message.
v4:
- Rebased on intel_uncore_forcewake_put(i915->uncore, ...) API
  change.
v5:
- Rebased on latest upstream gt_pm refactoring.
Signed-off-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>

CVE-2019-0154
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

65bee31e

UBUNTU: SAUCE: i915_bpo: drm/i915/gen8+: Add RC6 CTX corruption WA · 309001a4

Imre Deak authored Jul 09, 2018

In some circumstances the RC6 context can get corrupted. We can detect
this and take the required action, that is disable RC6 and runtime PM.
The HW recovers from the corrupted state after a system suspend/resume
cycle, so detect the recovery and re-enable RC6 and runtime PM.

v2: rebase (Mika)
v3:
- Move intel_suspend_gt_powersave() to the end of the GEM suspend
  sequence.
- Add commit message.
v4:
- Rebased on intel_uncore_forcewake_put(i915->uncore, ...) API
  change.
v5:
- Rebased on latest upstream gt_pm refactoring.
Signed-off-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>

CVE-2019-0154

[tjaalton: backport to i915_bpo
 - Use type safe register definitions.
 - Modify NEEDS_WaRsDisableCoarsePowerGating to match all of GEN9]
Signed-off-by: Timo Aaltonen <timo.aaltonen@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

309001a4

UBUNTU: SAUCE: i915_bpo: drm/i915: Lower RM timeout to avoid DSI hard hangs · 03d7c6f0

Uma Shankar authored Aug 07, 2018

In BXT/APL, device 2 MMIO reads from MIPI controller requires its PLL
to be turned ON. When MIPI PLL is turned off (MIPI Display is not
active or connected), and someone (host or GT engine) tries to read
MIPI registers, it causes hard hang. This is a hardware restriction
or limitation.

Driver by itself doesn't read MIPI registers when MIPI display is off.
But any userspace application can submit unprivileged batch buffer for
execution. In that batch buffer there can be mmio reads. And these
reads are allowed even for unprivileged applications. If these
register reads are for MIPI DSI controller and MIPI display is not
active during that time, then the MMIO read operation causes system
hard hang and only way to recover is hard reboot. A genuine
process/application won't submit batch buffer like this and doesn't
cause any issue. But on a compromised system, a malign userspace
process/app can generate such batch buffer and can trigger system
hard hang (denial of service attack).

The fix is to lower the internal MMIO timeout value to an optimum
value of 950us as recommended by hardware team. If the timeout is
beyond 1ms (which will hit for any value we choose if MMIO READ on a
DSI specific register is performed without PLL ON), it causes the
system hang. But if the timeout value is lower than it will be below
the threshold (even if timeout happens) and system will not get into
a hung state. This will avoid a system hang without losing any
programming or GT interrupts, taking the worst case of lowest CDCLK
frequency and early DC5 abort into account.
Signed-off-by: Uma Shankar <uma.shankar@intel.com>
Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com>

CVE-2019-0154

[tjaalton: backport to i915_bpo
 - Use type safe register definitions.]
Signed-off-by: Timo Aaltonen <timo.aaltonen@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

03d7c6f0

UBUNTU: SAUCE: i915_bpo: drm/i915/cmdparser: Ignore Length operands during command matching · 65fb3f49

Jon Bloomfield authored Sep 20, 2018

Some of the gen instruction macros (e.g. MI_DISPLAY_FLIP) have the
length directly encoded in them. Since these are used directly in
the tables, the Length becomes part of the comparison used for
matching during parsing. Thus, if the cmd being parsed has a
different length to that in the table, it is not matched and the
cmd is accepted via the default variable length path.

Fix by masking out everything except the Opcode in the cmd tables
Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com>

CVE-2019-0155
Signed-off-by: Timo Aaltonen <timo.aaltonen@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

65fb3f49

UBUNTU: SAUCE: i915_bpo: drm/i915/cmdparser: Add support for backward jumps · 9c352afb

Jon Bloomfield authored Sep 21, 2018

To keep things manageable, the pre-gen9 cmdparser does not
attempt to track any form of nested BB_START's. This did not
prevent usermode from using nested starts, or even chained
batches because the cmdparser is not strictly enforced pre gen9.

Instead, the existence of a nested BB_START would cause the batch
to be emitted in insecure mode, and any privileged capabilities
would not be available.

For Gen9, the cmdparser becomes mandatory (for BCS at least), and
so not providing any form of nested BB_START support becomes
overly restrictive. Any such batch will simply not run.

We make heavy use of backward jumps in igt, and it is much easier
to add support for this restricted subset of nested jumps, than to
rewrite the whole of our test suite to avoid them.

Add the required logic to support limited backward jumps, to
instructions that have already been validated by the parser.

Note that it's not sufficient to simply approve any BB_START
that jumps backwards in the buffer because this would allow an
attacker to embed a rogue instruction sequence within the
operand words of a harmless instruction (say LRI) and jump to
that.

We introduce a bit array to track every instr offset successfully
validated, and test the target of BB_START against this. If the
target offset hits, it is re-written to the same offset in the
shadow buffer and the BB_START cmd is allowed.

Note: This patch deliberately ignores checkpatch issues in the
cmdtables, in order to match the style of the surrounding code.
We'll correct the entire file in one go in a later patch.

v2: set dispatch secure late (Mika)
v3: rebase (Mika)
v4: Clear whitelist on each parse
Minor review updates (Chris)
v5: Correct backward jump batching
v6: fix compilation error due to struct eb shuffle (Mika)
Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>

CVE-2019-0155

[tjaalton: backport to i915_bpo
 - intel_engine_cs struct members, variables got renamed s/ring/engine/,
   follow the same renaming here.]
Signed-off-by: Timo Aaltonen <timo.aaltonen@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

9c352afb

UBUNTU: SAUCE: i915_bpo: drm/i915: Add gen9 BCS cmdparsing · 1de10cfc

Jon Bloomfield authored Apr 23, 2018

For gen9 we enable cmdparsing on the BCS ring, specifically
to catch inadvertent accesses to sensitive registers

Unlike gen7/hsw, we use the parser only to block certain
registers. We can rely on h/w to block restricted commands,
so the command tables only provide enough info to allow the
parser to delineate each command, and identify commands that
access registers.

Note: This patch deliberately ignores checkpatch issues in
favour of matching the style of the surrounding code. We'll
correct the entire file in one go in a later patch.
Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>

CVE-2019-0155

[tjaalton: backport to i915_bpo
 - intel_engine_cs struct members, variables got renamed s/ring/engine/,
   follow the same renaming here.
 - i915_cmd_parser_init_ring has changed since 4.4, so add gen9_blt_reg_tables
   and use it as on the patch for 4.15.
 - Use type safe register definitions.]
Signed-off-by: Timo Aaltonen <timo.aaltonen@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

1de10cfc