Commits · b0e5094307615e07f2134ce95893b92657f7da2c · Kirill Smelkov / linux

14 Aug, 2018 6 commits

USB: serial: cp210x: use tcflag_t to fix incompatible pointer type · b0e50943

Geert Uytterhoeven authored Nov 21, 2016

BugLink: https://bugs.launchpad.net/bugs/1776177

commit 009615ab upstream.

On sparc32, tcflag_t is unsigned long, unlike all other architectures:

    drivers/usb/serial/cp210x.c: In function 'cp210x_get_termios':
    drivers/usb/serial/cp210x.c:717:3: warning: passing argument 2 of 'cp210x_get_termios_port' from incompatible pointer type
       cp210x_get_termios_port(tty->driver_data,
       ^
    drivers/usb/serial/cp210x.c:35:13: note: expected 'unsigned int *' but argument is of type 'tcflag_t *'
     static void cp210x_get_termios_port(struct usb_serial_port *port,
		 ^

Consistently use tcflag_t to fix this.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Juerg Haefliger <juergh@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>

b0e50943

powerpc/64s: Clear PCR on boot · 68de1a47

Michael Neuling authored May 18, 2018

BugLink: https://bugs.launchpad.net/bugs/1776177

commit faf37c44 upstream.

Clear the PCR (Processor Compatibility Register) on boot to ensure we
are not running in a compatibility mode.

We've seen this cause problems when a crash (and kdump) occurs while
running compat mode guests. The kdump kernel then runs with the PCR
set and causes problems. The symptom in the kdump kernel (also seen in
petitboot after fast-reboot) is early userspace programs taking
sigills on newer instructions (seen in libc).
Signed-off-by: Michael Neuling <mikey@neuling.org>
Cc: stable@vger.kernel.org
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Juerg Haefliger <juergh@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>

68de1a47

arm64: lse: Add early clobbers to some input/output asm operands · 686690de

Will Deacon authored May 21, 2018

BugLink: https://bugs.launchpad.net/bugs/1776177

commit 32c3fa7c upstream.

For LSE atomics that read and write a register operand, we need to
ensure that these operands are annotated as "early clobber" if the
register is written before all of the input operands have been consumed.
Failure to do so can result in the compiler allocating the same register
to both operands, leading to splats such as:

 Unable to handle kernel paging request at virtual address 11111122222221
 [...]
 x1 : 1111111122222222 x0 : 1111111122222221
 Process swapper/0 (pid: 1, stack limit = 0x000000008209f908)
 Call trace:
  test_atomic64+0x1360/0x155c

where x0 has been allocated as both the value to be stored and also the
atomic_t pointer.

This patch adds the missing clobbers.

Cc: <stable@vger.kernel.org>
Cc: Dave Martin <dave.martin@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Reported-by: Mark Salter <msalter@redhat.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Juerg Haefliger <juergh@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>

686690de

Linux 4.4.135 · 7285003e

Greg Kroah-Hartman authored May 30, 2018

BugLink: https://bugs.launchpad.net/bugs/1776158Signed-off-by: Juerg Haefliger <juergh@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>

7285003e

Revert "vti4: Don't override MTU passed on link creation via IFLA_MTU" · 15499c15

Greg Kroah-Hartman authored May 30, 2018

BugLink: https://bugs.launchpad.net/bugs/1776158

This reverts commit 33cebc97 which is
03080e5e ("vti4: Don't override MTU passed on link creation via
IFLA_MTU") upstream as it causes test failures.

This commit should not have been backported to anything older than 4.16,
despite what the changelog said as the mtu must be set in older kernels,
unlike is needed in 4.16 and newer.

Thanks to Alistair Strachan for the debugging help figuring this out,
and for 'git bisect' for making my life a whole lot easier.

Cc: Alistair Strachan <astrachan@google.com>
Cc: Stefano Brivio <sbrivio@redhat.com>
Cc: Sabrina Dubroca <sd@queasysnail.net>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Juerg Haefliger <juergh@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>

15499c15

UBUNTU: Start new release · 71350a12
Stefan Bader authored Aug 14, 2018
```
Ignore: yes
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
```
71350a12

09 Aug, 2018 10 commits

UBUNTU: Ubuntu-4.4.0-133.159 · a80364c7
Stefan Bader authored Aug 09, 2018
```
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
```
a80364c7

tcp: detect malicious patterns in tcp_collapse_ofo_queue() · d42da7c9

Eric Dumazet authored Jul 23, 2018

In case an attacker feeds tiny packets completely out of order,
tcp_collapse_ofo_queue() might scan the whole rb-tree, performing
expensive copies, but not changing socket memory usage at all.

1) Do not attempt to collapse tiny skbs.
2) Add logic to exit early when too many tiny skbs are detected.

We prefer not doing aggressive collapsing (which copies packets)
for pathological flows, and revert to tcp_prune_ofo_queue() which
will be less expensive.

In the future, we might add the possibility of terminating flows
that are proven to be malicious.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

CVE-2018-5390

(backported from commit 3d4bf93a)
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

d42da7c9

tcp: avoid collapses in tcp_prune_queue() if possible · 60102d86

Eric Dumazet authored Jul 23, 2018

Right after a TCP flow is created, receiving tiny out of order
packets allways hit the condition :

if (atomic_read(&sk->sk_rmem_alloc) >= sk->sk_rcvbuf)
	tcp_clamp_window(sk);

tcp_clamp_window() increases sk_rcvbuf to match sk_rmem_alloc
(guarded by tcp_rmem[2])

Calling tcp_collapse_ofo_queue() in this case is not useful,
and offers a O(N^2) surface attack to malicious peers.

Better not attempt anything before full queue capacity is reached,
forcing attacker to spend lots of resource and allow us to more
easily detect the abuse.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

CVE-2018-5390

(cherry picked from commit f4a3313d)
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

60102d86

Revert "net: increase fragment memory usage limits" · f9942591

Stefan Bader authored Aug 08, 2018

This reverts commit c2a93660. It
made denial of service attacks on the IP fragment handling easier to
carry out.

CVE-2018-5391
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

f9942591

x86/mm/pat: Make set_memory_np() L1TF safe · e0ab8464

Andi Kleen authored Aug 07, 2018

set_memory_np() is used to mark kernel mappings not present, but it has
it's own open coded mechanism which does not have the L1TF protection of
inverting the address bits.

Replace the open coded PTE manipulation with the L1TF protecting low level
PTE routines.

Passes the CPA self test.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

CVE-2018-3620
CVE-2018-3646

[smb: Context adjustments]
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

e0ab8464

UBUNTU: SAUCE: Add pfn_pud() and pud_mkhuge() · ffdd1bb5

Stefan Bader authored Aug 09, 2018

Both were introduced when pud size transparent hugepage support was
added and that would be too complex and dangerous to backport. The
pfn_pud() function was extended in "x86/speculation/l1tf: Protect
PROT_NONE PTEs against speculation" but not backported then since it
was not present, yet (so no users).
For the following patch, though, we will need both.

CVE-2018-3620
CVE-2018-3646
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

ffdd1bb5

x86/mm/pat: Ensure cpa->pfn only contains page frame numbers · 319f01bb

Matt Fleming authored Nov 27, 2015

The x86 pageattr code is confused about the data that is stored
in cpa->pfn, sometimes it's treated as a page frame number,
sometimes it's treated as an unshifted physical address, and in
one place it's treated as a pte.

The result of this is that the mapping functions do not map the
intended physical address.

This isn't a problem in practice because most of the addresses
we're mapping in the EFI code paths are already mapped in
'trampoline_pgd' and so the pageattr mapping functions don't
actually do anything in this case. But when we move to using a
separate page table for the EFI runtime this will be an issue.
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Reviewed-by: Borislav Petkov <bp@suse.de>
Acked-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Toshi Kani <toshi.kani@hp.com>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/1448658575-17029-3-git-send-email-matt@codeblueprint.co.ukSigned-off-by: Ingo Molnar <mingo@kernel.org>

CVE-2018-3620
CVE-2018-3646

(cherry picked from commit edc3b912)
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

319f01bb

x86/speculation/l1tf: Make pmd/pud_mknotpresent() invert · efb357f0

Andi Kleen authored Aug 07, 2018

Some cases in THP like:
  - MADV_FREE
  - mprotect
  - split

mark the PMD non present for temporarily to prevent races. The window for
an L1TF attack in these contexts is very small, but it wants to be fixed
for correctness sake.

Use the proper low level functions for pmd/pud_mknotpresent() to address
this.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

CVE-2018-3620
CVE-2018-3646

[smb: Drop pud_mknotpresent() changes as it does not exist]
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

efb357f0

x86/speculation/l1tf: Invert all not present mappings · e847687e

Andi Kleen authored Aug 07, 2018

For kernel mappings PAGE_PROTNONE is not necessarily set for a non present
mapping, but the inversion logic explicitely checks for !PRESENT and
PROT_NONE.

Remove the PROT_NONE check and make the inversion unconditional for all not
present mappings.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

CVE-2018-3620
CVE-2018-3646
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

e847687e

cpu/hotplug: Fix SMT supported evaluation · 2319dd9d

Thomas Gleixner authored Aug 07, 2018

Josh reported that the late SMT evaluation in cpu_smt_state_init() sets
cpu_smt_control to CPU_SMT_NOT_SUPPORTED in case that 'nosmt' was supplied
on the kernel command line as it cannot differentiate between SMT disabled
by BIOS and SMT soft disable via 'nosmt'. That wreckages the state and
makes the sysfs interface unusable.

Rework this so that during bringup of the non boot CPUs the availability of
SMT is determined in cpu_smt_allowed(). If a newly booted CPU is not a
'primary' thread then set the local cpu_smt_available marker and evaluate
this explicitely right after the initial SMP bringup has finished.

SMT evaulation on x86 is a trainwreck as the firmware has all the
information _before_ booting the kernel, but there is no interface to query
it.

Fixes: 73d5e2b4 ("cpu/hotplug: detect SMT disabled by BIOS")
Reported-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

CVE-2018-3620
CVE-2018-3646

[smb: Context and also adjust to alternative booted_once scheme,
      including a move of the smt check into _cpu_up()]
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

2319dd9d

08 Aug, 2018 24 commits

KVM: VMX: Tell the nested hypervisor to skip L1D flush on vmentry · 0502ecb7

Paolo Bonzini authored Aug 05, 2018

When nested virtualization is in use, VMENTER operations from the nested
hypervisor into the nested guest will always be processed by the bare metal
hypervisor, and KVM's "conditional cache flushes" mode in particular does a
flush on nested vmentry.  Therefore, include the "skip L1D flush on
vmentry" bit in KVM's suggested ARCH_CAPABILITIES setting.

Add the relevant Documentation.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

CVE-2018-3620
CVE-2018-3646

[tyhicks: Adjust for the missing MSR_F10H_DECFG and MSR_IA32_UCODE_REV
 feature MSRs which do not exist in 4.15]
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
[smb: Minor context and adjusted documentation path]
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

0502ecb7

KVM: VMX: support MSR_IA32_ARCH_CAPABILITIES as a feature MSR · e5559836

Paolo Bonzini authored Jun 25, 2018

This lets userspace read the MSR_IA32_ARCH_CAPABILITIES and check that all
requested features are available on the host.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

CVE-2018-3620
CVE-2018-3646

(backported from commit cd283252)
[tyhicks: Adjust for the missing MSR_F10H_DECFG and MSR_IA32_UCODE_REV
 feature MSRs which do not exist in 4.15]
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

e5559836

KVM: X86: Introduce kvm_get_msr_feature() · d59679db

Wanpeng Li authored Feb 28, 2018

Introduce kvm_get_msr_feature() to handle the msrs which are supported
by different vendors and sharing the same emulation logic.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Liran Alon <liran.alon@oracle.com>
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>

CVE-2018-3620
CVE-2018-3646

(cherry picked from commit 66421c1e)
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

d59679db

KVM: x86: Add a framework for supporting MSR-based features · b8d3ddd8

Tom Lendacky authored Feb 21, 2018

Provide a new KVM capability that allows bits within MSRs to be recognized
as features.  Two new ioctls are added to the /dev/kvm ioctl routine to
retrieve the list of these MSRs and then retrieve their values. A kvm_x86_ops
callback is used to determine support for the listed MSR-based features.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[Tweaked documentation. - Radim]
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>

CVE-2018-3620
CVE-2018-3646

(backported from commit 801e459a)
[tyhicks: Adjust context for missing mem_enc_* kvm_x86_ops]
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
[smb: Further context adjustments for two hunks]
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

b8d3ddd8

KVM/VMX: Emulate MSR_IA32_ARCH_CAPABILITIES · 36566c92

KarimAllah Ahmed authored Feb 01, 2018

commit 28c1c9fa

Intel processors use MSR_IA32_ARCH_CAPABILITIES MSR to indicate RDCL_NO
(bit 0) and IBRS_ALL (bit 1). This is a read-only MSR. By default the
contents will come directly from the hardware, but user-space can still
override it.

[dwmw2: The bit in kvm_cpuid_7_0_edx_x86_features can be unconditional]
Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jun Nakajima <jun.nakajima@intel.com>
Cc: kvm@vger.kernel.org
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Asit Mallick <asit.k.mallick@intel.com>
Cc: Arjan Van De Ven <arjan.van.de.ven@intel.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Link: https://lkml.kernel.org/r/1517522386-18410-4-git-send-email-karahmed@amazon.deSigned-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>

CVE-2018-3620
CVE-2018-3646

(backported from commit a6005a792e24c30cb5fa8525b91af67ca0bcc1e7 bionic)
[smb: Context adjustments to work around already applied spectre
      patches. Also introduced specific guest_cpuid_has_arch_capabilities
      as replacement for non-existing guest_cpuid_has function.]
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

36566c92

x86/speculation: Use ARCH_CAPABILITIES to skip L1D flush on vmentry · 34d382bd

Paolo Bonzini authored Aug 05, 2018

Bit 3 of ARCH_CAPABILITIES tells a hypervisor that L1D flush on vmentry is
not needed.  Add a new value to enum vmx_l1d_flush_state, which is used
either if there is no L1TF bug at all, or if bit 3 is set in ARCH_CAPABILITIES.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

CVE-2018-3620
CVE-2018-3646
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

34d382bd

x86/speculation: Simplify sysfs report of VMX L1TF vulnerability · b2fea1af

Paolo Bonzini authored Aug 05, 2018

Three changes to the content of the sysfs file:

 - If EPT is disabled, L1TF cannot be exploited even across threads on the
   same core, and SMT is irrelevant.

 - If mitigation is completely disabled, and SMT is enabled, print "vulnerable"
   instead of "vulnerable, SMT vulnerable"

 - Reorder the two parts so that the main vulnerability state comes first
   and the detail on SMT is second.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

CVE-2018-3620
CVE-2018-3646
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

b2fea1af

Documentation/l1tf: Remove Yonah processors from not vulnerable list · 07d4efea

Thomas Gleixner authored Aug 05, 2018

Dave reported, that it's not confirmed that Yonah processors are
unaffected. Remove them from the list.
Reported-by: ave Hansen <dave.hansen@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

CVE-2018-3620
CVE-2018-3646

[smb: Adjusted document path]
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

07d4efea

x86/KVM/VMX: Don't set l1tf_flush_l1d from vmx_handle_external_intr() · 3e50244e

Nicolai Stange authored Jul 22, 2018

For VMEXITs caused by external interrupts, vmx_handle_external_intr()
indirectly calls into the interrupt handlers through the host's IDT.

It follows that these interrupts get accounted for in the
kvm_cpu_l1tf_flush_l1d per-cpu flag.

The subsequently executed vmx_l1d_flush() will thus be aware that some
interrupts have happened and conduct a L1d flush anyway.

Setting l1tf_flush_l1d from vmx_handle_external_intr() isn't needed
anymore. Drop it.
Signed-off-by: Nicolai Stange <nstange@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

CVE-2018-3620
CVE-2018-3646

[smb: Minor context adjustment]
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

3e50244e

x86/irq: Let interrupt handlers set kvm_cpu_l1tf_flush_l1d · 609f820f

Nicolai Stange authored Jul 29, 2018

The last missing piece to having vmx_l1d_flush() take interrupts after
VMEXIT into account is to set the kvm_cpu_l1tf_flush_l1d per-cpu flag on
irq entry.

Issue calls to kvm_set_cpu_l1tf_flush_l1d() from entering_irq(),
ipi_entering_ack_irq(), smp_reschedule_interrupt() and
uv_bau_message_interrupt().
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Nicolai Stange <nstange@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

CVE-2018-3620
CVE-2018-3646

[smb: Minor context adjustments]
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

609f820f

x86/apic: Order irq_enter/exit() calls correctly vs. ack_APIC_irq() · a56fcf6a

Wanpeng Li authored Sep 18, 2016

===============================
[ INFO: suspicious RCU usage. ]
4.8.0-rc6+ #5 Not tainted
-------------------------------
./arch/x86/include/asm/msr-trace.h:47 suspicious rcu_dereference_check() usage!

other info that might help us debug this:

RCU used illegally from idle CPU!
rcu_scheduler_active = 1, debug_locks = 0
RCU used illegally from extended quiescent state!
no locks held by swapper/2/0.

stack backtrace:
CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.8.0-rc6+ #5
Hardware name: Dell Inc. OptiPlex 7020/0F5C5X, BIOS A03 01/08/2015
 0000000000000000 ffff8d1bd6003f10 ffffffff94446949 ffff8d1bd4a68000
 0000000000000001 ffff8d1bd6003f40 ffffffff940e9247 ffff8d1bbdfcf3d0
 000000000000080b 0000000000000000 0000000000000000 ffff8d1bd6003f70
Call Trace:
 <IRQ>  [<ffffffff94446949>] dump_stack+0x99/0xd0
 [<ffffffff940e9247>] lockdep_rcu_suspicious+0xe7/0x120
 [<ffffffff9448e0d5>] do_trace_write_msr+0x135/0x140
 [<ffffffff9406e750>] native_write_msr+0x20/0x30
 [<ffffffff9406503d>] native_apic_msr_eoi_write+0x1d/0x30
 [<ffffffff9405b17e>] smp_trace_call_function_interrupt+0x1e/0x270
 [<ffffffff948cb1d6>] trace_call_function_interrupt+0x96/0xa0
 <EOI>  [<ffffffff947200f4>] ? cpuidle_enter_state+0xe4/0x360
 [<ffffffff947200df>] ? cpuidle_enter_state+0xcf/0x360
 [<ffffffff947203a7>] cpuidle_enter+0x17/0x20
 [<ffffffff940df008>] cpu_startup_entry+0x338/0x4d0
 [<ffffffff9405bfc4>] start_secondary+0x154/0x180

This can be reproduced readily by running ftrace test case of kselftest.

Move the irq_enter() call before ack_APIC_irq(), because irq_enter() tells
the RCU susbstems to end the extended quiescent state, so that the
following trace call in ack_APIC_irq() works correctly. The same applies to
exiting_ack_irq() which calls ack_APIC_irq() after irq_exit().

[ tglx: Massaged changelog ]
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wanpeng Li <wanpeng.li@hotmail.com>
Link: http://lkml.kernel.org/r/1474198491-3738-1-git-send-email-wanpeng.li@hotmail.comSigned-off-by: Thomas Gleixner <tglx@linutronix.de>

CVE-2018-3620
CVE-2018-3646

(cherry picked from commit b0f48706)
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

a56fcf6a

x86: Don't include linux/irq.h from asm/hardirq.h · 47a15da0

Nicolai Stange authored Jul 29, 2018

The next patch in this series will have to make the definition of
irq_cpustat_t available to entering_irq().

Inclusion of asm/hardirq.h into asm/apic.h would cause circular header
dependencies like

  asm/smp.h
    asm/apic.h
      asm/hardirq.h
        linux/irq.h
          linux/topology.h
            linux/smp.h
              asm/smp.h

or

  linux/gfp.h
    linux/mmzone.h
      asm/mmzone.h
        asm/mmzone_64.h
          asm/smp.h
            asm/apic.h
              asm/hardirq.h
                linux/irq.h
                  linux/irqdesc.h
                    linux/kobject.h
                      linux/sysfs.h
                        linux/kernfs.h
                          linux/idr.h
                            linux/gfp.h

and others.

This causes compilation errors because of the header guards becoming
effective in the second inclusion: symbols/macros that had been defined
before wouldn't be available to intermediate headers in the #include chain
anymore.

A possible workaround would be to move the definition of irq_cpustat_t
into its own header and include that from both, asm/hardirq.h and
asm/apic.h.

However, this wouldn't solve the real problem, namely asm/harirq.h
unnecessarily pulling in all the linux/irq.h cruft: nothing in
asm/hardirq.h itself requires it. Also, note that there are some other
archs, like e.g. arm64, which don't have that #include in their
asm/hardirq.h.

Remove the linux/irq.h #include from x86' asm/hardirq.h.

Fix resulting compilation errors by adding appropriate #includes to *.c
files as needed.

Note that some of these *.c files could be cleaned up a bit wrt. to their
set of #includes, but that should better be done from separate patches, if
at all.
Signed-off-by: Nicolai Stange <nstange@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

CVE-2018-3620
CVE-2018-3646

[smb: Heavily modified by cycles of compile-and-fix]
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

47a15da0

x86/KVM/VMX: Introduce per-host-cpu analogue of l1tf_flush_l1d · 70a63a1f

Nicolai Stange authored Jul 27, 2018

Part of the L1TF mitigation for vmx includes flushing the L1D cache upon
VMENTRY.

L1D flushes are costly and two modes of operations are provided to users:
"always" and the more selective "conditional" mode.

If operating in the latter, the cache would get flushed only if a host side
code path considered unconfined had been traversed. "Unconfined" in this
context means that it might have pulled in sensitive data like user data
or kernel crypto keys.

The need for L1D flushes is tracked by means of the per-vcpu flag
l1tf_flush_l1d. KVM exit handlers considered unconfined set it. A
vmx_l1d_flush() subsequently invoked before the next VMENTER will conduct a
L1d flush based on its value and reset that flag again.

Currently, interrupts delivered "normally" while in root operation between
VMEXIT and VMENTER are not taken into account. Part of the reason is that
these don't leave any traces and thus, the vmx code is unable to tell if
any such has happened.

As proposed by Paolo Bonzini, prepare for tracking all interrupts by
introducing a new per-cpu flag, "kvm_cpu_l1tf_flush_l1d". It will be in
strong analogy to the per-vcpu ->l1tf_flush_l1d.

A later patch will make interrupt handlers set it.

For the sake of cache locality, group kvm_cpu_l1tf_flush_l1d into x86'
per-cpu irq_cpustat_t as suggested by Peter Zijlstra.

Provide the helpers kvm_set_cpu_l1tf_flush_l1d(),
kvm_clear_cpu_l1tf_flush_l1d() and kvm_get_cpu_l1tf_flush_l1d(). Make them
trivial resp. non-existent for !CONFIG_KVM_INTEL as appropriate.

Let vmx_l1d_flush() handle kvm_cpu_l1tf_flush_l1d in the same way as
l1tf_flush_l1d.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Nicolai Stange <nstange@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>

CVE-2018-3620
CVE-2018-3646
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

70a63a1f

x86/irq: Demote irq_cpustat_t::__softirq_pending to u16 · 2176bba5

Nicolai Stange authored Jul 27, 2018

An upcoming patch will extend KVM's L1TF mitigation in conditional mode
to also cover interrupts after VMEXITs. For tracking those, stores to a
new per-cpu flag from interrupt handlers will become necessary.

In order to improve cache locality, this new flag will be added to x86's
irq_cpustat_t.

Make some space available there by shrinking the ->softirq_pending bitfield
from 32 to 16 bits: the number of bits actually used is only NR_SOFTIRQS,
i.e. 10.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Nicolai Stange <nstange@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>

CVE-2018-3620
CVE-2018-3646
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

2176bba5

x86/KVM/VMX: Move the l1tf_flush_l1d test to vmx_l1d_flush() · f7329c08

Nicolai Stange authored Jul 21, 2018

Currently, vmx_vcpu_run() checks if l1tf_flush_l1d is set and invokes
vmx_l1d_flush() if so.

This test is unncessary for the "always flush L1D" mode.

Move the check to vmx_l1d_flush()'s conditional mode code path.

Notes:
- vmx_l1d_flush() is likely to get inlined anyway and thus, there's no
  extra function call.

- This inverts the (static) branch prediction, but there hadn't been any
  explicit likely()/unlikely() annotations before and so it stays as is.
Signed-off-by: Nicolai Stange <nstange@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

CVE-2018-3620
CVE-2018-3646

[smb: Some minor context adjustments in second hunk]
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

f7329c08

x86/KVM/VMX: Replace 'vmx_l1d_flush_always' with 'vmx_l1d_flush_cond' · f9a1bc63

Nicolai Stange authored Jul 21, 2018

The vmx_l1d_flush_always static key is only ever evaluated if
vmx_l1d_should_flush is enabled. In that case however, there are only two
L1d flushing modes possible: "always" and "conditional".

The "conditional" mode's implementation tends to require more sophisticated
logic than the "always" mode.

Avoid inverted logic by replacing the 'vmx_l1d_flush_always' static key
with a 'vmx_l1d_flush_cond' one.

There is no change in functionality.
Signed-off-by: Nicolai Stange <nstange@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

CVE-2018-3620
CVE-2018-3646
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

f9a1bc63

x86/KVM/VMX: Don't set l1tf_flush_l1d to true from vmx_l1d_flush() · 0661b923

Nicolai Stange authored Jul 21, 2018

vmx_l1d_flush() gets invoked only if l1tf_flush_l1d is true. There's no
point in setting l1tf_flush_l1d to true from there again.
Signed-off-by: Nicolai Stange <nstange@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

CVE-2018-3620
CVE-2018-3646
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

0661b923

cpu/hotplug: detect SMT disabled by BIOS · 9541fd1b

Josh Poimboeuf authored Jul 24, 2018

If SMT is disabled in BIOS, the CPU code doesn't properly detect it.
The /sys/devices/system/cpu/smt/control file shows 'on', and the 'l1tf'
vulnerabilities file shows SMT as vulnerable.

Fix it by forcing 'cpu_smt_control' to CPU_SMT_NOT_SUPPORTED in such a
case.  Unfortunately the detection can only be done after bringing all
the CPUs online, so we have to overwrite any previous writes to the
variable.
Reported-by: Joe Mario <jmario@redhat.com>
Tested-by: Jiri Kosina <jkosina@suse.cz>
Fixes: f048c399 ("x86/topology: Provide topology_smt_supported()")
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>

CVE-2018-3620
CVE-2018-3646
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

9541fd1b

Documentation/l1tf: Fix typos · 3f1a1526

Tony Luck authored Jul 19, 2018

Fix spelling and other typos
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

CVE-2018-3620
CVE-2018-3646

[smb: Location changed to Documentation/security/l1tf.txt]
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

3f1a1526

x86/KVM/VMX: Initialize the vmx_l1d_flush_pages' content · bbd637c6

Nicolai Stange authored Jul 18, 2018

The slow path in vmx_l1d_flush() reads from vmx_l1d_flush_pages in order
to evict the L1d cache.

However, these pages are never cleared and, in theory, their data could be
leaked.

More importantly, KSM could merge a nested hypervisor's vmx_l1d_flush_pages
to fewer than 1 << L1D_CACHE_ORDER host physical pages and this would break
the L1d flushing algorithm: L1D on x86_64 is tagged by physical addresses.

Fix this by initializing the individual vmx_l1d_flush_pages with a
different pattern each.

Rename the "empty_zp" asm constraint identifier in vmx_l1d_flush() to
"flush_pages" to reflect this change.

Fixes: a47dd5f0 ("x86/KVM/VMX: Add L1D flush algorithm")
Signed-off-by: Nicolai Stange <nstange@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

CVE-2018-3620
CVE-2018-3646
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

bbd637c6

x86/speculation/l1tf: Unbreak !__HAVE_ARCH_PFN_MODIFY_ALLOWED architectures · fa535113

Jiri Kosina authored Jul 14, 2018

pfn_modify_allowed() and arch_has_pfn_modify_check() are outside of the
!__ASSEMBLY__ section in include/asm-generic/pgtable.h, which confuses
assembler on archs that don't have __HAVE_ARCH_PFN_MODIFY_ALLOWED (e.g.
ia64) and breaks build:

    include/asm-generic/pgtable.h: Assembler messages:
    include/asm-generic/pgtable.h:538: Error: Unknown opcode `static inline bool pfn_modify_allowed(unsigned long pfn,pgprot_t prot)'
    include/asm-generic/pgtable.h:540: Error: Unknown opcode `return true'
    include/asm-generic/pgtable.h:543: Error: Unknown opcode `static inline bool arch_has_pfn_modify_check(void)'
    include/asm-generic/pgtable.h:545: Error: Unknown opcode `return false'
    arch/ia64/kernel/entry.S:69: Error: `mov' does not fit into bundle

Move those two static inlines into the !__ASSEMBLY__ section so that they
don't confuse the asm build pass.

Fixes: 42e4089c ("x86/speculation/l1tf: Disallow non privileged high MMIO PROT_NONE mappings")
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

CVE-2018-3620
CVE-2018-3646

[smb: Reviewed and accepted two fuzz hunks]
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

fa535113

Documentation: Add section about CPU vulnerabilities · 30a9dc08

Thomas Gleixner authored Jul 13, 2018

Add documentation for the L1TF vulnerability and the mitigation mechanisms:

  - Explain the problem and risks
  - Document the mitigation mechanisms
  - Document the command line controls
  - Document the sysfs files
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lkml.kernel.org/r/20180713142323.287429944@linutronix.de

CVE-2018-3620
CVE-2018-3646

[smb: Added document as Documentation/security/l1tf.txt instead.]
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

30a9dc08

x86/bugs, kvm: Introduce boot-time control of L1TF mitigations · 7700d76e

Jiri Kosina authored Jul 13, 2018

Introduce the 'l1tf=' kernel command line option to allow for boot-time
switching of mitigation that is used on processors affected by L1TF.

The possible values are:

  full
	Provides all available mitigations for the L1TF vulnerability. Disables
	SMT and enables all mitigations in the hypervisors. SMT control via
	/sys/devices/system/cpu/smt/control is still possible after boot.
	Hypervisors will issue a warning when the first VM is started in
	a potentially insecure configuration, i.e. SMT enabled or L1D flush
	disabled.

  full,force
	Same as 'full', but disables SMT control. Implies the 'nosmt=force'
	command line option. sysfs control of SMT and the hypervisor flush
	control is disabled.

  flush
	Leaves SMT enabled and enables the conditional hypervisor mitigation.
	Hypervisors will issue a warning when the first VM is started in a
	potentially insecure configuration, i.e. SMT enabled or L1D flush
	disabled.

  flush,nosmt
	Disables SMT and enables the conditional hypervisor mitigation. SMT
	control via /sys/devices/system/cpu/smt/control is still possible
	after boot. If SMT is reenabled or flushing disabled at runtime
	hypervisors will issue a warning.

  flush,nowarn
	Same as 'flush', but hypervisors will not warn when
	a VM is started in a potentially insecure configuration.

  off
	Disables hypervisor mitigations and doesn't emit any warnings.

Default is 'flush'.

Let KVM adhere to these semantics, which means:

  - 'lt1f=full,force'	: Performe L1D flushes. No runtime control
    			  possible.

  - 'l1tf=full'
  - 'l1tf-flush'
  - 'l1tf=flush,nosmt'	: Perform L1D flushes and warn on VM start if
			  SMT has been runtime enabled or L1D flushing
			  has been run-time enabled

  - 'l1tf=flush,nowarn'	: Perform L1D flushes and no warnings are emitted.

  - 'l1tf=off'		: L1D flushes are not performed and no warnings
			  are emitted.

KVM can always override the L1D flushing behavior using its 'vmentry_l1d_flush'
module parameter except when lt1f=full,force is set.

This makes KVM's private 'nosmt' option redundant, and as it is a bit
non-systematic anyway (this is something to control globally, not on
hypervisor level), remove that option.

Add the missing Documentation entry for the l1tf vulnerability sysfs file
while at it.
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Jiri Kosina <jkosina@suse.cz>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lkml.kernel.org/r/20180713142323.202758176@linutronix.de

CVE-2018-3620
CVE-2018-3646

[smb: Minor context adjustments and adapt location of l1tf doc.]
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

7700d76e

cpu/hotplug: Set CPU_SMT_NOT_SUPPORTED early · 94353874

Thomas Gleixner authored Jul 13, 2018

The CPU_SMT_NOT_SUPPORTED state is set (if the processor does not support
SMT) when the sysfs SMT control file is initialized.

That was fine so far as this was only required to make the output of the
control file correct and to prevent writes in that case.

With the upcoming l1tf command line parameter, this needs to be set up
before the L1TF mitigation selection and command line parsing happens.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Jiri Kosina <jkosina@suse.cz>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lkml.kernel.org/r/20180713142323.121795971@linutronix.de

CVE-2018-3620
CVE-2018-3646
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

94353874