1. 05 Aug, 2018 11 commits
    • Paolo Bonzini's avatar
      x86/speculation: Simplify sysfs report of VMX L1TF vulnerability · ea156d19
      Paolo Bonzini authored
      Three changes to the content of the sysfs file:
      
       - If EPT is disabled, L1TF cannot be exploited even across threads on the
         same core, and SMT is irrelevant.
      
       - If mitigation is completely disabled, and SMT is enabled, print "vulnerable"
         instead of "vulnerable, SMT vulnerable"
      
       - Reorder the two parts so that the main vulnerability state comes first
         and the detail on SMT is second.
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      ea156d19
    • Thomas Gleixner's avatar
      Documentation/l1tf: Remove Yonah processors from not vulnerable list · 58331136
      Thomas Gleixner authored
      Dave reported, that it's not confirmed that Yonah processors are
      unaffected. Remove them from the list.
      Reported-by: default avatarave Hansen <dave.hansen@intel.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      58331136
    • Thomas Gleixner's avatar
    • Nicolai Stange's avatar
      x86/KVM/VMX: Don't set l1tf_flush_l1d from vmx_handle_external_intr() · 18b57ce2
      Nicolai Stange authored
      For VMEXITs caused by external interrupts, vmx_handle_external_intr()
      indirectly calls into the interrupt handlers through the host's IDT.
      
      It follows that these interrupts get accounted for in the
      kvm_cpu_l1tf_flush_l1d per-cpu flag.
      
      The subsequently executed vmx_l1d_flush() will thus be aware that some
      interrupts have happened and conduct a L1d flush anyway.
      
      Setting l1tf_flush_l1d from vmx_handle_external_intr() isn't needed
      anymore. Drop it.
      Signed-off-by: default avatarNicolai Stange <nstange@suse.de>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      18b57ce2
    • Nicolai Stange's avatar
      x86/irq: Let interrupt handlers set kvm_cpu_l1tf_flush_l1d · ffcba43f
      Nicolai Stange authored
      The last missing piece to having vmx_l1d_flush() take interrupts after
      VMEXIT into account is to set the kvm_cpu_l1tf_flush_l1d per-cpu flag on
      irq entry.
      
      Issue calls to kvm_set_cpu_l1tf_flush_l1d() from entering_irq(),
      ipi_entering_ack_irq(), smp_reschedule_interrupt() and
      uv_bau_message_interrupt().
      Suggested-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarNicolai Stange <nstange@suse.de>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      ffcba43f
    • Nicolai Stange's avatar
      x86: Don't include linux/irq.h from asm/hardirq.h · 447ae316
      Nicolai Stange authored
      The next patch in this series will have to make the definition of
      irq_cpustat_t available to entering_irq().
      
      Inclusion of asm/hardirq.h into asm/apic.h would cause circular header
      dependencies like
      
        asm/smp.h
          asm/apic.h
            asm/hardirq.h
              linux/irq.h
                linux/topology.h
                  linux/smp.h
                    asm/smp.h
      
      or
      
        linux/gfp.h
          linux/mmzone.h
            asm/mmzone.h
              asm/mmzone_64.h
                asm/smp.h
                  asm/apic.h
                    asm/hardirq.h
                      linux/irq.h
                        linux/irqdesc.h
                          linux/kobject.h
                            linux/sysfs.h
                              linux/kernfs.h
                                linux/idr.h
                                  linux/gfp.h
      
      and others.
      
      This causes compilation errors because of the header guards becoming
      effective in the second inclusion: symbols/macros that had been defined
      before wouldn't be available to intermediate headers in the #include chain
      anymore.
      
      A possible workaround would be to move the definition of irq_cpustat_t
      into its own header and include that from both, asm/hardirq.h and
      asm/apic.h.
      
      However, this wouldn't solve the real problem, namely asm/harirq.h
      unnecessarily pulling in all the linux/irq.h cruft: nothing in
      asm/hardirq.h itself requires it. Also, note that there are some other
      archs, like e.g. arm64, which don't have that #include in their
      asm/hardirq.h.
      
      Remove the linux/irq.h #include from x86' asm/hardirq.h.
      
      Fix resulting compilation errors by adding appropriate #includes to *.c
      files as needed.
      
      Note that some of these *.c files could be cleaned up a bit wrt. to their
      set of #includes, but that should better be done from separate patches, if
      at all.
      Signed-off-by: default avatarNicolai Stange <nstange@suse.de>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      447ae316
    • Nicolai Stange's avatar
      x86/KVM/VMX: Introduce per-host-cpu analogue of l1tf_flush_l1d · 45b575c0
      Nicolai Stange authored
      Part of the L1TF mitigation for vmx includes flushing the L1D cache upon
      VMENTRY.
      
      L1D flushes are costly and two modes of operations are provided to users:
      "always" and the more selective "conditional" mode.
      
      If operating in the latter, the cache would get flushed only if a host side
      code path considered unconfined had been traversed. "Unconfined" in this
      context means that it might have pulled in sensitive data like user data
      or kernel crypto keys.
      
      The need for L1D flushes is tracked by means of the per-vcpu flag
      l1tf_flush_l1d. KVM exit handlers considered unconfined set it. A
      vmx_l1d_flush() subsequently invoked before the next VMENTER will conduct a
      L1d flush based on its value and reset that flag again.
      
      Currently, interrupts delivered "normally" while in root operation between
      VMEXIT and VMENTER are not taken into account. Part of the reason is that
      these don't leave any traces and thus, the vmx code is unable to tell if
      any such has happened.
      
      As proposed by Paolo Bonzini, prepare for tracking all interrupts by
      introducing a new per-cpu flag, "kvm_cpu_l1tf_flush_l1d". It will be in
      strong analogy to the per-vcpu ->l1tf_flush_l1d.
      
      A later patch will make interrupt handlers set it.
      
      For the sake of cache locality, group kvm_cpu_l1tf_flush_l1d into x86'
      per-cpu irq_cpustat_t as suggested by Peter Zijlstra.
      
      Provide the helpers kvm_set_cpu_l1tf_flush_l1d(),
      kvm_clear_cpu_l1tf_flush_l1d() and kvm_get_cpu_l1tf_flush_l1d(). Make them
      trivial resp. non-existent for !CONFIG_KVM_INTEL as appropriate.
      
      Let vmx_l1d_flush() handle kvm_cpu_l1tf_flush_l1d in the same way as
      l1tf_flush_l1d.
      Suggested-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Suggested-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarNicolai Stange <nstange@suse.de>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      45b575c0
    • Nicolai Stange's avatar
      x86/irq: Demote irq_cpustat_t::__softirq_pending to u16 · 9aee5f8a
      Nicolai Stange authored
      An upcoming patch will extend KVM's L1TF mitigation in conditional mode
      to also cover interrupts after VMEXITs. For tracking those, stores to a
      new per-cpu flag from interrupt handlers will become necessary.
      
      In order to improve cache locality, this new flag will be added to x86's
      irq_cpustat_t.
      
      Make some space available there by shrinking the ->softirq_pending bitfield
      from 32 to 16 bits: the number of bits actually used is only NR_SOFTIRQS,
      i.e. 10.
      Suggested-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarNicolai Stange <nstange@suse.de>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      9aee5f8a
    • Nicolai Stange's avatar
      x86/KVM/VMX: Move the l1tf_flush_l1d test to vmx_l1d_flush() · 5b6ccc6c
      Nicolai Stange authored
      Currently, vmx_vcpu_run() checks if l1tf_flush_l1d is set and invokes
      vmx_l1d_flush() if so.
      
      This test is unncessary for the "always flush L1D" mode.
      
      Move the check to vmx_l1d_flush()'s conditional mode code path.
      
      Notes:
      - vmx_l1d_flush() is likely to get inlined anyway and thus, there's no
        extra function call.
        
      - This inverts the (static) branch prediction, but there hadn't been any
        explicit likely()/unlikely() annotations before and so it stays as is.
      Signed-off-by: default avatarNicolai Stange <nstange@suse.de>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      5b6ccc6c
    • Nicolai Stange's avatar
      x86/KVM/VMX: Replace 'vmx_l1d_flush_always' with 'vmx_l1d_flush_cond' · 427362a1
      Nicolai Stange authored
      The vmx_l1d_flush_always static key is only ever evaluated if
      vmx_l1d_should_flush is enabled. In that case however, there are only two
      L1d flushing modes possible: "always" and "conditional".
      
      The "conditional" mode's implementation tends to require more sophisticated
      logic than the "always" mode.
      
      Avoid inverted logic by replacing the 'vmx_l1d_flush_always' static key
      with a 'vmx_l1d_flush_cond' one.
      
      There is no change in functionality.
      Signed-off-by: default avatarNicolai Stange <nstange@suse.de>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      427362a1
    • Nicolai Stange's avatar
      x86/KVM/VMX: Don't set l1tf_flush_l1d to true from vmx_l1d_flush() · 379fd0c7
      Nicolai Stange authored
      vmx_l1d_flush() gets invoked only if l1tf_flush_l1d is true. There's no
      point in setting l1tf_flush_l1d to true from there again.
      Signed-off-by: default avatarNicolai Stange <nstange@suse.de>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      379fd0c7
  2. 29 Jul, 2018 5 commits
    • Linus Torvalds's avatar
      Linux 4.18-rc7 · acb18725
      Linus Torvalds authored
      acb18725
    • Linus Torvalds's avatar
      Merge tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 · 3cfb6772
      Linus Torvalds authored
      Pull ext4 fixes from Ted Ts'o:
       "Some miscellaneous ext4 fixes for 4.18; one fix is for a regression
        introduced in 4.18-rc4.
      
        Sorry for the late-breaking pull. I was originally going to wait for
        the next merge window, but Eric Whitney found a regression introduced
        in 4.18-rc4, so I decided to push out the regression plus the other
        fixes now. (The other commits have been baking in linux-next since
        early July)"
      
      * tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
        ext4: fix check to prevent initializing reserved inodes
        ext4: check for allocation block validity with block group locked
        ext4: fix inline data updates with checksums enabled
        ext4: clear mmp sequence number when remounting read-only
        ext4: fix false negatives *and* false positives in ext4_check_descriptors()
      3cfb6772
    • Linus Torvalds's avatar
      squashfs: be more careful about metadata corruption · 01cfb793
      Linus Torvalds authored
      Anatoly Trosinenko reports that a corrupted squashfs image can cause a
      kernel oops.  It turns out that squashfs can end up being confused about
      negative fragment lengths.
      
      The regular squashfs_read_data() does check for negative lengths, but
      squashfs_read_metadata() did not, and the fragment size code just
      blindly trusted the on-disk value.  Fix both the fragment parsing and
      the metadata reading code.
      Reported-by: default avatarAnatoly Trosinenko <anatoly.trosinenko@gmail.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Phillip Lougher <phillip@squashfs.org.uk>
      Cc: stable@kernel.org
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      01cfb793
    • Theodore Ts'o's avatar
      ext4: fix check to prevent initializing reserved inodes · 50122847
      Theodore Ts'o authored
      Commit 8844618d: "ext4: only look at the bg_flags field if it is
      valid" will complain if block group zero does not have the
      EXT4_BG_INODE_ZEROED flag set.  Unfortunately, this is not correct,
      since a freshly created file system has this flag cleared.  It gets
      almost immediately after the file system is mounted read-write --- but
      the following somewhat unlikely sequence will end up triggering a
      false positive report of a corrupted file system:
      
         mkfs.ext4 /dev/vdc
         mount -o ro /dev/vdc /vdc
         mount -o remount,rw /dev/vdc
      
      Instead, when initializing the inode table for block group zero, test
      to make sure that itable_unused count is not too large, since that is
      the case that will result in some or all of the reserved inodes
      getting cleared.
      
      This fixes the failures reported by Eric Whiteney when running
      generic/230 and generic/231 in the the nojournal test case.
      
      Fixes: 8844618d ("ext4: only look at the bg_flags field if it is valid")
      Reported-by: default avatarEric Whitney <enwlinux@gmail.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      50122847
    • Linus Torvalds's avatar
      Merge tag 'random_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random · a26fb01c
      Linus Torvalds authored
      Pull random fixes from Ted Ts'o:
       "In reaction to the fixes to address CVE-2018-1108, some Linux
        distributions that have certain systemd versions in some cases
        combined with patches to libcrypt for FIPS/FEDRAMP compliance, have
        led to boot-time stalls for some hardware.
      
        The reaction by some distros and Linux sysadmins has been to install
        packages that try to do complicated things with the CPU and hope that
        leads to randomness.
      
        To mitigate this, if RDRAND is available, mix it into entropy provided
        by userspace. It won't hurt, and it will probably help"
      
      * tag 'random_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random:
        random: mix rdrand with entropy sent in from userspace
      a26fb01c
  3. 28 Jul, 2018 3 commits
  4. 27 Jul, 2018 21 commits