1. 15 Aug, 2018 19 commits
    • Masami Hiramatsu's avatar
      kprobes/x86: Fix %p uses in error messages · 866234c3
      Masami Hiramatsu authored
      commit 0ea06330 upstream.
      
      Remove all %p uses in error messages in kprobes/x86.
      Signed-off-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: David Howells <dhowells@redhat.com>
      Cc: David S . Miller <davem@davemloft.net>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Jon Medhurst <tixy@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Thomas Richter <tmricht@linux.ibm.com>
      Cc: Tobin C . Harding <me@tobin.cc>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: acme@kernel.org
      Cc: akpm@linux-foundation.org
      Cc: brueckner@linux.vnet.ibm.com
      Cc: linux-arch@vger.kernel.org
      Cc: rostedt@goodmis.org
      Cc: schwidefsky@de.ibm.com
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/lkml/152491902310.9916.13355297638917767319.stgit@devboxSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      866234c3
    • Jiri Kosina's avatar
      x86/speculation: Protect against userspace-userspace spectreRSB · 7744abbe
      Jiri Kosina authored
      commit fdf82a78 upstream.
      
      The article "Spectre Returns! Speculation Attacks using the Return Stack
      Buffer" [1] describes two new (sub-)variants of spectrev2-like attacks,
      making use solely of the RSB contents even on CPUs that don't fallback to
      BTB on RSB underflow (Skylake+).
      
      Mitigate userspace-userspace attacks by always unconditionally filling RSB on
      context switch when the generic spectrev2 mitigation has been enabled.
      
      [1] https://arxiv.org/pdf/1807.07940.pdfSigned-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Acked-by: default avatarTim Chen <tim.c.chen@linux.intel.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Woodhouse <dwmw@amazon.co.uk>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/nycvar.YFH.7.76.1807261308190.997@cbobk.fhfr.pmSigned-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7744abbe
    • Peter Zijlstra's avatar
      x86/paravirt: Fix spectre-v2 mitigations for paravirt guests · 8dbce8a2
      Peter Zijlstra authored
      commit 5800dc5c upstream.
      
      Nadav reported that on guests we're failing to rewrite the indirect
      calls to CALLEE_SAVE paravirt functions. In particular the
      pv_queued_spin_unlock() call is left unpatched and that is all over the
      place. This obviously wrecks Spectre-v2 mitigation (for paravirt
      guests) which relies on not actually having indirect calls around.
      
      The reason is an incorrect clobber test in paravirt_patch_call(); this
      function rewrites an indirect call with a direct call to the _SAME_
      function, there is no possible way the clobbers can be different
      because of this.
      
      Therefore remove this clobber check. Also put WARNs on the other patch
      failure case (not enough room for the instruction) which I've not seen
      trigger in my (limited) testing.
      
      Three live kernel image disassemblies for lock_sock_nested (as a small
      function that illustrates the problem nicely). PRE is the current
      situation for guests, POST is with this patch applied and NATIVE is with
      or without the patch for !guests.
      
      PRE:
      
      (gdb) disassemble lock_sock_nested
      Dump of assembler code for function lock_sock_nested:
         0xffffffff817be970 <+0>:     push   %rbp
         0xffffffff817be971 <+1>:     mov    %rdi,%rbp
         0xffffffff817be974 <+4>:     push   %rbx
         0xffffffff817be975 <+5>:     lea    0x88(%rbp),%rbx
         0xffffffff817be97c <+12>:    callq  0xffffffff819f7160 <_cond_resched>
         0xffffffff817be981 <+17>:    mov    %rbx,%rdi
         0xffffffff817be984 <+20>:    callq  0xffffffff819fbb00 <_raw_spin_lock_bh>
         0xffffffff817be989 <+25>:    mov    0x8c(%rbp),%eax
         0xffffffff817be98f <+31>:    test   %eax,%eax
         0xffffffff817be991 <+33>:    jne    0xffffffff817be9ba <lock_sock_nested+74>
         0xffffffff817be993 <+35>:    movl   $0x1,0x8c(%rbp)
         0xffffffff817be99d <+45>:    mov    %rbx,%rdi
         0xffffffff817be9a0 <+48>:    callq  *0xffffffff822299e8
         0xffffffff817be9a7 <+55>:    pop    %rbx
         0xffffffff817be9a8 <+56>:    pop    %rbp
         0xffffffff817be9a9 <+57>:    mov    $0x200,%esi
         0xffffffff817be9ae <+62>:    mov    $0xffffffff817be993,%rdi
         0xffffffff817be9b5 <+69>:    jmpq   0xffffffff81063ae0 <__local_bh_enable_ip>
         0xffffffff817be9ba <+74>:    mov    %rbp,%rdi
         0xffffffff817be9bd <+77>:    callq  0xffffffff817be8c0 <__lock_sock>
         0xffffffff817be9c2 <+82>:    jmp    0xffffffff817be993 <lock_sock_nested+35>
      End of assembler dump.
      
      POST:
      
      (gdb) disassemble lock_sock_nested
      Dump of assembler code for function lock_sock_nested:
         0xffffffff817be970 <+0>:     push   %rbp
         0xffffffff817be971 <+1>:     mov    %rdi,%rbp
         0xffffffff817be974 <+4>:     push   %rbx
         0xffffffff817be975 <+5>:     lea    0x88(%rbp),%rbx
         0xffffffff817be97c <+12>:    callq  0xffffffff819f7160 <_cond_resched>
         0xffffffff817be981 <+17>:    mov    %rbx,%rdi
         0xffffffff817be984 <+20>:    callq  0xffffffff819fbb00 <_raw_spin_lock_bh>
         0xffffffff817be989 <+25>:    mov    0x8c(%rbp),%eax
         0xffffffff817be98f <+31>:    test   %eax,%eax
         0xffffffff817be991 <+33>:    jne    0xffffffff817be9ba <lock_sock_nested+74>
         0xffffffff817be993 <+35>:    movl   $0x1,0x8c(%rbp)
         0xffffffff817be99d <+45>:    mov    %rbx,%rdi
         0xffffffff817be9a0 <+48>:    callq  0xffffffff810a0c20 <__raw_callee_save___pv_queued_spin_unlock>
         0xffffffff817be9a5 <+53>:    xchg   %ax,%ax
         0xffffffff817be9a7 <+55>:    pop    %rbx
         0xffffffff817be9a8 <+56>:    pop    %rbp
         0xffffffff817be9a9 <+57>:    mov    $0x200,%esi
         0xffffffff817be9ae <+62>:    mov    $0xffffffff817be993,%rdi
         0xffffffff817be9b5 <+69>:    jmpq   0xffffffff81063aa0 <__local_bh_enable_ip>
         0xffffffff817be9ba <+74>:    mov    %rbp,%rdi
         0xffffffff817be9bd <+77>:    callq  0xffffffff817be8c0 <__lock_sock>
         0xffffffff817be9c2 <+82>:    jmp    0xffffffff817be993 <lock_sock_nested+35>
      End of assembler dump.
      
      NATIVE:
      
      (gdb) disassemble lock_sock_nested
      Dump of assembler code for function lock_sock_nested:
         0xffffffff817be970 <+0>:     push   %rbp
         0xffffffff817be971 <+1>:     mov    %rdi,%rbp
         0xffffffff817be974 <+4>:     push   %rbx
         0xffffffff817be975 <+5>:     lea    0x88(%rbp),%rbx
         0xffffffff817be97c <+12>:    callq  0xffffffff819f7160 <_cond_resched>
         0xffffffff817be981 <+17>:    mov    %rbx,%rdi
         0xffffffff817be984 <+20>:    callq  0xffffffff819fbb00 <_raw_spin_lock_bh>
         0xffffffff817be989 <+25>:    mov    0x8c(%rbp),%eax
         0xffffffff817be98f <+31>:    test   %eax,%eax
         0xffffffff817be991 <+33>:    jne    0xffffffff817be9ba <lock_sock_nested+74>
         0xffffffff817be993 <+35>:    movl   $0x1,0x8c(%rbp)
         0xffffffff817be99d <+45>:    mov    %rbx,%rdi
         0xffffffff817be9a0 <+48>:    movb   $0x0,(%rdi)
         0xffffffff817be9a3 <+51>:    nopl   0x0(%rax)
         0xffffffff817be9a7 <+55>:    pop    %rbx
         0xffffffff817be9a8 <+56>:    pop    %rbp
         0xffffffff817be9a9 <+57>:    mov    $0x200,%esi
         0xffffffff817be9ae <+62>:    mov    $0xffffffff817be993,%rdi
         0xffffffff817be9b5 <+69>:    jmpq   0xffffffff81063ae0 <__local_bh_enable_ip>
         0xffffffff817be9ba <+74>:    mov    %rbp,%rdi
         0xffffffff817be9bd <+77>:    callq  0xffffffff817be8c0 <__lock_sock>
         0xffffffff817be9c2 <+82>:    jmp    0xffffffff817be993 <lock_sock_nested+35>
      End of assembler dump.
      
      
      Fixes: 63f70270 ("[PATCH] i386: PARAVIRT: add common patching machinery")
      Fixes: 3010a066 ("x86/paravirt, objtool: Annotate indirect calls")
      Reported-by: default avatarNadav Amit <namit@vmware.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarJuergen Gross <jgross@suse.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8dbce8a2
    • Oleksij Rempel's avatar
      ARM: dts: imx6sx: fix irq for pcie bridge · 916a5789
      Oleksij Rempel authored
      commit 1bcfe056 upstream.
      
      Use the correct IRQ line for the MSI controller in the PCIe host
      controller. Apparently a different IRQ line is used compared to other
      i.MX6 variants. Without this change MSI IRQs aren't properly propagated
      to the upstream interrupt controller.
      Signed-off-by: default avatarOleksij Rempel <o.rempel@pengutronix.de>
      Reviewed-by: default avatarLucas Stach <l.stach@pengutronix.de>
      Fixes: b1d17f68 ("ARM: dts: imx: add initial imx6sx device tree source")
      Signed-off-by: default avatarShawn Guo <shawnguo@kernel.org>
      Signed-off-by: default avatarAmit Pundir <amit.pundir@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      916a5789
    • Michael Mera's avatar
      IB/ocrdma: fix out of bounds access to local buffer · 45c679be
      Michael Mera authored
      commit 062d0f22 upstream.
      
      In write to debugfs file 'resource_stats' the local buffer 'tmp_str' is
      written at index 'count-1' where 'count' is the size of the write, so
      potentially 0.
      
      This patch filters odd values for the write size/position to avoid this
      type of problem.
      Signed-off-by: default avatarMichael Mera <dev@michaelmera.com>
      Reviewed-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      Signed-off-by: default avatarAmit Pundir <amit.pundir@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      45c679be
    • Jack Morgenstein's avatar
      IB/mlx4: Mark user MR as writable if actual virtual memory is writable · d803aa2f
      Jack Morgenstein authored
      commit d8f9cc32 upstream.
      
      To allow rereg_user_mr to modify the MR from read-only to writable without
      using get_user_pages again, we needed to define the initial MR as writable.
      However, this was originally done unconditionally, without taking into
      account the writability of the underlying virtual memory.
      
      As a result, any attempt to register a read-only MR over read-only
      virtual memory failed.
      
      To fix this, do not add the writable flag bit when the user virtual memory
      is not writable (e.g. const memory).
      
      However, when the underlying memory is NOT writable (and we therefore
      do not define the initial MR as writable), the IB core adds a
      "force writable" flag to its user-pages request. If this succeeds,
      the reg_user_mr caller gets a writable copy of the original pages.
      
      If the user-space caller then does a rereg_user_mr operation to enable
      writability, this will succeed. This should not be allowed, since
      the original virtual memory was not writable.
      
      Cc: <stable@vger.kernel.org>
      Fixes: 9376932d ("IB/mlx4_ib: Add support for user MR re-registration")
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      Signed-off-by: default avatarJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: default avatarSudip Mukherjee <sudipm.mukherjee@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d803aa2f
    • Jack Morgenstein's avatar
      IB/core: Make testing MR flags for writability a static inline function · 01b377d3
      Jack Morgenstein authored
      commit 08bb558a upstream.
      
      Make the MR writability flags check, which is performed in umem.c,
      a static inline function in file ib_verbs.h
      
      This allows the function to be used by low-level infiniband drivers.
      
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      Signed-off-by: default avatarJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: default avatarSudip Mukherjee <sudipm.mukherjee@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      01b377d3
    • Al Viro's avatar
      fix __legitimize_mnt()/mntput() race · b9341f5a
      Al Viro authored
      commit 119e1ef8 upstream.
      
      __legitimize_mnt() has two problems - one is that in case of success
      the check of mount_lock is not ordered wrt preceding increment of
      refcount, making it possible to have successful __legitimize_mnt()
      on one CPU just before the otherwise final mntpu() on another,
      with __legitimize_mnt() not seeing mntput() taking the lock and
      mntput() not seeing the increment done by __legitimize_mnt().
      Solved by a pair of barriers.
      
      Another is that failure of __legitimize_mnt() on the second
      read_seqretry() leaves us with reference that'll need to be
      dropped by caller; however, if that races with final mntput()
      we can end up with caller dropping rcu_read_lock() and doing
      mntput() to release that reference - with the first mntput()
      having freed the damn thing just as rcu_read_lock() had been
      dropped.  Solution: in "do mntput() yourself" failure case
      grab mount_lock, check if MNT_DOOMED has been set by racing
      final mntput() that has missed our increment and if it has -
      undo the increment and treat that as "failure, caller doesn't
      need to drop anything" case.
      
      It's not easy to hit - the final mntput() has to come right
      after the first read_seqretry() in __legitimize_mnt() *and*
      manage to miss the increment done by __legitimize_mnt() before
      the second read_seqretry() in there.  The things that are almost
      impossible to hit on bare hardware are not impossible on SMP
      KVM, though...
      Reported-by: default avatarOleg Nesterov <oleg@redhat.com>
      Fixes: 48a066e7 ("RCU'd vsfmounts")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b9341f5a
    • Al Viro's avatar
      fix mntput/mntput race · a3ababd5
      Al Viro authored
      commit 9ea0a46c upstream.
      
      mntput_no_expire() does the calculation of total refcount under mount_lock;
      unfortunately, the decrement (as well as all increments) are done outside
      of it, leading to false positives in the "are we dropping the last reference"
      test.  Consider the following situation:
      	* mnt is a lazy-umounted mount, kept alive by two opened files.  One
      of those files gets closed.  Total refcount of mnt is 2.  On CPU 42
      mntput(mnt) (called from __fput()) drops one reference, decrementing component
      	* After it has looked at component #0, the process on CPU 0 does
      mntget(), incrementing component #0, gets preempted and gets to run again -
      on CPU 69.  There it does mntput(), which drops the reference (component #69)
      and proceeds to spin on mount_lock.
      	* On CPU 42 our first mntput() finishes counting.  It observes the
      decrement of component #69, but not the increment of component #0.  As the
      result, the total it gets is not 1 as it should've been - it's 0.  At which
      point we decide that vfsmount needs to be killed and proceed to free it and
      shut the filesystem down.  However, there's still another opened file
      on that filesystem, with reference to (now freed) vfsmount, etc. and we are
      screwed.
      
      It's not a wide race, but it can be reproduced with artificial slowdown of
      the mnt_get_count() loop, and it should be easier to hit on SMP KVM setups.
      
      Fix consists of moving the refcount decrement under mount_lock; the tricky
      part is that we want (and can) keep the fast case (i.e. mount that still
      has non-NULL ->mnt_ns) entirely out of mount_lock.  All places that zero
      mnt->mnt_ns are dropping some reference to mnt and they call synchronize_rcu()
      before that mntput().  IOW, if mntput() observes (under rcu_read_lock())
      a non-NULL ->mnt_ns, it is guaranteed that there is another reference yet to
      be dropped.
      Reported-by: default avatarJann Horn <jannh@google.com>
      Tested-by: default avatarJann Horn <jannh@google.com>
      Fixes: 48a066e7 ("RCU'd vsfmounts")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a3ababd5
    • Al Viro's avatar
      root dentries need RCU-delayed freeing · ba744147
      Al Viro authored
      commit 90bad5e0 upstream.
      
      Since mountpoint crossing can happen without leaving lazy mode,
      root dentries do need the same protection against having their
      memory freed without RCU delay as everything else in the tree.
      
      It's partially hidden by RCU delay between detaching from the
      mount tree and dropping the vfsmount reference, but the starting
      point of pathwalk can be on an already detached mount, in which
      case umount-caused RCU delay has already passed by the time the
      lazy pathwalk grabs rcu_read_lock().  If the starting point
      happens to be at the root of that vfsmount *and* that vfsmount
      covers the entire filesystem, we get trouble.
      
      Fixes: 48a066e7 ("RCU'd vsfmounts")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ba744147
    • Bart Van Assche's avatar
      scsi: sr: Avoid that opening a CD-ROM hangs with runtime power management enabled · 6aef4c4a
      Bart Van Assche authored
      commit 1214fd7b upstream.
      
      Surround scsi_execute() calls with scsi_autopm_get_device() and
      scsi_autopm_put_device(). Note: removing sr_mutex protection from the
      scsi_cd_get() and scsi_cd_put() calls is safe because the purpose of
      sr_mutex is to serialize cdrom_*() calls.
      
      This patch avoids that complaints similar to the following appear in the
      kernel log if runtime power management is enabled:
      
      INFO: task systemd-udevd:650 blocked for more than 120 seconds.
           Not tainted 4.18.0-rc7-dbg+ #1
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      systemd-udevd   D28176   650    513 0x00000104
      Call Trace:
      __schedule+0x444/0xfe0
      schedule+0x4e/0xe0
      schedule_preempt_disabled+0x18/0x30
      __mutex_lock+0x41c/0xc70
      mutex_lock_nested+0x1b/0x20
      __blkdev_get+0x106/0x970
      blkdev_get+0x22c/0x5a0
      blkdev_open+0xe9/0x100
      do_dentry_open.isra.19+0x33e/0x570
      vfs_open+0x7c/0xd0
      path_openat+0x6e3/0x1120
      do_filp_open+0x11c/0x1c0
      do_sys_open+0x208/0x2d0
      __x64_sys_openat+0x59/0x70
      do_syscall_64+0x77/0x230
      entry_SYSCALL_64_after_hwframe+0x49/0xbe
      Signed-off-by: default avatarBart Van Assche <bart.vanassche@wdc.com>
      Cc: Maurizio Lombardi <mlombard@redhat.com>
      Cc: Johannes Thumshirn <jthumshirn@suse.de>
      Cc: Alan Stern <stern@rowland.harvard.edu>
      Cc: <stable@vger.kernel.org>
      Tested-by: default avatarJohannes Thumshirn <jthumshirn@suse.de>
      Reviewed-by: default avatarJohannes Thumshirn <jthumshirn@suse.de>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6aef4c4a
    • Hans de Goede's avatar
      ACPI / LPSS: Add missing prv_offset setting for byt/cht PWM devices · 277131ba
      Hans de Goede authored
      commit fdcb613d upstream.
      
      The LPSS PWM device on on Bay Trail and Cherry Trail devices has a set
      of private registers at offset 0x800, the current lpss_device_desc for
      them already sets the LPSS_SAVE_CTX flag to have these saved/restored
      over device-suspend, but the current lpss_device_desc was not setting
      the prv_offset field, leading to the regular device registers getting
      saved/restored instead.
      
      This is causing the PWM controller to no longer work, resulting in a black
      screen,  after a suspend/resume on systems where the firmware clears the
      APB clock and reset bits at offset 0x804.
      
      This commit fixes this by properly setting prv_offset to 0x800 for
      the PWM devices.
      
      Cc: stable@vger.kernel.org
      Fixes: e1c74817 ("ACPI / LPSS: Add Intel BayTrail ACPI mode PWM")
      Fixes: 1bfbd8eb ("ACPI / LPSS: Add ACPI IDs for Intel Braswell")
      Signed-off-by: default avatarHans de Goede <hdegoede@redhat.com>
      Acked-by: default avatarRafael J . Wysocki <rjw@rjwysocki.net>
      Signed-off-by: default avatarThierry Reding <thierry.reding@gmail.com>
      Signed-off-by: default avatarSudip Mukherjee <sudipm.mukherjee@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      277131ba
    • Juergen Gross's avatar
      xen/netfront: don't cache skb_shinfo() · 6b1f6243
      Juergen Gross authored
      commit d472b3a6 upstream.
      
      skb_shinfo() can change when calling __pskb_pull_tail(): Don't cache
      its return value.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJuergen Gross <jgross@suse.com>
      Reviewed-by: default avatarWei Liu <wei.liu2@citrix.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6b1f6243
    • John David Anglin's avatar
      parisc: Define mb() and add memory barriers to assembler unlock sequences · 277b161b
      John David Anglin authored
      commit fedb8da9 upstream.
      
      For years I thought all parisc machines executed loads and stores in
      order. However, Jeff Law recently indicated on gcc-patches that this is
      not correct. There are various degrees of out-of-order execution all the
      way back to the PA7xxx processor series (hit-under-miss). The PA8xxx
      series has full out-of-order execution for both integer operations, and
      loads and stores.
      
      This is described in the following article:
      http://web.archive.org/web/20040214092531/http://www.cpus.hp.com/technical_references/advperf.shtml
      
      For this reason, we need to define mb() and to insert a memory barrier
      before the store unlocking spinlocks. This ensures that all memory
      accesses are complete prior to unlocking. The ldcw instruction performs
      the same function on entry.
      Signed-off-by: default avatarJohn David Anglin <dave.anglin@bell.net>
      Cc: stable@vger.kernel.org # 4.0+
      Signed-off-by: default avatarHelge Deller <deller@gmx.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      277b161b
    • Helge Deller's avatar
      parisc: Enable CONFIG_MLONGCALLS by default · a9252a70
      Helge Deller authored
      commit 66509a27 upstream.
      
      Enable the -mlong-calls compiler option by default, because otherwise in most
      cases linking the vmlinux binary fails due to truncations of R_PARISC_PCREL22F
      relocations. This fixes building the 64-bit defconfig.
      
      Cc: stable@vger.kernel.org # 4.0+
      Signed-off-by: default avatarHelge Deller <deller@gmx.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a9252a70
    • Kees Cook's avatar
      fork: unconditionally clear stack on fork · 1e400642
      Kees Cook authored
      commit e01e8063 upstream.
      
      One of the classes of kernel stack content leaks[1] is exposing the
      contents of prior heap or stack contents when a new process stack is
      allocated.  Normally, those stacks are not zeroed, and the old contents
      remain in place.  In the face of stack content exposure flaws, those
      contents can leak to userspace.
      
      Fixing this will make the kernel no longer vulnerable to these flaws, as
      the stack will be wiped each time a stack is assigned to a new process.
      There's not a meaningful change in runtime performance; it almost looks
      like it provides a benefit.
      
      Performing back-to-back kernel builds before:
      	Run times: 157.86 157.09 158.90 160.94 160.80
      	Mean: 159.12
      	Std Dev: 1.54
      
      and after:
      	Run times: 159.31 157.34 156.71 158.15 160.81
      	Mean: 158.46
      	Std Dev: 1.46
      
      Instead of making this a build or runtime config, Andy Lutomirski
      recommended this just be enabled by default.
      
      [1] A noisy search for many kinds of stack content leaks can be seen here:
      https://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=linux+kernel+stack+leak
      
      I did some more with perf and cycle counts on running 100,000 execs of
      /bin/true.
      
      before:
      Cycles: 218858861551 218853036130 214727610969 227656844122 224980542841
      Mean:  221015379122.60
      Std Dev: 4662486552.47
      
      after:
      Cycles: 213868945060 213119275204 211820169456 224426673259 225489986348
      Mean:  217745009865.40
      Std Dev: 5935559279.99
      
      It continues to look like it's faster, though the deviation is rather
      wide, but I'm not sure what I could do that would be less noisy.  I'm
      open to ideas!
      
      Link: http://lkml.kernel.org/r/20180221021659.GA37073@beastSigned-off-by: default avatarKees Cook <keescook@chromium.org>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Reviewed-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Laura Abbott <labbott@redhat.com>
      Cc: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      [ Srivatsa: Backported to 4.4.y ]
      Signed-off-by: default avatarSrivatsa S. Bhat <srivatsa@csail.mit.edu>
      Reviewed-by: default avatarSrinidhi Rao <srinidhir@vmware.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1e400642
    • Thomas Egerer's avatar
      ipv4+ipv6: Make INET*_ESP select CRYPTO_ECHAINIV · e424bee2
      Thomas Egerer authored
      commit 32b6170c upstream.
      
      The ESP algorithms using CBC mode require echainiv. Hence INET*_ESP have
      to select CRYPTO_ECHAINIV in order to work properly. This solves the
      issues caused by a misconfiguration as described in [1].
      The original approach, patching crypto/Kconfig was turned down by
      Herbert Xu [2].
      
      [1] https://lists.strongswan.org/pipermail/users/2015-December/009074.html
      [2] http://marc.info/?l=linux-crypto-vger&m=145224655809562&w=2Signed-off-by: default avatarThomas Egerer <hakke_007@gmx.de>
      Acked-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Cc: Yongqin Liu <yongqin.liu@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e424bee2
    • Tadeusz Struk's avatar
      tpm: fix race condition in tpm_common_write() · 215f36e1
      Tadeusz Struk authored
      commit 3ab2011e upstream.
      
      There is a race condition in tpm_common_write function allowing
      two threads on the same /dev/tpm<N>, or two different applications
      on the same /dev/tpmrm<N> to overwrite each other commands/responses.
      Fixed this by taking the priv->buffer_mutex early in the function.
      
      Also converted the priv->data_pending from atomic to a regular size_t
      type. There is no need for it to be atomic since it is only touched
      under the protection of the priv->buffer_mutex.
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarTadeusz Struk <tadeusz.struk@intel.com>
      Reviewed-by: default avatarJarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
      Signed-off-by: default avatarJarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      215f36e1
    • Theodore Ts'o's avatar
      ext4: fix check to prevent initializing reserved inodes · 7736fced
      Theodore Ts'o authored
      commit 50122847 upstream.
      
      Commit 8844618d: "ext4: only look at the bg_flags field if it is
      valid" will complain if block group zero does not have the
      EXT4_BG_INODE_ZEROED flag set.  Unfortunately, this is not correct,
      since a freshly created file system has this flag cleared.  It gets
      almost immediately after the file system is mounted read-write --- but
      the following somewhat unlikely sequence will end up triggering a
      false positive report of a corrupted file system:
      
         mkfs.ext4 /dev/vdc
         mount -o ro /dev/vdc /vdc
         mount -o remount,rw /dev/vdc
      
      Instead, when initializing the inode table for block group zero, test
      to make sure that itable_unused count is not too large, since that is
      the case that will result in some or all of the reserved inodes
      getting cleared.
      
      This fixes the failures reported by Eric Whiteney when running
      generic/230 and generic/231 in the the nojournal test case.
      
      Fixes: 8844618d ("ext4: only look at the bg_flags field if it is valid")
      Reported-by: default avatarEric Whitney <enwlinux@gmail.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7736fced
  2. 09 Aug, 2018 13 commits
  3. 06 Aug, 2018 8 commits