1. 11 Sep, 2018 1 commit
  2. 10 Sep, 2018 8 commits
    • Waiman Long's avatar
      locking/rwsem: Make owner store task pointer of last owning reader · 925b9cd1
      Waiman Long authored
      Currently, when a reader acquires a lock, it only sets the
      RWSEM_READER_OWNED bit in the owner field. The other bits are simply
      not used. When debugging hanging cases involving rwsems and readers,
      the owner value does not provide much useful information at all.
      
      This patch modifies the current behavior to always store the task_struct
      pointer of the last rwsem-acquiring reader in a reader-owned rwsem. This
      may be useful in debugging rwsem hanging cases especially if only one
      reader is involved. However, the task in the owner field may not the
      real owner or one of the real owners at all when the owner value is
      examined, for example, in a crash dump. So it is just an additional
      hint about the past history.
      
      If CONFIG_DEBUG_RWSEMS=y is enabled, the owner field will be checked at
      unlock time too to make sure the task pointer value is valid. That does
      have a slight performance cost and so is only enabled as part of that
      debug option.
      
      From the performance point of view, it is expected that the changes
      shouldn't have any noticeable performance impact. A rwsem microbenchmark
      (with 48 worker threads and 1:1 reader/writer ratio) was ran on a
      2-socket 24-core 48-thread Haswell system.  The locking rates on a
      4.19-rc1 based kernel were as follows:
      
        1) Unpatched kernel:				543.3 kops/s
        2) Patched kernel:				549.2 kops/s
        3) Patched kernel (CONFIG_DEBUG_RWSEMS on):	546.6 kops/s
      
      There was actually a slight increase in performance (1.1%) in this
      particular case. Maybe it was caused by the elimination of a branch or
      just a testing noise. Turning on the CONFIG_DEBUG_RWSEMS option also
      had less than the expected impact on performance.
      
      The least significant 2 bits of the owner value are now used to designate
      the rwsem is readers owned and the owners are anonymous.
      Signed-off-by: default avatarWaiman Long <longman@redhat.com>
      Acked-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will.deacon@arm.com>
      Link: http://lkml.kernel.org/r/1536265114-10842-1-git-send-email-longman@redhat.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      925b9cd1
    • Waiman Long's avatar
      locking/rwsem: Exit read lock slowpath if queue empty & no writer · 4b486b53
      Waiman Long authored
      It was discovered that a constant stream of readers with occassional
      writers pounding on a rwsem may cause many of the readers to enter the
      slowpath unnecessarily thus increasing latency and lowering performance.
      
      In the current code, a reader entering the slowpath critical section
      will unconditionally set the WAITING_BIAS, if not set yet, and clear
      its active count even if no one is in the wait queue and no writer
      is present. This causes some incoming readers to observe the presence
      of waiters in the wait queue and hence have to go into the slowpath
      themselves.
      
      With sufficient numbers of readers and a relatively short lock hold time,
      the WAITING_BIAS may be repeatedly turned on and off and a substantial
      portion of the readers will go into the slowpath sustaining a rather
      long queue in the wait queue spinlock and repeated WAITING_BIAS on/off
      cycle until the logjam is broken opportunistically.
      
      To avoid this situation from happening, an additional check is added to
      detect the special case that the reader in the critical section is the
      only one in the wait queue and no writer is present. When that happens,
      it can just exit the slowpath and return immediately as its active count
      has already been set in the lock.  Other incoming readers won't observe
      the presence of waiters and so will not be forced into the slowpath.
      
      The issue was found in a customer site where they had an application
      that pounded on the pread64 syscalls heavily on an XFS filesystem. The
      application was run in a recent 4-socket boxes with a lot of CPUs. They
      saw significant spinlock contention in the rwsem_down_read_failed() call.
      With this patch applied, the system CPU usage went down from 85% to 57%,
      and the spinlock contention in the pread64 syscalls was gone.
      Signed-off-by: default avatarWaiman Long <longman@redhat.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: default avatarDavidlohr Bueso <dbueso@suse.de>
      Acked-by: default avatarWill Deacon <will.deacon@arm.com>
      Cc: Joe Mario <jmario@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1532459425-19204-1-git-send-email-longman@redhat.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      4b486b53
    • Peter Zijlstra's avatar
      jump_label/lockdep: Assert we hold the hotplug lock for _cpuslocked() operations · cb538267
      Peter Zijlstra authored
      Weirdly we seem to have forgotten this...
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      cb538267
    • Ingo Molnar's avatar
    • Borislav Petkov's avatar
      jump_label: Fix typo in warning message · da260fe1
      Borislav Petkov authored
      There's no 'allocatote' - use the next best thing: 'allocate' :-)
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Jason Baron <jbaron@akamai.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20180907103521.31344-1-bp@alien8.deSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      da260fe1
    • Thomas Hellstrom's avatar
      locking/mutex: Fix mutex debug call and ww_mutex documentation · e13e2366
      Thomas Hellstrom authored
      The following commit:
      
        08295b3b ("Implement an algorithm choice for Wound-Wait mutexes")
      
      introduced a reference in the documentation to a function that was
      removed in an earlier commit.
      
      It also forgot to remove a call to debug_mutex_add_waiter() which is now
      unconditionally called by __mutex_add_waiter().
      
      Fix those bugs.
      Signed-off-by: default avatarThomas Hellstrom <thellstrom@vmware.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: dri-devel@lists.freedesktop.org
      Fixes: 08295b3b ("Implement an algorithm choice for Wound-Wait mutexes")
      Link: http://lkml.kernel.org/r/20180903140708.2401-1-thellstrom@vmware.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      e13e2366
    • Borislav Petkov's avatar
      jump_label: Use static_key_linked() accessor · 34e12b86
      Borislav Petkov authored
      ... instead of open-coding it, in static_key_mod().
      
      No functional changes.
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Jason Baron <jbaron@akamai.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20180909114252.17575-1-bp@alien8.deSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      34e12b86
    • Linus Torvalds's avatar
      Linux 4.19-rc3 · 11da3a7f
      Linus Torvalds authored
      11da3a7f
  3. 09 Sep, 2018 7 commits
  4. 08 Sep, 2018 6 commits
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · f8f65382
      Linus Torvalds authored
      Pull KVM fixes from Radim Krčmář:
       "ARM:
         - Fix a VFP corruption in 32-bit guest
         - Add missing cache invalidation for CoW pages
         - Two small cleanups
      
        s390:
         - Fallout from the hugetlbfs support: pfmf interpretion and locking
         - VSIE: fix keywrapping for nested guests
      
        PPC:
         - Fix a bug where pages might not get marked dirty, causing guest
           memory corruption on migration
         - Fix a bug causing reads from guest memory to use the wrong guest
           real address for very large HPT guests (>256G of memory), leading
           to failures in instruction emulation.
      
        x86:
         - Fix out of bound access from malicious pv ipi hypercalls
           (introduced in rc1)
         - Fix delivery of pending interrupts when entering a nested guest,
           preventing arbitrarily late injection
         - Sanitize kvm_stat output after destroying a guest
         - Fix infinite loop when emulating a nested guest page fault and
           improve the surrounding emulation code
         - Two minor cleanups"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (28 commits)
        KVM: LAPIC: Fix pv ipis out-of-bounds access
        KVM: nVMX: Fix loss of pending IRQ/NMI before entering L2
        arm64: KVM: Remove pgd_lock
        KVM: Remove obsolete kvm_unmap_hva notifier backend
        arm64: KVM: Only force FPEXC32_EL2.EN if trapping FPSIMD
        KVM: arm/arm64: Clean dcache to PoC when changing PTE due to CoW
        KVM: s390: Properly lock mm context allow_gmap_hpage_1m setting
        KVM: s390: vsie: copy wrapping keys to right place
        KVM: s390: Fix pfmf and conditional skey emulation
        tools/kvm_stat: re-animate display of dead guests
        tools/kvm_stat: indicate dead guests as such
        tools/kvm_stat: handle guest removals more gracefully
        tools/kvm_stat: don't reset stats when setting PID filter for debugfs
        tools/kvm_stat: fix updates for dead guests
        tools/kvm_stat: fix handling of invalid paths in debugfs provider
        tools/kvm_stat: fix python3 issues
        KVM: x86: Unexport x86_emulate_instruction()
        KVM: x86: Rename emulate_instruction() to kvm_emulate_instruction()
        KVM: x86: Do not re-{try,execute} after failed emulation in L2
        KVM: x86: Default to not allowing emulation retry in kvm_mmu_page_fault
        ...
      f8f65382
    • Linus Torvalds's avatar
      Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc · 0f3aa48a
      Linus Torvalds authored
      Pull ARM SoC fixes from Olof Johansson:
       "A few more fixes who have trickled in:
      
         - MMC bus width fixup for some Allwinner platforms
      
         - Fix for NULL deref in ti-aemif when no platform data is passed in
      
         - Fix div by 0 in SCMI code
      
         - Add a missing module alias in a new RPi driver"
      
      * tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
        memory: ti-aemif: fix a potential NULL-pointer dereference
        firmware: arm_scmi: fix divide by zero when sustained_perf_level is zero
        hwmon: rpi: add module alias to raspberrypi-hwmon
        arm64: allwinner: dts: h6: fix Pine H64 MMC bus width
      0f3aa48a
    • Olof Johansson's avatar
      Merge tag 'sunxi-fixes-for-4.19' of... · a132bb90
      Olof Johansson authored
      Merge tag 'sunxi-fixes-for-4.19' of https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux into fixes
      
      Allwinner fixes for 4.19
      
      Just one fix for H6 mmc on the Pine H64: the mmc bus width was missing
      from the device tree. This was added in 4.19-rc1.
      
      * tag 'sunxi-fixes-for-4.19' of https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux:
        arm64: allwinner: dts: h6: fix Pine H64 MMC bus width
      Signed-off-by: default avatarOlof Johansson <olof@lixom.net>
      a132bb90
    • Nadav Amit's avatar
      x86/mm: Use WRITE_ONCE() when setting PTEs · 9bc4f28a
      Nadav Amit authored
      When page-table entries are set, the compiler might optimize their
      assignment by using multiple instructions to set the PTE. This might
      turn into a security hazard if the user somehow manages to use the
      interim PTE. L1TF does not make our lives easier, making even an interim
      non-present PTE a security hazard.
      
      Using WRITE_ONCE() to set PTEs and friends should prevent this potential
      security hazard.
      
      I skimmed the differences in the binary with and without this patch. The
      differences are (obviously) greater when CONFIG_PARAVIRT=n as more
      code optimizations are possible. For better and worse, the impact on the
      binary with this patch is pretty small. Skimming the code did not cause
      anything to jump out as a security hazard, but it seems that at least
      move_soft_dirty_pte() caused set_pte_at() to use multiple writes.
      Signed-off-by: default avatarNadav Amit <namit@vmware.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Sean Christopherson <sean.j.christopherson@intel.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20180902181451.80520-1-namit@vmware.com
      9bc4f28a
    • Thomas Gleixner's avatar
      x86/apic/vector: Make error return value negative · 47b7360c
      Thomas Gleixner authored
      activate_managed() returns EINVAL instead of -EINVAL in case of
      error. While this is unlikely to happen, the positive return value would
      cause further malfunction at the call site.
      
      Fixes: 2db1f959 ("x86/vector: Handle managed interrupts proper")
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: stable@vger.kernel.org
      47b7360c
    • Linus Torvalds's avatar
      Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux · d7b686eb
      Linus Torvalds authored
      Pull i2c fixes from Wolfram Sang:
      
       - bugfixes for uniphier, i801, and xiic drivers
      
       - ID removal (never produced) for imx
      
       - one MAINTAINER addition
      
      * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
        i2c: xiic: Record xilinx i2c with Zynq fragment
        i2c: xiic: Make the start and the byte count write atomic
        i2c: i801: fix DNV's SMBCTRL register offset
        i2c: imx-lpi2c: Remove mx8dv compatible entry
        dt-bindings: imx-lpi2c: Remove mx8dv compatible entry
        i2c: uniphier-f: issue STOP only for last message or I2C_M_STOP
        i2c: uniphier: issue STOP only for last message or I2C_M_STOP
      d7b686eb
  5. 07 Sep, 2018 18 commits