1. 29 Aug, 2017 3 commits
    • Ying Huang's avatar
      smp: Avoid using two cache lines for struct call_single_data · 966a9671
      Ying Huang authored
      struct call_single_data is used in IPIs to transfer information between
      CPUs.  Its size is bigger than sizeof(unsigned long) and less than
      cache line size.  Currently it is not allocated with any explicit alignment
      requirements.  This makes it possible for allocated call_single_data to
      cross two cache lines, which results in double the number of the cache lines
      that need to be transferred among CPUs.
      
      This can be fixed by requiring call_single_data to be aligned with the
      size of call_single_data. Currently the size of call_single_data is the
      power of 2.  If we add new fields to call_single_data, we may need to
      add padding to make sure the size of new definition is the power of 2
      as well.
      
      Fortunately, this is enforced by GCC, which will report bad sizes.
      
      To set alignment requirements of call_single_data to the size of
      call_single_data, a struct definition and a typedef is used.
      
      To test the effect of the patch, I used the vm-scalability multiple
      thread swap test case (swap-w-seq-mt).  The test will create multiple
      threads and each thread will eat memory until all RAM and part of swap
      is used, so that huge number of IPIs are triggered when unmapping
      memory.  In the test, the throughput of memory writing improves ~5%
      compared with misaligned call_single_data, because of faster IPIs.
      Suggested-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarHuang, Ying <ying.huang@intel.com>
      [ Add call_single_data_t and align with size of call_single_data. ]
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Aaron Lu <aaron.lu@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/87bmnqd6lz.fsf@yhuang-mobile.sh.intel.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      966a9671
    • Peter Zijlstra's avatar
      locking/lockdep: Untangle xhlock history save/restore from task independence · f52be570
      Peter Zijlstra authored
      Where XHLOCK_{SOFT,HARD} are save/restore points in the xhlocks[] to
      ensure the temporal IRQ events don't interact with task state, the
      XHLOCK_PROC is a fundament different beast that just happens to share
      the interface.
      
      The purpose of XHLOCK_PROC is to annotate independent execution inside
      one task. For example workqueues, each work should appear to run in its
      own 'pristine' 'task'.
      
      Remove XHLOCK_PROC in favour of its own interface to avoid confusion.
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Byungchul Park <byungchul.park@lge.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: boqun.feng@gmail.com
      Cc: david@fromorbit.com
      Cc: johannes@sipsolutions.net
      Cc: kernel-team@lge.com
      Cc: oleg@redhat.com
      Cc: tj@kernel.org
      Link: http://lkml.kernel.org/r/20170829085939.ggmb6xiohw67micb@hirez.programming.kicks-ass.netSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      f52be570
    • Ingo Molnar's avatar
      locking/refcounts, x86/asm: Disable CONFIG_ARCH_HAS_REFCOUNT for the time being · 7b3d61cc
      Ingo Molnar authored
      Mike Galbraith bisected a boot crash back to the following commit:
      
        7a46ec0e ("locking/refcounts, x86/asm: Implement fast refcount overflow protection")
      
      The crash/hang pattern is:
      
       > Symptom is a few splats as below, with box finally hanging.  Network
       > comes up, but neither ssh nor console login is possible.
       >
       >  ------------[ cut here ]------------
       >  WARNING: CPU: 4 PID: 0 at net/netlink/af_netlink.c:374 netlink_sock_destruct+0x82/0xa0
       >  ...
       >  __sk_destruct()
       >  rcu_process_callbacks()
       >  __do_softirq()
       >  irq_exit()
       >  smp_apic_timer_interrupt()
       >  apic_timer_interrupt()
      
      We are at -rc7 already, and the code has grown some dependencies, so
      instead of a plain revert disable the config temporarily, in the hope
      of getting real fixes.
      Reported-by: default avatarMike Galbraith <efault@gmx.de>
      Tested-by: default avatarMike Galbraith <efault@gmx.de>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Elena Reshetova <elena.reshetova@intel.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/tip-7a46ec0e2f4850407de5e1d19a44edee6efa58ec@git.kernel.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      7b3d61cc
  2. 25 Aug, 2017 7 commits
    • Jiri Slaby's avatar
      futex: Remove duplicated code and fix undefined behaviour · 30d6e0a4
      Jiri Slaby authored
      There is code duplicated over all architecture's headers for
      futex_atomic_op_inuser. Namely op decoding, access_ok check for uaddr,
      and comparison of the result.
      
      Remove this duplication and leave up to the arches only the needed
      assembly which is now in arch_futex_atomic_op_inuser.
      
      This effectively distributes the Will Deacon's arm64 fix for undefined
      behaviour reported by UBSAN to all architectures. The fix was done in
      commit 5f16a046 (arm64: futex: Fix undefined behaviour with
      FUTEX_OP_OPARG_SHIFT usage). Look there for an example dump.
      
      And as suggested by Thomas, check for negative oparg too, because it was
      also reported to cause undefined behaviour report.
      
      Note that s390 removed access_ok check in d12a2970 ("s390/uaccess:
      remove pointless access_ok() checks") as access_ok there returns true.
      We introduce it back to the helper for the sake of simplicity (it gets
      optimized away anyway).
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Acked-by: default avatarRussell King <rmk+kernel@armlinux.org.uk>
      Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
      Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> [s390]
      Acked-by: Chris Metcalf <cmetcalf@mellanox.com> [for tile]
      Reviewed-by: default avatarDarren Hart (VMware) <dvhart@infradead.org>
      Reviewed-by: Will Deacon <will.deacon@arm.com> [core/arm64]
      Cc: linux-mips@linux-mips.org
      Cc: Rich Felker <dalias@libc.org>
      Cc: linux-ia64@vger.kernel.org
      Cc: linux-sh@vger.kernel.org
      Cc: peterz@infradead.org
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: sparclinux@vger.kernel.org
      Cc: Jonas Bonn <jonas@southpole.se>
      Cc: linux-s390@vger.kernel.org
      Cc: linux-arch@vger.kernel.org
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: linux-hexagon@vger.kernel.org
      Cc: Helge Deller <deller@gmx.de>
      Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: linux-snps-arc@lists.infradead.org
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: linux-xtensa@linux-xtensa.org
      Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
      Cc: openrisc@lists.librecores.org
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Stafford Horne <shorne@gmail.com>
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: linux-parisc@vger.kernel.org
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Richard Kuo <rkuo@codeaurora.org>
      Cc: linux-alpha@vger.kernel.org
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: linuxppc-dev@lists.ozlabs.org
      Cc: "David S. Miller" <davem@davemloft.net>
      Link: http://lkml.kernel.org/r/20170824073105.3901-1-jslaby@suse.cz
      30d6e0a4
    • Peter Zijlstra's avatar
      Documentation/locking/atomic: Finish the document... · ca110694
      Peter Zijlstra authored
      Julia reported that the document looked unfinished, and it is. I
      forgot to include the example cooked up by Paul here:
      
        https://lkml.kernel.org/r/20170731174345.GL3730@linux.vnet.ibm.com
      
      and I added an explicit example showing how, while it is an ACQUIRE
      pattern, it really does provide an MB.
      Reported-by: default avatarJulia Cartwright <julia@ni.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Boqun Feng <boqun.feng@gmail.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      ca110694
    • Peter Zijlstra's avatar
      locking/lockdep: Fix workqueue crossrelease annotation · e6f3faa7
      Peter Zijlstra authored
      The new completion/crossrelease annotations interact unfavourable with
      the extant flush_work()/flush_workqueue() annotations.
      
      The problem is that when a single work class does:
      
        wait_for_completion(&C)
      
      and
      
        complete(&C)
      
      in different executions, we'll build dependencies like:
      
        lock_map_acquire(W)
        complete_acquire(C)
      
      and
      
        lock_map_acquire(W)
        complete_release(C)
      
      which results in the dependency chain: W->C->W, which lockdep thinks
      spells deadlock, even though there is no deadlock potential since
      works are ran concurrently.
      
      One possibility would be to change the work 'lock' to recursive-read,
      but that would mean hitting a lockdep limitation on recursive locks.
      Also, unconditinoally switching to recursive-read here would fail to
      detect the actual deadlock on single-threaded workqueues, which do
      have a problem with this.
      
      For now, forcefully disregard these locks for crossrelease.
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: boqun.feng@gmail.com
      Cc: byungchul.park@lge.com
      Cc: david@fromorbit.com
      Cc: johannes@sipsolutions.net
      Cc: oleg@redhat.com
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      e6f3faa7
    • Peter Zijlstra's avatar
      workqueue/lockdep: 'Fix' flush_work() annotation · a1d14934
      Peter Zijlstra authored
      The flush_work() annotation as introduced by commit:
      
        e159489b ("workqueue: relax lockdep annotation on flush_work()")
      
      hits on the lockdep problem with recursive read locks.
      
      The situation as described is:
      
      Work W1:                Work W2:        Task:
      
      ARR(Q)                  ARR(Q)		flush_workqueue(Q)
      A(W1)                   A(W2)             A(Q)
        flush_work(W2)			  R(Q)
          A(W2)
          R(W2)
          if (special)
            A(Q)
          else
            ARR(Q)
          R(Q)
      
      where: A - acquire, ARR - acquire-read-recursive, R - release.
      
      Where under 'special' conditions we want to trigger a lock recursion
      deadlock, but otherwise allow the flush_work(). The allowing is done
      by using recursive read locks (ARR), but lockdep is broken for
      recursive stuff.
      
      However, there appears to be no need to acquire the lock if we're not
      'special', so if we remove the 'else' clause things become much
      simpler and no longer need the recursion thing at all.
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: boqun.feng@gmail.com
      Cc: byungchul.park@lge.com
      Cc: david@fromorbit.com
      Cc: johannes@sipsolutions.net
      Cc: oleg@redhat.com
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      a1d14934
    • Peter Zijlstra's avatar
      locking/lockdep/selftests: Add mixed read-write ABBA tests · e9149858
      Peter Zijlstra authored
      Currently lockdep has limited support for recursive readers, add a few
      mixed read-write ABBA selftests to show the extend of these
      limitations.
      
        [    0.000000] ----------------------------------------------------------------------------
        [    0.000000]                                  | spin |wlock |rlock |mutex | wsem | rsem |
        [    0.000000]   --------------------------------------------------------------------------
      
        [    0.000000]   mixed read-lock/lock-write ABBA:             |FAILED|             |  ok  |
        [    0.000000]    mixed read-lock/lock-read ABBA:             |  ok  |             |  ok  |
        [    0.000000]  mixed write-lock/lock-write ABBA:             |  ok  |             |  ok  |
      
      This clearly illustrates the case where lockdep fails to find a
      deadlock.
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: boqun.feng@gmail.com
      Cc: byungchul.park@lge.com
      Cc: david@fromorbit.com
      Cc: johannes@sipsolutions.net
      Cc: oleg@redhat.com
      Cc: tj@kernel.org
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      e9149858
    • Peter Zijlstra's avatar
      mm, locking/barriers: Clarify tlb_flush_pending() barriers · 0e709703
      Peter Zijlstra authored
      Better document the ordering around tlb_flush_pending().
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      0e709703
    • Ingo Molnar's avatar
      10c9850c
  3. 24 Aug, 2017 17 commits
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma · 90a6cd50
      Linus Torvalds authored
      Pull more rdma fixes from Doug Ledford:
       "Well, I thought we were going to be done for this -rc cycle. I should
        have known better than to say so though.
      
        We have four additional items that trickled in.
      
        One was a simple mistake on my part. I took a patch into my for-next
        thinking that the issue was less severe than it was. I was then
        notified that it needed to be in my -rc area instead.
      
        The other three were just found late in testing.
      
        Summary:
      
         - One core fix accidentally applied first to for-next and then cherry
           picked back because it needed to be in the -rc cycles instead
      
         - Another core fix
      
         - Two mlx5 fixes"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma:
        IB/mlx5: Always return success for RoCE modify port
        IB/mlx5: Fix Raw Packet QP event handler assignment
        IB/core: Avoid accessing non-allocated memory when inferring port type
        RDMA/uverbs: Initialize cq_context appropriately
      90a6cd50
    • Linus Torvalds's avatar
      Merge tag 'acpi-4.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 4898b99c
      Linus Torvalds authored
      Pull ACPI fixes from Rafael Wysocki:
       "These fix two recent regressions (in ACPICA and in the ACPI EC driver)
        and one bug in code introduced during the 4.12 cycle (ACPI device
        properties library routine).
      
        Specifics:
      
         - Fix a regression in the ACPI EC driver causing a kernel to crash
           during initialization on some systems due to a code ordering issue
           exposed by a recent change (Lv Zheng).
      
         - Fix a recent regression in ACPICA due to a change of the behavior
           of a library function in a way that is not backwards compatible
           with some existing callers of it (Rafael Wysocki).
      
         - Fix a coding mistake in a library function related to the handling
           of ACPI device properties introduced during the 4.12 cycle (Sakari
           Ailus)"
      
      * tag 'acpi-4.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        ACPI: device property: Fix node lookup in acpi_graph_get_child_prop_value()
        ACPICA: Fix acpi_evaluate_object_typed()
        ACPI: EC: Fix regression related to wrong ECDT initialization order
      4898b99c
    • Linus Torvalds's avatar
      Merge tag 'kbuild-fixes-v4.13' of... · f7bbf075
      Linus Torvalds authored
      Merge tag 'kbuild-fixes-v4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
      
      Pull Kbuild fixes from Masahiro Yamada:
      
       - fix linker script regression caused by dead code elimination support
      
       - fix typos and outdated comments
      
       - specify kselftest-clean as a PHONY target
      
       - fix "make dtbs_install" when $(srctree) includes shell special
         characters like '~'
      
       - Move -fshort-wchar to the global option list because defining it
         partially emits warnings
      
      * tag 'kbuild-fixes-v4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
        kbuild: update comments of Makefile.asm-generic
        kbuild: Do not use hyphen in exported variable name
        Makefile: add kselftest-clean to PHONY target list
        Kbuild: use -fshort-wchar globally
        fixdep: trivial: typo fix and correction
        kbuild: trivial cleanups on the comments
        kbuild: linker script do not match C names unless LD_DEAD_CODE_DATA_ELIMINATION is configured
      f7bbf075
    • Linus Torvalds's avatar
      Merge branch 'for-4.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux · b71a5e3f
      Linus Torvalds authored
      Pull btrfs fix from David Sterba:
       "We have one more fixup that stems from the blk_status_t conversion
        that did not quite cover everything.
      
        The normal cases were not affected because the code is 0, but any
        error and retries could mix up new and old values"
      
      * 'for-4.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
        Btrfs: fix blk_status_t/errno confusion
      b71a5e3f
    • Linus Torvalds's avatar
      Merge tag 'trace-v4.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace · 415be6c2
      Linus Torvalds authored
      Pull tracing fixes from Steven Rostedt:
       "Various bug fixes:
      
         - Two small memory leaks in error paths.
      
         - A missed return error code on an error path.
      
         - A fix to check the tracing ring buffer CPU when it doesn't exist
           (caused by setting maxcpus on the command line that is less than
           the actual number of CPUs, and then onlining them manually).
      
         - A fix to have the reset of boot tracers called by lateinit_sync()
           instead of just lateinit(). As some of the tracers register via
           lateinit(), and if the clear happens before the tracer is
           registered, it will never start even though it was told to via the
           kernel command line"
      
      * tag 'trace-v4.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
        tracing: Fix freeing of filter in create_filter() when set_str is false
        tracing: Fix kmemleak in tracing_map_array_free()
        ftrace: Check for null ret_stack on profile function graph entry function
        ring-buffer: Have ring_buffer_alloc_read_page() return error on offline CPU
        tracing: Missing error code in tracer_alloc_buffers()
        tracing: Call clear_boot_tracer() at lateinit_sync
      415be6c2
    • Linus Torvalds's avatar
      Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc · 1cffe595
      Linus Torvalds authored
      Pull ARM SoC fixes from Arnd Bergmann:
       "A small number of bugfixes, again nothing serious.
      
         - Alexander Dahl found multiple bugs in the Atmel memory interface
           driver
      
         - A randconfig build fix for at91 was incomplete, the second attempt
           fixes the remaining corner case
      
         - One fix for the TI Keystone queue handler
      
         - The Odroid XU4 HDMI port (added in 4.13) needs a small DT fix"
      
      * tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
        ARM: dts: exynos: add needs-hpd for Odroid-XU3/4
        ARM: at91: don't select CONFIG_ARM_CPU_SUSPEND for old platforms
        soc: ti: knav: Add a NULL pointer check for kdev in knav_pool_create
        memory: atmel-ebi: Fix smc cycle xlate converter
        memory: atmel-ebi: Allow t_DF timings of zero ns
        memory: atmel-ebi: Fix smc timing return value evaluation
      1cffe595
    • Eric W. Biederman's avatar
      pty: Repair TIOCGPTPEER · 311fc65c
      Eric W. Biederman authored
      The implementation of TIOCGPTPEER has two issues.
      
      When /dev/ptmx (as opposed to /dev/pts/ptmx) is opened the wrong
      vfsmount is passed to dentry_open.  Which results in the kernel displaying
      the wrong pathname for the peer.
      
      The second is simply by caching the vfsmount and dentry of the peer it leaves
      them open, in a way they were not previously Which because of the inreased
      reference counts can cause unnecessary behaviour differences resulting in
      regressions.
      
      To fix these move the ioctl into tty_io.c at a generic level allowing
      the ioctl to have access to the struct file on which the ioctl is
      being called.  This allows the path of the slave to be derived when
      opening the slave through TIOCGPTPEER instead of requiring the path to
      the slave be cached.  Thus removing the need for caching the path.
      
      A new function devpts_ptmx_path is factored out of devpts_acquire and
      used to implement a function devpts_mntget.   The new function devpts_mntget
      takes a filp to perform the lookup on and fsi so that it can confirm
      that the superblock that is found by devpts_ptmx_path is the proper superblock.
      
      v2: Lots of fixes to make the code actually work
      v3: Suggestions by Linus
          - Removed the unnecessary initialization of filp in ptm_open_peer
          - Simplified devpts_ptmx_path as gotos are no longer required
      
      [ This is the fix for the issue that was reverted in commit
        143c97cc, but this time without breaking 'pbuilder' due to
        increased reference counts   - Linus ]
      
      Fixes: 54ebbfb1 ("tty: add TIOCGPTPEER ioctl")
      Reported-by: default avatarChristian Brauner <christian.brauner@canonical.com>
      Reported-and-tested-by: default avatarStefan Lippers-Hollmann <s.l-h@gmx.de>
      Signed-off-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      311fc65c
    • Rafael J. Wysocki's avatar
      Merge branches 'acpica-fix', 'acpi-ec-fix' and 'acpi-properties-fix' · d5d6c1dd
      Rafael J. Wysocki authored
      * acpica-fix:
        ACPICA: Fix acpi_evaluate_object_typed()
      
      * acpi-ec-fix:
        ACPI: EC: Fix regression related to wrong ECDT initialization order
      
      * acpi-properties-fix:
        ACPI: device property: Fix node lookup in acpi_graph_get_child_prop_value()
      d5d6c1dd
    • Majd Dibbiny's avatar
      IB/mlx5: Always return success for RoCE modify port · ec255879
      Majd Dibbiny authored
      CM layer calls ib_modify_port() regardless of the link layer.
      
      For the Ethernet ports, qkey violation and Port capabilities
      are meaningless. Therefore, always return success for ib_modify_port
      calls on the Ethernet ports.
      
      Cc: Selvin Xavier <selvin.xavier@broadcom.com>
      Signed-off-by: default avatarMajd Dibbiny <majd@mellanox.com>
      Reviewed-by: default avatarMoni Shoua <monis@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      ec255879
    • Majd Dibbiny's avatar
      IB/mlx5: Fix Raw Packet QP event handler assignment · 1d31e9c0
      Majd Dibbiny authored
      In case we have SQ and RQ for Raw Packet QP, the SQ's event handler
      wasn't assigned.
      
      Fixing this by assigning event handler for each WQ after creation.
      
      [ 1877.145243] Call Trace:
      [ 1877.148644] <IRQ>
      [ 1877.150580] [<ffffffffa07987c5>] ? mlx5_rsc_event+0x105/0x210 [mlx5_core]
      [ 1877.159581] [<ffffffffa0795bd7>] ? mlx5_cq_event+0x57/0xd0 [mlx5_core]
      [ 1877.167137] [<ffffffffa079208e>] mlx5_eq_int+0x53e/0x6c0 [mlx5_core]
      [ 1877.174526] [<ffffffff8101a679>] ? sched_clock+0x9/0x10
      [ 1877.180753] [<ffffffff810f717e>] handle_irq_event_percpu+0x3e/0x1e0
      [ 1877.188014] [<ffffffff810f735d>] handle_irq_event+0x3d/0x60
      [ 1877.194567] [<ffffffff810f9fe7>] handle_edge_irq+0x77/0x130
      [ 1877.201129] [<ffffffff81014c3f>] handle_irq+0xbf/0x150
      [ 1877.207244] [<ffffffff815ed78a>] ? atomic_notifier_call_chain+0x1a/0x20
      [ 1877.214829] [<ffffffff815f434f>] do_IRQ+0x4f/0xf0
      [ 1877.220498] [<ffffffff815e94ad>] common_interrupt+0x6d/0x6d
      [ 1877.227025] <EOI>
      [ 1877.228967] [<ffffffff814834e2>] ? cpuidle_enter_state+0x52/0xc0
      [ 1877.236990] [<ffffffff81483615>] cpuidle_idle_call+0xc5/0x200
      [ 1877.243676] [<ffffffff8101bc7e>] arch_cpu_idle+0xe/0x30
      [ 1877.249831] [<ffffffff810b4725>] cpu_startup_entry+0xf5/0x290
      [ 1877.256513] [<ffffffff815cfee1>] start_secondary+0x265/0x27b
      [ 1877.263111] Code: Bad RIP value.
      [ 1877.267296] RIP [< (null)>] (null)
      [ 1877.273264] RSP <ffff88046fd63df8>
      [ 1877.277531] CR2: 0000000000000000
      
      Fixes: 19098df2 ("IB/mlx5: Refactor mlx5_ib_qp to accommodate other QP types")
      Signed-off-by: default avatarMajd Dibbiny <majd@mellanox.com>
      Reviewed-by: default avatarYishai Hadas <yishaih@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      1d31e9c0
    • Noa Osherovich's avatar
      IB/core: Avoid accessing non-allocated memory when inferring port type · 498ca3c8
      Noa Osherovich authored
      Commit 44c58487 ("IB/core: Define 'ib' and 'roce' rdma_ah_attr types")
      introduced the concept of type in ah_attr:
       * During ib_register_device, each port is checked for its type which
         is stored in ib_device's port_immutable array.
       * During uverbs' modify_qp, the type is inferred using the port number
         in ib_uverbs_qp_dest struct (address vector) by accessing the
         relevant port_immutable array and the type is passed on to
         providers.
      
      IB spec (version 1.3) enforces a valid port value only in Reset to
      Init. During Init to RTR, the address vector must be valid but port
      number is not mentioned as a field in the address vector, so its
      value is not validated, which leads to accesses to a non-allocated
      memory when inferring the port type.
      
      Save the real port number in ib_qp during modify to Init (when the
      comp_mask indicates that the port number is valid) and use this value
      to infer the port type.
      
      Avoid copying the address vector fields if the matching bit is not set
      in the attr_mask. Address vector can't be modified before the port, so
      no valid flow is affected.
      
      Fixes: 44c58487 ('IB/core: Define 'ib' and 'roce' rdma_ah_attr types')
      Signed-off-by: default avatarNoa Osherovich <noaos@mellanox.com>
      Reviewed-by: default avatarYishai Hadas <yishaih@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      498ca3c8
    • Omar Sandoval's avatar
      Btrfs: fix blk_status_t/errno confusion · 58efbc9f
      Omar Sandoval authored
      This fixes several instances of blk_status_t and bare errno ints being
      mixed up, some of which are real bugs.
      
      In the normal case, 0 matches BLK_STS_OK, so we don't observe any
      effects of the missing conversion, but in case of errors or passes
      through the repair/retry paths, the errors get mixed up.
      
      The changes were identified using 'sparse', we don't have reports of the
      buggy behaviour.
      
      Fixes: 4e4cbee9 ("block: switch bios to blk_status_t")
      Signed-off-by: default avatarOmar Sandoval <osandov@fb.com>
      Reviewed-by: default avatarLiu Bo <bo.li.liu@oracle.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      58efbc9f
    • Cao jin's avatar
    • Steven Rostedt (VMware)'s avatar
      tracing: Fix freeing of filter in create_filter() when set_str is false · 8b0db1a5
      Steven Rostedt (VMware) authored
      Performing the following task with kmemleak enabled:
      
       # cd /sys/kernel/tracing/events/irq/irq_handler_entry/
       # echo 'enable_event:kmem:kmalloc:3 if irq >' > trigger
       # echo 'enable_event:kmem:kmalloc:3 if irq > 31' > trigger
       # echo scan > /sys/kernel/debug/kmemleak
       # cat /sys/kernel/debug/kmemleak
      unreferenced object 0xffff8800b9290308 (size 32):
        comm "bash", pid 1114, jiffies 4294848451 (age 141.139s)
        hex dump (first 32 bytes):
          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        backtrace:
          [<ffffffff81cef5aa>] kmemleak_alloc+0x4a/0xa0
          [<ffffffff81357938>] kmem_cache_alloc_trace+0x158/0x290
          [<ffffffff81261c09>] create_filter_start.constprop.28+0x99/0x940
          [<ffffffff812639c9>] create_filter+0xa9/0x160
          [<ffffffff81263bdc>] create_event_filter+0xc/0x10
          [<ffffffff812655e5>] set_trigger_filter+0xe5/0x210
          [<ffffffff812660c4>] event_enable_trigger_func+0x324/0x490
          [<ffffffff812652e2>] event_trigger_write+0x1a2/0x260
          [<ffffffff8138cf87>] __vfs_write+0xd7/0x380
          [<ffffffff8138f421>] vfs_write+0x101/0x260
          [<ffffffff8139187b>] SyS_write+0xab/0x130
          [<ffffffff81cfd501>] entry_SYSCALL_64_fastpath+0x1f/0xbe
          [<ffffffffffffffff>] 0xffffffffffffffff
      
      The function create_filter() is passed a 'filterp' pointer that gets
      allocated, and if "set_str" is true, it is up to the caller to free it, even
      on error. The problem is that the pointer is not freed by create_filter()
      when set_str is false. This is a bug, and it is not up to the caller to free
      the filter on error if it doesn't care about the string.
      
      Link: http://lkml.kernel.org/r/1502705898-27571-2-git-send-email-chuhu@redhat.com
      
      Cc: stable@vger.kernel.org
      Fixes: 38b78eb8 ("tracing: Factorize filter creation")
      Reported-by: default avatarChunyu Hu <chuhu@redhat.com>
      Tested-by: default avatarChunyu Hu <chuhu@redhat.com>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      8b0db1a5
    • Chunyu Hu's avatar
      tracing: Fix kmemleak in tracing_map_array_free() · 475bb3c6
      Chunyu Hu authored
      kmemleak reported the below leak when I was doing clear of the hist
      trigger. With this patch, the kmeamleak is gone.
      
      unreferenced object 0xffff94322b63d760 (size 32):
        comm "bash", pid 1522, jiffies 4403687962 (age 2442.311s)
        hex dump (first 32 bytes):
          00 01 00 00 04 00 00 00 08 00 00 00 ff 00 00 00  ................
          10 00 00 00 00 00 00 00 80 a8 7a f2 31 94 ff ff  ..........z.1...
        backtrace:
          [<ffffffff9e96c27a>] kmemleak_alloc+0x4a/0xa0
          [<ffffffff9e424cba>] kmem_cache_alloc_trace+0xca/0x1d0
          [<ffffffff9e377736>] tracing_map_array_alloc+0x26/0x140
          [<ffffffff9e261be0>] kretprobe_trampoline+0x0/0x50
          [<ffffffff9e38b935>] create_hist_data+0x535/0x750
          [<ffffffff9e38bd47>] event_hist_trigger_func+0x1f7/0x420
          [<ffffffff9e38893d>] event_trigger_write+0xfd/0x1a0
          [<ffffffff9e44dfc7>] __vfs_write+0x37/0x170
          [<ffffffff9e44f552>] vfs_write+0xb2/0x1b0
          [<ffffffff9e450b85>] SyS_write+0x55/0xc0
          [<ffffffff9e203857>] do_syscall_64+0x67/0x150
          [<ffffffff9e977ce7>] return_from_SYSCALL_64+0x0/0x6a
          [<ffffffffffffffff>] 0xffffffffffffffff
      unreferenced object 0xffff9431f27aa880 (size 128):
        comm "bash", pid 1522, jiffies 4403687962 (age 2442.311s)
        hex dump (first 32 bytes):
          00 00 8c 2a 32 94 ff ff 00 f0 8b 2a 32 94 ff ff  ...*2......*2...
          00 e0 8b 2a 32 94 ff ff 00 d0 8b 2a 32 94 ff ff  ...*2......*2...
        backtrace:
          [<ffffffff9e96c27a>] kmemleak_alloc+0x4a/0xa0
          [<ffffffff9e425348>] __kmalloc+0xe8/0x220
          [<ffffffff9e3777c1>] tracing_map_array_alloc+0xb1/0x140
          [<ffffffff9e261be0>] kretprobe_trampoline+0x0/0x50
          [<ffffffff9e38b935>] create_hist_data+0x535/0x750
          [<ffffffff9e38bd47>] event_hist_trigger_func+0x1f7/0x420
          [<ffffffff9e38893d>] event_trigger_write+0xfd/0x1a0
          [<ffffffff9e44dfc7>] __vfs_write+0x37/0x170
          [<ffffffff9e44f552>] vfs_write+0xb2/0x1b0
          [<ffffffff9e450b85>] SyS_write+0x55/0xc0
          [<ffffffff9e203857>] do_syscall_64+0x67/0x150
          [<ffffffff9e977ce7>] return_from_SYSCALL_64+0x0/0x6a
          [<ffffffffffffffff>] 0xffffffffffffffff
      
      Link: http://lkml.kernel.org/r/1502705898-27571-1-git-send-email-chuhu@redhat.com
      
      Cc: stable@vger.kernel.org
      Fixes: 08d43a5f ("tracing: Add lock-free tracing_map")
      Signed-off-by: default avatarChunyu Hu <chuhu@redhat.com>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      475bb3c6
    • Steven Rostedt (VMware)'s avatar
      ftrace: Check for null ret_stack on profile function graph entry function · a8f0f9e4
      Steven Rostedt (VMware) authored
      There's a small race when function graph shutsdown and the calling of the
      registered function graph entry callback. The callback must not reference
      the task's ret_stack without first checking that it is not NULL. Note, when
      a ret_stack is allocated for a task, it stays allocated until the task exits.
      The problem here, is that function_graph is shutdown, and a new task was
      created, which doesn't have its ret_stack allocated. But since some of the
      functions are still being traced, the callbacks can still be called.
      
      The normal function_graph code handles this, but starting with commit
      8861dd30 ("ftrace: Access ret_stack->subtime only in the function
      profiler") the profiler code references the ret_stack on function entry, but
      doesn't check if it is NULL first.
      
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=196611
      
      Cc: stable@vger.kernel.org
      Fixes: 8861dd30 ("ftrace: Access ret_stack->subtime only in the function profiler")
      Reported-by: lilydjwg@gmail.com
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      a8f0f9e4
    • Linus Torvalds's avatar
      Revert "pty: fix the cached path of the pty slave file descriptor in the master" · 143c97cc
      Linus Torvalds authored
      This reverts commit c8c03f18.
      
      It turns out that while fixing the ptmx file descriptor to have the
      correct 'struct path' to the associated slave pty is a really good
      thing, it breaks some user space tools for a very annoying reason.
      
      The problem is that /dev/ptmx and its associated slave pty (/dev/pts/X)
      are on different mounts.  That was what caused us to have the wrong path
      in the first place (we would mix up the vfsmount of the 'ptmx' node,
      with the dentry of the pty slave node), but it also means that now while
      we use the right vfsmount, having the pty master open also keeps the pts
      mount busy.
      
      And it turn sout that that makes 'pbuilder' very unhappy, as noted by
      Stefan Lippers-Hollmann:
      
       "This patch introduces a regression for me when using pbuilder
        0.228.7[2] (a helper to build Debian packages in a chroot and to
        create and update its chroots) when trying to umount /dev/ptmx (inside
        the chroot) on Debian/ unstable (full log and pbuilder configuration
        file[3] attached).
      
        [...]
        Setting up build-essential (12.3) ...
        Processing triggers for libc-bin (2.24-15) ...
        I: unmounting dev/ptmx filesystem
        W: Could not unmount dev/ptmx: umount: /var/cache/pbuilder/build/1340/dev/ptmx: target is busy
                (In some cases useful info about processes that
                 use the device is found by lsof(8) or fuser(1).)"
      
      apparently pbuilder tries to unmount the /dev/pts filesystem while still
      holding at least one master node open, which is arguably not very nice,
      but we don't break user space even when fixing other bugs.
      
      So this commit has to be reverted.
      
      I'll try to figure out a way to avoid caching the path to the slave pty
      in the master pty.  The only thing that actually wants that slave pty
      path is the "TIOCGPTPEER" ioctl, and I think we could just recreate the
      path at that time.
      Reported-by: default avatarStefan Lippers-Hollmann <s.l-h@gmx.de>
      Cc: Eric W Biederman <ebiederm@xmission.com>
      Cc: Christian Brauner <christian.brauner@canonical.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      143c97cc
  4. 23 Aug, 2017 6 commits
    • Hans Verkuil's avatar
      ARM: dts: exynos: add needs-hpd for Odroid-XU3/4 · 93a4c835
      Hans Verkuil authored
      CEC support was added for Exynos5 in 4.13, but for the Odroids we need to set
      'needs-hpd' as well since CEC is disabled when there is no HDMI hotplug signal,
      just as for the exynos4 Odroid-U3.
      
      This is due to the level-shifter that is disabled when there is no HPD, thus
      blocking the CEC signal as well. Same close-but-no-cigar board design as the
      Odroid-U3.
      
      Tested with my Odroid XU4.
      Signed-off-by: default avatarHans Verkuil <hans.verkuil@cisco.com>
      Signed-off-by: default avatarKrzysztof Kozlowski <krzk@kernel.org>
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      93a4c835
    • Linus Torvalds's avatar
      Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux · 2acf097f
      Linus Torvalds authored
      Pull arm64 fixes from Will Deacon:
       "Late arm64 fixes.
      
        They fix very early boot failures with KASLR where the early mapping
        of the kernel is incorrect, so the failure mode looks like a hang with
        no output. There's also a signal-handling fix when a uaccess routine
        faults with a fatal signal pending, which could be used to create
        unkillable user tasks using userfaultfd and finally a state leak fix
        for the floating pointer registers across a call to exec().
      
        We're still seeing some random issues crop up (inode memory corruption
        and spinlock recursion) but we've not managed to reproduce things
        reliably enough to debug or bisect them yet.
      
        Summary:
      
         - Fix very early boot failures with KASLR enabled
      
         - Fix fatal signal handling on userspace access from kernel
      
         - Fix leakage of floating point register state across exec()"
      
      * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
        arm64: kaslr: Adjust the offset to avoid Image across alignment boundary
        arm64: kaslr: ignore modulo offset when validating virtual displacement
        arm64: mm: abort uaccess retries upon fatal signal
        arm64: fpsimd: Prevent registers leaking across exec
      2acf097f
    • Linus Torvalds's avatar
      Merge tag 'gpio-v4.13-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio · a67ca1e9
      Linus Torvalds authored
      Pull GPIO fixes from Linus Walleij:
       "Here are the (hopefully) last GPIO fixes for v4.13:
      
         - an important core fix to reject invalid GPIOs *before* trying to
           obtain a GPIO descriptor for it.
      
         - a driver fix for the mvebu driver IRQ handling"
      
      * tag 'gpio-v4.13-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
        gpio: mvebu: Fix cause computation in irq handler
        gpio: reject invalid gpio before getting gpio_desc
      a67ca1e9
    • Linus Torvalds's avatar
      Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · 55652400
      Linus Torvalds authored
      Pull SCSI fixes from James Bottomley:
       "Six minor and error leg fixes, plus one major change: the reversion of
        scsi-mq as the default.
      
        We're doing the latter temporarily (with a backport to stable) to give
        us time to fix all the issues that turned up with this default before
        trying again"
      
      * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
        scsi: cxgb4i: call neigh_event_send() to update MAC address
        Revert "scsi: default to scsi-mq"
        scsi: sd_zbc: Write unlock zone from sd_uninit_cmnd()
        scsi: aacraid: Fix out of bounds in aac_get_name_resp
        scsi: csiostor: fail probe if fw does not support FCoE
        scsi: megaraid_sas: fix error handle in megasas_probe_one
      55652400
    • Arnd Bergmann's avatar
      ARM: at91: don't select CONFIG_ARM_CPU_SUSPEND for old platforms · dbeb0c8e
      Arnd Bergmann authored
      My previous patch fixed a link error for all at91 platforms when
      CONFIG_ARM_CPU_SUSPEND was not set, however this caused another
      problem on a configuration that enabled CONFIG_ARCH_AT91 but none
      of the individual SoCs, and that also enabled CPU_ARM720 as
      the only CPU:
      
      warning: (ARCH_AT91 && SOC_IMX23 && SOC_IMX28 && ARCH_PXA && MACH_MVEBU_V7 && SOC_IMX6 && ARCH_OMAP3 && ARCH_OMAP4 && SOC_OMAP5 && SOC_AM33XX && SOC_DRA7XX && ARCH_EXYNOS3 && ARCH_EXYNOS4 && EXYNOS5420_MCPM && EXYNOS_CPU_SUSPEND && ARCH_VEXPRESS_TC2_PM && ARM_BIG_LITTLE_CPUIDLE && ARM_HIGHBANK_CPUIDLE && QCOM_PM) selects ARM_CPU_SUSPEND which has unmet direct dependencies (ARCH_SUSPEND_POSSIBLE)
      arch/arm/kernel/sleep.o: In function `cpu_resume':
      (.text+0xf0): undefined reference to `cpu_arm720_suspend_size'
      arch/arm/kernel/suspend.o: In function `__cpu_suspend_save':
      suspend.c:(.text+0x134): undefined reference to `cpu_arm720_do_suspend'
      
      This improves the hack some more by only selecting ARM_CPU_SUSPEND
      for the part that requires it, and changing pm.c to drop the
      contents of unused init functions so we no longer refer to
      cpu_resume on at91 platforms that don't need it.
      
      Fixes: cc7a938f ("ARM: at91: select CONFIG_ARM_CPU_SUSPEND")
      Acked-by: default avatarAlexandre Belloni <alexandre.belloni@free-electrons.com>
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      dbeb0c8e
    • Linus Torvalds's avatar
      Merge tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 · 98b9f8a4
      Linus Torvalds authored
      Pull ext4 fixes from Ted Ts'o:
       "Fix a clang build regression and an potential xattr corruption bug"
      
      * tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
        ext4: add missing xattr hash update
        ext4: fix clang build regression
      98b9f8a4
  5. 22 Aug, 2017 7 commits