1. 06 Dec, 2016 1 commit
    • Dmitry Vyukov's avatar
      lockdep: Fix report formatting · f943fe0f
      Dmitry Vyukov authored
      Since commit:
      
        4bcc595c ("printk: reinstate KERN_CONT for printing continuation lines")
      
      printk() requires KERN_CONT to continue log messages. Lots of printk()
      in lockdep.c and print_ip_sym() don't have it. As the result lockdep
      reports are completely messed up.
      
      Add missing KERN_CONT and inline print_ip_sym() where necessary.
      
      Example of a messed up report:
      
        0-rc5+ #41 Not tainted
        -------------------------------------------------------
        syz-executor0/5036 is trying to acquire lock:
         (
        rtnl_mutex
        ){+.+.+.}
        , at:
        [<ffffffff86b3d6ac>] rtnl_lock+0x1c/0x20
        but task is already holding lock:
         (
        &net->packet.sklist_lock
        ){+.+...}
        , at:
        [<ffffffff873541a6>] packet_diag_dump+0x1a6/0x1920
        which lock already depends on the new lock.
        the existing dependency chain (in reverse order) is:
        -> #3
         (
        &net->packet.sklist_lock
        +.+...}
        ...
      
      Without this patch all scripts that parse kernel bug reports are broken.
      Signed-off-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: andreyknvl@google.com
      Cc: aryabinin@virtuozzo.com
      Cc: joe@perches.com
      Cc: syzkaller@googlegroups.com
      Link: http://lkml.kernel.org/r/1480343083-48731-1-git-send-email-dvyukov@google.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      f943fe0f
  2. 02 Dec, 2016 2 commits
    • Thomas Gleixner's avatar
      locking/rtmutex: Use READ_ONCE() in rt_mutex_owner() · 1be5d4fa
      Thomas Gleixner authored
      While debugging the rtmutex unlock vs. dequeue race Will suggested to use
      READ_ONCE() in rt_mutex_owner() as it might race against the
      cmpxchg_release() in unlock_rt_mutex_safe().
      
      Will: "It's a minor thing which will most likely not matter in practice"
      
      Careful search did not unearth an actual problem in todays code, but it's
      better to be safe than surprised.
      Suggested-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: David Daney <ddaney@caviumnetworks.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sebastian Siewior <bigeasy@linutronix.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: <stable@vger.kernel.org>
      Link: http://lkml.kernel.org/r/20161130210030.431379999@linutronix.deSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      1be5d4fa
    • Thomas Gleixner's avatar
      locking/rtmutex: Prevent dequeue vs. unlock race · dbb26055
      Thomas Gleixner authored
      David reported a futex/rtmutex state corruption. It's caused by the
      following problem:
      
      CPU0		CPU1		CPU2
      
      l->owner=T1
      		rt_mutex_lock(l)
      		lock(l->wait_lock)
      		l->owner = T1 | HAS_WAITERS;
      		enqueue(T2)
      		boost()
      		  unlock(l->wait_lock)
      		schedule()
      
      				rt_mutex_lock(l)
      				lock(l->wait_lock)
      				l->owner = T1 | HAS_WAITERS;
      				enqueue(T3)
      				boost()
      				  unlock(l->wait_lock)
      				schedule()
      		signal(->T2)	signal(->T3)
      		lock(l->wait_lock)
      		dequeue(T2)
      		deboost()
      		  unlock(l->wait_lock)
      				lock(l->wait_lock)
      				dequeue(T3)
      				  ===> wait list is now empty
      				deboost()
      				 unlock(l->wait_lock)
      		lock(l->wait_lock)
      		fixup_rt_mutex_waiters()
      		  if (wait_list_empty(l)) {
      		    owner = l->owner & ~HAS_WAITERS;
      		    l->owner = owner
      		     ==> l->owner = T1
      		  }
      
      				lock(l->wait_lock)
      rt_mutex_unlock(l)		fixup_rt_mutex_waiters()
      				  if (wait_list_empty(l)) {
      				    owner = l->owner & ~HAS_WAITERS;
      cmpxchg(l->owner, T1, NULL)
       ===> Success (l->owner = NULL)
      				    l->owner = owner
      				     ==> l->owner = T1
      				  }
      
      That means the problem is caused by fixup_rt_mutex_waiters() which does the
      RMW to clear the waiters bit unconditionally when there are no waiters in
      the rtmutexes rbtree.
      
      This can be fatal: A concurrent unlock can release the rtmutex in the
      fastpath because the waiters bit is not set. If the cmpxchg() gets in the
      middle of the RMW operation then the previous owner, which just unlocked
      the rtmutex is set as the owner again when the write takes place after the
      successfull cmpxchg().
      
      The solution is rather trivial: verify that the owner member of the rtmutex
      has the waiters bit set before clearing it. This does not require a
      cmpxchg() or other atomic operations because the waiters bit can only be
      set and cleared with the rtmutex wait_lock held. It's also safe against the
      fast path unlock attempt. The unlock attempt via cmpxchg() will either see
      the bit set and take the slowpath or see the bit cleared and release it
      atomically in the fastpath.
      
      It's remarkable that the test program provided by David triggers on ARM64
      and MIPS64 really quick, but it refuses to reproduce on x86-64, while the
      problem exists there as well. That refusal might explain that this got not
      discovered earlier despite the bug existing from day one of the rtmutex
      implementation more than 10 years ago.
      
      Thanks to David for meticulously instrumenting the code and providing the
      information which allowed to decode this subtle problem.
      Reported-by: default avatarDavid Daney <ddaney@caviumnetworks.com>
      Tested-by: default avatarDavid Daney <david.daney@cavium.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sebastian Siewior <bigeasy@linutronix.de>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: stable@vger.kernel.org
      Fixes: 23f78d4a ("[PATCH] pi-futex: rt mutex core")
      Link: http://lkml.kernel.org/r/20161130210030.351136722@linutronix.deSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      dbb26055
  3. 25 Nov, 2016 1 commit
  4. 23 Nov, 2016 5 commits
    • Linus Torvalds's avatar
      Merge tag 'nfs-for-4.9-4' of git://git.linux-nfs.org/projects/anna/linux-nfs · 10b9dd56
      Linus Torvalds authored
      Pull NFS client bugfixes from Anna Schumaker:
       "Most of these fix regressions or races, but there is one patch for
        stable that Arnd sent me
      
        Stable bugfix:
         - Hide array-bounds warning
      
        Bugfixes:
         - Keep a reference on lock states while checking
         - Handle NFS4ERR_OLD_STATEID in nfs4_reclaim_open_state
         - Don't call close if the open stateid has already been cleared
         - Fix CLOSE rases with OPEN
         - Fix a regression in DELEGRETURN"
      
      * tag 'nfs-for-4.9-4' of git://git.linux-nfs.org/projects/anna/linux-nfs:
        NFSv4.x: hide array-bounds warning
        NFSv4.1: Keep a reference on lock states while checking
        NFSv4.1: Handle NFS4ERR_OLD_STATEID in nfs4_reclaim_open_state
        NFSv4: Don't call close if the open stateid has already been cleared
        NFSv4: Fix CLOSE races with OPEN
        NFSv4.1: Fix a regression in DELEGRETURN
      10b9dd56
    • Linus Torvalds's avatar
      Merge branch 'stable' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile · 4d92c8d0
      Linus Torvalds authored
      Pull arch/tile bugfix from Chris Metcalf:
       "This fixes a bug that causes reboots after 208 days of uptime :-)"
      
      * 'stable' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
        tile: avoid using clocksource_cyc2ns with absolute cycle count
      4d92c8d0
    • Chris Metcalf's avatar
      tile: avoid using clocksource_cyc2ns with absolute cycle count · e658a6f1
      Chris Metcalf authored
      For large values of "mult" and long uptimes, the intermediate
      result of "cycles * mult" can overflow 64 bits.  For example,
      the tile platform calls clocksource_cyc2ns with a 1.2 GHz clock;
      we have mult = 853, and after 208.5 days, we overflow 64 bits.
      
      Since clocksource_cyc2ns() is intended to be used for relative
      cycle counts, not absolute cycle counts, performance is more
      importance than accepting a wider range of cycle values.  So,
      just use mult_frac() directly in tile's sched_clock().
      
      Commit 4cecf6d4 ("sched, x86: Avoid unnecessary overflow
      in sched_clock") by Salman Qazi results in essentially the same
      generated code for x86 as this change does for tile.  In fact,
      a follow-on change by Salman introduced mult_frac() and switched
      to using it, so the C code was largely identical at that point too.
      
      Peter Zijlstra then added mul_u64_u32_shr() and switched x86
      to use it.  This is, in principle, better; by optimizing the
      64x64->64 multiplies to be 32x32->64 multiplies we can potentially
      save some time.  However, the compiler piplines the 64x64->64
      multiplies pretty well, and the conditional branch in the generic
      mul_u64_u32_shr() causes some bubbles in execution, with the
      result that it's pretty much a wash.  If tilegx provided its own
      implementation of mul_u64_u32_shr() without the conditional branch,
      we could potentially save 3 cycles, but that seems like small gain
      for a fair amount of additional build scaffolding; no other platform
      currently provides a mul_u64_u32_shr() override, and tile doesn't
      currently have an <asm/div64.h> header to put the override in.
      
      Additionally, gcc currently has an optimization bug that prevents
      it from recognizing the opportunity to use a 32x32->64 multiply,
      and so the result would be no better than the existing mult_frac()
      until such time as the compiler is fixed.
      
      For now, just using mult_frac() seems like the right answer.
      
      Cc: stable@kernel.org [v3.4+]
      Signed-off-by: default avatarChris Metcalf <cmetcalf@mellanox.com>
      e658a6f1
    • Linus Torvalds's avatar
      Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · ded9b5dd
      Linus Torvalds authored
      Pull perf fixes from Ingo Molnar:
       "Six fixes for bugs that were found via fuzzing, and a trivial
        hw-enablement patch for AMD Family-17h CPU PMUs"
      
      * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        perf/x86/intel/uncore: Allow only a single PMU/box within an events group
        perf/x86/intel: Cure bogus unwind from PEBS entries
        perf/x86: Restore TASK_SIZE check on frame pointer
        perf/core: Fix address filter parser
        perf/x86: Add perf support for AMD family-17h processors
        perf/x86/uncore: Fix crash by removing bogus event_list[] handling for SNB client uncore IMC
        perf/core: Do not set cpuctx->cgrp for unscheduled cgroups
      ded9b5dd
    • Linus Torvalds's avatar
      Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 · 23aabe73
      Linus Torvalds authored
      Pull crypto fixes from Herbert Xu:
       "The last push broke algif_hash for all shash implementations, so this
        is a follow-up to fix that.
      
        This also fixes a problem in the crypto scatterwalk that triggers a
        BUG_ON with certain debugging options due to the new vmalloced-stack
        code"
      
      * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
        crypto: scatterwalk - Remove unnecessary aliasing check in map_and_copy
        crypto: algif_hash - Fix result clobbering in recvmsg
      23aabe73
  5. 22 Nov, 2016 13 commits
    • Linus Torvalds's avatar
      Merge branch 'for-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux · 23400ac9
      Linus Torvalds authored
      Pull thermal management fix from Zhang Rui:
       "We only have one urgent fix this time.
      
        Commit 3105f234 ("thermal/powerclamp: correct cpu support check"),
        which is shipped in 4.9-rc3, fixed a problem introduced by commit
        b721ca0d ("thermal/powerclamp: remove cpu whitelist").
      
        But unfortunately, it broke intel_powerclamp driver module auto-
        loading at the same time. Thus we need this change to add back module
        auto-loading for 4.9"
      
      * 'for-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux:
        thermal/powerclamp: add back module device table
      23400ac9
    • Linus Torvalds's avatar
      Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · b66c08ba
      Linus Torvalds authored
      Pull SCSI fixes from James Bottomley:
       "Two small fixes.
      
        One prevents timeouts on mpt3sas when trying to use the secure erase
        protocol which causes the erase protocol to be aborted. The second is
        a regression in a prior fix which causes all commands to abort during
        PCI extended error recovery, which is incorrect because PCI EEH is
        independent from what's happening on the FC transport"
      
      * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
        scsi: qla2xxx: do not abort all commands in the adapter during EEH recovery
        scsi: mpt3sas: Fix secure erase premature termination
      b66c08ba
    • Linus Torvalds's avatar
      Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux · 57527ed1
      Linus Torvalds authored
      Pull clk fixes from Stephen Boyd:
       "A handful of driver fixes.
      
        The sunxi fixes are for an incorrect clk tree configuration and a bad
        frequency calculation. The other two are fixes for passing the wrong
        pointer in drivers recently converted to clk_hw style registration"
      
      * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
        clk: efm32gg: Pass correct type to hw provider registration
        clk: berlin: Pass correct type to hw provider registration
        clk: sunxi: Fix M factor computation for APB1
        clk: sunxi-ng: sun6i-a31: Force AHB1 clock to use PLL6 as parent
      57527ed1
    • Arnd Bergmann's avatar
      NFSv4.x: hide array-bounds warning · d55b352b
      Arnd Bergmann authored
      A correct bugfix introduced a harmless warning that shows up with gcc-7:
      
      fs/nfs/callback.c: In function 'nfs_callback_up':
      fs/nfs/callback.c:214:14: error: array subscript is outside array bounds [-Werror=array-bounds]
      
      What happens here is that the 'minorversion == 0' check tells the
      compiler that we assume minorversion can be something other than 0,
      but when CONFIG_NFS_V4_1 is disabled that would be invalid and
      result in an out-of-bounds access.
      
      The added check for IS_ENABLED(CONFIG_NFS_V4_1) tells gcc that this
      really can't happen, which makes the code slightly smaller and also
      avoids the warning.
      
      The bugfix that introduced the warning is marked for stable backports,
      we want this one backported to the same releases.
      
      Fixes: 98b0f80c ("NFSv4.x: Fix a refcount leak in nfs_callback_up_net")
      Cc: stable@vger.kernel.org # v3.7+
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarAnna Schumaker <Anna.Schumaker@Netapp.com>
      d55b352b
    • Linus Torvalds's avatar
      Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 000b8949
      Linus Torvalds authored
      Pull scheduler fixes from Ingo Molnar:
       "Two fixes for autogroup scheduling, for races when turning the feature
        on/off via /proc/sys/kernel/sched_autogroup_enabled"
      
      * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        sched/autogroup: Do not use autogroup->tg in zombie threads
        sched/autogroup: Fix autogroup_move_group() to never skip sched_move_task()
      000b8949
    • Linus Torvalds's avatar
      Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 7cfc4317
      Linus Torvalds authored
      Pull x86 fixes from Ingo Molnar:
       "Misc fixes:
         - two fixes to make (very) old Intel CPUs boot reliably
         - fix the intel-mid driver and rename it
         - two KASAN false positive fixes
         - an FPU fix
         - two sysfb fixes
         - two build fixes related to new toolchain versions"
      
      * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/platform/intel-mid: Rename platform_wdt to platform_mrfld_wdt
        x86/build: Build compressed x86 kernels as PIE when !CONFIG_RELOCATABLE as well
        x86/platform/intel-mid: Register watchdog device after SCU
        x86/fpu: Fix invalid FPU ptrace state after execve()
        x86/boot: Fail the boot if !M486 and CPUID is missing
        x86/traps: Ignore high word of regs->cs in early_fixup_exception()
        x86/dumpstack: Prevent KASAN false positive warnings
        x86/unwind: Prevent KASAN false positive warnings in guess unwinder
        x86/boot: Avoid warning for zero-filling .bss
        x86/sysfb: Fix lfb_size calculation
        x86/sysfb: Add support for 64bit EFI lfb_base
      7cfc4317
    • Peter Zijlstra's avatar
      perf/x86/intel/uncore: Allow only a single PMU/box within an events group · 033ac60c
      Peter Zijlstra authored
      Group validation expects all events to be of the same PMU; however
      is_uncore_pmu() is too wide, it matches _all_ uncore events, even
      across PMUs.
      
      This triggers failure when we group different events from different
      uncore PMUs, like:
      
        perf stat -vv -e '{uncore_cbox_0/config=0x0334/,uncore_qpi_0/event=1/}' -a sleep 1
      
      Fix is_uncore_pmu() by only matching events to the box at hand.
      
      Note that generic code; ran after this step; will disallow this
      mixture of PMU events.
      Reported-by: default avatarJiri Olsa <jolsa@redhat.com>
      Tested-by: default avatarJiri Olsa <jolsa@redhat.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vince@deater.net>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Link: http://lkml.kernel.org/r/20161118125354.GQ3117@twins.programming.kicks-ass.netSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      033ac60c
    • Peter Zijlstra's avatar
      perf/x86/intel: Cure bogus unwind from PEBS entries · b8000586
      Peter Zijlstra authored
      Vince Weaver reported that perf_fuzzer + KASAN detects that PEBS event
      unwinds sometimes do 'weird' things. In particular, we seemed to be
      ending up unwinding from random places on the NMI stack.
      
      While it was somewhat expected that the event record BP,SP would not
      match the interrupt BP,SP in that the interrupt is strictly later than
      the record event, it was overlooked that it could be on an already
      overwritten stack.
      
      Therefore, don't copy the recorded BP,SP over the interrupted BP,SP
      when we need stack unwinds.
      
      Note that its still possible the unwind doesn't full match the actual
      event, as its entirely possible to have done an (I)RET between record
      and interrupt, but on average it should still point in the general
      direction of where the event came from. Also, it's the best we can do,
      considering.
      
      The particular scenario that triggered the bogus NMI stack unwind was
      a PEBS event with very short period, upon enabling the event at the
      tail of the PMI handler (FREEZE_ON_PMI is not used), it instantly
      triggers a record (while still on the NMI stack) which in turn
      triggers the next PMI. This then causes back-to-back NMIs and we'll
      try and unwind the stack-frame from the last NMI, which obviously is
      now overwritten by our own.
      Analyzed-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Reported-by: default avatarVince Weaver <vincent.weaver@maine.edu>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@gmail.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: davej@codemonkey.org.uk <davej@codemonkey.org.uk>
      Cc: dvyukov@google.com <dvyukov@google.com>
      Cc: stable@vger.kernel.org
      Fixes: ca037701 ("perf, x86: Add PEBS infrastructure")
      Link: http://lkml.kernel.org/r/20161117171731.GV3157@twins.programming.kicks-ass.netSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      b8000586
    • Johannes Weiner's avatar
      perf/x86: Restore TASK_SIZE check on frame pointer · ae31fe51
      Johannes Weiner authored
      The following commit:
      
        75925e1a ("perf/x86: Optimize stack walk user accesses")
      
      ... switched from copy_from_user_nmi() to __copy_from_user_nmi() with a manual
      access_ok() check.
      
      Unfortunately, copy_from_user_nmi() does an explicit check against TASK_SIZE,
      whereas the access_ok() uses whatever the current address limit of the task is.
      
      We are getting NMIs when __probe_kernel_read() has switched to KERNEL_DS, and
      then see vmalloc faults when we access what looks like pointers into vmalloc
      space:
      
        [] WARNING: CPU: 3 PID: 3685731 at arch/x86/mm/fault.c:435 vmalloc_fault+0x289/0x290
        [] CPU: 3 PID: 3685731 Comm: sh Tainted: G        W       4.6.0-5_fbk1_223_gdbf0f40 #1
        [] Call Trace:
        []  <NMI>  [<ffffffff814717d1>] dump_stack+0x4d/0x6c
        []  [<ffffffff81076e43>] __warn+0xd3/0xf0
        []  [<ffffffff81076f2d>] warn_slowpath_null+0x1d/0x20
        []  [<ffffffff8104a899>] vmalloc_fault+0x289/0x290
        []  [<ffffffff8104b5a0>] __do_page_fault+0x330/0x490
        []  [<ffffffff8104b70c>] do_page_fault+0xc/0x10
        []  [<ffffffff81794e82>] page_fault+0x22/0x30
        []  [<ffffffff81006280>] ? perf_callchain_user+0x100/0x2a0
        []  [<ffffffff8115124f>] get_perf_callchain+0x17f/0x190
        []  [<ffffffff811512c7>] perf_callchain+0x67/0x80
        []  [<ffffffff8114e750>] perf_prepare_sample+0x2a0/0x370
        []  [<ffffffff8114e840>] perf_event_output+0x20/0x60
        []  [<ffffffff8114aee7>] ? perf_event_update_userpage+0xc7/0x130
        []  [<ffffffff8114ea01>] __perf_event_overflow+0x181/0x1d0
        []  [<ffffffff8114f484>] perf_event_overflow+0x14/0x20
        []  [<ffffffff8100a6e3>] intel_pmu_handle_irq+0x1d3/0x490
        []  [<ffffffff8147daf7>] ? copy_user_enhanced_fast_string+0x7/0x10
        []  [<ffffffff81197191>] ? vunmap_page_range+0x1a1/0x2f0
        []  [<ffffffff811972f1>] ? unmap_kernel_range_noflush+0x11/0x20
        []  [<ffffffff814f2056>] ? ghes_copy_tofrom_phys+0x116/0x1f0
        []  [<ffffffff81040d1d>] ? x2apic_send_IPI_self+0x1d/0x20
        []  [<ffffffff8100411d>] perf_event_nmi_handler+0x2d/0x50
        []  [<ffffffff8101ea31>] nmi_handle+0x61/0x110
        []  [<ffffffff8101ef94>] default_do_nmi+0x44/0x110
        []  [<ffffffff8101f13b>] do_nmi+0xdb/0x150
        []  [<ffffffff81795187>] end_repeat_nmi+0x1a/0x1e
        []  [<ffffffff8147daf7>] ? copy_user_enhanced_fast_string+0x7/0x10
        []  [<ffffffff8147daf7>] ? copy_user_enhanced_fast_string+0x7/0x10
        []  [<ffffffff8147daf7>] ? copy_user_enhanced_fast_string+0x7/0x10
        []  <<EOE>>  <IRQ>  [<ffffffff8115d05e>] ? __probe_kernel_read+0x3e/0xa0
      
      Fix this by moving the valid_user_frame() check to before the uaccess
      that loads the return address and the pointer to the next frame.
      Signed-off-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: linux-kernel@vger.kernel.org
      Fixes: 75925e1a ("perf/x86: Optimize stack walk user accesses")
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      ae31fe51
    • Oleg Nesterov's avatar
      sched/autogroup: Do not use autogroup->tg in zombie threads · 8e5bfa8c
      Oleg Nesterov authored
      Exactly because for_each_thread() in autogroup_move_group() can't see it
      and update its ->sched_task_group before _put() and possibly free().
      
      So the exiting task needs another sched_move_task() before exit_notify()
      and we need to re-introduce the PF_EXITING (or similar) check removed by
      the previous change for another reason.
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: hartsjc@redhat.com
      Cc: vbendel@redhat.com
      Cc: vlovejoy@redhat.com
      Link: http://lkml.kernel.org/r/20161114184612.GA15968@redhat.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      8e5bfa8c
    • Oleg Nesterov's avatar
      sched/autogroup: Fix autogroup_move_group() to never skip sched_move_task() · 18f649ef
      Oleg Nesterov authored
      The PF_EXITING check in task_wants_autogroup() is no longer needed. Remove
      it, but see the next patch.
      
      However the comment is correct in that autogroup_move_group() must always
      change task_group() for every thread so the sysctl_ check is very wrong;
      we can race with cgroups and even sys_setsid() is not safe because a task
      running with task_group() == ag->tg must participate in refcounting:
      
      	int main(void)
      	{
      		int sctl = open("/proc/sys/kernel/sched_autogroup_enabled", O_WRONLY);
      
      		assert(sctl > 0);
      		if (fork()) {
      			wait(NULL); // destroy the child's ag/tg
      			pause();
      		}
      
      		assert(pwrite(sctl, "1\n", 2, 0) == 2);
      		assert(setsid() > 0);
      		if (fork())
      			pause();
      
      		kill(getppid(), SIGKILL);
      		sleep(1);
      
      		// The child has gone, the grandchild runs with kref == 1
      		assert(pwrite(sctl, "0\n", 2, 0) == 2);
      		assert(setsid() > 0);
      
      		// runs with the freed ag/tg
      		for (;;)
      			sleep(1);
      
      		return 0;
      	}
      
      crashes the kernel. It doesn't really need sleep(1), it doesn't matter if
      autogroup_move_group() actually frees the task_group or this happens later.
      Reported-by: default avatarVern Lovejoy <vlovejoy@redhat.com>
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: hartsjc@redhat.com
      Cc: vbendel@redhat.com
      Link: http://lkml.kernel.org/r/20161114184609.GA15965@redhat.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      18f649ef
    • Herbert Xu's avatar
      crypto: scatterwalk - Remove unnecessary aliasing check in map_and_copy · c8467f7a
      Herbert Xu authored
      The aliasing check in map_and_copy is no longer necessary because
      the IPsec ESP code no longer provides an IV that points into the
      actual request data.  As this check is now triggering BUG checks
      due to the vmalloced stack code, I'm removing it.
      Reported-by: default avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      c8467f7a
    • Herbert Xu's avatar
      crypto: algif_hash - Fix result clobbering in recvmsg · 8acf7a10
      Herbert Xu authored
      Recently an init call was added to hash_recvmsg so as to reset
      the hash state in case a sendmsg call was never made.
      
      Unfortunately this ended up clobbering the result if the previous
      sendmsg was done with a MSG_MORE flag.  This patch fixes it by
      excluding that case when we make the init call.
      
      Fixes: a8348bca ("algif_hash - Fix NULL hash crash with shash")
      Reported-by: default avatarPatrick Steinhardt <ps@pks.im>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      8acf7a10
  6. 21 Nov, 2016 16 commits
  7. 20 Nov, 2016 2 commits
    • Linus Torvalds's avatar
      Linux 4.9-rc6 · 9c763584
      Linus Torvalds authored
      9c763584
    • Linus Torvalds's avatar
      Merge branch 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm · 697ed8d0
      Linus Torvalds authored
      Pull ARM fixes from Russell King:
       "A few more ARM fixes:
      
         - the assembly backtrace code suffers problems with the new printk()
           implementation which assumes that kernel messages without KERN_CONT
           should have newlines inserted between them. Fix this.
         - fix a section naming error - ".init.text" rather than ".text.init"
         - preallocate DMA debug memory at core_initcall() time rather than
           fs_initcall(), as we have some core drivers that need to use DMA
           mapping - and that triggers a kernel warning from the DMA debug
           code.
         - fix XIP kernels after the ro_after_init changes made this data
           permanently read-only"
      
      * 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm:
        ARM: Fix XIP kernels
        ARM: 8628/1: dma-mapping: preallocate DMA-debug hash tables in core_initcall
        ARM: 8624/1: proc-v7m.S: fix init section name
        ARM: fix backtrace
      697ed8d0