1. 24 Nov, 2016 1 commit
    • Dmitry Safonov's avatar
      x86/coredump: Always use user_regs_struct for compat_elf_gregset_t · 7b2dd368
      Dmitry Safonov authored
      Commit:
      
        90954e7b ("x86/coredump: Use pr_reg size, rather that TIF_IA32 flag")
      
      changed the coredumping code to construct the elf coredump file according
      to register set size - and that's good: if binary crashes with 32-bit code
      selector, generate 32-bit ELF core, otherwise - 64-bit core.
      
      That was made for restoring 32-bit applications on x86_64: we want
      32-bit application after restore to generate 32-bit ELF dump on crash.
      
      All was quite good and recently I started reworking 32-bit applications
      dumping part of CRIU: now it has two parasites (32 and 64) for seizing
      compat/native tasks, after rework it'll have one parasite, working in
      64-bit mode, to which 32-bit prologue long-jumps during infection.
      
      And while it has worked for my work machine, in VM with
      !CONFIG_X86_X32_ABI during reworking I faced that segfault in 32-bit
      binary, that has long-jumped to 64-bit mode results in dereference
      of garbage:
      
       32-victim[19266]: segfault at f775ef65 ip 00000000f775ef65 sp 00000000f776aa50 error 14
       BUG: unable to handle kernel paging request at ffffffffffffffff
       IP: [<ffffffff81332ce0>] strlen+0x0/0x20
       [...]
       Call Trace:
        [] elf_core_dump+0x11a9/0x1480
        [] do_coredump+0xa6b/0xe60
        [] get_signal+0x1a8/0x5c0
        [] do_signal+0x23/0x660
        [] exit_to_usermode_loop+0x34/0x65
        [] prepare_exit_to_usermode+0x2f/0x40
        [] retint_user+0x8/0x10
      
      That's because we have 64-bit registers set (with according total size)
      and we're writing it to elf_thread_core_info which has smaller size
      on !CONFIG_X86_X32_ABI. That lead to overwriting ELF notes part.
      
      Tested on 32-, 64-bit ELF crashes and on 32-bit binaries that have
      jumped with 64-bit code selector - all is readable with gdb.
      Signed-off-by: default avatarDmitry Safonov <dsafonov@virtuozzo.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-mm@kvack.org
      Fixes: 90954e7b ("x86/coredump: Use pr_reg size, rather that TIF_IA32 flag")
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      7b2dd368
  2. 23 Nov, 2016 5 commits
    • Linus Torvalds's avatar
      Merge tag 'nfs-for-4.9-4' of git://git.linux-nfs.org/projects/anna/linux-nfs · 10b9dd56
      Linus Torvalds authored
      Pull NFS client bugfixes from Anna Schumaker:
       "Most of these fix regressions or races, but there is one patch for
        stable that Arnd sent me
      
        Stable bugfix:
         - Hide array-bounds warning
      
        Bugfixes:
         - Keep a reference on lock states while checking
         - Handle NFS4ERR_OLD_STATEID in nfs4_reclaim_open_state
         - Don't call close if the open stateid has already been cleared
         - Fix CLOSE rases with OPEN
         - Fix a regression in DELEGRETURN"
      
      * tag 'nfs-for-4.9-4' of git://git.linux-nfs.org/projects/anna/linux-nfs:
        NFSv4.x: hide array-bounds warning
        NFSv4.1: Keep a reference on lock states while checking
        NFSv4.1: Handle NFS4ERR_OLD_STATEID in nfs4_reclaim_open_state
        NFSv4: Don't call close if the open stateid has already been cleared
        NFSv4: Fix CLOSE races with OPEN
        NFSv4.1: Fix a regression in DELEGRETURN
      10b9dd56
    • Linus Torvalds's avatar
      Merge branch 'stable' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile · 4d92c8d0
      Linus Torvalds authored
      Pull arch/tile bugfix from Chris Metcalf:
       "This fixes a bug that causes reboots after 208 days of uptime :-)"
      
      * 'stable' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
        tile: avoid using clocksource_cyc2ns with absolute cycle count
      4d92c8d0
    • Chris Metcalf's avatar
      tile: avoid using clocksource_cyc2ns with absolute cycle count · e658a6f1
      Chris Metcalf authored
      For large values of "mult" and long uptimes, the intermediate
      result of "cycles * mult" can overflow 64 bits.  For example,
      the tile platform calls clocksource_cyc2ns with a 1.2 GHz clock;
      we have mult = 853, and after 208.5 days, we overflow 64 bits.
      
      Since clocksource_cyc2ns() is intended to be used for relative
      cycle counts, not absolute cycle counts, performance is more
      importance than accepting a wider range of cycle values.  So,
      just use mult_frac() directly in tile's sched_clock().
      
      Commit 4cecf6d4 ("sched, x86: Avoid unnecessary overflow
      in sched_clock") by Salman Qazi results in essentially the same
      generated code for x86 as this change does for tile.  In fact,
      a follow-on change by Salman introduced mult_frac() and switched
      to using it, so the C code was largely identical at that point too.
      
      Peter Zijlstra then added mul_u64_u32_shr() and switched x86
      to use it.  This is, in principle, better; by optimizing the
      64x64->64 multiplies to be 32x32->64 multiplies we can potentially
      save some time.  However, the compiler piplines the 64x64->64
      multiplies pretty well, and the conditional branch in the generic
      mul_u64_u32_shr() causes some bubbles in execution, with the
      result that it's pretty much a wash.  If tilegx provided its own
      implementation of mul_u64_u32_shr() without the conditional branch,
      we could potentially save 3 cycles, but that seems like small gain
      for a fair amount of additional build scaffolding; no other platform
      currently provides a mul_u64_u32_shr() override, and tile doesn't
      currently have an <asm/div64.h> header to put the override in.
      
      Additionally, gcc currently has an optimization bug that prevents
      it from recognizing the opportunity to use a 32x32->64 multiply,
      and so the result would be no better than the existing mult_frac()
      until such time as the compiler is fixed.
      
      For now, just using mult_frac() seems like the right answer.
      
      Cc: stable@kernel.org [v3.4+]
      Signed-off-by: default avatarChris Metcalf <cmetcalf@mellanox.com>
      e658a6f1
    • Linus Torvalds's avatar
      Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · ded9b5dd
      Linus Torvalds authored
      Pull perf fixes from Ingo Molnar:
       "Six fixes for bugs that were found via fuzzing, and a trivial
        hw-enablement patch for AMD Family-17h CPU PMUs"
      
      * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        perf/x86/intel/uncore: Allow only a single PMU/box within an events group
        perf/x86/intel: Cure bogus unwind from PEBS entries
        perf/x86: Restore TASK_SIZE check on frame pointer
        perf/core: Fix address filter parser
        perf/x86: Add perf support for AMD family-17h processors
        perf/x86/uncore: Fix crash by removing bogus event_list[] handling for SNB client uncore IMC
        perf/core: Do not set cpuctx->cgrp for unscheduled cgroups
      ded9b5dd
    • Linus Torvalds's avatar
      Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 · 23aabe73
      Linus Torvalds authored
      Pull crypto fixes from Herbert Xu:
       "The last push broke algif_hash for all shash implementations, so this
        is a follow-up to fix that.
      
        This also fixes a problem in the crypto scatterwalk that triggers a
        BUG_ON with certain debugging options due to the new vmalloced-stack
        code"
      
      * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
        crypto: scatterwalk - Remove unnecessary aliasing check in map_and_copy
        crypto: algif_hash - Fix result clobbering in recvmsg
      23aabe73
  3. 22 Nov, 2016 13 commits
    • Linus Torvalds's avatar
      Merge branch 'for-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux · 23400ac9
      Linus Torvalds authored
      Pull thermal management fix from Zhang Rui:
       "We only have one urgent fix this time.
      
        Commit 3105f234 ("thermal/powerclamp: correct cpu support check"),
        which is shipped in 4.9-rc3, fixed a problem introduced by commit
        b721ca0d ("thermal/powerclamp: remove cpu whitelist").
      
        But unfortunately, it broke intel_powerclamp driver module auto-
        loading at the same time. Thus we need this change to add back module
        auto-loading for 4.9"
      
      * 'for-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux:
        thermal/powerclamp: add back module device table
      23400ac9
    • Linus Torvalds's avatar
      Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · b66c08ba
      Linus Torvalds authored
      Pull SCSI fixes from James Bottomley:
       "Two small fixes.
      
        One prevents timeouts on mpt3sas when trying to use the secure erase
        protocol which causes the erase protocol to be aborted. The second is
        a regression in a prior fix which causes all commands to abort during
        PCI extended error recovery, which is incorrect because PCI EEH is
        independent from what's happening on the FC transport"
      
      * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
        scsi: qla2xxx: do not abort all commands in the adapter during EEH recovery
        scsi: mpt3sas: Fix secure erase premature termination
      b66c08ba
    • Linus Torvalds's avatar
      Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux · 57527ed1
      Linus Torvalds authored
      Pull clk fixes from Stephen Boyd:
       "A handful of driver fixes.
      
        The sunxi fixes are for an incorrect clk tree configuration and a bad
        frequency calculation. The other two are fixes for passing the wrong
        pointer in drivers recently converted to clk_hw style registration"
      
      * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
        clk: efm32gg: Pass correct type to hw provider registration
        clk: berlin: Pass correct type to hw provider registration
        clk: sunxi: Fix M factor computation for APB1
        clk: sunxi-ng: sun6i-a31: Force AHB1 clock to use PLL6 as parent
      57527ed1
    • Arnd Bergmann's avatar
      NFSv4.x: hide array-bounds warning · d55b352b
      Arnd Bergmann authored
      A correct bugfix introduced a harmless warning that shows up with gcc-7:
      
      fs/nfs/callback.c: In function 'nfs_callback_up':
      fs/nfs/callback.c:214:14: error: array subscript is outside array bounds [-Werror=array-bounds]
      
      What happens here is that the 'minorversion == 0' check tells the
      compiler that we assume minorversion can be something other than 0,
      but when CONFIG_NFS_V4_1 is disabled that would be invalid and
      result in an out-of-bounds access.
      
      The added check for IS_ENABLED(CONFIG_NFS_V4_1) tells gcc that this
      really can't happen, which makes the code slightly smaller and also
      avoids the warning.
      
      The bugfix that introduced the warning is marked for stable backports,
      we want this one backported to the same releases.
      
      Fixes: 98b0f80c ("NFSv4.x: Fix a refcount leak in nfs_callback_up_net")
      Cc: stable@vger.kernel.org # v3.7+
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarAnna Schumaker <Anna.Schumaker@Netapp.com>
      d55b352b
    • Linus Torvalds's avatar
      Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 000b8949
      Linus Torvalds authored
      Pull scheduler fixes from Ingo Molnar:
       "Two fixes for autogroup scheduling, for races when turning the feature
        on/off via /proc/sys/kernel/sched_autogroup_enabled"
      
      * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        sched/autogroup: Do not use autogroup->tg in zombie threads
        sched/autogroup: Fix autogroup_move_group() to never skip sched_move_task()
      000b8949
    • Linus Torvalds's avatar
      Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 7cfc4317
      Linus Torvalds authored
      Pull x86 fixes from Ingo Molnar:
       "Misc fixes:
         - two fixes to make (very) old Intel CPUs boot reliably
         - fix the intel-mid driver and rename it
         - two KASAN false positive fixes
         - an FPU fix
         - two sysfb fixes
         - two build fixes related to new toolchain versions"
      
      * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/platform/intel-mid: Rename platform_wdt to platform_mrfld_wdt
        x86/build: Build compressed x86 kernels as PIE when !CONFIG_RELOCATABLE as well
        x86/platform/intel-mid: Register watchdog device after SCU
        x86/fpu: Fix invalid FPU ptrace state after execve()
        x86/boot: Fail the boot if !M486 and CPUID is missing
        x86/traps: Ignore high word of regs->cs in early_fixup_exception()
        x86/dumpstack: Prevent KASAN false positive warnings
        x86/unwind: Prevent KASAN false positive warnings in guess unwinder
        x86/boot: Avoid warning for zero-filling .bss
        x86/sysfb: Fix lfb_size calculation
        x86/sysfb: Add support for 64bit EFI lfb_base
      7cfc4317
    • Peter Zijlstra's avatar
      perf/x86/intel/uncore: Allow only a single PMU/box within an events group · 033ac60c
      Peter Zijlstra authored
      Group validation expects all events to be of the same PMU; however
      is_uncore_pmu() is too wide, it matches _all_ uncore events, even
      across PMUs.
      
      This triggers failure when we group different events from different
      uncore PMUs, like:
      
        perf stat -vv -e '{uncore_cbox_0/config=0x0334/,uncore_qpi_0/event=1/}' -a sleep 1
      
      Fix is_uncore_pmu() by only matching events to the box at hand.
      
      Note that generic code; ran after this step; will disallow this
      mixture of PMU events.
      Reported-by: default avatarJiri Olsa <jolsa@redhat.com>
      Tested-by: default avatarJiri Olsa <jolsa@redhat.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vince@deater.net>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Link: http://lkml.kernel.org/r/20161118125354.GQ3117@twins.programming.kicks-ass.netSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      033ac60c
    • Peter Zijlstra's avatar
      perf/x86/intel: Cure bogus unwind from PEBS entries · b8000586
      Peter Zijlstra authored
      Vince Weaver reported that perf_fuzzer + KASAN detects that PEBS event
      unwinds sometimes do 'weird' things. In particular, we seemed to be
      ending up unwinding from random places on the NMI stack.
      
      While it was somewhat expected that the event record BP,SP would not
      match the interrupt BP,SP in that the interrupt is strictly later than
      the record event, it was overlooked that it could be on an already
      overwritten stack.
      
      Therefore, don't copy the recorded BP,SP over the interrupted BP,SP
      when we need stack unwinds.
      
      Note that its still possible the unwind doesn't full match the actual
      event, as its entirely possible to have done an (I)RET between record
      and interrupt, but on average it should still point in the general
      direction of where the event came from. Also, it's the best we can do,
      considering.
      
      The particular scenario that triggered the bogus NMI stack unwind was
      a PEBS event with very short period, upon enabling the event at the
      tail of the PMI handler (FREEZE_ON_PMI is not used), it instantly
      triggers a record (while still on the NMI stack) which in turn
      triggers the next PMI. This then causes back-to-back NMIs and we'll
      try and unwind the stack-frame from the last NMI, which obviously is
      now overwritten by our own.
      Analyzed-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Reported-by: default avatarVince Weaver <vincent.weaver@maine.edu>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@gmail.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: davej@codemonkey.org.uk <davej@codemonkey.org.uk>
      Cc: dvyukov@google.com <dvyukov@google.com>
      Cc: stable@vger.kernel.org
      Fixes: ca037701 ("perf, x86: Add PEBS infrastructure")
      Link: http://lkml.kernel.org/r/20161117171731.GV3157@twins.programming.kicks-ass.netSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      b8000586
    • Johannes Weiner's avatar
      perf/x86: Restore TASK_SIZE check on frame pointer · ae31fe51
      Johannes Weiner authored
      The following commit:
      
        75925e1a ("perf/x86: Optimize stack walk user accesses")
      
      ... switched from copy_from_user_nmi() to __copy_from_user_nmi() with a manual
      access_ok() check.
      
      Unfortunately, copy_from_user_nmi() does an explicit check against TASK_SIZE,
      whereas the access_ok() uses whatever the current address limit of the task is.
      
      We are getting NMIs when __probe_kernel_read() has switched to KERNEL_DS, and
      then see vmalloc faults when we access what looks like pointers into vmalloc
      space:
      
        [] WARNING: CPU: 3 PID: 3685731 at arch/x86/mm/fault.c:435 vmalloc_fault+0x289/0x290
        [] CPU: 3 PID: 3685731 Comm: sh Tainted: G        W       4.6.0-5_fbk1_223_gdbf0f40 #1
        [] Call Trace:
        []  <NMI>  [<ffffffff814717d1>] dump_stack+0x4d/0x6c
        []  [<ffffffff81076e43>] __warn+0xd3/0xf0
        []  [<ffffffff81076f2d>] warn_slowpath_null+0x1d/0x20
        []  [<ffffffff8104a899>] vmalloc_fault+0x289/0x290
        []  [<ffffffff8104b5a0>] __do_page_fault+0x330/0x490
        []  [<ffffffff8104b70c>] do_page_fault+0xc/0x10
        []  [<ffffffff81794e82>] page_fault+0x22/0x30
        []  [<ffffffff81006280>] ? perf_callchain_user+0x100/0x2a0
        []  [<ffffffff8115124f>] get_perf_callchain+0x17f/0x190
        []  [<ffffffff811512c7>] perf_callchain+0x67/0x80
        []  [<ffffffff8114e750>] perf_prepare_sample+0x2a0/0x370
        []  [<ffffffff8114e840>] perf_event_output+0x20/0x60
        []  [<ffffffff8114aee7>] ? perf_event_update_userpage+0xc7/0x130
        []  [<ffffffff8114ea01>] __perf_event_overflow+0x181/0x1d0
        []  [<ffffffff8114f484>] perf_event_overflow+0x14/0x20
        []  [<ffffffff8100a6e3>] intel_pmu_handle_irq+0x1d3/0x490
        []  [<ffffffff8147daf7>] ? copy_user_enhanced_fast_string+0x7/0x10
        []  [<ffffffff81197191>] ? vunmap_page_range+0x1a1/0x2f0
        []  [<ffffffff811972f1>] ? unmap_kernel_range_noflush+0x11/0x20
        []  [<ffffffff814f2056>] ? ghes_copy_tofrom_phys+0x116/0x1f0
        []  [<ffffffff81040d1d>] ? x2apic_send_IPI_self+0x1d/0x20
        []  [<ffffffff8100411d>] perf_event_nmi_handler+0x2d/0x50
        []  [<ffffffff8101ea31>] nmi_handle+0x61/0x110
        []  [<ffffffff8101ef94>] default_do_nmi+0x44/0x110
        []  [<ffffffff8101f13b>] do_nmi+0xdb/0x150
        []  [<ffffffff81795187>] end_repeat_nmi+0x1a/0x1e
        []  [<ffffffff8147daf7>] ? copy_user_enhanced_fast_string+0x7/0x10
        []  [<ffffffff8147daf7>] ? copy_user_enhanced_fast_string+0x7/0x10
        []  [<ffffffff8147daf7>] ? copy_user_enhanced_fast_string+0x7/0x10
        []  <<EOE>>  <IRQ>  [<ffffffff8115d05e>] ? __probe_kernel_read+0x3e/0xa0
      
      Fix this by moving the valid_user_frame() check to before the uaccess
      that loads the return address and the pointer to the next frame.
      Signed-off-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: linux-kernel@vger.kernel.org
      Fixes: 75925e1a ("perf/x86: Optimize stack walk user accesses")
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      ae31fe51
    • Oleg Nesterov's avatar
      sched/autogroup: Do not use autogroup->tg in zombie threads · 8e5bfa8c
      Oleg Nesterov authored
      Exactly because for_each_thread() in autogroup_move_group() can't see it
      and update its ->sched_task_group before _put() and possibly free().
      
      So the exiting task needs another sched_move_task() before exit_notify()
      and we need to re-introduce the PF_EXITING (or similar) check removed by
      the previous change for another reason.
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: hartsjc@redhat.com
      Cc: vbendel@redhat.com
      Cc: vlovejoy@redhat.com
      Link: http://lkml.kernel.org/r/20161114184612.GA15968@redhat.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      8e5bfa8c
    • Oleg Nesterov's avatar
      sched/autogroup: Fix autogroup_move_group() to never skip sched_move_task() · 18f649ef
      Oleg Nesterov authored
      The PF_EXITING check in task_wants_autogroup() is no longer needed. Remove
      it, but see the next patch.
      
      However the comment is correct in that autogroup_move_group() must always
      change task_group() for every thread so the sysctl_ check is very wrong;
      we can race with cgroups and even sys_setsid() is not safe because a task
      running with task_group() == ag->tg must participate in refcounting:
      
      	int main(void)
      	{
      		int sctl = open("/proc/sys/kernel/sched_autogroup_enabled", O_WRONLY);
      
      		assert(sctl > 0);
      		if (fork()) {
      			wait(NULL); // destroy the child's ag/tg
      			pause();
      		}
      
      		assert(pwrite(sctl, "1\n", 2, 0) == 2);
      		assert(setsid() > 0);
      		if (fork())
      			pause();
      
      		kill(getppid(), SIGKILL);
      		sleep(1);
      
      		// The child has gone, the grandchild runs with kref == 1
      		assert(pwrite(sctl, "0\n", 2, 0) == 2);
      		assert(setsid() > 0);
      
      		// runs with the freed ag/tg
      		for (;;)
      			sleep(1);
      
      		return 0;
      	}
      
      crashes the kernel. It doesn't really need sleep(1), it doesn't matter if
      autogroup_move_group() actually frees the task_group or this happens later.
      Reported-by: default avatarVern Lovejoy <vlovejoy@redhat.com>
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: hartsjc@redhat.com
      Cc: vbendel@redhat.com
      Link: http://lkml.kernel.org/r/20161114184609.GA15965@redhat.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      18f649ef
    • Herbert Xu's avatar
      crypto: scatterwalk - Remove unnecessary aliasing check in map_and_copy · c8467f7a
      Herbert Xu authored
      The aliasing check in map_and_copy is no longer necessary because
      the IPsec ESP code no longer provides an IV that points into the
      actual request data.  As this check is now triggering BUG checks
      due to the vmalloced stack code, I'm removing it.
      Reported-by: default avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      c8467f7a
    • Herbert Xu's avatar
      crypto: algif_hash - Fix result clobbering in recvmsg · 8acf7a10
      Herbert Xu authored
      Recently an init call was added to hash_recvmsg so as to reset
      the hash state in case a sendmsg call was never made.
      
      Unfortunately this ended up clobbering the result if the previous
      sendmsg was done with a MSG_MORE flag.  This patch fixes it by
      excluding that case when we make the init call.
      
      Fixes: a8348bca ("algif_hash - Fix NULL hash crash with shash")
      Reported-by: default avatarPatrick Steinhardt <ps@pks.im>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      8acf7a10
  4. 21 Nov, 2016 16 commits
  5. 20 Nov, 2016 5 commits
    • Linus Torvalds's avatar
      Linux 4.9-rc6 · 9c763584
      Linus Torvalds authored
      9c763584
    • Linus Torvalds's avatar
      Merge branch 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm · 697ed8d0
      Linus Torvalds authored
      Pull ARM fixes from Russell King:
       "A few more ARM fixes:
      
         - the assembly backtrace code suffers problems with the new printk()
           implementation which assumes that kernel messages without KERN_CONT
           should have newlines inserted between them. Fix this.
         - fix a section naming error - ".init.text" rather than ".text.init"
         - preallocate DMA debug memory at core_initcall() time rather than
           fs_initcall(), as we have some core drivers that need to use DMA
           mapping - and that triggers a kernel warning from the DMA debug
           code.
         - fix XIP kernels after the ro_after_init changes made this data
           permanently read-only"
      
      * 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm:
        ARM: Fix XIP kernels
        ARM: 8628/1: dma-mapping: preallocate DMA-debug hash tables in core_initcall
        ARM: 8624/1: proc-v7m.S: fix init section name
        ARM: fix backtrace
      697ed8d0
    • Jon Paul Maloy's avatar
      tipc: eliminate obsolete socket locking policy description · 51b9a31c
      Jon Paul Maloy authored
      The comment block in socket.c describing the locking policy is
      obsolete, and does not reflect current reality. We remove it in this
      commit.
      
      Since the current locking policy is much simpler and follows a
      mainstream approach, we see no need to add a new description.
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      51b9a31c
    • Zhang Shengju's avatar
      rtnl: fix the loop index update error in rtnl_dump_ifinfo() · 3f0ae05d
      Zhang Shengju authored
      If the link is filtered out, loop index should also be updated. If not,
      loop index will not be correct.
      
      Fixes: dc599f76 ("net: Add support for filtering link dump by master device and kind")
      Signed-off-by: default avatarZhang Shengju <zhangshengju@cmss.chinamobile.com>
      Acked-by: default avatarDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3f0ae05d
    • Guillaume Nault's avatar
      l2tp: fix racy SOCK_ZAPPED flag check in l2tp_ip{,6}_bind() · 32c23116
      Guillaume Nault authored
      Lock socket before checking the SOCK_ZAPPED flag in l2tp_ip6_bind().
      Without lock, a concurrent call could modify the socket flags between
      the sock_flag(sk, SOCK_ZAPPED) test and the lock_sock() call. This way,
      a socket could be inserted twice in l2tp_ip6_bind_table. Releasing it
      would then leave a stale pointer there, generating use-after-free
      errors when walking through the list or modifying adjacent entries.
      
      BUG: KASAN: use-after-free in l2tp_ip6_close+0x22e/0x290 at addr ffff8800081b0ed8
      Write of size 8 by task syz-executor/10987
      CPU: 0 PID: 10987 Comm: syz-executor Not tainted 4.8.0+ #39
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014
       ffff880031d97838 ffffffff829f835b ffff88001b5a1640 ffff8800081b0ec0
       ffff8800081b15a0 ffff8800081b6d20 ffff880031d97860 ffffffff8174d3cc
       ffff880031d978f0 ffff8800081b0e80 ffff88001b5a1640 ffff880031d978e0
      Call Trace:
       [<ffffffff829f835b>] dump_stack+0xb3/0x118 lib/dump_stack.c:15
       [<ffffffff8174d3cc>] kasan_object_err+0x1c/0x70 mm/kasan/report.c:156
       [<     inline     >] print_address_description mm/kasan/report.c:194
       [<ffffffff8174d666>] kasan_report_error+0x1f6/0x4d0 mm/kasan/report.c:283
       [<     inline     >] kasan_report mm/kasan/report.c:303
       [<ffffffff8174db7e>] __asan_report_store8_noabort+0x3e/0x40 mm/kasan/report.c:329
       [<     inline     >] __write_once_size ./include/linux/compiler.h:249
       [<     inline     >] __hlist_del ./include/linux/list.h:622
       [<     inline     >] hlist_del_init ./include/linux/list.h:637
       [<ffffffff8579047e>] l2tp_ip6_close+0x22e/0x290 net/l2tp/l2tp_ip6.c:239
       [<ffffffff850b2dfd>] inet_release+0xed/0x1c0 net/ipv4/af_inet.c:415
       [<ffffffff851dc5a0>] inet6_release+0x50/0x70 net/ipv6/af_inet6.c:422
       [<ffffffff84c4581d>] sock_release+0x8d/0x1d0 net/socket.c:570
       [<ffffffff84c45976>] sock_close+0x16/0x20 net/socket.c:1017
       [<ffffffff817a108c>] __fput+0x28c/0x780 fs/file_table.c:208
       [<ffffffff817a1605>] ____fput+0x15/0x20 fs/file_table.c:244
       [<ffffffff813774f9>] task_work_run+0xf9/0x170
       [<ffffffff81324aae>] do_exit+0x85e/0x2a00
       [<ffffffff81326dc8>] do_group_exit+0x108/0x330
       [<ffffffff81348cf7>] get_signal+0x617/0x17a0 kernel/signal.c:2307
       [<ffffffff811b49af>] do_signal+0x7f/0x18f0
       [<ffffffff810039bf>] exit_to_usermode_loop+0xbf/0x150 arch/x86/entry/common.c:156
       [<     inline     >] prepare_exit_to_usermode arch/x86/entry/common.c:190
       [<ffffffff81006060>] syscall_return_slowpath+0x1a0/0x1e0 arch/x86/entry/common.c:259
       [<ffffffff85e4d726>] entry_SYSCALL_64_fastpath+0xc4/0xc6
      Object at ffff8800081b0ec0, in cache L2TP/IPv6 size: 1448
      Allocated:
      PID = 10987
       [ 1116.897025] [<ffffffff811ddcb6>] save_stack_trace+0x16/0x20
       [ 1116.897025] [<ffffffff8174c736>] save_stack+0x46/0xd0
       [ 1116.897025] [<ffffffff8174c9ad>] kasan_kmalloc+0xad/0xe0
       [ 1116.897025] [<ffffffff8174cee2>] kasan_slab_alloc+0x12/0x20
       [ 1116.897025] [<     inline     >] slab_post_alloc_hook mm/slab.h:417
       [ 1116.897025] [<     inline     >] slab_alloc_node mm/slub.c:2708
       [ 1116.897025] [<     inline     >] slab_alloc mm/slub.c:2716
       [ 1116.897025] [<ffffffff817476a8>] kmem_cache_alloc+0xc8/0x2b0 mm/slub.c:2721
       [ 1116.897025] [<ffffffff84c4f6a9>] sk_prot_alloc+0x69/0x2b0 net/core/sock.c:1326
       [ 1116.897025] [<ffffffff84c58ac8>] sk_alloc+0x38/0xae0 net/core/sock.c:1388
       [ 1116.897025] [<ffffffff851ddf67>] inet6_create+0x2d7/0x1000 net/ipv6/af_inet6.c:182
       [ 1116.897025] [<ffffffff84c4af7b>] __sock_create+0x37b/0x640 net/socket.c:1153
       [ 1116.897025] [<     inline     >] sock_create net/socket.c:1193
       [ 1116.897025] [<     inline     >] SYSC_socket net/socket.c:1223
       [ 1116.897025] [<ffffffff84c4b46f>] SyS_socket+0xef/0x1b0 net/socket.c:1203
       [ 1116.897025] [<ffffffff85e4d685>] entry_SYSCALL_64_fastpath+0x23/0xc6
      Freed:
      PID = 10987
       [ 1116.897025] [<ffffffff811ddcb6>] save_stack_trace+0x16/0x20
       [ 1116.897025] [<ffffffff8174c736>] save_stack+0x46/0xd0
       [ 1116.897025] [<ffffffff8174cf61>] kasan_slab_free+0x71/0xb0
       [ 1116.897025] [<     inline     >] slab_free_hook mm/slub.c:1352
       [ 1116.897025] [<     inline     >] slab_free_freelist_hook mm/slub.c:1374
       [ 1116.897025] [<     inline     >] slab_free mm/slub.c:2951
       [ 1116.897025] [<ffffffff81748b28>] kmem_cache_free+0xc8/0x330 mm/slub.c:2973
       [ 1116.897025] [<     inline     >] sk_prot_free net/core/sock.c:1369
       [ 1116.897025] [<ffffffff84c541eb>] __sk_destruct+0x32b/0x4f0 net/core/sock.c:1444
       [ 1116.897025] [<ffffffff84c5aca4>] sk_destruct+0x44/0x80 net/core/sock.c:1452
       [ 1116.897025] [<ffffffff84c5ad33>] __sk_free+0x53/0x220 net/core/sock.c:1460
       [ 1116.897025] [<ffffffff84c5af23>] sk_free+0x23/0x30 net/core/sock.c:1471
       [ 1116.897025] [<ffffffff84c5cb6c>] sk_common_release+0x28c/0x3e0 ./include/net/sock.h:1589
       [ 1116.897025] [<ffffffff8579044e>] l2tp_ip6_close+0x1fe/0x290 net/l2tp/l2tp_ip6.c:243
       [ 1116.897025] [<ffffffff850b2dfd>] inet_release+0xed/0x1c0 net/ipv4/af_inet.c:415
       [ 1116.897025] [<ffffffff851dc5a0>] inet6_release+0x50/0x70 net/ipv6/af_inet6.c:422
       [ 1116.897025] [<ffffffff84c4581d>] sock_release+0x8d/0x1d0 net/socket.c:570
       [ 1116.897025] [<ffffffff84c45976>] sock_close+0x16/0x20 net/socket.c:1017
       [ 1116.897025] [<ffffffff817a108c>] __fput+0x28c/0x780 fs/file_table.c:208
       [ 1116.897025] [<ffffffff817a1605>] ____fput+0x15/0x20 fs/file_table.c:244
       [ 1116.897025] [<ffffffff813774f9>] task_work_run+0xf9/0x170
       [ 1116.897025] [<ffffffff81324aae>] do_exit+0x85e/0x2a00
       [ 1116.897025] [<ffffffff81326dc8>] do_group_exit+0x108/0x330
       [ 1116.897025] [<ffffffff81348cf7>] get_signal+0x617/0x17a0 kernel/signal.c:2307
       [ 1116.897025] [<ffffffff811b49af>] do_signal+0x7f/0x18f0
       [ 1116.897025] [<ffffffff810039bf>] exit_to_usermode_loop+0xbf/0x150 arch/x86/entry/common.c:156
       [ 1116.897025] [<     inline     >] prepare_exit_to_usermode arch/x86/entry/common.c:190
       [ 1116.897025] [<ffffffff81006060>] syscall_return_slowpath+0x1a0/0x1e0 arch/x86/entry/common.c:259
       [ 1116.897025] [<ffffffff85e4d726>] entry_SYSCALL_64_fastpath+0xc4/0xc6
      Memory state around the buggy address:
       ffff8800081b0d80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
       ffff8800081b0e00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      >ffff8800081b0e80: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb
                                                          ^
       ffff8800081b0f00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
       ffff8800081b0f80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      
      ==================================================================
      
      The same issue exists with l2tp_ip_bind() and l2tp_ip_bind_table.
      
      Fixes: c51ce497 ("l2tp: fix oops in L2TP IP sockets for connect() AF_UNSPEC case")
      Reported-by: default avatarBaozeng Ding <sploving1@gmail.com>
      Reported-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Tested-by: default avatarBaozeng Ding <sploving1@gmail.com>
      Signed-off-by: default avatarGuillaume Nault <g.nault@alphalink.fr>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      32c23116