1. 31 Jan, 2017 8 commits
  2. 28 Jan, 2017 2 commits
  3. 17 Jan, 2017 7 commits
  4. 16 Jan, 2017 2 commits
    • Linus Torvalds's avatar
      Linux 4.10-rc4 · 49def185
      Linus Torvalds authored
      49def185
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace · 99421c1c
      Linus Torvalds authored
      Pull namespace fixes from Eric Biederman:
       "This tree contains 4 fixes.
      
        The first is a fix for a race that can causes oopses under the right
        circumstances, and that someone just recently encountered.
      
        Past that are several small trivial correct fixes. A real issue that
        was blocking development of an out of tree driver, but does not appear
        to have caused any actual problems for in-tree code. A potential
        deadlock that was reported by lockdep. And a deadlock people have
        experienced and took the time to track down caused by a cleanup that
        removed the code to drop a reference count"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
        sysctl: Drop reference added by grab_header in proc_sys_readdir
        pid: fix lockdep deadlock warning due to ucount_lock
        libfs: Modify mount_pseudo_xattr to be clear it is not a userspace mount
        mnt: Protect the mountpoint hashtable with mount_lock
      99421c1c
  5. 15 Jan, 2017 14 commits
    • Linus Torvalds's avatar
      Merge tag 'char-misc-4.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc · c9281627
      Linus Torvalds authored
      Pull char/misc driver fixes from Greg KH:
       "Here are some small char/misc driver fixes for 4.10-rc4 that resolve
        some reported issues.
      
        The MEI driver issue resolves a lot of problems that people have been
        having, as does the mem driver fix. The other minor fixes resolve
        other reported issues.
      
        All of these have been in linux-next for a while"
      
      * tag 'char-misc-4.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
        vme: Fix wrong pointer utilization in ca91cx42_slave_get
        auxdisplay: fix new ht16k33 build errors
        ppdev: don't print a free'd string
        extcon: return error code on failure
        drivers: char: mem: Fix thinkos in kmem address checks
        mei: bus: enable OS version only for SPT and newer
      c9281627
    • Linus Torvalds's avatar
      Merge tag 'driver-core-4.10-rc4' of... · 2d5a7101
      Linus Torvalds authored
      Merge tag 'driver-core-4.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
      
      Pull driver core fix from Greg KH:
       "Here is a single patch being reverted to remove a feature that was
        added in 4.10-rc1 that isn't quite ready for release.
      
        It will be redone as a debugfs file instead of a sysfs file in the
        future"
      
      * tag 'driver-core-4.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
        Revert "driver core: Add deferred_probe attribute to devices in sysfs"
      2d5a7101
    • Linus Torvalds's avatar
      Merge tag 'tty-4.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty · 7f138b97
      Linus Torvalds authored
      Pull tty/serial fixes from Greg KH:
       "Here are some small tty/serial driver fixes for 4.10-rc4 to resolve a
        number of reported issues.
      
        Nothing major here at all, one revert of a problematic patch, and some
        other tiny bugfixes. Full details are in the shortlog below.
      
        All have been in linux-next with no reported issues"
      
      * tag 'tty-4.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
        sysrq: attach sysrq handler correctly for 32-bit kernel
        Revert "tty: serial: 8250: add CON_CONSDEV to flags"
        Clearing FIFOs in RS485 emulation mode causes subsequent transmits to break
        8250_pci: Fix potential use-after-free in error path
        tty/serial: atmel: RS485 half duplex w/DMA: enable RX after TX is done
        tty/serial: atmel_serial: BUG: stop DMA from transmitting in stop_tx
      7f138b97
    • Linus Torvalds's avatar
      Merge tag 'usb-4.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb · 793e039e
      Linus Torvalds authored
      Pull USB fixes from Greg KH:
       "Here are a few small USB driver fixes for 4.10-rc4 to resolve some
        reported issues.
      
        The "largest" here is a number of bugs being fixed in the ch341
        usb-serial driver, to hopefully resolve the mess of different devices
        floating around that use this driver that have been having problems
        with the 4.10-rc1 release.
      
        There's also a tiny musb fix that I missed in the last pull request,
        as well as the traditional xhci fix rounding out the batch.
      
        All have been in linux-next with no reported issues"
      
      * tag 'usb-4.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
        xhci: fix deadlock at host remove by running watchdog correctly
        USB: serial: ch341: fix control-message error handling
        usb: musb: fix runtime PM in debugfs
        wusbcore: Fix one more crypto-on-the-stack bug
        USB: serial: kl5kusb105: fix line-state error handling
        USB: serial: ch341: fix baud rate and line-control handling
        USB: serial: ch341: fix line settings after reset-resume
        USB: serial: ch341: fix resume after reset
        USB: serial: ch341: fix open error handling
        USB: serial: ch341: fix modem-control and B0 handling
        USB: serial: ch341: fix open and resume after B0
        USB: serial: ch341: fix initial modem-control state
      793e039e
    • Linus Torvalds's avatar
      Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux · aa2797b3
      Linus Torvalds authored
      Pull i2c fixes from Wolfram Sang:
       "Bugfixes for I2C. Mostly core this time which is a bit unusual but
        nothing really scary in there"
      
      * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
        i2c: piix4: Avoid race conditions with IMC
        i2c: fix spelling mistake: "insufficent" -> "insufficient"
        i2c: print correct device invalid address
        i2c: do not enable fall back to Host Notify by default
        i2c: fix kernel memory disclosure in dev interface
      aa2797b3
    • Linus Torvalds's avatar
      Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 83346fbc
      Linus Torvalds authored
      Pull x86 fixes from Ingo Molnar:
       "Misc fixes:
      
         - unwinder fixes
         - AMD CPU topology enumeration fixes
         - microcode loader fixes
         - x86 embedded platform fixes
         - fix for a bootup crash that may trigger when clearcpuid= is used
           with invalid values"
      
      * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/mpx: Use compatible types in comparison to fix sparse error
        x86/tsc: Add the Intel Denverton Processor to native_calibrate_tsc()
        x86/entry: Fix the end of the stack for newly forked tasks
        x86/unwind: Include __schedule() in stack traces
        x86/unwind: Disable KASAN checks for non-current tasks
        x86/unwind: Silence warnings for non-current tasks
        x86/microcode/intel: Use correct buffer size for saving microcode data
        x86/microcode/intel: Fix allocation size of struct ucode_patch
        x86/microcode/intel: Add a helper which gives the microcode revision
        x86/microcode: Use native CPUID to tickle out microcode revision
        x86/CPU: Add native CPUID variants returning a single datum
        x86/boot: Add missing declaration of string functions
        x86/CPU/AMD: Fix Bulldozer topology
        x86/platform/intel-mid: Rename 'spidev' to 'mrfld_spidev'
        x86/cpu: Fix typo in the comment for Anniedale
        x86/cpu: Fix bootup crashes by sanitizing the argument of the 'clearcpuid=' command-line option
      83346fbc
    • Linus Torvalds's avatar
      Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · a11ce3a4
      Linus Torvalds authored
      Pull NOHZ fix from Ingo Molnar:
       "This fixes an old NOHZ race where we incorrectly calculate the next
        timer interrupt in certain circumstances where hrtimers are pending,
        that can cause hard to reproduce stalled-values artifacts in
        /proc/stat"
      
      * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        nohz: Fix collision between tick and other hrtimers
      a11ce3a4
    • Linus Torvalds's avatar
      Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 79078c53
      Linus Torvalds authored
      Pull perf fixes from Ingo Molnar:
       "Misc race fixes uncovered by fuzzing efforts, a Sparse fix, two PMU
        driver fixes, plus miscellanous tooling fixes"
      
      * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        perf/x86: Reject non sampling events with precise_ip
        perf/x86/intel: Account interrupts for PEBS errors
        perf/core: Fix concurrent sys_perf_event_open() vs. 'move_group' race
        perf/core: Fix sys_perf_event_open() vs. hotplug
        perf/x86/intel: Use ULL constant to prevent undefined shift behaviour
        perf/x86/intel/uncore: Fix hardcoded socket 0 assumption in the Haswell init code
        perf/x86: Set pmu->module in Intel PMU modules
        perf probe: Fix to probe on gcc generated symbols for offline kernel
        perf probe: Fix --funcs to show correct symbols for offline module
        perf symbols: Robustify reading of build-id from sysfs
        perf tools: Install tools/lib/traceevent plugins with install-bin
        tools lib traceevent: Fix prev/next_prio for deadline tasks
        perf record: Fix --switch-output documentation and comment
        perf record: Make __record_options static
        tools lib subcmd: Add OPT_STRING_OPTARG_SET option
        perf probe: Fix to get correct modname from elf header
        samples/bpf trace_output_user: Remove duplicate sys/ioctl.h include
        samples/bpf sock_example: Avoid getting ethhdr from two includes
        perf sched timehist: Show total scheduling time
      79078c53
    • Linus Torvalds's avatar
      Merge branch 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 255e6140
      Linus Torvalds authored
      Pull EFI fixes from Ingo Molnar:
       "A number of regression fixes:
      
         - Fix a boot hang on machines that have somewhat unusual memory map
           entries of phys_addr=0x0 num_pages=0, which broke due to a recent
           commit. This commit got cherry-picked from the v4.11 queue because
           the bug is affecting real machines.
      
         - Fix a boot hang also reported by KASAN, caused by incorrect init
           ordering introduced by a recent optimization.
      
         - Fix a recent robustification fix to allocate_new_fdt_and_exit_boot()
           that introduced an invalid assumption. Neither bugs were seen in
           the wild AFAIK"
      
      * 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        efi/x86: Prune invalid memory map entries and fix boot regression
        x86/efi: Don't allocate memmap through memblock after mm_init()
        efi/libstub/arm*: Pass latest memory map to the kernel
      255e6140
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · f4d3935e
      Linus Torvalds authored
      Pull vfs fixes from Al Viro.
      
      The most notable fix here is probably the fix for a splice regression
      ("fix a fencepost error in pipe_advance()") noticed by Alan Wylie.
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        fix a fencepost error in pipe_advance()
        coredump: Ensure proper size of sparse core files
        aio: fix lock dep warning
        tmpfs: clear S_ISGID when setting posix ACLs
      f4d3935e
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.dk/linux-block · 34241af7
      Linus Torvalds authored
      Pull block fixes from Jens Axboe:
      
       - the virtio_blk stack DMA corruption fix from Christoph, fixing and
         issue with VMAP stacks.
      
       - O_DIRECT blkbits calculation fix from Chandan.
      
       - discard regression fix from Christoph.
      
       - queue init error handling fixes for nbd and virtio_blk, from Omar and
         Jeff.
      
       - two small nvme fixes, from Christoph and Guilherme.
      
       - rename of blk_queue_zone_size and bdev_zone_size to _sectors instead,
         to more closely follow what we do in other places in the block layer.
         This interface is new for this series, so let's get the naming right
         before releasing a kernel with this feature. From Damien.
      
      * 'for-linus' of git://git.kernel.dk/linux-block:
        block: don't try to discard from __blkdev_issue_zeroout
        sd: remove __data_len hack for WRITE SAME
        nvme: use blk_rq_payload_bytes
        scsi: use blk_rq_payload_bytes
        block: add blk_rq_payload_bytes
        block: Rename blk_queue_zone_size and bdev_zone_size
        nvme: apply DELAY_BEFORE_CHK_RDY quirk at probe time too
        nvme-rdma: fix nvme_rdma_queue_is_ready
        virtio_blk: fix panic in initialization error path
        nbd: blk_mq_init_queue returns an error code on failure, not NULL
        virtio_blk: avoid DMA to stack for the sense buffer
        do_direct_IO: Use inode->i_blkbits to compute block count to be cleaned
      34241af7
    • Al Viro's avatar
      fix a fencepost error in pipe_advance() · b9dc6f65
      Al Viro authored
      The logics in pipe_advance() used to release all buffers past the new
      position failed in cases when the number of buffers to release was equal
      to pipe->buffers.  If that happened, none of them had been released,
      leaving pipe full.  Worse, it was trivial to trigger and we end up with
      pipe full of uninitialized pages.  IOW, it's an infoleak.
      
      Cc: stable@vger.kernel.org # v4.9
      Reported-by: default avatar"Alan J. Wylie" <alan@wylie.me.uk>
      Tested-by: default avatar"Alan J. Wylie" <alan@wylie.me.uk>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      b9dc6f65
    • Dave Kleikamp's avatar
      coredump: Ensure proper size of sparse core files · 4d22c75d
      Dave Kleikamp authored
      If the last section of a core file ends with an unmapped or zero page,
      the size of the file does not correspond with the last dump_skip() call.
      gdb complains that the file is truncated and can be confusing to users.
      
      After all of the vma sections are written, make sure that the file size
      is no smaller than the current file position.
      
      This problem can be demonstrated with gdb's bigcore testcase on the
      sparc architecture.
      Signed-off-by: default avatarDave Kleikamp <dave.kleikamp@oracle.com>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: linux-fsdevel@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      4d22c75d
    • Shaohua Li's avatar
      aio: fix lock dep warning · a12f1ae6
      Shaohua Li authored
      lockdep reports a warnning. file_start_write/file_end_write only
      acquire/release the lock for regular files. So checking the files in aio
      side too.
      
      [  453.532141] ------------[ cut here ]------------
      [  453.533011] WARNING: CPU: 1 PID: 1298 at ../kernel/locking/lockdep.c:3514 lock_release+0x434/0x670
      [  453.533011] DEBUG_LOCKS_WARN_ON(depth <= 0)
      [  453.533011] Modules linked in:
      [  453.533011] CPU: 1 PID: 1298 Comm: fio Not tainted 4.9.0+ #964
      [  453.533011] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.0-1.fc24 04/01/2014
      [  453.533011]  ffff8803a24b7a70 ffffffff8196cffb ffff8803a24b7ae8 0000000000000000
      [  453.533011]  ffff8803a24b7ab8 ffffffff81091ee1 ffff8803a5dba700 00000dba00000008
      [  453.533011]  ffffed0074496f59 ffff8803a5dbaf54 ffff8803ae0f8488 fffffffffffffdef
      [  453.533011] Call Trace:
      [  453.533011]  [<ffffffff8196cffb>] dump_stack+0x67/0x9c
      [  453.533011]  [<ffffffff81091ee1>] __warn+0x111/0x130
      [  453.533011]  [<ffffffff81091f97>] warn_slowpath_fmt+0x97/0xb0
      [  453.533011]  [<ffffffff81091f00>] ? __warn+0x130/0x130
      [  453.533011]  [<ffffffff8191b789>] ? blk_finish_plug+0x29/0x60
      [  453.533011]  [<ffffffff811205d4>] lock_release+0x434/0x670
      [  453.533011]  [<ffffffff8198af94>] ? import_single_range+0xd4/0x110
      [  453.533011]  [<ffffffff81322195>] ? rw_verify_area+0x65/0x140
      [  453.533011]  [<ffffffff813aa696>] ? aio_write+0x1f6/0x280
      [  453.533011]  [<ffffffff813aa6c9>] aio_write+0x229/0x280
      [  453.533011]  [<ffffffff813aa4a0>] ? aio_complete+0x640/0x640
      [  453.533011]  [<ffffffff8111df20>] ? debug_check_no_locks_freed+0x1a0/0x1a0
      [  453.533011]  [<ffffffff8114793a>] ? debug_lockdep_rcu_enabled.part.2+0x1a/0x30
      [  453.533011]  [<ffffffff81147985>] ? debug_lockdep_rcu_enabled+0x35/0x40
      [  453.533011]  [<ffffffff812a92be>] ? __might_fault+0x7e/0xf0
      [  453.533011]  [<ffffffff813ac9bc>] do_io_submit+0x94c/0xb10
      [  453.533011]  [<ffffffff813ac2ae>] ? do_io_submit+0x23e/0xb10
      [  453.533011]  [<ffffffff813ac070>] ? SyS_io_destroy+0x270/0x270
      [  453.533011]  [<ffffffff8111d7b3>] ? mark_held_locks+0x23/0xc0
      [  453.533011]  [<ffffffff8100201a>] ? trace_hardirqs_on_thunk+0x1a/0x1c
      [  453.533011]  [<ffffffff813acb90>] SyS_io_submit+0x10/0x20
      [  453.533011]  [<ffffffff824f96aa>] entry_SYSCALL_64_fastpath+0x18/0xad
      [  453.533011]  [<ffffffff81119190>] ? trace_hardirqs_off_caller+0xc0/0x110
      [  453.533011] ---[ end trace b2fbe664d1cc0082 ]---
      
      Cc: Dmitry Monakhov <dmonakhov@openvz.org>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarShaohua Li <shli@fb.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      a12f1ae6
  6. 14 Jan, 2017 7 commits
    • Linus Torvalds's avatar
      Merge tag 'dmaengine-fix-4.10-rc4' of git://git.infradead.org/users/vkoul/slave-dma · f0ad1771
      Linus Torvalds authored
      Pull dmaengine fixes from Vinod Koul:
       "The fixes this time around are spread over drivers, pretty normal
        update:
      
         - PCI ID for SKL ioatdma, workaround for SKX and
           ioat_alloc_chan_resources sleepy allocation fix
      
         - dw kconfig typo fix
      
         - null pointer deref for stm32
      
         - MAINTAINERS Update for at_hdmac
      
         - pl330 runtime pm fixes
      
         - omap-dma port window fix
      
         - rcar-dmac unmap slave resource fix"
      
      * tag 'dmaengine-fix-4.10-rc4' of git://git.infradead.org/users/vkoul/slave-dma:
        dmaengine: rcar-dmac: unmap slave resource when channel is freed
        dmaengine: omap-dma: Fix the port_window support
        dmaengine: iota: ioat_alloc_chan_resources should not perform sleeping allocations.
        dmaengine: pl330: Fix runtime PM support for terminated transfers
        MAINTAINERS: dmaengine: Update + Hand over the at_hdmac driver to Ludovic
        dmaengine: omap-dma: Fix dynamic lch_map allocation
        dmaengine: ti-dma-crossbar: Add some 'of_node_put()' in error path.
        dmaengine: stm32-dma: Fix null pointer dereference in stm32_dma_tx_status
        dmaengine: stm32-dma: Set correct args number for DMA request from DT
        dmaengine: dw: fix typo in Kconfig
        dmaengine: ioatdma: workaround SKX ioatdma version
        dmaengine: ioatdma: Add Skylake PCI Dev ID
      f0ad1771
    • Peter Jones's avatar
      efi/x86: Prune invalid memory map entries and fix boot regression · 0100a3e6
      Peter Jones authored
      Some machines, such as the Lenovo ThinkPad W541 with firmware GNET80WW
      (2.28), include memory map entries with phys_addr=0x0 and num_pages=0.
      
      These machines fail to boot after the following commit,
      
        commit 8e80632f ("efi/esrt: Use efi_mem_reserve() and avoid a kmalloc()")
      
      Fix this by removing such bogus entries from the memory map.
      
      Furthermore, currently the log output for this case (with efi=debug)
      looks like:
      
       [    0.000000] efi: mem45: [Reserved           |   |  |  |  |  |  |  |  |  |  |  |  ] range=[0x0000000000000000-0xffffffffffffffff] (0MB)
      
      This is clearly wrong, and also not as informative as it could be.  This
      patch changes it so that if we find obviously invalid memory map
      entries, we print an error and skip those entries.  It also detects the
      display of the address range calculation overflow, so the new output is:
      
       [    0.000000] efi: [Firmware Bug]: Invalid EFI memory map entries:
       [    0.000000] efi: mem45: [Reserved           |   |  |  |  |  |  |  |   |  |  |  |  ] range=[0x0000000000000000-0x0000000000000000] (invalid)
      
      It also detects memory map sizes that would overflow the physical
      address, for example phys_addr=0xfffffffffffff000 and
      num_pages=0x0200000000000001, and prints:
      
       [    0.000000] efi: [Firmware Bug]: Invalid EFI memory map entries:
       [    0.000000] efi: mem45: [Reserved           |   |  |  |  |  |  |  |   |  |  |  |  ] range=[phys_addr=0xfffffffffffff000-0x20ffffffffffffffff] (invalid)
      
      It then removes these entries from the memory map.
      Signed-off-by: default avatarPeter Jones <pjones@redhat.com>
      Signed-off-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      [ardb: refactor for clarity with no functional changes, avoid PAGE_SHIFT]
      Signed-off-by: default avatarMatt Fleming <matt@codeblueprint.co.uk>
      [Matt: Include bugzilla info in commit log]
      Cc: <stable@vger.kernel.org> # v4.9+
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=191121Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      0100a3e6
    • Greg Kroah-Hartman's avatar
      Revert "driver core: Add deferred_probe attribute to devices in sysfs" · c7334ce8
      Greg Kroah-Hartman authored
      This reverts commit 6751667a.
      
      Rob Herring objected to it, and a replacement for it will be added using
      debugfs in the future.
      
      Cc: Ben Hutchings <ben.hutchings@codethink.co.uk>
      Reported-by: default avatarRob Herring <robh@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c7334ce8
    • Jiri Olsa's avatar
      perf/x86: Reject non sampling events with precise_ip · 18e7a45a
      Jiri Olsa authored
      As Peter suggested [1] rejecting non sampling PEBS events,
      because they dont make any sense and could cause bugs
      in the NMI handler [2].
      
        [1] http://lkml.kernel.org/r/20170103094059.GC3093@worktop
        [2] http://lkml.kernel.org/r/1482931866-6018-3-git-send-email-jolsa@kernel.orgSigned-off-by: default avatarJiri Olsa <jolsa@redhat.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vince@deater.net>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Link: http://lkml.kernel.org/r/20170103142454.GA26251@kravaSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      18e7a45a
    • Jiri Olsa's avatar
      perf/x86/intel: Account interrupts for PEBS errors · 475113d9
      Jiri Olsa authored
      It's possible to set up PEBS events to get only errors and not
      any data, like on SNB-X (model 45) and IVB-EP (model 62)
      via 2 perf commands running simultaneously:
      
          taskset -c 1 ./perf record -c 4 -e branches:pp -j any -C 10
      
      This leads to a soft lock up, because the error path of the
      intel_pmu_drain_pebs_nhm() does not account event->hw.interrupt
      for error PEBS interrupts, so in case you're getting ONLY
      errors you don't have a way to stop the event when it's over
      the max_samples_per_tick limit:
      
        NMI watchdog: BUG: soft lockup - CPU#22 stuck for 22s! [perf_fuzzer:5816]
        ...
        RIP: 0010:[<ffffffff81159232>]  [<ffffffff81159232>] smp_call_function_single+0xe2/0x140
        ...
        Call Trace:
         ? trace_hardirqs_on_caller+0xf5/0x1b0
         ? perf_cgroup_attach+0x70/0x70
         perf_install_in_context+0x199/0x1b0
         ? ctx_resched+0x90/0x90
         SYSC_perf_event_open+0x641/0xf90
         SyS_perf_event_open+0x9/0x10
         do_syscall_64+0x6c/0x1f0
         entry_SYSCALL64_slow_path+0x25/0x25
      
      Add perf_event_account_interrupt() which does the interrupt
      and frequency checks and call it from intel_pmu_drain_pebs_nhm()'s
      error path.
      
      We keep the pending_kill and pending_wakeup logic only in the
      __perf_event_overflow() path, because they make sense only if
      there's any data to deliver.
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vince@deater.net>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Link: http://lkml.kernel.org/r/1482931866-6018-2-git-send-email-jolsa@kernel.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      475113d9
    • Peter Zijlstra's avatar
      perf/core: Fix concurrent sys_perf_event_open() vs. 'move_group' race · 321027c1
      Peter Zijlstra authored
      Di Shen reported a race between two concurrent sys_perf_event_open()
      calls where both try and move the same pre-existing software group
      into a hardware context.
      
      The problem is exactly that described in commit:
      
        f63a8daa ("perf: Fix event->ctx locking")
      
      ... where, while we wait for a ctx->mutex acquisition, the event->ctx
      relation can have changed under us.
      
      That very same commit failed to recognise sys_perf_event_context() as an
      external access vector to the events and thereby didn't apply the
      established locking rules correctly.
      
      So while one sys_perf_event_open() call is stuck waiting on
      mutex_lock_double(), the other (which owns said locks) moves the group
      about. So by the time the former sys_perf_event_open() acquires the
      locks, the context we've acquired is stale (and possibly dead).
      
      Apply the established locking rules as per perf_event_ctx_lock_nested()
      to the mutex_lock_double() for the 'move_group' case. This obviously means
      we need to validate state after we acquire the locks.
      
      Reported-by: Di Shen (Keen Lab)
      Tested-by: default avatarJohn Dias <joaodias@google.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Min Chong <mchong@google.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Fixes: f63a8daa ("perf: Fix event->ctx locking")
      Link: http://lkml.kernel.org/r/20170106131444.GZ3174@twins.programming.kicks-ass.netSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      321027c1
    • Peter Zijlstra's avatar
      perf/core: Fix sys_perf_event_open() vs. hotplug · 63cae12b
      Peter Zijlstra authored
      There is problem with installing an event in a task that is 'stuck' on
      an offline CPU.
      
      Blocked tasks are not dis-assosciated from offlined CPUs, after all, a
      blocked task doesn't run and doesn't require a CPU etc.. Only on
      wakeup do we ammend the situation and place the task on a available
      CPU.
      
      If we hit such a task with perf_install_in_context() we'll loop until
      either that task wakes up or the CPU comes back online, if the task
      waking depends on the event being installed, we're stuck.
      
      While looking into this issue, I also spotted another problem, if we
      hit a task with perf_install_in_context() that is in the middle of
      being migrated, that is we observe the old CPU before sending the IPI,
      but run the IPI (on the old CPU) while the task is already running on
      the new CPU, things also go sideways.
      
      Rework things to rely on task_curr() -- outside of rq->lock -- which
      is rather tricky. Imagine the following scenario where we're trying to
      install the first event into our task 't':
      
      CPU0            CPU1            CPU2
      
                      (current == t)
      
      t->perf_event_ctxp[] = ctx;
      smp_mb();
      cpu = task_cpu(t);
      
                      switch(t, n);
                                      migrate(t, 2);
                                      switch(p, t);
      
                                      ctx = t->perf_event_ctxp[]; // must not be NULL
      
      smp_function_call(cpu, ..);
      
                      generic_exec_single()
                        func();
                          spin_lock(ctx->lock);
                          if (task_curr(t)) // false
      
                          add_event_to_ctx();
                          spin_unlock(ctx->lock);
      
                                      perf_event_context_sched_in();
                                        spin_lock(ctx->lock);
                                        // sees event
      
      So its CPU0's store of t->perf_event_ctxp[] that must not go 'missing'.
      Because if CPU2's load of that variable were to observe NULL, it would
      not try to schedule the ctx and we'd have a task running without its
      counter, which would be 'bad'.
      
      As long as we observe !NULL, we'll acquire ctx->lock. If we acquire it
      first and not see the event yet, then CPU0 must observe task_curr()
      and retry. If the install happens first, then we must see the event on
      sched-in and all is well.
      
      I think we can translate the first part (until the 'must not be NULL')
      of the scenario to a litmus test like:
      
        C C-peterz
      
        {
        }
      
        P0(int *x, int *y)
        {
                int r1;
      
                WRITE_ONCE(*x, 1);
                smp_mb();
                r1 = READ_ONCE(*y);
        }
      
        P1(int *y, int *z)
        {
                WRITE_ONCE(*y, 1);
                smp_store_release(z, 1);
        }
      
        P2(int *x, int *z)
        {
                int r1;
                int r2;
      
                r1 = smp_load_acquire(z);
      	  smp_mb();
                r2 = READ_ONCE(*x);
        }
      
        exists
        (0:r1=0 /\ 2:r1=1 /\ 2:r2=0)
      
      Where:
        x is perf_event_ctxp[],
        y is our tasks's CPU, and
        z is our task being placed on the rq of CPU2.
      
      The P0 smp_mb() is the one added by this patch, ordering the store to
      perf_event_ctxp[] from find_get_context() and the load of task_cpu()
      in task_function_call().
      
      The smp_store_release/smp_load_acquire model the RCpc locking of the
      rq->lock and the smp_mb() of P2 is the context switch switching from
      whatever CPU2 was running to our task 't'.
      
      This litmus test evaluates into:
      
        Test C-peterz Allowed
        States 7
        0:r1=0; 2:r1=0; 2:r2=0;
        0:r1=0; 2:r1=0; 2:r2=1;
        0:r1=0; 2:r1=1; 2:r2=1;
        0:r1=1; 2:r1=0; 2:r2=0;
        0:r1=1; 2:r1=0; 2:r2=1;
        0:r1=1; 2:r1=1; 2:r2=0;
        0:r1=1; 2:r1=1; 2:r2=1;
        No
        Witnesses
        Positive: 0 Negative: 7
        Condition exists (0:r1=0 /\ 2:r1=1 /\ 2:r2=0)
        Observation C-peterz Never 0 7
        Hash=e427f41d9146b2a5445101d3e2fcaa34
      
      And the strong and weak model agree.
      Reported-by: default avatarMark Rutland <mark.rutland@arm.com>
      Tested-by: default avatarMark Rutland <mark.rutland@arm.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: jeremy.linton@arm.com
      Link: http://lkml.kernel.org/r/20161209135900.GU3174@twins.programming.kicks-ass.netSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      63cae12b