1. 20 Apr, 2017 1 commit
    • Steven Rostedt (VMware)'s avatar
      ring-buffer: Have ring_buffer_iter_empty() return true when empty · 78f7a45d
      Steven Rostedt (VMware) authored
      I noticed that reading the snapshot file when it is empty no longer gives a
      status. It suppose to show the status of the snapshot buffer as well as how
      to allocate and use it. For example:
      
       ># cat snapshot
       # tracer: nop
       #
       #
       # * Snapshot is allocated *
       #
       # Snapshot commands:
       # echo 0 > snapshot : Clears and frees snapshot buffer
       # echo 1 > snapshot : Allocates snapshot buffer, if not already allocated.
       #                      Takes a snapshot of the main buffer.
       # echo 2 > snapshot : Clears snapshot buffer (but does not allocate or free)
       #                      (Doesn't have to be '2' works with any number that
       #                       is not a '0' or '1')
      
      But instead it just showed an empty buffer:
      
       ># cat snapshot
       # tracer: nop
       #
       # entries-in-buffer/entries-written: 0/0   #P:4
       #
       #                              _-----=> irqs-off
       #                             / _----=> need-resched
       #                            | / _---=> hardirq/softirq
       #                            || / _--=> preempt-depth
       #                            ||| /     delay
       #           TASK-PID   CPU#  ||||    TIMESTAMP  FUNCTION
       #              | |       |   ||||       |         |
      
      What happened was that it was using the ring_buffer_iter_empty() function to
      see if it was empty, and if it was, it showed the status. But that function
      was returning false when it was empty. The reason was that the iter header
      page was on the reader page, and the reader page was empty, but so was the
      buffer itself. The check only tested to see if the iter was on the commit
      page, but the commit page was no longer pointing to the reader page, but as
      all pages were empty, the buffer is also.
      
      Cc: stable@vger.kernel.org
      Fixes: 651e22f2 ("ring-buffer: Always reset iterator to reader page")
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      78f7a45d
  2. 19 Apr, 2017 1 commit
    • Steven Rostedt (VMware)'s avatar
      tracing: Allocate the snapshot buffer before enabling probe · df62db5b
      Steven Rostedt (VMware) authored
      Currently the snapshot trigger enables the probe and then allocates the
      snapshot. If the probe triggers before the allocation, it could cause the
      snapshot to fail and turn tracing off. It's best to allocate the snapshot
      buffer first, and then enable the trigger. If something goes wrong in the
      enabling of the trigger, the snapshot buffer is still allocated, but it can
      also be freed by the user by writting zero into the snapshot buffer file.
      
      Also add a check of the return status of alloc_snapshot().
      
      Cc: stable@vger.kernel.org
      Fixes: 77fd5c15 ("tracing: Add snapshot trigger to function probes")
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      df62db5b
  3. 18 Apr, 2017 2 commits
  4. 17 Apr, 2017 1 commit
    • Namhyung Kim's avatar
      ftrace: Fix function pid filter on instances · d879d0b8
      Namhyung Kim authored
      When function tracer has a pid filter, it adds a probe to sched_switch
      to track if current task can be ignored.  The probe checks the
      ftrace_ignore_pid from current tr to filter tasks.  But it misses to
      delete the probe when removing an instance so that it can cause a crash
      due to the invalid tr pointer (use-after-free).
      
      This is easily reproducible with the following:
      
        # cd /sys/kernel/debug/tracing
        # mkdir instances/buggy
        # echo $$ > instances/buggy/set_ftrace_pid
        # rmdir instances/buggy
      
        ============================================================================
        BUG: KASAN: use-after-free in ftrace_filter_pid_sched_switch_probe+0x3d/0x90
        Read of size 8 by task kworker/0:1/17
        CPU: 0 PID: 17 Comm: kworker/0:1 Tainted: G    B           4.11.0-rc3  #198
        Call Trace:
         dump_stack+0x68/0x9f
         kasan_object_err+0x21/0x70
         kasan_report.part.1+0x22b/0x500
         ? ftrace_filter_pid_sched_switch_probe+0x3d/0x90
         kasan_report+0x25/0x30
         __asan_load8+0x5e/0x70
         ftrace_filter_pid_sched_switch_probe+0x3d/0x90
         ? fpid_start+0x130/0x130
         __schedule+0x571/0xce0
         ...
      
      To fix it, use ftrace_clear_pids() to unregister the probe.  As
      instance_rmdir() already updated ftrace codes, it can just free the
      filter safely.
      
      Link: http://lkml.kernel.org/r/20170417024430.21194-2-namhyung@kernel.org
      
      Fixes: 0c8916c3 ("tracing: Add rmdir to remove multibuffer instances")
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: stable@vger.kernel.org
      Reviewed-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      d879d0b8
  5. 14 Apr, 2017 1 commit
    • Steven Rostedt (VMware)'s avatar
      ftrace: Fix removing of second function probe · 82cc4fc2
      Steven Rostedt (VMware) authored
      When two function probes are added to set_ftrace_filter, and then one of
      them is removed, the update to the function locations is not performed, and
      the record keeping of the function states are corrupted, and causes an
      ftrace_bug() to occur.
      
      This is easily reproducable by adding two probes, removing one, and then
      adding it back again.
      
       # cd /sys/kernel/debug/tracing
       # echo schedule:traceoff > set_ftrace_filter
       # echo do_IRQ:traceoff > set_ftrace_filter
       # echo \!do_IRQ:traceoff > /debug/tracing/set_ftrace_filter
       # echo do_IRQ:traceoff > set_ftrace_filter
      
      Causes:
       ------------[ cut here ]------------
       WARNING: CPU: 2 PID: 1098 at kernel/trace/ftrace.c:2369 ftrace_get_addr_curr+0x143/0x220
       Modules linked in: [...]
       CPU: 2 PID: 1098 Comm: bash Not tainted 4.10.0-test+ #405
       Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v02.05 05/07/2012
       Call Trace:
        dump_stack+0x68/0x9f
        __warn+0x111/0x130
        ? trace_irq_work_interrupt+0xa0/0xa0
        warn_slowpath_null+0x1d/0x20
        ftrace_get_addr_curr+0x143/0x220
        ? __fentry__+0x10/0x10
        ftrace_replace_code+0xe3/0x4f0
        ? ftrace_int3_handler+0x90/0x90
        ? printk+0x99/0xb5
        ? 0xffffffff81000000
        ftrace_modify_all_code+0x97/0x110
        arch_ftrace_update_code+0x10/0x20
        ftrace_run_update_code+0x1c/0x60
        ftrace_run_modify_code.isra.48.constprop.62+0x8e/0xd0
        register_ftrace_function_probe+0x4b6/0x590
        ? ftrace_startup+0x310/0x310
        ? debug_lockdep_rcu_enabled.part.4+0x1a/0x30
        ? update_stack_state+0x88/0x110
        ? ftrace_regex_write.isra.43.part.44+0x1d3/0x320
        ? preempt_count_sub+0x18/0xd0
        ? mutex_lock_nested+0x104/0x800
        ? ftrace_regex_write.isra.43.part.44+0x1d3/0x320
        ? __unwind_start+0x1c0/0x1c0
        ? _mutex_lock_nest_lock+0x800/0x800
        ftrace_trace_probe_callback.isra.3+0xc0/0x130
        ? func_set_flag+0xe0/0xe0
        ? __lock_acquire+0x642/0x1790
        ? __might_fault+0x1e/0x20
        ? trace_get_user+0x398/0x470
        ? strcmp+0x35/0x60
        ftrace_trace_onoff_callback+0x48/0x70
        ftrace_regex_write.isra.43.part.44+0x251/0x320
        ? match_records+0x420/0x420
        ftrace_filter_write+0x2b/0x30
        __vfs_write+0xd7/0x330
        ? do_loop_readv_writev+0x120/0x120
        ? locks_remove_posix+0x90/0x2f0
        ? do_lock_file_wait+0x160/0x160
        ? __lock_is_held+0x93/0x100
        ? rcu_read_lock_sched_held+0x5c/0xb0
        ? preempt_count_sub+0x18/0xd0
        ? __sb_start_write+0x10a/0x230
        ? vfs_write+0x222/0x240
        vfs_write+0xef/0x240
        SyS_write+0xab/0x130
        ? SyS_read+0x130/0x130
        ? trace_hardirqs_on_caller+0x182/0x280
        ? trace_hardirqs_on_thunk+0x1a/0x1c
        entry_SYSCALL_64_fastpath+0x18/0xad
       RIP: 0033:0x7fe61c157c30
       RSP: 002b:00007ffe87890258 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
       RAX: ffffffffffffffda RBX: ffffffff8114a410 RCX: 00007fe61c157c30
       RDX: 0000000000000010 RSI: 000055814798f5e0 RDI: 0000000000000001
       RBP: ffff8800c9027f98 R08: 00007fe61c422740 R09: 00007fe61ca53700
       R10: 0000000000000073 R11: 0000000000000246 R12: 0000558147a36400
       R13: 00007ffe8788f160 R14: 0000000000000024 R15: 00007ffe8788f15c
        ? trace_hardirqs_off_caller+0xc0/0x110
       ---[ end trace 99fa09b3d9869c2c ]---
       Bad trampoline accounting at: ffffffff81cc3b00 (do_IRQ+0x0/0x150)
      
      Cc: stable@vger.kernel.org
      Fixes: 59df055f ("ftrace: trace different functions with a different tracer")
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      82cc4fc2
  6. 05 Apr, 2017 1 commit
  7. 03 Apr, 2017 1 commit
  8. 02 Apr, 2017 10 commits
    • Linus Torvalds's avatar
      Merge tag 'dmaengine-fix-4.11-rc5' of git://git.infradead.org/users/vkoul/slave-dma · f49237bf
      Linus Torvalds authored
      Pull dmaengine fixes from Vinod Koul:
       "A couple of minor fixes for 4.11:
      
         - array bound fix for __get_unmap_pool()
      
         - cyclic period splitting for bcm2835"
      
      * tag 'dmaengine-fix-4.11-rc5' of git://git.infradead.org/users/vkoul/slave-dma:
        dmaengine: Fix array index out of bounds warning in __get_unmap_pool()
        dmaengine: bcm2835: Fix cyclic DMA period splitting
      f49237bf
    • Linus Torvalds's avatar
      Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 496dcc50
      Linus Torvalds authored
      Pull x86 fixes from Thomas Gleixner:
       "This update provides:
      
         - prevent KASLR from randomizing EFI regions
      
         - restrict the usage of -maccumulate-outgoing-args and document when
           and why it is required.
      
         - make the Global Physical Address calculation for UV4 systems work
           correctly.
      
         - address a copy->paste->forgot-edit problem in the MCE exception
           table entries.
      
         - assign a name to AMD MCA bank 3, so the sysfs file registration
           works.
      
         - add a missing include in the boot code"
      
      * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/boot: Include missing header file
        x86/mce/AMD: Give a name to MCA bank 3 when accessed with legacy MSRs
        x86/build: Mostly disable '-maccumulate-outgoing-args'
        x86/mm/KASLR: Exclude EFI region from KASLR VA space randomization
        x86/mce: Fix copy/paste error in exception table entries
        x86/platform/uv: Fix calculation of Global Physical Address
      496dcc50
    • Linus Torvalds's avatar
      Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 128c434a
      Linus Torvalds authored
      Pull scheduler fixes from Thomas Gleixner:
       "This update provides:
      
         - make the scheduler clock switch to unstable mode smooth so the
           timestamps stay at microseconds granularity instead of switching to
           tick granularity.
      
         - unbreak perf test tsc by taking the new offset into account which
           was added in order to proveide better sched clock continuity
      
         - switching sched clock to unstable mode runs all clock related
           computations which affect the sched clock output itself from a work
           queue. In case of preemption sched clock uses half updated data and
           provides wrong timestamps. Keep the math in the protected context
           and delegate only the static key switch to workqueue context.
      
         - remove a duplicate header include"
      
      * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        sched/headers: Remove duplicate #include <linux/sched/debug.h> line
        sched/clock: Fix broken stable to unstable transfer
        sched/clock, x86/perf: Fix "perf test tsc"
        sched/clock: Fix clear_sched_clock_stable() preempt wobbly
      128c434a
    • Linus Torvalds's avatar
      Merge branch 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 0a89b5eb
      Linus Torvalds authored
      Pull EFI fix from Thomas Gleixner:
       "Downgrade the missing ESRT header printk to warning level and remove a
        useless error printk which just generates noise for no value"
      
      * 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        efi/esrt: Cleanup bad memory map log messages
      0a89b5eb
    • Linus Torvalds's avatar
      Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 4a6808f3
      Linus Torvalds authored
      Pull timer fixes from Thomas Gleixner:
       "Two small fixes for the new CLKEVT_OF infrastructure"
      
      * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        vmlinux.lds: Add __clkevt_of_table to kernel
        clockevents: Fix syntax error in clkevt-of macro
      4a6808f3
    • Linus Torvalds's avatar
      Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 907977b2
      Linus Torvalds authored
      Pull irq fixes from Thomas Gleixner:
       "Two small fixlets:
      
         - select a required Kconfig to make the MVEBU driver compile
      
         - add the missing MIPS local GIC interrupts which prevent drivers to
           probe successfully"
      
      * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        irqchip/mips-gic: Fix Local compare interrupt
        irqchip/mvebu-odmi: Select GENERIC_MSI_IRQ_DOMAIN
      907977b2
    • Linus Torvalds's avatar
      Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · ada63c61
      Linus Torvalds authored
      Pull core fix from Thomas Gleixner:
       "Prevent leaking kernel memory via /proc/$pid/syscall when the queried
        task is not in a syscall"
      
      * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        lib/syscall: Clear return values when no stack
      ada63c61
    • Linus Torvalds's avatar
      Merge branch 'parisc-4.11-3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux · 346ce1d7
      Linus Torvalds authored
      Pull parisc fixes from Helge Deller:
       "Al Viro reported that - in case of read faults - our copy_from_user()
        implementation may claim to have copied more bytes than it actually
        did. In order to fix this bug and because of the way how gcc optimizes
        register usage for inline assembly in C code, we had to replace our
        pa_memcpy() function with a pure assembler implementation.
      
        While fixing the memcpy bug we noticed some other issues with our
        get_user() and put_user() functions, e.g. nested faults may return
        wrong data. This is now fixed by a common fixup handler for
        get_user/put_user in the exception handler which additionally makes
        generated code smaller and faster.
      
        The third patch is a trivial one-line fix for a patch which went in
        during 4.11-rc and which avoids stalled CPU warnings after power
        shutdown (for parisc machines which can't plug power off themselves).
      
        Due to the rewrite of pa_memcpy() into assembly this patch got bigger
        than what I wanted to have sent at this stage.
      
        Those patches have been running in production during the last few days
        on our debian build servers without any further issues"
      
      * 'parisc-4.11-3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
        parisc: Avoid stalled CPU warnings after system shutdown
        parisc: Clean up fixup routines for get_user()/put_user()
        parisc: Fix access fault handling in pa_memcpy()
      346ce1d7
    • Linus Torvalds's avatar
      Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · 7d34ddbe
      Linus Torvalds authored
      Pull SCSI fixes from James Bottomley:
       "Thirteen small fixes: The hopefully final effort to get the lpfc nvme
        kconfig problems sorted, there's one important sg fix (user can induce
        read after end of buffer) and one minor enhancement (adding an extra
        PCI ID to qedi). The rest are a set of minor fixes, which mostly occur
        as user visible in error legs or on specific devices"
      
      * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
        scsi: ufs: remove the duplicated checking for supporting clkscaling
        scsi: lpfc: fix building without debugfs support
        scsi: lpfc: Fix PT2PT PRLI reject
        scsi: hpsa: fix volume offline state
        scsi: libsas: fix ata xfer length
        scsi: scsi_dh_alua: Warn if the first argument of alua_rtpg_queue() is NULL
        scsi: scsi_dh_alua: Ensure that alua_activate() calls the completion function
        scsi: scsi_dh_alua: Check scsi_device_get() return value
        scsi: sg: check length passed to SG_NEXT_CMD_LEN
        scsi: ufshcd-platform: remove the useless cast in ERR_PTR/IS_ERR
        scsi: qedi: Add PCI device-ID for QL41xxx adapters.
        scsi: aacraid: Fix potential null access
        scsi: qla2xxx: Fix crash in qla2xxx_eh_abort on bad ptr
      7d34ddbe
    • Linus Torvalds's avatar
      Merge branch 'akpm' (patches from Andrew) · 978e0f92
      Linus Torvalds authored
      Merge misc fixes from Andrew Morton:
       "11 fixes"
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>:
        kasan: do not sanitize kexec purgatory
        drivers/rapidio/devices/tsi721.c: make module parameter variable name unique
        mm/hugetlb.c: don't call region_abort if region_chg fails
        kasan: report only the first error by default
        hugetlbfs: initialize shared policy as part of inode allocation
        mm: fix section name for .data..ro_after_init
        mm, hugetlb: use pte_present() instead of pmd_present() in follow_huge_pmd()
        mm: workingset: fix premature shadow node shrinking with cgroups
        mm: rmap: fix huge file mmap accounting in the memcg stats
        mm: move mm_percpu_wq initialization earlier
        mm: migrate: fix remove_migration_pte() for ksm pages
      978e0f92
  9. 01 Apr, 2017 20 commits
    • Linus Torvalds's avatar
      Merge tag 'usb-4.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb · a9f6b6b8
      Linus Torvalds authored
      Pull USB fixes from Greg KH:
       "Here are some small USB fixes for 4.11-rc5.
      
        The usual xhci fixes are here, as well as a fix for yet-another-bug-
        found-by-KASAN, those developers are doing great stuff here.
      
        And there's a phy build warning fix that showed up in 4.11-rc1.
      
        All of these have been in linux-next with no reported issues"
      
      * tag 'usb-4.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
        usb: phy: isp1301: Fix build warning when CONFIG_OF is disabled
        xhci: Manually give back cancelled URB if we can't queue it for cancel
        xhci: Set URB actual length for stopped control transfers
        xhci: plat: Register shutdown for xhci_plat
        USB: fix linked-list corruption in rh_call_control()
      a9f6b6b8
    • Linus Torvalds's avatar
      Merge tag 'tty-4.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty · b3ff4fac
      Linus Torvalds authored
      Pull tty/serial fixes from Greg KH:
       "Here are some small fixes for some serial drivers and Kconfig help
        text for 4.11-rc5. Nothing major here at all, a few things resolving
        reported bugs in some random serial drivers.
      
        I don't think these made the last linux-next due to me getting to them
        yesterday, but I am not sure, they might have snuck in. The patches
        only affect drivers that the maintainers of sent me these patches for,
        so we should be safe here :)"
      
      * tag 'tty-4.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
        tty: pl011: fix earlycon work-around for QDF2400 erratum 44
        serial: 8250_EXAR: fix duplicate Kconfig text and add missing help text
        tty/serial: atmel: fix TX path in atmel_console_write()
        tty/serial: atmel: fix race condition (TX+DMA)
        serial: mxs-auart: Fix baudrate calculation
      b3ff4fac
    • Linus Torvalds's avatar
      Merge tag 'acpi-4.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 7ece03b0
      Linus Torvalds authored
      Pull ACPI fixes from Rafael Wysocki:
       "These fix two issues related to IOAPIC hotplug, an overzealous build
        optimization that prevents the function graph tracer from working with
        the ACPI subsystem correctly and an RCU synchronization issue in the
        ACPI APEI code.
      
        Specifics:
      
         - drop the unconditional setting of the '-Os' gcc flag from the ACPI
           Makefile to make the function graph tracer work correctly with the
           ACPI subsystem (Josh Poimboeuf).
      
         - add missing synchronize_rcu() to ghes_remove() which removes an
           element from an RCU-protected list, but fails to synchronize it
           properly afterward (James Morse).
      
         - fix two problems related to IOAPIC hotplug, a local variable
           initialization in setup_res() and the creation of platform device
           objects for IO(x)APICs which are (a) unused and (b) leaked on
           hot-removal (Joerg Roedel)"
      
      * tag 'acpi-4.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        ACPI: Fix incompatibility with mcount-based function graph tracing
        ACPI / APEI: Add missing synchronize_rcu() on NOTIFY_SCI removal
        ACPI: Do not create a platform_device for IOAPIC/IOxAPIC
        ACPI: ioapic: Clear on-stack resource before using it
      7ece03b0
    • Linus Torvalds's avatar
      Merge tag 'pm-4.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 0d2ceec6
      Linus Torvalds authored
      Pull power management fixes from Rafael Wysocki:
       "These fix a cpufreq core issue with the initialization of the cpufreq
        sysfs interface and a cpuidle powernv driver initialization issue.
      
        Specifics:
      
         - symbolic links from CPU directories to the corresponding cpufreq
           policy directories in sysfs are not created during initialization
           in some cases which confuses user space, so prevent that from
           happening (Rafael Wysocki).
      
         - the powernv cpuidle driver fails to pass a correct cpumaks to the
           cpuidle core in some cases which causes subsequent failures to
           occur, so fix it (Vaidyanathan Srinivasan)"
      
      * tag 'pm-4.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        cpuidle: powernv: Pass correct drv->cpumask for registration
        cpufreq: Fix creation of symbolic links to policy directories
      0d2ceec6
    • Linus Torvalds's avatar
      Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux · 1300dc68
      Linus Torvalds authored
      Pull i2c fixes from Wolfram Sang:
       "Two bugfixes from I2C, specifically the I2C mux section. Thanks to
        peda for collecting them"
      
      * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
        i2c: mux: pca954x: Add missing pca9546 definition to chip_desc
        Revert "i2c: mux: pca954x: Add ACPI support for pca954x"
      1300dc68
    • Linus Torvalds's avatar
      Merge tag 'arc-4.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc · dcbcb491
      Linus Torvalds authored
      Pull ARC fixes from Vineet Gupta:
       "Accumulated fixes for ARC which I've been been sitting on for a while:
      
         - reading clk from driver vs device tree [Vlad]
      
         - fix support for UIO in VDK platform [Alexey]
      
         - SLC busy bit reading workaround
      
         - build warning with kprobes header reorg"
      
      * tag 'arc-4.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
        ARC: fix build warnings with !CONFIG_KPROBES
        ARCv2: SLC: Make sure busy bit is set properly on SLC flushing
        ARC: vdk: Fix support of UIO
        ARCv2: make unimplemented vectors as no-ops rather than halt core
        ARC: get rate from clk driver instead of reading device tree
        ARC: [dts] add cpu nodes to ARCHS SMP device tree
        ARC: [dts] add input clocks for cpu nodes
      dcbcb491
    • Linus Torvalds's avatar
      Merge tag 'nfsd-4.11-1' of git://linux-nfs.org/~bfields/linux · 09c8b3d1
      Linus Torvalds authored
      Pull nfsd fixes from Bruce Fields:
       "The restriction of NFSv4 to TCP went overboard and also broke the
        backchannel; fix.
      
        Also some minor refinements to the nfsd version-setting interface that
        we'd like to get fixed before release"
      
      * tag 'nfsd-4.11-1' of git://linux-nfs.org/~bfields/linux:
        svcrdma: set XPT_CONG_CTRL flag for bc xprt
        NFSD: fix nfsd_reset_versions for NFSv4.
        NFSD: fix nfsd_minorversion(.., NFSD_AVAIL)
        NFSD: further refinement of content of /proc/fs/nfsd/versions
        nfsd: map the ENOKEY to nfserr_perm for avoiding warning
        SUNRPC/backchanel: set XPT_CONG_CTRL flag for bc xprt
      09c8b3d1
    • Timur Tabi's avatar
      tty: pl011: fix earlycon work-around for QDF2400 erratum 44 · e53e597f
      Timur Tabi authored
      The work-around for the Qualcomm Datacenter Technologies QDF2400
      erratum 44 sets the "qdf2400_e44_present" global variable if the
      work-around is needed.  However, this check does not happen until after
      earlycon is initialized, which means the work-around is not
      used, and the console hangs as soon as it displays one character.
      
      Fixes: d8a4995b ("tty: pl011: Work around QDF2400 E44 stuck BUSY bit")
      Signed-off-by: default avatarTimur Tabi <timur@codeaurora.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e53e597f
    • Linus Torvalds's avatar
      Merge branch 'for-linus-4.11' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs · fe8e12b5
      Linus Torvalds authored
      Pull btrfs fixes from Chris Mason:
       "We have three small fixes queued up in my for-linus-4.11 branch"
      
      * 'for-linus-4.11' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
        Btrfs: fix an integer overflow check
        btrfs: Change qgroup_meta_rsv to 64bit
        Btrfs: bring back repair during read
      fe8e12b5
    • Mike Galbraith's avatar
      kasan: do not sanitize kexec purgatory · 13a6798e
      Mike Galbraith authored
      Fixes this:
      
        kexec: Undefined symbol: __asan_load8_noabort
        kexec-bzImage64: Loading purgatory failed
      
      Link: http://lkml.kernel.org/r/1489672155.4458.7.camel@gmx.deSigned-off-by: default avatarMike Galbraith <efault@gmx.de>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      13a6798e
    • Randy Dunlap's avatar
      drivers/rapidio/devices/tsi721.c: make module parameter variable name unique · 4785603b
      Randy Dunlap authored
      kbuild test robot reported a non-static variable name collision between
      a staging driver and a RapidIO driver, with a generic variable name of
      'dbg_level'.
      
      Both drivers should be changed so that they don't use this generic
      public variable name.  This patch fixes the RapidIO driver but does not
      change the user interface (name) for the module parameter.
      
        drivers/staging/built-in.o:(.bss+0x109d0): multiple definition of `dbg_level'
        drivers/rapidio/built-in.o:(.bss+0x16c): first defined here
      
      Link: http://lkml.kernel.org/r/ab527fc5-aa3c-4b07-5d48-eef5de703192@infradead.orgSigned-off-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Reported-by: default avatarkbuild test robot <fengguang.wu@intel.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Matt Porter <mporter@kernel.crashing.org>
      Cc: Alexandre Bounine <alexandre.bounine@idt.com>
      Cc: Jérémy Lefaure <jeremy.lefaure@lse.epita.fr>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      4785603b
    • Mike Kravetz's avatar
      mm/hugetlb.c: don't call region_abort if region_chg fails · ff8c0c53
      Mike Kravetz authored
      Changes to hugetlbfs reservation maps is a two step process.  The first
      step is a call to region_chg to determine what needs to be changed, and
      prepare that change.  This should be followed by a call to call to
      region_add to commit the change, or region_abort to abort the change.
      
      The error path in hugetlb_reserve_pages called region_abort after a
      failed call to region_chg.  As a result, the adds_in_progress counter in
      the reservation map is off by 1.  This is caught by a VM_BUG_ON in
      resv_map_release when the reservation map is freed.
      
      syzkaller fuzzer (when using an injected kmalloc failure) found this
      bug, that resulted in the following:
      
       kernel BUG at mm/hugetlb.c:742!
       Call Trace:
        hugetlbfs_evict_inode+0x7b/0xa0 fs/hugetlbfs/inode.c:493
        evict+0x481/0x920 fs/inode.c:553
        iput_final fs/inode.c:1515 [inline]
        iput+0x62b/0xa20 fs/inode.c:1542
        hugetlb_file_setup+0x593/0x9f0 fs/hugetlbfs/inode.c:1306
        newseg+0x422/0xd30 ipc/shm.c:575
        ipcget_new ipc/util.c:285 [inline]
        ipcget+0x21e/0x580 ipc/util.c:639
        SYSC_shmget ipc/shm.c:673 [inline]
        SyS_shmget+0x158/0x230 ipc/shm.c:657
        entry_SYSCALL_64_fastpath+0x1f/0xc2
       RIP: resv_map_release+0x265/0x330 mm/hugetlb.c:742
      
      Link: http://lkml.kernel.org/r/1490821682-23228-1-git-send-email-mike.kravetz@oracle.comSigned-off-by: default avatarMike Kravetz <mike.kravetz@oracle.com>
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Acked-by: default avatarHillf Danton <hillf.zj@alibaba-inc.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ff8c0c53
    • Mark Rutland's avatar
      kasan: report only the first error by default · b0845ce5
      Mark Rutland authored
      Disable kasan after the first report.  There are several reasons for
      this:
      
       - Single bug quite often has multiple invalid memory accesses causing
         storm in the dmesg.
      
       - Write OOB access might corrupt metadata so the next report will print
         bogus alloc/free stacktraces.
      
       - Reports after the first easily could be not bugs by itself but just
         side effects of the first one.
      
      Given that multiple reports usually only do harm, it makes sense to
      disable kasan after the first one.  If user wants to see all the
      reports, the boot-time parameter kasan_multi_shot must be used.
      
      [aryabinin@virtuozzo.com: wrote changelog and doc, added missing include]
      Link: http://lkml.kernel.org/r/20170323154416.30257-1-aryabinin@virtuozzo.comSigned-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Signed-off-by: default avatarAndrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Andrey Konovalov <andreyknvl@google.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b0845ce5
    • Mike Kravetz's avatar
      hugetlbfs: initialize shared policy as part of inode allocation · 4742a35d
      Mike Kravetz authored
      Any time after inode allocation, destroy_inode can be called.  The
      hugetlbfs inode contains a shared_policy structure, and
      mpol_free_shared_policy is unconditionally called as part of
      hugetlbfs_destroy_inode.  Initialize the policy as part of inode
      allocation so that any quick (error path) calls to destroy_inode will be
      handed an initialized policy.
      
      syzkaller fuzzer found this bug, that resulted in the following:
      
          BUG: KASAN: user-memory-access in atomic_inc
          include/asm-generic/atomic-instrumented.h:87 [inline] at addr
          000000131730bd7a
          BUG: KASAN: user-memory-access in __lock_acquire+0x21a/0x3a80
          kernel/locking/lockdep.c:3239 at addr 000000131730bd7a
          Write of size 4 by task syz-executor6/14086
          CPU: 3 PID: 14086 Comm: syz-executor6 Not tainted 4.11.0-rc3+ #364
          Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
          Call Trace:
           atomic_inc include/asm-generic/atomic-instrumented.h:87 [inline]
           __lock_acquire+0x21a/0x3a80 kernel/locking/lockdep.c:3239
           lock_acquire+0x1ee/0x590 kernel/locking/lockdep.c:3762
           __raw_write_lock include/linux/rwlock_api_smp.h:210 [inline]
           _raw_write_lock+0x33/0x50 kernel/locking/spinlock.c:295
           mpol_free_shared_policy+0x43/0xb0 mm/mempolicy.c:2536
           hugetlbfs_destroy_inode+0xca/0x120 fs/hugetlbfs/inode.c:952
           alloc_inode+0x10d/0x180 fs/inode.c:216
           new_inode_pseudo+0x69/0x190 fs/inode.c:889
           new_inode+0x1c/0x40 fs/inode.c:918
           hugetlbfs_get_inode+0x40/0x420 fs/hugetlbfs/inode.c:734
           hugetlb_file_setup+0x329/0x9f0 fs/hugetlbfs/inode.c:1282
           newseg+0x422/0xd30 ipc/shm.c:575
           ipcget_new ipc/util.c:285 [inline]
           ipcget+0x21e/0x580 ipc/util.c:639
           SYSC_shmget ipc/shm.c:673 [inline]
           SyS_shmget+0x158/0x230 ipc/shm.c:657
           entry_SYSCALL_64_fastpath+0x1f/0xc2
      
      Analysis provided by Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      
      Link: http://lkml.kernel.org/r/1490477850-7944-1-git-send-email-mike.kravetz@oracle.comSigned-off-by: default avatarMike Kravetz <mike.kravetz@oracle.com>
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Acked-by: default avatarHillf Danton <hillf.zj@alibaba-inc.com>
      Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      4742a35d
    • Kees Cook's avatar
      mm: fix section name for .data..ro_after_init · 906f2a51
      Kees Cook authored
      A section name for .data..ro_after_init was added by both:
      
          commit d07a980c ("s390: add proper __ro_after_init support")
      
      and
      
          commit d7c19b06 ("mm: kmemleak: scan .data.ro_after_init")
      
      The latter adds incorrect wrapping around the existing s390 section, and
      came later.  I'd prefer the s390 naming, so this moves the s390-specific
      name up to the asm-generic/sections.h and renames the section as used by
      kmemleak (and in the future, kernel/extable.c).
      
      Link: http://lkml.kernel.org/r/20170327192213.GA129375@beastSigned-off-by: default avatarKees Cook <keescook@chromium.org>
      Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>	[s390 parts]
      Acked-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Cc: Eddie Kovsky <ewk@edkovsky.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      906f2a51
    • Naoya Horiguchi's avatar
      mm, hugetlb: use pte_present() instead of pmd_present() in follow_huge_pmd() · c9d398fa
      Naoya Horiguchi authored
      I found the race condition which triggers the following bug when
      move_pages() and soft offline are called on a single hugetlb page
      concurrently.
      
          Soft offlining page 0x119400 at 0x700000000000
          BUG: unable to handle kernel paging request at ffffea0011943820
          IP: follow_huge_pmd+0x143/0x190
          PGD 7ffd2067
          PUD 7ffd1067
          PMD 0
              [61163.582052] Oops: 0000 [#1] SMP
          Modules linked in: binfmt_misc ppdev virtio_balloon parport_pc pcspkr i2c_piix4 parport i2c_core acpi_cpufreq ip_tables xfs libcrc32c ata_generic pata_acpi virtio_blk 8139too crc32c_intel ata_piix serio_raw libata virtio_pci 8139cp virtio_ring virtio mii floppy dm_mirror dm_region_hash dm_log dm_mod [last unloaded: cap_check]
          CPU: 0 PID: 22573 Comm: iterate_numa_mo Tainted: P           OE   4.11.0-rc2-mm1+ #2
          Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
          RIP: 0010:follow_huge_pmd+0x143/0x190
          RSP: 0018:ffffc90004bdbcd0 EFLAGS: 00010202
          RAX: 0000000465003e80 RBX: ffffea0004e34d30 RCX: 00003ffffffff000
          RDX: 0000000011943800 RSI: 0000000000080001 RDI: 0000000465003e80
          RBP: ffffc90004bdbd18 R08: 0000000000000000 R09: ffff880138d34000
          R10: ffffea0004650000 R11: 0000000000c363b0 R12: ffffea0011943800
          R13: ffff8801b8d34000 R14: ffffea0000000000 R15: 000077ff80000000
          FS:  00007fc977710740(0000) GS:ffff88007dc00000(0000) knlGS:0000000000000000
          CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
          CR2: ffffea0011943820 CR3: 000000007a746000 CR4: 00000000001406f0
          Call Trace:
           follow_page_mask+0x270/0x550
           SYSC_move_pages+0x4ea/0x8f0
           SyS_move_pages+0xe/0x10
           do_syscall_64+0x67/0x180
           entry_SYSCALL64_slow_path+0x25/0x25
          RIP: 0033:0x7fc976e03949
          RSP: 002b:00007ffe72221d88 EFLAGS: 00000246 ORIG_RAX: 0000000000000117
          RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fc976e03949
          RDX: 0000000000c22390 RSI: 0000000000001400 RDI: 0000000000005827
          RBP: 00007ffe72221e00 R08: 0000000000c2c3a0 R09: 0000000000000004
          R10: 0000000000c363b0 R11: 0000000000000246 R12: 0000000000400650
          R13: 00007ffe72221ee0 R14: 0000000000000000 R15: 0000000000000000
          Code: 81 e4 ff ff 1f 00 48 21 c2 49 c1 ec 0c 48 c1 ea 0c 4c 01 e2 49 bc 00 00 00 00 00 ea ff ff 48 c1 e2 06 49 01 d4 f6 45 bc 04 74 90 <49> 8b 7c 24 20 40 f6 c7 01 75 2b 4c 89 e7 8b 47 1c 85 c0 7e 2a
          RIP: follow_huge_pmd+0x143/0x190 RSP: ffffc90004bdbcd0
          CR2: ffffea0011943820
          ---[ end trace e4f81353a2d23232 ]---
          Kernel panic - not syncing: Fatal exception
          Kernel Offset: disabled
      
      This bug is triggered when pmd_present() returns true for non-present
      hugetlb, so fixing the present check in follow_huge_pmd() prevents it.
      Using pmd_present() to determine present/non-present for hugetlb is not
      correct, because pmd_present() checks multiple bits (not only
      _PAGE_PRESENT) for historical reason and it can misjudge hugetlb state.
      
      Fixes: e66f17ff ("mm/hugetlb: take page table lock in follow_huge_pmd()")
      Link: http://lkml.kernel.org/r/1490149898-20231-1-git-send-email-n-horiguchi@ah.jp.nec.comSigned-off-by: default avatarNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Acked-by: default avatarHillf Danton <hillf.zj@alibaba-inc.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
      Cc: <stable@vger.kernel.org>        [4.0+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c9d398fa
    • Johannes Weiner's avatar
      mm: workingset: fix premature shadow node shrinking with cgroups · 0cefabda
      Johannes Weiner authored
      Commit 0a6b76dd ("mm: workingset: make shadow node shrinker memcg
      aware") enabled cgroup-awareness in the shadow node shrinker, but forgot
      to also enable cgroup-awareness in the list_lru the shadow nodes sit on.
      
      Consequently, all shadow nodes are sitting on a global (per-NUMA node)
      list, while the shrinker applies the limits according to the amount of
      cache in the cgroup its shrinking.  The result is excessive pressure on
      the shadow nodes from cgroups that have very little cache.
      
      Enable memcg-mode on the shadow node LRUs, such that per-cgroup limits
      are applied to per-cgroup lists.
      
      Fixes: 0a6b76dd ("mm: workingset: make shadow node shrinker memcg aware")
      Link: http://lkml.kernel.org/r/20170322005320.8165-1-hannes@cmpxchg.orgSigned-off-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Acked-by: default avatarVladimir Davydov <vdavydov@tarantool.org>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: <stable@vger.kernel.org>	[4.6+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      0cefabda
    • Johannes Weiner's avatar
      mm: rmap: fix huge file mmap accounting in the memcg stats · 553af430
      Johannes Weiner authored
      Huge pages are accounted as single units in the memcg's "file_mapped"
      counter.  Account the correct number of base pages, like we do in the
      corresponding node counter.
      
      Link: http://lkml.kernel.org/r/20170322005111.3156-1-hannes@cmpxchg.orgSigned-off-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Reviewed-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
      Cc: <stable@vger.kernel.org>	[4.8+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      553af430
    • Michal Hocko's avatar
      mm: move mm_percpu_wq initialization earlier · 597b7305
      Michal Hocko authored
      Yang Li has reported that drain_all_pages triggers a WARN_ON which means
      that this function is called earlier than the mm_percpu_wq is
      initialized on arm64 with CMA configured:
      
        WARNING: CPU: 2 PID: 1 at mm/page_alloc.c:2423 drain_all_pages+0x244/0x25c
        Modules linked in:
        CPU: 2 PID: 1 Comm: swapper/0 Not tainted 4.11.0-rc1-next-20170310-00027-g64dfbc5 #127
        Hardware name: Freescale Layerscape 2088A RDB Board (DT)
        task: ffffffc07c4a6d00 task.stack: ffffffc07c4a8000
        PC is at drain_all_pages+0x244/0x25c
        LR is at start_isolate_page_range+0x14c/0x1f0
        [...]
         drain_all_pages+0x244/0x25c
         start_isolate_page_range+0x14c/0x1f0
         alloc_contig_range+0xec/0x354
         cma_alloc+0x100/0x1fc
         dma_alloc_from_contiguous+0x3c/0x44
         atomic_pool_init+0x7c/0x208
         arm64_dma_init+0x44/0x4c
         do_one_initcall+0x38/0x128
         kernel_init_freeable+0x1a0/0x240
         kernel_init+0x10/0xfc
         ret_from_fork+0x10/0x20
      
      Fix this by moving the whole setup_vmstat which is an initcall right now
      to init_mm_internals which will be called right after the WQ subsystem
      is initialized.
      
      Link: http://lkml.kernel.org/r/20170315164021.28532-1-mhocko@kernel.orgSigned-off-by: default avatarMichal Hocko <mhocko@suse.com>
      Reported-by: default avatarYang Li <pku.leo@gmail.com>
      Tested-by: default avatarYang Li <pku.leo@gmail.com>
      Tested-by: default avatarXiaolong Ye <xiaolong.ye@intel.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      597b7305
    • Naoya Horiguchi's avatar
      mm: migrate: fix remove_migration_pte() for ksm pages · 4b0ece6f
      Naoya Horiguchi authored
      I found that calling page migration for ksm pages causes the following
      bug:
      
          page:ffffea0004d51180 count:2 mapcount:2 mapping:ffff88013c785141 index:0x913
          flags: 0x57ffffc0040068(uptodate|lru|active|swapbacked)
          raw: 0057ffffc0040068 ffff88013c785141 0000000000000913 0000000200000001
          raw: ffffea0004d5f9e0 ffffea0004d53f60 0000000000000000 ffff88007d81b800
          page dumped because: VM_BUG_ON_PAGE(!PageLocked(page))
          page->mem_cgroup:ffff88007d81b800
          ------------[ cut here ]------------
          kernel BUG at /src/linux-dev/mm/rmap.c:1086!
          invalid opcode: 0000 [#1] SMP
          Modules linked in: ppdev parport_pc virtio_balloon i2c_piix4 pcspkr parport i2c_core acpi_cpufreq ip_tables xfs libcrc32c ata_generic pata_acpi ata_piix 8139too libata virtio_blk 8139cp crc32c_intel mii virtio_pci virtio_ring serio_raw virtio floppy dm_mirror dm_region_hash dm_log dm_mod
          CPU: 0 PID: 3162 Comm: bash Not tainted 4.11.0-rc2-mm1+ #1
          Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
          RIP: 0010:do_page_add_anon_rmap+0x1ba/0x260
          RSP: 0018:ffffc90002473b30 EFLAGS: 00010282
          RAX: 0000000000000021 RBX: ffffea0004d51180 RCX: 0000000000000006
          RDX: 0000000000000000 RSI: 0000000000000082 RDI: ffff88007dc0dfe0
          RBP: ffffc90002473b58 R08: 00000000fffffffe R09: 00000000000001c1
          R10: 0000000000000005 R11: 00000000000001c0 R12: ffff880139ab3d80
          R13: 0000000000000000 R14: 0000700000000200 R15: 0000160000000000
          FS:  00007f5195f50740(0000) GS:ffff88007dc00000(0000) knlGS:0000000000000000
          CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
          CR2: 00007fd450287000 CR3: 000000007a08e000 CR4: 00000000001406f0
          Call Trace:
           page_add_anon_rmap+0x18/0x20
           remove_migration_pte+0x220/0x2c0
           rmap_walk_ksm+0x143/0x220
           rmap_walk+0x55/0x60
           remove_migration_ptes+0x53/0x80
           migrate_pages+0x8ed/0xb60
           soft_offline_page+0x309/0x8d0
           store_soft_offline_page+0xaf/0xf0
           dev_attr_store+0x18/0x30
           sysfs_kf_write+0x3a/0x50
           kernfs_fop_write+0xff/0x180
           __vfs_write+0x37/0x160
           vfs_write+0xb2/0x1b0
           SyS_write+0x55/0xc0
           do_syscall_64+0x67/0x180
           entry_SYSCALL64_slow_path+0x25/0x25
          RIP: 0033:0x7f51956339e0
          RSP: 002b:00007ffcfa0dffc8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
          RAX: ffffffffffffffda RBX: 000000000000000c RCX: 00007f51956339e0
          RDX: 000000000000000c RSI: 00007f5195f53000 RDI: 0000000000000001
          RBP: 00007f5195f53000 R08: 000000000000000a R09: 00007f5195f50740
          R10: 000000000000000b R11: 0000000000000246 R12: 00007f5195907400
          R13: 000000000000000c R14: 0000000000000001 R15: 0000000000000000
          Code: fe ff ff 48 81 c2 00 02 00 00 48 89 55 d8 e8 2e c3 fd ff 48 8b 55 d8 e9 42 ff ff ff 48 c7 c6 e0 52 a1 81 48 89 df e8 46 ad fe ff <0f> 0b 48 83 e8 01 e9 7f fe ff ff 48 83 e8 01 e9 96 fe ff ff 48
          RIP: do_page_add_anon_rmap+0x1ba/0x260 RSP: ffffc90002473b30
          ---[ end trace a679d00f4af2df48 ]---
          Kernel panic - not syncing: Fatal exception
          Kernel Offset: disabled
          ---[ end Kernel panic - not syncing: Fatal exception
      
      The problem is in the following lines:
      
          new = page - pvmw.page->index +
              linear_page_index(vma, pvmw.address);
      
      The 'new' is calculated with 'page' which is given by the caller as a
      destination page and some offset adjustment for thp.  But this doesn't
      properly work for ksm pages because pvmw.page->index doesn't change for
      each address but linear_page_index() changes, which means that 'new'
      points to different pages for each addresses backed by the ksm page.  As
      a result, we try to set totally unrelated pages as destination pages,
      and that causes kernel crash.
      
      This patch fixes the miscalculation and makes ksm page migration work
      fine.
      
      Fixes: 3fe87967 ("mm: convert remove_migration_pte() to use page_vma_mapped_walk()")
      Link: http://lkml.kernel.org/r/1489717683-29905-1-git-send-email-n-horiguchi@ah.jp.nec.comSigned-off-by: default avatarNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      4b0ece6f
  10. 31 Mar, 2017 2 commits