1. 23 Apr, 2014 11 commits
  2. 22 Apr, 2014 6 commits
  3. 20 Apr, 2014 5 commits
  4. 19 Apr, 2014 12 commits
    • Adrien BAK's avatar
      perf tools: Improve error reporting · ffa91880
      Adrien BAK authored
      In the current version, when using perf record, if something goes
      wrong in tools/perf/builtin-record.c:375
        session = perf_session__new(file, false, NULL);
      
      The error message:
      "Not enough memory for reading per file header"
      
      is issued. This error message seems to be outdated and is not very
      helpful. This patch proposes to replace this error message by
      "Perf session creation failed"
      
      I believe this issue has been brought to lkml:
      https://lkml.org/lkml/2014/2/24/458
      although this patch only tackles a (small) part of the issue.
      
      Additionnaly, this patch improves error reporting in
      tools/perf/util/data.c open_file_write.
      
      Currently, if the call to open fails, the user is unaware of it.
      This patch logs the error, before returning the error code to
      the caller.
      Reported-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarAdrien BAK <adrien.bak@metascale.org>
      Link: http://lkml.kernel.org/r/1397786443.3093.4.camel@beast
      [ Reorganize the changelog into paragraphs ]
      [ Added empty line after fd declaration in open_file_write ]
      Signed-off-by: default avatarJiri Olsa <jolsa@redhat.com>
      ffa91880
    • Vladimir Nikulichev's avatar
      perf tools: Adjust symbols in VDSO · 922d0e4d
      Vladimir Nikulichev authored
      pert-report doesn't resolve function names in VDSO:
      
      $ perf report --stdio -g flat,0.0,15,callee --sort pid
      ...
                  8.76%
                     0x7fff6b1fe861
                     __gettimeofday
                     ACE_OS::gettimeofday()
      ...
      
      In this case symbol values should be adjusted the same way as for executables,
      relocatable objects and prelinked libraries.
      
      After fix:
      
      $ perf report --stdio -g flat,0.0,15,callee --sort pid
      ...
                  8.76%
                     __vdso_gettimeofday
                     __gettimeofday
                     ACE_OS::gettimeofday()
      Signed-off-by: default avatarVladimir Nikulichev <nvs@tbricks.com>
      Tested-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Reviewed-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Link: http://lkml.kernel.org/r/969812.163009436-sendEmail@nvsSigned-off-by: default avatarJiri Olsa <jolsa@redhat.com>
      922d0e4d
    • Alexander Yarygin's avatar
      perf kvm: Fix 'Min time' counting in report command · acb61fc8
      Alexander Yarygin authored
      Every event in the perf-kvm has a 'stats' structure, which contains
      max/min/average/etc times of handling this event.
      The problem is that the 'perf-kvm stat report' command always shows
      that 'min time' is 0us for every event. Example:
      
       # perf kvm stat report
      
       Analyze events for all VCPUs:
      
          VM-EXIT    Samples  Samples%     Time%   Min Time   Max Time Avg time
        [..]
        0xB2 MSCH         12     0.07%     0.00%        0us        8us 7.31us ( +-   2.11% )
        0xB2 CHSC         12     0.07%     0.00%        0us       18us 9.39us ( +-   9.49% )
        0xB2 STPX          8     0.05%     0.00%        0us        2us 1.88us ( +-   7.18% )
        0xB2 STSI          7     0.04%     0.00%        0us       44us 16.49us ( +-  38.20% )
        [..]
      
      This happens because the 'stats' structure is not initialized and
      stats->min equals to 0. Lets initialize the structure for every
      event after its allocation using init_stats() function. This initializes
      stats->min to -1 and makes 'Min time' statistics counting work:
      
       # perf kvm stat report
      
       Analyze events for all VCPUs:
      
          VM-EXIT    Samples  Samples%     Time%   Min Time   Max Time Avg time
        [..]
        0xB2 MSCH         12     0.07%     0.00%        6us        8us 7.31us ( +-   2.11% )
        0xB2 CHSC         12     0.07%     0.00%        7us       18us 9.39us ( +-   9.49% )
        0xB2 STPX          8     0.05%     0.00%        1us        2us 1.88us ( +-   7.18% )
        0xB2 STSI          7     0.04%     0.00%        1us       44us 16.49us ( +-  38.20% )
        [..]
      Signed-off-by: default avatarAlexander Yarygin <yarygin@linux.vnet.ibm.com>
      Signed-off-by: default avatarChristian Borntraeger <borntraeger@de.ibm.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@gmail.com>
      Link: http://lkml.kernel.org/r/1397053319-2130-3-git-send-email-borntraeger@de.ibm.com
      [ Fixing the perf examples changelog output ]
      Signed-off-by: default avatarJiri Olsa <jolsa@redhat.com>
      acb61fc8
    • Eric Dumazet's avatar
      coredump: fix va_list corruption · 404ca80e
      Eric Dumazet authored
      A va_list needs to be copied in case it needs to be used twice.
      
      Thanks to Hugh for debugging this issue, leading to various panics.
      
      Tested:
      
        lpq84:~# echo "|/foobar12345 %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h" >/proc/sys/kernel/core_pattern
      
      'produce_core' is simply : main() { *(int *)0 = 1;}
      
        lpq84:~# ./produce_core
        Segmentation fault (core dumped)
        lpq84:~# dmesg | tail -1
        [  614.352947] Core dump to |/foobar12345 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 (null) pipe failed
      
      Notice the last argument was replaced by a NULL (we were lucky enough to
      not crash, but do not try this on your production machine !)
      
      After fix :
      
        lpq83:~# echo "|/foobar12345 %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h" >/proc/sys/kernel/core_pattern
        lpq83:~# ./produce_core
        Segmentation fault
        lpq83:~# dmesg | tail -1
        [  740.800441] Core dump to |/foobar12345 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 pipe failed
      
      Fixes: 5fe9d8ca ("coredump: cn_vprintf() has no reason to call vsnprintf() twice")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Diagnosed-by: default avatarHugh Dickins <hughd@google.com>
      Acked-by: default avatarOleg Nesterov <oleg@redhat.com>
      Cc: Neil Horman <nhorman@tuxdriver.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: stable@vger.kernel.org # 3.11+
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      404ca80e
    • Linus Torvalds's avatar
      Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 6d459690
      Linus Torvalds authored
      Pull x86 fix from Ingo Molnar:
       "This fixes the preemption-count imbalance crash reported by Owen
        Kibel"
      
      * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/mce: Fix CMCI preemption bugs
      6d459690
    • Linus Torvalds's avatar
      Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 8f98f6f5
      Linus Torvalds authored
      Pull scheduler fixes from Ingo Molnar:
       "Two fixes:
      
         - a SCHED_DEADLINE task selection fix
         - a sched/numa related lockdep splat fix"
      
      * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        sched: Check for stop task appearance when balancing happens
        sched/numa: Fix task_numa_free() lockdep splat
      8f98f6f5
    • Linus Torvalds's avatar
      Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 8de3f7a7
      Linus Torvalds authored
      Pull perf fixes from Ingo Molnar:
       "Two kernel side fixes:
      
         - an Intel uncore PMU driver potential crash fix
         - a kprobes/perf-call-graph interaction fix"
      
      * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        perf/x86/intel: Use rdmsrl_safe() when initializing RAPL PMU
        kprobes/x86: Fix page-fault handling logic
      8de3f7a7
    • Linus Torvalds's avatar
      Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux · b9312420
      Linus Torvalds authored
      Pull drm fixes from Dave Airlie:
       "Unfortunately this contains no easter eggs, its a bit larger than I'd
        like, but I included a patch that just moves code from one file to
        another and I'd like to avoid merge conflicts with that later, so it
        makes it seem worse than it is,
      
        Otherwise:
         - radeon: fixes to use new microcode to stabilise some cards, use
           some common displayport code, some runtime pm fixes, pll regression
           fixes
         - i915: fix for some context oopses, a warn in a used path, backlight
           fixes
         - nouveau: regression fix
         - omap: a bunch of fixes"
      
      * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (51 commits)
        drm: bochs: drop unused struct fields
        drm: bochs: add power management support
        drm: cirrus: add power management support
        drm: Split out drm_probe_helper.c from drm_crtc_helper.c
        drm/plane-helper: Don't fake-implement primary plane disabling
        drm/ast: fix value check in cbr_scan2
        drm/nouveau/bios: fix a bit shift error introduced by 457e77b2
        drm/radeon/ci: make sure mc ucode is loaded before checking the size
        drm/radeon/si: make sure mc ucode is loaded before checking the size
        drm/radeon: improve PLL params if we don't match exactly v2
        drm/radeon: memory leak on bo reservation failure. v2
        drm/radeon: fix VCE fence command
        drm/radeon: re-enable mclk dpm on R7 260X asics
        drm/radeon: add support for newer mc ucode on CI (v2)
        drm/radeon: add support for newer mc ucode on SI (v2)
        drm/radeon: apply more strict limits for PLL params v2
        drm/radeon: update CI DPM powertune settings
        drm/radeon: fix runpm handling on APUs (v4)
        drm/radeon: disable mclk dpm on R7 260X
        drm/tegra: Remove gratuitous pad field
        ...
      b9312420
    • Dave Airlie's avatar
      Merge branch 'drm-next-3.15-wip' of git://people.freedesktop.org/~deathsimple/linux into drm-next · a42892ed
      Dave Airlie authored
      Some i2c fixes over DisplayPort.
      
      * 'drm-next-3.15-wip' of git://people.freedesktop.org/~deathsimple/linux:
        drm/radeon: Improve vramlimit module param documentation
        drm/radeon: fix audio pin counts for DCE6+ (v2)
        drm/radeon/dp: switch to the common i2c over aux code
        drm/dp/i2c: Update comments about common i2c over dp assumptions (v3)
        drm/dp/i2c: send bare addresses to properly reset i2c connections (v4)
        drm/radeon/dp: handle zero sized i2c over aux transactions (v2)
        drm/i915: support address only i2c-over-aux transactions
        drm/tegra: dp: Support address-only I2C-over-AUX transactions
      a42892ed
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · ebfc45ee
      Linus Torvalds authored
      Pull more networking fixes from David Miller:
      
       1) Fix mlx4_en_netpoll implementation, it needs to schedule a NAPI
          context, not synchronize it.  From Chris Mason.
      
       2) Ipv4 flow input interface should never be zero, it should be
          LOOPBACK_IFINDEX instead.  From Cong Wang and Julian Anastasov.
      
       3) Properly configure MAC to PHY connection in mvneta devices, from
          Thomas Petazzoni.
      
       4) sys_recv should use SYSCALL_DEFINE.  From Jan Glauber.
      
       5) Tunnel driver ioctls do not use the correct namespace, fix from
          Nicolas Dichtel.
      
       6) Fix memory leak on seccomp filter attach, from Kees Cook.
      
       7) Fix lockdep warning for nested vlans, from Ding Tianhong.
      
       8) Crashes can happen in SCTP due to how the auth_enable value is
          managed, fix from Vlad Yasevich.
      
       9) Wireless fixes from John W Linville and co.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (45 commits)
        net: sctp: cache auth_enable per endpoint
        tg3: update rx_jumbo_pending ring param only when jumbo frames are enabled
        vlan: Fix lockdep warning when vlan dev handle notification
        seccomp: fix memory leak on filter attach
        isdn: icn: buffer overflow in icn_command()
        ip6_tunnel: use the right netns in ioctl handler
        sit: use the right netns in ioctl handler
        ip_tunnel: use the right netns in ioctl handler
        net: use SYSCALL_DEFINEx for sys_recv
        net: mdio-gpio: Add support for separate MDI and MDO gpio pins
        net: mdio-gpio: Add support for active low gpio pins
        net: mdio-gpio: Use devm_ functions where possible
        ipv4, route: pass 0 instead of LOOPBACK_IFINDEX to fib_validate_source()
        ipv4, fib: pass LOOPBACK_IFINDEX instead of 0 to flowi4_iif
        mlx4_en: don't use napi_synchronize inside mlx4_en_netpoll
        net: mvneta: properly configure the MAC <-> PHY connection in all situations
        net: phy: add minimal support for QSGMII PHY
        sfc:On MCDI timeout, issue an FLR (and mark MCDI to fail-fast)
        mwifiex: fix hung task on command timeout
        mwifiex: process event before command response
        ...
      ebfc45ee
    • Linus Torvalds's avatar
      Merge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6 · 6e66d5da
      Linus Torvalds authored
      Pull cifs fixes from Steve French:
       "A set of 5 small cifs fixes"
      
      * 'for-next' of git://git.samba.org/sfrench/cifs-2.6:
        cif: fix dead code
        cifs: fix error handling cifs_user_readv
        fs: cifs: remove unused variable.
        Return correct error on query of xattr on file with empty xattrs
        cifs: Wait for writebacks to complete before attempting write.
      6e66d5da
    • Linus Torvalds's avatar
      Merge tag 'char-misc-3.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc · 25bfe4f5
      Linus Torvalds authored
      Pull char/misc driver fixes from Greg KH:
       "Here are a few driver fixes for char/misc drivers that resolve
        reported issues.
      
        All have been in linux-next successfully for a few days"
      
      * tag 'char-misc-3.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
        Drivers: hv: vmbus: Negotiate version 3.0 when running on ws2012r2 hosts
        Tools: hv: Handle the case when the target file exists correctly
        vme_tsi148: Utilize to_pci_dev() macro
        vme_tsi148: Fix PCI address mapping assumption
        vme_tsi148: Fix typo in tsi148_slave_get()
        w1: avoid recursive device_add
        w1: fix netlink refcnt leak on error path
        misc: Grammar s/addition/additional/
        drivers: mcb: fix memory leak in chameleon_parse_cells() error path
        mei: ignore client writing state during cb completion
        mei: me: do not load the driver if the FW doesn't support MEI interface
        GenWQE: Increase driver version number
        GenWQE: Fix multithreading problems
        GenWQE: Ensure rc is not returning an uninitialized value
        GenWQE: Add wmb before DDCB is started
        GenWQE: Enable access to VPD flash area
      25bfe4f5
  5. 18 Apr, 2014 6 commits
    • Linus Torvalds's avatar
      Merge tag 'driver-core-3.15-rc2' of... · 60fbf2bd
      Linus Torvalds authored
      Merge tag 'driver-core-3.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
      
      Pull driver core fixes from Greg KH:
       "Here are some driver core fixes for 3.15-rc2.  Also in here are some
        documentation updates, as well as an API removal that had to wait for
        after -rc1 due to the cleanups coming into you from multiple developer
        trees (this one and the PPC tree.)
      
        All have been in linux next successfully"
      
      * tag 'driver-core-3.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
        drivers/base/dd.c incorrect pr_debug() parameters
        Documentation: Update stable address in Chinese and Japanese translations
        topology: Fix compilation warning when not in SMP
        Chinese: add translation of io_ordering.txt
        stable_kernel_rules: spelling/word usage
        sysfs, driver-core: remove unused {sysfs|device}_schedule_callback_owner()
        kernfs: protect lazy kernfs_iattrs allocation with mutex
        fs: Don't return 0 from get_anon_bdev
      60fbf2bd
    • Linus Torvalds's avatar
      Merge tag 'staging-3.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging · 8cb652bb
      Linus Torvalds authored
      Pull staging driver fixes from Greg KH:
       "Here are a few staging driver fixes for issues that have been reported
        for 3.15-rc2.
      
        Also dominating the diffstat for the pull request is the removal of
        the rtl8187se driver.  It's no longer needed in staging as a "real"
        driver for this hardware is now merged in the tree in the "correct"
        location in drivers/net/
      
        All of these patches have been tested in linux-next"
      
      * tag 'staging-3.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
        staging: r8188eu: Fix case where ethtype was never obtained and always be checked against 0
        staging: r8712u: Fix case where ethtype was never obtained and always be checked against 0
        staging: r8188eu: Calling rtw_get_stainfo() with a NULL sta_addr will return NULL
        staging: comedi: fix circular locking dependency in comedi_mmap()
        staging: r8723au: Add missing initialization of change_inx in sort algorithm
        Staging: unisys: use after free in list_for_each()
        staging: unisys: use after free in error messages
        staging: speakup: fix misuse of kstrtol() in handle_goto()
        staging: goldfish: Call free_irq in error path
        staging: delete rtl8187se wireless driver
        staging: rtl8723au: Fix buffer overflow in rtw_get_wfd_ie()
        staging: gs_fpgaboot: remove __TIMESTAMP__ macro
        staging: vme: fix memory leak in vme_user_probe()
        staging: fpgaboot: clean up Makefile
        staging/usbip: fix store_attach() sscanf return value check
        staging/usbip: userspace - fix usbipd SIGSEGV from refresh_exported_devices()
        staging: rtl8188eu: remove spaces, correct counts to unbreak P2P ioctls
        staging/rtl8821ae: Fix OOM handling in _rtl_init_deferred_work()
      8cb652bb
    • Linus Torvalds's avatar
      Merge tag 'tty-3.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty · 575a2929
      Linus Torvalds authored
      Pull tty/serial driver fixes from Greg KH:
       "Here are a number of small tty/serial driver fixes for 3.15-rc2.  Also
        in here are some Documentation file removals for drivers that we
        removed a long time ago, no need to keep it around any longer.
      
        All of these have been in linux-next for a bit"
      
      * tag 'tty-3.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
        Revert "serial: 8250, disable "too much work" messages"
        serial: amba-pl011: fix regression, causing an Oops on rmmod
        tty: Fix help text of SYNCLINK_CS
        tty: fix memleak in alloc_pid
        ttyprintk: Allow built as a module
        ttyprintk: Fix wrong tty_unregister_driver() call in the error path
        serial: 8250, disable "too much work" messages
        Documentation/serial: Delete obsolete driver documentation
        serial: omap: Fix missing pm_runtime_resume handling by simplifying code
        serial_core: Fix pm imbalance on unbind
        serial: pl011: change Rx burst size to half of trigger level
        serial: timberdale: Depend on X86_32
        serial: st-asc: Fix SysRq char handling
        Revert "serial: clps711x: Give a chance to perform useful tasks during wait loop"
        serial_core: Fix conditional start_tx on ring buffer not empty
        serial: efm32: use $vendor,$device scheme for compatible string
        serial: omap: free the wakeup settings in remove
      575a2929
    • Linus Torvalds's avatar
      Merge tag 'usb-3.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb · 7e55f81e
      Linus Torvalds authored
      Pull USB fixes from Greg KH:
       "Here are a number of tiny USB fixes and new device ids for 3.15-rc2.
        Nothing major, just issues some people have reported.
      
        All of these have been in linux-next"
      
      * tag 'usb-3.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
        uas: fix deadlocky memory allocations
        uas: fix error handling during scsi_scan()
        uas: fix GFP_NOIO under spinlock
        uwb: adds missing error handling
        USB: cdc-acm: Remove Motorola/Telit H24 serial interfaces from ACM driver
        USB: ohci-jz4740: FEAT_POWER is a port feature, not a hub feature
        USB: ohci-jz4740: Fix uninitialized variable warning
        USB: EHCI: tegra: set txfill_tuning
        usb: ehci-platform: Return immediately from suspend if ehci_suspend fails
        usb: ehci-exynos: Return immediately from suspend if ehci_suspend fails
        USB: fix crash during hotplug of PCI USB controller card
        USB: cdc-acm: fix double usb_autopm_put_interface() in acm_port_activate()
        usb: usb-common: fix typo for usb_state_string
        USB: usb_wwan: fix handling of missing bulk endpoints
        USB: pl2303: add ids for Hewlett-Packard HP POS pole displays
        USB: cp210x: Add 8281 (Nanotec Plug & Drive)
        usb: option driver, add support for Telit UE910v2
        Revert "USB: serial: add usbid for dell wwan card to sierra.c"
        USB: serial: ftdi_sio: add id for Brainboxes serial cards
      7e55f81e
    • Linus Torvalds's avatar
      Merge branch 'akpm' (incoming from Andrew) · ea2388f2
      Linus Torvalds authored
      Merge misc fixes from Andrew Morton:
       "13 fixes"
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>:
        thp: close race between split and zap huge pages
        mm: fix new kernel-doc warning in filemap.c
        mm: fix CONFIG_DEBUG_VM_RB description
        mm: use paravirt friendly ops for NUMA hinting ptes
        mips: export flush_icache_range
        mm/hugetlb.c: add cond_resched_lock() in return_unused_surplus_pages()
        wait: explain the shadowing and type inconsistencies
        Shiraz has moved
        Documentation/vm/numa_memory_policy.txt: fix wrong document in numa_memory_policy.txt
        powerpc/mm: fix ".__node_distance" undefined
        kernel/watchdog.c:touch_softlockup_watchdog(): use raw_cpu_write()
        init/Kconfig: move the trusted keyring config option to general setup
        vmscan: reclaim_clean_pages_from_list() must use mod_zone_page_state()
      ea2388f2
    • Kirill A. Shutemov's avatar
      thp: close race between split and zap huge pages · b5a8cad3
      Kirill A. Shutemov authored
      Sasha Levin has reported two THP BUGs[1][2].  I believe both of them
      have the same root cause.  Let's look to them one by one.
      
      The first bug[1] is "kernel BUG at mm/huge_memory.c:1829!".  It's
      BUG_ON(mapcount != page_mapcount(page)) in __split_huge_page().  From my
      testing I see that page_mapcount() is higher than mapcount here.
      
      I think it happens due to race between zap_huge_pmd() and
      page_check_address_pmd().  page_check_address_pmd() misses PMD which is
      under zap:
      
      	CPU0						CPU1
      						zap_huge_pmd()
      						  pmdp_get_and_clear()
      __split_huge_page()
        anon_vma_interval_tree_foreach()
          __split_huge_page_splitting()
            page_check_address_pmd()
              mm_find_pmd()
      	  /*
      	   * We check if PMD present without taking ptl: no
      	   * serialization against zap_huge_pmd(). We miss this PMD,
      	   * it's not accounted to 'mapcount' in __split_huge_page().
      	   */
      	  pmd_present(pmd) == 0
      
        BUG_ON(mapcount != page_mapcount(page)) // CRASH!!!
      
      						  page_remove_rmap(page)
      						    atomic_add_negative(-1, &page->_mapcount)
      
      The second bug[2] is "kernel BUG at mm/huge_memory.c:1371!".
      It's VM_BUG_ON_PAGE(!PageHead(page), page) in zap_huge_pmd().
      
      This happens in similar way:
      
      	CPU0						CPU1
      						zap_huge_pmd()
      						  pmdp_get_and_clear()
      						  page_remove_rmap(page)
      						    atomic_add_negative(-1, &page->_mapcount)
      __split_huge_page()
        anon_vma_interval_tree_foreach()
          __split_huge_page_splitting()
            page_check_address_pmd()
              mm_find_pmd()
      	  pmd_present(pmd) == 0	/* The same comment as above */
        /*
         * No crash this time since we already decremented page->_mapcount in
         * zap_huge_pmd().
         */
        BUG_ON(mapcount != page_mapcount(page))
      
        /*
         * We split the compound page here into small pages without
         * serialization against zap_huge_pmd()
         */
        __split_huge_page_refcount()
      						VM_BUG_ON_PAGE(!PageHead(page), page); // CRASH!!!
      
      So my understanding the problem is pmd_present() check in mm_find_pmd()
      without taking page table lock.
      
      The bug was introduced by me commit with commit 117b0791. Sorry for
      that. :(
      
      Let's open code mm_find_pmd() in page_check_address_pmd() and do the
      check under page table lock.
      
      Note that __page_check_address() does the same for PTE entires
      if sync != 0.
      
      I've stress tested split and zap code paths for 36+ hours by now and
      don't see crashes with the patch applied. Before it took <20 min to
      trigger the first bug and few hours for second one (if we ignore
      first).
      
      [1] https://lkml.kernel.org/g/<53440991.9090001@oracle.com>
      [2] https://lkml.kernel.org/g/<5310C56C.60709@oracle.com>
      Signed-off-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Reported-by: default avatarSasha Levin <sasha.levin@oracle.com>
      Tested-by: default avatarSasha Levin <sasha.levin@oracle.com>
      Cc: Bob Liu <lliubbo@gmail.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Michel Lespinasse <walken@google.com>
      Cc: Dave Jones <davej@redhat.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: <stable@vger.kernel.org>	[3.13+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b5a8cad3