1. 30 Apr, 2014 7 commits
  2. 22 Apr, 2014 2 commits
  3. 20 Apr, 2014 5 commits
  4. 19 Apr, 2014 12 commits
    • Adrien BAK's avatar
      perf tools: Improve error reporting · ffa91880
      Adrien BAK authored
      In the current version, when using perf record, if something goes
      wrong in tools/perf/builtin-record.c:375
        session = perf_session__new(file, false, NULL);
      
      The error message:
      "Not enough memory for reading per file header"
      
      is issued. This error message seems to be outdated and is not very
      helpful. This patch proposes to replace this error message by
      "Perf session creation failed"
      
      I believe this issue has been brought to lkml:
      https://lkml.org/lkml/2014/2/24/458
      although this patch only tackles a (small) part of the issue.
      
      Additionnaly, this patch improves error reporting in
      tools/perf/util/data.c open_file_write.
      
      Currently, if the call to open fails, the user is unaware of it.
      This patch logs the error, before returning the error code to
      the caller.
      Reported-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarAdrien BAK <adrien.bak@metascale.org>
      Link: http://lkml.kernel.org/r/1397786443.3093.4.camel@beast
      [ Reorganize the changelog into paragraphs ]
      [ Added empty line after fd declaration in open_file_write ]
      Signed-off-by: default avatarJiri Olsa <jolsa@redhat.com>
      ffa91880
    • Vladimir Nikulichev's avatar
      perf tools: Adjust symbols in VDSO · 922d0e4d
      Vladimir Nikulichev authored
      pert-report doesn't resolve function names in VDSO:
      
      $ perf report --stdio -g flat,0.0,15,callee --sort pid
      ...
                  8.76%
                     0x7fff6b1fe861
                     __gettimeofday
                     ACE_OS::gettimeofday()
      ...
      
      In this case symbol values should be adjusted the same way as for executables,
      relocatable objects and prelinked libraries.
      
      After fix:
      
      $ perf report --stdio -g flat,0.0,15,callee --sort pid
      ...
                  8.76%
                     __vdso_gettimeofday
                     __gettimeofday
                     ACE_OS::gettimeofday()
      Signed-off-by: default avatarVladimir Nikulichev <nvs@tbricks.com>
      Tested-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Reviewed-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Link: http://lkml.kernel.org/r/969812.163009436-sendEmail@nvsSigned-off-by: default avatarJiri Olsa <jolsa@redhat.com>
      922d0e4d
    • Alexander Yarygin's avatar
      perf kvm: Fix 'Min time' counting in report command · acb61fc8
      Alexander Yarygin authored
      Every event in the perf-kvm has a 'stats' structure, which contains
      max/min/average/etc times of handling this event.
      The problem is that the 'perf-kvm stat report' command always shows
      that 'min time' is 0us for every event. Example:
      
       # perf kvm stat report
      
       Analyze events for all VCPUs:
      
          VM-EXIT    Samples  Samples%     Time%   Min Time   Max Time Avg time
        [..]
        0xB2 MSCH         12     0.07%     0.00%        0us        8us 7.31us ( +-   2.11% )
        0xB2 CHSC         12     0.07%     0.00%        0us       18us 9.39us ( +-   9.49% )
        0xB2 STPX          8     0.05%     0.00%        0us        2us 1.88us ( +-   7.18% )
        0xB2 STSI          7     0.04%     0.00%        0us       44us 16.49us ( +-  38.20% )
        [..]
      
      This happens because the 'stats' structure is not initialized and
      stats->min equals to 0. Lets initialize the structure for every
      event after its allocation using init_stats() function. This initializes
      stats->min to -1 and makes 'Min time' statistics counting work:
      
       # perf kvm stat report
      
       Analyze events for all VCPUs:
      
          VM-EXIT    Samples  Samples%     Time%   Min Time   Max Time Avg time
        [..]
        0xB2 MSCH         12     0.07%     0.00%        6us        8us 7.31us ( +-   2.11% )
        0xB2 CHSC         12     0.07%     0.00%        7us       18us 9.39us ( +-   9.49% )
        0xB2 STPX          8     0.05%     0.00%        1us        2us 1.88us ( +-   7.18% )
        0xB2 STSI          7     0.04%     0.00%        1us       44us 16.49us ( +-  38.20% )
        [..]
      Signed-off-by: default avatarAlexander Yarygin <yarygin@linux.vnet.ibm.com>
      Signed-off-by: default avatarChristian Borntraeger <borntraeger@de.ibm.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@gmail.com>
      Link: http://lkml.kernel.org/r/1397053319-2130-3-git-send-email-borntraeger@de.ibm.com
      [ Fixing the perf examples changelog output ]
      Signed-off-by: default avatarJiri Olsa <jolsa@redhat.com>
      acb61fc8
    • Eric Dumazet's avatar
      coredump: fix va_list corruption · 404ca80e
      Eric Dumazet authored
      A va_list needs to be copied in case it needs to be used twice.
      
      Thanks to Hugh for debugging this issue, leading to various panics.
      
      Tested:
      
        lpq84:~# echo "|/foobar12345 %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h" >/proc/sys/kernel/core_pattern
      
      'produce_core' is simply : main() { *(int *)0 = 1;}
      
        lpq84:~# ./produce_core
        Segmentation fault (core dumped)
        lpq84:~# dmesg | tail -1
        [  614.352947] Core dump to |/foobar12345 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 (null) pipe failed
      
      Notice the last argument was replaced by a NULL (we were lucky enough to
      not crash, but do not try this on your production machine !)
      
      After fix :
      
        lpq83:~# echo "|/foobar12345 %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h" >/proc/sys/kernel/core_pattern
        lpq83:~# ./produce_core
        Segmentation fault
        lpq83:~# dmesg | tail -1
        [  740.800441] Core dump to |/foobar12345 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 pipe failed
      
      Fixes: 5fe9d8ca ("coredump: cn_vprintf() has no reason to call vsnprintf() twice")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Diagnosed-by: default avatarHugh Dickins <hughd@google.com>
      Acked-by: default avatarOleg Nesterov <oleg@redhat.com>
      Cc: Neil Horman <nhorman@tuxdriver.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: stable@vger.kernel.org # 3.11+
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      404ca80e
    • Linus Torvalds's avatar
      Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 6d459690
      Linus Torvalds authored
      Pull x86 fix from Ingo Molnar:
       "This fixes the preemption-count imbalance crash reported by Owen
        Kibel"
      
      * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/mce: Fix CMCI preemption bugs
      6d459690
    • Linus Torvalds's avatar
      Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 8f98f6f5
      Linus Torvalds authored
      Pull scheduler fixes from Ingo Molnar:
       "Two fixes:
      
         - a SCHED_DEADLINE task selection fix
         - a sched/numa related lockdep splat fix"
      
      * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        sched: Check for stop task appearance when balancing happens
        sched/numa: Fix task_numa_free() lockdep splat
      8f98f6f5
    • Linus Torvalds's avatar
      Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 8de3f7a7
      Linus Torvalds authored
      Pull perf fixes from Ingo Molnar:
       "Two kernel side fixes:
      
         - an Intel uncore PMU driver potential crash fix
         - a kprobes/perf-call-graph interaction fix"
      
      * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        perf/x86/intel: Use rdmsrl_safe() when initializing RAPL PMU
        kprobes/x86: Fix page-fault handling logic
      8de3f7a7
    • Linus Torvalds's avatar
      Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux · b9312420
      Linus Torvalds authored
      Pull drm fixes from Dave Airlie:
       "Unfortunately this contains no easter eggs, its a bit larger than I'd
        like, but I included a patch that just moves code from one file to
        another and I'd like to avoid merge conflicts with that later, so it
        makes it seem worse than it is,
      
        Otherwise:
         - radeon: fixes to use new microcode to stabilise some cards, use
           some common displayport code, some runtime pm fixes, pll regression
           fixes
         - i915: fix for some context oopses, a warn in a used path, backlight
           fixes
         - nouveau: regression fix
         - omap: a bunch of fixes"
      
      * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (51 commits)
        drm: bochs: drop unused struct fields
        drm: bochs: add power management support
        drm: cirrus: add power management support
        drm: Split out drm_probe_helper.c from drm_crtc_helper.c
        drm/plane-helper: Don't fake-implement primary plane disabling
        drm/ast: fix value check in cbr_scan2
        drm/nouveau/bios: fix a bit shift error introduced by 457e77b2
        drm/radeon/ci: make sure mc ucode is loaded before checking the size
        drm/radeon/si: make sure mc ucode is loaded before checking the size
        drm/radeon: improve PLL params if we don't match exactly v2
        drm/radeon: memory leak on bo reservation failure. v2
        drm/radeon: fix VCE fence command
        drm/radeon: re-enable mclk dpm on R7 260X asics
        drm/radeon: add support for newer mc ucode on CI (v2)
        drm/radeon: add support for newer mc ucode on SI (v2)
        drm/radeon: apply more strict limits for PLL params v2
        drm/radeon: update CI DPM powertune settings
        drm/radeon: fix runpm handling on APUs (v4)
        drm/radeon: disable mclk dpm on R7 260X
        drm/tegra: Remove gratuitous pad field
        ...
      b9312420
    • Dave Airlie's avatar
      Merge branch 'drm-next-3.15-wip' of git://people.freedesktop.org/~deathsimple/linux into drm-next · a42892ed
      Dave Airlie authored
      Some i2c fixes over DisplayPort.
      
      * 'drm-next-3.15-wip' of git://people.freedesktop.org/~deathsimple/linux:
        drm/radeon: Improve vramlimit module param documentation
        drm/radeon: fix audio pin counts for DCE6+ (v2)
        drm/radeon/dp: switch to the common i2c over aux code
        drm/dp/i2c: Update comments about common i2c over dp assumptions (v3)
        drm/dp/i2c: send bare addresses to properly reset i2c connections (v4)
        drm/radeon/dp: handle zero sized i2c over aux transactions (v2)
        drm/i915: support address only i2c-over-aux transactions
        drm/tegra: dp: Support address-only I2C-over-AUX transactions
      a42892ed
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · ebfc45ee
      Linus Torvalds authored
      Pull more networking fixes from David Miller:
      
       1) Fix mlx4_en_netpoll implementation, it needs to schedule a NAPI
          context, not synchronize it.  From Chris Mason.
      
       2) Ipv4 flow input interface should never be zero, it should be
          LOOPBACK_IFINDEX instead.  From Cong Wang and Julian Anastasov.
      
       3) Properly configure MAC to PHY connection in mvneta devices, from
          Thomas Petazzoni.
      
       4) sys_recv should use SYSCALL_DEFINE.  From Jan Glauber.
      
       5) Tunnel driver ioctls do not use the correct namespace, fix from
          Nicolas Dichtel.
      
       6) Fix memory leak on seccomp filter attach, from Kees Cook.
      
       7) Fix lockdep warning for nested vlans, from Ding Tianhong.
      
       8) Crashes can happen in SCTP due to how the auth_enable value is
          managed, fix from Vlad Yasevich.
      
       9) Wireless fixes from John W Linville and co.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (45 commits)
        net: sctp: cache auth_enable per endpoint
        tg3: update rx_jumbo_pending ring param only when jumbo frames are enabled
        vlan: Fix lockdep warning when vlan dev handle notification
        seccomp: fix memory leak on filter attach
        isdn: icn: buffer overflow in icn_command()
        ip6_tunnel: use the right netns in ioctl handler
        sit: use the right netns in ioctl handler
        ip_tunnel: use the right netns in ioctl handler
        net: use SYSCALL_DEFINEx for sys_recv
        net: mdio-gpio: Add support for separate MDI and MDO gpio pins
        net: mdio-gpio: Add support for active low gpio pins
        net: mdio-gpio: Use devm_ functions where possible
        ipv4, route: pass 0 instead of LOOPBACK_IFINDEX to fib_validate_source()
        ipv4, fib: pass LOOPBACK_IFINDEX instead of 0 to flowi4_iif
        mlx4_en: don't use napi_synchronize inside mlx4_en_netpoll
        net: mvneta: properly configure the MAC <-> PHY connection in all situations
        net: phy: add minimal support for QSGMII PHY
        sfc:On MCDI timeout, issue an FLR (and mark MCDI to fail-fast)
        mwifiex: fix hung task on command timeout
        mwifiex: process event before command response
        ...
      ebfc45ee
    • Linus Torvalds's avatar
      Merge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6 · 6e66d5da
      Linus Torvalds authored
      Pull cifs fixes from Steve French:
       "A set of 5 small cifs fixes"
      
      * 'for-next' of git://git.samba.org/sfrench/cifs-2.6:
        cif: fix dead code
        cifs: fix error handling cifs_user_readv
        fs: cifs: remove unused variable.
        Return correct error on query of xattr on file with empty xattrs
        cifs: Wait for writebacks to complete before attempting write.
      6e66d5da
    • Linus Torvalds's avatar
      Merge tag 'char-misc-3.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc · 25bfe4f5
      Linus Torvalds authored
      Pull char/misc driver fixes from Greg KH:
       "Here are a few driver fixes for char/misc drivers that resolve
        reported issues.
      
        All have been in linux-next successfully for a few days"
      
      * tag 'char-misc-3.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
        Drivers: hv: vmbus: Negotiate version 3.0 when running on ws2012r2 hosts
        Tools: hv: Handle the case when the target file exists correctly
        vme_tsi148: Utilize to_pci_dev() macro
        vme_tsi148: Fix PCI address mapping assumption
        vme_tsi148: Fix typo in tsi148_slave_get()
        w1: avoid recursive device_add
        w1: fix netlink refcnt leak on error path
        misc: Grammar s/addition/additional/
        drivers: mcb: fix memory leak in chameleon_parse_cells() error path
        mei: ignore client writing state during cb completion
        mei: me: do not load the driver if the FW doesn't support MEI interface
        GenWQE: Increase driver version number
        GenWQE: Fix multithreading problems
        GenWQE: Ensure rc is not returning an uninitialized value
        GenWQE: Add wmb before DDCB is started
        GenWQE: Enable access to VPD flash area
      25bfe4f5
  5. 18 Apr, 2014 14 commits
    • Linus Torvalds's avatar
      Merge tag 'driver-core-3.15-rc2' of... · 60fbf2bd
      Linus Torvalds authored
      Merge tag 'driver-core-3.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
      
      Pull driver core fixes from Greg KH:
       "Here are some driver core fixes for 3.15-rc2.  Also in here are some
        documentation updates, as well as an API removal that had to wait for
        after -rc1 due to the cleanups coming into you from multiple developer
        trees (this one and the PPC tree.)
      
        All have been in linux next successfully"
      
      * tag 'driver-core-3.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
        drivers/base/dd.c incorrect pr_debug() parameters
        Documentation: Update stable address in Chinese and Japanese translations
        topology: Fix compilation warning when not in SMP
        Chinese: add translation of io_ordering.txt
        stable_kernel_rules: spelling/word usage
        sysfs, driver-core: remove unused {sysfs|device}_schedule_callback_owner()
        kernfs: protect lazy kernfs_iattrs allocation with mutex
        fs: Don't return 0 from get_anon_bdev
      60fbf2bd
    • Linus Torvalds's avatar
      Merge tag 'staging-3.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging · 8cb652bb
      Linus Torvalds authored
      Pull staging driver fixes from Greg KH:
       "Here are a few staging driver fixes for issues that have been reported
        for 3.15-rc2.
      
        Also dominating the diffstat for the pull request is the removal of
        the rtl8187se driver.  It's no longer needed in staging as a "real"
        driver for this hardware is now merged in the tree in the "correct"
        location in drivers/net/
      
        All of these patches have been tested in linux-next"
      
      * tag 'staging-3.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
        staging: r8188eu: Fix case where ethtype was never obtained and always be checked against 0
        staging: r8712u: Fix case where ethtype was never obtained and always be checked against 0
        staging: r8188eu: Calling rtw_get_stainfo() with a NULL sta_addr will return NULL
        staging: comedi: fix circular locking dependency in comedi_mmap()
        staging: r8723au: Add missing initialization of change_inx in sort algorithm
        Staging: unisys: use after free in list_for_each()
        staging: unisys: use after free in error messages
        staging: speakup: fix misuse of kstrtol() in handle_goto()
        staging: goldfish: Call free_irq in error path
        staging: delete rtl8187se wireless driver
        staging: rtl8723au: Fix buffer overflow in rtw_get_wfd_ie()
        staging: gs_fpgaboot: remove __TIMESTAMP__ macro
        staging: vme: fix memory leak in vme_user_probe()
        staging: fpgaboot: clean up Makefile
        staging/usbip: fix store_attach() sscanf return value check
        staging/usbip: userspace - fix usbipd SIGSEGV from refresh_exported_devices()
        staging: rtl8188eu: remove spaces, correct counts to unbreak P2P ioctls
        staging/rtl8821ae: Fix OOM handling in _rtl_init_deferred_work()
      8cb652bb
    • Linus Torvalds's avatar
      Merge tag 'tty-3.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty · 575a2929
      Linus Torvalds authored
      Pull tty/serial driver fixes from Greg KH:
       "Here are a number of small tty/serial driver fixes for 3.15-rc2.  Also
        in here are some Documentation file removals for drivers that we
        removed a long time ago, no need to keep it around any longer.
      
        All of these have been in linux-next for a bit"
      
      * tag 'tty-3.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
        Revert "serial: 8250, disable "too much work" messages"
        serial: amba-pl011: fix regression, causing an Oops on rmmod
        tty: Fix help text of SYNCLINK_CS
        tty: fix memleak in alloc_pid
        ttyprintk: Allow built as a module
        ttyprintk: Fix wrong tty_unregister_driver() call in the error path
        serial: 8250, disable "too much work" messages
        Documentation/serial: Delete obsolete driver documentation
        serial: omap: Fix missing pm_runtime_resume handling by simplifying code
        serial_core: Fix pm imbalance on unbind
        serial: pl011: change Rx burst size to half of trigger level
        serial: timberdale: Depend on X86_32
        serial: st-asc: Fix SysRq char handling
        Revert "serial: clps711x: Give a chance to perform useful tasks during wait loop"
        serial_core: Fix conditional start_tx on ring buffer not empty
        serial: efm32: use $vendor,$device scheme for compatible string
        serial: omap: free the wakeup settings in remove
      575a2929
    • Linus Torvalds's avatar
      Merge tag 'usb-3.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb · 7e55f81e
      Linus Torvalds authored
      Pull USB fixes from Greg KH:
       "Here are a number of tiny USB fixes and new device ids for 3.15-rc2.
        Nothing major, just issues some people have reported.
      
        All of these have been in linux-next"
      
      * tag 'usb-3.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
        uas: fix deadlocky memory allocations
        uas: fix error handling during scsi_scan()
        uas: fix GFP_NOIO under spinlock
        uwb: adds missing error handling
        USB: cdc-acm: Remove Motorola/Telit H24 serial interfaces from ACM driver
        USB: ohci-jz4740: FEAT_POWER is a port feature, not a hub feature
        USB: ohci-jz4740: Fix uninitialized variable warning
        USB: EHCI: tegra: set txfill_tuning
        usb: ehci-platform: Return immediately from suspend if ehci_suspend fails
        usb: ehci-exynos: Return immediately from suspend if ehci_suspend fails
        USB: fix crash during hotplug of PCI USB controller card
        USB: cdc-acm: fix double usb_autopm_put_interface() in acm_port_activate()
        usb: usb-common: fix typo for usb_state_string
        USB: usb_wwan: fix handling of missing bulk endpoints
        USB: pl2303: add ids for Hewlett-Packard HP POS pole displays
        USB: cp210x: Add 8281 (Nanotec Plug & Drive)
        usb: option driver, add support for Telit UE910v2
        Revert "USB: serial: add usbid for dell wwan card to sierra.c"
        USB: serial: ftdi_sio: add id for Brainboxes serial cards
      7e55f81e
    • Linus Torvalds's avatar
      Merge branch 'akpm' (incoming from Andrew) · ea2388f2
      Linus Torvalds authored
      Merge misc fixes from Andrew Morton:
       "13 fixes"
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>:
        thp: close race between split and zap huge pages
        mm: fix new kernel-doc warning in filemap.c
        mm: fix CONFIG_DEBUG_VM_RB description
        mm: use paravirt friendly ops for NUMA hinting ptes
        mips: export flush_icache_range
        mm/hugetlb.c: add cond_resched_lock() in return_unused_surplus_pages()
        wait: explain the shadowing and type inconsistencies
        Shiraz has moved
        Documentation/vm/numa_memory_policy.txt: fix wrong document in numa_memory_policy.txt
        powerpc/mm: fix ".__node_distance" undefined
        kernel/watchdog.c:touch_softlockup_watchdog(): use raw_cpu_write()
        init/Kconfig: move the trusted keyring config option to general setup
        vmscan: reclaim_clean_pages_from_list() must use mod_zone_page_state()
      ea2388f2
    • Kirill A. Shutemov's avatar
      thp: close race between split and zap huge pages · b5a8cad3
      Kirill A. Shutemov authored
      Sasha Levin has reported two THP BUGs[1][2].  I believe both of them
      have the same root cause.  Let's look to them one by one.
      
      The first bug[1] is "kernel BUG at mm/huge_memory.c:1829!".  It's
      BUG_ON(mapcount != page_mapcount(page)) in __split_huge_page().  From my
      testing I see that page_mapcount() is higher than mapcount here.
      
      I think it happens due to race between zap_huge_pmd() and
      page_check_address_pmd().  page_check_address_pmd() misses PMD which is
      under zap:
      
      	CPU0						CPU1
      						zap_huge_pmd()
      						  pmdp_get_and_clear()
      __split_huge_page()
        anon_vma_interval_tree_foreach()
          __split_huge_page_splitting()
            page_check_address_pmd()
              mm_find_pmd()
      	  /*
      	   * We check if PMD present without taking ptl: no
      	   * serialization against zap_huge_pmd(). We miss this PMD,
      	   * it's not accounted to 'mapcount' in __split_huge_page().
      	   */
      	  pmd_present(pmd) == 0
      
        BUG_ON(mapcount != page_mapcount(page)) // CRASH!!!
      
      						  page_remove_rmap(page)
      						    atomic_add_negative(-1, &page->_mapcount)
      
      The second bug[2] is "kernel BUG at mm/huge_memory.c:1371!".
      It's VM_BUG_ON_PAGE(!PageHead(page), page) in zap_huge_pmd().
      
      This happens in similar way:
      
      	CPU0						CPU1
      						zap_huge_pmd()
      						  pmdp_get_and_clear()
      						  page_remove_rmap(page)
      						    atomic_add_negative(-1, &page->_mapcount)
      __split_huge_page()
        anon_vma_interval_tree_foreach()
          __split_huge_page_splitting()
            page_check_address_pmd()
              mm_find_pmd()
      	  pmd_present(pmd) == 0	/* The same comment as above */
        /*
         * No crash this time since we already decremented page->_mapcount in
         * zap_huge_pmd().
         */
        BUG_ON(mapcount != page_mapcount(page))
      
        /*
         * We split the compound page here into small pages without
         * serialization against zap_huge_pmd()
         */
        __split_huge_page_refcount()
      						VM_BUG_ON_PAGE(!PageHead(page), page); // CRASH!!!
      
      So my understanding the problem is pmd_present() check in mm_find_pmd()
      without taking page table lock.
      
      The bug was introduced by me commit with commit 117b0791. Sorry for
      that. :(
      
      Let's open code mm_find_pmd() in page_check_address_pmd() and do the
      check under page table lock.
      
      Note that __page_check_address() does the same for PTE entires
      if sync != 0.
      
      I've stress tested split and zap code paths for 36+ hours by now and
      don't see crashes with the patch applied. Before it took <20 min to
      trigger the first bug and few hours for second one (if we ignore
      first).
      
      [1] https://lkml.kernel.org/g/<53440991.9090001@oracle.com>
      [2] https://lkml.kernel.org/g/<5310C56C.60709@oracle.com>
      Signed-off-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Reported-by: default avatarSasha Levin <sasha.levin@oracle.com>
      Tested-by: default avatarSasha Levin <sasha.levin@oracle.com>
      Cc: Bob Liu <lliubbo@gmail.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Michel Lespinasse <walken@google.com>
      Cc: Dave Jones <davej@redhat.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: <stable@vger.kernel.org>	[3.13+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b5a8cad3
    • Randy Dunlap's avatar
      mm: fix new kernel-doc warning in filemap.c · b59b8cbc
      Randy Dunlap authored
      Fix new kernel-doc warning in mm/filemap.c:
      
        Warning(mm/filemap.c:2600): Excess function parameter 'ppos' description in '__generic_file_aio_write'
      Signed-off-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b59b8cbc
    • Davidlohr Bueso's avatar
      mm: fix CONFIG_DEBUG_VM_RB description · a663dad6
      Davidlohr Bueso authored
      This appears to be a copy/paste error.  Update the description to
      reflect extra rbtree debug and checks for the config option instead of
      duplicating CONFIG_DEBUG_VM.
      Signed-off-by: default avatarDavidlohr Bueso <davidlohr@hp.com>
      Cc: Aswin Chandramouleeswaran <aswin@hp.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a663dad6
    • Mel Gorman's avatar
      mm: use paravirt friendly ops for NUMA hinting ptes · 29c77870
      Mel Gorman authored
      David Vrabel identified a regression when using automatic NUMA balancing
      under Xen whereby page table entries were getting corrupted due to the
      use of native PTE operations.  Quoting him
      
      	Xen PV guest page tables require that their entries use machine
      	addresses if the preset bit (_PAGE_PRESENT) is set, and (for
      	successful migration) non-present PTEs must use pseudo-physical
      	addresses.  This is because on migration MFNs in present PTEs are
      	translated to PFNs (canonicalised) so they may be translated back
      	to the new MFN in the destination domain (uncanonicalised).
      
      	pte_mknonnuma(), pmd_mknonnuma(), pte_mknuma() and pmd_mknuma()
      	set and clear the _PAGE_PRESENT bit using pte_set_flags(),
      	pte_clear_flags(), etc.
      
      	In a Xen PV guest, these functions must translate MFNs to PFNs
      	when clearing _PAGE_PRESENT and translate PFNs to MFNs when setting
      	_PAGE_PRESENT.
      
      His suggested fix converted p[te|md]_[set|clear]_flags to using
      paravirt-friendly ops but this is overkill.  He suggested an alternative
      of using p[te|md]_modify in the NUMA page table operations but this is
      does more work than necessary and would require looking up a VMA for
      protections.
      
      This patch modifies the NUMA page table operations to use paravirt
      friendly operations to set/clear the flags of interest.  Unfortunately
      this will take a performance hit when updating the PTEs on
      CONFIG_PARAVIRT but I do not see a way around it that does not break
      Xen.
      Signed-off-by: default avatarMel Gorman <mgorman@suse.de>
      Acked-by: default avatarDavid Vrabel <david.vrabel@citrix.com>
      Tested-by: default avatarDavid Vrabel <david.vrabel@citrix.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Peter Anvin <hpa@zytor.com>
      Cc: Fengguang Wu <fengguang.wu@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Steven Noonan <steven@uplinklabs.net>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Cyrill Gorcunov <gorcunov@gmail.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      29c77870
    • Kees Cook's avatar
      mips: export flush_icache_range · 8229f1a0
      Kees Cook authored
      The lkdtm module performs tests against executable memory ranges, so it
      needs to flush the icache for proper behaviors.  Other architectures
      already export this, so do the same for MIPS.
      
      [akpm@linux-foundation.org: relocate export sites]
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Sanjay Lal <sanjayl@kymasys.com>
      Cc: John Crispin <blogic@openwrt.org>
      Cc: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8229f1a0
    • Mizuma, Masayoshi's avatar
      mm/hugetlb.c: add cond_resched_lock() in return_unused_surplus_pages() · 7848a4bf
      Mizuma, Masayoshi authored
      soft lockup in freeing gigantic hugepage fixed in commit 55f67141 "mm:
      hugetlb: fix softlockup when a large number of hugepages are freed." can
      happen in return_unused_surplus_pages(), so let's fix it.
      Signed-off-by: default avatarMasayoshi Mizuma <m.mizuma@jp.fujitsu.com>
      Signed-off-by: default avatarNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Aneesh Kumar <aneesh.kumar@linux.vnet.ibm.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      7848a4bf
    • Peter Zijlstra's avatar
      wait: explain the shadowing and type inconsistencies · 8b32201d
      Peter Zijlstra authored
      Stick in a comment before someone else tries to fix the sparse warning
      this generates.
      Suggested-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/n/tip-o2ro6f3vkxklni0bc8f7m68s@git.kernel.orgSigned-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8b32201d
    • Viresh Kumar's avatar
      Shiraz has moved · 9cc23682
      Viresh Kumar authored
      shiraz.hashim@st.com email-id doesn't exist anymore as he has left the
      company.  Replace ST's id with shiraz.linux.kernel@gmail.com.
      
      It also updates .mailmap file to fix address for 'git shortlog'.
      Signed-off-by: default avatarViresh Kumar <viresh.kumar@linaro.org>
      Cc: Shiraz Hashim <shiraz.linux.kernel@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      9cc23682
    • Tang Chen's avatar
      Documentation/vm/numa_memory_policy.txt: fix wrong document in numa_memory_policy.txt · 8f28ed92
      Tang Chen authored
      In document numa_memory_policy.txt, the following examples for flag
      MPOL_F_RELATIVE_NODES are incorrect.
      
      	For example, consider a task that is attached to a cpuset with
      	mems 2-5 that sets an Interleave policy over the same set with
      	MPOL_F_RELATIVE_NODES.  If the cpuset's mems change to 3-7, the
      	interleave now occurs over nodes 3,5-6.  If the cpuset's mems
      	then change to 0,2-3,5, then the interleave occurs over nodes
      	0,3,5.
      
      According to the comment of the patch adding flag MPOL_F_RELATIVE_NODES,
      the nodemasks the user specifies should be considered relative to the
      current task's mems_allowed.
      
       (https://lkml.org/lkml/2008/2/29/428)
      
      And according to numa_memory_policy.txt, if the user's nodemask includes
      nodes that are outside the range of the new set of allowed nodes, then
      the remap wraps around to the beginning of the nodemask and, if not
      already set, sets the node in the mempolicy nodemask.
      
      So in the example, if the user specifies 2-5, for a task whose
      mems_allowed is 3-7, the nodemasks should be remapped the third, fourth,
      fifth, sixth node in mems_allowed.  like the following:
      
      	mems_allowed:       3  4  5  6  7
      
      	relative index:     0  1  2  3  4
      	                    5
      
      So the nodemasks should be remapped to 3,5-7, but not 3,5-6.
      
      And for a task whose mems_allowed is 0,2-3,5, the nodemasks should be
      remapped to 0,2-3,5, but not 0,3,5.
      
      	mems_allowed:       0  2  3  5
      
              relative index:     0  1  2  3
                                  4  5
      Signed-off-by: default avatarTang Chen <tangchen@cn.fujitsu.com>
      Cc: Randy Dunlap <rdunlap@infradead.org>
      Acked-by: default avatarDavid Rientjes <rientjes@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8f28ed92