1. 30 Nov, 2018 40 commits
    • Hugh Dickins's avatar
      mm/huge_memory: splitting set mapping+index before unfreeze · 173d9d9f
      Hugh Dickins authored
      Huge tmpfs stress testing has occasionally hit shmem_undo_range()'s
      VM_BUG_ON_PAGE(page_to_pgoff(page) != index, page).
      
      Move the setting of mapping and index up before the page_ref_unfreeze()
      in __split_huge_page_tail() to fix this: so that a page cache lookup
      cannot get a reference while the tail's mapping and index are unstable.
      
      In fact, might as well move them up before the smp_wmb(): I don't see an
      actual need for that, but if I'm missing something, this way round is
      safer than the other, and no less efficient.
      
      You might argue that VM_BUG_ON_PAGE(page_to_pgoff(page) != index, page) is
      misplaced, and should be left until after the trylock_page(); but left as
      is has not crashed since, and gives more stringent assurance.
      
      Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1811261516380.2275@eggly.anvils
      Fixes: e9b61f19 ("thp: reintroduce split_huge_page()")
      Requires: 605ca5ed ("mm/huge_memory.c: reorder operations in __split_huge_page_tail()")
      Signed-off-by: default avatarHugh Dickins <hughd@google.com>
      Acked-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
      Cc: Jerome Glisse <jglisse@redhat.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: <stable@vger.kernel.org>	[4.8+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      173d9d9f
    • Hugh Dickins's avatar
      mm/huge_memory: rename freeze_page() to unmap_page() · 906f9cdf
      Hugh Dickins authored
      The term "freeze" is used in several ways in the kernel, and in mm it
      has the particular meaning of forcing page refcount temporarily to 0.
      freeze_page() is just too confusing a name for a function that unmaps a
      page: rename it unmap_page(), and rename unfreeze_page() remap_page().
      
      Went to change the mention of freeze_page() added later in mm/rmap.c,
      but found it to be incorrect: ordinary page reclaim reaches there too;
      but the substance of the comment still seems correct, so edit it down.
      
      Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1811261514080.2275@eggly.anvils
      Fixes: e9b61f19 ("thp: reintroduce split_huge_page()")
      Signed-off-by: default avatarHugh Dickins <hughd@google.com>
      Acked-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Jerome Glisse <jglisse@redhat.com>
      Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: <stable@vger.kernel.org>	[4.8+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      906f9cdf
    • Li Zhijian's avatar
      initramfs: clean old path before creating a hardlink · 7c0950d4
      Li Zhijian authored
      sys_link() can fail due to the new path already existing.  This case
      ofen occurs when we use a concated initrd, for example:
      
      1) prepare a basic rootfs, it contains a regular files rc.local
      lizhijian@:~/yocto-tiny-i386-2016-04-22$ cat etc/rc.local
       #!/bin/sh
       echo "Running /etc/rc.local..."
      yocto-tiny-i386-2016-04-22$ find . | sed 's,^\./,,' | cpio -o -H newc | gzip -n -9 >../rootfs.cgz
      
      2) create a extra initrd which also includes a etc/rc.local
      lizhijian@:~/lkp-x86_64/etc$ echo "append initrd" >rc.local
      lizhijian@:~/lkp/lkp-x86_64/etc$ cat rc.local
      append initrd
      lizhijian@:~/lkp/lkp-x86_64/etc$ ln rc.local rc.local.hardlink
      append initrd
      lizhijian@:~/lkp/lkp-x86_64/etc$ stat rc.local rc.local.hardlink
        File: 'rc.local'
        Size: 14        	Blocks: 8          IO Block: 4096   regular file
      Device: 801h/2049d	Inode: 11296086    Links: 2
      Access: (0664/-rw-rw-r--)  Uid: ( 1002/lizhijian)   Gid: ( 1002/lizhijian)
      Access: 2018-11-15 16:08:28.654464815 +0800
      Modify: 2018-11-15 16:07:57.514903210 +0800
      Change: 2018-11-15 16:08:24.180228872 +0800
       Birth: -
        File: 'rc.local.hardlink'
        Size: 14        	Blocks: 8          IO Block: 4096   regular file
      Device: 801h/2049d	Inode: 11296086    Links: 2
      Access: (0664/-rw-rw-r--)  Uid: ( 1002/lizhijian)   Gid: ( 1002/lizhijian)
      Access: 2018-11-15 16:08:28.654464815 +0800
      Modify: 2018-11-15 16:07:57.514903210 +0800
      Change: 2018-11-15 16:08:24.180228872 +0800
       Birth: -
      
      lizhijian@:~/lkp/lkp-x86_64$ find . | sed 's,^\./,,' | cpio -o -H newc | gzip -n -9 >../rc-local.cgz
      lizhijian@:~/lkp/lkp-x86_64$ gzip -dc ../rc-local.cgz | cpio -t
      .
      etc
      etc/rc.local.hardlink <<< it will be extracted first at this initrd
      etc/rc.local
      
      3) concate 2 initrds and boot
      lizhijian@:~/lkp$ cat rootfs.cgz rc-local.cgz >concate-initrd.cgz
      lizhijian@:~/lkp$ qemu-system-x86_64 -nographic -enable-kvm -cpu host -smp 1 -m 1024 -kernel ~/lkp/linux/arch/x86/boot/bzImage -append "console=ttyS0 earlyprint=ttyS0 ignore_loglevel" -initrd ./concate-initr.cgz -serial stdio -nodefaults
      
      In this case, sys_link(2) will fail and return -EEXIST, so we can only get
      the rc.local at rootfs.cgz instead of rc-local.cgz
      
      [akpm@linux-foundation.org: move code to avoid forward declaration]
      Link: http://lkml.kernel.org/r/1542352368-13299-1-git-send-email-lizhijian@cn.fujitsu.comSigned-off-by: default avatarLi Zhijian <lizhijian@cn.fujitsu.com>
      Cc: Philip Li <philip.li@intel.com>
      Cc: Dominik Brodowski <linux@dominikbrodowski.net>
      Cc: Li Zhijian <zhijianx.li@intel.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      7c0950d4
    • Anders Roxell's avatar
      kernel/kcov.c: mark funcs in __sanitizer_cov_trace_pc() as notrace · 903e8ff8
      Anders Roxell authored
      Since __sanitizer_cov_trace_pc() is marked as notrace, function calls in
      __sanitizer_cov_trace_pc() shouldn't be traced either.
      ftrace_graph_caller() gets called for each function that isn't marked
      'notrace', like canonicalize_ip().  This is the call trace from a run:
      
      [  139.644550]  ftrace_graph_caller+0x1c/0x24
      [  139.648352]  canonicalize_ip+0x18/0x28
      [  139.652313]  __sanitizer_cov_trace_pc+0x14/0x58
      [  139.656184]  sched_clock+0x34/0x1e8
      [  139.659759]  trace_clock_local+0x40/0x88
      [  139.663722]  ftrace_push_return_trace+0x8c/0x1f0
      [  139.667767]  prepare_ftrace_return+0xa8/0x100
      [  139.671709]  ftrace_graph_caller+0x1c/0x24
      
      Rework so that check_kcov_mode() and canonicalize_ip() that are called
      from __sanitizer_cov_trace_pc() are also marked as notrace.
      
      Link: http://lkml.kernel.org/r/20181128081239.18317-1-anders.roxell@linaro.orgSigned-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signen-off-by: default avatarAnders Roxell <anders.roxell@linaro.org>
      Co-developed-by: default avatarArnd Bergmann <arnd@arndb.de>
      Acked-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      903e8ff8
    • Johannes Weiner's avatar
      psi: make disabling/enabling easier for vendor kernels · e0c27447
      Johannes Weiner authored
      Mel Gorman reports a hackbench regression with psi that would prohibit
      shipping the suse kernel with it default-enabled, but he'd still like
      users to be able to opt in at little to no cost to others.
      
      With the current combination of CONFIG_PSI and the psi_disabled bool set
      from the commandline, this is a challenge.  Do the following things to
      make it easier:
      
      1. Add a config option CONFIG_PSI_DEFAULT_DISABLED that allows distros
         to enable CONFIG_PSI in their kernel but leave the feature disabled
         unless a user requests it at boot-time.
      
         To avoid double negatives, rename psi_disabled= to psi=.
      
      2. Make psi_disabled a static branch to eliminate any branch costs
         when the feature is disabled.
      
      In terms of numbers before and after this patch, Mel says:
      
      : The following is a comparision using CONFIG_PSI=n as a baseline against
      : your patch and a vanilla kernel
      :
      :                          4.20.0-rc4             4.20.0-rc4             4.20.0-rc4
      :                 kconfigdisable-v1r1                vanilla        psidisable-v1r1
      : Amean     1       1.3100 (   0.00%)      1.3923 (  -6.28%)      1.3427 (  -2.49%)
      : Amean     3       3.8860 (   0.00%)      4.1230 *  -6.10%*      3.8860 (  -0.00%)
      : Amean     5       6.8847 (   0.00%)      8.0390 * -16.77%*      6.7727 (   1.63%)
      : Amean     7       9.9310 (   0.00%)     10.8367 *  -9.12%*      9.9910 (  -0.60%)
      : Amean     12     16.6577 (   0.00%)     18.2363 *  -9.48%*     17.1083 (  -2.71%)
      : Amean     18     26.5133 (   0.00%)     27.8833 *  -5.17%*     25.7663 (   2.82%)
      : Amean     24     34.3003 (   0.00%)     34.6830 (  -1.12%)     32.0450 (   6.58%)
      : Amean     30     40.0063 (   0.00%)     40.5800 (  -1.43%)     41.5087 (  -3.76%)
      : Amean     32     40.1407 (   0.00%)     41.2273 (  -2.71%)     39.9417 (   0.50%)
      :
      : It's showing that the vanilla kernel takes a hit (as the bisection
      : indicated it would) and that disabling PSI by default is reasonably
      : close in terms of performance for this particular workload on this
      : particular machine so;
      
      Link: http://lkml.kernel.org/r/20181127165329.GA29728@cmpxchg.orgSigned-off-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Tested-by: default avatarMel Gorman <mgorman@techsingularity.net>
      Reported-by: default avatarMel Gorman <mgorman@techsingularity.net>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e0c27447
    • Alexey Dobriyan's avatar
      proc: fixup map_files test on arm · dbd4af54
      Alexey Dobriyan authored
      https://bugs.linaro.org/show_bug.cgi?id=3782
      
      Turns out arm doesn't permit mapping address 0, so try minimum virtual
      address instead.
      
      Link: http://lkml.kernel.org/r/20181113165446.GA28157@avx2Signed-off-by: default avatarAlexey Dobriyan <adobriyan@gmail.com>
      Reported-by: default avatarRafael David Tinoco <rafael.tinoco@linaro.org>
      Tested-by: default avatarRafael David Tinoco <rafael.tinoco@linaro.org>
      Acked-by: default avatarCyrill Gorcunov <gorcunov@gmail.com>
      Cc: Shuah Khan <shuah@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      dbd4af54
    • Qian Cai's avatar
      debugobjects: avoid recursive calls with kmemleak · 8de456cf
      Qian Cai authored
      CONFIG_DEBUG_OBJECTS_RCU_HEAD does not play well with kmemleak due to
      recursive calls.
      
      fill_pool
        kmemleak_ignore
          make_black_object
            put_object
              __call_rcu (kernel/rcu/tree.c)
                debug_rcu_head_queue
                  debug_object_activate
                    debug_object_init
                      fill_pool
                        kmemleak_ignore
                          make_black_object
                            ...
      
      So add SLAB_NOLEAKTRACE to kmem_cache_create() to not register newly
      allocated debug objects at all.
      
      Link: http://lkml.kernel.org/r/20181126165343.2339-1-cai@gmx.usSigned-off-by: default avatarQian Cai <cai@gmx.us>
      Suggested-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Acked-by: default avatarWaiman Long <longman@redhat.com>
      Acked-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Yang Shi <yang.shi@linux.alibaba.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8de456cf
    • Andrea Arcangeli's avatar
      userfaultfd: shmem: UFFDIO_COPY: set the page dirty if VM_WRITE is not set · dcf7fe9d
      Andrea Arcangeli authored
      Set the page dirty if VM_WRITE is not set because in such case the pte
      won't be marked dirty and the page would be reclaimed without writepage
      (i.e.  swapout in the shmem case).
      
      This was found by source review.  Most apps (certainly including QEMU)
      only use UFFDIO_COPY on PROT_READ|PROT_WRITE mappings or the app can't
      modify the memory in the first place.  This is for correctness and it
      could help the non cooperative use case to avoid unexpected data loss.
      
      Link: http://lkml.kernel.org/r/20181126173452.26955-6-aarcange@redhat.comReviewed-by: default avatarHugh Dickins <hughd@google.com>
      Cc: stable@vger.kernel.org
      Fixes: 4c27fe4c ("userfaultfd: shmem: add shmem_mcopy_atomic_pte for userfaultfd support")
      Reported-by: default avatarHugh Dickins <hughd@google.com>
      Signed-off-by: default avatarAndrea Arcangeli <aarcange@redhat.com>
      Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
      Cc: Jann Horn <jannh@google.com>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Mike Rapoport <rppt@linux.ibm.com>
      Cc: Peter Xu <peterx@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      dcf7fe9d
    • Andrea Arcangeli's avatar
      userfaultfd: shmem: add i_size checks · e2a50c1f
      Andrea Arcangeli authored
      With MAP_SHARED: recheck the i_size after taking the PT lock, to
      serialize against truncate with the PT lock.  Delete the page from the
      pagecache if the i_size_read check fails.
      
      With MAP_PRIVATE: check the i_size after the PT lock before mapping
      anonymous memory or zeropages into the MAP_PRIVATE shmem mapping.
      
      A mostly irrelevant cleanup: like we do the delete_from_page_cache()
      pagecache removal after dropping the PT lock, the PT lock is a spinlock
      so drop it before the sleepable page lock.
      
      Link: http://lkml.kernel.org/r/20181126173452.26955-5-aarcange@redhat.com
      Fixes: 4c27fe4c ("userfaultfd: shmem: add shmem_mcopy_atomic_pte for userfaultfd support")
      Signed-off-by: default avatarAndrea Arcangeli <aarcange@redhat.com>
      Reviewed-by: default avatarMike Rapoport <rppt@linux.ibm.com>
      Reviewed-by: default avatarHugh Dickins <hughd@google.com>
      Reported-by: default avatarJann Horn <jannh@google.com>
      Cc: <stable@vger.kernel.org>
      Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Peter Xu <peterx@redhat.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e2a50c1f
    • Andrea Arcangeli's avatar
      userfaultfd: shmem/hugetlbfs: only allow to register VM_MAYWRITE vmas · 29ec9066
      Andrea Arcangeli authored
      After the VMA to register the uffd onto is found, check that it has
      VM_MAYWRITE set before allowing registration.  This way we inherit all
      common code checks before allowing to fill file holes in shmem and
      hugetlbfs with UFFDIO_COPY.
      
      The userfaultfd memory model is not applicable for readonly files unless
      it's a MAP_PRIVATE.
      
      Link: http://lkml.kernel.org/r/20181126173452.26955-4-aarcange@redhat.com
      Fixes: ff62a342 ("hugetlb: implement memfd sealing")
      Signed-off-by: default avatarAndrea Arcangeli <aarcange@redhat.com>
      Reviewed-by: default avatarMike Rapoport <rppt@linux.ibm.com>
      Reviewed-by: default avatarHugh Dickins <hughd@google.com>
      Reported-by: default avatarJann Horn <jannh@google.com>
      Fixes: 4c27fe4c ("userfaultfd: shmem: add shmem_mcopy_atomic_pte for userfaultfd support")
      Cc: <stable@vger.kernel.org>
      Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Peter Xu <peterx@redhat.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      29ec9066
    • Andrea Arcangeli's avatar
      userfaultfd: shmem: allocate anonymous memory for MAP_PRIVATE shmem · 5b51072e
      Andrea Arcangeli authored
      Userfaultfd did not create private memory when UFFDIO_COPY was invoked
      on a MAP_PRIVATE shmem mapping.  Instead it wrote to the shmem file,
      even when that had not been opened for writing.  Though, fortunately,
      that could only happen where there was a hole in the file.
      
      Fix the shmem-backed implementation of UFFDIO_COPY to create private
      memory for MAP_PRIVATE mappings.  The hugetlbfs-backed implementation
      was already correct.
      
      This change is visible to userland, if userfaultfd has been used in
      unintended ways: so it introduces a small risk of incompatibility, but
      is necessary in order to respect file permissions.
      
      An app that uses UFFDIO_COPY for anything like postcopy live migration
      won't notice the difference, and in fact it'll run faster because there
      will be no copy-on-write and memory waste in the tmpfs pagecache
      anymore.
      
      Userfaults on MAP_PRIVATE shmem keep triggering only on file holes like
      before.
      
      The real zeropage can also be built on a MAP_PRIVATE shmem mapping
      through UFFDIO_ZEROPAGE and that's safe because the zeropage pte is
      never dirty, in turn even an mprotect upgrading the vma permission from
      PROT_READ to PROT_READ|PROT_WRITE won't make the zeropage pte writable.
      
      Link: http://lkml.kernel.org/r/20181126173452.26955-3-aarcange@redhat.com
      Fixes: 4c27fe4c ("userfaultfd: shmem: add shmem_mcopy_atomic_pte for userfaultfd support")
      Signed-off-by: default avatarAndrea Arcangeli <aarcange@redhat.com>
      Reported-by: default avatarMike Rapoport <rppt@linux.ibm.com>
      Reviewed-by: default avatarHugh Dickins <hughd@google.com>
      Cc: <stable@vger.kernel.org>
      Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
      Cc: Jann Horn <jannh@google.com>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Peter Xu <peterx@redhat.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      5b51072e
    • Andrea Arcangeli's avatar
      userfaultfd: use ENOENT instead of EFAULT if the atomic copy user fails · 9e368259
      Andrea Arcangeli authored
      Patch series "userfaultfd shmem updates".
      
      Jann found two bugs in the userfaultfd shmem MAP_SHARED backend: the
      lack of the VM_MAYWRITE check and the lack of i_size checks.
      
      Then looking into the above we also fixed the MAP_PRIVATE case.
      
      Hugh by source review also found a data loss source if UFFDIO_COPY is
      used on shmem MAP_SHARED PROT_READ mappings (the production usages
      incidentally run with PROT_READ|PROT_WRITE, so the data loss couldn't
      happen in those production usages like with QEMU).
      
      The whole patchset is marked for stable.
      
      We verified QEMU postcopy live migration with guest running on shmem
      MAP_PRIVATE run as well as before after the fix of shmem MAP_PRIVATE.
      Regardless if it's shmem or hugetlbfs or MAP_PRIVATE or MAP_SHARED, QEMU
      unconditionally invokes a punch hole if the guest mapping is filebacked
      and a MADV_DONTNEED too (needed to get rid of the MAP_PRIVATE COWs and
      for the anon backend).
      
      This patch (of 5):
      
      We internally used EFAULT to communicate with the caller, switch to
      ENOENT, so EFAULT can be used as a non internal retval.
      
      Link: http://lkml.kernel.org/r/20181126173452.26955-2-aarcange@redhat.com
      Fixes: 4c27fe4c ("userfaultfd: shmem: add shmem_mcopy_atomic_pte for userfaultfd support")
      Signed-off-by: default avatarAndrea Arcangeli <aarcange@redhat.com>
      Reviewed-by: default avatarMike Rapoport <rppt@linux.ibm.com>
      Reviewed-by: default avatarHugh Dickins <hughd@google.com>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Jann Horn <jannh@google.com>
      Cc: Peter Xu <peterx@redhat.com>
      Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
      Cc: <stable@vger.kernel.org>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      9e368259
    • Luis Chamberlain's avatar
      lib/test_kmod.c: fix rmmod double free · 5618cf03
      Luis Chamberlain authored
      We free the misc device string twice on rmmod; fix this.  Without this
      we cannot remove the module without crashing.
      
      Link: http://lkml.kernel.org/r/20181124050500.5257-1-mcgrof@kernel.orgSigned-off-by: default avatarLuis Chamberlain <mcgrof@kernel.org>
      Reported-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Reviewed-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: <stable@vger.kernel.org>	[4.12+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      5618cf03
    • Pan Bian's avatar
      hfsplus: do not free node before using · c7d7d620
      Pan Bian authored
      hfs_bmap_free() frees node via hfs_bnode_put(node).  However it then
      reads node->this when dumping error message on an error path, which may
      result in a use-after-free bug.  This patch frees node only when it is
      never used.
      
      Link: http://lkml.kernel.org/r/1543053441-66942-1-git-send-email-bianpan2016@163.comSigned-off-by: default avatarPan Bian <bianpan2016@163.com>
      Reviewed-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Ernesto A. Fernandez <ernesto.mnd.fernandez@gmail.com>
      Cc: Joe Perches <joe@perches.com>
      Cc: Viacheslav Dubeyko <slava@dubeyko.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c7d7d620
    • Pan Bian's avatar
      hfs: do not free node before using · ce96a407
      Pan Bian authored
      hfs_bmap_free() frees the node via hfs_bnode_put(node).  However, it
      then reads node->this when dumping error message on an error path, which
      may result in a use-after-free bug.  This patch frees the node only when
      it is never again used.
      
      Link: http://lkml.kernel.org/r/1542963889-128825-1-git-send-email-bianpan2016@163.com
      Fixes: a1185ffa2fc ("HFS rewrite")
      Signed-off-by: default avatarPan Bian <bianpan2016@163.com>
      Reviewed-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Joe Perches <joe@perches.com>
      Cc: Ernesto A. Fernandez <ernesto.mnd.fernandez@gmail.com>
      Cc: Viacheslav Dubeyko <slava@dubeyko.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ce96a407
    • Alexey Dobriyan's avatar
      proc: update MAINTAINERS with proc.txt · 94570a41
      Alexey Dobriyan authored
      Turns out that /proc has official documentation and people even trying
      to keep it uptodate.
      
      Link: http://lkml.kernel.org/r/20181116134630.GA8004@avx2Signed-off-by: default avatarAlexey Dobriyan <adobriyan@gmail.com>
      Reviewed-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      94570a41
    • Wei Yang's avatar
      mm/page_alloc.c: fix calculation of pgdat->nr_zones · 8f416836
      Wei Yang authored
      init_currently_empty_zone() will adjust pgdat->nr_zones and set it to
      'zone_idx(zone) + 1' unconditionally.  This is correct in the normal
      case, while not exact in hot-plug situation.
      
      This function is used in two places:
      
        * free_area_init_core()
        * move_pfn_range_to_zone()
      
      In the first case, we are sure zone index increase monotonically.  While
      in the second one, this is under users control.
      
      One way to reproduce this is:
      ----------------------------
      
      1. create a virtual machine with empty node1
      
         -m 4G,slots=32,maxmem=32G \
         -smp 4,maxcpus=8          \
         -numa node,nodeid=0,mem=4G,cpus=0-3 \
         -numa node,nodeid=1,mem=0G,cpus=4-7
      
      2. hot-add cpu 3-7
      
         cpu-add [3-7]
      
      2. hot-add memory to nod1
      
         object_add memory-backend-ram,id=ram0,size=1G
         device_add pc-dimm,id=dimm0,memdev=ram0,node=1
      
      3. online memory with following order
      
         echo online_movable > memory47/state
         echo online > memory40/state
      
      After this, node1 will have its nr_zones equals to (ZONE_NORMAL + 1)
      instead of (ZONE_MOVABLE + 1).
      
      Michal said:
       "Having an incorrect nr_zones might result in all sorts of problems
        which would be quite hard to debug (e.g. reclaim not considering the
        movable zone). I do not expect many users would suffer from this it
        but still this is trivial and obviously right thing to do so
        backporting to the stable tree shouldn't be harmful (last famous
        words)"
      
      Link: http://lkml.kernel.org/r/20181117022022.9956-1-richard.weiyang@gmail.com
      Fixes: f1dd2cd1 ("mm, memory_hotplug: do not associate hotadded memory to zones until online")
      Signed-off-by: default avatarWei Yang <richard.weiyang@gmail.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Reviewed-by: default avatarOscar Salvador <osalvador@suse.de>
      Cc: Anshuman Khandual <anshuman.khandual@arm.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8f416836
    • Yu Zhao's avatar
      mm: use swp_offset as key in shmem_replace_page() · c1cb20d4
      Yu Zhao authored
      We changed the key of swap cache tree from swp_entry_t.val to
      swp_offset.  We need to do so in shmem_replace_page() as well.
      
      Hugh said:
       "shmem_replace_page() has been wrong since the day I wrote it: good
        enough to work on swap "type" 0, which is all most people ever use
        (especially those few who need shmem_replace_page() at all), but
        broken once there are any non-0 swp_type bits set in the higher order
        bits"
      
      Link: http://lkml.kernel.org/r/20181121215442.138545-1-yuzhao@google.com
      Fixes: f6ab1f7f ("mm, swap: use offset of swap entry as key of swap cache")
      Signed-off-by: default avatarYu Zhao <yuzhao@google.com>
      Reviewed-by: default avatarMatthew Wilcox <willy@infradead.org>
      Acked-by: default avatarHugh Dickins <hughd@google.com>
      Cc: <stable@vger.kernel.org>	[4.9+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c1cb20d4
    • Pavel Tikhomirov's avatar
      mm: cleancache: fix corruption on missed inode invalidation · 6ff38bd4
      Pavel Tikhomirov authored
      If all pages are deleted from the mapping by memory reclaim and also
      moved to the cleancache:
      
      __delete_from_page_cache
        (no shadow case)
        unaccount_page_cache_page
          cleancache_put_page
        page_cache_delete
          mapping->nrpages -= nr
          (nrpages becomes 0)
      
      We don't clean the cleancache for an inode after final file truncation
      (removal).
      
      truncate_inode_pages_final
        check (nrpages || nrexceptional) is false
          no truncate_inode_pages
            no cleancache_invalidate_inode(mapping)
      
      These way when reading the new file created with same inode we may get
      these trash leftover pages from cleancache and see wrong data instead of
      the contents of the new file.
      
      Fix it by always doing truncate_inode_pages which is already ready for
      nrpages == 0 && nrexceptional == 0 case and just invalidates inode.
      
      [akpm@linux-foundation.org: add comment, per Jan]
      Link: http://lkml.kernel.org/r/20181112095734.17979-1-ptikhomirov@virtuozzo.com
      Fixes: commit 91b0abe3 ("mm + fs: store shadow entries in page cache")
      Signed-off-by: default avatarPavel Tikhomirov <ptikhomirov@virtuozzo.com>
      Reviewed-by: default avatarVasily Averin <vvs@virtuozzo.com>
      Reviewed-by: default avatarAndrey Ryabinin <aryabinin@virtuozzo.com>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      6ff38bd4
    • Larry Chen's avatar
      ocfs2: fix deadlock caused by ocfs2_defrag_extent() · e21e5744
      Larry Chen authored
      ocfs2_defrag_extent may fall into deadlock.
      
      ocfs2_ioctl_move_extents
          ocfs2_ioctl_move_extents
            ocfs2_move_extents
              ocfs2_defrag_extent
                ocfs2_lock_allocators_move_extents
      
                  ocfs2_reserve_clusters
                    inode_lock GLOBAL_BITMAP_SYSTEM_INODE
      
      	  __ocfs2_flush_truncate_log
                    inode_lock GLOBAL_BITMAP_SYSTEM_INODE
      
      As backtrace shows above, ocfs2_reserve_clusters() will call inode_lock
      against the global bitmap if local allocator has not sufficient cluters.
      Once global bitmap could meet the demand, ocfs2_reserve_cluster will
      return success with global bitmap locked.
      
      After ocfs2_reserve_cluster(), if truncate log is full,
      __ocfs2_flush_truncate_log() will definitely fall into deadlock because
      it needs to inode_lock global bitmap, which has already been locked.
      
      To fix this bug, we could remove from
      ocfs2_lock_allocators_move_extents() the code which intends to lock
      global allocator, and put the removed code after
      __ocfs2_flush_truncate_log().
      
      ocfs2_lock_allocators_move_extents() is referred by 2 places, one is
      here, the other does not need the data allocator context, which means
      this patch does not affect the caller so far.
      
      Link: http://lkml.kernel.org/r/20181101071422.14470-1-lchen@suse.comSigned-off-by: default avatarLarry Chen <lchen@suse.com>
      Reviewed-by: default avatarChangwei Ge <ge.changwei@h3c.com>
      Cc: Mark Fasheh <mark@fasheh.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Junxiao Bi <junxiao.bi@oracle.com>
      Cc: Joseph Qi <jiangqi903@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e21e5744
    • John Hubbard's avatar
      mm/gup: finish consolidating error handling · 08be37b7
      John Hubbard authored
      Commit df06b37f ("mm/gup: cache dev_pagemap while pinning pages")
      attempted to operate on each page that get_user_pages had retrieved.  In
      order to do that, it created a common exit point from the routine.
      However, one case was missed, which this patch fixes up.
      
      Also, there was still an unnecessary shadow declaration (with a
      different type) of the "ret" variable, which this patch removes.
      
      Keith's description of the situation is:
      
        This also fixes a potentially leaked dev_pagemap reference count if a
        failure occurs when an iteration crosses a vma boundary.  I don't think
        it's normal to have different vma's on a users mapped zone device
        memory, but good to fix anyway.
      
      I actually thought that this code:
      
          /* first iteration or cross vma bound */
          if (!vma || start >= vma->vm_end) {
      	        vma = find_extend_vma(mm, start);
      	        if (!vma && in_gate_area(mm, start)) {
      		            ret = get_gate_page(mm, start & PAGE_MASK,
      		                    gup_flags, &vma,
      		                    pages ? &pages[i] : NULL);
      		            if (ret)
      		                goto out;
      
      dealt with the "you're trying to pin the gate page, as part of this
      call", rather than the generic case of crossing a vma boundary.  (I
      think there's a fine point that I must be overlooking.) But it's still a
      valid case, either way.
      
      Link: http://lkml.kernel.org/r/20181121081402.29641-2-jhubbard@nvidia.com
      Fixes: df06b37f ("mm/gup: cache dev_pagemap while pinning pages")
      Signed-off-by: default avatarJohn Hubbard <jhubbard@nvidia.com>
      Reviewed-by: default avatarKeith Busch <keith.busch@intel.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      08be37b7
    • Luis Chamberlain's avatar
      MAINTAINERS: name change for Luis · 12457e63
      Luis Chamberlain authored
      My name has changed, works better than Global Entry I tell ya.
      
      Link: http://lkml.kernel.org/r/20181122003138.7752-1-mcgrof@kernel.orgSigned-off-by: default avatarLuis Chamberlain <mcgrof@kernel.org>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      12457e63
    • Linus Torvalds's avatar
      Merge tag 'char-misc-4.20-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc · b6839ef2
      Linus Torvalds authored
      Pull char/misc fixes from Greg KH:
       "Here are a few small char/misc driver fixes for 4.20-rc5 that resolve
        a number of reported issues.
      
        The "largest" here is the thunderbolt patch, which resolves an issue
        with NVM upgrade, the smallest being some fsi driver fixes. There's
        also a hyperv bugfix, and the usual binder bugfixes.
      
        All of these have been in linux-next with no reported issues"
      
      * tag 'char-misc-4.20-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
        misc: mic/scif: fix copy-paste error in scif_create_remote_lookup
        thunderbolt: Prevent root port runtime suspend during NVM upgrade
        Drivers: hv: vmbus: check the creation_status in vmbus_establish_gpadl()
        binder: fix race that allows malicious free of live buffer
        fsi: fsi-scom.c: Remove duplicate header
        fsi: master-ast-cf: select GENERIC_ALLOCATOR
      b6839ef2
    • Linus Torvalds's avatar
      Merge tag 'driver-core-4.20-rc5' of... · d7aca8a7
      Linus Torvalds authored
      Merge tag 'driver-core-4.20-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
      
      Pull driver core fix from Greg KH:
       "Here is a single driver core fix for 4.20-rc5
      
        It resolves an issue with the data alignment in 'struct devres' for
        the ARC platform. The full details are in the commit changelog, but
        the short summary is the change is a single line:
      
      	-       unsigned long long              data[]; /* guarantee ull alignment */
      	+       u8 __aligned(ARCH_KMALLOC_MINALIGN) data[];
      
        This has been in linux-next for a while with no reported issues"
      
      * tag 'driver-core-4.20-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
        devres: Align data[] to ARCH_KMALLOC_MINALIGN
      d7aca8a7
    • Linus Torvalds's avatar
      Merge tag 'staging-4.20-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging · cd9a0433
      Linus Torvalds authored
      Pull staging and IIO driver fixes from Greg KH:
       "Here are some small IIO and staging driver fixes for 4.20-rc5.
      
        Nothing major, the IIO fix ended up touching the HID drivers at the
        same time, but the HID maintainer acked it. The staging fixes are all
        minor patches for reported issues and regressions, full details are in
        the shortlog.
      
        All of these have been in linux-next for a while with no reported
        issues"
      
      * tag 'staging-4.20-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
        iio/hid-sensors: Fix IIO_CHAN_INFO_RAW returning wrong values for signed numbers
        staging: vchiq_arm: fix compat VCHIQ_IOC_AWAIT_COMPLETION
        staging: mt7621-pinctrl: fix uninitialized variable ngroups
        staging: rtl8723bs: Add missing return for cfg80211_rtw_get_station
        staging: most: use format specifier "%s" in snprintf
        staging: rtl8723bs: Fix incorrect sense of ether_addr_equal
        staging: mt7621-dma: fix potentially dereferencing uninitialized 'tx_desc'
        staging: comedi: clarify/unify macros for NI macro-defined terminals
        drivers: staging: cedrus: find ctx before dereferencing it ctx
        staging: rtl8723bs: Fix the return value in case of error in 'rtw_wx_read32()'
        staging: comedi: ni_mio_common: scale ao INSN_CONFIG_GET_CMD_TIMING_CONSTRAINTS
        iio:st_magn: Fix enable device after trigger
      cd9a0433
    • Linus Torvalds's avatar
      Merge tag 'usb-4.20-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb · 40ebba2a
      Linus Torvalds authored
      Pull USB/PHY driver fixes from Greg KH:
       "Here are some small USB and PHY driver fixes for 4.20-rc5
      
        Nothing big at all, just the usual handful of USB fixes for reported
        issues, along with some gadget and PHY driver bug fixes.
      
        All of these have been in linux-next with no reported issues. Note,
        the USB gadget fixes were in linux-next on its own branch, not in
        mine, it just got merged into here yesterday and missed linux-next of
        today"
      
      * tag 'usb-4.20-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
        usb: gadget: u_ether: fix unsafe list iteration
        USB: omap_udc: fix rejection of out transfers when DMA is used
        USB: omap_udc: fix USB gadget functionality on Palm Tungsten E
        USB: omap_udc: fix omap_udc_start() on 15xx machines
        USB: omap_udc: fix crashes on probe error and module removal
        USB: omap_udc: use devm_request_irq()
        usb: core: quirks: add RESET_RESUME quirk for Cherry G230 Stream series
        USB: usb-storage: Add new IDs to ums-realtek
        Revert "usb: dwc3: gadget: skip Set/Clear Halt when invalid"
        phy: qcom-qusb2: Fix HSTX_TRIM tuning with fused value for SDM845
        phy: qcom-qusb2: Use HSTX_TRIM fused value as is
        dt-bindings: phy-qcom-qmp: Fix several mistakes from prior commits
        phy: uniphier-pcie: Depend on HAS_IOMEM
      40ebba2a
    • Linus Torvalds's avatar
      Merge tag 'mtd/fixes-for-4.20-rc5' of git://git.infradead.org/linux-mtd · da59f180
      Linus Torvalds authored
      Pull mtd fixes from Boris Brezillon:
       "NAND fix:
         - Fix BBT cache allocation done in nanddev_bbt_init()
      
        SPI NOR fixes:
         - Fix the erase type selection logic"
      
      * tag 'mtd/fixes-for-4.20-rc5' of git://git.infradead.org/linux-mtd:
        mtd: nand: Fix memory allocation in nanddev_bbt_init()
        mtd: spi-nor: fix erase_type array to indicate current map conf
      da59f180
    • Linus Torvalds's avatar
      test_hexdump: use memcpy instead of strncpy · b1286ed7
      Linus Torvalds authored
      New versions of gcc reasonably warn about the odd pattern of
      
      	strncpy(p, q, strlen(q));
      
      which really doesn't make sense: the strncpy() ends up being just a slow
      and odd way to write memcpy() in this case.
      
      Apparently there was a patch for this floating around earlier, but it
      got lost.
      Acked-again-by: default avatarAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b1286ed7
    • Linus Torvalds's avatar
      Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 1ec63573
      Linus Torvalds authored
      Pull x86 fixes from Ingo Molnar:
       "Misc fixes:
      
         - MCE related boot crash fix on certain AMD systems
      
         - FPU exception handling fix
      
         - FPU handling race fix
      
         - revert+rewrite of the RSDP boot protocol extension, use boot_params
           instead
      
         - documentation fix"
      
      * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/MCE/AMD: Fix the thresholding machinery initialization order
        x86/fpu: Use the correct exception table macro in the XSTATE_OP wrapper
        x86/fpu: Disable bottom halves while loading FPU registers
        x86/acpi, x86/boot: Take RSDP address from boot params if available
        x86/boot: Mostly revert commit ae7e1238 ("Add ACPI RSDP address to setup_header")
        x86/ptrace: Fix documentation for tracehook_report_syscall_entry()
      1ec63573
    • Linus Torvalds's avatar
      Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · a1b3cf6d
      Linus Torvalds authored
      Pull perf fixes from Ingo Molnar:
       "Misc fixes:
      
         - counter freezing related regression fix
      
         - uprobes race fix
      
         - Intel PMU unusual event combination fix
      
         - .. and diverse tooling fixes"
      
      * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        uprobes: Fix handle_swbp() vs. unregister() + register() race once more
        perf/x86/intel: Disallow precise_ip on BTS events
        perf/x86/intel: Add generic branch tracing check to intel_pmu_has_bts()
        perf/x86/intel: Move branch tracing setup to the Intel-specific source file
        perf/x86/intel: Fix regression by default disabling perfmon v4 interrupt handling
        perf tools beauty ioctl: Support new ISO7816 commands
        tools uapi asm-generic: Synchronize ioctls.h
        tools arch x86: Update tools's copy of cpufeatures.h
        tools headers uapi: Synchronize i915_drm.h
        perf tools: Restore proper cwd on return from mnt namespace
        tools build feature: Check if get_current_dir_name() is available
        perf tools: Fix crash on synthesizing the unit
      a1b3cf6d
    • Linus Torvalds's avatar
      Merge branch 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 8d9f412d
      Linus Torvalds authored
      Pull EFI fix from Ingo Molnar:
       "An arm64 warning fix"
      
      * 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        efi: Prevent GICv3 WARN() by mapping the memreserve table before first use
      8d9f412d
    • Linus Torvalds's avatar
      Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 575d7d0d
      Linus Torvalds authored
      Pull objtool fixes from Ingo Molnar:
       "Two fixes for boundary conditions"
      
      * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        objtool: Fix segfault in .cold detection with -ffunction-sections
        objtool: Fix double-free in .cold detection error path
      575d7d0d
    • Linus Torvalds's avatar
      Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 5f1ca5c6
      Linus Torvalds authored
      Pull vfs fixes from Al Viro:
       "Assorted fixes all over the place.
      
        The iov_iter one is this cycle regression (splice from UDP triggering
        WARN_ON()), the rest is older"
      
      * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        afs: Use d_instantiate() rather than d_add() and don't d_drop()
        afs: Fix missing net error handling
        afs: Fix validation/callback interaction
        iov_iter: teach csum_and_copy_to_iter() to handle pipe-backed ones
        exportfs: do not read dentry after free
        exportfs: fix 'passing zero to ERR_PTR()' warning
        aio: fix failure to put the file pointer
        sysv: return 'err' instead of 0 in __sysv_write_inode
      5f1ca5c6
    • Linus Torvalds's avatar
      Merge tag 'trace-v4.20-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace · 49afe661
      Linus Torvalds authored
      Pull more tracing fixes from Steven Rostedt:
       "Two more fixes:
      
         - Change idx variable in DO_TRACE macro to __idx to avoid name
           conflicts. A kvm event had "idx" as a parameter and it confused the
           macro.
      
         - Fix a race where interrupts would be traced when set_graph_function
           was set. The previous patch set increased a race window that
           tricked the function graph tracer to think it should trace
           interrupts when it really should not have.
      
           The bug has been there before, but was seldom hit. Only the last
           patch series made it more common"
      
      * tag 'trace-v4.20-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
        tracing/fgraph: Fix set_graph_function from showing interrupts
        tracepoint: Use __idx instead of idx in DO_TRACE macro to make it unique
      49afe661
    • Linus Torvalds's avatar
      Merge tag 'trace-v4.20-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace · 0f1f6923
      Linus Torvalds authored
      Pull tracing fixes from Steven Rostedt:
       "While rewriting the function graph tracer, I discovered a design flaw
        that was introduced by a patch that tried to fix one bug, but by doing
        so created another bug.
      
        As both bugs corrupt the output (but they do not crash the kernel), I
        decided to fix the design such that it could have both bugs fixed. The
        original fix, fixed time reporting of the function graph tracer when
        doing a max_depth of one. This was code that can test how much the
        kernel interferes with userspace. But in doing so, it could corrupt
        the time keeping of the function profiler.
      
        The issue is that the curr_ret_stack variable was being used for two
        different meanings. One was to keep track of the stack pointer on the
        ret_stack (shadow stack used by the function graph tracer), and the
        other use case was the graph call depth. Although, the two may be
        closely related, where they got updated was the issue that lead to the
        two different bugs that required the two use cases to be updated
        differently.
      
        The big issue with this fix is that it requires changing each
        architecture. The good news is, I was able to remove a lot of code
        that was duplicated within the architectures and place it into a
        single location. Then I could make the fix in one place.
      
        I pushed this code into linux-next to let it settle over a week, and
        before doing so, I cross compiled all the affected architectures to
        make sure that they built fine.
      
        In the mean time, I also pulled in a patch that fixes the sched_switch
        previous tasks state output, that was not actually correct"
      
      * tag 'trace-v4.20-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
        sched, trace: Fix prev_state output in sched_switch tracepoint
        function_graph: Have profiler use curr_ret_stack and not depth
        function_graph: Reverse the order of pushing the ret_stack and the callback
        function_graph: Move return callback before update of curr_ret_stack
        function_graph: Use new curr_ret_depth to manage depth instead of curr_ret_stack
        function_graph: Make ftrace_push_return_trace() static
        sparc/function_graph: Simplify with function_graph_enter()
        sh/function_graph: Simplify with function_graph_enter()
        s390/function_graph: Simplify with function_graph_enter()
        riscv/function_graph: Simplify with function_graph_enter()
        powerpc/function_graph: Simplify with function_graph_enter()
        parisc: function_graph: Simplify with function_graph_enter()
        nds32: function_graph: Simplify with function_graph_enter()
        MIPS: function_graph: Simplify with function_graph_enter()
        microblaze: function_graph: Simplify with function_graph_enter()
        arm64: function_graph: Simplify with function_graph_enter()
        ARM: function_graph: Simplify with function_graph_enter()
        x86/function_graph: Simplify with function_graph_enter()
        function_graph: Create function_graph_enter() to consolidate architecture code
      0f1f6923
    • Linus Torvalds's avatar
      Merge tag 'drm-fixes-2018-11-30' of git://anongit.freedesktop.org/drm/drm · 570a3743
      Linus Torvalds authored
      Pull drm fixes from Dave Airlie:
       "This weeks instalment of fixes. Looks fairly like business as usual
        and everything seems to rolling along. There was one MST fix applied
        and reverted in the misc tree, but otherwise nothing too strange in
        here.
      
        core:
         - incorrect master setting on error fix
      
        i915:
         - only GVT fixes this week:
            * one MOCS register load
            * rpm lock fix
            * use after free
      
        rcar-du:
         - regression fix for group start
      
        amdgpu:
         - DP MST fix
         - GPUVM fix for huge pages
         - RLC fix for vega20
      
        ast:
         - fix EDID reading stability
         - ioreg free fix
      
        meson:
         - sleep in irq fix
         - vblank fixes
         - array boundary fix"
      
      * tag 'drm-fixes-2018-11-30' of git://anongit.freedesktop.org/drm/drm:
        drm/ast: fixed reading monitor EDID not stable issue
        drm/ast: Fix incorrect free on ioregs
        Revert "drm/dp_mst: Skip validating ports during destruction, just ref"
        drm/amdgpu: Add delay after enable RLC ucode
        drm/amdgpu: Avoid endless loop in GPUVM fragment processing
        drm/amdgpu: Cast to uint64_t before left shift
        drm/meson: add support for 1080p25 mode
        drm/meson: Fix OOB memory accesses in meson_viu_set_osd_lut()
        drm/meson: Enable fast_io in meson_dw_hdmi_regmap_config
        drm/meson: Fixes for drm_crtc_vblank_on/off support
        drm: set is_master to 0 upon drm_new_set_master() failure
        drm/dp_mst: Skip validating ports during destruction, just ref
        drm: rcar-du: Fix DU3 start/stop on M3-N
        drm/amd/dm: Understand why attaching path/tile properties are needed
        drm/amd/dm: Don't forget to attach MST encoders
        drm/i915/gvt: Avoid use-after-free iterating the gtt list
        drm/i915/gvt: ensure gpu is powered before do i915_gem_gtt_insert
        drm/i915/gvt: not to touch undefined MOCS registers
      570a3743
    • Linus Torvalds's avatar
      Merge tag 'pstore-v4.20-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux · e9eaf72e
      Linus Torvalds authored
      Pull pstore fix from Kees Cook:
       "Fix corrupted compression due to unlucky size choice with ECC"
      
      * tag 'pstore-v4.20-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
        pstore/ram: Correctly calculate usable PRZ bytes
      e9eaf72e
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma · 2b17992f
      Linus Torvalds authored
      Pull rdma fixes from Jason Gunthorpe:
       "This is a bit later than usual for our first -rc but I'm not seeing
        anything worry-some in the RDMA tree right now. Quiet so far this -rc
        cycle, only a few internal driver related bugs and a small series
        fixing ODP bugs found by more advanced testing.
      
        A set of small driver and core code fixes:
      
         - Small series fixing longtime user triggerable bugs in the ODP
           processing inside mlx5 and core code
      
         - Various small driver malfunctions and crashes (use after, free,
           error unwind, implementation bugs)
      
         - A misfunction of the RDMA GID cache that can be triggered by the
           administrator"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
        RDMA/mlx5: Initialize return variable in case pagefault was skipped
        IB/mlx5: Fix page fault handling for MW
        IB/umem: Set correct address to the invalidation function
        IB/mlx5: Skip non-ODP MR when handling a page fault
        RDMA/hns: Bugfix pbl configuration for rereg mr
        iser: set sector for ambiguous mr status errors
        RDMA/rdmavt: Fix rvt_create_ah function signature
        IB/mlx5: Avoid load failure due to unknown link width
        IB/mlx5: Fix XRC QP support after introducing extended atomic
        RDMA/bnxt_re: Avoid accessing the device structure after it is freed
        RDMA/bnxt_re: Fix system hang when registration with L2 driver fails
        RDMA/core: Add GIDs while changing MAC addr only for registered ndev
        RDMA/mlx5: Fix fence type for IB_WR_LOCAL_INV WR
        net/mlx5: Fix XRC SRQ umem valid bits
      2b17992f
    • Steven Rostedt (VMware)'s avatar
      tracing/fgraph: Fix set_graph_function from showing interrupts · 5cf99a0f
      Steven Rostedt (VMware) authored
      The tracefs file set_graph_function is used to only function graph functions
      that are listed in that file (or all functions if the file is empty). The
      way this is implemented is that the function graph tracer looks at every
      function, and if the current depth is zero and the function matches
      something in the file then it will trace that function. When other functions
      are called, the depth will be greater than zero (because the original
      function will be at depth zero), and all functions will be traced where the
      depth is greater than zero.
      
      The issue is that when a function is first entered, and the handler that
      checks this logic is called, the depth is set to zero. If an interrupt comes
      in and a function in the interrupt handler is traced, its depth will be
      greater than zero and it will automatically be traced, even if the original
      function was not. But because the logic only looks at depth it may trace
      interrupts when it should not be.
      
      The recent design change of the function graph tracer to fix other bugs
      caused the depth to be zero while the function graph callback handler is
      being called for a longer time, widening the race of this happening. This
      bug was actually there for a longer time, but because the race window was so
      small it seldom happened. The Fixes tag below is for the commit that widen
      the race window, because that commit belongs to a series that will also help
      fix the original bug.
      
      Cc: stable@kernel.org
      Fixes: 39eb456d ("function_graph: Use new curr_ret_depth to manage depth instead of curr_ret_stack")
      Reported-by: default avatarJoe Lawrence <joe.lawrence@redhat.com>
      Tested-by: default avatarJoe Lawrence <joe.lawrence@redhat.com>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      5cf99a0f
    • Zenghui Yu's avatar
      tracepoint: Use __idx instead of idx in DO_TRACE macro to make it unique · 0c7a52e4
      Zenghui Yu authored
      After enabling KVM event tracing, almost all of trace_kvm_exit()'s
      printk shows
      
      	"kvm_exit: IRQ: ..."
      
      even if the actual exception_type is NOT IRQ.  More specifically,
      trace_kvm_exit() is defined in virt/kvm/arm/trace.h by TRACE_EVENT.
      
      This slight problem may have existed after commit e6753f23
      ("tracepoint: Make rcuidle tracepoint callers use SRCU"). There are
      two variables in trace_kvm_exit() and __DO_TRACE() which have the
      same name, *idx*. Thus the actual value of *idx* will be overwritten
      when tracing. Fix it by adding a simple prefix.
      
      Cc: Joel Fernandes <joel@joelfernandes.org>
      Cc: Wang Haibin <wanghaibin.wang@huawei.com>
      Cc: linux-trace-devel@vger.kernel.org
      Cc: stable@vger.kernel.org
      Fixes: e6753f23 ("tracepoint: Make rcuidle tracepoint callers use SRCU")
      Reviewed-by: default avatarJoel Fernandes (Google) <joel@joelfernandes.org>
      Signed-off-by: default avatarZenghui Yu <yuzenghui@huawei.com>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      0c7a52e4