1. 18 Nov, 2018 3 commits
    • Mike Kravetz's avatar
      hugetlbfs: fix kernel BUG at fs/hugetlbfs/inode.c:444! · 5e41540c
      Mike Kravetz authored
      This bug has been experienced several times by the Oracle DB team.  The
      BUG is in remove_inode_hugepages() as follows:
      
      	/*
      	 * If page is mapped, it was faulted in after being
      	 * unmapped in caller.  Unmap (again) now after taking
      	 * the fault mutex.  The mutex will prevent faults
      	 * until we finish removing the page.
      	 *
      	 * This race can only happen in the hole punch case.
      	 * Getting here in a truncate operation is a bug.
      	 */
      	if (unlikely(page_mapped(page))) {
      		BUG_ON(truncate_op);
      
      In this case, the elevated map count is not the result of a race.
      Rather it was incorrectly incremented as the result of a bug in the huge
      pmd sharing code.  Consider the following:
      
       - Process A maps a hugetlbfs file of sufficient size and alignment
         (PUD_SIZE) that a pmd page could be shared.
      
       - Process B maps the same hugetlbfs file with the same size and
         alignment such that a pmd page is shared.
      
       - Process B then calls mprotect() to change protections for the mapping
         with the shared pmd. As a result, the pmd is 'unshared'.
      
       - Process B then calls mprotect() again to chage protections for the
         mapping back to their original value. pmd remains unshared.
      
       - Process B then forks and process C is created. During the fork
         process, we do dup_mm -> dup_mmap -> copy_page_range to copy page
         tables. Copying page tables for hugetlb mappings is done in the
         routine copy_hugetlb_page_range.
      
      In copy_hugetlb_page_range(), the destination pte is obtained by:
      
      	dst_pte = huge_pte_alloc(dst, addr, sz);
      
      If pmd sharing is possible, the returned pointer will be to a pte in an
      existing page table.  In the situation above, process C could share with
      either process A or process B.  Since process A is first in the list,
      the returned pte is a pointer to a pte in process A's page table.
      
      However, the check for pmd sharing in copy_hugetlb_page_range is:
      
      	/* If the pagetables are shared don't copy or take references */
      	if (dst_pte == src_pte)
      		continue;
      
      Since process C is sharing with process A instead of process B, the
      above test fails.  The code in copy_hugetlb_page_range which follows
      assumes dst_pte points to a huge_pte_none pte.  It copies the pte entry
      from src_pte to dst_pte and increments this map count of the associated
      page.  This is how we end up with an elevated map count.
      
      To solve, check the dst_pte entry for huge_pte_none.  If !none, this
      implies PMD sharing so do not copy.
      
      Link: http://lkml.kernel.org/r/20181105212315.14125-1-mike.kravetz@oracle.com
      Fixes: c5c99429 ("fix hugepages leak due to pagetable page sharing")
      Signed-off-by: default avatarMike Kravetz <mike.kravetz@oracle.com>
      Reviewed-by: default avatarNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Prakash Sangappa <prakash.sangappa@oracle.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      5e41540c
    • Olof Johansson's avatar
      kernel/sched/psi.c: simplify cgroup_move_task() · 8fcb2312
      Olof Johansson authored
      The existing code triggered an invalid warning about 'rq' possibly being
      used uninitialized.  Instead of doing the silly warning suppression by
      initializa it to NULL, refactor the code to bail out early instead.
      
      Warning was:
      
        kernel/sched/psi.c: In function `cgroup_move_task':
        kernel/sched/psi.c:639:13: warning: `rq' may be used uninitialized in this function [-Wmaybe-uninitialized]
      
      Link: http://lkml.kernel.org/r/20181103183339.8669-1-olof@lixom.net
      Fixes: 2ce7135a ("psi: cgroup support")
      Signed-off-by: default avatarOlof Johansson <olof@lixom.net>
      Reviewed-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Acked-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8fcb2312
    • Vitaly Wool's avatar
      z3fold: fix possible reclaim races · ca0246bb
      Vitaly Wool authored
      Reclaim and free can race on an object which is basically fine but in
      order for reclaim to be able to map "freed" object we need to encode
      object length in the handle.  handle_to_chunks() is then introduced to
      extract object length from a handle and use it during mapping.
      
      Moreover, to avoid racing on a z3fold "headless" page release, we should
      not try to free that page in z3fold_free() if the reclaim bit is set.
      Also, in the unlikely case of trying to reclaim a page being freed, we
      should not proceed with that page.
      
      While at it, fix the page accounting in reclaim function.
      
      This patch supersedes "[PATCH] z3fold: fix reclaim lock-ups".
      
      Link: http://lkml.kernel.org/r/20181105162225.74e8837d03583a9b707cf559@gmail.comSigned-off-by: default avatarVitaly Wool <vitaly.vul@sony.com>
      Signed-off-by: default avatarJongseok Kim <ks77sj@gmail.com>
      Reported-by-by: default avatarJongseok Kim <ks77sj@gmail.com>
      Reviewed-by: default avatarSnild Dolkow <snild@sony.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ca0246bb
  2. 16 Nov, 2018 9 commits
    • Linus Torvalds's avatar
      Merge tag 'fsnotify_for_v4.20-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs · 1ce80e0f
      Linus Torvalds authored
      Pull fsnotify fix from Jan Kara:
       "One small fsnotify fix for duplicate events"
      
      * tag 'fsnotify_for_v4.20-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
        fanotify: fix handling of events on child sub-directory
      1ce80e0f
    • Linus Torvalds's avatar
      Merge tag 'gfs2-4.20.fixes3' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2 · e6a2562f
      Linus Torvalds authored
      Pull bfs2 fixes from Andreas Gruenbacher:
       "Fix two bugs leading to leaked buffer head references:
      
         - gfs2: Put bitmap buffers in put_super
         - gfs2: Fix iomap buffer head reference counting bug
      
        And one bug leading to significant slow-downs when deleting large
        files:
      
         - gfs2: Fix metadata read-ahead during truncate (2)"
      
      * tag 'gfs2-4.20.fixes3' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2:
        gfs2: Fix iomap buffer head reference counting bug
        gfs2: Fix metadata read-ahead during truncate (2)
        gfs2: Put bitmap buffers in put_super
      e6a2562f
    • Andreas Gruenbacher's avatar
      gfs2: Fix iomap buffer head reference counting bug · c26b5aa8
      Andreas Gruenbacher authored
      GFS2 passes the inode buffer head (dibh) from gfs2_iomap_begin to
      gfs2_iomap_end in iomap->private.  It sets that private pointer in
      gfs2_iomap_get.  Users of gfs2_iomap_get other than gfs2_iomap_begin
      would have to release iomap->private, but this isn't done correctly,
      leading to a leak of buffer head references.
      
      To fix this, move the code for setting iomap->private from
      gfs2_iomap_get to gfs2_iomap_begin.
      
      Fixes: 64bc06bb ("gfs2: iomap buffered write support")
      Cc: stable@vger.kernel.org # v4.19+
      Signed-off-by: default avatarAndreas Gruenbacher <agruenba@redhat.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c26b5aa8
    • Linus Torvalds's avatar
      Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 · 32e2524a
      Linus Torvalds authored
      Pull crypto fixes from Herbert Xu:
       "This fixes the following issues:
      
         - Potential memory overwrite in simd
      
         - Kernel info leaks in crypto_user
      
         - NULL dereference and use-after-free in hisilicon"
      
      * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
        crypto: user - Zeroize whole structure given to user space
        crypto: user - fix leaking uninitialized memory to userspace
        crypto: simd - correctly take reqsize of wrapped skcipher into account
        crypto: hisilicon - Fix reference after free of memories on error path
        crypto: hisilicon - Fix NULL dereference for same dst and src
      32e2524a
    • Linus Torvalds's avatar
      Merge tag 'drm-fixes-2018-11-16' of git://anongit.freedesktop.org/drm/drm · 4efd3460
      Linus Torvalds authored
      Pull drm fixes from Dave Airlie:
       "Live from Vancouver, SoC maintainer talk, this weeks drm fixes pull
        for rc3:
      
        omapdrm:
         - regression fixes for the reordering bridge stuff that went into rc1
      
        i915:
         - incorrect EU count fix
         - HPD storm fix
         - MST fix
         - relocation fix for gen4/5
      
        amdgpu:
         - huge page handling fix
         - IH ring setup
         - XGMI aperture setup
         - watermark setup fix
      
        misc:
         - docs and MST fix"
      
      * tag 'drm-fixes-2018-11-16' of git://anongit.freedesktop.org/drm/drm: (23 commits)
        drm/i915: Account for scale factor when calculating initial phase
        drm/i915: Clean up skl_program_scaler()
        drm/i915: Move programming plane scaler to its own function.
        drm/i915/icl: Drop spurious register read from icl_dbuf_slices_update
        drm/i915: fix broadwell EU computation
        drm/amdgpu: fix huge page handling on Vega10
        drm/amd/pp: Fix truncated clock value when set watermark
        drm/amdgpu: fix bug with IH ring setup
        drm/meson: venc: dmt mode must use encp
        drm/amdgpu: set system aperture to cover whole FB region
        drm/i915: Fix hpd handling for pins with two encoders
        drm/i915/execlists: Force write serialisation into context image vs execution
        drm/i915/icl: Fix power well 2 wrt. DC-off toggling order
        drm/i915: Fix NULL deref when re-enabling HPD IRQs on systems with MST
        drm/i915: Fix possible race in intel_dp_add_mst_connector()
        drm/i915/ringbuffer: Delay after EMIT_INVALIDATE for gen4/gen5
        drm/omap: dsi: Fix missing of_platform_depopulate()
        drm/omap: Move DISPC runtime PM handling to omapdrm
        drm/omap: dsi: Ensure the device is active during probe
        drm/omap: hdmi4: Ensure the device is active during bind
        ...
      4efd3460
    • Linus Torvalds's avatar
      Merge tag 'powerpc-4.20-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · ef268de1
      Linus Torvalds authored
      Pull powerpc fixes from Michael Ellerman:
       "Two weeks worth of fixes since rc1.
      
         - I broke 16-byte alignment of the stack when we moved PPR into
           pt_regs. Despite being required by the ABI this broke almost
           nothing, we eventually hit it in code where GCC does arithmetic on
           the stack pointer assuming the bottom 4 bits are clear. Fix it by
           padding the in-kernel pt_regs by 8 bytes.
      
         - A couple of commits fixing minor bugs in the recent SLB rewrite.
      
         - A build fix related to tracepoints in KVM in some configurations.
      
         - Our old "IO workarounds" code written for Cell couldn't coexist in
           a kernel that runs on Power9 with the Radix MMU, fix that.
      
         - Remove the NPU DMA ops, these just printed a warning and should
           never have been called.
      
         - Suppress an overly chatty message triggered by CPU hotplug in some
           configs.
      
         - Two small selftest fixes.
      
        Thanks to: Alistair Popple, Gustavo Romero, Nicholas Piggin, Satheesh
        Rajendran, Scott Wood"
      
      * tag 'powerpc-4.20-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
        selftests/powerpc: Adjust wild_bctr to build with old binutils
        powerpc/64: Fix kernel stack 16-byte alignment
        powerpc/numa: Suppress "VPHN is not supported" messages
        selftests/powerpc: Fix wild_bctr test to work on ppc64
        powerpc/io: Fix the IO workarounds code to work with Radix
        powerpc/mm/64s: Fix preempt warning in slb_allocate_kernel()
        KVM: PPC: Move and undef TRACE_INCLUDE_PATH/FILE
        powerpc/mm/64s: Only use slbfee on CPUs that support it
        powerpc/mm/64s: Use PPC_SLBFEE macro
        powerpc/mm/64s: Consolidate SLB assertions
        powerpc/powernv/npu: Remove NPU DMA ops
      ef268de1
    • Linus Torvalds's avatar
      Merge tag 'xtensa-20181115' of git://github.com/jcmvbkbc/linux-xtensa · 50d25bdc
      Linus Torvalds authored
      Pull Xtensa fixes from Max Filippov:
      
       - fix stack alignment for bFLT binaries.
      
       - fix physical-to-virtual address translation for boot parameters in
         MMUv3 256+256 and 512+512 virtual memory layouts.
      
      * tag 'xtensa-20181115' of git://github.com/jcmvbkbc/linux-xtensa:
        xtensa: fix boot parameters address translation
        xtensa: make sure bFLT stack is 16 byte aligned
      50d25bdc
    • Linus Torvalds's avatar
      Merge tag 'for-linus-20181115' of git://git.kernel.dk/linux-block · 59749c2d
      Linus Torvalds authored
      Pull block fixes from Jens Axboe:
      
       - Discard loop fix, caused by integer overflow (Dave)
      
       - Blacklist of Samsung drive that hangs with power management (Diego)
      
       - Copy bio priority when cloning it (Hannes)
      
       - Fix race condition exposed in floppy (me)
      
       - Fix SCSI queue cleanup regression. While elusive, it caused oopses in
         queue running (Ming)
      
       - Fix bad string copy in kyber tracing (Omar)
      
      * tag 'for-linus-20181115' of git://git.kernel.dk/linux-block:
        SCSI: fix queue cleanup race before queue initialization is done
        block: fix 32 bit overflow in __blkdev_issue_discard()
        libata: blacklist SAMSUNG MZ7TD256HAFV-000L9 SSD
        block: copy ioprio in __bio_clone_fast() and bounce
        kyber: fix wrong strlcpy() size in trace_kyber_latency()
        floppy: fix race condition in __floppy_read_block_0()
      59749c2d
    • Linus Torvalds's avatar
      Merge tag 'fuse-fixes-4.20-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse · 9b5f361a
      Linus Torvalds authored
      Pull fuse fixes from Miklos Szeredi:
       "A couple of fixes, all bound for -stable (i.e. not regressions in this
        cycle)"
      
      * tag 'fuse-fixes-4.20-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
        fuse: fix use-after-free in fuse_direct_IO()
        fuse: fix possibly missed wake-up after abort
        fuse: fix leaked notify reply
      9b5f361a
  3. 15 Nov, 2018 11 commits
  4. 14 Nov, 2018 17 commits