1. 24 Dec, 2018 19 commits
  2. 23 Dec, 2018 2 commits
  3. 22 Dec, 2018 5 commits
    • Linus Torvalds's avatar
      Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · 9105b8aa
      Linus Torvalds authored
      Pull SCSI fixes from James Bottomley:
       "This is two simple target fixes and one discard related I/O starvation
        problem in sd.
      
        The discard problem occurs because the discard page doesn't have a
        mempool backing so if the allocation fails due to memory pressure, we
        then lose the forward progress we require if the writeout is on the
        same device. The fix is to back it with a mempool"
      
      * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
        scsi: sd: use mempool for discard special page
        scsi: target: iscsi: cxgbit: add missing spin_lock_init()
        scsi: target: iscsi: cxgbit: fix csk leak
      9105b8aa
    • Linus Torvalds's avatar
      Merge tag 'compiler-attributes-for-linus-v4.20' of https://github.com/ojeda/linux · 1104bd96
      Linus Torvalds authored
      Pull compiler_types.h fix from Miguel Ojeda:
       "A cleanup for userspace in compiler_types.h: don't pollute userspace
        with macro definitions (Xiaozhou Liu)
      
        This is harmless for the kernel, but v4.19 was released with a few
        macros exposed to userspace as the patch explains; which this removes,
        so it *could* happen that we break something for someone (although
        leaving inline redefined is probably worse)"
      
      * tag 'compiler-attributes-for-linus-v4.20' of https://github.com/ojeda/linux:
        include/linux/compiler_types.h: don't pollute userspace with macro definitions
      1104bd96
    • Linus Torvalds's avatar
      Merge tag 'auxdisplay-for-linus-v4.20' of https://github.com/ojeda/linux · 38c0ecf6
      Linus Torvalds authored
      Pull auxdisplay fix from Miguel Ojeda:
       "charlcd: fix x/y command parsing (Mans Rullgard)"
      
      * tag 'auxdisplay-for-linus-v4.20' of https://github.com/ojeda/linux:
        auxdisplay: charlcd: fix x/y command parsing
      38c0ecf6
    • Christian Brauner's avatar
      Revert "vfs: Allow userns root to call mknod on owned filesystems." · 94f82008
      Christian Brauner authored
      This reverts commit 55956b59.
      
      commit 55956b59 ("vfs: Allow userns root to call mknod on owned filesystems.")
      enabled mknod() in user namespaces for userns root if CAP_MKNOD is
      available. However, these device nodes are useless since any filesystem
      mounted from a non-initial user namespace will set the SB_I_NODEV flag on
      the filesystem. Now, when a device node s created in a non-initial user
      namespace a call to open() on said device node will fail due to:
      
      bool may_open_dev(const struct path *path)
      {
              return !(path->mnt->mnt_flags & MNT_NODEV) &&
                      !(path->mnt->mnt_sb->s_iflags & SB_I_NODEV);
      }
      
      The problem with this is that as of the aforementioned commit mknod()
      creates partially functional device nodes in non-initial user namespaces.
      In particular, it has the consequence that as of the aforementioned commit
      open() will be more privileged with respect to device nodes than mknod().
      Before it was the other way around. Specifically, if mknod() succeeded
      then it was transparent for any userspace application that a fatal error
      must have occured when open() failed.
      
      All of this breaks multiple userspace workloads and a widespread assumption
      about how to handle mknod(). Basically, all container runtimes and systemd
      live by the slogan "ask for forgiveness not permission" when running user
      namespace workloads. For mknod() the assumption is that if the syscall
      succeeds the device nodes are useable irrespective of whether it succeeds
      in a non-initial user namespace or not. This logic was chosen explicitly
      to allow for the glorious day when mknod() will actually be able to create
      fully functional device nodes in user namespaces.
      A specific problem people are already running into when running 4.18 rc
      kernels are failing systemd services. For any distro that is run in a
      container systemd services started with the PrivateDevices= property set
      will fail to start since the device nodes in question cannot be
      opened (cf. the arguments in [1]).
      
      Full disclosure, Seth made the very sound argument that it is already
      possible to end up with partially functional device nodes. Any filesystem
      mounted with MS_NODEV set will allow mknod() to succeed but will not allow
      open() to succeed. The difference to the case here is that the MS_NODEV
      case is transparent to userspace since it is an explicitly set mount option
      while the SB_I_NODEV case is an implicit property enforced by the kernel
      and hence opaque to userspace.
      
      [1]: https://github.com/systemd/systemd/pull/9483Signed-off-by: default avatarChristian Brauner <christian@brauner.io>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Seth Forshee <seth.forshee@canonical.com>
      Cc: Serge Hallyn <serge@hallyn.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      94f82008
    • Christoph Hellwig's avatar
      dma-mapping: fix flags in dma_alloc_wc · 0cd60eb1
      Christoph Hellwig authored
      We really need the writecombine flag in dma_alloc_wc, fix a stupid
      oversight.
      
      Fixes: 7ed1d91a ("dma-mapping: translate __GFP_NOFAIL to DMA_ATTR_NO_WARN")
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      0cd60eb1
  4. 21 Dec, 2018 14 commits
    • Linus Torvalds's avatar
      Merge branch 'akpm' (patches from Andrew) · 23203e3f
      Linus Torvalds authored
      Merge misc fixes from Andrew Morton:
       "4 fixes"
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>:
        mm, page_alloc: fix has_unmovable_pages for HugePages
        fork,memcg: fix crash in free_thread_stack on memcg charge fail
        mm: thp: fix flags for pmd migration when split
        mm, memory_hotplug: initialize struct pages for the full memory section
      23203e3f
    • Oscar Salvador's avatar
      mm, page_alloc: fix has_unmovable_pages for HugePages · 17e2e7d7
      Oscar Salvador authored
      While playing with gigantic hugepages and memory_hotplug, I triggered
      the following #PF when "cat memoryX/removable":
      
        BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
        #PF error: [normal kernel read fault]
        PGD 0 P4D 0
        Oops: 0000 [#1] SMP PTI
        CPU: 1 PID: 1481 Comm: cat Tainted: G            E     4.20.0-rc6-mm1-1-default+ #18
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014
        RIP: 0010:has_unmovable_pages+0x154/0x210
        Call Trace:
         is_mem_section_removable+0x7d/0x100
         removable_show+0x90/0xb0
         dev_attr_show+0x1c/0x50
         sysfs_kf_seq_show+0xca/0x1b0
         seq_read+0x133/0x380
         __vfs_read+0x26/0x180
         vfs_read+0x89/0x140
         ksys_read+0x42/0x90
         do_syscall_64+0x5b/0x180
         entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      The reason is we do not pass the Head to page_hstate(), and so, the call
      to compound_order() in page_hstate() returns 0, so we end up checking
      all hstates's size to match PAGE_SIZE.
      
      Obviously, we do not find any hstate matching that size, and we return
      NULL.  Then, we dereference that NULL pointer in
      hugepage_migration_supported() and we got the #PF from above.
      
      Fix that by getting the head page before calling page_hstate().
      
      Also, since gigantic pages span several pageblocks, re-adjust the logic
      for skipping pages.  While are it, we can also get rid of the
      round_up().
      
      [osalvador@suse.de: remove round_up(), adjust skip pages logic per Michal]
        Link: http://lkml.kernel.org/r/20181221062809.31771-1-osalvador@suse.de
      Link: http://lkml.kernel.org/r/20181217225113.17864-1-osalvador@suse.deSigned-off-by: default avatarOscar Salvador <osalvador@suse.de>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Reviewed-by: default avatarDavid Hildenbrand <david@redhat.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Pavel Tatashin <pavel.tatashin@microsoft.com>
      Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      17e2e7d7
    • Rik van Riel's avatar
      fork,memcg: fix crash in free_thread_stack on memcg charge fail · 5eed6f1d
      Rik van Riel authored
      Commit 9b6f7e16 ("mm: rework memcg kernel stack accounting") will
      result in fork failing if allocating a kernel stack for a task in
      dup_task_struct exceeds the kernel memory allowance for that cgroup.
      
      Unfortunately, it also results in a crash.
      
      This is due to the code jumping to free_stack and calling
      free_thread_stack when the memcg kernel stack charge fails, but without
      tsk->stack pointing at the freshly allocated stack.
      
      This in turn results in the vfree_atomic in free_thread_stack oopsing
      with a backtrace like this:
      
      #5 [ffffc900244efc88] die at ffffffff8101f0ab
       #6 [ffffc900244efcb8] do_general_protection at ffffffff8101cb86
       #7 [ffffc900244efce0] general_protection at ffffffff818ff082
          [exception RIP: llist_add_batch+7]
          RIP: ffffffff8150d487  RSP: ffffc900244efd98  RFLAGS: 00010282
          RAX: 0000000000000000  RBX: ffff88085ef55980  RCX: 0000000000000000
          RDX: ffff88085ef55980  RSI: 343834343531203a  RDI: 343834343531203a
          RBP: ffffc900244efd98   R8: 0000000000000001   R9: ffff8808578c3600
          R10: 0000000000000000  R11: 0000000000000001  R12: ffff88029f6c21c0
          R13: 0000000000000286  R14: ffff880147759b00  R15: 0000000000000000
          ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
       #8 [ffffc900244efda0] vfree_atomic at ffffffff811df2c7
       #9 [ffffc900244efdb8] copy_process at ffffffff81086e37
      #10 [ffffc900244efe98] _do_fork at ffffffff810884e0
      #11 [ffffc900244eff10] sys_vfork at ffffffff810887ff
      #12 [ffffc900244eff20] do_syscall_64 at ffffffff81002a43
          RIP: 000000000049b948  RSP: 00007ffcdb307830  RFLAGS: 00000246
          RAX: ffffffffffffffda  RBX: 0000000000896030  RCX: 000000000049b948
          RDX: 0000000000000000  RSI: 00007ffcdb307790  RDI: 00000000005d7421
          RBP: 000000000067370f   R8: 00007ffcdb3077b0   R9: 000000000001ed00
          R10: 0000000000000008  R11: 0000000000000246  R12: 0000000000000040
          R13: 000000000000000f  R14: 0000000000000000  R15: 000000000088d018
          ORIG_RAX: 000000000000003a  CS: 0033  SS: 002b
      
      The simplest fix is to assign tsk->stack right where it is allocated.
      
      Link: http://lkml.kernel.org/r/20181214231726.7ee4843c@imladris.surriel.com
      Fixes: 9b6f7e16 ("mm: rework memcg kernel stack accounting")
      Signed-off-by: default avatarRik van Riel <riel@surriel.com>
      Acked-by: default avatarRoman Gushchin <guro@fb.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Cc: Shakeel Butt <shakeelb@google.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      5eed6f1d
    • Peter Xu's avatar
      mm: thp: fix flags for pmd migration when split · 2e83ee1d
      Peter Xu authored
      When splitting a huge migrating PMD, we'll transfer all the existing PMD
      bits and apply them again onto the small PTEs.  However we are fetching
      the bits unconditionally via pmd_soft_dirty(), pmd_write() or
      pmd_yound() while actually they don't make sense at all when it's a
      migration entry.  Fix them up.  Since at it, drop the ifdef together as
      not needed.
      
      Note that if my understanding is correct about the problem then if
      without the patch there is chance to lose some of the dirty bits in the
      migrating pmd pages (on x86_64 we're fetching bit 11 which is part of
      swap offset instead of bit 2) and it could potentially corrupt the
      memory of an userspace program which depends on the dirty bit.
      
      Link: http://lkml.kernel.org/r/20181213051510.20306-1-peterx@redhat.comSigned-off-by: default avatarPeter Xu <peterx@redhat.com>
      Reviewed-by: default avatarKonstantin Khlebnikov <khlebnikov@yandex-team.ru>
      Reviewed-by: default avatarWilliam Kucharski <william.kucharski@oracle.com>
      Acked-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Dave Jiang <dave.jiang@intel.com>
      Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Souptick Joarder <jrdr.linux@gmail.com>
      Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
      Cc: Zi Yan <zi.yan@cs.rutgers.edu>
      Cc: <stable@vger.kernel.org>	[4.14+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      2e83ee1d
    • Mikhail Zaslonko's avatar
      mm, memory_hotplug: initialize struct pages for the full memory section · 2830bf6f
      Mikhail Zaslonko authored
      If memory end is not aligned with the sparse memory section boundary,
      the mapping of such a section is only partly initialized.  This may lead
      to VM_BUG_ON due to uninitialized struct page access from
      is_mem_section_removable() or test_pages_in_a_zone() function triggered
      by memory_hotplug sysfs handlers:
      
      Here are the the panic examples:
       CONFIG_DEBUG_VM=y
       CONFIG_DEBUG_VM_PGFLAGS=y
      
       kernel parameter mem=2050M
       --------------------------
       page:000003d082008000 is uninitialized and poisoned
       page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p))
       Call Trace:
       ( test_pages_in_a_zone+0xde/0x160)
         show_valid_zones+0x5c/0x190
         dev_attr_show+0x34/0x70
         sysfs_kf_seq_show+0xc8/0x148
         seq_read+0x204/0x480
         __vfs_read+0x32/0x178
         vfs_read+0x82/0x138
         ksys_read+0x5a/0xb0
         system_call+0xdc/0x2d8
       Last Breaking-Event-Address:
         test_pages_in_a_zone+0xde/0x160
       Kernel panic - not syncing: Fatal exception: panic_on_oops
      
       kernel parameter mem=3075M
       --------------------------
       page:000003d08300c000 is uninitialized and poisoned
       page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p))
       Call Trace:
       ( is_mem_section_removable+0xb4/0x190)
         show_mem_removable+0x9a/0xd8
         dev_attr_show+0x34/0x70
         sysfs_kf_seq_show+0xc8/0x148
         seq_read+0x204/0x480
         __vfs_read+0x32/0x178
         vfs_read+0x82/0x138
         ksys_read+0x5a/0xb0
         system_call+0xdc/0x2d8
       Last Breaking-Event-Address:
         is_mem_section_removable+0xb4/0x190
       Kernel panic - not syncing: Fatal exception: panic_on_oops
      
      Fix the problem by initializing the last memory section of each zone in
      memmap_init_zone() till the very end, even if it goes beyond the zone end.
      
      Michal said:
      
      : This has alwways been problem AFAIU.  It just went unnoticed because we
      : have zeroed memmaps during allocation before f7f99100 ("mm: stop
      : zeroing memory during allocation in vmemmap") and so the above test
      : would simply skip these ranges as belonging to zone 0 or provided a
      : garbage.
      :
      : So I guess we do care for post f7f99100 kernels mostly and
      : therefore Fixes: f7f99100 ("mm: stop zeroing memory during
      : allocation in vmemmap")
      
      Link: http://lkml.kernel.org/r/20181212172712.34019-2-zaslonko@linux.ibm.com
      Fixes: f7f99100 ("mm: stop zeroing memory during allocation in vmemmap")
      Signed-off-by: default avatarMikhail Zaslonko <zaslonko@linux.ibm.com>
      Reviewed-by: default avatarGerald Schaefer <gerald.schaefer@de.ibm.com>
      Suggested-by: default avatarMichal Hocko <mhocko@kernel.org>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Reported-by: default avatarMikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
      Tested-by: default avatarMikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Alexander Duyck <alexander.h.duyck@linux.intel.com>
      Cc: Pasha Tatashin <Pavel.Tatashin@microsoft.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      2830bf6f
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc · 6cafab50
      Linus Torvalds authored
      Pull sparc fixes from David Miller:
       "Just some small fixes here and there, and a refcount leak in a serial
        driver, nothing serious"
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
        serial/sunsu: fix refcount leak
        sparc: Set "ARCH: sunxx" information on the same line
        sparc: vdso: Drop implicit common-page-size linker flag
      6cafab50
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 87935eee
      Linus Torvalds authored
      Pull more networking fixes from David Miller:
       "Some more bug fixes have trickled in, we have:
      
        1) Local MAC entries properly in mscc driver, from Allan W. Nielsen.
      
        2) Eric Dumazet found some more of the typical "pskb_may_pull() -->
           oops forgot to reload the header pointer" bugs in ipv6 tunnel
           handling.
      
        3) Bad SKB socket pointer in ipv6 fragmentation handling, from Herbert
           Xu.
      
        4) Overflow fix in sk_msg_clone(), from Vakul Garg.
      
        5) Validate address lengths in AF_PACKET, from Willem de Bruijn"
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
        qmi_wwan: Fix qmap header retrieval in qmimux_rx_fixup
        qmi_wwan: Add support for Fibocom NL678 series
        tls: Do not call sk_memcopy_from_iter with zero length
        ipv6: tunnels: fix two use-after-free
        Prevent overflow of sk_msg in sk_msg_clone()
        packet: validate address length
        net: netxen: fix a missing check and an uninitialized use
        tcp: fix a race in inet_diag_dump_icsk()
        MAINTAINERS: update cxgb4 and cxgb3 maintainer
        ipv6: frags: Fix bogus skb->sk in reassembled packets
        mscc: Configured MAC entries should be locked.
      87935eee
    • Mans Rullgard's avatar
      auxdisplay: charlcd: fix x/y command parsing · 9bc30ab8
      Mans Rullgard authored
      The x/y command parsing has been broken since commit 12995706
      ("staging: panel: Fixed checkpatch warning about simple_strtoul()").
      
      Commit b34050fa ("auxdisplay: charlcd: Fix and clean up handling of
      x/y commands") fixed some problems by rewriting the parsing code,
      but also broke things further by removing the check for a complete
      command before attempting to parse it.  As a result, parsing is
      terminated at the first x or y character.
      
      This reinstates the check for a final semicolon.  Whereas the original
      code use strchr(), this is wasteful seeing as the semicolon is always
      at the end of the buffer.  Thus check this character directly instead.
      Signed-off-by: default avatarMans Rullgard <mans@mansr.com>
      Signed-off-by: default avatarMiguel Ojeda <miguel.ojeda.sandonis@gmail.com>
      9bc30ab8
    • Yangtao Li's avatar
      serial/sunsu: fix refcount leak · d430aff8
      Yangtao Li authored
      The function of_find_node_by_path() acquires a reference to the node
      returned by it and that reference needs to be dropped by its caller.
      
      su_get_type() doesn't do that. The match node are used as an identifier
      to compare against the current node, so we can directly drop the refcount
      after getting the node from the path as it is not used as pointer.
      
      Fix this by use a single variable and drop the refcount right after
      of_find_node_by_path().
      Signed-off-by: default avatarYangtao Li <tiny.windzz@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d430aff8
    • Corentin Labbe's avatar
      sparc: Set "ARCH: sunxx" information on the same line · afaffac3
      Corentin Labbe authored
      While checking boot log from SPARC qemu, I saw that the "ARCH: sunxx"
      information was split on two different line.
      This patchs merge both line together.
      In the meantime, thoses information need to be printed via pr_info
      since printk print them by default via the warning loglevel.
      Signed-off-by: default avatarCorentin Labbe <clabbe@baylibre.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      afaffac3
    • ndesaulniers@google.com's avatar
      sparc: vdso: Drop implicit common-page-size linker flag · 0ff70f62
      ndesaulniers@google.com authored
      GNU linker's -z common-page-size's default value is based on the target
      architecture. arch/sparc/vdso/Makefile sets it to the architecture
      default, which is implicit and redundant. Drop it.
      
      Link: https://lkml.kernel.org/r/20181206191231.192355-1-ndesaulniers@google.comSigned-off-by: default avatarNick Desaulniers <ndesaulniers@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0ff70f62
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · 5092adb2
      Linus Torvalds authored
      Pull kvm fix from Paolo Bonzini:
       "A simple patch for a pretty bad bug: Unbreak AMD nested
        virtualization."
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
        KVM: x86: nSVM: fix switch to guest mmu
      5092adb2
    • Daniele Palmas's avatar
      qmi_wwan: Fix qmap header retrieval in qmimux_rx_fixup · d667044f
      Daniele Palmas authored
      This patch fixes qmap header retrieval when modem is configured for
      dl data aggregation.
      Signed-off-by: default avatarDaniele Palmas <dnlplm@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d667044f
    • Linus Torvalds's avatar
      Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · e572fa0e
      Linus Torvalds authored
      Pull timer fix from Ingo Molnar:
       "Fix a division by zero crash in the posix-timers code"
      
      * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        posix-timers: Fix division by zero bug
      e572fa0e