1. 21 Nov, 2018 28 commits
    • Helge Deller's avatar
      parisc: Fix exported address of os_hpmc handler · b7706a82
      Helge Deller authored
      [ Upstream commit 99a3ae51 ]
      
      In the C-code we need to put the physical address of the hpmc handler in
      the interrupt vector table (IVA) in order to get HPMCs working.  Since
      on parisc64 function pointers are indirect (in fact they are function
      descriptors) we instead export the address as variable and not as
      function.
      
      This reverts a small part of commit f39cce65 ("parisc: Add
      cfi_startproc and cfi_endproc to assembly code").
      Signed-off-by: default avatarHelge Deller <deller@gmx.de>
      Cc: <stable@vger.kernel.org>    [4.9+]
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      b7706a82
    • Helge Deller's avatar
      parisc: Fix HPMC handler by increasing size to multiple of 16 bytes · 5a41c652
      Helge Deller authored
      [ Upstream commit d5654e15 ]
      
      Make sure that the HPMC (High Priority Machine Check) handler is 16-byte
      aligned and that it's length in the IVT is a multiple of 16 bytes.
      Otherwise PDC may decide not to call the HPMC crash handler.
      Signed-off-by: default avatarHelge Deller <deller@gmx.de>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      5a41c652
    • Helge Deller's avatar
      parisc: Align os_hpmc_size on word boundary · 4d01ed83
      Helge Deller authored
      [ Upstream commit 0ed9d3de ]
      
      The os_hpmc_size variable sometimes wasn't aligned at word boundary and thus
      triggered the unaligned fault handler at startup.
      Fix it by aligning it properly.
      Signed-off-by: default avatarHelge Deller <deller@gmx.de>
      Cc: <stable@vger.kernel.org> # v4.14+
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      4d01ed83
    • Kees Cook's avatar
      bna: ethtool: Avoid reading past end of buffer · 3d67d629
      Kees Cook authored
      [ Upstream commit 4dc69c1c ]
      
      Using memcpy() from a string that is shorter than the length copied means
      the destination buffer is being filled with arbitrary data from the kernel
      rodata segment. Instead, use strncpy() which will fill the trailing bytes
      with zeros.
      
      This was found with the future CONFIG_FORTIFY_SOURCE feature.
      
      Cc: Daniel Micay <danielmicay@gmail.com>
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      3d67d629
    • Vincenzo Maffione's avatar
      e1000: fix race condition between e1000_down() and e1000_watchdog · e029b837
      Vincenzo Maffione authored
      [ Upstream commit 44c445c3 ]
      
      This patch fixes a race condition that can result into the interface being
      up and carrier on, but with transmits disabled in the hardware.
      The bug may show up by repeatedly IFF_DOWN+IFF_UP the interface, which
      allows e1000_watchdog() interleave with e1000_down().
      
          CPU x                           CPU y
          --------------------------------------------------------------------
          e1000_down():
              netif_carrier_off()
                                          e1000_watchdog():
                                              if (carrier == off) {
                                                  netif_carrier_on();
                                                  enable_hw_transmit();
                                              }
              disable_hw_transmit();
                                          e1000_watchdog():
                                              /* carrier on, do nothing */
      Signed-off-by: default avatarVincenzo Maffione <v.maffione@gmail.com>
      Tested-by: default avatarAaron Brown <aaron.f.brown@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      e029b837
    • Colin Ian King's avatar
      e1000: avoid null pointer dereference on invalid stat type · 55edbf1c
      Colin Ian King authored
      [ Upstream commit 5983587c ]
      
      Currently if the stat type is invalid then data[i] is being set
      either by dereferencing a null pointer p, or it is reading from
      an incorrect previous location if we had a valid stat type
      previously.  Fix this by skipping over the read of p on an invalid
      stat type.
      
      Detected by CoverityScan, CID#113385 ("Explicit null dereferenced")
      Signed-off-by: default avatarColin Ian King <colin.king@canonical.com>
      Reviewed-by: default avatarAlexander Duyck <alexander.h.duyck@intel.com>
      Tested-by: default avatarAaron Brown <aaron.f.brown@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      55edbf1c
    • Michal Hocko's avatar
      mm: do not bug_on on incorrect length in __mm_populate() · 1a55a71e
      Michal Hocko authored
      commit bb177a73 upstream.
      
      syzbot has noticed that a specially crafted library can easily hit
      VM_BUG_ON in __mm_populate
      
        kernel BUG at mm/gup.c:1242!
        invalid opcode: 0000 [#1] SMP
        CPU: 2 PID: 9667 Comm: a.out Not tainted 4.18.0-rc3 #644
        Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/19/2017
        RIP: 0010:__mm_populate+0x1e2/0x1f0
        Code: 55 d0 65 48 33 14 25 28 00 00 00 89 d8 75 21 48 83 c4 20 5b 41 5c 41 5d 41 5e 41 5f 5d c3 e8 75 18 f1 ff 0f 0b e8 6e 18 f1 ff <0f> 0b 31 db eb c9 e8 93 06 e0 ff 0f 1f 00 55 48 89 e5 53 48 89 fb
        Call Trace:
           vm_brk_flags+0xc3/0x100
           vm_brk+0x1f/0x30
           load_elf_library+0x281/0x2e0
           __ia32_sys_uselib+0x170/0x1e0
           do_fast_syscall_32+0xca/0x420
           entry_SYSENTER_compat+0x70/0x7f
      
      The reason is that the length of the new brk is not page aligned when we
      try to populate the it.  There is no reason to bug on that though.
      do_brk_flags already aligns the length properly so the mapping is
      expanded as it should.  All we need is to tell mm_populate about it.
      Besides that there is absolutely no reason to to bug_on in the first
      place.  The worst thing that could happen is that the last page wouldn't
      get populated and that is far from putting system into an inconsistent
      state.
      
      Fix the issue by moving the length sanitization code from do_brk_flags
      up to vm_brk_flags.  The only other caller of do_brk_flags is brk
      syscall entry and it makes sure to provide the proper length so t here
      is no need for sanitation and so we can use do_brk_flags without it.
      
      Also remove the bogus BUG_ONs.
      
      [osalvador@techadventures.net: fix up vm_brk_flags s@request@len@]
      Link: http://lkml.kernel.org/r/20180706090217.GI32658@dhcp22.suse.czSigned-off-by: default avatarMichal Hocko <mhocko@suse.com>
      Reported-by: default avatarsyzbot <syzbot+5dcb560fe12aa5091c06@syzkaller.appspotmail.com>
      Tested-by: default avatarTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Reviewed-by: default avatarOscar Salvador <osalvador@suse.de>
      Cc: Zi Yan <zi.yan@cs.rutgers.edu>
      Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Michael S. Tsirkin <mst@redhat.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: "Huang, Ying" <ying.huang@intel.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      [bwh: Backported to 4.9:
       - There is no do_brk_flags() function; update do_brk()
       - Adjust context]
      Signed-off-by: default avatarBen Hutchings <ben.hutchings@codethink.co.uk>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      1a55a71e
    • Miklos Szeredi's avatar
      fuse: set FR_SENT while locked · c31ccf1f
      Miklos Szeredi authored
      commit 4c316f2f upstream.
      
      Otherwise fuse_dev_do_write() could come in and finish off the request, and
      the set_bit(FR_SENT, ...) could trigger the WARN_ON(test_bit(FR_SENT, ...))
      in request_end().
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      Reported-by: syzbot+ef054c4d3f64cd7f7cec@syzkaller.appspotmai
      Fixes: 46c34a34 ("fuse: no fc->lock for pqueue parts")
      Cc: <stable@vger.kernel.org> # v4.2
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c31ccf1f
    • Miklos Szeredi's avatar
      fuse: fix blocked_waitq wakeup · 7238fcbd
      Miklos Szeredi authored
      commit 908a572b upstream.
      
      Using waitqueue_active() is racy.  Make sure we issue a wake_up()
      unconditionally after storing into fc->blocked.  After that it's okay to
      optimize with waitqueue_active() since the first wake up provides the
      necessary barrier for all waiters, not the just the woken one.
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      Fixes: 3c18ef81 ("fuse: optimize wake_up")
      Cc: <stable@vger.kernel.org> # v3.10
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7238fcbd
    • Kirill Tkhai's avatar
      fuse: Fix use-after-free in fuse_dev_do_write() · 0799a93d
      Kirill Tkhai authored
      commit d2d2d4fb upstream.
      
      After we found req in request_find() and released the lock,
      everything may happen with the req in parallel:
      
      cpu0                              cpu1
      fuse_dev_do_write()               fuse_dev_do_write()
        req = request_find(fpq, ...)    ...
        spin_unlock(&fpq->lock)         ...
        ...                             req = request_find(fpq, oh.unique)
        ...                             spin_unlock(&fpq->lock)
        queue_interrupt(&fc->iq, req);   ...
        ...                              ...
        ...                              ...
        request_end(fc, req);
          fuse_put_request(fc, req);
        ...                              queue_interrupt(&fc->iq, req);
      Signed-off-by: default avatarKirill Tkhai <ktkhai@virtuozzo.com>
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      Fixes: 46c34a34 ("fuse: no fc->lock for pqueue parts")
      Cc: <stable@vger.kernel.org> # v4.2
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0799a93d
    • Kirill Tkhai's avatar
      fuse: Fix use-after-free in fuse_dev_do_read() · 7996b1c0
      Kirill Tkhai authored
      commit bc78abbd upstream.
      
      We may pick freed req in this way:
      
      [cpu0]                                  [cpu1]
      fuse_dev_do_read()                      fuse_dev_do_write()
         list_move_tail(&req->list, ...);     ...
         spin_unlock(&fpq->lock);             ...
         ...                                  request_end(fc, req);
         ...                                    fuse_put_request(fc, req);
         if (test_bit(FR_INTERRUPTED, ...))
               queue_interrupt(fiq, req);
      
      Fix that by keeping req alive until we finish all manipulations.
      
      Reported-by: syzbot+4e975615ca01f2277bdd@syzkaller.appspotmail.com
      Signed-off-by: default avatarKirill Tkhai <ktkhai@virtuozzo.com>
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      Fixes: 46c34a34 ("fuse: no fc->lock for pqueue parts")
      Cc: <stable@vger.kernel.org> # v4.2
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7996b1c0
    • Quinn Tran's avatar
      scsi: qla2xxx: shutdown chip if reset fail · 0dd45a4d
      Quinn Tran authored
      commit 1e4ac5d6 upstream.
      
      If chip unable to fully initialize, use full shutdown sequence to clear out
      any stale FW state.
      
      Fixes: e315cd28 ("[SCSI] qla2xxx: Code changes for qla data structure refactoring")
      Cc: stable@vger.kernel.org  #4.10
      Signed-off-by: default avatarQuinn Tran <quinn.tran@cavium.com>
      Signed-off-by: default avatarHimanshu Madhani <himanshu.madhani@cavium.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0dd45a4d
    • Himanshu Madhani's avatar
      scsi: qla2xxx: Fix incorrect port speed being set for FC adapters · 0b639352
      Himanshu Madhani authored
      commit 4c1458df upstream.
      
      Fixes: 6246b8a1 ("[SCSI] qla2xxx: Enhancements to support ISP83xx.")
      Fixes: 1bb39548 ("qla2xxx: Correct iiDMA-update calling conventions.")
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarHimanshu Madhani <himanshu.madhani@cavium.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0b639352
    • Young_X's avatar
      cdrom: fix improper type cast, which can leat to information leak. · 8dd745a8
      Young_X authored
      commit e4f3aa2e upstream.
      
      There is another cast from unsigned long to int which causes
      a bounds check to fail with specially crafted input. The value is
      then used as an index in the slot array in cdrom_slot_status().
      
      This issue is similar to CVE-2018-16658 and CVE-2018-10940.
      Signed-off-by: default avatarYoung_X <YangX92@hotmail.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Cc: Ben Hutchings <ben.hutchings@codethink.co.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8dd745a8
    • Dominique Martinet's avatar
      9p: clear dangling pointers in p9stat_free · 0cf4fa79
      Dominique Martinet authored
      [ Upstream commit 62e39417 ]
      
      p9stat_free is more of a cleanup function than a 'free' function as it
      only frees the content of the struct; there are chances of use-after-free
      if it is improperly used (e.g. p9stat_free called twice as it used to be
      possible to)
      
      Clearing dangling pointers makes the function idempotent and safer to use.
      
      Link: http://lkml.kernel.org/r/1535410108-20650-2-git-send-email-asmadeus@codewreck.orgSigned-off-by: default avatarDominique Martinet <dominique.martinet@cea.fr>
      Reported-by: syzbot+d4252148d198410b864f@syzkaller.appspotmail.com
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0cf4fa79
    • Dominique Martinet's avatar
      9p locks: fix glock.client_id leak in do_lock · 9e79094d
      Dominique Martinet authored
      [ Upstream commit b4dc44b3 ]
      
      the 9p client code overwrites our glock.client_id pointing to a static
      buffer by an allocated string holding the network provided value which
      we do not care about; free and reset the value as appropriate.
      
      This is almost identical to the leak in v9fs_file_getlock() fixed by
      Al Viro in commit ce85dd58 ("9p: we are leaking glock.client_id
      in v9fs_file_getlock()"), which was returned as an error by a coverity
      false positive -- while we are here attempt to make the code slightly
      more robust to future change of the net/9p/client code and hopefully
      more clear to coverity that there is no problem.
      
      Link: http://lkml.kernel.org/r/1536339057-21974-5-git-send-email-asmadeus@codewreck.orgSigned-off-by: default avatarDominique Martinet <dominique.martinet@cea.fr>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9e79094d
    • Breno Leitao's avatar
      powerpc/selftests: Wait all threads to join · abefbf42
      Breno Leitao authored
      [ Upstream commit 693b31b2 ]
      
      Test tm-tmspr might exit before all threads stop executing, because it just
      waits for the very last thread to join before proceeding/exiting.
      
      This patch makes sure that all threads that were created will join before
      proceeding/exiting.
      
      This patch also guarantees that the amount of threads being created is equal
      to thread_num.
      Signed-off-by: default avatarBreno Leitao <leitao@debian.org>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      abefbf42
    • Marco Felsch's avatar
      media: tvp5150: fix width alignment during set_selection() · c19136a6
      Marco Felsch authored
      [ Upstream commit bd24db04 ]
      
      The driver ignored the width alignment which exists due to the UYVY
      colorspace format. Fix the width alignment and make use of the the
      provided v4l2 helper function to set the width, height and all
      alignments in one.
      
      Fixes: 963ddc63 ("[media] media: tvp5150: Add cropping support")
      Signed-off-by: default avatarMarco Felsch <m.felsch@pengutronix.de>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab+samsung@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c19136a6
    • Phil Elwell's avatar
      sc16is7xx: Fix for multi-channel stall · 64a5369b
      Phil Elwell authored
      [ Upstream commit 83444987 ]
      
      The SC16IS752 is a dual-channel device. The two channels are largely
      independent, but the IRQ signals are wired together as an open-drain,
      active low signal which will be driven low while either of the
      channels requires attention, which can be for significant periods of
      time until operations complete and the interrupt can be acknowledged.
      In that respect it is should be treated as a true level-sensitive IRQ.
      
      The kernel, however, needs to be able to exit interrupt context in
      order to use I2C or SPI to access the device registers (which may
      involve sleeping).  Therefore the interrupt needs to be masked out or
      paused in some way.
      
      The usual way to manage sleeping from within an interrupt handler
      is to use a threaded interrupt handler - a regular interrupt routine
      does the minimum amount of work needed to triage the interrupt before
      waking the interrupt service thread. If the threaded IRQ is marked as
      IRQF_ONESHOT the kernel will automatically mask out the interrupt
      until the thread runs to completion. The sc16is7xx driver used to
      use a threaded IRQ, but a patch switched to using a kthread_worker
      in order to set realtime priorities on the handler thread and for
      other optimisations. The end result is non-threaded IRQ that
      schedules some work then returns IRQ_HANDLED, making the kernel
      think that all IRQ processing has completed.
      
      The work-around to prevent a constant stream of interrupts is to
      mark the interrupt as edge-sensitive rather than level-sensitive,
      but interpreting an active-low source as a falling-edge source
      requires care to prevent a total cessation of interrupts. Whereas
      an edge-triggering source will generate a new edge for every interrupt
      condition a level-triggering source will keep the signal at the
      interrupting level until it no longer requires attention; in other
      words, the host won't see another edge until all interrupt conditions
      are cleared. It is therefore vital that the interrupt handler does not
      exit with an outstanding interrupt condition, otherwise the kernel
      will not receive another interrupt unless some other operation causes
      the interrupt state on the device to be cleared.
      
      The existing sc16is7xx driver has a very simple interrupt "thread"
      (kthread_work job) that processes interrupts on each channel in turn
      until there are no more. If both channels are active and the first
      channel starts interrupting while the handler for the second channel
      is running then it will not be detected and an IRQ stall ensues. This
      could be handled easily if there was a shared IRQ status register, or
      a convenient way to determine if the IRQ had been deasserted for any
      length of time, but both appear to be lacking.
      
      Avoid this problem (or at least make it much less likely to happen)
      by reducing the granularity of per-channel interrupt processing
      to one condition per iteration, only exiting the overall loop when
      both channels are no longer interrupting.
      Signed-off-by: default avatarPhil Elwell <phil@raspberrypi.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      64a5369b
    • Huacai Chen's avatar
      MIPS/PCI: Call pcie_bus_configure_settings() to set MPS/MRRS · ff37c427
      Huacai Chen authored
      [ Upstream commit 2794f688 ]
      
      Call pcie_bus_configure_settings() on MIPS, like for other platforms.
      The function pcie_bus_configure_settings() makes sure the MPS (Max
      Payload Size) across the bus is uniform and provides the ability to
      tune the MRSS (Max Read Request Size) and MPS (Max Payload Size) to
      higher performance values. Some devices will not operate properly if
      these aren't set correctly because the firmware doesn't always do it.
      Signed-off-by: default avatarHuacai Chen <chenhc@lemote.com>
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Patchwork: https://patchwork.linux-mips.org/patch/20649/
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: James Hogan <jhogan@kernel.org>
      Cc: linux-mips@linux-mips.org
      Cc: Fuxin Zhang <zhangfx@lemote.com>
      Cc: Zhangjin Wu <wuzhangjin@gmail.com>
      Cc: Huacai Chen <chenhuacai@gmail.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ff37c427
    • Joel Stanley's avatar
      powerpc/boot: Ensure _zimage_start is a weak symbol · 76c386b3
      Joel Stanley authored
      [ Upstream commit ee9d21b3 ]
      
      When building with clang crt0's _zimage_start is not marked weak, which
      breaks the build when linking the kernel image:
      
       $ objdump -t arch/powerpc/boot/crt0.o |grep _zimage_start$
       0000000000000058 g       .text  0000000000000000 _zimage_start
      
       ld: arch/powerpc/boot/wrapper.a(crt0.o): in function '_zimage_start':
       (.text+0x58): multiple definition of '_zimage_start';
       arch/powerpc/boot/pseries-head.o:(.text+0x0): first defined here
      
      Clang requires the .weak directive to appear after the symbol is
      declared. The binutils manual says:
      
       This directive sets the weak attribute on the comma separated list of
       symbol names. If the symbols do not already exist, they will be
       created.
      
      So it appears this is different with clang. The only reference I could
      see for this was an OpenBSD mailing list post[1].
      
      Changing it to be after the declaration fixes building with Clang, and
      still works with GCC.
      
       $ objdump -t arch/powerpc/boot/crt0.o |grep _zimage_start$
       0000000000000058  w      .text	0000000000000000 _zimage_start
      
      Reported to clang as https://bugs.llvm.org/show_bug.cgi?id=38921
      
      [1] https://groups.google.com/forum/#!topic/fa.openbsd.tech/PAgKKen2YCYSigned-off-by: default avatarJoel Stanley <joel@jms.id.au>
      Reviewed-by: default avatarNick Desaulniers <ndesaulniers@google.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      76c386b3
    • Dengcheng Zhu's avatar
      MIPS: kexec: Mark CPU offline before disabling local IRQ · 12b49fd3
      Dengcheng Zhu authored
      [ Upstream commit dc57aaf9 ]
      
      After changing CPU online status, it will not be sent any IPIs such as in
      __flush_cache_all() on software coherency systems. Do this before disabling
      local IRQ.
      Signed-off-by: default avatarDengcheng Zhu <dzhu@wavecomp.com>
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Patchwork: https://patchwork.linux-mips.org/patch/20571/
      Cc: pburton@wavecomp.com
      Cc: ralf@linux-mips.org
      Cc: linux-mips@linux-mips.org
      Cc: rachel.mozes@intel.com
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      12b49fd3
    • Nicholas Mc Guire's avatar
      media: pci: cx23885: handle adding to list failure · 2a55f098
      Nicholas Mc Guire authored
      [ Upstream commit c5d59528 ]
      
      altera_hw_filt_init() which calls append_internal() assumes
      that the node was successfully linked in while in fact it can
      silently fail. So the call-site needs to set return to -ENOMEM
      on append_internal() returning NULL and exit through the err path.
      
      Fixes: 349bcf02 ("[media] Altera FPGA based CI driver module")
      Signed-off-by: default avatarNicholas Mc Guire <hofrat@osadl.org>
      Signed-off-by: default avatarHans Verkuil <hans.verkuil@cisco.com>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab+samsung@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2a55f098
    • Tomi Valkeinen's avatar
      drm/omap: fix memory barrier bug in DMM driver · 4bdf5ec1
      Tomi Valkeinen authored
      [ Upstream commit 538f66ba ]
      
      A DMM timeout "timed out waiting for done" has been observed on DRA7
      devices. The timeout happens rarely, and only when the system is under
      heavy load.
      
      Debugging showed that the timeout can be made to happen much more
      frequently by optimizing the DMM driver, so that there's almost no code
      between writing the last DMM descriptors to RAM, and writing to DMM
      register which starts the DMM transaction.
      
      The current theory is that a wmb() does not properly ensure that the
      data written to RAM is observable by all the components in the system.
      
      This DMM timeout has caused interesting (and rare) bugs as the error
      handling was not functioning properly (the error handling has been fixed
      in previous commits):
      
       * If a DMM timeout happened when a GEM buffer was being pinned for
         display on the screen, a timeout error would be shown, but the driver
         would continue programming DSS HW with broken buffer, leading to
         SYNCLOST floods and possible crashes.
      
       * If a DMM timeout happened when other user (say, video decoder) was
         pinning a GEM buffer, a timeout would be shown but if the user
         handled the error properly, no other issues followed.
      
       * If a DMM timeout happened when a GEM buffer was being released, the
         driver does not even notice the error, leading to crashes or hang
         later.
      
      This patch adds wmb() and readl() calls after the last bit is written to
      RAM, which should ensure that the execution proceeds only after the data
      is actually in RAM, and thus observable by DMM.
      
      The read-back should not be needed. Further study is required to understand
      if DMM is somehow special case and read-back is ok, or if DRA7's memory
      barriers do not work correctly.
      Signed-off-by: default avatarTomi Valkeinen <tomi.valkeinen@ti.com>
      Signed-off-by: default avatarPeter Ujfalusi <peter.ujfalusi@ti.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4bdf5ec1
    • Daniel Axtens's avatar
      powerpc/nohash: fix undefined behaviour when testing page size support · 0e4dde1a
      Daniel Axtens authored
      [ Upstream commit f5e28480 ]
      
      When enumerating page size definitions to check hardware support,
      we construct a constant which is (1U << (def->shift - 10)).
      
      However, the array of page size definitions is only initalised for
      various MMU_PAGE_* constants, so it contains a number of 0-initialised
      elements with def->shift == 0. This means we end up shifting by a
      very large number, which gives the following UBSan splat:
      
      ================================================================================
      UBSAN: Undefined behaviour in /home/dja/dev/linux/linux/arch/powerpc/mm/tlb_nohash.c:506:21
      shift exponent 4294967286 is too large for 32-bit type 'unsigned int'
      CPU: 0 PID: 0 Comm: swapper Not tainted 4.19.0-rc3-00045-ga604f927b012-dirty #6
      Call Trace:
      [c00000000101bc20] [c000000000a13d54] .dump_stack+0xa8/0xec (unreliable)
      [c00000000101bcb0] [c0000000004f20a8] .ubsan_epilogue+0x18/0x64
      [c00000000101bd30] [c0000000004f2b10] .__ubsan_handle_shift_out_of_bounds+0x110/0x1a4
      [c00000000101be20] [c000000000d21760] .early_init_mmu+0x1b4/0x5a0
      [c00000000101bf10] [c000000000d1ba28] .early_setup+0x100/0x130
      [c00000000101bf90] [c000000000000528] start_here_multiplatform+0x68/0x80
      ================================================================================
      
      Fix this by first checking if the element exists (shift != 0) before
      constructing the constant.
      Signed-off-by: default avatarDaniel Axtens <dja@axtens.net>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0e4dde1a
    • Fabio Estevam's avatar
      ARM: imx_v6_v7_defconfig: Select CONFIG_TMPFS_POSIX_ACL · 427ccff8
      Fabio Estevam authored
      [ Upstream commit 35d3cbe8 ]
      
      Andreas Müller reports:
      
      "Fixes:
      
      | Sep 04 09:05:10 imx6qdl-variscite-som systemd-udevd[220]: Failed to apply ACL on /dev/v4l-subdev0: Operation not supported
      | Sep 04 09:05:10 imx6qdl-variscite-som systemd-udevd[224]: Failed to apply ACL on /dev/v4l-subdev1: Operation not supported
      | Sep 04 09:05:10 imx6qdl-variscite-som systemd-udevd[215]: Failed to apply ACL on /dev/v4l-subdev10: Operation not supported
      | Sep 04 09:05:10 imx6qdl-variscite-som systemd-udevd[228]: Failed to apply ACL on /dev/v4l-subdev2: Operation not supported
      | Sep 04 09:05:10 imx6qdl-variscite-som systemd-udevd[232]: Failed to apply ACL on /dev/v4l-subdev5: Operation not supported
      | Sep 04 09:05:10 imx6qdl-variscite-som systemd-udevd[217]: Failed to apply ACL on /dev/v4l-subdev11: Operation not supported
      | Sep 04 09:05:10 imx6qdl-variscite-som systemd-udevd[214]: Failed to apply ACL on /dev/dri/card1: Operation not supported
      | Sep 04 09:05:10 imx6qdl-variscite-som systemd-udevd[216]: Failed to apply ACL on /dev/v4l-subdev8: Operation not supported
      | Sep 04 09:05:10 imx6qdl-variscite-som systemd-udevd[226]: Failed to apply ACL on /dev/v4l-subdev9: Operation not supported
      
      and nasty follow-ups: Starting weston from sddm as unpriviledged user fails
      with some hints on missing access rights."
      
      Select the CONFIG_TMPFS_POSIX_ACL option to fix these issues.
      Reported-by: default avatarAndreas Müller <schnitzeltony@gmail.com>
      Signed-off-by: default avatarFabio Estevam <fabio.estevam@nxp.com>
      Acked-by: default avatarOtavio Salvador <otavio@ossystems.com.br>
      Signed-off-by: default avatarShawn Guo <shawnguo@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      427ccff8
    • Miles Chen's avatar
      tty: check name length in tty_find_polling_driver() · 9206016c
      Miles Chen authored
      [ Upstream commit 33a1a7be ]
      
      The issue is found by a fuzzing test.
      If tty_find_polling_driver() recevies an incorrect input such as
      ',,' or '0b', the len becomes 0 and strncmp() always return 0.
      In this case, a null p->ops->poll_init() is called and it causes a kernel
      panic.
      
      Fix this by checking name length against zero in tty_find_polling_driver().
      
      $echo ,, > /sys/module/kgdboc/parameters/kgdboc
      [   20.804451] WARNING: CPU: 1 PID: 104 at drivers/tty/serial/serial_core.c:457
      uart_get_baud_rate+0xe8/0x190
      [   20.804917] Modules linked in:
      [   20.805317] CPU: 1 PID: 104 Comm: sh Not tainted 4.19.0-rc7ajb #8
      [   20.805469] Hardware name: linux,dummy-virt (DT)
      [   20.805732] pstate: 20000005 (nzCv daif -PAN -UAO)
      [   20.805895] pc : uart_get_baud_rate+0xe8/0x190
      [   20.806042] lr : uart_get_baud_rate+0xc0/0x190
      [   20.806476] sp : ffffffc06acff940
      [   20.806676] x29: ffffffc06acff940 x28: 0000000000002580
      [   20.806977] x27: 0000000000009600 x26: 0000000000009600
      [   20.807231] x25: ffffffc06acffad0 x24: 00000000ffffeff0
      [   20.807576] x23: 0000000000000001 x22: 0000000000000000
      [   20.807807] x21: 0000000000000001 x20: 0000000000000000
      [   20.808049] x19: ffffffc06acffac8 x18: 0000000000000000
      [   20.808277] x17: 0000000000000000 x16: 0000000000000000
      [   20.808520] x15: ffffffffffffffff x14: ffffffff00000000
      [   20.808757] x13: ffffffffffffffff x12: 0000000000000001
      [   20.809011] x11: 0101010101010101 x10: ffffff880d59ff5f
      [   20.809292] x9 : ffffff880d59ff5e x8 : ffffffc06acffaf3
      [   20.809549] x7 : 0000000000000000 x6 : ffffff880d59ff5f
      [   20.809803] x5 : 0000000080008001 x4 : 0000000000000003
      [   20.810056] x3 : ffffff900853e6b4 x2 : dfffff9000000000
      [   20.810693] x1 : ffffffc06acffad0 x0 : 0000000000000cb0
      [   20.811005] Call trace:
      [   20.811214]  uart_get_baud_rate+0xe8/0x190
      [   20.811479]  serial8250_do_set_termios+0xe0/0x6f4
      [   20.811719]  serial8250_set_termios+0x48/0x54
      [   20.811928]  uart_set_options+0x138/0x1bc
      [   20.812129]  uart_poll_init+0x114/0x16c
      [   20.812330]  tty_find_polling_driver+0x158/0x200
      [   20.812545]  configure_kgdboc+0xbc/0x1bc
      [   20.812745]  param_set_kgdboc_var+0xb8/0x150
      [   20.812960]  param_attr_store+0xbc/0x150
      [   20.813160]  module_attr_store+0x40/0x58
      [   20.813364]  sysfs_kf_write+0x8c/0xa8
      [   20.813563]  kernfs_fop_write+0x154/0x290
      [   20.813764]  vfs_write+0xf0/0x278
      [   20.813951]  __arm64_sys_write+0x84/0xf4
      [   20.814400]  el0_svc_common+0xf4/0x1dc
      [   20.814616]  el0_svc_handler+0x98/0xbc
      [   20.814804]  el0_svc+0x8/0xc
      [   20.822005] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
      [   20.826913] Mem abort info:
      [   20.827103]   ESR = 0x84000006
      [   20.827352]   Exception class = IABT (current EL), IL = 16 bits
      [   20.827655]   SET = 0, FnV = 0
      [   20.827855]   EA = 0, S1PTW = 0
      [   20.828135] user pgtable: 4k pages, 39-bit VAs, pgdp = (____ptrval____)
      [   20.828484] [0000000000000000] pgd=00000000aadee003, pud=00000000aadee003, pmd=0000000000000000
      [   20.829195] Internal error: Oops: 84000006 [#1] SMP
      [   20.829564] Modules linked in:
      [   20.829890] CPU: 1 PID: 104 Comm: sh Tainted: G        W         4.19.0-rc7ajb #8
      [   20.830545] Hardware name: linux,dummy-virt (DT)
      [   20.830829] pstate: 60000085 (nZCv daIf -PAN -UAO)
      [   20.831174] pc :           (null)
      [   20.831457] lr : serial8250_do_set_termios+0x358/0x6f4
      [   20.831727] sp : ffffffc06acff9b0
      [   20.831936] x29: ffffffc06acff9b0 x28: ffffff9008d7c000
      [   20.832267] x27: ffffff900969e16f x26: 0000000000000000
      [   20.832589] x25: ffffff900969dfb0 x24: 0000000000000000
      [   20.832906] x23: ffffffc06acffad0 x22: ffffff900969e160
      [   20.833232] x21: 0000000000000000 x20: ffffffc06acffac8
      [   20.833559] x19: ffffff900969df90 x18: 0000000000000000
      [   20.833878] x17: 0000000000000000 x16: 0000000000000000
      [   20.834491] x15: ffffffffffffffff x14: ffffffff00000000
      [   20.834821] x13: ffffffffffffffff x12: 0000000000000001
      [   20.835143] x11: 0101010101010101 x10: ffffff880d59ff5f
      [   20.835467] x9 : ffffff880d59ff5e x8 : ffffffc06acffaf3
      [   20.835790] x7 : 0000000000000000 x6 : ffffff880d59ff5f
      [   20.836111] x5 : c06419717c314100 x4 : 0000000000000007
      [   20.836419] x3 : 0000000000000000 x2 : 0000000000000000
      [   20.836732] x1 : 0000000000000001 x0 : ffffff900969df90
      [   20.837100] Process sh (pid: 104, stack limit = 0x(____ptrval____))
      [   20.837396] Call trace:
      [   20.837566]            (null)
      [   20.837816]  serial8250_set_termios+0x48/0x54
      [   20.838089]  uart_set_options+0x138/0x1bc
      [   20.838570]  uart_poll_init+0x114/0x16c
      [   20.838834]  tty_find_polling_driver+0x158/0x200
      [   20.839119]  configure_kgdboc+0xbc/0x1bc
      [   20.839380]  param_set_kgdboc_var+0xb8/0x150
      [   20.839658]  param_attr_store+0xbc/0x150
      [   20.839920]  module_attr_store+0x40/0x58
      [   20.840183]  sysfs_kf_write+0x8c/0xa8
      [   20.840183]  sysfs_kf_write+0x8c/0xa8
      [   20.840440]  kernfs_fop_write+0x154/0x290
      [   20.840702]  vfs_write+0xf0/0x278
      [   20.840942]  __arm64_sys_write+0x84/0xf4
      [   20.841209]  el0_svc_common+0xf4/0x1dc
      [   20.841471]  el0_svc_handler+0x98/0xbc
      [   20.841713]  el0_svc+0x8/0xc
      [   20.842057] Code: bad PC value
      [   20.842764] ---[ end trace a8835d7de79aaadf ]---
      [   20.843134] Kernel panic - not syncing: Fatal exception
      [   20.843515] SMP: stopping secondary CPUs
      [   20.844289] Kernel Offset: disabled
      [   20.844634] CPU features: 0x0,21806002
      [   20.844857] Memory Limit: none
      [   20.845172] ---[ end Kernel panic - not syncing: Fatal exception ]---
      Signed-off-by: default avatarMiles Chen <miles.chen@mediatek.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9206016c
    • Sam Bobroff's avatar
      powerpc/eeh: Fix possible null deref in eeh_dump_dev_log() · 9bed31c9
      Sam Bobroff authored
      [ Upstream commit f9bc28ae ]
      
      If an error occurs during an unplug operation, it's possible for
      eeh_dump_dev_log() to be called when edev->pdn is null, which
      currently leads to dereferencing a null pointer.
      
      Handle this by skipping the error log for those devices.
      Signed-off-by: default avatarSam Bobroff <sbobroff@linux.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9bed31c9
  2. 13 Nov, 2018 12 commits
    • Greg Kroah-Hartman's avatar
      Linux 4.9.137 · 55526837
      Greg Kroah-Hartman authored
      55526837
    • Shaohua Li's avatar
      MD: fix invalid stored role for a disk - try2 · 627ab1fa
      Shaohua Li authored
      commit 9e753ba9 upstream.
      
      Commit d595567d (MD: fix invalid stored role for a disk) broke linear
      hotadd. Let's only fix the role for disks in raid1/10.
      Based on Guoqing's original patch.
      Reported-by: default avatarkernel test robot <rong.a.chen@intel.com>
      Cc: Gioh Kim <gi-oh.kim@profitbricks.com>
      Cc: Guoqing Jiang <gqjiang@suse.com>
      Signed-off-by: default avatarShaohua Li <shli@fb.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      627ab1fa
    • Josef Bacik's avatar
      btrfs: set max_extent_size properly · f3bc71fa
      Josef Bacik authored
      commit ad22cf6e upstream.
      
      We can't use entry->bytes if our entry is a bitmap entry, we need to use
      entry->max_extent_size in that case.  Fix up all the logic to make this
      consistent.
      
      CC: stable@vger.kernel.org # 4.4+
      Signed-off-by: default avatarJosef Bacik <jbacik@fb.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f3bc71fa
    • Filipe Manana's avatar
      Btrfs: fix null pointer dereference on compressed write path error · a088035d
      Filipe Manana authored
      commit 3527a018 upstream.
      
      At inode.c:compress_file_range(), under the "free_pages_out" label, we can
      end up dereferencing the "pages" pointer when it has a NULL value. This
      case happens when "start" has a value of 0 and we fail to allocate memory
      for the "pages" pointer. When that happens we jump to the "cont" label and
      then enter the "if (start == 0)" branch where we immediately call the
      cow_file_range_inline() function. If that function returns 0 (success
      creating an inline extent) or an error (like -ENOMEM for example) we jump
      to the "free_pages_out" label and then access "pages[i]" leading to a NULL
      pointer dereference, since "nr_pages" has a value greater than zero at
      that point.
      
      Fix this by setting "nr_pages" to 0 when we fail to allocate memory for
      the "pages" pointer.
      
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=201119
      Fixes: 771ed689 ("Btrfs: Optimize compressed writeback and reads")
      CC: stable@vger.kernel.org # 4.4+
      Reviewed-by: default avatarLiu Bo <bo.liu@linux.alibaba.com>
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a088035d
    • Qu Wenruo's avatar
      btrfs: qgroup: Dirty all qgroups before rescan · 91b3675e
      Qu Wenruo authored
      commit 9c7b0c2e upstream.
      
      [BUG]
      In the following case, rescan won't zero out the number of qgroup 1/0:
      
        $ mkfs.btrfs -fq $DEV
        $ mount $DEV /mnt
      
        $ btrfs quota enable /mnt
        $ btrfs qgroup create 1/0 /mnt
        $ btrfs sub create /mnt/sub
        $ btrfs qgroup assign 0/257 1/0 /mnt
      
        $ dd if=/dev/urandom of=/mnt/sub/file bs=1k count=1000
        $ btrfs sub snap /mnt/sub /mnt/snap
        $ btrfs quota rescan -w /mnt
        $ btrfs qgroup show -pcre /mnt
        qgroupid         rfer         excl     max_rfer     max_excl parent  child
        --------         ----         ----     --------     -------- ------  -----
        0/5          16.00KiB     16.00KiB         none         none ---     ---
        0/257      1016.00KiB     16.00KiB         none         none 1/0     ---
        0/258      1016.00KiB     16.00KiB         none         none ---     ---
        1/0        1016.00KiB     16.00KiB         none         none ---     0/257
      
      So far so good, but:
      
        $ btrfs qgroup remove 0/257 1/0 /mnt
        WARNING: quotas may be inconsistent, rescan needed
        $ btrfs quota rescan -w /mnt
        $ btrfs qgroup show -pcre  /mnt
        qgoupid         rfer         excl     max_rfer     max_excl parent  child
        --------         ----         ----     --------     -------- ------  -----
        0/5          16.00KiB     16.00KiB         none         none ---     ---
        0/257      1016.00KiB     16.00KiB         none         none ---     ---
        0/258      1016.00KiB     16.00KiB         none         none ---     ---
        1/0        1016.00KiB     16.00KiB         none         none ---     ---
      	     ^^^^^^^^^^     ^^^^^^^^ not cleared
      
      [CAUSE]
      Before rescan we call qgroup_rescan_zero_tracking() to zero out all
      qgroups' accounting numbers.
      
      However we don't mark all qgroups dirty, but rely on rescan to do so.
      
      If we have any high level qgroup without children, it won't be marked
      dirty during rescan, since we cannot reach that qgroup.
      
      This will cause QGROUP_INFO items of childless qgroups never get updated
      in the quota tree, thus their numbers will stay the same in "btrfs
      qgroup show" output.
      
      [FIX]
      Just mark all qgroups dirty in qgroup_rescan_zero_tracking(), so even if
      we have childless qgroups, their QGROUP_INFO items will still get
      updated during rescan.
      Reported-by: default avatarMisono Tomohiro <misono.tomohiro@jp.fujitsu.com>
      CC: stable@vger.kernel.org # 4.4+
      Signed-off-by: default avatarQu Wenruo <wqu@suse.com>
      Reviewed-by: default avatarMisono Tomohiro <misono.tomohiro@jp.fujitsu.com>
      Tested-by: default avatarMisono Tomohiro <misono.tomohiro@jp.fujitsu.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      91b3675e
    • Filipe Manana's avatar
      Btrfs: fix wrong dentries after fsync of file that got its parent replaced · 6a0403a4
      Filipe Manana authored
      commit 0f375eed upstream.
      
      In a scenario like the following:
      
        mkdir /mnt/A               # inode 258
        mkdir /mnt/B               # inode 259
        touch /mnt/B/bar           # inode 260
      
        sync
      
        mv /mnt/B/bar /mnt/A/bar
        mv -T /mnt/A /mnt/B
        fsync /mnt/B/bar
      
        <power fail>
      
      After replaying the log we end up with file bar having 2 hard links, both
      with the name 'bar' and one in the directory with inode number 258 and the
      other in the directory with inode number 259. Also, we end up with the
      directory inode 259 still existing and with the directory inode 258 still
      named as 'A', instead of 'B'. In this scenario, file 'bar' should only
      have one hard link, located at directory inode 258, the directory inode
      259 should not exist anymore and the name for directory inode 258 should
      be 'B'.
      
      This incorrect behaviour happens because when attempting to log the old
      parents of an inode, we skip any parents that no longer exist. Fix this
      by forcing a full commit if an old parent no longer exists.
      
      A test case for fstests follows soon.
      
      CC: stable@vger.kernel.org # 4.4+
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6a0403a4
    • Josef Bacik's avatar
      btrfs: make sure we create all new block groups · 4c7b9a46
      Josef Bacik authored
      commit 545e3366 upstream.
      
      Allocating new chunks modifies both the extent and chunk tree, which can
      trigger new chunk allocations.  So instead of doing list_for_each_safe,
      just do while (!list_empty()) so we make sure we don't exit with other
      pending bg's still on our list.
      
      CC: stable@vger.kernel.org # 4.4+
      Reviewed-by: default avatarOmar Sandoval <osandov@fb.com>
      Reviewed-by: default avatarLiu Bo <bo.liu@linux.alibaba.com>
      Signed-off-by: default avatarJosef Bacik <josef@toxicpanda.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4c7b9a46
    • Josef Bacik's avatar
      btrfs: reset max_extent_size on clear in a bitmap · 9ff40fbf
      Josef Bacik authored
      commit 553cceb4 upstream.
      
      We need to clear the max_extent_size when we clear bits from a bitmap
      since it could have been from the range that contains the
      max_extent_size.
      
      CC: stable@vger.kernel.org # 4.4+
      Reviewed-by: default avatarLiu Bo <bo.liu@linux.alibaba.com>
      Signed-off-by: default avatarJosef Bacik <jbacik@fb.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9ff40fbf
    • Josef Bacik's avatar
      btrfs: wait on caching when putting the bg cache · c160dae0
      Josef Bacik authored
      commit 3aa7c7a3 upstream.
      
      While testing my backport I noticed there was a panic if I ran
      generic/416 generic/417 generic/418 all in a row.  This just happened to
      uncover a race where we had outstanding IO after we destroy all of our
      workqueues, and then we'd go to queue the endio work on those free'd
      workqueues.
      
      This is because we aren't waiting for the caching threads to be done
      before freeing everything up, so to fix this make sure we wait on any
      outstanding caching that's being done before we free up the block group,
      so we're sure to be done with all IO by the time we get to
      btrfs_stop_all_workers().  This fixes the panic I was seeing
      consistently in testing.
      
      ------------[ cut here ]------------
      kernel BUG at fs/btrfs/volumes.c:6112!
      SMP PTI
      Modules linked in:
      CPU: 1 PID: 27165 Comm: kworker/u4:7 Not tainted 4.16.0-02155-g3553e54a578d-dirty #875
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014
      Workqueue: btrfs-cache btrfs_cache_helper
      RIP: 0010:btrfs_map_bio+0x346/0x370
      RSP: 0000:ffffc900061e79d0 EFLAGS: 00010202
      RAX: 0000000000000000 RBX: ffff880071542e00 RCX: 0000000000533000
      RDX: ffff88006bb74380 RSI: 0000000000000008 RDI: ffff880078160000
      RBP: 0000000000000001 R08: ffff8800781cd200 R09: 0000000000503000
      R10: ffff88006cd21200 R11: 0000000000000000 R12: 0000000000000000
      R13: 0000000000000000 R14: ffff8800781cd200 R15: ffff880071542e00
      FS:  0000000000000000(0000) GS:ffff88007fd00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 000000000817ffc4 CR3: 0000000078314000 CR4: 00000000000006e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       btree_submit_bio_hook+0x8a/0xd0
       submit_one_bio+0x5d/0x80
       read_extent_buffer_pages+0x18a/0x320
       btree_read_extent_buffer_pages+0xbc/0x200
       ? alloc_extent_buffer+0x359/0x3e0
       read_tree_block+0x3d/0x60
       read_block_for_search.isra.30+0x1a5/0x360
       btrfs_search_slot+0x41b/0xa10
       btrfs_next_old_leaf+0x212/0x470
       caching_thread+0x323/0x490
       normal_work_helper+0xc5/0x310
       process_one_work+0x141/0x340
       worker_thread+0x44/0x3c0
       kthread+0xf8/0x130
       ? process_one_work+0x340/0x340
       ? kthread_bind+0x10/0x10
       ret_from_fork+0x35/0x40
      RIP: btrfs_map_bio+0x346/0x370 RSP: ffffc900061e79d0
      ---[ end trace 827eb13e50846033 ]---
      Kernel panic - not syncing: Fatal exception
      Kernel Offset: disabled
      ---[ end Kernel panic - not syncing: Fatal exception
      
      CC: stable@vger.kernel.org # 4.4+
      Signed-off-by: default avatarJosef Bacik <josef@toxicpanda.com>
      Reviewed-by: default avatarOmar Sandoval <osandov@fb.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c160dae0
    • Jeff Mahoney's avatar
      btrfs: don't attempt to trim devices that don't support it · 391235fd
      Jeff Mahoney authored
      commit 0be88e36 upstream.
      
      We check whether any device the file system is using supports discard in
      the ioctl call, but then we attempt to trim free extents on every device
      regardless of whether discard is supported.  Due to the way we mask off
      EOPNOTSUPP, we can end up issuing the trim operations on each free range
      on devices that don't support it, just wasting time.
      
      Fixes: 499f377f ("btrfs: iterate over unused chunk space in FITRIM")
      CC: stable@vger.kernel.org # 4.4+
      Signed-off-by: default avatarJeff Mahoney <jeffm@suse.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      391235fd
    • Jeff Mahoney's avatar
      btrfs: iterate all devices during trim, instead of fs_devices::alloc_list · d87af98c
      Jeff Mahoney authored
      commit d4e329de upstream.
      
      btrfs_trim_fs iterates over the fs_devices->alloc_list while holding the
      device_list_mutex.  The problem is that ->alloc_list is protected by the
      chunk mutex.  We don't want to hold the chunk mutex over the trim of the
      entire file system.  Fortunately, the ->dev_list list is protected by
      the dev_list mutex and while it will give us all devices, including
      read-only devices, we already just skip the read-only devices.  Then we
      can continue to take and release the chunk mutex while scanning each
      device.
      
      Fixes: 499f377f ("btrfs: iterate over unused chunk space in FITRIM")
      CC: stable@vger.kernel.org # 4.4+
      Signed-off-by: default avatarJeff Mahoney <jeffm@suse.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d87af98c
    • Jeff Mahoney's avatar
      btrfs: fix error handling in free_log_tree · 1bd7112b
      Jeff Mahoney authored
      commit 374b0e2d upstream.
      
      When we hit an I/O error in free_log_tree->walk_log_tree during file system
      shutdown we can crash due to there not being a valid transaction handle.
      
      Use btrfs_handle_fs_error when there's no transaction handle to use.
      
        BUG: unable to handle kernel NULL pointer dereference at 0000000000000060
        IP: free_log_tree+0xd2/0x140 [btrfs]
        PGD 0 P4D 0
        Oops: 0000 [#1] SMP DEBUG_PAGEALLOC PTI
        Modules linked in: <modules>
        CPU: 2 PID: 23544 Comm: umount Tainted: G        W        4.12.14-kvmsmall #9 SLE15 (unreleased)
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014
        task: ffff96bfd3478880 task.stack: ffffa7cf40d78000
        RIP: 0010:free_log_tree+0xd2/0x140 [btrfs]
        RSP: 0018:ffffa7cf40d7bd10 EFLAGS: 00010282
        RAX: 00000000fffffffb RBX: 00000000fffffffb RCX: 0000000000000002
        RDX: 0000000000000000 RSI: ffff96c02f07d4c8 RDI: 0000000000000282
        RBP: ffff96c013cf1000 R08: ffff96c02f07d4c8 R09: ffff96c02f07d4d0
        R10: 0000000000000000 R11: 0000000000000002 R12: 0000000000000000
        R13: ffff96c005e800c0 R14: ffffa7cf40d7bdb8 R15: 0000000000000000
        FS:  00007f17856bcfc0(0000) GS:ffff96c03f600000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: 0000000000000060 CR3: 0000000045ed6002 CR4: 00000000003606e0
        DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
        DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
        Call Trace:
         ? wait_for_writer+0xb0/0xb0 [btrfs]
         btrfs_free_log+0x17/0x30 [btrfs]
         btrfs_drop_and_free_fs_root+0x9a/0xe0 [btrfs]
         btrfs_free_fs_roots+0xc0/0x130 [btrfs]
         ? wait_for_completion+0xf2/0x100
         close_ctree+0xea/0x2e0 [btrfs]
         ? kthread_stop+0x161/0x260
         generic_shutdown_super+0x6c/0x120
         kill_anon_super+0xe/0x20
         btrfs_kill_super+0x13/0x100 [btrfs]
         deactivate_locked_super+0x3f/0x70
         cleanup_mnt+0x3b/0x70
         task_work_run+0x78/0x90
         exit_to_usermode_loop+0x77/0xa6
         do_syscall_64+0x1c5/0x1e0
         entry_SYSCALL_64_after_hwframe+0x42/0xb7
        RIP: 0033:0x7f1784f90827
        RSP: 002b:00007ffdeeb03118 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
        RAX: 0000000000000000 RBX: 0000556a60c62970 RCX: 00007f1784f90827
        RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000556a60c62b50
        RBP: 0000000000000000 R08: 0000000000000005 R09: 00000000ffffffff
        R10: 0000556a60c63900 R11: 0000000000000246 R12: 0000556a60c62b50
        R13: 00007f17854a81c4 R14: 0000000000000000 R15: 0000000000000000
        RIP: free_log_tree+0xd2/0x140 [btrfs] RSP: ffffa7cf40d7bd10
        CR2: 0000000000000060
      
      Fixes: 681ae509 ("Btrfs: cleanup reserved space when freeing tree log on error")
      CC: <stable@vger.kernel.org> # v3.13
      Signed-off-by: default avatarJeff Mahoney <jeffm@suse.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1bd7112b