1. 21 Nov, 2018 40 commits
    • Arnd Bergmann's avatar
      mtd: docg3: don't set conflicting BCH_CONST_PARAMS option · c7d2166d
      Arnd Bergmann authored
      commit be2e1c9d upstream.
      
      I noticed during the creation of another bugfix that the BCH_CONST_PARAMS
      option that is set by DOCG3 breaks setting variable parameters for any
      other users of the BCH library code.
      
      The only other user we have today is the MTD_NAND software BCH
      implementation (most flash controllers use hardware BCH these days
      and are not affected). I considered removing BCH_CONST_PARAMS entirely
      because of the inherent conflict, but according to the description in
      lib/bch.c there is a significant performance benefit in keeping it.
      
      To avoid the immediate problem of the conflict between MTD_NAND_BCH
      and DOCG3, this only sets the constant parameters if MTD_NAND_BCH
      is disabled, which should fix the problem for all cases that
      are affected. This should also work for all stable kernels.
      
      Note that there is only one machine that actually seems to use the
      DOCG3 driver (arch/arm/mach-pxa/mioa701.c), so most users should have
      the driver disabled, but it almost certainly shows up if we wanted
      to test random kernels on machines that use software BCH in MTD.
      
      Fixes: d13d19ec ("mtd: docg3: add ECC correction code")
      Cc: stable@vger.kernel.org
      Cc: Robert Jarzmik <robert.jarzmik@free.fr>
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarBoris Brezillon <boris.brezillon@bootlin.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c7d2166d
    • Andrea Arcangeli's avatar
      mm: thp: relax __GFP_THISNODE for MADV_HUGEPAGE mappings · 877813e0
      Andrea Arcangeli authored
      commit ac5b2c18 upstream.
      
      THP allocation might be really disruptive when allocated on NUMA system
      with the local node full or hard to reclaim.  Stefan has posted an
      allocation stall report on 4.12 based SLES kernel which suggests the
      same issue:
      
        kvm: page allocation stalls for 194572ms, order:9, mode:0x4740ca(__GFP_HIGHMEM|__GFP_IO|__GFP_FS|__GFP_COMP|__GFP_NOMEMALLOC|__GFP_HARDWALL|__GFP_THISNODE|__GFP_MOVABLE|__GFP_DIRECT_RECLAIM), nodemask=(null)
        kvm cpuset=/ mems_allowed=0-1
        CPU: 10 PID: 84752 Comm: kvm Tainted: G        W 4.12.0+98-ph <a href="/view.php?id=1" title="[geschlossen] Integration Ramdisk" class="resolved">0000001</a> SLE15 (unreleased)
        Hardware name: Supermicro SYS-1029P-WTRT/X11DDW-NT, BIOS 2.0 12/05/2017
        Call Trace:
         dump_stack+0x5c/0x84
         warn_alloc+0xe0/0x180
         __alloc_pages_slowpath+0x820/0xc90
         __alloc_pages_nodemask+0x1cc/0x210
         alloc_pages_vma+0x1e5/0x280
         do_huge_pmd_wp_page+0x83f/0xf00
         __handle_mm_fault+0x93d/0x1060
         handle_mm_fault+0xc6/0x1b0
         __do_page_fault+0x230/0x430
         do_page_fault+0x2a/0x70
         page_fault+0x7b/0x80
         [...]
        Mem-Info:
        active_anon:126315487 inactive_anon:1612476 isolated_anon:5
         active_file:60183 inactive_file:245285 isolated_file:0
         unevictable:15657 dirty:286 writeback:1 unstable:0
         slab_reclaimable:75543 slab_unreclaimable:2509111
         mapped:81814 shmem:31764 pagetables:370616 bounce:0
         free:32294031 free_pcp:6233 free_cma:0
        Node 0 active_anon:254680388kB inactive_anon:1112760kB active_file:240648kB inactive_file:981168kB unevictable:13368kB isolated(anon):0kB isolated(file):0kB mapped:280240kB dirty:1144kB writeback:0kB shmem:95832kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 81225728kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no
        Node 1 active_anon:250583072kB inactive_anon:5337144kB active_file:84kB inactive_file:0kB unevictable:49260kB isolated(anon):20kB isolated(file):0kB mapped:47016kB dirty:0kB writeback:4kB shmem:31224kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 31897600kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no
      
      The defrag mode is "madvise" and from the above report it is clear that
      the THP has been allocated for MADV_HUGEPAGA vma.
      
      Andrea has identified that the main source of the problem is
      __GFP_THISNODE usage:
      
      : The problem is that direct compaction combined with the NUMA
      : __GFP_THISNODE logic in mempolicy.c is telling reclaim to swap very
      : hard the local node, instead of failing the allocation if there's no
      : THP available in the local node.
      :
      : Such logic was ok until __GFP_THISNODE was added to the THP allocation
      : path even with MPOL_DEFAULT.
      :
      : The idea behind the __GFP_THISNODE addition, is that it is better to
      : provide local memory in PAGE_SIZE units than to use remote NUMA THP
      : backed memory. That largely depends on the remote latency though, on
      : threadrippers for example the overhead is relatively low in my
      : experience.
      :
      : The combination of __GFP_THISNODE and __GFP_DIRECT_RECLAIM results in
      : extremely slow qemu startup with vfio, if the VM is larger than the
      : size of one host NUMA node. This is because it will try very hard to
      : unsuccessfully swapout get_user_pages pinned pages as result of the
      : __GFP_THISNODE being set, instead of falling back to PAGE_SIZE
      : allocations and instead of trying to allocate THP on other nodes (it
      : would be even worse without vfio type1 GUP pins of course, except it'd
      : be swapping heavily instead).
      
      Fix this by removing __GFP_THISNODE for THP requests which are
      requesting the direct reclaim.  This effectivelly reverts 5265047a
      on the grounds that the zone/node reclaim was known to be disruptive due
      to premature reclaim when there was memory free.  While it made sense at
      the time for HPC workloads without NUMA awareness on rare machines, it
      was ultimately harmful in the majority of cases.  The existing behaviour
      is similar, if not as widespare as it applies to a corner case but
      crucially, it cannot be tuned around like zone_reclaim_mode can.  The
      default behaviour should always be to cause the least harm for the
      common case.
      
      If there are specialised use cases out there that want zone_reclaim_mode
      in specific cases, then it can be built on top.  Longterm we should
      consider a memory policy which allows for the node reclaim like behavior
      for the specific memory ranges which would allow a
      
      [1] http://lkml.kernel.org/r/20180820032204.9591-1-aarcange@redhat.com
      
      Mel said:
      
      : Both patches look correct to me but I'm responding to this one because
      : it's the fix.  The change makes sense and moves further away from the
      : severe stalling behaviour we used to see with both THP and zone reclaim
      : mode.
      :
      : I put together a basic experiment with usemem configured to reference a
      : buffer multiple times that is 80% the size of main memory on a 2-socket
      : box with symmetric node sizes and defrag set to "always".  The defrag
      : setting is not the default but it would be functionally similar to
      : accessing a buffer with madvise(MADV_HUGEPAGE).  Usemem is configured to
      : reference the buffer multiple times and while it's not an interesting
      : workload, it would be expected to complete reasonably quickly as it fits
      : within memory.  The results were;
      :
      : usemem
      :                                   vanilla           noreclaim-v1
      : Amean     Elapsd-1       42.78 (   0.00%)       26.87 (  37.18%)
      : Amean     Elapsd-3       27.55 (   0.00%)        7.44 (  73.00%)
      : Amean     Elapsd-4        5.72 (   0.00%)        5.69 (   0.45%)
      :
      : This shows the elapsed time in seconds for 1 thread, 3 threads and 4
      : threads referencing buffers 80% the size of memory.  With the patches
      : applied, it's 37.18% faster for the single thread and 73% faster with two
      : threads.  Note that 4 threads showing little difference does not indicate
      : the problem is related to thread counts.  It's simply the case that 4
      : threads gets spread so their workload mostly fits in one node.
      :
      : The overall view from /proc/vmstats is more startling
      :
      :                          4.19.0-rc1  4.19.0-rc1
      :                             vanillanoreclaim-v1r1
      : Minor Faults               35593425      708164
      : Major Faults                 484088          36
      : Swap Ins                    3772837           0
      : Swap Outs                   3932295           0
      :
      : Massive amounts of swap in/out without the patch
      :
      : Direct pages scanned        6013214           0
      : Kswapd pages scanned              0           0
      : Kswapd pages reclaimed            0           0
      : Direct pages reclaimed      4033009           0
      :
      : Lots of reclaim activity without the patch
      :
      : Kswapd efficiency              100%        100%
      : Kswapd velocity               0.000       0.000
      : Direct efficiency               67%        100%
      : Direct velocity           11191.956       0.000
      :
      : Mostly from direct reclaim context as you'd expect without the patch.
      :
      : Page writes by reclaim  3932314.000       0.000
      : Page writes file                 19           0
      : Page writes anon            3932295           0
      : Page reclaim immediate        42336           0
      :
      : Writes from reclaim context is never good but the patch eliminates it.
      :
      : We should never have default behaviour to thrash the system for such a
      : basic workload.  If zone reclaim mode behaviour is ever desired but on a
      : single task instead of a global basis then the sensible option is to build
      : a mempolicy that enforces that behaviour.
      
      This was a severe regression compared to previous kernels that made
      important workloads unusable and it starts when __GFP_THISNODE was
      added to THP allocations under MADV_HUGEPAGE.  It is not a significant
      risk to go to the previous behavior before __GFP_THISNODE was added, it
      worked like that for years.
      
      This was simply an optimization to some lucky workloads that can fit in
      a single node, but it ended up breaking the VM for others that can't
      possibly fit in a single node, so going back is safe.
      
      [mhocko@suse.com: rewrote the changelog based on the one from Andrea]
      Link: http://lkml.kernel.org/r/20180925120326.24392-2-mhocko@kernel.org
      Fixes: 5265047a ("mm, thp: really limit transparent hugepage allocation to local node")
      Signed-off-by: default avatarAndrea Arcangeli <aarcange@redhat.com>
      Signed-off-by: default avatarMichal Hocko <mhocko@suse.com>
      Reported-by: default avatarStefan Priebe <s.priebe@profihost.ag>
      Debugged-by: default avatarAndrea Arcangeli <aarcange@redhat.com>
      Reported-by: default avatarAlex Williamson <alex.williamson@redhat.com>
      Reviewed-by: default avatarMel Gorman <mgorman@techsingularity.net>
      Tested-by: default avatarMel Gorman <mgorman@techsingularity.net>
      Cc: Zi Yan <zi.yan@cs.rutgers.edu>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: David Rientjes <rientjes@google.com>
      Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
      Cc: <stable@vger.kernel.org>	[4.1+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      877813e0
    • Changwei Ge's avatar
      ocfs2: fix a misuse a of brelse after failing ocfs2_check_dir_entry · 298ed64f
      Changwei Ge authored
      commit 29aa3016 upstream.
      
      Somehow, file system metadata was corrupted, which causes
      ocfs2_check_dir_entry() to fail in function ocfs2_dir_foreach_blk_el().
      
      According to the original design intention, if above happens we should
      skip the problematic block and continue to retrieve dir entry.  But
      there is obviouse misuse of brelse around related code.
      
      After failure of ocfs2_check_dir_entry(), current code just moves to
      next position and uses the problematic buffer head again and again
      during which the problematic buffer head is released for multiple times.
      I suppose, this a serious issue which is long-lived in ocfs2.  This may
      cause other file systems which is also used in a the same host insane.
      
      So we should also consider about bakcporting this patch into linux
      -stable.
      
      Link: http://lkml.kernel.org/r/HK2PR06MB045211675B43EED794E597B6D56E0@HK2PR06MB0452.apcprd06.prod.outlook.comSigned-off-by: default avatarChangwei Ge <ge.changwei@h3c.com>
      Suggested-by: default avatarChangkuo Shi <shi.changkuo@h3c.com>
      Reviewed-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Mark Fasheh <mark@fasheh.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Junxiao Bi <junxiao.bi@oracle.com>
      Cc: Joseph Qi <jiangqi903@gmail.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      298ed64f
    • Greg Edwards's avatar
      vhost/scsi: truncate T10 PI iov_iter to prot_bytes · d2926beb
      Greg Edwards authored
      commit 4542d623 upstream.
      
      Commands with protection information included were not truncating the
      protection iov_iter to the number of protection bytes in the command.
      This resulted in vhost_scsi mis-calculating the size of the protection
      SGL in vhost_scsi_calc_sgls(), and including both the protection and
      data SG entries in the protection SGL.
      
      Fixes: 09b13fa8 ("vhost/scsi: Add ANY_LAYOUT support in vhost_scsi_handle_vq")
      Signed-off-by: default avatarGreg Edwards <gedwards@ddn.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Fixes: 09b13fa8
      Cc: stable@vger.kernel.org
      Reviewed-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d2926beb
    • Mikulas Patocka's avatar
      mach64: fix image corruption due to reading accelerator registers · 1ffd6317
      Mikulas Patocka authored
      commit c09bcc91 upstream.
      
      Reading the registers without waiting for engine idle returns
      unpredictable values. These unpredictable values result in display
      corruption - if atyfb_imageblit reads the content of DP_PIX_WIDTH with the
      bit DP_HOST_TRIPLE_EN set (from previous invocation), the driver would
      never ever clear the bit, resulting in display corruption.
      
      We don't want to wait for idle because it would degrade performance, so
      this patch modifies the driver so that it never reads accelerator
      registers.
      
      HOST_CNTL doesn't have to be read, we can just write it with
      HOST_BYTE_ALIGN because no other part of the driver cares if
      HOST_BYTE_ALIGN is set.
      
      DP_PIX_WIDTH is written in the functions atyfb_copyarea and atyfb_fillrect
      with the default value and in atyfb_imageblit with the value set according
      to the source image data.
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Reviewed-by: default avatarVille Syrjälä <syrjala@sci.fi>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarBartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1ffd6317
    • Mikulas Patocka's avatar
      mach64: fix display corruption on big endian machines · 5d4ce43c
      Mikulas Patocka authored
      commit 3c6c6a78 upstream.
      
      The code for manual bit triple is not endian-clean. It builds the variable
      "hostdword" using byte accesses, therefore we must read the variable with
      "le32_to_cpu".
      
      The patch also enables (hardware or software) bit triple only if the image
      is monochrome (image->depth). If we want to blit full-color image, we
      shouldn't use the triple code.
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Reviewed-by: default avatarVille Syrjälä <syrjala@sci.fi>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarBartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5d4ce43c
    • Ilya Dryomov's avatar
      libceph: bump CEPH_MSG_MAX_DATA_LEN · f29b6655
      Ilya Dryomov authored
      commit 94e6992b upstream.
      
      If the read is large enough, we end up spinning in the messenger:
      
        libceph: osd0 192.168.122.1:6801 io error
        libceph: osd0 192.168.122.1:6801 io error
        libceph: osd0 192.168.122.1:6801 io error
      
      This is a receive side limit, so only reads were affected.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarIlya Dryomov <idryomov@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f29b6655
    • Krzysztof Kozlowski's avatar
      clk: s2mps11: Fix matching when built as module and DT node contains compatible · c2fc59dc
      Krzysztof Kozlowski authored
      commit 8985167e upstream.
      
      When driver is built as module and DT node contains clocks compatible
      (e.g. "samsung,s2mps11-clk"), the module will not be autoloaded because
      module aliases won't match.
      
      The modalias from uevent: of:NclocksT<NULL>Csamsung,s2mps11-clk
      The modalias from driver: platform:s2mps11-clk
      
      The devices are instantiated by parent's MFD.  However both Device Tree
      bindings and parent define the compatible for clocks devices.  In case
      of module matching this DT compatible will be used.
      
      The issue will not happen if this is a built-in (no need for module
      matching) or when clocks DT node does not contain compatible (not
      correct from bindings perspective but working for driver).
      
      Note when backporting to stable kernels: adjust the list of device ID
      entries.
      
      Cc: <stable@vger.kernel.org>
      Fixes: 53c31b34 ("mfd: sec-core: Add of_compatible strings for clock MFD cells")
      Signed-off-by: default avatarKrzysztof Kozlowski <krzk@kernel.org>
      Acked-by: default avatarStephen Boyd <sboyd@kernel.org>
      Signed-off-by: default avatarStephen Boyd <sboyd@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c2fc59dc
    • Max Filippov's avatar
      xtensa: fix boot parameters address translation · bf6ae42c
      Max Filippov authored
      commit 40dc948f upstream.
      
      The bootloader may pass physical address of the boot parameters structure
      to the MMUv3 kernel in the register a2. Code in the _SetupMMU block in
      the arch/xtensa/kernel/head.S is supposed to map that physical address to
      the virtual address in the configured virtual memory layout.
      
      This code haven't been updated when additional 256+256 and 512+512
      memory layouts were introduced and it may produce wrong addresses when
      used with these layouts.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMax Filippov <jcmvbkbc@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bf6ae42c
    • Max Filippov's avatar
      xtensa: make sure bFLT stack is 16 byte aligned · fdc26b9d
      Max Filippov authored
      commit 0773495b upstream.
      
      Xtensa ABI requires stack alignment to be at least 16. In noMMU
      configuration ARCH_SLAB_MINALIGN is used to align stack. Make it at
      least 16.
      
      This fixes the following runtime error in noMMU configuration, caused by
      interaction between insufficiently aligned stack and alloca function,
      that results in corruption of on-stack variable in the libc function
      glob:
      
       Caught unhandled exception in 'sh' (pid = 47, pc = 0x02d05d65)
        - should not happen
        EXCCAUSE is 15
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMax Filippov <jcmvbkbc@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fdc26b9d
    • Max Filippov's avatar
      xtensa: add NOTES section to the linker script · f9d23644
      Max Filippov authored
      commit 4119ba21 upstream.
      
      This section collects all source .note.* sections together in the
      vmlinux image. Without it .note.Linux section may be placed at address
      0, while the rest of the kernel is at its normal address, resulting in a
      huge vmlinux.bin image that may not be linked into the xtensa Image.elf.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMax Filippov <jcmvbkbc@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f9d23644
    • Huacai Chen's avatar
      MIPS: Loongson-3: Fix BRIDGE irq delivery problem · 4353f89b
      Huacai Chen authored
      [ Upstream commit 360fe725 ]
      
      After commit e509bd7d ("genirq: Allow migration of chained
      interrupts by installing default action") Loongson-3 fails at here:
      
      setup_irq(LOONGSON_HT1_IRQ, &cascade_irqaction);
      
      This is because both chained_action and cascade_irqaction don't have
      IRQF_SHARED flag. This will cause Loongson-3 resume fails because HPET
      timer interrupt can't be delivered during S3. So we set the irqchip of
      the chained irq to loongson_irq_chip which doesn't disable the chained
      irq in CP0.Status.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarHuacai Chen <chenhc@lemote.com>
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Patchwork: https://patchwork.linux-mips.org/patch/20434/
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: James Hogan <jhogan@kernel.org>
      Cc: linux-mips@linux-mips.org
      Cc: Fuxin Zhang <zhangfx@lemote.com>
      Cc: Zhangjin Wu <wuzhangjin@gmail.com>
      Cc: Huacai Chen <chenhuacai@gmail.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      4353f89b
    • Huacai Chen's avatar
      MIPS: Loongson-3: Fix CPU UART irq delivery problem · 09b61caa
      Huacai Chen authored
      [ Upstream commit d06f8a2f ]
      
      Masking/unmasking the CPU UART irq in CP0_Status (and redirecting it to
      other CPUs) may cause interrupts be lost, especially in multi-package
      machines (Package-0's UART irq cannot be delivered to others). So make
      mask_loongson_irq() and unmask_loongson_irq() be no-ops.
      
      The original problem (UART IRQ may deliver to any core) is also because
      of masking/unmasking the CPU UART irq in CP0_Status. So it is safe to
      remove all of the stuff.
      Signed-off-by: default avatarHuacai Chen <chenhc@lemote.com>
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Patchwork: https://patchwork.linux-mips.org/patch/20433/
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: James Hogan <jhogan@kernel.org>
      Cc: linux-mips@linux-mips.org
      Cc: Fuxin Zhang <zhangfx@lemote.com>
      Cc: Zhangjin Wu <wuzhangjin@gmail.com>
      Cc: Huacai Chen <chenhuacai@gmail.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      09b61caa
    • Kees Cook's avatar
      bna: ethtool: Avoid reading past end of buffer · e1993df1
      Kees Cook authored
      [ Upstream commit 4dc69c1c ]
      
      Using memcpy() from a string that is shorter than the length copied means
      the destination buffer is being filled with arbitrary data from the kernel
      rodata segment. Instead, use strncpy() which will fill the trailing bytes
      with zeros.
      
      This was found with the future CONFIG_FORTIFY_SOURCE feature.
      
      Cc: Daniel Micay <danielmicay@gmail.com>
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      e1993df1
    • Vincenzo Maffione's avatar
      e1000: fix race condition between e1000_down() and e1000_watchdog · 9b574c3d
      Vincenzo Maffione authored
      [ Upstream commit 44c445c3 ]
      
      This patch fixes a race condition that can result into the interface being
      up and carrier on, but with transmits disabled in the hardware.
      The bug may show up by repeatedly IFF_DOWN+IFF_UP the interface, which
      allows e1000_watchdog() interleave with e1000_down().
      
          CPU x                           CPU y
          --------------------------------------------------------------------
          e1000_down():
              netif_carrier_off()
                                          e1000_watchdog():
                                              if (carrier == off) {
                                                  netif_carrier_on();
                                                  enable_hw_transmit();
                                              }
              disable_hw_transmit();
                                          e1000_watchdog():
                                              /* carrier on, do nothing */
      Signed-off-by: default avatarVincenzo Maffione <v.maffione@gmail.com>
      Tested-by: default avatarAaron Brown <aaron.f.brown@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      9b574c3d
    • Colin Ian King's avatar
      e1000: avoid null pointer dereference on invalid stat type · 5637fad6
      Colin Ian King authored
      [ Upstream commit 5983587c ]
      
      Currently if the stat type is invalid then data[i] is being set
      either by dereferencing a null pointer p, or it is reading from
      an incorrect previous location if we had a valid stat type
      previously.  Fix this by skipping over the read of p on an invalid
      stat type.
      
      Detected by CoverityScan, CID#113385 ("Explicit null dereferenced")
      Signed-off-by: default avatarColin Ian King <colin.king@canonical.com>
      Reviewed-by: default avatarAlexander Duyck <alexander.h.duyck@intel.com>
      Tested-by: default avatarAaron Brown <aaron.f.brown@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      5637fad6
    • Michal Hocko's avatar
      mm: do not bug_on on incorrect length in __mm_populate() · 6f0cb0e3
      Michal Hocko authored
      commit bb177a73 upstream.
      
      syzbot has noticed that a specially crafted library can easily hit
      VM_BUG_ON in __mm_populate
      
        kernel BUG at mm/gup.c:1242!
        invalid opcode: 0000 [#1] SMP
        CPU: 2 PID: 9667 Comm: a.out Not tainted 4.18.0-rc3 #644
        Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/19/2017
        RIP: 0010:__mm_populate+0x1e2/0x1f0
        Code: 55 d0 65 48 33 14 25 28 00 00 00 89 d8 75 21 48 83 c4 20 5b 41 5c 41 5d 41 5e 41 5f 5d c3 e8 75 18 f1 ff 0f 0b e8 6e 18 f1 ff <0f> 0b 31 db eb c9 e8 93 06 e0 ff 0f 1f 00 55 48 89 e5 53 48 89 fb
        Call Trace:
           vm_brk_flags+0xc3/0x100
           vm_brk+0x1f/0x30
           load_elf_library+0x281/0x2e0
           __ia32_sys_uselib+0x170/0x1e0
           do_fast_syscall_32+0xca/0x420
           entry_SYSENTER_compat+0x70/0x7f
      
      The reason is that the length of the new brk is not page aligned when we
      try to populate the it.  There is no reason to bug on that though.
      do_brk_flags already aligns the length properly so the mapping is
      expanded as it should.  All we need is to tell mm_populate about it.
      Besides that there is absolutely no reason to to bug_on in the first
      place.  The worst thing that could happen is that the last page wouldn't
      get populated and that is far from putting system into an inconsistent
      state.
      
      Fix the issue by moving the length sanitization code from do_brk_flags
      up to vm_brk_flags.  The only other caller of do_brk_flags is brk
      syscall entry and it makes sure to provide the proper length so t here
      is no need for sanitation and so we can use do_brk_flags without it.
      
      Also remove the bogus BUG_ONs.
      
      [osalvador@techadventures.net: fix up vm_brk_flags s@request@len@]
      Link: http://lkml.kernel.org/r/20180706090217.GI32658@dhcp22.suse.czSigned-off-by: default avatarMichal Hocko <mhocko@suse.com>
      Reported-by: default avatarsyzbot <syzbot+5dcb560fe12aa5091c06@syzkaller.appspotmail.com>
      Tested-by: default avatarTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Reviewed-by: default avatarOscar Salvador <osalvador@suse.de>
      Cc: Zi Yan <zi.yan@cs.rutgers.edu>
      Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Michael S. Tsirkin <mst@redhat.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: "Huang, Ying" <ying.huang@intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      [bwh: Backported to 4.4:
       - There is no do_brk_flags() function; update do_brk()
       - do_brk(), vm_brk() return the address on success
       - Adjust context]
      Signed-off-by: default avatarBen Hutchings <ben.hutchings@codethink.co.uk>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      6f0cb0e3
    • Oscar Salvador's avatar
      fs, elf: make sure to page align bss in load_elf_library · a0b71d1b
      Oscar Salvador authored
      commit 24962af7 upstream.
      
      The current code does not make sure to page align bss before calling
      vm_brk(), and this can lead to a VM_BUG_ON() in __mm_populate() due to
      the requested lenght not being correctly aligned.
      
      Let us make sure to align it properly.
      
      Kees: only applicable to CONFIG_USELIB kernels: 32-bit and configured
      for libc5.
      
      Link: http://lkml.kernel.org/r/20180705145539.9627-1-osalvador@techadventures.netSigned-off-by: default avatarOscar Salvador <osalvador@suse.de>
      Reported-by: syzbot+5dcb560fe12aa5091c06@syzkaller.appspotmail.com
      Tested-by: default avatarTetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
      Acked-by: default avatarKees Cook <keescook@chromium.org>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Nicolas Pitre <nicolas.pitre@linaro.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarBen Hutchings <ben.hutchings@codethink.co.uk>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      a0b71d1b
    • Kees Cook's avatar
      mm: refuse wrapped vm_brk requests · 2c69d1f0
      Kees Cook authored
      commit ba093a6d upstream.
      
      The vm_brk() alignment calculations should refuse to overflow.  The ELF
      loader depending on this, but it has been fixed now.  No other unsafe
      callers have been found.
      
      Link: http://lkml.kernel.org/r/1468014494-25291-3-git-send-email-keescook@chromium.orgSigned-off-by: default avatarKees Cook <keescook@chromium.org>
      Reported-by: default avatarHector Marco-Gisbert <hecmargi@upv.es>
      Cc: Ismael Ripoll Ripoll <iripoll@upv.es>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Chen Gang <gang.chen.5i5j@gmail.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Konstantin Khlebnikov <koct9i@gmail.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      [bwh: Backported to 4.4: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben.hutchings@codethink.co.uk>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      2c69d1f0
    • Kees Cook's avatar
      binfmt_elf: fix calculations for bss padding · 4a0c08d7
      Kees Cook authored
      commit 0036d1f7 upstream.
      
      A double-bug exists in the bss calculation code, where an overflow can
      happen in the "last_bss - elf_bss" calculation, but vm_brk internally
      aligns the argument, underflowing it, wrapping back around safe.  We
      shouldn't depend on these bugs staying in sync, so this cleans up the
      bss padding handling to avoid the overflow.
      
      This moves the bss padzero() before the last_bss > elf_bss case, since
      the zero-filling of the ELF_PAGE should have nothing to do with the
      relationship of last_bss and elf_bss: any trailing portion should be
      zeroed, and a zero size is already handled by padzero().
      
      Then it handles the math on elf_bss vs last_bss correctly.  These need
      to both be ELF_PAGE aligned to get the comparison correct, since that's
      the expected granularity of the mappings.  Since elf_bss already had
      alignment-based padding happen in padzero(), the "start" of the new
      vm_brk() should be moved forward as done in the original code.  However,
      since the "end" of the vm_brk() area will already become PAGE_ALIGNed in
      vm_brk() then last_bss should get aligned here to avoid hiding it as a
      side-effect.
      
      Additionally makes a cosmetic change to the initial last_bss calculation
      so it's easier to read in comparison to the load_addr calculation above
      it (i.e.  the only difference is p_filesz vs p_memsz).
      
      Link: http://lkml.kernel.org/r/1468014494-25291-2-git-send-email-keescook@chromium.orgSigned-off-by: default avatarKees Cook <keescook@chromium.org>
      Reported-by: default avatarHector Marco-Gisbert <hecmargi@upv.es>
      Cc: Ismael Ripoll Ripoll <iripoll@upv.es>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Chen Gang <gang.chen.5i5j@gmail.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Konstantin Khlebnikov <koct9i@gmail.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarBen Hutchings <ben.hutchings@codethink.co.uk>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      4a0c08d7
    • Michal Hocko's avatar
      mm, elf: handle vm_brk error · cac93390
      Michal Hocko authored
      commit ecc2bc8a upstream.
      
      load_elf_library doesn't handle vm_brk failure although nothing really
      indicates it cannot do that because the function is allowed to fail due
      to vm_mmap failures already.  This might be not a problem now but later
      patch will make vm_brk killable (resp.  mmap_sem for write waiting will
      become killable) and so the failure will be more probable.
      Signed-off-by: default avatarMichal Hocko <mhocko@suse.com>
      Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarBen Hutchings <ben.hutchings@codethink.co.uk>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      cac93390
    • Miklos Szeredi's avatar
      fuse: set FR_SENT while locked · f04651b9
      Miklos Szeredi authored
      commit 4c316f2f upstream.
      
      Otherwise fuse_dev_do_write() could come in and finish off the request, and
      the set_bit(FR_SENT, ...) could trigger the WARN_ON(test_bit(FR_SENT, ...))
      in request_end().
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      Reported-by: syzbot+ef054c4d3f64cd7f7cec@syzkaller.appspotmai
      Fixes: 46c34a34 ("fuse: no fc->lock for pqueue parts")
      Cc: <stable@vger.kernel.org> # v4.2
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f04651b9
    • Miklos Szeredi's avatar
      fuse: fix blocked_waitq wakeup · 2fe23468
      Miklos Szeredi authored
      commit 908a572b upstream.
      
      Using waitqueue_active() is racy.  Make sure we issue a wake_up()
      unconditionally after storing into fc->blocked.  After that it's okay to
      optimize with waitqueue_active() since the first wake up provides the
      necessary barrier for all waiters, not the just the woken one.
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      Fixes: 3c18ef81 ("fuse: optimize wake_up")
      Cc: <stable@vger.kernel.org> # v3.10
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2fe23468
    • Kirill Tkhai's avatar
      fuse: Fix use-after-free in fuse_dev_do_write() · 8bb4354a
      Kirill Tkhai authored
      commit d2d2d4fb upstream.
      
      After we found req in request_find() and released the lock,
      everything may happen with the req in parallel:
      
      cpu0                              cpu1
      fuse_dev_do_write()               fuse_dev_do_write()
        req = request_find(fpq, ...)    ...
        spin_unlock(&fpq->lock)         ...
        ...                             req = request_find(fpq, oh.unique)
        ...                             spin_unlock(&fpq->lock)
        queue_interrupt(&fc->iq, req);   ...
        ...                              ...
        ...                              ...
        request_end(fc, req);
          fuse_put_request(fc, req);
        ...                              queue_interrupt(&fc->iq, req);
      Signed-off-by: default avatarKirill Tkhai <ktkhai@virtuozzo.com>
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      Fixes: 46c34a34 ("fuse: no fc->lock for pqueue parts")
      Cc: <stable@vger.kernel.org> # v4.2
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8bb4354a
    • Kirill Tkhai's avatar
      fuse: Fix use-after-free in fuse_dev_do_read() · 7574afe0
      Kirill Tkhai authored
      commit bc78abbd upstream.
      
      We may pick freed req in this way:
      
      [cpu0]                                  [cpu1]
      fuse_dev_do_read()                      fuse_dev_do_write()
         list_move_tail(&req->list, ...);     ...
         spin_unlock(&fpq->lock);             ...
         ...                                  request_end(fc, req);
         ...                                    fuse_put_request(fc, req);
         if (test_bit(FR_INTERRUPTED, ...))
               queue_interrupt(fiq, req);
      
      Fix that by keeping req alive until we finish all manipulations.
      
      Reported-by: syzbot+4e975615ca01f2277bdd@syzkaller.appspotmail.com
      Signed-off-by: default avatarKirill Tkhai <ktkhai@virtuozzo.com>
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      Fixes: 46c34a34 ("fuse: no fc->lock for pqueue parts")
      Cc: <stable@vger.kernel.org> # v4.2
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7574afe0
    • Himanshu Madhani's avatar
      scsi: qla2xxx: Fix incorrect port speed being set for FC adapters · e304bf93
      Himanshu Madhani authored
      commit 4c1458df upstream.
      
      Fixes: 6246b8a1 ("[SCSI] qla2xxx: Enhancements to support ISP83xx.")
      Fixes: 1bb39548 ("qla2xxx: Correct iiDMA-update calling conventions.")
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarHimanshu Madhani <himanshu.madhani@cavium.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e304bf93
    • Young_X's avatar
      cdrom: fix improper type cast, which can leat to information leak. · 661aa0b4
      Young_X authored
      commit e4f3aa2e upstream.
      
      There is another cast from unsigned long to int which causes
      a bounds check to fail with specially crafted input. The value is
      then used as an index in the slot array in cdrom_slot_status().
      
      This issue is similar to CVE-2018-16658 and CVE-2018-10940.
      Signed-off-by: default avatarYoung_X <YangX92@hotmail.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Cc: Ben Hutchings <ben.hutchings@codethink.co.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      661aa0b4
    • Dominique Martinet's avatar
      9p: clear dangling pointers in p9stat_free · 1c6c5d0c
      Dominique Martinet authored
      [ Upstream commit 62e39417 ]
      
      p9stat_free is more of a cleanup function than a 'free' function as it
      only frees the content of the struct; there are chances of use-after-free
      if it is improperly used (e.g. p9stat_free called twice as it used to be
      possible to)
      
      Clearing dangling pointers makes the function idempotent and safer to use.
      
      Link: http://lkml.kernel.org/r/1535410108-20650-2-git-send-email-asmadeus@codewreck.orgSigned-off-by: default avatarDominique Martinet <dominique.martinet@cea.fr>
      Reported-by: syzbot+d4252148d198410b864f@syzkaller.appspotmail.com
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1c6c5d0c
    • Dominique Martinet's avatar
      9p locks: fix glock.client_id leak in do_lock · 55e46496
      Dominique Martinet authored
      [ Upstream commit b4dc44b3 ]
      
      the 9p client code overwrites our glock.client_id pointing to a static
      buffer by an allocated string holding the network provided value which
      we do not care about; free and reset the value as appropriate.
      
      This is almost identical to the leak in v9fs_file_getlock() fixed by
      Al Viro in commit ce85dd58 ("9p: we are leaking glock.client_id
      in v9fs_file_getlock()"), which was returned as an error by a coverity
      false positive -- while we are here attempt to make the code slightly
      more robust to future change of the net/9p/client code and hopefully
      more clear to coverity that there is no problem.
      
      Link: http://lkml.kernel.org/r/1536339057-21974-5-git-send-email-asmadeus@codewreck.orgSigned-off-by: default avatarDominique Martinet <dominique.martinet@cea.fr>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      55e46496
    • Marco Felsch's avatar
      media: tvp5150: fix width alignment during set_selection() · 155bd1c4
      Marco Felsch authored
      [ Upstream commit bd24db04 ]
      
      The driver ignored the width alignment which exists due to the UYVY
      colorspace format. Fix the width alignment and make use of the the
      provided v4l2 helper function to set the width, height and all
      alignments in one.
      
      Fixes: 963ddc63 ("[media] media: tvp5150: Add cropping support")
      Signed-off-by: default avatarMarco Felsch <m.felsch@pengutronix.de>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab+samsung@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      155bd1c4
    • Phil Elwell's avatar
      sc16is7xx: Fix for multi-channel stall · 06bbe238
      Phil Elwell authored
      [ Upstream commit 83444987 ]
      
      The SC16IS752 is a dual-channel device. The two channels are largely
      independent, but the IRQ signals are wired together as an open-drain,
      active low signal which will be driven low while either of the
      channels requires attention, which can be for significant periods of
      time until operations complete and the interrupt can be acknowledged.
      In that respect it is should be treated as a true level-sensitive IRQ.
      
      The kernel, however, needs to be able to exit interrupt context in
      order to use I2C or SPI to access the device registers (which may
      involve sleeping).  Therefore the interrupt needs to be masked out or
      paused in some way.
      
      The usual way to manage sleeping from within an interrupt handler
      is to use a threaded interrupt handler - a regular interrupt routine
      does the minimum amount of work needed to triage the interrupt before
      waking the interrupt service thread. If the threaded IRQ is marked as
      IRQF_ONESHOT the kernel will automatically mask out the interrupt
      until the thread runs to completion. The sc16is7xx driver used to
      use a threaded IRQ, but a patch switched to using a kthread_worker
      in order to set realtime priorities on the handler thread and for
      other optimisations. The end result is non-threaded IRQ that
      schedules some work then returns IRQ_HANDLED, making the kernel
      think that all IRQ processing has completed.
      
      The work-around to prevent a constant stream of interrupts is to
      mark the interrupt as edge-sensitive rather than level-sensitive,
      but interpreting an active-low source as a falling-edge source
      requires care to prevent a total cessation of interrupts. Whereas
      an edge-triggering source will generate a new edge for every interrupt
      condition a level-triggering source will keep the signal at the
      interrupting level until it no longer requires attention; in other
      words, the host won't see another edge until all interrupt conditions
      are cleared. It is therefore vital that the interrupt handler does not
      exit with an outstanding interrupt condition, otherwise the kernel
      will not receive another interrupt unless some other operation causes
      the interrupt state on the device to be cleared.
      
      The existing sc16is7xx driver has a very simple interrupt "thread"
      (kthread_work job) that processes interrupts on each channel in turn
      until there are no more. If both channels are active and the first
      channel starts interrupting while the handler for the second channel
      is running then it will not be detected and an IRQ stall ensues. This
      could be handled easily if there was a shared IRQ status register, or
      a convenient way to determine if the IRQ had been deasserted for any
      length of time, but both appear to be lacking.
      
      Avoid this problem (or at least make it much less likely to happen)
      by reducing the granularity of per-channel interrupt processing
      to one condition per iteration, only exiting the overall loop when
      both channels are no longer interrupting.
      Signed-off-by: default avatarPhil Elwell <phil@raspberrypi.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      06bbe238
    • Joel Stanley's avatar
      powerpc/boot: Ensure _zimage_start is a weak symbol · c269f0b6
      Joel Stanley authored
      [ Upstream commit ee9d21b3 ]
      
      When building with clang crt0's _zimage_start is not marked weak, which
      breaks the build when linking the kernel image:
      
       $ objdump -t arch/powerpc/boot/crt0.o |grep _zimage_start$
       0000000000000058 g       .text  0000000000000000 _zimage_start
      
       ld: arch/powerpc/boot/wrapper.a(crt0.o): in function '_zimage_start':
       (.text+0x58): multiple definition of '_zimage_start';
       arch/powerpc/boot/pseries-head.o:(.text+0x0): first defined here
      
      Clang requires the .weak directive to appear after the symbol is
      declared. The binutils manual says:
      
       This directive sets the weak attribute on the comma separated list of
       symbol names. If the symbols do not already exist, they will be
       created.
      
      So it appears this is different with clang. The only reference I could
      see for this was an OpenBSD mailing list post[1].
      
      Changing it to be after the declaration fixes building with Clang, and
      still works with GCC.
      
       $ objdump -t arch/powerpc/boot/crt0.o |grep _zimage_start$
       0000000000000058  w      .text	0000000000000000 _zimage_start
      
      Reported to clang as https://bugs.llvm.org/show_bug.cgi?id=38921
      
      [1] https://groups.google.com/forum/#!topic/fa.openbsd.tech/PAgKKen2YCYSigned-off-by: default avatarJoel Stanley <joel@jms.id.au>
      Reviewed-by: default avatarNick Desaulniers <ndesaulniers@google.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c269f0b6
    • Dengcheng Zhu's avatar
      MIPS: kexec: Mark CPU offline before disabling local IRQ · c5e2c6f9
      Dengcheng Zhu authored
      [ Upstream commit dc57aaf9 ]
      
      After changing CPU online status, it will not be sent any IPIs such as in
      __flush_cache_all() on software coherency systems. Do this before disabling
      local IRQ.
      Signed-off-by: default avatarDengcheng Zhu <dzhu@wavecomp.com>
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Patchwork: https://patchwork.linux-mips.org/patch/20571/
      Cc: pburton@wavecomp.com
      Cc: ralf@linux-mips.org
      Cc: linux-mips@linux-mips.org
      Cc: rachel.mozes@intel.com
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c5e2c6f9
    • Nicholas Mc Guire's avatar
      media: pci: cx23885: handle adding to list failure · 1664dd1f
      Nicholas Mc Guire authored
      [ Upstream commit c5d59528 ]
      
      altera_hw_filt_init() which calls append_internal() assumes
      that the node was successfully linked in while in fact it can
      silently fail. So the call-site needs to set return to -ENOMEM
      on append_internal() returning NULL and exit through the err path.
      
      Fixes: 349bcf02 ("[media] Altera FPGA based CI driver module")
      Signed-off-by: default avatarNicholas Mc Guire <hofrat@osadl.org>
      Signed-off-by: default avatarHans Verkuil <hans.verkuil@cisco.com>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab+samsung@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1664dd1f
    • Tomi Valkeinen's avatar
      drm/omap: fix memory barrier bug in DMM driver · f79cc6a2
      Tomi Valkeinen authored
      [ Upstream commit 538f66ba ]
      
      A DMM timeout "timed out waiting for done" has been observed on DRA7
      devices. The timeout happens rarely, and only when the system is under
      heavy load.
      
      Debugging showed that the timeout can be made to happen much more
      frequently by optimizing the DMM driver, so that there's almost no code
      between writing the last DMM descriptors to RAM, and writing to DMM
      register which starts the DMM transaction.
      
      The current theory is that a wmb() does not properly ensure that the
      data written to RAM is observable by all the components in the system.
      
      This DMM timeout has caused interesting (and rare) bugs as the error
      handling was not functioning properly (the error handling has been fixed
      in previous commits):
      
       * If a DMM timeout happened when a GEM buffer was being pinned for
         display on the screen, a timeout error would be shown, but the driver
         would continue programming DSS HW with broken buffer, leading to
         SYNCLOST floods and possible crashes.
      
       * If a DMM timeout happened when other user (say, video decoder) was
         pinning a GEM buffer, a timeout would be shown but if the user
         handled the error properly, no other issues followed.
      
       * If a DMM timeout happened when a GEM buffer was being released, the
         driver does not even notice the error, leading to crashes or hang
         later.
      
      This patch adds wmb() and readl() calls after the last bit is written to
      RAM, which should ensure that the execution proceeds only after the data
      is actually in RAM, and thus observable by DMM.
      
      The read-back should not be needed. Further study is required to understand
      if DMM is somehow special case and read-back is ok, or if DRA7's memory
      barriers do not work correctly.
      Signed-off-by: default avatarTomi Valkeinen <tomi.valkeinen@ti.com>
      Signed-off-by: default avatarPeter Ujfalusi <peter.ujfalusi@ti.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f79cc6a2
    • Daniel Axtens's avatar
      powerpc/nohash: fix undefined behaviour when testing page size support · 1e8628eb
      Daniel Axtens authored
      [ Upstream commit f5e28480 ]
      
      When enumerating page size definitions to check hardware support,
      we construct a constant which is (1U << (def->shift - 10)).
      
      However, the array of page size definitions is only initalised for
      various MMU_PAGE_* constants, so it contains a number of 0-initialised
      elements with def->shift == 0. This means we end up shifting by a
      very large number, which gives the following UBSan splat:
      
      ================================================================================
      UBSAN: Undefined behaviour in /home/dja/dev/linux/linux/arch/powerpc/mm/tlb_nohash.c:506:21
      shift exponent 4294967286 is too large for 32-bit type 'unsigned int'
      CPU: 0 PID: 0 Comm: swapper Not tainted 4.19.0-rc3-00045-ga604f927b012-dirty #6
      Call Trace:
      [c00000000101bc20] [c000000000a13d54] .dump_stack+0xa8/0xec (unreliable)
      [c00000000101bcb0] [c0000000004f20a8] .ubsan_epilogue+0x18/0x64
      [c00000000101bd30] [c0000000004f2b10] .__ubsan_handle_shift_out_of_bounds+0x110/0x1a4
      [c00000000101be20] [c000000000d21760] .early_init_mmu+0x1b4/0x5a0
      [c00000000101bf10] [c000000000d1ba28] .early_setup+0x100/0x130
      [c00000000101bf90] [c000000000000528] start_here_multiplatform+0x68/0x80
      ================================================================================
      
      Fix this by first checking if the element exists (shift != 0) before
      constructing the constant.
      Signed-off-by: default avatarDaniel Axtens <dja@axtens.net>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1e8628eb
    • Miles Chen's avatar
      tty: check name length in tty_find_polling_driver() · 7c7378d4
      Miles Chen authored
      [ Upstream commit 33a1a7be ]
      
      The issue is found by a fuzzing test.
      If tty_find_polling_driver() recevies an incorrect input such as
      ',,' or '0b', the len becomes 0 and strncmp() always return 0.
      In this case, a null p->ops->poll_init() is called and it causes a kernel
      panic.
      
      Fix this by checking name length against zero in tty_find_polling_driver().
      
      $echo ,, > /sys/module/kgdboc/parameters/kgdboc
      [   20.804451] WARNING: CPU: 1 PID: 104 at drivers/tty/serial/serial_core.c:457
      uart_get_baud_rate+0xe8/0x190
      [   20.804917] Modules linked in:
      [   20.805317] CPU: 1 PID: 104 Comm: sh Not tainted 4.19.0-rc7ajb #8
      [   20.805469] Hardware name: linux,dummy-virt (DT)
      [   20.805732] pstate: 20000005 (nzCv daif -PAN -UAO)
      [   20.805895] pc : uart_get_baud_rate+0xe8/0x190
      [   20.806042] lr : uart_get_baud_rate+0xc0/0x190
      [   20.806476] sp : ffffffc06acff940
      [   20.806676] x29: ffffffc06acff940 x28: 0000000000002580
      [   20.806977] x27: 0000000000009600 x26: 0000000000009600
      [   20.807231] x25: ffffffc06acffad0 x24: 00000000ffffeff0
      [   20.807576] x23: 0000000000000001 x22: 0000000000000000
      [   20.807807] x21: 0000000000000001 x20: 0000000000000000
      [   20.808049] x19: ffffffc06acffac8 x18: 0000000000000000
      [   20.808277] x17: 0000000000000000 x16: 0000000000000000
      [   20.808520] x15: ffffffffffffffff x14: ffffffff00000000
      [   20.808757] x13: ffffffffffffffff x12: 0000000000000001
      [   20.809011] x11: 0101010101010101 x10: ffffff880d59ff5f
      [   20.809292] x9 : ffffff880d59ff5e x8 : ffffffc06acffaf3
      [   20.809549] x7 : 0000000000000000 x6 : ffffff880d59ff5f
      [   20.809803] x5 : 0000000080008001 x4 : 0000000000000003
      [   20.810056] x3 : ffffff900853e6b4 x2 : dfffff9000000000
      [   20.810693] x1 : ffffffc06acffad0 x0 : 0000000000000cb0
      [   20.811005] Call trace:
      [   20.811214]  uart_get_baud_rate+0xe8/0x190
      [   20.811479]  serial8250_do_set_termios+0xe0/0x6f4
      [   20.811719]  serial8250_set_termios+0x48/0x54
      [   20.811928]  uart_set_options+0x138/0x1bc
      [   20.812129]  uart_poll_init+0x114/0x16c
      [   20.812330]  tty_find_polling_driver+0x158/0x200
      [   20.812545]  configure_kgdboc+0xbc/0x1bc
      [   20.812745]  param_set_kgdboc_var+0xb8/0x150
      [   20.812960]  param_attr_store+0xbc/0x150
      [   20.813160]  module_attr_store+0x40/0x58
      [   20.813364]  sysfs_kf_write+0x8c/0xa8
      [   20.813563]  kernfs_fop_write+0x154/0x290
      [   20.813764]  vfs_write+0xf0/0x278
      [   20.813951]  __arm64_sys_write+0x84/0xf4
      [   20.814400]  el0_svc_common+0xf4/0x1dc
      [   20.814616]  el0_svc_handler+0x98/0xbc
      [   20.814804]  el0_svc+0x8/0xc
      [   20.822005] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
      [   20.826913] Mem abort info:
      [   20.827103]   ESR = 0x84000006
      [   20.827352]   Exception class = IABT (current EL), IL = 16 bits
      [   20.827655]   SET = 0, FnV = 0
      [   20.827855]   EA = 0, S1PTW = 0
      [   20.828135] user pgtable: 4k pages, 39-bit VAs, pgdp = (____ptrval____)
      [   20.828484] [0000000000000000] pgd=00000000aadee003, pud=00000000aadee003, pmd=0000000000000000
      [   20.829195] Internal error: Oops: 84000006 [#1] SMP
      [   20.829564] Modules linked in:
      [   20.829890] CPU: 1 PID: 104 Comm: sh Tainted: G        W         4.19.0-rc7ajb #8
      [   20.830545] Hardware name: linux,dummy-virt (DT)
      [   20.830829] pstate: 60000085 (nZCv daIf -PAN -UAO)
      [   20.831174] pc :           (null)
      [   20.831457] lr : serial8250_do_set_termios+0x358/0x6f4
      [   20.831727] sp : ffffffc06acff9b0
      [   20.831936] x29: ffffffc06acff9b0 x28: ffffff9008d7c000
      [   20.832267] x27: ffffff900969e16f x26: 0000000000000000
      [   20.832589] x25: ffffff900969dfb0 x24: 0000000000000000
      [   20.832906] x23: ffffffc06acffad0 x22: ffffff900969e160
      [   20.833232] x21: 0000000000000000 x20: ffffffc06acffac8
      [   20.833559] x19: ffffff900969df90 x18: 0000000000000000
      [   20.833878] x17: 0000000000000000 x16: 0000000000000000
      [   20.834491] x15: ffffffffffffffff x14: ffffffff00000000
      [   20.834821] x13: ffffffffffffffff x12: 0000000000000001
      [   20.835143] x11: 0101010101010101 x10: ffffff880d59ff5f
      [   20.835467] x9 : ffffff880d59ff5e x8 : ffffffc06acffaf3
      [   20.835790] x7 : 0000000000000000 x6 : ffffff880d59ff5f
      [   20.836111] x5 : c06419717c314100 x4 : 0000000000000007
      [   20.836419] x3 : 0000000000000000 x2 : 0000000000000000
      [   20.836732] x1 : 0000000000000001 x0 : ffffff900969df90
      [   20.837100] Process sh (pid: 104, stack limit = 0x(____ptrval____))
      [   20.837396] Call trace:
      [   20.837566]            (null)
      [   20.837816]  serial8250_set_termios+0x48/0x54
      [   20.838089]  uart_set_options+0x138/0x1bc
      [   20.838570]  uart_poll_init+0x114/0x16c
      [   20.838834]  tty_find_polling_driver+0x158/0x200
      [   20.839119]  configure_kgdboc+0xbc/0x1bc
      [   20.839380]  param_set_kgdboc_var+0xb8/0x150
      [   20.839658]  param_attr_store+0xbc/0x150
      [   20.839920]  module_attr_store+0x40/0x58
      [   20.840183]  sysfs_kf_write+0x8c/0xa8
      [   20.840183]  sysfs_kf_write+0x8c/0xa8
      [   20.840440]  kernfs_fop_write+0x154/0x290
      [   20.840702]  vfs_write+0xf0/0x278
      [   20.840942]  __arm64_sys_write+0x84/0xf4
      [   20.841209]  el0_svc_common+0xf4/0x1dc
      [   20.841471]  el0_svc_handler+0x98/0xbc
      [   20.841713]  el0_svc+0x8/0xc
      [   20.842057] Code: bad PC value
      [   20.842764] ---[ end trace a8835d7de79aaadf ]---
      [   20.843134] Kernel panic - not syncing: Fatal exception
      [   20.843515] SMP: stopping secondary CPUs
      [   20.844289] Kernel Offset: disabled
      [   20.844634] CPU features: 0x0,21806002
      [   20.844857] Memory Limit: none
      [   20.845172] ---[ end Kernel panic - not syncing: Fatal exception ]---
      Signed-off-by: default avatarMiles Chen <miles.chen@mediatek.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7c7378d4
    • Shaohua Li's avatar
      MD: fix invalid stored role for a disk - try2 · b234c2a0
      Shaohua Li authored
      commit 9e753ba9 upstream.
      
      Commit d595567d (MD: fix invalid stored role for a disk) broke linear
      hotadd. Let's only fix the role for disks in raid1/10.
      Based on Guoqing's original patch.
      Reported-by: default avatarkernel test robot <rong.a.chen@intel.com>
      Cc: Gioh Kim <gi-oh.kim@profitbricks.com>
      Cc: Guoqing Jiang <gqjiang@suse.com>
      Signed-off-by: default avatarShaohua Li <shli@fb.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b234c2a0
    • Josef Bacik's avatar
      btrfs: set max_extent_size properly · bddfd0a2
      Josef Bacik authored
      commit ad22cf6e upstream.
      
      We can't use entry->bytes if our entry is a bitmap entry, we need to use
      entry->max_extent_size in that case.  Fix up all the logic to make this
      consistent.
      
      CC: stable@vger.kernel.org # 4.4+
      Signed-off-by: default avatarJosef Bacik <jbacik@fb.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bddfd0a2
    • Filipe Manana's avatar
      Btrfs: fix null pointer dereference on compressed write path error · b38ace86
      Filipe Manana authored
      commit 3527a018 upstream.
      
      At inode.c:compress_file_range(), under the "free_pages_out" label, we can
      end up dereferencing the "pages" pointer when it has a NULL value. This
      case happens when "start" has a value of 0 and we fail to allocate memory
      for the "pages" pointer. When that happens we jump to the "cont" label and
      then enter the "if (start == 0)" branch where we immediately call the
      cow_file_range_inline() function. If that function returns 0 (success
      creating an inline extent) or an error (like -ENOMEM for example) we jump
      to the "free_pages_out" label and then access "pages[i]" leading to a NULL
      pointer dereference, since "nr_pages" has a value greater than zero at
      that point.
      
      Fix this by setting "nr_pages" to 0 when we fail to allocate memory for
      the "pages" pointer.
      
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=201119
      Fixes: 771ed689 ("Btrfs: Optimize compressed writeback and reads")
      CC: stable@vger.kernel.org # 4.4+
      Reviewed-by: default avatarLiu Bo <bo.liu@linux.alibaba.com>
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b38ace86