1. 05 Oct, 2019 40 commits
    • Michal Hocko's avatar
      memcg, kmem: do not fail __GFP_NOFAIL charges · b4a734a5
      Michal Hocko authored
      commit e55d9d9b upstream.
      
      Thomas has noticed the following NULL ptr dereference when using cgroup
      v1 kmem limit:
      BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
      PGD 0
      P4D 0
      Oops: 0000 [#1] PREEMPT SMP PTI
      CPU: 3 PID: 16923 Comm: gtk-update-icon Not tainted 4.19.51 #42
      Hardware name: Gigabyte Technology Co., Ltd. Z97X-Gaming G1/Z97X-Gaming G1, BIOS F9 07/31/2015
      RIP: 0010:create_empty_buffers+0x24/0x100
      Code: cd 0f 1f 44 00 00 0f 1f 44 00 00 41 54 49 89 d4 ba 01 00 00 00 55 53 48 89 fb e8 97 fe ff ff 48 89 c5 48 89 c2 eb 03 48 89 ca <48> 8b 4a 08 4c 09 22 48 85 c9 75 f1 48 89 6a 08 48 8b 43 18 48 8d
      RSP: 0018:ffff927ac1b37bf8 EFLAGS: 00010286
      RAX: 0000000000000000 RBX: fffff2d4429fd740 RCX: 0000000100097149
      RDX: 0000000000000000 RSI: 0000000000000082 RDI: ffff9075a99fbe00
      RBP: 0000000000000000 R08: fffff2d440949cc8 R09: 00000000000960c0
      R10: 0000000000000002 R11: 0000000000000000 R12: 0000000000000000
      R13: ffff907601f18360 R14: 0000000000002000 R15: 0000000000001000
      FS:  00007fb55b288bc0(0000) GS:ffff90761f8c0000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000000000008 CR3: 000000007aebc002 CR4: 00000000001606e0
      Call Trace:
       create_page_buffers+0x4d/0x60
       __block_write_begin_int+0x8e/0x5a0
       ? ext4_inode_attach_jinode.part.82+0xb0/0xb0
       ? jbd2__journal_start+0xd7/0x1f0
       ext4_da_write_begin+0x112/0x3d0
       generic_perform_write+0xf1/0x1b0
       ? file_update_time+0x70/0x140
       __generic_file_write_iter+0x141/0x1a0
       ext4_file_write_iter+0xef/0x3b0
       __vfs_write+0x17e/0x1e0
       vfs_write+0xa5/0x1a0
       ksys_write+0x57/0xd0
       do_syscall_64+0x55/0x160
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Tetsuo then noticed that this is because the __memcg_kmem_charge_memcg
      fails __GFP_NOFAIL charge when the kmem limit is reached.  This is a wrong
      behavior because nofail allocations are not allowed to fail.  Normal
      charge path simply forces the charge even if that means to cross the
      limit.  Kmem accounting should be doing the same.
      
      Link: http://lkml.kernel.org/r/20190906125608.32129-1-mhocko@kernel.orgSigned-off-by: default avatarMichal Hocko <mhocko@suse.com>
      Reported-by: default avatarThomas Lindroth <thomas.lindroth@gmail.com>
      Debugged-by: default avatarTetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Thomas Lindroth <thomas.lindroth@gmail.com>
      Cc: Shakeel Butt <shakeelb@google.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b4a734a5
    • Tetsuo Handa's avatar
      memcg, oom: don't require __GFP_FS when invoking memcg OOM killer · d40b3eaf
      Tetsuo Handa authored
      commit f9c64562 upstream.
      
      Masoud Sharbiani noticed that commit 29ef680a ("memcg, oom: move
      out_of_memory back to the charge path") broke memcg OOM called from
      __xfs_filemap_fault() path.  It turned out that try_charge() is retrying
      forever without making forward progress because mem_cgroup_oom(GFP_NOFS)
      cannot invoke the OOM killer due to commit 3da88fb3 ("mm, oom:
      move GFP_NOFS check to out_of_memory").
      
      Allowing forced charge due to being unable to invoke memcg OOM killer will
      lead to global OOM situation.  Also, just returning -ENOMEM will be risky
      because OOM path is lost and some paths (e.g.  get_user_pages()) will leak
      -ENOMEM.  Therefore, invoking memcg OOM killer (despite GFP_NOFS) will be
      the only choice we can choose for now.
      
      Until 29ef680a, we were able to invoke memcg OOM killer when
      GFP_KERNEL reclaim failed [1].  But since 29ef680a, we need to
      invoke memcg OOM killer when GFP_NOFS reclaim failed [2].  Although in the
      past we did invoke memcg OOM killer for GFP_NOFS [3], we might get
      pre-mature memcg OOM reports due to this patch.
      
      [1]
      
       leaker invoked oom-killer: gfp_mask=0x6200ca(GFP_HIGHUSER_MOVABLE), nodemask=(null), order=0, oom_score_adj=0
       CPU: 0 PID: 2746 Comm: leaker Not tainted 4.18.0+ #19
       Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 04/13/2018
       Call Trace:
        dump_stack+0x63/0x88
        dump_header+0x67/0x27a
        ? mem_cgroup_scan_tasks+0x91/0xf0
        oom_kill_process+0x210/0x410
        out_of_memory+0x10a/0x2c0
        mem_cgroup_out_of_memory+0x46/0x80
        mem_cgroup_oom_synchronize+0x2e4/0x310
        ? high_work_func+0x20/0x20
        pagefault_out_of_memory+0x31/0x76
        mm_fault_error+0x55/0x115
        ? handle_mm_fault+0xfd/0x220
        __do_page_fault+0x433/0x4e0
        do_page_fault+0x22/0x30
        ? page_fault+0x8/0x30
        page_fault+0x1e/0x30
       RIP: 0033:0x4009f0
       Code: 03 00 00 00 e8 71 fd ff ff 48 83 f8 ff 49 89 c6 74 74 48 89 c6 bf c0 0c 40 00 31 c0 e8 69 fd ff ff 45 85 ff 7e 21 31 c9 66 90 <41> 0f be 14 0e 01 d3 f7 c1 ff 0f 00 00 75 05 41 c6 04 0e 2a 48 83
       RSP: 002b:00007ffe29ae96f0 EFLAGS: 00010206
       RAX: 000000000000001b RBX: 0000000000000000 RCX: 0000000001ce1000
       RDX: 0000000000000000 RSI: 000000007fffffe5 RDI: 0000000000000000
       RBP: 000000000000000c R08: 0000000000000000 R09: 00007f94be09220d
       R10: 0000000000000002 R11: 0000000000000246 R12: 00000000000186a0
       R13: 0000000000000003 R14: 00007f949d845000 R15: 0000000002800000
       Task in /leaker killed as a result of limit of /leaker
       memory: usage 524288kB, limit 524288kB, failcnt 158965
       memory+swap: usage 0kB, limit 9007199254740988kB, failcnt 0
       kmem: usage 2016kB, limit 9007199254740988kB, failcnt 0
       Memory cgroup stats for /leaker: cache:844KB rss:521136KB rss_huge:0KB shmem:0KB mapped_file:0KB dirty:132KB writeback:0KB inactive_anon:0KB active_anon:521224KB inactive_file:1012KB active_file:8KB unevictable:0KB
       Memory cgroup out of memory: Kill process 2746 (leaker) score 998 or sacrifice child
       Killed process 2746 (leaker) total-vm:536704kB, anon-rss:521176kB, file-rss:1208kB, shmem-rss:0kB
       oom_reaper: reaped process 2746 (leaker), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
      
      [2]
      
       leaker invoked oom-killer: gfp_mask=0x600040(GFP_NOFS), nodemask=(null), order=0, oom_score_adj=0
       CPU: 1 PID: 2746 Comm: leaker Not tainted 4.18.0+ #20
       Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 04/13/2018
       Call Trace:
        dump_stack+0x63/0x88
        dump_header+0x67/0x27a
        ? mem_cgroup_scan_tasks+0x91/0xf0
        oom_kill_process+0x210/0x410
        out_of_memory+0x109/0x2d0
        mem_cgroup_out_of_memory+0x46/0x80
        try_charge+0x58d/0x650
        ? __radix_tree_replace+0x81/0x100
        mem_cgroup_try_charge+0x7a/0x100
        __add_to_page_cache_locked+0x92/0x180
        add_to_page_cache_lru+0x4d/0xf0
        iomap_readpages_actor+0xde/0x1b0
        ? iomap_zero_range_actor+0x1d0/0x1d0
        iomap_apply+0xaf/0x130
        iomap_readpages+0x9f/0x150
        ? iomap_zero_range_actor+0x1d0/0x1d0
        xfs_vm_readpages+0x18/0x20 [xfs]
        read_pages+0x60/0x140
        __do_page_cache_readahead+0x193/0x1b0
        ondemand_readahead+0x16d/0x2c0
        page_cache_async_readahead+0x9a/0xd0
        filemap_fault+0x403/0x620
        ? alloc_set_pte+0x12c/0x540
        ? _cond_resched+0x14/0x30
        __xfs_filemap_fault+0x66/0x180 [xfs]
        xfs_filemap_fault+0x27/0x30 [xfs]
        __do_fault+0x19/0x40
        __handle_mm_fault+0x8e8/0xb60
        handle_mm_fault+0xfd/0x220
        __do_page_fault+0x238/0x4e0
        do_page_fault+0x22/0x30
        ? page_fault+0x8/0x30
        page_fault+0x1e/0x30
       RIP: 0033:0x4009f0
       Code: 03 00 00 00 e8 71 fd ff ff 48 83 f8 ff 49 89 c6 74 74 48 89 c6 bf c0 0c 40 00 31 c0 e8 69 fd ff ff 45 85 ff 7e 21 31 c9 66 90 <41> 0f be 14 0e 01 d3 f7 c1 ff 0f 00 00 75 05 41 c6 04 0e 2a 48 83
       RSP: 002b:00007ffda45c9290 EFLAGS: 00010206
       RAX: 000000000000001b RBX: 0000000000000000 RCX: 0000000001a1e000
       RDX: 0000000000000000 RSI: 000000007fffffe5 RDI: 0000000000000000
       RBP: 000000000000000c R08: 0000000000000000 R09: 00007f6d061ff20d
       R10: 0000000000000002 R11: 0000000000000246 R12: 00000000000186a0
       R13: 0000000000000003 R14: 00007f6ce59b2000 R15: 0000000002800000
       Task in /leaker killed as a result of limit of /leaker
       memory: usage 524288kB, limit 524288kB, failcnt 7221
       memory+swap: usage 0kB, limit 9007199254740988kB, failcnt 0
       kmem: usage 1944kB, limit 9007199254740988kB, failcnt 0
       Memory cgroup stats for /leaker: cache:3632KB rss:518232KB rss_huge:0KB shmem:0KB mapped_file:0KB dirty:0KB writeback:0KB inactive_anon:0KB active_anon:518408KB inactive_file:3908KB active_file:12KB unevictable:0KB
       Memory cgroup out of memory: Kill process 2746 (leaker) score 992 or sacrifice child
       Killed process 2746 (leaker) total-vm:536704kB, anon-rss:518264kB, file-rss:1188kB, shmem-rss:0kB
       oom_reaper: reaped process 2746 (leaker), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
      
      [3]
      
       leaker invoked oom-killer: gfp_mask=0x50, order=0, oom_score_adj=0
       leaker cpuset=/ mems_allowed=0
       CPU: 1 PID: 3206 Comm: leaker Not tainted 3.10.0-957.27.2.el7.x86_64 #1
       Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 04/13/2018
       Call Trace:
        [<ffffffffaf364147>] dump_stack+0x19/0x1b
        [<ffffffffaf35eb6a>] dump_header+0x90/0x229
        [<ffffffffaedbb456>] ? find_lock_task_mm+0x56/0xc0
        [<ffffffffaee32a38>] ? try_get_mem_cgroup_from_mm+0x28/0x60
        [<ffffffffaedbb904>] oom_kill_process+0x254/0x3d0
        [<ffffffffaee36c36>] mem_cgroup_oom_synchronize+0x546/0x570
        [<ffffffffaee360b0>] ? mem_cgroup_charge_common+0xc0/0xc0
        [<ffffffffaedbc194>] pagefault_out_of_memory+0x14/0x90
        [<ffffffffaf35d072>] mm_fault_error+0x6a/0x157
        [<ffffffffaf3717c8>] __do_page_fault+0x3c8/0x4f0
        [<ffffffffaf371925>] do_page_fault+0x35/0x90
        [<ffffffffaf36d768>] page_fault+0x28/0x30
       Task in /leaker killed as a result of limit of /leaker
       memory: usage 524288kB, limit 524288kB, failcnt 20628
       memory+swap: usage 524288kB, limit 9007199254740988kB, failcnt 0
       kmem: usage 0kB, limit 9007199254740988kB, failcnt 0
       Memory cgroup stats for /leaker: cache:840KB rss:523448KB rss_huge:0KB mapped_file:0KB swap:0KB inactive_anon:0KB active_anon:523448KB inactive_file:464KB active_file:376KB unevictable:0KB
       Memory cgroup out of memory: Kill process 3206 (leaker) score 970 or sacrifice child
       Killed process 3206 (leaker) total-vm:536692kB, anon-rss:523304kB, file-rss:412kB, shmem-rss:0kB
      
      Bisected by Masoud Sharbiani.
      
      Link: http://lkml.kernel.org/r/cbe54ed1-b6ba-a056-8899-2dc42526371d@i-love.sakura.ne.jp
      Fixes: 3da88fb3 ("mm, oom: move GFP_NOFS check to out_of_memory") [necessary after 29ef680a]
      Signed-off-by: default avatarTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Reported-by: default avatarMasoud Sharbiani <msharbiani@apple.com>
      Tested-by: default avatarMasoud Sharbiani <msharbiani@apple.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: <stable@vger.kernel.org>	[4.19+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d40b3eaf
    • Bob Peterson's avatar
      gfs2: clear buf_in_tr when ending a transaction in sweep_bh_for_rgrps · e0c1e6e5
      Bob Peterson authored
      commit f0b444b3 upstream.
      
      In function sweep_bh_for_rgrps, which is a helper for punch_hole,
      it uses variable buf_in_tr to keep track of when it needs to commit
      pending block frees on a partial delete that overflows the
      transaction created for the delete. The problem is that the
      variable was initialized at the start of function sweep_bh_for_rgrps
      but it was never cleared, even when starting a new transaction.
      
      This patch reinitializes the variable when the transaction is
      ended, so the next transaction starts out with it cleared.
      
      Fixes: d552a2b9 ("GFS2: Non-recursive delete")
      Cc: stable@vger.kernel.org # v4.12+
      Signed-off-by: default avatarBob Peterson <rpeterso@redhat.com>
      Signed-off-by: default avatarAndreas Gruenbacher <agruenba@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e0c1e6e5
    • Hans de Goede's avatar
      efifb: BGRT: Improve efifb_bgrt_sanity_check · 3620b06b
      Hans de Goede authored
      commit 51677dfc upstream.
      
      For various reasons, at least with x86 EFI firmwares, the xoffset and
      yoffset in the BGRT info are not always reliable.
      
      Extensive testing has shown that when the info is correct, the
      BGRT image is always exactly centered horizontally (the yoffset variable
      is more variable and not always predictable).
      
      This commit simplifies / improves the bgrt_sanity_check to simply
      check that the BGRT image is exactly centered horizontally and skips
      (re)drawing it when it is not.
      
      This fixes the BGRT image sometimes being drawn in the wrong place.
      
      Cc: stable@vger.kernel.org
      Fixes: 88fe4ceb ("efifb: BGRT: Do not copy the boot graphics for non native resolutions")
      Signed-off-by: default avatarHans de Goede <hdegoede@redhat.com>
      Cc: Peter Jones <pjones@redhat.com>,
      Signed-off-by: default avatarBartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190721131918.10115-1-hdegoede@redhat.comSigned-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3620b06b
    • Mark Brown's avatar
      regulator: Defer init completion for a while after late_initcall · c4f65c2f
      Mark Brown authored
      commit 55576cf1 upstream.
      
      The kernel has no way of knowing when we have finished instantiating
      drivers, between deferred probe and systems that build key drivers as
      modules we might be doing this long after userspace has booted. This has
      always been a bit of an issue with regulator_init_complete since it can
      power off hardware that's not had it's driver loaded which can result in
      user visible effects, the main case is powering off displays. Practically
      speaking it's not been an issue in real systems since most systems that
      use the regulator API are embedded and build in key drivers anyway but
      with Arm laptops coming on the market it's becoming more of an issue so
      let's do something about it.
      
      In the absence of any better idea just defer the powering off for 30s
      after late_initcall(), this is obviously a hack but it should mask the
      issue for now and it's no more arbitrary than late_initcall() itself.
      Ideally we'd have some heuristics to detect if we're on an affected
      system and tune or skip the delay appropriately, and there may be some
      need for a command line option to be added.
      
      Link: https://lore.kernel.org/r/20190904124250.25844-1-broonie@kernel.orgSigned-off-by: default avatarMark Brown <broonie@kernel.org>
      Tested-by: default avatarLee Jones <lee.jones@linaro.org>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c4f65c2f
    • Thadeu Lima de Souza Cascardo's avatar
      alarmtimer: Use EOPNOTSUPP instead of ENOTSUPP · 3784576f
      Thadeu Lima de Souza Cascardo authored
      commit f18ddc13 upstream.
      
      ENOTSUPP is not supposed to be returned to userspace. This was found on an
      OpenPower machine, where the RTC does not support set_alarm.
      
      On that system, a clock_nanosleep(CLOCK_REALTIME_ALARM, ...) results in
      "524 Unknown error 524"
      
      Replace it with EOPNOTSUPP which results in the expected "95 Operation not
      supported" error.
      
      Fixes: 1c6b39ad (alarmtimers: Return -ENOTSUPP if no RTC device is present)
      Signed-off-by: default avatarThadeu Lima de Souza Cascardo <cascardo@canonical.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20190903171802.28314-1-cascardo@canonical.comSigned-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3784576f
    • Shawn Lin's avatar
      arm64: dts: rockchip: limit clock rate of MMC controllers for RK3328 · 174bbcc5
      Shawn Lin authored
      commit 03e61929 upstream.
      
      150MHz is a fundamental limitation of RK3328 Soc, w/o this limitation,
      eMMC, for instance, will run into 200MHz clock rate in HS200 mode, which
      makes the RK3328 boards not always boot properly. By adding it in
      rk3328.dtsi would also obviate the worry of missing it when adding new
      boards.
      
      Fixes: 52e02d37 ("arm64: dts: rockchip: add core dtsi file for RK3328 SoCs")
      Cc: stable@vger.kernel.org
      Cc: Robin Murphy <robin.murphy@arm.com>
      Cc: Liang Chen <cl@rock-chips.com>
      Signed-off-by: default avatarShawn Lin <shawn.lin@rock-chips.com>
      Signed-off-by: default avatarHeiko Stuebner <heiko@sntech.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      174bbcc5
    • Will Deacon's avatar
      arm64: tlb: Ensure we execute an ISB following walk cache invalidation · 8cfe3b8a
      Will Deacon authored
      commit 51696d34 upstream.
      
      05f2d2f8 ("arm64: tlbflush: Introduce __flush_tlb_kernel_pgtable")
      added a new TLB invalidation helper which is used when freeing
      intermediate levels of page table used for kernel mappings, but is
      missing the required ISB instruction after completion of the TLBI
      instruction.
      
      Add the missing barrier.
      
      Cc: <stable@vger.kernel.org>
      Fixes: 05f2d2f8 ("arm64: tlbflush: Introduce __flush_tlb_kernel_pgtable")
      Reviewed-by: default avatarMark Rutland <mark.rutland@arm.com>
      Signed-off-by: default avatarWill Deacon <will@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8cfe3b8a
    • Will Deacon's avatar
      Revert "arm64: Remove unnecessary ISBs from set_{pte,pmd,pud}" · fc7d6bfd
      Will Deacon authored
      commit d0b7a302 upstream.
      
      This reverts commit 24fe1b0e.
      
      Commit 24fe1b0e ("arm64: Remove unnecessary ISBs from
      set_{pte,pmd,pud}") removed ISB instructions immediately following updates
      to the page table, on the grounds that they are not required by the
      architecture and a DSB alone is sufficient to ensure that subsequent data
      accesses use the new translation:
      
        DDI0487E_a, B2-128:
      
        | ... no instruction that appears in program order after the DSB
        | instruction can alter any state of the system or perform any part of
        | its functionality until the DSB completes other than:
        |
        | * Being fetched from memory and decoded
        | * Reading the general-purpose, SIMD and floating-point,
        |   Special-purpose, or System registers that are directly or indirectly
        |   read without causing side-effects.
      
      However, the same document also states the following:
      
        DDI0487E_a, B2-125:
      
        | DMB and DSB instructions affect reads and writes to the memory system
        | generated by Load/Store instructions and data or unified cache
        | maintenance instructions being executed by the PE. Instruction fetches
        | or accesses caused by a hardware translation table access are not
        | explicit accesses.
      
      which appears to claim that the DSB alone is insufficient.  Unfortunately,
      some CPU designers have followed the second clause above, whereas in Linux
      we've been relying on the first. This means that our mapping sequence:
      
      	MOV	X0, <valid pte>
      	STR	X0, [Xptep]	// Store new PTE to page table
      	DSB	ISHST
      	LDR	X1, [X2]	// Translates using the new PTE
      
      can actually raise a translation fault on the load instruction because the
      translation can be performed speculatively before the page table update and
      then marked as "faulting" by the CPU. For user PTEs, this is ok because we
      can handle the spurious fault, but for kernel PTEs and intermediate table
      entries this results in a panic().
      
      Revert the offending commit to reintroduce the missing barriers.
      
      Cc: <stable@vger.kernel.org>
      Fixes: 24fe1b0e ("arm64: Remove unnecessary ISBs from set_{pte,pmd,pud}")
      Reviewed-by: default avatarMark Rutland <mark.rutland@arm.com>
      Signed-off-by: default avatarWill Deacon <will@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fc7d6bfd
    • Luis Araneda's avatar
      ARM: zynq: Use memcpy_toio instead of memcpy on smp bring-up · 881edc16
      Luis Araneda authored
      commit b7005d4e upstream.
      
      This fixes a kernel panic on memcpy when
      FORTIFY_SOURCE is enabled.
      
      The initial smp implementation on commit aa7eb2bb
      ("arm: zynq: Add smp support")
      used memcpy, which worked fine until commit ee333554
      ("ARM: 8749/1: Kconfig: Add ARCH_HAS_FORTIFY_SOURCE")
      enabled overflow checks at runtime, producing a read
      overflow panic.
      
      The computed size of memcpy args are:
      - p_size (dst): 4294967295 = (size_t) -1
      - q_size (src): 1
      - size (len): 8
      
      Additionally, the memory is marked as __iomem, so one of
      the memcpy_* functions should be used for read/write.
      
      Fixes: aa7eb2bb ("arm: zynq: Add smp support")
      Signed-off-by: default avatarLuis Araneda <luaraneda@gmail.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMichal Simek <michal.simek@xilinx.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      881edc16
    • Lihua Yao's avatar
      ARM: samsung: Fix system restart on S3C6410 · 22092794
      Lihua Yao authored
      commit 16986074 upstream.
      
      S3C6410 system restart is triggered by watchdog reset.
      
      Cc: <stable@vger.kernel.org>
      Fixes: 9f55342c ("ARM: dts: s3c64xx: Fix infinite interrupt in soft mode")
      Signed-off-by: default avatarLihua Yao <ylhuajnu@outlook.com>
      Signed-off-by: default avatarKrzysztof Kozlowski <krzk@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      22092794
    • Amadeusz Sławiński's avatar
      ASoC: Intel: Fix use of potentially uninitialized variable · ad884155
      Amadeusz Sławiński authored
      commit 810f3b86 upstream.
      
      If ipc->ops.reply_msg_match is NULL, we may end up using uninitialized
      mask value.
      
      reported by smatch:
      sound/soc/intel/common/sst-ipc.c:266 sst_ipc_reply_find_msg() error: uninitialized symbol 'mask'.
      Signed-off-by: default avatarAmadeusz Sławiński <amadeuszx.slawinski@intel.com>
      Link: https://lore.kernel.org/r/20190827141712.21015-3-amadeuszx.slawinski@linux.intel.comReviewed-by: default avatarPierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ad884155
    • Amadeusz Sławiński's avatar
      ASoC: Intel: Skylake: Use correct function to access iomem space · 7bdab364
      Amadeusz Sławiński authored
      commit 17d29ff9 upstream.
      
      For copying from __iomem, we should use __ioread32_copy.
      
      reported by sparse:
      sound/soc/intel/skylake/skl-debug.c:437:34: warning: incorrect type in argument 1 (different address spaces)
      sound/soc/intel/skylake/skl-debug.c:437:34:    expected void [noderef] <asn:2> *to
      sound/soc/intel/skylake/skl-debug.c:437:34:    got unsigned char *
      sound/soc/intel/skylake/skl-debug.c:437:51: warning: incorrect type in argument 2 (different address spaces)
      sound/soc/intel/skylake/skl-debug.c:437:51:    expected void const *from
      sound/soc/intel/skylake/skl-debug.c:437:51:    got void [noderef] <asn:2> *[assigned] fw_reg_addr
      Signed-off-by: default avatarAmadeusz Sławiński <amadeuszx.slawinski@intel.com>
      Link: https://lore.kernel.org/r/20190827141712.21015-2-amadeuszx.slawinski@linux.intel.comReviewed-by: default avatarPierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7bdab364
    • Amadeusz Sławiński's avatar
      ASoC: Intel: NHLT: Fix debug print format · 3c54f463
      Amadeusz Sławiński authored
      commit 855a06da upstream.
      
      oem_table_id is 8 chars long, so we need to limit it, otherwise it
      may print some unprintable characters into dmesg.
      Signed-off-by: default avatarAmadeusz Sławiński <amadeuszx.slawinski@intel.com>
      Link: https://lore.kernel.org/r/20190827141712.21015-7-amadeuszx.slawinski@linux.intel.comReviewed-by: default avatarPierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3c54f463
    • Kees Cook's avatar
      binfmt_elf: Do not move brk for INTERP-less ET_EXEC · 29ecf8ca
      Kees Cook authored
      commit 7be3cb01 upstream.
      
      When brk was moved for binaries without an interpreter, it should have
      been limited to ET_DYN only. In other words, the special case was an
      ET_DYN that lacks an INTERP, not just an executable that lacks INTERP.
      The bug manifested for giant static executables, where the brk would end
      up in the middle of the text area on 32-bit architectures.
      Reported-and-tested-by: default avatarRichard Kojedzinszky <richard@kojedz.in>
      Fixes: bbdc6076 ("binfmt_elf: move brk out of mmap when doing direct loader exec")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      29ecf8ca
    • Arnd Bergmann's avatar
      media: don't drop front-end reference count for ->detach · 02ef5c29
      Arnd Bergmann authored
      commit 14e3cdbb upstream.
      
      A bugfix introduce a link failure in configurations without CONFIG_MODULES:
      
      In file included from drivers/media/usb/dvb-usb/pctv452e.c:20:0:
      drivers/media/usb/dvb-usb/pctv452e.c: In function 'pctv452e_frontend_attach':
      drivers/media/dvb-frontends/stb0899_drv.h:151:36: error: weak declaration of 'stb0899_attach' being applied to a already existing, static definition
      
      The problem is that the !IS_REACHABLE() declaration of stb0899_attach()
      is a 'static inline' definition that clashes with the weak definition.
      
      I further observed that the bugfix was only done for one of the five users
      of stb0899_attach(), the other four still have the problem.  This reverts
      the bugfix and instead addresses the problem by not dropping the reference
      count when calling '->detach()', instead we call this function directly
      in dvb_frontend_put() before dropping the kref on the front-end.
      
      I first submitted this in early 2018, and after some discussion it
      was apparently discarded.  While there is a long-term plan in place,
      that plan is obviously not nearing completion yet, and the current
      kernel is still broken unless this patch is applied.
      
      Link: https://patchwork.kernel.org/patch/10140175/
      Link: https://patchwork.linuxtv.org/patch/54831/
      
      Cc: Max Kellermann <max.kellermann@gmail.com>
      Cc: Wolfgang Rohdewald <wolfgang@rohdewald.de>
      Cc: stable@vger.kernel.org
      Fixes: f686c143 ("[media] stb0899: move code to "detach" callback")
      Fixes: 6cdeaed3 ("media: dvb_usb_pctv452e: module refcount changes were unbalanced")
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarSean Young <sean@mess.org>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab+samsung@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      02ef5c29
    • Hans de Goede's avatar
      media: sn9c20x: Add MSI MS-1039 laptop to flip_dmi_table · 589ca8ec
      Hans de Goede authored
      commit 7e0bb582 upstream.
      
      Like a bunch of other MSI laptops the MS-1039 uses a 0c45:627b
      SN9C201 + OV7660 webcam which is mounted upside down.
      
      Add it to the sn9c20x flip_dmi_table to deal with this.
      
      Cc: stable@vger.kernel.org
      Reported-by: default avatarRui Salvaterra <rsalvaterra@gmail.com>
      Signed-off-by: default avatarHans de Goede <hdegoede@redhat.com>
      Signed-off-by: default avatarHans Verkuil <hverkuil-cisco@xs4all.nl>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab+samsung@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      589ca8ec
    • Sean Christopherson's avatar
      KVM: x86: Manually calculate reserved bits when loading PDPTRS · 496cf984
      Sean Christopherson authored
      commit 16cfacc8 upstream.
      
      Manually generate the PDPTR reserved bit mask when explicitly loading
      PDPTRs.  The reserved bits that are being tracked by the MMU reflect the
      current paging mode, which is unlikely to be PAE paging in the vast
      majority of flows that use load_pdptrs(), e.g. CR0 and CR4 emulation,
      __set_sregs(), etc...  This can cause KVM to incorrectly signal a bad
      PDPTR, or more likely, miss a reserved bit check and subsequently fail
      a VM-Enter due to a bad VMCS.GUEST_PDPTR.
      
      Add a one off helper to generate the reserved bits instead of sharing
      code across the MMU's calculations and the PDPTR emulation.  The PDPTR
      reserved bits are basically set in stone, and pushing a helper into
      the MMU's calculation adds unnecessary complexity without improving
      readability.
      
      Oppurtunistically fix/update the comment for load_pdptrs().
      
      Note, the buggy commit also introduced a deliberate functional change,
      "Also remove bit 5-6 from rsvd_bits_mask per latest SDM.", which was
      effectively (and correctly) reverted by commit cd9ae5fe ("KVM: x86:
      Fix page-tables reserved bits").  A bit of SDM archaeology shows that
      the SDM from late 2008 had a bug (likely a copy+paste error) where it
      listed bits 6:5 as AVL and A for PDPTEs used for 4k entries but reserved
      for 2mb entries.  I.e. the SDM contradicted itself, and bits 6:5 are and
      always have been reserved.
      
      Fixes: 20c466b5 ("KVM: Use rsvd_bits_mask in load_pdptrs()")
      Cc: stable@vger.kernel.org
      Cc: Nadav Amit <nadav.amit@gmail.com>
      Reported-by: default avatarDoug Reiland <doug.reiland@intel.com>
      Signed-off-by: default avatarSean Christopherson <sean.j.christopherson@intel.com>
      Reviewed-by: default avatarPeter Xu <peterx@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      496cf984
    • Jan Dakinevich's avatar
      KVM: x86: set ctxt->have_exception in x86_decode_insn() · 933e3e2b
      Jan Dakinevich authored
      commit c8848cee upstream.
      
      x86_emulate_instruction() takes into account ctxt->have_exception flag
      during instruction decoding, but in practice this flag is never set in
      x86_decode_insn().
      
      Fixes: 6ea6e843 ("KVM: x86: inject exceptions produced by x86_decode_insn")
      Cc: stable@vger.kernel.org
      Cc: Denis Lunev <den@virtuozzo.com>
      Cc: Roman Kagan <rkagan@virtuozzo.com>
      Cc: Denis Plotnikov <dplotnikov@virtuozzo.com>
      Signed-off-by: default avatarJan Dakinevich <jan.dakinevich@virtuozzo.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      933e3e2b
    • Jan Dakinevich's avatar
      KVM: x86: always stop emulation on page fault · 9723e445
      Jan Dakinevich authored
      commit 8530a79c upstream.
      
      inject_emulated_exception() returns true if and only if nested page
      fault happens. However, page fault can come from guest page tables
      walk, either nested or not nested. In both cases we should stop an
      attempt to read under RIP and give guest to step over its own page
      fault handler.
      
      This is also visible when an emulated instruction causes a #GP fault
      and the VMware backdoor is enabled.  To handle the VMware backdoor,
      KVM intercepts #GP faults; with only the next patch applied,
      x86_emulate_instruction() injects a #GP but returns EMULATE_FAIL
      instead of EMULATE_DONE.   EMULATE_FAIL causes handle_exception_nmi()
      (or gp_interception() for SVM) to re-inject the original #GP because it
      thinks emulation failed due to a non-VMware opcode.  This patch prevents
      the issue as x86_emulate_instruction() will return EMULATE_DONE after
      injecting the #GP.
      
      Fixes: 6ea6e843 ("KVM: x86: inject exceptions produced by x86_decode_insn")
      Cc: stable@vger.kernel.org
      Cc: Denis Lunev <den@virtuozzo.com>
      Cc: Roman Kagan <rkagan@virtuozzo.com>
      Cc: Denis Plotnikov <dplotnikov@virtuozzo.com>
      Signed-off-by: default avatarJan Dakinevich <jan.dakinevich@virtuozzo.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9723e445
    • Helge Deller's avatar
      parisc: Disable HP HSC-PCI Cards to prevent kernel crash · 8225db4a
      Helge Deller authored
      commit 5fa16591 upstream.
      
      The HP Dino PCI controller chip can be used in two variants: as on-board
      controller (e.g. in B160L), or on an Add-On card ("Card-Mode") to bridge
      PCI components to systems without a PCI bus, e.g. to a HSC/GSC bus.  One
      such Add-On card is the HP HSC-PCI Card which has one or more DEC Tulip
      PCI NIC chips connected to the on-card Dino PCI controller.
      
      Dino in Card-Mode has a big disadvantage: All PCI memory accesses need
      to go through the DINO_MEM_DATA register, so Linux drivers will not be
      able to use the ioremap() function. Without ioremap() many drivers will
      not work, one example is the tulip driver which then simply crashes the
      kernel if it tries to access the ports on the HP HSC card.
      
      This patch disables the HP HSC card if it finds one, and as such
      fixes the kernel crash on a HP D350/2 machine.
      Signed-off-by: default avatarHelge Deller <deller@gmx.de>
      Noticed-by: default avatarPhil Scarr <phil.scarr@pm.me>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8225db4a
    • Vasily Averin's avatar
      fuse: fix missing unlock_page in fuse_writepage() · ad411629
      Vasily Averin authored
      commit d5880c7a upstream.
      
      unlock_page() was missing in case of an already in-flight write against the
      same page.
      Signed-off-by: default avatarVasily Averin <vvs@virtuozzo.com>
      Fixes: ff17be08 ("fuse: writepage: skip already in flight")
      Cc: <stable@vger.kernel.org> # v3.13
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ad411629
    • Madhavan Srinivasan's avatar
      powerpc/imc: Dont create debugfs files for cpu-less nodes · ecfe4b5f
      Madhavan Srinivasan authored
      commit 41ba17f2 upstream.
      
      Commit <684d9840> ('powerpc/powernv: Add debugfs interface for
      imc-mode and imc') added debugfs interface for the nest imc pmu
      devices to support changing of different ucode modes. Primarily adding
      this capability for debug. But when doing so, the code did not
      consider the case of cpu-less nodes. So when reading the _cmd_ or
      _mode_ file of a cpu-less node will create this crash.
      
        Faulting instruction address: 0xc0000000000d0d58
        Oops: Kernel access of bad area, sig: 11 [#1]
        ...
        CPU: 67 PID: 5301 Comm: cat Not tainted 5.2.0-rc6-next-20190627+ #19
        NIP:  c0000000000d0d58 LR: c00000000049aa18 CTR:c0000000000d0d50
        REGS: c00020194548f9e0 TRAP: 0300   Not tainted  (5.2.0-rc6-next-20190627+)
        MSR:  9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE>  CR:28022822  XER: 00000000
        CFAR: c00000000049aa14 DAR: 000000000003fc08 DSISR:40000000 IRQMASK: 0
        ...
        NIP imc_mem_get+0x8/0x20
        LR  simple_attr_read+0x118/0x170
        Call Trace:
          simple_attr_read+0x70/0x170 (unreliable)
          debugfs_attr_read+0x6c/0xb0
          __vfs_read+0x3c/0x70
           vfs_read+0xbc/0x1a0
          ksys_read+0x7c/0x140
          system_call+0x5c/0x70
      
      Patch fixes the issue with a more robust check for vbase to NULL.
      
      Before patch, ls output for the debugfs imc directory
      
        # ls /sys/kernel/debug/powerpc/imc/
        imc_cmd_0    imc_cmd_251  imc_cmd_253  imc_cmd_255  imc_mode_0    imc_mode_251  imc_mode_253  imc_mode_255
        imc_cmd_250  imc_cmd_252  imc_cmd_254  imc_cmd_8    imc_mode_250  imc_mode_252  imc_mode_254  imc_mode_8
      
      After patch, ls output for the debugfs imc directory
      
        # ls /sys/kernel/debug/powerpc/imc/
        imc_cmd_0  imc_cmd_8  imc_mode_0  imc_mode_8
      
      Actual bug here is that, we have two loops with potentially different
      loop counts. That is, in imc_get_mem_addr_nest(), loop count is
      obtained from the dt entries. But in case of export_imc_mode_and_cmd(),
      loop was based on for_each_nid() count. Patch fixes the loop count in
      latter based on the struct mem_info. Ideally it would be better to
      have array size in struct imc_pmu.
      
      Fixes: 684d9840 ('powerpc/powernv: Add debugfs interface for imc-mode and imc')
      Reported-by: default avatarQian Cai <cai@lca.pw>
      Suggested-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarMadhavan Srinivasan <maddy@linux.vnet.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20190827101635.6942-1-maddy@linux.vnet.ibm.com
      Cc: Jan Stancek <jstancek@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ecfe4b5f
    • Ming Lei's avatar
      scsi: implement .cleanup_rq callback · e94443fc
      Ming Lei authored
      [ Upstream commit b7e9e1fb ]
      
      Implement .cleanup_rq() callback for freeing driver private part
      of the request. Then we can avoid to leak this part if the request isn't
      completed by SCSI, and freed by blk-mq or upper layer(such as dm-rq) finally.
      
      Cc: Ewan D. Milne <emilne@redhat.com>
      Cc: Bart Van Assche <bvanassche@acm.org>
      Cc: Hannes Reinecke <hare@suse.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Mike Snitzer <snitzer@redhat.com>
      Cc: dm-devel@redhat.com
      Cc: <stable@vger.kernel.org>
      Fixes: 396eaf21 ("blk-mq: improve DM's blk-mq IO merging via blk_insert_cloned_request feedback")
      Signed-off-by: default avatarMing Lei <ming.lei@redhat.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      e94443fc
    • Ming Lei's avatar
      blk-mq: add callback of .cleanup_rq · 4ec3ca27
      Ming Lei authored
      [ Upstream commit 226b4fc7 ]
      
      SCSI maintains its own driver private data hooked off of each SCSI
      request, and the pridate data won't be freed after scsi_queue_rq()
      returns BLK_STS_RESOURCE or BLK_STS_DEV_RESOURCE. An upper layer driver
      (e.g. dm-rq) may need to retry these SCSI requests, before SCSI has
      fully dispatched them, due to a lower level SCSI driver's resource
      limitation identified in scsi_queue_rq(). Currently SCSI's per-request
      private data is leaked when the upper layer driver (dm-rq) frees and
      then retries these requests in response to BLK_STS_RESOURCE or
      BLK_STS_DEV_RESOURCE returns from scsi_queue_rq().
      
      This usecase is so specialized that it doesn't warrant training an
      existing blk-mq interface (e.g. blk_mq_free_request) to allow SCSI to
      account for freeing its driver private data -- doing so would add an
      extra branch for handling a special case that all other consumers of
      SCSI (and blk-mq) won't ever need to worry about.
      
      So the most pragmatic way forward is to delegate freeing SCSI driver
      private data to the upper layer driver (dm-rq).  Do so by adding
      new .cleanup_rq callback and calling a new blk_mq_cleanup_rq() method
      from dm-rq.  A following commit will implement the .cleanup_rq() hook
      in scsi_mq_ops.
      
      Cc: Ewan D. Milne <emilne@redhat.com>
      Cc: Bart Van Assche <bvanassche@acm.org>
      Cc: Hannes Reinecke <hare@suse.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Mike Snitzer <snitzer@redhat.com>
      Cc: dm-devel@redhat.com
      Cc: <stable@vger.kernel.org>
      Fixes: 396eaf21 ("blk-mq: improve DM's blk-mq IO merging via blk_insert_cloned_request feedback")
      Signed-off-by: default avatarMing Lei <ming.lei@redhat.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      4ec3ca27
    • Jan-Marek Glogowski's avatar
      ALSA: hda/realtek - PCI quirk for Medion E4254 · 4848fb93
      Jan-Marek Glogowski authored
      [ Upstream commit bd9c10bc ]
      
      The laptop has a combined jack to attach headsets on the right.
      The BIOS encodes them as two different colored jacks at the front,
      but otherwise it seems to be configured ok. But any adaption of
      the pins config on its own doesn't fix the jack detection to work
      in Linux. Still Windows works correct.
      
      This is somehow fixed by chaining ALC256_FIXUP_ASUS_HEADSET_MODE,
      which seems to register the microphone jack as a headset part and
      also results in fixing jack sensing, visible in dmesg as:
      
      -snd_hda_codec_realtek hdaudioC0D0:      Mic=0x19
      +snd_hda_codec_realtek hdaudioC0D0:      Headset Mic=0x19
      
      [ Actually the essential change is the location of the jack; the
        driver created "Front Mic Jack" without the matching volume / mute
        control element due to its jack location, which confused PA.
        -- tiwai ]
      Signed-off-by: default avatarJan-Marek Glogowski <glogow@fbihome.de>
      Cc: <stable@vger.kernel.org>
      Link: https://lore.kernel.org/r/8f4f9b20-0aeb-f8f1-c02f-fd53c09679f1@fbihome.deSigned-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      4848fb93
    • Yan, Zheng's avatar
      ceph: use ceph_evict_inode to cleanup inode's resource · e9bcaf82
      Yan, Zheng authored
      [ Upstream commit 87bc5b89 ]
      
      remove_session_caps() relies on __wait_on_freeing_inode(), to wait for
      freeing inode to remove its caps. But VFS wakes freeing inode waiters
      before calling destroy_inode().
      
      [ jlayton: mainline moved to ->free_inode before the original patch was
      	   merged. This backport reinstates ceph_destroy_inode and just
      	   has it do the call_rcu call. ]
      
      Cc: stable@vger.kernel.org
      Link: https://tracker.ceph.com/issues/40102Signed-off-by: default avatar"Yan, Zheng" <zyan@redhat.com>
      Reviewed-by: default avatarJeff Layton <jlayton@redhat.com>
      Signed-off-by: default avatarIlya Dryomov <idryomov@gmail.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      e9bcaf82
    • Sasha Levin's avatar
      Revert "ceph: use ceph_evict_inode to cleanup inode's resource" · 72f0fff3
      Sasha Levin authored
      This reverts commit 81281039.
      
      The backport was incorrect and was causing kernel panics. Revert and
      re-apply a correct backport from Jeff Layton.
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      72f0fff3
    • Joonwon Kang's avatar
      randstruct: Check member structs in is_pure_ops_struct() · 98dc6d95
      Joonwon Kang authored
      commit 60f2c82e upstream.
      
      While no uses in the kernel triggered this case, it was possible to have
      a false negative where a struct contains other structs which contain only
      function pointers because of unreachable code in is_pure_ops_struct().
      Signed-off-by: default avatarJoonwon Kang <kjw1627@gmail.com>
      Link: https://lore.kernel.org/r/20190727155841.GA13586@host
      Fixes: 313dd1b6 ("gcc-plugins: Add the randstruct plugin")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      98dc6d95
    • Ira Weiny's avatar
      IB/hfi1: Define variables as unsigned long to fix KASAN warning · ad6819cd
      Ira Weiny authored
      commit f8659d68 upstream.
      
      Define the working variables to be unsigned long to be compatible with
      for_each_set_bit and change types as needed.
      
      While we are at it remove unused variables from a couple of functions.
      
      This was found because of the following KASAN warning:
       ==================================================================
         BUG: KASAN: stack-out-of-bounds in find_first_bit+0x19/0x70
         Read of size 8 at addr ffff888362d778d0 by task kworker/u308:2/1889
      
         CPU: 21 PID: 1889 Comm: kworker/u308:2 Tainted: G W         5.3.0-rc2-mm1+ #2
         Hardware name: Intel Corporation W2600CR/W2600CR, BIOS SE5C600.86B.02.04.0003.102320141138 10/23/2014
         Workqueue: ib-comp-unb-wq ib_cq_poll_work [ib_core]
         Call Trace:
          dump_stack+0x9a/0xf0
          ? find_first_bit+0x19/0x70
          print_address_description+0x6c/0x332
          ? find_first_bit+0x19/0x70
          ? find_first_bit+0x19/0x70
          __kasan_report.cold.6+0x1a/0x3b
          ? find_first_bit+0x19/0x70
          kasan_report+0xe/0x12
          find_first_bit+0x19/0x70
          pma_get_opa_portstatus+0x5cc/0xa80 [hfi1]
          ? ret_from_fork+0x3a/0x50
          ? pma_get_opa_port_ectrs+0x200/0x200 [hfi1]
          ? stack_trace_consume_entry+0x80/0x80
          hfi1_process_mad+0x39b/0x26c0 [hfi1]
          ? __lock_acquire+0x65e/0x21b0
          ? clear_linkup_counters+0xb0/0xb0 [hfi1]
          ? check_chain_key+0x1d7/0x2e0
          ? lock_downgrade+0x3a0/0x3a0
          ? match_held_lock+0x2e/0x250
          ib_mad_recv_done+0x698/0x15e0 [ib_core]
          ? clear_linkup_counters+0xb0/0xb0 [hfi1]
          ? ib_mad_send_done+0xc80/0xc80 [ib_core]
          ? mark_held_locks+0x79/0xa0
          ? _raw_spin_unlock_irqrestore+0x44/0x60
          ? rvt_poll_cq+0x1e1/0x340 [rdmavt]
          __ib_process_cq+0x97/0x100 [ib_core]
          ib_cq_poll_work+0x31/0xb0 [ib_core]
          process_one_work+0x4ee/0xa00
          ? pwq_dec_nr_in_flight+0x110/0x110
          ? do_raw_spin_lock+0x113/0x1d0
          worker_thread+0x57/0x5a0
          ? process_one_work+0xa00/0xa00
          kthread+0x1bb/0x1e0
          ? kthread_create_on_node+0xc0/0xc0
          ret_from_fork+0x3a/0x50
      
         The buggy address belongs to the page:
         page:ffffea000d8b5dc0 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0
         flags: 0x17ffffc0000000()
         raw: 0017ffffc0000000 0000000000000000 ffffea000d8b5dc8 0000000000000000
         raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
         page dumped because: kasan: bad access detected
      
         addr ffff888362d778d0 is located in stack of task kworker/u308:2/1889 at offset 32 in frame:
          pma_get_opa_portstatus+0x0/0xa80 [hfi1]
      
         this frame has 1 object:
          [32, 36) 'vl_select_mask'
      
         Memory state around the buggy address:
          ffff888362d77780: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
          ffff888362d77800: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
         >ffff888362d77880: 00 00 00 00 00 00 f1 f1 f1 f1 04 f2 f2 f2 00 00
                                                          ^
          ffff888362d77900: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
          ffff888362d77980: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 04 f2 f2 f2
      
       ==================================================================
      
      Cc: <stable@vger.kernel.org>
      Fixes: 77241056 ("IB/hfi1: add driver files")
      Link: https://lore.kernel.org/r/20190911113053.126040.47327.stgit@awfm-01.aw.intel.comReviewed-by: default avatarMike Marciniszyn <mike.marciniszyn@intel.com>
      Signed-off-by: default avatarIra Weiny <ira.weiny@intel.com>
      Signed-off-by: default avatarKaike Wan <kaike.wan@intel.com>
      Signed-off-by: default avatarDennis Dalessandro <dennis.dalessandro@intel.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ad6819cd
    • Danit Goldberg's avatar
      IB/mlx5: Free mpi in mp_slave mode · a924850c
      Danit Goldberg authored
      commit 5d44adeb upstream.
      
      ib_add_slave_port() allocates a multiport struct but never frees it.
      Don't leak memory, free the allocated mpi struct during driver unload.
      
      Cc: <stable@vger.kernel.org>
      Fixes: 32f69e4b ("{net, IB}/mlx5: Manage port association for multiport RoCE")
      Link: https://lore.kernel.org/r/20190916064818.19823-3-leon@kernel.orgSigned-off-by: default avatarDanit Goldberg <danitg@mellanox.com>
      Reviewed-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a924850c
    • Vincent Whitchurch's avatar
      printk: Do not lose last line in kmsg buffer dump · 40b07199
      Vincent Whitchurch authored
      commit c9dccacf upstream.
      
      kmsg_dump_get_buffer() is supposed to select all the youngest log
      messages which fit into the provided buffer.  It determines the correct
      start index by using msg_print_text() with a NULL buffer to calculate
      the size of each entry.  However, when performing the actual writes,
      msg_print_text() only writes the entry to the buffer if the written len
      is lesser than the size of the buffer.  So if the lengths of the
      selected youngest log messages happen to precisely fill up the provided
      buffer, the last log message is not included.
      
      We don't want to modify msg_print_text() to fill up the buffer and start
      returning a length which is equal to the size of the buffer, since
      callers of its other users, such as kmsg_dump_get_line(), depend upon
      the current behaviour.
      
      Instead, fix kmsg_dump_get_buffer() to compensate for this.
      
      For example, with the following two final prints:
      
      [    6.427502] AAAAAAAAAAAAA
      [    6.427769] BBBBBBBB12345
      
      A dump of a 64-byte buffer filled by kmsg_dump_get_buffer(), before this
      patch:
      
       00000000: 3c 30 3e 5b 20 20 20 20 36 2e 35 32 32 31 39 37  <0>[    6.522197
       00000010: 5d 20 41 41 41 41 41 41 41 41 41 41 41 41 41 0a  ] AAAAAAAAAAAAA.
       00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
       00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
      
      After this patch:
      
       00000000: 3c 30 3e 5b 20 20 20 20 36 2e 34 35 36 36 37 38  <0>[    6.456678
       00000010: 5d 20 42 42 42 42 42 42 42 42 31 32 33 34 35 0a  ] BBBBBBBB12345.
       00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
       00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
      
      Link: http://lkml.kernel.org/r/20190711142937.4083-1-vincent.whitchurch@axis.com
      Fixes: e2ae715d ("kmsg - kmsg_dump() use iterator to receive log buffer content")
      To: rostedt@goodmis.org
      Cc: linux-kernel@vger.kernel.org
      Cc: <stable@vger.kernel.org> # v3.5+
      Signed-off-by: default avatarVincent Whitchurch <vincent.whitchurch@axis.com>
      Reviewed-by: default avatarSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Signed-off-by: default avatarPetr Mladek <pmladek@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      40b07199
    • Quinn Tran's avatar
      scsi: qla2xxx: Fix Relogin to prevent modifying scan_state flag · 28f142b9
      Quinn Tran authored
      commit 8b5292bc upstream.
      
      Relogin fails to move forward due to scan_state flag indicating device is
      not there. Before relogin process, Session delete process accidently
      modified the scan_state flag.
      
      [mkp: typos plus corrected Fixes: sha as reported by sfr]
      
      Fixes: 2dee5521 ("scsi: qla2xxx: Fix login state machine freeze")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarQuinn Tran <qutran@marvell.com>
      Signed-off-by: default avatarHimanshu Madhani <hmadhani@marvell.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      28f142b9
    • Martin Wilck's avatar
      scsi: scsi_dh_rdac: zero cdb in send_mode_select() · 03b75e65
      Martin Wilck authored
      commit 57adf5d4 upstream.
      
      cdb in send_mode_select() is not zeroed and is only partially filled in
      rdac_failover_get(), which leads to some random data getting to the
      device. Users have reported storage responding to such commands with
      INVALID FIELD IN CDB. Code before commit 32782557 was not affected, as
      it called blk_rq_set_block_pc().
      
      Fix this by zeroing out the cdb first.
      
      Identified & fix proposed by HPE.
      
      Fixes: 32782557 ("scsi_dh_rdac: switch to scsi_execute_req_flags()")
      Cc: stable@vger.kernel.org
      Link: https://lore.kernel.org/r/20190904155205.1666-1-martin.wilck@suse.comSigned-off-by: default avatarMartin Wilck <mwilck@suse.com>
      Acked-by: default avatarAles Novak <alnovak@suse.cz>
      Reviewed-by: default avatarShane Seymour <shane.seymour@hpe.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      03b75e65
    • Takashi Sakamoto's avatar
      ALSA: firewire-tascam: check intermediate state of clock status and retry · 2e21e5b2
      Takashi Sakamoto authored
      commit e1a00b5b upstream.
      
      2 bytes in MSB of register for clock status is zero during intermediate
      state after changing status of sampling clock in models of TASCAM FireWire
      series. The duration of this state differs depending on cases. During the
      state, it's better to retry reading the register for current status of
      the clock.
      
      In current implementation, the intermediate state is checked only when
      getting current sampling transmission frequency, then retry reading.
      This care is required for the other operations to read the register.
      
      This commit moves the codes of check and retry into helper function
      commonly used for operations to read the register.
      
      Fixes: e453df44 ("ALSA: firewire-tascam: add PCM functionality")
      Cc: <stable@vger.kernel.org> # v4.4+
      Signed-off-by: default avatarTakashi Sakamoto <o-takashi@sakamocchi.jp>
      Link: https://lore.kernel.org/r/20190910135152.29800-3-o-takashi@sakamocchi.jpSigned-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2e21e5b2
    • Takashi Sakamoto's avatar
      ALSA: firewire-tascam: handle error code when getting current source of clock · f5779e44
      Takashi Sakamoto authored
      commit 2617120f upstream.
      
      The return value of snd_tscm_stream_get_clock() is ignored. This commit
      checks the value and handle error.
      
      Fixes: e453df44 ("ALSA: firewire-tascam: add PCM functionality")
      Cc: <stable@vger.kernel.org> # v4.4+
      Signed-off-by: default avatarTakashi Sakamoto <o-takashi@sakamocchi.jp>
      Link: https://lore.kernel.org/r/20190910135152.29800-2-o-takashi@sakamocchi.jpSigned-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f5779e44
    • Luca Coelho's avatar
      iwlwifi: fw: don't send GEO_TX_POWER_LIMIT command to FW version 36 · fdd131ea
      Luca Coelho authored
      commit fddbfeec upstream.
      
      The intention was to have the GEO_TX_POWER_LIMIT command in FW version
      36 as well, but not all 8000 family got this feature enabled.  The
      8000 family is the only one using version 36, so skip this version
      entirely.  If we try to send this command to the firmwares that do not
      support it, we get a BAD_COMMAND response from the firmware.
      
      This fixes https://bugzilla.kernel.org/show_bug.cgi?id=204151.
      
      Cc: stable@vger.kernel.org # 4.19+
      Signed-off-by: default avatarLuca Coelho <luciano.coelho@intel.com>
      Signed-off-by: default avatarKalle Valo <kvalo@codeaurora.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fdd131ea
    • MyungJoo Ham's avatar
      PM / devfreq: passive: fix compiler warning · 6437ec27
      MyungJoo Ham authored
      [ Upstream commit 04658148 ]
      
      The recent commit of
      PM / devfreq: passive: Use non-devm notifiers
      had incurred compiler warning, "unused variable 'dev'".
      Reported-by: default avatarStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: default avatarMyungJoo Ham <myungjoo.ham@samsung.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      6437ec27
    • Sakari Ailus's avatar
      media: omap3isp: Set device on omap3isp subdevs · 814f7fe5
      Sakari Ailus authored
      [ Upstream commit e9eb103f ]
      
      The omap3isp driver registered subdevs without the dev field being set. Do
      that now.
      Signed-off-by: default avatarSakari Ailus <sakari.ailus@linux.intel.com>
      Reviewed-by: default avatarLaurent Pinchart <laurent.pinchart@ideasonboard.com>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab+samsung@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      814f7fe5
    • Qu Wenruo's avatar
      btrfs: extent-tree: Make sure we only allocate extents from block groups with the same type · c5dbd74f
      Qu Wenruo authored
      [ Upstream commit 2a28468e ]
      
      [BUG]
      With fuzzed image and MIXED_GROUPS super flag, we can hit the following
      BUG_ON():
      
        kernel BUG at fs/btrfs/delayed-ref.c:491!
        invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
        CPU: 0 PID: 1849 Comm: sync Tainted: G           O      5.2.0-custom #27
        RIP: 0010:update_existing_head_ref.cold+0x44/0x46 [btrfs]
        Call Trace:
         add_delayed_ref_head+0x20c/0x2d0 [btrfs]
         btrfs_add_delayed_tree_ref+0x1fc/0x490 [btrfs]
         btrfs_free_tree_block+0x123/0x380 [btrfs]
         __btrfs_cow_block+0x435/0x500 [btrfs]
         btrfs_cow_block+0x110/0x240 [btrfs]
         btrfs_search_slot+0x230/0xa00 [btrfs]
         ? __lock_acquire+0x105e/0x1e20
         btrfs_insert_empty_items+0x67/0xc0 [btrfs]
         alloc_reserved_file_extent+0x9e/0x340 [btrfs]
         __btrfs_run_delayed_refs+0x78e/0x1240 [btrfs]
         ? kvm_clock_read+0x18/0x30
         ? __sched_clock_gtod_offset+0x21/0x50
         btrfs_run_delayed_refs.part.0+0x4e/0x180 [btrfs]
         btrfs_run_delayed_refs+0x23/0x30 [btrfs]
         btrfs_commit_transaction+0x53/0x9f0 [btrfs]
         btrfs_sync_fs+0x7c/0x1c0 [btrfs]
         ? __ia32_sys_fdatasync+0x20/0x20
         sync_fs_one_sb+0x23/0x30
         iterate_supers+0x95/0x100
         ksys_sync+0x62/0xb0
         __ia32_sys_sync+0xe/0x20
         do_syscall_64+0x65/0x240
         entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      [CAUSE]
      This situation is caused by several factors:
      - Fuzzed image
        The extent tree of this fs missed one backref for extent tree root.
        So we can allocated space from that slot.
      
      - MIXED_BG feature
        Super block has MIXED_BG flag.
      
      - No mixed block groups exists
        All block groups are just regular ones.
      
      This makes data space_info->block_groups[] contains metadata block
      groups.  And when we reserve space for data, we can use space in
      metadata block group.
      
      Then we hit the following file operations:
      
      - fallocate
        We need to allocate data extents.
        find_free_extent() choose to use the metadata block to allocate space
        from, and choose the space of extent tree root, since its backref is
        missing.
      
        This generate one delayed ref head with is_data = 1.
      
      - extent tree update
        We need to update extent tree at run_delayed_ref time.
      
        This generate one delayed ref head with is_data = 0, for the same
        bytenr of old extent tree root.
      
      Then we trigger the BUG_ON().
      
      [FIX]
      The quick fix here is to check block_group->flags before using it.
      
      The problem can only happen for MIXED_GROUPS fs. Regular filesystems
      won't have space_info with DATA|METADATA flag, and no way to hit the
      bug.
      
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=203255Reported-by: default avatarJungyeon Yoon <jungyeon.yoon@gmail.com>
      Signed-off-by: default avatarQu Wenruo <wqu@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      c5dbd74f