1. 31 May, 2019 40 commits
    • Arnd Bergmann's avatar
      bcache: avoid clang -Wunintialized warning · 9b143f35
      Arnd Bergmann authored
      [ Upstream commit 78d4eb8a ]
      
      clang has identified a code path in which it thinks a
      variable may be unused:
      
      drivers/md/bcache/alloc.c:333:4: error: variable 'bucket' is used uninitialized whenever 'if' condition is false
            [-Werror,-Wsometimes-uninitialized]
                              fifo_pop(&ca->free_inc, bucket);
                              ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      drivers/md/bcache/util.h:219:27: note: expanded from macro 'fifo_pop'
       #define fifo_pop(fifo, i)       fifo_pop_front(fifo, (i))
                                      ^~~~~~~~~~~~~~~~~~~~~~~~~
      drivers/md/bcache/util.h:189:6: note: expanded from macro 'fifo_pop_front'
              if (_r) {                                                       \
                  ^~
      drivers/md/bcache/alloc.c:343:46: note: uninitialized use occurs here
                              allocator_wait(ca, bch_allocator_push(ca, bucket));
                                                                        ^~~~~~
      drivers/md/bcache/alloc.c:287:7: note: expanded from macro 'allocator_wait'
                      if (cond)                                               \
                          ^~~~
      drivers/md/bcache/alloc.c:333:4: note: remove the 'if' if its condition is always true
                              fifo_pop(&ca->free_inc, bucket);
                              ^
      drivers/md/bcache/util.h:219:27: note: expanded from macro 'fifo_pop'
       #define fifo_pop(fifo, i)       fifo_pop_front(fifo, (i))
                                      ^
      drivers/md/bcache/util.h:189:2: note: expanded from macro 'fifo_pop_front'
              if (_r) {                                                       \
              ^
      drivers/md/bcache/alloc.c:331:15: note: initialize the variable 'bucket' to silence this warning
                              long bucket;
                                         ^
      
      This cannot happen in practice because we only enter the loop
      if there is at least one element in the list.
      
      Slightly rearranging the code makes this clearer to both the
      reader and the compiler, which avoids the warning.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Reviewed-by: default avatarNathan Chancellor <natechancellor@gmail.com>
      Signed-off-by: default avatarColy Li <colyli@suse.de>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      9b143f35
    • Coly Li's avatar
      bcache: add failure check to run_cache_set() for journal replay · b24e16eb
      Coly Li authored
      [ Upstream commit ce3e4cfb ]
      
      Currently run_cache_set() has no return value, if there is failure in
      bch_journal_replay(), the caller of run_cache_set() has no idea about
      such failure and just continue to execute following code after
      run_cache_set().  The internal failure is triggered inside
      bch_journal_replay() and being handled in async way. This behavior is
      inefficient, while failure handling inside bch_journal_replay(), cache
      register code is still running to start the cache set. Registering and
      unregistering code running as same time may introduce some rare race
      condition, and make the code to be more hard to be understood.
      
      This patch adds return value to run_cache_set(), and returns -EIO if
      bch_journal_rreplay() fails. Then caller of run_cache_set() may detect
      such failure and stop registering code flow immedidately inside
      register_cache_set().
      
      If journal replay fails, run_cache_set() can report error immediately
      to register_cache_set(). This patch makes the failure handling for
      bch_journal_replay() be in synchronized way, easier to understand and
      debug, and avoid poetential race condition for register-and-unregister
      in same time.
      Signed-off-by: default avatarColy Li <colyli@suse.de>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      b24e16eb
    • Tang Junhui's avatar
      bcache: fix failure in journal relplay · b2993307
      Tang Junhui authored
      [ Upstream commit 63120731 ]
      
      journal replay failed with messages:
      Sep 10 19:10:43 ceph kernel: bcache: error on
      bb379a64-e44e-4812-b91d-a5599871a3b1: bcache: journal entries
      2057493-2057567 missing! (replaying 2057493-20766016), disabling
      caching
      
      The reason is in journal_reclaim(), when discard is enabled, we send
      discard command and reclaim those journal buckets whose seq is old
      than the last_seq_now, but before we write a journal with last_seq_now,
      the machine is restarted, so the journal with the last_seq_now is not
      written to the journal bucket, and the last_seq_wrote in the newest
      journal is old than last_seq_now which we expect to be, so when we doing
      replay, journals from last_seq_wrote to last_seq_now are missing.
      
      It's hard to write a journal immediately after journal_reclaim(),
      and it harmless if those missed journal are caused by discarding
      since those journals are already wrote to btree node. So, if miss
      seqs are started from the beginning journal, we treat it as normal,
      and only print a message to show the miss journal, and point out
      it maybe caused by discarding.
      
      Patch v2 add a judgement condition to ignore the missed journal
      only when discard enabled as Coly suggested.
      
      (Coly Li: rebase the patch with other changes in bch_journal_replay())
      Signed-off-by: default avatarTang Junhui <tang.junhui.linux@gmail.com>
      Tested-by: default avatarDennis Schridde <devurandom@gmx.net>
      Signed-off-by: default avatarColy Li <colyli@suse.de>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      b2993307
    • Coly Li's avatar
      bcache: return error immediately in bch_journal_replay() · e067f2f0
      Coly Li authored
      [ Upstream commit 68d10e69 ]
      
      When failure happens inside bch_journal_replay(), calling
      cache_set_err_on() and handling the failure in async way is not a good
      idea. Because after bch_journal_replay() returns, registering code will
      continue to execute following steps, and unregistering code triggered
      by cache_set_err_on() is running in same time. First it is unnecessary
      to handle failure and unregister cache set in an async way, second there
      might be potential race condition to run register and unregister code
      for same cache set.
      
      So in this patch, if failure happens in bch_journal_replay(), we don't
      call cache_set_err_on(), and just print out the same error message to
      kernel message buffer, then return -EIO immediately caller. Then caller
      can detect such failure and handle it in synchrnozied way.
      Signed-off-by: default avatarColy Li <colyli@suse.de>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      e067f2f0
    • Shenghui Wang's avatar
      bcache: avoid potential memleak of list of journal_replay(s) in the CACHE_SYNC... · 11425640
      Shenghui Wang authored
      bcache: avoid potential memleak of list of journal_replay(s) in the CACHE_SYNC branch of run_cache_set
      
      [ Upstream commit 95f18c9d ]
      
      In the CACHE_SYNC branch of run_cache_set(), LIST_HEAD(journal) is used
      to collect journal_replay(s) and filled by bch_journal_read().
      
      If all goes well, bch_journal_replay() will release the list of
      jounal_replay(s) at the end of the branch.
      
      If something goes wrong, code flow will jump to the label "err:" and leave
      the list unreleased.
      
      This patch will release the list of journal_replay(s) in the case of
      error detected.
      
      v1 -> v2:
      * Move the release code to the location after label 'err:' to
        simply the change.
      Signed-off-by: default avatarShenghui Wang <shhuiw@foxmail.com>
      Signed-off-by: default avatarColy Li <colyli@suse.de>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      11425640
    • Corentin Labbe's avatar
      crypto: sun4i-ss - Fix invalid calculation of hash end · 8d4819fd
      Corentin Labbe authored
      [ Upstream commit f8739155 ]
      
      When nbytes < 4, end is wronlgy set to a negative value which, due to
      uint, is then interpreted to a large value leading to a deadlock in the
      following code.
      
      This patch fix this problem.
      
      Fixes: 6298e948 ("crypto: sunxi-ss - Add Allwinner Security System crypto accelerator")
      Signed-off-by: default avatarCorentin Labbe <clabbe.montjoie@gmail.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      8d4819fd
    • Sagi Grimberg's avatar
      nvme-tcp: fix a NULL deref when an admin connect times out · f3423700
      Sagi Grimberg authored
      [ Upstream commit 7a425896 ]
      
      If we timeout the admin startup sequence we might not yet have
      an I/O tagset allocated which causes the teardown sequence to crash.
      Make nvme_tcp_teardown_io_queues safe by not iterating inflight tags
      if the tagset wasn't allocated.
      
      Fixes: 39d57757 ("nvme-tcp: fix timeout handler")
      Signed-off-by: default avatarSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      f3423700
    • Sagi Grimberg's avatar
      nvme-rdma: fix a NULL deref when an admin connect times out · b4e256d8
      Sagi Grimberg authored
      [ Upstream commit 1007709d ]
      
      If we timeout the admin startup sequence we might not yet have
      an I/O tagset allocated which causes the teardown sequence to crash.
      Make nvme_tcp_teardown_io_queues safe by not iterating inflight tags
      if the tagset wasn't allocated.
      
      Fixes: 4c174e63 ("nvme-rdma: fix timeout handler")
      Signed-off-by: default avatarSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      b4e256d8
    • Sagi Grimberg's avatar
      nvme: set 0 capacity if namespace block size exceeds PAGE_SIZE · 12b83abc
      Sagi Grimberg authored
      [ Upstream commit 01fa0174 ]
      
      If our target exposed a namespace with a block size that is greater
      than PAGE_SIZE, set 0 capacity on the namespace as we do not support it.
      
      This issue encountered when the nvmet namespace was backed by a tempfile.
      Signed-off-by: default avatarSagi Grimberg <sagi@grimberg.me>
      Reviewed-by: default avatarKeith Busch <keith.busch@intel.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      12b83abc
    • Kangjie Lu's avatar
      net: cw1200: fix a NULL pointer dereference · 6fb42f3c
      Kangjie Lu authored
      [ Upstream commit 0ed2a005 ]
      
      In case create_singlethread_workqueue fails, the fix free the
      hardware and returns NULL to avoid NULL pointer dereference.
      Signed-off-by: default avatarKangjie Lu <kjlu@umn.edu>
      Signed-off-by: default avatarKalle Valo <kvalo@codeaurora.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      6fb42f3c
    • Aditya Pakki's avatar
      rsi: Fix NULL pointer dereference in kmalloc · 90905286
      Aditya Pakki authored
      [ Upstream commit d5414c23 ]
      
      kmalloc can fail in rsi_register_rates_channels but memcpy still attempts
      to write to channels. The patch replaces these calls with kmemdup and
      passes the error upstream.
      Signed-off-by: default avatarAditya Pakki <pakki001@umn.edu>
      Signed-off-by: default avatarKalle Valo <kvalo@codeaurora.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      90905286
    • Dan Carpenter's avatar
      mwifiex: prevent an array overflow · ae745a8b
      Dan Carpenter authored
      [ Upstream commit b4c35c17 ]
      
      The "rate_index" is only used as an index into the phist_data->rx_rate[]
      array in the mwifiex_hist_data_set() function.  That array has
      MWIFIEX_MAX_AC_RX_RATES (74) elements and it's used to generate some
      debugfs information.  The "rate_index" variable comes from the network
      skb->data[] and it is a u8 so it's in the 0-255 range.  We need to cap
      it to prevent an array overflow.
      
      Fixes: cbf6e055 ("mwifiex: add rx histogram statistics support")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarKalle Valo <kvalo@codeaurora.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      ae745a8b
    • Xiaoli Feng's avatar
      Fix nfs4.2 return -EINVAL when do dedupe operation · 50c312ad
      Xiaoli Feng authored
      [ Upstream commit ce96e888 ]
      
      dedupe_file_range operations is combiled into remap_file_range.
      But in nfs42_remap_file_range, it's skiped for dedupe operations.
      Before this patch:
        # dd if=/dev/zero of=nfs/file bs=1M count=1
        # xfs_io -c "dedupe nfs/file 4k 64k 4k" nfs/file
        XFS_IOC_FILE_EXTENT_SAME: Invalid argument
      After this patch:
        # dd if=/dev/zero of=nfs/file bs=1M count=1
        # xfs_io -c "dedupe nfs/file 4k 64k 4k" nfs/file
        deduped 4096/4096 bytes at offset 65536
        4 KiB, 1 ops; 0.0046 sec (865.988 KiB/sec and 216.4971 ops/sec)
      Signed-off-by: default avatarXiaoli Feng <fengxiaoli0714@gmail.com>
      Signed-off-by: default avatarAnna Schumaker <Anna.Schumaker@Netapp.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      50c312ad
    • Daniel Baluta's avatar
      ASoC: fsl_sai: Update is_slave_mode with correct value · e2130db1
      Daniel Baluta authored
      [ Upstream commit ddb35114 ]
      
      is_slave_mode defaults to false because sai structure
      that contains it is kzalloc'ed.
      
      Anyhow, if we decide to set the following configuration
      SAI slave -> SAI master, is_slave_mode will remain set on true
      although SAI being master it should be set to false.
      
      Fix this by updating is_slave_mode for each call of
      fsl_sai_set_dai_fmt.
      Signed-off-by: default avatarDaniel Baluta <daniel.baluta@nxp.com>
      Acked-by: default avatarNicolin Chen <nicoleotsuka@gmail.com>
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      e2130db1
    • Linus Walleij's avatar
      regulator: core: Actually put the gpiod after use · a973c17c
      Linus Walleij authored
      [ Upstream commit 78927aa4 ]
      
      I went to great lengths to hand over the management of the GPIO
      descriptors to the regulator core, and some stray rebased
      oneliner in the old patch must have been assuming the devices
      were still doing devres management of it.
      
      We handed the management over to the regulator core, so of
      course the regulator core shall issue gpiod_put() when done.
      
      Sorry for the descriptor leak.
      
      Fixes: 541d052d ("regulator: core: Only support passing enable GPIO descriptors")
      Signed-off-by: default avatarLinus Walleij <linus.walleij@linaro.org>
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      a973c17c
    • Kangjie Lu's avatar
      slimbus: fix a potential NULL pointer dereference in of_qcom_slim_ngd_register · c3bbd47c
      Kangjie Lu authored
      [ Upstream commit 06d5d6b7 ]
      
      In case platform_device_alloc fails, the fix returns an error
      code to avoid the NULL pointer dereference.
      Signed-off-by: default avatarKangjie Lu <kjlu@umn.edu>
      Signed-off-by: default avatarSrinivas Kandagatla <srinivas.kandagatla@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      c3bbd47c
    • Daniel T. Lee's avatar
      libbpf: fix samples/bpf build failure due to undefined UINT32_MAX · f1df511d
      Daniel T. Lee authored
      [ Upstream commit 32e621e5 ]
      
      Currently, building bpf samples will cause the following error.
      
          ./tools/lib/bpf/bpf.h:132:27: error: 'UINT32_MAX' undeclared here (not in a function) ..
           #define BPF_LOG_BUF_SIZE (UINT32_MAX >> 8) /* verifier maximum in kernels <= 5.1 */
                                     ^
          ./samples/bpf/bpf_load.h:31:25: note: in expansion of macro 'BPF_LOG_BUF_SIZE'
           extern char bpf_log_buf[BPF_LOG_BUF_SIZE];
                                   ^~~~~~~~~~~~~~~~
      
      Due to commit 4519efa6 ("libbpf: fix BPF_LOG_BUF_SIZE off-by-one error")
      hard-coded size of BPF_LOG_BUF_SIZE has been replaced with UINT32_MAX which is
      defined in <stdint.h> header.
      
      Even with this change, bpf selftests are running fine since these are built
      with clang and it includes header(-idirafter) from clang/6.0.0/include.
      (it has <stdint.h>)
      
          clang -I. -I./include/uapi -I../../../include/uapi -idirafter /usr/local/include -idirafter /usr/include \
          -idirafter /usr/lib/llvm-6.0/lib/clang/6.0.0/include -idirafter /usr/include/x86_64-linux-gnu \
          -Wno-compare-distinct-pointer-types -O2 -target bpf -emit-llvm -c progs/test_sysctl_prog.c -o - | \
          llc -march=bpf -mcpu=generic  -filetype=obj -o /linux/tools/testing/selftests/bpf/test_sysctl_prog.o
      
      But bpf samples are compiled with GCC, and it only searches and includes
      headers declared at the target file. As '#include <stdint.h>' hasn't been
      declared in tools/lib/bpf/bpf.h, it causes build failure of bpf samples.
      
          gcc -Wp,-MD,./samples/bpf/.sockex3_user.o.d -Wall -Wmissing-prototypes -Wstrict-prototypes \
          -O2 -fomit-frame-pointer -std=gnu89 -I./usr/include -I./tools/lib/ -I./tools/testing/selftests/bpf/ \
          -I./tools/  lib/ -I./tools/include -I./tools/perf -c -o ./samples/bpf/sockex3_user.o ./samples/bpf/sockex3_user.c;
      
      This commit add declaration of '#include <stdint.h>' to tools/lib/bpf/bpf.h
      to fix this problem.
      Signed-off-by: default avatarDaniel T. Lee <danieltimlee@gmail.com>
      Acked-by: default avatarYonghong Song <yhs@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      f1df511d
    • Masahiro Yamada's avatar
      drm: prefix header search paths with $(srctree)/ · 59c8fa1f
      Masahiro Yamada authored
      [ Upstream commit 43068cb7 ]
      
      Currently, the Kbuild core manipulates header search paths in a crazy
      way [1].
      
      To fix this mess, I want all Makefiles to add explicit $(srctree)/ to
      the search paths in the srctree. Some Makefiles are already written in
      that way, but not all. The goal of this work is to make the notation
      consistent, and finally get rid of the gross hacks.
      
      Having whitespaces after -I does not matter since commit 48f6e3cf
      ("kbuild: do not drop -I without parameter").
      
      [1]: https://patchwork.kernel.org/patch/9632347/Signed-off-by: default avatarMasahiro Yamada <yamada.masahiro@socionext.com>
      Reviewed-by: default avatarSam Ravnborg <sam@ravnborg.org>
      Reviewed-by: default avatarJames Qian Wang (Arm Technology China) <james.qian.wang@arm.com>
      Acked-by: default avatarLiviu Dudau <liviu.dudau@arm.com>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      Link: https://patchwork.freedesktop.org/patch/msgid/1553859161-2628-1-git-send-email-yamada.masahiro@socionext.comSigned-off-by: default avatarSasha Levin <sashal@kernel.org>
      59c8fa1f
    • Sergey Matyukevich's avatar
      mac80211/cfg80211: update bss channel on channel switch · 4659bb96
      Sergey Matyukevich authored
      [ Upstream commit 5dc8cdce ]
      
      FullMAC STAs have no way to update bss channel after CSA channel switch
      completion. As a result, user-space tools may provide inconsistent
      channel info. For instance, consider the following two commands:
      $ sudo iw dev wlan0 link
      $ sudo iw dev wlan0 info
      The latter command gets channel info from the hardware, so most probably
      its output will be correct. However the former command gets channel info
      from scan cache, so its output will contain outdated channel info.
      In fact, current bss channel info will not be updated until the
      next [re-]connect.
      
      Note that mac80211 STAs have a workaround for this, but it requires
      access to internal cfg80211 data, see ieee80211_chswitch_work:
      
      	/* XXX: shouldn't really modify cfg80211-owned data! */
      	ifmgd->associated->channel = sdata->csa_chandef.chan;
      
      This patch suggests to convert mac80211 workaround into cfg80211 behavior
      and to update current bss channel in cfg80211_ch_switch_notify.
      Signed-off-by: default avatarSergey Matyukevich <sergey.matyukevich.os@quantenna.com>
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      4659bb96
    • Sugar Zhang's avatar
      dmaengine: pl330: _stop: clear interrupt status · 1606cbce
      Sugar Zhang authored
      [ Upstream commit 2da254cc ]
      
      This patch kill instructs the DMAC to immediately terminate
      execution of a thread. and then clear the interrupt status,
      at last, stop generating interrupts for DMA_SEV. to guarantee
      the next dma start is clean. otherwise, one interrupt maybe leave
      to next start and make some mistake.
      
      we can reporduce the problem as follows:
      
      DMASEV: modify the event-interrupt resource, and if the INTEN sets
      function as interrupt, the DMAC will set irq<event_num> HIGH to
      generate interrupt. write INTCLR to clear interrupt.
      
      	DMA EXECUTING INSTRUCTS		DMA TERMINATE
      		|				|
      		|				|
      	       ...			      _stop
      		|				|
      		|			spin_lock_irqsave
      	     DMASEV				|
      		|				|
      		|			    mask INTEN
      		|				|
      		|			     DMAKILL
      		|				|
      		|			spin_unlock_irqrestore
      
      in above case, a interrupt was left, and if we unmask INTEN, the DMAC
      will set irq<event_num> HIGH to generate interrupt.
      
      to fix this, do as follows:
      
      	DMA EXECUTING INSTRUCTS		DMA TERMINATE
      		|				|
      		|				|
      	       ...			      _stop
      		|				|
      		|			spin_lock_irqsave
      	     DMASEV				|
      		|				|
      		|			     DMAKILL
      		|				|
      		|			   clear INTCLR
      		|			    mask INTEN
      		|				|
      		|			spin_unlock_irqrestore
      Signed-off-by: default avatarSugar Zhang <sugar.zhang@rock-chips.com>
      Signed-off-by: default avatarVinod Koul <vkoul@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      1606cbce
    • Huazhong Tan's avatar
      net: hns3: use atomic_t replace u32 for arq's count · 4918adcf
      Huazhong Tan authored
      [ Upstream commit 30780a8b ]
      
      Since irq handler and mailbox task will both update arq's count,
      so arq's count should use atomic_t instead of u32, otherwise
      its value may go wrong finally.
      
      Fixes: 07a0556a ("net: hns3: Changes to support ARQ(Asynchronous Receive Queue)")
      Signed-off-by: default avatarHuazhong Tan <tanhuazhong@huawei.com>
      Signed-off-by: default avatarPeng Li <lipeng321@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      4918adcf
    • Will Deacon's avatar
      arm64: futex: Fix FUTEX_WAKE_OP atomic ops with non-zero result value · 1c05aa60
      Will Deacon authored
      [ Upstream commit 84ff7a09 ]
      
      Rather embarrassingly, our futex() FUTEX_WAKE_OP implementation doesn't
      explicitly set the return value on the non-faulting path and instead
      leaves it holding the result of the underlying atomic operation. This
      means that any FUTEX_WAKE_OP atomic operation which computes a non-zero
      value will be reported as having failed. Regrettably, I wrote the buggy
      code back in 2011 and it was upstreamed as part of the initial arm64
      support in 2012.
      
      The reasons we appear to get away with this are:
      
        1. FUTEX_WAKE_OP is rarely used and therefore doesn't appear to get
           exercised by futex() test applications
      
        2. If the result of the atomic operation is zero, the system call
           behaves correctly
      
        3. Prior to version 2.25, the only operation used by GLIBC set the
           futex to zero, and therefore worked as expected. From 2.25 onwards,
           FUTEX_WAKE_OP is not used by GLIBC at all.
      
      Fix the implementation by ensuring that the return value is either 0
      to indicate that the atomic operation completed successfully, or -EFAULT
      if we encountered a fault when accessing the user mapping.
      
      Cc: <stable@kernel.org>
      Fixes: 6170a974 ("arm64: Atomic operations")
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      1c05aa60
    • Arnd Bergmann's avatar
      s390: qeth: address type mismatch warning · 8f1f40f7
      Arnd Bergmann authored
      [ Upstream commit 46b83629 ]
      
      clang produces a harmless warning for each use for the qeth_adp_supported
      macro:
      
      drivers/s390/net/qeth_l2_main.c:559:31: warning: implicit conversion from enumeration type 'enum qeth_ipa_setadp_cmd' to
            different enumeration type 'enum qeth_ipa_funcs' [-Wenum-conversion]
              if (qeth_adp_supported(card, IPA_SETADP_SET_PROMISC_MODE))
                  ~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~
      drivers/s390/net/qeth_core.h:179:41: note: expanded from macro 'qeth_adp_supported'
              qeth_is_ipa_supported(&c->options.adp, f)
              ~~~~~~~~~~~~~~~~~~~~~                  ^
      
      Add a version of this macro that uses the correct types, and
      remove the unused qeth_adp_enabled() macro that has the same
      problem.
      Reviewed-by: default avatarNathan Chancellor <natechancellor@gmail.com>
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarJulian Wiedmann <jwi@linux.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      8f1f40f7
    • Heiner Kallweit's avatar
      net: phy: improve genphy_soft_reset · 0dd7e77b
      Heiner Kallweit authored
      [ Upstream commit 8c90b795 ]
      
      PHY's behave differently when being reset. Some reset registers to
      defaults, some don't. Some trigger an autoneg restart, some don't.
      
      So let's also set the autoneg restart bit when resetting. Then PHY
      behavior should be more consistent. Clearing BMCR_ISOLATE serves the
      same purpose and is borrowed from genphy_restart_aneg.
      
      BMCR holds the speed / duplex settings in fixed mode. Therefore
      we may have an issue if a soft reset resets BMCR to its default.
      So better call genphy_setup_forced() afterwards in fixed mode.
      We've seen no related complaint in the last >10 yrs, so let's
      treat it as an improvement.
      Signed-off-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      0dd7e77b
    • Yunsheng Lin's avatar
      net: hns3: fix for TX clean num when cleaning TX BD · 4c1c9dab
      Yunsheng Lin authored
      [ Upstream commit 63380a1a ]
      
      hns3_desc_unused() returns how many BD have been cleaned, but new
      buffer has not been attached to them. The register of
      HNS3_RING_RX_RING_FBDNUM_REG returns how many BD need allocating new
      buffer to or need to cleaned. So the remaining BD need to be clean
      is HNS3_RING_RX_RING_FBDNUM_REG - hns3_desc_unused().
      
      Also, new buffer can not attach to the pending BD when the last BD is
      not handled, because memcpy has not been done on the first pending BD.
      
      This patch fixes by subtracting the pending BD num from unused_count
      after 'HNS3_RING_RX_RING_FBDNUM_REG - unused_count' is used to calculate
      the BD bum need to be clean.
      
      Fixes: e5597095 ("net: hns3: Add handling of GRO Pkts not fully RX'ed in NAPI poll")
      Signed-off-by: default avatarYunsheng Lin <linyunsheng@huawei.com>
      Signed-off-by: default avatarPeng Li <lipeng321@huawei.com>
      Signed-off-by: default avatarHuazhong Tan <tanhuazhong@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      4c1c9dab
    • Huazhong Tan's avatar
      net: hns3: fix pause configure fail problem · f5fc42d9
      Huazhong Tan authored
      [ Upstream commit fba2efda ]
      
      When configure pause, current implementation returns directly
      after setup PFC without setup BP, which is not sufficient.
      
      So this patch fixes it, only return while setting PFC failed.
      
      Fixes: 44e59e37 ("net: hns3: do not return GE PFC setting err when initializing")
      Signed-off-by: default avatarHuazhong Tan <tanhuazhong@huawei.com>
      Signed-off-by: default avatarPeng Li <lipeng321@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      f5fc42d9
    • Mariusz Bialonczyk's avatar
      w1: fix the resume command API · ae9505c0
      Mariusz Bialonczyk authored
      [ Upstream commit 62909da8 ]
      
      >From the DS2408 datasheet [1]:
      "Resume Command function checks the status of the RC flag and, if it is set,
       directly transfers control to the control functions, similar to a Skip ROM
       command. The only way to set the RC flag is through successfully executing
       the Match ROM, Search ROM, Conditional Search ROM, or Overdrive-Match ROM
       command"
      
      The function currently works perfectly fine in a multidrop bus, but when we
      have only a single slave connected, then only a Skip ROM is used and Match
      ROM is not called at all. This is leading to problems e.g. with single one
      DS2408 connected, as the Resume Command is not working properly and the
      device is responding with failing results after the Resume Command.
      
      This commit is fixing this by using a Skip ROM instead in those cases.
      The bandwidth / performance advantage is exactly the same.
      
      Refs:
      [1] https://datasheets.maximintegrated.com/en/ds/DS2408.pdfSigned-off-by: default avatarMariusz Bialonczyk <manio@skyboo.net>
      Reviewed-by: default avatarJean-Francois Dagenais <jeff.dagenais@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      ae9505c0
    • Grygorii Strashko's avatar
      net: ethernet: ti: cpsw: fix allmulti cfg in dual_mac mode · edb51581
      Grygorii Strashko authored
      [ Upstream commit 06095f34 ]
      
      Now CPSW ALE will set/clean Host port bit in Unregistered Multicast Flood
      Mask (UNREG_MCAST_FLOOD_MASK) for every VLAN without checking if this port
      belongs to VLAN or not when ALLMULTI mode flag is set for nedev. This is
      working in non dual_mac mode, but in dual_mac - it causes
      enabling/disabling ALLMULTI flag for both ports.
      
      Hence fix it by adding additional parameter to cpsw_ale_set_allmulti() to
      specify ALE port number for which ALLMULTI has to be enabled and check if
      port belongs to VLAN before modifying UNREG_MCAST_FLOOD_MASK.
      Signed-off-by: default avatarGrygorii Strashko <grygorii.strashko@ti.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      edb51581
    • Nicholas Piggin's avatar
      sched/nohz: Run NOHZ idle load balancer on HK_FLAG_MISC CPUs · 5cd3fddb
      Nicholas Piggin authored
      [ Upstream commit 9b019acb ]
      
      The NOHZ idle balancer runs on the lowest idle CPU. This can
      interfere with isolated CPUs, so confine it to HK_FLAG_MISC
      housekeeping CPUs.
      
      HK_FLAG_SCHED is not used for this because it is not set anywhere
      at the moment. This could be folded into HK_FLAG_SCHED once that
      option is fixed.
      
      The problem was observed with increased jitter on an application
      running on CPU0, caused by NOHZ idle load balancing being run on
      CPU1 (an SMT sibling).
      Signed-off-by: default avatarNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: https://lkml.kernel.org/r/20190412042613.28930-1-npiggin@gmail.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      5cd3fddb
    • Bard liao's avatar
      ALSA: hda: fix unregister device twice on ASoC driver · c13f2ccf
      Bard liao authored
      [ Upstream commit 4d95c517 ]
      
      snd_hda_codec_device_new() is used by both legacy HDA and ASoC
      driver. However, we will call snd_hdac_device_unregister() in
      snd_hdac_ext_bus_device_remove() for ASoC device. This patch uses
      the type flag in hdac_device struct to determine is it a ASoC device
      or legacy HDA device and call snd_hdac_device_unregister() in
      snd_hda_codec_dev_free() only if it is a legacy HDA device.
      Signed-off-by: default avatarBard liao <yung-chuan.liao@linux.intel.com>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      c13f2ccf
    • Philipp Rudo's avatar
      s390/kexec_file: Fix detection of text segment in ELF loader · ea9e874b
      Philipp Rudo authored
      [ Upstream commit 729829d7 ]
      
      To register data for the next kernel (command line, oldmem_base, etc.) the
      current kernel needs to find the ELF segment that contains head.S. This is
      currently done by checking ifor 'phdr->p_paddr == 0'. This works fine for
      the current kernel build but in theory the first few pages could be
      skipped. Make the detection more robust by checking if the entry point lies
      within the segment.
      Signed-off-by: default avatarPhilipp Rudo <prudo@linux.ibm.com>
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      ea9e874b
    • Manish Rangankar's avatar
      scsi: qedi: Abort ep termination if offload not scheduled · 1df245c3
      Manish Rangankar authored
      [ Upstream commit f848bfd8 ]
      
      Sometimes during connection recovery when there is a failure to resolve
      ARP, and offload connection was not issued, driver tries to flush pending
      offload connection work which was not queued up.
      
      kernel: WARNING: CPU: 19 PID: 10110 at kernel/workqueue.c:3030 __flush_work.isra.34+0x19c/0x1b0
      kernel: CPU: 19 PID: 10110 Comm: iscsid Tainted: G W 5.1.0-rc4 #11
      kernel: Hardware name: Dell Inc. PowerEdge R730/0599V5, BIOS 2.9.1 12/04/2018
      kernel: RIP: 0010:__flush_work.isra.34+0x19c/0x1b0
      kernel: Code: 8b fb 66 0f 1f 44 00 00 31 c0 eb ab 48 89 ef c6 07 00 0f 1f 40 00 fb 66 0f 1f 44 00 00 31 c0 eb 96 e8 08 16 fe ff 0f 0b eb 8d <0f> 0b 31 c0 eb 87 0f 1f 40 00 66 2e 0f 1
      f 84 00 00 00 00 00 0f 1f
      kernel: RSP: 0018:ffffa6b4054dba68 EFLAGS: 00010246
      kernel: RAX: 0000000000000000 RBX: ffff91df21c36fc0 RCX: 0000000000000000
      kernel: RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff91df21c36fc0
      kernel: RBP: ffff91df21c36ef0 R08: 0000000000000000 R09: 0000000000000000
      kernel: R10: 0000000000000038 R11: ffffa6b4054dbd60 R12: ffffffffc05e72c0
      kernel: R13: ffff91db10280820 R14: 0000000000000048 R15: 0000000000000000
      kernel: FS:  00007f5d83cc1740(0000) GS:ffff91df2f840000(0000) knlGS:0000000000000000
      kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      kernel: CR2: 0000000001cc5000 CR3: 0000000465450002 CR4: 00000000001606e0
      kernel: Call Trace:
      kernel: ? try_to_del_timer_sync+0x4d/0x80
      kernel: qedi_ep_disconnect+0x3b/0x410 [qedi]
      kernel: ? 0xffffffffc083c000
      kernel: ? klist_iter_exit+0x14/0x20
      kernel: ? class_find_device+0x93/0xf0
      kernel: iscsi_if_ep_disconnect.isra.18+0x58/0x70 [scsi_transport_iscsi]
      kernel: iscsi_if_recv_msg+0x10e2/0x1510 [scsi_transport_iscsi]
      kernel: ? copyout+0x22/0x30
      kernel: ? _copy_to_iter+0xa0/0x430
      kernel: ? _cond_resched+0x15/0x30
      kernel: ? __kmalloc_node_track_caller+0x1f9/0x270
      kernel: iscsi_if_rx+0xa5/0x1e0 [scsi_transport_iscsi]
      kernel: netlink_unicast+0x17f/0x230
      kernel: netlink_sendmsg+0x2d2/0x3d0
      kernel: sock_sendmsg+0x36/0x50
      kernel: ___sys_sendmsg+0x280/0x2a0
      kernel: ? timerqueue_add+0x54/0x80
      kernel: ? enqueue_hrtimer+0x38/0x90
      kernel: ? hrtimer_start_range_ns+0x19f/0x2c0
      kernel: __sys_sendmsg+0x58/0xa0
      kernel: do_syscall_64+0x5b/0x180
      kernel: entry_SYSCALL_64_after_hwframe+0x44/0xa9
      Signed-off-by: default avatarManish Rangankar <mrangankar@marvell.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      1df245c3
    • Fabien Dessenne's avatar
      rtc: stm32: manage the get_irq probe defer case · a4e1b27e
      Fabien Dessenne authored
      [ Upstream commit cf612c59 ]
      
      Manage the -EPROBE_DEFER error case for the wake IRQ.
      Signed-off-by: default avatarFabien Dessenne <fabien.dessenne@st.com>
      Acked-by: default avatarAmelie Delaunay <amelie.delaunay@st.com>
      Signed-off-by: default avatarAlexandre Belloni <alexandre.belloni@bootlin.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      a4e1b27e
    • Sven Van Asbroeck's avatar
      rtc: 88pm860x: prevent use-after-free on device remove · 88053c93
      Sven Van Asbroeck authored
      [ Upstream commit f22b1ba1 ]
      
      The device's remove() attempts to shut down the delayed_work scheduled
      on the kernel-global workqueue by calling flush_scheduled_work().
      
      Unfortunately, flush_scheduled_work() does not prevent the delayed_work
      from re-scheduling itself. The delayed_work might run after the device
      has been removed, and touch the already de-allocated info structure.
      This is a potential use-after-free.
      
      Fix by calling cancel_delayed_work_sync() during remove(): this ensures
      that the delayed work is properly cancelled, is no longer running, and
      is not able to re-schedule itself.
      
      This issue was detected with the help of Coccinelle.
      Signed-off-by: default avatarSven Van Asbroeck <TheSven73@gmail.com>
      Signed-off-by: default avatarAlexandre Belloni <alexandre.belloni@bootlin.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      88053c93
    • Johannes Berg's avatar
      iwlwifi: pcie: don't crash on invalid RX interrupt · 902e3a83
      Johannes Berg authored
      [ Upstream commit 30f24eab ]
      
      If for some reason the device gives us an RX interrupt before we're
      ready for it, perhaps during device power-on with misconfigured IRQ
      causes mapping or so, we can crash trying to access the queues.
      
      Prevent that by checking that we actually have RXQs and that they
      were properly allocated.
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: default avatarLuca Coelho <luciano.coelho@intel.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      902e3a83
    • Qu Wenruo's avatar
      btrfs: Don't panic when we can't find a root key · 40b111b4
      Qu Wenruo authored
      [ Upstream commit 7ac1e464 ]
      
      When we failed to find a root key in btrfs_update_root(), we just panic.
      
      That's definitely not cool, fix it by outputting an unique error
      message, aborting current transaction and return -EUCLEAN. This should
      not normally happen as the root has been used by the callers in some
      way.
      Reviewed-by: default avatarFilipe Manana <fdmanana@suse.com>
      Reviewed-by: default avatarJohannes Thumshirn <jthumshirn@suse.de>
      Signed-off-by: default avatarQu Wenruo <wqu@suse.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      40b111b4
    • Josef Bacik's avatar
      btrfs: fix panic during relocation after ENOSPC before writeback happens · 2d39b32c
      Josef Bacik authored
      [ Upstream commit ff612ba7 ]
      
      We've been seeing the following sporadically throughout our fleet
      
      panic: kernel BUG at fs/btrfs/relocation.c:4584!
      netversion: 5.0-0
      Backtrace:
       #0 [ffffc90003adb880] machine_kexec at ffffffff81041da8
       #1 [ffffc90003adb8c8] __crash_kexec at ffffffff8110396c
       #2 [ffffc90003adb988] crash_kexec at ffffffff811048ad
       #3 [ffffc90003adb9a0] oops_end at ffffffff8101c19a
       #4 [ffffc90003adb9c0] do_trap at ffffffff81019114
       #5 [ffffc90003adba00] do_error_trap at ffffffff810195d0
       #6 [ffffc90003adbab0] invalid_op at ffffffff81a00a9b
          [exception RIP: btrfs_reloc_cow_block+692]
          RIP: ffffffff8143b614  RSP: ffffc90003adbb68  RFLAGS: 00010246
          RAX: fffffffffffffff7  RBX: ffff8806b9c32000  RCX: ffff8806aad00690
          RDX: ffff880850b295e0  RSI: ffff8806b9c32000  RDI: ffff88084f205bd0
          RBP: ffff880849415000   R8: ffffc90003adbbe0   R9: ffff88085ac90000
          R10: ffff8805f7369140  R11: 0000000000000000  R12: ffff880850b295e0
          R13: ffff88084f205bd0  R14: 0000000000000000  R15: 0000000000000000
          ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
       #7 [ffffc90003adbbb0] __btrfs_cow_block at ffffffff813bf1cd
       #8 [ffffc90003adbc28] btrfs_cow_block at ffffffff813bf4b3
       #9 [ffffc90003adbc78] btrfs_search_slot at ffffffff813c2e6c
      
      The way relocation moves data extents is by creating a reloc inode and
      preallocating extents in this inode and then copying the data into these
      preallocated extents.  Once we've done this for all of our extents,
      we'll write out these dirty pages, which marks the extent written, and
      goes into btrfs_reloc_cow_block().  From here we get our current
      reloc_control, which _should_ match the reloc_control for the current
      block group we're relocating.
      
      However if we get an ENOSPC in this path at some point we'll bail out,
      never initiating writeback on this inode.  Not a huge deal, unless we
      happen to be doing relocation on a different block group, and this block
      group is now rc->stage == UPDATE_DATA_PTRS.  This trips the BUG_ON() in
      btrfs_reloc_cow_block(), because we expect to be done modifying the data
      inode.  We are in fact done modifying the metadata for the data inode
      we're currently using, but not the one from the failed block group, and
      thus we BUG_ON().
      
      (This happens when writeback finishes for extents from the previous
      group, when we are at btrfs_finish_ordered_io() which updates the data
      reloc tree (inode item, drops/adds extent items, etc).)
      
      Fix this by writing out the reloc data inode always, and then breaking
      out of the loop after that point to keep from tripping this BUG_ON()
      later.
      Signed-off-by: default avatarJosef Bacik <josef@toxicpanda.com>
      Reviewed-by: default avatarFilipe Manana <fdmanana@suse.com>
      [ add note from Filipe ]
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      2d39b32c
    • Robbie Ko's avatar
      Btrfs: fix data bytes_may_use underflow with fallocate due to failed quota reserve · f033d6a2
      Robbie Ko authored
      [ Upstream commit 39ad3173 ]
      
      When doing fallocate, we first add the range to the reserve_list and
      then reserve the quota.  If quota reservation fails, we'll release all
      reserved parts of reserve_list.
      
      However, cur_offset is not updated to indicate that this range is
      already been inserted into the list.  Therefore, the same range is freed
      twice.  Once at list_for_each_entry loop, and once at the end of the
      function.  This will result in WARN_ON on bytes_may_use when we free the
      remaining space.
      
      At the end, under the 'out' label we have a call to:
      
         btrfs_free_reserved_data_space(inode, data_reserved, alloc_start, alloc_end - cur_offset);
      
      The start offset, third argument, should be cur_offset.
      
      Everything from alloc_start to cur_offset was freed by the
      list_for_each_entry_safe_loop.
      
      Fixes: 18513091 ("btrfs: update btrfs_space_info's bytes_may_use timely")
      Reviewed-by: default avatarFilipe Manana <fdmanana@suse.com>
      Signed-off-by: default avatarRobbie Ko <robbieko@synology.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      f033d6a2
    • Nadav Amit's avatar
      x86/modules: Avoid breaking W^X while loading modules · eee8c24f
      Nadav Amit authored
      [ Upstream commit f2c65fb3 ]
      
      When modules and BPF filters are loaded, there is a time window in
      which some memory is both writable and executable. An attacker that has
      already found another vulnerability (e.g., a dangling pointer) might be
      able to exploit this behavior to overwrite kernel code. Prevent having
      writable executable PTEs in this stage.
      
      In addition, avoiding having W+X mappings can also slightly simplify the
      patching of modules code on initialization (e.g., by alternatives and
      static-key), as would be done in the next patch. This was actually the
      main motivation for this patch.
      
      To avoid having W+X mappings, set them initially as RW (NX) and after
      they are set as RO set them as X as well. Setting them as executable is
      done as a separate step to avoid one core in which the old PTE is cached
      (hence writable), and another which sees the updated PTE (executable),
      which would break the W^X protection.
      Suggested-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Suggested-by: default avatarAndy Lutomirski <luto@amacapital.net>
      Signed-off-by: default avatarNadav Amit <namit@vmware.com>
      Signed-off-by: default avatarRick Edgecombe <rick.p.edgecombe@intel.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: <akpm@linux-foundation.org>
      Cc: <ard.biesheuvel@linaro.org>
      Cc: <deneen.t.dock@intel.com>
      Cc: <kernel-hardening@lists.openwall.com>
      Cc: <kristen@linux.intel.com>
      Cc: <linux_dti@icloud.com>
      Cc: <will.deacon@arm.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jessica Yu <jeyu@kernel.org>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Rik van Riel <riel@surriel.com>
      Link: https://lkml.kernel.org/r/20190426001143.4983-12-namit@vmware.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      eee8c24f
    • Bart Van Assche's avatar
      scsi: qla2xxx: Fix hardirq-unsafe locking · f4edac5b
      Bart Van Assche authored
      [ Upstream commit 300ec741 ]
      
      Since fc_remote_port_delete() must be called with interrupts enabled, do
      not disable interrupts when calling that function. Remove the lockin calls
      from around the put_sess() call. This is safe because the function that is
      called when the final reference is dropped, qlt_unreg_sess(), grabs the
      proper locks. This patch avoids that lockdep reports the following:
      
      WARNING: HARDIRQ-safe -> HARDIRQ-unsafe lock order detected
      kworker/2:1/62 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire:
      0000000009e679b3 (&(&k->k_lock)->rlock){+.+.}, at: klist_next+0x43/0x1d0
      
      and this task is already holding:
      00000000a033b71c (&(&ha->tgt.sess_lock)->rlock){-...}, at: qla24xx_delete_sess_fn+0x55/0xf0 [qla2xxx_scst]
      which would create a new lock dependency:
       (&(&ha->tgt.sess_lock)->rlock){-...} -> (&(&k->k_lock)->rlock){+.+.}
      
      but this new dependency connects a HARDIRQ-irq-safe lock:
       (&(&ha->tgt.sess_lock)->rlock){-...}
      
      ... which became HARDIRQ-irq-safe at:
        lock_acquire+0xe3/0x200
        _raw_spin_lock_irqsave+0x3d/0x60
        qla24xx_report_id_acquisition+0xa69/0xe30 [qla2xxx_scst]
        qla24xx_process_response_queue+0x69e/0x1270 [qla2xxx_scst]
        qla24xx_msix_rsp_q+0x79/0xf0 [qla2xxx_scst]
        __handle_irq_event_percpu+0x79/0x3c0
        handle_irq_event_percpu+0x70/0xf0
        handle_irq_event+0x5a/0x8b
        handle_edge_irq+0x12c/0x310
        handle_irq+0x192/0x20a
        do_IRQ+0x73/0x160
        ret_from_intr+0x0/0x1d
        default_idle+0x23/0x1f0
        arch_cpu_idle+0x15/0x20
        default_idle_call+0x35/0x40
        do_idle+0x2bb/0x2e0
        cpu_startup_entry+0x1d/0x20
        start_secondary+0x2a8/0x320
        secondary_startup_64+0xa4/0xb0
      
      to a HARDIRQ-irq-unsafe lock:
       (&(&k->k_lock)->rlock){+.+.}
      
      ... which became HARDIRQ-irq-unsafe at:
      ...
        lock_acquire+0xe3/0x200
        _raw_spin_lock+0x32/0x50
        klist_add_tail+0x33/0xb0
        device_add+0x7e1/0xb50
        device_create_groups_vargs+0x11c/0x150
        device_create_with_groups+0x89/0xb0
        vtconsole_class_init+0xb2/0x124
        do_one_initcall+0xc5/0x3ce
        kernel_init_freeable+0x295/0x32e
        kernel_init+0x11/0x11b
        ret_from_fork+0x3a/0x50
      
      other info that might help us debug this:
      
       Possible interrupt unsafe locking scenario:
      
             CPU0                    CPU1
             ----                    ----
        lock(&(&k->k_lock)->rlock);
                                     local_irq_disable();
                                     lock(&(&ha->tgt.sess_lock)->rlock);
                                     lock(&(&k->k_lock)->rlock);
        <Interrupt>
          lock(&(&ha->tgt.sess_lock)->rlock);
      
       *** DEADLOCK ***
      
      3 locks held by kworker/2:1/62:
       #0: 00000000a4319c16 ((wq_completion)"qla2xxx_wq"){+.+.}, at: process_one_work+0x437/0xa80
       #1: 00000000ffa34c42 ((work_completion)(&sess->del_work)){+.+.}, at: process_one_work+0x437/0xa80
       #2: 00000000a033b71c (&(&ha->tgt.sess_lock)->rlock){-...}, at: qla24xx_delete_sess_fn+0x55/0xf0 [qla2xxx_scst]
      
      the dependencies between HARDIRQ-irq-safe lock and the holding lock:
      -> (&(&ha->tgt.sess_lock)->rlock){-...} ops: 8 {
         IN-HARDIRQ-W at:
                          lock_acquire+0xe3/0x200
                          _raw_spin_lock_irqsave+0x3d/0x60
                          qla24xx_report_id_acquisition+0xa69/0xe30 [qla2xxx_scst]
                          qla24xx_process_response_queue+0x69e/0x1270 [qla2xxx_scst]
                          qla24xx_msix_rsp_q+0x79/0xf0 [qla2xxx_scst]
                          __handle_irq_event_percpu+0x79/0x3c0
                          handle_irq_event_percpu+0x70/0xf0
                          handle_irq_event+0x5a/0x8b
                          handle_edge_irq+0x12c/0x310
                          handle_irq+0x192/0x20a
                          do_IRQ+0x73/0x160
                          ret_from_intr+0x0/0x1d
                          default_idle+0x23/0x1f0
                          arch_cpu_idle+0x15/0x20
                          default_idle_call+0x35/0x40
                          do_idle+0x2bb/0x2e0
                          cpu_startup_entry+0x1d/0x20
                          start_secondary+0x2a8/0x320
                          secondary_startup_64+0xa4/0xb0
         INITIAL USE at:
                         lock_acquire+0xe3/0x200
                         _raw_spin_lock_irqsave+0x3d/0x60
                         qla24xx_report_id_acquisition+0xa69/0xe30 [qla2xxx_scst]
                         qla24xx_process_response_queue+0x69e/0x1270 [qla2xxx_scst]
                         qla24xx_msix_rsp_q+0x79/0xf0 [qla2xxx_scst]
                         __handle_irq_event_percpu+0x79/0x3c0
                         handle_irq_event_percpu+0x70/0xf0
                         handle_irq_event+0x5a/0x8b
                         handle_edge_irq+0x12c/0x310
                         handle_irq+0x192/0x20a
                         do_IRQ+0x73/0x160
                         ret_from_intr+0x0/0x1d
                         default_idle+0x23/0x1f0
                         arch_cpu_idle+0x15/0x20
                         default_idle_call+0x35/0x40
                         do_idle+0x2bb/0x2e0
                         cpu_startup_entry+0x1d/0x20
                         start_secondary+0x2a8/0x320
                         secondary_startup_64+0xa4/0xb0
       }
       ... key      at: [<ffffffffa0c0d080>] __key.85462+0x0/0xfffffffffff7df80 [qla2xxx_scst]
       ... acquired at:
         lock_acquire+0xe3/0x200
         _raw_spin_lock_irqsave+0x3d/0x60
         klist_next+0x43/0x1d0
         device_for_each_child+0x96/0x110
         scsi_target_block+0x3c/0x40 [scsi_mod]
         fc_remote_port_delete+0xe7/0x1c0 [scsi_transport_fc]
         qla2x00_mark_device_lost+0xa0b/0xa30 [qla2xxx_scst]
         qlt_unreg_sess+0x1c6/0x380 [qla2xxx_scst]
         qla24xx_delete_sess_fn+0xe6/0xf0 [qla2xxx_scst]
         process_one_work+0x511/0xa80
         worker_thread+0x67/0x5b0
         kthread+0x1d2/0x1f0
         ret_from_fork+0x3a/0x50
      
      the dependencies between the lock to be acquired
       and HARDIRQ-irq-unsafe lock:
      -> (&(&k->k_lock)->rlock){+.+.} ops: 13831 {
         HARDIRQ-ON-W at:
                          lock_acquire+0xe3/0x200
                          _raw_spin_lock+0x32/0x50
                          klist_add_tail+0x33/0xb0
                          device_add+0x7e1/0xb50
                          device_create_groups_vargs+0x11c/0x150
                          device_create_with_groups+0x89/0xb0
                          vtconsole_class_init+0xb2/0x124
                          do_one_initcall+0xc5/0x3ce
                          kernel_init_freeable+0x295/0x32e
                          kernel_init+0x11/0x11b
                          ret_from_fork+0x3a/0x50
         SOFTIRQ-ON-W at:
                          lock_acquire+0xe3/0x200
                          _raw_spin_lock+0x32/0x50
                          klist_add_tail+0x33/0xb0
                          device_add+0x7e1/0xb50
                          device_create_groups_vargs+0x11c/0x150
                          device_create_with_groups+0x89/0xb0
                          vtconsole_class_init+0xb2/0x124
                          do_one_initcall+0xc5/0x3ce
                          kernel_init_freeable+0x295/0x32e
                          kernel_init+0x11/0x11b
                          ret_from_fork+0x3a/0x50
         INITIAL USE at:
                         lock_acquire+0xe3/0x200
                         _raw_spin_lock+0x32/0x50
                         klist_add_tail+0x33/0xb0
                         device_add+0x7e1/0xb50
                         device_create_groups_vargs+0x11c/0x150
                         device_create_with_groups+0x89/0xb0
                         vtconsole_class_init+0xb2/0x124
                         do_one_initcall+0xc5/0x3ce
                         kernel_init_freeable+0x295/0x32e
                         kernel_init+0x11/0x11b
                         ret_from_fork+0x3a/0x50
       }
       ... key      at: [<ffffffff83ed8780>] __key.15491+0x0/0x40
       ... acquired at:
         lock_acquire+0xe3/0x200
         _raw_spin_lock_irqsave+0x3d/0x60
         klist_next+0x43/0x1d0
         device_for_each_child+0x96/0x110
         scsi_target_block+0x3c/0x40 [scsi_mod]
         fc_remote_port_delete+0xe7/0x1c0 [scsi_transport_fc]
         qla2x00_mark_device_lost+0xa0b/0xa30 [qla2xxx_scst]
         qlt_unreg_sess+0x1c6/0x380 [qla2xxx_scst]
         qla24xx_delete_sess_fn+0xe6/0xf0 [qla2xxx_scst]
         process_one_work+0x511/0xa80
         worker_thread+0x67/0x5b0
         kthread+0x1d2/0x1f0
         ret_from_fork+0x3a/0x50
      
      stack backtrace:
      CPU: 2 PID: 62 Comm: kworker/2:1 Tainted: G           O      5.0.7-dbg+ #8
      Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
      Workqueue: qla2xxx_wq qla24xx_delete_sess_fn [qla2xxx_scst]
      Call Trace:
       dump_stack+0x86/0xca
       check_usage.cold.52+0x473/0x563
       __lock_acquire+0x11c0/0x23e0
       lock_acquire+0xe3/0x200
       _raw_spin_lock_irqsave+0x3d/0x60
       klist_next+0x43/0x1d0
       device_for_each_child+0x96/0x110
       scsi_target_block+0x3c/0x40 [scsi_mod]
       fc_remote_port_delete+0xe7/0x1c0 [scsi_transport_fc]
       qla2x00_mark_device_lost+0xa0b/0xa30 [qla2xxx_scst]
       qlt_unreg_sess+0x1c6/0x380 [qla2xxx_scst]
       qla24xx_delete_sess_fn+0xe6/0xf0 [qla2xxx_scst]
       process_one_work+0x511/0xa80
       worker_thread+0x67/0x5b0
       kthread+0x1d2/0x1f0
       ret_from_fork+0x3a/0x50
      
      Cc: Himanshu Madhani <hmadhani@marvell.com>
      Cc: Giridhar Malavali <gmalavali@marvell.com>
      Signed-off-by: default avatarBart Van Assche <bvanassche@acm.org>
      Acked-by: default avatarHimanshu Madhani <hmadhani@marvell.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      f4edac5b