1. 05 Oct, 2017 10 commits
    • LEROY Christophe's avatar
      crypto: talitos - fix sha224 · 362711d5
      LEROY Christophe authored
      commit afd62fa2 upstream.
      
      Kernel crypto tests report the following error at startup
      
      [    2.752626] alg: hash: Test 4 failed for sha224-talitos
      [    2.757907] 00000000: 30 e2 86 e2 e7 8a dd 0d d7 eb 9f d5 83 fe f1 b0
      00000010: 2d 5a 6c a5 f9 55 ea fd 0e 72 05 22
      
      This patch fixes it
      Signed-off-by: default avatarChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      362711d5
    • LEROY Christophe's avatar
      crypto: talitos - Don't provide setkey for non hmac hashing algs. · 231c4f64
      LEROY Christophe authored
      commit 56136631 upstream.
      
      Today, md5sum fails with error -ENOKEY because a setkey
      function is set for non hmac hashing algs, see strace output below:
      
      mmap(NULL, 378880, PROT_READ, MAP_SHARED, 6, 0) = 0x77f50000
      accept(3, 0, NULL)                      = 7
      vmsplice(5, [{"bin/\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 378880}], 1, SPLICE_F_MORE|SPLICE_F_GIFT) = 262144
      splice(4, NULL, 7, NULL, 262144, SPLICE_F_MORE) = -1 ENOKEY (Required key not available)
      write(2, "Generation of hash for file kcap"..., 50) = 50
      munmap(0x77f50000, 378880)              = 0
      
      This patch ensures that setkey() function is set only
      for hmac hashing.
      Signed-off-by: default avatarChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      231c4f64
    • Xin Long's avatar
      scsi: scsi_transport_iscsi: fix the issue that iscsi_if_rx doesn't parse nlmsg properly · 9d253491
      Xin Long authored
      commit c88f0e6b upstream.
      
      ChunYu found a kernel crash by syzkaller:
      
      [  651.617875] kasan: CONFIG_KASAN_INLINE enabled
      [  651.618217] kasan: GPF could be caused by NULL-ptr deref or user memory access
      [  651.618731] general protection fault: 0000 [#1] SMP KASAN
      [  651.621543] CPU: 1 PID: 9539 Comm: scsi Not tainted 4.11.0.cov #32
      [  651.621938] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
      [  651.622309] task: ffff880117780000 task.stack: ffff8800a3188000
      [  651.622762] RIP: 0010:skb_release_data+0x26c/0x590
      [...]
      [  651.627260] Call Trace:
      [  651.629156]  skb_release_all+0x4f/0x60
      [  651.629450]  consume_skb+0x1a5/0x600
      [  651.630705]  netlink_unicast+0x505/0x720
      [  651.632345]  netlink_sendmsg+0xab2/0xe70
      [  651.633704]  sock_sendmsg+0xcf/0x110
      [  651.633942]  ___sys_sendmsg+0x833/0x980
      [  651.637117]  __sys_sendmsg+0xf3/0x240
      [  651.638820]  SyS_sendmsg+0x32/0x50
      [  651.639048]  entry_SYSCALL_64_fastpath+0x1f/0xc2
      
      It's caused by skb_shared_info at the end of sk_buff was overwritten by
      ISCSI_KEVENT_IF_ERROR when parsing nlmsg info from skb in iscsi_if_rx.
      
      During the loop if skb->len == nlh->nlmsg_len and both are sizeof(*nlh),
      ev = nlmsg_data(nlh) will acutally get skb_shinfo(SKB) instead and set a
      new value to skb_shinfo(SKB)->nr_frags by ev->type.
      
      This patch is to fix it by checking nlh->nlmsg_len properly there to
      avoid over accessing sk_buff.
      Reported-by: default avatarChunYu Wang <chunwang@redhat.com>
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Acked-by: default avatarChris Leech <cleech@redhat.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9d253491
    • Dennis Yang's avatar
      md/raid5: preserve STRIPE_ON_UNPLUG_LIST in break_stripe_batch_list · 29854a77
      Dennis Yang authored
      commit 184a09eb upstream.
      
      In release_stripe_plug(), if a stripe_head has its STRIPE_ON_UNPLUG_LIST
      set, it indicates that this stripe_head is already in the raid5_plug_cb
      list and release_stripe() would be called instead to drop a reference
      count. Otherwise, the STRIPE_ON_UNPLUG_LIST bit would be set for this
      stripe_head and it will get queued into the raid5_plug_cb list.
      
      Since break_stripe_batch_list() did not preserve STRIPE_ON_UNPLUG_LIST,
      A stripe could be re-added to plug list while it is still on that list
      in the following situation. If stripe_head A is added to another
      stripe_head B's batch list, in this case A will have its
      batch_head != NULL and be added into the plug list. After that,
      stripe_head B gets handled and called break_stripe_batch_list() to
      reset all the batched stripe_head(including A which is still on
      the plug list)'s state and reset their batch_head to NULL.
      Before the plug list gets processed, if there is another write request
      comes in and get stripe_head A, A will have its batch_head == NULL
      (cleared by calling break_stripe_batch_list() on B) and be added to
      plug list once again.
      Signed-off-by: default avatarDennis Yang <dennisyang@qnap.com>
      Signed-off-by: default avatarShaohua Li <shli@fb.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      29854a77
    • Shaohua Li's avatar
      md/raid5: fix a race condition in stripe batch · d03d1567
      Shaohua Li authored
      commit 3664847d upstream.
      
      We have a race condition in below scenario, say have 3 continuous stripes, sh1,
      sh2 and sh3, sh1 is the stripe_head of sh2 and sh3:
      
      CPU1				CPU2				CPU3
      handle_stripe(sh3)
      				stripe_add_to_batch_list(sh3)
      				-> lock(sh2, sh3)
      				-> lock batch_lock(sh1)
      				-> add sh3 to batch_list of sh1
      				-> unlock batch_lock(sh1)
      								clear_batch_ready(sh1)
      								-> lock(sh1) and batch_lock(sh1)
      								-> clear STRIPE_BATCH_READY for all stripes in batch_list
      								-> unlock(sh1) and batch_lock(sh1)
      ->clear_batch_ready(sh3)
      -->test_and_clear_bit(STRIPE_BATCH_READY, sh3)
      --->return 0 as sh->batch == NULL
      				-> sh3->batch_head = sh1
      				-> unlock (sh2, sh3)
      
      In CPU1, handle_stripe will continue handle sh3 even it's in batch stripe list
      of sh1. By moving sh3->batch_head assignment in to batch_lock, we make it
      impossible to clear STRIPE_BATCH_READY before batch_head is set.
      
      Thanks Stephane for helping debug this tricky issue.
      Reported-and-tested-by: default avatarStephane Thiell <sthiell@stanford.edu>
      Signed-off-by: default avatarShaohua Li <shli@fb.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d03d1567
    • Bo Yan's avatar
      tracing: Erase irqsoff trace with empty write · 68a4a528
      Bo Yan authored
      commit 8dd33bcb upstream.
      
      One convenient way to erase trace is "echo > trace". However, this
      is currently broken if the current tracer is irqsoff tracer. This
      is because irqsoff tracer use max_buffer as the default trace
      buffer.
      
      Set the max_buffer as the one to be cleared when it's the trace
      buffer currently in use.
      
      Link: http://lkml.kernel.org/r/1505754215-29411-1-git-send-email-byan@nvidia.com
      
      Cc: <mingo@redhat.com>
      Fixes: 4acd4d00 ("tracing: give easy way to clear trace buffer")
      Signed-off-by: default avatarBo Yan <byan@nvidia.com>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      68a4a528
    • Tahsin Erdogan's avatar
      tracing: Fix trace_pipe behavior for instance traces · 9c5afa72
      Tahsin Erdogan authored
      commit 75df6e68 upstream.
      
      When reading data from trace_pipe, tracing_wait_pipe() performs a
      check to see if tracing has been turned off after some data was read.
      Currently, this check always looks at global trace state, but it
      should be checking the trace instance where trace_pipe is located at.
      
      Because of this bug, cat instances/i1/trace_pipe in the following
      script will immediately exit instead of waiting for data:
      
      cd /sys/kernel/debug/tracing
      echo 0 > tracing_on
      mkdir -p instances/i1
      echo 1 > instances/i1/tracing_on
      echo 1 > instances/i1/events/sched/sched_process_exec/enable
      cat instances/i1/trace_pipe
      
      Link: http://lkml.kernel.org/r/20170917102348.1615-1-tahsin@google.com
      
      Fixes: 10246fa3 ("tracing: give easy way to clear trace buffer")
      Signed-off-by: default avatarTahsin Erdogan <tahsin@google.com>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9c5afa72
    • Paul Mackerras's avatar
      KVM: PPC: Book3S: Fix race and leak in kvm_vm_ioctl_create_spapr_tce() · f75c0042
      Paul Mackerras authored
      commit 47c5310a upstream, with part
      of commit edd03602 folded in.
      
      Nixiaoming pointed out that there is a memory leak in
      kvm_vm_ioctl_create_spapr_tce() if the call to anon_inode_getfd()
      fails; the memory allocated for the kvmppc_spapr_tce_table struct
      is not freed, and nor are the pages allocated for the iommu
      tables.
      
      David Hildenbrand pointed out that there is a race in that the
      function checks early on that there is not already an entry in the
      stt->iommu_tables list with the same LIOBN, but an entry with the
      same LIOBN could get added between then and when the new entry is
      added to the list.
      
      This fixes both problems.  To simplify things, we now call
      anon_inode_getfd() before placing the new entry in the list.  The
      check for an existing entry is done while holding the kvm->lock
      mutex, immediately before adding the new entry to the list.
      
      [paulus@ozlabs.org - folded in that part of edd03602 ("KVM:
       PPC: Book3S HV: Protect updates to spapr_tce_tables list", 2017-08-28)
       which restructured the code that 47c5310a modified, to avoid
       a build failure caused by the absence of put_unused_fd().
       Also removed the locked memory accounting, since it doesn't exist
       in this version, and adjusted the commit message.]
      
      Fixes: 54738c09 ("KVM: PPC: Accelerate H_PUT_TCE by implementing it in real mode")
      Reported-by: default avatarNixiaoming <nixiaoming@huawei.com>
      Reported-by: default avatarDavid Hildenbrand <david@redhat.com>
      Signed-off-by: default avatarPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f75c0042
    • Avraham Stern's avatar
      mac80211: flush hw_roc_start work before cancelling the ROC · 7d8fbf3d
      Avraham Stern authored
      commit 6e46d8ce upstream.
      
      When HW ROC is supported it is possible that after the HW notified
      that the ROC has started, the ROC was cancelled and another ROC was
      added while the hw_roc_start worker is waiting on the mutex (since
      cancelling the ROC and adding another one also holds the same mutex).
      As a result, the hw_roc_start worker will continue to run after the
      new ROC is added but before it is actually started by the HW.
      This may result in notifying userspace that the ROC has started before
      it actually does, or in case of management tx ROC, in an attempt to
      tx while not on the right channel.
      
      In addition, when the driver will notify mac80211 that the second ROC
      has started, mac80211 will warn that this ROC has already been
      notified.
      
      Fix this by flushing the hw_roc_start work before cancelling an ROC.
      Signed-off-by: default avatarAvraham Stern <avraham.stern@intel.com>
      Signed-off-by: default avatarLuca Coelho <luciano.coelho@intel.com>
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7d8fbf3d
    • Shu Wang's avatar
      cifs: release auth_key.response for reconnect. · fcc949a4
      Shu Wang authored
      commit f5c4ba81 upstream.
      
      There is a race that cause cifs reconnect in cifs_mount,
      - cifs_mount
        - cifs_get_tcp_session
          - [ start thread cifs_demultiplex_thread
            - cifs_read_from_socket: -ECONNABORTED
              - DELAY_WORK smb2_reconnect_server ]
        - cifs_setup_session
        - [ smb2_reconnect_server ]
      
      auth_key.response was allocated in cifs_setup_session, and
      will release when the session destoried. So when session re-
      connect, auth_key.response should be check and released.
      
      Tested with my system:
      CIFS VFS: Free previous auth_key.response = ffff8800320bbf80
      
      A simple auth_key.response allocation call trace:
      - cifs_setup_session
      - SMB2_sess_setup
      - SMB2_sess_auth_rawntlmssp_authenticate
      - build_ntlmssp_auth_blob
      - setup_ntlmv2_rsp
      Signed-off-by: default avatarShu Wang <shuwang@redhat.com>
      Signed-off-by: default avatarSteve French <smfrench@gmail.com>
      Reviewed-by: default avatarRonnie Sahlberg <lsahlber@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fcc949a4
  2. 27 Sep, 2017 30 commits
    • Greg Kroah-Hartman's avatar
      Linux 4.4.89 · 10def3a6
      Greg Kroah-Hartman authored
      10def3a6
    • Steven Rostedt (VMware)'s avatar
      ftrace: Fix memleak when unregistering dynamic ops when tracing disabled · ed1bf439
      Steven Rostedt (VMware) authored
      commit edb096e0 upstream.
      
      If function tracing is disabled by the user via the function-trace option or
      the proc sysctl file, and a ftrace_ops that was allocated on the heap is
      unregistered, then the shutdown code exits out without doing the proper
      clean up. This was found via kmemleak and running the ftrace selftests, as
      one of the tests unregisters with function tracing disabled.
      
       # cat kmemleak
      unreferenced object 0xffffffffa0020000 (size 4096):
        comm "swapper/0", pid 1, jiffies 4294668889 (age 569.209s)
        hex dump (first 32 bytes):
          55 ff 74 24 10 55 48 89 e5 ff 74 24 18 55 48 89  U.t$.UH...t$.UH.
          e5 48 81 ec a8 00 00 00 48 89 44 24 50 48 89 4c  .H......H.D$PH.L
        backtrace:
          [<ffffffff81d64665>] kmemleak_vmalloc+0x85/0xf0
          [<ffffffff81355631>] __vmalloc_node_range+0x281/0x3e0
          [<ffffffff8109697f>] module_alloc+0x4f/0x90
          [<ffffffff81091170>] arch_ftrace_update_trampoline+0x160/0x420
          [<ffffffff81249947>] ftrace_startup+0xe7/0x300
          [<ffffffff81249bd2>] register_ftrace_function+0x72/0x90
          [<ffffffff81263786>] trace_selftest_ops+0x204/0x397
          [<ffffffff82bb8971>] trace_selftest_startup_function+0x394/0x624
          [<ffffffff81263a75>] run_tracer_selftest+0x15c/0x1d7
          [<ffffffff82bb83f1>] init_trace_selftests+0x75/0x192
          [<ffffffff81002230>] do_one_initcall+0x90/0x1e2
          [<ffffffff82b7d620>] kernel_init_freeable+0x350/0x3fe
          [<ffffffff81d61ec3>] kernel_init+0x13/0x122
          [<ffffffff81d72c6a>] ret_from_fork+0x2a/0x40
          [<ffffffffffffffff>] 0xffffffffffffffff
      
      Fixes: 12cce594 ("ftrace/x86: Allow !CONFIG_PREEMPT dynamic ops to use allocated trampolines")
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ed1bf439
    • Michael Lyle's avatar
      bcache: fix bch_hprint crash and improve output · a069d0a4
      Michael Lyle authored
      commit 9276717b upstream.
      
      Most importantly, solve a crash where %llu was used to format signed
      numbers.  This would cause a buffer overflow when reading sysfs
      writeback_rate_debug, as only 20 bytes were allocated for this and
      %llu writes 20 characters plus a null.
      
      Always use the units mechanism rather than having different output
      paths for simplicity.
      
      Also, correct problems with display output where 1.10 was a larger
      number than 1.09, by multiplying by 10 and then dividing by 1024 instead
      of dividing by 100.  (Remainders of >= 1000 would print as .10).
      
      Minor changes: Always display the decimal point instead of trying to
      omit it based on number of digits shown.  Decide what units to use
      based on 1000 as a threshold, not 1024 (in other words, always print
      at most 3 digits before the decimal point).
      Signed-off-by: default avatarMichael Lyle <mlyle@lyle.org>
      Reported-by: default avatarDmitry Yu Okunev <dyokunev@ut.mephi.ru>
      Acked-by: default avatarKent Overstreet <kent.overstreet@gmail.com>
      Reviewed-by: default avatarColy Li <colyli@suse.de>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a069d0a4
    • Tang Junhui's avatar
      bcache: fix for gc and write-back race · f522051a
      Tang Junhui authored
      commit 9baf3097 upstream.
      
      gc and write-back get raced (see the email "bcache get stucked" I sended
      before):
      gc thread                               write-back thread
      |                                       |bch_writeback_thread()
      |bch_gc_thread()                        |
      |                                       |==>read_dirty()
      |==>bch_btree_gc()                      |
      |==>btree_root() //get btree root       |
      |                //node write locker    |
      |==>bch_btree_gc_root()                 |
      |                                       |==>read_dirty_submit()
      |                                       |==>write_dirty()
      |                                       |==>continue_at(cl,
      |                                       |               write_dirty_finish,
      |                                       |               system_wq);
      |                                       |==>write_dirty_finish()//excute
      |                                       |               //in system_wq
      |                                       |==>bch_btree_insert()
      |                                       |==>bch_btree_map_leaf_nodes()
      |                                       |==>__bch_btree_map_nodes()
      |                                       |==>btree_root //try to get btree
      |                                       |              //root node read
      |                                       |              //lock
      |                                       |-----stuck here
      |==>bch_btree_set_root()
      |==>bch_journal_meta()
      |==>bch_journal()
      |==>journal_try_write()
      |==>journal_write_unlocked() //journal_full(&c->journal)
      |                            //condition satisfied
      |==>continue_at(cl, journal_write, system_wq); //try to excute
      |                               //journal_write in system_wq
      |                               //but work queue is excuting
      |                               //write_dirty_finish()
      |==>closure_sync(); //wait journal_write execute
      |                   //over and wake up gc,
      |-------------stuck here
      |==>release root node write locker
      
      This patch alloc a separate work-queue for write-back thread to avoid such
      race.
      
      (Commit log re-organized by Coly Li to pass checkpatch.pl checking)
      Signed-off-by: default avatarTang Junhui <tang.junhui@zte.com.cn>
      Acked-by: default avatarColy Li <colyli@suse.de>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f522051a
    • Tony Asleson's avatar
      bcache: Correct return value for sysfs attach errors · a6c5e7a0
      Tony Asleson authored
      commit 77fa100f upstream.
      
      If you encounter any errors in bch_cached_dev_attach it will return
      a negative error code.  The variable 'v' which stores the result is
      unsigned, thus user space sees a very large value returned for bytes
      written which can cause incorrect user space behavior.  Utilize 1
      signed variable to use throughout the function to preserve error return
      capability.
      Signed-off-by: default avatarTony Asleson <tasleson@redhat.com>
      Acked-by: default avatarColy Li <colyli@suse.de>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a6c5e7a0
    • Tang Junhui's avatar
      bcache: correct cache_dirty_target in __update_writeback_rate() · d9c6a28a
      Tang Junhui authored
      commit a8394090 upstream.
      
      __update_write_rate() uses a Proportion-Differentiation Controller
      algorithm to control writeback rate. A dirty target number is used in
      this PD controller to control writeback rate. A larger target number
      will make the writeback rate smaller, on the versus, a smaller target
      number will make the writeback rate larger.
      
      bcache uses the following steps to calculate the target number,
      1) cache_sectors = all-buckets-of-cache-set * buckets-size
      2) cache_dirty_target = cache_sectors * cached-device-writeback_percent
      3) target = cache_dirty_target *
      (sectors-of-cached-device/sectors-of-all-cached-devices-of-this-cache-set)
      
      The calculation at step 1) for cache_sectors is incorrect, which does
      not consider dirty blocks occupied by flash only volume.
      
      A flash only volume can be took as a bcache device without cached
      device. All data sectors allocated for it are persistent on cache device
      and marked dirty, they are not touched by bcache writeback and garbage
      collection code. So data blocks of flash only volume should be ignore
      when calculating cache_sectors of cache set.
      
      Current code does not subtract dirty sectors of flash only volume, which
      results a larger target number from the above 3 steps. And in sequence
      the cache device's writeback rate is smaller then a correct value,
      writeback speed is slower on all cached devices.
      
      This patch fixes the incorrect slower writeback rate by subtracting
      dirty sectors of flash only volumes in __update_writeback_rate().
      
      (Commit log composed by Coly Li to pass checkpatch.pl checking)
      Signed-off-by: default avatarTang Junhui <tang.junhui@zte.com.cn>
      Reviewed-by: default avatarColy Li <colyli@suse.de>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d9c6a28a
    • Tang Junhui's avatar
      bcache: do not subtract sectors_to_gc for bypassed IO · 0471f58e
      Tang Junhui authored
      commit 69daf03a upstream.
      
      Since bypassed IOs use no bucket, so do not subtract sectors_to_gc to
      trigger gc thread.
      Signed-off-by: default avatartang.junhui <tang.junhui@zte.com.cn>
      Acked-by: default avatarColy Li <colyli@suse.de>
      Reviewed-by: default avatarEric Wheeler <bcache@linux.ewheeler.net>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0471f58e
    • Jan Kara's avatar
      bcache: Fix leak of bdev reference · 093457f2
      Jan Kara authored
      commit 4b758df2 upstream.
      
      If blkdev_get_by_path() in register_bcache() fails, we try to lookup the
      block device using lookup_bdev() to detect which situation we are in to
      properly report error. However we never drop the reference returned to
      us from lookup_bdev(). Fix that.
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Acked-by: default avatarColy Li <colyli@suse.de>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      093457f2
    • Tang Junhui's avatar
      bcache: initialize dirty stripes in flash_dev_run() · 5025da3b
      Tang Junhui authored
      commit 175206cf upstream.
      
      bcache uses a Proportion-Differentiation Controller algorithm to control
      writeback rate to cached devices. In the PD controller algorithm, dirty
      stripes of thin flash device should not be counted in, because flash only
      volumes never write back dirty data.
      
      Currently dirty stripe counter for thin flash device is not initialized
      when the thin flash device starts. Which means the following calculation
      in PD controller will reference an undefined dirty stripes number, and
      all cached devices attached to the same cache set where the thin flash
      device lies on may have an inaccurate writeback rate.
      
      This patch calles bch_sectors_dirty_init() in flash_dev_run(), to
      correctly initialize dirty stripe counter when the thin flash device
      starts to run. This patch also does following parameter data type change,
       -void bch_sectors_dirty_init(struct cached_dev *dc);
       +void bch_sectors_dirty_init(struct bcache_device *);
      to call this function conveniently in flash_dev_run().
      
      (Commit log is composed by Coly Li)
      Signed-off-by: default avatarTang Junhui <tang.junhui@zte.com.cn>
      Reviewed-by: default avatarColy Li <colyli@suse.de>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5025da3b
    • Guenter Roeck's avatar
      media: uvcvideo: Prevent heap overflow when accessing mapped controls · 4931578f
      Guenter Roeck authored
      commit 7e09f7d5 upstream.
      
      The size of uvc_control_mapping is user controlled leading to a
      potential heap overflow in the uvc driver. This adds a check to verify
      the user provided size fits within the bounds of the defined buffer
      size.
      
      Originally-from: Richard Simmons <rssimmo@amazon.com>
      Signed-off-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Reviewed-by: default avatarLaurent Pinchart <laurent.pinchart@ideasonboard.com>
      Signed-off-by: default avatarHans Verkuil <hans.verkuil@cisco.com>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab@s-opensource.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4931578f
    • Daniel Mentz's avatar
      media: v4l2-compat-ioctl32: Fix timespec conversion · 04affe4e
      Daniel Mentz authored
      commit 9c7ba1d7 upstream.
      
      Certain syscalls like recvmmsg support 64 bit timespec values for the
      X32 ABI. The helper function compat_put_timespec converts a timespec
      value to a 32 bit or 64 bit value depending on what ABI is used. The
      v4l2 compat layer, however, is not designed to support 64 bit timespec
      values and always uses 32 bit values. Hence, compat_put_timespec must
      not be used.
      
      Without this patch, user space will be provided with bad timestamp
      values from the VIDIOC_DQEVENT ioctl. Also, fields of the struct
      v4l2_event32 that come immediately after timestamp get overwritten,
      namely the field named id.
      
      Fixes: 81993e81 ("compat: Get rid of (get|put)_compat_time(val|spec)")
      Cc: H. Peter Anvin <hpa@linux.intel.com>
      Cc: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
      Cc: Tiffany Lin <tiffany.lin@mediatek.com>
      Cc: Ricardo Ribalda Delgado <ricardo.ribalda@gmail.com>
      Cc: Sakari Ailus <sakari.ailus@linux.intel.com>
      Signed-off-by: default avatarDaniel Mentz <danielmentz@google.com>
      Signed-off-by: default avatarHans Verkuil <hans.verkuil@cisco.com>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab@s-opensource.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      04affe4e
    • Aleksandr Bezzubikov's avatar
      PCI: shpchp: Enable bridge bus mastering if MSI is enabled · 7498bd60
      Aleksandr Bezzubikov authored
      commit 48b79a14 upstream.
      
      An SHPC may generate MSIs to notify software about slot or controller
      events (SHPC spec r1.0, sec 4.7).  A PCI device can only generate an MSI if
      it has bus mastering enabled.
      
      Enable bus mastering if the bridge contains an SHPC that uses MSI for event
      notifications.
      Signed-off-by: default avatarAleksandr Bezzubikov <zuban32s@gmail.com>
      [bhelgaas: changelog]
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Reviewed-by: default avatarMarcel Apfelbaum <marcel@redhat.com>
      Acked-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7498bd60
    • Jose Abreu's avatar
      ARC: Re-enable MMU upon Machine Check exception · 81306fc3
      Jose Abreu authored
      commit 1ee55a8f upstream.
      
      I recently came upon a scenario where I would get a double fault
      machine check exception tiriggered by a kernel module.
      However the ensuing crash stacktrace (ksym lookup) was not working
      correctly.
      
      Turns out that machine check auto-disables MMU while modules are allocated
      in kernel vaddr spapce.
      
      This patch re-enables the MMU before start printing the stacktrace
      making stacktracing of modules work upon a fatal exception.
      Signed-off-by: default avatarJose Abreu <joabreu@synopsys.com>
      Reviewed-by: default avatarAlexey Brodkin <abrodkin@synopsys.com>
      Signed-off-by: default avatarVineet Gupta <vgupta@synopsys.com>
      [vgupta: moved code into low level handler to avoid in 2 places]
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      81306fc3
    • Baohong Liu's avatar
      tracing: Apply trace_clock changes to instance max buffer · d28e96be
      Baohong Liu authored
      commit 170b3b10 upstream.
      
      Currently trace_clock timestamps are applied to both regular and max
      buffers only for global trace. For instance trace, trace_clock
      timestamps are applied only to regular buffer. But, regular and max
      buffers can be swapped, for example, following a snapshot. So, for
      instance trace, bad timestamps can be seen following a snapshot.
      Let's apply trace_clock timestamps to instance max buffer as well.
      
      Link: http://lkml.kernel.org/r/ebdb168d0be042dcdf51f81e696b17fabe3609c1.1504642143.git.tom.zanussi@linux.intel.com
      
      Fixes: 277ba044 ("tracing: Add interface to allow multiple trace buffers")
      Signed-off-by: default avatarBaohong Liu <baohong.liu@intel.com>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d28e96be
    • Steven Rostedt (VMware)'s avatar
      ftrace: Fix selftest goto location on error · 753154fc
      Steven Rostedt (VMware) authored
      commit 46320a6a upstream.
      
      In the second iteration of trace_selftest_ops(), the error goto label is
      wrong in the case where trace_selftest_test_global_cnt is off. In the
      case of error, it leaks the dynamic ops that was allocated.
      
      Fixes: 95950c2e ("ftrace: Add self-tests for multiple function trace users")
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      753154fc
    • Dan Carpenter's avatar
      scsi: qla2xxx: Fix an integer overflow in sysfs code · d8663aa2
      Dan Carpenter authored
      commit e6f77540 upstream.
      
      The value of "size" comes from the user.  When we add "start + size" it
      could lead to an integer overflow bug.
      
      It means we vmalloc() a lot more memory than we had intended.  I believe
      that on 64 bit systems vmalloc() can succeed even if we ask it to
      allocate huge 4GB buffers.  So we would get memory corruption and likely
      a crash when we call ha->isp_ops->write_optrom() and ->read_optrom().
      
      Only root can trigger this bug.
      
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=194061
      
      Fixes: b7cc176c ("[SCSI] qla2xxx: Allow region-based flash-part accesses.")
      Reported-by: default avatarshqking <shqking@gmail.com>
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d8663aa2
    • Hannes Reinecke's avatar
      scsi: sg: fixup infoleak when using SG_GET_REQUEST_TABLE · 72896ca3
      Hannes Reinecke authored
      commit 3e009749 upstream.
      
      When calling SG_GET_REQUEST_TABLE ioctl only a half-filled table is
      returned; the remaining part will then contain stale kernel memory
      information.  This patch zeroes out the entire table to avoid this
      issue.
      Signed-off-by: default avatarHannes Reinecke <hare@suse.com>
      Reviewed-by: default avatarBart Van Assche <bart.vanassche@wdc.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      72896ca3
    • Hannes Reinecke's avatar
      scsi: sg: factor out sg_fill_request_table() · c04996ad
      Hannes Reinecke authored
      commit 4759df90 upstream.
      
      Factor out sg_fill_request_table() for better readability.
      
      [mkp: typos, applied by hand]
      Signed-off-by: default avatarHannes Reinecke <hare@suse.com>
      Reviewed-by: default avatarBart Van Assche <bart.vanassche@wdc.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c04996ad
    • Dan Carpenter's avatar
      scsi: sg: off by one in sg_ioctl() · f0cd701d
      Dan Carpenter authored
      commit bd46fc40 upstream.
      
      If "val" is SG_MAX_QUEUE then we are one element beyond the end of the
      "rinfo" array so the > should be >=.
      
      Fixes: 109bade9 ("scsi: sg: use standard lists for sg_requests")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Acked-by: default avatarDouglas Gilbert <dgilbert@interlog.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f0cd701d
    • Hannes Reinecke's avatar
      scsi: sg: use standard lists for sg_requests · 3682e0c6
      Hannes Reinecke authored
      commit 109bade9 upstream.
      
      'Sg_request' is using a private list implementation; convert it to
      standard lists.
      Signed-off-by: default avatarHannes Reinecke <hare@suse.com>
      Reviewed-by: default avatarJohannes Thumshirn <jthumshirn@suse.de>
      Tested-by: default avatarJohannes Thumshirn <jthumshirn@suse.de>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3682e0c6
    • Hannes Reinecke's avatar
    • Long Li's avatar
      scsi: storvsc: fix memory leak on ring buffer busy · cf22210c
      Long Li authored
      commit 0208eeaa upstream.
      
      When storvsc is sending I/O to Hyper-v, it may allocate a bigger buffer
      descriptor for large data payload that can't fit into a pre-allocated
      buffer descriptor. This bigger buffer is freed on return path.
      
      If I/O request to Hyper-v fails due to ring buffer busy, the storvsc
      allocated buffer descriptor should also be freed.
      
      [mkp: applied by hand]
      
      Fixes: be0cf6ca ("scsi: storvsc: Set the tablesize based on the information given by the host")
      Signed-off-by: default avatarLong Li <longli@microsoft.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cf22210c
    • Shivasharan S's avatar
      scsi: megaraid_sas: Return pended IOCTLs with cmd_status MFI_STAT_WRONG_STATE... · b4730f45
      Shivasharan S authored
      scsi: megaraid_sas: Return pended IOCTLs with cmd_status MFI_STAT_WRONG_STATE in case adapter is dead
      
      commit eb3fe263 upstream.
      
      After a kill adapter, since the cmd_status is not set, the IOCTLs will
      be hung in driver resulting in application hang.  Set cmd_status
      MFI_STAT_WRONG_STATE when completing pended IOCTLs.
      Signed-off-by: default avatarKashyap Desai <kashyap.desai@broadcom.com>
      Signed-off-by: default avatarShivasharan S <shivasharan.srikanteshwara@broadcom.com>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.com>
      Reviewed-by: default avatarTomas Henzl <thenzl@redhat.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      
      b4730f45
    • Shivasharan S's avatar
    • Steffen Maier's avatar
      scsi: zfcp: trace high part of "new" 64 bit SCSI LUN · 4dd6cbbc
      Steffen Maier authored
      commit 5d4a3d0a upstream.
      
      Complements debugging aspects of the otherwise functionally complete
      v3.17 commit 9cb78c16 ("scsi: use 64-bit LUNs").
      
      While I don't have access to a target exporting 3 or 4 level LUNs,
      I did test it by explicitly attaching a non-existent fake 4 level LUN
      by means of zfcp sysfs attribute "unit_add".
      In order to see corresponding trace records of otherwise successful
      events, we had to increase the trace level of area SCSI and HBA to 6.
      
      $ echo 6 > /sys/kernel/debug/s390dbf/zfcp_0.0.1880_scsi/level
      $ echo 6 > /sys/kernel/debug/s390dbf/zfcp_0.0.1880_hba/level
      
      $ echo 0x4011402240334044 > \
        /sys/bus/ccw/drivers/zfcp/0.0.1880/0x50050763031bd327/unit_add
      
      Example output formatted by an updated zfcpdbf from the s390-tools
      package interspersed with kernel messages at scsi_logging_level=4605:
      
      Timestamp      : ...
      Area           : REC
      Subarea        : 00
      Level          : 1
      Exception      : -
      CPU ID         : ..
      Caller         : 0x...
      Record ID      : 1
      Tag            : scsla_1
      LUN            : 0x4011402240334044
      WWPN           : 0x50050763031bd327
      D_ID           : 0x00......
      Adapter status : 0x5400050b
      Port status    : 0x54000001
      LUN status     : 0x41000000
      Ready count    : 0x00000001
      Running count  : 0x00000000
      ERP want       : 0x01
      ERP need       : 0x01
      
      scsi 2:0:0:4630896905707208721: scsi scan: INQUIRY pass 1 length 36
      scsi 2:0:0:4630896905707208721: scsi scan: INQUIRY successful with code 0x0
      
      Timestamp      : ...
      Area           : HBA
      Subarea        : 00
      Level          : 6
      Exception      : -
      CPU ID         : ..
      Caller         : 0x...
      Record ID      : 1
      Tag            : fs_norm
      Request ID     : 0x<inquiry2-req-id>
      Request status : 0x00000010
      FSF cmnd       : 0x00000001
      FSF sequence no: 0x...
      FSF issued     : ...
      FSF stat       : 0x00000000
      FSF stat qual  : 00000000 00000000 00000000 00000000
      Prot stat      : 0x00000001
      Prot stat qual : ........ ........ 00000000 00000000
      Port handle    : 0x...
      LUN handle     : 0x...
      |
      Timestamp      : ...
      Area           : SCSI
      Subarea        : 00
      Level          : 6
      Exception      : -
      CPU ID         : ..
      Caller         : 0x...
      Record ID      : 1
      Tag            : rsl_nor
      Request ID     : 0x<inquiry2-req-id>
      SCSI ID        : 0x00000000
      SCSI LUN       : 0x40224011
      SCSI LUN high  : 0x40444033 <=======================
      SCSI result    : 0x00000000
      SCSI retries   : 0x00
      SCSI allowed   : 0x03
      SCSI scribble  : 0x<inquiry2-req-id>
      SCSI opcode    : 12000000 a4000000 00000000 00000000
      FCP rsp inf cod: 0x00
      FCP rsp IU     : 00000000 00000000 00000000 00000000
                       00000000 00000000
      
      scsi 2:0:0:4630896905707208721: scsi scan: INQUIRY pass 2 length 164
      scsi 2:0:0:4630896905707208721: scsi scan: INQUIRY successful with code 0x0
      scsi 2:0:0:4630896905707208721: scsi scan: peripheral device type of 31, \
      no device added
      Signed-off-by: default avatarSteffen Maier <maier@linux.vnet.ibm.com>
      Fixes: 9cb78c16 ("scsi: use 64-bit LUNs")
      Reviewed-by: default avatarBenjamin Block <bblock@linux.vnet.ibm.com>
      Reviewed-by: default avatarJens Remus <jremus@linux.vnet.ibm.com>
      Signed-off-by: default avatarBenjamin Block <bblock@linux.vnet.ibm.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4dd6cbbc
    • Steffen Maier's avatar
      scsi: zfcp: trace HBA FSF response by default on dismiss or timedout late response · 1e6c640a
      Steffen Maier authored
      commit fdb7cee3 upstream.
      
      At the default trace level, we only trace unsuccessful events including
      FSF responses.
      
      zfcp_dbf_hba_fsf_response() only used protocol status and FSF status to
      decide on an unsuccessful response. However, this is only one of multiple
      possible sources determining a failed struct zfcp_fsf_req.
      
      An FSF request can also "fail" if its response runs into an ERP timeout
      or if it gets dismissed because a higher level recovery was triggered
      [trace tags "erscf_1" or "erscf_2" in zfcp_erp_strategy_check_fsfreq()].
      FSF requests with ERP timeout are:
      FSF_QTCB_EXCHANGE_CONFIG_DATA, FSF_QTCB_EXCHANGE_PORT_DATA,
      FSF_QTCB_OPEN_PORT_WITH_DID or FSF_QTCB_CLOSE_PORT or
      FSF_QTCB_CLOSE_PHYSICAL_PORT for target ports,
      FSF_QTCB_OPEN_LUN, FSF_QTCB_CLOSE_LUN.
      One example is slow queue processing which can cause follow-on errors,
      e.g. FSF_PORT_ALREADY_OPEN after FSF_QTCB_OPEN_PORT_WITH_DID timed out.
      In order to see the root cause, we need to see late responses even if the
      channel presented them successfully with FSF_PROT_GOOD and FSF_GOOD.
      Example trace records formatted with zfcpdbf from the s390-tools package:
      
      Timestamp      : ...
      Area           : REC
      Subarea        : 00
      Level          : 1
      Exception      : -
      CPU ID         : ..
      Caller         : ...
      Record ID      : 1
      Tag            : fcegpf1
      LUN            : 0xffffffffffffffff
      WWPN           : 0x<WWPN>
      D_ID           : 0x00<D_ID>
      Adapter status : 0x5400050b
      Port status    : 0x41200000
      LUN status     : 0x00000000
      Ready count    : 0x00000001
      Running count  : 0x...
      ERP want       : 0x02				ZFCP_ERP_ACTION_REOPEN_PORT
      ERP need       : 0x02				ZFCP_ERP_ACTION_REOPEN_PORT
      |
      Timestamp      : ...				30 seconds later
      Area           : REC
      Subarea        : 00
      Level          : 1
      Exception      : -
      CPU ID         : ..
      Caller         : ...
      Record ID      : 2
      Tag            : erscf_2
      LUN            : 0xffffffffffffffff
      WWPN           : 0x<WWPN>
      D_ID           : 0x00<D_ID>
      Adapter status : 0x5400050b
      Port status    : 0x41200000
      LUN status     : 0x00000000
      Request ID     : 0x<request_ID>
      ERP status     : 0x10000000			ZFCP_STATUS_ERP_TIMEDOUT
      ERP step       : 0x0800				ZFCP_ERP_STEP_PORT_OPENING
      ERP action     : 0x02				ZFCP_ERP_ACTION_REOPEN_PORT
      ERP count      : 0x00
      |
      Timestamp      : ...				later than previous record
      Area           : HBA
      Subarea        : 00
      Level          : 5	> default level		=> 3	<= default level
      Exception      : -
      CPU ID         : 00
      Caller         : ...
      Record ID      : 1
      Tag            : fs_qtcb			=> fs_rerr
      Request ID     : 0x<request_ID>
      Request status : 0x00001010			ZFCP_STATUS_FSFREQ_DISMISSED
      						| ZFCP_STATUS_FSFREQ_CLEANUP
      FSF cmnd       : 0x00000005
      FSF sequence no: 0x...
      FSF issued     : ...				> 30 seconds ago
      FSF stat       : 0x00000000			FSF_GOOD
      FSF stat qual  : 00000000 00000000 00000000 00000000
      Prot stat      : 0x00000001			FSF_PROT_GOOD
      Prot stat qual : 00000000 00000000 00000000 00000000
      Port handle    : 0x...
      LUN handle     : 0x00000000
      QTCB log length: ...
      QTCB log info  : ...
      
      In case of problems detecting that new responses are waiting on the input
      queue, we sooner or later trigger adapter recovery due to an FSF request
      timeout (trace tag "fsrth_1").
      FSF requests with FSF request timeout are:
      typically FSF_QTCB_ABORT_FCP_CMND; but theoretically also
      FSF_QTCB_EXCHANGE_CONFIG_DATA or FSF_QTCB_EXCHANGE_PORT_DATA via sysfs,
      FSF_QTCB_OPEN_PORT_WITH_DID or FSF_QTCB_CLOSE_PORT for WKA ports,
      FSF_QTCB_FCP_CMND for task management function (LUN / target reset).
      One or more pending requests can meanwhile have FSF_PROT_GOOD and FSF_GOOD
      because the channel filled in the response via DMA into the request's QTCB.
      
      In a theroretical case, inject code can create an erroneous FSF request
      on purpose. If data router is enabled, it uses deferred error reporting.
      A READ SCSI command can succeed with FSF_PROT_GOOD, FSF_GOOD, and
      SAM_STAT_GOOD. But on writing the read data to host memory via DMA,
      it can still fail, e.g. if an intentionally wrong scatter list does not
      provide enough space. Rather than getting an unsuccessful response,
      we get a QDIO activate check which in turn triggers adapter recovery.
      One or more pending requests can meanwhile have FSF_PROT_GOOD and FSF_GOOD
      because the channel filled in the response via DMA into the request's QTCB.
      Example trace records formatted with zfcpdbf from the s390-tools package:
      
      Timestamp      : ...
      Area           : HBA
      Subarea        : 00
      Level          : 6	> default level		=> 3	<= default level
      Exception      : -
      CPU ID         : ..
      Caller         : ...
      Record ID      : 1
      Tag            : fs_norm			=> fs_rerr
      Request ID     : 0x<request_ID2>
      Request status : 0x00001010			ZFCP_STATUS_FSFREQ_DISMISSED
      						| ZFCP_STATUS_FSFREQ_CLEANUP
      FSF cmnd       : 0x00000001
      FSF sequence no: 0x...
      FSF issued     : ...
      FSF stat       : 0x00000000			FSF_GOOD
      FSF stat qual  : 00000000 00000000 00000000 00000000
      Prot stat      : 0x00000001			FSF_PROT_GOOD
      Prot stat qual : ........ ........ 00000000 00000000
      Port handle    : 0x...
      LUN handle     : 0x...
      |
      Timestamp      : ...
      Area           : SCSI
      Subarea        : 00
      Level          : 3
      Exception      : -
      CPU ID         : ..
      Caller         : ...
      Record ID      : 1
      Tag            : rsl_err
      Request ID     : 0x<request_ID2>
      SCSI ID        : 0x...
      SCSI LUN       : 0x...
      SCSI result    : 0x000e0000			DID_TRANSPORT_DISRUPTED
      SCSI retries   : 0x00
      SCSI allowed   : 0x05
      SCSI scribble  : 0x<request_ID2>
      SCSI opcode    : 28...				Read(10)
      FCP rsp inf cod: 0x00
      FCP rsp IU     : 00000000 00000000 00000000 00000000
                                               ^^	SAM_STAT_GOOD
                       00000000 00000000
      
      Only with luck in both above cases, we could see a follow-on trace record
      of an unsuccesful event following a successful but late FSF response with
      FSF_PROT_GOOD and FSF_GOOD. Typically this was the case for I/O requests
      resulting in a SCSI trace record "rsl_err" with DID_TRANSPORT_DISRUPTED
      [On ZFCP_STATUS_FSFREQ_DISMISSED, zfcp_fsf_protstatus_eval() sets
      ZFCP_STATUS_FSFREQ_ERROR seen by the request handler functions as failure].
      However, the reason for this follow-on trace was invisible because the
      corresponding HBA trace record was missing at the default trace level
      (by default hidden records with tags "fs_norm", "fs_qtcb", or "fs_open").
      
      On adapter recovery, after we had shut down the QDIO queues, we perform
      unsuccessful pseudo completions with flag ZFCP_STATUS_FSFREQ_DISMISSED
      for each pending FSF request in zfcp_fsf_req_dismiss_all().
      In order to find the root cause, we need to see all pseudo responses even
      if the channel presented them successfully with FSF_PROT_GOOD and FSF_GOOD.
      
      Therefore, check zfcp_fsf_req.status for ZFCP_STATUS_FSFREQ_DISMISSED
      or ZFCP_STATUS_FSFREQ_ERROR and trace with a new tag "fs_rerr".
      
      It does not matter that there are numerous places which set
      ZFCP_STATUS_FSFREQ_ERROR after the location where we trace an FSF response
      early. These cases are based on protocol status != FSF_PROT_GOOD or
      == FSF_PROT_FSF_STATUS_PRESENTED and are thus already traced by default
      as trace tag "fs_perr" or "fs_ferr" respectively.
      
      NB: The trace record with tag "fssrh_1" for status read buffers on dismiss
      all remains. zfcp_fsf_req_complete() handles this and returns early.
      All other FSF request types are handled separately and as described above.
      Signed-off-by: default avatarSteffen Maier <maier@linux.vnet.ibm.com>
      Fixes: 8a36e453 ("[SCSI] zfcp: enhancement of zfcp debug features")
      Fixes: 2e261af8 ("[SCSI] zfcp: Only collect FSF/HBA debug data for matching trace levels")
      Reviewed-by: default avatarBenjamin Block <bblock@linux.vnet.ibm.com>
      Signed-off-by: default avatarBenjamin Block <bblock@linux.vnet.ibm.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1e6c640a
    • Steffen Maier's avatar
      scsi: zfcp: fix payload with full FCP_RSP IU in SCSI trace records · 71948224
      Steffen Maier authored
      commit 12c3e575 upstream.
      
      If the FCP_RSP UI has optional parts (FCP_SNS_INFO or FCP_RSP_INFO) and
      thus does not fit into the fsp_rsp field built into a SCSI trace record,
      trace the full FCP_RSP UI with all optional parts as payload record
      instead of just FCP_SNS_INFO as payload and
      a 1 byte RSP_INFO_CODE part of FCP_RSP_INFO built into the SCSI record.
      
      That way we would also get the full FCP_SNS_INFO in case a
      target would ever send more than
      min(SCSI_SENSE_BUFFERSIZE==96, ZFCP_DBF_PAY_MAX_REC==256)==96.
      
      The mandatory part of FCP_RSP IU is only 24 bytes.
      PAYload costs at least one full PAY record of 256 bytes anyway.
      We cap to the hardware response size which is only FSF_FCP_RSP_SIZE==128.
      So we can just put the whole FCP_RSP IU with any optional parts into
      PAYload similarly as we do for SAN PAY since v4.9 commit aceeffbb
      ("zfcp: trace full payload of all SAN records (req,resp,iels)").
      This does not cause any additional trace records wasting memory.
      
      Decoded trace records were confusing because they showed a hard-coded
      sense data length of 96 even if the FCP_RSP_IU field FCP_SNS_LEN showed
      actually less.
      
      Since the same commit, we set pl_len for SAN traces to the full length of a
      request/response even if we cap the corresponding trace.
      In contrast, here for SCSI traces we set pl_len to the pre-computed
      length of FCP_RSP IU considering SNS_LEN or RSP_LEN if valid.
      Nonetheless we trace a hardcoded payload of length FSF_FCP_RSP_SIZE==128
      if there were optional parts.
      This makes it easier for the zfcpdbf tool to format only the relevant
      part of the long FCP_RSP UI buffer. And any trailing information is still
      available in the payload trace record just in case.
      
      Rename the payload record tag from "fcp_sns" to "fcp_riu" to make the new
      content explicit to zfcpdbf which can then pick a suitable field name such
      as "FCP rsp IU all:" instead of "Sense info :"
      Also, the same zfcpdbf can still be backwards compatible with "fcp_sns".
      
      Old example trace record before this fix, formatted with the tool zfcpdbf
      from s390-tools:
      
      Timestamp      : ...
      Area           : SCSI
      Subarea        : 00
      Level          : 3
      Exception      : -
      CPU id         : ..
      Caller         : 0x...
      Record id      : 1
      Tag            : rsl_err
      Request id     : 0x<request_id>
      SCSI ID        : 0x...
      SCSI LUN       : 0x...
      SCSI result    : 0x00000002
      SCSI retries   : 0x00
      SCSI allowed   : 0x05
      SCSI scribble  : 0x<request_id>
      SCSI opcode    : 00000000 00000000 00000000 00000000
      FCP rsp inf cod: 0x00
      FCP rsp IU     : 00000000 00000000 00000202 00000000
                                             ^^==FCP_SNS_LEN_VALID
                       00000020 00000000
                       ^^^^^^^^==FCP_SNS_LEN==32
      Sense len      : 96 <==min(SCSI_SENSE_BUFFERSIZE,ZFCP_DBF_PAY_MAX_REC)
      Sense info     : 70000600 00000018 00000000 29000000
                       00000400 00000000 00000000 00000000
                       00000000 00000000 00000000 00000000<==superfluous
                       00000000 00000000 00000000 00000000<==superfluous
                       00000000 00000000 00000000 00000000<==superfluous
                       00000000 00000000 00000000 00000000<==superfluous
      
      New example trace records with this fix:
      
      Timestamp      : ...
      Area           : SCSI
      Subarea        : 00
      Level          : 3
      Exception      : -
      CPU ID         : ..
      Caller         : 0x...
      Record ID      : 1
      Tag            : rsl_err
      Request ID     : 0x<request_id>
      SCSI ID        : 0x...
      SCSI LUN       : 0x...
      SCSI result    : 0x00000002
      SCSI retries   : 0x00
      SCSI allowed   : 0x03
      SCSI scribble  : 0x<request_id>
      SCSI opcode    : a30c0112 00000000 02000000 00000000
      FCP rsp inf cod: 0x00
      FCP rsp IU     : 00000000 00000000 00000a02 00000200
                       00000020 00000000
      FCP rsp IU len : 56
      FCP rsp IU all : 00000000 00000000 00000a02 00000200
                                             ^^=FCP_RESID_UNDER|FCP_SNS_LEN_VALID
                       00000020 00000000 70000500 00000018
                       ^^^^^^^^==FCP_SNS_LEN
                                         ^^^^^^^^^^^^^^^^^
                       00000000 240000cb 00011100 00000000
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                       00000000 00000000
                       ^^^^^^^^^^^^^^^^^==FCP_SNS_INFO
      
      Timestamp      : ...
      Area           : SCSI
      Subarea        : 00
      Level          : 1
      Exception      : -
      CPU ID         : ..
      Caller         : 0x...
      Record ID      : 1
      Tag            : lr_okay
      Request ID     : 0x<request_id>
      SCSI ID        : 0x...
      SCSI LUN       : 0x...
      SCSI result    : 0x00000000
      SCSI retries   : 0x00
      SCSI allowed   : 0x05
      SCSI scribble  : 0x<request_id>
      SCSI opcode    : <CDB of unrelated SCSI command passed to eh handler>
      FCP rsp inf cod: 0x00
      FCP rsp IU     : 00000000 00000000 00000100 00000000
                       00000000 00000008
      FCP rsp IU len : 32
      FCP rsp IU all : 00000000 00000000 00000100 00000000
                                             ^^==FCP_RSP_LEN_VALID
                       00000000 00000008 00000000 00000000
                                ^^^^^^^^==FCP_RSP_LEN
                                         ^^^^^^^^^^^^^^^^^==FCP_RSP_INFO
      Signed-off-by: default avatarSteffen Maier <maier@linux.vnet.ibm.com>
      Fixes: 250a1352 ("[SCSI] zfcp: Redesign of the debug tracing for SCSI records.")
      Reviewed-by: default avatarBenjamin Block <bblock@linux.vnet.ibm.com>
      Signed-off-by: default avatarBenjamin Block <bblock@linux.vnet.ibm.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      71948224
    • Steffen Maier's avatar
      scsi: zfcp: fix missing trace records for early returns in TMF eh handlers · d0fbe221
      Steffen Maier authored
      commit 1a5d999e upstream.
      
      For problem determination we need to see that we were in scsi_eh
      as well as whether and why we were successful or not.
      
      The following commits introduced new early returns without adding
      a trace record:
      
      v2.6.35 commit a1dbfddd
      ("[SCSI] zfcp: Pass return code from fc_block_scsi_eh to scsi eh")
      on fc_block_scsi_eh() returning != 0 which is FAST_IO_FAIL,
      
      v2.6.30 commit 63caf367
      ("[SCSI] zfcp: Improve reliability of SCSI eh handlers in zfcp")
      on not having gotten an FSF request after the maximum number of retry
      attempts and thus could not issue a TMF and has to return FAILED.
      Signed-off-by: default avatarSteffen Maier <maier@linux.vnet.ibm.com>
      Fixes: a1dbfddd ("[SCSI] zfcp: Pass return code from fc_block_scsi_eh to scsi eh")
      Fixes: 63caf367 ("[SCSI] zfcp: Improve reliability of SCSI eh handlers in zfcp")
      Reviewed-by: default avatarBenjamin Block <bblock@linux.vnet.ibm.com>
      Signed-off-by: default avatarBenjamin Block <bblock@linux.vnet.ibm.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d0fbe221
    • Steffen Maier's avatar
      scsi: zfcp: fix passing fsf_req to SCSI trace on TMF to correlate with HBA · 1a847369
      Steffen Maier authored
      commit 9fe5d2b2 upstream.
      
      Without this fix we get SCSI trace records on task management functions
      which cannot be correlated to HBA trace records because all fields
      related to the FSF request are empty (zero).
      Also, the FCP_RSP_IU is missing as well as any sense data if available.
      
      This was caused by v2.6.14 commit 8a36e453 ("[SCSI] zfcp: enhancement
      of zfcp debug features") introducing trace records for TMFs but
      hard coding NULL for a possibly existing TMF FSF request.
      The scsi_cmnd scribble is also zero or unrelated for the TMF request
      so it also could not lookup a suitable FSF request from there.
      
      A broken example trace record formatted with zfcpdbf from the s390-tools
      package:
      
      Timestamp      : ...
      Area           : SCSI
      Subarea        : 00
      Level          : 1
      Exception      : -
      CPU ID         : ..
      Caller         : 0x...
      Record ID      : 1
      Tag            : lr_fail
      Request ID     : 0x0000000000000000
                         ^^^^^^^^^^^^^^^^ no correlation to HBA record
      SCSI ID        : 0x<scsitarget>
      SCSI LUN       : 0x<scsilun>
      SCSI result    : 0x000e0000
      SCSI retries   : 0x00
      SCSI allowed   : 0x05
      SCSI scribble  : 0x0000000000000000
      SCSI opcode    : 2a000017 3bb80000 08000000 00000000
      FCP rsp inf cod: 0x00
                         ^^ no TMF response
      FCP rsp IU     : 00000000 00000000 00000000 00000000
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                       00000000 00000000
                       ^^^^^^^^^^^^^^^^^ no interesting FCP_RSP_IU
      Sense len      : ...
      ^^^^^^^^^^^^^^^^^^^^ no sense data length
      Sense info     : ...
      ^^^^^^^^^^^^^^^^^^^^ no sense data content, even if present
      
      There are some true cases where we really do not have an FSF request:
      "rsl_fai" from zfcp_dbf_scsi_fail_send() called for early
      returns / completions in zfcp_scsi_queuecommand(),
      "abrt_or", "abrt_bl", "abrt_ru", "abrt_ar" from
      zfcp_scsi_eh_abort_handler() where we did not get as far,
      "lr_nres", "tr_nres" from zfcp_task_mgmt_function() where we're
      successful and do not need to do anything because adapter stopped.
      For these cases it's correct to pass NULL for fsf_req to _zfcp_dbf_scsi().
      Signed-off-by: default avatarSteffen Maier <maier@linux.vnet.ibm.com>
      Fixes: 8a36e453 ("[SCSI] zfcp: enhancement of zfcp debug features")
      Reviewed-by: default avatarBenjamin Block <bblock@linux.vnet.ibm.com>
      Signed-off-by: default avatarBenjamin Block <bblock@linux.vnet.ibm.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1a847369
    • Steffen Maier's avatar
      scsi: zfcp: fix capping of unsuccessful GPN_FT SAN response trace records · 52661717
      Steffen Maier authored
      commit 975171b4 upstream.
      
      v4.9 commit aceeffbb ("zfcp: trace full payload of all SAN records
      (req,resp,iels)") fixed trace data loss of 2.6.38 commit 2c55b750
      ("[SCSI] zfcp: Redesign of the debug tracing for SAN records.")
      necessary for problem determination, e.g. to see the
      currently active zone set during automatic port scan.
      
      While it already saves space by not dumping any empty residual entries
      of the large successful GPN_FT response (4 pages), there are seldom cases
      where the GPN_FT response is unsuccessful and likely does not have
      FC_NS_FID_LAST set in fp_flags so we did not cap the trace record.
      We typically see such case for an initiator WWPN, which is not in any zone.
      
      Cap unsuccessful responses to at least the actual basic CT_IU response
      plus whatever fits the SAN trace record built-in "payload" buffer
      just in case there's trailing information
      of which we would at least see the existence and its beginning.
      
      In order not to erroneously cap successful responses, we need to swap
      calling the trace function and setting the CT / ELS status to success (0).
      
      Example trace record pair formatted with zfcpdbf:
      
      Timestamp      : ...
      Area           : SAN
      Subarea        : 00
      Level          : 1
      Exception      : -
      CPU ID         : ..
      Caller         : 0x...
      Record ID      : 1
      Tag            : fssct_1
      Request ID     : 0x<request_id>
      Destination ID : 0x00fffffc
      SAN req short  : 01000000 fc020000 01720ffc 00000000
                       00000008
      SAN req length : 20
      |
      Timestamp      : ...
      Area           : SAN
      Subarea        : 00
      Level          : 1
      Exception      : -
      CPU ID         : ..
      Caller         : 0x...
      Record ID      : 2
      Tag            : fsscth2
      Request ID     : 0x<request_id>
      Destination ID : 0x00fffffc
      SAN resp short : 01000000 fc020000 80010000 00090700
                       00000000 00000000 00000000 00000000 [trailing info]
                       00000000 00000000 00000000 00000000 [trailing info]
      SAN resp length: 16384
      San resp info  : 01000000 fc020000 80010000 00090700
                       00000000 00000000 00000000 00000000 [trailing info]
                       00000000 00000000 00000000 00000000 [trailing info]
                       00000000 00000000 00000000 00000000 [trailing info]
                       00000000 00000000 00000000 00000000 [trailing info]
                       00000000 00000000 00000000 00000000 [trailing info]
                       00000000 00000000 00000000 00000000 [trailing info]
                       00000000 00000000 00000000 00000000 [trailing info]
                       00000000 00000000 00000000 00000000 [trailing info]
                       00000000 00000000 00000000 00000000 [trailing info]
                       00000000 00000000 00000000 00000000 [trailing info]
                       00000000 00000000 00000000 00000000 [trailing info]
                       00000000 00000000 00000000 00000000 [trailing info]
                       00000000 00000000 00000000 00000000 [trailing info]
                       00000000 00000000 00000000 00000000 [trailing info]
                       00000000 00000000 00000000 00000000 [trailing info]
      
      The fix saves all but one of the previously associated 64 PAYload trace
      record chunks of size 256 bytes each.
      Signed-off-by: default avatarSteffen Maier <maier@linux.vnet.ibm.com>
      Fixes: aceeffbb ("zfcp: trace full payload of all SAN records (req,resp,iels)")
      Fixes: 2c55b750 ("[SCSI] zfcp: Redesign of the debug tracing for SAN records.")
      Reviewed-by: default avatarBenjamin Block <bblock@linux.vnet.ibm.com>
      Signed-off-by: default avatarBenjamin Block <bblock@linux.vnet.ibm.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      52661717