1. 04 Apr, 2019 3 commits
  2. 29 Mar, 2019 1 commit
  3. 28 Mar, 2019 10 commits
    • Tyrel Datwyler's avatar
      scsi: ibmvfc: Clean up transport events · d6e2635b
      Tyrel Datwyler authored
      No change to functionality. Simply make transport event messages a little
      clearer, and rework CRQ format enums such that we have separate enums for
      INIT messages and XPORT events.
      
      [mkp: typo]
      Signed-off-by: default avatarTyrel Datwyler <tyreld@linux.vnet.ibm.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      d6e2635b
    • Tyrel Datwyler's avatar
      scsi: ibmvfc: Byte swap status and error codes when logging · 3e6f7de4
      Tyrel Datwyler authored
      Status and error codes are returned in big endian from the VIOS. The values
      are translated into a human readable format when logged, but the values are
      also logged. This patch byte swaps those values so that they are consistent
      between BE and LE platforms.
      Signed-off-by: default avatarTyrel Datwyler <tyreld@linux.vnet.ibm.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      3e6f7de4
    • Tyrel Datwyler's avatar
      scsi: ibmvfc: Add failed PRLI to cmd_status lookup array · 95237c25
      Tyrel Datwyler authored
      The VIOS uses the SCSI_ERROR class to report PRLI failures. These errors
      are indicated with the combination of a IBMVFC_FC_SCSI_ERROR return status
      and 0x8000 error code. Add these codes to cmd_status[] with appropriate
      human readable error message.
      Signed-off-by: default avatarTyrel Datwyler <tyreld@linux.vnet.ibm.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      95237c25
    • Tyrel Datwyler's avatar
      scsi: ibmvfc: Remove "failed" from logged errors · 6dc6a944
      Tyrel Datwyler authored
      The text of messages logged with ibmvfc_log_error() always contain the term
      "failed". In the case of cancelled commands during EH they are reported
      back by the VIOS using error codes. This can be confusing to somebody
      looking at these log messages as to whether a command was successfully
      cancelled. The following real log message for example it is unclear if the
      transaction was actaully cancelled.
      
      <6>sd 0:0:1:1: Cancelling outstanding commands.
      <3>sd 0:0:1:1: [sde] Command (28) failed: transaction cancelled (2:6) flags: 0 fcp_rsp: 0, resid=0, scsi_status: 0
      
      Remove prefixing of "failed" to all error logged messages. The
      ibmvfc_log_error() function translates the returned error/status codes to a
      human readable message already.
      Signed-off-by: default avatarTyrel Datwyler <tyreld@linux.vnet.ibm.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      6dc6a944
    • Steffen Maier's avatar
      scsi: zfcp: reduce flood of fcrscn1 trace records on multi-element RSCN · c8206579
      Steffen Maier authored
      If an incoming ELS of type RSCN contains more than one element, zfcp
      suboptimally causes repeated erp trigger NOP trace records for each
      previously failed port. These could be ports that went away.  It loops over
      each RSCN element, and for each of those in an inner loop over all
      zfcp_ports.
      
      The trigger to recover failed ports should be just the reception of some
      RSCN, no matter how many elements it has. So we can loop over failed ports
      separately, and only then loop over each RSCN element to handle the
      non-failed ports.
      
      The call chain was:
      
        zfcp_fc_incoming_rscn
          for (i = 1; i < no_entries; i++)
            _zfcp_fc_incoming_rscn
              list_for_each_entry(port, &adapter->port_list, list)
                if (masked port->d_id match) zfcp_fc_test_link
                if (!port->d_id) zfcp_erp_port_reopen "fcrscn1"   <===
      
      In order the reduce the "flooding" of the REC trace area in such cases, we
      factor out handling the failed ports to be outside of the entries loop:
      
        zfcp_fc_incoming_rscn
          if (no_entries > 1)                                     <===
            list_for_each_entry(port, &adapter->port_list, list)  <===
              if (!port->d_id) zfcp_erp_port_reopen "fcrscn1"     <===
          for (i = 1; i < no_entries; i++)
            _zfcp_fc_incoming_rscn
              list_for_each_entry(port, &adapter->port_list, list)
                if (masked port->d_id match) zfcp_fc_test_link
      
      Abbreviated example trace records before this code change:
      
      Tag            : fcrscn1
      WWPN           : 0x500507630310d327
      ERP want       : 0x02
      ERP need       : 0x02
      
      Tag            : fcrscn1
      WWPN           : 0x500507630310d327
      ERP want       : 0x02
      ERP need       : 0x00                 NOP => superfluous trace record
      
      The last trace entry repeats if there are more than 2 RSCN elements.
      Signed-off-by: default avatarSteffen Maier <maier@linux.ibm.com>
      Reviewed-by: default avatarBenjamin Block <bblock@linux.ibm.com>
      Reviewed-by: default avatarJens Remus <jremus@linux.ibm.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      c8206579
    • Steffen Maier's avatar
      scsi: zfcp: fix scsi_eh host reset with port_forced ERP for non-NPIV FCP devices · 242ec145
      Steffen Maier authored
      Suppose more than one non-NPIV FCP device is active on the same channel.
      Send I/O to storage and have some of the pending I/O run into a SCSI
      command timeout, e.g. due to bit errors on the fibre. Now the error
      situation stops. However, we saw FCP requests continue to timeout in the
      channel. The abort will be successful, but the subsequent TUR fails.
      Scsi_eh starts. The LUN reset fails. The target reset fails.  The host
      reset only did an FCP device recovery. However, for non-NPIV FCP devices,
      this does not close and reopen ports on the SAN-side if other non-NPIV FCP
      device(s) share the same open ports.
      
      In order to resolve the continuing FCP request timeouts, we need to
      explicitly close and reopen ports on the SAN-side.
      
      This was missing since the beginning of zfcp in v2.6.0 history commit
      ea127f97 ("[PATCH] s390 (7/7): zfcp host adapter.").
      
      Note: The FSF requests for forced port reopen could run into FSF request
      timeouts due to other reasons. This would trigger an internal FCP device
      recovery. Pending forced port reopen recoveries would get dismissed. So
      some ports might not get fully reopened during this host reset handler.
      However, subsequent I/O would trigger the above described escalation and
      eventually all ports would be forced reopen to resolve any continuing FCP
      request timeouts due to earlier bit errors.
      Signed-off-by: default avatarSteffen Maier <maier@linux.ibm.com>
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Cc: <stable@vger.kernel.org> #3.0+
      Reviewed-by: default avatarJens Remus <jremus@linux.ibm.com>
      Reviewed-by: default avatarBenjamin Block <bblock@linux.ibm.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      242ec145
    • Steffen Maier's avatar
      scsi: zfcp: fix rport unblock if deleted SCSI devices on Scsi_Host · fe67888f
      Steffen Maier authored
      An already deleted SCSI device can exist on the Scsi_Host and remain there
      because something still holds a reference.  A new SCSI device with the same
      H:C:T:L and FCP device, target port WWPN, and FCP LUN can be created.  When
      we try to unblock an rport, we still find the deleted SCSI device and
      return early because the zfcp_scsi_dev of that SCSI device is not
      ZFCP_STATUS_COMMON_UNBLOCKED. Hence we miss to unblock the rport, even if
      the new proper SCSI device would be in good state.
      
      Therefore, skip deleted SCSI devices when iterating the sdevs of the shost.
      [cf. __scsi_device_lookup{_by_target}() or scsi_device_get()]
      
      The following abbreviated trace sequence can indicate such problem:
      
      Area           : REC
      Tag            : ersfs_3
      LUN            : 0x4045400300000000
      WWPN           : 0x50050763031bd327
      LUN status     : 0x40000000     not ZFCP_STATUS_COMMON_UNBLOCKED
      Ready count    : n		not incremented yet
      Running count  : 0x00000000
      ERP want       : 0x01
      ERP need       : 0xc1		ZFCP_ERP_ACTION_NONE
      
      Area           : REC
      Tag            : ersfs_3
      LUN            : 0x4045400300000000
      WWPN           : 0x50050763031bd327
      LUN status     : 0x41000000
      Ready count    : n+1
      Running count  : 0x00000000
      ERP want       : 0x01
      ERP need       : 0x01
      
      ...
      
      Area           : REC
      Level          : 4		only with increased trace level
      Tag            : ertru_l
      LUN            : 0x4045400300000000
      WWPN           : 0x50050763031bd327
      LUN status     : 0x40000000
      Request ID     : 0x0000000000000000
      ERP status     : 0x01800000
      ERP step       : 0x1000
      ERP action     : 0x01
      ERP count      : 0x00
      
      NOT followed by a trace record with tag "scpaddy"
      for WWPN 0x50050763031bd327.
      Signed-off-by: default avatarSteffen Maier <maier@linux.ibm.com>
      Fixes: 6f2ce1c6 ("scsi: zfcp: fix rport unblock race with LUN recovery")
      Cc: <stable@vger.kernel.org> #2.6.32+
      Reviewed-by: default avatarJens Remus <jremus@linux.ibm.com>
      Reviewed-by: default avatarBenjamin Block <bblock@linux.ibm.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      fe67888f
    • Martin K. Petersen's avatar
      scsi: sd: Quiesce warning if device does not report optimal I/O size · 1d5de5bd
      Martin K. Petersen authored
      Commit a83da8a4 ("scsi: sd: Optimal I/O size should be a multiple
      of physical block size") split one conditional into several separate
      statements in an effort to provide more accurate warning messages when
      a device reports a nonsensical value. However, this reorganization
      accidentally dropped the precondition of the reported value being
      larger than zero. This lead to a warning getting emitted on devices
      that do not report an optimal I/O size at all.
      
      Remain silent if a device does not report an optimal I/O size.
      
      Fixes: a83da8a4 ("scsi: sd: Optimal I/O size should be a multiple of physical block size")
      Cc: Randy Dunlap <rdunlap@infradead.org>
      Cc: <stable@vger.kernel.org>
      Reported-by: default avatarHussam Al-Tayeb <ht990332@gmx.com>
      Tested-by: default avatarHussam Al-Tayeb <ht990332@gmx.com>
      Reviewed-by: default avatarBart Van Assche <bvanassche@acm.org>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      1d5de5bd
    • Bart Van Assche's avatar
      scsi: sd: Fix a race between closing an sd device and sd I/O · c14a5726
      Bart Van Assche authored
      The scsi_end_request() function calls scsi_cmd_to_driver() indirectly and
      hence needs the disk->private_data pointer. Avoid that that pointer is
      cleared before all affected I/O requests have finished. This patch avoids
      that the following crash occurs:
      
      Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
      Call trace:
       scsi_mq_uninit_cmd+0x1c/0x30
       scsi_end_request+0x7c/0x1b8
       scsi_io_completion+0x464/0x668
       scsi_finish_command+0xbc/0x160
       scsi_eh_flush_done_q+0x10c/0x170
       sas_scsi_recover_host+0x84c/0xa98 [libsas]
       scsi_error_handler+0x140/0x5b0
       kthread+0x100/0x12c
       ret_from_fork+0x10/0x18
      
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Ming Lei <ming.lei@redhat.com>
      Cc: Hannes Reinecke <hare@suse.com>
      Cc: Johannes Thumshirn <jthumshirn@suse.de>
      Cc: Jason Yan <yanaijie@huawei.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarBart Van Assche <bvanassche@acm.org>
      Reported-by: default avatarJason Yan <yanaijie@huawei.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      c14a5726
    • zhengbin's avatar
      scsi: core: Run queue when state is set to running after being blocked · 70fc085c
      zhengbin authored
      Use dd to test a SCSI device:
      
        1. echo "blocked" >/sys/block/sda/device/state
        2. dd if=/dev/sda of=/mnt/t.log bs=1M count=10
        3. echo "running" >/sys/block/sda/device/state
      
      dd should finish this work after step 3, but it hangs.
      
      After step2, the call chain is this:
      
      blk_mq_dispatch_rq_list-->scsi_queue_rq-->prep_to_mq
      
      prep_to_mq will return BLK_STS_RESOURCE, and scsi_queue_rq will
      transition it to BLK_STS_DEV_RESOURCE which means that driver can
      guarantee that IO dispatch will be triggered in future when the
      resource is available.  Need to follow the rule if we set the device
      state to running.
      
      [mkp: tweaked commit description and code comment as suggested by Bart]
      Signed-off-by: default avatarzhengbin <zhengbin13@huawei.com>
      Reviewed-by: default avatarMing Lei <ming.lei@redhat.com>
      Reviewed-by: default avatarBart Van Assche <bvanassche@acm.org>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      70fc085c
  4. 26 Mar, 2019 3 commits
  5. 21 Mar, 2019 2 commits
    • Tyrel Datwyler's avatar
      scsi: ibmvscsi: Fix empty event pool access during host removal · 7f5203c1
      Tyrel Datwyler authored
      The event pool used for queueing commands is destroyed fairly early in the
      ibmvscsi_remove() code path. Since, this happens prior to the call so
      scsi_remove_host() it is possible for further calls to queuecommand to be
      processed which manifest as a panic due to a NULL pointer dereference as
      seen here:
      
      PANIC: "Unable to handle kernel paging request for data at address
      0x00000000"
      
      Context process backtrace:
      
      DSISR: 0000000042000000 ????Syscall Result: 0000000000000000
      4 [c000000002cb3820] memcpy_power7 at c000000000064204
      [Link Register] [c000000002cb3820] ibmvscsi_send_srp_event at d000000003ed14a4
      5 [c000000002cb3920] ibmvscsi_send_srp_event at d000000003ed14a4 [ibmvscsi] ?(unreliable)
      6 [c000000002cb39c0] ibmvscsi_queuecommand at d000000003ed2388 [ibmvscsi]
      7 [c000000002cb3a70] scsi_dispatch_cmd at d00000000395c2d8 [scsi_mod]
      8 [c000000002cb3af0] scsi_request_fn at d00000000395ef88 [scsi_mod]
      9 [c000000002cb3be0] __blk_run_queue at c000000000429860
      10 [c000000002cb3c10] blk_delay_work at c00000000042a0ec
      11 [c000000002cb3c40] process_one_work at c0000000000dac30
      12 [c000000002cb3cd0] worker_thread at c0000000000db110
      13 [c000000002cb3d80] kthread at c0000000000e3378
      14 [c000000002cb3e30] ret_from_kernel_thread at c00000000000982c
      
      The kernel buffer log is overfilled with this log:
      
      [11261.952732] ibmvscsi: found no event struct in pool!
      
      This patch reorders the operations during host teardown. Start by calling
      the SRP transport and Scsi_Host remove functions to flush any outstanding
      work and set the host offline. LLDD teardown follows including destruction
      of the event pool, freeing the Command Response Queue (CRQ), and unmapping
      any persistent buffers. The event pool destruction is protected by the
      scsi_host lock, and the pool is purged prior of any requests for which we
      never received a response. Finally, move the removal of the scsi host from
      our global list to the end so that the host is easily locatable for
      debugging purposes during teardown.
      
      Cc: <stable@vger.kernel.org> # v2.6.12+
      Signed-off-by: default avatarTyrel Datwyler <tyreld@linux.vnet.ibm.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      7f5203c1
    • Tyrel Datwyler's avatar
      scsi: ibmvscsi: Protect ibmvscsi_head from concurrent modificaiton · 7205981e
      Tyrel Datwyler authored
      For each ibmvscsi host created during a probe or destroyed during a remove
      we either add or remove that host to/from the global ibmvscsi_head
      list. This runs the risk of concurrent modification.
      
      This patch adds a simple spinlock around the list modification calls to
      prevent concurrent updates as is done similarly in the ibmvfc driver and
      ipr driver.
      
      Fixes: 32d6e4b6 ("scsi: ibmvscsi: add vscsi hosts to global list_head")
      Cc: <stable@vger.kernel.org> # v4.10+
      Signed-off-by: default avatarTyrel Datwyler <tyreld@linux.vnet.ibm.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      7205981e
  6. 20 Mar, 2019 1 commit
  7. 19 Mar, 2019 4 commits
    • Himanshu Madhani's avatar
      scsi: qla2xxx: Fix NULL pointer crash due to stale CPUID · ac444b4f
      Himanshu Madhani authored
      This patch fixes crash due to NULL pointer derefrence because CPU pointer
      is not set and used by driver.  Instead, driver is passes CPU as tag via
      ha->isp_ops->{lun_reset|target_reset}
      
      [   30.160780] qla2xxx [0000:a0:00.1]-8038:9: Cable is unplugged...
      [   69.984045] qla2xxx [0000:a0:00.0]-8009:8: DEVICE RESET ISSUED nexus=8:0:0 cmd=00000000b0d62f46.
      [   69.992849] BUG: unable to handle kernel NULL pointer dereference at 0000000000000040
      [   70.000680] PGD 0 P4D 0
      [   70.003232] Oops: 0000 [#1] SMP PTI
      [   70.006727] CPU: 2 PID: 6714 Comm: sg_reset Kdump: loaded Not tainted 4.18.0-67.el8.x86_64 #1
      [   70.015258] Hardware name: NEC Express5800/T110j [N8100-2758Y]/MX32-PH0-NJ, BIOS F11 02/13/2019
      [   70.024016] RIP: 0010:blk_mq_rq_cpu+0x9/0x10
      [   70.028315] Code: 01 58 01 00 00 48 83 c0 28 48 3d 80 02 00 00 75 ab c3 0f 1f 44 00 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48
       8b 47 08 <8b> 40 40 c3 0f 1f 00 0f 1f 44 00 00 48 83 ec 10 48 c7 c6 20 6e 7c
      [   70.047087] RSP: 0018:ffff99a481487d58 EFLAGS: 00010246
      [   70.052322] RAX: 0000000000000000 RBX: ffffffffc041b08b RCX: 0000000000000000
      [   70.059466] RDX: 0000000000000000 RSI: ffff8d10b6b16898 RDI: ffff8d10b341e400
      [   70.066615] RBP: ffffffffc03a6bd0 R08: 0000000000000415 R09: 0000000000aaaaaa
      [   70.073765] R10: 0000000000000001 R11: 0000000000000001 R12: ffff8d10b341e528
      [   70.080914] R13: ffff8d10aadefc00 R14: ffff8d0f64efa998 R15: ffff8d0f64efa000
      [   70.088083] FS:  00007f90a201e540(0000) GS:ffff8d10b6b00000(0000) knlGS:0000000000000000
      [   70.096188] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   70.101959] CR2: 0000000000000040 CR3: 0000000268886005 CR4: 00000000003606e0
      [   70.109127] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [   70.116277] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [   70.123425] Call Trace:
      [   70.125896]  __qla2xxx_eh_generic_reset+0xb1/0x220 [qla2xxx]
      [   70.131572]  scsi_ioctl_reset+0x1f5/0x2a0
      [   70.135600]  scsi_ioctl+0x18e/0x397
      [   70.139099]  ? sd_ioctl+0x7c/0x100 [sd_mod]
      [   70.143287]  blkdev_ioctl+0x32b/0x9f0
      [   70.146954]  ? __check_object_size+0xa3/0x181
      [   70.151323]  block_ioctl+0x39/0x40
      [   70.154735]  do_vfs_ioctl+0xa4/0x630
      [   70.158322]  ? syscall_trace_enter+0x1d3/0x2c0
      [   70.162769]  ksys_ioctl+0x60/0x90
      [   70.166104]  __x64_sys_ioctl+0x16/0x20
      [   70.169859]  do_syscall_64+0x5b/0x1b0
      [   70.173532]  entry_SYSCALL_64_after_hwframe+0x65/0xca
      [   70.178587] RIP: 0033:0x7f90a1b3445b
      [   70.182183] Code: 0f 1e fa 48 8b 05 2d aa 2c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00
       00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d fd a9 2c 00 f7 d8 64 89 01 48
      [   70.200956] RSP: 002b:00007fffdca88b68 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
      [   70.208535] RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f90a1b3445b
      [   70.215684] RDX: 00007fffdca88b84 RSI: 0000000000002284 RDI: 0000000000000003
      [   70.222833] RBP: 00007fffdca88ca8 R08: 00007fffdca88b84 R09: 0000000000000000
      [   70.229981] R10: 0000000000000000 R11: 0000000000000246 R12: 00007fffdca88b84
      [   70.237131] R13: 0000000000000000 R14: 000055ab09b0bd28 R15: 0000000000000000
      [   70.244284] Modules linked in: nft_chain_route_ipv4 xt_CHECKSUM nft_chain_nat_ipv4 ipt_MASQUERADE nf_nat_ipv4 nf_nat nf_conntrack_ipv4
       nf_defrag_ipv4 xt_conntrack nf_conntrack libcrc32c ipt_REJECT nf_reject_ipv4 nft_counter nft_compat tun bridge stp llc nf_tables nfnetli
      nk devlink sunrpc vfat fat intel_rapl intel_pmc_core x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm wmi_bmof iTCO_wdt iTCO_
      vendor_support irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel ipmi_ssif intel_cstate intel_uncore intel_rapl_perf ipmi_si jo
      ydev pcspkr ipmi_devintf sg wmi ipmi_msghandler video acpi_power_meter acpi_pad mei_me i2c_i801 mei ip_tables ext4 mbcache jbd2 sr_mod cd
      rom sd_mod qla2xxx ast i2c_algo_bit drm_kms_helper nvme_fc syscopyarea sysfillrect uas sysimgblt fb_sys_fops nvme_fabrics ttm
      [   70.314805]  usb_storage nvme_core crc32c_intel scsi_transport_fc ahci drm libahci tg3 libata megaraid_sas pinctrl_cannonlake pinctrl_
      intel
      [   70.327335] CR2: 0000000000000040
      
      Fixes: 9cf2bab6 ("block: kill request ->cpu member")
      Signed-off-by: default avatarHimanshu Madhani <hmadhani@marvell.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      ac444b4f
    • Quinn Tran's avatar
      scsi: qla2xxx: Fix FC-AL connection target discovery · 4705f10e
      Quinn Tran authored
      Commit 7f147f9b ("scsi: qla2xxx: Fix N2N target discovery with Local
      loop") fixed N2N target discovery for local loop.  However, same code is
      used for FC-AL discovery as well. Added check to make sure we are bypassing
      area and domain check only in N2N topology for target discovery.
      
      Fixes: 7f147f9b ("scsi: qla2xxx: Fix N2N target discovery with Local loop")
      Cc: stable@vger.kernel.org # 5.0+
      Signed-off-by: default avatarQuinn Tran <qtran@marvell.com>
      Signed-off-by: default avatarHimanshu Madhani <hmadhani@marvell.com>
      Reviewed-by: default avatarEwan D. Milne <emilne@redhat.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      4705f10e
    • Bart Van Assche's avatar
      scsi: core: Avoid that a kernel warning appears during system resume · 17605afa
      Bart Van Assche authored
      Since scsi_device_quiesce() skips SCSI devices that have another state than
      RUNNING, OFFLINE or TRANSPORT_OFFLINE, scsi_device_resume() should not
      complain about SCSI devices that have been skipped. Hence this patch.  This
      patch avoids that the following warning appears during resume:
      
      WARNING: CPU: 3 PID: 1039 at blk_clear_pm_only+0x2a/0x30
      CPU: 3 PID: 1039 Comm: kworker/u8:49 Not tainted 5.0.0+ #1
      Hardware name: LENOVO 4180F42/4180F42, BIOS 83ET75WW (1.45 ) 05/10/2013
      Workqueue: events_unbound async_run_entry_fn
      RIP: 0010:blk_clear_pm_only+0x2a/0x30
      Call Trace:
       ? scsi_device_resume+0x28/0x50
       ? scsi_dev_type_resume+0x2b/0x80
       ? async_run_entry_fn+0x2c/0xd0
       ? process_one_work+0x1f0/0x3f0
       ? worker_thread+0x28/0x3c0
       ? process_one_work+0x3f0/0x3f0
       ? kthread+0x10c/0x130
       ? __kthread_create_on_node+0x150/0x150
       ? ret_from_fork+0x1f/0x30
      
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Hannes Reinecke <hare@suse.com>
      Cc: Ming Lei <ming.lei@redhat.com>
      Cc: Johannes Thumshirn <jthumshirn@suse.de>
      Cc: Oleksandr Natalenko <oleksandr@natalenko.name>
      Cc: Martin Steigerwald <martin@lichtvoll.de>
      Cc: <stable@vger.kernel.org>
      Reported-by: default avatarJisheng Zhang <Jisheng.Zhang@synaptics.com>
      Tested-by: default avatarJisheng Zhang <Jisheng.Zhang@synaptics.com>
      Fixes: 3a0a5299 ("block, scsi: Make SCSI quiesce and resume work reliably") # v4.15
      Signed-off-by: default avatarBart Van Assche <bvanassche@acm.org>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      17605afa
    • Bart Van Assche's avatar
      scsi: core: Also call destroy_rcu_head() for passthrough requests · db983f6e
      Bart Van Assche authored
      cmd->rcu is initialized by scsi_initialize_rq(). For passthrough
      requests, blk_get_request() calls scsi_initialize_rq(). For filesystem
      requests, scsi_init_command() calls scsi_initialize_rq(). Make sure
      that destroy_rcu_head() is called for passthrough requests.
      
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Hannes Reinecke <hare@suse.com>
      Cc: Ewan D. Milne <emilne@redhat.com>
      Cc: Johannes Thumshirn <jthumshirn@suse.de>
      Reported-by: default avatarEwan D. Milne <emilne@redhat.com>
      Signed-off-by: default avatarBart Van Assche <bvanassche@acm.org>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      db983f6e
  8. 18 Mar, 2019 1 commit
    • Maurizio Lombardi's avatar
      scsi: iscsi: flush running unbind operations when removing a session · 165aa2bf
      Maurizio Lombardi authored
      In some cases, the iscsi_remove_session() function is called while an
      unbind_work operation is still running.  This may cause a situation where
      sysfs objects are removed in an incorrect order, triggering a kernel
      warning.
      
      [  605.249442] ------------[ cut here ]------------
      [  605.259180] sysfs group 'power' not found for kobject 'target2:0:0'
      [  605.321371] WARNING: CPU: 1 PID: 26794 at fs/sysfs/group.c:235 sysfs_remove_group+0x76/0x80
      [  605.341266] Modules linked in: dm_service_time target_core_user target_core_pscsi target_core_file target_core_iblock iscsi_target_mod target_core_mod nls_utf8 isofs ppdev bochs_drm nfit ttm libnvdimm drm_kms_helper syscopyarea sysfillrect sysimgblt joydev pcspkr fb_sys_fops drm i2c_piix4 sg parport_pc parport xfs libcrc32c dm_multipath sr_mod sd_mod cdrom ata_generic 8021q garp mrp ata_piix stp crct10dif_pclmul crc32_pclmul llc libata crc32c_intel virtio_net net_failover ghash_clmulni_intel serio_raw failover sunrpc dm_mirror dm_region_hash dm_log dm_mod be2iscsi bnx2i cnic uio cxgb4i cxgb4 libcxgbi libcxgb qla4xxx iscsi_boot_sysfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi
      [  605.627479] CPU: 1 PID: 26794 Comm: kworker/u32:2 Not tainted 4.18.0-60.el8.x86_64 #1
      [  605.721401] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20180724_192412-buildhw-07.phx2.fedoraproject.org-1.fc29 04/01/2014
      [  605.823651] Workqueue: scsi_wq_2 __iscsi_unbind_session [scsi_transport_iscsi]
      [  605.830940] RIP: 0010:sysfs_remove_group+0x76/0x80
      [  605.922907] Code: 48 89 df 5b 5d 41 5c e9 38 c4 ff ff 48 89 df e8 e0 bf ff ff eb cb 49 8b 14 24 48 8b 75 00 48 c7 c7 38 73 cb a7 e8 24 77 d7 ff <0f> 0b 5b 5d 41 5c c3 0f 1f 00 0f 1f 44 00 00 41 56 41 55 41 54 55
      [  606.122304] RSP: 0018:ffffbadcc8d1bda8 EFLAGS: 00010286
      [  606.218492] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
      [  606.326381] RDX: ffff98bdfe85eb40 RSI: ffff98bdfe856818 RDI: ffff98bdfe856818
      [  606.514498] RBP: ffffffffa7ab73e0 R08: 0000000000000268 R09: 0000000000000007
      [  606.529469] R10: 0000000000000000 R11: ffffffffa860d9ad R12: ffff98bdf978e838
      [  606.630535] R13: ffff98bdc2cd4010 R14: ffff98bdc2cd3ff0 R15: ffff98bdc2cd4000
      [  606.824707] FS:  0000000000000000(0000) GS:ffff98bdfe840000(0000) knlGS:0000000000000000
      [  607.018333] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  607.117844] CR2: 00007f84b78ac024 CR3: 000000002c00a003 CR4: 00000000003606e0
      [  607.117844] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [  607.420926] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [  607.524236] Call Trace:
      [  607.530591]  device_del+0x56/0x350
      [  607.624393]  ? ata_tlink_match+0x30/0x30 [libata]
      [  607.727805]  ? attribute_container_device_trigger+0xb4/0xf0
      [  607.829911]  scsi_target_reap_ref_release+0x39/0x50
      [  607.928572]  scsi_remove_target+0x1a2/0x1d0
      [  608.017350]  __iscsi_unbind_session+0xb3/0x160 [scsi_transport_iscsi]
      [  608.117435]  process_one_work+0x1a7/0x360
      [  608.132917]  worker_thread+0x30/0x390
      [  608.222900]  ? pwq_unbound_release_workfn+0xd0/0xd0
      [  608.323989]  kthread+0x112/0x130
      [  608.418318]  ? kthread_bind+0x30/0x30
      [  608.513821]  ret_from_fork+0x35/0x40
      [  608.613909] ---[ end trace 0b98c310c8a6138c ]---
      Signed-off-by: default avatarMaurizio Lombardi <mlombard@redhat.com>
      Acked-by: default avatarChris Leech <cleech@redhat.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      165aa2bf
  9. 17 Mar, 2019 14 commits
  10. 16 Mar, 2019 1 commit
    • Linus Torvalds's avatar
      Merge tag 'pidfd-v5.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux · a9dce667
      Linus Torvalds authored
      Pull pidfd system call from Christian Brauner:
       "This introduces the ability to use file descriptors from /proc/<pid>/
        as stable handles on struct pid. Even if a pid is recycled the handle
        will not change. For a start these fds can be used to send signals to
        the processes they refer to.
      
        With the ability to use /proc/<pid> fds as stable handles on struct
        pid we can fix a long-standing issue where after a process has exited
        its pid can be reused by another process. If a caller sends a signal
        to a reused pid it will end up signaling the wrong process.
      
        With this patchset we enable a variety of use cases. One obvious
        example is that we can now safely delegate an important part of
        process management - sending signals - to processes other than the
        parent of a given process by sending file descriptors around via scm
        rights and not fearing that the given process will have been recycled
        in the meantime. It also allows for easy testing whether a given
        process is still alive or not by sending signal 0 to a pidfd which is
        quite handy.
      
        There has been some interest in this feature e.g. from systems
        management (systemd, glibc) and container managers. I have requested
        and gotten comments from glibc to make sure that this syscall is
        suitable for their needs as well. In the future I expect it to take on
        most other pid-based signal syscalls. But such features are left for
        the future once they are needed.
      
        This has been sitting in linux-next for quite a while and has not
        caused any issues. It comes with selftests which verify basic
        functionality and also test that a recycled pid cannot be signaled via
        a pidfd.
      
        Jon has written about a prior version of this patchset. It should
        cover the basic functionality since not a lot has changed since then:
      
            https://lwn.net/Articles/773459/
      
        The commit message for the syscall itself is extensively documenting
        the syscall, including it's functionality and extensibility"
      
      * tag 'pidfd-v5.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
        selftests: add tests for pidfd_send_signal()
        signal: add pidfd_send_signal() syscall
      a9dce667