Commits · 8d2c9d25b725a699479d388da7e116d1d2bc0ea1 · Kirill Smelkov / linux · GitLab

30 Dec, 2022 3 commits

scsi: libsas: Remove useless dev_list delete in sas_ex_discover_end_dev() · 8d2c9d25

Jason Yan authored Dec 14, 2022

The domain device 'child' is allocated in sas_ex_discover_end_dev() and
used to be added to the dev_list in this function. After the following two
fixes the device is added to the disco_list instead. As a result, the
list_del() and locking left behind is now redundant.

Fixes: 87c8331f ("[SCSI] libsas: prevent domain rediscovery competing with ata error handling")
Fixes: 92625f9b ("[SCSI] libsas: restore scan order")
Cc: John Garry <john.g.garry@oracle.com>
Signed-off-by: Jason Yan <yanaijie@huawei.com>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

8d2c9d25

scsi: libsas: Change the coding style of sas_discover_sata() · ffebb38e

Jason Yan authored Dec 14, 2022

The coding style where calling this interface is inconsistent with other
interfaces for SATA devices. The standard style for other SATA interfaces
is like:

    #ifdefine CONFIG_SCSI_SAS_ATA
    void sas_ata_task_abort(struct sas_task *task);
    #else
    static inline void sas_ata_task_abort(struct sas_task *task)
    {
    }
    #endif

And the callers does not have to do things like "#ifdefine CONFIG_SCSI_SAS_ATA"
and may call the interface directly. So follow the standard style here.

Cc: John Garry <john.g.garry@oracle.com>
Signed-off-by: Jason Yan <yanaijie@huawei.com>
Reviewed-by: Jack Wang <jinpu.wang@ionos.com>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

ffebb38e

scsi: libsas: Move sas_get_ata_command_set() up to save the declaration · 6c90466e

Jason Yan authored Dec 14, 2022

There is a sas_get_ata_command_set() declaration above sas_get_ata_info()
to make it compile. However, this function is defined in the same
file. Move it up to save the forward declaration.

Also remove the variable 'fis' which is not needed in this function.

Cc: John Garry <john.g.garry@oracle.com>
Signed-off-by: Jason Yan <yanaijie@huawei.com>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Reviewed-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

6c90466e

26 Nov, 2022 17 commits

scsi: libsas: Do not export sas_ata_wait_after_reset() · 4d450cf2

Jie Zhan authored Nov 18, 2022

sas_ata_wait_after_reset() does not need to be exported since it is no
longer referenced outside libsas.
Signed-off-by: Jie Zhan <zhanjie9@hisilicon.com>
Link: https://lore.kernel.org/r/20221118083714.4034612-6-zhanjie9@hisilicon.comReviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

4d450cf2

scsi: hisi_sas: Fix SATA devices missing issue during I_T nexus reset · 3c2673a0

Jie Zhan authored Nov 18, 2022

SATA devices on an expander may be removed and not be found again when I_T
nexus reset and revalidation are processed simultaneously.

The issue comes from:

 - Revalidation can remove SATA devices in link reset, e.g. in
   hisi_sas_clear_nexus_ha().

 - However, hisi_sas_debug_I_T_nexus_reset() polls the state of a SATA
   device on an expander after sending link_reset, where it calls:
    hisi_sas_debug_I_T_nexus_reset
     sas_ata_wait_after_reset
      ata_wait_after_reset
       ata_wait_ready
        smp_ata_check_ready
         sas_ex_phy_discover
          sas_ex_phy_discover_helper
           sas_set_ex_phy

   The ex_phy's change count is updated in sas_set_ex_phy(), so SATA
   devices after a link reset may not be found later through revalidation.

A similar issue was reported in:
commit 0f3fce5c ("[SCSI] libsas: fix ata_eh clobbering ex_phys via
smp_ata_check_ready")
commit 87c8331f ("[SCSI] libsas: prevent domain rediscovery competing
with ata error handling").

To address this issue, in hisi_sas_debug_I_T_nexus_reset(), we now call
smp_ata_check_ready_type() that only polls the device type while not
updating the ex_phy's data of libsas.

Fixes: 71453bd9 ("scsi: hisi_sas: Use sas_ata_wait_after_reset() in IT nexus reset")
Signed-off-by: Jie Zhan <zhanjie9@hisilicon.com>
Link: https://lore.kernel.org/r/20221118083714.4034612-5-zhanjie9@hisilicon.comSigned-off-by: Martin K. Petersen <martin.petersen@oracle.com>

3c2673a0

scsi: libsas: Add smp_ata_check_ready_type() · 9181ce3c

Jie Zhan authored Nov 18, 2022

Create function smp_ata_check_ready_type() for LLDDs to wait for SATA
devices to come up after a link reset.
Signed-off-by: Jie Zhan <zhanjie9@hisilicon.com>
Link: https://lore.kernel.org/r/20221118083714.4034612-4-zhanjie9@hisilicon.comSigned-off-by: Martin K. Petersen <martin.petersen@oracle.com>

9181ce3c

scsi: Revert "scsi: hisi_sas: Don't send bcast events from HW during nexus HA reset" · 94a3555d

Jie Zhan authored Nov 18, 2022

This reverts commit f5f2a271.

This is now unnecessary to solve the SATA devices missing issue in
hisi_sas_clear_nexus_ha(). Hence, we should not ignore bcast events during
sas_eh_handle_sas_errors() in case of missing bcast events, unless a
justified need is found and a mechanism to defer (but not ignore) bcast
events in sas_eh_handle_sas_errors() is provided.

Also, in hisi_sas_clear_nexus_ha(), there is nothing further to handle in
"out: " other than return, so that part can be reverted.
Signed-off-by: Jie Zhan <zhanjie9@hisilicon.com>
Link: https://lore.kernel.org/r/20221118083714.4034612-3-zhanjie9@hisilicon.comReviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

94a3555d

scsi: Revert "scsi: hisi_sas: Drain bcast events in hisi_sas_rescan_topology()" · 7e613be7

Jie Zhan authored Nov 18, 2022

This reverts commit 11ff0c98.

Draining or flushing events in hisi_sas_rescan_topology() can hang the
driver, typically with phy up or phy down events being processed,
i.e. sas_porte_bytes_dmaed() or sas_phye_loss_of_signal().
Signed-off-by: Jie Zhan <zhanjie9@hisilicon.com>
Link: https://lore.kernel.org/r/20221118083714.4034612-2-zhanjie9@hisilicon.comSigned-off-by: Martin K. Petersen <martin.petersen@oracle.com>

7e613be7

scsi: ufs: ufs-mediatek: Modify the return value · 96a2dfa1

ChanWoo Lee authored Nov 18, 2022

Be consistent with the rest of driver wrt. functions returning bool.

91: return !!(host->caps & UFS_MTK_CAP_BOOST_CRYPT_ENGINE);
98: return !!(host->caps & UFS_MTK_CAP_VA09_PWR_CTRL);
105: return !!(host->caps & UFS_MTK_CAP_BROKEN_VCC);
Signed-off-by: ChanWoo Lee <cw9316.lee@samsung.com>
Link: https://lore.kernel.org/r/20221118045242.2770-1-cw9316.lee@samsung.comReviewed-by: Stanley Chu <stanley.chu@mediatek.com>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

96a2dfa1

scsi: ufs: ufs-mediatek: Remove unneeded code · 54155528

ChanWoo Lee authored Nov 18, 2022

Remove unnecessary if/goto code.
Signed-off-by: ChanWoo Lee <cw9316.lee@samsung.com>
Link: https://lore.kernel.org/r/20221118044136.921-1-cw9316.lee@samsung.comReviewed-by: Stanley Chu <stanley.chu@mediatek.com>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

54155528

scsi: device_handler: alua: Call scsi_device_put() from non-atomic context · 50759b88

Bart Van Assche authored Nov 17, 2022

Since commit f93ed747 ("scsi: core: Release SCSI devices
synchronously"), scsi_device_put() might sleep. Avoid calling it from
alua_rtpg_queue() with the pg_lock held. The lock only pretects h->pg,
anyway. To avoid the pg being freed under us, because of a race with
another thread, take a temporary reference. In alua_rtpg_queue(), verify
that the pg still belongs to the sdev being passed before actually queueing
the RTPG.

This patch fixes the following smatch warning:

drivers/scsi/device_handler/scsi_dh_alua.c:1013 alua_rtpg_queue() warn: sleeping in atomic context

alua_check_vpd() <- disables preempt
-> alua_rtpg_queue()
   -> scsi_device_put()

Cc: Martin Wilck <mwilck@suse.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Sachin Sant <sachinp@linux.ibm.com>
Cc: Benjamin Block <bblock@linux.ibm.com>
Suggested-by: Martin Wilck <mwilck@suse.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20221117183626.2656196-3-bvanassche@acm.orgTested-by: Sachin Sant <sachinp@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

50759b88

scsi: device_handler: alua: Revert "Move a scsi_device_put() call out of alua_check_vpd()" · a500c4cc

Bart Van Assche authored Nov 17, 2022

There is a bug in commit 0b25e17e ("scsi: alua: Move a
scsi_device_put() call out of alua_check_vpd()"): that patch may cause
alua_rtpg_queue() callers to call scsi_device_put() even if that function
should not be called. Revert that commit to prepare for a different
solution.

Cc: Hannes Reinecke <hare@suse.de>
Cc: Martin Wilck <mwilck@suse.com>
Cc: Sachin Sant <sachinp@linux.ibm.com>
Cc: Benjamin Block <bblock@linux.ibm.com>
Reported-by: Sachin Sant <sachinp@linux.ibm.com>
Reported-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20221117183626.2656196-2-bvanassche@acm.orgTested-by: Sachin Sant <sachinp@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

a500c4cc

scsi: snic: Fix possible UAF in snic_tgt_create() · e118df49

Gaosheng Cui authored Nov 17, 2022

Smatch reports a warning as follows:

drivers/scsi/snic/snic_disc.c:307 snic_tgt_create() warn:
  '&tgt->list' not removed from list

If device_add() fails in snic_tgt_create(), tgt will be freed, but
tgt->list will not be removed from snic->disc.tgt_list, then list traversal
may cause UAF.

Remove from snic->disc.tgt_list before free().

Fixes: c8806b6c ("snic: driver for Cisco SCSI HBA")
Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com>
Link: https://lore.kernel.org/r/20221117035100.2944812-1-cuigaosheng1@huawei.comAcked-by: Narsimhulu Musini <nmusini@cisco.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

e118df49

scsi: qla2xxx: Initialize vha->unknown_atio_[list, work] for NPIV hosts · 95da5e58

Gleb Chesnokov authored Nov 15, 2022

Initialization of vha->unknown_atio_list and vha->unknown_atio_work only
happens for base_vha in qlt_probe_one_stage1(). But there is no
initialization for NPIV hosts that are created in qla24xx_vport_create().

This causes a crash when trying to access these NPIV host fields.

Fix this by adding initialization to qla_vport_create().
Signed-off-by: Gleb Chesnokov <gleb.chesnokov@scst.dev>
Link: https://lore.kernel.org/r/376c89a2-a9ac-bcf9-bf0f-dfe89a02fd4b@scst.devSigned-off-by: Martin K. Petersen <martin.petersen@oracle.com>

95da5e58

scsi: qla2xxx: Remove duplicate of vha->iocb_work initialization · 3620e174

Gleb Chesnokov authored Nov 15, 2022

Commit 9b3e0f4d ("scsi: qla2xxx: Move work element processing out of
DPC thread") introduced the initialization of vha->iocb_work in
qla2x00_create_host() function.

This initialization is also called from qla2x00_probe_one() function, just
after qla2x00_create_host().

Hence remove this duplicate call since it has already been called before.
Signed-off-by: Gleb Chesnokov <gleb.chesnokov@scst.dev>
Link: https://lore.kernel.org/r/822b3823-f344-67d6-30f1-16e31cf68eed@scst.devSigned-off-by: Martin K. Petersen <martin.petersen@oracle.com>

3620e174

scsi: fcoe: Fix transport not deattached when fcoe_if_init() fails · 4155658c

Chen Zhongjin authored Nov 15, 2022

fcoe_init() calls fcoe_transport_attach(&fcoe_sw_transport), but when
fcoe_if_init() fails, &fcoe_sw_transport is not detached and leaves freed
&fcoe_sw_transport on fcoe_transports list. This causes panic when
reinserting module.

 BUG: unable to handle page fault for address: fffffbfff82e2213
 RIP: 0010:fcoe_transport_attach+0xe1/0x230 [libfcoe]
 Call Trace:
  <TASK>
  do_one_initcall+0xd0/0x4e0
  load_module+0x5eee/0x7210
  ...

Fixes: 78a58246 ("[SCSI] fcoe: convert fcoe.ko to become an fcoe transport provider driver")
Signed-off-by: Chen Zhongjin <chenzhongjin@huawei.com>
Link: https://lore.kernel.org/r/20221115092442.133088-1-chenzhongjin@huawei.comSigned-off-by: Martin K. Petersen <martin.petersen@oracle.com>

4155658c

scsi: sd: Use 16-byte SYNCHRONIZE CACHE on ZBC devices · 42c59077

Shin'ichiro Kawasaki authored Nov 15, 2022

ZBC Zoned Block Commands specification mandates SYNCHRONIZE CACHE(16) for
host-managed zoned block devices, but does not mandate SYNCHRONIZE
CACHE(10). Call SYNCHRONIZE CACHE(16) in place of SYNCHRONIZE CACHE(10) to
ensure that the command is always supported. For this purpose, add
use_16_for_sync flag to struct scsi_device in same manner as use_16_for_rw
flag.

To be precise, ZBC does not mandate SYNCHRONIZE CACHE(16) for host-aware
zoned block devices. However, modern devices should support 16-byte
commands. Hence, call SYNCHRONIZE CACHE (16) on both types of ZBC devices,
host-aware and host-managed. Of note is that READ(16) and WRITE(16) have
same story and they are already called for both types of ZBC devices.

Another note is that this patch depends on the fix commit ea045fd3
("ata: libata-scsi: fix SYNCHRONIZE CACHE (16) command failure").
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Link: https://lore.kernel.org/r/20221115002905.1709006-1-shinichiro.kawasaki@wdc.comReviewed-by: Damien Le Moal <damien.lemoal@opendource.wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

42c59077

scsi: ipr: Fix WARNING in ipr_init() · e6f108bf

Shang XiaoJing authored Nov 13, 2022

ipr_init() will not call unregister_reboot_notifier() when
pci_register_driver() fails, which causes a WARNING. Call
unregister_reboot_notifier() when pci_register_driver() fails.

notifier callback ipr_halt [ipr] already registered
WARNING: CPU: 3 PID: 299 at kernel/notifier.c:29
notifier_chain_register+0x16d/0x230
Modules linked in: ipr(+) xhci_pci_renesas xhci_hcd ehci_hcd usbcore
led_class gpu_sched drm_buddy video wmi drm_ttm_helper ttm
drm_display_helper drm_kms_helper drm drm_panel_orientation_quirks
agpgart cfbft
CPU: 3 PID: 299 Comm: modprobe Tainted: G        W
6.1.0-rc1-00190-g39508d23b672-dirty #332
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
rel-1.15.0-0-g2dd4b9b3f840-prebuilt.qemu.org 04/01/2014
RIP: 0010:notifier_chain_register+0x16d/0x230
Call Trace:
 <TASK>
 __blocking_notifier_chain_register+0x73/0xb0
 ipr_init+0x30/0x1000 [ipr]
 do_one_initcall+0xdb/0x480
 do_init_module+0x1cf/0x680
 load_module+0x6a50/0x70a0
 __do_sys_finit_module+0x12f/0x1c0
 do_syscall_64+0x3f/0x90
 entry_SYSCALL_64_after_hwframe+0x63/0xcd

Fixes: f72919ec ("[SCSI] ipr: implement shutdown changes and remove obsolete write cache parameter")
Signed-off-by: Shang XiaoJing <shangxiaojing@huawei.com>
Link: https://lore.kernel.org/r/20221113064513.14028-1-shangxiaojing@huawei.comSigned-off-by: Martin K. Petersen <martin.petersen@oracle.com>

e6f108bf

scsi: scsi_debug: Fix possible name leak in sdebug_add_host_helper() · e6d773f9

Yang Yingliang authored Nov 12, 2022

Afer commit 1fa5ae85 ("driver core: get rid of struct device's bus_id
string array"), the name of device is allocated dynamically, it needs be
freed when device_register() returns error.

As comment of device_register() says, one should use put_device() to give
up the reference in the error path. Fix this by calling put_device(), then
the name can be freed in kobject_cleanup(), and sdbg_host is freed in
sdebug_release_adapter().

When the device release is not set, it means the device is not initialized.
We can not call put_device() in this case. Use kfree() to free memory.

Fixes: 1fa5ae85 ("driver core: get rid of struct device's bus_id string array")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Link: https://lore.kernel.org/r/20221112131010.3757845-1-yangyingliang@huawei.comSigned-off-by: Martin K. Petersen <martin.petersen@oracle.com>

e6d773f9

scsi: fcoe: Fix possible name leak when device_register() fails · 47b6a122

Yang Yingliang authored Nov 12, 2022

If device_register() returns an error, the name allocated by dev_set_name()
needs to be freed. As the comment of device_register() says, one should use
put_device() to give up the reference in the error path. Fix this by
calling put_device(), then the name can be freed in kobject_cleanup().

The 'fcf' is freed in fcoe_fcf_device_release(), so the kfree() in the
error path can be removed.

The 'ctlr' is freed in fcoe_ctlr_device_release(), so don't use the error
label, just return NULL after calling put_device().

Fixes: 9a74e884 ("[SCSI] libfcoe: Add fcoe_sysfs")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Link: https://lore.kernel.org/r/20221112094310.3633291-1-yangyingliang@huawei.comSigned-off-by: Martin K. Petersen <martin.petersen@oracle.com>

47b6a122

25 Nov, 2022 9 commits

scsi: scsi_debug: Fix a warning in resp_report_zones() · 07f2ca13

Harshit Mogalapalli authored Nov 11, 2022

As 'alloc_len' is user controlled data, if user tries to allocate memory
larger than(>=) MAX_ORDER, then kcalloc() will fail, it creates a stack
trace and messes up dmesg with a warning.

Add __GFP_NOWARN in order to avoid too large allocation warning. This is
detected by static analysis using smatch.

Fixes: 7db0e0c8 ("scsi: scsi_debug: Fix buffer size of REPORT ZONES command")
Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>
Link: https://lore.kernel.org/r/20221112070612.2121535-1-harshit.m.mogalapalli@oracle.comSigned-off-by: Martin K. Petersen <martin.petersen@oracle.com>

07f2ca13

scsi: scsi_debug: Fix a warning in resp_verify() · ed0f17b7

Harshit Mogalapalli authored Nov 11, 2022

As 'vnum' is controlled by user, so if user tries to allocate memory larger
than(>=) MAX_ORDER, then kcalloc() will fail, it creates a stack trace and
messes up dmesg with a warning.

Add __GFP_NOWARN in order to avoid too large allocation warning. This is
detected by static analysis using smatch.

Fixes: c3e2fe92 ("scsi: scsi_debug: Implement VERIFY(10), add VERIFY(16)")
Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>
Link: https://lore.kernel.org/r/20221112070031.2121068-1-harshit.m.mogalapalli@oracle.comSigned-off-by: Martin K. Petersen <martin.petersen@oracle.com>

ed0f17b7

scsi: efct: Fix possible memleak in efct_device_init() · bb0cd225

Chen Zhongjin authored Nov 11, 2022

In efct_device_init(), when efct_scsi_reg_fc_transport() fails,
efct_scsi_tgt_driver_exit() is not called to release memory for
efct_scsi_tgt_driver_init() and causes memleak:

unreferenced object 0xffff8881020ce000 (size 2048):
  comm "modprobe", pid 465, jiffies 4294928222 (age 55.872s)
  backtrace:
    [<0000000021a1ef1b>] kmalloc_trace+0x27/0x110
    [<000000004c3ed51c>] target_register_template+0x4fd/0x7b0 [target_core_mod]
    [<00000000f3393296>] efct_scsi_tgt_driver_init+0x18/0x50 [efct]
    [<00000000115de533>] 0xffffffffc0d90011
    [<00000000d608f646>] do_one_initcall+0xd0/0x4e0
    [<0000000067828cf1>] do_init_module+0x1cc/0x6a0
    ...

Fixes: 4df84e84 ("scsi: elx: efct: Driver initialization routines")
Signed-off-by: Chen Zhongjin <chenzhongjin@huawei.com>
Link: https://lore.kernel.org/r/20221111074046.57061-1-chenzhongjin@huawei.comSigned-off-by: Martin K. Petersen <martin.petersen@oracle.com>

bb0cd225

scsi: ufs: core: Fix unnecessary operation for early return · 222d227f

ChanWoo Lee authored Nov 11, 2022

Setting bitmap_len is not required when returning early. Defer until it is
needed.
Signed-off-by: ChanWoo Lee <cw9316.lee@samsung.com>
Link: https://lore.kernel.org/r/20221111062301.7423-1-cw9316.lee@samsung.comSigned-off-by: Martin K. Petersen <martin.petersen@oracle.com>

222d227f

scsi: ufs: core: Switch 'check_for_bkops' to bool · 5277326d

ChanWoo Lee authored Nov 11, 2022

Only checks true and false so it can be converted to bool.
Signed-off-by: ChanWoo Lee <cw9316.lee@samsung.com>
Link: https://lore.kernel.org/r/20221111062209.7365-1-cw9316.lee@samsung.comSigned-off-by: Martin K. Petersen <martin.petersen@oracle.com>

5277326d

scsi: ufs: core: Separate function name and message · 859ed37c

ChanWoo Lee authored Nov 11, 2022

Separate the function name and message to make it easier to check the log.
Modify messages to fit the format of others.
Signed-off-by: ChanWoo Lee <cw9316.lee@samsung.com>
Link: https://lore.kernel.org/r/20221111062126.7307-1-cw9316.lee@samsung.comSigned-off-by: Martin K. Petersen <martin.petersen@oracle.com>

859ed37c

scsi: hpsa: Fix possible memory leak in hpsa_add_sas_device() · fda34a5d

Yang Yingliang authored Nov 11, 2022

If hpsa_sas_port_add_rphy() returns an error, the 'rphy' allocated in
sas_end_device_alloc() needs to be freed. Address this by calling
sas_rphy_free() in the error path.

Fixes: d04e62b9 ("hpsa: add in sas transport class")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Link: https://lore.kernel.org/r/20221111043012.1074466-1-yangyingliang@huawei.comSigned-off-by: Martin K. Petersen <martin.petersen@oracle.com>

fda34a5d

scsi: hpsa: Fix error handling in hpsa_add_sas_host() · 4ef174a3

Yang Yingliang authored Nov 10, 2022

hpsa_sas_port_add_phy() does:
  ...
  sas_phy_add()  -> may return error here
  sas_port_add_phy()
  ...

Whereas hpsa_free_sas_phy() does:
  ...
  sas_port_delete_phy()
  sas_phy_delete()
  ...

If hpsa_sas_port_add_phy() returns an error, hpsa_free_sas_phy() can not be
called to free the memory because the port and the phy have not been added
yet.

Replace hpsa_free_sas_phy() with sas_phy_free() and kfree() to avoid kernel
crash in this case.

Fixes: d04e62b9 ("hpsa: add in sas transport class")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Link: https://lore.kernel.org/r/20221110151129.394389-1-yangyingliang@huawei.comSigned-off-by: Martin K. Petersen <martin.petersen@oracle.com>

4ef174a3

scsi: mpt3sas: Fix possible resource leaks in mpt3sas_transport_port_add() · 78316e9d

Yang Yingliang authored Nov 09, 2022

In mpt3sas_transport_port_add(), if sas_rphy_add() returns error,
sas_rphy_free() needs be called to free the resource allocated in
sas_end_device_alloc(). Otherwise a kernel crash will happen:

Unable to handle kernel NULL pointer dereference at virtual address 0000000000000108
CPU: 45 PID: 37020 Comm: bash Kdump: loaded Tainted: G        W          6.1.0-rc1+ #189
pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : device_del+0x54/0x3d0
lr : device_del+0x37c/0x3d0
Call trace:
 device_del+0x54/0x3d0
 attribute_container_class_device_del+0x28/0x38
 transport_remove_classdev+0x6c/0x80
 attribute_container_device_trigger+0x108/0x110
 transport_remove_device+0x28/0x38
 sas_rphy_remove+0x50/0x78 [scsi_transport_sas]
 sas_port_delete+0x30/0x148 [scsi_transport_sas]
 do_sas_phy_delete+0x78/0x80 [scsi_transport_sas]
 device_for_each_child+0x68/0xb0
 sas_remove_children+0x30/0x50 [scsi_transport_sas]
 sas_rphy_remove+0x38/0x78 [scsi_transport_sas]
 sas_port_delete+0x30/0x148 [scsi_transport_sas]
 do_sas_phy_delete+0x78/0x80 [scsi_transport_sas]
 device_for_each_child+0x68/0xb0
 sas_remove_children+0x30/0x50 [scsi_transport_sas]
 sas_remove_host+0x20/0x38 [scsi_transport_sas]
 scsih_remove+0xd8/0x420 [mpt3sas]

Because transport_add_device() is not called when sas_rphy_add() fails, the
device is not added. When sas_rphy_remove() is subsequently called to
remove the device in the remove() path, a NULL pointer dereference happens.

Fixes: f92363d1 ("[SCSI] mpt3sas: add new driver supporting 12GB SAS")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Link: https://lore.kernel.org/r/20221109032403.1636422-1-yangyingliang@huawei.comSigned-off-by: Martin K. Petersen <martin.petersen@oracle.com>

78316e9d

24 Nov, 2022 11 commits

scsi: hpsa: Fix possible memory leak in hpsa_init_one() · 9c9ff300

Yuan Can authored Nov 22, 2022

The hpda_alloc_ctlr_info() allocates h and its field reply_map. However, in
hpsa_init_one(), if alloc_percpu() failed, the hpsa_init_one() jumps to
clean1 directly, which frees h and leaks the h->reply_map.

Fix by calling hpda_free_ctlr_info() to release h->replay_map and h instead
free h directly.

Fixes: 8b834bff ("scsi: hpsa: fix selection of reply queue")
Signed-off-by: Yuan Can <yuancan@huawei.com>
Link: https://lore.kernel.org/r/20221122015751.87284-1-yuancan@huawei.comReviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

9c9ff300

scsi: core: Do not increase scsi_device's iorequest_cnt if dispatch failed · cfee29ff

Wenchao Hao authored Nov 23, 2022

If scsi_dispatch_cmd() failed, the SCSI command was not sent to the target.
scsi_queue_rq() would return BLK_STS_RESOURCE if scsi_dispatch_cmd()
failed, and the related request would be requeued. The timeout of this
request would not fire, so noone would increase iodone_cnt.
Signed-off-by: Wenchao Hao <haowenchao@huawei.com>
Link: https://lore.kernel.org/r/20221123122137.150776-3-haowenchao@huawei.comReviewed-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

cfee29ff

scsi: core: Increase scsi_device's iodone_cnt in scsi_timeout() · ec9780e4

Wenchao Hao authored Nov 23, 2022

If a SCSI command times out and is going to be aborted, we should increase
the iodone_cnt of the related scsi_device. Otherwise the iodone_cnt would
be smaller than iorequest_cnt.

Increasing iodone_cnt in scsi_timeout() would not cause a double accounting
issue. Brief analysis follows:

 - We add the iodone_cnt when BLK_EH_DONE is returned in
   scsi_timeout(). The related command's timeout event would not happen.

 - If the abort succeeds and the command is not retried, the command would
   be completed with scsi_finish_command() which would not increase
   iodone_cnt.

 - If the abort succeeds and the command is retried, it would be requeue. A
   scsi_dispatch_cmd() would be called and iorequest_cnt would be increased
   again.

 - If the abort fails, the error handler successfully recovers the device,
   and the command is not retried, the command would be completed with
   scsi_finish_command() which would not increase iodone_cnt.

 - If the abort fails, the error handler successfully recovers the device,
   and the command is retried, the iorequest_cnt would be increased again.
Signed-off-by: Wenchao Hao <haowenchao@huawei.com>
Link: https://lore.kernel.org/r/20221123122137.150776-2-haowenchao@huawei.comReviewed-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

ec9780e4

scsi: iscsi: Rename iscsi_set_param() to iscsi_if_set_param() · 0c26a2d7

Wenchao Hao authored Nov 22, 2022

There are two iscsi_set_param() functions defined in libiscsi.c and
scsi_transport_iscsi.c respectively which is confusing.

Rename the one in scsi_transport_iscsi.c to iscsi_if_set_param().
Signed-off-by: Wenchao Hao <haowenchao@huawei.com>
Link: https://lore.kernel.org/r/20221122181105.4123935-1-haowenchao@huawei.comReviewed-by: Mike Christie <michael.christie@oracle.com>
Reviewed-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

0c26a2d7

scsi: target: core: Fix hard lockup when executing a compare-and-write command · a72629b5

Maurizio Lombardi authored Nov 21, 2022

While handling an I/O completion for the compare portion of a
COMPARE_AND_WRITE command, it may happen that the
compare_and_write_callback function submits new bio structs while still in
softirq context.

Low level drivers like md raid5 do not expect their make_request call to be
used in softirq context, they call into schedule() and create a deadlocked
system.

 __schedule at ffffffff873a0807
 schedule at ffffffff873a0cc5
 raid5_get_active_stripe at ffffffffc0875744 [raid456]
 raid5_make_request at ffffffffc0875a50 [raid456]
 md_handle_request at ffffffff8713b9f9
 md_make_request at ffffffff8713bacb
 generic_make_request at ffffffff86e6f14b
 submit_bio at ffffffff86e6f27c
 iblock_submit_bios at ffffffffc0b4e4dc [target_core_iblock]
 iblock_execute_rw at ffffffffc0b4f3ce [target_core_iblock]
 __target_execute_cmd at ffffffffc1090079 [target_core_mod]
 compare_and_write_callback at ffffffffc1093602 [target_core_mod]
 target_cmd_interrupted at ffffffffc108d1ec [target_core_mod]
 target_complete_cmd_with_sense at ffffffffc108d27c [target_core_mod]
 iblock_complete_cmd at ffffffffc0b4e23a [target_core_iblock]
 dm_io_dec_pending at ffffffffc00db29e [dm_mod]
 clone_endio at ffffffffc00dbf07 [dm_mod]
 raid5_align_endio at ffffffffc086d6c2 [raid456]
 blk_update_request at ffffffff86e6d950
 scsi_end_request at ffffffff87063d48
 scsi_io_completion at ffffffff87063ee8
 blk_complete_reqs at ffffffff86e77b05
 __softirqentry_text_start at ffffffff876000d7

This problem appears to be an issue between target_cmd_interrupted() and
compare_and_write_callback(). target_cmd_interrupted() calls the se_cmd's
transport_complete_callback function pointer if the se_cmd is being stopped
or aborted, and CMD_T_ABORTED was set on the se_cmd.

When calling compare_and_write_callback(), the success parameter was set to
false. target_cmd_interrupted() seems to expect this means the callback
will do cleanup that does not require a process context. But
compare_and_write_callback() ignores the parameter if there was I/O done
for the compare part of COMPARE_AND_WRITE.

Since there was data, the function continued on, passed the compare, and
issued a write while ignoring the value of the success parameter.  The
submit of a bio for the write portion of the COMPARE_AND_WRITE then causes
schedule to be unsafely called from the softirq context.

Fix the bug in compare_and_write_callback by jumping to the out label if
success == "false", after checking if we have been called by
transport_generic_request_failure(); The command is being aborted or
stopped so there is no need to submit the write bio for the write part of
the COMPARE_AND_WRITE command.
Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
Link: https://lore.kernel.org/r/20221121092703.316489-1-mlombard@redhat.comReviewed-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

a72629b5

scsi: target: iscsi: Fix a race condition between login_work and the login thread · fec1b2fa

Maurizio Lombardi authored Nov 15, 2022

In case a malicious initiator sends some random data immediately after a
login PDU; the iscsi_target_sk_data_ready() callback will schedule the
login_work and, at the same time, the negotiation may end without clearing
the LOGIN_FLAGS_INITIAL_PDU flag (because no additional PDU exchanges are
required to complete the login).

The login has been completed but the login_work function will find the
LOGIN_FLAGS_INITIAL_PDU flag set and will never stop from rescheduling
itself; at this point, if the initiator drops the connection, the
iscsit_conn structure will be freed, login_work will dereference a released
socket structure and the kernel crashes.

BUG: kernel NULL pointer dereference, address: 0000000000000230
PF: supervisor write access in kernel mode
PF: error_code(0x0002) - not-present page
Workqueue: events iscsi_target_do_login_rx [iscsi_target_mod]
RIP: 0010:_raw_read_lock_bh+0x15/0x30
Call trace:
 iscsi_target_do_login_rx+0x75/0x3f0 [iscsi_target_mod]
 process_one_work+0x1e8/0x3c0

Fix this bug by forcing login_work to stop after the login has been
completed and the socket callbacks have been restored.

Add a comment to clearify the return values of iscsi_target_do_login()
Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
Link: https://lore.kernel.org/r/20221115125638.102517-1-mlombard@redhat.comReviewed-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

fec1b2fa

scsi: target: core: Change the way target_xcopy_do_work() sets restiction on max I/O · 689d94ec

Anastasia Kovaleva authored Nov 14, 2022

To determine how many blocks sends in one command, the minimum value is
selected from the hw_max_sectors of both devices. In target_xcopy_do_work,
hw_max_sectors are used as blocks, not sectors; it also ignores the fact
that sectors can be of different sizes, for example 512 and 4096
bytes. Because of this, a number of blocks can be transmitted that the
device will not be able to accept.

Change the selection of max transmission size into bytes.
Reviewed-by: Konstantin Shelekhin <k.shelekhin@yadro.com>
Reviewed-by: Dmitriy Bogdanov <d.bogdanov@yadro.com>
Signed-off-by: Anastasia Kovaleva <a.kovaleva@yadro.com>
Link: https://lore.kernel.org/r/20221114102500.88892-4-a.kovaleva@yadro.comReviewed-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

689d94ec

scsi: target: core: Make hw_max_sectors store the sectors amount in blocks · 9375031e

Anastasia Kovaleva authored Nov 14, 2022

By default, hw_max_sectors stores its value in 512 blocks in iblock,
despite the fact that the block size can be 4096 bytes. Change
hw_max_sectors to store the number of sectors in hw_block_size blocks.
Reviewed-by: Konstantin Shelekhin <k.shelekhin@yadro.com>
Reviewed-by: Dmitriy Bogdanov <d.bogdanov@yadro.com>
Signed-off-by: Anastasia Kovaleva <a.kovaleva@yadro.com>
Link: https://lore.kernel.org/r/20221114102500.88892-3-a.kovaleva@yadro.comReviewed-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

9375031e

scsi: target: core: Send max transfer length in blocks · 7870d248

Anastasia Kovaleva authored Nov 14, 2022

A MAXIMUM TRANSFER LENGTH value indicates the maximum transfer length in
logical blocks that the device server accepts for a single command. Fix
function sending the length in sectors instead of blocks.

This patch also removes the special casing for fileio in block_size_store
since this logic in now unified in spc_emulate_evpd_b0() for all backends.
Reviewed-by: Konstantin Shelekhin <k.shelekhin@yadro.com>
Reviewed-by: Dmitriy Bogdanov <d.bogdanov@yadro.com>
Signed-off-by: Anastasia Kovaleva <a.kovaleva@yadro.com>
Link: https://lore.kernel.org/r/20221114102500.88892-2-a.kovaleva@yadro.comReviewed-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

7870d248

scsi: lpfc: Remove linux/msi.h include · cdd9344e

Thomas Gleixner authored Nov 13, 2022

Nothing in this file needs anything from linux/msi.h
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20221113202428.436270297@linutronix.de
Cc: James Smart <james.smart@broadcom.com>
Cc: Dick Kennedy <dick.kennedy@broadcom.com>
Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: linux-scsi@vger.kernel.org
Reviewed-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

cdd9344e

scsi: lpfc: Update lpfc version to 14.2.0.9 · d57d98fe

Justin Tee authored Nov 15, 2022

Update lpfc version to 14.2.0.9.
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20221116011921.105995-7-justintee8345@gmail.comSigned-off-by: Martin K. Petersen <martin.petersen@oracle.com>

d57d98fe