1. 01 Aug, 2021 8 commits
    • Daejun Park's avatar
      scsi: ufs: ufshpb: L2P map management for HPB read · 4b5f4907
      Daejun Park authored
      Implement L2P map management in HPB.
      
      The HPB divides logical addresses into several regions. A region consists
      of several sub-regions. The sub-region is a basic unit where L2P mapping is
      managed. The driver loads L2P mapping data of each sub-region. The loaded
      sub-region is called active-state. The HPB driver unloads L2P mapping data
      as region unit. The unloaded region is called inactive-state.
      
      Sub-region/region candidates to be loaded and unloaded are delivered from
      the UFS device. The UFS device delivers the recommended active sub-region
      and inactivate region to the driver using sense data. The HPB module
      performs L2P mapping management on the host through the delivered
      information.
      
      A pinned region is a preset region on the UFS device that is always
      in activate-state.
      
      The data structures for map data requests and L2P mappings use the mempool
      API, minimizing allocation overhead while avoiding static allocation.
      
      The mininum size of the memory pool used in the HPB is implemented
      as a module parameter so that it can be configurable by the user.
      
      To guarantee a minimum memory pool size of 4MB: ufshpb_host_map_kbytes=4096.
      
      The map_work manages active/inactive via 2 "to-do" lists:
      
       - hpb->lh_inact_rgn: regions to be inactivated
       - hpb->lh_act_srgn: subregions to be activated
      
      These lists are maintained on I/O completion.
      
      [mkp: switch to REQ_OP_DRV_*]
      
      Link: https://lore.kernel.org/r/20210712085859epcms2p36e420f19564f6cd0c4a45d54949619eb@epcms2p3Tested-by: default avatarBean Huo <beanhuo@micron.com>
      Tested-by: default avatarCan Guo <cang@codeaurora.org>
      Tested-by: default avatarStanley Chu <stanley.chu@mediatek.com>
      Reviewed-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Reviewed-by: default avatarBart Van Assche <bvanassche@acm.org>
      Reviewed-by: default avatarCan Guo <cang@codeaurora.org>
      Reviewed-by: default avatarBean Huo <beanhuo@micron.com>
      Reviewed-by: default avatarStanley Chu <stanley.chu@mediatek.com>
      Acked-by: default avatarAvri Altman <Avri.Altman@wdc.com>
      Signed-off-by: default avatarDaejun Park <daejun7.park@samsung.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      4b5f4907
    • Daejun Park's avatar
      scsi: ufs: ufshpb: Introduce Host Performance Buffer feature · f02bc975
      Daejun Park authored
      Implement Host Performance Buffer (HPB) initialization and add function
      calls to UFS core driver.
      
      NAND flash-based storage devices, including UFS, have mechanisms to
      translate logical addresses of I/O requests to the corresponding physical
      addresses of the flash storage.  In UFS, logical-to-physical-address (L2P)
      map data, which is required to identify the physical address for the
      requested I/Os, can only be partially stored in SRAM from NAND flash. Due
      to this partial loading, accessing the flash address area, where the L2P
      information for that address is not loaded in the SRAM, can result in
      serious performance degradation.
      
      The basic concept of HPB is to cache L2P mapping entries in host system
      memory so that both physical block address (PBA) and logical block address
      (LBA) can be delivered in HPB read command. The HPB read command allows to
      read data faster than a regular read command in UFS since it provides the
      physical address (HPB Entry) of the desired logical block in addition to
      its logical address. The UFS device can access the physical block in NAND
      directly without searching and uploading L2P mapping table. This improves
      read performance because the NAND read operation for uploading L2P mapping
      table is removed.
      
      In HPB initialization, the host checks if the UFS device supports HPB
      feature and retrieves related device capabilities. Then, HPB parameters are
      configured in the device.
      
      Total start-up time of popular applications was measured and the difference
      observed between HPB being enabled and disabled. Popular applications are
      12 game apps and 24 non-game apps. Each test cycle consists of running 36
      applications in sequence. We repeated the cycle for observing performance
      improvement by L2P mapping cache hit in HPB.
      
      The following is the test environment:
      
       - kernel version: 4.4.0
       - RAM: 8GB
       - UFS 2.1 (64GB)
      
      Results:
      
         +-------+----------+----------+-------+
         | cycle | baseline | with HPB | diff  |
         +-------+----------+----------+-------+
         | 1     | 272.4    | 264.9    | -7.5  |
         | 2     | 250.4    | 248.2    | -2.2  |
         | 3     | 226.2    | 215.6    | -10.6 |
         | 4     | 230.6    | 214.8    | -15.8 |
         | 5     | 232.0    | 218.1    | -13.9 |
         | 6     | 231.9    | 212.6    | -19.3 |
         +-------+----------+----------+-------+
      
      We also measured HPB performance using iozone:
      
         $ iozone -r 4k -+n -i2 -ecI -t 16 -l 16 -u 16 -s $IO_RANGE/16 -F \
         mnt/tmp_1 mnt/tmp_2 mnt/tmp_3 mnt/tmp_4 mnt/tmp_5 mnt/tmp_6 mnt/tmp_7 \
         mnt/tmp_8 mnt/tmp_9 mnt/tmp_10 mnt/tmp_11 mnt/tmp_12 mnt/tmp_13 \
         mnt/tmp_14 mnt/tmp_15 mnt/tmp_16
      
      Results:
      
         +----------+--------+---------+
         | IO range | HPB on | HPB off |
         +----------+--------+---------+
         |   1 GB   | 294.8  | 300.87  |
         |   4 GB   | 293.51 | 179.35  |
         |   8 GB   | 294.85 | 162.52  |
         |  16 GB   | 293.45 | 156.26  |
         |  32 GB   | 277.4  | 153.25  |
         +----------+--------+---------+
      
      Link: https://lore.kernel.org/r/20210712085830epcms2p8c1288b7f7a81b044158a18232617b572@epcms2p8Reported-by: default avatarkernel test robot <lkp@intel.com>
      Tested-by: default avatarBean Huo <beanhuo@micron.com>
      Tested-by: default avatarCan Guo <cang@codeaurora.org>
      Tested-by: default avatarStanley Chu <stanley.chu@mediatek.com>
      Reviewed-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Reviewed-by: default avatarBart Van Assche <bvanassche@acm.org>
      Reviewed-by: default avatarCan Guo <cang@codeaurora.org>
      Reviewed-by: default avatarBean Huo <beanhuo@micron.com>
      Reviewed-by: default avatarStanley Chu <stanley.chu@mediatek.com>
      Acked-by: default avatarAvri Altman <Avri.Altman@wdc.com>
      Signed-off-by: default avatarDaejun Park <daejun7.park@samsung.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      f02bc975
    • Dwaipayan Ray's avatar
      scsi: qla4xxx: Convert uses of __constant_cpu_to_<foo> to cpu_to_<foo> · 33529018
      Dwaipayan Ray authored
      The macros cpu_to_le16() and cpu_to_le32() have special cases for
      constants.  Their __constant_<foo> versions are not required.
      
      On little endian systems, both cpu_to_le16() and __constant_cpu_to_le16()
      expand to the same expression. Same is the case with cpu_to_le32().
      
      On big endian systems, cpu_to_le16() expands to __swab16() which has a
      __builtin_constant_p check. Similarly, cpu_to_le32() expands to __swab32().
      
      Consequently these macros can be safely used with constants, and hence all
      those uses are converted. This was discovered as a part of a checkpatch
      evaluation, looking at all reports of WARNING:CONSTANT_CONVERSION error
      type.
      
      Link: https://lore.kernel.org/r/20210716112852.24598-1-dwaipayanray1@gmail.comSigned-off-by: default avatarDwaipayan Ray <dwaipayanray1@gmail.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      33529018
    • Colin Ian King's avatar
      scsi: BusLogic: Use %X for u32 sized integer rather than %lX · 2127cd21
      Colin Ian King authored
      An earlier fix changed the print format specifier for adapter->bios_addr to
      use %lX. However, the integer is a u32 so the fix was wrong. Fix this by
      using the correct %X format specifier.
      
      Link: https://lore.kernel.org/r/20210730095031.26981-1-colin.king@canonical.com
      Fixes: 43622697 ("scsi: BusLogic: use %lX for unsigned long rather than %X")
      Acked-by: default avatarKhalid Aziz <khalid@gonehiking.org>
      Signed-off-by: default avatarColin Ian King <colin.king@canonical.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Addresses-Coverity: ("Invalid type in argument")
      2127cd21
    • Maciej W. Rozycki's avatar
      scsi: BusLogic: Avoid unbounded vsprintf() use · a40662c9
      Maciej W. Rozycki authored
      Existing blogic_msg() invocations do not appear to overrun its internal
      buffer of a fixed length of 100, which would cause stack corruption, but
      it's easy to miss with possible further updates and a fix is cheap in
      performance terms, so limit the output produced into the buffer by using
      vscnprintf() rather than vsprintf().
      
      Link: https://lore.kernel.org/r/alpine.DEB.2.21.2104201939390.44318@angie.orcam.me.ukAcked-by: default avatarKhalid Aziz <khalid@gonehiking.org>
      Signed-off-by: default avatarMaciej W. Rozycki <macro@orcam.me.uk>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      a40662c9
    • Maciej W. Rozycki's avatar
      scsi: BusLogic: Fix missing pr_cont() use · 44d01fc8
      Maciej W. Rozycki authored
      Update BusLogic driver's messaging system to use pr_cont() for continuation
      lines, bringing messy output:
      
      pci 0000:00:13.0: PCI->APIC IRQ transform: INT A -> IRQ 17
      scsi: ***** BusLogic SCSI Driver Version 2.1.17 of 12 September 2013 *****
      scsi: Copyright 1995-1998 by Leonard N. Zubkoff <lnz@dandelion.com>
      scsi0: Configuring BusLogic Model BT-958 PCI Wide Ultra SCSI Host Adapter
      scsi0:   Firmware Version: 5.07B, I/O Address: 0x7000, IRQ Channel: 17/Level
      scsi0:   PCI Bus: 0, Device: 19, Address:
      0xE0012000,
      Host Adapter SCSI ID: 7
      scsi0:   Parity Checking: Enabled, Extended Translation: Enabled
      scsi0:   Synchronous Negotiation: Ultra, Wide Negotiation: Enabled
      scsi0:   Disconnect/Reconnect: Enabled, Tagged Queuing: Enabled
      scsi0:   Scatter/Gather Limit: 128 of 8192 segments, Mailboxes: 211
      scsi0:   Driver Queue Depth: 211, Host Adapter Queue Depth: 192
      scsi0:   Tagged Queue Depth:
      Automatic
      , Untagged Queue Depth: 3
      scsi0:   SCSI Bus Termination: Both Enabled
      , SCAM: Disabled
      
      scsi0: *** BusLogic BT-958 Initialized Successfully ***
      scsi host0: BusLogic BT-958
      
      back to order:
      
      pci 0000:00:13.0: PCI->APIC IRQ transform: INT A -> IRQ 17
      scsi: ***** BusLogic SCSI Driver Version 2.1.17 of 12 September 2013 *****
      scsi: Copyright 1995-1998 by Leonard N. Zubkoff <lnz@dandelion.com>
      scsi0: Configuring BusLogic Model BT-958 PCI Wide Ultra SCSI Host Adapter
      scsi0:   Firmware Version: 5.07B, I/O Address: 0x7000, IRQ Channel: 17/Level
      scsi0:   PCI Bus: 0, Device: 19, Address: 0xE0012000, Host Adapter SCSI ID: 7
      scsi0:   Parity Checking: Enabled, Extended Translation: Enabled
      scsi0:   Synchronous Negotiation: Ultra, Wide Negotiation: Enabled
      scsi0:   Disconnect/Reconnect: Enabled, Tagged Queuing: Enabled
      scsi0:   Scatter/Gather Limit: 128 of 8192 segments, Mailboxes: 211
      scsi0:   Driver Queue Depth: 211, Host Adapter Queue Depth: 192
      scsi0:   Tagged Queue Depth: Automatic, Untagged Queue Depth: 3
      scsi0:   SCSI Bus Termination: Both Enabled, SCAM: Disabled
      scsi0: *** BusLogic BT-958 Initialized Successfully ***
      scsi host0: BusLogic BT-958
      
      Also diagnostic output such as with the BusLogic=TraceConfiguration
      parameter is affected and becomes vertical and therefore hard to read.
      This has now been corrected, e.g.:
      
      pci 0000:00:13.0: PCI->APIC IRQ transform: INT A -> IRQ 17
      blogic_cmd(86) Status = 30:  4 ==>  4: FF 05 93 00
      blogic_cmd(95) Status = 28: (Modify I/O Address)
      blogic_cmd(91) Status = 30:  1 ==>  1: 01
      blogic_cmd(04) Status = 30:  4 ==>  4: 41 41 35 30
      blogic_cmd(8D) Status = 30: 14 ==> 14: 45 DC 00 20 00 00 00 00 00 40 30 37 42 1D
      scsi: ***** BusLogic SCSI Driver Version 2.1.17 of 12 September 2013 *****
      scsi: Copyright 1995-1998 by Leonard N. Zubkoff <lnz@dandelion.com>
      blogic_cmd(04) Status = 30:  4 ==>  4: 41 41 35 30
      blogic_cmd(0B) Status = 30:  3 ==>  3: 00 08 07
      blogic_cmd(0D) Status = 30: 34 ==> 34: 03 01 07 04 00 00 00 00 00 00 00 00 00 00 00 00 FF 42 44 46 FF 00 00 00 00 00 00 00 00 00 FF 00 FF 00
      blogic_cmd(8D) Status = 30: 14 ==> 14: 45 DC 00 20 00 00 00 00 00 40 30 37 42 1D
      blogic_cmd(84) Status = 30:  1 ==>  1: 37
      blogic_cmd(8B) Status = 30:  5 ==>  5: 39 35 38 20 20
      blogic_cmd(85) Status = 30:  1 ==>  1: 42
      blogic_cmd(86) Status = 30:  4 ==>  4: FF 05 93 00
      blogic_cmd(91) Status = 30: 64 ==> 64: 41 46 3E 20 39 35 38 20 20 00 C4 00 04 01 07 2F 07 04 35 FF FF FF FF FF FF FF FF FF FF 01 00 FE FF 08 FF FF 00 00 00 00 00 00 00 01 00 01 00 00 FF FF 00 00 00 00 00 00 00 00 00 00 00 00 00 FC
      scsi0: Configuring BusLogic Model BT-958 PCI Wide Ultra SCSI Host Adapter
      
      etc.
      
      Link: https://lore.kernel.org/r/alpine.DEB.2.21.2104201940430.44318@angie.orcam.me.uk
      Fixes: 4bcc595c ("printk: reinstate KERN_CONT for printing continuation lines")
      Cc: stable@vger.kernel.org # v4.9+
      Acked-by: default avatarKhalid Aziz <khalid@gonehiking.org>
      Signed-off-by: default avatarMaciej W. Rozycki <macro@orcam.me.uk>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      44d01fc8
    • Christoph Hellwig's avatar
      scsi: bsg-lib: Fix commands without data transfer in bsg_transport_sg_io_fn() · 659a3784
      Christoph Hellwig authored
      Set ret to 0 after the initial permission checks to avoid leaking -EPERM
      for commands without data transfer.
      
      Link: https://lore.kernel.org/r/20210731074027.1185545-3-hch@lst.de
      Fixes: 75ca5640 ("scsi: bsg: Move the whole request execution into the SCSI/transport handlers")
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      659a3784
    • Christoph Hellwig's avatar
      scsi: bsg: Fix commands without data transfer in scsi_bsg_sg_io_fn() · 5c0f6137
      Christoph Hellwig authored
      Set ret to 0 after the initial permission checks to avoid leaking -EPERM
      for commands without data transfer.
      
      Link: https://lore.kernel.org/r/20210731074027.1185545-2-hch@lst.de
      Fixes: 75ca5640 ("scsi: bsg: Move the whole request execution into the SCSI/transport handlers")
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      5c0f6137
  2. 31 Jul, 2021 8 commits
  3. 29 Jul, 2021 24 commits