1. 13 Nov, 2019 10 commits
    • Damien Le Moal's avatar
      scsi: sd_zbc: Cleanup sd_zbc_alloc_report_buffer() · 23a50861
      Damien Le Moal authored
      There is no need to arbitrarily limit the size of a report zone to the
      number of zones defined by SD_ZBC_REPORT_MAX_ZONES. Rather, simply
      calculate the report buffer size needed for the requested number of
      zones without exceeding the device total number of zones. This buffer
      size limitation to the hardware maximum transfer size and page mapping
      capabilities is kept unchanged. Starting with this initial buffer size,
      the allocation is optimized by iterating over decreasing buffer size
      until the allocation succeeds (each iteration is allowed to fail fast
      using the __GFP_NORETRY flag). This ensures forward progress for zone
      reports and avoids failures of zones revalidation under memory pressure.
      
      While at it, also replace the hard coded 512 B sector size with the
      SECTOR_SIZE macro.
      Reviewed-by: default avatarHannes Reinecke <hare@suse.de>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Acked-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarDamien Le Moal <damien.lemoal@wdc.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      23a50861
    • Damien Le Moal's avatar
      null_blk: Add zone_nr_conv to features · 6d09c408
      Damien Le Moal authored
      For a null_blk device with zoned mode enabled, the number of
      conventional zones can be configured through configfs with the
      zone_nr_conv parameter. Add this missing parameter in the features
      string.
      Signed-off-by: default avatarDamien Le Moal <damien.lemoal@wdc.com>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.de>
      Reviewed-by: default avatarChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      6d09c408
    • Christoph Hellwig's avatar
      null_blk: clean up report zones · 7fc8fb51
      Christoph Hellwig authored
      Make the instance name match the method name and define the name to NULL
      instead of providing an inline stub, which is rather pointless for a
      method call.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarDamien Le Moal <damien.lemoal@wdc.com>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.de>
      Reviewed-by: default avatarChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      7fc8fb51
    • Christoph Hellwig's avatar
      null_blk: clean up the block device operations · e3f89564
      Christoph Hellwig authored
      Remove the pointless stub open and release methods, give the operations
      vector a slightly less confusing name, and use normal alignment for the
      assignment operators.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarDamien Le Moal <damien.lemoal@wdc.com>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.de>
      Reviewed-by: default avatarChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      e3f89564
    • Damien Le Moal's avatar
      block: Remove partition support for zoned block devices · 5eac3eb3
      Damien Le Moal authored
      No known partitioning tool supports zoned block devices, especially the
      host managed flavor with strong sequential write constraints.
      Furthermore, there are also no known user nor use cases for partitioned
      zoned block devices.
      
      This patch removes partition device creation for zoned block devices,
      which allows simplifying the processing of zone commands for zoned
      block devices. A warning is added if a partition table is found on the
      device.
      
      For report zones operations no zone sector information remapping is
      necessary anymore, simplifying the code. Of note is that remapping of
      zone reports for DM targets is still necessary as done by
      dm_remap_zone_report().
      
      Similarly, remaping of a zone reset bio is not necessary anymore.
      Testing for the applicability of the zone reset all request also becomes
      simpler and only needs to check that the number of sectors of the
      requested zone range is equal to the disk capacity.
      Reviewed-by: default avatarHannes Reinecke <hare@suse.de>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarDamien Le Moal <damien.lemoal@wdc.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      5eac3eb3
    • Damien Le Moal's avatar
      block: Simplify report zones execution · ceeb373a
      Damien Le Moal authored
      All kernel users of blkdev_report_zones() as well as applications use
      through ioctl(BLKZONEREPORT) expect to potentially get less zone
      descriptors than requested. As such, the use of the internal report
      zones command execution loop implemented by blk_report_zones() is
      not necessary and can even be harmful to performance by causing the
      execution of inefficient small zones report command to service the
      reminder of a requested zone array.
      
      This patch removes blk_report_zones(), simplifying the code. Also
      remove a now incorrect comment in dm_blk_report_zones().
      Signed-off-by: default avatarDamien Le Moal <damien.lemoal@wdc.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarJavier Gonzalez <javier@javigon.com>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.de>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      ceeb373a
    • Christoph Hellwig's avatar
      block: cleanup the !zoned case in blk_revalidate_disk_zones · c98c3d09
      Christoph Hellwig authored
      blk_revalidate_disk_zones is never called for non-zoned devices.  Just
      return early and warn instead of trying to handle this case.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarDamien Le Moal <damien.lemoal@wdc.com>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.de>
      Reviewed-by: default avatarChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      c98c3d09
    • Damien Le Moal's avatar
      block: Enhance blk_revalidate_disk_zones() · d9dd7308
      Damien Le Moal authored
      For ZBC and ZAC zoned devices, the scsi driver revalidation processing
      implemented by sd_revalidate_disk() includes a call to
      sd_zbc_read_zones() which executes a full disk zone report used to
      check that all zones of the disk are the same size. This processing is
      followed by a call to blk_revalidate_disk_zones(), used to initialize
      the device request queue zone bitmaps (zone type and zone write lock
      bitmaps). To do so, blk_revalidate_disk_zones() also executes a full
      device zone report to obtain zone types. As a result, the entire
      zoned block device revalidation process includes two full device zone
      report.
      
      By moving the zone size checks into blk_revalidate_disk_zones(), this
      process can be optimized to a single full device zone report, leading to
      shorter device scan and revalidation times. This patch implements this
      optimization, reducing the original full device zone report implemented
      in sd_zbc_check_zones() to a single, small, report zones command
      execution to obtain the size of the first zone of the device. Checks
      whether all zones of the device are the same size as the first zone
      size are moved to the generic blk_check_zone() function called from
      blk_revalidate_disk_zones().
      
      This optimization also has the following benefits:
      1) fewer memory allocations in the scsi layer during disk revalidation
         as the potentailly large buffer for zone report execution is not
         needed.
      2) Implement zone checks in a generic manner, reducing the burden on
         device driver which only need to obtain the zone size and check that
         this size is a power of 2 number of LBAs. Any new type of zoned
         block device will benefit from this.
      Signed-off-by: default avatarDamien Le Moal <damien.lemoal@wdc.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.de>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      d9dd7308
    • Jens Axboe's avatar
      Merge branch 'for-5.5/drivers-post' into for-5.5/zoned · 0788c4ed
      Jens Axboe authored
      * for-5.5/drivers-post:
        scsi: sd_zbc: add zone open, close, and finish support
        scsi: core: Handle drivers which set sg_tablesize to zero
        scsi: qla2xxx: fix NPIV tear down process
        scsi: sd_zbc: Fix sd_zbc_complete()
        scsi: qla2xxx: stop timer in shutdown path
        scsi: sd: define variable dif as unsigned int instead of bool
        scsi: target: cxgbit: Fix cxgbit_fw4_ack()
        scsi: qla2xxx: Fix partial flash write of MBI
        scsi: qla2xxx: Initialized mailbox to prevent driver load failure
        scsi: lpfc: Honor module parameter lpfc_use_adisc
        scsi: ufs-bsg: Wake the device before sending raw upiu commands
        scsi: lpfc: Check queue pointer before use
        scsi: qla2xxx: fixup incorrect usage of host_byte
      0788c4ed
    • Jens Axboe's avatar
      Merge branch 'for-5.5/drivers' into for-5.5/zoned · d29510d3
      Jens Axboe authored
      * for-5.5/drivers: (38 commits)
        null_blk: add zone open, close, and finish support
        dm: add zone open, close and finish support
        nvme: Fix parsing of ANA log page
        nvmet: stop using bio_set_op_attrs
        nvmet: add plugging for read/write when ns is bdev
        nvmet: clean up command parsing a bit
        nvme-pci: Spelling s/resdicovered/rediscovered/
        nvmet: fill discovery controller sn, fr and mn correctly
        nvmet: Open code nvmet_req_execute()
        nvmet: Remove the data_len field from the nvmet_req struct
        nvmet: Introduce nvmet_dsm_len() helper
        nvmet: Cleanup discovery execute handlers
        nvmet: Introduce common execute function for get_log_page and identify
        nvmet-tcp: Don't set the request's data_len
        nvmet-tcp: Don't check data_len in nvmet_tcp_map_data()
        nvme: Introduce nvme_lba_to_sect()
        nvme: Cleanup and rename nvme_block_nr()
        nvme: resync include/linux/nvme.h with nvmecli
        nvme: move common call to nvme_cleanup_cmd to core layer
        nvme: introduce "Command Aborted By host" status code
        ...
      d29510d3
  2. 08 Nov, 2019 2 commits
  3. 07 Nov, 2019 18 commits
  4. 06 Nov, 2019 4 commits
  5. 05 Nov, 2019 5 commits
    • Jens Axboe's avatar
      Merge branch 'nvme-5.4-rc7' of git://git.infradead.org/nvme into for-linus · 0473976c
      Jens Axboe authored
      Pull NVMe fixes from Keith:
      
      "We have a few late nvme fixes for a couple device removal kernel
       crashes, and a compat fix for a new ioctl introduced during this merge
       window."
      
      * 'nvme-5.4-rc7' of git://git.infradead.org/nvme:
        nvme: change nvme_passthru_cmd64 to explicitly mark rsvd
        nvme-multipath: fix crash in nvme_mpath_clear_ctrl_paths
        nvme-rdma: fix a segmentation fault during module unload
      0473976c
    • Charles Machalow's avatar
      nvme: change nvme_passthru_cmd64 to explicitly mark rsvd · 0d6eeb1f
      Charles Machalow authored
      Changing nvme_passthru_cmd64 to add a field: rsvd2. This field is an explicit
      marker for the padding space added on certain platforms as a result of the
      enlargement of the result field from 32 bit to 64 bits in size, and
      fixes differences in struct size when using compat ioctl for 32-bit
      binaries on 64-bit architecture.
      
      Fixes: 65e68edc ("nvme: allow 64-bit results in passthru commands")
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarCharles Machalow <csm10495@gmail.com>
      [changelog]
      Signed-off-by: default avatarKeith Busch <kbusch@kernel.org>
      0d6eeb1f
    • Anton Eidelman's avatar
      nvme-multipath: fix crash in nvme_mpath_clear_ctrl_paths · 763303a8
      Anton Eidelman authored
      nvme_mpath_clear_ctrl_paths() iterates through
      the ctrl->namespaces list while holding ctrl->scan_lock.
      This does not seem to be the correct way of protecting
      from concurrent list modification.
      
      Specifically, nvme_scan_work() sorts ctrl->namespaces
      AFTER unlocking scan_lock.
      
      This may result in the following (rare) crash in ctrl disconnect
      during scan_work:
      
          BUG: kernel NULL pointer dereference, address: 0000000000000050
          Oops: 0000 [#1] SMP PTI
          CPU: 0 PID: 3995 Comm: nvme 5.3.5-050305-generic
          RIP: 0010:nvme_mpath_clear_current_path+0xe/0x90 [nvme_core]
          ...
          Call Trace:
           nvme_mpath_clear_ctrl_paths+0x3c/0x70 [nvme_core]
           nvme_remove_namespaces+0x35/0xe0 [nvme_core]
           nvme_do_delete_ctrl+0x47/0x90 [nvme_core]
           nvme_sysfs_delete+0x49/0x60 [nvme_core]
           dev_attr_store+0x17/0x30
           sysfs_kf_write+0x3e/0x50
           kernfs_fop_write+0x11e/0x1a0
           __vfs_write+0x1b/0x40
           vfs_write+0xb9/0x1a0
           ksys_write+0x67/0xe0
           __x64_sys_write+0x1a/0x20
           do_syscall_64+0x5a/0x130
           entry_SYSCALL_64_after_hwframe+0x44/0xa9
          RIP: 0033:0x7f8d02bfb154
      
      Fix:
      After taking scan_lock in nvme_mpath_clear_ctrl_paths()
      down_read(&ctrl->namespaces_rwsem) as well to make list traversal safe.
      This will not cause deadlocks because taking scan_lock never happens
      while holding the namespaces_rwsem.
      Moreover, scan work downs namespaces_rwsem in the same order.
      
      Alternative: sort ctrl->namespaces in nvme_scan_work()
      while still holding the scan_lock.
      This would leave nvme_mpath_clear_ctrl_paths() without correct protection
      against ctrl->namespaces modification by anyone other than scan_work.
      Reviewed-by: default avatarSagi Grimberg <sagi@grimberg.me>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarAnton Eidelman <anton@lightbitslabs.com>
      Signed-off-by: default avatarKeith Busch <kbusch@kernel.org>
      763303a8
    • Max Gurtovoy's avatar
      nvme-rdma: fix a segmentation fault during module unload · 9ad9e8d6
      Max Gurtovoy authored
      In case there are controllers that are not associated with any RDMA
      device (e.g. during unsuccessful reconnection) and the user will unload
      the module, these controllers will not be freed and will access already
      freed memory. The same logic appears in other fabric drivers as well.
      
      Fixes: 87fd1253 ("nvme-rdma: remove redundant reference between ib_device and tagset")
      Reviewed-by: default avatarSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: default avatarMax Gurtovoy <maxg@mellanox.com>
      Signed-off-by: default avatarKeith Busch <kbusch@kernel.org>
      9ad9e8d6
    • Christoph Hellwig's avatar
      block: avoid blk_bio_segment_split for small I/O operations · fa532287
      Christoph Hellwig authored
      __blk_queue_split() adds significant overhead for small I/O operations.
      Add a shortcut to avoid it for cases where we know we never need to
      split.
      
      Based on a patch from Ming Lei.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      fa532287
  6. 04 Nov, 2019 1 commit