1. 30 Jun, 2017 1 commit
  2. 19 Jun, 2017 15 commits
    • Damien Le Moal's avatar
      dm zoned: drive-managed zoned block device target · 3b1a94c8
      Damien Le Moal authored
      The dm-zoned device mapper target provides transparent write access
      to zoned block devices (ZBC and ZAC compliant block devices).
      dm-zoned hides to the device user (a file system or an application
      doing raw block device accesses) any constraint imposed on write
      requests by the device, equivalent to a drive-managed zoned block
      device model.
      
      Write requests are processed using a combination of on-disk buffering
      using the device conventional zones and direct in-place processing for
      requests aligned to a zone sequential write pointer position.
      A background reclaim process implemented using dm_kcopyd_copy ensures
      that conventional zones are always available for executing unaligned
      write requests. The reclaim process overhead is minimized by managing
      buffer zones in a least-recently-written order and first targeting the
      oldest buffer zones. Doing so, blocks under regular write access (such
      as metadata blocks of a file system) remain stored in conventional
      zones, resulting in no apparent overhead.
      
      dm-zoned implementation focus on simplicity and on minimizing overhead
      (CPU, memory and storage overhead). For a 14TB host-managed disk with
      256 MB zones, dm-zoned memory usage per disk instance is at most about
      3 MB and as little as 5 zones will be used internally for storing metadata
      and performing buffer zone reclaim operations. This is achieved using
      zone level indirection rather than a full block indirection system for
      managing block movement between zones.
      
      dm-zoned primary target is host-managed zoned block devices but it can
      also be used with host-aware device models to mitigate potential
      device-side performance degradation due to excessive random writing.
      
      Zoned block devices can be formatted and checked for use with the dm-zoned
      target using the dmzadm utility available at:
      
      https://github.com/hgst/dm-zoned-toolsSigned-off-by: default avatarDamien Le Moal <damien.lemoal@wdc.com>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.com>
      Reviewed-by: default avatarBart Van Assche <bart.vanassche@sandisk.com>
      [Mike Snitzer partly refactored Damien's original work to cleanup the code]
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      3b1a94c8
    • Damien Le Moal's avatar
      dm kcopyd: add sequential write feature · b73c67c2
      Damien Le Moal authored
      When copyying blocks to host-managed zoned block devices, writes must be
      sequential.  However, dm_kcopyd_copy() does not guarantee this as writes
      are issued in the completion order of reads, and reads may complete out
      of order despite being issued sequentially.
      
      Fix this by introducing the DM_KCOPYD_WRITE_SEQ feature flag.  This can
      be specified when calling dm_kcopyd_copy() and should be set
      automatically if one of the destinations is a host-managed zoned block
      device.  For a split job, the master job maintains the write position at
      which writes must be issued.  This is checked with the pop() function
      which is modified to not return any write I/O sub job that is not at the
      correct write position.
      
      When DM_KCOPYD_WRITE_SEQ is specified for a job, errors cannot be
      ignored and the flag DM_KCOPYD_IGNORE_ERROR is ignored, even if
      specified by the user.
      Signed-off-by: default avatarDamien Le Moal <damien.lemoal@wdc.com>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.com>
      Reviewed-by: default avatarBart Van Assche <bart.vanassche@sandisk.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      b73c67c2
    • Damien Le Moal's avatar
      dm linear: add support for zoned block devices · 0be12c1c
      Damien Le Moal authored
      Add support for zoned block devices by allowing host-managed zoned block
      device mapped targets, the remapping of REQ_OP_ZONE_RESET and the post
      processing (reply remapping) of REQ_OP_ZONE_REPORT.
      Signed-off-by: default avatarDamien Le Moal <damien.lemoal@wdc.com>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.com>
      Reviewed-by: default avatarBart Van Assche <bart.vanassche@sandisk.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      0be12c1c
    • Damien Le Moal's avatar
      dm flakey: add support for zoned block devices · 124c4454
      Damien Le Moal authored
      With the development of file system support for zoned block devices
      (e.g. f2fs), having dm-flakey support these devices is interesting
      to improve testing.
      
      Add host-aware and host-managed zoned block devices support to in
      dm-flakey.  The target type feature is set to DM_TARGET_ZONED_HM to
      indicate support for host-managed models.  Also add hooks for remapping
      of REQ_OP_ZONE_RESET and REQ_OP_ZONE_REPORT bios.  Additionally, in the
      bio completion path, (backward) remapping of a zone report reply is
      added.
      Signed-off-by: default avatarDamien Le Moal <damien.lemoal@wdc.com>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.com>
      Reviewed-by: default avatarBart Van Assche <bart.vanassche@sandisk.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      124c4454
    • Damien Le Moal's avatar
      dm: introduce dm_remap_zone_report() · 10999307
      Damien Le Moal authored
      A target driver support zoned block devices and exposing it as such may
      receive REQ_OP_ZONE_REPORT request for the user to determine the mapped
      device zone configuration. To process properly such request, the target
      driver may need to remap the zone descriptors provided in the report
      reply. The helper function dm_remap_zone_report() does this generically
      using only the target start offset and length and the start offset
      within the target device.
      
      dm_remap_zone_report() will remap the start sector of all zones
      reported. If the report includes sequential zones, the write pointer
      position of these zones will also be remapped.
      Signed-off-by: default avatarDamien Le Moal <damien.lemoal@wdc.com>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.com>
      Reviewed-by: default avatarBart Van Assche <bart.vanassche@sandisk.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      10999307
    • Damien Le Moal's avatar
      dm: fix REQ_OP_ZONE_REPORT bio handling · 264c869d
      Damien Le Moal authored
      A REQ_OP_ZONE_REPORT bio is not a medium access command.  Its number of
      sectors indicates the maximum size allowed for the report reply size and
      not an amount of sectors accessed from the device.  REQ_OP_ZONE_REPORT
      bios should thus not be split depending on the target device maximum I/O
      length but passed as-is.  Note that it is the responsability of the
      target to remap and format the report reply.
      Signed-off-by: default avatarDamien Le Moal <damien.lemoal@wdc.com>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.com>
      Reviewed-by: default avatarBart Van Assche <bart.vanassche@sandisk.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      264c869d
    • Damien Le Moal's avatar
      dm: fix REQ_OP_ZONE_RESET bio handling · a4aa5e56
      Damien Le Moal authored
      The REQ_OP_ZONE_RESET bio has no payload and zero sectors.  Its position
      is the only information used to indicate the zone to reset on the
      device.  Due to its zero length, this bio is not cloned and sent to the
      target through the non-flush case in __split_and_process_bio().  Add an
      additional case in that function to call __split_and_process_non_flush()
      without checking the clone info size.
      Signed-off-by: default avatarDamien Le Moal <damien.lemoal@wdc.com>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.com>
      Reviewed-by: default avatarBart Van Assche <bart.vanassche@sandisk.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      a4aa5e56
    • Damien Le Moal's avatar
      dm table: add zoned block devices validation · dd88d313
      Damien Le Moal authored
      1) Introduce DM_TARGET_ZONED_HM feature flag:
      
      The target drivers currently available will not operate correctly if a
      table target maps onto a host-managed zoned block device.
      
      To avoid problems, introduce the new feature flag DM_TARGET_ZONED_HM to
      allow a target to explicitly state that it supports host-managed zoned
      block devices.  This feature is checked for all targets in a table if
      any of the table's block devices are host-managed.
      
      Note that as host-aware zoned block devices are backward compatible with
      regular block devices, they can be used by any of the current target
      types.  This new feature is thus restricted to host-managed zoned block
      devices.
      
      2) Check device area zone alignment:
      
      If a target maps to a zoned block device, check that the device area is
      aligned on zone boundaries to avoid problems with REQ_OP_ZONE_RESET
      operations (resetting a partially mapped sequential zone would not be
      possible).  This also facilitates the processing of zone report with
      REQ_OP_ZONE_REPORT bios.
      
      3) Check block devices zone model compatibility
      
      When setting the DM device's queue limits, several possibilities exists
      for zoned block devices:
      1) The DM target driver may want to expose a different zone model
      (e.g. host-managed device emulation or regular block device on top of
      host-managed zoned block devices)
      2) Expose the underlying zone model of the devices as-is
      
      To allow both cases, the underlying block device zone model must be set
      in the target limits in dm_set_device_limits() and the compatibility of
      all devices checked similarly to the logical block size alignment.  For
      this last check, introduce validate_hardware_zoned_model() to check that
      all targets of a table have the same zone model and that the zone size
      of the target devices are equal.
      Signed-off-by: default avatarDamien Le Moal <damien.lemoal@wdc.com>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.com>
      Reviewed-by: default avatarBart Van Assche <bart.vanassche@sandisk.com>
      [Mike Snitzer refactored Damien's original work to simplify the code]
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      dd88d313
    • Joe Perches's avatar
      dm: convert DM printk macros to pr_<level> macros · d2c3c8dc
      Joe Perches authored
      Using pr_<level> is the more common logging style.
      
      Standardize style and use new macro DM_FMT.
      Use no_printk in DMDEBUG macros when CONFIG_DM_DEBUG is not #defined.
      Signed-off-by: default avatarJoe Perches <joe@perches.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      d2c3c8dc
    • Milan Broz's avatar
      dm crypt: add big-endian variant of plain64 IV · 7e3fd855
      Milan Broz authored
      The big-endian IV (plain64be) is needed to map images from extracted
      disks that are used in some external (on-chip FDE) disk encryption
      drives, e.g.: data recovery from external USB/SATA drives that support
      "internal" encryption.
      Signed-off-by: default avatarMilan Broz <gmazyland@gmail.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      7e3fd855
    • Geliang Tang's avatar
      dm bio prison: use rb_entry() rather than container_of() · 6e333d0b
      Geliang Tang authored
      To make the code clearer, use rb_entry() instead of container_of() to
      deal with rbtree.
      Signed-off-by: default avatarGeliang Tang <geliangtang@gmail.com>
      Acked-by: default avatarColy Li <colyli@suse.de>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      6e333d0b
    • Mikulas Patocka's avatar
      dm ioctl: report event number in DM_LIST_DEVICES · 23d70c5e
      Mikulas Patocka authored
      Report the event numbers for all the devices, so that the user doesn't
      have to ask them one by one.  The event number is reported after the
      name field in the dm_name_list structure.
      
      The location of the next record is specified in the dm_name_list->next
      field, that means that we can put the new data after the end of name and
      it is backward compatible with the old code.  The old code just skips
      the event number without interpreting it.
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: default avatarAndy Grover <agrover@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      23d70c5e
    • Mikulas Patocka's avatar
      dm ioctl: add a new DM_DEV_ARM_POLL ioctl · fc1841e1
      Mikulas Patocka authored
      This ioctl will record the current global event number in the structure
      dm_file, so that next select or poll call will wait until new events
      arrived since this ioctl.
      
      The DM_DEV_ARM_POLL ioctl has the same effect as closing and reopening
      the handle.
      
      Using the DM_DEV_ARM_POLL ioctl is optional - if the userspace is OK
      with closing and reopening the /dev/mapper/control handle after select
      or poll, there is no need to re-arm via ioctl.
      
      Usage:
      1. open the /dev/mapper/control device
      2. send the DM_DEV_ARM_POLL ioctl
      3. scan the event numbers of all devices we are interested in and process
         them
      4. call select, poll or epoll on the handle (it waits until some new event
         happens since the DM_DEV_ARM_POLL ioctl)
      5. go to step 2
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: default avatarAndy Grover <agrover@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      fc1841e1
    • Mikulas Patocka's avatar
      dm: add basic support for using the select or poll function · 93e6442c
      Mikulas Patocka authored
      Add the ability to poll on the /dev/mapper/control device.  The select
      or poll function waits until any event happens on any dm device since
      opening the /dev/mapper/control device.  When select or poll returns the
      device as readable, we must close and reopen the device to wait for new
      dm events.
      
      Usage:
      1. open the /dev/mapper/control device
      2. scan the event numbers of all devices we are interested in and process
         them
      3. call select, poll or epoll on the handle (it waits until some new event
         happens since opening the device)
      4. close the /dev/mapper/control handle
      5. go to step 1
      
      The next commit allows to re-arm the polling without closing and
      reopening the device.
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: default avatarAndy Grover <agrover@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      93e6442c
    • Ming Lei's avatar
      nvme: host: unquiesce queue in nvme_kill_queues() · 443bd90f
      Ming Lei authored
      When nvme_kill_queues() is run, queues may be in
      quiesced state, so we forcibly unquiesce queues to avoid
      blocking dispatch, and I/O hang can be avoided in
      remove path.
      
      Peviously we use blk_mq_start_stopped_hw_queues() as
      counterpart of blk_mq_quiesce_queue(), now we have
      introduced blk_mq_unquiesce_queue(), so use it explicitly.
      
      Cc: linux-nvme@lists.infradead.org
      Signed-off-by: default avatarMing Lei <ming.lei@redhat.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      443bd90f
  3. 18 Jun, 2017 24 commits