1. 14 Mar, 2016 1 commit
    • Bryn M. Reeves's avatar
      dm: fix rq_end_stats() NULL pointer in dm_requeue_original_request() · 98dbc9c6
      Bryn M. Reeves authored
      An "old" (.request_fn) DM 'struct request' stores a pointer to the
      associated 'struct dm_rq_target_io' in rq->special.
      
      dm_requeue_original_request(), previously named
      dm_requeue_unmapped_original_request(), called dm_unprep_request() to
      reset rq->special to NULL.  But rq_end_stats() would go on to hit a NULL
      pointer deference because its call to tio_from_request() returned NULL.
      
      Fix this by calling rq_end_stats() _before_ dm_unprep_request()
      Signed-off-by: default avatarBryn M. Reeves <bmr@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Fixes: e262f347 ("dm stats: add support for request-based DM devices")
      Cc: stable@vger.kernel.org # 4.2+
      98dbc9c6
  2. 11 Mar, 2016 1 commit
    • Mike Snitzer's avatar
      dm thin: consistently return -ENOSPC if pool has run out of data space · c3667cc6
      Mike Snitzer authored
      Commit 0a927c2f ("dm thin: return -ENOSPC when erroring retry list due
      to out of data space") was a step in the right direction but didn't go
      far enough.
      
      Add a new 'out_of_data_space' flag to 'struct pool' and set it if/when
      the pool runs of of data space.  This fixes cell_error() and
      error_retry_list() to not blindly return -EIO.
      
      We cannot rely on the 'error_if_no_space' feature flag since it is
      transient (in that it can be reset once space is added, plus it only
      controls whether errors are issued, it doesn't reflect whether the
      pool is actually out of space).
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      c3667cc6
  3. 10 Mar, 2016 14 commits
  4. 23 Feb, 2016 15 commits
  5. 22 Feb, 2016 9 commits
    • Mike Snitzer's avatar
      dm: allocate blk_mq_tag_set rather than embed in mapped_device · 1c357a1e
      Mike Snitzer authored
      The blk_mq_tag_set is only needed for dm-mq support.  There is point
      wasting space in 'struct mapped_device' for non-dm-mq devices.
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> # check kzalloc return
      1c357a1e
    • Mike Snitzer's avatar
      dm: add 'dm_mq_nr_hw_queues' and 'dm_mq_queue_depth' module params · faad87df
      Mike Snitzer authored
      Allow user to change these values via module params or sysfs.
      
      'dm_mq_nr_hw_queues' defaults to 1 (max 32).
      
      'dm_mq_queue_depth' defaults to 2048 (up from 64, which proved far too
      small under moderate sized workloads -- the dm-multipath device would
      continuously block waiting for tags (requests) to become available).
      The maximum is BLK_MQ_MAX_DEPTH (currently 10240).
      
      Keep in mind the total number of pre-allocated requests per
      request-based dm-mq device is 'dm_mq_nr_hw_queues' * 'dm_mq_queue_depth'
      (currently 2048).
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      faad87df
    • Mike Snitzer's avatar
      dm: optimize dm_request_fn() · c91852ff
      Mike Snitzer authored
      DM multipath is the only request-based DM target -- which only supports
      tables with a single target that is immutable.  Leverage this fact in
      dm_request_fn().
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      c91852ff
    • Mike Snitzer's avatar
      dm: optimize dm_mq_queue_rq() · 16f12266
      Mike Snitzer authored
      DM multipath is the only dm-mq target.  But that aside, request-based DM
      only supports tables with a single target that is immutable.  Leverage
      this fact in dm_mq_queue_rq() by using the 'immutable_target' stored in
      the mapped_device when the table was made active.  This saves the need
      to even take the read-side of the SRCU via dm_{get,put}_live_table.
      
      If the active DM table does not have an immutable target (e.g. "error"
      target was swapped in) then fallback to the slow-path where the target
      is looked up from the live table.
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      16f12266
    • Mike Snitzer's avatar
      dm: set DM_TARGET_WILDCARD feature on "error" target · f083b09b
      Mike Snitzer authored
      The DM_TARGET_WILDCARD feature indicates that the "error" target may
      replace any target; even immutable targets.  This feature will be useful
      to preserve the ability to replace the "multipath" target even once it
      is formally converted over to having the DM_TARGET_IMMUTABLE feature.
      
      Also, implicit in the DM_TARGET_WILDCARD feature flag being set is that
      .map, .map_rq, .clone_and_map_rq and .release_clone_rq are all defined
      in the target_type.
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      f083b09b
    • Mike Snitzer's avatar
      dm: cleanup dm_any_congested() · e522c039
      Mike Snitzer authored
      The request-based DM support for checking queue congestion doesn't
      require access to the live DM table.
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      e522c039
    • Mike Snitzer's avatar
      dm: remove unused dm_get_rq_mapinfo() · ae6ad75e
      Mike Snitzer authored
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      ae6ad75e
    • Mike Snitzer's avatar
      dm: fix excessive dm-mq context switching · 6acfe68b
      Mike Snitzer authored
      Request-based DM's blk-mq support (dm-mq) was reported to be 50% slower
      than if an underlying null_blk device were used directly.  One of the
      reasons for this drop in performance is that blk_insert_clone_request()
      was calling blk_mq_insert_request() with @async=true.  This forced the
      use of kblockd_schedule_delayed_work_on() to run the blk-mq hw queues
      which ushered in ping-ponging between process context (fio in this case)
      and kblockd's kworker to submit the cloned request.  The ftrace
      function_graph tracer showed:
      
        kworker-2013  =>   fio-12190
        fio-12190    =>  kworker-2013
        ...
        kworker-2013  =>   fio-12190
        fio-12190    =>  kworker-2013
        ...
      
      Fixing blk_insert_clone_request()'s blk_mq_insert_request() call to
      _not_ use kblockd to submit the cloned requests isn't enough to
      eliminate the observed context switches.
      
      In addition to this dm-mq specific blk-core fix, there are 2 DM core
      fixes to dm-mq that (when paired with the blk-core fix) completely
      eliminate the observed context switching:
      
      1)  don't blk_mq_run_hw_queues in blk-mq request completion
      
          Motivated by desire to reduce overhead of dm-mq, punting to kblockd
          just increases context switches.
      
          In my testing against a really fast null_blk device there was no benefit
          to running blk_mq_run_hw_queues() on completion (and no other blk-mq
          driver does this).  So hopefully this change doesn't induce the need for
          yet another revert like commit 621739b0 !
      
      2)  use blk_mq_complete_request() in dm_complete_request()
      
          blk_complete_request() doesn't offer the traditional q->mq_ops vs
          .request_fn branching pattern that other historic block interfaces
          do (e.g. blk_get_request).  Using blk_mq_complete_request() for
          blk-mq requests is important for performance.  It should be noted
          that, like blk_complete_request(), blk_mq_complete_request() doesn't
          natively handle partial completions -- but the request-based
          DM-multipath target does provide the required partial completion
          support by dm.c:end_clone_bio() triggering requeueing of the request
          via dm-mpath.c:multipath_end_io()'s return of DM_ENDIO_REQUEUE.
      
      dm-mq fix #2 is _much_ more important than #1 for eliminating the
      context switches.
      Before: cpu          : usr=15.10%, sys=59.39%, ctx=7905181, majf=0, minf=475
      After:  cpu          : usr=20.60%, sys=79.35%, ctx=2008, majf=0, minf=472
      
      With these changes multithreaded async read IOPs improved from ~950K
      to ~1350K for this dm-mq stacked on null_blk test-case.  The raw read
      IOPs of the underlying null_blk device for the same workload is ~1950K.
      
      Fixes: 7fb4898e ("block: add blk-mq support to blk_insert_cloned_request()")
      Fixes: bfebd1cd ("dm: add full blk-mq support to request-based DM")
      Cc: stable@vger.kernel.org # 4.1+
      Reported-by: default avatarSagi Grimberg <sagig@dev.mellanox.co.il>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Acked-by: default avatarJens Axboe <axboe@kernel.dk>
      6acfe68b
    • Mike Snitzer's avatar
      dm: fix sparse "unexpected unlock" warnings in ioctl code · 956a4025
      Mike Snitzer authored
      Rename dm_get_live_table_for_ioctl to dm_grab_bdev_for_ioctl and have it
      do the dm_{get,put}_live_table() rather than split those operations.
      
      The dm_grab_bdev_for_ioctl() callers only care about the block_device
      associated with a singleton DM device so there isn't any need to retain
      a reference to the live DM table.  It is sufficient to:
      1) dm_get_live_table()
      2) bdgrab() the bdev associated with the singleton table's target
      3) dm_put_live_table()
      4) bdput() the bdev
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      956a4025