1. 03 Oct, 2017 2 commits
    • Joseph Qi's avatar
      blk-throttle: fix possible io stall when upgrade to max · 4f02fb76
      Joseph Qi authored
      There is a case which will lead to io stall. The case is described as
      follows.
      /test1
        |-subtest1
      /test2
        |-subtest2
      And subtest1 and subtest2 each has 32 queued bios already.
      
      Now upgrade to max. In throtl_upgrade_state, it will try to dispatch
      bios as follows:
      1) tg=subtest1, do nothing;
      2) tg=test1, transfer 32 queued bios from subtest1 to test1; no pending
      left, no need to schedule next dispatch;
      3) tg=subtest2, do nothing;
      4) tg=test2, transfer 32 queued bios from subtest2 to test2; no pending
      left, no need to schedule next dispatch;
      5) tg=/, transfer 8 queued bios from test1 to /, 8 queued bios from
      test2 to /, 8 queued bios from test1 to /, and 8 queued bios from test2
      to /; note that test1 and test2 each still has 16 queued bios left;
      6) tg=/, try to schedule next dispatch, but since disptime is now
      (update in tg_update_disptime, wait=0), pending timer is not scheduled
      in fact;
      7) In throtl_upgrade_state it totally dispatches 32 queued bios and with
      32 left. test1 and test2 each has 16 queued bios;
      8) throtl_pending_timer_fn sees the left over bios, but could do
      nothing, because throtl_select_dispatch returns 0, and test1/test2 has
      no pending tg.
      
      The blktrace shows the following:
      8,32   0        0     2.539007641     0  m   N throtl upgrade to max
      8,32   0        0     2.539072267     0  m   N throtl /test2 dispatch nr_queued=16 read=0 write=16
      8,32   7        0     2.539077142     0  m   N throtl /test1 dispatch nr_queued=16 read=0 write=16
      
      So force schedule dispatch if there are pending children.
      Reviewed-by: default avatarShaohua Li <shli@fb.com>
      Signed-off-by: default avatarJoseph Qi <qijiang.qj@alibaba-inc.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      4f02fb76
    • Wouter Verhelst's avatar
      MAINTAINERS: update list for NBD · 38b249bc
      Wouter Verhelst authored
      nbd-general@sourceforge.net becomes nbd@other.debian.org, because
      sourceforge is just a spamtrap these days.
      Signed-off-by: default avatarWouter Verhelst <w@uter.be>
      Reviewed-by: default avatarJosef Bacik <jbacik@fb.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      38b249bc
  2. 02 Oct, 2017 1 commit
    • Josef Bacik's avatar
      nbd: fix -ERESTARTSYS handling · 6e60a3bb
      Josef Bacik authored
      Christoph made it so that if we return'ed BLK_STS_RESOURCE whenever we
      got ERESTARTSYS from sending our packets we'd return BLK_STS_OK, which
      means we'd never requeue and just hang.  We really need to return the
      right value from the upper layer.
      
      Fixes: fc17b653 ("blk-mq: switch ->queue_rq return value to blk_status_t")
      Signed-off-by: default avatarJosef Bacik <jbacik@fb.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      6e60a3bb
  3. 27 Sep, 2017 1 commit
    • Coly Li's avatar
      bcache: use llist_for_each_entry_safe() in __closure_wake_up() · a5f3d8a5
      Coly Li authored
      Commit 09b3efec ("bcache: Don't reinvent the wheel but use existing llist
      API") replaces the following while loop by llist_for_each_entry(),
      
      -
      -	while (reverse) {
      -		cl = container_of(reverse, struct closure, list);
      -		reverse = llist_next(reverse);
      -
      +	llist_for_each_entry(cl, reverse, list) {
       		closure_set_waiting(cl, 0);
       		closure_sub(cl, CLOSURE_WAITING + 1);
       	}
      
      This modification introduces a potential race by iterating a corrupted
      list. Here is how it happens.
      
      In the above modification, closure_sub() may wake up a process which is
      waiting on reverse list. If this process decides to wait again by calling
      closure_wait(), its cl->list will be added to another wait list. Then
      when llist_for_each_entry() continues to iterate next node, it will travel
      on another new wait list which is added in closure_wait(), not the
      original reverse list in __closure_wake_up(). It is more probably to
      happen on UP machine because the waked up process may preempt the process
      which wakes up it.
      
      Use llist_for_each_entry_safe() will fix the issue, the safe version fetch
      next node before waking up a process. Then the copy of next node will make
      sure list iteration stays on original reverse list.
      
      Fixes: 09b3efec ("bcache: Don't reinvent the wheel but use existing llist API")
      Signed-off-by: default avatarColy Li <colyli@suse.de>
      Reported-by: default avatarMichael Lyle <mlyle@lyle.org>
      Reviewed-by: default avatarByungchul Park <byungchul.park@lge.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      a5f3d8a5
  4. 26 Sep, 2017 4 commits
  5. 25 Sep, 2017 32 commits