1. 13 Nov, 2019 11 commits
    • Coly Li's avatar
      bcache: at least try to shrink 1 node in bch_mca_scan() · 9fcc34b1
      Coly Li authored
      In bch_mca_scan(), the number of shrinking btree node is calculated
      by code like this,
      	unsigned long nr = sc->nr_to_scan;
      
              nr /= c->btree_pages;
              nr = min_t(unsigned long, nr, mca_can_free(c));
      variable sc->nr_to_scan is number of objects (here is bcache B+tree
      nodes' number) to shrink, and pointer variable sc is sent from memory
      management code as parametr of a callback.
      
      If sc->nr_to_scan is smaller than c->btree_pages, after the above
      calculation, variable 'nr' will be 0 and nothing will be shrunk. It is
      frequeently observed that only 1 or 2 is set to sc->nr_to_scan and make
      nr to be zero. Then bch_mca_scan() will do nothing more then acquiring
      and releasing mutex c->bucket_lock.
      
      This patch checkes whether nr is 0 after the above calculation, if 0
      is the result then set 1 to variable 'n'. Then at least bch_mca_scan()
      will try to shrink a single B+tree node.
      Signed-off-by: default avatarColy Li <colyli@suse.de>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      9fcc34b1
    • Coly Li's avatar
      bcache: add idle_max_writeback_rate sysfs interface · c5fcdedc
      Coly Li authored
      For writeback mode, if there is no regular I/O request for a while,
      the writeback rate will be set to the maximum value (1TB/s for now).
      This is good for most of the storage workload, but there are still
      people don't what the maximum writeback rate in I/O idle time.
      
      This patch adds a sysfs interface file idle_max_writeback_rate to
      permit people to disable maximum writeback rate. Then the minimum
      writeback rate can be advised by writeback_rate_minimum in the
      bcache device's sysfs interface.
      Reported-by: default avatarChristian Balzer <chibi@gol.com>
      Signed-off-by: default avatarColy Li <colyli@suse.de>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      c5fcdedc
    • Coly Li's avatar
      bcache: add code comments in bch_btree_leaf_dirty() · 5dccefd3
      Coly Li authored
      This patch adds code comments in bch_btree_leaf_dirty() to explain
      why w->journal should always reference the eldest journal pin of
      all the writing bkeys in the btree node. To make the bcache journal
      code to be easier to be understood.
      Signed-off-by: default avatarColy Li <colyli@suse.de>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      5dccefd3
    • Andrea Righi's avatar
      bcache: fix deadlock in bcache_allocator · 84c529ae
      Andrea Righi authored
      bcache_allocator can call the following:
      
       bch_allocator_thread()
        -> bch_prio_write()
           -> bch_bucket_alloc()
              -> wait on &ca->set->bucket_wait
      
      But the wake up event on bucket_wait is supposed to come from
      bch_allocator_thread() itself => deadlock:
      
      [ 1158.490744] INFO: task bcache_allocato:15861 blocked for more than 10 seconds.
      [ 1158.495929]       Not tainted 5.3.0-050300rc3-generic #201908042232
      [ 1158.500653] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      [ 1158.504413] bcache_allocato D    0 15861      2 0x80004000
      [ 1158.504419] Call Trace:
      [ 1158.504429]  __schedule+0x2a8/0x670
      [ 1158.504432]  schedule+0x2d/0x90
      [ 1158.504448]  bch_bucket_alloc+0xe5/0x370 [bcache]
      [ 1158.504453]  ? wait_woken+0x80/0x80
      [ 1158.504466]  bch_prio_write+0x1dc/0x390 [bcache]
      [ 1158.504476]  bch_allocator_thread+0x233/0x490 [bcache]
      [ 1158.504491]  kthread+0x121/0x140
      [ 1158.504503]  ? invalidate_buckets+0x890/0x890 [bcache]
      [ 1158.504506]  ? kthread_park+0xb0/0xb0
      [ 1158.504510]  ret_from_fork+0x35/0x40
      
      Fix by making the call to bch_prio_write() non-blocking, so that
      bch_allocator_thread() never waits on itself.
      
      Moreover, make sure to wake up the garbage collector thread when
      bch_prio_write() is failing to allocate buckets.
      
      BugLink: https://bugs.launchpad.net/bugs/1784665
      BugLink: https://bugs.launchpad.net/bugs/1796292Signed-off-by: default avatarAndrea Righi <andrea.righi@canonical.com>
      Signed-off-by: default avatarColy Li <colyli@suse.de>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      84c529ae
    • Coly Li's avatar
      bcache: add code comment bch_keylist_pop() and bch_keylist_pop_front() · 06c1526d
      Coly Li authored
      This patch adds simple code comments for bch_keylist_pop() and
      bch_keylist_pop_front() in bset.c, to make the code more easier to
      be understand.
      Signed-off-by: default avatarColy Li <colyli@suse.de>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      06c1526d
    • Coly Li's avatar
      bcache: deleted code comments for dead code in bch_data_insert_keys() · 41fa4dee
      Coly Li authored
      In request.c:bch_data_insert_keys(), there is code comment for a piece
      of dead code. This patch deletes the dead code and its code comment
      since they are useless in practice.
      Signed-off-by: default avatarColy Li <colyli@suse.de>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      41fa4dee
    • Coly Li's avatar
      bcache: add more accurate error messages in read_super() · aaf8dbea
      Coly Li authored
      Previous code only returns "Not a bcache superblock" for both bcache
      super block offset and magic error. This patch addss more accurate error
      messages,
      - for super block unmatched offset:
        "Not a bcache superblock (bad offset)"
      - for super block unmatched magic number:
        "Not a bcache superblock (bad magic)"
      Signed-off-by: default avatarColy Li <colyli@suse.de>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      aaf8dbea
    • Coly Li's avatar
      bcache: fix static checker warning in bcache_device_free() · 2d886951
      Coly Li authored
      Commit cafe5635 ("bcache: A block layer cache") leads to the
      following static checker warning:
      
          ./drivers/md/bcache/super.c:770 bcache_device_free()
          warn: variable dereferenced before check 'd->disk' (see line 766)
      
      drivers/md/bcache/super.c
         762  static void bcache_device_free(struct bcache_device *d)
         763  {
         764          lockdep_assert_held(&bch_register_lock);
         765
         766          pr_info("%s stopped", d->disk->disk_name);
                                            ^^^^^^^^^
      Unchecked dereference.
      
         767
         768          if (d->c)
         769                  bcache_device_detach(d);
         770          if (d->disk && d->disk->flags & GENHD_FL_UP)
                          ^^^^^^^
      Check too late.
      
         771                  del_gendisk(d->disk);
         772          if (d->disk && d->disk->queue)
         773                  blk_cleanup_queue(d->disk->queue);
         774          if (d->disk) {
         775                  ida_simple_remove(&bcache_device_idx,
         776                                    first_minor_to_idx(d->disk->first_minor));
         777                  put_disk(d->disk);
         778          }
         779
      
      It is not 100% sure that the gendisk struct of bcache device will always
      be there, the warning makes sense when there is problem in block core.
      
      This patch tries to remove the static checking warning by checking
      d->disk to avoid NULL pointer deferences.
      Reported-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarColy Li <colyli@suse.de>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      2d886951
    • Guoju Fang's avatar
      bcache: fix a lost wake-up problem caused by mca_cannibalize_lock · 34cf78bf
      Guoju Fang authored
      This patch fix a lost wake-up problem caused by the race between
      mca_cannibalize_lock and bch_cannibalize_unlock.
      
      Consider two processes, A and B. Process A is executing
      mca_cannibalize_lock, while process B takes c->btree_cache_alloc_lock
      and is executing bch_cannibalize_unlock. The problem happens that after
      process A executes cmpxchg and will execute prepare_to_wait. In this
      timeslice process B executes wake_up, but after that process A executes
      prepare_to_wait and set the state to TASK_INTERRUPTIBLE. Then process A
      goes to sleep but no one will wake up it. This problem may cause bcache
      device to dead.
      Signed-off-by: default avatarGuoju Fang <fangguoju@gmail.com>
      Signed-off-by: default avatarColy Li <colyli@suse.de>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      34cf78bf
    • Coly Li's avatar
      bcache: fix fifo index swapping condition in journal_pin_cmp() · c0e0954e
      Coly Li authored
      Fifo structure journal.pin is implemented by a cycle buffer, if the back
      index reaches highest location of the cycle buffer, it will be swapped
      to 0. Once the swapping happens, it means a smaller fifo index might be
      associated to a newer journal entry. So the btree node with oldest
      journal entry won't be selected in bch_btree_leaf_dirty() to reference
      the dirty B+tree leaf node. This problem may cause bcache journal won't
      protect unflushed oldest B+tree dirty leaf node in power failure, and
      this B+tree leaf node is possible to beinconsistent after reboot from
      power failure.
      
      This patch fixes the fifo index comparing logic in journal_pin_cmp(),
      to avoid potential corrupted B+tree leaf node when the back index of
      journal pin is swapped.
      Signed-off-by: default avatarColy Li <colyli@suse.de>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      c0e0954e
    • Jens Axboe's avatar
      Merge branch 'md-next' of... · e2a7b9f4
      Jens Axboe authored
      Merge branch 'md-next' of git://git.kernel.org/pub/scm/linux/kernel/git/song/md into for-5.5/drivers
      
      Pull MD changes from Song.
      
      * 'md-next' of git://git.kernel.org/pub/scm/linux/kernel/git/song/md:
        md/raid10: prevent access of uninitialized resync_pages offset
        md: avoid invalid memory access for array sb->dev_roles
        md/raid1: avoid soft lockup under high load
      e2a7b9f4
  2. 12 Nov, 2019 3 commits
  3. 07 Nov, 2019 7 commits
  4. 06 Nov, 2019 1 commit
  5. 05 Nov, 2019 1 commit
  6. 04 Nov, 2019 17 commits