Commits · 8e3c3827776fc93728c0c8d7c7b731226dc6ee23 · Kirill Smelkov / linux

An error occurred fetching the project authors.

10 Nov, 2017 2 commits

dm cache: pass cache structure to mode functions · 8e3c3827

Mike Snitzer authored 7 years ago

No functional changes, just a bit cleaner than passing cache_features
structure.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

8e3c3827

dm cache: fix race condition in the writeback mode overwrite_bio optimisation · d1260e2a

Joe Thornber authored 7 years ago

When a DM cache in writeback mode moves data between the slow and fast
device it can often avoid a copy if the triggering bio either:

i) covers the whole block (no point copying if we're about to overwrite it)
ii) the migration is a promotion and the origin block is currently discarded

Prior to this fix there was a race with case (ii).  The discard status
was checked with a shared lock held (rather than exclusive).  This meant
another bio could run in parallel and write data to the origin, removing
the discard state.  After the promotion the parallel write would have
been lost.

With this fix the discard status is re-checked once the exclusive lock
has been aquired.  If the block is no longer discarded it falls back to
the slower full copy path.

Fixes: b29d4986 ("dm cache: significant rework to leverage dm-bio-prison-v2")
Cc: stable@vger.kernel.org # v4.12+
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

d1260e2a

28 Aug, 2017 1 commit

dm: constify argument arrays · 5916a22b

Eric Biggers authored 7 years ago

The arrays of 'struct dm_arg' are never modified by the device-mapper
core, so constify them so that they are placed in .rodata.

(Exception: the args array in dm-raid cannot be constified because it is
allocated on the stack and modified.)
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

5916a22b

23 Aug, 2017 1 commit

block: replace bi_bdev with a gendisk pointer and partitions index · 74d46992

Christoph Hellwig authored 7 years ago

This way we don't need a block_device structure to submit I/O.  The
block_device has different life time rules from the gendisk and
request_queue and is usually only available when the block device node
is open.  Other callers need to explicitly create one (e.g. the lightnvm
passthrough code, or the new nvme multipathing code).

For the actual I/O path all that we need is the gendisk, which exists
once per block device.  But given that the block layer also does
partition remapping we additionally need a partition index, which is
used for said remapping in generic_make_request.

Note that all the block drivers generally want request_queue or
sometimes the gendisk, so this removes a layer of indirection all
over the stack.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

74d46992

09 Jun, 2017 2 commits

block: switch bios to blk_status_t · 4e4cbee9

Christoph Hellwig authored 7 years ago

Replace bi_error with a new bi_status to allow for a clear conversion.
Note that device mapper overloaded bi_error with a private value, which
we'll have to keep arround at least for now and thus propagate to a
proper blk_status_t value.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>

4e4cbee9

dm: change ->end_io calling convention · 1be56909

Christoph Hellwig authored 7 years ago

Turn the error paramter into a pointer so that target drivers can change
the value, and make sure only DM_ENDIO_* values are returned from the
methods.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Jens Axboe <axboe@fb.com>

1be56909

15 May, 2017 3 commits

dm cache: simplify the IDLE vs BUSY state calculation · 49b7f768

Joe Thornber authored 7 years ago

Drop the MODERATE state since it wasn't buying us much.

Also, in check_migrations(), prepare for the next commit ("dm cache
policy smq: don't do any writebacks unless IDLE") by deferring to the
policy to make the final decision on whether writebacks can be
serviced.
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

49b7f768

dm cache: track all IO to the cache rather than just the origin device's IO · 701e03e4

Joe Thornber authored 7 years ago

IO tracking used to throttle writebacks when the origin device is busy.

Even if all the IO is going to the fast device, writebacks can
significantly degrade performance.  So track all IO to gauge whether the
cache is busy or not.

Otherwise, synthetic IO tests (e.g. fio) that might send all IO to the
fast device wouldn't cause writebacks to get throttled.
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

701e03e4

dm cache: fix incorrect 'idle_time' reset in IO tracker · 072792dc

Joe Thornber authored 7 years ago

Some bios have no payload (eg, a FLUSH), don't reset the idle_time when
these come in.
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

072792dc

08 Apr, 2017 1 commit

block: remove the discard_zeroes_data flag · 48920ff2

Christoph Hellwig authored 7 years ago

Now that we use the proper REQ_OP_WRITE_ZEROES operation everywhere we can
kill this hack.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>

48920ff2

31 Mar, 2017 1 commit

dm cache: set/clear the cache core's dirty_bitset when loading mappings · 449b668c

Joe Thornber authored 7 years ago

When loading metadata make sure to set/clear the dirty bits in the cache
core's dirty_bitset as well as the policy.

Otherwise the cache core is unaware that any blocks were dirty when the
cache was last shutdown. A very serious side-effect being that the
cleaner policy would therefore never be tasked with writing back dirty
data from a cache that was in writeback mode (e.g. when switching from
smq policy to cleaner policy when decommissioning a writeback cache).

This fixes a serious data corruption bug associated with writeback mode.
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

449b668c

07 Mar, 2017 2 commits

dm cache: significant rework to leverage dm-bio-prison-v2 · b29d4986

Joe Thornber authored 8 years ago

The cache policy interfaces have been updated to work well with the new
bio-prison v2 interface's ability to queue work immediately (promotion,
demotion, etc) -- overriding benefit being reduced latency on processing
IO through the cache.  Previously such work would be left for the DM
cache core to queue on various lists and then process in batches later
-- this caused a serious delay in latency for IO driven by the cache.

The background tracker code was factored out so that all cache policies
can make use of it.

Also, the "cleaner" policy has been removed and is now a variant of the
smq policy that simply disallows migrations.
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

b29d4986

dm bio prison v2: new interface for the bio prison · 742c8fdc

Joe Thornber authored 8 years ago

The deferred set is gone and all methods have _v2 appended to the end of
their names to allow for continued use of the original bio prison in DM
thin-provisioning.
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

742c8fdc

16 Feb, 2017 2 commits

dm cache metadata: add "metadata2" feature · 629d0a8a

Joe Thornber authored 8 years ago

If "metadata2" is provided as a table argument when creating/loading a
cache target a more compact metadata format, with separate dirty bits,
is used.  "metadata2" improves speed of shutting down a cache target.
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

629d0a8a

dm cache: fix corruption seen when using cache > 2TB · ca763d0a

Joe Thornber authored 8 years ago

A rounding bug due to compiler generated temporary being 32bit was found
in remap_to_cache().  A localized cast in remap_to_cache() fixes the
corruption but this preferred fix (changing from uint32_t to sector_t)
eliminates potential for future rounding errors elsewhere.

Cc: stable@vger.kernel.org
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

ca763d0a

02 Feb, 2017 1 commit

block: Use pointer to backing_dev_info from request_queue · dc3b17cc

Jan Kara authored 8 years ago

We will want to have struct backing_dev_info allocated separately from
struct request_queue. As the first step add pointer to backing_dev_info
to request_queue and convert all users touching it. No functional
changes in this patch.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@fb.com>

dc3b17cc

27 Jan, 2017 1 commit

block: add a op_is_flush helper · f73f44eb

Christoph Hellwig authored 8 years ago

This centralizes the checks for bios that needs to be go into the flush
state machine.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Jens Axboe <axboe@fb.com>

f73f44eb

21 Nov, 2016 1 commit
- dm cache: add missing cache device name to DMERR in set_cache_mode() · 23cab26d
  Mike Snitzer authored 8 years ago
```
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
```
  23cab26d
07 Aug, 2016 1 commit

block: rename bio bi_rw to bi_opf · 1eff9d32

Jens Axboe authored 8 years ago

Since commit 63a4cc24, bio->bi_rw contains flags in the lower
portion and the op code in the higher portions. This means that
old code that relies on manually setting bi_rw is most likely
going to be broken. Instead of letting that brokeness linger,
rename the member, to force old and out-of-tree code to break
at compile time instead of at runtime.

No intended functional changes in this commit.
Signed-off-by: Jens Axboe <axboe@fb.com>

1eff9d32

07 Jun, 2016 2 commits

block, drivers, fs: rename REQ_FLUSH to REQ_PREFLUSH · 28a8f0d3

Mike Christie authored 8 years ago

To avoid confusion between REQ_OP_FLUSH, which is handled by
request_fn drivers, and upper layers requesting the block layer
perform a flush sequence along with possibly a WRITE, this patch
renames REQ_FLUSH to REQ_PREFLUSH.
Signed-off-by: Mike Christie <mchristi@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>

28a8f0d3

dm: use bio op accessors · e6047149

Mike Christie authored 8 years ago

Separate the op from the rq_flag_bits and have dm
set/get the bio using bio_set_op_attrs/bio_op.
Signed-off-by: Mike Christie <mchristi@redhat.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>

e6047149

10 Mar, 2016 2 commits

dm cache: bump the target version · 843f0f2e
Mike Snitzer authored 8 years ago
```
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
```
843f0f2e

dm cache: make sure every metadata function checks fail_io · d14fcf3d

Joe Thornber authored 8 years ago

Otherwise operations may be attempted that will only ever go on to crash
(since the metadata device is either missing or unreliable if 'fail_io'
is set).
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org

d14fcf3d

23 Feb, 2016 1 commit

dm: rename target's per_bio_data_size to per_io_data_size · 30187e1d

Mike Snitzer authored 9 years ago

Request-based DM will also make use of per_bio_data_size.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

30187e1d

10 Dec, 2015 1 commit

dm: don't save and restore bi_private · fe3265b1

Mikulas Patocka authored 9 years ago

Device mapper used the field bi_private to point to dm_target_io. However,
since kernel 3.15, the bi_private field is unused, and so the targets do
not need to save and restore this field.

This patch removes code that saves and restores bi_private from dm-cache,
dm-snapshot and dm-verity.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

fe3265b1

31 Oct, 2015 1 commit

dm: drop NULL test before kmem_cache_destroy() and mempool_destroy() · 6f65985e

Julia Lawall authored 9 years ago

Remove DM's unneeded NULL tests before calling these destroy functions,
now that they check for NULL, thanks to these v4.3 commits:
3942d299 ("mm/slab_common: allow NULL cache pointer in kmem_cache_destroy()")
4e3ca3e0 ("mm/mempool: allow NULL `pool' pointer in mempool_destroy()")

The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@@ expression x; @@
-if (x != NULL)
  \(kmem_cache_destroy\|mempool_destroy\|dma_pool_destroy\)(x);
// </smpl>
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

6f65985e

01 Sep, 2015 1 commit

dm cache: fix use after freeing migrations · cc7da0ba

Joe Thornber authored 9 years ago

Both free_io_migration() and issue_discard() dereference a migration
that was just freed. Fix those by saving off the migrations's cache
object before freeing the migration. Also cleanup needless mg->cache
dereferences now that the cache object is available directly.

Fixes: e44b6a5a ("dm cache: move wake_waker() from free_migrations() to where it is needed")
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

cc7da0ba

31 Aug, 2015 2 commits

dm cache: small cleanups related to deferred prison cell cleanup · dc9cee5d

Mike Snitzer authored 9 years ago

Eliminate __cell_release() since it only had one caller that always
released the cell holder.

Switch cell_error_with_code() to using free_prison_cell() for the sake
of consistency.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

dc9cee5d

dm cache: fix leaking of deferred bio prison cells · 9153df74

Joe Thornber authored 9 years ago

There were two cases where dm_cell_visit_release() was being called,
which removes the cell from the prison's rbtree, but the callers didn't
also return the cell to the mempool.  Fix this by having them call
free_prison_cell().

This leak manifested as the 'kmalloc-96' slab growing until OOM.

Fixes: 651f5fa2 ("dm cache: defer whole cells")
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org # 4.1+

9153df74

13 Aug, 2015 1 commit

block: kill merge_bvec_fn() completely · 8ae12666

Kent Overstreet authored 9 years ago

As generic_make_request() is now able to handle arbitrarily sized bios,
it's no longer necessary for each individual block driver to define its
own ->merge_bvec_fn() callback. Remove every invocation completely.

Cc: Jens Axboe <axboe@kernel.dk>
Cc: Lars Ellenberg <drbd-dev@lists.linbit.com>
Cc: drbd-user@lists.linbit.com
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Yehuda Sadeh <yehuda@inktank.com>
Cc: Sage Weil <sage@inktank.com>
Cc: Alex Elder <elder@kernel.org>
Cc: ceph-devel@vger.kernel.org
Cc: Alasdair Kergon <agk@redhat.com>
Cc: Mike Snitzer <snitzer@redhat.com>
Cc: dm-devel@redhat.com
Cc: Neil Brown <neilb@suse.de>
Cc: linux-raid@vger.kernel.org
Cc: Christoph Hellwig <hch@infradead.org>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Acked-by: NeilBrown <neilb@suse.de> (for the 'md' bits)
Acked-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
[dpark: also remove ->merge_bvec_fn() in dm-thin as well as
 dm-era-target, and resolve merge conflicts]
Signed-off-by: Dongsu Park <dpark@posteo.net>
Signed-off-by: Ming Lin <ming.l@ssi.samsung.com>
Signed-off-by: Jens Axboe <axboe@fb.com>

8ae12666

12 Aug, 2015 1 commit

dm cache: move wake_waker() from free_migrations() to where it is needed · e44b6a5a

Joe Thornber authored 9 years ago

This stops spurious wake ups from calls to prealloc_free_structs().
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

e44b6a5a

29 Jul, 2015 3 commits

dm cache: fix device destroy hang due to improper prealloc_used accounting · 795e633a

Mike Snitzer authored 9 years ago

Commit 665022d7 ("dm cache: avoid calls to prealloc_free_structs() if
possible") introduced a regression that caused the removal of a DM cache
device to hang in cache_postsuspend()'s call to wait_for_migrations()
with the following stack trace:

  [<ffffffff81651457>] schedule+0x37/0x80
  [<ffffffffa041e21b>] cache_postsuspend+0xbb/0x470 [dm_cache]
  [<ffffffff810ba970>] ? prepare_to_wait_event+0xf0/0xf0
  [<ffffffffa0006f77>] dm_table_postsuspend_targets+0x47/0x60 [dm_mod]
  [<ffffffffa0001eb5>] __dm_destroy+0x215/0x250 [dm_mod]
  [<ffffffffa0004113>] dm_destroy+0x13/0x20 [dm_mod]
  [<ffffffffa00098cd>] dev_remove+0x10d/0x170 [dm_mod]
  [<ffffffffa00097c0>] ? dev_suspend+0x240/0x240 [dm_mod]
  [<ffffffffa0009f85>] ctl_ioctl+0x255/0x4d0 [dm_mod]
  [<ffffffff8127ac00>] ? SYSC_semtimedop+0x280/0xe10
  [<ffffffffa000a213>] dm_ctl_ioctl+0x13/0x20 [dm_mod]
  [<ffffffff811fd432>] do_vfs_ioctl+0x2d2/0x4b0
  [<ffffffff81117d5f>] ? __audit_syscall_entry+0xaf/0x100
  [<ffffffff81022636>] ? do_audit_syscall_entry+0x66/0x70
  [<ffffffff811fd689>] SyS_ioctl+0x79/0x90
  [<ffffffff81023e58>] ? syscall_trace_leave+0xb8/0x110
  [<ffffffff81654f6e>] entry_SYSCALL_64_fastpath+0x12/0x71

Fix this by accounting for the call to prealloc_data_structs()
immediately _before_ the call as opposed to after.  This is needed
because it is possible to break out of the control loop after the call
to prealloc_data_structs() but before prealloc_used was set to true.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

795e633a

Revert "dm cache: do not wake_worker() in free_migration()" · 3508e659

Mike Snitzer authored 9 years ago

This reverts commit 386cb7cd.

Taking the wake_worker() out of free_migration() will slow writeback
dramatically, and hence adaptability.

Say we have 10k blocks that need writing back, but are only able to
issue 5 concurrently due to the migration bandwidth: it's imperative
that we wake_worker() immediately after migration completion; waiting
for the next 1 second wake up (via do_waker) means it'll take a long
time to write that all back.
Reported-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

3508e659

block: add a bi_error field to struct bio · 4246a0b6

Christoph Hellwig authored 9 years ago

Currently we have two different ways to signal an I/O error on a BIO:

 (1) by clearing the BIO_UPTODATE flag
 (2) by returning a Linux errno value to the bi_end_io callback

The first one has the drawback of only communicating a single possible
error (-EIO), and the second one has the drawback of not beeing persistent
when bios are queued up, and are not passed along from child to parent
bio in the ever more popular chaining scenario.  Having both mechanisms
available has the additional drawback of utterly confusing driver authors
and introducing bugs where various I/O submitters only deal with one of
them, and the others have to add boilerplate code to deal with both kinds
of error returns.

So add a new bi_error field to store an errno value directly in struct
bio and remove the existing mechanisms to clean all this up.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>

4246a0b6

17 Jul, 2015 3 commits

dm cache: avoid calls to prealloc_free_structs() if possible · 665022d7

Mike Snitzer authored 9 years ago

If no work was performed then prealloc_data_structs() wasn't ever called
so there isn't any need to call prealloc_free_structs().
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

665022d7

dm cache: avoid preallocation if no work in writeback_some_dirty_blocks() · e782eff5

Mike Snitzer authored 9 years ago

Refactor writeback_some_dirty_blocks() to avoid prealloc_data_structs()
if the policy doesn't have any dirty blocks ready for writeback.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

e782eff5

dm cache: do not wake_worker() in free_migration() · 386cb7cd

Mike Snitzer authored 9 years ago

All methods that queue work call wake_worker() as you'd expect.
E.g. cell_defer, defer_bio, quiesce_migration (which is called by
writeback, promote, demote_then_promote, invalidate, discard, etc).
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

386cb7cd

16 Jul, 2015 1 commit

dm cache: display 'needs_check' in status if it is set · 255eac20

Mike Snitzer authored 9 years ago

There is currently no way to see that the needs_check flag has been set
in the metadata.  Display 'needs_check' in the cache status if it is set
in the cache metadata.

Also, update cache documentation.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

255eac20

11 Jun, 2015 2 commits

dm cache: age and write back cache entries even without active IO · fba10109

Joe Thornber authored 9 years ago

The policy tick() method is normally called from interrupt context.
Both the mq and smq policies do some bottom half work for the tick
method in their map functions.  However if no IO is going through the
cache, then that bottom half work doesn't occur.  With these policies
this means recently hit entries do not age and do not get written
back as early as we'd like.

Fix this by introducing a new 'can_block' parameter to the tick()
method.  When this is set the bottom half work occurs immediately.
'can_block' is set when the tick method is called every second by the
core target (not in interrupt context).
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

fba10109

dm cache: prefix all DMERR and DMINFO messages with cache device name · b61d9509

Mike Snitzer authored 9 years ago

Having the DM device name associated with the ERR or INFO message is
very helpful.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

b61d9509