Commit eeee2827 authored by Linus Torvalds's avatar Linus Torvalds

Merge tag 'for-5.5/dm-changes' of...

Merge tag 'for-5.5/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm

Pull device mapper updates from Mike Snitzer:

 - Fix DM core to disallow stacking request-based DM on partitions.

 - Fix DM raid target to properly resync raidset even if bitmap needed
   additional pages.

 - Fix DM crypt performance regression due to use of WQ_HIGHPRI for the
   IO and crypt workqueues.

 - Fix DM integrity metadata layout that was aligned on 128K boundary
   rather than the intended 4K boundary (removes 124K of wasted space
   for each metadata block).

 - Improve the DM thin, cache and clone targets to use spin_lock_irq
   rather than spin_lock_irqsave where possible.

 - Fix DM thin single thread performance that was lost due to needless
   workqueue wakeups.

 - Fix DM zoned target performance that was lost due to excessive
   backing device checks.

 - Add ability to trigger write failure with the DM dust test target.

 - Fix whitespace indentation in drivers/md/Kconfig.

 - Various smalls fixes and cleanups (e.g. use struct_size, fix
   uninitialized variable, variable renames, etc).

* tag 'for-5.5/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: (22 commits)
  Revert "dm crypt: use WQ_HIGHPRI for the IO and crypt workqueues"
  dm: Fix Kconfig indentation
  dm thin: wakeup worker only when deferred bios exist
  dm integrity: fix excessive alignment of metadata runs
  dm raid: Remove unnecessary negation of a shift in raid10_format_to_md_layout
  dm zoned: reduce overhead of backing device checks
  dm dust: add limited write failure mode
  dm dust: change ret to r in dust_map_read and dust_map
  dm dust: change result vars to r
  dm cache: replace spin_lock_irqsave with spin_lock_irq
  dm bio prison: replace spin_lock_irqsave with spin_lock_irq
  dm thin: replace spin_lock_irqsave with spin_lock_irq
  dm clone: add bucket_lock_irq/bucket_unlock_irq helpers
  dm clone: replace spin_lock_irqsave with spin_lock_irq
  dm writecache: handle REQ_FUA
  dm writecache: fix uninitialized variable warning
  dm stripe: use struct_size() in kmalloc()
  dm raid: streamline rs_get_progress() and its raid_status() caller side
  dm raid: simplify rs_setup_recovery call chain
  dm raid: to ensure resynchronization, perform raid set grow in preresume
  ...
parents 7e5192b9 f612b213
...@@ -177,6 +177,11 @@ bitmap_flush_interval:number ...@@ -177,6 +177,11 @@ bitmap_flush_interval:number
The bitmap flush interval in milliseconds. The metadata buffers The bitmap flush interval in milliseconds. The metadata buffers
are synchronized when this interval expires. are synchronized when this interval expires.
fix_padding
Use a smaller padding of the tag area that is more
space-efficient. If this option is not present, large padding is
used - that is for compatibility with older kernels.
The journal mode (D/J), buffer_sectors, journal_watermark, commit_time can The journal mode (D/J), buffer_sectors, journal_watermark, commit_time can
be changed when reloading the target (load an inactive table and swap the be changed when reloading the target (load an inactive table and swap the
......
...@@ -417,3 +417,5 @@ Version History ...@@ -417,3 +417,5 @@ Version History
deadlock/potential data corruption. Update superblock when deadlock/potential data corruption. Update superblock when
specific devices are requested via rebuild. Fix RAID leg specific devices are requested via rebuild. Fix RAID leg
rebuild errors. rebuild errors.
1.15.0 Fix size extensions not being synchronized in case of new MD bitmap
pages allocated; also fix those not occuring after previous reductions
...@@ -38,9 +38,9 @@ config MD_AUTODETECT ...@@ -38,9 +38,9 @@ config MD_AUTODETECT
default y default y
---help--- ---help---
If you say Y here, then the kernel will try to autodetect raid If you say Y here, then the kernel will try to autodetect raid
arrays as part of its boot process. arrays as part of its boot process.
If you don't use raid and say Y, this autodetection can cause If you don't use raid and say Y, this autodetection can cause
a several-second delay in the boot time due to various a several-second delay in the boot time due to various
synchronisation steps that are part of this step. synchronisation steps that are part of this step.
...@@ -290,7 +290,7 @@ config DM_SNAPSHOT ...@@ -290,7 +290,7 @@ config DM_SNAPSHOT
depends on BLK_DEV_DM depends on BLK_DEV_DM
select DM_BUFIO select DM_BUFIO
---help--- ---help---
Allow volume managers to take writable snapshots of a device. Allow volume managers to take writable snapshots of a device.
config DM_THIN_PROVISIONING config DM_THIN_PROVISIONING
tristate "Thin provisioning target" tristate "Thin provisioning target"
...@@ -298,7 +298,7 @@ config DM_THIN_PROVISIONING ...@@ -298,7 +298,7 @@ config DM_THIN_PROVISIONING
select DM_PERSISTENT_DATA select DM_PERSISTENT_DATA
select DM_BIO_PRISON select DM_BIO_PRISON
---help--- ---help---
Provides thin provisioning and snapshots that share a data store. Provides thin provisioning and snapshots that share a data store.
config DM_CACHE config DM_CACHE
tristate "Cache target (EXPERIMENTAL)" tristate "Cache target (EXPERIMENTAL)"
...@@ -307,23 +307,23 @@ config DM_CACHE ...@@ -307,23 +307,23 @@ config DM_CACHE
select DM_PERSISTENT_DATA select DM_PERSISTENT_DATA
select DM_BIO_PRISON select DM_BIO_PRISON
---help--- ---help---
dm-cache attempts to improve performance of a block device by dm-cache attempts to improve performance of a block device by
moving frequently used data to a smaller, higher performance moving frequently used data to a smaller, higher performance
device. Different 'policy' plugins can be used to change the device. Different 'policy' plugins can be used to change the
algorithms used to select which blocks are promoted, demoted, algorithms used to select which blocks are promoted, demoted,
cleaned etc. It supports writeback and writethrough modes. cleaned etc. It supports writeback and writethrough modes.
config DM_CACHE_SMQ config DM_CACHE_SMQ
tristate "Stochastic MQ Cache Policy (EXPERIMENTAL)" tristate "Stochastic MQ Cache Policy (EXPERIMENTAL)"
depends on DM_CACHE depends on DM_CACHE
default y default y
---help--- ---help---
A cache policy that uses a multiqueue ordered by recent hits A cache policy that uses a multiqueue ordered by recent hits
to select which blocks should be promoted and demoted. to select which blocks should be promoted and demoted.
This is meant to be a general purpose policy. It prioritises This is meant to be a general purpose policy. It prioritises
reads over writes. This SMQ policy (vs MQ) offers the promise reads over writes. This SMQ policy (vs MQ) offers the promise
of less memory utilization, improved performance and increased of less memory utilization, improved performance and increased
adaptability in the face of changing workloads. adaptability in the face of changing workloads.
config DM_WRITECACHE config DM_WRITECACHE
tristate "Writecache target" tristate "Writecache target"
...@@ -343,9 +343,9 @@ config DM_ERA ...@@ -343,9 +343,9 @@ config DM_ERA
select DM_PERSISTENT_DATA select DM_PERSISTENT_DATA
select DM_BIO_PRISON select DM_BIO_PRISON
---help--- ---help---
dm-era tracks which parts of a block device are written to dm-era tracks which parts of a block device are written to
over time. Useful for maintaining cache coherency when using over time. Useful for maintaining cache coherency when using
vendor snapshots. vendor snapshots.
config DM_CLONE config DM_CLONE
tristate "Clone target (EXPERIMENTAL)" tristate "Clone target (EXPERIMENTAL)"
...@@ -353,20 +353,20 @@ config DM_CLONE ...@@ -353,20 +353,20 @@ config DM_CLONE
default n default n
select DM_PERSISTENT_DATA select DM_PERSISTENT_DATA
---help--- ---help---
dm-clone produces a one-to-one copy of an existing, read-only source dm-clone produces a one-to-one copy of an existing, read-only source
device into a writable destination device. The cloned device is device into a writable destination device. The cloned device is
visible/mountable immediately and the copy of the source device to the visible/mountable immediately and the copy of the source device to the
destination device happens in the background, in parallel with user destination device happens in the background, in parallel with user
I/O. I/O.
If unsure, say N. If unsure, say N.
config DM_MIRROR config DM_MIRROR
tristate "Mirror target" tristate "Mirror target"
depends on BLK_DEV_DM depends on BLK_DEV_DM
---help--- ---help---
Allow volume managers to mirror logical volumes, also Allow volume managers to mirror logical volumes, also
needed for live data migration tools such as 'pvmove'. needed for live data migration tools such as 'pvmove'.
config DM_LOG_USERSPACE config DM_LOG_USERSPACE
tristate "Mirror userspace logging" tristate "Mirror userspace logging"
...@@ -483,7 +483,7 @@ config DM_FLAKEY ...@@ -483,7 +483,7 @@ config DM_FLAKEY
tristate "Flakey target" tristate "Flakey target"
depends on BLK_DEV_DM depends on BLK_DEV_DM
---help--- ---help---
A target that intermittently fails I/O for debugging purposes. A target that intermittently fails I/O for debugging purposes.
config DM_VERITY config DM_VERITY
tristate "Verity target support" tristate "Verity target support"
......
...@@ -150,11 +150,10 @@ static int bio_detain(struct dm_bio_prison *prison, ...@@ -150,11 +150,10 @@ static int bio_detain(struct dm_bio_prison *prison,
struct dm_bio_prison_cell **cell_result) struct dm_bio_prison_cell **cell_result)
{ {
int r; int r;
unsigned long flags;
spin_lock_irqsave(&prison->lock, flags); spin_lock_irq(&prison->lock);
r = __bio_detain(prison, key, inmate, cell_prealloc, cell_result); r = __bio_detain(prison, key, inmate, cell_prealloc, cell_result);
spin_unlock_irqrestore(&prison->lock, flags); spin_unlock_irq(&prison->lock);
return r; return r;
} }
...@@ -198,11 +197,9 @@ void dm_cell_release(struct dm_bio_prison *prison, ...@@ -198,11 +197,9 @@ void dm_cell_release(struct dm_bio_prison *prison,
struct dm_bio_prison_cell *cell, struct dm_bio_prison_cell *cell,
struct bio_list *bios) struct bio_list *bios)
{ {
unsigned long flags; spin_lock_irq(&prison->lock);
spin_lock_irqsave(&prison->lock, flags);
__cell_release(prison, cell, bios); __cell_release(prison, cell, bios);
spin_unlock_irqrestore(&prison->lock, flags); spin_unlock_irq(&prison->lock);
} }
EXPORT_SYMBOL_GPL(dm_cell_release); EXPORT_SYMBOL_GPL(dm_cell_release);
...@@ -250,12 +247,10 @@ void dm_cell_visit_release(struct dm_bio_prison *prison, ...@@ -250,12 +247,10 @@ void dm_cell_visit_release(struct dm_bio_prison *prison,
void *context, void *context,
struct dm_bio_prison_cell *cell) struct dm_bio_prison_cell *cell)
{ {
unsigned long flags; spin_lock_irq(&prison->lock);
spin_lock_irqsave(&prison->lock, flags);
visit_fn(context, cell); visit_fn(context, cell);
rb_erase(&cell->node, &prison->cells); rb_erase(&cell->node, &prison->cells);
spin_unlock_irqrestore(&prison->lock, flags); spin_unlock_irq(&prison->lock);
} }
EXPORT_SYMBOL_GPL(dm_cell_visit_release); EXPORT_SYMBOL_GPL(dm_cell_visit_release);
...@@ -275,11 +270,10 @@ int dm_cell_promote_or_release(struct dm_bio_prison *prison, ...@@ -275,11 +270,10 @@ int dm_cell_promote_or_release(struct dm_bio_prison *prison,
struct dm_bio_prison_cell *cell) struct dm_bio_prison_cell *cell)
{ {
int r; int r;
unsigned long flags;
spin_lock_irqsave(&prison->lock, flags); spin_lock_irq(&prison->lock);
r = __promote_or_release(prison, cell); r = __promote_or_release(prison, cell);
spin_unlock_irqrestore(&prison->lock, flags); spin_unlock_irq(&prison->lock);
return r; return r;
} }
...@@ -379,10 +373,9 @@ EXPORT_SYMBOL_GPL(dm_deferred_entry_dec); ...@@ -379,10 +373,9 @@ EXPORT_SYMBOL_GPL(dm_deferred_entry_dec);
int dm_deferred_set_add_work(struct dm_deferred_set *ds, struct list_head *work) int dm_deferred_set_add_work(struct dm_deferred_set *ds, struct list_head *work)
{ {
int r = 1; int r = 1;
unsigned long flags;
unsigned next_entry; unsigned next_entry;
spin_lock_irqsave(&ds->lock, flags); spin_lock_irq(&ds->lock);
if ((ds->sweeper == ds->current_entry) && if ((ds->sweeper == ds->current_entry) &&
!ds->entries[ds->current_entry].count) !ds->entries[ds->current_entry].count)
r = 0; r = 0;
...@@ -392,7 +385,7 @@ int dm_deferred_set_add_work(struct dm_deferred_set *ds, struct list_head *work) ...@@ -392,7 +385,7 @@ int dm_deferred_set_add_work(struct dm_deferred_set *ds, struct list_head *work)
if (!ds->entries[next_entry].count) if (!ds->entries[next_entry].count)
ds->current_entry = next_entry; ds->current_entry = next_entry;
} }
spin_unlock_irqrestore(&ds->lock, flags); spin_unlock_irq(&ds->lock);
return r; return r;
} }
......
...@@ -177,11 +177,10 @@ bool dm_cell_get_v2(struct dm_bio_prison_v2 *prison, ...@@ -177,11 +177,10 @@ bool dm_cell_get_v2(struct dm_bio_prison_v2 *prison,
struct dm_bio_prison_cell_v2 **cell_result) struct dm_bio_prison_cell_v2 **cell_result)
{ {
int r; int r;
unsigned long flags;
spin_lock_irqsave(&prison->lock, flags); spin_lock_irq(&prison->lock);
r = __get(prison, key, lock_level, inmate, cell_prealloc, cell_result); r = __get(prison, key, lock_level, inmate, cell_prealloc, cell_result);
spin_unlock_irqrestore(&prison->lock, flags); spin_unlock_irq(&prison->lock);
return r; return r;
} }
...@@ -261,11 +260,10 @@ int dm_cell_lock_v2(struct dm_bio_prison_v2 *prison, ...@@ -261,11 +260,10 @@ int dm_cell_lock_v2(struct dm_bio_prison_v2 *prison,
struct dm_bio_prison_cell_v2 **cell_result) struct dm_bio_prison_cell_v2 **cell_result)
{ {
int r; int r;
unsigned long flags;
spin_lock_irqsave(&prison->lock, flags); spin_lock_irq(&prison->lock);
r = __lock(prison, key, lock_level, cell_prealloc, cell_result); r = __lock(prison, key, lock_level, cell_prealloc, cell_result);
spin_unlock_irqrestore(&prison->lock, flags); spin_unlock_irq(&prison->lock);
return r; return r;
} }
...@@ -285,11 +283,9 @@ void dm_cell_quiesce_v2(struct dm_bio_prison_v2 *prison, ...@@ -285,11 +283,9 @@ void dm_cell_quiesce_v2(struct dm_bio_prison_v2 *prison,
struct dm_bio_prison_cell_v2 *cell, struct dm_bio_prison_cell_v2 *cell,
struct work_struct *continuation) struct work_struct *continuation)
{ {
unsigned long flags; spin_lock_irq(&prison->lock);
spin_lock_irqsave(&prison->lock, flags);
__quiesce(prison, cell, continuation); __quiesce(prison, cell, continuation);
spin_unlock_irqrestore(&prison->lock, flags); spin_unlock_irq(&prison->lock);
} }
EXPORT_SYMBOL_GPL(dm_cell_quiesce_v2); EXPORT_SYMBOL_GPL(dm_cell_quiesce_v2);
...@@ -309,11 +305,10 @@ int dm_cell_lock_promote_v2(struct dm_bio_prison_v2 *prison, ...@@ -309,11 +305,10 @@ int dm_cell_lock_promote_v2(struct dm_bio_prison_v2 *prison,
unsigned new_lock_level) unsigned new_lock_level)
{ {
int r; int r;
unsigned long flags;
spin_lock_irqsave(&prison->lock, flags); spin_lock_irq(&prison->lock);
r = __promote(prison, cell, new_lock_level); r = __promote(prison, cell, new_lock_level);
spin_unlock_irqrestore(&prison->lock, flags); spin_unlock_irq(&prison->lock);
return r; return r;
} }
...@@ -342,11 +337,10 @@ bool dm_cell_unlock_v2(struct dm_bio_prison_v2 *prison, ...@@ -342,11 +337,10 @@ bool dm_cell_unlock_v2(struct dm_bio_prison_v2 *prison,
struct bio_list *bios) struct bio_list *bios)
{ {
bool r; bool r;
unsigned long flags;
spin_lock_irqsave(&prison->lock, flags); spin_lock_irq(&prison->lock);
r = __unlock(prison, cell, bios); r = __unlock(prison, cell, bios);
spin_unlock_irqrestore(&prison->lock, flags); spin_unlock_irq(&prison->lock);
return r; return r;
} }
......
...@@ -74,22 +74,19 @@ static bool __iot_idle_for(struct io_tracker *iot, unsigned long jifs) ...@@ -74,22 +74,19 @@ static bool __iot_idle_for(struct io_tracker *iot, unsigned long jifs)
static bool iot_idle_for(struct io_tracker *iot, unsigned long jifs) static bool iot_idle_for(struct io_tracker *iot, unsigned long jifs)
{ {
bool r; bool r;
unsigned long flags;
spin_lock_irqsave(&iot->lock, flags); spin_lock_irq(&iot->lock);
r = __iot_idle_for(iot, jifs); r = __iot_idle_for(iot, jifs);
spin_unlock_irqrestore(&iot->lock, flags); spin_unlock_irq(&iot->lock);
return r; return r;
} }
static void iot_io_begin(struct io_tracker *iot, sector_t len) static void iot_io_begin(struct io_tracker *iot, sector_t len)
{ {
unsigned long flags; spin_lock_irq(&iot->lock);
spin_lock_irqsave(&iot->lock, flags);
iot->in_flight += len; iot->in_flight += len;
spin_unlock_irqrestore(&iot->lock, flags); spin_unlock_irq(&iot->lock);
} }
static void __iot_io_end(struct io_tracker *iot, sector_t len) static void __iot_io_end(struct io_tracker *iot, sector_t len)
...@@ -172,7 +169,6 @@ static void __commit(struct work_struct *_ws) ...@@ -172,7 +169,6 @@ static void __commit(struct work_struct *_ws)
{ {
struct batcher *b = container_of(_ws, struct batcher, commit_work); struct batcher *b = container_of(_ws, struct batcher, commit_work);
blk_status_t r; blk_status_t r;
unsigned long flags;
struct list_head work_items; struct list_head work_items;
struct work_struct *ws, *tmp; struct work_struct *ws, *tmp;
struct continuation *k; struct continuation *k;
...@@ -186,12 +182,12 @@ static void __commit(struct work_struct *_ws) ...@@ -186,12 +182,12 @@ static void __commit(struct work_struct *_ws)
* We have to grab these before the commit_op to avoid a race * We have to grab these before the commit_op to avoid a race
* condition. * condition.
*/ */
spin_lock_irqsave(&b->lock, flags); spin_lock_irq(&b->lock);
list_splice_init(&b->work_items, &work_items); list_splice_init(&b->work_items, &work_items);
bio_list_merge(&bios, &b->bios); bio_list_merge(&bios, &b->bios);
bio_list_init(&b->bios); bio_list_init(&b->bios);
b->commit_scheduled = false; b->commit_scheduled = false;
spin_unlock_irqrestore(&b->lock, flags); spin_unlock_irq(&b->lock);
r = b->commit_op(b->commit_context); r = b->commit_op(b->commit_context);
...@@ -238,13 +234,12 @@ static void async_commit(struct batcher *b) ...@@ -238,13 +234,12 @@ static void async_commit(struct batcher *b)
static void continue_after_commit(struct batcher *b, struct continuation *k) static void continue_after_commit(struct batcher *b, struct continuation *k)
{ {
unsigned long flags;
bool commit_scheduled; bool commit_scheduled;
spin_lock_irqsave(&b->lock, flags); spin_lock_irq(&b->lock);
commit_scheduled = b->commit_scheduled; commit_scheduled = b->commit_scheduled;
list_add_tail(&k->ws.entry, &b->work_items); list_add_tail(&k->ws.entry, &b->work_items);
spin_unlock_irqrestore(&b->lock, flags); spin_unlock_irq(&b->lock);
if (commit_scheduled) if (commit_scheduled)
async_commit(b); async_commit(b);
...@@ -255,13 +250,12 @@ static void continue_after_commit(struct batcher *b, struct continuation *k) ...@@ -255,13 +250,12 @@ static void continue_after_commit(struct batcher *b, struct continuation *k)
*/ */
static void issue_after_commit(struct batcher *b, struct bio *bio) static void issue_after_commit(struct batcher *b, struct bio *bio)
{ {
unsigned long flags;
bool commit_scheduled; bool commit_scheduled;
spin_lock_irqsave(&b->lock, flags); spin_lock_irq(&b->lock);
commit_scheduled = b->commit_scheduled; commit_scheduled = b->commit_scheduled;
bio_list_add(&b->bios, bio); bio_list_add(&b->bios, bio);
spin_unlock_irqrestore(&b->lock, flags); spin_unlock_irq(&b->lock);
if (commit_scheduled) if (commit_scheduled)
async_commit(b); async_commit(b);
...@@ -273,12 +267,11 @@ static void issue_after_commit(struct batcher *b, struct bio *bio) ...@@ -273,12 +267,11 @@ static void issue_after_commit(struct batcher *b, struct bio *bio)
static void schedule_commit(struct batcher *b) static void schedule_commit(struct batcher *b)
{ {
bool immediate; bool immediate;
unsigned long flags;
spin_lock_irqsave(&b->lock, flags); spin_lock_irq(&b->lock);
immediate = !list_empty(&b->work_items) || !bio_list_empty(&b->bios); immediate = !list_empty(&b->work_items) || !bio_list_empty(&b->bios);
b->commit_scheduled = true; b->commit_scheduled = true;
spin_unlock_irqrestore(&b->lock, flags); spin_unlock_irq(&b->lock);
if (immediate) if (immediate)
async_commit(b); async_commit(b);
...@@ -630,23 +623,19 @@ static struct per_bio_data *init_per_bio_data(struct bio *bio) ...@@ -630,23 +623,19 @@ static struct per_bio_data *init_per_bio_data(struct bio *bio)
static void defer_bio(struct cache *cache, struct bio *bio) static void defer_bio(struct cache *cache, struct bio *bio)
{ {
unsigned long flags; spin_lock_irq(&cache->lock);
spin_lock_irqsave(&cache->lock, flags);
bio_list_add(&cache->deferred_bios, bio); bio_list_add(&cache->deferred_bios, bio);
spin_unlock_irqrestore(&cache->lock, flags); spin_unlock_irq(&cache->lock);
wake_deferred_bio_worker(cache); wake_deferred_bio_worker(cache);
} }
static void defer_bios(struct cache *cache, struct bio_list *bios) static void defer_bios(struct cache *cache, struct bio_list *bios)
{ {
unsigned long flags; spin_lock_irq(&cache->lock);
spin_lock_irqsave(&cache->lock, flags);
bio_list_merge(&cache->deferred_bios, bios); bio_list_merge(&cache->deferred_bios, bios);
bio_list_init(bios); bio_list_init(bios);
spin_unlock_irqrestore(&cache->lock, flags); spin_unlock_irq(&cache->lock);
wake_deferred_bio_worker(cache); wake_deferred_bio_worker(cache);
} }
...@@ -756,33 +745,27 @@ static dm_dblock_t oblock_to_dblock(struct cache *cache, dm_oblock_t oblock) ...@@ -756,33 +745,27 @@ static dm_dblock_t oblock_to_dblock(struct cache *cache, dm_oblock_t oblock)
static void set_discard(struct cache *cache, dm_dblock_t b) static void set_discard(struct cache *cache, dm_dblock_t b)
{ {
unsigned long flags;
BUG_ON(from_dblock(b) >= from_dblock(cache->discard_nr_blocks)); BUG_ON(from_dblock(b) >= from_dblock(cache->discard_nr_blocks));
atomic_inc(&cache->stats.discard_count); atomic_inc(&cache->stats.discard_count);
spin_lock_irqsave(&cache->lock, flags); spin_lock_irq(&cache->lock);
set_bit(from_dblock(b), cache->discard_bitset); set_bit(from_dblock(b), cache->discard_bitset);
spin_unlock_irqrestore(&cache->lock, flags); spin_unlock_irq(&cache->lock);
} }
static void clear_discard(struct cache *cache, dm_dblock_t b) static void clear_discard(struct cache *cache, dm_dblock_t b)
{ {
unsigned long flags; spin_lock_irq(&cache->lock);
spin_lock_irqsave(&cache->lock, flags);
clear_bit(from_dblock(b), cache->discard_bitset); clear_bit(from_dblock(b), cache->discard_bitset);
spin_unlock_irqrestore(&cache->lock, flags); spin_unlock_irq(&cache->lock);
} }
static bool is_discarded(struct cache *cache, dm_dblock_t b) static bool is_discarded(struct cache *cache, dm_dblock_t b)
{ {
int r; int r;
unsigned long flags; spin_lock_irq(&cache->lock);
spin_lock_irqsave(&cache->lock, flags);
r = test_bit(from_dblock(b), cache->discard_bitset); r = test_bit(from_dblock(b), cache->discard_bitset);
spin_unlock_irqrestore(&cache->lock, flags); spin_unlock_irq(&cache->lock);
return r; return r;
} }
...@@ -790,12 +773,10 @@ static bool is_discarded(struct cache *cache, dm_dblock_t b) ...@@ -790,12 +773,10 @@ static bool is_discarded(struct cache *cache, dm_dblock_t b)
static bool is_discarded_oblock(struct cache *cache, dm_oblock_t b) static bool is_discarded_oblock(struct cache *cache, dm_oblock_t b)
{ {
int r; int r;
unsigned long flags; spin_lock_irq(&cache->lock);
spin_lock_irqsave(&cache->lock, flags);
r = test_bit(from_dblock(oblock_to_dblock(cache, b)), r = test_bit(from_dblock(oblock_to_dblock(cache, b)),
cache->discard_bitset); cache->discard_bitset);
spin_unlock_irqrestore(&cache->lock, flags); spin_unlock_irq(&cache->lock);
return r; return r;
} }
...@@ -827,17 +808,16 @@ static void remap_to_cache(struct cache *cache, struct bio *bio, ...@@ -827,17 +808,16 @@ static void remap_to_cache(struct cache *cache, struct bio *bio,
static void check_if_tick_bio_needed(struct cache *cache, struct bio *bio) static void check_if_tick_bio_needed(struct cache *cache, struct bio *bio)
{ {
unsigned long flags;
struct per_bio_data *pb; struct per_bio_data *pb;
spin_lock_irqsave(&cache->lock, flags); spin_lock_irq(&cache->lock);
if (cache->need_tick_bio && !op_is_flush(bio->bi_opf) && if (cache->need_tick_bio && !op_is_flush(bio->bi_opf) &&
bio_op(bio) != REQ_OP_DISCARD) { bio_op(bio) != REQ_OP_DISCARD) {
pb = get_per_bio_data(bio); pb = get_per_bio_data(bio);
pb->tick = true; pb->tick = true;
cache->need_tick_bio = false; cache->need_tick_bio = false;
} }
spin_unlock_irqrestore(&cache->lock, flags); spin_unlock_irq(&cache->lock);
} }
static void __remap_to_origin_clear_discard(struct cache *cache, struct bio *bio, static void __remap_to_origin_clear_discard(struct cache *cache, struct bio *bio,
...@@ -1889,17 +1869,16 @@ static void process_deferred_bios(struct work_struct *ws) ...@@ -1889,17 +1869,16 @@ static void process_deferred_bios(struct work_struct *ws)
{ {
struct cache *cache = container_of(ws, struct cache, deferred_bio_worker); struct cache *cache = container_of(ws, struct cache, deferred_bio_worker);
unsigned long flags;
bool commit_needed = false; bool commit_needed = false;
struct bio_list bios; struct bio_list bios;
struct bio *bio; struct bio *bio;
bio_list_init(&bios); bio_list_init(&bios);
spin_lock_irqsave(&cache->lock, flags); spin_lock_irq(&cache->lock);
bio_list_merge(&bios, &cache->deferred_bios); bio_list_merge(&bios, &cache->deferred_bios);
bio_list_init(&cache->deferred_bios); bio_list_init(&cache->deferred_bios);
spin_unlock_irqrestore(&cache->lock, flags); spin_unlock_irq(&cache->lock);
while ((bio = bio_list_pop(&bios))) { while ((bio = bio_list_pop(&bios))) {
if (bio->bi_opf & REQ_PREFLUSH) if (bio->bi_opf & REQ_PREFLUSH)
......
...@@ -712,7 +712,7 @@ static int __metadata_commit(struct dm_clone_metadata *cmd) ...@@ -712,7 +712,7 @@ static int __metadata_commit(struct dm_clone_metadata *cmd)
static int __flush_dmap(struct dm_clone_metadata *cmd, struct dirty_map *dmap) static int __flush_dmap(struct dm_clone_metadata *cmd, struct dirty_map *dmap)
{ {
int r; int r;
unsigned long word, flags; unsigned long word;
word = 0; word = 0;
do { do {
...@@ -736,9 +736,9 @@ static int __flush_dmap(struct dm_clone_metadata *cmd, struct dirty_map *dmap) ...@@ -736,9 +736,9 @@ static int __flush_dmap(struct dm_clone_metadata *cmd, struct dirty_map *dmap)
return r; return r;
/* Update the changed flag */ /* Update the changed flag */
spin_lock_irqsave(&cmd->bitmap_lock, flags); spin_lock_irq(&cmd->bitmap_lock);
dmap->changed = 0; dmap->changed = 0;
spin_unlock_irqrestore(&cmd->bitmap_lock, flags); spin_unlock_irq(&cmd->bitmap_lock);
return 0; return 0;
} }
...@@ -746,7 +746,6 @@ static int __flush_dmap(struct dm_clone_metadata *cmd, struct dirty_map *dmap) ...@@ -746,7 +746,6 @@ static int __flush_dmap(struct dm_clone_metadata *cmd, struct dirty_map *dmap)
int dm_clone_metadata_commit(struct dm_clone_metadata *cmd) int dm_clone_metadata_commit(struct dm_clone_metadata *cmd)
{ {
int r = -EPERM; int r = -EPERM;
unsigned long flags;
struct dirty_map *dmap, *next_dmap; struct dirty_map *dmap, *next_dmap;
down_write(&cmd->lock); down_write(&cmd->lock);
...@@ -770,9 +769,9 @@ int dm_clone_metadata_commit(struct dm_clone_metadata *cmd) ...@@ -770,9 +769,9 @@ int dm_clone_metadata_commit(struct dm_clone_metadata *cmd)
} }
/* Swap dirty bitmaps */ /* Swap dirty bitmaps */
spin_lock_irqsave(&cmd->bitmap_lock, flags); spin_lock_irq(&cmd->bitmap_lock);
cmd->current_dmap = next_dmap; cmd->current_dmap = next_dmap;
spin_unlock_irqrestore(&cmd->bitmap_lock, flags); spin_unlock_irq(&cmd->bitmap_lock);
/* /*
* No one is accessing the old dirty bitmap anymore, so we can flush * No one is accessing the old dirty bitmap anymore, so we can flush
...@@ -817,9 +816,9 @@ int dm_clone_cond_set_range(struct dm_clone_metadata *cmd, unsigned long start, ...@@ -817,9 +816,9 @@ int dm_clone_cond_set_range(struct dm_clone_metadata *cmd, unsigned long start,
{ {
int r = 0; int r = 0;
struct dirty_map *dmap; struct dirty_map *dmap;
unsigned long word, region_nr, flags; unsigned long word, region_nr;
spin_lock_irqsave(&cmd->bitmap_lock, flags); spin_lock_irq(&cmd->bitmap_lock);
if (cmd->read_only) { if (cmd->read_only) {
r = -EPERM; r = -EPERM;
...@@ -836,7 +835,7 @@ int dm_clone_cond_set_range(struct dm_clone_metadata *cmd, unsigned long start, ...@@ -836,7 +835,7 @@ int dm_clone_cond_set_range(struct dm_clone_metadata *cmd, unsigned long start,
} }
} }
out: out:
spin_unlock_irqrestore(&cmd->bitmap_lock, flags); spin_unlock_irq(&cmd->bitmap_lock);
return r; return r;
} }
...@@ -903,13 +902,11 @@ int dm_clone_metadata_abort(struct dm_clone_metadata *cmd) ...@@ -903,13 +902,11 @@ int dm_clone_metadata_abort(struct dm_clone_metadata *cmd)
void dm_clone_metadata_set_read_only(struct dm_clone_metadata *cmd) void dm_clone_metadata_set_read_only(struct dm_clone_metadata *cmd)
{ {
unsigned long flags;
down_write(&cmd->lock); down_write(&cmd->lock);
spin_lock_irqsave(&cmd->bitmap_lock, flags); spin_lock_irq(&cmd->bitmap_lock);
cmd->read_only = 1; cmd->read_only = 1;
spin_unlock_irqrestore(&cmd->bitmap_lock, flags); spin_unlock_irq(&cmd->bitmap_lock);
if (!cmd->fail_io) if (!cmd->fail_io)
dm_bm_set_read_only(cmd->bm); dm_bm_set_read_only(cmd->bm);
...@@ -919,13 +916,11 @@ void dm_clone_metadata_set_read_only(struct dm_clone_metadata *cmd) ...@@ -919,13 +916,11 @@ void dm_clone_metadata_set_read_only(struct dm_clone_metadata *cmd)
void dm_clone_metadata_set_read_write(struct dm_clone_metadata *cmd) void dm_clone_metadata_set_read_write(struct dm_clone_metadata *cmd)
{ {
unsigned long flags;
down_write(&cmd->lock); down_write(&cmd->lock);
spin_lock_irqsave(&cmd->bitmap_lock, flags); spin_lock_irq(&cmd->bitmap_lock);
cmd->read_only = 0; cmd->read_only = 0;
spin_unlock_irqrestore(&cmd->bitmap_lock, flags); spin_unlock_irq(&cmd->bitmap_lock);
if (!cmd->fail_io) if (!cmd->fail_io)
dm_bm_set_read_write(cmd->bm); dm_bm_set_read_write(cmd->bm);
......
...@@ -44,7 +44,9 @@ int dm_clone_set_region_hydrated(struct dm_clone_metadata *cmd, unsigned long re ...@@ -44,7 +44,9 @@ int dm_clone_set_region_hydrated(struct dm_clone_metadata *cmd, unsigned long re
* @start: Starting region number * @start: Starting region number
* @nr_regions: Number of regions in the range * @nr_regions: Number of regions in the range
* *
* This function doesn't block, so it's safe to call it from interrupt context. * This function doesn't block, but since it uses spin_lock_irq()/spin_unlock_irq()
* it's NOT safe to call it from any context where interrupts are disabled, e.g.,
* from interrupt context.
*/ */
int dm_clone_cond_set_range(struct dm_clone_metadata *cmd, unsigned long start, int dm_clone_cond_set_range(struct dm_clone_metadata *cmd, unsigned long start,
unsigned long nr_regions); unsigned long nr_regions);
......
...@@ -332,8 +332,6 @@ static void submit_bios(struct bio_list *bios) ...@@ -332,8 +332,6 @@ static void submit_bios(struct bio_list *bios)
*/ */
static void issue_bio(struct clone *clone, struct bio *bio) static void issue_bio(struct clone *clone, struct bio *bio)
{ {
unsigned long flags;
if (!bio_triggers_commit(clone, bio)) { if (!bio_triggers_commit(clone, bio)) {
generic_make_request(bio); generic_make_request(bio);
return; return;
...@@ -352,9 +350,9 @@ static void issue_bio(struct clone *clone, struct bio *bio) ...@@ -352,9 +350,9 @@ static void issue_bio(struct clone *clone, struct bio *bio)
* Batch together any bios that trigger commits and then issue a single * Batch together any bios that trigger commits and then issue a single
* commit for them in process_deferred_flush_bios(). * commit for them in process_deferred_flush_bios().
*/ */
spin_lock_irqsave(&clone->lock, flags); spin_lock_irq(&clone->lock);
bio_list_add(&clone->deferred_flush_bios, bio); bio_list_add(&clone->deferred_flush_bios, bio);
spin_unlock_irqrestore(&clone->lock, flags); spin_unlock_irq(&clone->lock);
wake_worker(clone); wake_worker(clone);
} }
...@@ -469,7 +467,7 @@ static void complete_discard_bio(struct clone *clone, struct bio *bio, bool succ ...@@ -469,7 +467,7 @@ static void complete_discard_bio(struct clone *clone, struct bio *bio, bool succ
static void process_discard_bio(struct clone *clone, struct bio *bio) static void process_discard_bio(struct clone *clone, struct bio *bio)
{ {
unsigned long rs, re, flags; unsigned long rs, re;
bio_region_range(clone, bio, &rs, &re); bio_region_range(clone, bio, &rs, &re);
BUG_ON(re > clone->nr_regions); BUG_ON(re > clone->nr_regions);
...@@ -501,9 +499,9 @@ static void process_discard_bio(struct clone *clone, struct bio *bio) ...@@ -501,9 +499,9 @@ static void process_discard_bio(struct clone *clone, struct bio *bio)
/* /*
* Defer discard processing. * Defer discard processing.
*/ */
spin_lock_irqsave(&clone->lock, flags); spin_lock_irq(&clone->lock);
bio_list_add(&clone->deferred_discard_bios, bio); bio_list_add(&clone->deferred_discard_bios, bio);
spin_unlock_irqrestore(&clone->lock, flags); spin_unlock_irq(&clone->lock);
wake_worker(clone); wake_worker(clone);
} }
...@@ -554,6 +552,12 @@ struct hash_table_bucket { ...@@ -554,6 +552,12 @@ struct hash_table_bucket {
#define bucket_unlock_irqrestore(bucket, flags) \ #define bucket_unlock_irqrestore(bucket, flags) \
spin_unlock_irqrestore(&(bucket)->lock, flags) spin_unlock_irqrestore(&(bucket)->lock, flags)
#define bucket_lock_irq(bucket) \
spin_lock_irq(&(bucket)->lock)
#define bucket_unlock_irq(bucket) \
spin_unlock_irq(&(bucket)->lock)
static int hash_table_init(struct clone *clone) static int hash_table_init(struct clone *clone)
{ {
unsigned int i, sz; unsigned int i, sz;
...@@ -851,7 +855,6 @@ static void hydration_overwrite(struct dm_clone_region_hydration *hd, struct bio ...@@ -851,7 +855,6 @@ static void hydration_overwrite(struct dm_clone_region_hydration *hd, struct bio
*/ */
static void hydrate_bio_region(struct clone *clone, struct bio *bio) static void hydrate_bio_region(struct clone *clone, struct bio *bio)
{ {
unsigned long flags;
unsigned long region_nr; unsigned long region_nr;
struct hash_table_bucket *bucket; struct hash_table_bucket *bucket;
struct dm_clone_region_hydration *hd, *hd2; struct dm_clone_region_hydration *hd, *hd2;
...@@ -859,19 +862,19 @@ static void hydrate_bio_region(struct clone *clone, struct bio *bio) ...@@ -859,19 +862,19 @@ static void hydrate_bio_region(struct clone *clone, struct bio *bio)
region_nr = bio_to_region(clone, bio); region_nr = bio_to_region(clone, bio);
bucket = get_hash_table_bucket(clone, region_nr); bucket = get_hash_table_bucket(clone, region_nr);
bucket_lock_irqsave(bucket, flags); bucket_lock_irq(bucket);
hd = __hash_find(bucket, region_nr); hd = __hash_find(bucket, region_nr);
if (hd) { if (hd) {
/* Someone else is hydrating the region */ /* Someone else is hydrating the region */
bio_list_add(&hd->deferred_bios, bio); bio_list_add(&hd->deferred_bios, bio);
bucket_unlock_irqrestore(bucket, flags); bucket_unlock_irq(bucket);
return; return;
} }
if (dm_clone_is_region_hydrated(clone->cmd, region_nr)) { if (dm_clone_is_region_hydrated(clone->cmd, region_nr)) {
/* The region has been hydrated */ /* The region has been hydrated */
bucket_unlock_irqrestore(bucket, flags); bucket_unlock_irq(bucket);
issue_bio(clone, bio); issue_bio(clone, bio);
return; return;
} }
...@@ -880,16 +883,16 @@ static void hydrate_bio_region(struct clone *clone, struct bio *bio) ...@@ -880,16 +883,16 @@ static void hydrate_bio_region(struct clone *clone, struct bio *bio)
* We must allocate a hydration descriptor and start the hydration of * We must allocate a hydration descriptor and start the hydration of
* the corresponding region. * the corresponding region.
*/ */
bucket_unlock_irqrestore(bucket, flags); bucket_unlock_irq(bucket);
hd = alloc_hydration(clone); hd = alloc_hydration(clone);
hydration_init(hd, region_nr); hydration_init(hd, region_nr);
bucket_lock_irqsave(bucket, flags); bucket_lock_irq(bucket);
/* Check if the region has been hydrated in the meantime. */ /* Check if the region has been hydrated in the meantime. */
if (dm_clone_is_region_hydrated(clone->cmd, region_nr)) { if (dm_clone_is_region_hydrated(clone->cmd, region_nr)) {
bucket_unlock_irqrestore(bucket, flags); bucket_unlock_irq(bucket);
free_hydration(hd); free_hydration(hd);
issue_bio(clone, bio); issue_bio(clone, bio);
return; return;
...@@ -899,7 +902,7 @@ static void hydrate_bio_region(struct clone *clone, struct bio *bio) ...@@ -899,7 +902,7 @@ static void hydrate_bio_region(struct clone *clone, struct bio *bio)
if (hd2 != hd) { if (hd2 != hd) {
/* Someone else started the region's hydration. */ /* Someone else started the region's hydration. */
bio_list_add(&hd2->deferred_bios, bio); bio_list_add(&hd2->deferred_bios, bio);
bucket_unlock_irqrestore(bucket, flags); bucket_unlock_irq(bucket);
free_hydration(hd); free_hydration(hd);
return; return;
} }
...@@ -911,7 +914,7 @@ static void hydrate_bio_region(struct clone *clone, struct bio *bio) ...@@ -911,7 +914,7 @@ static void hydrate_bio_region(struct clone *clone, struct bio *bio)
*/ */
if (unlikely(get_clone_mode(clone) >= CM_READ_ONLY)) { if (unlikely(get_clone_mode(clone) >= CM_READ_ONLY)) {
hlist_del(&hd->h); hlist_del(&hd->h);
bucket_unlock_irqrestore(bucket, flags); bucket_unlock_irq(bucket);
free_hydration(hd); free_hydration(hd);
bio_io_error(bio); bio_io_error(bio);
return; return;
...@@ -925,11 +928,11 @@ static void hydrate_bio_region(struct clone *clone, struct bio *bio) ...@@ -925,11 +928,11 @@ static void hydrate_bio_region(struct clone *clone, struct bio *bio)
* to the destination device. * to the destination device.
*/ */
if (is_overwrite_bio(clone, bio)) { if (is_overwrite_bio(clone, bio)) {
bucket_unlock_irqrestore(bucket, flags); bucket_unlock_irq(bucket);
hydration_overwrite(hd, bio); hydration_overwrite(hd, bio);
} else { } else {
bio_list_add(&hd->deferred_bios, bio); bio_list_add(&hd->deferred_bios, bio);
bucket_unlock_irqrestore(bucket, flags); bucket_unlock_irq(bucket);
hydration_copy(hd, 1); hydration_copy(hd, 1);
} }
} }
...@@ -996,7 +999,6 @@ static unsigned long __start_next_hydration(struct clone *clone, ...@@ -996,7 +999,6 @@ static unsigned long __start_next_hydration(struct clone *clone,
unsigned long offset, unsigned long offset,
struct batch_info *batch) struct batch_info *batch)
{ {
unsigned long flags;
struct hash_table_bucket *bucket; struct hash_table_bucket *bucket;
struct dm_clone_region_hydration *hd; struct dm_clone_region_hydration *hd;
unsigned long nr_regions = clone->nr_regions; unsigned long nr_regions = clone->nr_regions;
...@@ -1010,13 +1012,13 @@ static unsigned long __start_next_hydration(struct clone *clone, ...@@ -1010,13 +1012,13 @@ static unsigned long __start_next_hydration(struct clone *clone,
break; break;
bucket = get_hash_table_bucket(clone, offset); bucket = get_hash_table_bucket(clone, offset);
bucket_lock_irqsave(bucket, flags); bucket_lock_irq(bucket);
if (!dm_clone_is_region_hydrated(clone->cmd, offset) && if (!dm_clone_is_region_hydrated(clone->cmd, offset) &&
!__hash_find(bucket, offset)) { !__hash_find(bucket, offset)) {
hydration_init(hd, offset); hydration_init(hd, offset);
__insert_region_hydration(bucket, hd); __insert_region_hydration(bucket, hd);
bucket_unlock_irqrestore(bucket, flags); bucket_unlock_irq(bucket);
/* Batch hydration */ /* Batch hydration */
__batch_hydration(batch, hd); __batch_hydration(batch, hd);
...@@ -1024,7 +1026,7 @@ static unsigned long __start_next_hydration(struct clone *clone, ...@@ -1024,7 +1026,7 @@ static unsigned long __start_next_hydration(struct clone *clone,
return (offset + 1); return (offset + 1);
} }
bucket_unlock_irqrestore(bucket, flags); bucket_unlock_irq(bucket);
} while (++offset < nr_regions); } while (++offset < nr_regions);
...@@ -1140,13 +1142,13 @@ static void process_deferred_discards(struct clone *clone) ...@@ -1140,13 +1142,13 @@ static void process_deferred_discards(struct clone *clone)
int r = -EPERM; int r = -EPERM;
struct bio *bio; struct bio *bio;
struct blk_plug plug; struct blk_plug plug;
unsigned long rs, re, flags; unsigned long rs, re;
struct bio_list discards = BIO_EMPTY_LIST; struct bio_list discards = BIO_EMPTY_LIST;
spin_lock_irqsave(&clone->lock, flags); spin_lock_irq(&clone->lock);
bio_list_merge(&discards, &clone->deferred_discard_bios); bio_list_merge(&discards, &clone->deferred_discard_bios);
bio_list_init(&clone->deferred_discard_bios); bio_list_init(&clone->deferred_discard_bios);
spin_unlock_irqrestore(&clone->lock, flags); spin_unlock_irq(&clone->lock);
if (bio_list_empty(&discards)) if (bio_list_empty(&discards))
return; return;
...@@ -1176,13 +1178,12 @@ static void process_deferred_discards(struct clone *clone) ...@@ -1176,13 +1178,12 @@ static void process_deferred_discards(struct clone *clone)
static void process_deferred_bios(struct clone *clone) static void process_deferred_bios(struct clone *clone)
{ {
unsigned long flags;
struct bio_list bios = BIO_EMPTY_LIST; struct bio_list bios = BIO_EMPTY_LIST;
spin_lock_irqsave(&clone->lock, flags); spin_lock_irq(&clone->lock);
bio_list_merge(&bios, &clone->deferred_bios); bio_list_merge(&bios, &clone->deferred_bios);
bio_list_init(&clone->deferred_bios); bio_list_init(&clone->deferred_bios);
spin_unlock_irqrestore(&clone->lock, flags); spin_unlock_irq(&clone->lock);
if (bio_list_empty(&bios)) if (bio_list_empty(&bios))
return; return;
...@@ -1193,7 +1194,6 @@ static void process_deferred_bios(struct clone *clone) ...@@ -1193,7 +1194,6 @@ static void process_deferred_bios(struct clone *clone)
static void process_deferred_flush_bios(struct clone *clone) static void process_deferred_flush_bios(struct clone *clone)
{ {
struct bio *bio; struct bio *bio;
unsigned long flags;
struct bio_list bios = BIO_EMPTY_LIST; struct bio_list bios = BIO_EMPTY_LIST;
struct bio_list bio_completions = BIO_EMPTY_LIST; struct bio_list bio_completions = BIO_EMPTY_LIST;
...@@ -1201,13 +1201,13 @@ static void process_deferred_flush_bios(struct clone *clone) ...@@ -1201,13 +1201,13 @@ static void process_deferred_flush_bios(struct clone *clone)
* If there are any deferred flush bios, we must commit the metadata * If there are any deferred flush bios, we must commit the metadata
* before issuing them or signaling their completion. * before issuing them or signaling their completion.
*/ */
spin_lock_irqsave(&clone->lock, flags); spin_lock_irq(&clone->lock);
bio_list_merge(&bios, &clone->deferred_flush_bios); bio_list_merge(&bios, &clone->deferred_flush_bios);
bio_list_init(&clone->deferred_flush_bios); bio_list_init(&clone->deferred_flush_bios);
bio_list_merge(&bio_completions, &clone->deferred_flush_completions); bio_list_merge(&bio_completions, &clone->deferred_flush_completions);
bio_list_init(&clone->deferred_flush_completions); bio_list_init(&clone->deferred_flush_completions);
spin_unlock_irqrestore(&clone->lock, flags); spin_unlock_irq(&clone->lock);
if (bio_list_empty(&bios) && bio_list_empty(&bio_completions) && if (bio_list_empty(&bios) && bio_list_empty(&bio_completions) &&
!(dm_clone_changed_this_transaction(clone->cmd) && need_commit_due_to_time(clone))) !(dm_clone_changed_this_transaction(clone->cmd) && need_commit_due_to_time(clone)))
......
...@@ -2700,21 +2700,18 @@ static int crypt_ctr(struct dm_target *ti, unsigned int argc, char **argv) ...@@ -2700,21 +2700,18 @@ static int crypt_ctr(struct dm_target *ti, unsigned int argc, char **argv)
} }
ret = -ENOMEM; ret = -ENOMEM;
cc->io_queue = alloc_workqueue("kcryptd_io/%s", cc->io_queue = alloc_workqueue("kcryptd_io/%s", WQ_MEM_RECLAIM, 1, devname);
WQ_HIGHPRI | WQ_CPU_INTENSIVE | WQ_MEM_RECLAIM,
1, devname);
if (!cc->io_queue) { if (!cc->io_queue) {
ti->error = "Couldn't create kcryptd io queue"; ti->error = "Couldn't create kcryptd io queue";
goto bad; goto bad;
} }
if (test_bit(DM_CRYPT_SAME_CPU, &cc->flags)) if (test_bit(DM_CRYPT_SAME_CPU, &cc->flags))
cc->crypt_queue = alloc_workqueue("kcryptd/%s", cc->crypt_queue = alloc_workqueue("kcryptd/%s", WQ_CPU_INTENSIVE | WQ_MEM_RECLAIM,
WQ_HIGHPRI | WQ_CPU_INTENSIVE | WQ_MEM_RECLAIM,
1, devname); 1, devname);
else else
cc->crypt_queue = alloc_workqueue("kcryptd/%s", cc->crypt_queue = alloc_workqueue("kcryptd/%s",
WQ_HIGHPRI | WQ_CPU_INTENSIVE | WQ_MEM_RECLAIM | WQ_UNBOUND, WQ_CPU_INTENSIVE | WQ_MEM_RECLAIM | WQ_UNBOUND,
num_online_cpus(), devname); num_online_cpus(), devname);
if (!cc->crypt_queue) { if (!cc->crypt_queue) {
ti->error = "Couldn't create kcryptd queue"; ti->error = "Couldn't create kcryptd queue";
......
...@@ -17,6 +17,7 @@ ...@@ -17,6 +17,7 @@
struct badblock { struct badblock {
struct rb_node node; struct rb_node node;
sector_t bb; sector_t bb;
unsigned char wr_fail_cnt;
}; };
struct dust_device { struct dust_device {
...@@ -101,7 +102,8 @@ static int dust_remove_block(struct dust_device *dd, unsigned long long block) ...@@ -101,7 +102,8 @@ static int dust_remove_block(struct dust_device *dd, unsigned long long block)
return 0; return 0;
} }
static int dust_add_block(struct dust_device *dd, unsigned long long block) static int dust_add_block(struct dust_device *dd, unsigned long long block,
unsigned char wr_fail_cnt)
{ {
struct badblock *bblock; struct badblock *bblock;
unsigned long flags; unsigned long flags;
...@@ -115,6 +117,7 @@ static int dust_add_block(struct dust_device *dd, unsigned long long block) ...@@ -115,6 +117,7 @@ static int dust_add_block(struct dust_device *dd, unsigned long long block)
spin_lock_irqsave(&dd->dust_lock, flags); spin_lock_irqsave(&dd->dust_lock, flags);
bblock->bb = block; bblock->bb = block;
bblock->wr_fail_cnt = wr_fail_cnt;
if (!dust_rb_insert(&dd->badblocklist, bblock)) { if (!dust_rb_insert(&dd->badblocklist, bblock)) {
if (!dd->quiet_mode) { if (!dd->quiet_mode) {
DMERR("%s: block %llu already in badblocklist", DMERR("%s: block %llu already in badblocklist",
...@@ -126,8 +129,10 @@ static int dust_add_block(struct dust_device *dd, unsigned long long block) ...@@ -126,8 +129,10 @@ static int dust_add_block(struct dust_device *dd, unsigned long long block)
} }
dd->badblock_count++; dd->badblock_count++;
if (!dd->quiet_mode) if (!dd->quiet_mode) {
DMINFO("%s: badblock added at block %llu", __func__, block); DMINFO("%s: badblock added at block %llu with write fail count %hhu",
__func__, block, wr_fail_cnt);
}
spin_unlock_irqrestore(&dd->dust_lock, flags); spin_unlock_irqrestore(&dd->dust_lock, flags);
return 0; return 0;
...@@ -163,22 +168,27 @@ static int dust_map_read(struct dust_device *dd, sector_t thisblock, ...@@ -163,22 +168,27 @@ static int dust_map_read(struct dust_device *dd, sector_t thisblock,
bool fail_read_on_bb) bool fail_read_on_bb)
{ {
unsigned long flags; unsigned long flags;
int ret = DM_MAPIO_REMAPPED; int r = DM_MAPIO_REMAPPED;
if (fail_read_on_bb) { if (fail_read_on_bb) {
thisblock >>= dd->sect_per_block_shift; thisblock >>= dd->sect_per_block_shift;
spin_lock_irqsave(&dd->dust_lock, flags); spin_lock_irqsave(&dd->dust_lock, flags);
ret = __dust_map_read(dd, thisblock); r = __dust_map_read(dd, thisblock);
spin_unlock_irqrestore(&dd->dust_lock, flags); spin_unlock_irqrestore(&dd->dust_lock, flags);
} }
return ret; return r;
} }
static void __dust_map_write(struct dust_device *dd, sector_t thisblock) static int __dust_map_write(struct dust_device *dd, sector_t thisblock)
{ {
struct badblock *bblk = dust_rb_search(&dd->badblocklist, thisblock); struct badblock *bblk = dust_rb_search(&dd->badblocklist, thisblock);
if (bblk && bblk->wr_fail_cnt > 0) {
bblk->wr_fail_cnt--;
return DM_MAPIO_KILL;
}
if (bblk) { if (bblk) {
rb_erase(&bblk->node, &dd->badblocklist); rb_erase(&bblk->node, &dd->badblocklist);
dd->badblock_count--; dd->badblock_count--;
...@@ -189,37 +199,40 @@ static void __dust_map_write(struct dust_device *dd, sector_t thisblock) ...@@ -189,37 +199,40 @@ static void __dust_map_write(struct dust_device *dd, sector_t thisblock)
(unsigned long long)thisblock); (unsigned long long)thisblock);
} }
} }
return DM_MAPIO_REMAPPED;
} }
static int dust_map_write(struct dust_device *dd, sector_t thisblock, static int dust_map_write(struct dust_device *dd, sector_t thisblock,
bool fail_read_on_bb) bool fail_read_on_bb)
{ {
unsigned long flags; unsigned long flags;
int ret = DM_MAPIO_REMAPPED;
if (fail_read_on_bb) { if (fail_read_on_bb) {
thisblock >>= dd->sect_per_block_shift; thisblock >>= dd->sect_per_block_shift;
spin_lock_irqsave(&dd->dust_lock, flags); spin_lock_irqsave(&dd->dust_lock, flags);
__dust_map_write(dd, thisblock); ret = __dust_map_write(dd, thisblock);
spin_unlock_irqrestore(&dd->dust_lock, flags); spin_unlock_irqrestore(&dd->dust_lock, flags);
} }
return DM_MAPIO_REMAPPED; return ret;
} }
static int dust_map(struct dm_target *ti, struct bio *bio) static int dust_map(struct dm_target *ti, struct bio *bio)
{ {
struct dust_device *dd = ti->private; struct dust_device *dd = ti->private;
int ret; int r;
bio_set_dev(bio, dd->dev->bdev); bio_set_dev(bio, dd->dev->bdev);
bio->bi_iter.bi_sector = dd->start + dm_target_offset(ti, bio->bi_iter.bi_sector); bio->bi_iter.bi_sector = dd->start + dm_target_offset(ti, bio->bi_iter.bi_sector);
if (bio_data_dir(bio) == READ) if (bio_data_dir(bio) == READ)
ret = dust_map_read(dd, bio->bi_iter.bi_sector, dd->fail_read_on_bb); r = dust_map_read(dd, bio->bi_iter.bi_sector, dd->fail_read_on_bb);
else else
ret = dust_map_write(dd, bio->bi_iter.bi_sector, dd->fail_read_on_bb); r = dust_map_write(dd, bio->bi_iter.bi_sector, dd->fail_read_on_bb);
return ret; return r;
} }
static bool __dust_clear_badblocks(struct rb_root *tree, static bool __dust_clear_badblocks(struct rb_root *tree,
...@@ -375,8 +388,10 @@ static int dust_message(struct dm_target *ti, unsigned int argc, char **argv, ...@@ -375,8 +388,10 @@ static int dust_message(struct dm_target *ti, unsigned int argc, char **argv,
struct dust_device *dd = ti->private; struct dust_device *dd = ti->private;
sector_t size = i_size_read(dd->dev->bdev->bd_inode) >> SECTOR_SHIFT; sector_t size = i_size_read(dd->dev->bdev->bd_inode) >> SECTOR_SHIFT;
bool invalid_msg = false; bool invalid_msg = false;
int result = -EINVAL; int r = -EINVAL;
unsigned long long tmp, block; unsigned long long tmp, block;
unsigned char wr_fail_cnt;
unsigned int tmp_ui;
unsigned long flags; unsigned long flags;
char dummy; char dummy;
...@@ -388,45 +403,69 @@ static int dust_message(struct dm_target *ti, unsigned int argc, char **argv, ...@@ -388,45 +403,69 @@ static int dust_message(struct dm_target *ti, unsigned int argc, char **argv,
} else if (!strcasecmp(argv[0], "disable")) { } else if (!strcasecmp(argv[0], "disable")) {
DMINFO("disabling read failures on bad sectors"); DMINFO("disabling read failures on bad sectors");
dd->fail_read_on_bb = false; dd->fail_read_on_bb = false;
result = 0; r = 0;
} else if (!strcasecmp(argv[0], "enable")) { } else if (!strcasecmp(argv[0], "enable")) {
DMINFO("enabling read failures on bad sectors"); DMINFO("enabling read failures on bad sectors");
dd->fail_read_on_bb = true; dd->fail_read_on_bb = true;
result = 0; r = 0;
} else if (!strcasecmp(argv[0], "countbadblocks")) { } else if (!strcasecmp(argv[0], "countbadblocks")) {
spin_lock_irqsave(&dd->dust_lock, flags); spin_lock_irqsave(&dd->dust_lock, flags);
DMINFO("countbadblocks: %llu badblock(s) found", DMINFO("countbadblocks: %llu badblock(s) found",
dd->badblock_count); dd->badblock_count);
spin_unlock_irqrestore(&dd->dust_lock, flags); spin_unlock_irqrestore(&dd->dust_lock, flags);
result = 0; r = 0;
} else if (!strcasecmp(argv[0], "clearbadblocks")) { } else if (!strcasecmp(argv[0], "clearbadblocks")) {
result = dust_clear_badblocks(dd); r = dust_clear_badblocks(dd);
} else if (!strcasecmp(argv[0], "quiet")) { } else if (!strcasecmp(argv[0], "quiet")) {
if (!dd->quiet_mode) if (!dd->quiet_mode)
dd->quiet_mode = true; dd->quiet_mode = true;
else else
dd->quiet_mode = false; dd->quiet_mode = false;
result = 0; r = 0;
} else { } else {
invalid_msg = true; invalid_msg = true;
} }
} else if (argc == 2) { } else if (argc == 2) {
if (sscanf(argv[1], "%llu%c", &tmp, &dummy) != 1) if (sscanf(argv[1], "%llu%c", &tmp, &dummy) != 1)
return result; return r;
block = tmp; block = tmp;
sector_div(size, dd->sect_per_block); sector_div(size, dd->sect_per_block);
if (block > size) { if (block > size) {
DMERR("selected block value out of range"); DMERR("selected block value out of range");
return result; return r;
} }
if (!strcasecmp(argv[0], "addbadblock")) if (!strcasecmp(argv[0], "addbadblock"))
result = dust_add_block(dd, block); r = dust_add_block(dd, block, 0);
else if (!strcasecmp(argv[0], "removebadblock")) else if (!strcasecmp(argv[0], "removebadblock"))
result = dust_remove_block(dd, block); r = dust_remove_block(dd, block);
else if (!strcasecmp(argv[0], "queryblock")) else if (!strcasecmp(argv[0], "queryblock"))
result = dust_query_block(dd, block); r = dust_query_block(dd, block);
else
invalid_msg = true;
} else if (argc == 3) {
if (sscanf(argv[1], "%llu%c", &tmp, &dummy) != 1)
return r;
if (sscanf(argv[2], "%u%c", &tmp_ui, &dummy) != 1)
return r;
block = tmp;
if (tmp_ui > 255) {
DMERR("selected write fail count out of range");
return r;
}
wr_fail_cnt = tmp_ui;
sector_div(size, dd->sect_per_block);
if (block > size) {
DMERR("selected block value out of range");
return r;
}
if (!strcasecmp(argv[0], "addbadblock"))
r = dust_add_block(dd, block, wr_fail_cnt);
else else
invalid_msg = true; invalid_msg = true;
...@@ -436,7 +475,7 @@ static int dust_message(struct dm_target *ti, unsigned int argc, char **argv, ...@@ -436,7 +475,7 @@ static int dust_message(struct dm_target *ti, unsigned int argc, char **argv,
if (invalid_msg) if (invalid_msg)
DMERR("unrecognized message '%s' received", argv[0]); DMERR("unrecognized message '%s' received", argv[0]);
return result; return r;
} }
static void dust_status(struct dm_target *ti, status_type_t type, static void dust_status(struct dm_target *ti, status_type_t type,
...@@ -499,12 +538,12 @@ static struct target_type dust_target = { ...@@ -499,12 +538,12 @@ static struct target_type dust_target = {
static int __init dm_dust_init(void) static int __init dm_dust_init(void)
{ {
int result = dm_register_target(&dust_target); int r = dm_register_target(&dust_target);
if (result < 0) if (r < 0)
DMERR("dm_register_target failed %d", result); DMERR("dm_register_target failed %d", r);
return result; return r;
} }
static void __exit dm_dust_exit(void) static void __exit dm_dust_exit(void)
......
...@@ -53,6 +53,7 @@ ...@@ -53,6 +53,7 @@
#define SB_VERSION_1 1 #define SB_VERSION_1 1
#define SB_VERSION_2 2 #define SB_VERSION_2 2
#define SB_VERSION_3 3 #define SB_VERSION_3 3
#define SB_VERSION_4 4
#define SB_SECTORS 8 #define SB_SECTORS 8
#define MAX_SECTORS_PER_BLOCK 8 #define MAX_SECTORS_PER_BLOCK 8
...@@ -73,6 +74,7 @@ struct superblock { ...@@ -73,6 +74,7 @@ struct superblock {
#define SB_FLAG_HAVE_JOURNAL_MAC 0x1 #define SB_FLAG_HAVE_JOURNAL_MAC 0x1
#define SB_FLAG_RECALCULATING 0x2 #define SB_FLAG_RECALCULATING 0x2
#define SB_FLAG_DIRTY_BITMAP 0x4 #define SB_FLAG_DIRTY_BITMAP 0x4
#define SB_FLAG_FIXED_PADDING 0x8
#define JOURNAL_ENTRY_ROUNDUP 8 #define JOURNAL_ENTRY_ROUNDUP 8
...@@ -250,6 +252,7 @@ struct dm_integrity_c { ...@@ -250,6 +252,7 @@ struct dm_integrity_c {
bool journal_uptodate; bool journal_uptodate;
bool just_formatted; bool just_formatted;
bool recalculate_flag; bool recalculate_flag;
bool fix_padding;
struct alg_spec internal_hash_alg; struct alg_spec internal_hash_alg;
struct alg_spec journal_crypt_alg; struct alg_spec journal_crypt_alg;
...@@ -463,7 +466,9 @@ static void wraparound_section(struct dm_integrity_c *ic, unsigned *sec_ptr) ...@@ -463,7 +466,9 @@ static void wraparound_section(struct dm_integrity_c *ic, unsigned *sec_ptr)
static void sb_set_version(struct dm_integrity_c *ic) static void sb_set_version(struct dm_integrity_c *ic)
{ {
if (ic->mode == 'B' || ic->sb->flags & cpu_to_le32(SB_FLAG_DIRTY_BITMAP)) if (ic->sb->flags & cpu_to_le32(SB_FLAG_FIXED_PADDING))
ic->sb->version = SB_VERSION_4;
else if (ic->mode == 'B' || ic->sb->flags & cpu_to_le32(SB_FLAG_DIRTY_BITMAP))
ic->sb->version = SB_VERSION_3; ic->sb->version = SB_VERSION_3;
else if (ic->meta_dev || ic->sb->flags & cpu_to_le32(SB_FLAG_RECALCULATING)) else if (ic->meta_dev || ic->sb->flags & cpu_to_le32(SB_FLAG_RECALCULATING))
ic->sb->version = SB_VERSION_2; ic->sb->version = SB_VERSION_2;
...@@ -2955,6 +2960,7 @@ static void dm_integrity_status(struct dm_target *ti, status_type_t type, ...@@ -2955,6 +2960,7 @@ static void dm_integrity_status(struct dm_target *ti, status_type_t type,
arg_count += !!ic->internal_hash_alg.alg_string; arg_count += !!ic->internal_hash_alg.alg_string;
arg_count += !!ic->journal_crypt_alg.alg_string; arg_count += !!ic->journal_crypt_alg.alg_string;
arg_count += !!ic->journal_mac_alg.alg_string; arg_count += !!ic->journal_mac_alg.alg_string;
arg_count += (ic->sb->flags & cpu_to_le32(SB_FLAG_FIXED_PADDING)) != 0;
DMEMIT("%s %llu %u %c %u", ic->dev->name, (unsigned long long)ic->start, DMEMIT("%s %llu %u %c %u", ic->dev->name, (unsigned long long)ic->start,
ic->tag_size, ic->mode, arg_count); ic->tag_size, ic->mode, arg_count);
if (ic->meta_dev) if (ic->meta_dev)
...@@ -2974,6 +2980,8 @@ static void dm_integrity_status(struct dm_target *ti, status_type_t type, ...@@ -2974,6 +2980,8 @@ static void dm_integrity_status(struct dm_target *ti, status_type_t type,
DMEMIT(" sectors_per_bit:%llu", (unsigned long long)ic->sectors_per_block << ic->log2_blocks_per_bitmap_bit); DMEMIT(" sectors_per_bit:%llu", (unsigned long long)ic->sectors_per_block << ic->log2_blocks_per_bitmap_bit);
DMEMIT(" bitmap_flush_interval:%u", jiffies_to_msecs(ic->bitmap_flush_interval)); DMEMIT(" bitmap_flush_interval:%u", jiffies_to_msecs(ic->bitmap_flush_interval));
} }
if ((ic->sb->flags & cpu_to_le32(SB_FLAG_FIXED_PADDING)) != 0)
DMEMIT(" fix_padding");
#define EMIT_ALG(a, n) \ #define EMIT_ALG(a, n) \
do { \ do { \
...@@ -3042,8 +3050,14 @@ static int calculate_device_limits(struct dm_integrity_c *ic) ...@@ -3042,8 +3050,14 @@ static int calculate_device_limits(struct dm_integrity_c *ic)
if (!ic->meta_dev) { if (!ic->meta_dev) {
sector_t last_sector, last_area, last_offset; sector_t last_sector, last_area, last_offset;
ic->metadata_run = roundup((__u64)ic->tag_size << (ic->sb->log2_interleave_sectors - ic->sb->log2_sectors_per_block), /* we have to maintain excessive padding for compatibility with existing volumes */
(__u64)(1 << SECTOR_SHIFT << METADATA_PADDING_SECTORS)) >> SECTOR_SHIFT; __u64 metadata_run_padding =
ic->sb->flags & cpu_to_le32(SB_FLAG_FIXED_PADDING) ?
(__u64)(METADATA_PADDING_SECTORS << SECTOR_SHIFT) :
(__u64)(1 << SECTOR_SHIFT << METADATA_PADDING_SECTORS);
ic->metadata_run = round_up((__u64)ic->tag_size << (ic->sb->log2_interleave_sectors - ic->sb->log2_sectors_per_block),
metadata_run_padding) >> SECTOR_SHIFT;
if (!(ic->metadata_run & (ic->metadata_run - 1))) if (!(ic->metadata_run & (ic->metadata_run - 1)))
ic->log2_metadata_run = __ffs(ic->metadata_run); ic->log2_metadata_run = __ffs(ic->metadata_run);
else else
...@@ -3086,6 +3100,8 @@ static int initialize_superblock(struct dm_integrity_c *ic, unsigned journal_sec ...@@ -3086,6 +3100,8 @@ static int initialize_superblock(struct dm_integrity_c *ic, unsigned journal_sec
journal_sections = 1; journal_sections = 1;
if (!ic->meta_dev) { if (!ic->meta_dev) {
if (ic->fix_padding)
ic->sb->flags |= cpu_to_le32(SB_FLAG_FIXED_PADDING);
ic->sb->journal_sections = cpu_to_le32(journal_sections); ic->sb->journal_sections = cpu_to_le32(journal_sections);
if (!interleave_sectors) if (!interleave_sectors)
interleave_sectors = DEFAULT_INTERLEAVE_SECTORS; interleave_sectors = DEFAULT_INTERLEAVE_SECTORS;
...@@ -3725,6 +3741,8 @@ static int dm_integrity_ctr(struct dm_target *ti, unsigned argc, char **argv) ...@@ -3725,6 +3741,8 @@ static int dm_integrity_ctr(struct dm_target *ti, unsigned argc, char **argv)
goto bad; goto bad;
} else if (!strcmp(opt_string, "recalculate")) { } else if (!strcmp(opt_string, "recalculate")) {
ic->recalculate_flag = true; ic->recalculate_flag = true;
} else if (!strcmp(opt_string, "fix_padding")) {
ic->fix_padding = true;
} else { } else {
r = -EINVAL; r = -EINVAL;
ti->error = "Invalid argument"; ti->error = "Invalid argument";
...@@ -3867,7 +3885,7 @@ static int dm_integrity_ctr(struct dm_target *ti, unsigned argc, char **argv) ...@@ -3867,7 +3885,7 @@ static int dm_integrity_ctr(struct dm_target *ti, unsigned argc, char **argv)
should_write_sb = true; should_write_sb = true;
} }
if (!ic->sb->version || ic->sb->version > SB_VERSION_3) { if (!ic->sb->version || ic->sb->version > SB_VERSION_4) {
r = -EINVAL; r = -EINVAL;
ti->error = "Unknown version"; ti->error = "Unknown version";
goto bad; goto bad;
...@@ -4182,7 +4200,7 @@ static void dm_integrity_dtr(struct dm_target *ti) ...@@ -4182,7 +4200,7 @@ static void dm_integrity_dtr(struct dm_target *ti)
static struct target_type integrity_target = { static struct target_type integrity_target = {
.name = "integrity", .name = "integrity",
.version = {1, 3, 0}, .version = {1, 4, 0},
.module = THIS_MODULE, .module = THIS_MODULE,
.features = DM_TARGET_SINGLETON | DM_TARGET_INTEGRITY, .features = DM_TARGET_SINGLETON | DM_TARGET_INTEGRITY,
.ctr = dm_integrity_ctr, .ctr = dm_integrity_ctr,
......
This diff is collapsed.
...@@ -55,19 +55,6 @@ static void trigger_event(struct work_struct *work) ...@@ -55,19 +55,6 @@ static void trigger_event(struct work_struct *work)
dm_table_event(sc->ti->table); dm_table_event(sc->ti->table);
} }
static inline struct stripe_c *alloc_context(unsigned int stripes)
{
size_t len;
if (dm_array_too_big(sizeof(struct stripe_c), sizeof(struct stripe),
stripes))
return NULL;
len = sizeof(struct stripe_c) + (sizeof(struct stripe) * stripes);
return kmalloc(len, GFP_KERNEL);
}
/* /*
* Parse a single <dev> <sector> pair * Parse a single <dev> <sector> pair
*/ */
...@@ -142,7 +129,7 @@ static int stripe_ctr(struct dm_target *ti, unsigned int argc, char **argv) ...@@ -142,7 +129,7 @@ static int stripe_ctr(struct dm_target *ti, unsigned int argc, char **argv)
return -EINVAL; return -EINVAL;
} }
sc = alloc_context(stripes); sc = kmalloc(struct_size(sc, stripe, stripes), GFP_KERNEL);
if (!sc) { if (!sc) {
ti->error = "Memory allocation for striped context " ti->error = "Memory allocation for striped context "
"failed"; "failed";
......
...@@ -918,21 +918,15 @@ bool dm_table_supports_dax(struct dm_table *t, ...@@ -918,21 +918,15 @@ bool dm_table_supports_dax(struct dm_table *t,
static bool dm_table_does_not_support_partial_completion(struct dm_table *t); static bool dm_table_does_not_support_partial_completion(struct dm_table *t);
struct verify_rq_based_data { static int device_is_rq_stackable(struct dm_target *ti, struct dm_dev *dev,
unsigned sq_count; sector_t start, sector_t len, void *data)
unsigned mq_count;
};
static int device_is_rq_based(struct dm_target *ti, struct dm_dev *dev,
sector_t start, sector_t len, void *data)
{ {
struct request_queue *q = bdev_get_queue(dev->bdev); struct block_device *bdev = dev->bdev;
struct verify_rq_based_data *v = data; struct request_queue *q = bdev_get_queue(bdev);
if (queue_is_mq(q)) /* request-based cannot stack on partitions! */
v->mq_count++; if (bdev != bdev->bd_contains)
else return false;
v->sq_count++;
return queue_is_mq(q); return queue_is_mq(q);
} }
...@@ -941,7 +935,6 @@ static int dm_table_determine_type(struct dm_table *t) ...@@ -941,7 +935,6 @@ static int dm_table_determine_type(struct dm_table *t)
{ {
unsigned i; unsigned i;
unsigned bio_based = 0, request_based = 0, hybrid = 0; unsigned bio_based = 0, request_based = 0, hybrid = 0;
struct verify_rq_based_data v = {.sq_count = 0, .mq_count = 0};
struct dm_target *tgt; struct dm_target *tgt;
struct list_head *devices = dm_table_get_devices(t); struct list_head *devices = dm_table_get_devices(t);
enum dm_queue_mode live_md_type = dm_get_md_type(t->md); enum dm_queue_mode live_md_type = dm_get_md_type(t->md);
...@@ -1045,14 +1038,10 @@ static int dm_table_determine_type(struct dm_table *t) ...@@ -1045,14 +1038,10 @@ static int dm_table_determine_type(struct dm_table *t)
/* Non-request-stackable devices can't be used for request-based dm */ /* Non-request-stackable devices can't be used for request-based dm */
if (!tgt->type->iterate_devices || if (!tgt->type->iterate_devices ||
!tgt->type->iterate_devices(tgt, device_is_rq_based, &v)) { !tgt->type->iterate_devices(tgt, device_is_rq_stackable, NULL)) {
DMERR("table load rejected: including non-request-stackable devices"); DMERR("table load rejected: including non-request-stackable devices");
return -EINVAL; return -EINVAL;
} }
if (v.sq_count > 0) {
DMERR("table load rejected: not all devices are blk-mq request-stackable");
return -EINVAL;
}
return 0; return 0;
} }
......
This diff is collapsed.
...@@ -1218,7 +1218,8 @@ static int writecache_map(struct dm_target *ti, struct bio *bio) ...@@ -1218,7 +1218,8 @@ static int writecache_map(struct dm_target *ti, struct bio *bio)
} }
} while (bio->bi_iter.bi_size); } while (bio->bi_iter.bi_size);
if (unlikely(wc->uncommitted_blocks >= wc->autocommit_blocks)) if (unlikely(bio->bi_opf & REQ_FUA ||
wc->uncommitted_blocks >= wc->autocommit_blocks))
writecache_flush(wc); writecache_flush(wc);
else else
writecache_schedule_autocommit(wc); writecache_schedule_autocommit(wc);
...@@ -1561,7 +1562,7 @@ static void writecache_writeback(struct work_struct *work) ...@@ -1561,7 +1562,7 @@ static void writecache_writeback(struct work_struct *work)
{ {
struct dm_writecache *wc = container_of(work, struct dm_writecache, writeback_work); struct dm_writecache *wc = container_of(work, struct dm_writecache, writeback_work);
struct blk_plug plug; struct blk_plug plug;
struct wc_entry *f, *g, *e = NULL; struct wc_entry *f, *uninitialized_var(g), *e = NULL;
struct rb_node *node, *next_node; struct rb_node *node, *next_node;
struct list_head skipped; struct list_head skipped;
struct writeback_list wbl; struct writeback_list wbl;
......
...@@ -554,6 +554,7 @@ static struct dmz_mblock *dmz_get_mblock(struct dmz_metadata *zmd, ...@@ -554,6 +554,7 @@ static struct dmz_mblock *dmz_get_mblock(struct dmz_metadata *zmd,
TASK_UNINTERRUPTIBLE); TASK_UNINTERRUPTIBLE);
if (test_bit(DMZ_META_ERROR, &mblk->state)) { if (test_bit(DMZ_META_ERROR, &mblk->state)) {
dmz_release_mblock(zmd, mblk); dmz_release_mblock(zmd, mblk);
dmz_check_bdev(zmd->dev);
return ERR_PTR(-EIO); return ERR_PTR(-EIO);
} }
...@@ -625,6 +626,8 @@ static int dmz_rdwr_block(struct dmz_metadata *zmd, int op, sector_t block, ...@@ -625,6 +626,8 @@ static int dmz_rdwr_block(struct dmz_metadata *zmd, int op, sector_t block,
ret = submit_bio_wait(bio); ret = submit_bio_wait(bio);
bio_put(bio); bio_put(bio);
if (ret)
dmz_check_bdev(zmd->dev);
return ret; return ret;
} }
...@@ -691,6 +694,7 @@ static int dmz_write_dirty_mblocks(struct dmz_metadata *zmd, ...@@ -691,6 +694,7 @@ static int dmz_write_dirty_mblocks(struct dmz_metadata *zmd,
TASK_UNINTERRUPTIBLE); TASK_UNINTERRUPTIBLE);
if (test_bit(DMZ_META_ERROR, &mblk->state)) { if (test_bit(DMZ_META_ERROR, &mblk->state)) {
clear_bit(DMZ_META_ERROR, &mblk->state); clear_bit(DMZ_META_ERROR, &mblk->state);
dmz_check_bdev(zmd->dev);
ret = -EIO; ret = -EIO;
} }
nr_mblks_submitted--; nr_mblks_submitted--;
...@@ -768,7 +772,7 @@ int dmz_flush_metadata(struct dmz_metadata *zmd) ...@@ -768,7 +772,7 @@ int dmz_flush_metadata(struct dmz_metadata *zmd)
/* If there are no dirty metadata blocks, just flush the device cache */ /* If there are no dirty metadata blocks, just flush the device cache */
if (list_empty(&write_list)) { if (list_empty(&write_list)) {
ret = blkdev_issue_flush(zmd->dev->bdev, GFP_NOIO, NULL); ret = blkdev_issue_flush(zmd->dev->bdev, GFP_NOIO, NULL);
goto out; goto err;
} }
/* /*
...@@ -778,7 +782,7 @@ int dmz_flush_metadata(struct dmz_metadata *zmd) ...@@ -778,7 +782,7 @@ int dmz_flush_metadata(struct dmz_metadata *zmd)
*/ */
ret = dmz_log_dirty_mblocks(zmd, &write_list); ret = dmz_log_dirty_mblocks(zmd, &write_list);
if (ret) if (ret)
goto out; goto err;
/* /*
* The log is on disk. It is now safe to update in place * The log is on disk. It is now safe to update in place
...@@ -786,11 +790,11 @@ int dmz_flush_metadata(struct dmz_metadata *zmd) ...@@ -786,11 +790,11 @@ int dmz_flush_metadata(struct dmz_metadata *zmd)
*/ */
ret = dmz_write_dirty_mblocks(zmd, &write_list, zmd->mblk_primary); ret = dmz_write_dirty_mblocks(zmd, &write_list, zmd->mblk_primary);
if (ret) if (ret)
goto out; goto err;
ret = dmz_write_sb(zmd, zmd->mblk_primary); ret = dmz_write_sb(zmd, zmd->mblk_primary);
if (ret) if (ret)
goto out; goto err;
while (!list_empty(&write_list)) { while (!list_empty(&write_list)) {
mblk = list_first_entry(&write_list, struct dmz_mblock, link); mblk = list_first_entry(&write_list, struct dmz_mblock, link);
...@@ -805,16 +809,20 @@ int dmz_flush_metadata(struct dmz_metadata *zmd) ...@@ -805,16 +809,20 @@ int dmz_flush_metadata(struct dmz_metadata *zmd)
zmd->sb_gen++; zmd->sb_gen++;
out: out:
if (ret && !list_empty(&write_list)) {
spin_lock(&zmd->mblk_lock);
list_splice(&write_list, &zmd->mblk_dirty_list);
spin_unlock(&zmd->mblk_lock);
}
dmz_unlock_flush(zmd); dmz_unlock_flush(zmd);
up_write(&zmd->mblk_sem); up_write(&zmd->mblk_sem);
return ret; return ret;
err:
if (!list_empty(&write_list)) {
spin_lock(&zmd->mblk_lock);
list_splice(&write_list, &zmd->mblk_dirty_list);
spin_unlock(&zmd->mblk_lock);
}
if (!dmz_check_bdev(zmd->dev))
ret = -EIO;
goto out;
} }
/* /*
...@@ -1221,6 +1229,7 @@ static int dmz_update_zone(struct dmz_metadata *zmd, struct dm_zone *zone) ...@@ -1221,6 +1229,7 @@ static int dmz_update_zone(struct dmz_metadata *zmd, struct dm_zone *zone)
if (ret < 0) { if (ret < 0) {
dmz_dev_err(zmd->dev, "Get zone %u report failed", dmz_dev_err(zmd->dev, "Get zone %u report failed",
dmz_id(zmd, zone)); dmz_id(zmd, zone));
dmz_check_bdev(zmd->dev);
return ret; return ret;
} }
......
...@@ -82,6 +82,7 @@ static int dmz_reclaim_align_wp(struct dmz_reclaim *zrc, struct dm_zone *zone, ...@@ -82,6 +82,7 @@ static int dmz_reclaim_align_wp(struct dmz_reclaim *zrc, struct dm_zone *zone,
"Align zone %u wp %llu to %llu (wp+%u) blocks failed %d", "Align zone %u wp %llu to %llu (wp+%u) blocks failed %d",
dmz_id(zmd, zone), (unsigned long long)wp_block, dmz_id(zmd, zone), (unsigned long long)wp_block,
(unsigned long long)block, nr_blocks, ret); (unsigned long long)block, nr_blocks, ret);
dmz_check_bdev(zrc->dev);
return ret; return ret;
} }
...@@ -489,12 +490,7 @@ static void dmz_reclaim_work(struct work_struct *work) ...@@ -489,12 +490,7 @@ static void dmz_reclaim_work(struct work_struct *work)
ret = dmz_do_reclaim(zrc); ret = dmz_do_reclaim(zrc);
if (ret) { if (ret) {
dmz_dev_debug(zrc->dev, "Reclaim error %d\n", ret); dmz_dev_debug(zrc->dev, "Reclaim error %d\n", ret);
if (ret == -EIO) if (!dmz_check_bdev(zrc->dev))
/*
* LLD might be performing some error handling sequence
* at the underlying device. To not interfere, do not
* attempt to schedule the next reclaim run immediately.
*/
return; return;
} }
......
...@@ -80,6 +80,8 @@ static inline void dmz_bio_endio(struct bio *bio, blk_status_t status) ...@@ -80,6 +80,8 @@ static inline void dmz_bio_endio(struct bio *bio, blk_status_t status)
if (status != BLK_STS_OK && bio->bi_status == BLK_STS_OK) if (status != BLK_STS_OK && bio->bi_status == BLK_STS_OK)
bio->bi_status = status; bio->bi_status = status;
if (bio->bi_status != BLK_STS_OK)
bioctx->target->dev->flags |= DMZ_CHECK_BDEV;
if (refcount_dec_and_test(&bioctx->ref)) { if (refcount_dec_and_test(&bioctx->ref)) {
struct dm_zone *zone = bioctx->zone; struct dm_zone *zone = bioctx->zone;
...@@ -565,31 +567,51 @@ static int dmz_queue_chunk_work(struct dmz_target *dmz, struct bio *bio) ...@@ -565,31 +567,51 @@ static int dmz_queue_chunk_work(struct dmz_target *dmz, struct bio *bio)
} }
/* /*
* Check the backing device availability. If it's on the way out, * Check if the backing device is being removed. If it's on the way out,
* start failing I/O. Reclaim and metadata components also call this * start failing I/O. Reclaim and metadata components also call this
* function to cleanly abort operation in the event of such failure. * function to cleanly abort operation in the event of such failure.
*/ */
bool dmz_bdev_is_dying(struct dmz_dev *dmz_dev) bool dmz_bdev_is_dying(struct dmz_dev *dmz_dev)
{ {
struct gendisk *disk; if (dmz_dev->flags & DMZ_BDEV_DYING)
return true;
if (!(dmz_dev->flags & DMZ_BDEV_DYING)) { if (dmz_dev->flags & DMZ_CHECK_BDEV)
disk = dmz_dev->bdev->bd_disk; return !dmz_check_bdev(dmz_dev);
if (blk_queue_dying(bdev_get_queue(dmz_dev->bdev))) {
dmz_dev_warn(dmz_dev, "Backing device queue dying"); if (blk_queue_dying(bdev_get_queue(dmz_dev->bdev))) {
dmz_dev->flags |= DMZ_BDEV_DYING; dmz_dev_warn(dmz_dev, "Backing device queue dying");
} else if (disk->fops->check_events) { dmz_dev->flags |= DMZ_BDEV_DYING;
if (disk->fops->check_events(disk, 0) &
DISK_EVENT_MEDIA_CHANGE) {
dmz_dev_warn(dmz_dev, "Backing device offline");
dmz_dev->flags |= DMZ_BDEV_DYING;
}
}
} }
return dmz_dev->flags & DMZ_BDEV_DYING; return dmz_dev->flags & DMZ_BDEV_DYING;
} }
/*
* Check the backing device availability. This detects such events as
* backing device going offline due to errors, media removals, etc.
* This check is less efficient than dmz_bdev_is_dying() and should
* only be performed as a part of error handling.
*/
bool dmz_check_bdev(struct dmz_dev *dmz_dev)
{
struct gendisk *disk;
dmz_dev->flags &= ~DMZ_CHECK_BDEV;
if (dmz_bdev_is_dying(dmz_dev))
return false;
disk = dmz_dev->bdev->bd_disk;
if (disk->fops->check_events &&
disk->fops->check_events(disk, 0) & DISK_EVENT_MEDIA_CHANGE) {
dmz_dev_warn(dmz_dev, "Backing device offline");
dmz_dev->flags |= DMZ_BDEV_DYING;
}
return !(dmz_dev->flags & DMZ_BDEV_DYING);
}
/* /*
* Process a new BIO. * Process a new BIO.
*/ */
...@@ -902,8 +924,8 @@ static int dmz_prepare_ioctl(struct dm_target *ti, struct block_device **bdev) ...@@ -902,8 +924,8 @@ static int dmz_prepare_ioctl(struct dm_target *ti, struct block_device **bdev)
{ {
struct dmz_target *dmz = ti->private; struct dmz_target *dmz = ti->private;
if (dmz_bdev_is_dying(dmz->dev)) if (!dmz_check_bdev(dmz->dev))
return -ENODEV; return -EIO;
*bdev = dmz->dev->bdev; *bdev = dmz->dev->bdev;
......
...@@ -72,6 +72,7 @@ struct dmz_dev { ...@@ -72,6 +72,7 @@ struct dmz_dev {
/* Device flags. */ /* Device flags. */
#define DMZ_BDEV_DYING (1 << 0) #define DMZ_BDEV_DYING (1 << 0)
#define DMZ_CHECK_BDEV (2 << 0)
/* /*
* Zone descriptor. * Zone descriptor.
...@@ -255,5 +256,6 @@ void dmz_schedule_reclaim(struct dmz_reclaim *zrc); ...@@ -255,5 +256,6 @@ void dmz_schedule_reclaim(struct dmz_reclaim *zrc);
* Functions defined in dm-zoned-target.c * Functions defined in dm-zoned-target.c
*/ */
bool dmz_bdev_is_dying(struct dmz_dev *dmz_dev); bool dmz_bdev_is_dying(struct dmz_dev *dmz_dev);
bool dmz_check_bdev(struct dmz_dev *dmz_dev);
#endif /* DM_ZONED_H */ #endif /* DM_ZONED_H */
...@@ -608,9 +608,6 @@ void *dm_vcalloc(unsigned long nmemb, unsigned long elem_size); ...@@ -608,9 +608,6 @@ void *dm_vcalloc(unsigned long nmemb, unsigned long elem_size);
*/ */
#define dm_round_up(n, sz) (dm_div_up((n), (sz)) * (sz)) #define dm_round_up(n, sz) (dm_div_up((n), (sz)) * (sz))
#define dm_array_too_big(fixed, obj, num) \
((num) > (UINT_MAX - (fixed)) / (obj))
/* /*
* Sector offset taken relative to the start of the target instead of * Sector offset taken relative to the start of the target instead of
* relative to the start of the device. * relative to the start of the device.
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment