Commits · 7b41b51a705ec0eb5f88060c9f724c8bc0e79eab · nexedi / linux

08 Apr, 2013 4 commits

bcache: Documentation updates · 7b41b51a
Kent Overstreet authored Mar 27, 2013
```
Signed-off-by: Kent Overstreet <koverstreet@google.com>
```
7b41b51a
bcache: Use WARN_ONCE() instead of __WARN() · cc0f4eaa
Kent Overstreet authored Mar 27, 2013
```
Signed-off-by: Kent Overstreet <koverstreet@google.com>
```
cc0f4eaa

bcache: Add missing #include <linux/prefetch.h> · cd953ed0

Geert Uytterhoeven authored Mar 27, 2013

m68k/allmodconfig:

drivers/md/bcache/bset.c: In function ‘bset_search_tree’:
drivers/md/bcache/bset.c:727: error: implicit declaration of function ‘prefetch’

drivers/md/bcache/btree.c: In function ‘bch_btree_node_get’:
drivers/md/bcache/btree.c:933: error: implicit declaration of function ‘prefetch’
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Kent Overstreet <koverstreet@google.com>

cd953ed0

bcache: Sparse fixes · c19ed23a
Kent Overstreet authored Mar 26, 2013
```
Signed-off-by: Kent Overstreet <koverstreet@google.com>
```
c19ed23a

28 Mar, 2013 18 commits

bcache: Don't export utility code, prefix with bch_ · 169ef1cf

Kent Overstreet authored Mar 28, 2013

Signed-off-by: Kent Overstreet <koverstreet@google.com>
Cc: linux-bcache@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>

169ef1cf

drbd: fix if(); found by kbuild test robot · 0b6ef416

Lars Ellenberg authored Mar 27, 2013

Recently introduced al_begin_io_nonblock() was returning -EBUSY,
even when it should return -EWOULDBLOCK.

Impact:
A few spurious wake_up() calls in prepare_al_transaction_nonblock().
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

0b6ef416

drbd: use sched_setscheduler() · 3990e04d

Philipp Reisner authored Mar 27, 2013

It was unnoticed for some time that assigning to current->policy is
no longer sufficient to set a real time priority for a kernel thread.
Reported-by: Charlie Suffin <Charlie.Suffin@stratus.com>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

3990e04d

drbd: fix for deadlock when using automatic split-brain-recovery · 7c689e63

Philipp Reisner authored Mar 27, 2013

With an automatic after split-brain recovery policy of
"after-sb-1pri call-pri-lost-after-sb",
when trying to drbd_set_role() to R_SECONDARY,
we run into a deadlock.

This was first recognized and supposedly fixed by
2009-06-10 "Fixed a deadlock when using automatic split brain recovery when both nodes are"
replacing drbd_set_role() with drbd_change_state() in that code-path,
but the first hunk of that patch forgets to remove the drbd_set_role().

We apparently only ever tested the "two primaries" case.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

7c689e63

drbd: add module_put() on error path in drbd_proc_open() · 193d0153

Alexey Khoroshilov authored Mar 27, 2013

If single_open() fails in drbd_proc_open(), module refcount is left incremented.
The patch adds module_put() on the error path.

Found by Linux Driver Verification project (linuxtesting.org).
Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

193d0153

drbd: fix drbd epoch write count for ahead/behind mode · 607f25e5

Lars Ellenberg authored Mar 27, 2013

The sanity check when receiving P_BARRIER_ACK does expect all write
requests with a given req->epoch to have been either all replicated,
or all not replicated.

Because req->epoch was assigned before calling maybe_pull_ahead(),
this expectation was not met, leading to an off-by-one in the sanity
check, and further to a "Protocol Error".

Fix: move the call to maybe_pull_ahead() a few lines up,
and assign req->epoch only after that.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

607f25e5

drbd: Fix build error when CONFIG_CRYPTO_HMAC is not set · ef57f9e6

Philipp Reisner authored Mar 27, 2013

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

ef57f9e6

drbd: validate resync_after dependency on attach already · a3f8f7dc

Lars Ellenberg authored Mar 27, 2013

We validated resync_after dependencies, if changed via disk-options.
But we did not validate them when first created via attach.
We also did not check or cleanup dependencies that used to be correct,
but now point to meanwhile removed minor devices.

If the drbd_resync_after_valid() validation in disk-options tried to
follow a dependency chain in this way, this could lead to NULL pointer
dereference.

Validate resync_after settings in drbd_adm_attach() already, as well as
in drbd_adm_disk_opts(), and and only reject dependency loops.
Depending on non-existing disks is allowed and equivalent to no dependency.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

a3f8f7dc

drbd: fix memory leak · 94ad0a10

Lars Ellenberg authored Mar 27, 2013

We forgot to free the disk_conf,
so for each attach/detach cycle we leaked 336 bytes.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

94ad0a10

drbd: only fail empty flushes if no good data is reachable · 7074e4a7

Lars Ellenberg authored Mar 27, 2013

We completed empty flushes (blkdev_issue_flush()) with IO error
if we lost the local disk, even if we still have an established
replication link to a healthy remote disk.

Fix this to only report errors to upper layers,
if neither local nor remote data is reachable.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

7074e4a7

drbd: Fix disconnect to keep the peer disk state if connection breaks during operation · 2bd5ed5d

Philipp Reisner authored Mar 27, 2013

The issue was that if the connection broke while we did the
gracefull state change to C_DISCONNECTING (C_TEARDOWN), then
we returned a success code from the state engine. (SS_CW_NO_NEED)

The result of that is that we missed to call the fence-peer
script in such a case.

Fixed that by introducing a new error code (SS_OUTDATE_WO_CONN).
This one should never reach back into user space.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

2bd5ed5d

drbd: fix spurious warning about bitmap being locked from detach · bb45185d

Philipp Reisner authored Mar 27, 2013

Introduced in drbd: always write bitmap on detach,
the bitmap bulk writeout on detach was indicating
it expected exclusive bitmap access.

Where I meant to say: expect no more modifications,
but testing/counting is still allowed.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

bb45185d

drbd: drop now useless duplicate state request from invalidate · 0b2dafcd

Philipp Reisner authored Mar 27, 2013

Patch best viewed with git diff --ignore-space-change.

Now that we attempt the fallback to local bitmap operation
only when disconnected, we can safely drop the extra "silent"
state request from both invalidate and invalidate-remote.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

0b2dafcd

drbd: fix effective error returned when refusing an invalidate · 5c4f13d9

Philipp Reisner authored Mar 27, 2013

Since commit
  drbd: Disallow the peer_disk_state to be D_OUTDATED while connected
trying to invalidate a disconnected Primary returned an error code
that did not really match the situation:
"Refusing to be Outdated while Connected"

Insert two more specific conditions into is_valid_state(),
changing that to "Need access to UpToDate data",
respectively "Need a connection to start verify or resync".
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

5c4f13d9

drbd: move invalidating the whole bitmap out of after_state ch() · 9376d9f8

Philipp Reisner authored Mar 27, 2013

To avoid other state change requests, after passing through
sanitize_state(), to be mistaken for an invalidate,
move the "set all bits as out-of-sync" into the invalidate path.

Make invalidate and invalidate-remote behave consistently wrt.
current connection state (need either an established replication link,
or really be disconnected). Also mention that in the documentation.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

9376d9f8

drbd: abort start of resync early, if it raced with connection breakage · a700471b

Philipp Reisner authored Mar 27, 2013

We've seen a spurious full resync, because a connection breakage
raced with drbd_start_resync(, C_SYNC_TARGET),
and the resulting state change request intended to start the resync
ended up looking like a local invalidate.

Fix:
Double check the state inside the lock,
and don't even request that state change,
if we had connection or IO problems.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

a700471b

drbd: reset ap_in_flight counter for new connections · 2d56a974

Philipp Reisner authored Mar 27, 2013

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

2d56a974

idr: document exit conditions on idr_for_each_entry better · b949be58

George Spelvin authored Mar 27, 2013

And some manual common subexpression elimination which may help the
compiler produce smaller code.
Signed-off-by: George Spelvin <linux@horizon.com>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

b949be58

26 Mar, 2013 1 commit

bcache: Fix for the build fixes · 29177b89

Kent Overstreet authored Mar 25, 2013

Commit 82a84eaf7e51ba3da0c36cbc401034a4e943492d left a return 0 in
closure_debug_init(). Whoops.
Signed-off-by: Kent Overstreet <koverstreet@google.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

29177b89

25 Mar, 2013 4 commits

aoe: get rid of cached bv variable in bufinit() · 2124469e

Jens Axboe authored Mar 25, 2013

Less error prone if we just kill it, it's only used once
anyway.
Signed-off-by: Jens Axboe <axboe@kernel.dk>

2124469e

bcache: Style/checkpatch fixes · b1a67b0f

Kent Overstreet authored Mar 25, 2013

Took out some nested functions, and fixed some more checkpatch
complaints.
Signed-off-by: Kent Overstreet <koverstreet@google.com>
Cc: linux-bcache@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>

b1a67b0f

bcache: Build fixes from test robot · 07e86ccb

Kent Overstreet authored Mar 25, 2013

config: make ARCH=i386 allmodconfig

All error/warnings:

   drivers/md/bcache/bset.c: In function 'bch_ptr_bad':
>> drivers/md/bcache/bset.c:164:2: warning: format '%li' expects argument of type 'long int', but argument 4 has type 'size_t' [-Wformat]
--
   drivers/md/bcache/debug.c: In function 'bch_pbtree':
>> drivers/md/bcache/debug.c:86:4: warning: format '%li' expects argument of type 'long int', but argument 4 has type 'size_t' [-Wformat]
--
   drivers/md/bcache/btree.c: In function 'bch_btree_read_done':
>> drivers/md/bcache/btree.c:245:8: warning: format '%lu' expects argument of type 'long unsigned int', but argument 4 has type 'size_t' [-Wformat]
--
   drivers/md/bcache/closure.o: In function `closure_debug_init':
>> (.init.text+0x0): multiple definition of `init_module'
>> drivers/md/bcache/super.o:super.c:(.init.text+0x0): first defined here
Signed-off-by: Kent Overstreet <koverstreet@google.com>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: linux-bcache@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>

07e86ccb

Merge branch 'bcache-for-upstream' of... · e226e341

Jens Axboe authored Mar 24, 2013

Merge branch 'bcache-for-upstream' of http://evilpiepirate.org/git/linux-bcache into for-3.10/drivers

e226e341

23 Mar, 2013 13 commits

bcache: A block layer cache · cafe5635

Kent Overstreet authored Mar 23, 2013

Does writethrough and writeback caching, handles unclean shutdown, and
has a bunch of other nifty features motivated by real world usage.

See the wiki at http://bcache.evilpiepirate.org for more.
Signed-off-by: Kent Overstreet <koverstreet@google.com>

cafe5635

Export __lockdep_no_validate__ · ea6749c7

Kent Overstreet authored Dec 27, 2012

Hack, but bcache needs a way around lockdep for locking during garbage
collection - we need to keep multiple btree nodes locked for coalescing
and rw_lock_nested() isn't really sufficient or appropriate here.
Signed-off-by: Kent Overstreet <koverstreet@google.com>
CC: Peter Zijlstra <peterz@infradead.org>
CC: Ingo Molnar <mingo@redhat.com>

ea6749c7

Export blk_fill_rwbs() · 9ca8f8e5

Kent Overstreet authored Apr 13, 2012

Exported so it can be used by bcache's tracepoints
Signed-off-by: Kent Overstreet <koverstreet@google.com>
CC: Steven Rostedt <rostedt@goodmis.org>
CC: Frederic Weisbecker <fweisbec@gmail.com>
CC: Ingo Molnar <mingo@redhat.com>

9ca8f8e5

Export get_random_int() · 1f8e8ed0

Kent Overstreet authored Apr 09, 2012

Needed for bcache - need a cheap source of random numbers for perturbing
IO sizes, for rate limiting IO to the SSD.
Signed-off-by: Kent Overstreet <koverstreet@google.com>
CC: "Theodore Ts'o" <tytso@mit.edu>

1f8e8ed0

Revert "rw_semaphore: remove up/down_read_non_owner" · 84759c6d

Kent Overstreet authored Sep 21, 2011

This reverts commit 11b80f45.

Bcache needs rw semaphores for cache coherency in writeback mode -
writes have to take a read lock on a per cache device rw sem, and
release it when the bio completes.

But since this is for bios it's naturally not in the context of the
process that originally took the lock.
Signed-off-by: Kent Overstreet <koverstreet@google.com>
CC: Christoph Hellwig <hch@infradead.org>
CC: David Howells <dhowells@redhat.com>

84759c6d

drbd: adjust upper limit for activity log extents · 5bbcf5e6

Lars Ellenberg authored Mar 19, 2013

Now that the on-disk activity-log ring buffer size is adjustable,
the maximum active set can become larger, and is now limited by
the use of 16bit "labels".

This increases the maximum working set from 6433 to 65534 extents,
each of which covers an area of 4MiB.
Which means that if you use the maximum, you'd have to resync
more than 250 GiB after an unclean Primary shutdown.
With capable backend storage and replication links,
this is entirely feasible.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

5bbcf5e6

drbd: try hard to max out the updates per AL transaction · 45ad07b3

Lars Ellenberg authored Mar 19, 2013

There may have been more incoming requests while we where preparing
the current transaction. Try to consolidate more updates into this
transaction until we make no more progres.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

45ad07b3

drbd: move start io accounting before activity log transaction · 7e8c288f

Lars Ellenberg authored Mar 19, 2013

The IO accounting of the drbd "queue depth" was misleading.
We only started IO accounting once we already wrote the activity log.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

7e8c288f

drbd: consolidate as many updates as possible into one AL transaction · 08a1ddab

Lars Ellenberg authored Mar 19, 2013

Depending on current IO depth, try to consolidate as many updates
as possible into one activity log transaction.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

08a1ddab

lru_cache: introduce lc_get_cumulative() · cbe5e610

Lars Ellenberg authored Mar 22, 2013

New helper to be able to consolidate more updates
into a single transaction.
Without this, we can only grab a single refcount
on an updated element while preparing a transaction.

lc_get_cumulative - like lc_get; also finds to-be-changed elements
  @lc: the lru cache to operate on
  @enr: the label to look up

  Unlike lc_get this also returns the element for @enr, if it is belonging to
  a pending transaction, so the return values are like for lc_get(),
  plus:

  pointer to an element already on the "to_be_changed" list.
	  In this case, the cache was already marked %LC_DIRTY.

  Caller needs to make sure that the pending transaction is completed,
  before proceeding to actually use this element.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>

Fixed up by Jens to export lc_get_cumulative().
Signed-off-by: Jens Axboe <axboe@kernel.dk>

cbe5e610

drbd: queue writes on submitter thread, unless they pass the activity log fastpath · 779b3fe4

Lars Ellenberg authored Mar 19, 2013

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

779b3fe4

drbd: split out some helper functions to drbd_al_begin_io · 6c3c4355

Lars Ellenberg authored Mar 19, 2013

To make the code easier to follow,
use an explicit find_active_resync_extent(),
and add a "nonblock" parameter to _al_get().
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

6c3c4355

drbd: split drbd_al_begin_io into fastpath, prepare, and commit · b5bc8e08

Lars Ellenberg authored Mar 19, 2013

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

b5bc8e08