Commits · 981aa8c091e164ea51dd1e81b71a1f3852bbcceb · nexedi / linux

16 Dec, 2013 7 commits

bcache: bugfix - moving_gc now moves only correct buckets · 981aa8c0

Nicholas Swenson authored Nov 07, 2013

Removed gc_move_threshold because picking buckets only by
threshold could lead moving extra buckets (ei. if there are
buckets at the threshold that aren't supposed to be moved
do to space considerations).

This is replaced by a GC_MOVE bit in the gc_mark bitmask.
Now only marked buckets get moved.
Signed-off-by: Nicholas Swenson <nks@daterainc.com>
Signed-off-by: Kent Overstreet <kmo@daterainc.com>

981aa8c0

bcache: fix for gc crashing when no sectors are used · bee63f40

Nicholas Swenson authored Oct 31, 2013

Signed-off-by: Nicholas Swenson <nks@daterainc.com>
Signed-off-by: Kent Overstreet <kmo@daterainc.com>

bee63f40

bcache: Fix heap_peek() macro · 97d11a66

Nicholas Swenson authored Oct 23, 2013

Signed-off-by: Nicholas Swenson <nks@daterainc.com>
Signed-off-by: Kent Overstreet <kmo@daterainc.com>

97d11a66

bcache: Fix for can_attach_cache() · 9eb8ebeb

Nicholas Swenson authored Oct 22, 2013

Signed-off-by: Nicholas Swenson <nks@daterainc.com>
Signed-off-by: Kent Overstreet <kmo@daterainc.com>

9eb8ebeb

bcache: Fix dirty_data accounting · d24a6e10

Kent Overstreet authored Nov 10, 2013

Dirty data accounting wasn't quite right - firstly, we were adding the key we're
inserting after it could have merged with another dirty key already in the
btree, and secondly we could sometimes pass the wrong offset to
bcache_dev_sectors_dirty_add() for dirty data we were overwriting - which is
important when tracking dirty data by stripe.

NOTE FOR BACKPORTERS: For 3.10 (and 3.11?) there's other accounting fixes
necessary that got squashed in with other patches; the full patch against 3.10
is 408cc2f47eeac93a, available at:
  git://evilpiepirate.org/~kent/linux-bcache.git bcache-3.10-writeback-fixes
Signed-off-by: Kent Overstreet <kmo@daterainc.com>
Cc: linux-stable <stable@vger.kernel.org> # >= v3.10

diff --git a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c
index 2a46036..4a12b2f 100644
--- a/drivers/md/bcache/btree.c
+++ b/drivers/md/bcache/btree.c
@@ -1817,7 +1817,8 @@ static bool fix_overlapping_extents(struct btree *b, struct bkey *insert,
 			if (KEY_START(k) > KEY_START(insert) + sectors_found)
 				goto check_failed;

-			if (KEY_PTRS(replace_key) != KEY_PTRS(k))
+			if (KEY_PTRS(k) != KEY_PTRS(replace_key) ||
+			    KEY_DIRTY(k) != KEY_DIRTY(replace_key))
 				goto check_failed;

 			/* skip past gen */

d24a6e10

bcache: Use uninterruptible sleep in writeback · ce2b3f59

Kent Overstreet authored Nov 28, 2013

We're just waiting on kthread_should_stop(), nothing else, so
interruptible sleep was wrong here.
Signed-off-by: Kent Overstreet <kmo@daterainc.com>

ce2b3f59

bcache: kthread don't set writeback task to INTERUPTIBLE · f665c0f8

Stefan Priebe authored Nov 16, 2013

at the beginning (schedule_timout_interuptible) and others
do his on their own

This prevents wrong load average calculation (load of 1 per thread)
Signed-off-by: Kent Overstreet <kmo@daterainc.com>

f665c0f8

29 Nov, 2013 1 commit

bcache: fix sparse non static symbol warning · 08239ca2

Wei Yongjun authored Nov 28, 2013

Fixes the following sparse warning:

drivers/md/bcache/btree.c:2220:5: warning:
 symbol 'btree_insert_fn' was not declared. Should it be static?
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: Kent Overstreet <kmo@daterainc.com>

08239ca2

11 Nov, 2013 32 commits

bcache: defensively handle format strings · c8694948

Kees Cook authored Sep 10, 2013

Just to be safe, call the error reporting function with "%s" to avoid
any possible future format string leak.
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Kent Overstreet <kmo@daterainc.com>

c8694948

bcache: Bypass torture test · 5ceaaad7

Kent Overstreet authored Sep 10, 2013

More testing ftw! Also, now verify mode doesn't break if you read dirty
data.
Signed-off-by: Kent Overstreet <kmo@daterainc.com>

5ceaaad7

bcache: Delete some slower inline asm · 098fb254

Kent Overstreet authored Aug 21, 2013

Never saw a profile of bset_search_tree() where it wasn't bottlenecked
on memory until I got my new Haswell machine, but when I tried it there
it was suddenly burning 20% of the cpu in the inner loop on shrd...

Turns out, the version of shrd that takes 64 bit operands has a 9 cycle
latency. hah.
Signed-off-by: Kent Overstreet <kmo@daterainc.com>

098fb254

bcache: Use ida for bcache block dev minor · 28935ab5
Kent Overstreet authored Jul 31, 2013
```
Signed-off-by: Kent Overstreet <kmo@daterainc.com>
```
28935ab5
bcache: Fix sysfs splat on shutdown with flash only devs · c4d951dd
Kent Overstreet authored Aug 21, 2013
```
Whoops.
Signed-off-by: Kent Overstreet <kmo@daterainc.com>
```
c4d951dd

bcache: Better full stripe scanning · 48a915a8

Kent Overstreet authored Oct 31, 2013

The old scanning-by-stripe code burned too much CPU, this should be
better.
Signed-off-by: Kent Overstreet <kmo@daterainc.com>

48a915a8

bcache: Have btree_split() insert into parent directly · 17e21a9f

Kent Overstreet authored Jul 26, 2013

The flow control in btree_insert_node() was... fragile... before,
this'll use more stack (but since our btrees are never more than depth
1, that shouldn't matter) and it should be significantly clearer and
less fragile.
Signed-off-by: Kent Overstreet <kmo@daterainc.com>

17e21a9f

bcache: Move spinlock into struct time_stats · 65d22e91
Kent Overstreet authored Jul 31, 2013
```
Minor cleanup.
Signed-off-by: Kent Overstreet <kmo@daterainc.com>
```
65d22e91

bcache: Kill sequential_merge option · 8aee1220

Kent Overstreet authored Jul 30, 2013

It never really made sense to expose this, so just kill it.
Signed-off-by: Kent Overstreet <kmo@daterainc.com>

8aee1220

bcache: Kill bch_next_recurse_key() · 50310164

Kent Overstreet authored Sep 10, 2013

This dates from before the btree iterator, and now it's finally gone
Signed-off-by: Kent Overstreet <kmo@daterainc.com>

50310164

bcache: Avoid deadlocking in garbage collection · bc9389ee

Kent Overstreet authored Sep 10, 2013

Not a complete fix - we could still deadlock if btree_insert_node() has
to split...
Signed-off-by: Kent Overstreet <kmo@daterainc.com>

bc9389ee

bcache: Incremental gc · a1f0358b

Kent Overstreet authored Sep 10, 2013

Big garbage collection rewrite; now, garbage collection uses the same
mechanisms as used elsewhere for inserting/updating btree node pointers,
instead of rewriting interior btree nodes in place.

This makes the code significantly cleaner and less fragile, and means we
can now make garbage collection incremental - it doesn't have to hold a
write lock on the root of the btree for the entire duration of garbage
collection.

This means that there's less of a latency hit for doing garbage
collection, which means we can gc more frequently (and do a better job
of reclaiming from the cache), and we can coalesce across more btree
nodes (improving our space efficiency).
Signed-off-by: Kent Overstreet <kmo@daterainc.com>

a1f0358b

bcache: Add make_btree_freeing_key() · 8835c123

Kent Overstreet authored Jul 24, 2013

Refactoring, prep work for incremental garbage collection.
Signed-off-by: Kent Overstreet <kmo@daterainc.com>

8835c123

bcache: Add btree_node_write_sync() · f269af5a

Kent Overstreet authored Jul 23, 2013

More refactoring - mostly making the interfaces more explicit about what
we actually want to do.
Signed-off-by: Kent Overstreet <kmo@daterainc.com>

f269af5a

bcache: PRECEDING_KEY() · 0eacac22

Kent Overstreet authored Jul 01, 2013

btree_insert_key() was open coding this, this is just refactoring.
Signed-off-by: Kent Overstreet <kmo@daterainc.com>

0eacac22

bcache: bch_(btree|extent)_ptr_invalid() · d5cc66e9

Kent Overstreet authored Jul 24, 2013

Trying to treat btree pointers and leaf node pointers the same way was a
mistake - going to start being more explicit about the type of
key/pointer we're dealing with. This is the first part of that
refactoring; this patch shouldn't change any actual behaviour.
Signed-off-by: Kent Overstreet <kmo@daterainc.com>

d5cc66e9

bcache: Don't bother with bucket refcount for btree node allocations · 3a3b6a4e

Kent Overstreet authored Jul 24, 2013

The bucket refcount (dropped with bkey_put()) is only needed to prevent
the newly allocated bucket from being garbage collected until we've
added a pointer to it somewhere. But for btree node allocations, the
fact that we have btree nodes locked is enough to guard against races
with garbage collection.

Eventually the per bucket refcount is going to be replaced with
something specific to bch_alloc_sectors().
Signed-off-by: Kent Overstreet <kmo@daterainc.com>

3a3b6a4e

bcache: Debug code improvements · 280481d0

Kent Overstreet authored Oct 24, 2013

Couple changes:
 * Consolidate bch_check_keys() and bch_check_key_order(), and move the
   checks that only check_key_order() could do to bch_btree_iter_next().

 * Get rid of CONFIG_BCACHE_EDEBUG - now, all that code is compiled in
   when CONFIG_BCACHE_DEBUG is enabled, and there's now a sysfs file to
   flip on the EDEBUG checks at runtime.

 * Dropped an old not terribly useful check in rw_unlock(), and
   refactored/improved a some of the other debug code.
Signed-off-by: Kent Overstreet <kmo@daterainc.com>

280481d0

bcache: Fix bch_ptr_bad() · e58ff155

Kent Overstreet authored Jul 24, 2013

Previously, bch_ptr_bad() could return false when there was a pointer to
a nonexistant device... it only filtered out keys with PTR_CHECK_DEV
pointers.

This behaviour was intended for multiple cache device support; for that,
just because the device for one of the pointers has gone away doesn't
mean we want to filter out the rest of the pointers.

But we don't yet explicitly filter/check individual pointers, so without
that this behaviour was wrong - a corrupt bkey with a bad device pointer
could cause us to deref a bad pointer. Doh.
Signed-off-by: Kent Overstreet <kmo@daterainc.com>

e58ff155

bcache: Pull on disk data structures out into a separate header · 81ab4190

Kent Overstreet authored Oct 31, 2013

Now, the on disk data structures are in a header that can be exported to
userspace - and having them all centralized is nice too.
Signed-off-by: Kent Overstreet <kmo@daterainc.com>

81ab4190

bcache: Move sector allocator to alloc.c · 2599b53b
Kent Overstreet authored Jul 24, 2013
```
Just reorganizing things a bit.
Signed-off-by: Kent Overstreet <kmo@daterainc.com>
```
2599b53b

bcache: Break up struct search · 220bb38c

Kent Overstreet authored Sep 10, 2013

With all the recent refactoring around struct btree op struct search has
gotten rather large.

But we can now easily break it up in a different way - we break out
struct btree_insert_op which is for inserting data into the cache, and
that's now what the copying gc code uses - struct search is now specific
to request.c
Signed-off-by: Kent Overstreet <kmo@daterainc.com>

220bb38c

bcache: Convert bch_btree_insert() to bch_btree_map_leaf_nodes() · cc7b8819

Kent Overstreet authored Jul 24, 2013

Last of the btree_map() conversions. Main visible effect is
bch_btree_insert() is no longer taking a struct btree_op as an argument
anymore - there's no fancy state machine stuff going on, it's just a
normal function.
Signed-off-by: Kent Overstreet <kmo@daterainc.com>

cc7b8819

bcache: Don't use op->insert_collision · 6054c6d4

Kent Overstreet authored Jul 24, 2013

When we convert bch_btree_insert() to bch_btree_map_leaf_nodes(), we
won't be passing struct btree_op to bch_btree_insert() anymore - so we
need a different way of returning whether there was a collision (really,
a replace collision).
Signed-off-by: Kent Overstreet <kmo@daterainc.com>

6054c6d4

bcache: Kill op->replace · 1b207d80

Kent Overstreet authored Sep 10, 2013

This is prep work for converting bch_btree_insert to
bch_btree_map_leaf_nodes() - we have to convert all its arguments to
actual arguments. Bunch of churn, but should be straightforward.
Signed-off-by: Kent Overstreet <kmo@daterainc.com>

1b207d80

bcache: Drop some closure stuff · faadf0c9

Kent Overstreet authored Nov 01, 2013

With a the recent bcache refactoring, some of the closure code isn't
needed anymore.
Signed-off-by: Kent Overstreet <kmo@daterainc.com>

faadf0c9

bcache: Kill op->cl · b54d6934

Kent Overstreet authored Jul 24, 2013

This isn't used for waiting asynchronously anymore - so this is a fairly
trivial refactoring.
Signed-off-by: Kent Overstreet <kmo@daterainc.com>

b54d6934

bcache: Prune struct btree_op · c18536a7

Kent Overstreet authored Jul 24, 2013

Eventual goal is for struct btree_op to contain only what is necessary
for traversing the btree.
Signed-off-by: Kent Overstreet <kmo@daterainc.com>

c18536a7

bcache: Clean up cache_lookup_fn · cc231966

Kent Overstreet authored Jul 24, 2013

There was some looping in submit_partial_cache_hit() and
submit_partial_cache_hit() that isn't needed anymore - originally, we
wouldn't necessarily process the full hit or miss all at once because
when splitting the bio, we took into account the restrictions of the
device we were sending it to.

But, device bio size restrictions are now handled elsewhere, with a
wrapper around generic_make_request() - so that looping has been
unnecessary for awhile now and we can now do quite a bit of cleanup.

And if we trim the key we're reading from to match the subset we're
actually reading, we don't have to explicitly calculate bi_sector
anymore. Neat.
Signed-off-by: Kent Overstreet <kmo@daterainc.com>

cc231966

bcache: Convert bch_btree_read_async() to bch_btree_map_keys() · 2c1953e2

Kent Overstreet authored Jul 24, 2013

This is a fairly straightforward conversion, mostly reshuffling -
op->lookup_done goes away, replaced by MAP_DONE/MAP_CONTINUE. And the
code for handling cache hits and misses wasn't really btree code, so it
gets moved to request.c.
Signed-off-by: Kent Overstreet <kmo@daterainc.com>

2c1953e2

bcache: Move some stuff to btree.c · df8e8970

Kent Overstreet authored Jul 24, 2013

With the new btree_map() functions, we don't need to export the stuff
needed for traversing the btree anymore.
Signed-off-by: Kent Overstreet <kmo@daterainc.com>

df8e8970

bcache: Add btree_map() functions · 48dad8ba

Kent Overstreet authored Sep 10, 2013

Lots of stuff has been open coding its own btree traversal - which is
generally pretty simple code, but there are a few subtleties.

This adds new new functions, bch_btree_map_nodes() and
bch_btree_map_keys(), which do the traversal for you. Everything that's
open coding btree traversal now (with the exception of garbage
collection) is slowly going to be converted to these two functions;
being able to write other code at a higher level of abstraction  is a
big improvement w.r.t. overall code quality.
Signed-off-by: Kent Overstreet <kmo@daterainc.com>

48dad8ba