Commit c9b2ffc0 authored by Marc MERLIN's avatar Marc MERLIN Committed by Jonathan Corbet

bcache: documentation updates and corrections

Bcache documentation updates:
- Added new HOWTO/COOKBOOK section
- fixed a few typos
- /sys/block/bcache0/cache_mode is /sys/block/bcache0/bcache/cache_mode
Signed-off-by: default avatarMarc MERLIN <marc@merlins.org>
Signed-off-by: default avatarJonathan Corbet <corbet@lwn.net>
parent ebc88ef0
Say you've got a big slow raid 6, and an X-25E or three. Wouldn't it be Say you've got a big slow raid 6, and an ssd or three. Wouldn't it be
nice if you could use them as cache... Hence bcache. nice if you could use them as cache... Hence bcache.
Wiki and git repositories are at: Wiki and git repositories are at:
...@@ -8,7 +8,7 @@ Wiki and git repositories are at: ...@@ -8,7 +8,7 @@ Wiki and git repositories are at:
It's designed around the performance characteristics of SSDs - it only allocates It's designed around the performance characteristics of SSDs - it only allocates
in erase block sized buckets, and it uses a hybrid btree/log to track cached in erase block sized buckets, and it uses a hybrid btree/log to track cached
extants (which can be anywhere from a single sector to the bucket size). It's extents (which can be anywhere from a single sector to the bucket size). It's
designed to avoid random writes at all costs; it fills up an erase block designed to avoid random writes at all costs; it fills up an erase block
sequentially, then issues a discard before reusing it. sequentially, then issues a discard before reusing it.
...@@ -55,7 +55,10 @@ immediately. Without udev, you can manually register devices like this: ...@@ -55,7 +55,10 @@ immediately. Without udev, you can manually register devices like this:
Registering the backing device makes the bcache device show up in /dev; you can Registering the backing device makes the bcache device show up in /dev; you can
now format it and use it as normal. But the first time using a new bcache now format it and use it as normal. But the first time using a new bcache
device, it'll be running in passthrough mode until you attach it to a cache. device, it'll be running in passthrough mode until you attach it to a cache.
See the section on attaching. If you are thinking about using bcache later, it is recommended to setup all your
slow devices as bcache backing devices without a cache, and you can choose to add
a caching device later.
See 'ATTACHING' section below.
The devices show up as: The devices show up as:
...@@ -72,12 +75,14 @@ To get started: ...@@ -72,12 +75,14 @@ To get started:
mount /dev/bcache0 /mnt mount /dev/bcache0 /mnt
You can control bcache devices through sysfs at /sys/block/bcache<N>/bcache . You can control bcache devices through sysfs at /sys/block/bcache<N>/bcache .
You can also control them through /sys/fs//bcache/<cset-uuid>/ .
Cache devices are managed as sets; multiple caches per set isn't supported yet Cache devices are managed as sets; multiple caches per set isn't supported yet
but will allow for mirroring of metadata and dirty data in the future. Your new but will allow for mirroring of metadata and dirty data in the future. Your new
cache set shows up as /sys/fs/bcache/<UUID> cache set shows up as /sys/fs/bcache/<UUID>
ATTACHING: ATTACHING
---------
After your cache device and backing device are registered, the backing device After your cache device and backing device are registered, the backing device
must be attached to your cache set to enable caching. Attaching a backing must be attached to your cache set to enable caching. Attaching a backing
...@@ -105,7 +110,8 @@ but all the cached data will be invalidated. If there was dirty data in the ...@@ -105,7 +110,8 @@ but all the cached data will be invalidated. If there was dirty data in the
cache, don't expect the filesystem to be recoverable - you will have massive cache, don't expect the filesystem to be recoverable - you will have massive
filesystem corruption, though ext4's fsck does work miracles. filesystem corruption, though ext4's fsck does work miracles.
ERROR HANDLING: ERROR HANDLING
--------------
Bcache tries to transparently handle IO errors to/from the cache device without Bcache tries to transparently handle IO errors to/from the cache device without
affecting normal operation; if it sees too many errors (the threshold is affecting normal operation; if it sees too many errors (the threshold is
...@@ -127,7 +133,143 @@ the backing devices to passthrough mode. ...@@ -127,7 +133,143 @@ the backing devices to passthrough mode.
writeback mode). It currently doesn't do anything intelligent if it fails to writeback mode). It currently doesn't do anything intelligent if it fails to
read some of the dirty data, though. read some of the dirty data, though.
TROUBLESHOOTING PERFORMANCE:
HOWTO/COOKBOOK
--------------
A) Your bcache doesn't start.
Starting and starting a bcache with a missing caching device
Registering the backing device doesn't help, it's already there, you just need
to force it to run without the cache:
host:~# echo /dev/sdb1 > /sys/fs/bcache/register
[ 119.844831] bcache: register_bcache() error opening /dev/sdb1: device already registered
Next, you try to register your caching device if it's present. However if it's
absent, or registration fails for some reason, you can still start your bcache
without its cache, like so:
host:/sys/block/sdb/sdb1/bcache# echo 1 > running
B) Bcache not finding its cache and not starting
This does not work:
host:/sys/block/md5/bcache# echo 0226553a-37cf-41d5-b3ce-8b1e944543a8 > attach
[ 1933.455082] bcache: bch_cached_dev_attach() Couldn't find uuid for md5 in set
[ 1933.478179] bcache: __cached_dev_store() Can't attach 0226553a-37cf-41d5-b3ce-8b1e944543a8
[ 1933.478179] : cache set not found
In this case, the caching device was simply not registered at boot or
disappeared and came back, and needs to be (re-)registered:
host:/sys/block/md5/bcache# echo /dev/sdh2 > /sys/fs/bcache/register
C) Corrupt bcache caching device crashes the kernel on startup/boot
You'll have to wipe the caching device, start the backing device without the
cache, and you can re-attach the cleaned up caching device then. This does
require booting with a kernel/rescue media where bcache is disabled
since it will otherwise try to access your device and probably crash
again before you have a chance to wipe it.
(or if you plan ahead, compile a backup kernel with bcache disabled and keep it
in your grub config for a rainy day)
If bcache is not available in the kernel, a filesystem on the backing device is
still available at an 8KiB offset. So either via a loopdev of the backing device
created with --offset 8K or by temporarily increasing the start sector of the
partition by 16 (512byte sectors).
This is how you wipe the caching device:
host:~# wipefs -a /dev/sdh2
16 bytes were erased at offset 0x1018 (bcache)
they were: c6 85 73 f6 4e 1a 45 ca 82 65 f5 7f 48 ba 6d 81
After you boot back with bcache enabled, you recreate the cache and attach it:
host:~# make-bcache -C /dev/sdh2
UUID: 7be7e175-8f4c-4f99-94b2-9c904d227045
Set UUID: 5bc072a8-ab17-446d-9744-e247949913c1
version: 0
nbuckets: 106874
block_size: 1
bucket_size: 1024
nr_in_set: 1
nr_this_dev: 0
first_bucket: 1
[ 650.511912] bcache: run_cache_set() invalidating existing data
[ 650.549228] bcache: register_cache() registered cache device sdh2
start backing device with missing cache:
host:/sys/block/md5/bcache# echo 1 > running
attach new cache:
host:/sys/block/md5/bcache# echo 5bc072a8-ab17-446d-9744-e247949913c1 > attach
[ 865.276616] bcache: bch_cached_dev_attach() Caching md5 as bcache0 on set 5bc072a8-ab17-446d-9744-e247949913c1
D) Remove or replace a caching device
host:/sys/block/sda/sda7/bcache# echo 1 > detach
[ 695.872542] bcache: cached_dev_detach_finish() Caching disabled for sda7
host:~# wipefs -a /dev/nvme0n1p4
wipefs: error: /dev/nvme0n1p4: probing initialization failed: Device or resource busy
Ooops, it's disabled, but not unregistered, so it's still protected
We need to go and unregister it:
host:/sys/fs/bcache/b7ba27a1-2398-4649-8ae3-0959f57ba128# ls -l cache0
lrwxrwxrwx 1 root root 0 Feb 25 18:33 cache0 -> ../../../devices/pci0000:00/0000:00:1d.0/0000:70:00.0/nvme/nvme0/nvme0n1/nvme0n1p4/bcache/
host:/sys/fs/bcache/b7ba27a1-2398-4649-8ae3-0959f57ba128# echo 1 > stop
kernel: [ 917.041908] bcache: cache_set_free() Cache set b7ba27a1-2398-4649-8ae3-0959f57ba128 unregistered
Now we can wipe it:
host:~# wipefs -a /dev/nvme0n1p4
/dev/nvme0n1p4: 16 bytes were erased at offset 0x00001018 (bcache): c6 85 73 f6 4e 1a 45 ca 82 65 f5 7f 48 ba 6d 81
E) dmcrypt and bcache
First setup bcache unencrypted and then install dmcrypt on top of /dev/bcache<N>
This will work faster than if you dmcrypt both the backing and caching
devices and then install bcache on top.
F) Stop/free a registered bcache to wipe and/or recreate it
(or maybe you need to free up all bcache references so that you can have fdisk
run and re-register a changed partition table, which won't work if there are any
active backing or caching devices left on it)
1) Is it present in /dev/bcache* ? (there are times where it won't be)
If so, it's easy:
host:/sys/block/bcache0/bcache# echo 1 > stop
2) But if your backing device is gone, this won't work:
host:/sys/block/bcache0# cd bcache
bash: cd: bcache: No such file or directory
In this case, you may have to unregister the dmcrypt block device that
references this bcache to free it up:
host:~# dmsetup remove oldds1
bcache: bcache_device_free() bcache0 stopped
bcache: cache_set_free() Cache set 5bc072a8-ab17-446d-9744-e247949913c1 unregistered
This causes the backing bcache to be removed from /sys/fs/bcache and then it can
be reused
3) In other cases, you can also look in /sys/fs/bcache/:
host:/sys/fs/bcache# ls -l */{cache?,bdev?}
lrwxrwxrwx 1 root root 0 Mar 5 09:39 0226553a-37cf-41d5-b3ce-8b1e944543a8/bdev1 -> ../../../devices/virtual/block/dm-1/bcache/
lrwxrwxrwx 1 root root 0 Mar 5 09:39 0226553a-37cf-41d5-b3ce-8b1e944543a8/cache0 -> ../../../devices/virtual/block/dm-4/bcache/
lrwxrwxrwx 1 root root 0 Mar 5 09:39 5bc072a8-ab17-446d-9744-e247949913c1/cache0 -> ../../../devices/pci0000:00/0000:00:01.0/0000:01:00.0/ata10/host9/target9:0:0/9:0:0:0/block/sdl/sdl2/bcache/
The device names will show which UUID is relevant, cd in that directory
and stop the cache:
host:/sys/fs/bcache/5bc072a8-ab17-446d-9744-e247949913c1# echo 1 > stop
this will free up bcache references and let you reuse the partition for other
purposes.
TROUBLESHOOTING PERFORMANCE
---------------------------
Bcache has a bunch of config options and tunables. The defaults are intended to Bcache has a bunch of config options and tunables. The defaults are intended to
be reasonable for typical desktop and server workloads, but they're not what you be reasonable for typical desktop and server workloads, but they're not what you
...@@ -140,7 +282,7 @@ want for getting the best possible numbers when benchmarking. ...@@ -140,7 +282,7 @@ want for getting the best possible numbers when benchmarking.
maturity, but simply because in writeback mode you'll lose data if something maturity, but simply because in writeback mode you'll lose data if something
happens to your SSD) happens to your SSD)
# echo writeback > /sys/block/bcache0/cache_mode # echo writeback > /sys/block/bcache0/bcache/cache_mode
- Bad performance, or traffic not going to the SSD that you'd expect - Bad performance, or traffic not going to the SSD that you'd expect
...@@ -193,7 +335,9 @@ want for getting the best possible numbers when benchmarking. ...@@ -193,7 +335,9 @@ want for getting the best possible numbers when benchmarking.
Solution: warm the cache by doing writes, or use the testing branch (there's Solution: warm the cache by doing writes, or use the testing branch (there's
a fix for the issue there). a fix for the issue there).
SYSFS - BACKING DEVICE:
SYSFS - BACKING DEVICE
----------------------
Available at /sys/block/<bdev>/bcache, /sys/block/bcache*/bcache and Available at /sys/block/<bdev>/bcache, /sys/block/bcache*/bcache and
(if attached) /sys/fs/bcache/<cset-uuid>/bdev* (if attached) /sys/fs/bcache/<cset-uuid>/bdev*
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment