• Filipe Manana's avatar
    Btrfs: fix -ENOSPC on block group removal · 39c2d7fa
    Filipe Manana authored
    Unlike when attempting to allocate a new block group, where we check
    that we have enough space in the system space_info to update the device
    items and insert a new chunk item in the chunk tree, we were not checking
    if the system space_info had enough space for updating the device items
    and deleting the chunk item in the chunk tree. This often lead to -ENOSPC
    error when attempting to allocate blocks for the chunk tree (during btree
    node/leaf COW operations) while updating the device items or deleting the
    chunk item, which resulted in the current transaction being aborted and
    turning the filesystem into read-only mode.
    
    While running fstests generic/038, which stresses allocation of block
    groups and removal of unused block groups, with a large scratch device
    (750Gb) this happened often, despite more than enough unallocated space,
    and resulted in the following trace:
    
    [68663.586604] WARNING: CPU: 3 PID: 1521 at fs/btrfs/super.c:260 __btrfs_abort_transaction+0x52/0x114 [btrfs]()
    [68663.600407] BTRFS: Transaction aborted (error -28)
    (...)
    [68663.730829] Call Trace:
    [68663.732585]  [<ffffffff8142fa46>] dump_stack+0x4f/0x7b
    [68663.734334]  [<ffffffff8108b6a2>] ? console_unlock+0x361/0x3ad
    [68663.739980]  [<ffffffff81045ea5>] warn_slowpath_common+0xa1/0xbb
    [68663.757153]  [<ffffffffa036ca6d>] ? __btrfs_abort_transaction+0x52/0x114 [btrfs]
    [68663.760925]  [<ffffffff81045f05>] warn_slowpath_fmt+0x46/0x48
    [68663.762854]  [<ffffffffa03b159d>] ? btrfs_update_device+0x15a/0x16c [btrfs]
    [68663.764073]  [<ffffffffa036ca6d>] __btrfs_abort_transaction+0x52/0x114 [btrfs]
    [68663.765130]  [<ffffffffa03b3638>] btrfs_remove_chunk+0x597/0x5ee [btrfs]
    [68663.765998]  [<ffffffffa0384663>] ? btrfs_delete_unused_bgs+0x245/0x296 [btrfs]
    [68663.767068]  [<ffffffffa0384676>] btrfs_delete_unused_bgs+0x258/0x296 [btrfs]
    [68663.768227]  [<ffffffff8143527f>] ? _raw_spin_unlock_irq+0x2d/0x4c
    [68663.769081]  [<ffffffffa038b109>] cleaner_kthread+0x13d/0x16c [btrfs]
    [68663.799485]  [<ffffffffa038afcc>] ? btrfs_alloc_root+0x28/0x28 [btrfs]
    [68663.809208]  [<ffffffff8105f367>] kthread+0xef/0xf7
    [68663.828795]  [<ffffffff810e603f>] ? time_hardirqs_on+0x15/0x28
    [68663.844942]  [<ffffffff8105f278>] ? __kthread_parkme+0xad/0xad
    [68663.846486]  [<ffffffff81435a88>] ret_from_fork+0x58/0x90
    [68663.847760]  [<ffffffff8105f278>] ? __kthread_parkme+0xad/0xad
    [68663.849503] ---[ end trace 798477c6d6dbaad6 ]---
    [68663.850525] BTRFS: error (device sdc) in btrfs_remove_chunk:2652: errno=-28 No space left
    
    So fix this by verifying that enough space exists in system space_info,
    and reserving the space in the chunk block reserve, before attempting to
    delete the block group and allocate a new system chunk if we don't have
    enough space to perform the necessary updates and delete in the chunk
    tree. Like for the block group creation case, we don't error our if we
    fail to allocate a new system chunk, since we might end up not needing
    it (no node/leaf splits happen during the COW operations and/or we end
    up not needing to COW any btree nodes or leafs because they were already
    COWed in the current transaction and their writeback didn't start yet).
    Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
    Signed-off-by: default avatarChris Mason <clm@fb.com>
    39c2d7fa
volumes.c 175 KB