Commits · 23d31bd476d1d096d0f073483547872ec155ab34 · Kirill Smelkov / linux

01 Jul, 2019 33 commits

btrfs: Use newly introduced btrfs_lock_and_flush_ordered_range · 23d31bd4

Nikolay Borisov authored May 07, 2019

There several functions which open code
btrfs_lock_and_flush_ordered_range, just replace them with a call to the
function. No functional changes.
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>

23d31bd4

btrfs: add new helper btrfs_lock_and_flush_ordered_range · ffa87214

Nikolay Borisov authored May 07, 2019

There is a certain idiom used in multiple places in btrfs' codebase,
dealing with flushing an ordered range. Factor this in a separate
function that can be reused. Future patches will replace the existing
code with that function.
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>

ffa87214

btrfs: remove the incorrect comment on RO fs when btrfs_run_delalloc_range() fails · 1200b51f

Qu Wenruo authored May 09, 2019

At the context of btrfs_run_delalloc_range(), we haven't started/joined
a transaction, thus even something went wrong, we can't and won't abort
transaction, thus no way to make the fs RO.
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>

1200b51f

btrfs: extent-tree: Add trace events for space info numbers update · 480b9b4d

Qu Wenruo authored Apr 29, 2019

Add trace event for update_bytes_pinned() and update_bytes_may_use() to
detect underflow better.

The output would be something like (only showing data part):

  ## Buffered write start, 16K total ##
  2255.954 xfs_io/860 btrfs:update_bytes_may_use:(nil)U: type=DATA old=0 diff=4096
  2257.169 sudo/860 btrfs:update_bytes_may_use:(nil)U: type=DATA old=4096 diff=4096
  2257.346 sudo/860 btrfs:update_bytes_may_use:(nil)U: type=DATA old=8192 diff=4096
  2257.542 sudo/860 btrfs:update_bytes_may_use:(nil)U: type=DATA old=12288 diff=4096

  ## Delalloc start ##
  3727.853 kworker/u8:3-e/700 btrfs:update_bytes_may_use:(nil)U: type=DATA old=16384 diff=-16384

  ## Space cache update ##
  3733.132 sudo/862 btrfs:update_bytes_may_use:(nil)U: type=DATA old=0 diff=65536
  3733.169 sudo/862 btrfs:update_bytes_may_use:(nil)U: type=DATA old=65536 diff=-65536
  3739.868 sudo/862 btrfs:update_bytes_may_use:(nil)U: type=DATA old=0 diff=65536
  3739.891 sudo/862 btrfs:update_bytes_may_use:(nil)U: type=DATA old=65536 diff=-65536

These two trace events will allow bcc tool to probe btrfs_space_info
changes and detect underflow with more details (e.g. backtrace for each
update).
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>

480b9b4d

btrfs: extent-tree: Add lockdep assert when updating space info · 0185f364

Qu Wenruo authored Apr 29, 2019

Just add a safe net for btrfs_space_info member updating.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>

0185f364

btrfs: read number of data stripes from map only once · cff82672

David Sterba authored May 17, 2019

There are several places that call nr_data_stripes, but this value does
not change.
Signed-off-by: David Sterba <dsterba@suse.com>

cff82672

btrfs: constify map parameter for nr_parity_stripes and nr_data_stripes · 72ad8131
David Sterba authored May 17, 2019
```
Signed-off-by: David Sterba <dsterba@suse.com>
```
72ad8131

btrfs: refactor helper for bg flags to name conversion · 158da513

David Sterba authored May 17, 2019

The helper lacks the btrfs_ prefix and the parameter is the raw
blockgroup type, so none of the callers has to do the flags -> index
conversion.
Signed-off-by: David Sterba <dsterba@suse.com>

158da513

btrfs: factor out devs_max setting in __btrfs_alloc_chunk · e3ecdb3f
David Sterba authored May 17, 2019
```
Merge the repeated code before the if-else block.
Signed-off-by: David Sterba <dsterba@suse.com>
```
e3ecdb3f

btrfs: use u8 for raid_array members · 8c3e3582

David Sterba authored May 17, 2019

The raid_attr table is now 7 * 56 = 392 bytes long, consisting of just
small numbers so we don't have to use ints. New size is 7 * 32 = 224,
saving 3 cachelines.
Signed-off-by: David Sterba <dsterba@suse.com>

8c3e3582

btrfs: factor out helper for counting data stripes · 946c9256

David Sterba authored May 17, 2019

Factor the sequence of ifs to a helper, the 'data stripes' here means
the number of stripes without redundancy and parity.
Signed-off-by: David Sterba <dsterba@suse.com>

946c9256

btrfs: use raid_attr table for btrfs_bg_type_to_factor · 44b28ada
David Sterba authored May 17, 2019
```
The factor is the number of copies.
Signed-off-by: David Sterba <dsterba@suse.com>
```
44b28ada

btrfs: use raid_attr table to find profiles for integrity lowering · 6079e12c

David Sterba authored May 17, 2019

Replace open coded list of the profiles by selecting them from the
raid_attr table. The criteria are now more explicit, we need profiles
that have more than 1 copy of the data or can reconstruct the data with
a missing device.
Signed-off-by: David Sterba <dsterba@suse.com>

6079e12c

btrfs: use raid_attr to get allowed profiles for balance conversion · 081db89b

David Sterba authored May 17, 2019

Iterate over the table and gather all allowed profiles for a given
number of devices, instead of open coding.
Signed-off-by: David Sterba <dsterba@suse.com>

081db89b

btrfs: use raid_attr in btrfs_chunk_max_errors · fc9a2ac7

David Sterba authored May 17, 2019

The number of tolerated failures is stored in the raid_attr table, use
it.
Signed-off-by: David Sterba <dsterba@suse.com>

fc9a2ac7

btrfs: use raid_attr table in get_profile_num_devs · 9fa02ac7

David Sterba authored May 17, 2019

The dev_max constraints are defined in the raid_attr table, use it
instead of open-coding it.
Signed-off-by: David Sterba <dsterba@suse.com>

9fa02ac7

btrfs: remove mapping tree structures indirection · c8bf1b67

David Sterba authored May 17, 2019

fs_info::mapping_tree is the physical<->logical mapping tree and uses
the same underlying structure as extents, but is embedded to another
structure. There are no other members and this indirection is useless.
No functional change.
Signed-off-by: David Sterba <dsterba@suse.com>

c8bf1b67

btrfs: raid56: allow the exact minimum number of devices for balance convert · 49cc180c

David Sterba authored May 17, 2019

The minimum number of devices for RAID5 is 2, though this is only a bit
expensive RAID1, and for RAID6 it's 3, which is a triple copy that works
only 3 devices.

mkfs.btrfs allows that and mounting such filesystem also works, so the
conversion via balance filters is inconsistent with the others and we
should not prevent it.
Signed-off-by: David Sterba <dsterba@suse.com>

49cc180c

btrfs: fix minimum number of chunk errors for DUP · 0ee5f8ae

David Sterba authored May 17, 2019

The list of profiles in btrfs_chunk_max_errors lists DUP as a profile
DUP able to tolerate 1 device missing. Though this profile is special
with 2 copies, it still needs the device, unlike the others.

Looking at the history of changes, thre's no clear reason why DUP is
there, functions were refactored and blocks of code merged to one
helper.

d20983b4 Btrfs: fix writing data into the seed filesystem
  - factor code to a helper

de11cc12 Btrfs: don't pre-allocate btrfs bio
  - unrelated change, DUP still in the list with max errors 1

a236aed1 Btrfs: Deal with failed writes in mirrored configurations
  - introduced the max errors, leaves DUP and RAID1 in the same group
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>

0ee5f8ae

Btrfs: remove unused variables in __btrfs_unlink_inode · be9b8dfa

Liu Bo authored Sep 12, 2018

This code was first introduced in 5f39d397 ("Btrfs: Create
extent_buffer interface for large blocksizes") and the function was
named btrfs_unlink_trans. It later got renamed to __btrfs_unlink_inode
and finally commit 16cdcec7 ("btrfs: implement delayed inode items
operation") changed the way inodes are deleted and obviated the need for
those two members.
Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com>
Reviewed-by: David Sterba <dsterba@suse.com>
[ replace changelog by Nikolay's version ]
Signed-off-by: David Sterba <dsterba@suse.com>

be9b8dfa

btrfs: Remove unused variable mode in btrfs_mount · cebf05ca

Goldwyn Rodrigues authored Oct 05, 2018

This is a leftover from 312c89fb ("btrfs: cleanup btrfs_mount()
using btrfs_mount_root()"), the mode was used for opening devices that's
not done here anymore.
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>

cebf05ca

btrfs: switch order of unlocks of space_info and bg in do_trimming() · 8f63a840

Su Yue authored Nov 28, 2018

In function do_trimming(), block_group->lock should be unlocked first,
as the locks should be released in the reverse order. This does not
cause problems but should follow the best practices.
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Reviewed-by: David Sterba <dsterba@suse.com>
[ update changelog ]
Signed-off-by: David Sterba <dsterba@suse.com>

8f63a840

btrfs: tree-checker: Check if the file extent end overflows · 4c094c33

Qu Wenruo authored May 03, 2019

Under certain conditions, we could have strange file extent item in log
tree like:

  item 18 key (69599 108 397312) itemoff 15208 itemsize 53
	extent data disk bytenr 0 nr 0
	extent data offset 0 nr 18446744073709547520 ram 18446744073709547520

The num_bytes + ram_bytes overflow 64 bit type.

For num_bytes part, we can detect such overflow along with file offset
(key->offset), as file_offset + num_bytes should never go beyond u64.

For ram_bytes part, it's about the decompressed size of the extent, not
directly related to the size.
In theory it is OK to have a large value, and put extra limitation
on RAM bytes may cause unexpected false alerts.

So in tree-checker, we only check if the file offset and num bytes
overflow.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>

4c094c33

btrfs: Remove redundant assignment of tgt_device->commit_total_bytes · 2ed95d2d

Nikolay Borisov authored May 14, 2019

This is already done in btrfs_init_dev_replace_tgtdev which is the first
phase of device replace, called before doing scrub. During that time
exclusive lock is held. Additionally btrfs_fs_device::commit_total_bytes
is always set based on the size of the underlying block device which
shouldn't change once set. This makes the 2nd assignment of the variable
in the finishing phase redundant.
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>

2ed95d2d

btrfs: Explicitly reserve space for devreplace item · f232ab04

Nikolay Borisov authored May 14, 2019

Part of device replace involves writing an item to the device root
containing information about pending replace operations. Currently space
for this item is not being explicitly reserved so this works thanks to
presence of global reserve. While not fatal it's not a good practice.
Let's be explicit about space requirement of device replace and reserve
space when starting the transaction.
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>

f232ab04

btrfs: Streamline replace sem unlock in btrfs_dev_replace_start · fa19452a

Nikolay Borisov authored May 14, 2019

There are only 2 branches which goto leave label with need_unlock set
to true. Essentially need_unlock is used as a substitute for directly
calling up_write. Since the branches needing this are only 2 and their
context is not that big it's more clear to just call up_write where
required. No functional changes.
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>

fa19452a

btrfs: Ensure btrfs_init_dev_replace_tgtdev sees up to date values · e1e0eb43

Nikolay Borisov authored May 14, 2019

btrfs_init_dev_replace_tgtdev reads certain values from the source
device (such as commit_total_bytes) which are updated during transaction
commit. Currently this function is called before committing any pending
transaction, leading to possibly reading outdated values.

Fix this by moving the function below the transaction commit, at this
point the EXCL_OP bit it set hence once transaction is complete the
total size of the device cannot be changed (it's usually changed by
resize/remove ops which are blocked).

Fixes: 9e271ae2 ("Btrfs: kernel operation should come after user input has been verified")
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>

e1e0eb43

btrfs: dev-replace: Remove impossible WARN_ON · 419684b2

Nikolay Borisov authored May 14, 2019

This WARN_ON can never trigger because src_device cannot be null.
btrfs_find_device_by_devspec always returns either an error or a valid
pointer to the device. Just remove it.
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>

419684b2

btrfs: Reduce critical section in btrfs_init_dev_replace_tgtdev · b0d9e1ea

Nikolay Borisov authored May 14, 2019

There is no point in holding btrfs_fs_devices::device_list_mutex
while initialising fields of the not-yet-published device. Instead,
hold the mutex only when the newly initialised device is being
published. I think holding device_list_mutex here is redundant
altogether, because at this point BTRFS_FS_EXCL_OP is set which
prevents device removal/addition/balance/resize to occur.
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>

b0d9e1ea

btrfs: Don't opencode sync_blockdev in btrfs_init_dev_replace_tgtdev · ddb93784

Nikolay Borisov authored May 14, 2019

Using sync_blockdev makes it plain obvious what's happening. No
functional changes.
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>

ddb93784

btrfs: fiemap: preallocate ulists for btrfs_check_shared · 5911c8fe

David Sterba authored May 15, 2019

btrfs_check_shared looks up parents of a given extent and uses ulists
for that. These are allocated and freed repeatedly. Preallocation in the
caller will avoid the overhead and also allow us to use the GFP_KERNEL
as it is happens before the extent locks are taken.
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>

5911c8fe

btrfs: detect fast implementation of crc32c on all architectures · 9b4e675a

David Sterba authored May 16, 2019

Currently, there's only check for fast crc32c implementation on X86,
based on the CPU flags. This is used to decide if checksumming should be
offloaded to worker threads or can be calculated by the caller.

As there are more architectures that implement a faster version of
crc32c (ARM, SPARC, s390, MIPS, PowerPC), also there are specialized hw
cards.

The detection is based on driver name, all generic C implementations
contain 'generic', while the specialized versions do not. Alternatively
the priority could be used, but this is not currently provided by the
crypto API.

The flag is set per-filesystem at mount time and used for the offloading
decisions.
Signed-off-by: David Sterba <dsterba@suse.com>

9b4e675a

btrfs: extent-tree: Refactor add_pinned_bytes() to add|sub_pinned_bytes() · 78192442

Qu Wenruo authored May 15, 2019

Instead of using @sign to determine whether we're adding or subtracting.
Even it only has 3 callers, it's still (and in fact already caused
problem in the past) confusing to use.

Refactor add_pinned_bytes() to add_pinned_bytes() and sub_pinned_bytes()
to explicitly show what we're doing.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>

78192442

30 Jun, 2019 3 commits

Linux 5.2-rc7 · 6fbc7275
Linus Torvalds authored Jun 30, 2019

6fbc7275

Merge tag 'powerpc-5.2-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · 39132f74

Linus Torvalds authored Jun 30, 2019

Pull powerpc fix from Michael Ellerman:
 "One fix for a regression in my commit adding KUAP (Kernel User Access
  Prevention) on Radix, which incorrectly touched the AMR in the early
  machine check handler.

  Thanks to Nicholas Piggin"

* tag 'powerpc-5.2-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/64s/exception: Fix machine check early corrupting AMR

39132f74

Merge branch 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 7c15f41e

Linus Torvalds authored Jun 30, 2019

Pull SMP fixes from Thomas Gleixner:
 "Two small changes for the cpu hotplug code:

   - Prevent out of bounds access which actually might crash the machine
     caused by a missing bounds check in the fail injection code

   - Warn about unsupported migitation mode command line arguments to
     make people aware that they typoed the paramater. Not necessarily a
     fix but quite some people tripped over that"

* 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  cpu/hotplug: Fix out-of-bounds read when setting fail state
  cpu/speculation: Warn on unsupported mitigations= parameter

7c15f41e

29 Jun, 2019 4 commits

Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 72825454

Linus Torvalds authored Jun 29, 2019

Pull x86 fixes from Ingo Molnar:
 "Misc fixes all over the place:

   - might_sleep() atomicity fix in the microcode loader

   - resctrl boundary condition fix

   - APIC arithmethics bug fix for frequencies >= 4.2 GHz

   - three 5-level paging crash fixes

   - two speculation fixes

   - a perf/stacktrace fix"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/unwind/orc: Fall back to using frame pointers for generated code
  perf/x86: Always store regs->ip in perf_callchain_kernel()
  x86/speculation: Allow guests to use SSBD even if host does not
  x86/mm: Handle physical-virtual alignment mismatch in phys_p4d_init()
  x86/boot/64: Add missing fixup_pointer() for next_early_pgt access
  x86/boot/64: Fix crash if kernel image crosses page table boundary
  x86/apic: Fix integer overflow on 10 bit left shift of cpu_khz
  x86/resctrl: Prevent possible overrun during bitmap operations
  x86/microcode: Fix the microcode load on CPU hotplug for real

72825454

Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 57103eb7

Linus Torvalds authored Jun 29, 2019

Pull perf fixes from Ingo Molnar:
 "Various fixes, most of them related to bugs perf fuzzing found in the
  x86 code"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/regs: Use PERF_REG_EXTENDED_MASK
  perf/x86: Remove pmu->pebs_no_xmm_regs
  perf/x86: Clean up PEBS_XMM_REGS
  perf/x86/regs: Check reserved bits
  perf/x86: Disable extended registers for non-supported PMUs
  perf/ioctl: Add check for the sample_period value
  perf/core: Fix perf_sample_regs_user() mm check

57103eb7

Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · eed7d30e

Linus Torvalds authored Jun 29, 2019

Pull irq fixes from Ingo Molnar:
 "Diverse irqchip driver fixes"

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqchip/gic-v3-its: Fix command queue pointer comparison bug
  irqchip/mips-gic: Use the correct local interrupt map registers
  irqchip/ti-sci-inta: Fix kernel crash if irq_create_fwspec_mapping fail
  irqchip/irq-csky-mpintc: Support auto irq deliver to all cpus

eed7d30e

Merge branch 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · a7211bc9

Linus Torvalds authored Jun 29, 2019

Pull EFI fixes from Ingo Molnar:
 "Four fixes:
   - fix a kexec crash on arm64
   - fix a reboot crash on some Android platforms
   - future-proof the code for upcoming ACPI 6.2 changes
   - fix a build warning on x86"

* 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  efibc: Replace variable set function in notifier call
  x86/efi: fix a -Wtype-limits compilation warning
  efi/bgrt: Drop BGRT status field reserved bits check
  efi/memreserve: deal with memreserve entries in unmapped memory

a7211bc9