Commits · e626f37e657adbab2a7abe51480925891662a5f3 · Kirill Smelkov / linux

17 May, 2022 1 commit

nvme: split the enum used for various register constants · e626f37e

Christoph Hellwig authored May 16, 2022

Instead of having one big enum add one for each register or field.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>

e626f37e

16 May, 2022 8 commits

nvme-fabrics: add a request timeout helper · 93ba75c9

Chaitanya Kulkarni authored Mar 30, 2022

The RDAMA and TCP transport both complete the timed out request in the
same manner and hence code is duplicated. Add and use the helper
nvmf_complete_timed_out_request() to remove the duplicate code.
Signed-off-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>

93ba75c9

nvme-pci: harden drive presence detect in nvme_dev_disable() · b98235d3

Stefan Roese authored May 06, 2022

On our ZynqMP system we observe, that a NVMe drive that resets itself
while doing a firmware update causes a Kernel crash like this:

[ 67.720772] pcieport 0000:02:02.0: pciehp: Slot(2): Link Down
[ 67.720783] pcieport 0000:02:02.0: pciehp: Slot(2): Card not present
[ 67.720795] nvme 0000:04:00.0: PME# disabled
[ 67.720849] Internal error: synchronous external abort: 96000010 [#1] PREEMPT SMP
[ 67.720853] nwl-pcie fd0e0000.pcie: Slave error

Analysis: When nvme_dev_disable() is called because of this PCIe hotplug
event, pci_is_enabled() is still true. And accessing the NVMe drive
which is currently not available as it's in reboot process causes this
"synchronous external abort" on this ARM64 platform.

This patch adds the pci_device_is_present() check as well, which returns
false in this "Card not present" hot-plug case. With this change, the
NVMe driver does not try to access the NVMe registers any more and the
FW update finishes without any problems.
Signed-off-by: Stefan Roese <sr@denx.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>

b98235d3

nvme-pci: fix a NULL pointer dereference in nvme_alloc_admin_tags · da427611

Smith, Kyle Miller (Nimble Kernel) authored Apr 22, 2022

In nvme_alloc_admin_tags, the admin_q can be set to an error (typically
-ENOMEM) if the blk_mq_init_queue call fails to set up the queue, which
is checked immediately after the call. However, when we return the error
message up the stack, to nvme_reset_work the error takes us to
nvme_remove_dead_ctrl()
  nvme_dev_disable()
   nvme_suspend_queue(&dev->queues[0]).

Here, we only check that the admin_q is non-NULL, rather than not
an error or NULL, and begin quiescing a queue that never existed, leading
to bad / NULL pointer dereference.
Signed-off-by: Kyle Smith <kyles@hpe.com>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>

da427611

nvme: mark internal passthru request RQF_QUIET · 128126a7

Chaitanya Kulkarni authored Apr 19, 2022

Most of the internal passthru commands use __nvme_submit_sync_cmd()
interface. There are few places we open code the request submission :-

1. nvme_keep_alive_work(struct work_struct *work)
2. nvme_timeout(struct request *req, bool reserved)
3. nvme_delete_queue(struct nvme_queue *nvmeq, u8 opcode)

Mark the internal passthru request quiet so that we can skip the verbose
error message from nvme_log_error() in nvme_end_req() completion path,
this will be consistent with what we have in __nvme_submit_sync_cmd().
Signed-off-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Alan Adamson <alan.adamson@oracle.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>

128126a7

nvme: remove unneeded include from constants file · da3340e7

Max Gurtovoy authored Apr 28, 2022

No usage of blkdev.h elements.
Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>

da3340e7

nvme: add missing status values to verbose logging · ca2d8992

Max Gurtovoy authored Apr 28, 2022

Log a few more path related status codes.
Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>

ca2d8992

nvme: set dma alignment to dword · 52fde2c0

Keith Busch authored May 04, 2022

The nvme specification only requires qword alignment for segment
descriptors, and the driver already guarantees that. The spec has always
allowed user data to be dword aligned, which is what the queue's
attribute is for, so relax the alignment requirement to that value.

While we could allow byte alignment for some controllers when using
SGLs, we still need to support PRP, and that only allows dword.

Fixes: 3b2a1ebc ("nvme: set dma alignment to qword")
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>

52fde2c0

nvme: fix interpretation of DMRSL · 1a86924e

Tom Yan authored Apr 29, 2022

DMRSLl is in the unit of logical blocks, while max_discard_sectors is
in the unit of "linux sector".
Signed-off-by: Tom Yan <tom.ty89@gmail.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>

1a86924e

10 May, 2022 4 commits

loop: remove most the top-of-file boilerplate comment from the UAPI header · c23d47ab

Christoph Hellwig authored Apr 19, 2022

Just leave the SPDX marker and the copyright notice and remove the
irrelevant rest.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20220419063303.583106-5-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

c23d47ab

loop: remove most the top-of-file boilerplate comment · eb04bb15

Christoph Hellwig authored Apr 19, 2022

Remove the irrelevant changelogs and todo notes and just leave the SPDX
marker and the copyright notice.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Link: https://lore.kernel.org/r/20220419063303.583106-4-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

eb04bb15

loop: add a SPDX header · f21e6e18

Christoph Hellwig authored Apr 19, 2022

The copyright statement says:

"Redistribution of this file is permitted under the GNU General Public
 License." and was added by Ted in 1993, at which point GPLv2 only
 was the default Linux license.

Replace it with the usual GPLv2 only SPDX header.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20220419063303.583106-3-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

f21e6e18

loop: remove loop.h · 754d9679

Christoph Hellwig authored Apr 19, 2022

Merge loop.h into loop.c as all the content is only used there.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Link: https://lore.kernel.org/r/20220419063303.583106-2-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

754d9679

04 May, 2022 5 commits

block: null_blk: Improve device creation with configfs · 49c3b926

Damien Le Moal authored Apr 20, 2022

Currently, the directory name used to create a nullb device through
sysfs is not used as the device name, potentially causing headaches for
users if devices are already created through the modprobe operation
withe the nr_device module parameter not set to 0. E.g. a user can do
"mkdir /sys/kernel/config/nullb/nullb0" to create a nullb device even
though /dev/nullb0 was already created by modprobe. In this case, the
configfs nullb device will be named nullb1, causing confusion for the
user.

Simplify this by using the configfs directory name as the nullb device
name, always, unless another nullb device is already using the same
name. E.g. if modprobe created nullb0, then:

$ mkdir /sys/kernel/config/nullb/nullb0
mkdir: cannot create directory '/sys/kernel/config/nullb/nullb0': File
exists

will be reported to the user.

To implement this, the function null_find_dev_by_name() is added to
check for the existence of a nullb device with the name used for a new
configfs device directory. nullb_group_make_item() uses this new
function to check if the directory name can be used as the disk name.
Finally, null_add_dev() is modified to use the device config item name
as the disk name for a new nullb device created using configfs.
The naming of devices created though modprobe remains unchanged.

Of note is that it is possible for a user to create through configfs a
nullb device with the same name as an existing device. E.g.

$ mkdir /sys/kernel/config/nullb/null

will successfully create the nullb device named "null" but this block
device will however not appear under /dev/ since /dev/null already
exists.
Suggested-by: Joseph Bacik <josef@toxicpanda.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Link: https://lore.kernel.org/r/20220420005718.3780004-5-damien.lemoal@opensource.wdc.comSigned-off-by: Jens Axboe <axboe@kernel.dk>

49c3b926

block: null_blk: Cleanup messages · db060f54

Damien Le Moal authored Apr 20, 2022

Use the pr_fmt() macro to prefix all null_blk pr_xxx() messages with
"null_blk:" to clarify which module is printing the messages. Also add
a pr_info() message in null_add_dev() to print the name of a newly
created disk.
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Link: https://lore.kernel.org/r/20220420005718.3780004-4-damien.lemoal@opensource.wdc.comSigned-off-by: Jens Axboe <axboe@kernel.dk>

db060f54

block: null_blk: Cleanup device creation and deletion · b3a0a73e

Damien Le Moal authored Apr 20, 2022

Introduce the null_create_dev() and null_destroy_dev() helper functions
to respectivel create nullb devices on modprobe and destroy them on
rmmod. The null_destroy_dev() helper avoids duplicated code in the
null_init() and null_exit() functions for deleting devices.
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Link: https://lore.kernel.org/r/20220420005718.3780004-3-damien.lemoal@opensource.wdc.comSigned-off-by: Jens Axboe <axboe@kernel.dk>

b3a0a73e

block: null_blk: Fix code style issues · 525323d2

Damien Le Moal authored Apr 20, 2022

Fix message grammar and code style issues (brackets and indentation) in
null_init().
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Link: https://lore.kernel.org/r/20220420005718.3780004-2-damien.lemoal@opensource.wdc.comSigned-off-by: Jens Axboe <axboe@kernel.dk>

525323d2

xen-blkback: use bdev_discard_alignment · 0000f2f7

Christoph Hellwig authored Apr 18, 2022

Use bdev_discard_alignment to calculate the correct discard alignment
offset even for partitions instead of just looking at the queue limit.

Also switch to use bdev_discard_granularity to get rid of the last direct
queue reference in xen_blkbk_discard.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/20220418045314.360785-12-hch@lst.de
[axboe: fold in 'q' removal as it's now unused]
Signed-off-by: Jens Axboe <axboe@kernel.dk>

0000f2f7

03 May, 2022 10 commits

rnbd-srv: use bdev_discard_alignment · 18292faa

Christoph Hellwig authored Apr 18, 2022

Use bdev_discard_alignment to calculate the correct discard alignment
offset even for partitions instead of just looking at the queue limit.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/20220418045314.360785-11-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

18292faa

nvme: remove a spurious clear of discard_alignment · 4e7f0ece

Christoph Hellwig authored Apr 18, 2022

The nvme driver never sets a discard_alignment, so it also doens't need
to clear it to zero.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/20220418045314.360785-10-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

4e7f0ece

loop: remove a spurious clear of discard_alignment · 4418bfd8

Christoph Hellwig authored Apr 18, 2022

The loop driver never sets a discard_alignment, so it also doens't need
to clear it to zero.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/20220418045314.360785-9-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

4418bfd8

dasd: don't set the discard_alignment queue limit · c3f76529

Christoph Hellwig authored Apr 18, 2022

The discard_alignment queue limit is named a bit misleading means the
offset into the block device at which the discard granularity starts.
Setting it to PAGE_SIZE while the discard granularity is the block size
that is smaller or the same as PAGE_SIZE as done by dasd is mostly
harmless but also useless.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/20220418045314.360785-8-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

c3f76529

raid5: don't set the discard_alignment queue limit · 3d50d368

Christoph Hellwig authored Apr 18, 2022

The discard_alignment queue limit is named a bit misleading means the
offset into the block device at which the discard granularity starts.
Setting it to the discard granularity as done by raid5 is mostly
harmless but also useless.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Song Liu <song@kernel.org>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/20220418045314.360785-7-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

3d50d368

dm-zoned: don't set the discard_alignment queue limit · 44d58370

Christoph Hellwig authored Apr 18, 2022

The discard_alignment queue limit is named a bit misleading means the
offset into the block device at which the discard granularity starts.
Setting it to the discard granularity as done by dm-zoned is mostly
harmless but also useless.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/20220418045314.360785-6-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

44d58370

virtio_blk: fix the discard_granularity and discard_alignment queue limits · 62952cc5

Christoph Hellwig authored Apr 18, 2022

The discard_alignment queue limit is named a bit misleading means the
offset into the block device at which the discard granularity starts.

On the other hand the discard_sector_alignment from the virtio 1.1 looks
similar to what Linux uses as discard granularity (even if not very well
described):

  "discard_sector_alignment can be used by OS when splitting a request
   based on alignment. "

And at least qemu does set it to the discard granularity.

So stop setting the discard_alignment and use the virtio
discard_sector_alignment to set the discard granularity.

Fixes: 1f23816b ("virtio_blk: add discard and write zeroes support")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/20220418045314.360785-5-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

62952cc5

null_blk: don't set the discard_alignment queue limit · fb749a87

Christoph Hellwig authored Apr 18, 2022

The discard_alignment queue limit is named a bit misleading means the
offset into the block device at which the discard granularity starts.
Setting it to the discard granularity as done by null_blk is mostly
harmless but also useless.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/20220418045314.360785-4-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

fb749a87

nbd: don't set the discard_alignment queue limit · 4a04d517

Christoph Hellwig authored Apr 18, 2022

The discard_alignment queue limit is named a bit misleading means the
offset into the block device at which the discard granularity starts.
Setting it to the discard granularity as done by nbd is mostly harmless
but also useless.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/20220418045314.360785-3-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

4a04d517

ubd: don't set the discard_alignment queue limit · 07c6e92a

Christoph Hellwig authored Apr 18, 2022

The discard_alignment queue limit is named a bit misleading means the
offset into the block device at which the discard granularity starts.
Setting it to the discard granularity as done by ubd is mostly harmless
but also useless.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/20220418045314.360785-2-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

07c6e92a

01 May, 2022 1 commit

aoe: Avoid flush_scheduled_work() usage · 0b8d7622

Tetsuo Handa authored Apr 19, 2022

Flushing system-wide workqueues is dangerous and will be forbidden.
Replace system_wq with local aoe_wq.

Link: https://lkml.kernel.org/r/49925af7-78a8-a3dd-bce6-cfc02e1a9236@I-love.SAKURA.ne.jpSigned-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Link: https://lore.kernel.org/r/abb37616-eec9-2794-e21e-7c623085d987@I-love.SAKURA.ne.jpSigned-off-by: Jens Axboe <axboe@kernel.dk>

0b8d7622

28 Apr, 2022 1 commit

Merge branch 'md-next' of... · f01e49fb

Jens Axboe authored Apr 27, 2022

Merge branch 'md-next' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md into for-5.19/drivers

Pull MD updates from Song:

"1. Improve annotation in raid5 code, by Logan Gunthorpe.
 2. Support MD_BROKEN flag in raid-1/5/10, by Mariusz Tkaczyk.
 3. Other small fixes/cleanups."

* 'md-next' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md:
  md: Replace role magic numbers with defined constants
  md/raid0: Ignore RAID0 layout if the second zone has only one device
  md/raid5: Annotate functions that hold device_lock with __must_hold
  md/raid5-ppl: Annotate with rcu_dereference_protected()
  md/raid5: Annotate rdev/replacement access when mddev_lock is held
  md/raid5: Annotate rdev/replacement accesses when nr_pending is elevated
  md/raid5: Add __rcu annotation to struct disk_info
  md/raid5: Un-nest struct raid5_percpu definition
  md/raid5: Cleanup setup_conf() error returns
  md: replace deprecated strlcpy & remove duplicated line
  md/bitmap: don't set sb values if can't pass sanity check
  md: fix an incorrect NULL check in md_reload_sb
  md: fix an incorrect NULL check in does_sb_need_changing
  raid5: introduce MD_BROKEN
  md: Set MD_BROKEN for RAID1 and RAID10

f01e49fb

26 Apr, 2022 1 commit

null-blk: save memory footprint for struct nullb_cmd · 8ba816b2

Yu Kuai authored Apr 26, 2022

Total 16 bytes can be saved in two ways:

1) The field 'bio' will only be used in bio based mode, and the field
   'rq' will only be used in mq mode. Since they won't be used in the
   same time, declare a union for them.
2) The field 'bool fake_timeout' can be placed in the hole after the
   field 'error'.
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Link: https://lore.kernel.org/r/20220426022133.3999006-1-yukuai3@huawei.comSigned-off-by: Jens Axboe <axboe@kernel.dk>

8ba816b2

25 Apr, 2022 9 commits

md: Replace role magic numbers with defined constants · 9151ad5d

David Sloan authored Apr 21, 2022

There are several instances where magic numbers are used in md.c instead
of the defined constants in md_p.h. This patch set improves code
readability by replacing all occurrences of 0xffff, 0xfffe, and 0xfffd when
relating to md roles with their equivalent defined constant.
Signed-off-by: David Sloan <david.sloan@eideticom.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Song Liu <song@kernel.org>

9151ad5d

md/raid0: Ignore RAID0 layout if the second zone has only one device · ea23994e

Pascal Hambourg authored Apr 13, 2022

The RAID0 layout is irrelevant if all members have the same size so the
array has only one zone. It is *also* irrelevant if the array has two
zones and the second zone has only one device, for example if the array
has two members of different sizes.

So in that case it makes sense to allow assembly even when the layout is
undefined, like what is done when the array has only one zone.
Reviewed-by: NeilBrown <neilb@suse.de>
Signed-off-by: Pascal Hambourg <pascal@plouf.fr.eu.org>
Signed-off-by: Song Liu <song@kernel.org>

ea23994e

md/raid5: Annotate functions that hold device_lock with __must_hold · 4631f39f

Logan Gunthorpe authored Apr 07, 2022

A handful of functions note the device_lock must be held with a comment
but this is not comprehensive. Many other functions hold the lock when
taken so add an __must_hold() to each call to annotate when the lock is
held.

This makes it a bit easier to analyse device_lock.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Song Liu <song@kernel.org>

4631f39f

md/raid5-ppl: Annotate with rcu_dereference_protected() · 4f4ee2bf

Logan Gunthorpe authored Apr 07, 2022

To suppress the last remaining sparse warnings about accessing
rdev, add rcu_dereference_protected calls to a couple places
in raid5-ppl. All of these places are called under raid5_run and
therefore are occurring before the array has started and is thus
safe.

There's no sensible check to do for the second argument of
rcu_dereference_protected() so a comment is added instead.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Song Liu <song@kernel.org>

4f4ee2bf

md/raid5: Annotate rdev/replacement access when mddev_lock is held · 9aeb7f99

Logan Gunthorpe authored Apr 07, 2022

The mddev_lock should be held during raid5_remove_disk() which is when
the rdev/replacement pointers are modified. So any access to these
pointers marked __rcu should be safe whenever the mddev_lock is held.

There are numerous such access that currently produce sparse warnings.
Add a helper function, rdev_mdlock_deref() that wraps
rcu_dereference_protected() in all these instances.

This annotation fixes a number of sparse warnings.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Song Liu <song@kernel.org>

9aeb7f99

md/raid5: Annotate rdev/replacement accesses when nr_pending is elevated · e38b0432

Logan Gunthorpe authored Apr 07, 2022

There are a number of accesses to __rcu variables that should be safe
because nr_pending in the disk is known to be elevated.

Create a wrapper around rcu_dereference_protected() to annotate these
accesses and verify that nr_pending is non-zero.

This fixes a number of sparse warnings.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Song Liu <song@kernel.org>

e38b0432

md/raid5: Add __rcu annotation to struct disk_info · b0920ede

Logan Gunthorpe authored Apr 07, 2022

rdev and replacement are protected in some circumstances with
rcu_dereference and synchronize_rcu (in raid5_remove_disk()). However,
they were not annotated with __rcu so a sparse warning is emitted for
every rcu_dereference() call.

Add the __rcu annotation and fix up the initialization with
RCU_INIT_POINTER, all pointer modifications with rcu_assign_pointer(),
a few cases where the pointer value is tested with rcu_access_pointer()
and one case where READ_ONCE() is used instead of rcu_dereference(),
a case in print_raid5_conf() that should have rcu_dereference() and
rcu_read_[un]lock() calls.

Additional sparse issues will be fixed up in further commits.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Song Liu <song@kernel.org>

b0920ede

md/raid5: Un-nest struct raid5_percpu definition · 3d9a644c

Logan Gunthorpe authored Apr 07, 2022

Sparse reports many warnings of the form:
  drivers/md/raid5.c:1476:16: warning: dereference of noderef expression

This is because all struct raid5_percpu definitions get marked as
__percpu when really only the pointer in r5conf should have that
annotation.

Fix this by moving the defnition of raid5_precpu out of the definition
of struct r5conf.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Song Liu <song@kernel.org>

3d9a644c

md/raid5: Cleanup setup_conf() error returns · 8fbcba6b

Logan Gunthorpe authored Apr 07, 2022

Be more careful about the error returns. Most errors in this function
are actually ENOMEM, but it forcibly returns EIO if conf has been
allocated.

Instead return ret and ensure it is set appropriately before each goto
abort.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Song Liu <song@kernel.org>

8fbcba6b