Commits · 26db5ee158510108c819aa7be6eb8c75accf85d7 · Kirill Smelkov / linux

03 Feb, 2023 21 commits

block: add a bvec_set_folio helper · 26db5ee1

Christoph Hellwig authored Feb 03, 2023

A smaller wrapper around bvec_set_page that takes a folio instead.
There are only two potential users for this in the tree, but the number
will grow in the future.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Link: https://lore.kernel.org/r/20230203150634.3199647-3-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

26db5ee1

block: factor out a bvec_set_page helper · d58cdfae

Christoph Hellwig authored Feb 03, 2023

Add a helper to initialize a bvec based of a page pointer. This will help
removing various open code bvec initializations.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Link: https://lore.kernel.org/r/20230203150634.3199647-2-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

d58cdfae

blk-cgroup: move the cgroup information to struct gendisk · 3f13ab7c

Christoph Hellwig authored Feb 03, 2023

cgroup information only makes sense on a live gendisk that allows
file system I/O (which includes the raw block device).  So move over
the cgroup related members.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Andreas Herrmann <aherrmann@suse.de>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/20230203150400.3199230-20-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

3f13ab7c

blk-cgroup: pass a gendisk to blkg_lookup · 479664ce

Christoph Hellwig authored Feb 03, 2023

Pass a gendisk to blkg_lookup and use that to find the match as part
of phasing out usage of the request_queue in the blk-cgroup code.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Andreas Herrmann <aherrmann@suse.de>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/20230203150400.3199230-19-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

479664ce

blk-cgroup: pass a gendisk to pd_alloc_fn · 0a0b4f79

Christoph Hellwig authored Feb 03, 2023

No need to the request_queue here, pass a gendisk and extract the
node ids from that.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Andreas Herrmann <aherrmann@suse.de>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/20230203150400.3199230-18-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

0a0b4f79

blk-cgroup: pass a gendisk to blkcg_{de,}activate_policy · 40e4996e

Christoph Hellwig authored Feb 03, 2023

Prepare for storing the blkcg information in the gendisk instead of
the request_queue.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Andreas Herrmann <aherrmann@suse.de>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/20230203150400.3199230-17-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

40e4996e

blk-rq-qos: store a gendisk instead of request_queue in struct rq_qos · ba91c849

Christoph Hellwig authored Feb 03, 2023

This is what about half of the users already want, and it's only going to
grow more.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Andreas Herrmann <aherrmann@suse.de>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/20230203150400.3199230-16-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

ba91c849

blk-rq-qos: constify rq_qos_ops · 3963d84d

Christoph Hellwig authored Feb 03, 2023

These op vectors are constant, so mark them const.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Andreas Herrmann <aherrmann@suse.de>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/20230203150400.3199230-15-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

3963d84d

blk-rq-qos: make rq_qos_add and rq_qos_del more useful · ce57b558

Christoph Hellwig authored Feb 03, 2023

Switch to passing a gendisk, and make rq_qos_add initialize all required
fields and drop the not required q argument from rq_qos_del.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Andreas Herrmann <aherrmann@suse.de>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/20230203150400.3199230-14-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

ce57b558

blk-rq-qos: move rq_qos_add and rq_qos_del out of line · b494f9c5

Christoph Hellwig authored Feb 03, 2023

These two functions are rather larger and not in a fast path, so move
them out of line.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/20230203150400.3199230-13-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

b494f9c5

blk-wbt: open code wbt_queue_depth_changed in wbt_init · 4e1d91ae

Christoph Hellwig authored Feb 03, 2023

wbt_queue_depth_changed just updates a field and calls another function.
Open code it in wbt_init, so that the local queue variable can be used
instead of the one stored in the rq_qos. This will allow delaying that
rq_qos->queue assignment in a subsequent patch.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Andreas Herrmann <aherrmann@suse.de>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/20230203150400.3199230-12-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

4e1d91ae

blk-wbt: move private information from blk-wbt.h to blk-wbt.c · 0bc65bd4

Christoph Hellwig authored Feb 03, 2023

A large part of blk-wbt.h is only used in blk-wbt.c, so move it there.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/20230203150400.3199230-11-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

0bc65bd4

blk-wbt: pass a gendisk to wbt_init · 958f2965

Christoph Hellwig authored Feb 03, 2023

Pass a gendisk to wbt_init to prepare for phasing out usage of the
request_queue in the blk-cgroup code.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Andreas Herrmann <aherrmann@suse.de>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/20230203150400.3199230-10-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

958f2965

blk-wbt: pass a gendisk to wbt_{enable,disable}_default · 04aad37b

Christoph Hellwig authored Feb 03, 2023

Pass a gendisk to wbt_enable_default and wbt_disable_default to
prepare for phasing out usage of the request_queue in the blk-cgroup
code.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Andreas Herrmann <aherrmann@suse.de>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/20230203150400.3199230-9-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

04aad37b

blk-cgroup: store a gendisk to throttle in struct task_struct · f05837ed

Christoph Hellwig authored Feb 03, 2023

Switch from a request_queue pointer and reference to a gendisk once
for the throttle information in struct task_struct.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Andreas Herrmann <aherrmann@suse.de>
Link: https://lore.kernel.org/r/20230203150400.3199230-8-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

f05837ed

blk-cgroup: pin the gendisk in struct blkcg_gq · 84d7d462

Christoph Hellwig authored Feb 03, 2023

Currently each blkcg_gq holds a request_queue reference, which is what
is used in the policies.  But a lot of these interfaces will move over to
use a gendisk, so store a disk in struct blkcg_gq and hold a reference to
it.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Andreas Herrmann <aherrmann@suse.de>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/20230203150400.3199230-7-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

84d7d462

blk-cgroup: remove the !bdi->dev check in blkg_dev_name · 180b04d4

Christoph Hellwig authored Feb 03, 2023

bdi_dev_name already performs the same check.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/20230203150400.3199230-6-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

180b04d4

blk-cgroup: simplify blkg freeing from initialization failure paths · 27b642b0

Christoph Hellwig authored Feb 03, 2023

There is no need to delay freeing a blkg to a workqueue when freeing it
after an initialization failure.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/20230203150400.3199230-5-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

27b642b0

blk-cgroup: improve error unwinding in blkg_alloc · 0b6f93bd

Christoph Hellwig authored Feb 03, 2023

Unwind only the previous initialization steps that happened in blkg_alloc
using goto based unwinding. This avoids the need for the !queue special
case in blkg_free and thus ensures that any blkg seens outside of
blkg_alloc is always fully constructed.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/20230203150400.3199230-4-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

0b6f93bd

blk-cgroup: delay blk-cgroup initialization until add_disk · 178fa7d4

Christoph Hellwig authored Feb 03, 2023

There is no need to initialize the cgroup code before the disk is marked
live. Moving the cgroup initialization earlier will help to have a
fully initialized struct device in the gendisk for the cgroup code to
use in the future. Similarly tear the cgroup information down in
del_gendisk to be symmetric and because none of the cgroup tracking is
needed once non-passthrough I/O stops.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Andreas Herrmann <aherrmann@suse.de>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/20230203150400.3199230-3-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

178fa7d4

block: don't call blk_throtl_stat_add for non-READ/WRITE commands · a886001c

Christoph Hellwig authored Feb 03, 2023

blk_throtl_stat_add is called from blk_stat_add explicitly, unlike the
other stats that go through q->stats->callbacks. To prepare for cgroup
data moving to the gendisk, ensure blk_throtl_stat_add is only called
for the plain READ and WRITE commands that it actually handles internally,
as blk_stat_add can also be called for passthrough commands on queues that
do not have a gendisk associated with them.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Andreas Herrmann <aherrmann@suse.de>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/20230203150400.3199230-2-hch@lst.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

a886001c

02 Feb, 2023 1 commit

Merge branch 'md-next' of... · 839c717b

Jens Axboe authored Feb 02, 2023

Merge branch 'md-next' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md into for-6.3/block

Pull MD updates from Song:

"Non-urgent fixes:
   md: don't update recovery_cp when curr_resync is ACTIVE
   md: Free writes_pending in md_stop

 Performance optimization:
   md: Change active_io to percpu"

* 'md-next' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md:
  md: use MD_RESYNC_* whenever possible
  md: Free writes_pending in md_stop
  md: Change active_io to percpu
  md: Factor out is_md_suspended helper
  md: don't update recovery_cp when curr_resync is ACTIVE

839c717b

01 Feb, 2023 6 commits

md: use MD_RESYNC_* whenever possible · ed821cf8

Hou Tao authored Feb 01, 2023

Just replace magic numbers by MD_RESYNC_* enumerations.
Signed-off-by: Hou Tao <houtao1@huawei.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Song Liu <song@kernel.org>

ed821cf8

md: Free writes_pending in md_stop · 07dbb135

Xiao Ni authored Jan 21, 2023

dm raid calls md_stop to stop the raid device. It needs to
free the writes_pending here.
Signed-off-by: Xiao Ni <xni@redhat.com>
Signed-off-by: Song Liu <song@kernel.org>

07dbb135

md: Change active_io to percpu · 72adae23

Xiao Ni authored Jan 31, 2023

Now the type of active_io is atomic. It's used to count how many ios are
in the submitting process and it's added and decreased very time. But it
only needs to check if it's zero when suspending the raid. So we can
switch atomic to percpu to improve the performance.

After switching active_io to percpu type, we use the state of active_io
to judge if the raid device is suspended. And we don't need to wake up
->sb_wait in md_handle_request anymore. It's done in the callback function
which is registered when initing active_io. The argument mddev->suspended
is only used to count how many users are trying to set raid to suspend
state.
Signed-off-by: Xiao Ni <xni@redhat.com>
Signed-off-by: Song Liu <song@kernel.org>

72adae23

md: Factor out is_md_suspended helper · d1932913

Xiao Ni authored Jan 31, 2023

This helper function will be used in next patch. It's easy for
understanding.
Signed-off-by: Xiao Ni <xni@redhat.com>
Signed-off-by: Song Liu <song@kernel.org>

d1932913

md: don't update recovery_cp when curr_resync is ACTIVE · 1d1f25bf

Hou Tao authored Jan 31, 2023

Don't update recovery_cp when curr_resync is MD_RESYNC_ACTIVE, otherwise
md may skip the resync of the first 3 sectors if the resync procedure is
interrupted before the first calling of ->sync_request() as shown below:

md_do_sync thread          control thread
  // setup resync
  mddev->recovery_cp = 0
  j = 0
  mddev->curr_resync = MD_RESYNC_ACTIVE

                             // e.g., set array as idle
                             set_bit(MD_RECOVERY_INTR, &&mddev_recovery)
  // resync loop
  // check INTR before calling sync_request
  !test_bit(MD_RECOVERY_INTR, &mddev->recovery

  // resync interrupted
  // update recovery_cp from 0 to 3
  // the resync of three 3 sectors will be skipped
  mddev->recovery_cp = 3

Fixes: eac58d08 ("md: Use enum for overloaded magic numbers used by mddev->curr_resync")
Cc: stable@vger.kernel.org # 6.0+
Signed-off-by: Hou Tao <houtao1@huawei.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Song Liu <song@kernel.org>

1d1f25bf

loop: Improve the hw_queue_depth kernel module parameter implementation · e152a05f

Bart Van Assche authored Jan 30, 2023

Make the following minor changes which were reported by colleagues
while reviewing this code:
- Remove the parentheses from around the LOOP_DEFAULT_HW_Q_DEPTH
  definition since these are superfluous.
- Accept other number formats than decimal, e.g. hexadecimal.
- Do not set hw_queue_depth to an out-of-range value, even if that value
  won't be used.
- Use the LOOP_DEFAULT_HW_Q_DEPTH macro in the kernel module parameter
  description to prevent that the description gets out of sync.

This patch has been tested as follows:

 # modprobe -r loop
 # modprobe loop hw_queue_depth=-1
 modprobe: ERROR: could not insert 'loop': Invalid argument
 # modprobe loop hw_queue_depth=0
 modprobe: ERROR: could not insert 'loop': Invalid argument
 # modprobe loop hw_queue_depth=1; cat /sys/module/loop/parameters/hw_queue_depth
 1
 # modprobe -r loop; modprobe loop; cat /sys/module/loop/parameters/hw_queue_depth hw_queue_depth=0x10
 16
 # modprobe -r loop; modprobe loop; cat /sys/module/loop/parameters/hw_queue_depth hw_queue_depth=128
 128
 # modprobe -r loop; modprobe loop hw_queue_depth=129; cat /sys/module/loop/parameters/hw_queue_depth
 129
 # modprobe -r loop; modprobe loop hw_queue_depth=$((1<<32))
 modprobe: ERROR: could not insert 'loop': Numerical result out of range

See also commit ef44c508 ("loop: allow user to set the queue
depth").

Cc: Chaitanya Kulkarni <kch@nvidia.com>
Cc: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Link: https://lore.kernel.org/r/20230130211347.832110-1-bvanassche@acm.orgSigned-off-by: Jens Axboe <axboe@kernel.dk>

e152a05f

31 Jan, 2023 2 commits

block: Remove mm.h from bvec.h · 2d97930d

Matthew Wilcox authored Jan 31, 2023

This was originally added for the definition of nth_page(), but we no
longer use nth_page() in this header, so we can drop the heavyweight
mm.h now.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Link: https://lore.kernel.org/r/20230131050132.2627124-1-willy@infradead.orgSigned-off-by: Jens Axboe <axboe@kernel.dk>

2d97930d

ublk_drv: only allow owner to open unprivileged disk · 48a90519

Ming Lei authored Jan 31, 2023

Owner of one unprivileged ublk device could be one evil user, which
can grant this disk's privilege to other users deliberately, and
this way could be like making one trap and waiting for other users
to be caught.

So only owner to open unprivileged disk even though the owner
grants disk privilege to other user. This way is reasonable too
given anyone can create ublk disk, and no need other's grant.
Reported-by: Stefan Hajnoczi <stefanha@redhat.com>
Fixes: 4093cb5a ("ublk_drv: add mechanism for supporting unprivileged ublk device")
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20230131040446.214583-1-ming.lei@redhat.comSigned-off-by: Jens Axboe <axboe@kernel.dk>

48a90519

30 Jan, 2023 10 commits

block: Default to use cgroup support for BFQ · 4a6a7bc2

Ulf Hansson authored Jan 30, 2023

Assuming that both Kconfig options, BLK_CGROUP and IOSCHED_BFQ are set, we
most likely want cgroup support for BFQ too (BFQ_GROUP_IOSCHED), so let's
make it default y.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20230130121240.159456-1-ulf.hansson@linaro.orgSigned-off-by: Jens Axboe <axboe@kernel.dk>