Commits · 59b28a6e37e650c0d601ed87875b6217140cda5d · Kirill Smelkov / linux

01 May, 2024 2 commits

io_uring/msg_ring: cleanup posting to IOPOLL vs !IOPOLL ring · 59b28a6e

Jens Axboe authored Mar 28, 2024

Move the posting outside the checking and locking, it's cleaner that
way.
Signed-off-by: Jens Axboe <axboe@kernel.dk>

59b28a6e

io_uring: Require zeroed sqe->len on provided-buffers send · 79996b45

Gabriel Krisman Bertazi authored May 01, 2024

When sending from a provided buffer, we set sr->len to be the smallest
between the actual buffer size and sqe->len.  But, now that we
disconnect the buffer from the submission request, we can get in a
situation where the buffers and requests mismatch, and only part of a
buffer gets sent.  Assume:

* buf[1]->len = 128; buf[2]->len = 256
* sqe[1]->len = 128; sqe[2]->len = 256

If sqe1 runs first, it picks buff[1] and it's all good. But, if sqe[2]
runs first, sqe[1] picks buff[2], and the last half of buff[2] is
never sent.

While arguably the use-case of different-length sends is questionable,
it has already raised confusion with potential users of this
feature. Let's make the interface less tricky by forcing the length to
only come from the buffer ring entry itself.

Fixes: ac5f71a3 ("io_uring/net: add provided buffer support for IORING_OP_SEND")
Signed-off-by: Gabriel Krisman Bertazi <krisman@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

79996b45

30 Apr, 2024 2 commits

io_uring/notif: disable LAZY_WAKE for linked notifs · 19352a1d

Pavel Begunkov authored Apr 30, 2024

Notifications may now be linked and thus a single tw can post multiple
CQEs, it's not safe to use LAZY_WAKE with them. Disable LAZY_WAKE for
now, if that'd prove to be a problem we can count them and pass the
expected number of CQEs into __io_req_task_work_add().

Fixes: 6fe42209 ("io_uring/notif: implement notification stacking")
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/0a5accdb7d2d0d27ebec14f8106e14e0192fae17.1714488419.git.asml.silence@gmail.comSigned-off-by: Jens Axboe <axboe@kernel.dk>

19352a1d

io_uring/net: fix sendzc lazy wake polling · ef42b85a

Pavel Begunkov authored Apr 30, 2024

SEND[MSG]_ZC produces multiple CQEs via notifications, LAZY_WAKE doesn't
handle it and so disable LAZY_WAKE for sendzc polling. It should be
fine, sends are not likely to be polled in the first place.

Fixes: 6ce4a93d ("io_uring/poll: use IOU_F_TWQ_LAZY_WAKE for wakeups")
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/5b360fb352d91e3aec751d75c87dfb4753a084ee.1714488419.git.asml.silence@gmail.comSigned-off-by: Jens Axboe <axboe@kernel.dk>

ef42b85a

26 Apr, 2024 1 commit

io_uring/msg_ring: reuse ctx->submitter_task read using READ_ONCE instead of re-reading it · a4d416dc

linke li authored Apr 26, 2024

In io_msg_exec_remote(), ctx->submitter_task is read using READ_ONCE at
the beginning of the function, checked, and then re-read from
ctx->submitter_task, voiding all guarantees of the checks. Reuse the value
that was read by READ_ONCE to ensure the consistency of the task struct
throughout the function.
Signed-off-by: linke li <lilinke99@qq.com>
Link: https://lore.kernel.org/r/tencent_F9B2296C93928D6F68FF0C95C33475C68209@qq.comSigned-off-by: Jens Axboe <axboe@kernel.dk>

a4d416dc

25 Apr, 2024 1 commit

io_uring/rw: reinstate thread check for retries · 039a2e80

Jens Axboe authored Apr 25, 2024

Allowing retries for everything is arguably the right thing to do, now
that every command type is async read from the start. But it's exposed a
few issues around missing check for a retry (which cca65713 exposed),
and the fixup commit for that isn't necessarily 100% sound in terms of
iov_iter state.

For now, just revert these two commits. This unfortunately then re-opens
the fact that -EAGAIN can get bubbled to userspace for some cases where
the kernel very well could just sanely retry them. But until we have all
the conditions covered around that, we cannot safely enable that.

This reverts commit df604d2a.
This reverts commit cca65713.
Signed-off-by: Jens Axboe <axboe@kernel.dk>

039a2e80

23 Apr, 2024 3 commits

io_uring/notif: implement notification stacking · 6fe42209

Pavel Begunkov authored Apr 19, 2024

The network stack allows only one ubuf_info per skb, and unlike
MSG_ZEROCOPY, each io_uring zerocopy send will carry a separate
ubuf_info. That means that send requests can't reuse a previosly
allocated skb and need to get one more or more of new ones. That's fine
for large sends, but otherwise it would spam the stack with lots of skbs
carrying just a little data each.

To help with that implement linking notification (i.e. an io_uring wrapper
around ubuf_info) into a list. Each is refcounted by skbs and the stack
as usual. additionally all non head entries keep a reference to the
head, which they put down when their refcount hits 0. When the head have
no more users, it'll efficiently put all notifications in a batch.

As mentioned previously about ->io_link_skb, the callback implementation
always allows to bind to an skb without a ubuf_info.
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/bf1e7f9b72f9ecc99999fdc0d2cded5eea87fd0b.1713369317.git.asml.silence@gmail.comSigned-off-by: Jens Axboe <axboe@kernel.dk>

6fe42209

io_uring/notif: simplify io_notif_flush() · 5a569469

Pavel Begunkov authored Apr 19, 2024

io_notif_flush() is partially duplicating io_tx_ubuf_complete(), so
instead of duplicating it, make the flush call io_tx_ubuf_complete.
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/19e41652c16718b946a5c80d2ad409df7682e47e.1713369317.git.asml.silence@gmail.comSigned-off-by: Jens Axboe <axboe@kernel.dk>

5a569469

Merge branch 'for-uring-ubufops' of... · 3830fff3

Jens Axboe authored Apr 22, 2024

Merge branch 'for-uring-ubufops' of git://git.kernel.org/pub/scm/linux/kernel/git/kuba/linux into for-6.10/io_uring

Merge net changes required for the upcoming send zerocopy improvements.

* 'for-uring-ubufops' of git://git.kernel.org/pub/scm/linux/kernel/git/kuba/linux:
  net: add callback for setting a ubuf_info to skb
  net: extend ubuf_info callback to ops structure
Signed-off-by: Jens Axboe <axboe@kernel.dk>

3830fff3

22 Apr, 2024 7 commits

net: add callback for setting a ubuf_info to skb · 65bada80

Pavel Begunkov authored Apr 19, 2024

At the moment an skb can only have one ubuf_info associated with it,
which might be a performance problem for zerocopy sends in cases like
TCP via io_uring. Add a callback for assigning ubuf_info to skb, this
way we will implement smarter assignment later like linking ubuf_info
together.

Note, it's an optional callback, which should be compatible with
skb_zcopy_set(), that's because the net stack might potentially decide
to clone an skb and take another reference to ubuf_info whenever it
wishes. Also, a correct implementation should always be able to bind to
an skb without prior ubuf_info, otherwise we could end up in a situation
when the send would not be able to progress.
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/all/b7918aadffeb787c84c9e72e34c729dc04f3a45d.1713369317.git.asml.silence@gmail.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

65bada80

net: extend ubuf_info callback to ops structure · 7ab4f16f

Pavel Begunkov authored Apr 19, 2024

We'll need to associate additional callbacks with ubuf_info, introduce
a structure holding ubuf_info callbacks. Apart from a more smarter
io_uring notification management introduced in next patches, it can be
used to generalise msg_zerocopy_put_abort() and also store
->sg_from_iter, which is currently passed in struct msghdr.
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/all/a62015541de49c0e2a8a0377a1d5d0a5aeb07016.1713369317.git.asml.silence@gmail.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

7ab4f16f

io_uring/net: support bundles for recv · 2f9c9515

Jens Axboe authored Mar 05, 2024

If IORING_OP_RECV is used with provided buffers, the caller may also set
IORING_RECVSEND_BUNDLE to turn it into a multi-buffer recv. This grabs
buffers available and receives into them, posting a single completion for
all of it.

This can be used with multishot receive as well, or without it.

Now that both send and receive support bundles, add a feature flag for
it as well. If IORING_FEAT_RECVSEND_BUNDLE is set after registering the
ring, then the kernel supports bundles for recv and send.
Signed-off-by: Jens Axboe <axboe@kernel.dk>

2f9c9515

io_uring/net: support bundles for send · a05d1f62

Jens Axboe authored Mar 05, 2024

If IORING_OP_SEND is used with provided buffers, the caller may also
set IORING_RECVSEND_BUNDLE to turn it into a multi-buffer send. The idea
is that an application can fill outgoing buffers in a provided buffer
group, and then arm a single send that will service them all. Once
there are no more buffers to send, or if the requested length has
been sent, the request posts a single completion for all the buffers.

This only enables it for IORING_OP_SEND, IORING_OP_SENDMSG is coming
in a separate patch. However, this patch does do a lot of the prep
work that makes wiring up the sendmsg variant pretty trivial. They
share the prep side.
Signed-off-by: Jens Axboe <axboe@kernel.dk>

a05d1f62

io_uring/kbuf: add helpers for getting/peeking multiple buffers · 35c8711c

Jens Axboe authored Mar 05, 2024

Our provided buffer interface only allows selection of a single buffer.
Add an API that allows getting/peeking multiple buffers at the same time.

This is only implemented for the ring provided buffers. It could be added
for the legacy provided buffers as well, but since it's strongly
encouraged to use the new interface, let's keep it simpler and just
provide it for the new API. The legacy interface will always just select
a single buffer.

There are two new main functions:

io_buffers_select(), which selects up as many buffers as it can. The
caller supplies the iovec array, and io_buffers_select() may allocate a
bigger array if the 'out_len' being passed in is non-zero and bigger
than what fits in the provided iovec. Buffers grabbed with this helper
are permanently assigned.

io_buffers_peek(), which works like io_buffers_select(), except they can
be recycled, if needed. Callers using either of these functions should
call io_put_kbufs() rather than io_put_kbuf() at completion time. The
peek interface must be called with the ctx locked from peek to
completion.

This add a bit state for the request:

- REQ_F_BUFFERS_COMMIT, which means that the the buffers have been
  peeked and should be committed to the buffer ring head when they are
  put as part of completion. Prior to this, req->buf_list was cleared to
  NULL when committed.
Signed-off-by: Jens Axboe <axboe@kernel.dk>

35c8711c

io_uring/net: add provided buffer support for IORING_OP_SEND · ac5f71a3

Jens Axboe authored Feb 19, 2024

It's pretty trivial to wire up provided buffer support for the send
side, just like how it's done the receive side. This enables setting up
a buffer ring that an application can use to push pending sends to,
and then have a send pick a buffer from that ring.

One of the challenges with async IO and networking sends is that you
can get into reordering conditions if you have more than one inflight
at the same time. Consider the following scenario where everything is
fine:

1) App queues sendA for socket1
2) App queues sendB for socket1
3) App does io_uring_submit()
4) sendA is issued, completes successfully, posts CQE
5) sendB is issued, completes successfully, posts CQE

All is fine. Requests are always issued in-order, and both complete
inline as most sends do.

However, if we're flooding socket1 with sends, the following could
also result from the same sequence:

1) App queues sendA for socket1
2) App queues sendB for socket1
3) App does io_uring_submit()
4) sendA is issued, socket1 is full, poll is armed for retry
5) Space frees up in socket1, this triggers sendA retry via task_work
6) sendB is issued, completes successfully, posts CQE
7) sendA is retried, completes successfully, posts CQE

Now we've sent sendB before sendA, which can make things unhappy. If
both sendA and sendB had been using provided buffers, then it would look
as follows instead:

1) App queues dataA for sendA, queues sendA for socket1
2) App queues dataB for sendB queues sendB for socket1
3) App does io_uring_submit()
4) sendA is issued, socket1 is full, poll is armed for retry
5) Space frees up in socket1, this triggers sendA retry via task_work
6) sendB is issued, picks first buffer (dataA), completes successfully,
   posts CQE (which says "I sent dataA")
7) sendA is retried, picks first buffer (dataB), completes successfully,
   posts CQE (which says "I sent dataB")

Now we've sent the data in order, and everybody is happy.

It's worth noting that this also opens the door for supporting multishot
sends, as provided buffers would be a prerequisite for that. Those can
trigger either when new buffers are added to the outgoing ring, or (if
stalled due to lack of space) when space frees up in the socket.
Signed-off-by: Jens Axboe <axboe@kernel.dk>

ac5f71a3

io_uring/net: add generic multishot retry helper · 3e747ded

Jens Axboe authored Feb 25, 2024

This is just moving io_recv_prep_retry() higher up so it can get used
for sends as well, and rename it to be generically useful for both
sends and receives.
Signed-off-by: Jens Axboe <axboe@kernel.dk>

3e747ded

17 Apr, 2024 3 commits

io_uring/rw: ensure retry condition isn't lost · df604d2a

Jens Axboe authored Apr 17, 2024

A previous commit removed the checking on whether or not it was possible
to retry a request, since it's now possible to retry any of them. This
would previously have caused the request to have been ended with an error,
but now the retry condition can simply get lost instead.

Cleanup the retry handling and always just punt it to task_work, which
will queue it with io-wq appropriately.
Reported-by: Changhui Zhong <czhong@redhat.com>
Tested-by: Ming Lei <ming.lei@redhat.com>
Fixes: cca65713 ("io_uring/rw: cleanup retry path")
Signed-off-by: Jens Axboe <axboe@kernel.dk>

df604d2a

io-wq: Drop intermediate step between pending list and active work · 24c3fc5c

Gabriel Krisman Bertazi authored Apr 15, 2024

next_work is only used to make the work visible for
cancellation. Instead, we can just directly write to cur_work before
dropping the acct_lock and avoid the extra hop.
Signed-off-by: Gabriel Krisman Bertazi <krisman@suse.de>
Link: https://lore.kernel.org/r/20240416021054.3940-3-krisman@suse.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

24c3fc5c

io-wq: write next_work before dropping acct_lock · 068c27e3

Gabriel Krisman Bertazi authored Apr 15, 2024

Commit 361aee45 ("io-wq: add intermediate work step between pending
list and active work") closed a race between a cancellation and the work
being removed from the wq for execution.  To ensure the request is
always reachable by the cancellation, we need to move it within the wq
lock, which also synchronizes the cancellation.  But commit
42abc95f ("io-wq: decouple work_list protection from the big
wqe->lock") replaced the wq lock here and accidentally reintroduced the
race by releasing the acct_lock too early.

In other words:

        worker                |     cancellation
work = io_get_next_work()     |
raw_spin_unlock(&acct->lock); |
			      |
                              | io_acct_cancel_pending_work
                              | io_wq_worker_cancel()
worker->next_work = work

Using acct_lock is still enough since we synchronize on it on
io_acct_cancel_pending_work.

Fixes: 42abc95f ("io-wq: decouple work_list protection from the big wqe->lock")
Signed-off-by: Gabriel Krisman Bertazi <krisman@suse.de>
Link: https://lore.kernel.org/r/20240416021054.3940-2-krisman@suse.deSigned-off-by: Jens Axboe <axboe@kernel.dk>

068c27e3

15 Apr, 2024 21 commits

io_uring/sqpoll: work around a potential audit memory leak · c4ce0ab2

Jens Axboe authored Mar 21, 2024

kmemleak complains that there's a memory leak related to connect
handling:

unreferenced object 0xffff0001093bdf00 (size 128):
comm "iou-sqp-455", pid 457, jiffies 4294894164
hex dump (first 32 bytes):
02 00 fa ea 7f 00 00 01 00 00 00 00 00 00 00 00  ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
backtrace (crc 2e481b1a):
[<00000000c0a26af4>] kmemleak_alloc+0x30/0x38
[<000000009c30bb45>] kmalloc_trace+0x228/0x358
[<000000009da9d39f>] __audit_sockaddr+0xd0/0x138
[<0000000089a93e34>] move_addr_to_kernel+0x1a0/0x1f8
[<000000000b4e80e6>] io_connect_prep+0x1ec/0x2d4
[<00000000abfbcd99>] io_submit_sqes+0x588/0x1e48
[<00000000e7c25e07>] io_sq_thread+0x8a4/0x10e4
[<00000000d999b491>] ret_from_fork+0x10/0x20

which can can happen if:

1) The command type does something on the prep side that triggers an
   audit call.
2) The thread hasn't done any operations before this that triggered
   an audit call inside ->issue(), where we have audit_uring_entry()
   and audit_uring_exit().

Work around this by issuing a blanket NOP operation before the SQPOLL
does anything.
Signed-off-by: Jens Axboe <axboe@kernel.dk>

c4ce0ab2

io_uring/notif: shrink account_pages to u32 · d6e29506

Pavel Begunkov authored Apr 15, 2024

->account_pages is the number of pages we account against the user
derived from unsigned len, it definitely fits into unsigned, which saves
some space in struct io_notif_data.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/19f2687fcb36daa74d86f4a27bfb3d35cffec318.1713185320.git.asml.silence@gmail.comSigned-off-by: Jens Axboe <axboe@kernel.dk>

d6e29506

io_uring/notif: remove ctx var from io_notif_tw_complete · 2e730d8d

Pavel Begunkov authored Apr 15, 2024

We don't need ctx in the hottest path, i.e. registered buffers,
let's get it only when we need it.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/e7345e268ffaeaf79b4c8f3a5d019d6a87a3d1f1.1713185320.git.asml.silence@gmail.comSigned-off-by: Jens Axboe <axboe@kernel.dk>

2e730d8d

io_uring/notif: refactor io_tx_ubuf_complete() · 7e58d0af

Pavel Begunkov authored Apr 15, 2024

Flip the dec_and_test "if", that makes the function extension easier in
the future.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/43939e2b04dff03bff5d7227c98afedf951227b3.1713185320.git.asml.silence@gmail.comSigned-off-by: Jens Axboe <axboe@kernel.dk>

7e58d0af

io_uring: ensure overflow entries are dropped when ring is exiting · 686b56cb

Jens Axboe authored Apr 12, 2024

A previous consolidation cleanup missed handling the case where the ring
is dying, and __io_cqring_overflow_flush() doesn't flush entries if the
CQ ring is already full. This is fine for the normal CQE overflow
flushing, but if the ring is going away, we need to flush everything,
even if it means simply freeing the overflown entries.

Fixes: 6c948ec44b29 ("io_uring: consolidate overflow flushing")
Signed-off-by: Jens Axboe <axboe@kernel.dk>

686b56cb

io_uring/timeout: remove duplicate initialization of the io_timeout list. · 4d0f4a54

Ruyi Zhang authored Apr 11, 2024

In the __io_timeout_prep function, the io_timeout list is initialized
twice, removing the meaningless second initialization.
Signed-off-by: Ruyi Zhang <ruyi.zhang@samsung.com>
Link: https://lore.kernel.org/r/20240411055953.2029218-1-ruyi.zhang@samsung.comSigned-off-by: Jens Axboe <axboe@kernel.dk>

4d0f4a54

io_uring: consolidate overflow flushing · 6b231248

Pavel Begunkov authored Apr 10, 2024

Consolidate __io_cqring_overflow_flush and io_cqring_overflow_kill()
into a single function as it once was, it's easier to work with it this
way.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/986b42c35e76a6be7aa0cdcda0a236a2222da3a7.1712708261.git.asml.silence@gmail.comSigned-off-by: Jens Axboe <axboe@kernel.dk>

6b231248

io_uring: always lock __io_cqring_overflow_flush · 8d09a88e

Pavel Begunkov authored Apr 10, 2024

Conditional locking is never great, in case of
__io_cqring_overflow_flush(), which is a slow path, it's not justified.
Don't handle IOPOLL separately, always grab uring_lock for overflow
flushing.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/162947df299aa12693ac4b305dacedab32ec7976.1712708261.git.asml.silence@gmail.comSigned-off-by: Jens Axboe <axboe@kernel.dk>

8d09a88e

io_uring: open code io_cqring_overflow_flush() · 408024b9

Pavel Begunkov authored Apr 10, 2024

There is only one caller of io_cqring_overflow_flush(), open code it
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/a1fecd56d9dba923ed8d4d159727fa939d3baa2a.1712708261.git.asml.silence@gmail.comSigned-off-by: Jens Axboe <axboe@kernel.dk>

408024b9

io_uring: remove extra SQPOLL overflow flush · e45ec969

Pavel Begunkov authored Apr 10, 2024

c1edbf5f ("io_uring: flag SQPOLL busy condition to userspace")
added an extra overflowed CQE flush in the SQPOLL submission path due to
backpressure, which was later removed. Remove the flush and let
io_cqring_wait() / iopoll handle it.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/2a83b0724ca6ca9d16c7d79a51b77c81876b2e39.1712708261.git.asml.silence@gmail.comSigned-off-by: Jens Axboe <axboe@kernel.dk>

e45ec969

io_uring: unexport io_req_cqe_overflow() · a5bff518

Pavel Begunkov authored Apr 10, 2024

There are no users of io_req_cqe_overflow() apart from io_uring.c, make
it static.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/f4295eb2f9eb98d5db38c0578f57f0b86bfe0d8c.1712708261.git.asml.silence@gmail.comSigned-off-by: Jens Axboe <axboe@kernel.dk>

a5bff518

io_uring: separate header for exported net bits · 8c9a6f54

Pavel Begunkov authored Apr 09, 2024

We're exporting some io_uring bits to networking, e.g. for implementing
a net callback for io_uring cmds, but we don't want to expose more than
needed. Add a separate header for networking.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: David Wei <dw@davidwei.uk>
Link: https://lore.kernel.org/r/20240409210554.1878789-1-dw@davidwei.ukSigned-off-by: Jens Axboe <axboe@kernel.dk>

8c9a6f54

io_uring/net: set MSG_ZEROCOPY for sendzc in advance · d285da7d

Pavel Begunkov authored Apr 08, 2024

We can set MSG_ZEROCOPY at the preparation step, do it so we don't have
to care about it later in the issue callback.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/c2c22aaa577624977f045979a6db2b9fb2e5648c.1712534031.git.asml.silence@gmail.comSigned-off-by: Jens Axboe <axboe@kernel.dk>

d285da7d

io_uring/net: get rid of io_notif_complete_tw_ext · 6b7f864b

Pavel Begunkov authored Apr 08, 2024

io_notif_complete_tw_ext() can be removed and combined with
io_notif_complete_tw to make it simpler without sacrificing
anything.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/025a124a5e20e2474a57e2f04f16c422eb83063c.1712534031.git.asml.silence@gmail.comSigned-off-by: Jens Axboe <axboe@kernel.dk>

6b7f864b

io_uring/net: merge ubuf sendzc callbacks · 99863292

Pavel Begunkov authored Apr 08, 2024

Splitting io_tx_ubuf_callback_ext from io_tx_ubuf_callback is a pre
mature optimisation that doesn't give us much. Merge the functions into
one and reclaim some simplicity back.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/d44d68f6f7add33a0dcf0b7fd7b73c2dc543604f.1712534031.git.asml.silence@gmail.comSigned-off-by: Jens Axboe <axboe@kernel.dk>

99863292

io_uring: return void from io_put_kbuf_comp() · bbbef3e9

Ming Lei authored Apr 07, 2024

The only caller doesn't handle the return value of io_put_kbuf_comp(), so
change its return type into void.

Also follow Jens's suggestion to rename it as io_put_kbuf_drop().
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20240407132759.4056167-1-ming.lei@redhat.comSigned-off-by: Jens Axboe <axboe@kernel.dk>

bbbef3e9

io_uring: remove io_req_put_rsrc_locked() · c29006a2

Pavel Begunkov authored Apr 05, 2024

io_req_put_rsrc_locked() is a weird shim function around
io_req_put_rsrc(). All calls to io_req_put_rsrc() require holding
->uring_lock, so we can just use it directly.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/a195bc78ac3d2c6fbaea72976e982fe51e50ecdd.1712331455.git.asml.silence@gmail.comReviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

c29006a2

io_uring: remove async request cache · d9713ad3

Pavel Begunkov authored Apr 05, 2024

io_req_complete_post() was a sole user of ->locked_free_list, but
since we just gutted the function, the cache is not used anymore and
can be removed.

->locked_free_list served as an asynhronous counterpart of the main
request (i.e. struct io_kiocb) cache for all unlocked cases like io-wq.
Now they're all forced to be completed into the main cache directly,
off of the normal completion path or via io_free_req().
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/7bffccd213e370abd4de480e739d8b08ab6c1326.1712331455.git.asml.silence@gmail.comReviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

d9713ad3

io_uring: turn implicit assumptions into a warning · de96e9ae

Pavel Begunkov authored Apr 05, 2024

io_req_complete_post() is now io-wq only and shouldn't be used outside
of it, i.e. it relies that io-wq holds a ref for the request as
explained in a comment below. Let's add a warning to enforce the
assumption and make sure nobody would try to do anything weird.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/1013b60c35d431d0698cafbc53c06f5917348c20.1712331455.git.asml.silence@gmail.comReviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

de96e9ae

io_uring: kill dead code in io_req_complete_post · f3913000

Ming Lei authored Apr 05, 2024

Since commit 8f6c829491fe ("io_uring: remove struct io_tw_state::locked"),
io_req_complete_post() is only called from io-wq submit work, where the
request reference is guaranteed to be grabbed and won't drop to zero
in io_req_complete_post().

Kill the dead code, meantime add req_ref_put() to put the reference.

Cc: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/1d8297e2046553153e763a52574f0e0f4d512f86.1712331455.git.asml.silence@gmail.comSigned-off-by: Jens Axboe <axboe@kernel.dk>

f3913000

io_uring/kbuf: remove dead define · 285207f6

Jens Axboe authored Mar 29, 2024

We no longer use IO_BUFFER_LIST_BUF_PER_PAGE, kill it.
Signed-off-by: Jens Axboe <axboe@kernel.dk>

285207f6