Commit 28aabffa authored by Jens Axboe's avatar Jens Axboe

io_uring/sqpoll: close race on waiting for sqring entries

When an application uses SQPOLL, it must wait for the SQPOLL thread to
consume SQE entries, if it fails to get an sqe when calling
io_uring_get_sqe(). It can do so by calling io_uring_enter(2) with the
flag value of IORING_ENTER_SQ_WAIT. In liburing, this is generally done
with io_uring_sqring_wait(). There's a natural expectation that once
this call returns, a new SQE entry can be retrieved, filled out, and
submitted. However, the kernel uses the cached sq head to determine if
the SQRING is full or not. If the SQPOLL thread is currently in the
process of submitting SQE entries, it may have updated the cached sq
head, but not yet committed it to the SQ ring. Hence the kernel may find
that there are SQE entries ready to be consumed, and return successfully
to the application. If the SQPOLL thread hasn't yet committed the SQ
ring entries by the time the application returns to userspace and
attempts to get a new SQE, it will fail getting a new SQE.

Fix this by having io_sqring_full() always use the user visible SQ ring
head entry, rather than the internally cached one.

Cc: stable@vger.kernel.org # 5.10+
Link: https://github.com/axboe/liburing/discussions/1267Reported-by: default avatarBenedek Thaler <thaler@thaler.hu>
Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
parent f7c91343
...@@ -284,7 +284,14 @@ static inline bool io_sqring_full(struct io_ring_ctx *ctx) ...@@ -284,7 +284,14 @@ static inline bool io_sqring_full(struct io_ring_ctx *ctx)
{ {
struct io_rings *r = ctx->rings; struct io_rings *r = ctx->rings;
return READ_ONCE(r->sq.tail) - ctx->cached_sq_head == ctx->sq_entries; /*
* SQPOLL must use the actual sqring head, as using the cached_sq_head
* is race prone if the SQPOLL thread has grabbed entries but not yet
* committed them to the ring. For !SQPOLL, this doesn't matter, but
* since this helper is just used for SQPOLL sqring waits (or POLLOUT),
* just read the actual sqring head unconditionally.
*/
return READ_ONCE(r->sq.tail) - READ_ONCE(r->sq.head) == ctx->sq_entries;
} }
static inline unsigned int io_sqring_entries(struct io_ring_ctx *ctx) static inline unsigned int io_sqring_entries(struct io_ring_ctx *ctx)
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment