Commit 63135aa3 authored by Jens Axboe's avatar Jens Axboe Committed by Linus Torvalds

mm: provide filemap_range_needs_writeback() helper

Patch series "Improve IOCB_NOWAIT O_DIRECT reads", v3.

An internal workload complained because it was using too much CPU, and
when I took a look, we had a lot of io_uring workers going to town.

For an async buffered read like workload, I am normally expecting _zero_
offloads to a worker thread, but this one had tons of them.  I'd drop
caches and things would look good again, but then a minute later we'd
regress back to using workers.  Turns out that every minute something
was reading parts of the device, which would add page cache for that
inode.  I put patches like these in for our kernel, and the problem was
solved.

Don't -EAGAIN IOCB_NOWAIT dio reads just because we have page cache
entries for the given range.  This causes unnecessary work from the
callers side, when the IO could have been issued totally fine without
blocking on writeback when there is none.

This patch (of 3):

For O_DIRECT reads/writes, we check if we need to issue a call to
filemap_write_and_wait_range() to issue and/or wait for writeback for any
page in the given range.  The existing mechanism just checks for a page in
the range, which is suboptimal for IOCB_NOWAIT as we'll fallback to the
slow path (and needing retry) if there's just a clean page cache page in
the range.

Provide filemap_range_needs_writeback() which tries a little harder to
check if we actually need to issue and/or wait for writeback in the range.

Link: https://lkml.kernel.org/r/20210224164455.1096727-1-axboe@kernel.dk
Link: https://lkml.kernel.org/r/20210224164455.1096727-2-axboe@kernel.dkSigned-off-by: default avatarJens Axboe <axboe@kernel.dk>
Reviewed-by: default avatarMatthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: default avatarJan Kara <jack@suse.cz>
Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
parent dce44566
......@@ -2878,6 +2878,8 @@ static inline int filemap_fdatawait(struct address_space *mapping)
extern bool filemap_range_has_page(struct address_space *, loff_t lstart,
loff_t lend);
extern bool filemap_range_needs_writeback(struct address_space *,
loff_t lstart, loff_t lend);
extern int filemap_write_and_wait_range(struct address_space *mapping,
loff_t lstart, loff_t lend);
extern int __filemap_fdatawrite_range(struct address_space *mapping,
......
......@@ -635,6 +635,49 @@ static bool mapping_needs_writeback(struct address_space *mapping)
return mapping->nrpages;
}
/**
* filemap_range_needs_writeback - check if range potentially needs writeback
* @mapping: address space within which to check
* @start_byte: offset in bytes where the range starts
* @end_byte: offset in bytes where the range ends (inclusive)
*
* Find at least one page in the range supplied, usually used to check if
* direct writing in this range will trigger a writeback. Used by O_DIRECT
* read/write with IOCB_NOWAIT, to see if the caller needs to do
* filemap_write_and_wait_range() before proceeding.
*
* Return: %true if the caller should do filemap_write_and_wait_range() before
* doing O_DIRECT to a page in this range, %false otherwise.
*/
bool filemap_range_needs_writeback(struct address_space *mapping,
loff_t start_byte, loff_t end_byte)
{
XA_STATE(xas, &mapping->i_pages, start_byte >> PAGE_SHIFT);
pgoff_t max = end_byte >> PAGE_SHIFT;
struct page *page;
if (!mapping_needs_writeback(mapping))
return false;
if (!mapping_tagged(mapping, PAGECACHE_TAG_DIRTY) &&
!mapping_tagged(mapping, PAGECACHE_TAG_WRITEBACK))
return false;
if (end_byte < start_byte)
return false;
rcu_read_lock();
xas_for_each(&xas, page, max) {
if (xas_retry(&xas, page))
continue;
if (xa_is_value(page))
continue;
if (PageDirty(page) || PageLocked(page) || PageWriteback(page))
break;
}
rcu_read_unlock();
return page != NULL;
}
EXPORT_SYMBOL_GPL(filemap_range_needs_writeback);
/**
* filemap_write_and_wait_range - write out & wait on a file range
* @mapping: the address_space for the pages
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment