Commit f04d16ee authored by Haibo Li's avatar Haibo Li Committed by Andrew Morton

mm/filemap.c: fix update prev_pos after one read request done

ra->prev_pos tracks the last visited byte in the previous read request. 
It is used to check whether it is sequential read in ondemand_readahead
and thus affects the readahead window.

After commit 06c04442 ("mm/filemap.c: generic_file_buffered_read() now
uses find_get_pages_contig"), update logic of prev_pos is changed.  It
updates prev_pos after each return from filemap_get_pages().  But the read
request from user may be not fully completed at this point.  The updated
prev_pos impacts the subsequent readahead window.

The real problem is performance drop of fsck_msdos between linux-5.4 and
linux-5.15(also linux-6.4).  Comparing to linux-5.4,It spends about 110%
time and read 140% pages.  The read pattern of fsck_msdos is not fully
sequential.

Simplified read pattern of fsck_msdos likes below:
1.read at page offset 0xa,size 0x1000
2.read at other page offset like 0x20,size 0x1000
3.read at page offset 0xa,size 0x4000
4.read at page offset 0xe,size 0x1000

Here is the read status on linux-6.4:
1.after read at page offset 0xa,size 0x1000
    ->page ofs 0xa go into pagecache
2.after read at page offset 0x20,size 0x1000
    ->page ofs 0x20 go into pagecache
3.read at page offset 0xa,size 0x4000
    ->filemap_get_pages read ofs 0xa from pagecache and returns
    ->prev_pos is updated to 0xb and goto next loop
    ->filemap_get_pages tends to read ofs 0xb,size 0x3000
    ->initial_readahead case in ondemand_readahead since prev_pos is
      the same as request ofs.
    ->read 8 pages while async size is 5 pages
      (PageReadahead flag at page 0xe)
4.read at page offset 0xe,size 0x1000
    ->hit page 0xe with PageReadahead flag set,double the ra_size.
      read 16 pages while async size is 16 pages
Now it reads 24 pages while actually uses 5 pages

on linux-5.4:
1.the same as 6.4
2.the same as 6.4
3.read at page offset 0xa,size 0x4000
    ->read ofs 0xa from pagecache
    ->read ofs 0xb,size 0x3000 using page_cache_sync_readahead
      read 3 pages
    ->prev_pos is updated to 0xd before generic_file_buffered_read
      returns
4.read at page offset 0xe,size 0x1000
    ->initial_readahead case in ondemand_readahead since
      request ofs-prev_pos==1
    ->read 4 pages while async size is 3 pages

Now it reads 7 pages while actually uses 5 pages.

In above demo, the initial_readahead case is triggered by offset of user
request on linux-5.4.  While it may be triggered by update logic of
prev_pos on linux-6.4.

To fix the performance drop, update prev_pos after finishing one read
request.

Link: https://lkml.kernel.org/r/20230628110220.120134-1-haibo.li@mediatek.comSigned-off-by: default avatarHaibo Li <haibo.li@mediatek.com>
Reviewed-by: default avatarJan Kara <jack@suse.cz>
Cc: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Matthias Brugger <matthias.bgg@gmail.com>
Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
parent de4ec376
......@@ -2632,6 +2632,7 @@ ssize_t filemap_read(struct kiocb *iocb, struct iov_iter *iter,
int i, error = 0;
bool writably_mapped;
loff_t isize, end_offset;
loff_t last_pos = ra->prev_pos;
if (unlikely(iocb->ki_pos >= inode->i_sb->s_maxbytes))
return 0;
......@@ -2682,8 +2683,8 @@ ssize_t filemap_read(struct kiocb *iocb, struct iov_iter *iter,
* When a read accesses the same folio several times, only
* mark it as accessed the first time.
*/
if (!pos_same_folio(iocb->ki_pos, ra->prev_pos - 1,
fbatch.folios[0]))
if (!pos_same_folio(iocb->ki_pos, last_pos - 1,
fbatch.folios[0]))
folio_mark_accessed(fbatch.folios[0]);
for (i = 0; i < folio_batch_count(&fbatch); i++) {
......@@ -2710,7 +2711,7 @@ ssize_t filemap_read(struct kiocb *iocb, struct iov_iter *iter,
already_read += copied;
iocb->ki_pos += copied;
ra->prev_pos = iocb->ki_pos;
last_pos = iocb->ki_pos;
if (copied < bytes) {
error = -EFAULT;
......@@ -2724,7 +2725,7 @@ ssize_t filemap_read(struct kiocb *iocb, struct iov_iter *iter,
} while (iov_iter_count(iter) && iocb->ki_pos < isize && !error);
file_accessed(filp);
ra->prev_pos = last_pos;
return already_read ? already_read : error;
}
EXPORT_SYMBOL_GPL(filemap_read);
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment