Commit 546a1924 authored by Dave Chinner's avatar Dave Chinner Committed by Dave Chinner

writeback: write_cache_pages doesn't terminate at nr_to_write <= 0

I noticed XFS writeback in 2.6.36-rc1 was much slower than it should have
been. Enabling writeback tracing showed:

    flush-253:16-8516  [007] 1342952.351608: wbc_writepage: bdi 253:16: towrt=1024 skip=0 mode=0 kupd=0 bgrd=1 reclm=0 cyclic=1 more=0 older=0x0 start=0x0 end=0x0
    flush-253:16-8516  [007] 1342952.351654: wbc_writepage: bdi 253:16: towrt=1023 skip=0 mode=0 kupd=0 bgrd=1 reclm=0 cyclic=1 more=0 older=0x0 start=0x0 end=0x0
    flush-253:16-8516  [000] 1342952.369520: wbc_writepage: bdi 253:16: towrt=0 skip=0 mode=0 kupd=0 bgrd=1 reclm=0 cyclic=1 more=0 older=0x0 start=0x0 end=0x0
    flush-253:16-8516  [000] 1342952.369542: wbc_writepage: bdi 253:16: towrt=-1 skip=0 mode=0 kupd=0 bgrd=1 reclm=0 cyclic=1 more=0 older=0x0 start=0x0 end=0x0
    flush-253:16-8516  [000] 1342952.369549: wbc_writepage: bdi 253:16: towrt=-2 skip=0 mode=0 kupd=0 bgrd=1 reclm=0 cyclic=1 more=0 older=0x0 start=0x0 end=0x0

Writeback is not terminating in background writeback if ->writepage is
returning with wbc->nr_to_write == 0, resulting in sub-optimal single page
writeback on XFS.

Fix the write_cache_pages loop to terminate correctly when this situation
occurs and so prevent this sub-optimal background writeback pattern. This
improves sustained sequential buffered write performance from around
250MB/s to 750MB/s for a 100GB file on an XFS filesystem on my 8p test VM.

Cc:<stable@kernel.org>
Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
Reviewed-by: default avatarWu Fengguang <fengguang.wu@intel.com>
Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
parent 4536f2ad
...@@ -985,22 +985,16 @@ int write_cache_pages(struct address_space *mapping, ...@@ -985,22 +985,16 @@ int write_cache_pages(struct address_space *mapping,
} }
} }
if (wbc->nr_to_write > 0) { /*
if (--wbc->nr_to_write == 0 && * We stop writing back only if we are not doing
wbc->sync_mode == WB_SYNC_NONE) { * integrity sync. In case of integrity sync we have to
/* * keep going until we have written all the pages
* We stop writing back only if we are * we tagged for writeback prior to entering this loop.
* not doing integrity sync. In case of */
* integrity sync we have to keep going if (--wbc->nr_to_write <= 0 &&
* because someone may be concurrently wbc->sync_mode == WB_SYNC_NONE) {
* dirtying pages, and we might have done = 1;
* synced a lot of newly appeared dirty break;
* pages, but have not synced all of the
* old dirty pages.
*/
done = 1;
break;
}
} }
} }
pagevec_release(&pvec); pagevec_release(&pvec);
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment