Commit a50aeb40 authored by Wu Fengguang's avatar Wu Fengguang Committed by Linus Torvalds

writeback: merge for_kupdate and !for_kupdate cases

Unify the logic for kupdate and non-kupdate cases.  There won't be
starvation because the inodes requeued into b_more_io will later be
spliced _after_ the remaining inodes in b_io, hence won't stand in the way
of other inodes in the next run.

It avoids unnecessary redirty_tail() calls, hence the update of
i_dirtied_when.  The timestamp update is undesirable because it could
later delay the inode's periodic writeback, or may exclude the inode from
the data integrity sync operation (which checks timestamp to avoid extra
work and livelock).

===
How the redirty_tail() comes about:

It was a long story..  This redirty_tail() was introduced with
wbc.more_io.  The initial patch for more_io actually does not have the
redirty_tail(), and when it's merged, several 100% iowait bug reports
arised:

reiserfs:
        http://lkml.org/lkml/2007/10/23/93

jfs:
        commit 29a424f2
        JFS: clear PAGECACHE_TAG_DIRTY for no-write pages

ext2:
        http://www.spinics.net/linux/lists/linux-ext4/msg04762.html

They are all old bugs hidden in various filesystems that become "visible"
with the more_io patch.  At the time, the ext2 bug is thought to be
"trivial", so not fixed.  Instead the following updated more_io patch with
redirty_tail() is merged:

	http://www.spinics.net/linux/lists/linux-ext4/msg04507.html

This will in general prevent 100% on ext2 and possibly other unknown FS bugs.
Signed-off-by: default avatarWu Fengguang <fengguang.wu@intel.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Martin Bligh <mbligh@google.com>
Cc: Michael Rubin <mrubin@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
parent 4ea879b9
...@@ -374,45 +374,22 @@ writeback_single_inode(struct inode *inode, struct writeback_control *wbc) ...@@ -374,45 +374,22 @@ writeback_single_inode(struct inode *inode, struct writeback_control *wbc)
if (mapping_tagged(mapping, PAGECACHE_TAG_DIRTY)) { if (mapping_tagged(mapping, PAGECACHE_TAG_DIRTY)) {
/* /*
* We didn't write back all the pages. nfs_writepages() * We didn't write back all the pages. nfs_writepages()
* sometimes bales out without doing anything. Redirty * sometimes bales out without doing anything.
* the inode; Move it from b_io onto b_more_io/b_dirty.
*/ */
/* inode->i_state |= I_DIRTY_PAGES;
* akpm: if the caller was the kupdate function we put if (wbc->nr_to_write <= 0) {
* this inode at the head of b_dirty so it gets first
* consideration. Otherwise, move it to the tail, for
* the reasons described there. I'm not really sure
* how much sense this makes. Presumably I had a good
* reasons for doing it this way, and I'd rather not
* muck with it at present.
*/
if (wbc->for_kupdate) {
/* /*
* For the kupdate function we move the inode * slice used up: queue for next turn
* to b_more_io so it will get more writeout as
* soon as the queue becomes uncongested.
*/ */
inode->i_state |= I_DIRTY_PAGES; requeue_io(inode);
if (wbc->nr_to_write <= 0) {
/*
* slice used up: queue for next turn
*/
requeue_io(inode);
} else {
/*
* somehow blocked: retry later
*/
redirty_tail(inode);
}
} else { } else {
/* /*
* Otherwise fully redirty the inode so that * Writeback blocked by something other than
* other inodes on this superblock will get some * congestion. Delay the inode for some time to
* writeout. Otherwise heavy writing to one * avoid spinning on the CPU (100% iowait)
* file would indefinitely suspend writeout of * retrying writeback of the dirty page/inode
* all the other files. * that cannot be performed immediately.
*/ */
inode->i_state |= I_DIRTY_PAGES;
redirty_tail(inode); redirty_tail(inode);
} }
} else if (inode->i_state & I_DIRTY) { } else if (inode->i_state & I_DIRTY) {
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment