• Linus Torvalds's avatar
    VM: Fix nasty and subtle race in shared mmap'ed page writeback · 7658cc28
    Linus Torvalds authored
    The VM layer (on the face of it, fairly reasonably) expected that when
    it does a ->writepage() call to the filesystem, it would write out the
    full page at that point in time.  Especially since it had earlier marked
    the whole page dirty with "set_page_dirty()".
    
    But that isn't actually the case: ->writepage() does not actually write
    a page, it writes the parts of the page that have been explicitly marked
    dirty before, *and* that had not got written out for other reasons since
    the last time we told it they were dirty.
    
    That last caveat is the important one.
    
    Which _most_ of the time ends up being the whole page (since we had
    called "set_page_dirty()" on the page earlier), but if the filesystem
    had done any dirty flushing of its own (for example, to honor some
    internal write ordering guarantees), it might end up doing only a
    partial page IO (or none at all) when ->writepage() is actually called.
    
    That is the correct thing in general (since we actually often _want_
    only the known-dirty parts of the page to be written out), but the
    shared dirty page handling had implicitly forgotten about these details,
    and had a number of cases where it was doing just the "->writepage()"
    part, without telling the low-level filesystem that the whole page might
    have been re-dirtied as part of being mapped writably into user space.
    
    Since most of the time the FS did actually write out the full page, we
    didn't notice this for a loong time, and this needed some really odd
    patterns to trigger.  But it caused occasional corruption with rtorrent
    and with the Debian "apt" database, because both use shared mmaps to
    update the end result.
    
    This fixes it. Finally. After way too much hair-pulling.
    Acked-by: default avatarNick Piggin <nickpiggin@yahoo.com.au>
    Acked-by: default avatarMartin J. Bligh <mbligh@google.com>
    Acked-by: default avatarMartin Michlmayr <tbm@cyrius.com>
    Acked-by: default avatarMartin Johansson <martin@fatbob.nu>
    Acked-by: default avatarIngo Molnar <mingo@elte.hu>
    Acked-by: default avatarAndrei Popa <andrei.popa@i-neo.ro>
    Cc: High Dickins <hugh@veritas.com>
    Cc: Andrew Morton <akpm@osdl.org>,
    Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
    Cc: Segher Boessenkool <segher@kernel.crashing.org>
    Cc: David Miller <davem@davemloft.net>
    Cc: Arjan van de Ven <arjan@infradead.org>
    Cc: Gordon Farquharson <gordonfarquharson@gmail.com>
    Cc: Guillaume Chazarain <guichaz@yahoo.fr>
    Cc: Theodore Tso <tytso@mit.edu>
    Cc: Kenneth Cheng <kenneth.w.chen@intel.com>
    Cc: Tobias Diedrich <ranma@tdiedrich.de>
    Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
    7658cc28
page-writeback.c 27.1 KB