Commit ebdf4de5 authored by Jan Kara's avatar Jan Kara Committed by Linus Torvalds

mm: migrate: fix reference check race between __find_get_block() and migration

buffer_migrate_page_norefs() can race with bh users in the following
way:

CPU1                                    CPU2
buffer_migrate_page_norefs()
  buffer_migrate_lock_buffers()
  checks bh refs
  spin_unlock(&mapping->private_lock)
                                        __find_get_block()
                                          spin_lock(&mapping->private_lock)
                                          grab bh ref
                                          spin_unlock(&mapping->private_lock)
  move page                               do bh work

This can result in various issues like lost updates to buffers (i.e.
metadata corruption) or use after free issues for the old page.

This patch closes the race by holding mapping->private_lock while the
mapping is being moved to a new page.  Ordinarily, a reference can be
taken outside of the private_lock using the per-cpu BH LRU but the
references are checked and the LRU invalidated if necessary.  The
private_lock is held once the references are known so the buffer lookup
slow path will spin on the private_lock.  Between the page lock and
private_lock, it should be impossible for other references to be
acquired and updates to happen during the migration.

A user had reported data corruption issues on a distribution kernel with
a similar page migration implementation as mainline.  The data
corruption could not be reproduced with this patch applied.  A small
number of migration-intensive tests were run and no performance problems
were noted.

[mgorman@techsingularity.net: Changelog, removed tracing]
Link: http://lkml.kernel.org/r/20190718090238.GF24383@techsingularity.net
Fixes: 89cb0888 "mm: migrate: provide buffer_migrate_page_norefs()"
Signed-off-by: default avatarJan Kara <jack@suse.cz>
Signed-off-by: default avatarMel Gorman <mgorman@techsingularity.net>
Cc: <stable@vger.kernel.org>	[5.0+]
Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
parent fa1e512f
...@@ -767,12 +767,12 @@ static int __buffer_migrate_page(struct address_space *mapping, ...@@ -767,12 +767,12 @@ static int __buffer_migrate_page(struct address_space *mapping,
} }
bh = bh->b_this_page; bh = bh->b_this_page;
} while (bh != head); } while (bh != head);
spin_unlock(&mapping->private_lock);
if (busy) { if (busy) {
if (invalidated) { if (invalidated) {
rc = -EAGAIN; rc = -EAGAIN;
goto unlock_buffers; goto unlock_buffers;
} }
spin_unlock(&mapping->private_lock);
invalidate_bh_lrus(); invalidate_bh_lrus();
invalidated = true; invalidated = true;
goto recheck_buffers; goto recheck_buffers;
...@@ -805,6 +805,8 @@ static int __buffer_migrate_page(struct address_space *mapping, ...@@ -805,6 +805,8 @@ static int __buffer_migrate_page(struct address_space *mapping,
rc = MIGRATEPAGE_SUCCESS; rc = MIGRATEPAGE_SUCCESS;
unlock_buffers: unlock_buffers:
if (check_refs)
spin_unlock(&mapping->private_lock);
bh = head; bh = head;
do { do {
unlock_buffer(bh); unlock_buffer(bh);
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment