Commit 284f69fb authored by David Rientjes's avatar David Rientjes Committed by Sasha Levin

mm, compaction: abort free scanner if split fails

[ Upstream commit a4f04f2c ]

If the memory compaction free scanner cannot successfully split a free
page (only possible due to per-zone low watermark), terminate the free
scanner rather than continuing to scan memory needlessly.  If the
watermark is insufficient for a free page of order <= cc->order, then
terminate the scanner since all future splits will also likely fail.

This prevents the compaction freeing scanner from scanning all memory on
very large zones (very noticeable for zones > 128GB, for instance) when
all splits will likely fail while holding zone->lock.

compaction_alloc() iterating a 128GB zone has been benchmarked to take
over 400ms on some systems whereas any free page isolated and ready to
be split ends up failing in split_free_page() because of the low
watermark check and thus the iteration continues.

The next time compaction occurs, the freeing scanner will likely start
at the end of the zone again since no success was made previously and we
get the same lengthy iteration until the zone is brought above the low
watermark.  All thp page faults can take >400ms in such a state without
this fix.

Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1606211820350.97086@chino.kir.corp.google.comSigned-off-by: default avatarDavid Rientjes <rientjes@google.com>
Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Hugh Dickins <hughd@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
parent 68385427
......@@ -480,25 +480,23 @@ static unsigned long isolate_freepages_block(struct compact_control *cc,
/* Found a free page, break it into order-0 pages */
isolated = split_free_page(page);
if (!isolated)
break;
total_isolated += isolated;
cc->nr_freepages += isolated;
for (i = 0; i < isolated; i++) {
list_add(&page->lru, freelist);
page++;
}
/* If a page was split, advance to the end of it */
if (isolated) {
cc->nr_freepages += isolated;
if (!strict &&
cc->nr_migratepages <= cc->nr_freepages) {
if (!strict && cc->nr_migratepages <= cc->nr_freepages) {
blockpfn += isolated;
break;
}
/* Advance to the end of split page */
blockpfn += isolated - 1;
cursor += isolated - 1;
continue;
}
isolate_fail:
if (strict)
......@@ -508,6 +506,9 @@ static unsigned long isolate_freepages_block(struct compact_control *cc,
}
if (locked)
spin_unlock_irqrestore(&cc->zone->lock, flags);
/*
* There is a tiny chance that we have read bogus compound_order(),
* so be careful to not go outside of the pageblock.
......@@ -529,9 +530,6 @@ static unsigned long isolate_freepages_block(struct compact_control *cc,
if (strict && blockpfn < end_pfn)
total_isolated = 0;
if (locked)
spin_unlock_irqrestore(&cc->zone->lock, flags);
/* Update the pageblock-skip if the whole pageblock was scanned */
if (blockpfn == end_pfn)
update_pageblock_skip(cc, valid_page, total_isolated, false);
......@@ -955,6 +953,7 @@ static void isolate_freepages(struct compact_control *cc)
block_end_pfn = block_start_pfn,
block_start_pfn -= pageblock_nr_pages,
isolate_start_pfn = block_start_pfn) {
unsigned long isolated;
/*
* This can iterate a massively long zone without finding any
......@@ -979,8 +978,12 @@ static void isolate_freepages(struct compact_control *cc)
continue;
/* Found a block suitable for isolating free pages from. */
isolate_freepages_block(cc, &isolate_start_pfn,
isolated = isolate_freepages_block(cc, &isolate_start_pfn,
block_end_pfn, freelist, false);
/* If isolation failed early, do not continue needlessly */
if (!isolated && isolate_start_pfn < block_end_pfn &&
cc->nr_migratepages > cc->nr_freepages)
break;
/*
* Remember where the free scanner should restart next time,
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment