• Mel Gorman's avatar
    mm: have order > 0 compaction start near a pageblock with free pages · de74f1cc
    Mel Gorman authored
    Commit 7db8889a ("mm: have order > 0 compaction start off where it
    left") introduced a caching mechanism to reduce the amount work the free
    page scanner does in compaction.  However, it has a problem.  Consider
    two process simultaneously scanning free pages
    
    					    			C
    	Process A		M     S     			F
    			|---------------------------------------|
    	Process B		M 	FS
    
    	C is zone->compact_cached_free_pfn
    	S is cc->start_pfree_pfn
    	M is cc->migrate_pfn
    	F is cc->free_pfn
    
    In this diagram, Process A has just reached its migrate scanner, wrapped
    around and updated compact_cached_free_pfn accordingly.
    
    Simultaneously, Process B finishes isolating in a block and updates
    compact_cached_free_pfn again to the location of its free scanner.
    
    Process A moves to "end_of_zone - one_pageblock" and runs this check
    
                    if (cc->order > 0 && (!cc->wrapped ||
                                          zone->compact_cached_free_pfn >
                                          cc->start_free_pfn))
                            pfn = min(pfn, zone->compact_cached_free_pfn);
    
    compact_cached_free_pfn is above where it started so the free scanner
    skips almost the entire space it should have scanned.  When there are
    multiple processes compacting it can end in a situation where the entire
    zone is not being scanned at all.  Further, it is possible for two
    processes to ping-pong update to compact_cached_free_pfn which is just
    random.
    
    Overall, the end result wrecks allocation success rates.
    
    There is not an obvious way around this problem without introducing new
    locking and state so this patch takes a different approach.
    
    First, it gets rid of the skip logic because it's not clear that it
    matters if two free scanners happen to be in the same block but with
    racing updates it's too easy for it to skip over blocks it should not.
    
    Second, it updates compact_cached_free_pfn in a more limited set of
    circumstances.
    
    If a scanner has wrapped, it updates compact_cached_free_pfn to the end
    	of the zone. When a wrapped scanner isolates a page, it updates
    	compact_cached_free_pfn to point to the highest pageblock it
    	can isolate pages from.
    
    If a scanner has not wrapped when it has finished isolated pages it
    	checks if compact_cached_free_pfn is pointing to the end of the
    	zone. If so, the value is updated to point to the highest
    	pageblock that pages were isolated from. This value will not
    	be updated again until a free page scanner wraps and resets
    	compact_cached_free_pfn.
    
    This is not optimal and it can still race but the compact_cached_free_pfn
    will be pointing to or very near a pageblock with free pages.
    Signed-off-by: default avatarMel Gorman <mgorman@suse.de>
    Reviewed-by: default avatarRik van Riel <riel@redhat.com>
    Reviewed-by: default avatarMinchan Kim <minchan@kernel.org>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    de74f1cc
compaction.c 25.5 KB