Commit fff4068c authored by Johannes Weiner's avatar Johannes Weiner Committed by Linus Torvalds

mm: page_alloc: revert NUMA aspect of fair allocation policy

Commit 81c0a2bb ("mm: page_alloc: fair zone allocator policy") meant
to bring aging fairness among zones in system, but it was overzealous
and badly regressed basic workloads on NUMA systems.

Due to the way kswapd and page allocator interacts, we still want to
make sure that all zones in any given node are used equally for all
allocations to maximize memory utilization and prevent thrashing on the
highest zone in the node.

While the same principle applies to NUMA nodes - memory utilization is
obviously improved by spreading allocations throughout all nodes -
remote references can be costly and so many workloads prefer locality
over memory utilization.  The original change assumed that
zone_reclaim_mode would be a good enough predictor for that, but it
turned out to be as indicative as a coin flip.

Revert the NUMA aspect of the fairness until we can find a proper way to
make it configurable and agree on a sane default.
Signed-off-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
Reviewed-by: default avatarMichal Hocko <mhocko@suse.cz>
Signed-off-by: default avatarMel Gorman <mgorman@suse.de>
Cc: <stable@kernel.org> # 3.12
Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
parent 8798cee2
...@@ -1816,7 +1816,7 @@ static void zlc_clear_zones_full(struct zonelist *zonelist) ...@@ -1816,7 +1816,7 @@ static void zlc_clear_zones_full(struct zonelist *zonelist)
static bool zone_local(struct zone *local_zone, struct zone *zone) static bool zone_local(struct zone *local_zone, struct zone *zone)
{ {
return node_distance(local_zone->node, zone->node) == LOCAL_DISTANCE; return local_zone->node == zone->node;
} }
static bool zone_allows_reclaim(struct zone *local_zone, struct zone *zone) static bool zone_allows_reclaim(struct zone *local_zone, struct zone *zone)
...@@ -1913,18 +1913,17 @@ get_page_from_freelist(gfp_t gfp_mask, nodemask_t *nodemask, unsigned int order, ...@@ -1913,18 +1913,17 @@ get_page_from_freelist(gfp_t gfp_mask, nodemask_t *nodemask, unsigned int order,
* page was allocated in should have no effect on the * page was allocated in should have no effect on the
* time the page has in memory before being reclaimed. * time the page has in memory before being reclaimed.
* *
* When zone_reclaim_mode is enabled, try to stay in * Try to stay in local zones in the fastpath. If
* local zones in the fastpath. If that fails, the * that fails, the slowpath is entered, which will do
* slowpath is entered, which will do another pass * another pass starting with the local zones, but
* starting with the local zones, but ultimately fall * ultimately fall back to remote zones that do not
* back to remote zones that do not partake in the * partake in the fairness round-robin cycle of this
* fairness round-robin cycle of this zonelist. * zonelist.
*/ */
if (alloc_flags & ALLOC_WMARK_LOW) { if (alloc_flags & ALLOC_WMARK_LOW) {
if (zone_page_state(zone, NR_ALLOC_BATCH) <= 0) if (zone_page_state(zone, NR_ALLOC_BATCH) <= 0)
continue; continue;
if (zone_reclaim_mode && if (!zone_local(preferred_zone, zone))
!zone_local(preferred_zone, zone))
continue; continue;
} }
/* /*
...@@ -2390,7 +2389,7 @@ static void prepare_slowpath(gfp_t gfp_mask, unsigned int order, ...@@ -2390,7 +2389,7 @@ static void prepare_slowpath(gfp_t gfp_mask, unsigned int order,
* thrash fairness information for zones that are not * thrash fairness information for zones that are not
* actually part of this zonelist's round-robin cycle. * actually part of this zonelist's round-robin cycle.
*/ */
if (zone_reclaim_mode && !zone_local(preferred_zone, zone)) if (!zone_local(preferred_zone, zone))
continue; continue;
mod_zone_page_state(zone, NR_ALLOC_BATCH, mod_zone_page_state(zone, NR_ALLOC_BATCH,
high_wmark_pages(zone) - high_wmark_pages(zone) -
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment