Commit f14c889d authored by Michal Hocko's avatar Michal Hocko Committed by Jiri Slaby

mm: exclude memoryless nodes from zone_reclaim

commit 70ef57e6 upstream.

We had a report about strange OOM killer strikes on a PPC machine
although there was a lot of swap free and a tons of anonymous memory
which could be swapped out.  In the end it turned out that the OOM was a
side effect of zone reclaim which wasn't unmapping and swapping out and
so the system was pushed to the OOM.  Although this sounds like a bug
somewhere in the kswapd vs.  zone reclaim vs.  direct reclaim
interaction numactl on the said hardware suggests that the zone reclaim
should not have been set in the first place:

  node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
  node 0 size: 0 MB
  node 0 free: 0 MB
  node 2 cpus:
  node 2 size: 7168 MB
  node 2 free: 6019 MB
  node distances:
  node   0   2
  0:  10  40
  2:  40  10

So all the CPUs are associated with Node0 which doesn't have any memory
while Node2 contains all the available memory.  Node distances cause an
automatic zone_reclaim_mode enabling.

Zone reclaim is intended to keep the allocations local but this doesn't
make any sense on the memoryless nodes.  So let's exclude such nodes for
init_zone_allows_reclaim which evaluates zone reclaim behavior and
suitable reclaim_nodes.
Signed-off-by: default avatarMichal Hocko <mhocko@suse.cz>
Acked-by: default avatarDavid Rientjes <rientjes@google.com>
Acked-by: default avatarNishanth Aravamudan <nacc@linux.vnet.ibm.com>
Tested-by: default avatarNishanth Aravamudan <nacc@linux.vnet.ibm.com>
Acked-by: default avatarMel Gorman <mgorman@suse.de>
Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: default avatarMel Gorman <mgorman@suse.de>
Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
parent a32d8674
......@@ -1850,7 +1850,7 @@ static void __paginginit init_zone_allows_reclaim(int nid)
{
int i;
for_each_online_node(i)
for_each_node_state(i, N_MEMORY)
if (node_distance(nid, i) <= RECLAIM_DISTANCE)
node_set(i, NODE_DATA(nid)->reclaim_nodes);
else
......@@ -4903,7 +4903,8 @@ void __paginginit free_area_init_node(int nid, unsigned long *zones_size,
pgdat->node_id = nid;
pgdat->node_start_pfn = node_start_pfn;
init_zone_allows_reclaim(nid);
if (node_state(nid, N_MEMORY))
init_zone_allows_reclaim(nid);
#ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
get_pfn_range_for_nid(nid, &start_pfn, &end_pfn);
#endif
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment