-
Andrew Morton authored
From: Rik van Riel <riel@surriel.com> In 2.6.0 both __alloc_pages() and the corresponding wakeup_kswapd()s walk all zones in the zone list, possibly spanning multiple nodes in a low numa factor system like AMD64. Also, if lower_zone_protection is set in /proc, then it may be possible that kswapd never cleans out data in zones further down the zonelist and try_to_free_pages needs to do that. However, in 2.6.0 try_to_free_pages() only frees pages in the pgdat the first zone in the zonelist belongs to. This is probably the wrong behaviour, since both the page allocator and the kswapd wakeup free things from all zones on the zonelist. The following patch makes try_to_free_pages() consistent with the allocator, by passing the zonelist as an argument and freeing pages from all zones in the list. I do not have any numa systems myself, so I have only tested it on my own little smp box. Testing on NUMA systems may be useful, though the patch really only should have an impact in those rare cases where kswapd can't keep up with allocations... As a side effect, the patch shrinks the kernel by 2 lines and replaces some subtle magic by a simpler array walk.
d5d4042d