• Kalesh Singh's avatar
    Multi-gen LRU: fix per-zone reclaim · 669281ee
    Kalesh Singh authored
    MGLRU has a LRU list for each zone for each type (anon/file) in each
    generation:
    
    	long nr_pages[MAX_NR_GENS][ANON_AND_FILE][MAX_NR_ZONES];
    
    The min_seq (oldest generation) can progress independently for each
    type but the max_seq (youngest generation) is shared for both anon and
    file. This is to maintain a common frame of reference.
    
    In order for eviction to advance the min_seq of a type, all the per-zone
    lists in the oldest generation of that type must be empty.
    
    The eviction logic only considers pages from eligible zones for
    eviction or promotion.
    
        scan_folios() {
    	...
    	for (zone = sc->reclaim_idx; zone >= 0; zone--)  {
    	    ...
    	    sort_folio(); 	// Promote
    	    ...
    	    isolate_folio(); 	// Evict
    	}
    	...
        }
    
    Consider the system has the movable zone configured and default 4
    generations. The current state of the system is as shown below
    (only illustrating one type for simplicity):
    
    Type: ANON
    
    	Zone    DMA32     Normal    Movable    Device
    
    	Gen 0       0          0        4GB         0
    
    	Gen 1       0        1GB        1MB         0
    
    	Gen 2     1MB        4GB        1MB         0
    
    	Gen 3     1MB        1MB        1MB         0
    
    Now consider there is a GFP_KERNEL allocation request (eligible zone
    index <= Normal), evict_folios() will return without doing any work
    since there are no pages to scan in the eligible zones of the oldest
    generation. Reclaim won't make progress until triggered from a ZONE_MOVABLE
    allocation request; which may not happen soon if there is a lot of free
    memory in the movable zone. This can lead to OOM kills, although there
    is 1GB pages in the Normal zone of Gen 1 that we have not yet tried to
    reclaim.
    
    This issue is not seen in the conventional active/inactive LRU since
    there are no per-zone lists.
    
    If there are no (not enough) folios to scan in the eligible zones, move
    folios from ineligible zone (zone_index > reclaim_index) to the next
    generation. This allows for the progression of min_seq and reclaiming
    from the next generation (Gen 1).
    
    Qualcomm, Mediatek and raspberrypi [1] discovered this issue independently.
    
    [1] https://github.com/raspberrypi/linux/issues/5395
    
    Link: https://lkml.kernel.org/r/20230802025606.346758-1-kaleshsingh@google.com
    Fixes: ac35a490 ("mm: multi-gen LRU: minimal implementation")
    Signed-off-by: default avatarKalesh Singh <kaleshsingh@google.com>
    Reported-by: default avatarCharan Teja Kalla <quic_charante@quicinc.com>
    Reported-by: default avatarLecopzer Chen <lecopzer.chen@mediatek.com>
    Tested-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> [mediatek]
    Tested-by: default avatarCharan Teja Kalla <quic_charante@quicinc.com>
    Cc: Yu Zhao <yuzhao@google.com>
    Cc: Barry Song <baohua@kernel.org>
    Cc: Brian Geffon <bgeffon@google.com>
    Cc: Jan Alexander Steffens (heftig) <heftig@archlinux.org>
    Cc: Matthias Brugger <matthias.bgg@gmail.com>
    Cc: Oleksandr Natalenko <oleksandr@natalenko.name>
    Cc: Qi Zheng <zhengqi.arch@bytedance.com>
    Cc: Steven Barrett <steven@liquorix.net>
    Cc: Suleiman Souhlal <suleiman@google.com>
    Cc: Suren Baghdasaryan <surenb@google.com>
    Cc: Aneesh Kumar K V <aneesh.kumar@linux.ibm.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    669281ee
vmscan.c 222 KB