• Mel Gorman's avatar
    mm: page_alloc: skip regions with hugetlbfs pages when allocating 1G pages · 4d73ba5f
    Mel Gorman authored
    A bug was reported by Yuanxi Liu where allocating 1G pages at runtime is
    taking an excessive amount of time for large amounts of memory.  Further
    testing allocating huge pages that the cost is linear i.e.  if allocating
    1G pages in batches of 10 then the time to allocate nr_hugepages from
    10->20->30->etc increases linearly even though 10 pages are allocated at
    each step.  Profiles indicated that much of the time is spent checking the
    validity within already existing huge pages and then attempting a
    migration that fails after isolating the range, draining pages and a whole
    lot of other useless work.
    
    Commit eb14d4ee ("mm,page_alloc: drop unnecessary checks from
    pfn_range_valid_contig") removed two checks, one which ignored huge pages
    for contiguous allocations as huge pages can sometimes migrate.  While
    there may be value on migrating a 2M page to satisfy a 1G allocation, it's
    potentially expensive if the 1G allocation fails and it's pointless to try
    moving a 1G page for a new 1G allocation or scan the tail pages for valid
    PFNs.
    
    Reintroduce the PageHuge check and assume any contiguous region with
    hugetlbfs pages is unsuitable for a new 1G allocation.
    
    The hpagealloc test allocates huge pages in batches and reports the
    average latency per page over time.  This test happens just after boot
    when fragmentation is not an issue.  Units are in milliseconds.
    
    hpagealloc
                                   6.3.0-rc6              6.3.0-rc6              6.3.0-rc6
                                     vanilla   hugeallocrevert-v1r1   hugeallocsimple-v1r2
    Min       Latency       26.42 (   0.00%)        5.07 (  80.82%)       18.94 (  28.30%)
    1st-qrtle Latency      356.61 (   0.00%)        5.34 (  98.50%)       19.85 (  94.43%)
    2nd-qrtle Latency      697.26 (   0.00%)        5.47 (  99.22%)       20.44 (  97.07%)
    3rd-qrtle Latency      972.94 (   0.00%)        5.50 (  99.43%)       20.81 (  97.86%)
    Max-1     Latency       26.42 (   0.00%)        5.07 (  80.82%)       18.94 (  28.30%)
    Max-5     Latency       82.14 (   0.00%)        5.11 (  93.78%)       19.31 (  76.49%)
    Max-10    Latency      150.54 (   0.00%)        5.20 (  96.55%)       19.43 (  87.09%)
    Max-90    Latency     1164.45 (   0.00%)        5.53 (  99.52%)       20.97 (  98.20%)
    Max-95    Latency     1223.06 (   0.00%)        5.55 (  99.55%)       21.06 (  98.28%)
    Max-99    Latency     1278.67 (   0.00%)        5.57 (  99.56%)       22.56 (  98.24%)
    Max       Latency     1310.90 (   0.00%)        8.06 (  99.39%)       26.62 (  97.97%)
    Amean     Latency      678.36 (   0.00%)        5.44 *  99.20%*       20.44 *  96.99%*
    
                       6.3.0-rc6   6.3.0-rc6   6.3.0-rc6
                         vanilla   revert-v1   hugeallocfix-v2
    Duration User           0.28        0.27        0.30
    Duration System       808.66       17.77       35.99
    Duration Elapsed      830.87       18.08       36.33
    
    The vanilla kernel is poor, taking up to 1.3 second to allocate a huge
    page and almost 10 minutes in total to run the test.  Reverting the
    problematic commit reduces it to 8ms at worst and the patch takes 26ms. 
    This patch fixes the main issue with skipping huge pages but leaves the
    page_count() out because a page with an elevated count potentially can
    migrate.
    
    BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=217022
    Link: https://lkml.kernel.org/r/20230414141429.pwgieuwluxwez3rj@techsingularity.net
    Fixes: eb14d4ee ("mm,page_alloc: drop unnecessary checks from pfn_range_valid_contig")
    Signed-off-by: default avatarMel Gorman <mgorman@techsingularity.net>
    Reported-by: default avatarYuanxi Liu <y.liu@naruida.com>
    Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
    Reviewed-by: default avatarDavid Hildenbrand <david@redhat.com>
    Acked-by: default avatarMichal Hocko <mhocko@suse.com>
    Reviewed-by: default avatarOscar Salvador <osalvador@suse.de>
    Cc: Matthew Wilcox <willy@infradead.org>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    4d73ba5f
page_alloc.c 272 KB