• Mel Gorman's avatar
    mm/page_alloc: add page->buddy_list and page->pcp_list · bf75f200
    Mel Gorman authored
    Patch series "Drain remote per-cpu directly", v5.
    
    Some setups, notably NOHZ_FULL CPUs, may be running realtime or
    latency-sensitive applications that cannot tolerate interference due to
    per-cpu drain work queued by __drain_all_pages().  Introduce a new
    mechanism to remotely drain the per-cpu lists.  It is made possible by
    remotely locking 'struct per_cpu_pages' new per-cpu spinlocks.  This has
    two advantages, the time to drain is more predictable and other unrelated
    tasks are not interrupted.
    
    This series has the same intent as Nicolas' series "mm/page_alloc: Remote
    per-cpu lists drain support" -- avoid interference of a high priority task
    due to a workqueue item draining per-cpu page lists.  While many workloads
    can tolerate a brief interruption, it may cause a real-time task running
    on a NOHZ_FULL CPU to miss a deadline and at minimum, the draining is
    non-deterministic.
    
    Currently an IRQ-safe local_lock protects the page allocator per-cpu
    lists.  The local_lock on its own prevents migration and the IRQ disabling
    protects from corruption due to an interrupt arriving while a page
    allocation is in progress.
    
    This series adjusts the locking.  A spinlock is added to struct
    per_cpu_pages to protect the list contents while local_lock_irq is
    ultimately replaced by just the spinlock in the final patch.  This allows
    a remote CPU to safely.  Follow-on work should allow the spin_lock_irqsave
    to be converted to spin_lock to avoid IRQs being disabled/enabled in most
    cases.  The follow-on patch will be one kernel release later as it is
    relatively high risk and it'll make bisections more clear if there are any
    problems.
    
    Patch 1 is a cosmetic patch to clarify when page->lru is storing buddy pages
    	and when it is storing per-cpu pages.
    
    Patch 2 shrinks per_cpu_pages to make room for a spin lock. Strictly speaking
    	this is not necessary but it avoids per_cpu_pages consuming another
    	cache line.
    
    Patch 3 is a preparation patch to avoid code duplication.
    
    Patch 4 is a minor correction.
    
    Patch 5 uses a spin_lock to protect the per_cpu_pages contents while still
    	relying on local_lock to prevent migration, stabilise the pcp
    	lookup and prevent IRQ reentrancy.
    
    Patch 6 remote drains per-cpu pages directly instead of using a workqueue.
    
    Patch 7 uses a normal spinlock instead of local_lock for remote draining
    
    
    This patch (of 7):
    
    The page allocator uses page->lru for storing pages on either buddy or PCP
    lists.  Create page->buddy_list and page->pcp_list as a union with
    page->lru.  This is simply to clarify what type of list a page is on in
    the page allocator.
    
    No functional change intended.
    
    [minchan@kernel.org: fix page lru fields in macros]
    Link: https://lkml.kernel.org/r/20220624125423.6126-2-mgorman@techsingularity.netSigned-off-by: default avatarMel Gorman <mgorman@techsingularity.net>
    Tested-by: default avatarMinchan Kim <minchan@kernel.org>
    Acked-by: default avatarMinchan Kim <minchan@kernel.org>
    Reviewed-by: default avatarNicolas Saenz Julienne <nsaenzju@redhat.com>
    Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
    Tested-by: default avatarYu Zhao <yuzhao@google.com>
    Cc: Marcelo Tosatti <mtosatti@redhat.com>
    Cc: Michal Hocko <mhocko@kernel.org>
    Cc: Hugh Dickins <hughd@google.com>
    Cc: Marek Szyprowski <m.szyprowski@samsung.com>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    bf75f200
page_alloc.c 265 KB