• Johannes Weiner's avatar
    mm: page_isolation: prepare for hygienic freelists · fd919a85
    Johannes Weiner authored
    Page isolation currently sets MIGRATE_ISOLATE on a block, then drops
    zone->lock and scans the block for straddling buddies to split up. 
    Because this happens non-atomically wrt the page allocator, it's possible
    for allocations to get a buddy whose first block is a regular pcp
    migratetype but whose tail is isolated.  This means that in certain cases
    memory can still be allocated after isolation.  It will also trigger the
    freelist type hygiene warnings in subsequent patches.
    
    start_isolate_page_range()
      isolate_single_pageblock()
        set_migratetype_isolate(tail)
          lock zone->lock
          move_freepages_block(tail) // nop
          set_pageblock_migratetype(tail)
          unlock zone->lock
                                                         __rmqueue_smallest()
                                                           del_page_from_freelist(head)
                                                           expand(head, head_mt)
                                                             WARN(head_mt != tail_mt)
        start_pfn = ALIGN_DOWN(MAX_ORDER_NR_PAGES)
        for (pfn = start_pfn, pfn < end_pfn)
          if (PageBuddy())
            split_free_page(head)
    
    Introduce a variant of move_freepages_block() provided by the allocator
    specifically for page isolation; it moves free pages, converts the block,
    and handles the splitting of straddling buddies while holding zone->lock.
    
    The allocator knows that pageblocks and buddies are always naturally
    aligned, which means that buddies can only straddle blocks if they're
    actually >pageblock_order.  This means the search-and-split part can be
    simplified compared to what page isolation used to do.
    
    Also tighten up the page isolation code around the expectations of which
    pages can be large, and how they are freed.
    
    Based on extensive discussions with and invaluable input from Zi Yan.
    
    [hannes@cmpxchg.org: work around older gcc warning]
      Link: https://lkml.kernel.org/r/20240321142426.GB777580@cmpxchg.org
    Link: https://lkml.kernel.org/r/20240320180429.678181-10-hannes@cmpxchg.orgSigned-off-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
    Reviewed-by: default avatarVlastimil Babka <vbabka@suse.cz>
    Tested-by: default avatarBaolin Wang <baolin.wang@linux.alibaba.com>
    Cc: David Hildenbrand <david@redhat.com>
    Cc: "Huang, Ying" <ying.huang@intel.com>
    Cc: Mel Gorman <mgorman@techsingularity.net>
    Cc: Zi Yan <ziy@nvidia.com>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    fd919a85
page_alloc.c 194 KB