• Hugh Dickins's avatar
    mempolicy: mmap_lock is not needed while migrating folios · 72e315f7
    Hugh Dickins authored
    mbind(2) holds down_write of current task's mmap_lock throughout
    (exclusive because it needs to set the new mempolicy on the vmas);
    migrate_pages(2) holds down_read of pid's mmap_lock throughout.
    
    They both hold mmap_lock across the internal migrate_pages(), under which
    all new page allocations (huge or small) are made.  I'm nervous about it;
    and migrate_pages() certainly does not need mmap_lock itself.  It's done
    this way for mbind(2), because its page allocator is vma_alloc_folio() or
    alloc_hugetlb_folio_vma(), both of which depend on vma and address.
    
    Now that we have alloc_pages_mpol(), depending on (refcounted) memory
    policy and interleave index, mbind(2) can be modified to use that or
    alloc_hugetlb_folio_nodemask(), and then not need mmap_lock across the
    internal migrate_pages() at all: add alloc_migration_target_by_mpol() to
    replace mbind's new_page().
    
    (After that change, alloc_hugetlb_folio_vma() is used by nothing but a
    userfaultfd functi...
    72e315f7
hugetlb.c 215 KB