• Kirill A. Shutemov's avatar
    mm, thp: close race between mremap() and split_huge_page() · dd18dbc2
    Kirill A. Shutemov authored
    It's critical for split_huge_page() (and migration) to catch and freeze
    all PMDs on rmap walk.  It gets tricky if there's concurrent fork() or
    mremap() since usually we copy/move page table entries on dup_mm() or
    move_page_tables() without rmap lock taken.  To get it work we rely on
    rmap walk order to not miss any entry.  We expect to see destination VMA
    after source one to work correctly.
    
    But after switching rmap implementation to interval tree it's not always
    possible to preserve expected walk order.
    
    It works fine for dup_mm() since new VMA has the same vma_start_pgoff()
    / vma_last_pgoff() and explicitly insert dst VMA after src one with
    vma_interval_tree_insert_after().
    
    But on move_vma() destination VMA can be merged into adjacent one and as
    result shifted left in interval tree.  Fortunately, we can detect the
    situation and prevent race with rmap walk by moving page table entries
    under rmap lock.  See commit 38a76013.
    
    Problem is that we...
    dd18dbc2
mremap.c 15 KB