• Hugh Dickins's avatar
    mm/khugepaged: collapse_shmem() without freezing new_page · 87c460a0
    Hugh Dickins authored
    khugepaged's collapse_shmem() does almost all of its work, to assemble
    the huge new_page from 512 scattered old pages, with the new_page's
    refcount frozen to 0 (and refcounts of all old pages so far also frozen
    to 0).  Including shmem_getpage() to read in any which were out on swap,
    memory reclaim if necessary to allocate their intermediate pages, and
    copying over all the data from old to new.
    
    Imagine the frozen refcount as a spinlock held, but without any lock
    debugging to highlight the abuse: it's not good, and under serious load
    heads into lockups - speculative getters of the page are not expecting
    to spin while khugepaged is rescheduled.
    
    One can get a little further under load by hacking around elsewhere; but
    fortunately, freezing the new_page turns out to have been entirely
    unnecessary, with no hacks needed elsewhere.
    
    The huge new_page lock is already held throughout, and guards all its
    subpages as they are brought one by one into the page cache tree; and
    anything reading the data in that page, without the lock, before it has
    been marked PageUptodate, would already be in the wrong.  So simply
    eliminate the freezing of the new_page.
    
    Each of the old pages remains frozen with refcount 0 after it has been
    replaced by a new_page subpage in the page cache tree, until they are
    all unfrozen on success or failure: just as before.  They could be
    unfrozen sooner, but cause no problem once no longer visible to
    find_get_entry(), filemap_map_pages() and other speculative lookups.
    
    Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1811261527570.2275@eggly.anvils
    Fixes: f3f0e1d2 ("khugepaged: add support of collapse for tmpfs/shmem pages")
    Signed-off-by: default avatarHugh Dickins <hughd@google.com>
    Acked-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
    Cc: Jerome Glisse <jglisse@redhat.com>
    Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
    Cc: Matthew Wilcox <willy@infradead.org>
    Cc: <stable@vger.kernel.org>	[4.8+]
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    87c460a0
khugepaged.c 47.8 KB