• Hugh Dickins's avatar
    mm/munlock: mlock_page() munlock_page() batch by pagevec · 2fbb0c10
    Hugh Dickins authored
    A weakness of the page->mlock_count approach is the need for lruvec lock
    while holding page table lock.  That is not an overhead we would allow on
    normal pages, but I think acceptable just for pages in an mlocked area.
    But let's try to amortize the extra cost by gathering on per-cpu pagevec
    before acquiring the lruvec lock.
    
    I have an unverified conjecture that the mlock pagevec might work out
    well for delaying the mlock processing of new file pages until they have
    got off lru_cache_add()'s pagevec and on to LRU.
    
    The initialization of page->mlock_count is subject to races and awkward:
    0 or !!PageMlocked or 1?  Was it wrong even in the implementation before
    this commit, which just widens the window?  I haven't gone back to think
    it through.  Maybe someone can point out a better way to initialize it.
    
    Bringing lru_cache_add_inactive_or_unevictable()'s mlock initialization
    into mm/mlock.c has helped: mlock_new_page(), using the mlock pagevec,
    rather than lru_cache_add()'s pagevec.
    
    Experimented with various orderings: the right thing seems to be for
    mlock_page() and mlock_new_page() to TestSetPageMlocked before adding to
    pagevec, but munlock_page() to leave TestClearPageMlocked to the later
    pagevec processing.
    
    Dropped the VM_BUG_ON_PAGE(PageTail)s this time around: they have made
    their point, and the thp_nr_page()s already contain a VM_BUG_ON_PGFLAGS()
    for that.
    
    This still leaves acquiring lruvec locks under page table lock each time
    the pagevec fills (or a THP is added): which I suppose is rather silly,
    since they sit on pagevec waiting to be processed long after page table
    lock has been dropped; but I'm disinclined to uglify the calling sequence
    until some load shows an actual problem with it (nothing wrong with
    taking lruvec lock under page table lock, just "nicer" to do it less).
    Signed-off-by: default avatarHugh Dickins <hughd@google.com>
    Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
    Signed-off-by: default avatarMatthew Wilcox (Oracle) <willy@infradead.org>
    2fbb0c10
swap.c 32.4 KB