• David Hildenbrand's avatar
    mm: track mapcount of large folios in single value · 05c5323b
    David Hildenbrand authored
    Let's track the mapcount of large folios in a single value.  The mapcount
    of a large folio currently corresponds to the sum of the entire mapcount
    and all page mapcounts.
    
    This sum is what we actually want to know in folio_mapcount() and it is
    also sufficient for implementing folio_mapped().
    
    With PTE-mapped THP becoming more important and more widely used, we want
    to avoid looping over all pages of a folio just to obtain the mapcount of
    large folios.  The comment "In the common case, avoid the loop when no
    pages mapped by PTE" in folio_total_mapcount() does no longer hold for
    mTHP that are always mapped by PTE.
    
    Further, we are planning on using folio_mapcount() more frequently, and
    might even want to remove page mapcounts for large folios in some kernel
    configs.  Therefore, allow for reading the mapcount of large folios
    efficiently and atomically without looping over any pages.
    
    Maintain the mapcount also for hugetlb pages for simplicity.  Use the new
    mapcount to implement folio_mapcount() and folio_mapped().  Make
    page_mapped() simply call folio_mapped().  We can now get rid of
    folio_large_is_mapped().
    
    _nr_pages_mapped is now only used in rmap code and for debugging purposes.
    Keep folio_nr_pages_mapped() around, but document that its use should be
    limited to rmap internals and debugging purposes.
    
    This change implies one additional atomic add/sub whenever
    mapping/unmapping (parts of) a large folio.
    
    As we now batch RMAP operations for PTE-mapped THP during fork(), during
    unmap/zap, and when PTE-remapping a PMD-mapped THP, and we adjust the
    large mapcount for a PTE batch only once, the added overhead in the common
    case is small.  Only when unmapping individual pages of a large folio
    (e.g., during COW), the overhead might be bigger in comparison, but it's
    essentially one additional atomic operation.
    
    Note that before the new mapcount would overflow, already our refcount
    would overflow: each mapping requires a folio reference.  Extend the
    focumentation of folio_mapcount().
    
    Link: https://lkml.kernel.org/r/20240409192301.907377-5-david@redhat.comSigned-off-by: default avatarDavid Hildenbrand <david@redhat.com>
    Reviewed-by: default avatarYin Fengwei <fengwei.yin@intel.com>
    Cc: Chris Zankel <chris@zankel.net>
    Cc: Hugh Dickins <hughd@google.com>
    Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
    Cc: Jonathan Corbet <corbet@lwn.net>
    Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
    Cc: Max Filippov <jcmvbkbc@gmail.com>
    Cc: Miaohe Lin <linmiaohe@huawei.com>
    Cc: Muchun Song <muchun.song@linux.dev>
    Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
    Cc: Peter Xu <peterx@redhat.com>
    Cc: Richard Chang <richardycc@google.com>
    Cc: Rich Felker <dalias@libc.org>
    Cc: Ryan Roberts <ryan.roberts@arm.com>
    Cc: Yang Shi <shy828301@gmail.com>
    Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
    Cc: Zi Yan <ziy@nvidia.com>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    05c5323b
khugepaged.c 71.5 KB