• Vlastimil Babka's avatar
    mm, page_alloc: reduce page alloc/free sanity checks · 700d2e9a
    Vlastimil Babka authored
    Historically, we have performed sanity checks on all struct pages being
    allocated or freed, making sure they have no unexpected page flags or
    certain field values.  This can detect insufficient cleanup and some cases
    of use-after-free, although on its own it can't always identify the
    culprit.  The result is a warning and the "bad page" being leaked.
    
    The checks do need some cpu cycles, so in 4.7 with commits 479f854a
    ("mm, page_alloc: defer debugging checks of pages allocated from the PCP")
    and 4db7548c ("mm, page_alloc: defer debugging checks of freed pages
    until a PCP drain") they were no longer performed in the hot paths when
    allocating and freeing from pcplists, but only when pcplists are bypassed,
    refilled or drained.  For debugging purposes, with CONFIG_DEBUG_VM enabled
    the checks were instead still done in the hot paths and not when refilling
    or draining pcplists.
    
    With 4462b32c ("mm, page_alloc: more extensive free page checking with
    debug_pagealloc"), enabling debug_pagealloc also moved the sanity checks
    back to hot pahs.  When both debug_pagealloc and CONFIG_DEBUG_VM are
    enabled, the checks are done both in hotpaths and pcplist refill/drain.
    
    Even though the non-debug default today might seem to be a sensible
    tradeoff between overhead and ability to detect bad pages, on closer look
    it's arguably not.  As most allocations go through the pcplists, catching
    any bad pages when refilling or draining pcplists has only a small chance,
    insufficient for debugging or serious hardening purposes.  On the other
    hand the cost of the checks is concentrated in the already expensive
    drain/refill batching operations, and those are done under the often
    contended zone lock.  That was recently identified as an issue for page
    allocation and the zone lock contention reduced by moving the checks
    outside of the locked section with a patch "mm: reduce lock contention of
    pcp buffer refill", but the cost of the checks is still visible compared
    to their removal [1].  In the pcplist draining path free_pcppages_bulk()
    the checks are still done under zone->lock.
    
    Thus, remove the checks from pcplist refill and drain paths completely.
    Introduce a static key check_pages_enabled to control checks during page
    allocation a freeing (whether pcplist is used or bypassed). The static
    key is enabled if either is true:
    
    - kernel is built with CONFIG_DEBUG_VM=y (debugging)
    - debug_pagealloc or page poisoning is boot-time enabled (debugging)
    - init_on_alloc or init_on_free is boot-time enabled (hardening)
    
    The resulting user visible changes:
    - no checks when draining/refilling pcplists - less overhead, with
      likely no practical reduction of ability to catch bad pages
    - no checks when bypassing pcplists in default config (no
      debugging/hardening) - less overhead etc. as above
    - on typical hardened kernels [2], checks are now performed on each page
      allocation/free (previously only when bypassing/draining/refilling
      pcplists) - the init_on_alloc/init_on_free enabled should be sufficient
      indication for preferring more costly alloc/free operations for
      hardening purposes and we shouldn't need to introduce another toggle
    - code (various wrappers) removal and simplification
    
    [1] https://lore.kernel.org/all/68ba44d8-6899-c018-dcb3-36f3a96e6bea@sra.uni-hannover.de/
    [2] https://lore.kernel.org/all/63ebc499.a70a0220.9ac51.29ea@mx.google.com/
    
    [akpm@linux-foundation.org: coding-style cleanups]
    [akpm@linux-foundation.org: make check_pages_enabled static]
    Link: https://lkml.kernel.org/r/20230216095131.17336-1-vbabka@suse.czReported-by: default avatarAlexander Halbuer <halbuer@sra.uni-hannover.de>
    Reported-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarVlastimil Babka <vbabka@suse.cz>
    Cc: Kees Cook <keescook@chromium.org>
    Cc: Mel Gorman <mgorman@techsingularity.net>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    700d2e9a
page_alloc.c 269 KB