• Gerald Schaefer's avatar
    s390: select ARCH_WANT_HUGETLB_PAGE_OPTIMIZE_VMEMMAP · 00a34d5a
    Gerald Schaefer authored
    Enable HUGETLB_PAGE_OPTIMIZE_VMEMMAP for s390.
    
    With this, vmemmap pages used to back struct pages for compound tail
    pages of hugetlb pages are freed and remapped to compound head page
    frame as RO, see also Documentation/vm/vmemmap_dedup.rst.
    
    For 1M hugetlb pages, this results in freeing 3 of 4 vmemmap pages,
    saving 12K of memory for each 1M hugetlb page (~1.2%).
    /sys/kernel/debug/kernel_page_tables will show the impact:
    
    ---[ vmemmap Area Start ]---
    [...]
    0x0000037202d84000-0x0000037202d85000         4K PTE RW NX
    0x0000037202d85000-0x0000037202d88000        12K PTE RO NX
    
    For 2G hugetlb pages, this results in freeing 8191 of 8192 vmemmap
    pages, saving 32764K of memory for each 2G hugetlb page (~1.6%)
    /sys/kernel/debug/kernel_page_tables will show the impact:
    
    ---[ vmemmap Area Start ]---
    [...]
    0x000003720a000000-0x000003720a001000         4K PTE RW NX
    0x000003720a001000-0x000003720c000000     32764K PTE RO NX
    
    The memory savings come with some costs:
    - vmemmap mapping for compound hugetlb pages is not a PMD mapping any
      more, but split to 4K PTE mappings, and it will not be coalesced back
      to PMD mapping after freeing hugetlb pages from the pool.
      Apart from theoretical performance impact, this will also (slightly)
      relativize the memory savings because of additional 2K PTE pagetable
      allocations.
    - Workload using "on the fly" hugetlb allocations via
      "nr_overcommit_hugepages" instead of using the hugetlb pool via
      "nr_hugepages" will suffer from considerably increased fault handling
      time, see also description from commit 78f39084
      ("mm: hugetlb_vmemmap: add hugetlb_optimize_vmemmap sysctl").
    - Freeing hugetlb pages from the pool will require re-allocation of the
      freed struct pages, and therefore needs some memory available to the
      kernel. This might fail in memory constrained scenarios.
    - For the same reason, memory offline might fail even for ZONE_MOVABLE
      when hugetlb pages are present (but not for s390, since we do not
      support ARCH_ENABLE_HUGEPAGE_MIGRATION, and therefore cannot have
      hugetlb pages in ZONE_MOVABLE).
    - General increased complexity and overhead in kernel handling of
      compound (head) pages.
    
    Therefore, this feature is disabled by default, and has to be enabled
    explicitly either by adding "hugetlb_free_vmemmap=on" kernel parameter,
    or during run-time via "/proc/sys/vm/hugetlb_optimize_vmemmap" sysctl.
    Acked-by: default avatarHeiko Carstens <hca@linux.ibm.com>
    Signed-off-by: default avatarGerald Schaefer <gerald.schaefer@linux.ibm.com>
    Signed-off-by: default avatarAlexander Gordeev <agordeev@linux.ibm.com>
    00a34d5a
Kconfig 24.9 KB