• Luiz Capitulino's avatar
    hugetlb: prep_compound_gigantic_page(): drop __init marker · 2906dd52
    Luiz Capitulino authored
    The HugeTLB subsystem uses the buddy allocator to allocate hugepages
    during runtime.  This means that hugepages allocation during runtime is
    limited to MAX_ORDER order.  For archs supporting gigantic pages (that
    is, page sizes greater than MAX_ORDER), this in turn means that those
    pages can't be allocated at runtime.
    
    HugeTLB supports gigantic page allocation during boottime, via the boot
    allocator.  To this end the kernel provides the command-line options
    hugepagesz= and hugepages=, which can be used to instruct the kernel to
    allocate N gigantic pages during boot.
    
    For example, x86_64 supports 2M and 1G hugepages, but only 2M hugepages
    can be allocated and freed at runtime.  If one wants to allocate 1G
    gigantic pages, this has to be done at boot via the hugepagesz= and
    hugepages= command-line options.
    
    Now, gigantic page allocation at boottime has two serious problems:
    
     1. Boottime allocation is not NUMA aware. On a NUMA machine the kernel
        evenly distributes boottime allocated hugepages among nodes.
    
        For example, suppose you have a four-node NUMA machine and want
        to allocate four 1G gigantic pages at boottime. The kernel will
        allocate one gigantic page per node.
    
        On the other hand, we do have users who want to be able to specify
        which NUMA node gigantic pages should allocated from. So that they
        can place virtual machines on a specific NUMA node.
    
     2. Gigantic pages allocated at boottime can't be freed
    
    At this point it's important to observe that regular hugepages allocated
    at runtime don't have those problems.  This is so because HugeTLB
    interface for runtime allocation in sysfs supports NUMA and runtime
    allocated pages can be freed just fine via the buddy allocator.
    
    This series adds support for allocating gigantic pages at runtime.  It
    does so by allocating gigantic pages via CMA instead of the buddy
    allocator.  Releasing gigantic pages is also supported via CMA.  As this
    series builds on top of the existing HugeTLB interface, it makes gigantic
    page allocation and releasing just like regular sized hugepages.  This
    also means that NUMA support just works.
    
    For example, to allocate two 1G gigantic pages on node 1, one can do:
    
     # echo 2 > \
       /sys/devices/system/node/node1/hugepages/hugepages-1048576kB/nr_hugepages
    
    And, to release all gigantic pages on the same node:
    
     # echo 0 > \
       /sys/devices/system/node/node1/hugepages/hugepages-1048576kB/nr_hugepages
    
    Please, refer to patch 5/5 for full technical details.
    
    Finally, please note that this series is a follow up for a previous series
    that tried to extend the command-line options set to be NUMA aware:
    
     http://marc.info/?l=linux-mm&m=139593335312191&w=2
    
    During the discussion of that series it was agreed that having runtime
    allocation support for gigantic pages was a better solution.
    
    This patch (of 5):
    
    This function is going to be used by non-init code in a future
    commit.
    Signed-off-by: default avatarLuiz Capitulino <lcapitulino@redhat.com>
    Reviewed-by: default avatarDavidlohr Bueso <davidlohr@hp.com>
    Acked-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
    Reviewed-by: default avatarZhang Yanfei <zhangyanfei@cn.fujitsu.com>
    Cc: Marcelo Tosatti <mtosatti@redhat.com>
    Cc: Andrea Arcangeli <aarcange@redhat.com>
    Cc: Davidlohr Bueso <davidlohr@hp.com>
    Cc: David Rientjes <rientjes@google.com>
    Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
    Cc: Yinghai Lu <yinghai@kernel.org>
    Cc: Rik van Riel <riel@redhat.com>
    Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    2906dd52
hugetlb.c 94.1 KB