• Waiman Long's avatar
    mm/page_alloc.c: don't call kasan_free_pages() at deferred mem init · 3c0c12cc
    Waiman Long authored
    When CONFIG_KASAN is enabled on large memory SMP systems, the deferrred
    pages initialization can take a long time.  Below were the reported init
    times on a 8-socket 96-core 4TB IvyBridge system.
    
      1) Non-debug kernel without CONFIG_KASAN
         [    8.764222] node 1 initialised, 132086516 pages in 7027ms
    
      2) Debug kernel with CONFIG_KASAN
         [  146.288115] node 1 initialised, 132075466 pages in 143052ms
    
    So the page init time in a debug kernel was 20X of the non-debug kernel.
    The long init time can be problematic as the page initialization is done
    with interrupt disabled.  In this particular case, it caused the
    appearance of following warning messages as well as NMI backtraces of all
    the cores that were doing the initialization.
    
    [   68.240049] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
    [   68.241000] rcu: 	25-...0: (100 ticks this GP) idle=b72/1/0x4000000000000000 softirq=915/915 fqs=16252
    [   68.241000] rcu: 	44-...0: (95 ticks this GP) idle=49a/1/0x4000000000000000 softirq=788/788 fqs=16253
    [   68.241000] rcu: 	54-...0: (104 ticks this GP) idle=03a/1/0x4000000000000000 softirq=721/825 fqs=16253
    [   68.241000] rcu: 	60-...0: (103 ticks this GP) idle=cbe/1/0x4000000000000000 softirq=637/740 fqs=16253
    [   68.241000] rcu: 	72-...0: (105 ticks this GP) idle=786/1/0x4000000000000000 softirq=536/641 fqs=16253
    [   68.241000] rcu: 	84-...0: (99 ticks this GP) idle=292/1/0x4000000000000000 softirq=537/537 fqs=16253
    [   68.241000] rcu: 	111-...0: (104 ticks this GP) idle=bde/1/0x4000000000000000 softirq=474/476 fqs=16253
    [   68.241000] rcu: 	(detected by 13, t=65018 jiffies, g=249, q=2)
    
    The long init time was mainly caused by the call to kasan_free_pages() to
    poison the newly initialized pages.  On a 4TB system, we are talking about
    almost 500GB of memory probably on the same node.
    
    In reality, we may not need to poison the newly initialized pages before
    they are ever allocated.  So KASAN poisoning of freed pages before the
    completion of deferred memory initialization is now disabled.  Those pages
    will be properly poisoned when they are allocated or freed after deferred
    pages initialization is done.
    
    With this change, the new page initialization time became:
    
    [   21.948010] node 1 initialised, 132075466 pages in 18702ms
    
    This was still about double the non-debug kernel time, but was much
    better than before.
    
    Link: http://lkml.kernel.org/r/1544459388-8736-1-git-send-email-longman@redhat.comSigned-off-by: default avatarWaiman Long <longman@redhat.com>
    Reviewed-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
    Cc: Alexander Potapenko <glider@google.com>
    Cc: Dmitry Vyukov <dvyukov@google.com>
    Cc: Michal Hocko <mhocko@suse.com>
    Cc: Pasha Tatashin <Pavel.Tatashin@microsoft.com>
    Cc: Oscar Salvador <osalvador@suse.de>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    3c0c12cc
page_alloc.c 230 KB