• David Hildenbrand's avatar
    powernv/memtrace: don't abuse memory hot(un)plug infrastructure for memory allocations · 0bd4b96d
    David Hildenbrand authored
    Let's use alloc_contig_pages() for allocating memory and remove the
    linear mapping manually via arch_remove_linear_mapping(). Mark all pages
    PG_offline, such that they will definitely not get touched - e.g.,
    when hibernating. When freeing memory, try to revert what we did.
    
    The original idea was discussed in:
     https://lkml.kernel.org/r/48340e96-7e6b-736f-9e23-d3111b915b6e@redhat.com
    
    This is similar to CONFIG_DEBUG_PAGEALLOC handling on other
    architectures, whereby only single pages are unmapped from the linear
    mapping. Let's mimic what memory hot(un)plug would do with the linear
    mapping.
    
    We now need MEMORY_HOTPLUG and CONTIG_ALLOC as dependencies. Add a TODO
    that we want to use __GFP_ZERO for clearing once alloc_contig_pages()
    understands that.
    
    Tested with in QEMU/TCG with 10 GiB of main memory:
      [root@localhost ~]# echo 0x40000000 > /sys/kernel/debug/powerpc/memtrace/enable
      [  105.903043][ T1080] memtrace: Allocated trace memory on node 0 at 0x0000000080000000
      [root@localhost ~]# echo 0x40000000 > /sys/kernel/debug/powerpc/memtrace/enable
      [  145.042493][ T1080] radix-mmu: Mapped 0x0000000080000000-0x00000000c0000000 with 64.0 KiB pages
      [  145.049019][ T1080] memtrace: Freed trace memory back on node 0
      [  145.333960][ T1080] memtrace: Allocated trace memory on node 0 at 0x0000000080000000
      [root@localhost ~]# echo 0x80000000 > /sys/kernel/debug/powerpc/memtrace/enable
      [  213.606916][ T1080] radix-mmu: Mapped 0x0000000080000000-0x00000000c0000000 with 64.0 KiB pages
      [  213.613855][ T1080] memtrace: Freed trace memory back on node 0
      [  214.185094][ T1080] memtrace: Allocated trace memory on node 0 at 0x0000000080000000
      [root@localhost ~]# echo 0x100000000 > /sys/kernel/debug/powerpc/memtrace/enable
      [  234.874872][ T1080] radix-mmu: Mapped 0x0000000080000000-0x0000000100000000 with 64.0 KiB pages
      [  234.886974][ T1080] memtrace: Freed trace memory back on node 0
      [  234.890153][ T1080] memtrace: Failed to allocate trace memory on node 0
      [root@localhost ~]# echo 0x40000000 > /sys/kernel/debug/powerpc/memtrace/enable
      [  259.490196][ T1080] memtrace: Allocated trace memory on node 0 at 0x0000000080000000
    
    I also made sure allocated memory is properly zeroed.
    
    Note 1: We currently won't be allocating from ZONE_MOVABLE - because our
    	pages are not movable. However, as we don't run with any memory
    	hot(un)plug mechanism around, we could make an exception to
    	increase the chance of allocations succeeding.
    
    Note 2: PG_reserved isn't sufficient. E.g., kernel_page_present() used
    	along PG_reserved in hibernation code will always return "true"
    	on powerpc, resulting in the pages getting touched. It's too
    	generic - e.g., indicates boot allocations.
    
    Note 3: For now, we keep using memory_block_size_bytes() as minimum
    	granularity.
    Suggested-by: default avatarMichal Hocko <mhocko@kernel.org>
    Signed-off-by: default avatarDavid Hildenbrand <david@redhat.com>
    Reviewed-by: default avatarOscar Salvador <osalvador@suse.de>
    Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
    Link: https://lore.kernel.org/r/20201111145322.15793-9-david@redhat.com
    0bd4b96d
memtrace.c 6.61 KB