• Dan Williams's avatar
    mm: teach pfn_to_online_page() about ZONE_DEVICE section collisions · 1f90a347
    Dan Williams authored
    While pfn_to_online_page() is able to determine pfn_valid() at subsection
    granularity it is not able to reliably determine if a given pfn is also
    online if the section is mixes ZONE_{NORMAL,MOVABLE} with ZONE_DEVICE.
    This means that pfn_to_online_page() may return invalid @page objects.
    For example with a memory map like:
    
    100000000-1fbffffff : System RAM
      142000000-143002e16 : Kernel code
      143200000-143713fff : Kernel rodata
      143800000-143b15b7f : Kernel data
      144227000-144ffffff : Kernel bss
    1fc000000-2fbffffff : Persistent Memory (legacy)
      1fc000000-2fbffffff : namespace0.0
    
    This command:
    
    echo 0x1fc000000 > /sys/devices/system/memory/soft_offline_page
    
    ...succeeds when it should fail.  When it succeeds it touches an
    uninitialized page and may crash or cause other damage (see
    dissolve_free_huge_page()).
    
    While the memory map above is contrived via the memmap=ss!nn kernel
    command line option, the collision happens in practice on shipping
    platforms.  The memory controller resources that decode spans of physical
    address space are a limited resource.  One technique platform-firmware
    uses to conserve those resources is to share a decoder across 2 devices to
    keep the address range contiguous.  Unfortunately the unit of operation of
    a decoder is 64MiB while the Linux section size is 128MiB.  This results
    in situations where, without subsection hotplug memory mappings with
    different lifetimes collide into one object that can only express one
    lifetime.
    
    Update move_pfn_range_to_zone() to flag (SECTION_TAINT_ZONE_DEVICE) a
    section that mixes ZONE_DEVICE pfns with other online pfns.  With
    SECTION_TAINT_ZONE_DEVICE to delineate, pfn_to_online_page() can fall back
    to a slow-path check for ZONE_DEVICE pfns in an online section.  In the
    fast path online_section() for a full ZONE_DEVICE section returns false.
    
    Because the collision case is rare, and for simplicity, the
    SECTION_TAINT_ZONE_DEVICE flag is never cleared once set.
    
    [dan.j.williams@intel.com: fix CONFIG_ZONE_DEVICE=n build]
      Link: https://lkml.kernel.org/r/CAPcyv4iX+7LAgAeSqx7Zw-Zd=ZV9gBv8Bo7oTbwCOOqJoZ3+Yg@mail.gmail.com
    
    Link: https://lkml.kernel.org/r/161058500675.1840162.7887862152161279354.stgit@dwillia2-desk3.amr.corp.intel.com
    Fixes: ba72b4c8 ("mm/sparsemem: support sub-section hotplug")
    Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
    Reported-by: default avatarMichal Hocko <mhocko@suse.com>
    Acked-by: default avatarMichal Hocko <mhocko@suse.com>
    Reported-by: default avatarDavid Hildenbrand <david@redhat.com>
    Reviewed-by: default avatarDavid Hildenbrand <david@redhat.com>
    Reviewed-by: default avatarOscar Salvador <osalvador@suse.de>
    Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
    Cc: Qian Cai <cai@lca.pw>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    1f90a347
memory_hotplug.c 52.8 KB