• David Hildenbrand's avatar
    mm/memory_hotplug: introduce add_memory_driver_managed() · 7b7b2721
    David Hildenbrand authored
    Patch series "mm/memory_hotplug: Interface to add driver-managed system
    ram", v4.
    
    kexec (via kexec_load()) can currently not properly handle memory added
    via dax/kmem, and will have similar issues with virtio-mem.  kexec-tools
    will currently add all memory to the fixed-up initial firmware memmap.  In
    case of dax/kmem, this means that - in contrast to a proper reboot - how
    that persistent memory will be used can no longer be configured by the
    kexec'd kernel.  In case of virtio-mem it will be harmful, because that
    memory might contain inaccessible pieces that require coordination with
    hypervisor first.
    
    In both cases, we want to let the driver in the kexec'd kernel handle
    detecting and adding the memory, like during an ordinary reboot.
    Introduce add_memory_driver_managed().  More on the samentics are in patch
    #1.
    
    In the future, we might want to make this behavior configurable for
    dax/kmem- either by configuring it in the kernel (which would then also
    allow to configure kexec_file_load()) or in kexec-tools by also adding
    "System RAM (kmem)" memory from /proc/iomem to the fixed-up initial
    firmware memmap.
    
    More on the motivation can be found in [1] and [2].
    
    [1] https://lkml.kernel.org/r/20200429160803.109056-1-david@redhat.com
    [2] https://lkml.kernel.org/r/20200430102908.10107-1-david@redhat.com
    
    This patch (of 3):
    
    Some device drivers rely on memory they managed to not get added to the
    initial (firmware) memmap as system RAM - so it's not used as initial
    system RAM by the kernel and the driver is under control.  While this is
    the case during cold boot and after a reboot, kexec is not aware of that
    and might add such memory to the initial (firmware) memmap of the kexec
    kernel.  We need ways to teach kernel and userspace that this system ram
    is different.
    
    For example, dax/kmem allows to decide at runtime if persistent memory is
    to be used as system ram.  Another future user is virtio-mem, which has to
    coordinate with its hypervisor to deal with inaccessible parts within
    memory resources.
    
    We want to let users in the kernel (esp. kexec) but also user space
    (esp. kexec-tools) know that this memory has different semantics and
    needs to be handled differently:
    1. Don't create entries in /sys/firmware/memmap/
    2. Name the memory resource "System RAM ($DRIVER)" (exposed via
       /proc/iomem) ($DRIVER might be "kmem", "virtio_mem").
    3. Flag the memory resource IORESOURCE_MEM_DRIVER_MANAGED
    
    /sys/firmware/memmap/ [1] represents the "raw firmware-provided memory
    map" because "on most architectures that firmware-provided memory map is
    modified afterwards by the kernel itself".  The primary user is kexec on
    x86-64.  Since commit d96ae530 ("memory-hotplug: create
    /sys/firmware/memmap entry for new memory"), we add all hotplugged memory
    to that firmware memmap - which makes perfect sense for traditional memory
    hotplug on x86-64, where real HW will also add hotplugged DIMMs to the
    firmware memmap.  We replicate what the "raw firmware-provided memory map"
    looks like after hot(un)plug.
    
    To keep things simple, let the user provide the full resource name instead
    of only the driver name - this way, we don't have to manually
    allocate/craft strings for memory resources.  Also use the resource name
    to make decisions, to avoid passing additional flags.  In case the name
    isn't "System RAM", it's special.
    
    We don't have to worry about firmware_map_remove() on the removal path.
    If there is no entry, it will simply return with -EINVAL.
    
    We'll adapt dax/kmem in a follow-up patch.
    
    [1] https://www.kernel.org/doc/Documentation/ABI/testing/sysfs-firmware-memmapSigned-off-by: default avatarDavid Hildenbrand <david@redhat.com>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Acked-by: default avatarPankaj Gupta <pankaj.gupta.linux@gmail.com>
    Cc: Michal Hocko <mhocko@suse.com>
    Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
    Cc: Wei Yang <richard.weiyang@gmail.com>
    Cc: Baoquan He <bhe@redhat.com>
    Cc: Dave Hansen <dave.hansen@linux.intel.com>
    Cc: Eric Biederman <ebiederm@xmission.com>
    Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
    Cc: Dan Williams <dan.j.williams@intel.com>
    Link: http://lkml.kernel.org/r/20200508084217.9160-1-david@redhat.com
    Link: http://lkml.kernel.org/r/20200508084217.9160-3-david@redhat.comSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    7b7b2721
memory_hotplug.c 46.7 KB