• David Rientjes's avatar
    mm, hotplug: fix concurrent memory hot-add deadlock · 30467e0b
    David Rientjes authored
    There's a deadlock when concurrently hot-adding memory through the probe
    interface and switching a memory block from offline to online.
    
    When hot-adding memory via the probe interface, add_memory() first takes
    mem_hotplug_begin() and then device_lock() is later taken when registering
    the newly initialized memory block.  This creates a lock dependency of (1)
    mem_hotplug.lock (2) dev->mutex.
    
    When switching a memory block from offline to online, dev->mutex is first
    grabbed in device_online() when the write(2) transitions an existing
    memory block from offline to online, and then online_pages() will take
    mem_hotplug_begin().
    
    This creates a lock inversion between mem_hotplug.lock and dev->mutex.
    Vitaly reports that this deadlock can happen when kworker handling a probe
    event races with systemd-udevd switching a memory block's state.
    
    This patch requires the state transition to take mem_hotplug_begin()
    before dev->mutex.  Hot-adding memory via the probe interface creates a
    memory block while holding mem_hotplug_begin(), there is no way to take
    dev->mutex first in this case.
    
    online_pages() and offline_pages() are only called when transitioning
    memory block state.  We now require that mem_hotplug_begin() is taken
    before calling them -- this requires exporting the mem_hotplug_begin() and
    mem_hotplug_done() to generic code.  In all hot-add and hot-remove cases,
    mem_hotplug_begin() is done prior to device_online().  This is all that is
    needed to avoid the deadlock.
    Signed-off-by: default avatarDavid Rientjes <rientjes@google.com>
    Reported-by: default avatarVitaly Kuznetsov <vkuznets@redhat.com>
    Tested-by: default avatarVitaly Kuznetsov <vkuznets@redhat.com>
    Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
    Cc: "K. Y. Srinivasan" <kys@microsoft.com>
    Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
    Cc: Tang Chen <tangchen@cn.fujitsu.com>
    Cc: Vlastimil Babka <vbabka@suse.cz>
    Cc: Zhang Zhen <zhenzhang.zhang@huawei.com>
    Cc: Vladimir Davydov <vdavydov@parallels.com>
    Cc: Wang Nan <wangnan0@huawei.com>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    30467e0b
memory.c 18.7 KB