• Naoya Horiguchi's avatar
    mm/hwpoison: fix unpoison_memory() · bf181c58
    Naoya Horiguchi authored
    After recent soft-offline rework, error pages can be taken off from
    buddy allocator, but the existing unpoison_memory() does not properly
    undo the operation.  Moreover, due to the recent change on
    __get_hwpoison_page(), get_page_unless_zero() is hardly called for
    hwpoisoned pages.  So __get_hwpoison_page() highly likely returns -EBUSY
    (meaning to fail to grab page refcount) and unpoison just clears
    PG_hwpoison without releasing a refcount.  That does not lead to a
    critical issue like kernel panic, but unpoisoned pages never get back to
    buddy (leaked permanently), which is not good.
    
    To (partially) fix this, we need to identify "taken off" pages from
    other types of hwpoisoned pages.  We can't use refcount or page flags
    for this purpose, so a pseudo flag is defined by hacking ->private
    field.  Someone might think that put_page() is enough to cancel
    taken-off pages, but the normal free path contains some operations not
    suitable for the current purpose, and can fire VM_BUG_ON().
    
    Note that unpoison_memory() is now supposed to be cancel hwpoison events
    injected only by madvise() or
    /sys/devices/system/memory/{hard,soft}_offline_page, not by MCE
    injection, so please don't try to use unpoison when testing with MCE
    injection.
    
    [lkp@intel.com: report build failure for ARCH=i386]
    
    Link: https://lkml.kernel.org/r/20211115084006.3728254-4-naoya.horiguchi@linux.devSigned-off-by: default avatarNaoya Horiguchi <naoya.horiguchi@nec.com>
    Reviewed-by: default avatarYang Shi <shy828301@gmail.com>
    Cc: David Hildenbrand <david@redhat.com>
    Cc: Oscar Salvador <osalvador@suse.de>
    Cc: Michal Hocko <mhocko@suse.com>
    Cc: Ding Hui <dinghui@sangfor.com.cn>
    Cc: Tony Luck <tony.luck@intel.com>
    Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
    Cc: Miaohe Lin <linmiaohe@huawei.com>
    Cc: Peter Xu <peterx@redhat.com>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    bf181c58
page_alloc.c 265 KB