• Naoya Horiguchi's avatar
    mm/hwpoison: convert TTU_IGNORE_HWPOISON to TTU_HWPOISON · 6da6b1d4
    Naoya Horiguchi authored
    After a memory error happens on a clean folio, a process unexpectedly
    receives SIGBUS when it accesses the error page.  This SIGBUS killing is
    pointless and simply degrades the level of RAS of the system, because the
    clean folio can be dropped without any data lost on memory error handling
    as we do for a clean pagecache.
    
    When memory_failure() is called on a clean folio, try_to_unmap() is called
    twice (one from split_huge_page() and one from hwpoison_user_mappings()). 
    The root cause of the issue is that pte conversion to hwpoisoned entry is
    now done in the first call of try_to_unmap() because PageHWPoison is
    already set at this point, while it's actually expected to be done in the
    second call.  This behavior disturbs the error handling operation like
    removing pagecache, which results in the malfunction described above.
    
    So convert TTU_IGNORE_HWPOISON into TTU_HWPOISON and set TTU_HWPOISON only
    when we really intend to convert pte to hwpoison entry.  This can prevent
    other callers of try_to_unmap() from accidentally converting to hwpoison
    entries.
    
    Link: https://lkml.kernel.org/r/20230221085905.1465385-1-naoya.horiguchi@linux.dev
    Fixes: a42634a6 ("readahead: Use a folio in read_pages()")
    Signed-off-by: default avatarNaoya Horiguchi <naoya.horiguchi@nec.com>
    Cc: David Hildenbrand <david@redhat.com>
    Cc: Hugh Dickins <hughd@google.com>
    Cc: Matthew Wilcox <willy@infradead.org>
    Cc: Miaohe Lin <linmiaohe@huawei.com>
    Cc: Minchan Kim <minchan@kernel.org>
    Cc: Vlastimil Babka <vbabka@suse.cz>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    6da6b1d4
memory-failure.c 70 KB