• David Hildenbrand's avatar
    fs/proc/kcore: don't read offline sections, logically offline pages and hwpoisoned pages · 0daa322b
    David Hildenbrand authored
    Let's avoid reading:
    
    1) Offline memory sections: the content of offline memory sections is
       stale as the memory is effectively unused by the kernel.  On s390x with
       standby memory, offline memory sections (belonging to offline storage
       increments) are not accessible.  With virtio-mem and the hyper-v
       balloon, we can have unavailable memory chunks that should not be
       accessed inside offline memory sections.  Last but not least, offline
       memory sections might contain hwpoisoned pages which we can no longer
       identify because the memmap is stale.
    
    2) PG_offline pages: logically offline pages that are documented as
       "The content of these pages is effectively stale.  Such pages should
       not be touched (read/write/dump/save) except by their owner.".
       Examples include pages inflated in a balloon or unavailble memory
       ranges inside hotplugged memory sections with virtio-mem or the hyper-v
       balloon.
    
    3) PG_hwpoison pages: Reading pages marked as hwpoisoned can be fatal.
       As documented: "Accessing is not safe since it may cause another
       machine check.  Don't touch!"
    
    Introduce is_page_hwpoison(), adding a comment that it is inherently racy
    but best we can really do.
    
    Reading /proc/kcore now performs similar checks as when reading
    /proc/vmcore for kdump via makedumpfile: problematic pages are exclude.
    It's also similar to hibernation code, however, we don't skip hwpoisoned
    pages when processing pages in kernel/power/snapshot.c:saveable_page()
    yet.
    
    Note 1: we can race against memory offlining code, especially memory going
    offline and getting unplugged: however, we will properly tear down the
    identity mapping and handle faults gracefully when accessing this memory
    from kcore code.
    
    Note 2: we can race against drivers setting PageOffline() and turning
    memory inaccessible in the hypervisor.  We'll handle this in a follow-up
    patch.
    
    Link: https://lkml.kernel.org/r/20210526093041.8800-4-david@redhat.comSigned-off-by: default avatarDavid Hildenbrand <david@redhat.com>
    Reviewed-by: default avatarMike Rapoport <rppt@linux.ibm.com>
    Reviewed-by: default avatarOscar Salvador <osalvador@suse.de>
    Cc: Aili Yao <yaoaili@kingsoft.com>
    Cc: Alexey Dobriyan <adobriyan@gmail.com>
    Cc: Alex Shi <alex.shi@linux.alibaba.com>
    Cc: Haiyang Zhang <haiyangz@microsoft.com>
    Cc: Jason Wang <jasowang@redhat.com>
    Cc: Jiri Bohac <jbohac@suse.cz>
    Cc: "K. Y. Srinivasan" <kys@microsoft.com>
    Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
    Cc: "Michael S. Tsirkin" <mst@redhat.com>
    Cc: Michal Hocko <mhocko@suse.com>
    Cc: Mike Kravetz <mike.kravetz@oracle.com>
    Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
    Cc: Roman Gushchin <guro@fb.com>
    Cc: Stephen Hemminger <sthemmin@microsoft.com>
    Cc: Steven Price <steven.price@arm.com>
    Cc: Wei Liu <wei.liu@kernel.org>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    0daa322b
kcore.c 16.1 KB