• Mike Kravetz's avatar
    hugetlbfs: fix hugetlb page migration/fault race causing SIGBUS · 4643d67e
    Mike Kravetz authored
    Li Wang discovered that LTP/move_page12 V2 sometimes triggers SIGBUS in
    the kernel-v5.2.3 testing.  This is caused by a race between hugetlb
    page migration and page fault.
    
    If a hugetlb page can not be allocated to satisfy a page fault, the task
    is sent SIGBUS.  This is normal hugetlbfs behavior.  A hugetlb fault
    mutex exists to prevent two tasks from trying to instantiate the same
    page.  This protects against the situation where there is only one
    hugetlb page, and both tasks would try to allocate.  Without the mutex,
    one would fail and SIGBUS even though the other fault would be
    successful.
    
    There is a similar race between hugetlb page migration and fault.
    Migration code will allocate a page for the target of the migration.  It
    will then unmap the original page from all page tables.  It does this
    unmap by first clearing the pte and then writing a migration entry.  The
    page table lock is held for the duration of this clear and write
    operation.  However, the beginnings of the hugetlb page fault code
    optimistically checks the pte without taking the page table lock.  If
    clear (as it can be during the migration unmap operation), a hugetlb
    page allocation is attempted to satisfy the fault.  Note that the page
    which will eventually satisfy this fault was already allocated by the
    migration code.  However, the allocation within the fault path could
    fail which would result in the task incorrectly being sent SIGBUS.
    
    Ideally, we could take the hugetlb fault mutex in the migration code
    when modifying the page tables.  However, locks must be taken in the
    order of hugetlb fault mutex, page lock, page table lock.  This would
    require significant rework of the migration code.  Instead, the issue is
    addressed in the hugetlb fault code.  After failing to allocate a huge
    page, take the page table lock and check for huge_pte_none before
    returning an error.  This is the same check that must be made further in
    the code even if page allocation is successful.
    
    Link: http://lkml.kernel.org/r/20190808000533.7701-1-mike.kravetz@oracle.com
    Fixes: 290408d4 ("hugetlb: hugepage migration core")
    Signed-off-by: default avatarMike Kravetz <mike.kravetz@oracle.com>
    Reported-by: default avatarLi Wang <liwang@redhat.com>
    Tested-by: default avatarLi Wang <liwang@redhat.com>
    Reviewed-by: default avatarNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
    Acked-by: default avatarMichal Hocko <mhocko@suse.com>
    Cc: Cyril Hrubis <chrubis@suse.cz>
    Cc: Xishi Qiu <xishi.qiuxishi@alibaba-inc.com>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    4643d67e
hugetlb.c 136 KB