Commit abc40bd2 authored by Mel Gorman's avatar Mel Gorman Committed by Linus Torvalds

mm: numa: Do not mark PTEs pte_numa when splitting huge pages

This patch reverts 1ba6e0b5 ("mm: numa: split_huge_page: transfer the
NUMA type from the pmd to the pte"). If a huge page is being split due
a protection change and the tail will be in a PROT_NONE vma then NUMA
hinting PTEs are temporarily created in the protected VMA.

 VM_RW|VM_PROTNONE
|-----------------|
      ^
      split here

In the specific case above, it should get fixed up by change_pte_range()
but there is a window of opportunity for weirdness to happen. Similarly,
if a huge page is shrunk and split during a protection update but before
pmd_numa is cleared then a pte_numa can be left behind.

Instead of adding complexity trying to deal with the case, this patch
will not mark PTEs NUMA when splitting a huge page. NUMA hinting faults
will not be triggered which is marginal in comparison to the complexity
in dealing with the corner cases during THP split.

Cc: stable@vger.kernel.org
Signed-off-by: default avatarMel Gorman <mgorman@suse.de>
Acked-by: default avatarRik van Riel <riel@redhat.com>
Acked-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
parent d3cb8bf6
...@@ -1795,14 +1795,17 @@ static int __split_huge_page_map(struct page *page, ...@@ -1795,14 +1795,17 @@ static int __split_huge_page_map(struct page *page,
for (i = 0; i < HPAGE_PMD_NR; i++, haddr += PAGE_SIZE) { for (i = 0; i < HPAGE_PMD_NR; i++, haddr += PAGE_SIZE) {
pte_t *pte, entry; pte_t *pte, entry;
BUG_ON(PageCompound(page+i)); BUG_ON(PageCompound(page+i));
/*
* Note that pmd_numa is not transferred deliberately
* to avoid any possibility that pte_numa leaks to
* a PROT_NONE VMA by accident.
*/
entry = mk_pte(page + i, vma->vm_page_prot); entry = mk_pte(page + i, vma->vm_page_prot);
entry = maybe_mkwrite(pte_mkdirty(entry), vma); entry = maybe_mkwrite(pte_mkdirty(entry), vma);
if (!pmd_write(*pmd)) if (!pmd_write(*pmd))
entry = pte_wrprotect(entry); entry = pte_wrprotect(entry);
if (!pmd_young(*pmd)) if (!pmd_young(*pmd))
entry = pte_mkold(entry); entry = pte_mkold(entry);
if (pmd_numa(*pmd))
entry = pte_mknuma(entry);
pte = pte_offset_map(&_pmd, haddr); pte = pte_offset_map(&_pmd, haddr);
BUG_ON(!pte_none(*pte)); BUG_ON(!pte_none(*pte));
set_pte_at(mm, haddr, pte, entry); set_pte_at(mm, haddr, pte, entry);
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment