Commit 3d0b95cd authored by Baolin Wang's avatar Baolin Wang Committed by Andrew Morton

mm: hugetlb: considering PMD sharing when flushing cache/TLBs

This patchset fixes some cache flushing issues if PMD sharing is possible
for hugetlb pages, which were found by code inspection.  Meanwhile Mike
found the flush_cache_page() can not cover the whole size of a hugetlb
page on some architectures [1], so I added a new patch 3 to fix this
issue, since I found only try_to_unmap_one() and try_to_migrate_one() need
to fix after some investigation.

[1] https://lore.kernel.org/linux-mm/064da3bb-5b4b-7332-a722-c5a541128705@oracle.com/


This patch (of 3):

When moving hugetlb page tables, the cache flushing is called in
move_page_tables() without considering the shared PMDs, which may be cause
cache issues on some architectures.

Thus we should move the hugetlb cache flushing into
move_hugetlb_page_tables() with considering the shared PMDs ranges,
calculated by adjust_range_if_pmd_sharing_possible().  Meanwhile also
expanding the TLBs flushing range in case of shared PMDs.

Note this is discovered via code inspection, and did not meet a real
problem in practice so far.

Link: https://lkml.kernel.org/r/cover.1651056365.git.baolin.wang@linux.alibaba.com
Link: https://lkml.kernel.org/r/0443c8cf20db554d3ff4b439b30e0ff26c0181dd.1651056365.git.baolin.wang@linux.alibaba.com
Fixes: 550a7d60 ("mm, hugepages: add mremap() support for hugepage backed vma")
Signed-off-by: default avatarBaolin Wang <baolin.wang@linux.alibaba.com>
Reviewed-by: default avatarMike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: default avatarMuchun Song <songmuchun@bytedance.com>
Cc: Mina Almasry <almasrymina@google.com>
Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
parent 6366238b
...@@ -4922,10 +4922,17 @@ int move_hugetlb_page_tables(struct vm_area_struct *vma, ...@@ -4922,10 +4922,17 @@ int move_hugetlb_page_tables(struct vm_area_struct *vma,
unsigned long old_addr_copy; unsigned long old_addr_copy;
pte_t *src_pte, *dst_pte; pte_t *src_pte, *dst_pte;
struct mmu_notifier_range range; struct mmu_notifier_range range;
bool shared_pmd = false;
mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, vma, mm, old_addr, mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, vma, mm, old_addr,
old_end); old_end);
adjust_range_if_pmd_sharing_possible(vma, &range.start, &range.end); adjust_range_if_pmd_sharing_possible(vma, &range.start, &range.end);
/*
* In case of shared PMDs, we should cover the maximum possible
* range.
*/
flush_cache_range(vma, range.start, range.end);
mmu_notifier_invalidate_range_start(&range); mmu_notifier_invalidate_range_start(&range);
/* Prevent race with file truncation */ /* Prevent race with file truncation */
i_mmap_lock_write(mapping); i_mmap_lock_write(mapping);
...@@ -4942,8 +4949,10 @@ int move_hugetlb_page_tables(struct vm_area_struct *vma, ...@@ -4942,8 +4949,10 @@ int move_hugetlb_page_tables(struct vm_area_struct *vma,
*/ */
old_addr_copy = old_addr; old_addr_copy = old_addr;
if (huge_pmd_unshare(mm, vma, &old_addr_copy, src_pte)) if (huge_pmd_unshare(mm, vma, &old_addr_copy, src_pte)) {
shared_pmd = true;
continue; continue;
}
dst_pte = huge_pte_alloc(mm, new_vma, new_addr, sz); dst_pte = huge_pte_alloc(mm, new_vma, new_addr, sz);
if (!dst_pte) if (!dst_pte)
...@@ -4951,6 +4960,10 @@ int move_hugetlb_page_tables(struct vm_area_struct *vma, ...@@ -4951,6 +4960,10 @@ int move_hugetlb_page_tables(struct vm_area_struct *vma,
move_huge_pte(vma, old_addr, new_addr, src_pte, dst_pte); move_huge_pte(vma, old_addr, new_addr, src_pte, dst_pte);
} }
if (shared_pmd)
flush_tlb_range(vma, range.start, range.end);
else
flush_tlb_range(vma, old_end - len, old_end); flush_tlb_range(vma, old_end - len, old_end);
mmu_notifier_invalidate_range_end(&range); mmu_notifier_invalidate_range_end(&range);
i_mmap_unlock_write(mapping); i_mmap_unlock_write(mapping);
......
...@@ -490,12 +490,12 @@ unsigned long move_page_tables(struct vm_area_struct *vma, ...@@ -490,12 +490,12 @@ unsigned long move_page_tables(struct vm_area_struct *vma,
return 0; return 0;
old_end = old_addr + len; old_end = old_addr + len;
flush_cache_range(vma, old_addr, old_end);
if (is_vm_hugetlb_page(vma)) if (is_vm_hugetlb_page(vma))
return move_hugetlb_page_tables(vma, new_vma, old_addr, return move_hugetlb_page_tables(vma, new_vma, old_addr,
new_addr, len); new_addr, len);
flush_cache_range(vma, old_addr, old_end);
mmu_notifier_range_init(&range, MMU_NOTIFY_UNMAP, 0, vma, vma->vm_mm, mmu_notifier_range_init(&range, MMU_NOTIFY_UNMAP, 0, vma, vma->vm_mm,
old_addr, old_end); old_addr, old_end);
mmu_notifier_invalidate_range_start(&range); mmu_notifier_invalidate_range_start(&range);
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment