Commit 6c21e066 authored by Jann Horn's avatar Jann Horn Committed by Linus Torvalds

mm/mempolicy: Take VMA lock before replacing policy

mbind() calls down into vma_replace_policy() without taking the per-VMA
locks, replaces the VMA's vma->vm_policy pointer, and frees the old
policy.  That's bad; a concurrent page fault might still be using the
old policy (in vma_alloc_folio()), resulting in use-after-free.

Normally this will manifest as a use-after-free read first, but it can
result in memory corruption, including because vma_alloc_folio() can
call mpol_cond_put() on the freed policy, which conditionally changes
the policy's refcount member.

This bug is specific to CONFIG_NUMA, but it does also affect non-NUMA
systems as long as the kernel was built with CONFIG_NUMA.
Signed-off-by: default avatarJann Horn <jannh@google.com>
Reviewed-by: default avatarSuren Baghdasaryan <surenb@google.com>
Fixes: 5e31275c ("mm: add per-VMA lock and helper functions to control it")
Cc: stable@kernel.org
Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
parent 57012c57
...@@ -384,8 +384,10 @@ void mpol_rebind_mm(struct mm_struct *mm, nodemask_t *new) ...@@ -384,8 +384,10 @@ void mpol_rebind_mm(struct mm_struct *mm, nodemask_t *new)
VMA_ITERATOR(vmi, mm, 0); VMA_ITERATOR(vmi, mm, 0);
mmap_write_lock(mm); mmap_write_lock(mm);
for_each_vma(vmi, vma) for_each_vma(vmi, vma) {
vma_start_write(vma);
mpol_rebind_policy(vma->vm_policy, new); mpol_rebind_policy(vma->vm_policy, new);
}
mmap_write_unlock(mm); mmap_write_unlock(mm);
} }
...@@ -768,6 +770,8 @@ static int vma_replace_policy(struct vm_area_struct *vma, ...@@ -768,6 +770,8 @@ static int vma_replace_policy(struct vm_area_struct *vma,
struct mempolicy *old; struct mempolicy *old;
struct mempolicy *new; struct mempolicy *new;
vma_assert_write_locked(vma);
pr_debug("vma %lx-%lx/%lx vm_ops %p vm_file %p set_policy %p\n", pr_debug("vma %lx-%lx/%lx vm_ops %p vm_file %p set_policy %p\n",
vma->vm_start, vma->vm_end, vma->vm_pgoff, vma->vm_start, vma->vm_end, vma->vm_pgoff,
vma->vm_ops, vma->vm_file, vma->vm_ops, vma->vm_file,
...@@ -1313,6 +1317,14 @@ static long do_mbind(unsigned long start, unsigned long len, ...@@ -1313,6 +1317,14 @@ static long do_mbind(unsigned long start, unsigned long len,
if (err) if (err)
goto mpol_out; goto mpol_out;
/*
* Lock the VMAs before scanning for pages to migrate, to ensure we don't
* miss a concurrently inserted page.
*/
vma_iter_init(&vmi, mm, start);
for_each_vma_range(vmi, vma, end)
vma_start_write(vma);
ret = queue_pages_range(mm, start, end, nmask, ret = queue_pages_range(mm, start, end, nmask,
flags | MPOL_MF_INVERT, &pagelist); flags | MPOL_MF_INVERT, &pagelist);
...@@ -1538,6 +1550,7 @@ SYSCALL_DEFINE4(set_mempolicy_home_node, unsigned long, start, unsigned long, le ...@@ -1538,6 +1550,7 @@ SYSCALL_DEFINE4(set_mempolicy_home_node, unsigned long, start, unsigned long, le
break; break;
} }
vma_start_write(vma);
new->home_node = home_node; new->home_node = home_node;
err = mbind_range(&vmi, vma, &prev, start, end, new); err = mbind_range(&vmi, vma, &prev, start, end, new);
mpol_put(new); mpol_put(new);
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment