• Vlastimil Babka's avatar
    mm, mmap: fix vma_merge() case 7 with vma_ops->close · fc0c8f90
    Vlastimil Babka authored
    When debugging issues with a workload using SysV shmem, Michal Hocko has
    come up with a reproducer that shows how a series of mprotect() operations
    can result in an elevated shm_nattch and thus leak of the resource.
    
    The problem is caused by wrong assumptions in vma_merge() commit
    714965ca ("mm/mmap: start distinguishing if vma can be removed in
    mergeability test").  The shmem vmas have a vma_ops->close callback that
    decrements shm_nattch, and we remove the vma without calling it.
    
    vma_merge() has thus historically avoided merging vma's with
    vma_ops->close and commit 714965ca was supposed to keep it that way. 
    It relaxed the checks for vma_ops->close in can_vma_merge_after() assuming
    that it is never called on a vma that would be a candidate for removal. 
    However, the vma_merge() code does also use the result of this check in
    the decision to remove a different vma in the merge case 7.
    
    A robust solution would be to refactor vma_merge() code in a way that the
    vma_ops->close check is only done for vma's that are actually going to be
    removed, and not as part of the preliminary checks.  That would both solve
    the existing bug, and also allow additional merges that the checks
    currently prevent unnecessarily in some cases.
    
    However to fix the existing bug first with a minimized risk, and for
    easier stable backports, this patch only adds a vma_ops->close check to
    the buggy case 7 specifically.  All other cases of vma removal are covered
    by the can_vma_merge_before() check that includes the test for
    vma_ops->close.
    
    The reproducer code, adapted from Michal Hocko's code:
    
    int main(int argc, char *argv[]) {
      int segment_id;
      size_t segment_size = 20 * PAGE_SIZE;
      char * sh_mem;
      struct shmid_ds shmid_ds;
    
      key_t key = 0x1234;
      segment_id = shmget(key, segment_size,
                          IPC_CREAT | IPC_EXCL | S_IRUSR | S_IWUSR);
      sh_mem = (char *)shmat(segment_id, NULL, 0);
    
      mprotect(sh_mem + 2*PAGE_SIZE, PAGE_SIZE, PROT_NONE);
    
      mprotect(sh_mem + PAGE_SIZE, PAGE_SIZE, PROT_WRITE);
    
      mprotect(sh_mem + 2*PAGE_SIZE, PAGE_SIZE, PROT_WRITE);
    
      shmdt(sh_mem);
    
      shmctl(segment_id, IPC_STAT, &shmid_ds);
      printf("nattch after shmdt(): %lu (expected: 0)\n", shmid_ds.shm_nattch);
    
      if (shmctl(segment_id, IPC_RMID, 0))
              printf("IPCRM failed %d\n", errno);
      return (shmid_ds.shm_nattch) ? 1 : 0;
    }
    
    Link: https://lkml.kernel.org/r/20240222215930.14637-2-vbabka@suse.cz
    Fixes: 714965ca ("mm/mmap: start distinguishing if vma can be removed in mergeability test")
    Signed-off-by: default avatarVlastimil Babka <vbabka@suse.cz>
    Reported-by: default avatarMichal Hocko <mhocko@suse.com>
    Reviewed-by: default avatarLorenzo Stoakes <lstoakes@gmail.com>
    Reviewed-by: default avatarLiam R. Howlett <Liam.Howlett@oracle.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    fc0c8f90
mmap.c 106 KB