• Yang Shi's avatar
    fs/proc: task_mmu.c: don't read mapcount for migration entry · 24d7275c
    Yang Shi authored
    The syzbot reported the below BUG:
    
      kernel BUG at include/linux/page-flags.h:785!
      invalid opcode: 0000 [#1] PREEMPT SMP KASAN
      CPU: 1 PID: 4392 Comm: syz-executor560 Not tainted 5.16.0-rc6-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      RIP: 0010:PageDoubleMap include/linux/page-flags.h:785 [inline]
      RIP: 0010:__page_mapcount+0x2d2/0x350 mm/util.c:744
      Call Trace:
        page_mapcount include/linux/mm.h:837 [inline]
        smaps_account+0x470/0xb10 fs/proc/task_mmu.c:466
        smaps_pte_entry fs/proc/task_mmu.c:538 [inline]
        smaps_pte_range+0x611/0x1250 fs/proc/task_mmu.c:601
        walk_pmd_range mm/pagewalk.c:128 [inline]
        walk_pud_range mm/pagewalk.c:205 [inline]
        walk_p4d_range mm/pagewalk.c:240 [inline]
        walk_pgd_range mm/pagewalk.c:277 [inline]
        __walk_page_range+0xe23/0x1ea0 mm/pagewalk.c:379
        walk_page_vma+0x277/0x350 mm/pagewalk.c:530
        smap_gather_stats.part.0+0x148/0x260 fs/proc/task_mmu.c:768
        smap_gather_stats fs/proc/task_mmu.c:741 [inline]
        show_smap+0xc6/0x440 fs/proc/task_mmu.c:822
        seq_read_iter+0xbb0/0x1240 fs/seq_file.c:272
        seq_read+0x3e0/0x5b0 fs/seq_file.c:162
        vfs_read+0x1b5/0x600 fs/read_write.c:479
        ksys_read+0x12d/0x250 fs/read_write.c:619
        do_syscall_x64 arch/x86/entry/common.c:50 [inline]
        do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
        entry_SYSCALL_64_after_hwframe+0x44/0xae
    
    The reproducer was trying to read /proc/$PID/smaps when calling
    MADV_FREE at the mean time.  MADV_FREE may split THPs if it is called
    for partial THP.  It may trigger the below race:
    
               CPU A                         CPU B
               -----                         -----
      smaps walk:                      MADV_FREE:
      page_mapcount()
        PageCompound()
                                       split_huge_page()
        page = compound_head(page)
        PageDoubleMap(page)
    
    When calling PageDoubleMap() this page is not a tail page of THP anymore
    so the BUG is triggered.
    
    This could be fixed by elevated refcount of the page before calling
    mapcount, but that would prevent it from counting migration entries, and
    it seems overkilling because the race just could happen when PMD is
    split so all PTE entries of tail pages are actually migration entries,
    and smaps_account() does treat migration entries as mapcount == 1 as
    Kirill pointed out.
    
    Add a new parameter for smaps_account() to tell this entry is migration
    entry then skip calling page_mapcount().  Don't skip getting mapcount
    for device private entries since they do track references with mapcount.
    
    Pagemap also has the similar issue although it was not reported.  Fixed
    it as well.
    
    [shy828301@gmail.com: v4]
      Link: https://lkml.kernel.org/r/20220203182641.824731-1-shy828301@gmail.com
    [nathan@kernel.org: avoid unused variable warning in pagemap_pmd_range()]
      Link: https://lkml.kernel.org/r/20220207171049.1102239-1-nathan@kernel.org
    Link: https://lkml.kernel.org/r/20220120202805.3369-1-shy828301@gmail.com
    Fixes: e9b61f19 ("thp: reintroduce split_huge_page()")
    Signed-off-by: default avatarYang Shi <shy828301@gmail.com>
    Signed-off-by: default avatarNathan Chancellor <nathan@kernel.org>
    Reported-by: syzbot+1f52b3a18d5633fa7f82@syzkaller.appspotmail.com
    Acked-by: default avatarDavid Hildenbrand <david@redhat.com>
    Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
    Cc: Jann Horn <jannh@google.com>
    Cc: Matthew Wilcox <willy@infradead.org>
    Cc: Alexey Dobriyan <adobriyan@gmail.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    24d7275c
task_mmu.c 48 KB