• Johannes Weiner's avatar
    mm: workingset: fix crash in shadow node shrinker caused by replace_page_cache_page() · f84311d7
    Johannes Weiner authored
    commit 22f2ac51 upstream.
    
    Antonio reports the following crash when using fuse under memory pressure:
    
      kernel BUG at /build/linux-a2WvEb/linux-4.4.0/mm/workingset.c:346!
      invalid opcode: 0000 [#1] SMP
      Modules linked in: all of them
      CPU: 2 PID: 63 Comm: kswapd0 Not tainted 4.4.0-36-generic #55-Ubuntu
      Hardware name: System manufacturer System Product Name/P8H67-M PRO, BIOS 3904 04/27/2013
      task: ffff88040cae6040 ti: ffff880407488000 task.ti: ffff880407488000
      RIP: shadow_lru_isolate+0x181/0x190
      Call Trace:
        __list_lru_walk_one.isra.3+0x8f/0x130
        list_lru_walk_one+0x23/0x30
        scan_shadow_nodes+0x34/0x50
        shrink_slab.part.40+0x1ed/0x3d0
        shrink_zone+0x2ca/0x2e0
        kswapd+0x51e/0x990
        kthread+0xd8/0xf0
        ret_from_fork+0x3f/0x70
    
    which corresponds to the following sanity check in the shadow node
    tracking:
    
      BUG_ON(node->count & RADIX_TREE_COUNT_MASK);
    
    The workingset code tracks radix tree nodes that exclusively contain
    shadow entries of evicted pages in them, and this (somewhat obscure)
    line checks whether there are real pages left that would interfere with
    reclaim of the radix tree node under memory pressure.
    
    While discussing ways how fuse might sneak pages into the radix tree
    past the workingset code, Miklos pointed to replace_page_cache_page(),
    and indeed there is a problem there: it properly accounts for the old
    page being removed - __delete_from_page_cache() does that - but then
    does a raw raw radix_tree_insert(), not accounting for the replacement
    page.  Eventually the page count bits in node->count underflow while
    leaving the node incorrectly linked to the shadow node LRU.
    
    To address this, make sure replace_page_cache_page() uses the tracked
    page insertion code, page_cache_tree_insert().  This fixes the page
    accounting and makes sure page-containing nodes are properly unlinked
    from the shadow node LRU again.
    
    Also, make the sanity checks a bit less obscure by using the helpers for
    checking the number of pages and shadows in a radix tree node.
    
    [mhocko@suse.com: backport for 4.4]
    Fixes: 449dd698 ("mm: keep page cache radix tree nodes in check")
    Link: http://lkml.kernel.org/r/20160919155822.29498-1-hannes@cmpxchg.orgSigned-off-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
    Reported-by: default avatarAntonio SJ Musumeci <trapexit@spawn.link>
    Debugged-by: default avatarMiklos Szeredi <miklos@szeredi.hu>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: default avatarMichal Hocko <mhocko@suse.com>
    Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
    f84311d7
filemap.c 72.1 KB