• Rafael Aquini's avatar
    mm: avoid reinserting isolated balloon pages into LRU lists · 117aad1e
    Rafael Aquini authored
    Isolated balloon pages can wrongly end up in LRU lists when
    migrate_pages() finishes its round without draining all the isolated
    page list.
    
    The same issue can happen when reclaim_clean_pages_from_list() tries to
    reclaim pages from an isolated page list, before migration, in the CMA
    path.  Such balloon page leak opens a race window against LRU lists
    shrinkers that leads us to the following kernel panic:
    
      BUG: unable to handle kernel NULL pointer dereference at 0000000000000028
      IP: [<ffffffff810c2625>] shrink_page_list+0x24e/0x897
      PGD 3cda2067 PUD 3d713067 PMD 0
      Oops: 0000 [#1] SMP
      CPU: 0 PID: 340 Comm: kswapd0 Not tainted 3.12.0-rc1-22626-g4367597 #87
      Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
      RIP: shrink_page_list+0x24e/0x897
      RSP: 0000:ffff88003da499b8  EFLAGS: 00010286
      RAX: 0000000000000000 RBX: ffff88003e82bd60 RCX: 00000000000657d5
      RDX: 0000000000000000 RSI: 000000000000031f RDI: ffff88003e82bd40
      RBP: ffff88003da49ab0 R08: 0000000000000001 R09: 0000000081121a45
      R10: ffffffff81121a45 R11: ffff88003c4a9a28 R12: ffff88003e82bd40
      R13: ffff88003da0e800 R14: 0000000000000001 R15: ffff88003da49d58
      FS:  0000000000000000(0000) GS:ffff88003fc00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00000000067d9000 CR3: 000000003ace5000 CR4: 00000000000407b0
      Call Trace:
        shrink_inactive_list+0x240/0x3de
        shrink_lruvec+0x3e0/0x566
        __shrink_zone+0x94/0x178
        shrink_zone+0x3a/0x82
        balance_pgdat+0x32a/0x4c2
        kswapd+0x2f0/0x372
        kthread+0xa2/0xaa
        ret_from_fork+0x7c/0xb0
      Code: 80 7d 8f 01 48 83 95 68 ff ff ff 00 4c 89 e7 e8 5a 7b 00 00 48 85 c0 49 89 c5 75 08 80 7d 8f 00 74 3e eb 31 48 8b 80 18 01 00 00 <48> 8b 74 0d 48 8b 78 30 be 02 00 00 00 ff d2 eb
      RIP  [<ffffffff810c2625>] shrink_page_list+0x24e/0x897
       RSP <ffff88003da499b8>
      CR2: 0000000000000028
      ---[ end trace 703d2451af6ffbfd ]---
      Kernel panic - not syncing: Fatal exception
    
    This patch fixes the issue, by assuring the proper tests are made at
    putback_movable_pages() & reclaim_clean_pages_from_list() to avoid
    isolated balloon pages being wrongly reinserted in LRU lists.
    
    [akpm@linux-foundation.org: clarify awkward comment text]
    Signed-off-by: default avatarRafael Aquini <aquini@redhat.com>
    Reported-by: default avatarLuiz Capitulino <lcapitulino@redhat.com>
    Tested-by: default avatarLuiz Capitulino <lcapitulino@redhat.com>
    Cc: Mel Gorman <mel@csn.ul.ie>
    Cc: Rik van Riel <riel@redhat.com>
    Cc: Hugh Dickins <hughd@google.com>
    Cc: Johannes Weiner <hannes@cmpxchg.org>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    117aad1e
vmscan.c 107 KB