• Charan Teja Reddy's avatar
    mm, page_alloc: fix core hung in free_pcppages_bulk() · 88e8ac11
    Charan Teja Reddy authored
    The following race is observed with the repeated online, offline and a
    delay between two successive online of memory blocks of movable zone.
    
    P1						P2
    
    Online the first memory block in
    the movable zone. The pcp struct
    values are initialized to default
    values,i.e., pcp->high = 0 &
    pcp->batch = 1.
    
    					Allocate the pages from the
    					movable zone.
    
    Try to Online the second memory
    block in the movable zone thus it
    entered the online_pages() but yet
    to call zone_pcp_update().
    					This process is entered into
    					the exit path thus it tries
    					to release the order-0 pages
    					to pcp lists through
    					free_unref_page_commit().
    					As pcp->high = 0, pcp->count = 1
    					proceed to call the function
    					free_pcppages_bulk().
    Update the pcp values thus the
    new pcp values are like, say,
    pcp->high = 378, pcp->batch = 63.
    					Read the pcp's batch value using
    					READ_ONCE() and pass the same to
    					free_pcppages_bulk(), pcp values
    					passed here are, batch = 63,
    					count = 1.
    
    					Since num of pages in the pcp
    					lists are less than ->batch,
    					then it will stuck in
    					while(list_empty(list)) loop
    					with interrupts disabled thus
    					a core hung.
    
    Avoid this by ensuring free_pcppages_bulk() is called with proper count of
    pcp list pages.
    
    The mentioned race is some what easily reproducible without [1] because
    pcp's are not updated for the first memory block online and thus there is
    a enough race window for P2 between alloc+free and pcp struct values
    update through onlining of second memory block.
    
    With [1], the race still exists but it is very narrow as we update the pcp
    struct values for the first memory block online itself.
    
    This is not limited to the movable zone, it could also happen in cases
    with the normal zone (e.g., hotplug to a node that only has DMA memory, or
    no other memory yet).
    
    [1]: https://patchwork.kernel.org/patch/11696389/
    
    Fixes: 5f8dcc21 ("page-allocator: split per-cpu list into one-list-per-migrate-type")
    Signed-off-by: default avatarCharan Teja Reddy <charante@codeaurora.org>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Acked-by: default avatarDavid Hildenbrand <david@redhat.com>
    Acked-by: default avatarDavid Rientjes <rientjes@google.com>
    Acked-by: default avatarMichal Hocko <mhocko@suse.com>
    Cc: Michal Hocko <mhocko@suse.com>
    Cc: Vlastimil Babka <vbabka@suse.cz>
    Cc: Vinayak Menon <vinmenon@codeaurora.org>
    Cc: <stable@vger.kernel.org> [2.6+]
    Link: http://lkml.kernel.org/r/1597150703-19003-1-git-send-email-charante@codeaurora.orgSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    88e8ac11
page_alloc.c 243 KB