• Hugh Dickins's avatar
    swap: fix shmem swapping when more than 8 areas · 78ac34ad
    Hugh Dickins authored
    commit 9b15b817 upstream.
    
    Minchan Kim reports that when a system has many swap areas, and tmpfs
    swaps out to the ninth or more, shmem_getpage_gfp()'s attempts to read
    back the page cannot locate it, and the read fails with -ENOMEM.
    
    Whoops.  Yes, I blindly followed read_swap_header()'s pte_to_swp_entry(
    swp_entry_to_pte()) technique for determining maximum usable swap
    offset, without stopping to realize that that actually depends upon the
    pte swap encoding shifting swap offset to the higher bits and truncating
    it there.  Whereas our radix_tree swap encoding leaves offset in the
    lower bits: it's swap "type" (that is, index of swap area) that was
    truncated.
    
    Fix it by reducing the SWP_TYPE_SHIFT() in swapops.h, and removing the
    broken radix_to_swp_entry(swp_to_radix_entry()) from read_swap_header().
    
    This does not reduce the usable size of a swap area any further, it
    leaves it as claimed when making the original commit: no change from 3.0
    on x86_64, nor on i386 without PAE; but 3.0's 512GB is reduced to 128GB
    per swapfile on i386 with PAE.  It's not a change I would have risked
    five years ago, but with x86_64 supported for ten years, I believe it's
    appropriate now.
    
    Hmm, and what if some architecture implements its swap pte with offset
    encoded below type? That would equally break the maximum usable swap
    offset check.  Happily, they all follow the same tradition of encoding
    offset above type, but I'll prepare a check on that for next.
    Reported-and-Reviewed-and-Tested-by: default avatarMinchan Kim <minchan@kernel.org>
    Signed-off-by: default avatarHugh Dickins <hughd@google.com>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
    78ac34ad
swapfile.c 63.6 KB