• Aneesh Kumar K.V's avatar
    powerpc/book3s64/radix: Fix boot failure with large amount of guest memory · 103a8542
    Aneesh Kumar K.V authored
    If the hypervisor doesn't support hugepages, the kernel ends up allocating a large
    number of page table pages. The early page table allocation was wrongly
    setting the max memblock limit to ppc64_rma_size with radix translation
    which resulted in boot failure as shown below.
    
    Kernel panic - not syncing:
    early_alloc_pgtable: Failed to allocate 16777216 bytes align=0x1000000 nid=-1 from=0x0000000000000000 max_addr=0xffffffffffffffff
     CPU: 0 PID: 0 Comm: swapper Not tainted 5.8.0-24.9-default+ #2
     Call Trace:
     [c0000000016f3d00] [c0000000007c6470] dump_stack+0xc4/0x114 (unreliable)
     [c0000000016f3d40] [c00000000014c78c] panic+0x164/0x418
     [c0000000016f3dd0] [c000000000098890] early_alloc_pgtable+0xe0/0xec
     [c0000000016f3e60] [c0000000010a5440] radix__early_init_mmu+0x360/0x4b4
     [c0000000016f3ef0] [c000000001099bac] early_init_mmu+0x1c/0x3c
     [c0000000016f3f10] [c00000000109a320] early_setup+0x134/0x170
    
    This was because the kernel was checking for the radix feature before we enable the
    feature via mmu_features. This resulted in the kernel using hash restrictions on
    radix.
    
    Rework the early init code such that the kernel boot with memblock restrictions
    as imposed by hash. At that point, the kernel still hasn't finalized the
    translation the kernel will end up using.
    
    We have three different ways of detecting radix.
    
    1. dt_cpu_ftrs_scan -> used only in case of PowerNV
    2. ibm,pa-features -> Used when we don't use cpu_dt_ftr_scan
    3. CAS -> Where we negotiate with hypervisor about the supported translation.
    
    We look at 1 or 2 early in the boot and after that, we look at the CAS vector to
    finalize the translation the kernel will use. We also support a kernel command
    line option (disable_radix) to switch to hash.
    
    Update the memblock limit after mmu_early_init_devtree() if the kernel is going
    to use radix translation. This forces some of the memblock allocations we do before
    mmu_early_init_devtree() to be within the RMA limit.
    
    Fixes: 2bfd65e4 ("powerpc/mm/radix: Add radix callbacks for early init routines")
    Reported-by: default avatarShirisha Ganta <shiganta@in.ibm.com>
    Signed-off-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
    Reviewed-by: default avatarHari Bathini <hbathini@linux.ibm.com>
    Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
    Link: https://lore.kernel.org/r/20200828100852.426575-1-aneesh.kumar@linux.ibm.com
    103a8542
radix_pgtable.c 28.3 KB