• Ard Biesheuvel's avatar
    ARM: p2v: reduce p2v alignment requirement to 2 MiB · 9443076e
    Ard Biesheuvel authored
    The ARM kernel's linear map starts at PAGE_OFFSET, which maps to a
    physical address (PHYS_OFFSET) that is platform specific, and is
    discovered at boot. Since we don't want to slow down translations
    between physical and virtual addresses by keeping the offset in a
    variable in memory, we implement this by patching the code performing
    the translation, and putting the offset between PAGE_OFFSET and the
    start of physical RAM directly into the instruction opcodes.
    
    As we only patch up to 8 bits of offset, yielding 4 GiB >> 8 == 16 MiB
    of granularity, we have to round up PHYS_OFFSET to the next multiple if
    the start of physical RAM is not a multiple of 16 MiB. This wastes some
    physical RAM, since the memory that was skipped will now live below
    PAGE_OFFSET, making it inaccessible to the kernel.
    
    We can improve this by changing the patchable sequences and the patching
    logic to carry more bits of offset: 11 bits gives us 4 GiB >> 11 == 2 MiB
    of granularity, and so we will never waste more than that amount by
    rounding up the physical start of DRAM to the next multiple of 2 MiB.
    (Note that 2 MiB granularity guarantees that the linear mapping can be
    created efficiently, whereas less than 2 MiB may result in the linear
    mapping needing another level of page tables)
    
    This helps Zhen Lei's scenario, where the start of DRAM is known to be
    occupied. It also helps EFI boot, which relies on the firmware's page
    allocator to allocate space for the decompressed kernel as low as
    possible. And if the KASLR patches ever land for 32-bit, it will give
    us 3 more bits of randomization of the placement of the kernel inside
    the linear region.
    
    For the ARM code path, it simply comes down to using two add/sub
    instructions instead of one for the carryless version, and patching
    each of them with the correct immediate depending on the rotation
    field. For the LPAE calculation, which has to deal with a carry, it
    patches the MOVW instruction with up to 12 bits of offset (but we only
    need 11 bits anyway)
    
    For the Thumb2 code path, patching more than 11 bits of displacement
    would be somewhat cumbersome, but the 11 bits we need fit nicely into
    the second word of the u16[2] opcode, so we simply update the immediate
    assignment and the left shift to create an addend of the right magnitude.
    Suggested-by: default avatarZhen Lei <thunder.leizhen@huawei.com>
    Acked-by: default avatarNicolas Pitre <nico@fluxnic.net>
    Acked-by: default avatarLinus Walleij <linus.walleij@linaro.org>
    Signed-off-by: default avatarArd Biesheuvel <ardb@kernel.org>
    9443076e
memory.h 10.4 KB