• Ard Biesheuvel's avatar
    arm64: lse: deal with clobbered IP registers after branch via PLT · 5be8b70a
    Ard Biesheuvel authored
    The LSE atomics implementation uses runtime patching to patch in calls
    to out of line non-LSE atomics implementations on cores that lack hardware
    support for LSE. To avoid paying the overhead cost of a function call even
    if no call ends up being made, the bl instruction is kept invisible to the
    compiler, and the out of line implementations preserve all registers, not
    just the ones that they are required to preserve as per the AAPCS64.
    
    However, commit fd045f6c ("arm64: add support for module PLTs") added
    support for routing branch instructions via veneers if the branch target
    offset exceeds the range of the ordinary relative branch instructions.
    Since this deals with jump and call instructions that are exposed to ELF
    relocations, the PLT code uses x16 to hold the address of the branch target
    when it performs an indirect branch-to-register, something which is
    explicitly allowed by the AAPCS64 (and ordinary compiler generated code
    does not expect register x16 or x17 to retain their values across a bl
    instruction).
    
    Since the lse runtime patched bl instructions don't adhere to the AAPCS64,
    they don't deal with this clobbering of registers x16 and x17. So add them
    to the clobber list of the asm() statements that perform the call
    instructions, and drop x16 and x17 from the list of registers that are
    callee saved in the out of line non-LSE implementations.
    
    In addition, since we have given these functions two scratch registers,
    they no longer need to stack/unstack temp registers.
    Signed-off-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
    [will: factored clobber list into #define, updated Makefile comment]
    Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
    Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
    5be8b70a
Makefile 1006 Bytes