• Anton Blanchard's avatar
    powerpc: Reimplement __get_SP() as a function not a define · bfe9a2cf
    Anton Blanchard authored
    Li Zhong points out an issue with our current __get_SP()
    implementation. If ftrace function tracing is enabled (ie -pg
    profiling using _mcount) we spill a stack frame on 64bit all the
    time.
    
    If a function calls __get_SP() and later calls a function that is
    tail call optimised, we will pop the stack frame and the value
    returned by __get_SP() is no longer valid. An example from Li can
    be found in save_stack_trace -> save_context_stack:
    
    c0000000000432c0 <.save_stack_trace>:
    c0000000000432c0:       mflr    r0
    c0000000000432c4:       std     r0,16(r1)
    c0000000000432c8:       stdu    r1,-128(r1) <-- stack frame for _mcount
    c0000000000432cc:       std     r3,112(r1)
    c0000000000432d0:       bl      <._mcount>
    c0000000000432d4:       nop
    
    c0000000000432d8:       mr      r4,r1 <-- __get_SP()
    
    c0000000000432dc:       ld      r5,632(r13)
    c0000000000432e0:       ld      r3,112(r1)
    c0000000000432e4:       li      r6,1
    
    c0000000000432e8:       addi    r1,r1,128 <-- pop stack frame
    
    c0000000000432ec:       ld      r0,16(r1)
    c0000000000432f0:       mtlr    r0
    c0000000000432f4:       b       <.save_context_stack> <-- tail call optimized
    
    save_context_stack ends up with a stack pointer below the current
    one, and it is likely to be scribbled over.
    
    Fix this by making __get_SP() a function which returns the
    callers stack frame. Also replace inline assembly which grabs
    the stack pointer in save_stack_trace and show_stack with
    __get_SP().
    
    This also fixes an issue with perf_arch_fetch_caller_regs().
    It currently unwinds the stack once, which will skip a
    valid stack frame on a leaf function. With the __get_SP() fixes
    in this patch, we never need to unwind the stack frame to get
    to the first interesting frame.
    
    We have to export __get_SP() because perf_arch_fetch_caller_regs()
    (which is used in modules) calls it from a header file.
    Reported-by: default avatarLi Zhong <zhong@linux.vnet.ibm.com>
    Signed-off-by: default avatarAnton Blanchard <anton@samba.org>
    Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
    bfe9a2cf
ppc_ksyms.c 841 Bytes