• Alexandre Truong's avatar
    perf arm64: Inject missing frames when using 'perf record --call-graph=fp' · b9f6fbb3
    Alexandre Truong authored
    When unwinding using frame pointers on ARM64, the return address of the
    current function may not have been pushed into the stack when a function
    was interrupted, which makes perf show an incorrect call graph to the
    user.
    
    Consider the following example program:
    
      void leaf() {
          /* long computation */
      }
    
      void parent() {
          // (1)
          leaf();
          // (2)
      }
    
      ... could be compiled into (using gcc -fno-inline -fno-omit-frame-pointer):
    
      leaf:
          /* long computation */
          nop
          ret
      parent:
          // (1)
          stp     x29, x30, [sp, -16]!
          mov     x29, sp
          bl      parent
          nop
          ldp     x29, x30, [sp], 16
          // (2)
          ret
    
    If the program is interrupted at (1), (2), or any point in "leaf:", the
    call graph will skip the callers of the current function. We can unwind
    using the dwarf info and check if the return addr is the same as the LR
    register, and inject the missing frame into the call graph.
    
    Before this patch, the above example shows the following call-graph when
    recording using "--call-graph fp" mode in ARM64:
    
      # Children      Self  Command   Shared Object     Symbol
      # ........  ........  ........  ................  ......................
      #
          99.86%    99.86%  program3  program3          [.] leaf
      	    |
      	    ---_start
      	       __libc_start_main
      	       main
      	       leaf
    
    As can be seen, the "parent" function is missing. This is specially
    problematic in "leaf" because for leaf functions the compiler may always
    omit pushing the return addr into the stack. After this patch, it shows
    the correct graph:
    
      # Children      Self  Command   Shared Object     Symbol
      # ........  ........  ........  ................  ......................
      #
          99.86%    99.86%  program3  program3          [.] leaf
      	    |
      	    ---_start
      	       __libc_start_main
      	       main
      	       parent
      	       leaf
    Reviewed-by: default avatarJames Clark <james.clark@arm.com>
    Signed-off-by: default avatarAlexandre Truong <alexandre.truong@arm.com>
    Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
    Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
    Cc: John Garry <john.garry@huawei.com>
    Cc: Leo Yan <leo.yan@linaro.org>
    Cc: Mark Rutland <mark.rutland@arm.com>
    Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
    Cc: Namhyung Kim <namhyung@kernel.org>
    Cc: Will Deacon <will@kernel.org>
    Cc: linux-arm-kernel@lists.infradead.org
    Link: https://lore.kernel.org/r/20211217154521.80603-7-german.gomez@arm.comSigned-off-by: default avatarGerman Gomez <german.gomez@arm.com>
    [ Rename machine__normalize_is() to machine__normalized_is(), as suggested by James Clark ]
    Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
    b9f6fbb3
Build 9.56 KB