• Jiri Olsa's avatar
    ftrace: Update dynamic ftrace calls only if necessary · 7f50d06b
    Jiri Olsa authored
    Currently dynamic ftrace calls are updated any time
    the ftrace_ops is un/registered. If we do  this update
    only when it's needed, we save lot of time for perf
    system wide ftrace function sampling/counting.
    
    The reason is that for system wide sampling/counting,
    perf creates event for each cpu in the system.
    
    Each event then registers separate copy of ftrace_ops,
    which ends up in FTRACE_UPDATE_CALLS updates. On servers
    with many cpus that means serious stall (240 cpus server):
    
    Counting:
      # time ./perf stat -e ftrace:function -a sleep 1
    
       Performance counter stats for 'system wide':
    
                  370,663      ftrace:function
    
              1.401427505 seconds time elapsed
    
      real    3m51.743s
      user    0m0.023s
      sys     3m48.569s
    
    Sampling:
      # time ./perf record -e ftrace:function -a sleep 1
      [ perf record: Woken up 0 times to write data ]
      Warning:
      Processed 141200 events and lost 5 chunks!
    
      [ perf record: Captured and wrote 10.703 MB perf.data (135950 samples) ]
    
      real    2m31.429s
      user    0m0.213s
      sys     2m29.494s
    
    There's no reason to do the FTRACE_UPDATE_CALLS update
    for each event in perf case, because all the ftrace_ops
    always share the same filter, so the updated calls are
    always the same.
    
    It's required that only first ftrace_ops registration
    does the FTRACE_UPDATE_CALLS update (also sometimes
    the second if the first one used the trampoline), but
    the rest can be only cheaply linked into the ftrace_ops
    list.
    
    Counting:
      # time ./perf stat -e ftrace:function -a sleep 1
    
       Performance counter stats for 'system wide':
    
                 398,571      ftrace:function
    
             1.377503733 seconds time elapsed
    
      real    0m2.787s
      user    0m0.005s
      sys     0m1.883s
    
    Sampling:
      # time ./perf record -e ftrace:function -a sleep 1
      [ perf record: Woken up 0 times to write data ]
      Warning:
      Processed 261730 events and lost 9 chunks!
    
      [ perf record: Captured and wrote 19.907 MB perf.data (256293 samples) ]
    
      real    1m31.948s
      user    0m0.309s
      sys     1m32.051s
    
    Link: http://lkml.kernel.org/r/1458138873-1553-6-git-send-email-jolsa@kernel.orgAcked-by: default avatarNamhyung Kim <namhyung@kernel.org>
    Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
    Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
    7f50d06b
ftrace.c 139 KB