• Steven Rostedt (Google)'s avatar
    tracing/filter: Call filter predicate functions directly via a switch statement · fde59ab1
    Steven Rostedt (Google) authored
    Due to retpolines, indirect calls are much more expensive than direct
    calls. The filters have a select set of functions it uses for the
    predicates. Instead of using function pointers to call them, create a
    filter_pred_fn_call() function that uses a switch statement to call the
    predicate functions directly. This gives almost a 10% speedup to the
    filter logic.
    
    Using the histogram benchmark:
    
    Before:
    
     # event histogram
     #
     # trigger info: hist:keys=delta:vals=hitcount:sort=delta:size=2048 if delta > 0 [active]
     #
    
    { delta:        113 } hitcount:        272
    { delta:        114 } hitcount:        840
    { delta:        118 } hitcount:        344
    { delta:        119 } hitcount:      25428
    { delta:        120 } hitcount:     350590
    { delta:        121 } hitcount:    1892484
    { delta:        122 } hitcount:    6205004
    { delta:        123 } hitcount:   11583521
    { delta:        124 } hitcount:   37590979
    { delta:        125 } hitcount:  108308504
    { delta:        126 } hitcount:  131672461
    { delta:        127 } hitcount:   88700598
    { delta:        128 } hitcount:   65939870
    { delta:        129 } hitcount:   45055004
    { delta:        130 } hitcount:   33174464
    { delta:        131 } hitcount:   31813493
    { delta:        132 } hitcount:   29011676
    { delta:        133 } hitcount:   22798782
    { delta:        134 } hitcount:   22072486
    { delta:        135 } hitcount:   17034113
    { delta:        136 } hitcount:    8982490
    { delta:        137 } hitcount:    2865908
    { delta:        138 } hitcount:     980382
    { delta:        139 } hitcount:    1651944
    { delta:        140 } hitcount:    4112073
    { delta:        141 } hitcount:    3963269
    { delta:        142 } hitcount:    1712508
    { delta:        143 } hitcount:     575941
    
    After:
    
     # event histogram
     #
     # trigger info: hist:keys=delta:vals=hitcount:sort=delta:size=2048 if delta > 0 [active]
     #
    
    { delta:        103 } hitcount:         60
    { delta:        104 } hitcount:      16966
    { delta:        105 } hitcount:     396625
    { delta:        106 } hitcount:    3223400
    { delta:        107 } hitcount:   12053754
    { delta:        108 } hitcount:   20241711
    { delta:        109 } hitcount:   14850200
    { delta:        110 } hitcount:    4946599
    { delta:        111 } hitcount:    3479315
    { delta:        112 } hitcount:   18698299
    { delta:        113 } hitcount:   62388733
    { delta:        114 } hitcount:   95803834
    { delta:        115 } hitcount:   58278130
    { delta:        116 } hitcount:   15364800
    { delta:        117 } hitcount:    5586866
    { delta:        118 } hitcount:    2346880
    { delta:        119 } hitcount:    1131091
    { delta:        120 } hitcount:     620896
    { delta:        121 } hitcount:     236652
    { delta:        122 } hitcount:     105957
    { delta:        123 } hitcount:     119107
    { delta:        124 } hitcount:      54494
    { delta:        125 } hitcount:      63856
    { delta:        126 } hitcount:      64454
    { delta:        127 } hitcount:      34818
    { delta:        128 } hitcount:      41446
    { delta:        129 } hitcount:      51242
    { delta:        130 } hitcount:      28361
    { delta:        131 } hitcount:      23926
    
    The peak before was 126ns per event, after the peak is 114ns, and the
    fastest time went from 113ns to 103ns.
    
    Link: https://lkml.kernel.org/r/20220906225529.781407172@goodmis.org
    
    Cc: Ingo Molnar <mingo@kernel.org>
    Cc: Andrew Morton <akpm@linux-foundation.org>
    Cc: Masami Hiramatsu <mhiramat@kernel.org>
    Cc: Tom Zanussi <zanussi@kernel.org>
    Signed-off-by: default avatarSteven Rostedt (Google) <rostedt@goodmis.org>
    fde59ab1
trace_events_filter.c 60.2 KB