• Andrii Nakryiko's avatar
    selftests/bpf: support stat filtering in comparison mode in veristat · d5ce4b89
    Andrii Nakryiko authored
    Finally add support for filtering stats values, similar to
    non-comparison mode filtering. For comparison mode 4 variants of stats
    are important for filtering, as they allow to filter either A or B side,
    but even more importantly they allow to filter based on value
    difference, and for verdict stat value difference is MATCH/MISMATCH
    classification. So with these changes it's finally possible to easily
    check if there were any mismatches between failure/success outcomes on
    two separate data sets. Like in an example below:
    
      $ ./veristat -e file,prog,verdict,insns -C ~/baseline-results.csv ~/shortest-results.csv -f verdict_diff=mismatch
      File                                   Program                Verdict (A)  Verdict (B)  Verdict (DIFF)  Insns (A)  Insns (B)  Insns        (DIFF)
      -------------------------------------  ---------------------  -----------  -----------  --------------  ---------  ---------  -------------------
      dynptr_success.bpf.linked1.o           test_data_slice        success      failure      MISMATCH               85          0       -85 (-100.00%)
      dynptr_success.bpf.linked1.o           test_read_write        success      failure      MISMATCH             1992          0     -1992 (-100.00%)
      dynptr_success.bpf.linked1.o           test_ringbuf           success      failure      MISMATCH               74          0       -74 (-100.00%)
      kprobe_multi.bpf.linked1.o             test_kprobe            failure      success      MISMATCH                0        246      +246 (+100.00%)
      kprobe_multi.bpf.linked1.o             test_kprobe_manual     failure      success      MISMATCH                0        246      +246 (+100.00%)
      kprobe_multi.bpf.linked1.o             test_kretprobe         failure      success      MISMATCH                0        248      +248 (+100.00%)
      kprobe_multi.bpf.linked1.o             test_kretprobe_manual  failure      success      MISMATCH                0        248      +248 (+100.00%)
      kprobe_multi.bpf.linked1.o             trigger                failure      success      MISMATCH                0          2        +2 (+100.00%)
      netcnt_prog.bpf.linked1.o              bpf_nextcnt            failure      success      MISMATCH                0         56       +56 (+100.00%)
      pyperf600_nounroll.bpf.linked1.o       on_event               success      failure      MISMATCH           568128    1000001    +431873 (+76.02%)
      ringbuf_bench.bpf.linked1.o            bench_ringbuf          success      failure      MISMATCH                8          0        -8 (-100.00%)
      strobemeta.bpf.linked1.o               on_event               success      failure      MISMATCH           557149    1000001    +442852 (+79.49%)
      strobemeta_nounroll1.bpf.linked1.o     on_event               success      failure      MISMATCH            57240    1000001  +942761 (+1647.03%)
      strobemeta_nounroll2.bpf.linked1.o     on_event               success      failure      MISMATCH           501725    1000001    +498276 (+99.31%)
      strobemeta_subprogs.bpf.linked1.o      on_event               success      failure      MISMATCH            65420    1000001  +934581 (+1428.59%)
      test_map_in_map_invalid.bpf.linked1.o  xdp_noop0              success      failure      MISMATCH                2          0        -2 (-100.00%)
      test_mmap.bpf.linked1.o                test_mmap              success      failure      MISMATCH               46          0       -46 (-100.00%)
      test_verif_scale3.bpf.linked1.o        balancer_ingress       success      failure      MISMATCH           845499    1000001    +154502 (+18.27%)
      -------------------------------------  ---------------------  -----------  -----------  --------------  ---------  ---------  -------------------
    
    Note that by filtering on verdict_diff=mismatch, it's now extremely easy and
    fast to see any changes in verdict. Example above showcases both failure ->
    success transitions (which are generally surprising) and success -> failure
    transitions (which are expected if bugs are present).
    
    Given veristat allows to query relative percent difference values, internal
    logic for comparison mode is based on floating point numbers, so requires a bit
    of epsilon precision logic, deviating from typical integer simple handling
    rules.
    Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/r/20221103055304.2904589-11-andrii@kernel.orgSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
    d5ce4b89
veristat.c 46.8 KB