• Petar Gligoric's avatar
    perf script: task-analyzer add csv support · fdd0f81f
    Petar Gligoric authored
    This patch adds the possibility to write the trace and the summary as csv files
    to a user specified file. A format as such simplifies further data processing.
    This is achieved by having ";" as separators instead of spaces and solely one
    header per file.
    
    Additional parameters are being considered, like in the normal usage of the
    script. Colors are turned off in the case of a csv output, thus the highlight
    option is also being ignored.
    
    Usage:
    
    Write standard task to csv file:
    
      $ perf script report tasks-analyzer --csv <file>
    
    write limited output to csv file in nanoseconds:
    
      $ perf script report tasks-analyzer --csv <file> --ns --limit-to-tasks 1337
    
    Write summary to a csv file:
    
      $ perf script report tasks-analyzer --csv-summary <file>
    
    Write summary to csv file with additional schedule information:
    
      $ perf script report tasks-analyzer --csv-summary <file> --summary-extended
    
    Write both summary and standard task to a csv file:
    
      $ perf script report tasks-analyzer --csv --csv-summary
    
    The following examples illustrate what is possible with the CSV output.  The
    first command sequence will record all scheduler switch events for 10 seconds,
    the task-analyzer calculates task information like runtimes as CSV.  A small
    python snippet using pandas and matplotlib will visualize the most frequent
    task (e.g. kworker/1:1) runtimes - each runtime as a bar in a bar chart:
    
      $ perf record -e sched:sched_switch -a -- sleep 10
      $ perf script report tasks-analyzer --ns --csv tasks.csv
      $ cat << EOF > /tmp/freq-comm-runtimes-bar.py
        import pandas as pd
        import matplotlib.pyplot as plt
    
        df = pd.read_csv("tasks.csv", sep=';')
        most_freq_comm = df["COMM"].value_counts().idxmax()
        most_freq_runtimes = df[df["COMM"]==most_freq_comm]["Runtime"]
        plt.title(f"Runtimes for Task {most_freq_comm} in Nanoseconds")
        plt.bar(range(len(most_freq_runtimes)), most_freq_runtimes)
        plt.show()
      $ python3 /tmp/freq-comm-runtimes-bar.py
    
    As a seconds example, the subsequent script generates a pie chart of all
    accumulated tasks runtimes for 10 seconds of system recordings:
    
      $ perf record -e sched:sched_switch -a -- sleep 10
      $ perf script report tasks-analyzer --csv-summary task-summary.csv
      $ cat << EOF > /tmp/accumulated-task-pie.py
        import pandas as pd
        from matplotlib.pyplot import pie, axis, show
    
        df = pd.read_csv("task-summary.csv", sep=';')
        sums = df.groupby(df["Comm"])["Accumulated"].sum()
        axis("equal")
        pie(sums, labels=sums.index);
        show()
      EOF
      $ python3 /tmp/accumulated-task-pie.py
    
    A variety of other visualizations are possible in matplotlib and other
    environments. Of course, pandas, numpy and co. also allow easy
    statistical analysis of the data!
    Signed-off-by: default avatarPetar Gligoric <petar.gligoric@rohde-schwarz.com>
    Cc: Andi Kleen <ak@linux.intel.com>
    Cc: Ian Rogers <irogers@google.com>
    Cc: Jiri Olsa <jolsa@kernel.org>
    Cc: Namhyung Kim <namhyung@kernel.org>
    Link: https://lore.kernel.org/r/20221206154406.41941-3-petar.gligor@gmail.comSigned-off-by: default avatarHagen Paul Pfeifer <hagen@jauu.net>
    Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
    fdd0f81f
task-analyzer.py 33.2 KB