An error occurred fetching the project authors.
  1. 06 May, 2023 1 commit
  2. 15 Mar, 2023 2 commits
    • Namhyung Kim's avatar
      perf bpf filter: Show warning for missing sample flags · 4310551b
      Namhyung Kim authored
      For a BPF filter to work properly, users need to provide appropriate
      options to enable the sample types.  Otherwise the BPF program would
      see an invalid value (i.e. always 0) and filter won't work well.
      
      Show a warning message if sample types are missing like below.
      
        $ sudo ./perf record -e cycles --filter 'addr < 100' true
        Error: cycles event does not have PERF_SAMPLE_ADDR
         Hint: please add -d option to perf record.
        failed to set filter "BPF" on event cycles with 22 (Invalid argument)
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Hao Luo <haoluo@google.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Song Liu <song@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: bpf@vger.kernel.org
      Link: https://lore.kernel.org/r/20230314234237.3008956-2-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      4310551b
    • Namhyung Kim's avatar
      perf record: Record dropped sample count · 27c6f245
      Namhyung Kim authored
      When it uses bpf filters, event might drop some samples.  It'd be nice
      if it can report how many samples it lost.  As LOST_SAMPLES event can
      carry the similar information, let's use it for bpf filters.
      
      To indicate it's from BPF filters, add a new misc flag for that and
      do not display cpu load warnings.
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Hao Luo <haoluo@google.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: James Clark <james.clark@arm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Song Liu <song@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: bpf@vger.kernel.org
      Link: https://lore.kernel.org/r/20230314234237.3008956-2-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      27c6f245
  3. 14 Mar, 2023 1 commit
    • Ian Rogers's avatar
      perf build: Make BUILD_BPF_SKEL default, rename to NO_BPF_SKEL · a980755b
      Ian Rogers authored
      BPF skeleton support is now key to a number of perf features. Rather
      than making it so that BPF support must be enabled for the build, make
      this the default and error if the build lacks a clang and libbpf that
      are sufficient. To avoid the error and build without BPF skeletons the
      NO_BPF_SKEL=1 flag can be used. Add a build-options flag to 'perf
      version' to enable detection of the BPF skeleton support and use this
      in the offcpu shell test.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andres Freund <andres@anarazel.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Martin Liška <mliska@suse.cz>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Nathan Chancellor <nathan@kernel.org>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: Pavithra Gurushankar <gpavithrasha@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Quentin Monnet <quentin@isovalent.com>
      Cc: Roberto Sassu <roberto.sassu@huawei.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tiezhu Yang <yangtiezhu@loongson.cn>
      Cc: Tom Rix <trix@redhat.com>
      Cc: Yang Jihong <yangjihong1@huawei.com>
      Cc: llvm@lists.linux.dev
      Link: https://lore.kernel.org/r/20230311065753.3012826-2-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      a980755b
  4. 13 Mar, 2023 3 commits
    • Ian Rogers's avatar
      perf evlist: Remove nr_groups · 9d2dc632
      Ian Rogers authored
      Maintaining the number of groups during event parsing is problematic
      and since changing to sort/regroup events can only be computed by a
      linear pass over the evlist. As the value is generally only used in
      tests, rather than hold it in a variable compute it by passing over
      the evlist when necessary.
      
      This change highlights that libpfm's counting of groups with a single
      entry disagreed with regular event parsing. The libpfm tests are
      updated accordingly.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Florian Fischer <florian.fischer@muhq.space>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Kim Phillips <kim.phillips@amd.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Sean Christopherson <seanjc@google.com>
      Cc: Steinar H. Gunderson <sesse@google.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
      Link: https://lore.kernel.org/r/20230312021543.3060328-2-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      9d2dc632
    • Changbin Du's avatar
      perf record: Reuse target::initial_delay · cb4b9e68
      Changbin Du authored
      This just simply replace record_opts::initial_delay with
      target::initial_delay. Nothing else is changed.
      Signed-off-by: default avatarChangbin Du <changbin.du@huawei.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Hui Wang <hw.huiwang@huawei.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20230302031146.2801588-3-changbin.du@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      cb4b9e68
    • Kan Liang's avatar
      perf record: Fix "read LOST count failed" msg with sample read · 07d85ba9
      Kan Liang authored
      Hundreds of "read LOST count failed" error messages may be displayed,
      when the below command is launched.
      
      perf record -e '{cpu/mem-loads-aux/,cpu/event=0xcd,umask=0x1/}:S' -a
      
      According to the commit 89e3106f ("libperf: Handle read format
      in perf_evsel__read()"), the PERF_FORMAT_GROUP is only available for
      the leader. However, the record__read_lost_samples() goes through every
      entry of an evlist, which includes both leader and member. The member
      event errors out and triggers the error message. Since there may be
      hundreds of CPUs on a server, the message will be printed hundreds of
      times, which is very annoying.
      
      The message itself is correct, but the pr_err is a overkill. Other error
      messages in the record__read_lost_samples() are all pr_debug. To make
      the output format consistent, change the pr_err("read LOST count
      failed\n"); to pr_debug("read LOST count failed\n");.
      User can still get the message via -v option.
      
      Fixes: e3a23261 ("perf record: Read and inject LOST_SAMPLES events")
      Signed-off-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20230301150413.27011-1-kan.liang@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      07d85ba9
  5. 15 Feb, 2023 1 commit
    • Yang Jihong's avatar
      perf record: Fix segfault with --overwrite and --max-size · 91621be6
      Yang Jihong authored
      When --overwrite and --max-size options of perf record are used
      together, a segmentation fault occurs. The following is an example:
      
        # perf record -e sched:sched* --overwrite --max-size 1K -a -- sleep 1
        [ perf record: Woken up 1 times to write data ]
        perf: Segmentation fault
        Obtained 12 stack frames.
        ./perf/perf(+0x197673) [0x55f99710b673]
        /lib/x86_64-linux-gnu/libc.so.6(+0x3ef0f) [0x7fa45f3cff0f]
        ./perf/perf(+0x8eb40) [0x55f997002b40]
        ./perf/perf(+0x1f6882) [0x55f99716a882]
        ./perf/perf(+0x794c2) [0x55f996fed4c2]
        ./perf/perf(+0x7b7c7) [0x55f996fef7c7]
        ./perf/perf(+0x9074b) [0x55f99700474b]
        ./perf/perf(+0x12e23c) [0x55f9970a223c]
        ./perf/perf(+0x12e54a) [0x55f9970a254a]
        ./perf/perf(+0x7db60) [0x55f996ff1b60]
        /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xe6) [0x7fa45f3b2c86]
        ./perf/perf(+0x7dfe9) [0x55f996ff1fe9]
        Segmentation fault (core dumped)
      
      backtrace of the core file is as follows:
      
        (gdb) bt
        #0  record__bytes_written (rec=0x55f99755a200 <record>) at builtin-record.c:234
        #1  record__output_max_size_exceeded (rec=0x55f99755a200 <record>) at builtin-record.c:242
        #2  record__write (map=0x0, size=12816, bf=0x55f9978da2e0, rec=0x55f99755a200 <record>) at builtin-record.c:263
        #3  process_synthesized_event (tool=tool@entry=0x55f99755a200 <record>, event=event@entry=0x55f9978da2e0, sample=sample@entry=0x0, machine=machine@entry=0x55f997893658) at builtin-record.c:618
        #4  0x000055f99716a883 in __perf_event__synthesize_id_index (tool=tool@entry=0x55f99755a200 <record>, process=process@entry=0x55f997002aa0 <process_synthesized_event>, evlist=0x55f9978928b0, machine=machine@entry=0x55f997893658,
            from=from@entry=0) at util/synthetic-events.c:1895
        #5  0x000055f99716a91f in perf_event__synthesize_id_index (tool=tool@entry=0x55f99755a200 <record>, process=process@entry=0x55f997002aa0 <process_synthesized_event>, evlist=<optimized out>, machine=machine@entry=0x55f997893658)
            at util/synthetic-events.c:1905
        #6  0x000055f996fed4c3 in record__synthesize (tail=tail@entry=true, rec=0x55f99755a200 <record>) at builtin-record.c:1997
        #7  0x000055f996fef7c8 in __cmd_record (argc=argc@entry=2, argv=argv@entry=0x7ffc67551260, rec=0x55f99755a200 <record>) at builtin-record.c:2802
        #8  0x000055f99700474c in cmd_record (argc=<optimized out>, argv=0x7ffc67551260) at builtin-record.c:4258
        #9  0x000055f9970a223d in run_builtin (p=0x55f997564d88 <commands+264>, argc=10, argv=0x7ffc67551260) at perf.c:330
        #10 0x000055f9970a254b in handle_internal_command (argc=10, argv=0x7ffc67551260) at perf.c:384
        #11 0x000055f996ff1b61 in run_argv (argcp=<synthetic pointer>, argv=<synthetic pointer>) at perf.c:428
        #12 main (argc=<optimized out>, argv=0x7ffc67551260) at perf.c:562
      
      The reason is that record__bytes_written accesses the freed memory rec->thread_data,
      The process is as follows:
        __cmd_record
          -> record__free_thread_data
            -> zfree(&rec->thread_data)         // free rec->thread_data
          -> record__synthesize
            -> perf_event__synthesize_id_index
              -> process_synthesized_event
                -> record__write
                  -> record__bytes_written      // access rec->thread_data
      
      We add a member variable "thread_bytes_written" in the struct "record"
      to save the data size written by the threads.
      
      Fixes: 6d575816 ("perf record: Add support for limit perf output file size")
      Signed-off-by: default avatarYang Jihong <yangjihong1@huawei.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Jiwei Sun <jiwei.sun@windriver.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/CAM9d7ci_TRrqBQVQNW8=GwakUr7SsZpYxaaty-S4bxF8zJWyqw@mail.gmail.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      91621be6
  6. 20 Dec, 2022 1 commit
  7. 14 Dec, 2022 2 commits
    • Ian Rogers's avatar
      perf evlist: Remove group option. · 5f8f9567
      Ian Rogers authored
      The group option predates grouping events using curly braces added in
      commit 89efb029 ("perf tools: Add support to parse event group
      syntax").
      
      The --group option was retained for legacy support (in August
      2012) but keeping it adds complexity.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Anshuman Khandual <anshuman.khandual@arm.com>
      Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Eelco Chaudron <echaudro@redhat.com>
      Cc: German Gomez <german.gomez@arm.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Kim Phillips <kim.phillips@amd.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Riccardo Mancini <rickyman7@gmail.com>
      Cc: Sandipan Das <sandipan.das@amd.com>
      Cc: Sean Christopherson <seanjc@google.com>
      Cc: Shaomin Deng <dengshaomin@cdjrlc.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Timothy Hayes <timothy.hayes@arm.com>
      Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
      Cc: bpf@vger.kernel.org
      Link: https://lore.kernel.org/r/20221213232651.1269909-6-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      5f8f9567
    • Ian Rogers's avatar
      perf build: Use libtraceevent from the system · 378ef0f5
      Ian Rogers authored
      Remove the LIBTRACEEVENT_DYNAMIC and LIBTRACEFS_DYNAMIC make command
      line variables.
      
      If libtraceevent isn't installed or NO_LIBTRACEEVENT=1 is passed to the
      build, don't compile in libtraceevent and libtracefs support.
      
      This also disables CONFIG_TRACE that controls "perf trace".
      
      CONFIG_LIBTRACEEVENT is used to control enablement in Build/Makefiles,
      HAVE_LIBTRACEEVENT is used in C code.
      
      Without HAVE_LIBTRACEEVENT tracepoints are disabled and as such the
      commands kmem, kwork, lock, sched and timechart are removed.  The
      majority of commands continue to work including "perf test".
      
      Committer notes:
      
      Fixed up a tools/perf/util/Build reject and added:
      
        #include <traceevent/event-parse.h>
      
      to tools/perf/util/scripting-engines/trace-event-perl.c.
      
      Committer testing:
      
        $ rpm -qi libtraceevent-devel
        Name        : libtraceevent-devel
        Version     : 1.5.3
        Release     : 2.fc36
        Architecture: x86_64
        Install Date: Mon 25 Jul 2022 03:20:19 PM -03
        Group       : Unspecified
        Size        : 27728
        License     : LGPLv2+ and GPLv2+
        Signature   : RSA/SHA256, Fri 15 Apr 2022 02:11:58 PM -03, Key ID 999f7cbf38ab71f4
        Source RPM  : libtraceevent-1.5.3-2.fc36.src.rpm
        Build Date  : Fri 15 Apr 2022 10:57:01 AM -03
        Build Host  : buildvm-x86-05.iad2.fedoraproject.org
        Packager    : Fedora Project
        Vendor      : Fedora Project
        URL         : https://git.kernel.org/pub/scm/libs/libtrace/libtraceevent.git/
        Bug URL     : https://bugz.fedoraproject.org/libtraceevent
        Summary     : Development headers of libtraceevent
        Description :
        Development headers of libtraceevent-libs
        $
      
      Default build:
      
        $ ldd ~/bin/perf | grep tracee
        	libtraceevent.so.1 => /lib64/libtraceevent.so.1 (0x00007f1dcaf8f000)
        $
      
        # perf trace -e sched:* --max-events 10
             0.000 migration/0/17 sched:sched_migrate_task(comm: "", pid: 1603763 (perf), prio: 120, dest_cpu: 1)
             0.005 migration/0/17 sched:sched_wake_idle_without_ipi(cpu: 1)
             0.011 migration/0/17 sched:sched_switch(prev_comm: "", prev_pid: 17 (migration/0), prev_state: 1, next_comm: "", next_prio: 120)
             1.173 :0/0 sched:sched_wakeup(comm: "", pid: 3138 (gnome-terminal-), prio: 120)
             1.180 :0/0 sched:sched_switch(prev_comm: "", prev_prio: 120, next_comm: "", next_pid: 3138 (gnome-terminal-), next_prio: 120)
             0.156 migration/1/21 sched:sched_migrate_task(comm: "", pid: 1603763 (perf), prio: 120, orig_cpu: 1, dest_cpu: 2)
             0.160 migration/1/21 sched:sched_wake_idle_without_ipi(cpu: 2)
             0.166 migration/1/21 sched:sched_switch(prev_comm: "", prev_pid: 21 (migration/1), prev_state: 1, next_comm: "", next_prio: 120)
             1.183 :0/0 sched:sched_wakeup(comm: "", pid: 1602985 (kworker/u16:0-f), prio: 120, target_cpu: 1)
             1.186 :0/0 sched:sched_switch(prev_comm: "", prev_prio: 120, next_comm: "", next_pid: 1602985 (kworker/u16:0-f), next_prio: 120)
        #
      
      Had to tweak tools/perf/util/setup.py to make sure the python binding
      shared object links with libtraceevent if -DHAVE_LIBTRACEEVENT is
      present in CFLAGS.
      
      Building with NO_LIBTRACEEVENT=1 uncovered some more build failures:
      
      - Make building of data-convert-bt.c to CONFIG_LIBTRACEEVENT=y
      
      - perf-$(CONFIG_LIBTRACEEVENT) += scripts/
      
      - bpf_kwork.o needs also to be dependent on CONFIG_LIBTRACEEVENT=y
      
      - The python binding needed some fixups and util/trace-event.c can't be
        built and linked with the python binding shared object, so remove it
        in tools/perf/util/setup.py and exclude it from the list of
        dependencies in the python/perf.so Makefile.perf target.
      
      Building without libtraceevent-devel installed uncovered more build
      failures:
      
      - The python binding tools/perf/util/python.c was assuming that
        traceevent/parse-events.h was always available, which was the case
        when we defaulted to using the in-kernel tools/lib/traceevent/ files,
        now we need to enclose it under ifdef HAVE_LIBTRACEEVENT, just like
        the other parts of it that deal with tracepoints.
      
      - We have to ifdef the rules in the Build files with
        CONFIG_LIBTRACEEVENT=y to build builtin-trace.c and
        tools/perf/trace/beauty/ as we only ifdef setting CONFIG_TRACE=y when
        setting NO_LIBTRACEEVENT=1 in the make command line, not when we don't
        detect libtraceevent-devel installed in the system. Simplification here
        to avoid these two ways of disabling builtin-trace.c and not having
        CONFIG_TRACE=y when libtraceevent-devel isn't installed is the clean
        way.
      
      From Athira:
      
      <quote>
      tools/perf/arch/powerpc/util/Build
      -perf-y += kvm-stat.o
      +perf-$(CONFIG_LIBTRACEEVENT) += kvm-stat.o
      </quote>
      
      Then, ditto for arm64 and s390, detected by container cross build tests.
      
      - s/390 uses test__checkevent_tracepoint() that is now only available if
        HAVE_LIBTRACEEVENT is defined, enclose the callsite with ifder HAVE_LIBTRACEEVENT.
      
      Also from Athira:
      
      <quote>
      With this change, I could successfully compile in these environment:
      - Without libtraceevent-devel installed
      - With libtraceevent-devel installed
      - With “make NO_LIBTRACEEVENT=1”
      </quote>
      
      Then, finally rename CONFIG_TRACEEVENT to CONFIG_LIBTRACEEVENT for
      consistency with other libraries detected in tools/perf/.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Tested-by: default avatarAthira Rajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: bpf@vger.kernel.org
      Link: http://lore.kernel.org/lkml/20221205225940.3079667-3-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      378ef0f5
  8. 05 Dec, 2022 1 commit
    • Sean Christopherson's avatar
      perf tools: Use dedicated non-atomic clear/set bit helpers · 49bd97c2
      Sean Christopherson authored
      Use the dedicated non-atomic helpers for {clear,set}_bit() and their
      test variants, i.e. the double-underscore versions.  Depsite being
      defined in atomic.h, and despite the kernel versions being atomic in the
      kernel, tools' {clear,set}_bit() helpers aren't actually atomic.  Move
      to the double-underscore versions so that the versions that are expected
      to be atomic (for kernel developers) can be made atomic without
      affecting users that don't want atomic operations.
      
      No functional change intended.
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
      Cc: James Morse <james.morse@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Marc Zyngier <maz@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Oliver Upton <oliver.upton@linux.dev>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
      Cc: Sean Christopherson <seanjc@google.com>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: Yury Norov <yury.norov@gmail.com>
      Cc: alexandru elisei <alexandru.elisei@arm.com>
      Cc: kvm@vger.kernel.org
      Cc: kvmarm@lists.cs.columbia.edu
      Cc: kvmarm@lists.linux.dev
      Cc: linux-arm-kernel@lists.infradead.org
      Link: http://lore.kernel.org/lkml/20221119013450.2643007-6-seanjc@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      49bd97c2
  9. 02 Dec, 2022 1 commit
  10. 03 Nov, 2022 1 commit
  11. 27 Oct, 2022 1 commit
  12. 25 Oct, 2022 1 commit
  13. 04 Oct, 2022 7 commits
    • Adrian Hunter's avatar
      perf tools: Add debug messages and comments for testing · da406202
      Adrian Hunter authored
      Add debug messages to enable scripts to track aspects of 'perf record'
      behaviour. The messages will be consumed after 'perf record' has run,
      with the exception of "perf record has started" which is consequently
      flushed.
      
      Put comments so developers know which messages are also being used by test
      scripts.
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Link: https://lore.kernel.org/r/20220912083412.7058-11-adrian.hunter@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      da406202
    • Namhyung Kim's avatar
      perf record: Fix a segfault in record__read_lost_samples() · d031a00a
      Namhyung Kim authored
      When it fails to open events record__open() returns without setting the
      session->evlist.  Then it gets a segfault in the function trying to read
      lost sample counts.  You can easily reproduce it as a normal user like:
      
        $ perf record -p 1 true
        ...
        perf: Segmentation fault
        ...
      
      Skip the function if it has no evlist.  And add more protection for evsels
      which are not properly initialized.
      
      Fixes: a49aa8a54e861af1 ("perf record: Read and inject LOST_SAMPLES events")
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: default avatarLeo Yan <leo.yan@linaro.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: https://lore.kernel.org/r/20220909235024.278281-1-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      d031a00a
    • Namhyung Kim's avatar
      perf record: Read and inject LOST_SAMPLES events · e3a23261
      Namhyung Kim authored
      When there are lost samples, it can read the number of PERF_FORMAT_LOST and
      convert it to PERF_RECORD_LOST_SAMPLES and write to the data file at the end.
      Reviewed-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20220901195739.668604-4-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e3a23261
    • Ian Rogers's avatar
      perf record: Update use of pthread mutex · 49c670b1
      Ian Rogers authored
      Switch to the use of mutex wrappers that provide better error checking
      for synth_lock.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Reviewed-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexandre Truong <alexandre.truong@arm.com>
      Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Andres Freund <andres@anarazel.de>
      Cc: Andrii Nakryiko <andrii@kernel.org>
      Cc: André Almeida <andrealmeid@igalia.com>
      Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
      Cc: Colin Ian King <colin.king@intel.com>
      Cc: Dario Petrillo <dario.pk1@gmail.com>
      Cc: Darren Hart <dvhart@infradead.org>
      Cc: Dave Marchevsky <davemarchevsky@fb.com>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Fangrui Song <maskray@google.com>
      Cc: Hewenliang <hewenliang4@huawei.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jason Wang <wangborong@cdjrlc.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kim Phillips <kim.phillips@amd.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Martin Liška <mliska@suse.cz>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Nathan Chancellor <nathan@kernel.org>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: Pavithra Gurushankar <gpavithrasha@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Quentin Monnet <quentin@isovalent.com>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Remi Bernon <rbernon@codeweavers.com>
      Cc: Riccardo Mancini <rickyman7@gmail.com>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Thomas Richter <tmricht@linux.ibm.com>
      Cc: Tom Rix <trix@redhat.com>
      Cc: Weiguo Li <liwg06@foxmail.com>
      Cc: Wenyu Liu <liuwenyu7@huawei.com>
      Cc: William Cohen <wcohen@redhat.com>
      Cc: Zechuan Chen <chenzechuan1@huawei.com>
      Cc: bpf@vger.kernel.org
      Cc: llvm@lists.linux.dev
      Cc: yaowenbin <yaowenbin1@huawei.com>
      Link: https://lore.kernel.org/r/20220826164242.43412-8-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      49c670b1
    • Adrian Hunter's avatar
      perf record: Allow multiple recording time ranges · 6657a099
      Adrian Hunter authored
      AUX area traces can produce too much data to record successfully or
      analyze subsequently. Add another means to reduce data collection by
      allowing multiple recording time ranges.
      
      This is useful, for instance, in cases where a workload produces
      predictably reproducible events in specific time ranges.
      
      Today we only have perf record -D <msecs> to start at a specific region, or
      some complicated approach using snapshot mode and external scripts sending
      signals or using the fifos. But these approaches are difficult to set up
      compared with simply having perf do it.
      
      Extend perf record option -D/--delay option to specifying relative time
      stamps for start stop controlled by perf with the right time offset, for
      instance:
      
          perf record -e intel_pt// -D 10-20,30-40
      
      to record 10ms to 20ms into the trace and 30ms to 40ms.
      
      Example:
      
       The example workload is:
      
       $ cat repeat-usleep.c
      
       int usleep(useconds_t usec);
      
       int usage(int ret, const char *msg)
       {
               if (msg)
                       fprintf(stderr, "%s\n", msg);
      
               fprintf(stderr, "Usage is: repeat-usleep <microseconds>\n");
      
               return ret;
       }
      
       int main(int argc, char *argv[])
       {
               unsigned long usecs;
               char *end_ptr;
      
               if (argc != 2)
                       return usage(1, "Error: Wrong number of arguments!");
      
               errno = 0;
               usecs = strtoul(argv[1], &end_ptr, 0);
               if (errno || *end_ptr || usecs > UINT_MAX)
                       return usage(1, "Error: Invalid argument!");
      
               while (1) {
                       int ret = usleep(usecs);
      
                       if (ret & errno != EINTR)
                               return usage(1, "Error: usleep() failed!");
               }
      
               return 0;
       }
      
       $ perf record -e intel_pt//u --delay 10-20,40-70,110-160 -- ./repeat-usleep 500
       Events disabled
       Events enabled
       Events disabled
       Events enabled
       Events disabled
       Events enabled
       Events disabled
       [ perf record: Woken up 5 times to write data ]
       [ perf record: Captured and wrote 0.204 MB perf.data ]
       Terminated
      
       A dlfilter is used to determine continuous data collection (timestamps
       less than 1ms apart):
      
       $ cat dlfilter-show-delays.c
      
       static __u64 start_time;
       static __u64 last_time;
      
       int start(void **data, void *ctx)
       {
               printf("%-17s\t%-9s\t%-6s\n", " Time", " Duration", " Delay");
               return 0;
       }
      
       int filter_event_early(void *data, const struct perf_dlfilter_sample *sample, void *ctx)
       {
               __u64 delta;
      
               if (!sample->time)
                       return 1;
               if (!last_time)
                       goto out;
               delta = sample->time - last_time;
               if (delta < 1000000)
                       goto out2;;
               printf("%17.9f\t%9.1f\t%6.1f\n", start_time / 1000000000.0, (last_time - start_time) / 1000000.0, delta / 1000000.0);
       out:
               start_time = sample->time;
       out2:
               last_time = sample->time;
               return 1;
       }
      
       int stop(void *data, void *ctx)
       {
               printf("%17.9f\t%9.1f\n", start_time / 1000000000.0, (last_time - start_time) / 1000000.0);
               return 0;
       }
      
       The result shows the times roughly match the --delay option:
      
       $ perf script --itrace=qb --dlfilter dlfilter-show-delays.so
        Time                    Duration        Delay
         39215.302317300             9.7         20.5
         39215.332480217            30.4         40.9
         39215.403837717            49.8
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/r/20220824072814.16422-6-adrian.hunter@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      6657a099
    • Adrian Hunter's avatar
      perf record: Change evlist->ctl_fd to use fdarray_flag__non_perf_event · feff0b61
      Adrian Hunter authored
      Patch "perf record: Fix way of handling non-perf-event pollfds" added a
      generic way to handle non-perf-event file descriptors like evlist->ctl_fd.
      Use it instead of handling evlist->ctl_fd separately.
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Acked-by: default avatarIan Rogers <irogers@google.com>
      Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/r/20220824072814.16422-4-adrian.hunter@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      feff0b61
    • Adrian Hunter's avatar
      perf record: Fix way of handling non-perf-event pollfds · 6562c9ac
      Adrian Hunter authored
      perf record __cmd_record() does not poll evlist pollfds. Instead it polls
      thread_data[0].pollfd. That happens whether or not threads are being used.
      
      perf record duplicates evlist mmap pollfds as needed for separate threads.
      The non-perf-event represented by evlist->ctl_fd has to handled separately,
      which is done explicitly, duplicating it into the thread_data[0] pollfds.
      That approach neglects any other non-perf-event file descriptors. Currently
      there is also done_fd which needs the same handling.
      
      Add a new generalized approach.
      
      Add fdarray_flag__non_perf_event to identify the file descriptors that
      need the special handling. For those cases, also keep a mapping of the
      evlist pollfd index and thread pollfd index, so that the evlist revents
      can be updated.
      
      Although this patch adds the new handling, it does not take it into use.
      There is no functional change, but it is the precursor to a fix, so is
      marked as a fix.
      
      Fixes: 415ccb58 ("perf record: Introduce thread specific data array")
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Acked-by: default avatarIan Rogers <irogers@google.com>
      Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/r/20220824072814.16422-2-adrian.hunter@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      6562c9ac
  14. 21 Sep, 2022 1 commit
  15. 08 Sep, 2022 1 commit
    • Adrian Hunter's avatar
      perf record: Fix synthesis failure warnings · faf59ec8
      Adrian Hunter authored
      Some calls to synthesis functions set err < 0 but only warn about the
      failure and continue.  However they do not set err back to zero, relying
      on subsequent code to do that.
      
      That changed with the introduction of option --synth. When --synth=no
      subsequent functions that set err back to zero are not called.
      
      Fix by setting err = 0 in those cases.
      
      Example:
      
       Before:
      
         $ perf record --no-bpf-event --synth=all -o /tmp/huh uname
         Couldn't synthesize bpf events.
         Linux
         [ perf record: Woken up 1 times to write data ]
         [ perf record: Captured and wrote 0.014 MB /tmp/huh (7 samples) ]
         $ perf record --no-bpf-event --synth=no -o /tmp/huh uname
         Couldn't synthesize bpf events.
      
       After:
      
         $ perf record --no-bpf-event --synth=no -o /tmp/huh uname
         Couldn't synthesize bpf events.
         Linux
         [ perf record: Woken up 1 times to write data ]
         [ perf record: Captured and wrote 0.014 MB /tmp/huh (7 samples) ]
      
      Fixes: 41b740b6 ("perf record: Add --synth option")
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/r/20220907162458.72817-1-adrian.hunter@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      faf59ec8
  16. 06 Sep, 2022 1 commit
    • Athira Rajeev's avatar
      tools/perf: Fix out of bound access to cpu mask array · cbd7bfc7
      Athira Rajeev authored
      The cpu mask init code in "record__mmap_cpu_mask_init" function access
      "bits" array part of "struct mmap_cpu_mask".  The size of this array is
      the value from cpu__max_cpu().cpu.  This array is used to contain the
      cpumask value for each cpu. While setting bit for each cpu, it calls
      "set_bit" function which access index in "bits" array.
      
      If we provide a command line option to -C which is greater than the
      number of CPU's present in the system, the set_bit could access an array
      member which is out-of the array size. This is because currently, there
      is no boundary check for the CPU. This will result in seg fault:
      
      <<>>
        ./perf record -C 12341234 ls
        Perf can support 2048 CPUs. Consider raising MAX_NR_CPUS
        Segmentation fault (core dumped)
      <<>>
      
      Debugging with gdb, points to function flow as below:
      
      <<>>
        set_bit
        record__mmap_cpu_mask_init
        record__init_thread_default_masks
        record__init_thread_masks
        cmd_record
      <<>>
      
      Fix this by adding boundary check for the array.
      
      After the patch:
      
      <<>>
      ./perf record -C 12341234 ls
        Perf can support 2048 CPUs. Consider raising MAX_NR_CPUS
        Failed to initialize parallel data streaming masks
      <<>>
      
      With this fix, if -C is given a non-exsiting CPU, perf
      record will fail with:
      
      <<>>
        ./perf record -C 50 ls
        Failed to initialize parallel data streaming masks
      <<>>
      Reported-by: default avatarNageswara R Sastry <rnsastry@linux.ibm.com>
      Signed-off-by: default avatarAthira Jajeev <atrajeev@linux.vnet.ibm.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Tested-by: default avatarNageswara R Sastry <rnsastry@linux.ibm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: linuxppc-dev@lists.ozlabs.org
      Link: https://lore.kernel.org/r/20220905141929.7171-2-atrajeev@linux.vnet.ibm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      cbd7bfc7
  17. 12 Aug, 2022 1 commit
  18. 23 Jun, 2022 4 commits
  19. 26 May, 2022 5 commits
    • Namhyung Kim's avatar
      perf record: Add cgroup support for off-cpu profiling · 685439a7
      Namhyung Kim authored
      This covers two different use cases.  The first one is cgroup
      filtering given by -G/--cgroup option which controls the off-cpu
      profiling for tasks in the given cgroups only.
      
      The other use case is cgroup sampling which is enabled by
      --all-cgroups option and it adds PERF_SAMPLE_CGROUP to the sample_type
      to set the cgroup id of the task in the sample data.
      
      Example output.
      
        $ sudo perf record -a --off-cpu --all-cgroups sleep 1
      
        $ sudo perf report --stdio -s comm,cgroup --call-graph=no
        ...
        # Samples: 144  of event 'offcpu-time'
        # Event count (approx.): 48452045427
        #
        # Children      Self  Command          Cgroup
        # ........  ........  ...............  ..........................................
        #
            61.57%     5.60%  Chrome_ChildIOT  /user.slice/user-657345.slice/user@657345.service/app.slice/...
            29.51%     7.38%  Web Content      /user.slice/user-657345.slice/user@657345.service/app.slice/...
            17.48%     1.59%  Chrome_IOThread  /user.slice/user-657345.slice/user@657345.service/app.slice/...
            16.48%     4.12%  pipewire-pulse   /user.slice/user-657345.slice/user@657345.service/session.slice/...
            14.48%     2.07%  perf             /user.slice/user-657345.slice/user@657345.service/app.slice/...
            14.30%     7.15%  CompositorTileW  /user.slice/user-657345.slice/user@657345.service/app.slice/...
            13.33%     6.67%  Timer            /user.slice/user-657345.slice/user@657345.service/app.slice/...
        ...
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: default avatarIan Rogers <irogers@google.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Blake Jones <blakejones@google.com>
      Cc: Hao Luo <haoluo@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: bpf@vger.kernel.org
      Link: https://lore.kernel.org/r/20220518224725.742882-6-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      685439a7
    • Namhyung Kim's avatar
      perf record: Implement basic filtering for off-cpu · 10742d0c
      Namhyung Kim authored
      It should honor cpu and task filtering with -a, -C or -p, -t options.
      
      Committer testing:
      
        # perf record --off-cpu --cpu 1 perf bench sched messaging -l 1000
        # Running 'sched/messaging' benchmark:
        # 20 sender and receiver processes per group
        # 10 groups == 400 processes run
      
             Total time: 1.722 [sec]
        [ perf record: Woken up 2 times to write data ]
        [ perf record: Captured and wrote 1.446 MB perf.data (7248 samples) ]
        #
        # perf script | head -20
                    perf 97164 [001] 38287.696761:          1      cycles:  ffffffffb6070174 native_write_msr+0x4 (vmlinux)
                    perf 97164 [001] 38287.696764:          1      cycles:  ffffffffb6070174 native_write_msr+0x4 (vmlinux)
                    perf 97164 [001] 38287.696765:          9      cycles:  ffffffffb6070174 native_write_msr+0x4 (vmlinux)
                    perf 97164 [001] 38287.696767:        212      cycles:  ffffffffb6070176 native_write_msr+0x6 (vmlinux)
                    perf 97164 [001] 38287.696768:       5130      cycles:  ffffffffb6070176 native_write_msr+0x6 (vmlinux)
                    perf 97164 [001] 38287.696770:     123063      cycles:  ffffffffb6e0011e syscall_return_via_sysret+0x38 (vmlinux)
                    perf 97164 [001] 38287.696803:    2292748      cycles:  ffffffffb636c82d __fput+0xad (vmlinux)
                 swapper     0 [001] 38287.702852:    1927474      cycles:  ffffffffb6761378 mwait_idle_with_hints.constprop.0+0x48 (vmlinux)
                  :97513 97513 [001] 38287.767207:    1172536      cycles:  ffffffffb612ff65 newidle_balance+0x5 (vmlinux)
                 swapper     0 [001] 38287.769567:    1073081      cycles:  ffffffffb618216d ktime_get_mono_fast_ns+0xd (vmlinux)
                  :97533 97533 [001] 38287.770962:     984460      cycles:  ffffffffb65b2900 selinux_socket_sendmsg+0x0 (vmlinux)
                  :97540 97540 [001] 38287.772242:     883462      cycles:  ffffffffb6d0bf59 irqentry_exit_to_user_mode+0x9 (vmlinux)
                 swapper     0 [001] 38287.773633:     741963      cycles:  ffffffffb6761378 mwait_idle_with_hints.constprop.0+0x48 (vmlinux)
                  :97552 97552 [001] 38287.774539:     606680      cycles:  ffffffffb62eda0a page_add_file_rmap+0x7a (vmlinux)
                  :97556 97556 [001] 38287.775333:     502254      cycles:  ffffffffb634f964 get_obj_cgroup_from_current+0xc4 (vmlinux)
                  :97561 97561 [001] 38287.776163:     427891      cycles:  ffffffffb61b1522 cgroup_rstat_updated+0x22 (vmlinux)
                 swapper     0 [001] 38287.776854:     359030      cycles:  ffffffffb612fc5e load_balance+0x9ce (vmlinux)
                  :97567 97567 [001] 38287.777312:     330371      cycles:  ffffffffb6a8d8d0 skb_set_owner_w+0x0 (vmlinux)
                  :97566 97566 [001] 38287.777589:     311622      cycles:  ffffffffb614a7a8 native_queued_spin_lock_slowpath+0x148 (vmlinux)
                  :97512 97512 [001] 38287.777671:     307851      cycles:  ffffffffb62e0f35 find_vma+0x55 (vmlinux)
        #
        # perf record --off-cpu --cpu 4 perf bench sched messaging -l 1000
        # Running 'sched/messaging' benchmark:
        # 20 sender and receiver processes per group
        # 10 groups == 400 processes run
      
             Total time: 1.613 [sec]
        [ perf record: Woken up 2 times to write data ]
        [ perf record: Captured and wrote 1.415 MB perf.data (6729 samples) ]
        # perf script | head -20
                    perf 97650 [004] 38323.728036:          1      cycles:  ffffffffb6070174 native_write_msr+0x4 (vmlinux)
                    perf 97650 [004] 38323.728040:          1      cycles:  ffffffffb6070174 native_write_msr+0x4 (vmlinux)
                    perf 97650 [004] 38323.728041:          9      cycles:  ffffffffb6070174 native_write_msr+0x4 (vmlinux)
                    perf 97650 [004] 38323.728042:        208      cycles:  ffffffffb6070176 native_write_msr+0x6 (vmlinux)
                    perf 97650 [004] 38323.728044:       5026      cycles:  ffffffffb6070176 native_write_msr+0x6 (vmlinux)
                    perf 97650 [004] 38323.728046:     119970      cycles:  ffffffffb6d0bebc syscall_exit_to_user_mode+0x1c (vmlinux)
                    perf 97650 [004] 38323.728078:    2190103      cycles:            54b756 perf_tool__process_synth_event+0x16 (/home/acme/bin/perf)
                 swapper     0 [004] 38323.783357:    1593139      cycles:  ffffffffb6761378 mwait_idle_with_hints.constprop.0+0x48 (vmlinux)
                 swapper     0 [004] 38323.785352:    1593139      cycles:  ffffffffb6761378 mwait_idle_with_hints.constprop.0+0x48 (vmlinux)
                 swapper     0 [004] 38323.797330:    1418936      cycles:  ffffffffb6761378 mwait_idle_with_hints.constprop.0+0x48 (vmlinux)
                 swapper     0 [004] 38323.802350:    1418936      cycles:  ffffffffb6761378 mwait_idle_with_hints.constprop.0+0x48 (vmlinux)
                 swapper     0 [004] 38323.806333:    1418936      cycles:  ffffffffb6761378 mwait_idle_with_hints.constprop.0+0x48 (vmlinux)
                  :97996 97996 [004] 38323.807145:    1418936      cycles:      7f5db9be6917 [unknown] ([unknown])
                  :97959 97959 [004] 38323.807730:    1445074      cycles:  ffffffffb6329d36 memcg_slab_post_alloc_hook+0x146 (vmlinux)
                  :97959 97959 [004] 38323.808103:    1341584      cycles:  ffffffffb62fd90f get_page_from_freelist+0x112f (vmlinux)
                  :97959 97959 [004] 38323.808451:    1227537      cycles:  ffffffffb65b2905 selinux_socket_sendmsg+0x5 (vmlinux)
                  :97959 97959 [004] 38323.808768:    1184321      cycles:  ffffffffb6d1ba35 _raw_spin_lock_irqsave+0x15 (vmlinux)
                  :97959 97959 [004] 38323.809073:    1153017      cycles:  ffffffffb6a8d92d skb_set_owner_w+0x5d (vmlinux)
                  :97959 97959 [004] 38323.809402:    1126875      cycles:  ffffffffb6329c64 memcg_slab_post_alloc_hook+0x74 (vmlinux)
                  :97959 97959 [004] 38323.809695:    1073248      cycles:  ffffffffb6e0001d entry_SYSCALL_64+0x1d (vmlinux)
        #
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Blake Jones <blakejones@google.com>
      Cc: Hao Luo <haoluo@google.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: bpf@vger.kernel.org
      Link: https://lore.kernel.org/r/20220518224725.742882-4-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      10742d0c
    • Namhyung Kim's avatar
      perf record: Enable off-cpu analysis with BPF · edc41a10
      Namhyung Kim authored
      Add --off-cpu option to enable the off-cpu profiling with BPF.  It'd
      use a bpf_output event and rename it to "offcpu-time".  Samples will
      be synthesized at the end of the record session using data from a BPF
      map which contains the aggregated off-cpu time at context switches.
      So it needs root privilege to get the off-cpu profiling.
      
      Each sample will have a separate user stacktrace so it will skip
      kernel threads.  The sample ip will be set from the stacktrace and
      other sample data will be updated accordingly.  Currently it only
      handles some basic sample types.
      
      The sample timestamp is set to a dummy value just not to bother with
      other events during the sorting.  So it has a very big initial value
      and increase it on processing each samples.
      
      Good thing is that it can be used together with regular profiling like
      cpu cycles.  If you don't want to that, you can use a dummy event to
      enable off-cpu profiling only.
      
      Example output:
        $ sudo perf record --off-cpu perf bench sched messaging -l 1000
      
        $ sudo perf report --stdio --call-graph=no
        # Total Lost Samples: 0
        #
        # Samples: 41K of event 'cycles'
        # Event count (approx.): 42137343851
        ...
      
        # Samples: 1K of event 'offcpu-time'
        # Event count (approx.): 587990831640
        #
        # Children      Self  Command          Shared Object       Symbol
        # ........  ........  ...............  ..................  .........................
        #
            81.66%     0.00%  sched-messaging  libc-2.33.so        [.] __libc_start_main
            81.66%     0.00%  sched-messaging  perf                [.] cmd_bench
            81.66%     0.00%  sched-messaging  perf                [.] main
            81.66%     0.00%  sched-messaging  perf                [.] run_builtin
            81.43%     0.00%  sched-messaging  perf                [.] bench_sched_messaging
            40.86%    40.86%  sched-messaging  libpthread-2.33.so  [.] __read
            37.66%    37.66%  sched-messaging  libpthread-2.33.so  [.] __write
             2.91%     2.91%  sched-messaging  libc-2.33.so        [.] __poll
        ...
      
      As you can see it spent most of off-cpu time in read and write in
      bench_sched_messaging().  The --call-graph=no was added just to make
      the output concise here.
      
      It uses perf hooks facility to control BPF program during the record
      session rather than adding new BPF/off-cpu specific calls.
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Acked-by: default avatarIan Rogers <irogers@google.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Blake Jones <blakejones@google.com>
      Cc: Hao Luo <haoluo@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: bpf@vger.kernel.org
      Link: https://lore.kernel.org/r/20220518224725.742882-3-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      edc41a10
    • Adrian Hunter's avatar
      perf tools: Allow all_cpus to be a superset of user_requested_cpus · 7be1fedd
      Adrian Hunter authored
      To support collection of system-wide events with user requested CPUs,
      all_cpus must be a superset of user_requested_cpus.
      
      In order to support all_cpus to be a superset of user_requested_cpus,
      all_cpus must be used instead of user_requested_cpus when dealing with CPUs
      of all events instead of CPUs of requested events.
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Acked-by: default avatarIan Rogers <irogers@google.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Leo Yan <leo.yan@linaro.org>
      Link: https://lore.kernel.org/r/20220524075436.29144-10-adrian.hunter@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      7be1fedd
    • Adrian Hunter's avatar
      perf record: Use evlist__add_dummy_on_all_cpus() in record__config_text_poke() · 921e3be5
      Adrian Hunter authored
      Use evlist__add_dummy_on_all_cpus() in record__config_text_poke() in
      preparation for allowing system-wide events on all CPUs while the user
      requested events are on only user requested CPUs.
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Acked-by: default avatarIan Rogers <irogers@google.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Leo Yan <leo.yan@linaro.org>
      Link: https://lore.kernel.org/r/20220524075436.29144-7-adrian.hunter@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      921e3be5
  20. 05 May, 2022 1 commit
    • Ian Rogers's avatar
      perf cpumap: Switch to using perf_cpu_map API · 0255571a
      Ian Rogers authored
      Switch some raw accesses to the cpu map to using the library API. This
      can help with reference count checking. Some BPF cases switch from index
      to CPU for consistency, this shouldn't matter as the CPU map is full.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Antonov <alexander.antonov@linux.intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Andrii Nakryiko <andrii@kernel.org>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: German Gomez <german.gomez@arm.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Fastabend <john.fastabend@gmail.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: KP Singh <kpsingh@kernel.org>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Martin KaFai Lau <kafai@fb.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Riccardo Mancini <rickyman7@gmail.com>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: Will Deacon <will@kernel.org>
      Cc: Yonghong Song <yhs@fb.com>
      Link: http://lore.kernel.org/lkml/20220503041757.2365696-2-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      0255571a
  21. 14 Apr, 2022 1 commit
  22. 01 Apr, 2022 1 commit
    • Ian Rogers's avatar
      perf evlist: Rename cpus to user_requested_cpus · 0df6ade7
      Ian Rogers authored
      evlist contains cpus and all_cpus. all_cpus is the union of the cpu maps
      of all evsels.
      
      For non-task targets, cpus is set to be cpus requested from the command
      line, defaulting to all online cpus if no cpus are specified.
      
      For an uncore event, all_cpus may be just CPU 0 or every online CPU.
      
      This causes all_cpus to have fewer values than the cpus variable which
      is confusing given the 'all' in the name.
      
      To try to make the behavior clearer, rename cpus to user_requested_cpus
      and add comments on the two struct variables.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Antonov <alexander.antonov@linux.intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Andrii Nakryiko <andrii@kernel.org>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: German Gomez <german.gomez@arm.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Fastabend <john.fastabend@gmail.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: KP Singh <kpsingh@kernel.org>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Martin KaFai Lau <kafai@fb.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Riccardo Mancini <rickyman7@gmail.com>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: Will Deacon <will@kernel.org>
      Cc: Yonghong Song <yhs@fb.com>
      Cc: bpf@vger.kernel.org
      Cc: coresight@lists.linaro.org
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: netdev@vger.kernel.org
      Link: http://lore.kernel.org/lkml/20220328232648.2127340-3-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      0df6ade7
  23. 23 Feb, 2022 1 commit