1. 03 Apr, 2020 12 commits
    • Jin Yao's avatar
      perf top: Support --group-sort-idx to change the sort order · df7deb2c
      Jin Yao authored
      'perf report' supports the option --group-sort-idx, which sorts the
      output by the event at the index n in event group.
      
      For example:
      
        perf record -e cycles,instructions,cache-misses
        perf report --group --group-sort-idx 2 --stdio
      
      The perf-report output is sorted by cache-misses.
      
      This patch supports --group-sort-idx in perf-top.
      
      For example:
      
        perf top --group -e cycles,instructions,cache-misses --group-sort-idx 2
      
      The perf-top output is sorted by cache-misses.
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Suggested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jin Yao <yao.jin@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20200324220711.6025-1-yao.jin@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      df7deb2c
    • Kemeng Shi's avatar
      perf symbols: Fix arm64 gap between kernel start and module end · 78886f3e
      Kemeng Shi authored
      During execution of command 'perf report' in my arm64 virtual machine,
      this error message is showed:
      
      failed to process sample
      
      __symbol__inc_addr_samples(860): ENOMEM! sym->name=__this_module,
          start=0x1477100, addr=0x147dbd8, end=0x80002000, func: 0
      
      The error is caused with path:
      cmd_report
       __cmd_report
        perf_session__process_events
         __perf_session__process_events
          ordered_events__flush
           __ordered_events__flush
            oe->deliver (ordered_events__deliver_event)
             perf_session__deliver_event
              machines__deliver_event
               perf_evlist__deliver_sample
                tool->sample (process_sample_event)
                 hist_entry_iter__add
                  iter->add_entry_cb(hist_iter__report_callback)
                   hist_entry__inc_addr_samples
                    symbol__inc_addr_samples
                     __symbol__inc_addr_samples
                      h = annotated_source__histogram(src, evidx) (NULL)
      
      annotated_source__histogram failed is caused with path:
      ...
       hist_entry__inc_addr_samples
        symbol__inc_addr_samples
         symbol__hists
          annotated_source__alloc_histograms
           src->histograms = calloc(nr_hists, sizeof_sym_hist) (failed)
      
      Calloc failed as the symbol__size(sym) is too huge. As show in error
      message: start=0x1477100, end=0x80002000, size of symbol is about 2G.
      
      This is the same problem as 'perf annotate: Fix s390 gap between kernel
      end and module start (b9c0a649)'. Perf gets symbol information from
      /proc/kallsyms in __dso__load_kallsyms. A part of symbol in /proc/kallsyms
      from my virtual machine is as follows:
       #cat /proc/kallsyms | sort
       ...
       ffff000001475080 d rpfilter_mt_reg      [ip6t_rpfilter]
       ffff000001475100 d $d   [ip6t_rpfilter]
       ffff000001475100 d __this_module        [ip6t_rpfilter]
       ffff000080080000 t _head
       ffff000080080000 T _text
       ffff000080080040 t pe_header
       ...
      
      Take line 'ffff000001475100 d __this_module [ip6t_rpfilter]' as example.
      The start and end of symbol are both set to ffff000001475100 in
      dso__load_all_kallsyms. Then symbols__fixup_end will set the end of symbol
      to next big address to ffff000001475100 in /proc/kallsyms, ffff000080080000
      in this example. Then sizeof of symbol will be about 2G and cause the
      problem.
      
      The start of module in my machine is
       ffff000000a62000 t $x   [dm_mod]
      
      The start of kernel in my machine is
       ffff000080080000 t _head
      
      There is a big gap between end of module and begin of kernel if a samll
      amount of memory is used by module. And the last symbol in module will
      have a large address range as caotaining the big gap.
      
      Give that the module and kernel text segment sequence may change in
      the future, fix this by limiting range of last symbol in module and kernel
      to 4K in arch arm64.
      Signed-off-by: default avatarKemeng Shi <shikemeng@huawei.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Hewenliang <hewenliang4@huawei.com>
      Cc: Hu Shiyuan <hushiyuan@huawei.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Richter <tmricht@linux.ibm.com>
      Link: http://lore.kernel.org/lkml/33fd24c4-0d5a-9d93-9b62-dffa97c992ca@huawei.com
      [ refreshed the patch on current codebase, added string.h include as strchr() is used ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      78886f3e
    • Arnaldo Carvalho de Melo's avatar
      perf build-test: Honour JOBS to override detection of number of cores · 7b1642f2
      Arnaldo Carvalho de Melo authored
      When one does:
      
        $ make -C tools/perf build-test
      
      The makefile in tools/perf/tests/ will, just like the main one, detect
      how many cores are in the system and use it with -j.
      
      Sometimes we may need to override that, for instance, when using
      icecream or distcc to use multiple machines in the build process, then
      we need to, as with the main makefile, use:
      
        $ make JOBS=N -C tools/perf build-test
      
      Fix the tests makefile to honour that.
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lore.kernel.org/lkml/20200330130301.GA31702@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      7b1642f2
    • Namhyung Kim's avatar
      perf script: Add --show-cgroup-events option · 160d4af9
      Namhyung Kim authored
      The --show-cgroup-events option is to print CGROUP events in the
      output like others.
      
      Committer testing:
      
        [root@seventh ~]# perf record --all-cgroups --namespaces /wb/cgtest
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.039 MB perf.data (487 samples) ]
        [root@seventh ~]# perf script --show-cgroup-events | grep PERF_RECORD_CGROUP -B2 -A2
                 swapper     0     0.000000: PERF_RECORD_CGROUP cgroup: 1 /
                    perf 12145 11200.440730:          1 cycles:  ffffffffb900d58b __intel_pmu_enable_all.constprop.0+0x3b (/lib/modules/5.6.0-rc6-00008-gfe2413eefd7f/build/vmlinux)
                    perf 12145 11200.440733:          1 cycles:  ffffffffb900d58b __intel_pmu_enable_all.constprop.0+0x3b (/lib/modules/5.6.0-rc6-00008-gfe2413eefd7f/build/vmlinux)
        --
                  cgtest 12145 11200.440739:     193472 cycles:  ffffffffb90f6fbc commit_creds+0x1fc (/lib/modules/5.6.0-rc6-00008-gfe2413eefd7f/build/vmlinux)
                  cgtest 12145 11200.440790:    2691608 cycles:      7fa2cb43019b _dl_sysdep_start+0x7cb (/usr/lib64/ld-2.29.so)
                  cgtest 12145 11200.440962: PERF_RECORD_CGROUP cgroup: 83 /sub
                  cgtest 12147 11200.441054:          1 cycles:  ffffffffb900d58b __intel_pmu_enable_all.constprop.0+0x3b (/lib/modules/5.6.0-rc6-00008-gfe2413eefd7f/build/vmlinux)
                  cgtest 12147 11200.441057:          1 cycles:  ffffffffb900d58b __intel_pmu_enable_all.constprop.0+0x3b (/lib/modules/5.6.0-rc6-00008-gfe2413eefd7f/build/vmlinux)
        --
                  cgtest 12148 11200.441103:      10227 cycles:  ffffffffb9a0153d end_repeat_nmi+0x48 (/lib/modules/5.6.0-rc6-00008-gfe2413eefd7f/build/vmlinux)
                  cgtest 12148 11200.441106:     273295 cycles:  ffffffffb99ecbc7 copy_page+0x7 (/lib/modules/5.6.0-rc6-00008-gfe2413eefd7f/build/vmlinux)
                  cgtest 12147 11200.441133: PERF_RECORD_CGROUP cgroup: 88 /sub/cgrp1
                  cgtest 12147 11200.441143:    2788845 cycles:  ffffffffb94676c2 security_genfs_sid+0x102 (/lib/modules/5.6.0-rc6-00008-gfe2413eefd7f/build/vmlinux)
                  cgtest 12148 11200.441162: PERF_RECORD_CGROUP cgroup: 93 /sub/cgrp2
                  cgtest 12148 11200.441182:    2669546 cycles:            401020 _init+0x20 (/wb/cgtest)
                  cgtest 12149 11200.441247:          1 cycles:  ffffffffb900d58b __intel_pmu_enable_all.constprop.0+0x3b (/lib/modules/5.6.0-rc6-00008-gfe2413eefd7f/build/vmlinux)
        [root@seventh ~]#
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20200325124536.2800725-10-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      160d4af9
    • Namhyung Kim's avatar
      perf top: Add --all-cgroups option · f382842f
      Namhyung Kim authored
      The --all-cgroups option is to enable cgroup profiling support.  It
      tells kernel to record CGROUP events in the ring buffer so that 'perf
      top' can identify task/cgroup association later.
      
      Committer testing:
      
      Use:
      
        # perf top --all-cgroups -s cgroup_id,cgroup,pid
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20200325124536.2800725-9-namhyung@kernel.org
      Link: http://lore.kernel.org/lkml/20200402015249.3800462-1-namhyung@kernel.org
      [ Extracted the HAVE_FILE_HANDLE from the followup patch ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      f382842f
    • Namhyung Kim's avatar
      perf record: Add --all-cgroups option · 8fb4b679
      Namhyung Kim authored
      The --all-cgroups option is to enable cgroup profiling support.  It
      tells kernel to record CGROUP events in the ring buffer so that perf
      report can identify task/cgroup association later.
      
        [root@seventh ~]# perf record --all-cgroups --namespaces /wb/cgtest
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.042 MB perf.data (558 samples) ]
        [root@seventh ~]# perf report --stdio -s cgroup_id,cgroup,pid
        # To display the perf.data header info, please use --header/--header-only options.
        #
        #
        # Total Lost Samples: 0
        #
        # Samples: 558  of event 'cycles'
        # Event count (approx.): 458017341
        #
        # Overhead  cgroup id (dev/inode)  Cgroup          Pid:Command
        # ........  .....................  ..........  ...............
        #
            33.15%  4/0xeffffffb           /sub           9615:looper0
            32.83%  4/0xf00002f5           /sub/cgrp2     9620:looper2
            32.79%  4/0xf00002f4           /sub/cgrp1     9619:looper1
             0.35%  4/0xf00002f5           /sub/cgrp2     9618:cgtest
             0.34%  4/0xf00002f4           /sub/cgrp1     9617:cgtest
             0.32%  4/0xeffffffb           /              9615:looper0
             0.11%  4/0xeffffffb           /sub           9617:cgtest
             0.10%  4/0xeffffffb           /sub           9618:cgtest
      
        #
        # (Tip: Sample related events with: perf record -e '{cycles,instructions}:S')
        #
        [root@seventh ~]#
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20200325124536.2800725-8-namhyung@kernel.org
      Link: http://lore.kernel.org/lkml/20200402015249.3800462-1-namhyung@kernel.org
      [ Extracted the HAVE_FILE_HANDLE from the followup patch ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      8fb4b679
    • Namhyung Kim's avatar
      perf record: Support synthesizing cgroup events · ab64069f
      Namhyung Kim authored
      Synthesize cgroup events by iterating cgroup filesystem directories.
      The cgroup event only saves the portion of cgroup path after the mount
      point and the cgroup id (which actually is a file handle).
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20200325124536.2800725-7-namhyung@kernel.org
      Link: http://lore.kernel.org/lkml/20200402015249.3800462-1-namhyung@kernel.org
      [ Extracted the HAVE_FILE_HANDLE from the followup patch, added missing __maybe_unused ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      ab64069f
    • Namhyung Kim's avatar
      perf report: Add 'cgroup' sort key · b629f3e9
      Namhyung Kim authored
      The cgroup sort key is to show cgroup membership of each task.
      Currently it shows full path in the cgroupfs (not relative to the root
      of cgroup namespace) since it'd be more intuitive IMHO.  Otherwise root
      cgroup in different namespaces will all show same name - "/".
      
      The cgroup sort key should come before cgroup_id otherwise
      sort_dimension__add() will match it to cgroup_id as it only matches with
      the given substring.
      
      For example it will look like following.  Note that record patch adding
      --all-cgroups patch will come later.
      
        $ perf record -a --namespace --all-cgroups  cgtest
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.208 MB perf.data (4090 samples) ]
      
        $ perf report -s cgroup_id,cgroup,pid
        ...
        # Overhead  cgroup id (dev/inode)  Cgroup          Pid:Command
        # ........  .....................  ..........  ...............
        #
            93.96%  0/0x0                  /                 0:swapper
             1.25%  3/0xeffffffb           /               278:looper0
             0.86%  3/0xf000015f           /sub/cgrp1      280:cgtest
             0.37%  3/0xf0000160           /sub/cgrp2      281:cgtest
             0.34%  3/0xf0000163           /sub/cgrp3      282:cgtest
             0.22%  3/0xeffffffb           /sub            278:looper0
             0.20%  3/0xeffffffb           /               280:cgtest
             0.15%  3/0xf0000163           /sub/cgrp3      285:looper3
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20200325124536.2800725-6-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      b629f3e9
    • Namhyung Kim's avatar
      perf cgroup: Maintain cgroup hierarchy · d1277aa3
      Namhyung Kim authored
      Each cgroup is kept in the perf_env's cgroup_tree sorted by the cgroup
      id.  Hist entries have cgroup id can compare it directly and later it
      can be used to find a group name using this tree.
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20200325124536.2800725-5-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      d1277aa3
    • Namhyung Kim's avatar
      perf tools: Basic support for CGROUP event · ba78c1c5
      Namhyung Kim authored
      Implement basic functionality to support cgroup tracking.  Each cgroup
      can be identified by inode number which can be read from userspace too.
      The actual cgroup processing will come in the later patch.
      Reported-by: default avatarkernel test robot <rong.a.chen@intel.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      [ fix perf test failure on sampling parsing ]
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20200325124536.2800725-4-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      ba78c1c5
    • Namhyung Kim's avatar
      perf tools: Add file-handle feature test · 49f550ea
      Namhyung Kim authored
      The file handle (FHANDLE) support is configurable so some systems might not
      have it.  So add a config feature item to check it on build time so that we
      don't add the cgroup tracking feature based on that.
      
      Committer notes:
      
      Had to make the test use the same construct as its later use in
      synthetic-events.c, in the next patch in this series. i.e. make it be:
      
      	struct {
      		struct file_handle fh;
      		uint64_t cgroup_id;
      	} handle;
      
      To cope with:
      
          CC       /tmp/build/perf/util/cloexec.o
        util/synthetic-events.c:428:22: error: field 'fh' with   CC       /tmp/build/perf/util/call-path.o
        variable sized type 'struct file_handle' not at the end of a struct or class is a GNU
              extension [-Werror,-Wgnu-variable-sized-type-not-at-end]
                        struct file_handle fh;
                                           ^
        1 error generated.
      
      Deal with this at some point, i.e. investigate if the right thing is to
      remove that -Wgnu-variable-sized-type-not-at-end from our CFLAGS, for
      now do the test the same way as it is used looks more sensible.
      Reported-by: default avatarArnaldo Carvalho de Melo <acme@kernel.org>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20200402015249.3800462-1-namhyung@kernel.org
      [ split from a larger patch, removed blank line at EOF ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      49f550ea
    • Arnaldo Carvalho de Melo's avatar
      perf python: Include rwsem.c in the pythong biding · 460c3ed9
      Arnaldo Carvalho de Melo authored
      We'll need it for the cgroup patches, and its better to have it in a
      separate patch in case we need to later revert the cgroup patches.
      
      I.e. without this we have:
      
        [root@five ~]# perf test -v python
        19: 'import perf' in python                               :
        --- start ---
        test child forked, pid 148447
        Traceback (most recent call last):
          File "<stdin>", line 1, in <module>
        ImportError: /tmp/build/perf/python/perf.cpython-37m-x86_64-linux-gnu.so: undefined symbol: down_write
        test child finished with -1
        ---- end ----
        'import perf' in python: FAILED!
        [root@five ~]#
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20200403123606.GC23243@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      460c3ed9
  2. 02 Apr, 2020 1 commit
  3. 27 Mar, 2020 5 commits
    • Namhyung Kim's avatar
      perf/core: Add PERF_SAMPLE_CGROUP feature · 6546b19f
      Namhyung Kim authored
      The PERF_SAMPLE_CGROUP bit is to save (perf_event) cgroup information in
      the sample.  It will add a 64-bit id to identify current cgroup and it's
      the file handle in the cgroup file system.  Userspace should use this
      information with PERF_RECORD_CGROUP event to match which cgroup it
      belongs.
      
      I put it before PERF_SAMPLE_AUX for simplicity since it just needs a
      64-bit word.  But if we want bigger samples, I can work on that
      direction too.
      
      Committer testing:
      
        $ pahole perf_sample_data | grep -w cgroup -B5 -A5
        	/* --- cacheline 4 boundary (256 bytes) was 56 bytes ago --- */
        	struct perf_regs           regs_intr;            /*   312    16 */
        	/* --- cacheline 5 boundary (320 bytes) was 8 bytes ago --- */
        	u64                        stack_user_size;      /*   328     8 */
        	u64                        phys_addr;            /*   336     8 */
        	u64                        cgroup;               /*   344     8 */
      
        	/* size: 384, cachelines: 6, members: 22 */
        	/* padding: 32 */
        };
        $
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Zefan Li <lizefan@huawei.com>
      Link: http://lore.kernel.org/lkml/20200325124536.2800725-3-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      6546b19f
    • Namhyung Kim's avatar
      perf/core: Add PERF_RECORD_CGROUP event · 96aaab68
      Namhyung Kim authored
      To support cgroup tracking, add CGROUP event to save a link between
      cgroup path and id number.  This is needed since cgroups can go away
      when userspace tries to read the cgroup info (from the id) later.
      
      The attr.cgroup bit was also added to enable cgroup tracking from
      userspace.
      
      This event will be generated when a new cgroup becomes active.
      Userspace might need to synthesize those events for existing cgroups.
      
      Committer testing:
      
      From the resulting kernel, using /sys/kernel/btf/vmlinux:
      
        $ pahole perf_event_attr | grep -w cgroup -B5 -A1
        	__u64                      write_backward:1;     /*    40:27  8 */
        	__u64                      namespaces:1;         /*    40:28  8 */
        	__u64                      ksymbol:1;            /*    40:29  8 */
        	__u64                      bpf_event:1;          /*    40:30  8 */
        	__u64                      aux_output:1;         /*    40:31  8 */
        	__u64                      cgroup:1;             /*    40:32  8 */
        	__u64                      __reserved_1:31;      /*    40:33  8 */
        $
      Reported-by: default avatarkbuild test robot <lkp@intel.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: default avatarTejun Heo <tj@kernel.org>
      [staticize perf_event_cgroup function]
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Zefan Li <lizefan@huawei.com>
      Link: http://lore.kernel.org/lkml/20200325124536.2800725-2-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      96aaab68
    • Hagen Paul Pfeifer's avatar
      perf script: Introduce --deltatime option · 26567ed7
      Hagen Paul Pfeifer authored
      For some kind of analysis a deltatime output is more human friendly and
      reduce the cognitive load for further analysis.
      
      The following output demonstrate the new option "deltatime": calculate
      the time difference in relation to the previous event.
      
        $ perf script --deltatime
        test  2525 [001]     0.000000:            sdt_libev:ev_add: (5635e72a5ebd)
        test  2525 [001]     0.000091:  sdt_libev:epoll_wait_enter: (5635e72a76a9)
        test  2525 [001]     1.000051: sdt_libev:epoll_wait_return: (5635e72a772e) arg1=1
        test  2525 [001]     0.000685:            sdt_libev:ev_add: (5635e72a5ebd)
        test  2525 [001]     0.000048:  sdt_libev:epoll_wait_enter: (5635e72a76a9)
        test  2525 [001]     1.000104: sdt_libev:epoll_wait_return: (5635e72a772e) arg1=1
        test  2525 [001]     0.003895:  sdt_libev:epoll_wait_enter: (5635e72a76a9)
        test  2525 [001]     0.996034: sdt_libev:epoll_wait_return: (5635e72a772e) arg1=1
        test  2525 [001]     0.000058:  sdt_libev:epoll_wait_enter: (5635e72a76a9)
        test  2525 [001]     1.000004: sdt_libev:epoll_wait_return: (5635e72a772e) arg1=1
        test  2525 [001]     0.000064:  sdt_libev:epoll_wait_enter: (5635e72a76a9)
        test  2525 [001]     0.999934: sdt_libev:epoll_wait_return: (5635e72a772e) arg1=1
        test  2525 [001]     0.000056:  sdt_libev:epoll_wait_enter: (5635e72a76a9)
        test  2525 [001]     0.999930: sdt_libev:epoll_wait_return: (5635e72a772e) arg1=1
      
      Committer testing:
      
      So go from default output to --reltime and then this new --deltatime, to
      contrast the various timestamp presentation modes for a random perf.data file I
      had laying around:
      
        [root@five ~]# perf script --reltime | head
           perf 442394 [000]     0.000000:   16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [000]     0.000002:   16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [000]     0.000004:   16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [000]     0.000006:  128 cycles: ffffffff972415a1 perf_event_update_userpage+0x1 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [000]     0.000009: 2597 cycles: ffffffff97463785 cap_task_setscheduler+0x5 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [001]     0.000036:   16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [001]     0.000038:   16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [001]     0.000040:   16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [001]     0.000041:  224 cycles: ffffffff9700a53a perf_ibs_handle_irq+0x1da (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [001]     0.000044: 4439 cycles: ffffffff97120d85 put_prev_entity+0x45 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
        [root@five ~]# perf script --deltatime | head
           perf 442394 [000]     0.000000:   16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [000]     0.000002:   16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [000]     0.000001:   16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [000]     0.000001:  128 cycles: ffffffff972415a1 perf_event_update_userpage+0x1 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [000]     0.000002: 2597 cycles: ffffffff97463785 cap_task_setscheduler+0x5 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [001]     0.000027:   16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [001]     0.000002:   16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [001]     0.000001:   16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [001]     0.000001:  224 cycles: ffffffff9700a53a perf_ibs_handle_irq+0x1da (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [001]     0.000002: 4439 cycles: ffffffff97120d85 put_prev_entity+0x45 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
        [root@five ~]# perf script | head
           perf 442394 [000]  7600.157861:   16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [000]  7600.157864:   16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [000]  7600.157866:   16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [000]  7600.157867:  128 cycles: ffffffff972415a1 perf_event_update_userpage+0x1 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [000]  7600.157870: 2597 cycles: ffffffff97463785 cap_task_setscheduler+0x5 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [001]  7600.157897:   16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [001]  7600.157900:   16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [001]  7600.157901:   16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [001]  7600.157903:  224 cycles: ffffffff9700a53a perf_ibs_handle_irq+0x1da (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [001]  7600.157906: 4439 cycles: ffffffff97120d85 put_prev_entity+0x45 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
        [root@five ~]#
      
      Andi suggested we better implement it as a new field, i.e. -F deltatime, like:
      
        [root@five ~]# perf script -F deltatime
        Invalid field requested.
      
         Usage: perf script [<options>]
            or: perf script [<options>] record <script> [<record-options>] <command>
            or: perf script [<options>] report <script> [script-args]
            or: perf script [<options>] <script> [<record-options>] <command>
            or: perf script [<options>] <top-script> [script-args]
      
            -F, --fields <str>    comma separated output fields prepend with 'type:'. +field to add and -field to remove.Valid types: hw,sw,trace,raw,synth. Fields: comm,tid,pid,time,cpu,event,trace,ip,sym,dso,addr,symoff,srcline,period,iregs,uregs,brstack,brstacksym,flags,bpf-output,brstackinsn,brstackoff,callindent,insn,insnlen,synth,phys_addr,metric,misc,ipc
        [root@five ~]#
      
      I.e. we have -F for maximum flexibility:
      
        [root@five ~]# perf script -F comm,pid,cpu,time | head
                  perf 442394 [000]  7600.157861:
                  perf 442394 [000]  7600.157864:
                  perf 442394 [000]  7600.157866:
                  perf 442394 [000]  7600.157867:
                  perf 442394 [000]  7600.157870:
                  perf 442394 [001]  7600.157897:
                  perf 442394 [001]  7600.157900:
                  perf 442394 [001]  7600.157901:
                  perf 442394 [001]  7600.157903:
                  perf 442394 [001]  7600.157906:
        [root@five ~]#
      
      But since we already have --reltime, having --deltatime, documented one after
      the other is sensible.
      Signed-off-by: default avatarHagen Paul Pfeifer <hagen@jauu.net>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Link: http://lore.kernel.org/lkml/20200204173709.489161-1-hagen@jauu.net
      [ Added 'perf script' man page entry for --deltatime ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      26567ed7
    • Adrian Hunter's avatar
      perf test x86: Add CET instructions to the new instructions test · 26cec748
      Adrian Hunter authored
      Add to the "x86 instruction decoder - new instructions" test the
      following instructions:
      
      	incsspd
      	incsspq
      	rdsspd
      	rdsspq
      	saveprevssp
      	rstorssp
      	wrssd
      	wrssq
      	wrussd
      	wrussq
      	setssbsy
      	clrssbsy
      	endbr32
      	endbr64
      
      And the "notrack" prefix for indirect calls and jumps.
      
      For information about the instructions, refer Intel Control-flow
      Enforcement Technology Specification May 2019 (334525-003).
      
      Committer testing:
      
        $ perf test instr
        67: x86 instruction decoder - new instructions            : Ok
        $
      
      Then use verbose mode and check one of those new instructions:
      
        $ perf test -v instr |& grep saveprevssp
        Decoded ok: f3 0f 01 ea          	saveprevssp
        Decoded ok: f3 0f 01 ea          	saveprevssp
        $
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Acked-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi v. Shankar <ravi.v.shankar@intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: x86@kernel.org
      Link: http://lore.kernel.org/lkml/20200204171425.28073-3-yu-cheng.yu@intel.comSigned-off-by: default avatarYu-cheng Yu <yu-cheng.yu@intel.com>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      26cec748
    • Yu-cheng Yu's avatar
      x86/insn: Add Control-flow Enforcement (CET) instructions to the opcode map · 315a4af8
      Yu-cheng Yu authored
      Add the following CET instructions to the opcode map:
      
        INCSSP:
            Increment Shadow Stack pointer (SSP).
      
        RDSSP:
            Read SSP into a GPR.
      
        SAVEPREVSSP:
            Use "previous ssp" token at top of current Shadow Stack (SHSTK) to
            create a "restore token" on the previous (outgoing) SHSTK.
      
        RSTORSSP:
            Restore from a "restore token" to SSP.
      
        WRSS:
            Write to kernel-mode SHSTK (kernel-mode instruction).
      
        WRUSS:
            Write to user-mode SHSTK (kernel-mode instruction).
      
        SETSSBSY:
            Verify the "supervisor token" pointed by MSR_IA32_PL0_SSP, set the
            token busy, and set then Shadow Stack pointer(SSP) to the value of
            MSR_IA32_PL0_SSP.
      
        CLRSSBSY:
            Verify the "supervisor token" and clear its busy bit.
      
        ENDBR64/ENDBR32:
            Mark a valid 64/32 bit control transfer endpoint.
      
      Detailed information of CET instructions can be found in Intel Software
      Developer's Manual.
      Signed-off-by: default avatarYu-cheng Yu <yu-cheng.yu@intel.com>
      Reviewed-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Reviewed-by: default avatarTony Luck <tony.luck@intel.com>
      Acked-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi v. Shankar <ravi.v.shankar@intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: x86@kernel.org
      Link: http://lore.kernel.org/lkml/20200204171425.28073-2-yu-cheng.yu@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      315a4af8
  4. 26 Mar, 2020 2 commits
  5. 25 Mar, 2020 1 commit
  6. 24 Mar, 2020 19 commits
    • Ravi Bangoria's avatar
      perf dso: Fix dso comparison · 0d33b343
      Ravi Bangoria authored
      Perf gets dso details from two different sources. 1st, from builid
      headers in perf.data and 2nd from MMAP2 samples. Dso from buildid
      header does not have dso_id detail. And dso from MMAP2 samples does
      not have buildid information. If detail of the same dso is present
      at both the places, filename is common.
      
      Previously, __dsos__findnew_link_by_longname_id() used to compare only
      long or short names, but Commit 0e3149f8 ("perf dso: Move dso_id
      from 'struct map' to 'struct dso'") also added a dso_id comparison.
      Because of that, now perf is creating two different dso objects of the
      same file, one from buildid header (with dso_id but without buildid)
      and second from MMAP2 sample (with buildid but without dso_id).
      
      This is causing issues with archive, buildid-list etc subcommands. Fix
      this by comparing dso_id only when it's present. And incase dso is
      present in 'dsos' list without dso_id, inject dso_id detail as well.
      
      Before:
      
        $ sudo ./perf buildid-list -H
        0000000000000000000000000000000000000000 /usr/bin/ls
        0000000000000000000000000000000000000000 /usr/lib64/ld-2.30.so
        0000000000000000000000000000000000000000 /usr/lib64/libc-2.30.so
      
        $ ./perf archive
        perf archive: no build-ids found
      
      After:
      
        $ ./perf buildid-list -H
        b6b1291d0cead046ed0fa5734037fa87a579adee /usr/bin/ls
        641f0c90cfa15779352f12c0ec3c7a2b2b6f41e8 /usr/lib64/ld-2.30.so
        675ace3ca07a0b863df01f461a7b0984c65c8b37 /usr/lib64/libc-2.30.so
      
        $ ./perf archive
        Now please run:
      
        $ tar xvf perf.data.tar.bz2 -C ~/.debug
      
        wherever you need to run 'perf report' on.
      
      Committer notes:
      
      Renamed is_empty_dso_id() to dso_id__empty() and inject_dso_id() to
      dso__inject_id() to keep namespacing consistent.
      
      Fixes: 0e3149f8 ("perf dso: Move dso_id from 'struct map' to 'struct dso'")
      Reported-by: default avatarNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Signed-off-by: default avatarRavi Bangoria <ravi.bangoria@linux.ibm.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Tested-by: default avatarNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lore.kernel.org/lkml/20200324042424.68366-1-ravi.bangoria@linux.ibm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      0d33b343
    • Christophe JAILLET's avatar
      perf cpumap: Fix snprintf overflow check · d74b181a
      Christophe JAILLET authored
      'snprintf' returns the number of characters which would be generated for
      the given input.
      
      If the returned value is *greater than* or equal to the buffer size, it
      means that the output has been truncated.
      
      Fix the overflow test accordingly.
      
      Fixes: 7780c25b ("perf tools: Allow ability to map cpus to nodes easily")
      Fixes: 92a7e127 ("perf cpumap: Add cpu__max_present_cpu()")
      Signed-off-by: default avatarChristophe JAILLET <christophe.jaillet@wanadoo.fr>
      Suggested-by: default avatarDavid Laight <David.Laight@ACULAB.COM>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: He Zhe <zhe.he@windriver.com>
      Cc: Jan Stancek <jstancek@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: kernel-janitors@vger.kernel.org
      Link: http://lore.kernel.org/lkml/20200324070319.10901-1-christophe.jaillet@wanadoo.frSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      d74b181a
    • John Garry's avatar
      perf test: Test pmu-events aliases · 956a7835
      John Garry authored
      Add creating event aliases to the pmu-events test.
      
      So currently we verify that the generated pmu-events.c is as expected for
      some test events. Now test that we generate aliases as expected for those
      events during normal operation.
      
      For that, we cycle through each HW PMU in the system, and use the test
      events to create aliases, and verify those against known, expected values.
      
      For core PMUs, we should create an alias for every event in
      test_cpu_events[].
      
      However, for uncore PMUs, they need to be matched by the pmu_event.pmu
      member, so use test_uncore_events[]; so check the match beforehand with
      pmu_uncore_alias_match().
      
      A sample run is as follows for my x86 machine:
      
        john@linux-3c19:~/linux> tools/perf/perf test -vv 10
        10: PMU events                                            :
        --- start ---
      
        ...
      
        testing PMU uncore_arb aliases: no events to match
        testing PMU cstate_pkg aliases: no events to match
        skipping testing PMU breakpoint
        testing aliases PMU uncore_cbox_1: matched event unc_cbo_xsnp_response.miss_eviction
        testing PMU uncore_cbox_1 aliases: pass
        testing PMU power aliases: no events to match
        testing aliases PMU cpu: matched event bp_l1_btb_correct
        testing aliases PMU cpu: matched event bp_l2_btb_correct
        testing aliases PMU cpu: matched event segment_reg_loads.any
        testing aliases PMU cpu: matched event dispatch_blocked.any
        testing aliases PMU cpu: matched event eist_trans
        testing PMU cpu aliases: pass
        testing PMU intel_pt aliases: no events to match
        skipping testing PMU software
        skipping testing PMU intel_bts
        testing PMU uncore_imc aliases: no events to match
        testing aliases PMU uncore_cbox_0: matched event unc_cbo_xsnp_response.miss_eviction
        testing PMU uncore_cbox_0 aliases: pass
        testing PMU cstate_core aliases: no events to match
        skipping testing PMU tracepoint
        testing PMU msr aliases: no events to match
        test child finished with 0
      Signed-off-by: default avatarJohn Garry <john.garry@huawei.com>
      Acked-by: default avatarJiri Olsa <jolsa@redhat.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Joakim Zhang <qiangqing.zhang@nxp.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Will Deacon <will@kernel.org>
      Cc: linuxarm@huawei.com
      Link: http://lore.kernel.org/lkml/1584442939-8911-8-git-send-email-john.garry@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      956a7835
    • John Garry's avatar
      perf pmu: Make pmu_uncore_alias_match() public · 5b9a5000
      John Garry authored
      The perf pmu-events test will want to use pmu_uncore_alias_match(), so
      make it public.
      Signed-off-by: default avatarJohn Garry <john.garry@huawei.com>
      Acked-by: default avatarJiri Olsa <jolsa@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Joakim Zhang <qiangqing.zhang@nxp.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Will Deacon <will@kernel.org>
      Cc: linuxarm@huawei.com
      Link: http://lore.kernel.org/lkml/1584442939-8911-7-git-send-email-john.garry@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      5b9a5000
    • John Garry's avatar
      perf pmu: Add is_pmu_core() · d504fae9
      John Garry authored
      Add a function to decide whether a PMU is a core PMU.
      Signed-off-by: default avatarJohn Garry <john.garry@huawei.com>
      Acked-by: default avatarJiri Olsa <jolsa@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Joakim Zhang <qiangqing.zhang@nxp.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Will Deacon <will@kernel.org>
      Cc: linuxarm@huawei.com
      Link: http://lore.kernel.org/lkml/1584442939-8911-6-git-send-email-john.garry@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      d504fae9
    • John Garry's avatar
      perf test: Add pmu-events test · a6c925fd
      John Garry authored
      The initial test will verify that the test tables in generated pmu-events.c
      match against known, expected values.
      
      For known events added in pmu-events/arch/test, we need to add an entry
      in test_cpu_aliases_events[] or test_uncore_events[].
      
      A sample run is as follows for x86:
      
        john@linux-3c19:~/linux> tools/perf/perf test -vv 10
        10: PMU event aliases                                     :
        --- start ---
        test child forked, pid 5316
        testing event table bp_l1_btb_correct: pass
        testing event table bp_l2_btb_correct: pass
        testing event table segment_reg_loads.any: pass
        testing event table dispatch_blocked.any: pass
        testing event table eist_trans: pass
        testing event table uncore_hisi_ddrc.flux_wcmd: pass
        testing event table unc_cbo_xsnp_response.miss_eviction: pass
        test child finished with 0
        ---- end ----
        PMU event aliases: Ok
      Signed-off-by: default avatarJohn Garry <john.garry@huawei.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: default avatarJiri Olsa <jolsa@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Joakim Zhang <qiangqing.zhang@nxp.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Will Deacon <will@kernel.org>
      Cc: linuxarm@huawei.com
      [ Fixup test_cpu_events[] and test_uncore_events[] sentinels to initialize one of its members to NULL, fixing the build in older compilers ]
      Link: http://lore.kernel.org/lkml/1584442939-8911-5-git-send-email-john.garry@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      a6c925fd
    • John Garry's avatar
      perf pmu: Refactor pmu_add_cpu_aliases() · e45ad701
      John Garry authored
      Create pmu_add_cpu_aliases_map() from pmu_add_cpu_aliases(), so the caller
      can pass the map; the pmu-events test would use this since there would
      be no CPUID matching to a mapfile there.
      Signed-off-by: default avatarJohn Garry <john.garry@huawei.com>
      Acked-by: default avatarJiri Olsa <jolsa@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Joakim Zhang <qiangqing.zhang@nxp.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Will Deacon <will@kernel.org>
      Cc: linuxarm@huawei.com
      Link: http://lore.kernel.org/lkml/1584442939-8911-4-git-send-email-john.garry@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e45ad701
    • John Garry's avatar
      perf jevents: Support test events folder · d8447808
      John Garry authored
      With the goal of supporting pmu-events test case, introduce support for
      a test events folder.
      
      These test events can be used for testing generation of pmu-event tables
      and alias creation for any arch.
      
      When running the pmu-events test case, these test events will be used as
      the platform-agnostic events, so aliases can be created per-PMU and
      validated against known expected values.
      
      To support the test events, add a "testcpu" entry in pmu_events_map[].
      The pmu-events test will be able to lookup the events map for "testcpu",
      to verify the generated tables against expected values.
      
      The resultant generated pmu-events.c will now look like the following:
      
        struct pmu_event pme_ampere_emag[] = {
        {
        	.name = "ldrex_spec",
        	.event = "event=0x6c",
        	.desc = "Exclusive operation spe...",
        	.topic = "intrinsic",
        	.long_desc = "Exclusive operation ...",
        },
        ...
        };
      
        struct pmu_event pme_test_cpu[] = {
        {
        	.name = "uncore_hisi_ddrc.flux_wcmd",
        	.event = "event=0x2",
        	.desc = "DDRC write commands. Unit: hisi_sccl,ddrc ",
        	.topic = "uncore",
        	.long_desc = "DDRC write commands",
        	.pmu = "hisi_sccl,ddrc",
        },
        {
        	.name = "unc_cbo_xsnp_response.miss_eviction",
        	.event = "umask=0x81,event=0x22",
        	.desc = "Unit: uncore_cbox A cross-core snoop resulted ...",
        	.topic = "uncore",
        	.long_desc = "A cross-core snoop resulted from L3 ...",
        	.pmu = "uncore_cbox",
        },
        {
        	.name = "eist_trans",
        	.event = "umask=0x0,period=200000,event=0x3a",
        	.desc = "Number of Enhanced Intel SpeedStep(R) ...",
        	.topic = "other",
        },
        {
        	.name = 0,
        },
        };
      
        struct pmu_events_map pmu_events_map[] = {
        ...
        {
        	.cpuid = "0x00000000500f0000",
        	.version = "v1",
        	.type = "core",
        	.table = pme_ampere_emag
        },
        ...
        {
        	.cpuid = "testcpu",
        	.version = "v1",
        	.type = "core",
        	.table = pme_test_cpu,
        },
        {
        	.cpuid = 0,
        	.version = 0,
        	.type = 0,
        	.table = 0,
        },
        };
      Signed-off-by: default avatarJohn Garry <john.garry@huawei.com>
      Acked-by: default avatarJiri Olsa <jolsa@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Joakim Zhang <qiangqing.zhang@nxp.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Will Deacon <will@kernel.org>
      Cc: linuxarm@huawei.com
      Link: http://lore.kernel.org/lkml/1584442939-8911-3-git-send-email-john.garry@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      d8447808
    • John Garry's avatar
      perf jevents: Add some test events · c52db67a
      John Garry authored
      Add some test PMU events. The events are randomly chosen from x86 and
      arm64 JSONs. The events include CPU and uncore events.
      Signed-off-by: default avatarJohn Garry <john.garry@huawei.com>
      Acked-by: default avatarJiri Olsa <jolsa@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Joakim Zhang <qiangqing.zhang@nxp.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Will Deacon <will@kernel.org>
      Cc: linuxarm@huawei.com
      Link: http://lore.kernel.org/lkml/1584442939-8911-2-git-send-email-john.garry@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      c52db67a
    • Jiri Olsa's avatar
      perf tools: Unify a bit the build directory output · 7cd053d4
      Jiri Olsa authored
      Removing the extra 'SUBDIR' line from clean and doc build output.
      Because it's annoying.. ;-)
      
      Before:
      
        $ make clean
        ...
        SUBDIR   Documentation
        CLEAN    Documentation
      
      After:
      
        $ make clean
        ...
        CLEAN    Documentation
      
      Before:
      
        $ make doc
        BUILD:   Doing 'make -j8' parallel build
        SUBDIR   Documentation
        ASCIIDOC perf-stat.html
        ...
      
      After:
      
        $ make doc
        BUILD:   Doing 'make -j8' parallel build
        ASCIIDOC perf-stat.html
        ...
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20200318204522.1200981-1-jolsa@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      7cd053d4
    • Arnaldo Carvalho de Melo's avatar
      tools headers uapi: Update linux/in.h copy · 29f36c16
      Arnaldo Carvalho de Melo authored
      To get the changes in:
      
        26776253 ("seg6: fix SRv6 L2 tunnels to use IANA-assigned protocol number")
      
      That ends up automatically adding the new IPPROTO_ETHERNET to the socket
      args beautifiers:
      
        $ tools/perf/trace/beauty/socket_ipproto.sh > before
      
      Apply this patch:
      
        $ tools/perf/trace/beauty/socket_ipproto.sh > after
        $ diff -u before after
        --- before	2020-03-19 11:48:36.876673819 -0300
        +++ after	2020-03-19 11:49:00.148541377 -0300
        @@ -6,6 +6,7 @@
         	[132] = "SCTP",
         	[136] = "UDPLITE",
         	[137] = "MPLS",
        +	[143] = "ETHERNET",
         	[17] = "UDP",
         	[1] = "ICMP",
         	[22] = "IDP",
        $
      
      Addresses this tools/perf build warning:
      
        Warning: Kernel ABI header at 'tools/include/uapi/linux/in.h' differs from latest version at 'include/uapi/linux/in.h'
        diff -u tools/include/uapi/linux/in.h include/uapi/linux/in.h
      
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Paolo Lungaroni <paolo.lungaroni@cnit.it>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      29f36c16
    • Vijay Thakkar's avatar
      perf vendor events amd: Update Zen1 events to V2 · b5b8a7cf
      Vijay Thakkar authored
      This patch updates the PMCs for AMD Zen1 core based processors (Family
      17h; Models 0 through 2F) to be in accordance with PMCs as
      documented in the latest versions of the AMD Processor Programming
      Reference [1], [2] and [3]. Note that some events, such as FPU pipe
      assignment are missing in [1], and therefore [3] is included for full
      coverage of events.
      
      PMCs added:
      
        fpu_pipe_assignment.dual{0|1|2|3}
        fpu_pipe_assignment.total{0|1|2|3}
        ls_mab_alloc.dc_prefetcher
        ls_mab_alloc.stores
        ls_mab_alloc.loads
        bp_dyn_ind_pred
        bp_de_redirect
      
      PMC removed:
      
        ex_ret_cond_misp
      
      Cumulative counts, fpu_pipe_assignment.total and
      fpu_pipe_assignment.dual, existed in v1, but did expose port-level
      counters.
      
      ex_ret_cond_misp has been removed as it has been removed from the latest
      versions of the PPR, and when tested, always seems to sample zero as
      tested on a Ryzen 3400G system.
      
      [1]: Processor Programming Reference (PPR) for AMD Family 17h Models
      01h,08h, Revision B2 Processors, 54945 Rev 3.03 - Jun 14, 2019.
      
      [2]: Processor Programming Reference (PPR) for AMD Family 17h Model 18h,
      Revision B1 Processors, 55570-B1 Rev 3.14 - Sep 26, 2019.
      
      [3]: OSRR for AMD Family 17h processors, Models 00h-2Fh, 56255 Rev 3.03 - July, 2018
      
      All of the PPRs can be found at:
      https://bugzilla.kernel.org/show_bug.cgi?id=206537Signed-off-by: default avatarVijay Thakkar <vijaythakkar@me.com>
      Acked-by: default avatarKim Phillips <kim.phillips@amd.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Jon Grimm <jon.grimm@amd.com>
      Cc: Martin Liška <mliska@suse.cz>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: vijay thakkar <vijaythakkar@me.com>
      Link: http://lore.kernel.org/lkml/20200318190002.307290-4-vijaythakkar@me.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      b5b8a7cf
    • Vijay Thakkar's avatar
      perf vendor events amd: Add Zen2 events · 2079f7aa
      Vijay Thakkar authored
      This patch adds PMU events for AMD Zen2 core based processors, namely,
      Matisse (model 71h), Castle Peak (model 31h) and Rome (model 2xh), as
      documented in the AMD Processor Programming Reference for Matisse [1].
      The model number regex has been set to detect all the models under
      family 17 that do not match those of Zen1, as the range is larger for
      zen2.
      
      Zen2 adds some additional counters that are not present in Zen1 and
      events for them have been added in this patch. Some counters have also
      been removed for Zen2 thatwere previously present in Zen1 and have been
      confirmed to always sample zero on zen2. These added/removed counters
      have been omitted for brevity but can be found here:
      https://gist.github.com/thakkarV/5b12ca5fd7488eb2c42e451e40bdd5f3
      
      Note that PPR for Zen2 [1] does not include some counters that were
      documented in the PPR for Zen1 based processors [2]. After having tested
      these counters, some of them that still work for zen2 systems have been
      preserved in the events for zen2. The counters that are omitted in [1]
      but are still measurable and non-zero on zen2 (tested on a Ryzen 3900X
      system) are the following:
      
        PMC 0x000 fpu_pipe_assignment.{total|total0|total1|total2|total3}
        PMC 0x004 fp_num_mov_elim_scal_op.*
        PMC 0x046 ls_tablewalker.*
        PMC 0x062 l2_latency.l2_cycles_waiting_on_fills
        PMC 0x063 l2_wcb_req.*
        PMC 0x06D l2_fill_pending.l2_fill_busy
        PMC 0x080 ic_fw32
        PMC 0x081 ic_fw32_miss
        PMC 0x086 bp_snp_re_sync
        PMC 0x087 ic_fetch_stall.*
        PMC 0x08C ic_cache_inval.*
        PMC 0x099 bp_tlb_rel
        PMC 0x0C7 ex_ret_brn_resync
        PMC 0x28A ic_oc_mode_switch.*
        L3PMC 0x001 l3_request_g1.*
        L3PMC 0x006 l3_comb_clstr_state.*
      
      [1]: Processor Programming Reference (PPR) for AMD Family 17h Model 71h,
      Revision B0 Processors, 56176 Rev 3.06 - Jul 17, 2019
      
      [2]: Processor Programming Reference (PPR) for AMD Family 17h Models
      01h,08h, Revision B2 Processors, 54945 Rev 3.03 - Jun 14, 2019
      
      All of the PPRs can be found at:
      
      https://bugzilla.kernel.org/show_bug.cgi?id=206537
      
      Here are the results of running "fpu_pipe_assignment.total" events on my
      Ryzen 3900X family 17h model 71h system:
      
      Before this patch:
      
        $> perf list *fpu_pipe_assignment*
      
      List of pre-defined events (to be used in -e):
      
      After:
      
        $> perf list *fpu_pipe_assignment*
      
        floating point:
        fpu_pipe_assignment.total
            [Total number of fp uOps]
        fpu_pipe_assignment.total0
            [Total number uOps assigned to pipe 0]
        fpu_pipe_assignment.total1
            [Total number uOps assigned to pipe 1]
        fpu_pipe_assignment.total2
            [Total number uOps assigned to pipe 2]
        fpu_pipe_assignment.total3
            [Total number uOps assigned to pipe 3]
      
        Metric Groups:
      
        $> perf stat -e fpu_pipe_assignment.total sleep 1
      
        Performance counter stats for 'sleep 1':
      
                    25,883      fpu_pipe_assignment.total
      
               1.004145868 seconds time elapsed
      
               0.001805000 seconds user
               0.000000000 seconds sys
      
      Usage tests while running Linpackin the background:
      
        $> perf stat -I1000 -e fpu_pipe_assignment.total
             1.000266796     79,313,191,516      fpu_pipe_assignment.total
             2.000809630     68,091,474,430      fpu_pipe_assignment.total
             3.001028115     52,925,023,174      fpu_pipe_assignment.total
      
        $> perf record -e fpu_pipe_assignment.total,fpu_pipe_assignment.total0 -a sleep 1
        [ perf record: Woken up 9 times to write data ]
        [ perf record: Captured and wrote 4.031 MB perf.data (64764 samples) ]
      
        $> perf report --stdio --no-header | head -30
            98.33%  xhpl             xhpl                          [.] dgemm_kernel
             0.28%  xhpl             xhpl                          [.] dtrsm_kernel_LT
             0.10%  xhpl             [kernel.kallsyms]             [k] entry_SYSCALL_64
             0.08%  xhpl             xhpl                          [.] idamax_k
             0.07%  baloo_file_extr  liblmdb.so                    [.] mdb_mid2l_insert
             0.06%  xhpl             xhpl                          [.] dgemm_itcopy
             0.06%  xhpl             xhpl                          [.] dgemm_oncopy
             0.06%  xhpl             [kernel.kallsyms]             [k] __schedule
             0.06%  xhpl             [kernel.kallsyms]             [k] syscall_trace_enter
             0.06%  xhpl             [kernel.kallsyms]             [k] native_sched_clock
             0.06%  xhpl             [kernel.kallsyms]             [k] pick_next_task_fair
             0.05%  xhpl             xhpl                          [.] blas_thread_server.llvm.15009391670273914865
             0.04%  xhpl             [kernel.kallsyms]             [k] do_syscall_64
             0.04%  xhpl             [kernel.kallsyms]             [k] yield_task_fair
             0.04%  xhpl             libpthread-2.31.so            [.] __pthread_mutex_unlock_usercnt
             0.03%  xhpl             [kernel.kallsyms]             [k] cpuacct_charge
             0.03%  xhpl             [kernel.kallsyms]             [k] syscall_return_via_sysret
             0.03%  xhpl             libc-2.31.so                  [.] __sched_yield
             0.03%  xhpl             [kernel.kallsyms]             [k] __calc_delta
      
        $> perf annotate --stdio2 dgemm_kernel | egrep '^ {0,2}[0-9]+' -B2 -A2
                        sub          $0x60,%rsp
                        mov          %rbx,(%rsp)
          0.00          mov          %rbp,0x8(%rsp)
                        mov          %r12,0x10(%rsp)
          0.00          mov          %r13,0x18(%rsp)
                        mov          %r14,0x20(%rsp)
                        mov          %r15,0x28(%rsp)
        --
                        mov          %rdi,%r13
                        mov          %rsi,0x28(%rsp)
          0.00          mov          %rdx,%r12
                        vmovsd       %xmm0,0x30(%rsp)
                        shl          $0x3,%r10
                        mov          0x28(%rsp),%rax
          0.00          xor          %rdx,%rdx
                        mov          $0x18,%rdi
                        div          %rdi
        --
                        nop
                  a0:   mov          %r12,%rax
          0.00          shl          $0x3,%rax
                        mov          %r8,%rdi
                        lea          (%r8,%rax,8),%r15
        --
                        mov          %r12,%rax
                        nop
          0.00    c0:   vmovups      (%rdi),%ymm1
          0.09          vmovups      0x20(%rdi),%ymm2
          0.02          vmovups      (%r15),%ymm3
          0.10          vmovups      %ymm1,(%rsi)
          0.07          vmovups      %ymm2,0x20(%rsi)
          0.07          vmovups      %ymm3,0x40(%rsi)
          0.06          add          $0x40,%rdi
                        add          $0x40,%r15
                        add          $0x60,%rsi
          0.00          dec          %rax
                      ↑ jne          c0
                        mov          %r9,%r15
        --
                        nop
                 110:   lea          0x80(%rsp),%rsi
          0.01          add          $0x60,%rsi
          0.03          mov          %r12,%rax
          0.00          sar          $0x3,%rax
                        cmp          $0x2,%rax
                      ↓ jl           d26
                        prefetcht0   0x200(%rdi)
          0.01          vmovups      -0x60(%rsi),%ymm1
          0.02          prefetcht0   0xa0(%rsi)
          0.00          vbroadcastsd -0x80(%rdi),%ymm0
          0.00          prefetcht0   0xe0(%rsi)
          0.03          vmovups      -0x40(%rsi),%ymm2
          0.00          prefetcht0   0x120(%rsi)
                        vmovups      -0x20(%rsi),%ymm3
                        vmulpd       %ymm0,%ymm1,%ymm4
          0.01          prefetcht0   0x160(%rsi)
                        vmulpd       %ymm0,%ymm2,%ymm8
          0.01          vmulpd       %ymm0,%ymm3,%ymm12
          0.02          prefetcht0   0x1a0(%rsi)
          0.01          vbroadcastsd -0x78(%rdi),%ymm0
                        vmulpd       %ymm0,%ymm1,%ymm5
          0.01          vmulpd       %ymm0,%ymm2,%ymm9
                        vmulpd       %ymm0,%ymm3,%ymm13
          0.01          vbroadcastsd -0x70(%rdi),%ymm0
                        vmulpd       %ymm0,%ymm1,%ymm6
          0.00          vmulpd       %ymm0,%ymm2,%ymm10
          0.00          add          $0x60,%rsi
      
        ... snip ...
      
                        nop
                65e0:   vmovddup     -0x60(%rsi),%xmm2
          0.00          vmovups      -0x80(%rdi),%xmm0
                        vmovups      -0x70(%rdi),%xmm1
          0.00          vmovddup     -0x58(%rsi),%xmm3
                        vfmadd231pd  %xmm0,%xmm2,%xmm4
          0.00          vfmadd231pd  %xmm1,%xmm2,%xmm5
          0.00          vfmadd231pd  %xmm0,%xmm3,%xmm6
          0.00          vfmadd231pd  %xmm1,%xmm3,%xmm7
          0.00          add          $0x10,%rsi
                        add          $0x20,%rdi
          0.00          dec          %rax
                      ↑ jne          65e0
                        nop
                        nop
                6620:   vmovddup     0x30(%rsp),%xmm0
          0.00          vmulpd       %xmm0,%xmm4,%xmm4
          0.00          vmulpd       %xmm0,%xmm5,%xmm5
                        vmulpd       %xmm0,%xmm6,%xmm6
                        vmulpd       %xmm0,%xmm7,%xmm7
                        vaddpd       (%r15),%xmm4,%xmm4
                        vaddpd       0x10(%r15),%xmm5,%xmm5
          0.00          vaddpd       (%r15,%r10,1),%xmm6,%xmm6
          0.00          vaddpd       0x10(%r15,%r10,1),%xmm7,%xmm7
          0.00          vmovups      %xmm4,(%r15)
                        vmovups      %xmm5,0x10(%r15)
          0.00          vmovups      %xmm6,(%r15,%r10,1)
                        vmovups      %xmm7,0x10(%r15,%r10,1)
                        add          $0x20,%r15
        --
                        lea          (%r8,%rax,8),%r8
                69d8:   mov          0x20(%rsp),%r14
          0.00          test         $0x1,%r14
                      ↓ je           6d84
                        mov          %r9,%r15
        --
                        vbroadcastsd -0x28(%rsi),%ymm3
                        vfmadd231pd  (%rdi),%ymm0,%ymm4
          0.00          vfmadd231pd  0x20(%rdi),%ymm1,%ymm5
                        vfmadd231pd  0x40(%rdi),%ymm2,%ymm6
                        vfmadd231pd  0x60(%rdi),%ymm3,%ymm7
        --
                        vmulpd       %ymm0,%ymm4,%ymm4
                        vaddpd       (%r15),%ymm4,%ymm4
          0.00          vmovups      %ymm4,(%r15)
                        add          $0x20,%r15
                        dec          %r11
        --
                        mov          %rbx,%rsp
                        mov          (%rsp),%rbx
          0.01          mov          0x8(%rsp),%rbp
                        mov          0x10(%rsp),%r12
                        mov          0x18(%rsp),%r13
      Signed-off-by: default avatarVijay Thakkar <vijaythakkar@me.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: default avatarKim Phillips <kim.phillips@amd.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Jon Grimm <jon.grimm@amd.com>
      Cc: Martin Liška <mliska@suse.cz>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20200318190002.307290-3-vijaythakkar@me.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      2079f7aa
    • Vijay Thakkar's avatar
      perf vendor events amd: Restrict model detection for zen1 based processors · c5f18e9e
      Vijay Thakkar authored
      This patch changes the previous blanket detection of AMD Family 17h
      processors to be more specific to Zen1 core based products only by
      replacing model detection regex pattern [[:xdigit:]]+ with
      ([12][0-9A-F]|[0-9A-F]), restricting to models 0 though 2f only.
      
      This change is required to allow for the addition of separate PMU events
      for Zen2 core based models in the following patches as those belong to
      family 17h but have different PMCs. Current PMU events directory has
      also been renamed to "amdzen1" from "amdfam17h" to reflect this
      specificity.
      
      Note that although this change does not break PMU counters for existing
      zen1 based systems, it does disable the current set of counters for zen2
      based systems. Counters for zen2 have been added in the following
      patches in this patchset.
      Signed-off-by: default avatarVijay Thakkar <vijaythakkar@me.com>
      Acked-by: default avatarKim Phillips <kim.phillips@amd.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Jon Grimm <jon.grimm@amd.com>
      Cc: Martin Liška <mliska@suse.cz>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20200318190002.307290-2-vijaythakkar@me.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      c5f18e9e
    • Kajol Jain's avatar
      perf metricgroup: Fix printing event names of metric group with multiple... · 58fc90fd
      Kajol Jain authored
      perf metricgroup: Fix printing event names of metric group with multiple events incase of overlapping events
      
      Commit f01642e4 ("perf metricgroup: Support multiple events for
      metricgroup") introduced support for multiple events in a metric group.
      But with the current upstream, metric events names are not printed
      properly incase we try to run multiple metric groups with overlapping
      event.
      
      With current upstream version, incase of overlapping metric events issue
      is, we always start our comparision logic from start.  So, the events
      which already matched with some metric group also take part in
      comparision logic. Because of that when we have overlapping events, we
      end up matching current metric group event with already matched one.
      
      For example, in skylake machine we have metric event CoreIPC and
      Instructions. Both of them need 'inst_retired.any' event value.  As
      events in Instructions is subset of events in CoreIPC, they endup in
      pointing to same 'inst_retired.any' value.
      
      In skylake platform:
      
      command:# ./perf stat -M CoreIPC,Instructions  -C 0 sleep 1
      
       Performance counter stats for 'CPU(s) 0':
      
           1,254,992,790      inst_retired.any          # 1254992790.0
                                                          Instructions
                                                        #      1.3 CoreIPC
             977,172,805      cycles
           1,254,992,756      inst_retired.any
      
             1.000802596 seconds time elapsed
      
      command:# sudo ./perf stat -M UPI,IPC sleep 1
      
         Performance counter stats for 'sleep 1':
                 948,650      uops_retired.retire_slots
                 866,182      inst_retired.any          #      0.7 IPC
                 866,182      inst_retired.any
               1,175,671      cpu_clk_unhalted.thread
      
      Patch fixes the issue by adding a new bool pointer 'evlist_used' to keep
      track of events which already matched with some group by setting it
      true.  So, we skip all used events in list when we start comparision
      logic.  Patch also make some changes in comparision logic, incase we get
      a match miss, we discard the whole match and start again with first
      event id in metric event.
      
      With this patch:
      
      In skylake platform:
      
      command:# ./perf stat -M CoreIPC,Instructions  -C 0 sleep 1
      
       Performance counter stats for 'CPU(s) 0':
      
               3,348,415      inst_retired.any          #      0.3 CoreIPC
              11,779,026      cycles
               3,348,381      inst_retired.any          # 3348381.0
                                                          Instructions
      
             1.001649056 seconds time elapsed
      
      command:# ./perf stat -M UPI,IPC sleep 1
      
       Performance counter stats for 'sleep 1':
      
               1,023,148      uops_retired.retire_slots #      1.1 UPI
                 924,976      inst_retired.any
                 924,976      inst_retired.any          #      0.6 IPC
               1,489,414      cpu_clk_unhalted.thread
      
             1.003064672 seconds time elapsed
      Signed-off-by: default avatarKajol Jain <kjain@linux.ibm.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Anju T Sudhakar <anju@linux.vnet.ibm.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
      Link: http://lore.kernel.org/lkml/20200221101121.28920-1-kjain@linux.ibm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      58fc90fd
    • Jin Yao's avatar
      perf stat: Align the output for interval aggregation mode · d13e9e41
      Jin Yao authored
      There is a slight misalignment in -A -I output.
      
      For example:
      
       # perf stat -e cpu/event=cpu-cycles/ -a -A -I 1000
      
       #           time CPU                    counts unit events
            1.000440863 CPU0               1,068,388      cpu/event=cpu-cycles/
            1.000440863 CPU1                 875,954      cpu/event=cpu-cycles/
            1.000440863 CPU2               3,072,538      cpu/event=cpu-cycles/
            1.000440863 CPU3               4,026,870      cpu/event=cpu-cycles/
            1.000440863 CPU4               5,919,630      cpu/event=cpu-cycles/
            1.000440863 CPU5               2,714,260      cpu/event=cpu-cycles/
            1.000440863 CPU6               2,219,240      cpu/event=cpu-cycles/
            1.000440863 CPU7               1,299,232      cpu/event=cpu-cycles/
      
      The value of counts is not aligned with the column "counts" and
      the event name is not aligned with the column "events".
      
      With this patch, the output is,
      
       # perf stat -e cpu/event=cpu-cycles/ -a -A -I 1000
      
       #           time CPU                    counts unit events
            1.000423009 CPU0                  997,421      cpu/event=cpu-cycles/
            1.000423009 CPU1                1,422,042      cpu/event=cpu-cycles/
            1.000423009 CPU2                  484,651      cpu/event=cpu-cycles/
            1.000423009 CPU3                  525,791      cpu/event=cpu-cycles/
            1.000423009 CPU4                1,370,100      cpu/event=cpu-cycles/
            1.000423009 CPU5                  442,072      cpu/event=cpu-cycles/
            1.000423009 CPU6                  205,643      cpu/event=cpu-cycles/
            1.000423009 CPU7                1,302,250      cpu/event=cpu-cycles/
      
      Now output is aligned.
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20200218071614.25736-1-yao.jin@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      d13e9e41
    • Jin Yao's avatar
      perf report/top TUI: Support hotkeys to let user select any event for sorting · dbddf174
      Jin Yao authored
      When performing "perf report --group", it shows the event group information
      together. In previous patch, we have supported a new option "--group-sort-idx"
      to sort the output by the event at the index n in event group.
      
      It would be nice if we can use a hotkey in browser to select a event
      to sort.
      
      For example,
      
        # perf report --group
      
       Samples: 12K of events 'cpu/instructions,period=2000003/, cpu/cpu-cycles,period=200003/, ...
                              Overhead  Command    Shared Object            Symbol
        92.19%  98.68%   0.00%  93.30%  mgen       mgen                     [.] LOOP1
         3.12%   0.29%   0.00%   0.16%  gsd-color  libglib-2.0.so.0.5600.4  [.] 0x0000000000049515
         1.56%   0.03%   0.00%   0.04%  gsd-color  libglib-2.0.so.0.5600.4  [.] 0x00000000000494b7
         1.56%   0.01%   0.00%   0.00%  gsd-color  libglib-2.0.so.0.5600.4  [.] 0x00000000000494ce
         1.56%   0.00%   0.00%   0.00%  mgen       [kernel.kallsyms]        [k] task_tick_fair
         0.00%   0.15%   0.00%   0.04%  perf       [kernel.kallsyms]        [k] smp_call_function_single
         0.00%   0.13%   0.00%   6.08%  swapper    [kernel.kallsyms]        [k] intel_idle
         0.00%   0.03%   0.00%   0.00%  gsd-color  libglib-2.0.so.0.5600.4  [.] g_main_context_check
         0.00%   0.03%   0.00%   0.00%  swapper    [kernel.kallsyms]        [k] apic_timer_interrupt
         0.00%   0.03%   0.00%   0.00%  swapper    [kernel.kallsyms]        [k] check_preempt_curr
      
      When user press hotkey '3' (event index, starting from 0), it indicates
      to sort output by the forth event in group.
      
        Samples: 12K of events 'cpu/instructions,period=2000003/, cpu/cpu-cycles,period=200003/, ...
                              Overhead  Command    Shared Object            Symbol
        92.19%  98.68%   0.00%  93.30%  mgen       mgen                     [.] LOOP1
         0.00%   0.13%   0.00%   6.08%  swapper    [kernel.kallsyms]        [k] intel_idle
         3.12%   0.29%   0.00%   0.16%  gsd-color  libglib-2.0.so.0.5600.4  [.] 0x0000000000049515
         0.00%   0.00%   0.00%   0.06%  swapper    [kernel.kallsyms]        [k] hrtimer_start_range_ns
         1.56%   0.03%   0.00%   0.04%  gsd-color  libglib-2.0.so.0.5600.4  [.] 0x00000000000494b7
         0.00%   0.15%   0.00%   0.04%  perf       [kernel.kallsyms]        [k] smp_call_function_single
         0.00%   0.00%   0.00%   0.02%  mgen       [kernel.kallsyms]        [k] update_curr
         0.00%   0.00%   0.00%   0.02%  mgen       [kernel.kallsyms]        [k] apic_timer_interrupt
         0.00%   0.00%   0.00%   0.02%  mgen       [kernel.kallsyms]        [k] native_apic_msr_eoi_write
         0.00%   0.00%   0.00%   0.02%  mgen       [kernel.kallsyms]        [k] __update_load_avg_se
      
       v6:
       ---
       Jiri provided a good improvement to eliminate unneeded refresh.
       This improvement is added to v6.
      
       v2:
       ---
       1. Report warning at helpline when index is invalid.
       2. Report warning at helpline when it's not group event.
       3. Use "case '0' ... '9'" to refine the code
       4. Split K_RELOAD implementation to another patch.
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20200220013616.19916-4-yao.jin@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      dbddf174
    • Jin Yao's avatar
      perf report: Support a new key to reload the browser · 5e3b810a
      Jin Yao authored
      Sometimes we may need to reload the browser to update the output since
      some options are changed.
      
      This patch creates a new key K_RELOAD. Once the __cmd_report() returns
      K_RELOAD, it would repeat the whole process, such as, read samples from
      data file, sort the data and display in the browser.
      
       v5:
       ---
       1. Fix the 'make NO_SLANG=1' error. Define K_RELOAD in util/hist.h.
       2. Skip setup_sorting() in repeat path if last key is K_RELOAD.
      
       v4:
       ---
       Need to quit in perf_evsel_menu__run if key is K_RELOAD.
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20200220013616.19916-3-yao.jin@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      5e3b810a
    • Jin Yao's avatar
      perf report: Allow specifying event to be used as sort key in --group output · 429a5f9d
      Jin Yao authored
      When performing "perf report --group", it shows the event group
      information together. By default, the output is sorted by the first
      event in group.
      
      It would be nice for user to select any event for sorting. This patch
      introduces a new option "--group-sort-idx" to sort the output by the
      event at the index n in event group.
      
      For example,
      
      Before:
      
        # perf report --group --stdio
      
        # To display the perf.data header info, please use --header/--header-only options.
        #
        #
        # Total Lost Samples: 0
        #
        # Samples: 12K of events 'cpu/instructions,period=2000003/, cpu/cpu-cycles,period=200003/, BR_MISP_RETIRED.ALL_BRANCHES:pp, cpu/event=0xc0,umask=1,cmask=1,
        # Event count (approx.): 6451235635
        #
        #                         Overhead  Command    Shared Object            Symbol
        # ................................  .........  .......................  ...................................
        #
            92.19%  98.68%   0.00%  93.30%  mgen       mgen                     [.] LOOP1
             3.12%   0.29%   0.00%   0.16%  gsd-color  libglib-2.0.so.0.5600.4  [.] 0x0000000000049515
             1.56%   0.03%   0.00%   0.04%  gsd-color  libglib-2.0.so.0.5600.4  [.] 0x00000000000494b7
             1.56%   0.01%   0.00%   0.00%  gsd-color  libglib-2.0.so.0.5600.4  [.] 0x00000000000494ce
             1.56%   0.00%   0.00%   0.00%  mgen       [kernel.kallsyms]        [k] task_tick_fair
             0.00%   0.15%   0.00%   0.04%  perf       [kernel.kallsyms]        [k] smp_call_function_single
             0.00%   0.13%   0.00%   6.08%  swapper    [kernel.kallsyms]        [k] intel_idle
             0.00%   0.03%   0.00%   0.00%  gsd-color  libglib-2.0.so.0.5600.4  [.] g_main_context_check
             0.00%   0.03%   0.00%   0.00%  swapper    [kernel.kallsyms]        [k] apic_timer_interrupt
             ...
      
      After:
      
        # perf report --group --stdio --group-sort-idx 3
      
        # To display the perf.data header info, please use --header/--header-only options.
        #
        #
        # Total Lost Samples: 0
        #
        # Samples: 12K of events 'cpu/instructions,period=2000003/, cpu/cpu-cycles,period=200003/, BR_MISP_RETIRED.ALL_BRANCHES:pp, cpu/event=0xc0,umask=1,cmask=1,
        # Event count (approx.): 6451235635
        #
        #                         Overhead  Command    Shared Object            Symbol
        # ................................  .........  .......................  ...................................
        #
            92.19%  98.68%   0.00%  93.30%  mgen       mgen                     [.] LOOP1
             0.00%   0.13%   0.00%   6.08%  swapper    [kernel.kallsyms]        [k] intel_idle
             3.12%   0.29%   0.00%   0.16%  gsd-color  libglib-2.0.so.0.5600.4  [.] 0x0000000000049515
             0.00%   0.00%   0.00%   0.06%  swapper    [kernel.kallsyms]        [k] hrtimer_start_range_ns
             1.56%   0.03%   0.00%   0.04%  gsd-color  libglib-2.0.so.0.5600.4  [.] 0x00000000000494b7
             0.00%   0.15%   0.00%   0.04%  perf       [kernel.kallsyms]        [k] smp_call_function_single
             0.00%   0.00%   0.00%   0.02%  mgen       [kernel.kallsyms]        [k] update_curr
             0.00%   0.00%   0.00%   0.02%  mgen       [kernel.kallsyms]        [k] apic_timer_interrupt
             0.00%   0.00%   0.00%   0.02%  mgen       [kernel.kallsyms]        [k] native_apic_msr_eoi_write
             0.00%   0.00%   0.00%   0.02%  mgen       [kernel.kallsyms]        [k] __update_load_avg_se
             0.00%   0.00%   0.00%   0.02%  mgen       [kernel.kallsyms]        [k] scheduler_tick
      
      Now the output is sorted by the fourth event in group.
      
       v7:
       ---
       Rebase to latest perf/core, no other change.
      
       v4:
       ---
       1. Update Documentation/perf-report.txt to mention
          '--group-sort-idx' support multiple groups with different
          amount of events and it should be used on grouped events.
      
       2. Update __hpp__group_sort_idx(), just return when the
          idx is out of limit.
      
       3. Return failure on symbol_conf.group_sort_idx && !session->evlist->nr_groups.
          So now we don't need to use together with --group.
      
       v3:
       ---
       Refine the code in __hpp__group_sort_idx().
      
       Before:
         for (i = 1; i < nr_members; i++) {
              if (i == idx) {
                      ret = field_cmp(fields_a[i], fields_b[i]);
                      if (ret)
                              goto out;
              }
         }
      
       After:
         if (idx >= 1 && idx < nr_members) {
              ret = field_cmp(fields_a[idx], fields_b[idx]);
              if (ret)
                      goto out;
         }
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20200220013616.19916-2-yao.jin@linux.intel.com
      [ Renamed pair_fields_alloc() to hist_entry__new_pair() and combined decl + assignment of vars ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      429a5f9d