1. 19 Mar, 2020 1 commit
    • Ingo Molnar's avatar
      Merge tag 'perf-core-for-mingo-5.7-20200310' of... · fdca7c14
      Ingo Molnar authored
      Merge tag 'perf-core-for-mingo-5.7-20200310' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
      
      Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
      
      perf stat:
      
        Jin Yao:
      
        - Show percore counts in per CPU output.
      
      perf report:
      
        Jin Yao:
      
        - Allow selecting which block info columns to report and its order.
      
        - Support color ops to print block percents in color.
      
        - Fix wrong block address comparison in block_info__cmp().
      
      perf annotate:
      
        Ravi Bangoria:
      
        - Get rid of annotation->nr_jumps, unused.
      
      expr:
      
        Jiri Olsa:
      
        - Move expr lexer to flex.
      
      llvm:
      
        Arnaldo Carvalho de Melo:
      
        - Add debug hint message about missing kernel-devel package.
      
      core:
      
        Kan Liang:
      
        - Initial patches to support the recently added PERF_SAMPLE_BRANCH_HW_INDEX
          kernel feature.
      
        - Add check for unexpected use of reserved membrs in event attr, so that in
          the future older perf tools will complain instead of silently try to process
          unknown features.
      
      libapi:
      
        Namhyung Kim:
      
        - Adopt cgroupsfs_find_mountpoint() from tools/perf/util/.
      
      libperf:
      
        Michael Petlan:
      
        - Add counting example.
      
      libtraceevent:
      
         Steven Rostedt (VMware):
      
        - Remove extra '\n' in print_event_time().
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      fdca7c14
  2. 17 Mar, 2020 3 commits
  3. 10 Mar, 2020 12 commits
    • Jin Yao's avatar
      perf block-info: Support color ops to print block percents in color · f787feff
      Jin Yao authored
      It would be nice to print the block percents with colors.
      
      This patch supports the 'Sampled Cycles%' and 'Avg Cycles%' printed in
      colors.
      
      For example,
      
      perf record -b ...
      perf report --total-cycles or perf report --total-cycles --stdio
      
      percent > 5%, colored in red
      percent > 0.5%, colored in green
      percent < 0.5%, default color
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jin Yao <yao.jin@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20200202141655.32053-5-yao.jin@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      f787feff
    • Jin Yao's avatar
      perf block-info: Allow selecting which columns to report and its order · cca0cc76
      Jin Yao authored
      Currently we use a predefined array to set the block info output
      formats, it's fixed and inflexible.
      
      This patch adds two parameters "block_hpps" and "nr_hpps" in
      block_info__create_report and other static functions, in order to let
      user decide which columns to report and with specified report ordering.
      It should be more flexible.
      
      Buffers will be allocated to contain the new fmts, of course, we need to
      release them before perf exits.
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jin Yao <yao.jin@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20200202141655.32053-4-yao.jin@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      cca0cc76
    • Jin Yao's avatar
      perf diff: Use __block_info__cmp() to replace block_pair_cmp() · a8a9f6dc
      Jin Yao authored
      'perf diff' uses block_pair_cmp() to compare two blocks. But
      block_info__cmp() has the similar functionality and it's a bit more
      complete.
      
      This patch removes block_pair_cmp() and uses __block_info__cmp()
      instead. __block_info__cmp() is wrapped by block_info__cmp() and it
      doesn't receives a perf_hpp_fmt parameter.
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jin Yao <yao.jin@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20200202141655.32053-3-yao.jin@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      a8a9f6dc
    • Jin Yao's avatar
      perf block-info: Fix wrong block address comparison in block_info__cmp() · 3e152aa9
      Jin Yao authored
      Commit 60414418 ("perf block: Cleanup and refactor block info
      functions") introduces block_info__cmp(), which compares two blocks.
      
      But the issues are:
      
      1. It should return the strcmp cmp value only if it's not 0.
      
      2. When symbol names are matched, we need to compare the addresses
         of blocks further. But it wrongly uses the symbol addresses for
         comparison.
      
      3. If the syms are both NULL, we can't consider these two blocks are
         matched.
      
      This patch fixes above 3 issues.
      
      Fixes: 60414418 ("perf block: Cleanup and refactor block info functions")
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jin Yao <yao.jin@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20200202141655.32053-2-yao.jin@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      3e152aa9
    • Jiri Olsa's avatar
      perf expr: Make expr__parse() return -1 on error · d942815a
      Jiri Olsa authored
      To match the error value of the expr__find_other function, so all
      exported expr functions return the same values:
      0 on success, -1 on error.
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Reviewed-by: default avatarAndi Kleen <ak@linux.intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
      Link: http://lore.kernel.org/lkml/20200228093616.67125-6-jolsa@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      d942815a
    • Jiri Olsa's avatar
      perf expr: Straighten expr__parse()/expr__find_other() interface · 0f9b1e12
      Jiri Olsa authored
      Now that we have a flex parser we don't need to update the parsed string
      pointer, so the interface can just be passed the pointer to the
      expression instead of a pointer to pointer.
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Reviewed-by: default avatarAndi Kleen <ak@linux.intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
      Link: http://lore.kernel.org/lkml/20200228093616.67125-5-jolsa@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      0f9b1e12
    • Jiri Olsa's avatar
      perf expr: Increase EXPR_MAX_OTHER to support metrics with more than 15 variables · 58ca7076
      Jiri Olsa authored
      We have metrics that define more than 15 variables, like
      Branch_Misprediction_Cost. Increasing the allowed variables count to 20.
      
      As Andy pointed out, we can't go too high in here, because some of the
      code has O(n^2) complexity (already_seen) and we might want to do some
      other changes (like using hash tables) before increasing the maximum
      even more.
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Reviewed-by: default avatarAndi Kleen <ak@linux.intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
      Link: http://lore.kernel.org/lkml/20200228093616.67125-4-jolsa@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      58ca7076
    • Jiri Olsa's avatar
      perf expr: Move expr lexer to flex · 26226a97
      Jiri Olsa authored
      Adding expr flex code instead of the manual parser code. So it's easily
      extensible in upcoming changes.
      
      The new flex code is in flex.l object and gets compiled like all the
      other flexers we use.  It's defined as flex reentrant parser.
      
      It's used by both expr__parse and expr__find_other interfaces by
      separating the starting point.
      
      There's no intended change of functionality ;-) the test expr is
      passing.
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Reviewed-by: default avatarAndi Kleen <ak@linux.intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
      Link: http://lore.kernel.org/lkml/20200228093616.67125-3-jolsa@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      26226a97
    • Jiri Olsa's avatar
      perf expr: Add expr.c object · 576a65b6
      Jiri Olsa authored
      Add generic expr code into new expr.c object.
      
      The expr.c object will be mainly used in following change that will get
      rid of the manual flex code,
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Reviewed-by: default avatarAndi Kleen <ak@linux.intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
      Link: http://lore.kernel.org/lkml/20200228093616.67125-2-jolsa@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      576a65b6
    • Kan Liang's avatar
      perf header: Add check for unexpected use of reserved membrs in event attr · 277ce1ef
      Kan Liang authored
      The perf.data may be generated by a newer version of perf tool, which
      support new input bits in attr, e.g. new bit for branch_sample_type.
      
      The perf.data may be parsed by an older version of perf tool later.  The
      old perf tool may parse the perf.data incorrectly. There is no warning
      message for this case.
      
      Current perf header never check for unknown input bits in attr.
      
      When read the event desc from header, check the stored event attr.  The
      reserved bits, sample type, read format and branch sample type will be
      checked.
      Signed-off-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Pavel Gerasimov <pavel.gerasimov@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Vitaly Slobodskoy <vitaly.slobodskoy@intel.com>
      Link: http://lkml.kernel.org/r/20200228163011.19358-4-kan.liang@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      277ce1ef
    • Kan Liang's avatar
      perf evsel: Support PERF_SAMPLE_BRANCH_HW_INDEX · d3f85437
      Kan Liang authored
      A new branch sample type PERF_SAMPLE_BRANCH_HW_INDEX has been introduced
      in latest kernel.
      
      Enable HW_INDEX by default in LBR call stack mode.
      
      If kernel doesn't support the sample type, switching it off.
      
      Add HW_INDEX in attr_fprintf as well. User can check whether the branch
      sample type is set via debug information or header.
      
      Committer testing:
      
      First collect some samples with LBR callchains, system wide, for a few
      seconds:
      
        # perf record --call-graph lbr -a sleep 5
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.625 MB perf.data (224 samples) ]
        #
      
      Now lets use 'perf evlist -v' to look at the branch_sample_type:
      
        # perf evlist -v
        cycles: size: 120, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|CALLCHAIN|CPU|PERIOD|BRANCH_STACK, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1, branch_sample_type: USER|CALL_STACK|NO_FLAGS|NO_CYCLES|HW_INDEX
        #
      
      So the machine has the kernel feature, and it was correctly added to
      perf_event_attr.branch_sample_type, for the default 'cycles' event.
      
      If we do it in another machine, where the kernel lacks the HW_INDEX
      feature, we get:
      
        # perf record --call-graph lbr -a sleep 2s
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 1.690 MB perf.data (499 samples) ]
        # perf evlist -v
        cycles: size: 120, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|CALLCHAIN|CPU|PERIOD|BRANCH_STACK, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1, branch_sample_type: USER|CALL_STACK|NO_FLAGS|NO_CYCLES
        #
      
      No HW_INDEX in attr.branch_sample_type.
      Signed-off-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Pavel Gerasimov <pavel.gerasimov@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Vitaly Slobodskoy <vitaly.slobodskoy@intel.com>
      Link: http://lore.kernel.org/lkml/20200228163011.19358-3-kan.liang@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      d3f85437
    • Kan Liang's avatar
      perf tools: Add hw_idx in struct branch_stack · 42bbabed
      Kan Liang authored
      The low level index of raw branch records for the most recent branch can
      be recorded in a sample with PERF_SAMPLE_BRANCH_HW_INDEX
      branch_sample_type. Extend struct branch_stack to support it.
      
      However, if the PERF_SAMPLE_BRANCH_HW_INDEX is not applied, only nr and
      entries[] will be output by kernel. The pointer of entries[] could be
      wrong, since the output format is different with new struct
      branch_stack.  Add a variable no_hw_idx in struct perf_sample to
      indicate whether the hw_idx is output.  Add get_branch_entry() to return
      corresponding pointer of entries[0].
      
      To make dummy branch sample consistent as new branch sample, add hw_idx
      in struct dummy_branch_stack for cs-etm and intel-pt.
      
      Apply the new struct branch_stack for synthetic events as well.
      
      Extend test case sample-parsing to support new struct branch_stack.
      
      Committer notes:
      
      Renamed get_branch_entries() to perf_sample__branch_entries() to have
      proper namespacing and pave the way for this to be moved to libperf,
      eventually.
      
      Add 'static' to that inline as it is in a header.
      
      Add 'hw_idx' to 'struct dummy_branch_stack' in cs-etm.c to fix the build
      on arm64.
      Signed-off-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Pavel Gerasimov <pavel.gerasimov@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Vitaly Slobodskoy <vitaly.slobodskoy@intel.com>
      Link: http://lore.kernel.org/lkml/20200228163011.19358-2-kan.liang@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      42bbabed
  4. 06 Mar, 2020 9 commits
  5. 05 Mar, 2020 1 commit
    • Arnaldo Carvalho de Melo's avatar
      tools headers UAPI: Update tools's copy of linux/perf_event.h · 6339998d
      Arnaldo Carvalho de Melo authored
      To get the changes in:
      
        bbfd5e4f ("perf/core: Add new branch sample type for HW index of raw branch records")
      
      This silences this perf tools build warning:
      
        Warning: Kernel ABI header at 'tools/include/uapi/linux/perf_event.h' differs from latest version at 'include/uapi/linux/perf_event.h'
        diff -u tools/include/uapi/linux/perf_event.h include/uapi/linux/perf_event.h
      
      This update is a prerequisite to adding support for the HW index of raw
      branch records.
      Acked-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Pavel Gerasimov <pavel.gerasimov@intel.com>
      Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Vitaly Slobodskoy <vitaly.slobodskoy@intel.com>
      Link: http://lore.kernel.org/lkml/20200304134902.GB12612@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      6339998d
  6. 04 Mar, 2020 9 commits
    • Steven Rostedt (VMware)'s avatar
      tools lib traceevent: Remove extra '\n' in print_event_time() · 401d61cb
      Steven Rostedt (VMware) authored
      If the precision of print_event_time() is zero or greater than the
      timestamp, it uses a different format. But that format had an extra new
      line at the end, and caused the output to not look right:
      
      cpus=2
                 sleep-3946  [001]111264306005
      : function:             inotify_inode_queue_event
                 sleep-3946  [001]111264307158
      : function:             __fsnotify_parent
                 sleep-3946  [001]111264307637
      : function:             inotify_dentry_parent_queue_event
                 sleep-3946  [001]111264307989
      : function:             fsnotify
                 sleep-3946  [001]111264308401
      : function:             audit_syscall_exit
      
      Fixes: 38847db9 ("libtraceevent, perf tools: Changes in tep_print_event_* APIs")
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lore.kernel.org/lkml/20200303231852.6ab6882f@oasis.local.homeSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      401d61cb
    • Michael Petlan's avatar
      libperf: Add counting example · 76ce0265
      Michael Petlan authored
      Current libperf man pages mention file counting.c "coming with libperf package",
      however, the file is missing. Add the file then.
      
      Fixes: 81de3bf3 ("libperf: Add man pages")
      Signed-off-by: default avatarMichael Petlan <mpetlan@redhat.com>
      Acked-by: default avatarJiri Olsa <jolsa@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      LPU-Reference: 20200227194424.28210-1-mpetlan@redhat.com
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      76ce0265
    • Ravi Bangoria's avatar
      perf annotate: Get rid of annotation->nr_jumps · dabce16b
      Ravi Bangoria authored
      The 'nr_jumps' field in 'struct annotation' is not used since it's
      inception in commit 2402e4a9 ("perf annotate browser: Show 'jumpy'
      functions").  Get rid of it.
      Signed-off-by: default avatarRavi Bangoria <ravi.bangoria@linux.ibm.com>
      Acked-by: default avatarJiri Olsa <jolsa@redhat.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Song Liu <songliubraving@fb.com>
      Link: http://lore.kernel.org/lkml/20200204045233.474937-7-ravi.bangoria@linux.ibm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      dabce16b
    • Arnaldo Carvalho de Melo's avatar
      perf llvm: Add debug hint message about missing kernel-devel package · 357a5d24
      Arnaldo Carvalho de Melo authored
      To help in debugging, add this extra message:
      
        detect_kbuild_dir: Couldn't find "/lib/modules/5.4.20-200.fc31.x86_64/build/include/generated/autoconf.h", missing kernel-devel package?.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      357a5d24
    • Jin Yao's avatar
      perf stat: Show percore counts in per CPU output · 1af62ce6
      Jin Yao authored
      We have supported the event modifier "percore" which sums up the event
      counts for all hardware threads in a core and show the counts per core.
      
      For example,
      
       # perf stat -e cpu/event=cpu-cycles,percore/ -a -A -- sleep 1
      
        Performance counter stats for 'system wide':
      
       S0-D0-C0                395,072      cpu/event=cpu-cycles,percore/
       S0-D0-C1                851,248      cpu/event=cpu-cycles,percore/
       S0-D0-C2                954,226      cpu/event=cpu-cycles,percore/
       S0-D0-C3              1,233,659      cpu/event=cpu-cycles,percore/
      
      This patch provides a new option "--percore-show-thread". It is used
      with event modifier "percore" together to sum up the event counts for
      all hardware threads in a core but show the counts per hardware thread.
      
      This is essentially a replacement for the any bit (which is gone in
      Icelake). Per core counts are useful for some formulas, e.g. CoreIPC.
      The original percore version was inconvenient to post process. This
      variant matches the output of the any bit.
      
      With this patch, for example,
      
       # perf stat -e cpu/event=cpu-cycles,percore/ -a -A --percore-show-thread  -- sleep 1
      
        Performance counter stats for 'system wide':
      
       CPU0               2,453,061      cpu/event=cpu-cycles,percore/
       CPU1               1,823,921      cpu/event=cpu-cycles,percore/
       CPU2               1,383,166      cpu/event=cpu-cycles,percore/
       CPU3               1,102,652      cpu/event=cpu-cycles,percore/
       CPU4               2,453,061      cpu/event=cpu-cycles,percore/
       CPU5               1,823,921      cpu/event=cpu-cycles,percore/
       CPU6               1,383,166      cpu/event=cpu-cycles,percore/
       CPU7               1,102,652      cpu/event=cpu-cycles,percore/
      
      We can see counts are duplicated in CPU pairs (CPU0/CPU4, CPU1/CPU5,
      CPU2/CPU6, CPU3/CPU7).
      
      The interval mode also works. For example,
      
       # perf stat -e cpu/event=cpu-cycles,percore/ -a -A --percore-show-thread  -I 1000
       #           time CPU                    counts unit events
            1.000425421 CPU0                 925,032      cpu/event=cpu-cycles,percore/
            1.000425421 CPU1                 430,202      cpu/event=cpu-cycles,percore/
            1.000425421 CPU2                 436,843      cpu/event=cpu-cycles,percore/
            1.000425421 CPU3               1,192,504      cpu/event=cpu-cycles,percore/
            1.000425421 CPU4                 925,032      cpu/event=cpu-cycles,percore/
            1.000425421 CPU5                 430,202      cpu/event=cpu-cycles,percore/
            1.000425421 CPU6                 436,843      cpu/event=cpu-cycles,percore/
            1.000425421 CPU7               1,192,504      cpu/event=cpu-cycles,percore/
      
      If we offline CPU5, the result is:
      
       # perf stat -e cpu/event=cpu-cycles,percore/ -a -A --percore-show-thread -- sleep 1
      
        Performance counter stats for 'system wide':
      
       CPU0               2,752,148      cpu/event=cpu-cycles,percore/
       CPU1               1,009,312      cpu/event=cpu-cycles,percore/
       CPU2               2,784,072      cpu/event=cpu-cycles,percore/
       CPU3               2,427,922      cpu/event=cpu-cycles,percore/
       CPU4               2,752,148      cpu/event=cpu-cycles,percore/
       CPU6               2,784,072      cpu/event=cpu-cycles,percore/
       CPU7               2,427,922      cpu/event=cpu-cycles,percore/
      
              1.001416041 seconds time elapsed
      
       v4:
       ---
       Ravi Bangoria reports an issue in v3. Once we offline a CPU,
       the output is not correct. The issue is we should use the cpu
       idx in print_percore_thread rather than using the cpu value.
      
       v3:
       ---
       1. Fix the interval mode output error
       2. Use cpu value (not cpu index) in config->aggr_get_id().
       3. Refine the code according to Jiri's comments.
      
       v2:
       ---
       Add the explanation in change log. This is essentially a replacement
       for the any bit. No code change.
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Tested-by: default avatarRavi Bangoria <ravi.bangoria@linux.ibm.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20200214080452.26402-1-yao.jin@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      1af62ce6
    • Namhyung Kim's avatar
      tools lib api fs: Move cgroupsfs_find_mountpoint() · 7982a898
      Namhyung Kim authored
      Move it from tools/perf/util/cgroup.c as it can be used by other places.
      Note that cgroup filesystem is different from others since it's usually
      mounted separately (in v1) for each subsystem.
      
      I just copied the code with a little modification to pass a name of
      subsystem.
      Suggested-by: default avatarJiri Olsa <jolsa@redhat.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lore.kernel.org/lkml/20200127100031.1368732-1-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      7982a898
    • Arnaldo Carvalho de Melo's avatar
    • Nick Desaulniers's avatar
      perf diff: Fix undefined string comparison spotted by clang's -Wstring-compare · c395c355
      Nick Desaulniers authored
      clang warns:
      
        util/block-info.c:298:18: error: result of comparison against a string
        literal is unspecified (use an explicit string comparison function
        instead) [-Werror,-Wstring-compare]
                if ((start_line != SRCLINE_UNKNOWN) && (end_line != SRCLINE_UNKNOWN)) {
                                ^  ~~~~~~~~~~~~~~~
        util/block-info.c:298:51: error: result of comparison against a string
        literal is unspecified (use an explicit string comparison function
        instead) [-Werror,-Wstring-compare]
                if ((start_line != SRCLINE_UNKNOWN) && (end_line != SRCLINE_UNKNOWN)) {
                                                                 ^  ~~~~~~~~~~~~~~~
        util/block-info.c:298:18: error: result of comparison against a string
        literal is unspecified (use an explicit string
        comparison function instead) [-Werror,-Wstring-compare]
                if ((start_line != SRCLINE_UNKNOWN) && (end_line != SRCLINE_UNKNOWN)) {
                                ^  ~~~~~~~~~~~~~~~
        util/block-info.c:298:51: error: result of comparison against a string
        literal is unspecified (use an explicit string comparison function
        instead) [-Werror,-Wstring-compare]
                if ((start_line != SRCLINE_UNKNOWN) && (end_line != SRCLINE_UNKNOWN)) {
                                                                 ^  ~~~~~~~~~~~~~~~
        util/map.c:434:15: error: result of comparison against a string literal
        is unspecified (use an explicit string comparison function instead)
        [-Werror,-Wstring-compare]
                        if (srcline != SRCLINE_UNKNOWN)
                                    ^  ~~~~~~~~~~~~~~~
      
      Reviewer Notes:
      
      Looks good to me. Some more context:
      https://clang.llvm.org/docs/DiagnosticsReference.html#wstring-compare
      The spec says:
      J.1 Unspecified behavior
      The following are unspecified:
      .. Whether two string literals result in distinct arrays (6.4.5).
      Signed-off-by: default avatarNick Desaulniers <nick.desaulniers@gmail.com>
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Changbin Du <changbin.du@intel.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: John Keeping <john@metanate.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: clang-built-linux@googlegroups.com
      Link: https://github.com/ClangBuiltLinux/linux/issues/900
      Link: http://lore.kernel.org/lkml/20200223193456.25291-1-nick.desaulniers@gmail.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      c395c355
    • Ingo Molnar's avatar
      Merge tag 'perf-urgent-for-mingo-5.6-20200303' of... · b95b4d5e
      Ingo Molnar authored
      Merge tag 'perf-urgent-for-mingo-5.6-20200303' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent
      
      Pull perf/urgent fixes from Arnaldo Carvalho de Melo:
      
      perf symbols:
      
        Arnaldo Carvalho de Melo:
      
        - Don't try to find a vmlinux file when looking for kernel modules,
          fixing symbol resolution in systems with compressed kernel modules.
      
      perf env:
      
        Arnaldo Carvalho de Melo:
      
        - Do not return pointers to local variables, fixing valid warning from
          gcc 10 for corner case that stops the build due to -Werror.
      
      perf tests:
      
        Arnaldo Carvalho de Melo:
      
        - Make global variable static in the bp_account entry to fix build
          with gcc 10.
      
      perf parse-events:
      
        Arnaldo Carvalho de Melo:
      
        - Use asprintf() instead of strncpy() to read tracepoint files, addressing
          compiler warning that stops the build as we use -Werror.
      
      perf bench:
      
        Arnaldo Carvalho de Melo:
      
        - Share some global variables to fix build with gcc 10.
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      b95b4d5e
  7. 03 Mar, 2020 3 commits
    • Linus Torvalds's avatar
      Merge tag '5.6-rc4-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6 · 8b614cb8
      Linus Torvalds authored
      Pull cifs fixes from Steve French:
       "Five small cifs/smb3 fixes, two for stable (one for a reconnect
        problem and the other fixes a use case when renaming an open file)"
      
      * tag '5.6-rc4-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
        cifs: Use #define in cifs_dbg
        cifs: fix rename() by ensuring source handle opened with DELETE bit
        cifs: add missing mount option to /proc/mounts
        cifs: fix potential mismatch of UNC paths
        cifs: don't leak -EAGAIN for stat() during reconnect
      8b614cb8
    • Arnaldo Carvalho de Melo's avatar
      perf symbols: Don't try to find a vmlinux file when looking for kernel modules · b5c09518
      Arnaldo Carvalho de Melo authored
      The dso->kernel value is now set to everything that is in
      machine->kmaps, but that was being used to decide if vmlinux lookup is
      needed, which ended up making that lookup be made for kernel modules,
      that now have dso->kernel set, leading to these kinds of warnings when
      running on a machine with compressed kernel modules, like fedora:31:
      
        [root@five ~]# perf record -F 10000 -a sleep 2
        [ perf record: Woken up 1 times to write data ]
        lzma: fopen failed on vmlinux: 'No such file or directory'
        lzma: fopen failed on /boot/vmlinux: 'No such file or directory'
        lzma: fopen failed on /boot/vmlinux-5.5.5-200.fc31.x86_64: 'No such file or directory'
        lzma: fopen failed on /usr/lib/debug/boot/vmlinux-5.5.5-200.fc31.x86_64: 'No such file or directory'
        lzma: fopen failed on /lib/modules/5.5.5-200.fc31.x86_64/build/vmlinux: 'No such file or directory'
        lzma: fopen failed on vmlinux: 'No such file or directory'
        lzma: fopen failed on /boot/vmlinux: 'No such file or directory'
        lzma: fopen failed on /boot/vmlinux-5.5.5-200.fc31.x86_64: 'No such file or directory'
        lzma: fopen failed on /usr/lib/debug/boot/vmlinux-5.5.5-200.fc31.x86_64: 'No such file or directory'
        lzma: fopen failed on /lib/modules/5.5.5-200.fc31.x86_64/build/vmlinux: 'No such file or directory'
        lzma: fopen failed on vmlinux: 'No such file or directory'
        lzma: fopen failed on /boot/vmlinux: 'No such file or directory'
        lzma: fopen failed on /boot/vmlinux-5.5.5-200.fc31.x86_64: 'No such file or directory'
        lzma: fopen failed on /usr/lib/debug/boot/vmlinux-5.5.5-200.fc31.x86_64: 'No such file or directory'
        lzma: fopen failed on /lib/modules/5.5.5-200.fc31.x86_64/build/vmlinux: 'No such file or directory'
        lzma: fopen failed on vmlinux: 'No such file or directory'
        lzma: fopen failed on /boot/vmlinux: 'No such file or directory'
        lzma: fopen failed on /boot/vmlinux-5.5.5-200.fc31.x86_64: 'No such file or directory'
        lzma: fopen failed on /usr/lib/debug/boot/vmlinux-5.5.5-200.fc31.x86_64: 'No such file or directory'
        lzma: fopen failed on /lib/modules/5.5.5-200.fc31.x86_64/build/vmlinux: 'No such file or directory'
        lzma: fopen failed on vmlinux: 'No such file or directory'
        lzma: fopen failed on /boot/vmlinux: 'No such file or directory'
        lzma: fopen failed on /boot/vmlinux-5.5.5-200.fc31.x86_64: 'No such file or directory'
        lzma: fopen failed on /usr/lib/debug/boot/vmlinux-5.5.5-200.fc31.x86_64: 'No such file or directory'
        lzma: fopen failed on /lib/modules/5.5.5-200.fc31.x86_64/build/vmlinux: 'No such file or directory'
        [ perf record: Captured and wrote 1.024 MB perf.data (1366 samples) ]
        [root@five ~]#
      
      This happens when collecting the buildid, when we find samples for
      kernel modules, fix it by checking if the looked up DSO is a kernel
      module by other means.
      
      Fixes: 02213cec ("perf maps: Mark module DSOs with kernel type")
      Tested-by: default avatarJiri Olsa <jolsa@redhat.com>
      Acked-by: default avatarJiri Olsa <jolsa@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Kim Phillips <kim.phillips@amd.com>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
      Link: http://lore.kernel.org/lkml/20200302191007.GD10335@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      b5c09518
    • Arnaldo Carvalho de Melo's avatar
      perf bench: Share some global variables to fix build with gcc 10 · e4d9b04b
      Arnaldo Carvalho de Melo authored
      Noticed with gcc 10 (fedora rawhide) that those variables were not being
      declared as static, so end up with:
      
        ld: /tmp/build/perf/bench/epoll-wait.o:/git/perf/tools/perf/bench/epoll-wait.c:93: multiple definition of `end'; /tmp/build/perf/bench/futex-hash.o:/git/perf/tools/perf/bench/futex-hash.c:40: first defined here
        ld: /tmp/build/perf/bench/epoll-wait.o:/git/perf/tools/perf/bench/epoll-wait.c:93: multiple definition of `start'; /tmp/build/perf/bench/futex-hash.o:/git/perf/tools/perf/bench/futex-hash.c:40: first defined here
        ld: /tmp/build/perf/bench/epoll-wait.o:/git/perf/tools/perf/bench/epoll-wait.c:93: multiple definition of `runtime'; /tmp/build/perf/bench/futex-hash.o:/git/perf/tools/perf/bench/futex-hash.c:40: first defined here
        ld: /tmp/build/perf/bench/epoll-ctl.o:/git/perf/tools/perf/bench/epoll-ctl.c:38: multiple definition of `end'; /tmp/build/perf/bench/futex-hash.o:/git/perf/tools/perf/bench/futex-hash.c:40: first defined here
        ld: /tmp/build/perf/bench/epoll-ctl.o:/git/perf/tools/perf/bench/epoll-ctl.c:38: multiple definition of `start'; /tmp/build/perf/bench/futex-hash.o:/git/perf/tools/perf/bench/futex-hash.c:40: first defined here
        ld: /tmp/build/perf/bench/epoll-ctl.o:/git/perf/tools/perf/bench/epoll-ctl.c:38: multiple definition of `runtime'; /tmp/build/perf/bench/futex-hash.o:/git/perf/tools/perf/bench/futex-hash.c:40: first defined here
        make[4]: *** [/git/perf/tools/build/Makefile.build:145: /tmp/build/perf/bench/perf-in.o] Error 1
      
      Prefix those with bench__ and add them to bench/bench.h, so that we can
      share those on the tools needing to access those variables from signal
      handlers.
      Acked-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lore.kernel.org/lkml/20200303155811.GD13702@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e4d9b04b
  8. 02 Mar, 2020 2 commits
    • Arnaldo Carvalho de Melo's avatar
      perf parse-events: Use asprintf() instead of strncpy() to read tracepoint files · 7125f204
      Arnaldo Carvalho de Melo authored
      Make the code more compact by using asprintf() instead of malloc()+strncpy() which also uses
      less memory and avoids these warnings with gcc 10:
      
          CC       /tmp/build/perf/util/cloexec.o
        In file included from /usr/include/string.h:495,
                         from util/parse-events.h:12,
                         from util/parse-events.c:18:
        In function ‘strncpy’,
            inlined from ‘tracepoint_id_to_path’ at util/parse-events.c:271:5:
        /usr/include/bits/string_fortified.h:106:10: error: ‘__builtin_strncpy’ offset [275, 511] from the object at ‘sys_dirent’ is out of the bounds of referenced subobject ‘d_name’ with type ‘char[256]’ at offset 19 [-Werror=array-bounds]
          106 |   return __builtin___strncpy_chk (__dest, __src, __len, __bos (__dest));
              |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        In file included from /usr/include/dirent.h:61,
                         from util/parse-events.c:5:
        util/parse-events.c: In function ‘tracepoint_id_to_path’:
        /usr/include/bits/dirent.h:33:10: note: subobject ‘d_name’ declared here
           33 |     char d_name[256];  /* We must not include limits.h! */
              |          ^~~~~~
        In file included from /usr/include/string.h:495,
                         from util/parse-events.h:12,
                         from util/parse-events.c:18:
        In function ‘strncpy’,
            inlined from ‘tracepoint_id_to_path’ at util/parse-events.c:273:5:
        /usr/include/bits/string_fortified.h:106:10: error: ‘__builtin_strncpy’ offset [275, 511] from the object at ‘evt_dirent’ is out of the bounds of referenced subobject ‘d_name’ with type ‘char[256]’ at offset 19 [-Werror=array-bounds]
          106 |   return __builtin___strncpy_chk (__dest, __src, __len, __bos (__dest));
              |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        In file included from /usr/include/dirent.h:61,
                         from util/parse-events.c:5:
        util/parse-events.c: In function ‘tracepoint_id_to_path’:
        /usr/include/bits/dirent.h:33:10: note: subobject ‘d_name’ declared here
           33 |     char d_name[256];  /* We must not include limits.h! */
              |          ^~~~~~
          CC       /tmp/build/perf/util/call-path.o
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lore.kernel.org/lkml/20200302145535.GA28183@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      7125f204
    • Arnaldo Carvalho de Melo's avatar
      perf env: Do not return pointers to local variables · ebcb9464
      Arnaldo Carvalho de Melo authored
      It is possible to return a pointer to a local variable when looking up
      the architecture name for the running system and no normalization is
      done on that value, i.e. we may end up returning the uts.machine local
      variable.
      
      While this doesn't happen on most arches, as normalization takes place,
      lets fix this by making that a static variable and optimize it a bit by
      not always running uname(), only the first time.
      
      Noticed in fedora rawhide running with:
      
        [perfbuilder@a5ff49d6e6e4 ~]$ gcc --version
        gcc (GCC) 10.0.1 20200216 (Red Hat 10.0.1-0.8)
      Reported-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      ebcb9464