1. 10 Jan, 2018 4 commits
  2. 08 Jan, 2018 20 commits
    • Jiri Olsa's avatar
      perf script: Add support to display sample misc field · 28a0b398
      Jiri Olsa authored
      Adding support to display sample misc field in form
      of letter for each bit:
      
        # perf script -F +misc ...
         sched-messaging  1414 K     28690.636582:       4590 cycles ...
         sched-messaging  1407 U     28690.636600:     325620 cycles ...
         sched-messaging  1414 K     28690.636608:      19473 cycles ...
        misc field  __________/
      
      The misc bits are assigned to following letters:
      
        PERF_RECORD_MISC_KERNEL        K
        PERF_RECORD_MISC_USER          U
        PERF_RECORD_MISC_HYPERVISOR    H
        PERF_RECORD_MISC_GUEST_KERNEL  G
        PERF_RECORD_MISC_GUEST_USER    g
        PERF_RECORD_MISC_MMAP_DATA*    M
        PERF_RECORD_MISC_COMM_EXEC     E
        PERF_RECORD_MISC_SWITCH_OUT    S
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20180107160356.28203-9-jolsa@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      28a0b398
    • Jiri Olsa's avatar
      perf: Update PERF_RECORD_MISC_* comment for perf_event_header::misc bit 13 · 972c1488
      Jiri Olsa authored
      The perf_event_header::misc bit 13 is shared on different events and
      next patch is adding yet another bit 13 user.  Updating the comment to
      make it more structured and clear which events use bit 13.
      Suggested-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lkml.kernel.org/r/20180107160356.28203-8-jolsa@kernel.org
      [ Update the tools/include/uapi/linux copy ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      972c1488
    • Jiri Olsa's avatar
      perf: Return empty callchain instead of NULL · 99e818cc
      Jiri Olsa authored
      It simplifies the code a bit, because we dump the callchain
      Link: http://lkml.kernel.org/n/tip-uqp7qd6aif47g39glnbu95yl@git.kernel.org
      even if it's empty. With 'empty' callchain we can remove
      all the NULL-checking code paths.
      
      Original-patch-from: Peter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lkml.kernel.org/r/20180107160356.28203-7-jolsa@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      99e818cc
    • Jiri Olsa's avatar
      perf: Make perf_callchain function static · 8cf7e0e2
      Jiri Olsa authored
      And move it to core.c, because there's no caller of this function other
      than the one in core.c
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20180107160356.28203-6-jolsa@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      8cf7e0e2
    • Jiri Olsa's avatar
      perf: Add sample_id to PERF_RECORD_ITRACE_START event comment · 81df978c
      Jiri Olsa authored
      Adding missing sample_id line into PERF_RECORD_ITRACE_START
      event comment.
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20180107160356.28203-5-jolsa@kernel.org
      [ Update the tools/include/uapi/linux copy ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      81df978c
    • Jiri Olsa's avatar
      perf: Allocate context task_ctx_data for child event · 313ccb96
      Jiri Olsa authored
      Currently we use perf_event_context::task_ctx_data to save and restore
      the LBR status when the task is scheduled out and in.
      
      We don't allocate it for child contexts, which results in shorter task's
      LBR stack, because we don't save the history from previous run and start
      over every time we schedule the task in.
      
      I made a test to generate samples with LBR call stack and got higher
      numbers on bigger chain depths:
      
                                  before:     after:
        LBR call chain: nr: 1       60561     498127
        LBR call chain: nr: 2           0          0
        LBR call chain: nr: 3      107030       2172
        LBR call chain: nr: 4      466685      62758
        LBR call chain: nr: 5     2307319     878046
        LBR call chain: nr: 6       48713     495218
        LBR call chain: nr: 7        1040       4551
        LBR call chain: nr: 8         481        172
        LBR call chain: nr: 9         878        120
        LBR call chain: nr: 10       2377       6698
        LBR call chain: nr: 11      28830     151487
        LBR call chain: nr: 12      29347     339867
        LBR call chain: nr: 13          4         22
        LBR call chain: nr: 14          3         53
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Fixes: 4af57ef2 ("perf: Add pmu specific data for perf task context")
      Link: http://lkml.kernel.org/r/20180107160356.28203-4-jolsa@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      313ccb96
    • Jiri Olsa's avatar
      perf tools: Display perf_event_attr::namespaces debug info · db9fc765
      Jiri Olsa authored
      Display namespaces bit in -vv debug display:
      
        $ perf record -vv --namespaces ...
        ...
        perf_event_attr:
          size                             112
          ...
          namespaces                       1
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20180107160356.28203-3-jolsa@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      db9fc765
    • Jiri Olsa's avatar
      perf tools: Enable LIBBABELTRACE by default · 24787afb
      Jiri Olsa authored
      There's no reason anymore to treat babel trace in a special way, because
      a) we no longer display its state b) the needed babeltrace library is
      now out and well adopted among distros.
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20180107160356.28203-2-jolsa@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      24787afb
    • Jin Yao's avatar
      perf script: Support time percent and multiple time ranges · 2ab046cd
      Jin Yao authored
      perf script has a --time option to limit the time range of output.  It
      only supports absolute time.
      
      Now this option is extended to support multiple time ranges and support
      the percent of time.
      
      For example:
      
      1. Select the first and second 10% time slices:
      
         perf script --time 10%/1,10%/2
      
      2. Select from 0% to 10% and 30% to 40% slices:
      
         perf script --time 0%-10%,30%-40%
      
      Changelog:
      
      v6: Fix the merge issue with latest perf/core branch.
          No functional changes.
      
      v5: Add checking of first/last sample time to detect if it's recorded
          in perf.data. If it's not recorded, returns error message to user.
      
      v4: Remove perf_time__skip_sample, only uses perf_time__ranges_skip_sample
      
      v3: Since the definitions of first_sample_time/last_sample_time
          are moved from perf_session to perf_evlist so change the
          related code.
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1512738826-2628-7-git-send-email-yao.jin@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      2ab046cd
    • Jin Yao's avatar
      perf report: Support time percent and multiple time ranges · 5b969bc7
      Jin Yao authored
      perf report has a --time option to limit the time range of output.  It
      only supports absolute time.
      
      Now this option is extended to support multiple time ranges and support
      the percent of time.
      
      For example:
      
      1. Select the first and second 10% time slices:
      
      perf report --time 10%/1,10%/2
      
      2. Select from 0% to 10% and 30% to 40% slices:
      
      perf report --time 0%-10%,30%-40%
      
      Changelog:
      
      v6: Fix the merge issue with latest perf/core branch.
          No functional changes.
      
      v5: Add checking of first/last sample time to detect if it's recorded
          in perf.data. If it's not recorded, returns error message to user.
      
      v4: Remove perf_time__skip_sample, only uses perf_time__ranges_skip_sample
      
      v3: Since the definitions of first_sample_time/last_sample_time
          are moved from perf_session to perf_evlist so change the
          related code.
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1512738826-2628-6-git-send-email-yao.jin@linux.intel.com
      [ Add missing colons at end of examples in the man page ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      5b969bc7
    • Jin Yao's avatar
      perf tools: Create function to perform multiple time range checking · 9a9b8b4b
      Jin Yao authored
      Previous patch supports the multiple time range.
      
      For example, select the first and second 10% time slices.
      perf report --time 10%/1,10%/2
      
      We need a function to check if a timestamp is in the ranges of
      [0, 10%) and [10%, 20%].
      
      Note that it includes the last element in [10%, 20%] but it doesn't
      include the last element in [0, 10%). It's to avoid the overlap.
      
      This patch implments a new function perf_time__ranges_skip_sample
      for this checking.
      
      Change log:
      
      v4: Let perf_time__ranges_skip_sample be compatible with
          perf_time__skip_sample when only one time range.
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1512738826-2628-5-git-send-email-yao.jin@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      9a9b8b4b
    • Jin Yao's avatar
      perf tools: Create function to parse time percent · 13a70f35
      Jin Yao authored
      Current perf report/script/... have a --time option to limit the time
      range of output. But right now it only supports absolute time, add
      support for time percentage.
      
      For example:
      
      1. Select the second 10% time slice
         perf report --time 10%/2
      
      2. Select from 0% to 10% time slice
         perf report --time 0%-10%
      
      It also support the multiple time ranges.
      
      3. Select the first and second 10% time slices
         perf report --time 10%/1,10%/2
      
      4. Select from 0% to 10% and 30% to 40% slices
         perf report --time 0%-10%,30%-40%
      
      Changelog:
      
      v4: An issue is found. Following passes.
          perf script --time 10%/10x12321xsdfdasfdsafdsafdsa
      
          Now it uses strtol to replace atoi.
      
      Committer notes:
      
      This just puts in place the infrastructure, so the examples in this cset
      comment will only work later, after more patches in this series are
      applied.
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1512738826-2628-4-git-send-email-yao.jin@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      13a70f35
    • Jin Yao's avatar
      perf record: Record the first and last sample time in the header · 68588baf
      Jin Yao authored
      In the default 'perf record' configuration, all samples are processed,
      to create the HEADER_BUILD_ID table. So it's very easy to get the
      first/last samples and save the time to perf file header via the
      function write_sample_time().
      
      Later, at post processing time, perf report/script will fetch the time
      from perf file header.
      
      Committer testing:
      
        # perf record -a sleep 1
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 2.099 MB perf.data (1101 samples) ]
        [root@jouet home]# perf report --header | grep "time of "
        # time of first sample : 22947.909226
        # time of last sample : 22948.910704
        #
        # perf report -D | grep PERF_RECORD_SAMPLE\(
        0 22947909226101 0x20bb68 [0x30]: PERF_RECORD_SAMPLE(IP, 0x4001): 0/0: 0xffffffffa21b1af3 period: 1 addr: 0
        0 22947909229928 0x20bb98 [0x30]: PERF_RECORD_SAMPLE(IP, 0x4001): 0/0: 0xffffffffa200d204 period: 1 addr: 0
        <SNIP>
        3 22948910397351 0x219360 [0x30]: PERF_RECORD_SAMPLE(IP, 0x4001): 28251/28251: 0xffffffffa22071d8 period: 169518 addr: 0
        0 22948910652380 0x20f120 [0x30]: PERF_RECORD_SAMPLE(IP, 0x4001): 0/0: 0xffffffffa2856816 period: 198807 addr: 0
        2 22948910704034 0x2172d0 [0x30]: PERF_RECORD_SAMPLE(IP, 0x4001): 0/0: 0xffffffffa2856816 period: 88111 addr: 0
        #
      
      Changelog:
      
      v7: Just update the patch description according to Arnaldo's suggestion.
      
      v6: Currently '--buildid-all' is not enabled at default. So the walking
          on all samples is the default operation. There is no big overhead
          to calculate the timestamp boundary in process_sample_event handler
          once we already go through all samples. So the timestamp boundary
          calculation is enabled by default when '--buildid-all' is not enabled.
      
          While if '--buildid-all' is enabled, we creates a new option
          "--timestamp-boundary" for user to decide if it enables the
          timestamp boundary calculation.
      
      v5: There is an issue that the sample walking can only work when
          '--buildid-all' is not enabled. So we need to let the walking
          be able to work even if '--buildid-all' is enabled and let the
          processing skips the dso hit marking for this case.
      
          At first, I want to provide a new option "--record-time-boundaries".
          While after consideration, I think a new option is not very
          necessary.
      
      v3: Remove the definitions of first_sample_time and last_sample_time
          from struct record and directly save them in perf_evlist.
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1512738826-2628-3-git-send-email-yao.jin@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      68588baf
    • Jin Yao's avatar
      perf header: Add infrastructure to record first and last sample time · 6011518d
      Jin Yao authored
      perf report/script/... have a --time option to limit the time range of
      output. That's very useful to slice large traces, e.g. when processing
      the output of perf script for some analysis.
      
      But right now --time only supports absolute time. Also there is no fast
      way to get the start/end times of a given trace except for looking at
      it.  This makes it hard to e.g. only decode the first half of the trace,
      which is useful for parallelization of scripts
      
      Another problem is that perf records are variable size and there is no
      synchronization mechanism. So the only way to find the last sample
      reliably would be to walk all samples. But we want to avoid that in perf
      report/...  because it is already quite expensive. That is why storing
      the first sample time and last sample time in perf record is better.
      
      This patch creates a new header feature type HEADER_SAMPLE_TIME and
      related ops. Save the first sample time and the last sample time to the
      feature section in perf file header. That will be done when, for
      instance, processing build-ids, where we already have to process all
      samples to create the build-id table, take advantage of that to further
      amortize that processing by storing HEADER_SAMPLE_TIME to make 'perf
      report/script' faster when using --time.
      
      Committer testing:
      
      After this patch is applied the header is written with zeroes, we need
      the next patch, for "perf record" to actually write the timestamps:
      
        # perf report -D | grep PERF_RECORD_SAMPLE\(
        22501155244406 0x44f0 [0x28]: PERF_RECORD_SAMPLE(IP, 0x4001): 25016/25016: 0xffffffffa21be8c5 period: 1 addr: 0
        <SNIP>
        22501155793625 0x4a30 [0x28]: PERF_RECORD_SAMPLE(IP, 0x4001): 25016/25016: 0xffffffffa21ffd50 period: 2828043 addr: 0
        # perf report --header | grep "time of "
        # time of first sample : 0.000000
        # time of last sample : 0.000000
        #
      
      Changelog:
      
      v7: 1. Rebase to latest perf/core branch.
      
          2. Add following clarification in patch description according to
             Arnaldo's suggestion.
      
             "That will be done when, for instance, processing build-ids,
      	where we already have to process all samples to create the
      	build-id table, take advantage of that to further amortize
      	that processing by storing HEADER_SAMPLE_TIME to make
      	'perf report/script' faster when using --time."
      
      v4: Use perf script time style for timestamp printing. Also add with
          the printing of sample duration.
      
      v3: Remove the definitions of first_sample_time/last_sample_time from
          perf_session. Just define them in perf_evlist
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1512738826-2628-2-git-send-email-yao.jin@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      6011518d
    • Jin Yao's avatar
      perf report: Fix a no annotate browser displayed issue · 40c39e30
      Jin Yao authored
      When enabling '-b' option in perf record, for example,
      
        perf record -b ...
        perf report
      
      and then browsing the annotate browser from perf report (press 'A'), it
      would fail (annotate browser can't be displayed).
      
      It's because the '.add_entry_cb' op of struct report is overwritten by
      hist_iter__branch_callback() in builtin-report.c. But this function doesn't do
      something like mapping symbols and sources. So next, do_annotate() will return
      directly.
      
              notes = symbol__annotation(act->ms.sym);
              if (!notes->src)
                      return 0;
      
      This patch adds the lost code to hist_iter__branch_callback (refer to
      hist_iter__report_callback).
      
      v2:
      
      Fix a crash bug when perform 'perf report --stdio'.
      
      The reason is that we init the symbol annotation only in browser mode, it
      doesn't allocate/init resources for stdio mode.
      
      So now in hist_iter__branch_callback(), it will return directly if it's not in
      browser mode.
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1514284963-18587-1-git-send-email-yao.jin@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      40c39e30
    • Jin Yao's avatar
      perf report: Fix a wrong offset issue when using /proc/kcore · 935f5a9d
      Jin Yao authored
      When a valid vmlinux is not found, 'perf report' falls back to look at
      /proc/kcore. In this case, it will report the impossible large offset.
      
      For example:
      
        # perf record -b -e cycles:k find /etc/ > /dev/null
        # perf report --stdio --branch-history
      
          22.77%  _vm_normal_page+18446603336221188162
                  |
                  ---page_remove_rmap +18446603336221188324
                     page_remove_rmap +18446603336221188487 (cycles:5)
                     unlock_page_memcg +18446603336221188096
                     page_remove_rmap +18446603336221188327 (cycles:1)
      
      The issue is the value which is passed to parameter 'addr' in
      __get_srcline() is the objdump address. It's not correct if we calculate
      the offset by using 'addr - sym->start'.
      
      This patch creates a new parameter 'ip' in __get_srcline(). It is not
      converted to objdump address.
      
      With this patch, the perf report output is:
      
          22.77%  _vm_normal_page+66
                  |
                  ---page_remove_rmap +228
                     page_remove_rmap +391 (cycles:5)
                     unlock_page_memcg +0
                     page_remove_rmap +231 (cycles:1)
                     page_remove_rmap +236
      
      Committer testing:
      
      Make sure you get any valid vmlinux out of the way, using '-v' on the
      'perf report' case and deleting it from places where perf searches them,
      like your kernel build dir and the build-id cache, in ~/.debug/.
      Reported-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1514564812-17344-1-git-send-email-yao.jin@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      935f5a9d
    • Wang Nan's avatar
      perf tools: Fix compile error with libunwind x86 · 44df1afd
      Wang Nan authored
      Fix a compile error:
      
       ...
         CC       util/libunwind/x86_32.o
       In file included from util/libunwind/x86_32.c:33:0:
       util/libunwind/../../arch/x86/util/unwind-libunwind.c: In function 'libunwind__x86_reg_id':
       util/libunwind/../../arch/x86/util/unwind-libunwind.c:110:11: error: 'EINVAL' undeclared (first use in this function)
          return -EINVAL;
                  ^
       util/libunwind/../../arch/x86/util/unwind-libunwind.c:110:11: note: each undeclared identifier is reported only once for each function it appears in
       mv: cannot stat 'util/libunwind/.x86_32.o.tmp': No such file or directory
       make[4]: *** [util/libunwind/x86_32.o] Error 1
       make[3]: *** [util] Error 2
       make[2]: *** [libperf-in.o] Error 2
       make[1]: *** [sub-make] Error 2
       make: *** [all] Error 2
      
      It happens when libunwind-x86 feature is detected.
      Signed-off-by: default avatarWang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/20171206015040.114574-1-wangnan0@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      44df1afd
    • Arnaldo Carvalho de Melo's avatar
      perf test bpf: Hook on epoll_pwait() · e0337f4f
      Arnaldo Carvalho de Melo authored
      The 'perf test bpf' was hooking a eBPF program on the SyS_epoll_wait()
      kernel function, that was what the epoll_wait() glibc function ended up
      calling, but since at least glibc 2.26, the one that comes with, for
      instance, Fedora 27, glibc ends up calling SyS_epoll_pwait() when
      epoll_wait() is used, causing this 'perf test' entry to fail.
      
      So switch to using epoll_pwait() and hook the eBPF program to the
      SyS_epoll_pwait() kernel function to make it work on a wider range of
      glibc and kernel versions.
      Tested-by: default avatarWang Nan <wangnan0@huawei.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-zynvquy63er8s5mrgsz65pto@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e0337f4f
    • Arnaldo Carvalho de Melo's avatar
      perf test bpf: Use designated struct field initializers · 13cb2d0f
      Arnaldo Carvalho de Melo authored
      To follow standard practice in the kernel sources, documenting the
      initialization better and helping quickly finding the value for some
      field in a struct with many entries.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-syn3hz9hz7ukxlxbx5x6hv20@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      13cb2d0f
    • Arnaldo Carvalho de Melo's avatar
      perf test bpf: Improve message about expected samples · 6703c977
      Arnaldo Carvalho de Melo authored
      When failing on one of the BPF tests we were just stating:
      
        BPF filter result incorrect
      
      Add some more info to help figuring out the problem:
      
       BPF filter result incorrect, expected 56, got 0 samples
      
      This came out while investigating this failure, first seen after
      updating the kernel to the 4.15.0-rc6 tag:
      
        [root@jouet ~]# perf test bpf
        39: BPF filter               :
        39.1: Basic BPF filtering    : FAILED!
        39.2: BPF pinning            : Skip
        39.3: BPF prologue generation: Skip
        39.4: BPF relocation checker : Skip
        [root@jouet ~]#
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-403npu7daupv6b2bmxliv5pk@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      6703c977
  3. 06 Jan, 2018 3 commits
    • Ingo Molnar's avatar
      perf/x86/msr: Clean up the code · 9128d3ed
      Ingo Molnar authored
      Recent changes made a bit of an inconsistent mess out of arch/x86/events/msr.c,
      fix it:
      
       - re-align the initialization tables to be vertically aligned and readable again
      
       - harmonize comment style in terms of punctuation, capitalization and spelling
      
       - use curly braces for multi-condition branches
      
       - remove extra newlines
      
       - simplify the code a bit
      
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: kan.liang@intel.com
      Link: http://lkml.kernel.org/r/1515169132-3980-1-git-send-email-eranian@google.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      9128d3ed
    • Stephane Eranian's avatar
      perf/x86/msr: Add support for MSR_IA32_THERM_STATUS · 9ae21dd6
      Stephane Eranian authored
      This patch adds support for the Digital Readout provided by the
      IA32_THERM_STATUS MSR (0x19C) on Intel X86 processors. The readout
      shows the number of degrees Celcius to the TCC (critical temperature)
      supported by the processor. Thus, the larger, the better.
      
      The perf_event support is provided via the msr PMU. The new
      logical event is called cpu_thermal_margin. It comes with a unit and
      snapshot files. The event shows the current temprature distance (margin).
      It is not an accumulating event. The unit is degrees C. The event
      is provided per logical CPU to make things simpler but it is the
      same for both hyper-threads sharing a physical core.
      
      $ perf stat -I 1000 -a -A -e msr/cpu_thermal_margin/
      
      This will print the temperature for all logical CPUs.
                   time CPU                counts unit events
           1.000123741 CPU0                    38 C    msr/cpu_thermal_margin/
           1.000161837 CPU1                    37 C    msr/cpu_thermal_margin/
           1.000187906 CPU2                    36 C    msr/cpu_thermal_margin/
           1.000189046 CPU3                    39 C    msr/cpu_thermal_margin/
           1.000283044 CPU4                    40 C    msr/cpu_thermal_margin/
           1.000344297 CPU5                    40 C    msr/cpu_thermal_margin/
           1.000365832 CPU6                    39 C    msr/cpu_thermal_margin/
           ...
      
      In case the temperature margin cannot be read, the reported value would be -1.
      
      Works on all processors supporting the Digital Readout (dtherm in cpuinfo)
      Signed-off-by: default avatarStephane Eranian <eranian@google.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: kan.liang@intel.com
      Link: http://lkml.kernel.org/r/1515169132-3980-1-git-send-email-eranian@google.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      9ae21dd6
    • Ingo Molnar's avatar
      b6815f35
  4. 05 Jan, 2018 13 commits
    • Linus Torvalds's avatar
      Merge tag 'for-4.15-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux · 89876f27
      Linus Torvalds authored
      Pull btrfs fixes from David Sterba:
       "We have two more fixes for 4.15, both aimed for stable.
      
        The leak fix is obvious, the second patch fixes a bug revealed by the
        refcount API, when it behaves differently than previous atomic_t and
        reports refs going from 0 to 1 in one case"
      
      * tag 'for-4.15-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
        btrfs: fix refcount_t usage when deleting btrfs_delayed_nodes
        btrfs: Fix flush bio leak
      89876f27
    • Linus Torvalds's avatar
      Merge tag 'xfs-4.15-fixes-10' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux · 12e971b6
      Linus Torvalds authored
      Pull XFS fixes from Darrick Wong:
       "I have just a few fixes for bugs and resource cleanup problems this
        week:
      
         - Fix resource cleanup of failed quota initialization
      
         - Fix integer overflow problems wrt s_maxbytes"
      
      * tag 'xfs-4.15-fixes-10' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
        xfs: fix s_maxbytes overflow problems
        xfs: quota: check result of register_shrinker()
        xfs: quota: fix missed destroy of qi_tree_lock
      12e971b6
    • Linus Torvalds's avatar
      Merge tag 'mfd-fixes-4.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd · f842839c
      Linus Torvalds authored
      Pull MFD fix from Lee Jones:
       "Late bugfix to plug a leak in rtsx_pcr"
      
      * tag 'mfd-fixes-4.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd:
        mfd: rtsx: Release IRQ during shutdown
      f842839c
    • Linus Torvalds's avatar
      Merge branch 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · abb7099d
      Linus Torvalds authored
      Pull  more x86 pti fixes from Thomas Gleixner:
       "Another small stash of fixes for fallout from the PTI work:
      
         - Fix the modules vs. KASAN breakage which was caused by making
           MODULES_END depend of the fixmap size. That was done when the cpu
           entry area moved into the fixmap, but now that we have a separate
           map space for that this is causing more issues than it solves.
      
         - Use the proper cache flush methods for the debugstore buffers as
           they are mapped/unmapped during runtime and not statically mapped
           at boot time like the rest of the cpu entry area.
      
         - Make the map layout of the cpu_entry_area consistent for 4 and 5
           level paging and fix the KASLR vaddr_end wreckage.
      
         - Use PER_CPU_EXPORT for per cpu variable and while at it unbreak
           nvidia gfx drivers by dropping the GPL export. The subject line of
           the commit tells it the other way around, but I noticed that too
           late.
      
         - Fix the ASM alternative macros so they can be used in the middle of
           an inline asm block.
      
         - Rename the BUG_CPU_INSECURE flag to BUG_CPU_MELTDOWN so the attack
           vector is properly identified. The Spectre mitigations will come
           with their own bug bits later"
      
      * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/pti: Rename BUG_CPU_INSECURE to BUG_CPU_MELTDOWN
        x86/alternatives: Add missing '\n' at end of ALTERNATIVE inline asm
        x86/tlb: Drop the _GPL from the cpu_tlbstate export
        x86/events/intel/ds: Use the proper cache flush method for mapping ds buffers
        x86/kaslr: Fix the vaddr_end mess
        x86/mm: Map cpu_entry_area at the same place on 4/5 level
        x86/mm: Set MODULES_END to 0xffffffffff000000
      abb7099d
    • Linus Torvalds's avatar
      Merge branch 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · b03acc4c
      Linus Torvalds authored
      Pull EFI updates from Thomas Gleixner:
      
       - A fix for a add_efi_memmap parameter regression which ensures that
         the parameter is parsed before it is used.
      
       - Reinstate the virtual capsule mapping as the cached copy turned out
         to break Quark and other things
      
       - Remove Matt Fleming as EFI co-maintainer. He stepped back a few days
         ago. Thanks Matt for all your great work!
      
      * 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        MAINTAINERS: Remove Matt Fleming as EFI co-maintainer
        efi/capsule-loader: Reinstate virtual capsule mapping
        x86/efi: Fix kernel param add_efi_memmap regression
      b03acc4c
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux · 3eac6903
      Linus Torvalds authored
      Pull s390 fixes from Martin Schwidefsky:
       "Four bug fixes"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
        s390/dasd: fix wrongly assigned configuration data
        s390: fix preemption race in disable_sacf_uaccess
        s390/sclp: disable FORTIFY_SOURCE for early sclp code
        s390/pci: handle insufficient resources during dma tlb flush
      3eac6903
    • Linus Torvalds's avatar
      Merge tag 'for-linus-4.15-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip · 925cbd7e
      Linus Torvalds authored
      Pull xen fix from Juergen Gross:
       "One minor fix adjusting the kmalloc flags in the new pvcalls driver
        added in rc1"
      
      * tag 'for-linus-4.15-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
        xen/pvcalls: use GFP_ATOMIC under spin lock
      925cbd7e
    • Linus Torvalds's avatar
      Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 · 64648a5f
      Linus Torvalds authored
      Pull crypto fixes from Herbert Xu:
       "This fixes the following issues:
      
         - racy use of ctx->rcvused in af_alg
      
         - algif_aead crash in chacha20poly1305
      
         - freeing bogus pointer in pcrypt
      
         - build error on MIPS in mpi
      
         - memory leak in inside-secure
      
         - memory overwrite in inside-secure
      
         - NULL pointer dereference in inside-secure
      
         - state corruption in inside-secure
      
         - build error without CRYPTO_GF128MUL in chelsio
      
         - use after free in n2"
      
      * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
        crypto: inside-secure - do not use areq->result for partial results
        crypto: inside-secure - fix request allocations in invalidation path
        crypto: inside-secure - free requests even if their handling failed
        crypto: inside-secure - per request invalidation
        lib/mpi: Fix umul_ppmm() for MIPS64r6
        crypto: pcrypt - fix freeing pcrypt instances
        crypto: n2 - cure use after free
        crypto: af_alg - Fix race around ctx->rcvused by making it atomic_t
        crypto: chacha20poly1305 - validate the digest size
        crypto: chelsio - select CRYPTO_GF128MUL
      64648a5f
    • Linus Torvalds's avatar
      Merge branch 'akpm' (patches from Andrew) · d8887f1c
      Linus Torvalds authored
      Merge misc fixes from Andrew Morton:
       "9 fixes"
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>:
        mailmap: update Mark Yao's email address
        userfaultfd: clear the vma->vm_userfaultfd_ctx if UFFD_EVENT_FORK fails
        mm/sparse.c: wrong allocation for mem_section
        mm/zsmalloc.c: include fs.h
        mm/debug.c: provide useful debugging information for VM_BUG
        kernel/exit.c: export abort() to modules
        mm/mprotect: add a cond_resched() inside change_pmd_range()
        kernel/acct.c: fix the acct->needcheck check in check_free_space()
        mm: check pfn_valid first in zero_resv_unavail
      d8887f1c
    • Thomas Gleixner's avatar
      x86/pti: Rename BUG_CPU_INSECURE to BUG_CPU_MELTDOWN · de791821
      Thomas Gleixner authored
      Use the name associated with the particular attack which needs page table
      isolation for mitigation.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Acked-by: default avatarDavid Woodhouse <dwmw@amazon.co.uk>
      Cc: Alan Cox <gnomes@lxorguk.ukuu.org.uk>
      Cc: Jiri Koshina <jikos@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Andi Lutomirski  <luto@amacapital.net>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Paul Turner <pjt@google.com>
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Cc: Greg KH <gregkh@linux-foundation.org>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Kees Cook <keescook@google.com>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1801051525300.1724@nanos
      de791821
    • David Woodhouse's avatar
      x86/alternatives: Add missing '\n' at end of ALTERNATIVE inline asm · b9e705ef
      David Woodhouse authored
      Where an ALTERNATIVE is used in the middle of an inline asm block, this
      would otherwise lead to the following instruction being appended directly
      to the trailing ".popsection", and a failed compile.
      
      Fixes: 9cebed42 ("x86, alternative: Use .pushsection/.popsection")
      Signed-off-by: default avatarDavid Woodhouse <dwmw@amazon.co.uk>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: gnomes@lxorguk.ukuu.org.uk
      Cc: Rik van Riel <riel@redhat.com>
      Cc: ak@linux.intel.com
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Paul Turner <pjt@google.com>
      Cc: Jiri Kosina <jikos@kernel.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Kees Cook <keescook@google.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20180104143710.8961-8-dwmw@amazon.co.uk
      b9e705ef
    • Sinan Kaya's avatar
      mfd: rtsx: Release IRQ during shutdown · 107b7d9f
      Sinan Kaya authored
      'Commit cc27b735 ("PCI/portdrv: Turn off PCIe services during
      shutdown")' revealed a resource leak in rtsx_pci driver during shutdown.
      
      Issue shows up as a warning during shutdown as follows:
      
      remove_proc_entry: removing non-empty directory 'irq/17', leaking at least
      'rtsx_pci'
      WARNING: CPU: 0 PID: 1578 at fs/proc/generic.c:572
      remove_proc_entry+0x11d/0x130
      Modules linked in <long list but none that are out-of-tree>
      ...
      Call Trace:
      unregister_irq_proc
      free_desc
      irq_free_descs
      mp_unmap_irq
      acpi_unregister_gsi_apic
      acpi_pci_irq_disable
      do_pci_disable_device
      pci_disable_device
      device_shutdown
      kernel_restart
      Sys_reboot
      
      Even though rtsx_pci driver implements a shutdown callback, it is not
      releasing the interrupt that it registered during probe. This is causing
      the ACPI layer to complain that the shared IRQ is in use while freeing
      IRQ.
      
      This code releases the IRQ to prevent resource leak and eliminate the
      warning.
      
      Fixes: cc27b735 ("PCI/portdrv: Turn off PCIe services during shutdown")
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=198141Reported-by: default avatarChris Clayton <chris2553@googlemail.com>
      Signed-off-by: default avatarSinan Kaya <okaya@codeaurora.org>
      Reviewed-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarLee Jones <lee.jones@linaro.org>
      107b7d9f
    • Linus Torvalds's avatar
      Merge tag 'drm-fixes-for-v4.15-rc7' of git://people.freedesktop.org/~airlied/linux · 5866bec2
      Linus Torvalds authored
      Pull drm fixes from Dave Airlie:
       "Just collecting some fixes to finish my hoildays :-).
      
        A few fixes for i915 (one documentation build fix), one ttm fix, one
        AMD display fix, one omapdrm fix, and a set of armada fixes from
        Russell.
      
        All seem pretty small, you can now return to your latest security news
        site"
      
      * tag 'drm-fixes-for-v4.15-rc7' of git://people.freedesktop.org/~airlied/linux:
        drm/i915: Apply Display WA #1183 on skl, kbl, and cfl
        drm/ttm: check the return value of kzalloc
        drm/amd/display: call set csc_default if enable adjustment is false
        docs: fix, intel_guc_loader.c has been moved to intel_guc_fw.c
        omapdrm/dss/hdmi4_cec: fix interrupt handling
        documentation/gpu/i915: fix docs build error after file rename
        drm/i915: Put all non-blocking modesets onto an ordered wq
        drm/i915: Disable DC states around GMBUS on GLK
        drm/i915/psr: Fix register name mess up.
        drm/armada: fix YUV planar format framebuffer offsets
        drm/armada: improve efficiency of armada_drm_plane_calc_addrs()
        drm/armada: fix UV swap code
        drm/armada: fix SRAM powerdown
        drm/armada: fix leak of crtc structure
      5866bec2