1. 03 Jun, 2016 1 commit
  2. 29 May, 2016 1 commit
  3. 27 May, 2016 3 commits
    • Wang Nan's avatar
      perf ctf: Convert invalid chars in a string before set value · 5ea5888b
      Wang Nan authored
      We observed some crazy apps on Android set their comm to unprintable
      string. For example:
      
        # cat /proc/10607/task/*/comm
        tencent.qqmusic
        ...
        Binder_2
        日志输出线  <-- Chinese word 'log output thread'
        WifiManager
        ...
      
      'perf data convert' fails to convert perf.data with such string to CTF format.
      
      For example:
      
        # cat << EOF > ./badguy.c
        #include <sys/prctl.h>
        int main(int argc, char *argv[])
        {
               prctl(PR_SET_NAME, "\xe6\x97\xa5\xe5\xbf\x97\xe8\xbe\x93\xe5\x87\xba\xe7\xba\xbf");
               while(1)
                       sleep(1);
               return 0;
        }
        EOF
        # gcc ./badguy.c
        # perf record -e sched:* ./a.out
        # perf data convert --to-ctf ./bad.ctf
        CTF stream 4 flush failed
        [ perf data convert: Converted 'perf.data' into CTF data './bad.ctf' ]
        [ perf data convert: Converted and wrote 0.008 MB (78 samples)  ]
        # babeltrace ./bad.ctf/
        [error] Packet size (18446744073709551615 bits) is larger than remaining file size (262144 bits).
        [error] Stream index creation error.
        [error] Open file stream error.
        [warning] [Context] Cannot open_trace of format ctf at path ./bad.ctf.
        [warning] [Context] cannot open trace "./bad.ctf" from ./bad.ctf/ for reading.
        [error] Cannot open any trace for reading.
      
        [error] opening trace "./bad.ctf/" for reading.
      
        [error] none of the specified trace paths could be opened.
      
      This patch converts unprintable characters to hexadecimal word.
      
      After applying this patch the above test works correctly:
      
        # ~/perf data convert --to-ctf ./good.ctf
        [ perf data convert: Converted 'perf.data' into CTF data './good.ctf' ]
        [ perf data convert: Converted and wrote 0.008 MB (78 samples) ]
        # babeltrace ./good.ctf
        ..
        [23:14:35.491665268] (+0.000001100) sched:sched_wakeup: { cpu_id = 4 }, { perf_ip = 0xFFFFFFFF810AEF33, perf_tid = 0, perf_pid = 0, perf_id = 5123, perf_period = 1, common_type = 270, common_flags = 45, common_preempt_count = 4, common_pid = 0, comm = "\xe6\x97\xa5\xe5\xbf\x97\xe8\xbe\x93\xe5\x87\xba\xe7\xba\xbf", pid = 1057, prio = 120, success = 1, target_cpu = 4 }
        [23:14:35.491666230] (+0.000000962) sched:sched_wakeup: { cpu_id = 4 }, { perf_ip = 0xFFFFFFFF810AEF33, perf_tid = 0, perf_pid = 0, perf_id = 5122, perf_period = 1, common_type = 270, common_flags = 45, common_preempt_count = 4, common_pid = 0, comm = "\xe6\x97\xa5\xe5\xbf\x97\xe8\xbe\x93\xe5\x87\xba\xe7\xba\xbf", pid = 1057, prio = 120, success = 1, target_cpu = 4 }
        ..
      
      Committer note:
      
      To build perf with libabeltrace, use:
      
        $ mkdir -p /tmp/build/perf
        $ make LIBBABELTRACE=1 LIBBABELTRACE_DIR=/usr/local O=/tmp/build/perf -C tools/perf install-bin
      
      Or equivalent (no O=, fixup LIBBABELTRACE_DIR, etc).
      Signed-off-by: default avatarWang Nan <wangnan0@huawei.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1464348951-179595-1-git-send-email-wangnan0@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      5ea5888b
    • Wang Nan's avatar
      perf record: Fix crash when kptr is restricted · 3dc6c1d5
      Wang Nan authored
      Before this patch, a simple 'perf record' could fail if kptr_restrict is
      set to 1 (for normal user) or 2 (for root):
      
        # perf record ls
        WARNING: Kernel address maps (/proc/{kallsyms,modules}) are restricted,
        check /proc/sys/kernel/kptr_restrict.
      
        Samples in kernel functions may not be resolved if a suitable vmlinux
        file is not found in the buildid cache or in the vmlinux path.
      
        Samples in kernel modules won't be resolved at all.
      
        If some relocation was applied (e.g. kexec) symbols may be misresolved
        even with a suitable vmlinux or kallsyms file.
      
        Segmentation fault (core dumped)
      
      This patch skips perf_event__synthesize_kernel_mmap() when kptr is not
      available.
      Signed-off-by: default avatarWang Nan <wangnan0@huawei.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Fixes: 45e90056 ("perf machine: Do not bail out if not managing to read ref reloc symbol")
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1464081688-167940-2-git-send-email-wangnan0@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      3dc6c1d5
    • Wang Nan's avatar
      perf symbols: Check kptr_restrict for root · 38272dc4
      Wang Nan authored
      If kptr_restrict is set to 2, even root is not allowed to see pointers.
      This patch checks kptr_restrict even if euid == 0. For root, report
      error if kptr_restrict is 2.
      Signed-off-by: default avatarWang Nan <wangnan0@huawei.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1464081688-167940-1-git-send-email-wangnan0@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      38272dc4
  4. 25 May, 2016 1 commit
  5. 24 May, 2016 1 commit
    • Ingo Molnar's avatar
      Merge tag 'perf-core-for-mingo-20160523' of... · 0c9f790f
      Ingo Molnar authored
      Merge tag 'perf-core-for-mingo-20160523' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent
      
      Pull perf/core improvements from Arnaldo Carvalho de Melo:
      
      User visible changes:
      
      - Add "srcline_from" and "srcline_to" branch sort keys to 'perf top' and
        'perf report' (Andi Kleen)
      
      Infrastructure changes:
      
      - Make 'perf trace' auto-attach fd->name and ptr->name beautifiers based
        on the name of syscall arguments, this way new syscalls that have
        'const char * (path,pathname,filename)' will use the fd->name beautifier
        (vfs_getname perf probe, if in place) and the 'fd->name' (vfs_getname
        or via /proc/PID/fd/) (Arnaldo Carvalho de Melo)
      
      - Infrastructure to read from a ring buffer in backward write mode (Wang Nan)
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      0c9f790f
  6. 23 May, 2016 7 commits
  7. 20 May, 2016 15 commits
  8. 19 May, 2016 1 commit
  9. 18 May, 2016 1 commit
  10. 17 May, 2016 9 commits
    • Arnaldo Carvalho de Melo's avatar
      perf tools: Separate accounting of contexts and real addresses in a stack trace · a29d5c9b
      Arnaldo Carvalho de Melo authored
      The perf_sample->ip_callchain->nr value includes all the entries in the
      ip_callchain->ip[] array, real addresses and PERF_CONTEXT_{KERNEL,USER,etc},
      while what the user expects is that what is in the kernel.perf_event_max_stack
      sysctl or in the upcoming per event perf_event_attr.sample_max_stack knob be
      honoured in terms of IP addresses in the stack trace.
      
      So match the kernel support and validate chain->nr taking into account
      both kernel.perf_event_max_stack and kernel.perf_event_max_contexts_per_stack.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: Wang Nan <wangnan0@huawei.com>
      Cc: Zefan Li <lizefan@huawei.com>
      Link: http://lkml.kernel.org/n/tip-mgx0jpzfdq4uq4abfa40byu0@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      a29d5c9b
    • Arnaldo Carvalho de Melo's avatar
      perf core: Separate accounting of contexts and real addresses in a stack trace · c85b0334
      Arnaldo Carvalho de Melo authored
      The perf_sample->ip_callchain->nr value includes all the entries in the
      ip_callchain->ip[] array, real addresses and PERF_CONTEXT_{KERNEL,USER,etc},
      while what the user expects is that what is in the kernel.perf_event_max_stack
      sysctl or in the upcoming per event perf_event_attr.sample_max_stack knob be
      honoured in terms of IP addresses in the stack trace.
      
      So allocate a bunch of extra entries for contexts, and do the accounting
      via perf_callchain_entry_ctx struct members.
      
      A new sysctl, kernel.perf_event_max_contexts_per_stack is also
      introduced for investigating possible bugs in the callchain
      implementation by some arch.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: Wang Nan <wangnan0@huawei.com>
      Cc: Zefan Li <lizefan@huawei.com>
      Link: http://lkml.kernel.org/n/tip-3b4wnqk340c4sg4gwkfdi9yk@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      c85b0334
    • Arnaldo Carvalho de Melo's avatar
      perf core: Add perf_callchain_store_context() helper · 3e4de4ec
      Arnaldo Carvalho de Melo authored
      We need have different helpers to account how many contexts we have in
      the sample and for real addresses, so do it now as a prep patch, to
      ease review.
      
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/n/tip-q964tnyuqrxw5gld18vizs3c@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      3e4de4ec
    • Arnaldo Carvalho de Melo's avatar
      perf core: Add a 'nr' field to perf_event_callchain_context · 3b1fff08
      Arnaldo Carvalho de Melo authored
      We will use it to count how many addresses are in the entry->ip[] array,
      excluding PERF_CONTEXT_{KERNEL,USER,etc} entries, so that we can really
      return the number of entries specified by the user via the relevant
      sysctl, kernel.perf_event_max_contexts, or via the per event
      perf_event_attr.sample_max_stack knob.
      
      This way we keep the perf_sample->ip_callchain->nr meaning, that is the
      number of entries, be it real addresses or PERF_CONTEXT_ entries, while
      honouring the max_stack knobs, i.e. the end result will be max_stack
      entries if we have at least that many entries in a given stack trace.
      
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/n/tip-s8teto51tdqvlfhefndtat9r@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      3b1fff08
    • Arnaldo Carvalho de Melo's avatar
      perf core: Pass max stack as a perf_callchain_entry context · cfbcf468
      Arnaldo Carvalho de Melo authored
      This makes perf_callchain_{user,kernel}() receive the max stack
      as context for the perf_callchain_entry, instead of accessing
      the global sysctl_perf_event_max_stack.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: Wang Nan <wangnan0@huawei.com>
      Cc: Zefan Li <lizefan@huawei.com>
      Link: http://lkml.kernel.org/n/tip-kolmn1yo40p7jhswxwrc7rrd@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      cfbcf468
    • Arnaldo Carvalho de Melo's avatar
      perf core: Generalize max_stack sysctl handler · a831100a
      Arnaldo Carvalho de Melo authored
      So that it can be used for other stack related knobs, such as the
      upcoming one to tweak the max number of of contexts per stack sample.
      
      In all those cases we can only change the value if there are no perf
      sessions collecting stacks, so they need to grab that mutex, etc.
      
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/n/tip-8t3fk94wuzp8m2z1n4gc0s17@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      a831100a
    • Masami Hiramatsu's avatar
      perf symbols: Introduce DSO__NAME_KALLSYMS and DSO__NAME_KCORE · 0a77582f
      Masami Hiramatsu authored
      Instead of using a raw string, use DSO__NAME_KALLSYMS and
      DSO__NAME_KCORE macros for kallsyms and kcore.
      Signed-off-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: Hemant Kumar <hemant@linux.vnet.ibm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20160515031935.4017.50971.stgit@devboxSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      0a77582f
    • Namhyung Kim's avatar
      perf stat: Use cpu-clock event for cpu targets · a1f3d567
      Namhyung Kim authored
      Currently 'perf stat' always counts task-clock event by default.  But
      it's somewhat confusing for system-wide targets (especially with 'sleep
      N' as the 'sleep' task just sleeps and doesn't use cputime).  Changing
      to cpu-clock event instead for that case makes more sense IMHO.
      
      Before:
        # perf stat -a sleep 0.1
      
         Performance counter stats for 'system wide':
      
              403.038603      task-clock (msec)     #    4.001 CPUs utilized
                     150      context-switches      #    0.372 K/sec
                       7      cpu-migrations        #    0.017 K/sec
                      71      page-faults           #    0.176 K/sec
              23,705,169      cycles                #    0.059 GHz
              15,888,166      instructions          #    0.67  insn per cycle
               3,326,078      branches              #    8.253 M/sec
                  87,643      branch-misses         #    2.64% of all branches
      
             0.100737009 seconds time elapsed
      
        #
      
      After:
      
        # perf stat -a sleep 0.1
      
         Performance counter stats for 'system wide':
      
              404.271182      cpu-clock (msec)      #    4.000 CPUs utilized
                     143      context-switches      #    0.354 K/sec
                      13      cpu-migrations        #    0.032 K/sec
                      73      page-faults           #    0.181 K/sec
              22,119,220      cycles                #    0.055 GHz
              13,622,065      instructions          #    0.62  insn per cycle
               2,918,769      branches              #    7.220 M/sec
                  85,033      branch-misses         #    2.91% of all branches
      
             0.101073089 seconds time elapsed
      
        #
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1463119263-5569-3-git-send-email-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      a1f3d567
    • Namhyung Kim's avatar
      perf stat: Update runtime using cpu-clock event · daf4f478
      Namhyung Kim authored
      Currently only the task-clock event updates the runtime_nsec so it
      cannot show the metric when using cpu-clock events.  However cpu clock
      works basically same as task-clock, so no need to not update the runtime
      IMHO.
      
      Before:
      
        # perf stat -a -e cpu-clock,context-switches,page-faults,cycles sleep 0.1
      
          Performance counter stats for 'system wide':
      
               1217.759506      cpu-clock (msec)
                        93      context-switches
                        61      page-faults
                18,958,022      cycles
      
               0.101393794 seconds time elapsed
      
      After:
      
         Performance counter stats for 'system wide':
      
               1220.471884      cpu-clock (msec)          #   12.013 CPUs utilized
                       118      context-switches          #    0.097 K/sec
                        59      page-faults               #    0.048 K/sec
                17,941,247      cycles                    #    0.015 GHz
      
               0.101594777 seconds time elapsed
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1463119263-5569-2-git-send-email-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      daf4f478