1. 26 Nov, 2010 2 commits
    • Hitoshi Mitake's avatar
      perf bench: Add feature that measures the performance of the... · ea7872b9
      Hitoshi Mitake authored
      perf bench: Add feature that measures the performance of the arch/x86/lib/memcpy_64.S memcpy routines via 'perf bench mem'
      
      This patch ports arch/x86/lib/memcpy_64.S to perf bench mem
      memcpy for benchmarking memcpy() in userland with tricky and
      dirty way.
      
      util/include/asm/cpufeature.h, util/include/asm/dwarf2.h, and
      util/include/linux/linkage.h are mostly dummy files with small
      wrappers, so that we are able to include memcpy_64.S
      unmodified.
      Signed-off-by: default avatarHitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
      Cc: h.mitake@gmail.com
      Cc: Miao Xie <miaox@cn.fujitsu.com>
      Cc: Ma Ling <ling.ma@intel.com>
      Cc: Zhao Yakui <yakui.zhao@intel.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Andi Kleen <andi@firstfloor.org>
      LKML-Reference: <1290668693-27068-2-git-send-email-mitake@dcl.info.waseda.ac.jp>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      ea7872b9
    • Hitoshi Mitake's avatar
      perf bench: Print both of prefaulted and no prefaulted results by default · 49ce8fc6
      Hitoshi Mitake authored
      After applying this patch, perf bench mem memcpy prints
      both of prefualted and without prefaulted score of memcpy().
      
      New options --no-prefault and --only-prefault are added
      to print single result, mainly for scripting usage.
      
      Usage example:
      
       | mitake@X201i:~/linux/.../tools/perf% ./perf bench mem memcpy -l 500MB
       | # Running mem/memcpy benchmark...
       | # Copying 500MB Bytes ...
       |
       |      634.969014 MB/Sec
       |        4.828062 GB/Sec (with prefault)
       | mitake@X201i:~/linux/.../tools/perf% ./perf bench mem memcpy -l 500MB --only-prefault
       | # Running mem/memcpy benchmark...
       | # Copying 500MB Bytes ...
       |
       |        4.705192 GB/Sec (with prefault)
       | mitake@X201i:~/linux/.../tools/perf% ./perf bench mem memcpy -l 500MB --no-prefault
       | # Running mem/memcpy benchmark...
       | # Copying 500MB Bytes ...
       |
       |      642.725568 MB/Sec
      Signed-off-by: default avatarHitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
      Cc: h.mitake@gmail.com
      Cc: Miao Xie <miaox@cn.fujitsu.com>
      Cc: Ma Ling <ling.ma@intel.com>
      Cc: Zhao Yakui <yakui.zhao@intel.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Andi Kleen <andi@firstfloor.org>
      LKML-Reference: <1290668693-27068-1-git-send-email-mitake@dcl.info.waseda.ac.jp>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      49ce8fc6
  2. 20 Nov, 2010 1 commit
    • Corey Ashford's avatar
      perf stat: Change and clean up sys_perf_event_open error handling · d9cf837e
      Corey Ashford authored
      This patch makes several changes to "perf stat":
      
      - "perf stat" will no longer go ahead and run the application when one or
      more of the specified events could not be opened.
      - Use error() and die() instead of pr_err() so that the output is more
      consistent with "perf top" and "perf record".
      - Handle permission errors in a more robust way, and in a similar way to
      "perf record" and "perf top".
      
      In addition, the sys_perf_event_open() error handling of "perf top" and "perf
      record" is made more consistent and adds the following phrase when an event
      doesn't open (with something ther than an access or permission error):
      
      "/bin/dmesg may provide additional information."
      
      This is added because kernel code doesn't have a good way of expressing
      detailed errors to user space, so its only avenue is to use printk's.  However,
      many users may not think of looking at dmesg to find out why an event is being
      rejected.
      
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <fweisbec@gmail.com>
      Cc: Ian Munsie <ianmunsi@au1.ibm.com>
      Cc: Michael Ellerman <michaele@au1.ibm.com>
      LKML-Reference: <1290217044-26293-1-git-send-email-cjashfor@linux.vnet.ibm.com>
      Signed-off-by: default avatarCorey Ashford <cjashfor@linux.vnet.ibm.com>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      d9cf837e
  3. 19 Nov, 2010 3 commits
    • Arnaldo Carvalho de Melo's avatar
      perf tools: Change my maintainer address · 4aafd3f7
      Arnaldo Carvalho de Melo authored
      Also remove old snail mail address from CREDITS, moved years ago.
      
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      4aafd3f7
    • Robert Morell's avatar
      perf tools: Remove hardcoded include paths for elfutils · a7112397
      Robert Morell authored
      This change removes the use of hardcoded absolute "/usr/include/elfutils" paths
      from the perf build.  The problem with hardcoded paths is that it prevents them
      from being overridden by $prefix or by -I in CFLAGS (e.g., for cross-compiling
      purposes).
      
      Instead, just include the "elfutils/" subdirectory as a relative path when
      files are needed from that directory.
      
      Tested by building perf:
      - Cross-compiled for ARM on x86_64
      - Built natively on x86_64
      - Built on x86_64 with /usr/include/elfutils moved to another location
        and manually included in CFLAGS
      Acked-by: default avatarMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      LKML-Reference: <1289945793-31441-1-git-send-email-rmorell@nvidia.com>
      Signed-off-by: default avatarRobert Morell <rmorell@nvidia.com>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      a7112397
    • Stephane Eranian's avatar
      perf stat: Add no-aggregation mode to -a · f5b4a9c3
      Stephane Eranian authored
      This patch adds a new -A option to perf stat. If specified then perf stat does
      not aggregate counts across all monitored CPUs in system-wide mode, i.e., when
      using -a. This option is not supported in per-thread mode.
      
      Being able to get a per-cpu breakdown is useful to detect imbalances between
      CPUs when running a uniform workload than spans all monitored CPUs.
      
      The second version corrects the missing cpumap[] support, so that it works when
      the -C option is used.
      
      The third version fixes a missing cpumap[] in print_counter() and removes a
      stray patch in builtin-trace.c.
      
      Examples on a 4-way system:
      
      # perf stat -a   -e cycles,instructions -- sleep 1
       Performance counter stats for 'sleep 1':
               9592808135  cycles
               3490380006  instructions             #      0.364 IPC
              1.001584632  seconds time elapsed
      
      # perf stat -a -A -e cycles,instructions -- sleep 1
       Performance counter stats for 'sleep 1':
      CPU0            2398163767  cycles
      CPU1            2398180817  cycles
      CPU2            2398217115  cycles
      CPU3            2398247483  cycles
      CPU0             872282046  instructions             #      0.364 IPC
      CPU1             873481776  instructions             #      0.364 IPC
      CPU2             872638127  instructions             #      0.364 IPC
      CPU3             872437789  instructions             #      0.364 IPC
              1.001556052  seconds time elapsed
      
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Robert Richter <robert.richter@amd.com>
      LKML-Reference: <4ce257b5.1e07e30a.7b6b.3aa9@mx.google.com>
      Signed-off-by: default avatarStephane Eranian <eranian@google.com>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      f5b4a9c3
  4. 18 Nov, 2010 10 commits
    • Ingo Molnar's avatar
      Merge branch 'perf/core' of... · ae51ce90
      Ingo Molnar authored
      Merge branch 'perf/core' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing into perf/core
      ae51ce90
    • Frederic Weisbecker's avatar
      tracing: Remove useless syscall ftrace_event_call declaration · 423478cd
      Frederic Weisbecker authored
      It is defined right after, which makes the declaration completely
      useless.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Jason Baron <jbaron@redhat.com>
      423478cd
    • Frederic Weisbecker's avatar
      tracing: Allow syscall trace events for non privileged users · 53cf810b
      Frederic Weisbecker authored
      As for the raw syscalls events, individual syscall events won't
      leak system wide information on task bound tracing. Allow non
      privileged users to use them in such workflow.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Jason Baron <jbaron@redhat.com>
      53cf810b
    • Frederic Weisbecker's avatar
      tracing: Allow raw syscall trace events for non privileged users · fe554203
      Frederic Weisbecker authored
      This allows non privileged users to use the raw syscall trace events
      for task bound tracing in perf.
      
      It is safe because raw syscall trace events don't leak system wide
      informations.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Jason Baron <jbaron@redhat.com>
      fe554203
    • Frederic Weisbecker's avatar
      tracing: New macro to set up initial event flags value · 1ed0c597
      Frederic Weisbecker authored
      This introduces the new TRACE_EVENT_FLAGS() macro in order
      to set up initial event flags value.
      
      This macro must simply follow the definition of a trace event
      and take the event name and the flag value as parameters:
      
      TRACE_EVENT(my_event, .....
      ....
      );
      
      TRACE_EVENT_FLAGS(my_event, 1)
      
      This will set up 1 as the initial my_event->flags value.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Jason Baron <jbaron@redhat.com>
      1ed0c597
    • Frederic Weisbecker's avatar
      tracing: New flag to allow non privileged users to use a trace event · 61c32659
      Frederic Weisbecker authored
      This adds a new trace event internal flag that allows them to be
      used in perf by non privileged users in case of task bound tracing.
      
      This is desired for syscalls tracepoint because they don't leak
      global system informations, like some other tracepoints.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Jason Baron <jbaron@redhat.com>
      61c32659
    • Soeren Sandmann Pedersen's avatar
      x86: Eliminate bp argument from the stack tracing routines · 9c0729dc
      Soeren Sandmann Pedersen authored
      The various stack tracing routines take a 'bp' argument in which the
      caller is supposed to provide the base pointer to use, or 0 if doesn't
      have one. Since bp is garbage whenever CONFIG_FRAME_POINTER is not
      defined, this means all callers in principle should either always pass
      0, or be conditional on CONFIG_FRAME_POINTER.
      
      However, there are only really three use cases for stack tracing:
      
      (a) Trace the current task, including IRQ stack if any
      (b) Trace the current task, but skip IRQ stack
      (c) Trace some other task
      
      In all cases, if CONFIG_FRAME_POINTER is not defined, bp should just
      be 0.  If it _is_ defined, then
      
      - in case (a) bp should be gotten directly from the CPU's register, so
        the caller should pass NULL for regs,
      
      - in case (b) the caller should should pass the IRQ registers to
        dump_trace(),
      
      - in case (c) bp should be gotten from the top of the task's stack, so
        the caller should pass NULL for regs.
      
      Hence, the bp argument is not necessary because the combination of
      task and regs is sufficient to determine an appropriate value for bp.
      
      This patch introduces a new inline function stack_frame(task, regs)
      that computes the desired bp. This function is then called from the
      two versions of dump_stack().
      Signed-off-by: default avatarSoren Sandmann <ssp@redhat.com>
      Acked-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arjan van de Ven <arjan@infradead.org>,
      Cc: Frederic Weisbecker <fweisbec@gmail.com>,
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>,
      LKML-Reference: <m3oc9rop28.fsf@dhcp-100-3-82.bos.redhat.com>>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      9c0729dc
    • Don Zickus's avatar
      x86, nmi_watchdog: Remove all stub function calls from old nmi_watchdog · 072b198a
      Don Zickus authored
      Now that the bulk of the old nmi_watchdog is gone, remove all
      the stub variables and hooks associated with it.
      
      This touches lots of files mainly because of how the io_apic
      nmi_watchdog was implemented.  Now that the io_apic nmi_watchdog
      is forever gone, remove all its fingers.
      
      Most of this code was not being exercised by virtue of
      nmi_watchdog != NMI_IO_APIC, so there shouldn't be anything to
      risky here.
      Signed-off-by: default avatarDon Zickus <dzickus@redhat.com>
      Cc: fweisbec@gmail.com
      Cc: gorcunov@openvz.org
      LKML-Reference: <1289578944-28564-3-git-send-email-dzickus@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      072b198a
    • Don Zickus's avatar
      x86, nmi_watchdog: Remove the old nmi_watchdog · 5f2b0ba4
      Don Zickus authored
      Now that we have a new nmi_watchdog that is more generic and
      sits on top of the perf subsystem, we really do not need the old
      nmi_watchdog any more.
      
      In addition, the old nmi_watchdog doesn't really work if you are
      using the default clocksource, hpet.  The old nmi_watchdog code
      relied on local apic interrupts to determine if the cpu is still
      alive.  With hpet as the clocksource, these interrupts don't
      increment any more and the old nmi_watchdog triggers false
      postives.
      
      This piece removes the old nmi_watchdog code and stubs out any
      variables and functions calls.  The stubs are the same ones used
      by the new nmi_watchdog code, so it should be well tested.
      Signed-off-by: default avatarDon Zickus <dzickus@redhat.com>
      Cc: fweisbec@gmail.com
      Cc: gorcunov@openvz.org
      LKML-Reference: <1289578944-28564-2-git-send-email-dzickus@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      5f2b0ba4
    • Ingo Molnar's avatar
      Merge branch 'tip/perf/urgent-3' of... · a89d4bd0
      Ingo Molnar authored
      Merge branch 'tip/perf/urgent-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into perf/urgent
      a89d4bd0
  5. 16 Nov, 2010 1 commit
  6. 15 Nov, 2010 23 commits