1. 20 May, 2010 5 commits
    • Frederic Weisbecker's avatar
      perf: Fix forgotten preempt_enable by nested writers · acd35a46
      Frederic Weisbecker authored
      A writer that gets a reference to the buffer handle disables
      preemption. When we put that reference, we check if we are
      the outer most writer and if not, we simply return and defer
      the head update to the outer most writer. The problem here
      is that preemption is only reenabled by the outer most, that
      produces preemption count imbalance for every nested writer
      that exit.
      
      So just don't forget to always re-enable preemption when we
      put the buffer reference, whoever we are.
      
      Fixes lots of sleeping in atomic warnings, visible with lock
      events recording.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Robert Richter <robert.richter@amd.com>
      acd35a46
    • Ingo Molnar's avatar
      Merge branch 'perf/urgent' of... · dfacc4d6
      Ingo Molnar authored
      Merge branch 'perf/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing into perf/core
      dfacc4d6
    • Frederic Weisbecker's avatar
      perf: Fix unaligned accesses while fetching trace values · 85cb68b2
      Frederic Weisbecker authored
      Accessing trace values of an 8 size may end up in a segfault
      on archs that can't deal with misaligned access, which is the
      case for sparc 64. This is because PERF_SAMPLE_RAW are aligned
      to 4 and not to 8.
      
      Fix this on the macros that get the values of 8 size.
      
      This fixes segfaults on perf tools in sparc 64.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      Cc: David Miller <davem@davemloft.net>
      85cb68b2
    • Frederic Weisbecker's avatar
      perf: Comply with new rcu checks API · 49f135ed
      Frederic Weisbecker authored
      The software events hlist doesn't fully comply with the new
      rcu checks api.
      
      We need to consider three different sides that access the hlist:
      
      - the hlist allocation/release side. This side happens when an
        events is created or released, accesses to the hlist are
        serialized under the cpuctx mutex.
      
      - the events insertion/removal in the hlist. This side is always
        serialized against the above one. The hlist is always present
        during such operations. This side happens when a software event
        is scheduled in/out. The serialization that ensures the software
        event is really attached to the context is made under the
        ctx->lock.
      
      - events triggering. This is the read side, it can happen
        concurrently with any update side.
      
      This patch deals with them one by one and anticipates with the
      separate rcu mem space patches in preparation.
      
      This patch fixes various annoying rcu warnings.
      Reported-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      49f135ed
    • Tom Zanussi's avatar
      perf: Use read() instead of lseek() in trace_event_read.c:skip() · cbb5cf7f
      Tom Zanussi authored
      This is a small fix for a problem affecting live-mode, introduced
      recently:
      
      root@tropicana:~# perf trace rwtop
      perf trace started with Perl
      script /root/libexec/perf-core/scripts/perl/rwtop.pl
      
        Fatal: did not read header event
      
      commit d00a47cc added a skip()
      function to skip over e.g. header_page, but this doesn't work for
      live mode.  This patch re-implements skip() to use read() instead of
      lseek() to fix that.
      Signed-off-by: default avatarTom Zanussi <tzanussi@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1273032130.6383.28.camel@tropicana>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      cbb5cf7f
  2. 19 May, 2010 10 commits
    • Arnaldo Carvalho de Melo's avatar
      perf session: Make read_build_id routines look at the host_machine too · f869097e
      Arnaldo Carvalho de Melo authored
      The changes made to support host and guest machines in a session, that
      started when the 'perf kvm' tool was introduced ended up introducing a
      bug where the host_machine was not having its DSOs traversed for
      build-id processing.
      
      Fix it by moving some methods to the right classes and considering the
      host_machine when processing build-ids.
      Reported-by: default avatarTom Zanussi <tzanussi@gmail.com>
      Reported-by: default avatarStephane Eranian <eranian@google.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      f869097e
    • Arnaldo Carvalho de Melo's avatar
      perf symbols: Don't try to read the build-id twice · f6e1467d
      Arnaldo Carvalho de Melo authored
      In __dsos__read_build_ids if the dso already had its build-id read,
      don't try again.
      
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      f6e1467d
    • Cyrill Gorcunov's avatar
      perf, x86: P4 PMU -- add missing bit in CCCR mask · ce7f1545
      Cyrill Gorcunov authored
      Should be there for the sake of RAW events.
      Signed-off-by: default avatarCyrill Gorcunov <gorcunov@openvz.org>
      CC: Lin Ming <ming.m.lin@intel.com>
      CC: Peter Zijlstra <a.p.zijlstra@chello.nl>
      CC: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <20100518212439.354345151@openvz.org>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      ce7f1545
    • Cyrill Gorcunov's avatar
      perf, x86: P4_pmu_schedule_events -- use smp_processor_id instead of raw_ · 9d36dfcf
      Cyrill Gorcunov authored
      This snippet somehow escaped the commit:
      
       | commit 137351e0
       | Author: Cyrill Gorcunov <gorcunov@openvz.org>
       | Date:   Sat May 8 15:25:52 2010 +0400
       |
       |    x86, perf: P4 PMU -- protect sensible procedures from preemption
      
      so bring it eventually back. It helps to catch
      preemption issue (if there will be, rule of thumb --
      don't use raw_ if you can).
      Signed-off-by: default avatarCyrill Gorcunov <gorcunov@openvz.org>
      Cc: Lin Ming <ming.m.lin@intel.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <20100518212439.167259349@openvz.org>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      9d36dfcf
    • Cyrill Gorcunov's avatar
      perf, x86: P4 PMU -- do a real check for ESCR address being in hash · 623aab89
      Cyrill Gorcunov authored
      To prevent from clashes in future code modifications
      do a real check for ESCR address being in hash. At
      moment the callers are known to pass sane values but
      better to be on a safe side.
      
      And comment fix.
      Signed-off-by: default avatarCyrill Gorcunov <gorcunov@openvz.org>
      CC: Lin Ming <ming.m.lin@intel.com>
      CC: Peter Zijlstra <a.p.zijlstra@chello.nl>
      CC: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <20100518212439.004503600@openvz.org>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      623aab89
    • Arnaldo Carvalho de Melo's avatar
      perf tools: remove xstrndup, xmalloc, xzalloc · 151f85a4
      Arnaldo Carvalho de Melo authored
      All the functions that call this can handle the equivalent, non
      panic'ing wrapped routines.
      
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      151f85a4
    • Arnaldo Carvalho de Melo's avatar
      perf probe: Don't call die() · 8a7ddad8
      Arnaldo Carvalho de Melo authored
      Functions that were calling xzalloc also returned -1 when, for other
      reasons, it could fail, and the calleds are coping with failures, so
      stop using die() and xzalloc().
      
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      8a7ddad8
    • Arnaldo Carvalho de Melo's avatar
      perf probe: Fix some error exit paths · b448c4b6
      Arnaldo Carvalho de Melo authored
      That could leave filedescriptors open and leak memory. Also stop using
      xmalloc, use malloc and handle results just like other error cases in
      the same routine that used it.
      
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      b448c4b6
    • Arnaldo Carvalho de Melo's avatar
      perf tools: Remove some unused functions · a41794cd
      Arnaldo Carvalho de Melo authored
      Without the bloated cplus_demangle from binutils, i.e building with:
      
      $ make NO_DEMANGLE=1 O=~acme/git/build/perf -j3 -C tools/perf/ install
      
      Before:
      
         text	   data	    bss	    dec	    hex	filename
       471851	  29280	4025056	4526187	 45106b	/home/acme/bin/perf
      
      After:
      
      [acme@doppio linux-2.6-tip]$ size ~/bin/perf
         text	   data	    bss	    dec	    hex	filename
       446886	  29232	4008576	4484694	 446e56	/home/acme/bin/perf
      
      So its a 5.3% size reduction in code, but the interesting part is in the git
      diff --stat output:
      
       19 files changed, 20 insertions(+), 1909 deletions(-)
      
      If we ever need some of the things we got from git but weren't using, we just
      have to go to the git repo and get fresh, uptodate source code bits.
      
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      a41794cd
    • Stephane Eranian's avatar
      perf stat: add perf stat -B to pretty print large numbers · 5af52b51
      Stephane Eranian authored
      It is hard to read very large numbers so provide an option to perf stat
      to separate thousands using a separator. The patch leverages the locale
      support of stdio. You need to set your LC_NUMERIC appropriately, for
      instance LC_NUMERIC=en_US.UTF8. You need to pass -B to activate this
      feature. This way existing scripts parsing the output do not need to be
      changed. Here is an example.
      
      $ perf stat noploop 2
      noploop for 2 seconds
      
       Performance counter stats for 'noploop 2':
      
              1998.347031  task-clock-msecs         #      0.998 CPUs
                       61  context-switches         #      0.000 M/sec
                        0  CPU-migrations           #      0.000 M/sec
                      118  page-faults              #      0.000 M/sec
            4,138,410,900  cycles                   #   2070.917 M/sec  (scaled from 70.01%)
            2,062,650,268  instructions             #      0.498 IPC    (scaled from 70.01%)
            2,057,653,466  branches                 #   1029.678 M/sec  (scaled from 70.01%)
                   40,267  branch-misses            #      0.002 %      (scaled from 30.04%)
            2,055,961,348  cache-references         #   1028.831 M/sec  (scaled from 30.03%)
                   53,725  cache-misses             #      0.027 M/sec  (scaled from 30.02%)
      
              2.001393933  seconds time elapsed
      
      $ perf stat -B  noploop 2
      noploop for 2 seconds
      
       Performance counter stats for 'noploop 2':
      
              1998.297883  task-clock-msecs         #      0.998 CPUs
                       59  context-switches         #      0.000 M/sec
                        0  CPU-migrations           #      0.000 M/sec
                      119  page-faults              #      0.000 M/sec
            4,131,380,160  cycles                   #   2067.450 M/sec  (scaled from 70.01%)
            2,059,096,507  instructions             #      0.498 IPC    (scaled from 70.01%)
            2,054,681,303  branches                 #   1028.216 M/sec  (scaled from 70.01%)
                   25,650  branch-misses            #      0.001 %      (scaled from 30.05%)
            2,056,283,014  cache-references         #   1029.017 M/sec  (scaled from 30.03%)
                   47,097  cache-misses             #      0.024 M/sec  (scaled from 30.02%)
      
              2.001391016  seconds time elapsed
      
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <4bf28fe8.914ed80a.01ca.fffff5f5@mx.google.com>
      Signed-off-by: default avatarStephane Eranian <eranian@google.com>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      5af52b51
  3. 18 May, 2010 25 commits