1. 13 Mar, 2010 2 commits
    • Masami Hiramatsu's avatar
      perf probe: Fix need_dwarf flag if lazy matching is used · fc6ceea0
      Masami Hiramatsu authored
      Set need_dwarf if lazy matching pattern is specified, because
      lazy matching requires real source path for which we must use
      debuginfo.
      Signed-off-by: default avatarMasami Hiramatsu <mhiramat@redhat.com>
      Cc: systemtap <systemtap@sources.redhat.com>
      Cc: DLE <dle-develop@lists.sourceforge.net>
      LKML-Reference: <20100312232224.2017.54550.stgit@localhost6.localdomain6>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      fc6ceea0
    • Masami Hiramatsu's avatar
      perf probe: Fix probe_point buffer overrun · 594087a0
      Masami Hiramatsu authored
      Fix probe_point array-size overrun problem. In some cases (e.g.
      inline function), one user-specified probe-point can be
      translated to many probe address, and it overruns pre-defined
      array-size. This also removes redundant MAX_PROBES macro
      definition.
      Signed-off-by: default avatarMasami Hiramatsu <mhiramat@redhat.com>
      Cc: systemtap <systemtap@sources.redhat.com>
      Cc: DLE <dle-develop@lists.sourceforge.net>
      Cc: <stable@kernel.org>
      LKML-Reference: <20100312232217.2017.45017.stgit@localhost6.localdomain6>
      [ Note that only root can create new probes. Eventually we should remove
        the MAX_PROBES limit, but that is a larger patch not eligible to
        perf/urgent treatment. ]
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      594087a0
  2. 11 Mar, 2010 8 commits
    • Arnaldo Carvalho de Melo's avatar
      perf record: Don't try to find buildids in a zero sized file · 9f591fd7
      Arnaldo Carvalho de Melo authored
      Fixing this symptom:
      
       [acme@mica linux-2.6-tip]$ perf record -a -f
         Fatal: Permission error - are you root?
      
       Bus error
       [acme@mica linux-2.6-tip]$
      
      I.e. if for some reason no data is collected, in this case a non
      root user trying to do systemwide profiling, no data will be
      collected, and then we end up trying to mmap a zero sized file
      and access the file header, b00m.
      Reported-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: <stable@kernel.org>
      LKML-Reference: <1268333592-30872-1-git-send-email-acme@infradead.org>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      9f591fd7
    • Xiao Guangrong's avatar
      perf: export perf_trace_regs and perf_arch_fetch_caller_regs · 639fe4b1
      Xiao Guangrong authored
      Export perf_trace_regs and perf_arch_fetch_caller_regs since module will
      use these.
      Signed-off-by: default avatarXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
      [ use EXPORT_PER_CPU_SYMBOL_GPL() ]
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <4B989C1B.2090407@cn.fujitsu.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      639fe4b1
    • Peter Zijlstra's avatar
      perf, x86: Fix hw_perf_enable() event assignment · 45e16a68
      Peter Zijlstra authored
      What happens is that we schedule badly like:
      
      <...>-1987  [019]   280.252808: x86_pmu_start: event-46/1300c0: idx: 0
      <...>-1987  [019]   280.252811: x86_pmu_start: event-47/1300c0: idx: 1
      <...>-1987  [019]   280.252812: x86_pmu_start: event-48/1300c0: idx: 2
      <...>-1987  [019]   280.252813: x86_pmu_start: event-49/1300c0: idx: 3
      <...>-1987  [019]   280.252814: x86_pmu_start: event-50/1300c0: idx: 32
      <...>-1987  [019]   280.252825: x86_pmu_stop: event-46/1300c0: idx: 0
      <...>-1987  [019]   280.252826: x86_pmu_stop: event-47/1300c0: idx: 1
      <...>-1987  [019]   280.252827: x86_pmu_stop: event-48/1300c0: idx: 2
      <...>-1987  [019]   280.252828: x86_pmu_stop: event-49/1300c0: idx: 3
      <...>-1987  [019]   280.252829: x86_pmu_stop: event-50/1300c0: idx: 32
      <...>-1987  [019]   280.252834: x86_pmu_start: event-47/1300c0: idx: 1
      <...>-1987  [019]   280.252834: x86_pmu_start: event-48/1300c0: idx: 2
      <...>-1987  [019]   280.252835: x86_pmu_start: event-49/1300c0: idx: 3
      <...>-1987  [019]   280.252836: x86_pmu_start: event-50/1300c0: idx: 32
      <...>-1987  [019]   280.252837: x86_pmu_start: event-51/1300c0: idx: 32 *FAIL*
      
      This happens because we only iterate the n_running events in the first
      pass, and reset their index to -1 if they don't match to force a
      re-assignment.
      
      Now, in our RR example, n_running == 0 because we fully unscheduled, so
      event-50 will retain its idx==32, even though in scheduling it will have
      gotten idx=0, and we don't trigger the re-assign path.
      
      The easiest way to fix this is the below patch, which simply validates
      the full assignment in the second pass.
      Reported-by: default avatarStephane Eranian <eranian@google.com>
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <1268311069.5037.31.camel@laptop>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      45e16a68
    • Peter Zijlstra's avatar
      perf, ppc: Fix compile error due to new cpu notifiers · 85cfabbc
      Peter Zijlstra authored
      Fix:
      
        arch/powerpc/kernel/perf_event.c:1334: error: 'power_pmu_notifier' undeclared (first use in this function)
        arch/powerpc/kernel/perf_event.c:1334: error: (Each undeclared identifier is reported only once
        arch/powerpc/kernel/perf_event.c:1334: error: for each function it appears in.)
        arch/powerpc/kernel/perf_event.c:1334: error: implicit declaration of function 'power_pmu_notifier'
        arch/powerpc/kernel/perf_event.c:1334: error: implicit declaration of function 'register_cpu_notifier'
      
      Due to commit 3f6da390 (perf: Rework and fix the arch CPU-hotplug hooks).
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      85cfabbc
    • John Kacur's avatar
      perf: Make the install relative to DESTDIR if specified · 7ae5f213
      John Kacur authored
      Without this change, the install path is relative to
      prefix/DESTDIR where prefix is automatically set to $HOME.
      
      This can produce unexpected results. For example:
      
        make -C tools/perf DESTDIR=/home/jkacur/tmp install-man
      
      creates the directory:		/home/jkacur/home/jkacur/tmp/share/...
      instead of the expected:	/home/jkacur/tmp/share/...
      Signed-off-by: default avatarJohn Kacur <jkacur@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Cc: Kyle McMartin <kyle@redhat.com>
      Cc: <stable@kernel.org>
      LKML-Reference: <1268312220-12880-1-git-send-email-jkacur@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      7ae5f213
    • Masami Hiramatsu's avatar
      kprobes: Calculate the index correctly when freeing the out-of-line execution slot · 83ff56f4
      Masami Hiramatsu authored
      From : Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      
      When freeing the instruction slot, the arithmetic to calculate
      the index of the slot in the page needs to account for the total
      size of the instruction on the various architectures.
      
      Calculate the index correctly when freeing the out-of-line
      execution slot.
      Reported-by: default avatarSachin Sant <sachinp@in.ibm.com>
      Reported-by: default avatarHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: default avatarAnanth N Mavinakayanahalli <ananth@in.ibm.com>
      Signed-off-by: default avatarMasami Hiramatsu <mhiramat@redhat.com>
      LKML-Reference: <4B9667AB.9050507@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      83ff56f4
    • Paul Mackerras's avatar
      perf tools: Fix sparse CPU numbering related bugs · a12b51c4
      Paul Mackerras authored
      At present, the perf subcommands that do system-wide monitoring
      (perf stat, perf record and perf top) don't work properly unless
      the online cpus are numbered 0, 1, ..., N-1.  These tools ask
      for the number of online cpus with sysconf(_SC_NPROCESSORS_ONLN)
      and then try to create events for cpus 0, 1, ..., N-1.
      
      This creates problems for systems where the online cpus are
      numbered sparsely.  For example, a POWER6 system in
      single-threaded mode (i.e. only running 1 hardware thread per
      core) will have only even-numbered cpus online.
      
      This fixes the problem by reading the /sys/devices/system/cpu/online
      file to find out which cpus are online.  The code that does that is in
      tools/perf/util/cpumap.[ch], and consists of a read_cpu_map()
      function that sets up a cpumap[] array and returns the number of
      online cpus.  If /sys/devices/system/cpu/online can't be read or
      can't be parsed successfully, it falls back to using sysconf to
      ask how many cpus are online and sets up an identity map in cpumap[].
      
      The perf record, perf stat and perf top code then calls
      read_cpu_map() in the system-wide monitoring case (instead of
      sysconf) and uses cpumap[] to get the cpu numbers to pass to
      perf_event_open.
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      Cc: Anton Blanchard <anton@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      LKML-Reference: <20100310093609.GA3959@brick.ozlabs.ibm.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      a12b51c4
    • Paul Mackerras's avatar
      perf_event: Fix oops triggered by cpu offline/online · 220b140b
      Paul Mackerras authored
      Anton Blanchard found that he could reliably make the kernel hit a
      BUG_ON in the slab allocator by taking a cpu offline and then online
      while a system-wide perf record session was running.
      
      The reason is that when the cpu comes up, we completely reinitialize
      the ctx field of the struct perf_cpu_context for the cpu.  If there is
      a system-wide perf record session running, then there will be a struct
      perf_event that has a reference to the context, so its refcount will
      be 2.  (The perf_event has been removed from the context's group_entry
      and event_entry lists by perf_event_exit_cpu(), but that doesn't
      remove the perf_event's reference to the context and doesn't decrement
      the context's refcount.)
      
      When the cpu comes up, perf_event_init_cpu() gets called, and it calls
      __perf_event_init_context() on the cpu's context.  That resets the
      refcount to 1.  Then when the perf record session finishes and the
      perf_event is closed, the refcount gets decremented to 0 and the
      context gets kfreed after an RCU grace period.  Since the context
      wasn't kmalloced -- it's part of a per-cpu variable -- bad things
      happen.
      
      In fact we don't need to completely reinitialize the context when the
      cpu comes up.  It's sufficient to initialize the context once at boot,
      but we need to do it for all possible cpus.
      
      This moves the context initialization to happen at boot time.  With
      this, we don't trash the refcount and the context never gets kfreed,
      and we don't hit the BUG_ON.
      Reported-by: default avatarAnton Blanchard <anton@samba.org>
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      Tested-by: default avatarAnton Blanchard <anton@samba.org>
      Acked-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: <stable@kernel.org>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      220b140b
  3. 10 Mar, 2010 27 commits
    • Frederic Weisbecker's avatar
      perf: Drop the obsolete profile naming for trace events · 97d5a220
      Frederic Weisbecker authored
      Drop the obsolete "profile" naming used by perf for trace events.
      Perf can now do more than simple events counting, so generalize
      the API naming.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      Cc: Jason Baron <jbaron@redhat.com>
      97d5a220
    • Frederic Weisbecker's avatar
      perf: Take a hot regs snapshot for trace events · c530665c
      Frederic Weisbecker authored
      We are taking a wrong regs snapshot when a trace event triggers.
      Either we use get_irq_regs(), which gives us the interrupted
      registers if we are in an interrupt, or we use task_pt_regs()
      which gives us the state before we entered the kernel, assuming
      we are lucky enough to be no kernel thread, in which case
      task_pt_regs() returns the initial set of regs when the kernel
      thread was started.
      
      What we want is different. We need a hot snapshot of the regs,
      so that we can get the instruction pointer to record in the
      sample, the frame pointer for the callchain, and some other
      things.
      
      Let's use the new perf_fetch_caller_regs() for that.
      
      Comparison with perf record -e lock: -R -a -f -g
      Before:
      
              perf  [kernel]                   [k] __do_softirq
                     |
                     --- __do_softirq
                        |
                        |--55.16%-- __open
                        |
                         --44.84%-- __write_nocancel
      
      After:
      
                  perf  [kernel]           [k] perf_tp_event
                     |
                     --- perf_tp_event
                        |
                        |--41.07%-- lock_acquire
                        |          |
                        |          |--39.36%-- _raw_spin_lock
                        |          |          |
                        |          |          |--7.81%-- hrtimer_interrupt
                        |          |          |          smp_apic_timer_interrupt
                        |          |          |          apic_timer_interrupt
      
      The old case was producing unreliable callchains. Now having
      right frame and instruction pointers, we have the trace we
      want.
      
      Also syscalls and kprobe events already have the right regs,
      let's use them instead of wasting a retrieval.
      
      v2: Follow the rename perf_save_regs() -> perf_fetch_caller_regs()
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Archs <linux-arch@vger.kernel.org>
      c530665c
    • Frederic Weisbecker's avatar
      perf: Introduce new perf_fetch_caller_regs() for hot regs snapshot · 5331d7b8
      Frederic Weisbecker authored
      Events that trigger overflows by interrupting a context can
      use get_irq_regs() or task_pt_regs() to retrieve the state
      when the event triggered. But this is not the case for some
      other class of events like trace events as tracepoints are
      executed in the same context than the code that triggered
      the event.
      
      It means we need a different api to capture the regs there,
      namely we need a hot snapshot to get the most important
      informations for perf: the instruction pointer to get the
      event origin, the frame pointer for the callchain, the code
      segment for user_mode() tests (we always use __KERNEL_CS as
      trace events always occur from the kernel) and the eflags
      for further purposes.
      
      v2: rename perf_save_regs to perf_fetch_caller_regs as per
      Masami's suggestion.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Archs <linux-arch@vger.kernel.org>
      5331d7b8
    • Frederic Weisbecker's avatar
      perf/x86-64: Use frame pointer to walk on irq and process stacks · 61e67fb9
      Frederic Weisbecker authored
      We were using the frame pointer based stack walker on every
      contexts in x86-32, but not in x86-64 where we only use the
      seven-league boots on the exception stacks.
      
      Use it also on irq and process stacks. This utterly accelerate
      the captures.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      61e67fb9
    • Frederic Weisbecker's avatar
      lockdep: Move lock events under lockdep recursion protection · db2c4c77
      Frederic Weisbecker authored
      There are rcu locked read side areas in the path where we submit
      a trace event. And these rcu_read_(un)lock() trigger lock events,
      which create recursive events.
      
      One pair in do_perf_sw_event:
      
      __lock_acquire
            |
            |--96.11%-- lock_acquire
            |          |
            |          |--27.21%-- do_perf_sw_event
            |          |          perf_tp_event
            |          |          |
            |          |          |--49.62%-- ftrace_profile_lock_release
            |          |          |          lock_release
            |          |          |          |
            |          |          |          |--33.85%-- _raw_spin_unlock
      
      Another pair in perf_output_begin/end:
      
      __lock_acquire
            |--23.40%-- perf_output_begin
            |          |          __perf_event_overflow
            |          |          perf_swevent_overflow
            |          |          perf_swevent_add
            |          |          perf_swevent_ctx_event
            |          |          do_perf_sw_event
            |          |          perf_tp_event
            |          |          |
            |          |          |--55.37%-- ftrace_profile_lock_acquire
            |          |          |          lock_acquire
            |          |          |          |
            |          |          |          |--37.31%-- _raw_spin_lock
      
      The problem is not that much the trace recursion itself, as we have a
      recursion protection already (though it's always wasteful to recurse).
      But the trace events are outside the lockdep recursion protection, then
      each lockdep event triggers a lock trace, which will trigger two
      other lockdep events. Here the recursive lock trace event won't
      be taken because of the trace recursion, so the recursion stops there
      but lockdep will still analyse these new events:
      
      To sum up, for each lockdep events we have:
      
      	lock_*()
      	     |
                   trace lock_acquire
                        |
                        ----- rcu_read_lock()
                        |          |
                        |          lock_acquire()
                        |          |
                        |          trace_lock_acquire() (stopped)
                        |          |
      		  |          lockdep analyze
                        |
                        ----- rcu_read_unlock()
                                   |
                                   lock_release
                                   |
                                   trace_lock_release() (stopped)
                                   |
                                   lockdep analyze
      
      And you can repeat the above two times as we have two rcu read side
      sections when we submit an event.
      
      This is fixed in this patch by moving the lock trace event under
      the lockdep recursion protection.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      Cc: Jens Axboe <jens.axboe@oracle.com>
      db2c4c77
    • Arnaldo Carvalho de Melo's avatar
      perf report: Print the map table just after samples for which no map was found · 65f2ed2b
      Arnaldo Carvalho de Melo authored
      If -vv is used just the map table will be printed, -vvv will
      print the symbol table too, with it we can see that we have a
      bug where some samples are not being resolved to a map when we
      get them in the perf.data stream, but after we have it all
      processed, we can find the right map, some reordering probably
      is happening.
      
      Upcoming patches will provide ways to ask for most PERF_SAMPLE_
      conditional samples to be taken for !PERF_RECORD_SAMPLE events
      too, then we'll be able to ask for PERF_SAMPLE_TIME and
      PERF_SAMPLE_CPU to help diagnose this.
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1268161097-17761-1-git-send-email-acme@infradead.org>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      65f2ed2b
    • Eric B Munson's avatar
      perf report: Add multiple event support · cbbc79a5
      Eric B Munson authored
      Perf report does not handle multiple events being reported, even
      though perf record stores them properly on disk.  This patch
      addresses that issue by adding the logic to perf report to use
      the event stream id that is saved by record and the new data
      structures to seperate the event streams and report them
      individually.
      Signed-off-by: default avatarEric B Munson <ebmunson@us.ibm.com>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1267804269-22660-6-git-send-email-acme@infradead.org>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      cbbc79a5
    • Eric B Munson's avatar
      perf session: Change perf_session post processing functions to take histogram tree · eefc465c
      Eric B Munson authored
      Now that report can store historgrams for multiple events we
      need to be able to do the post processing work for each
      histogram. This patch changes the post processing functions so
      that they can be called individually for each event's histogram.
      Signed-off-by: default avatarEric B Munson <ebmunson@us.ibm.com>
      [ Guarantee bisectabilty by fixing up builtin-report.c ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1267804269-22660-5-git-send-email-acme@infradead.org>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      eefc465c
    • Eric B Munson's avatar
      perf session: Add storage for seperating event types in report · cb8f0939
      Eric B Munson authored
      This patch adds the structures necessary to count each event
      type independently in perf report.
      Signed-off-by: default avatarEric B Munson <ebmunson@us.ibm.com>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1267804269-22660-4-git-send-email-acme@infradead.org>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      cb8f0939
    • Eric B Munson's avatar
      perf session: Change add_hist_entry to take the tree root instead of session · d403d0ac
      Eric B Munson authored
      In order to minimize the impact of storing multiple events in a
      report this function will now take the root of the histogram
      tree so that the logic for selecting the proper tree can be
      inserted before the call.
      Signed-off-by: default avatarEric B Munson <ebmunson@us.ibm.com>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1267804269-22660-3-git-send-email-acme@infradead.org>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      d403d0ac
    • Eric B Munson's avatar
      perf record: Add ID and to recorded event data when recording multiple events · 8907fd60
      Eric B Munson authored
      Currently perf record does not write the ID or the to disk for
      events. This doesn't allow report to tell if an event stream
      contains one or more types of events.  This patch adds this
      entry to the list of data that record will write to disk if more
      than one event was requested.
      Signed-off-by: default avatarEric B Munson <ebmunson@us.ibm.com>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1267804269-22660-2-git-send-email-acme@infradead.org>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      8907fd60
    • Arnaldo Carvalho de Melo's avatar
      perf probe: Add missing variable initialization · accd3cc4
      Arnaldo Carvalho de Melo authored
      cc1: warnings being treated as errors
       util/probe-finder.c: In function 'find_line_range':
       util/probe-finder.c:172: warning: 'src' may be used
       uninitialized in this function make: *** [util/probe-finder.o]
       Error 1
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: default avatarMasami Hiramatsu <mhiramat@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1267804269-22660-1-git-send-email-acme@infradead.org>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      accd3cc4
    • Arnaldo Carvalho de Melo's avatar
      perf tools: Don't trow away old map slices not overlapped by new maps · 12245509
      Arnaldo Carvalho de Melo authored
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1267800842-22324-1-git-send-email-acme@infradead.org>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      12245509
    • Peter Zijlstra's avatar
      perf: Provide better condition for event rotation · d4944a06
      Peter Zijlstra authored
      Try to avoid useless rotation and PMU disables.
      
      [ Could be improved by keeping a nr_runnable count to better account
        for the < PERF_STAT_INACTIVE counters ]
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      Cc: paulus@samba.org
      Cc: eranian@google.com
      Cc: robert.richter@amd.com
      Cc: fweisbec@gmail.com
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      d4944a06
    • Peter Zijlstra's avatar
      perf, x86: Fix double enable calls · f3d46b2e
      Peter Zijlstra authored
      hw_perf_enable() would enable already enabled events.
      
      This causes problems with code that assumes that ->enable/->disable calls
      are balanced (like the LBR code does).
      
      What happens is that events that were already running and left in place
      would get enabled again.
      
      Avoid this by only enabling new events that match their previous
      assignment.
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      Cc: paulus@samba.org
      Cc: eranian@google.com
      Cc: robert.richter@amd.com
      Cc: fweisbec@gmail.com
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      f3d46b2e
    • Peter Zijlstra's avatar
      perf, x86: Fix double disable calls · 19925ce7
      Peter Zijlstra authored
      hw_perf_enable() would disable events that were not yet enabled.
      
      This causes problems with code that assumes that ->enable/->disable calls
      are balanced (like the LBR code does).
      
      What happens is that we disable newly added counters that match their
      previous assignment, even though they are not yet programmed on the
      hardware.
      
      Avoid this by only doing the first pass over the existing events.
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      Cc: paulus@samba.org
      Cc: eranian@google.com
      Cc: robert.richter@amd.com
      Cc: fweisbec@gmail.com
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      19925ce7
    • Peter Zijlstra's avatar
      perf, x86: Properly account n_added · 356e1f2e
      Peter Zijlstra authored
      Make sure n_added is properly accounted so that we can rely on the value
      to reflect the number of added counters. This is needed if its going to
      be used for more than a boolean check.
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      Cc: paulus@samba.org
      Cc: eranian@google.com
      Cc: robert.richter@amd.com
      Cc: fweisbec@gmail.com
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      356e1f2e
    • Peter Zijlstra's avatar
      perf, x86: Avoid double disable on throttle vs ioctl(PERF_IOC_DISABLE) · 71e2d282
      Peter Zijlstra authored
      Calling ioctl(PERF_EVENT_IOC_DISABLE) on a thottled counter would result
      in a double disable, cure this by using x86_pmu_{start,stop} for
      throttle/unthrottle and teach x86_pmu_stop() to check ->active_mask.
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      Cc: paulus@samba.org
      Cc: eranian@google.com
      Cc: robert.richter@amd.com
      Cc: fweisbec@gmail.com
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      71e2d282
    • Peter Zijlstra's avatar
      perf, x86: Fix x86_pmu_start · c08053e6
      Peter Zijlstra authored
      pmu::start should undo pmu::stop, make it so.
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      Cc: paulus@samba.org
      Cc: eranian@google.com
      Cc: robert.richter@amd.com
      Cc: fweisbec@gmail.com
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      c08053e6
    • Peter Zijlstra's avatar
      perf, x86: Use unlocked bitops · 34538ee7
      Peter Zijlstra authored
      There is no concurrency on these variables, so don't use LOCK'ed ops.
      
      As to the intel_pmu_handle_irq() status bit clean, nobody uses that so
      remove it all together.
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: paulus@samba.org
      Cc: eranian@google.com
      Cc: robert.richter@amd.com
      Cc: fweisbec@gmail.com
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      LKML-Reference: <20100304140100.240023029@chello.nl>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      34538ee7
    • Peter Zijlstra's avatar
      perf, x86: Change x86_pmu.{enable,disable} calling convention · aff3d91a
      Peter Zijlstra authored
      Pass the full perf_event into the x86_pmu functions so that those may
      make use of more than the hw_perf_event, and while doing this, remove the
      superfluous second argument.
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: paulus@samba.org
      Cc: eranian@google.com
      Cc: robert.richter@amd.com
      Cc: fweisbec@gmail.com
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      LKML-Reference: <20100304140100.165166129@chello.nl>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      aff3d91a
    • Peter Zijlstra's avatar
      perf, x86: Remove superfluous arguments to x86_perf_event_update() · cc2ad4ba
      Peter Zijlstra authored
      The second and third argument to x86_perf_event_update() are superfluous
      since they are simple expressions of the first argument. Hence remove
      them.
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: paulus@samba.org
      Cc: eranian@google.com
      Cc: robert.richter@amd.com
      Cc: fweisbec@gmail.com
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      LKML-Reference: <20100304140100.089468871@chello.nl>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      cc2ad4ba
    • Peter Zijlstra's avatar
      perf, x86: Remove superfluous arguments to x86_perf_event_set_period() · 07088edb
      Peter Zijlstra authored
      The second and third argument to x86_perf_event_set_period() are
      superfluous since they are simple expressions of the first argument.
      Hence remove them.
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: paulus@samba.org
      Cc: eranian@google.com
      Cc: robert.richter@amd.com
      Cc: fweisbec@gmail.com
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      LKML-Reference: <20100304140100.006500906@chello.nl>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      07088edb
    • Peter Zijlstra's avatar
      perf, x86, Do not user perf_disable from NMI context · 3fb2b8dd
      Peter Zijlstra authored
      Explicitly use intel_pmu_{disable,enable}_all() in intel_pmu_handle_irq()
      to avoid the NMI race conditions in perf_{disable,enable}
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      Cc: paulus@samba.org
      Cc: eranian@google.com
      Cc: robert.richter@amd.com
      Cc: fweisbec@gmail.com
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      3fb2b8dd
    • Peter Zijlstra's avatar
      perf: Optimize perf_disable · 32975a4f
      Peter Zijlstra authored
      Currently we always call hw_perf_disable(), even if its already disabled,
      this seems superflous, esp. since it cannot be made NMI safe (see further
      patches).
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: paulus@samba.org
      Cc: eranian@google.com
      Cc: robert.richter@amd.com
      Cc: fweisbec@gmail.com
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      32975a4f
    • Peter Zijlstra's avatar
      perf: Rework and fix the arch CPU-hotplug hooks · 3f6da390
      Peter Zijlstra authored
      Remove the hw_perf_event_*() hotplug hooks in favour of per PMU hotplug
      notifiers. This has the advantage of reducing the static weak interface
      as well as exposing all hotplug actions to the PMU.
      
      Use this to fix x86 hotplug usage where we did things in ONLINE which
      should have been done in UP_PREPARE or STARTING.
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: paulus@samba.org
      Cc: eranian@google.com
      Cc: robert.richter@amd.com
      Cc: fweisbec@gmail.com
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      LKML-Reference: <20100305154128.736225361@chello.nl>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      3f6da390
    • Peter Zijlstra's avatar
      perf: Provide generic perf_sample_data initialization · dc1d628a
      Peter Zijlstra authored
      This makes it easier to extend perf_sample_data and fixes a bug on arm
      and sparc, which failed to set ->raw to NULL, which can cause crashes
      when combined with PERF_SAMPLE_RAW.
      
      It also optimizes PowerPC and tracepoint, because the struct
      initialization is forced to zero out the whole structure.
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: default avatarJean Pihet <jpihet@mvista.com>
      Reviewed-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: default avatarDavid S. Miller <davem@davemloft.net>
      Cc: Jamie Iles <jamie.iles@picochip.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: stable@kernel.org
      LKML-Reference: <20100304140100.315416040@chello.nl>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      dc1d628a
  4. 09 Mar, 2010 2 commits
  5. 08 Mar, 2010 1 commit