1. 17 May, 2009 3 commits
    • Ingo Molnar's avatar
      perf_counter: fix threaded task exit · 0203026b
      Ingo Molnar authored
      Flushing counters in __exit_signal() with irqs disabled is not
      a good idea as perf_counter_exit_task() acquires mutexes. So
      flush it before acquiring the tasklist lock.
      
      (Note, we still need a fix for when the PID has been unhashed.)
      
      [ Impact: fix crash with inherited counters ]
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Srivatsa Vaddagiri <vatsa@in.ibm.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      0203026b
    • Peter Zijlstra's avatar
      perf_counter: Fix counter inheritance · 856d56b9
      Peter Zijlstra authored
      Srivatsa Vaddagiri reported that a Java workload triggers this
      warning in kernel/exit.c:
      
         WARN_ON_ONCE(!list_empty(&tsk->perf_counter_ctx.counter_list));
      
      Add the inherited counter propagation on self-detach, this could
      cause counter leaks and incomplete stats in threaded code like
      the below:
      
        #include <pthread.h>
        #include <unistd.h>
      
        void *thread(void *arg)
        {
                sleep(5);
                return NULL;
        }
      
        void main(void)
        {
                pthread_t thr;
                pthread_create(&thr, NULL, thread, NULL);
        }
      Reported-by: default avatarSrivatsa Vaddagiri <vatsa@in.ibm.com>
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      856d56b9
    • Peter Zijlstra's avatar
      perf_counter: Fix inheritance cleanup code · 8bc20959
      Peter Zijlstra authored
      Clean up code that open-coded the list_{add,del}_counter() code in
      __perf_counter_exit_task() which consequently diverged. This could
      lead to software counter crashes.
      
      Also, fold the ctx->nr_counter inc/dec into those functions and clean
      up some of the related code.
      
      [ Impact: fix potential sw counter crash, cleanup ]
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Srivatsa Vaddagiri <vatsa@in.ibm.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      8bc20959
  2. 15 May, 2009 20 commits
    • Paul Mackerras's avatar
      perf_counter: powerpc: supply more precise information on counter overflow events · 0bbd0d4b
      Paul Mackerras authored
      This uses values from the MMCRA, SIAR and SDAR registers on
      powerpc to supply more precise information for overflow events,
      including a data address when PERF_RECORD_ADDR is specified.
      
      Since POWER6 uses different bit positions in MMCRA from earlier
      processors, this converts the struct power_pmu limited_pmc5_6
      field, which only had 0/1 values, into a flags field and
      defines bit values for its previous use (PPMU_LIMITED_PMC5_6)
      and a new flag (PPMU_ALT_SIPR) to indicate that the processor
      uses the POWER6 bit positions rather than the earlier
      positions.  It also adds definitions in reg.h for the new and
      old positions of the bit that indicates that the SIAR and SDAR
      values come from the same instruction.
      
      For the data address, the SDAR value is supplied if we are not
      doing instruction sampling.  In that case there is no guarantee
      that the address given in the PERF_RECORD_ADDR subrecord will
      correspond to the instruction whose address is given in the
      PERF_RECORD_IP subrecord.
      
      If instruction sampling is enabled (e.g. because this counter
      is counting a marked instruction event), then we only supply
      the SDAR value for the PERF_RECORD_ADDR subrecord if it
      corresponds to the instruction whose address is in the
      PERF_RECORD_IP subrecord.  Otherwise we supply 0.
      
      [ Impact: support more PMU hardware features on PowerPC ]
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      Acked-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <18955.37028.48861.555309@drongo.ozlabs.ibm.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      0bbd0d4b
    • Paul Mackerras's avatar
      perf_counter: allow arch to supply event misc flags and instruction pointer · 9d23a90a
      Paul Mackerras authored
      At present the values we put in overflow events for the misc
      flags indicating processor mode and the instruction pointer are
      obtained using the standard user_mode() and
      instruction_pointer() functions. Those functions tell you where
      the performance monitor interrupt was taken, which might not be
      exactly where the counter overflow occurred, for example
      because interrupts were disabled at the point where the
      overflow occurred, or because the processor had many
      instructions in flight and chose to complete some more
      instructions beyond the one that caused the counter overflow.
      
      Some architectures (e.g. powerpc) can supply more precise
      information about where the counter overflow occurred and the
      processor mode at that point.  This introduces new functions,
      perf_misc_flags() and perf_instruction_pointer(), which arch
      code can override to provide more precise information if
      available.  They have default implementations which are
      identical to the existing code.
      
      This also adds a new misc flag value,
      PERF_EVENT_MISC_HYPERVISOR, for the case where a counter
      overflow occurred in the hypervisor.  We encode the processor
      mode in the 2 bits previously used to indicate user or kernel
      mode; the values for user and kernel mode are unchanged and
      hypervisor mode is indicated by both bits being set.
      
      [ Impact: generalize perfcounter core facilities ]
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      Acked-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <18956.1272.818511.561835@cargo.ozlabs.ibm.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      9d23a90a
    • Paul Mackerras's avatar
      perf_counter: powerpc: use u64 for event codes internally · ef923214
      Paul Mackerras authored
      Although the perf_counter API allows 63-bit raw event codes,
      internally in the powerpc back-end we had been using 32-bit
      event codes.  This expands them to 64 bits so that we can add
      bits for specifying threshold start/stop events and instruction
      sampling modes later.
      
      This also corrects the return value of can_go_on_limited_pmc;
      we were returning an event code rather than just a 0/1 value in
      some circumstances. That didn't particularly matter while event
      codes were 32-bit, but now that event codes are 64-bit it
      might, so this fixes it.
      
      [ Impact: extend PowerPC perfcounter interfaces from u32 to u64 ]
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      Acked-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <18955.36874.472452.353104@drongo.ozlabs.ibm.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      ef923214
    • Peter Zijlstra's avatar
      perf_counter: frequency based adaptive irq_period, 32-bit fix · 2e569d36
      Peter Zijlstra authored
      fix:
      
        kernel/built-in.o: In function `perf_counter_alloc':
        perf_counter.c:(.text+0x7ddc7): undefined reference to `__udivdi3'
      
      [ Impact: build fix on 32-bit systems ]
      Reported-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      LKML-Reference: <1242394667.6642.1887.camel@laptop>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      2e569d36
    • Peter Zijlstra's avatar
      perf top: update to use the new freq interface · f5456a6b
      Peter Zijlstra authored
      Provide perf top -F as alternative to -c.
      
      [ Impact: new 'perf top' feature ]
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <20090515132018.707922166@chello.nl>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      f5456a6b
    • Peter Zijlstra's avatar
      perf_counter: frequency based adaptive irq_period · 60db5e09
      Peter Zijlstra authored
      Instead of specifying the irq_period for a counter, provide a target interrupt
      frequency and dynamically adapt the irq_period to match this frequency.
      
      [ Impact: new perf-counter attribute/feature ]
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <20090515132018.646195868@chello.nl>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      60db5e09
    • Peter Zijlstra's avatar
      perf_counter: per user mlock gift · 789f90fc
      Peter Zijlstra authored
      Instead of a per-process mlock gift for perf-counters, use a
      per-user gift so that there is less of a DoS potential.
      
      [ Impact: allow less worst-case unprivileged memory consumption ]
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <20090515132018.496182835@chello.nl>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      789f90fc
    • Peter Zijlstra's avatar
      perf_counter: remove perf_disable/enable exports · 548e1ddf
      Peter Zijlstra authored
      Now that ACPI idle doesn't use it anymore, remove the exports.
      
      [ Impact: remove dead code/data ]
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <20090515132018.429826617@chello.nl>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      548e1ddf
    • Ingo Molnar's avatar
      perf stat: handle Ctrl-C · 58d7e993
      Ingo Molnar authored
      Before this change, if a long-running perf stat workload was Ctrl-C-ed,
      the utility exited without displaying statistics.
      
      After the change, the Ctrl-C gets propagated into the workload (and
      causes its early exit there), but perf stat itself will still continue
      to run and will display counter results.
      
      This is useful to run open-ended workloads, let them run for
      a while, then Ctrl-C them to get the stats.
      
      [ Impact: extend perf stat with new functionality ]
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      58d7e993
    • Ingo Molnar's avatar
      perf_counter: Remove ACPI quirk · 251e8e3c
      Ingo Molnar authored
      We had a disable/enable around acpi_idle_do_entry() due to an erratum
      in an early prototype CPU i had access to. That erratum has been fixed
      in the BIOS so remove the quirk.
      
      The quirk also kept us from profiling interrupts that hit the ACPI idle
      instruction - so this is an improvement as well, beyond a cleanup and
      a micro-optimization.
      
      [ Impact: improve profiling scope, cleanup, micro-optimization ]
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      251e8e3c
    • Ingo Molnar's avatar
      perf_counter: x86: Protect against infinite loops in intel_pmu_handle_irq() · 9029a5e3
      Ingo Molnar authored
      intel_pmu_handle_irq() can lock up in an infinite loop if the hardware
      does not allow the acking of irqs. Alas, this happened in testing so
      make this robust and emit a warning if it happens in the future.
      
      Also, clean up the IRQ handlers a bit.
      
      [ Impact: improve perfcounter irq/nmi handling robustness ]
      Acked-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      9029a5e3
    • Ingo Molnar's avatar
      perf_counter: x86: Disallow interval of 1 · 1c80f4b5
      Ingo Molnar authored
      On certain CPUs i have observed a stuck PMU if interval was set to
      1 and NMIs were used. The PMU had PMC0 set in MSR_CORE_PERF_GLOBAL_STATUS,
      but it was not possible to ack it via MSR_CORE_PERF_GLOBAL_OVF_CTRL,
      and the NMI loop got stuck infinitely.
      
      [ Impact: fix rare hangs during high perfcounter load ]
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      1c80f4b5
    • Peter Zijlstra's avatar
      perf_counter: x86: Robustify interrupt handling · a4016a79
      Peter Zijlstra authored
      Two consecutive NMIs could daze and confuse the machine when the
      first would handle the overflow of both counters.
      
      [ Impact: fix false-positive syslog messages under multi-session profiling ]
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      a4016a79
    • Peter Zijlstra's avatar
      perf_counter: Rework the perf counter disable/enable · 9e35ad38
      Peter Zijlstra authored
      The current disable/enable mechanism is:
      
      	token = hw_perf_save_disable();
      	...
      	/* do bits */
      	...
      	hw_perf_restore(token);
      
      This works well, provided that the use nests properly. Except we don't.
      
      x86 NMI/INT throttling has non-nested use of this, breaking things. Therefore
      provide a reference counter disable/enable interface, where the first disable
      disables the hardware, and the last enable enables the hardware again.
      
      [ Impact: refactor, simplify the PMU disable/enable logic ]
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      9e35ad38
    • Peter Zijlstra's avatar
      perf_counter: x86: Fix up the amd NMI/INT throttle · 962bf7a6
      Peter Zijlstra authored
      perf_counter_unthrottle() restores throttle_ctrl, buts its never set.
      Also, we fail to disable all counters when throttling.
      
      [ Impact: fix rare stuck perf-counters when they are throttled ]
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      962bf7a6
    • Peter Zijlstra's avatar
      perf_counter: Fix perf_output_copy() WARN to account for overflow · 53020fe8
      Peter Zijlstra authored
      The simple reservation test in perf_output_copy() failed to take
      unsigned int overflow into account, fix this.
      
      [ Impact: fix false positive warning with more than 4GB of profiling data ]
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      53020fe8
    • Peter Zijlstra's avatar
      perf_counter: x86: Allow unpriviliged use of NMIs · a026dfec
      Peter Zijlstra authored
      Apply sysctl_perf_counter_priv to NMIs. Also, fail the counter
      creation instead of silently down-grading to regular interrupts.
      
      [ Impact: allow wider perf-counter usage ]
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      a026dfec
    • Ingo Molnar's avatar
      perf_counter: x86: Fix throttling · f5a5a2f6
      Ingo Molnar authored
      If counters are disabled globally when a perfcounter IRQ/NMI hits,
      and if we throttle in that case, we'll promote the '0' value to
      the next lapic IRQ and disable all perfcounters at that point,
      permanently ...
      
      Fix it.
      
      [ Impact: fix hung perfcounters under load ]
      Acked-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      f5a5a2f6
    • Peter Zijlstra's avatar
      perf_counter: x86: More accurate counter update · ec3232bd
      Peter Zijlstra authored
      Take the counter width into account instead of assuming 32 bits.
      
      In particular Nehalem has 44 bit wide counters, and all
      arithmetics should happen on a 44-bit signed integer basis.
      
      [ Impact: fix rare event imprecision, warning message on Nehalem ]
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      ec3232bd
    • Arnaldo Carvalho de Melo's avatar
      perf record: Allow specifying a pid to record · 1a853e36
      Arnaldo Carvalho de Melo authored
      Allow specifying a pid instead of always fork+exec'ing a command.
      
      Because the PERF_EVENT_COMM and PERF_EVENT_MMAP events happened before
      we connected, we must synthesize them so that 'perf report' can get what
      it needs.
      
      [ Impact: add new command line option ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Clark Williams <williams@redhat.com>
      Cc: John Kacur <jkacur@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      LKML-Reference: <20090515015046.GA13664@ghostprotocols.net>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      1a853e36
  3. 13 May, 2009 1 commit
  4. 12 May, 2009 1 commit
    • Paul Mackerras's avatar
      perf_counter: call hw_perf_save_disable/restore around group_sched_in · e758a33d
      Paul Mackerras authored
      I noticed that when enabling a group via the PERF_COUNTER_IOC_ENABLE
      ioctl on the group leader, the counters weren't enabled and counting
      immediately on return from the ioctl, but did start counting a little
      while later (presumably after a context switch).
      
      The reason was that __perf_counter_enable calls group_sched_in which
      calls hw_perf_group_sched_in, which on powerpc assumes that the caller
      has called hw_perf_save_disable already.  Until commit 46d686c6
      ("perf_counter: put whole group on when enabling group leader") it was
      true that all callers of group_sched_in had called
      hw_perf_save_disable first, and the powerpc hw_perf_group_sched_in
      relies on that (there isn't an x86 version).
      
      This fixes the problem by putting calls to hw_perf_save_disable /
      hw_perf_restore around the calls to group_sched_in and
      counter_sched_in in __perf_counter_enable.  Having the calls to
      hw_perf_save_disable/restore around the counter_sched_in call is
      harmless and makes this call consistent with the other call sites
      of counter_sched_in, which have all called hw_perf_save_disable first.
      
      [ Impact: more precise counter group disable/enable functionality ]
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      LKML-Reference: <18953.25733.53359.147452@cargo.ozlabs.ibm.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      e758a33d
  5. 11 May, 2009 4 commits
    • Paul Mackerras's avatar
      perf_counter: call atomic64_set for counter->count · 615a3f1e
      Paul Mackerras authored
      A compile warning triggered because we are calling
      atomic_set(&counter->count). But since counter->count
      is an atomic64_t, we have to use atomic64_set.
      
      So the count can be set short, resulting in the reset ioctl
      only resetting the low word.
      
      [ Impact: clear counter properly during the reset ioctl ]
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      LKML-Reference: <18951.48285.270311.981806@drongo.ozlabs.ibm.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      615a3f1e
    • Paul Mackerras's avatar
      perf_counter: don't count scheduler ticks as context switches · a08b159f
      Paul Mackerras authored
      The context-switch software counter gives inflated values at present
      because each scheduler tick and each process-wide counter
      enable/disable prctl gets counted as a context switch.
      
      This happens because perf_counter_task_tick, perf_counter_task_disable
      and perf_counter_task_enable all call perf_counter_task_sched_out,
      which calls perf_swcounter_event to record a context switch event.
      
      This fixes it by introducing a variant of perf_counter_task_sched_out
      with two underscores in front for internal use within the perf_counter
      code, and makes perf_counter_task_{tick,disable,enable} call it.  This
      variant doesn't record a context switch event, and takes a struct
      perf_counter_context *.  This adds the new variant rather than
      changing the behaviour or interface of perf_counter_task_sched_out
      because that is called from other code.
      
      [ Impact: fix inflated context-switch event counts ]
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      LKML-Reference: <18951.48034.485580.498953@drongo.ozlabs.ibm.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      a08b159f
    • Paul Mackerras's avatar
      perf_counter: Put whole group on when enabling group leader · 6751b71e
      Paul Mackerras authored
      Currently, if you have a group where the leader is disabled and there
      are siblings that are enabled, and then you enable the leader, we only
      put the leader on the PMU, and not its enabled siblings.  This is
      incorrect, since the enabled group members should be all on or all off
      at any given point.
      
      This fixes it by adding a call to group_sched_in in
      __perf_counter_enable in the case where we're enabling a group leader.
      
      To avoid the need for a forward declaration this also moves
      group_sched_in up before __perf_counter_enable.  The actual content of
      group_sched_in is unchanged by this patch.
      
      [ Impact: fix bug in counter enable code ]
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      LKML-Reference: <18951.34946.451546.691693@drongo.ozlabs.ibm.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      6751b71e
    • Mike Galbraith's avatar
      perf_counter, x86: clean up throttling printk · 88233923
      Mike Galbraith authored
      s/PERFMON/perfcounters for perfcounter interrupt throttling warning.
      
      'perfmon' is the CPU feature name that is Intel-only, while we do
      throttling in a generic way.
      
      [ Impact: cleanup ]
      Signed-off-by: default avatarMike Galbraith <efault@gmx.de>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      88233923
  6. 10 May, 2009 1 commit
  7. 09 May, 2009 1 commit
  8. 08 May, 2009 4 commits
    • Peter Zijlstra's avatar
      perf_counter: add PERF_RECORD_CPU · f370e1e2
      Peter Zijlstra authored
      Allow recording the CPU number the event was generated on.
      
      RFC: this leaves a u32 as reserved, should we fill in the
           node_id() there, or leave this open for future extention,
           as userspace can already easily do the cpu->node mapping
           if needed.
      
      [ Impact: extend perfcounter output record format ]
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      LKML-Reference: <20090508170029.008627711@chello.nl>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      f370e1e2
    • Peter Zijlstra's avatar
      perf_counter: add PERF_RECORD_CONFIG · a85f61ab
      Peter Zijlstra authored
      Much like CONFIG_RECORD_GROUP records the hw_event.config to
      identify the values, allow to record this for all counters.
      
      [ Impact: extend perfcounter output record format ]
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      LKML-Reference: <20090508170028.923228280@chello.nl>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      a85f61ab
    • Peter Zijlstra's avatar
      perf_counter: rework ioctl()s · 3df5edad
      Peter Zijlstra authored
      Corey noticed that ioctl()s on grouped counters didn't work on
      the whole group. This extends the ioctl() interface to take a
      second argument that is interpreted as a flags field. We then
      provide PERF_IOC_FLAG_GROUP to toggle the behaviour.
      
      Having this flag gives the greatest flexibility, allowing you
      to individually enable/disable/reset counters in a group, or
      all together.
      
      [ Impact: fix group counter enable/disable semantics ]
      Reported-by: default avatarCorey Ashford <cjashfor@linux.vnet.ibm.com>
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <20090508170028.837558214@chello.nl>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      3df5edad
    • Peter Zijlstra's avatar
      perf_counter: optimize perf_counter_task_tick() · 7fc23a53
      Peter Zijlstra authored
      perf_counter_task_tick() does way too much work to find out
      there's nothing to do. Provide an easy short-circuit for the
      normal case where there are no counters on the system.
      
      [ Impact: micro-optimization ]
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      LKML-Reference: <20090508170028.750619201@chello.nl>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      7fc23a53
  9. 06 May, 2009 1 commit
  10. 05 May, 2009 4 commits