1. 26 Jun, 2013 2 commits
    • Stephane Eranian's avatar
      perf/x86: Fix shared register mutual exclusion enforcement · 2f7f73a5
      Stephane Eranian authored
      This patch fixes a problem with the shared registers mutual
      exclusion code and incremental event scheduling by the
      generic perf_event code.
      
      There was a bug whereby the mutual exclusion on the shared
      registers was not enforced because of incremental scheduling
      abort due to event constraints. As an example on Intel
      Nehalem, consider the following events:
      
      group1= L1D_CACHE_LD:E_STATE,OFFCORE_RESPONSE_0:PF_RFO,L1D_CACHE_LD:I_STATE
      group2= L1D_CACHE_LD:I_STATE
      
      The L1D_CACHE_LD event can only be measured by 2 counters. Yet, there
      are 3 instances here. The first group can be scheduled and is committed.
      Then, the generic code tries to schedule group2 and this fails (because
      there is no more counter to support the 3rd instance of L1D_CACHE_LD).
      But in x86_schedule_events() error path, put_event_contraints() is invoked
      on ALL the events and not just the ones that just failed. That causes the
      "lock" on the shared offcore_response MSR to be released. Yet the first group
      is actually scheduled and is exposed to reprogramming of that shared msr by
      the sibling HT thread. In other words, there is no guarantee on what is
      measured.
      
      This patch fixes the problem by tagging committed events with the
      PERF_X86_EVENT_COMMITTED tag. In the error path of x86_schedule_events(),
      only the events NOT tagged have their constraint released. The tag
      is eventually removed when the event in descheduled.
      Signed-off-by: default avatarStephane Eranian <eranian@google.com>
      Signed-off-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20130620164254.GA3556@quadSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      2f7f73a5
    • Andi Kleen's avatar
      perf/x86/intel: Support full width counting · 069e0c3c
      Andi Kleen authored
      Recent Intel CPUs like Haswell and IvyBridge have a new
      alternative MSR range for perfctrs that allows writing the full
      counter width. Enable this range if the hardware reports it
      using a new capability bit.
      
      Currently the perf code queries CPUID to get the counter width,
      and sign extends the counter values as needed. The traditional
      PERFCTR MSRs always limit to 32bit, even though the counter
      internally is larger (usually 48 bits on recent CPUs)
      
      When the new capability is set use the alternative range which
      do not have these restrictions.
      
      This lowers the overhead of perf stat slightly because it has to
      do less interrupts to accumulate the counter value. On Haswell
      it also avoids some problems with TSX aborting when the end of
      the counter range is reached.
      
      ( See the patch "perf/x86/intel: Avoid checkpointed counters
        causing excessive TSX aborts" for more details. )
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      Reviewed-by: default avatarStephane Eranian <eranian@google.com>
      Acked-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1372173153-20215-1-git-send-email-andi@firstfloor.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      069e0c3c
  2. 23 Jun, 2013 3 commits
    • Dave Hansen's avatar
      x86: Add NMI duration tracepoints · 0c4df02d
      Dave Hansen authored
      This patch has been invaluable in my adventures finding
      issues in the perf NMI handler.  I'm as big a fan of
      printk() as anybody is, but using printk() in NMIs is
      deadly when they're happening frequently.
      
      Even hacking in trace_printk() ended up eating enough
      CPU to throw off some of the measurements I was making.
      Signed-off-by: default avatarDave Hansen <dave.hansen@linux.intel.com>
      Acked-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: paulus@samba.org
      Cc: acme@ghostprotocols.net
      Cc: Dave Hansen <dave@sr71.net>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      0c4df02d
    • Dave Hansen's avatar
      perf: Drop sample rate when sampling is too slow · 14c63f17
      Dave Hansen authored
      This patch keeps track of how long perf's NMI handler is taking,
      and also calculates how many samples perf can take a second.  If
      the sample length times the expected max number of samples
      exceeds a configurable threshold, it drops the sample rate.
      
      This way, we don't have a runaway sampling process eating up the
      CPU.
      
      This patch can tend to drop the sample rate down to level where
      perf doesn't work very well.  *BUT* the alternative is that my
      system hangs because it spends all of its time handling NMIs.
      
      I'll take a busted performance tool over an entire system that's
      busted and undebuggable any day.
      
      BTW, my suspicion is that there's still an underlying bug here.
      Using the HPET instead of the TSC is definitely a contributing
      factor, but I suspect there are some other things going on.
      But, I can't go dig down on a bug like that with my machine
      hanging all the time.
      Signed-off-by: default avatarDave Hansen <dave.hansen@linux.intel.com>
      Acked-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: paulus@samba.org
      Cc: acme@ghostprotocols.net
      Cc: Dave Hansen <dave@sr71.net>
      [ Prettified it a bit. ]
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      14c63f17
    • Dave Hansen's avatar
      x86: Warn when NMI handlers take large amounts of time · 2ab00456
      Dave Hansen authored
      I have a system which is causing all kinds of problems.  It has
      8 NUMA nodes, and lots of cores that can fight over cachelines.
      If things are not working _perfectly_, then NMIs can take longer
      than expected.
      
      If we get too many of them backed up to each other, we can
      easily end up in a situation where we are doing nothing *but*
      running NMIs.  The biggest problem, though, is that this happens
      _silently_.  You might be lucky to get an hrtimer warning, but
      most of the time system simply hangs.
      
      This patch should at least give us some warning before we fall
      off the cliff.  the warnings look like this:
      
      	nmi_handle: perf_event_nmi_handler() took: 26095071 ns
      
      The message is triggered whenever we notice the longest NMI
      we've seen to date.  You can always view and reset this value
      via the debugfs interface if you like.
      Signed-off-by: default avatarDave Hansen <dave.hansen@linux.intel.com>
      Acked-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: paulus@samba.org
      Cc: acme@ghostprotocols.net
      Cc: Dave Hansen <dave@sr71.net>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      2ab00456
  3. 20 Jun, 2013 9 commits
  4. 19 Jun, 2013 16 commits
  5. 31 May, 2013 1 commit
    • Ingo Molnar's avatar
      Merge tag 'perf-core-for-mingo' of... · afb71193
      Ingo Molnar authored
      Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
      
      perf/core improvements and fixes:
      
       * Reset SIGTERM handler in workload child process, fix from David Ahern.
      
       * Handle death by SIGTERM in 'perf record', fix from David Ahern.
      
       * Fix printing of perf_event_paranoid message, from David Ahern.
      
       * Handle realloc failures in 'perf kvm', from David Ahern.
      
       * Fix divide by 0 in variance, from David Ahern.
      
       * Save parent pid in thread struct, from David Ahern.
      
       * Handle JITed code in shared memory, from Andi Kleen.
      
       * Makefile reorganization, prep work for Kconfig patches, from Jiri Olsa.
      
       * Fixes for 'perf diff', from Jiri Olsa.
      
       * Add automated make test suite, from Jiri Olsa.
      
       * 'perf tests' fixes from Jiri Olsa.
      
       * Remove some unused struct members, from Jiri Olsa.
      
       * Add missing liblk.a dependency for python/perf.so, fix from Jiri Olsa.
      
       * Respect CROSS_COMPILE in liblk.a, from Rabin Vincent.
      
       * Expand definition of sysfs format attribute, from Michael Ellerman.
      
       * No need to do locking when adding hists in perf report, only 'top'
         needs that, from Namhyung Kim.
      
       * Sorting improvements, from Namhyung Kim.
      
       * Fix alignment of symbol column in in the hists browser (top, report)
         when -v is given, from NAmhyung Kim.
      
       * Add --percent-limit option to 'top' and 'report', from Namhyung Kim.
      
       * Fix 'perf top' -E option behavior, from Namhyung Kim.
      
       * Fix bug in isupper() and islower(), from Sukadev Bhattiprolu.
      
       * Fix compile errors in bp_signal 'perf test', from Sukadev Bhattiprolu.
      
       * Make Power7 CPI stack events available in sysfs, from Sukadev Bhattiprolu.
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      afb71193
  6. 30 May, 2013 9 commits