1. 05 May, 2010 1 commit
  2. 04 May, 2010 4 commits
    • Ingo Molnar's avatar
    • Anton Blanchard's avatar
      perf: Fix performance issue with perf report · 02bf60aa
      Anton Blanchard authored
      On a large machine we spend a lot of time in perf_header__find_attr when
      running perf report.
      
      If we are parsing a file without PERF_SAMPLE_ID then for each sample we call
      perf_header__find_attr and loop through all counter IDs, never finding a match.
      As the machine gets larger there are more per cpu counters and we spend an
      awful lot of time in there.
      
      The patch below initialises each sample id to -1ULL and checks for this in
      perf_header__find_attr. We may need to do something more intelligent eventually
      (eg a hash lookup from counter id to attr) but this at least fixes the most
      common usage of perf report.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Eric B Munson <ebmunson@us.ibm.com>
      Acked-by: default avatarEric B Munson <ebmunson@us.ibm.com>
      LKML-Reference: <20100504111915.GB14636@kryten>
      Signed-off-by: default avatarAnton Blanchard <anton@samba.org>
      --
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      02bf60aa
    • Arnaldo Carvalho de Melo's avatar
      perf inject: Add missing bits · 11d232ec
      Arnaldo Carvalho de Melo authored
      New commands need to have Documentation and be added to command-list.txt
      so that they can appear when 'perf' is called withouth any subcommand:
      
      [root@doppio linux-2.6-tip]# perf
      
       usage: perf [--version] [--help] COMMAND [ARGS]
      
       The most commonly used perf commands are:
         annotate        Read perf.data (created by perf record) and display annotated code
         archive         Create archive with object files with build-ids found in perf.data file
         bench           General framework for benchmark suites
         buildid-cache   Manage build-id cache.
         buildid-list    List the buildids in a perf.data file
         diff            Read two perf.data files and display the differential profile
         inject          Filter to augment the events stream with additional information
         kmem            Tool to trace/measure kernel memory(slab) properties
         kvm             Tool to trace/measure kvm guest os
         list            List all symbolic event types
         lock            Analyze lock events
         probe           Define new dynamic tracepoints
         record          Run a command and record its profile into perf.data
         report          Read perf.data (created by perf record) and display the profile
         sched           Tool to trace/measure scheduler properties (latencies)
         stat            Run a command and gather performance counter statistics
         test            Runs sanity tests.
         timechart       Tool to visualize total system behavior during a workload
         top             System profiling tool.
         trace           Read perf.data (created by perf record) and display trace output
      
       See 'perf help COMMAND' for more information on a specific command.
      
      [root@doppio linux-2.6-tip]#
      
      The new 'perf inject' command hadn't so it wasn't appearing on that list.
      
      Also fix the long option, that should have no spaces in it, rename the faulty one
      to be '--build-ids', instead of '--inject build-ids'.
      Reported-by: default avatarIngo Molnar <mingo@elte.hu>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      11d232ec
    • Frederic Weisbecker's avatar
      hw_breakpoints: Fix percpu build failure · 777d0411
      Frederic Weisbecker authored
      Fix this build error:
      
         kernel/hw_breakpoint.c:58:1: error: pasting "__pcpu_scope_" and "*" does not give a valid preprocessing token
      
      It happens if CONFIG_DEBUG_FORCE_WEAK_PER_CPU, because we concatenate
      someting with the name and we have the "*" in the name.
      
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Cc: K. Prasad <prasad@linux.vnet.ibm.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Jason Wessel <jason.wessel@windriver.com>
      LKML-Reference: <20100503133942.GA5497@nowhere>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      777d0411
  3. 03 May, 2010 2 commits
    • Tom Zanussi's avatar
      perf: record TRACE_INFO only if using tracepoints and SAMPLE_RAW · 63e0c771
      Tom Zanussi authored
      The current perf code implicitly assumes SAMPLE_RAW means tracepoints
      are being used, but doesn't check for that.  It happily records the
      TRACE_INFO even if SAMPLE_RAW is used without tracepoints, but when the
      perf data is read it won't go any further when it finds TRACE_INFO but
      no tracepoints, and displays misleading errors.
      
      This adds a check for both in perf-record, and won't record TRACE_INFO
      unless both are true.  This at least allows perf report -D to dump raw
      events, and avoids triggering a misleading error condition in perf
      trace.  It doesn't actually enable the non-tracepoint raw events to be
      displayed in perf trace, since perf trace currently only deals with
      tracepoint events.
      
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <1272865861.7932.16.camel@tropicana>
      Signed-off-by: default avatarTom Zanussi <tzanussi@gmail.com>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      63e0c771
    • Ingo Molnar's avatar
      Merge branch 'perf/core' of... · 0806ebd9
      Ingo Molnar authored
      Merge branch 'perf/core' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing into perf/core
      0806ebd9
  4. 02 May, 2010 5 commits
    • Arnaldo Carvalho de Melo's avatar
      perf inject: Refactor read_buildid function · 090f7204
      Arnaldo Carvalho de Melo authored
      Into two functions, one that actually reads the build_id for the dso if
      it wasn't already read, and another taht will inject the event if the
      build_id is available.
      
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      090f7204
    • Arnaldo Carvalho de Melo's avatar
      perf record: Don't exit in live mode when no tracepoints are enabled · 2c9faa06
      Arnaldo Carvalho de Melo authored
      With this I was able to actually test Tom Zanussi's two previous patches
      in my usual perf testing ways, i.e. without any tracepoints activated.
      
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      2c9faa06
    • Tom Zanussi's avatar
      perf: add perf-inject builtin · 454c407e
      Tom Zanussi authored
      Currently, perf 'live mode' writes build-ids at the end of the
      session, which isn't actually useful for processing live mode events.
      
      What would be better would be to have the build-ids sent before any of
      the samples that reference them, which can be done by processing the
      event stream and retrieving the build-ids on the first hit.  Doing
      that in perf-record itself, however, is off-limits.
      
      This patch introduces perf-inject, which does the same job while
      leaving perf-record untouched.  Normal mode perf still records the
      build-ids at the end of the session as it should, but for live mode,
      perf-inject can be injected in between the record and report steps
      e.g.:
      
      perf record -o - ./hackbench 10 | perf inject -v -b | perf report -v -i -
      
      perf-inject reads a perf-record event stream and repipes it to stdout.
      At any point the processing code can inject other events into the
      event stream - in this case build-ids (-b option) are read and
      injected as needed into the event stream.
      
      Build-ids are just the first user of perf-inject - potentially
      anything that needs userspace processing to augment the trace stream
      with additional information could make use of this facility.
      
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <1272696080-16435-3-git-send-email-tzanussi@gmail.com>
      Signed-off-by: default avatarTom Zanussi <tzanussi@gmail.com>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      454c407e
    • Tom Zanussi's avatar
      perf/live: don't synthesize build ids at the end of a live mode trace · 789688fa
      Tom Zanussi authored
      It doesn't really make sense to record the build ids at the end of a
      live mode session - live mode samples need that information during the
      trace rather than at the end.
      
      Leave event__synthesize_build_id() in place, however; we'll still be
      using that to synthesize build ids in a more timely fashion in a
      future patch.
      
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      LKML-Reference: <1272696080-16435-2-git-send-email-tzanussi@gmail.com>
      Signed-off-by: default avatarTom Zanussi <tzanussi@gmail.com>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      789688fa
    • Arnaldo Carvalho de Melo's avatar
      perf tools: Don't use code surrounded by __KERNEL__ · fb72014d
      Arnaldo Carvalho de Melo authored
      We need to refactor code to be explicitely shared by the kernel and at
      least the tools/ userspace programs, so, till we do that, copy the bare
      minimum bitmap/bitops code needed by tools/perf.
      Reported-by: default avatar"H. Peter Anvin" <hpa@zytor.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      fb72014d
  5. 01 May, 2010 7 commits
    • Frederic Weisbecker's avatar
      hw-breakpoints: Get the number of available registers on boot dynamically · feef47d0
      Frederic Weisbecker authored
      The breakpoint generic layer assumes that archs always know in advance
      the static number of address registers available to host breakpoints
      through the HBP_NUM macro.
      
      However this is not true for every archs. For example Arm needs to get
      this information dynamically to handle the compatiblity between
      different versions.
      
      To solve this, this patch proposes to drop the static HBP_NUM macro
      and let the arch provide the number of available slots through a
      new hw_breakpoint_slots() function. For archs that have
      CONFIG_HAVE_MIXED_BREAKPOINTS_REGS selected, it will be called once
      as the number of registers fits for instruction and data breakpoints
      together.
      For the others it will be called first to get the number of
      instruction breakpoint registers and another time to get the
      data breakpoint registers, the targeted type is given as a
      parameter of hw_breakpoint_slots().
      Reported-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: default avatarPaul Mundt <lethal@linux-sh.org>
      Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Cc: K. Prasad <prasad@linux.vnet.ibm.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Jason Wessel <jason.wessel@windriver.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      feef47d0
    • Frederic Weisbecker's avatar
      hw-breakpoints: Handle breakpoint weight in allocation constraints · f93a2054
      Frederic Weisbecker authored
      Depending on their nature and on what an arch supports, breakpoints
      may consume more than one address register. For example a simple
      absolute address match usually only requires one address register.
      But an address range match may consume two registers.
      
      Currently our slot allocation constraints, that tend to reflect the
      limited arch's resources, always consider that a breakpoint consumes
      one slot.
      
      Then provide a way for archs to tell us the weight of a breakpoint
      through a new hw_breakpoint_weight() helper. This weight will be
      computed against the generic allocation constraints instead of
      a constant value.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: default avatarPaul Mundt <lethal@linux-sh.org>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Cc: K. Prasad <prasad@linux.vnet.ibm.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      f93a2054
    • Frederic Weisbecker's avatar
      hw-breakpoints: Separate constraint space for data and instruction breakpoints · 0102752e
      Frederic Weisbecker authored
      There are two outstanding fashions for archs to implement hardware
      breakpoints.
      
      The first is to separate breakpoint address pattern definition
      space between data and instruction breakpoints. We then have
      typically distinct instruction address breakpoint registers
      and data address breakpoint registers, delivered with
      separate control registers for data and instruction breakpoints
      as well. This is the case of PowerPc and ARM for example.
      
      The second consists in having merged breakpoint address space
      definition between data and instruction breakpoint. Address
      registers can host either instruction or data address and
      the access mode for the breakpoint is defined in a control
      register. This is the case of x86 and Super H.
      
      This patch adds a new CONFIG_HAVE_MIXED_BREAKPOINTS_REGS config
      that archs can select if they belong to the second case. Those
      will have their slot allocation merged for instructions and
      data breakpoints.
      
      The others will have a separate slot tracking between data and
      instruction breakpoints.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: default avatarPaul Mundt <lethal@linux-sh.org>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Cc: K. Prasad <prasad@linux.vnet.ibm.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      0102752e
    • Frederic Weisbecker's avatar
      hw-breakpoints: Change/Enforce some breakpoints policies · b2812d03
      Frederic Weisbecker authored
      The current policies of breakpoints in x86 and SH are the following:
      
      - task bound breakpoints can only break on userspace addresses
      - cpu wide breakpoints can only break on kernel addresses
      
      The former rule prevents ptrace breakpoints to be set to trigger on
      kernel addresses, which is good. But as a side effect, we can't
      breakpoint on kernel addresses for task bound breakpoints.
      
      The latter rule simply makes no sense, there is no reason why we
      can't set breakpoints on userspace while performing cpu bound
      profiles.
      
      We want the following new policies:
      
      - task bound breakpoint can set userspace address breakpoints, with
      no particular privilege required.
      - task bound breakpoints can set kernelspace address breakpoints but
      must be privileged to do that.
      - cpu bound breakpoints can do what they want as they are privileged
      already.
      
      To implement these new policies, this patch checks if we are dealing
      with a kernel address breakpoint, if so and if the exclude_kernel
      parameter is set, we tell the user that the breakpoint is invalid,
      which makes a good generic ptrace protection.
      If we don't have exclude_kernel, ensure the user has the right
      privileges as kernel breakpoints are quite sensitive (risk of
      trap recursion attacks and global performance impacts).
      
      [ Paul Mundt: keep addr space check for sh signal delivery and fix
        double function declaration]
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Cc: K. Prasad <prasad@linux.vnet.ibm.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Jason Wessel <jason.wessel@windriver.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarPaul Mundt <lethal@linux-sh.org>
      b2812d03
    • Frederic Weisbecker's avatar
      hw-breakpoints: Check disabled breakpoints again · 87e9b202
      Frederic Weisbecker authored
      We stopped checking disabled breakpoints because we weren't
      allowing breakpoints on NULL addresses. And gdb tends to set
      NULL addresses on inactive breakpoints.
      
      But refusing NULL addresses was actually a regression that has
      been fixed now. There is no reason anymore to not validate
      inactive breakpoint settings.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: default avatarPaul Mundt <lethal@linux-sh.org>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Cc: K. Prasad <prasad@linux.vnet.ibm.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Jason Wessel <jason.wessel@windriver.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      87e9b202
    • Frederic Weisbecker's avatar
      hw-breakpoints: Tag ptrace breakpoint as exclude_kernel · 73266fc1
      Frederic Weisbecker authored
      Tag ptrace breakpoints with the exclude_kernel attribute set. This
      will make it easier to set generic policies on breakpoints, when it
      comes to ensure nobody unpriviliged try to breakpoint on the kernel.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: default avatarPaul Mundt <lethal@linux-sh.org>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Cc: K. Prasad <prasad@linux.vnet.ibm.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      73266fc1
    • Frederic Weisbecker's avatar
      perf: Fix warning while reading ring buffer headers · d00a47cc
      Frederic Weisbecker authored
      commit e9e94e3b
      "perf trace: Ignore "overwrite" field if present in
      /events/header_page" makes perf trace launching spurious warnings
      about unexpected tokens read:
      
      	Warning: Error: expected type 6 but read 4
      
      This change tries to handle the overcommit field in the header_page
      file whenever this field is present or not.
      
      The problem is that if this field is not present, we try to find it
      and give up in the middle of the line when we realize we are actually
      dealing with another field, which is the "data" one. And this failure
      abandons the file pointer in the middle of the "data" description
      line:
      
      	field: u64 timestamp;	offset:0;	size:8;	signed:0;
      	field: local_t commit;	offset:8;	size:8;	signed:1;
      	field: char data;	offset:16;	size:4080;	signed:1;
                            ^^^
                            Here
      
      What happens next is that we want to read this line to parse the data
      field, but we fail because the pointer is not in the beginning of the
      line.
      
      We could probably fix that by rewinding the pointer. But in fact we
      don't care much about these headers that only concern the ftrace
      ring-buffer. We don't use them from perf.
      
      Just skip this part of perf.data, but don't remove it from recording
      to stay compatible with olders perf.data
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      d00a47cc
  6. 30 Apr, 2010 10 commits
  7. 29 Apr, 2010 11 commits