1. 06 May, 2009 2 commits
    • Steven Rostedt's avatar
      ring-buffer: add benchmark and tester · 5092dbc9
      Steven Rostedt authored
      This patch adds code that can benchmark the ring buffer as well as
      test it. This code can be compiled into the kernel (not recommended)
      or as a module.
      
      A separate ring buffer is used to not interfer with other users, like
      ftrace. It creates a producer and a consumer (option to disable creation
      of the consumer) and will run for 10 seconds, then sleep for 10 seconds
      and then repeat.
      
      While running, the producer will write 10 byte loads into the ring
      buffer with just putting in the current CPU number. The reader will
      continually try to read the buffer. The reader will alternate from reading
      the buffer via event by event, or by full pages.
      
      The output is a pr_info, thus it will fill up the syslogs.
      
        Starting ring buffer hammer
        End ring buffer hammer
        Time:     9000349 (usecs)
        Overruns: 12578640
        Read:     5358440  (by events)
        Entries:  0
        Total:    17937080
        Missed:   0
        Hit:      17937080
        Entries per millisec: 1993
        501 ns per entry
        Sleeping for 10 secs
        Starting ring buffer hammer
        End ring buffer hammer
        Time:     9936350 (usecs)
        Overruns: 0
        Read:     28146644  (by pages)
        Entries:  74
        Total:    28146718
        Missed:   0
        Hit:      28146718
        Entries per millisec: 2832
        353 ns per entry
        Sleeping for 10 secs
      
      Time:      is the time the test ran
      Overruns:  the number of events that were overwritten and not read
      Read:      the number of events read (either by pages or events)
      Entries:   the number of entries left in the buffer
                       (the by pages will only read full pages)
      Total:     Entries + Read + Overruns
      Missed:    the number of entries that failed to write
      Hit:       the number of entries that were written
      
      The above example shows that it takes ~353 nanosecs per entry when
      there is a reader, reading by pages (and no overruns)
      
      The event by event reader slowed the producer down to 501 nanosecs.
      
      [ Impact: see how changes to the ring buffer affect stability and performance ]
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      5092dbc9
    • Steven Rostedt's avatar
      ring-buffer: move big if statement down · aa20ae84
      Steven Rostedt authored
      In the hot path of the ring buffer "__rb_reserve_next" there's a big
      if statement that does not even return back to the work flow.
      
      	code;
      
      	if (cross to next page) {
      
      		[ lots of code ]
      
      		return;
      	}
      
      	more code;
      
      The condition is even the unlikely path, although we do not denote it
      with an unlikely because gcc is fine with it. The condition is true when
      the write crosses a page boundary, and we need to start at a new page.
      
      Having this if statement makes it hard to read, but calling another
      function to do the work is also not appropriate, because we are using a lot
      of variables that were set before the if statement, and we do not want to
      send them as parameters.
      
      This patch changes it to a goto:
      
      	code;
      
      	if (cross to next page)
      		goto next_page;
      
      	more code;
      
      	return;
      
      next_page:
      
      	[ lots of code]
      
      This makes the code easier to understand, and a bit more obvious.
      
      The output from gcc is practically identical. For some reason, gcc decided
      to use different registers when I switched it to a goto. But other than that,
      the logic is the same.
      
      [ Impact: easier to read code ]
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      aa20ae84
  2. 05 May, 2009 9 commits
    • Steven Rostedt's avatar
      tracing: use proper export symbol for tracing api · 94487d6d
      Steven Rostedt authored
      When adding the EXPORT_SYMBOL to some of the tracing API, I accidently
      used EXPORT_SYMBOL instead of EXPORT_SYMBOL_GPL. This patch fixes
      that mistake.
      
      [ Impact: export the tracing code only for GPL modules ]
      Reported-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      94487d6d
    • Tim Abbott's avatar
      ftrace: use .sched.text, not .text.sched in recordmcount.pl · 31b6e76e
      Tim Abbott authored
      The only references in the kernel to the .text.sched section are in
      recordmcount.pl.  Since the code it has is intended to be example code
      it should refer to real kernel sections.  So change it to .sched.text
      instead.
      
      [ Impact: consistency in comments ]
      Signed-off-by: default avatarTim Abbott <tabbott@mit.edu>
      LKML-Reference: <1241136371-10768-1-git-send-email-tabbott@mit.edu>
      Acked-by: default avatarSam Ravnborg <sam@ravnborg.org>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      31b6e76e
    • Steven Rostedt's avatar
      ring-buffer: disable writers when resetting buffers · 41ede23e
      Steven Rostedt authored
      As a precaution, it is best to disable writing to the ring buffers
      when reseting them.
      
      [ Impact: prevent weird things if write happens during reset ]
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      41ede23e
    • Steven Rostedt's avatar
      ring-buffer: have read page swap increment counter with page entries · afbab76a
      Steven Rostedt authored
      In the swap page ring buffer code that is used by the ftrace splice code,
      we scan the page to increment the counter of entries read.
      
      With the number of entries already in the page we simply need to add it.
      
      [ Impact: speed up reading page from ring buffer ]
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      afbab76a
    • Steven Rostedt's avatar
      ring-buffer: record page entries in buffer page descriptor · 778c55d4
      Steven Rostedt authored
      Currently, when the ring buffer writer overflows the buffer and must
      write over non consumed data, we increment the overrun counter by
      reading the entries on the page we are about to overwrite. This reads
      the entries one by one.
      
      This is not very effecient. This patch adds another entry counter
      into each buffer page descriptor that keeps track of the number of
      entries on the page. Now on overwrite, the overrun counter simply
      needs to add the number of entries that is on the page it is about
      to overwrite.
      
      [ Impact: speed up of ring buffer in overwrite mode ]
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      778c55d4
    • Steven Rostedt's avatar
      ring-buffer: convert cpu buffer entries to local_t · e4906eff
      Steven Rostedt authored
      The entries counter in cpu buffer is not atomic. It can be updated by
      other interrupts or from another CPU (readers).
      
      But making entries into "atomic_t" causes an atomic operation that can
      hurt performance. Instead we convert it to a local_t that will increment
      a counter with a local CPU atomic operation (if the arch supports it).
      
      Instead of fighting with readers and overwrites that decrement the counter,
      I added a "read" counter. Every time a reader reads an entry it is
      incremented.
      
      We already have a overrun counter and with that, the entries counter and
      the read counter, we can calculate the total number of entries in the
      buffer with:
      
        (entries - overrun) - read
      
      As long as the total number of entries in the ring buffer is less than
      the word size, this will work. But since the entries counter was previously
      a long, this is no different than what we had before.
      
      Thanks to Andrew Morton for pointing out in the first version that
      atomic_t does not replace unsigned long. I switched to atomic_long_t
      even though it is signed. A negative count is most likely a bug.
      
      [ Impact: keep accurate count of cpu buffer entries ]
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      e4906eff
    • Steven Rostedt's avatar
      tracing: export stats of ring buffers to userspace · c8d77183
      Steven Rostedt authored
      This patch adds stats to the ftrace ring buffers:
      
       # cat /debugfs/tracing/per_cpu/cpu0/stats
       entries: 42360
       overrun: 30509326
       commit overrun: 0
       nmi dropped: 0
      
      Where entries are the total number of data entries in the buffer.
      
      overrun is the number of entries not consumed and were overwritten by
      the writer.
      
      commit overrun is the number of entries dropped due to nested writers
      wrapping the buffer before the initial writer finished the commit.
      
      nmi dropped is the number of entries dropped due to the ring buffer
      lock being held when an nmi was going to write to the ring buffer.
      Note, this field will be meaningless and will go away when the ring
      buffer becomes lockless.
      
      [ Impact: let userspace know what is happening in the ring buffers ]
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      c8d77183
    • Steven Rostedt's avatar
      ring-buffer: add counters for commit overrun and nmi dropped entries · f0d2c681
      Steven Rostedt authored
      The WARN_ON in the ring buffer when a commit is preempted and the
      buffer is filled by preceding writes can happen in normal operations.
      The WARN_ON makes it look like a bug, not to mention, because
      it does not stop tracing and calls printk which can also recurse, this
      is prone to deadlock (the WARN_ON is not in a position to recurse).
      
      This patch removes the WARN_ON and replaces it with a counter that
      can be retrieved by a tracer. This counter is called commit_overrun.
      
      While at it, I added a nmi_dropped counter to count any time an NMI entry
      is dropped because the NMI could not take the spinlock.
      
      [ Impact: prevent deadlock by printing normal case warning ]
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      f0d2c681
    • Steven Rostedt's avatar
      ring-buffer: export symbols · d6ce96da
      Steven Rostedt authored
      I'm adding a module to do a series of tests on the ring buffer as well
      as benchmarks. This module needs to have more of the ring buffer API
      exported. There's nothing wrong with reading the ring buffer from a
      module.
      
      [ Impact: allow modules to read pages from the ring buffer ]
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      d6ce96da
  3. 01 May, 2009 3 commits
  4. 29 Apr, 2009 11 commits
    • Heiko Carstens's avatar
      tracing: fix build failure on s390 · a0e39ed3
      Heiko Carstens authored
      "tracing: create automated trace defines" causes this compile error on s390,
      as reported by Sachin Sant against linux-next:
      
       kernel/built-in.o: In function `__do_softirq':
       (.text+0x1c680): undefined reference to `__tracepoint_softirq_entry'
      
      This happens because the definitions of the softirq tracepoints were moved
      from kernel/softirq.c to kernel/irq/handle.c. Since s390 doesn't support
      generic hardirqs handle.c doesn't get compiled and the definitions are
      missing.
      
      So move the tracepoints to softirq.c again.
      
      [ Impact: fix build failure on s390 ]
      Reported-by: default avatarSachin Sant <sachinp@in.ibm.com>
      Signed-off-by: default avatarHeiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: fweisbec@gmail.com
      LKML-Reference: <20090429135139.5fac79b8@osiris.boeblingen.de.ibm.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      a0e39ed3
    • Tom Zanussi's avatar
      tracing/filters: a better event parser · 8b372562
      Tom Zanussi authored
      Replace the current event parser hack with a better one.  Filters are
      no longer specified predicate by predicate, but all at once and can
      use parens and any of the following operators:
      
      numeric fields:
      
      ==, !=, <, <=, >, >=
      
      string fields:
      
      ==, !=
      
      predicates can be combined with the logical operators:
      
      &&, ||
      
      examples:
      
      "common_preempt_count > 4" > filter
      
      "((sig >= 10 && sig < 15) || sig == 17) && comm != bash" > filter
      
      If there was an error, the erroneous string along with an error
      message can be seen by looking at the filter e.g.:
      
      ((sig >= 10 && sig < 15) || dsig == 17) && comm != bash
      ^
      parse_error: Field not found
      
      Currently the caret for an error always appears at the beginning of
      the filter; a real position should be used, but the error message
      should be useful even without it.
      
      To clear a filter, '0' can be written to the filter file.
      
      Filters can also be set or cleared for a complete subsystem by writing
      the same filter as would be written to an individual event to the
      filter file at the root of the subsytem.  Note however, that if any
      event in the subsystem lacks a field specified in the filter being
      set, the set will fail and all filters in the subsytem are
      automatically cleared.  This change from the previous version was made
      because using only the fields that happen to exist for a given event
      would most likely result in a meaningless filter.
      
      Because the logical operators are now implemented as predicates, the
      maximum number of predicates in a filter was increased from 8 to 16.
      
      [ Impact: add new, extended trace-filter implementation ]
      Signed-off-by: default avatarTom Zanussi <tzanussi@gmail.com>
      Acked-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Cc: fweisbec@gmail.com
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      LKML-Reference: <1240905899.6416.121.camel@tropicana>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      8b372562
    • Tom Zanussi's avatar
      tracing/filters: distinguish between signed and unsigned fields · a118e4d1
      Tom Zanussi authored
      The new filter comparison ops need to be able to distinguish between
      signed and unsigned field types, so add an is_signed flag/param to the
      event field struct/trace_define_fields().  Also define a simple macro,
      is_signed_type() to determine the signedness at compile time, used in the
      trace macros.  If the is_signed_type() macro won't work with a specific
      type, a new slightly modified version of TRACE_FIELD() called
      TRACE_FIELD_SIGN(), allows the signedness to be set explicitly.
      
      [ Impact: extend trace-filter code for new feature ]
      Signed-off-by: default avatarTom Zanussi <tzanussi@gmail.com>
      Acked-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Cc: fweisbec@gmail.com
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      LKML-Reference: <1240905893.6416.120.camel@tropicana>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      a118e4d1
    • Tom Zanussi's avatar
      tracing/filters: move preds into event_filter object · 30e673b2
      Tom Zanussi authored
      Create a new event_filter object, and move the pred-related members
      out of the call and subsystem objects and into the filter object - the
      details of the filter implementation don't need to be exposed in the
      call and subsystem in any case, and it will also help make the new
      parser implementation a little cleaner.
      
      [ Impact: refactor trace-filter code to prepare for new features ]
      Signed-off-by: default avatarTom Zanussi <tzanussi@gmail.com>
      Acked-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Cc: fweisbec@gmail.com
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      LKML-Reference: <1240905887.6416.119.camel@tropicana>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      30e673b2
    • Stuart Bennett's avatar
      tracing: x86, mmiotrace: only register for die notifier when tracer active · 0f9a623d
      Stuart Bennett authored
      Follow up to afcfe024 in Linus' tree
      ("x86: mmiotrace: quieten spurious warning message")
      Signed-off-by: default avatarStuart Bennett <stuart@freedesktop.org>
      Acked-by: default avatarPekka Paalanen <pq@iki.fi>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      LKML-Reference: <1240946271-7083-5-git-send-email-stuart@freedesktop.org>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      0f9a623d
    • Stuart Bennett's avatar
      tracing: x86, mmiotrace: refactor clearing/restore of page presence · 46e91d00
      Stuart Bennett authored
      * change function names to clear_* from set_*: in reality we only clear
        and restore page presence, and never unconditionally set present.
        Using clear_*({true, false}, ...) is therefore more honest than
        set_*({false, true}, ...)
      
      * upgrade presence storage to pteval_t: doing user-space tracing will
        require saving and manipulation of the _PAGE_PROTNONE bit, in addition
        to the existing _PAGE_PRESENT changes, and having multiple bools stored
        and passed around does not seem optimal
      
      [ Impact: refactor, clean up mmiotrace code ]
      Signed-off-by: default avatarStuart Bennett <stuart@freedesktop.org>
      Acked-by: default avatarPekka Paalanen <pq@iki.fi>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      LKML-Reference: <1240946271-7083-4-git-send-email-stuart@freedesktop.org>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      46e91d00
    • Stuart Bennett's avatar
      tracing: x86, mmiotrace: code consistency/legibility improvement · 0492e1bb
      Stuart Bennett authored
      kmmio_probe being *p and kmmio_fault_page being sometimes *f and
      sometimes *p is not helpful.
      
      [ Impact: cleanup ]
      Signed-off-by: default avatarStuart Bennett <stuart@freedesktop.org>
      Acked-by: default avatarPekka Paalanen <pq@iki.fi>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      LKML-Reference: <1240946271-7083-3-git-send-email-stuart@freedesktop.org>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      0492e1bb
    • Steven Rostedt's avatar
      ring-buffer: fix printk output · 7d7d2b80
      Steven Rostedt authored
      The warning output in trace_recursive_lock uses %d for a long when
      it should be %ld.
      
      [ Impact: fix compile warning ]
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      7d7d2b80
    • Steven Rostedt's avatar
      tracing: have splice only copy full pages · f2957f1f
      Steven Rostedt authored
      Splice works with pages, it is much more effecient to use an entire
      page than to copy bits over several pages.
      
      Using logdev to trace the internals of the splice mechanism, I was
      able to see that splice can be very aggressive. When tracing is
      occurring, and the reader caught up to the writer, and the writer
      is on the reader page, the reader will copy what is there into the
      splice page. Splice may iterate over several pages and if the
      writer is still writing to the page, the reader will keep copying
      bits to new pages to pass to userspace.
      
      This patch changes it to only pass data to userspace if the page
      is full (the writer has left the page). This has a small side effect
      that splice can not read a partial page, and must wait for the
      page to fill. This should not be an issue. If tracing has stopped,
      then a use of "read" will still read all of the page.
      
      [ Impact: better performance for ring buffer splice code ]
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      f2957f1f
    • Steven Rostedt's avatar
      tracing: only add splice page if entries exist · 93459c6c
      Steven Rostedt authored
      The splice code allocates a page even when the ring buffer is empty.
      It detects the ring buffer being empty when it it fails to copy
      anything from the ring buffer into the page.
      
      This patch adds a check to see if there is anything in the ring buffer
      before allocating a page.
      
      Thanks to logdev for letting me trace the tracer to find this.
      
      [ Impact: speed up due to removing unnecessary allocation ]
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      93459c6c
    • Steven Rostedt's avatar
      tracing: fix ref count in splice pages · 5beae6ef
      Steven Rostedt authored
      The pages allocated for the splice binary buffer did not initialize
      the ref count correctly. This caused pages not to be freed and causes
      a drastic memory leak.
      
      Thanks to logdev I was able to trace the tracer to find where the leak
      was.
      
      [ Impact: stop memory leak when using splice ]
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      5beae6ef
  5. 28 Apr, 2009 1 commit
    • Steven Rostedt's avatar
      tracing: convert ftrace_dump spinlocks to raw · cd891ae0
      Steven Rostedt authored
      ftrace_dump is used for printing out the contents of the ftrace ring buffer
      to the console on failure. Currently it uses a spinlock to synchronize
      the output from multiple failures on different CPUs. This spin lock
      currently is a normal spinlock and can cause issues with lockdep and
      lock tracing.
      
      This patch converts it to raw since it is for error handling only.
      The lock is local to the ftrace_dump and is not used by any other
      infrastructure.
      
      [ Impact: prevent ftrace_dump from locking up by internal tracing ]
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      cd891ae0
  6. 26 Apr, 2009 1 commit
    • Steven Rostedt's avatar
      tracing/events: make modules have their own file_operations structure · 701970b3
      Steven Rostedt authored
      For proper module reference counting, the file_operations that modules use
      must have the "owner" field set to the module. Unfortunately, the trace events
      use share file_operations. The same file_operations are used by all both
      kernel core and all modules.
      
      This patch makes the modules allocate their own file_operations and
      copies the functions from the core kernel. This allows those file
      operations to be owned by the module.
      
      Care is taken to free this code on module unload.
      
      Thanks to Greg KH for reminding me that file_operations must be owned
      by the module to have reference counting take place.
      
      [ Impact: fix modular tracepoints / potential crash ]
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Acked-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      701970b3
  7. 25 Apr, 2009 1 commit
    • Steven Rostedt's avatar
      tracing/events: reuse trace event ids after overflow · 060fa5c8
      Steven Rostedt authored
      With modules being able to add trace events, and the max trace event
      counter is 16 bits (65536) we can overflow the counter easily
      with a simple while loop adding and removing modules that contain
      trace events.
      
      This patch links together the registered trace events and on overflow
      searches for available trace event ids. It will still fail if
      over 65536 events are registered, but considering that a typical
      kernel only has 22000 functions, 65000 events should be sufficient.
      Reported-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      060fa5c8
  8. 24 Apr, 2009 9 commits
    • Steven Rostedt's avatar
      tracing: remove deprecated TRACE_FORMAT · b8e65554
      Steven Rostedt authored
      The TRACE_FORMAT macro has been deprecated by the TRACE_EVENT macro.
      There are no more users. All new users must use the TRACE_EVENT macro.
      
      [ Impact: remove old functionality ]
      
      Cc: Peter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      b8e65554
    • Steven Rostedt's avatar
      tracing/irq: convert irq traces to use TRACE_EVENT macro · 160031b5
      Steven Rostedt authored
      The TRACE_FORMAT will soon be deprecated. This patch converts it to
      the TRACE_EVENT macro.
      
      Note, this change should also speed up the tracing.
      
      [ Impact: remove a user of deprecated TRACE_FORMAT ]
      
      Cc: Jason Baron <jbaron@redhat.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      160031b5
    • Steven Rostedt's avatar
      tracing/lockdep: convert lockdep to use TRACE_EVENT macro · 39517091
      Steven Rostedt authored
      The TRACE_FORMAT will soon be deprecated. This patch converts it to
      the TRACE_EVENT macro.
      
      Note, this change should also speed up the tracing.
      
      [ Impact: remove a user of deprecated TRACE_FORMAT ]
      
      Cc: Peter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      39517091
    • Lai Jiangshan's avatar
      ring_buffer: compressed event header · 334d4169
      Lai Jiangshan authored
      RB_MAX_SMALL_DATA = 28bytes is too small for most tracers, it wastes
      an 'u32' to save the actually length for events which data size > 28.
      
      This fix uses compressed event header and enlarges RB_MAX_SMALL_DATA.
      
      [ Impact: saves about 0%-12.5%(depends on tracer) memory in ring_buffer ]
      Signed-off-by: default avatarLai Jiangshan <laijs@cn.fujitsu.com>
      LKML-Reference: <49F13189.3090000@cn.fujitsu.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      334d4169
    • Steven Rostedt's avatar
      tracing: fix cut and paste macro error · c2518c43
      Steven Rostedt authored
      In case a module uses the TRACE_EVENT macro for creating automated
      events in ftrace, it may choose to use a different file name
      than the defined system name, or choose to use a different path than
      the default "include/trace/events" include path.
      
      If this is done, then before including trace/define_trace.h the
      header would define either "TRACE_INCLUDE_FILE" for the file
      name or "TRACE_INCLUDE_PATH" for the include path.
      
      If it does not define these, then the define_trace.h defines them
      instead. If define trace defines them, then define_trace.h should
      also undefine them before exiting. To do this a macro is used
      to note this:
      
       #ifndef TRACE_INCLUDE_FILE
       # define TRACE_INCLUDE_FILE TRACE_SYSTEM
       # define UNDEF_TRACE_INCLUDE_FILE
       #endif
      
      [...]
      
       #ifdef UNDEF_TRACE_INCLUDE_FILE
       # undef TRACE_INCLUDE_FILE
       # undef UNDEF_TRACE_INCLUDE_FILE
       #endif
      
      The UNDEF_TRACE_INCLUDE_FILE acts as a CPP variable to know to undef
      the TRACE_INCLUDE_FILE before leaving define_trace.h.
      
      Unfortunately, due to cut and paste errors, the macros between
      FILE and PATH got mixed up.
      
      [ Impact: undef TRACE_INCLUDE_FILE and/or TRACE_INCLUDE_PATH when needed ]
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      c2518c43
    • Chris Wright's avatar
      x86: use native register access for native tlb flushing · d7285c6b
      Chris Wright authored
      currently these are paravirtulaized, doesn't appear any callers rely on
      this (no pv_ops backends are using native_tlb and overriding cr3/4
      access).
      
      [ Impact: fix lockdep warning with paravirt and function tracer ]
      Signed-off-by: default avatarChris Wright <chrisw@sous-sol.org>
      LKML-Reference: <20090423172138.GR3036@sequoia.sous-sol.org>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      d7285c6b
    • Steven Rostedt's avatar
      tracing: add size checks for exported ftrace internal structures · 75db37d2
      Steven Rostedt authored
      The events exported by TRACE_EVENT are automated and are guaranteed
      to be correct when used.
      
      The internal ftrace structures on the other hand are more manually
      exported. These require the ftrace maintainer to make sure they
      are up to date.
      
      This patch adds a size check to help flag when a type changes in
      an internal ftrace data structure, and the update needs to be reflected
      in the export.
      
      If a export is incorrect, then the only harm is that the user space
      tools will not know how to correctly read the internal structures of
      ftrace.
      
      [ Impact: help prevent inconsistent ftrace format print outs ]
      Signed-off-by: default avatarSteven Rostedt <srostedt@redhat.com>
      75db37d2
    • Steven Rostedt's avatar
      tracing: increase size of number of possible events · 89ec0dee
      Steven Rostedt authored
      With the new event tracing registration, we must increase the number
      of events that can be registered. Currently the type field is only
      one byte, which leaves us only 256 possible events.
      
      Since we do not save the CPU number in the tracer anymore (it is determined
      by the per cpu ring buffer that is used) we have an extra byte to use.
      
      This patch increases the size of type from 1 byte (256 events) to
      2 bytes (65,536 events).
      
      It also adds a WARN_ON_ONCE if we exceed that limit.
      
      [ Impact: allow more than 255 events ]
      Signed-off-by: default avatarSteven Rostedt <srostedt@redhat.com>
      89ec0dee
    • Steven Rostedt's avatar
      tracing/wakeup: move access to wakeup_cpu into spinlock · 9be24414
      Steven Rostedt authored
      The code had the following outside the lock:
      
              if (next != wakeup_task)
                      return;
      
              pc = preempt_count();
      
              /* The task we are waiting for is waking up */
              data = wakeup_trace->data[wakeup_cpu];
      
      On initialization, wakeup_task is NULL and wakeup_cpu -1. This code
      is not under a lock. If wakeup_task is set on another CPU as that
      task is waking up, we can see the wakeup_task before wakeup_cpu is
      set. If we read wakeup_cpu while it is still -1 then we will have
      a bad data pointer.
      
      This patch moves the reading of wakeup_cpu within the protection of
      the spinlock used to protect the writing of wakeup_cpu and wakeup_task.
      
      [ Impact: remove possible race causing invalid pointer dereference ]
      Reported-by: default avatarManeesh Soni <maneesh@in.ibm.com>
      Signed-off-by: default avatarSteven Rostedt <srostedt@redhat.com>
      9be24414
  9. 22 Apr, 2009 3 commits
    • Frederic Weisbecker's avatar
      tracing/events: protect __get_str() · 6a74aa40
      Frederic Weisbecker authored
      The __get_str() macro is used in a code part then its content should be
      protected with parenthesis.
      
      [ Impact: make macro definition more robust ]
      Reported-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      6a74aa40
    • Frederic Weisbecker's avatar
      tracing/lock: provide lock_acquired event support for dynamic size string · 7e7ca9a2
      Frederic Weisbecker authored
      Now that we can support the dynamic sized string, make the lock tracing
      able to use it, making it safe against modules removal and consuming
      the right amount of memory needed for each lock name
      
      Changes in v2:
      adapt to the __ending_string() updates and the opening_string() removal.
      
      [ Impact: protect lock tracer against module removal ]
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      7e7ca9a2
    • Frederic Weisbecker's avatar
      tracing/events: provide string with undefined size support · 9cbf1176
      Frederic Weisbecker authored
      This patch provides the support for dynamic size strings on
      event tracing.
      
      The key concept is to use a structure with an ending char array field of
      undefined size and use such ability to allocate the minimal size on the
      ring buffer to make one or more string entries fit inside, as opposite
      to a fixed length strings with upper bound.
      
      The strings themselves are represented using fields which have an offset
      value from the beginning of the entry.
      
      This patch provides three new macros:
      
      __string(item, src)
      
      This one declares a string to the structure inside TP_STRUCT__entry.
      You need to provide the name of the string field and the source that will
      be copied inside.
      This will also add the dynamic size of the string needed for the ring
      buffer entry allocation.
      A stack allocated structure is used to temporarily store the offset
      of each strings, avoiding double calls to strlen() on each event
      insertion.
      
      __get_str(field)
      
      This one will give you a pointer to the string you have created. This
      is an abstract helper to resolve the absolute address given the field
      name which is a relative address from the beginning of the trace_structure.
      
      __assign_str(dst, src)
      
      Use this macro to automatically perform the string copy from src to
      dst. src must be a variable to assign and dst is the name of a __string
      field.
      
      Example on how to use it:
      
      TRACE_EVENT(my_event,
      	TP_PROTO(char *src1, char *src2),
      
      	TP_ARGS(src1, src2),
      	TP_STRUCT__entry(
      		__string(str1, src1)
      		__string(str2, src2)
      	),
      	TP_fast_assign(
      		__assign_str(str1, src1);
      		__assign_str(str2, src2);
      	),
      	TP_printk("%s %s", __get_str(src1), __get_str(src2))
      )
      
      Of course you can mix-up any __field or __array inside this
      TRACE_EVENT. The position of the __string or __assign_str
      doesn't matter.
      
      Changes in v2:
      
      Address the suggestion of Steven Rostedt: drop the opening_string() macro
      and redefine __ending_string() to get the size of the string to be copied
      instead of overwritting the whole ring buffer allocation.
      
      Changes in v3:
      
      Address other suggestions of Steven Rostedt and Peter Zijlstra with
      some changes: drop the __ending_string and the need to have only one
      string field.
      Use offsets instead of absolute addresses.
      
      [ Impact: allow more compact memory usage for string tracing ]
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      9cbf1176