1. 25 Nov, 2009 2 commits
    • Lai Jiangshan's avatar
      tracing: Separate raw syscall from syscall tracer · b8007ef7
      Lai Jiangshan authored
      The current syscall tracer mixes raw syscalls and real syscalls.
      
      echo 1 > events/syscalls/enable
      And we get these from the output:
      
      (XXXX insteads "            grep-20914 [001] 588211.446347" .. etc)
      
      XXXX: sys_read(fd: 3, buf: 80609a8, count: 7000)
      XXXX: sys_enter: NR 3 (3, 80609a8, 7000, a, 1000, bfce8ef8)
      XXXX: sys_read -> 0x138
      XXXX: sys_exit: NR 3 = 312
      XXXX: sys_read(fd: 3, buf: 8060ae0, count: 7000)
      XXXX: sys_enter: NR 3 (3, 8060ae0, 7000, a, 1000, bfce8ef8)
      XXXX: sys_read -> 0x138
      XXXX: sys_exit: NR 3 = 312
      
      There are 2 drawbacks here.
      A) two almost identical records are saved in ringbuffer
         when a syscall enters or exits. (4 records for every syscall)
         This wastes precious space in the ring buffer.
      B) the lines including "sys_enter/sys_exit" produces
         hardly any useful information for the output (no labels).
      
      The user can use this method to prevent these drawbacks:
      echo 1 > events/syscalls/enable
      echo 0 > events/syscalls/sys_enter/enable
      echo 0 > events/syscalls/sys_exit/enable
      
      But this is not user friendly. So we separate raw syscall
      from syscall tracer.
      
      After this fix applied:
      syscall tracer's output (echo 1 > events/syscalls/enable):
      
      XXXX: sys_read(fd: 3, buf: bfe87d88, count: 200)
      XXXX: sys_read -> 0x200
      XXXX: sys_fstat64(fd: 3, statbuf: bfe87c98)
      XXXX: sys_fstat64 -> 0x0
      XXXX: sys_close(fd: 3)
      
      raw syscall tracer's output (echo 1 > events/raw_syscalls/enable):
      
      XXXX: sys_enter: NR 175 (0, bf92bf18, bf92bf98, 8, b748cff4, bf92bef8)
      XXXX: sys_exit: NR 175 = 0
      XXXX: sys_enter: NR 175 (2, bf92bf98, 0, 8, b748cff4, bf92bef8)
      XXXX: sys_exit: NR 175 = 0
      XXXX: sys_enter: NR 3 (9, bf927f9c, 4000, b77e2518, b77dce60, bf92bff8)
      Signed-off-by: default avatarLai Jiangshan <laijs@cn.fujitsu.com>
      LKML-Reference: <4AEFC37C.5080609@cn.fujitsu.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      b8007ef7
    • Steven Rostedt's avatar
      ring-buffer-benchmark: Add parameters to set produce/consumer priorities · 7ac07434
      Steven Rostedt authored
      Running the ring-buffer-benchmark's threads at the lowest priority may
      work well for keeping it in the background, but it is not appropriate
      for the benchmarks.
      
      This patch adds 4 parameters to the module:
      
        consumer_fifo
        consumer_nice
        producer_fifo
        producer_nice
      
      By default the consumer and producer still run at nice +19.
      
      If the *_fifo options are set, they will override the *_nice values.
      
       modprobe ring_buffer_benchmark consumer_nice=0 producer_fifo=10
      
      The above will set the consumer thread to a nice value of 0, and
      the producer thread to a RT SCHED_FIFO priority of 10.
      
      Note, this patch also fixes a bug where calling set_user_nice on the
      consumer thread would oops the kernel when the parameter "disable_reader"
      is set.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      7ac07434
  2. 23 Nov, 2009 2 commits
    • Ingo Molnar's avatar
      tracing, function tracer: Clean up strstrip() usage · 457dc928
      Ingo Molnar authored
      Clean up strstrip() usage - which also addresses this build warning:
      
        kernel/trace/ftrace.c: In function 'ftrace_pid_write':
        kernel/trace/ftrace.c:3004: warning: ignoring return value of 'strstrip', declared with attribute warn_unused_result
      
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      457dc928
    • Ingo Molnar's avatar
      ring-buffer benchmark: Run producer/consumer threads at nice +19 · 98e4833b
      Ingo Molnar authored
      The ring-buffer benchmark threads run on nice 0 by default, using
      up a lot of CPU time and slowing down the system:
      
         PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
        1024 root      20   0     0    0    0 D 95.3  0.0   4:01.67 rb_producer
        1023 root      20   0     0    0    0 R 93.5  0.0   2:54.33 rb_consumer
       21569 mingo     40   0 14852 1048  772 R  3.6  0.1   0:00.05 top
           1 root      40   0  4080  928  668 S  0.0  0.0   0:23.98 init
      
      Renice them to +19 to make them less intrusive.
      
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      98e4833b
  3. 19 Nov, 2009 1 commit
  4. 17 Nov, 2009 3 commits
  5. 12 Nov, 2009 2 commits
    • Steven Rostedt's avatar
      tracing: do not disable interrupts for trace_clock_local · 8b2a5dac
      Steven Rostedt authored
      Disabling interrupts in trace_clock_local takes quite a performance
      hit to the recording of traces. Using perf top we see:
      
      ------------------------------------------------------------------------------
         PerfTop:     244 irqs/sec  kernel:100.0% [1000Hz cpu-clock-msecs],  (all, 4 CPUs)
      ------------------------------------------------------------------------------
      
                   samples    pcnt   kernel function
                   _______   _____   _______________
      
                   2842.00 - 40.4% : trace_clock_local
                   1043.00 - 14.8% : rb_reserve_next_event
                    784.00 - 11.1% : ring_buffer_lock_reserve
                    600.00 -  8.5% : __rb_reserve_next
                    579.00 -  8.2% : rb_end_commit
                    440.00 -  6.3% : ring_buffer_unlock_commit
                    290.00 -  4.1% : ring_buffer_producer_thread 	[ring_buffer_benchmark]
                    155.00 -  2.2% : debug_smp_processor_id
                    117.00 -  1.7% : trace_recursive_unlock
                    103.00 -  1.5% : ring_buffer_event_data
                     28.00 -  0.4% : do_gettimeofday
                     22.00 -  0.3% : _spin_unlock_irq
                     14.00 -  0.2% : native_read_tsc
                     11.00 -  0.2% : getnstimeofday
      
      Where trace_clock_local is 40% of the tracing, and the time for recording
      a trace according to ring_buffer_benchmark is 210ns. After converting
      the interrupts to preemption disabling we have from perf top:
      
      ------------------------------------------------------------------------------
         PerfTop:    1084 irqs/sec  kernel:99.9% [1000Hz cpu-clock-msecs],  (all, 4 CPUs)
      ------------------------------------------------------------------------------
      
                   samples    pcnt   kernel function
                   _______   _____   _______________
      
                   1277.00 - 16.8% : native_read_tsc
                   1148.00 - 15.1% : rb_reserve_next_event
                    896.00 - 11.8% : ring_buffer_lock_reserve
                    688.00 -  9.1% : __rb_reserve_next
                    664.00 -  8.8% : rb_end_commit
                    563.00 -  7.4% : ring_buffer_unlock_commit
                    508.00 -  6.7% : _spin_unlock_irq
                    365.00 -  4.8% : debug_smp_processor_id
                    321.00 -  4.2% : trace_clock_local
                    303.00 -  4.0% : ring_buffer_producer_thread 	[ring_buffer_benchmark]
                    273.00 -  3.6% : native_sched_clock
                    122.00 -  1.6% : trace_recursive_unlock
                    113.00 -  1.5% : sched_clock
                    101.00 -  1.3% : ring_buffer_event_data
                     53.00 -  0.7% : tick_nohz_stop_sched_tick
      
      Where trace_clock_local drops from 40% to only taking 4% of the total time.
      The trace time also goes from 210ns down to 179ns (31ns).
      
      I talked with Peter Zijlstra about the impact that sched_clock may have
      without having interrupts disabled, and he told me that if a timer interrupt
      comes in, sched_clock may report a wrong time.
      
      Balancing a seldom incorrect timestamp with a 15% performance boost, I'll
      take the performance boost.
      Acked-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      8b2a5dac
    • Steven Rostedt's avatar
      ring-buffer: Add multiple iterations between benchmark timestamps · a6f0eb6a
      Steven Rostedt authored
      The ring_buffer_benchmark does a gettimeofday after every write to the
      ring buffer in its measurements. This adds the overhead of the call
      to gettimeofday to the measurements and does not give an accurate picture
      of the length of time it takes to record a trace.
      
      This was first noticed with perf top:
      
      ------------------------------------------------------------------------------
         PerfTop:     679 irqs/sec  kernel:99.9% [1000Hz cpu-clock-msecs],  (all, 4 CPUs)
      ------------------------------------------------------------------------------
      
                   samples    pcnt   kernel function
                   _______   _____   _______________
      
                   1673.00 - 27.8% : trace_clock_local
                    806.00 - 13.4% : do_gettimeofday
                    590.00 -  9.8% : rb_reserve_next_event
                    554.00 -  9.2% : native_read_tsc
                    431.00 -  7.2% : ring_buffer_lock_reserve
                    365.00 -  6.1% : __rb_reserve_next
                    355.00 -  5.9% : rb_end_commit
                    322.00 -  5.4% : getnstimeofday
                    268.00 -  4.5% : ring_buffer_unlock_commit
                    262.00 -  4.4% : ring_buffer_producer_thread 	[ring_buffer_benchmark]
                    113.00 -  1.9% : read_tsc
                     91.00 -  1.5% : debug_smp_processor_id
                     69.00 -  1.1% : trace_recursive_unlock
                     66.00 -  1.1% : ring_buffer_event_data
                     25.00 -  0.4% : _spin_unlock_irq
      
      And the length of each write to the ring buffer measured at 310ns.
      
      This patch adds a new module parameter called "write_interval" which is
      defaulted to 50. This is the number of writes performed between
      timestamps. After this patch perf top shows:
      
      ------------------------------------------------------------------------------
         PerfTop:     244 irqs/sec  kernel:100.0% [1000Hz cpu-clock-msecs],  (all, 4 CPUs)
      ------------------------------------------------------------------------------
      
                   samples    pcnt   kernel function
                   _______   _____   _______________
      
                   2842.00 - 40.4% : trace_clock_local
                   1043.00 - 14.8% : rb_reserve_next_event
                    784.00 - 11.1% : ring_buffer_lock_reserve
                    600.00 -  8.5% : __rb_reserve_next
                    579.00 -  8.2% : rb_end_commit
                    440.00 -  6.3% : ring_buffer_unlock_commit
                    290.00 -  4.1% : ring_buffer_producer_thread 	[ring_buffer_benchmark]
                    155.00 -  2.2% : debug_smp_processor_id
                    117.00 -  1.7% : trace_recursive_unlock
                    103.00 -  1.5% : ring_buffer_event_data
                     28.00 -  0.4% : do_gettimeofday
                     22.00 -  0.3% : _spin_unlock_irq
                     14.00 -  0.2% : native_read_tsc
                     11.00 -  0.2% : getnstimeofday
      
      do_gettimeofday dropped from 13% usage to a mere 0.4%! (using the default
      50 interval)  The measurement for each timestamp went from 310ns to 210ns.
      That's 100ns (1/3rd) overhead that the gettimeofday call was introducing.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      a6f0eb6a
  6. 02 Nov, 2009 3 commits
  7. 29 Oct, 2009 9 commits
  8. 24 Oct, 2009 4 commits
  9. 22 Oct, 2009 8 commits
  10. 21 Oct, 2009 6 commits