1. 27 Feb, 2019 1 commit
  2. 09 Sep, 2018 1 commit
  3. 24 Aug, 2018 1 commit
  4. 09 Aug, 2018 1 commit
    • Masami Hiramatsu's avatar
      ring_buffer: tracing: Inherit the tracing setting to next ring buffer · a26030a6
      Masami Hiramatsu authored
      commit 73c8d894 upstream.
      
      Maintain the tracing on/off setting of the ring_buffer when switching
      to the trace buffer snapshot.
      
      Taking a snapshot is done by swapping the backup ring buffer
      (max_tr_buffer). But since the tracing on/off setting is defined
      by the ring buffer, when swapping it, the tracing on/off setting
      can also be changed. This causes a strange result like below:
      
        /sys/kernel/debug/tracing # cat tracing_on
        1
        /sys/kernel/debug/tracing # echo 0 > tracing_on
        /sys/kernel/debug/tracing # cat tracing_on
        0
        /sys/kernel/debug/tracing # echo 1 > snapshot
        /sys/kernel/debug/tracing # cat tracing_on
        1
        /sys/kernel/debug/tracing # echo 1 > snapshot
        /sys/kernel/debug/tracing # cat tracing_on
        0
      
      We don't touch tracing_on, but snapshot changes tracing_on
      setting each time. This is an anomaly, because user doesn't know
      that each "ring_buffer" stores its own tracing-enable state and
      the snapshot is done by swapping ring buffers.
      
      Link: http://lkml.kernel.org/r/153149929558.11274.11730609978254724394.stgit@devbox
      
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Shuah Khan <shuah@kernel.org>
      Cc: Tom Zanussi <tom.zanussi@linux.intel.com>
      Cc: Hiraku Toyooka <hiraku.toyooka@cybertrust.co.jp>
      Cc: stable@vger.kernel.org
      Fixes: debdd57f
      
       ("tracing: Make a snapshot feature available from userspace")
      Signed-off-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      [ Updated commit log and comment in the code ]
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: default avatarSudip Mukherjee <sudipm.mukherjee@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a26030a6
  5. 02 Jan, 2018 3 commits
  6. 20 Dec, 2017 1 commit
  7. 05 Oct, 2017 2 commits
  8. 27 Sep, 2017 2 commits
  9. 30 Aug, 2017 1 commit
  10. 27 Jul, 2017 1 commit
    • Chunyu Hu's avatar
      tracing: Fix kmemleak in instance_rmdir · 919e4811
      Chunyu Hu authored
      commit db9108e0 upstream.
      
      Hit the kmemleak when executing instance_rmdir, it forgot releasing
      mem of tracing_cpumask. With this fix, the warn does not appear any
      more.
      
      unreferenced object 0xffff93a8dfaa7c18 (size 8):
        comm "mkdir", pid 1436, jiffies 4294763622 (age 9134.308s)
        hex dump (first 8 bytes):
          ff ff ff ff ff ff ff ff                          ........
        backtrace:
          [<ffffffff88b6567a>] kmemleak_alloc+0x4a/0xa0
          [<ffffffff8861ea41>] __kmalloc_node+0xf1/0x280
          [<ffffffff88b505d3>] alloc_cpumask_var_node+0x23/0x30
          [<ffffffff88b5060e>] alloc_cpumask_var+0xe/0x10
          [<ffffffff88571ab0>] instance_mkdir+0x90/0x240
          [<ffffffff886e5100>] tracefs_syscall_mkdir+0x40/0x70
          [<ffffffff886565c9>] vfs_mkdir+0x109/0x1b0
          [<ffffffff8865b1d0>] SyS_mkdir+0xd0/0x100
          [<ffffffff88403857>] do_syscall_64+0x67/0x150
          [<ffffffff88b710e7>] return_from_SYSCALL_64+0x0/0x6a
          [<ffffffffffffffff>] 0xffffffffffffffff
      
      Link: http://lkml.kernel.org/r/1500546969-12594-1-git-send-email-chuhu@redhat.com
      
      Fixes: ccfe9e42
      
       ("tracing: Make tracing_cpumask available for all instances")
      Signed-off-by: default avatarChunyu Hu <chuhu@redhat.com>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      919e4811
  11. 21 Jul, 2017 1 commit
  12. 27 Apr, 2017 1 commit
  13. 21 Apr, 2017 1 commit
    • Namhyung Kim's avatar
      ftrace: Fix function pid filter on instances · 7da0f8e5
      Namhyung Kim authored
      commit d879d0b8 upstream.
      
      When function tracer has a pid filter, it adds a probe to sched_switch
      to track if current task can be ignored.  The probe checks the
      ftrace_ignore_pid from current tr to filter tasks.  But it misses to
      delete the probe when removing an instance so that it can cause a crash
      due to the invalid tr pointer (use-after-free).
      
      This is easily reproducible with the following:
      
        # cd /sys/kernel/debug/tracing
        # mkdir instances/buggy
        # echo $$ > instances/buggy/set_ftrace_pid
        # rmdir instances/buggy
      
        ============================================================================
        BUG: KASAN: use-after-free in ftrace_filter_pid_sched_switch_probe+0x3d/0x90
        Read of size 8 by task kworker/0:1/17
        CPU: 0 PID: 17 Comm: kworker/0:1 Tainted: G    B           4.11.0-rc3  #198
        Call Trace:
         dump_stack+0x68/0x9f
         kasan_object_err+0x21/0x70
         kasan_report.part.1+0x22b/0x500
         ? ftrace_filter_pid_sched_switch_probe+0x3d/0x90
         kasan_report+0x25/0x30
         __asan_load8+0x5e/0x70
         ftrace_filter_pid_sched_switch_probe+0x3d/0x90
         ? fpid_start+0x130/0x130
         __schedule+0x571/0xce0
         ...
      
      To fix it, use ftrace_clear_pids() to unregister the probe.  As
      instance_rmdir() already updated ftrace codes, it can just free the
      filter safely.
      
      Link: http://lkml.kernel.org/r/20170417024430.21194-2-namhyung@kernel.org
      
      Fixes: 0c8916c3
      
       ("tracing: Add rmdir to remove multibuffer instances")
      Cc: Ingo Molnar <mingo@kernel.org>
      Reviewed-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7da0f8e5
  14. 15 Mar, 2017 1 commit
    • Eric W. Biederman's avatar
      fs: Better permission checking for submounts · d3381fab
      Eric W. Biederman authored
      commit 93faccbb upstream.
      
      To support unprivileged users mounting filesystems two permission
      checks have to be performed: a test to see if the user allowed to
      create a mount in the mount namespace, and a test to see if
      the user is allowed to access the specified filesystem.
      
      The automount case is special in that mounting the original filesystem
      grants permission to mount the sub-filesystems, to any user who
      happens to stumble across the their mountpoint and satisfies the
      ordinary filesystem permission checks.
      
      Attempting to handle the automount case by using override_creds
      almost works.  It preserves the idea that permission to mount
      the original filesystem is permission to mount the sub-filesystem.
      Unfortunately using override_creds messes up the filesystems
      ordinary permission checks.
      
      Solve this by being explicit that a mount is a submount by introducing
      vfs_submount, and using it where appropriate.
      
      vfs_submount uses a new mo...
      d3381fab
  15. 25 Sep, 2016 2 commits
  16. 12 Sep, 2016 1 commit
  17. 02 Sep, 2016 1 commit
    • Steven Rostedt (Red Hat)'s avatar
      tracing: Added hardware latency tracer · e7c15cd8
      Steven Rostedt (Red Hat) authored
      
      The hardware latency tracer has been in the PREEMPT_RT patch for some time.
      It is used to detect possible SMIs or any other hardware interruptions that
      the kernel is unaware of. Note, NMIs may also be detected, but that may be
      good to note as well.
      
      The logic is pretty simple. It simply creates a thread that spins on a
      single CPU for a specified amount of time (width) within a periodic window
      (window). These numbers may be adjusted by their cooresponding names in
      
         /sys/kernel/tracing/hwlat_detector/
      
      The defaults are window = 1000000 us (1 second)
                       width  =  500000 us (1/2 second)
      
      The loop consists of:
      
      	t1 = trace_clock_local();
      	t2 = trace_clock_local();
      
      Where trace_clock_local() is a variant of sched_clock().
      
      The difference of t2 - t1 is recorded as the "inner" timestamp and also the
      timestamp  t1 - prev_t2 is recorded as the "outer" timestamp. If either of
      these differences are greater than the time denoted in
      /sys/kernel/tracing/tracing_thresh then it records the event.
      
      When this tracer is started, and tracing_thresh is zero, it changes to the
      default threshold of 10 us.
      
      The hwlat tracer in the PREEMPT_RT patch was originally written by
      Jon Masters. I have modified it quite a bit and turned it into a
      tracer.
      Based-on-code-by: default avatarJon Masters <jcm@redhat.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      e7c15cd8
  18. 23 Aug, 2016 1 commit
  19. 05 Jul, 2016 2 commits
  20. 23 Jun, 2016 1 commit
    • Steven Rostedt (Red Hat)'s avatar
      tracing: Skip more functions when doing stack tracing of events · be54f69c
      Steven Rostedt (Red Hat) authored
       # echo 1 > options/stacktrace
       # echo 1 > events/sched/sched_switch/enable
       # cat trace
                <idle>-0     [002] d..2  1982.525169: <stack trace>
       => save_stack_trace
       => __ftrace_trace_stack
       => trace_buffer_unlock_commit_regs
       => event_trigger_unlock_commit
       => trace_event_buffer_commit
       => trace_event_raw_event_sched_switch
       => __schedule
       => schedule
       => schedule_preempt_disabled
       => cpu_startup_entry
       => start_secondary
      
      The above shows that we are seeing 6 functions before ever making it to the
      caller of the sched_switch event.
      
       # echo stacktrace > events/sched/sched_switch/trigger
       # cat trace
                <idle>-0     [002] d..3  2146.335208: <stack trace>
       => trace_event_buffer_commit
       => trace_event_raw_event_sched_switch
       => __schedule
       => schedule
       => schedule_preempt_disabled
       => cpu_startup_entry
       => start_secondary
      
      The stacktrace trigger isn't as bad, because it adds its own skip to the
      stacktracing, but still has two events extra.
      
      One issue is that if the stacktrace passes its own "regs" then there should
      be no addition to the skip, as the regs will not include the functions being
      called. This was an issue that was fixed by commit 7717c6be
      
       ("tracing:
      Fix stacktrace skip depth in trace_buffer_unlock_commit_regs()" as adding
      the skip number for kprobes made the probes not have any stack at all.
      
      But since this is only an issue when regs is being used, a skip should be
      added if regs is NULL. Now we have:
      
       # echo 1 > options/stacktrace
       # echo 1 > events/sched/sched_switch/enable
       # cat trace
                <idle>-0     [000] d..2  1297.676333: <stack trace>
       => __schedule
       => schedule
       => schedule_preempt_disabled
       => cpu_startup_entry
       => rest_init
       => start_kernel
       => x86_64_start_reservations
       => x86_64_start_kernel
      
       # echo stacktrace > events/sched/sched_switch/trigger
       # cat trace
                <idle>-0     [002] d..3  1370.759745: <stack trace>
       => __schedule
       => schedule
       => schedule_preempt_disabled
       => cpu_startup_entry
       => start_secondary
      
      And kprobes are not touched.
      Reported-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      be54f69c
  21. 20 Jun, 2016 5 commits
  22. 03 May, 2016 1 commit
    • Steven Rostedt (Red Hat)'s avatar
      tracing: Use temp buffer when filtering events · 0fc1b09f
      Steven Rostedt (Red Hat) authored
      
      Filtering of events requires the data to be written to the ring buffer
      before it can be decided to filter or not. This is because the parameters of
      the filter are based on the result that is written to the ring buffer and
      not on the parameters that are passed into the trace functions.
      
      The ftrace ring buffer is optimized for writing into the ring buffer and
      committing. The discard procedure used when filtering decides the event
      should be discarded is much more heavy weight. Thus, using a temporary
      filter when filtering events can speed things up drastically.
      
      Without a temp buffer we have:
      
       # trace-cmd start -p nop
       # perf stat -r 10 hackbench 50
             0.790706626 seconds time elapsed ( +-  0.71% )
      
       # trace-cmd start -e all
       # perf stat -r 10 hackbench 50
             1.566904059 seconds time elapsed ( +-  0.27% )
      
       # trace-cmd start -e all -f 'common_preempt_count==20'
       # perf stat -r 10 hackbench 50
             1.690598511 seconds time elapsed ( +-  0.19% )
      
       # trace-cmd start -e all -f 'common_preempt_count!=20'
       # perf stat -r 10 hackbench 50
             1.707486364 seconds time elapsed ( +-  0.30% )
      
      The first run above is without any tracing, just to get a based figure.
      hackbench takes ~0.79 seconds to run on the system.
      
      The second run enables tracing all events where nothing is filtered. This
      increases the time by 100% and hackbench takes 1.57 seconds to run.
      
      The third run filters all events where the preempt count will equal "20"
      (this should never happen) thus all events are discarded. This takes 1.69
      seconds to run. This is 10% slower than just committing the events!
      
      The last run enables all events and filters where the filter will commit all
      events, and this takes 1.70 seconds to run. The filtering overhead is
      approximately 10%. Thus, the discard and commit of an event from the ring
      buffer may be about the same time.
      
      With this patch, the numbers change:
      
       # trace-cmd start -p nop
       # perf stat -r 10 hackbench 50
             0.778233033 seconds time elapsed ( +-  0.38% )
      
       # trace-cmd start -e all
       # perf stat -r 10 hackbench 50
             1.582102692 seconds time elapsed ( +-  0.28% )
      
       # trace-cmd start -e all -f 'common_preempt_count==20'
       # perf stat -r 10 hackbench 50
             1.309230710 seconds time elapsed ( +-  0.22% )
      
       # trace-cmd start -e all -f 'common_preempt_count!=20'
       # perf stat -r 10 hackbench 50
             1.786001924 seconds time elapsed ( +-  0.20% )
      
      The first run is again the base with no tracing.
      
      The second run is all tracing with no filtering. It is a little slower, but
      that may be well within the noise.
      
      The third run shows that discarding all events only took 1.3 seconds. This
      is a speed up of 23%! The discard is much faster than even the commit.
      
      The one downside is shown in the last run. Events that are not discarded by
      the filter will take longer to add, this is due to the extra copy of the
      event.
      
      Cc: Alexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      0fc1b09f
  23. 29 Apr, 2016 5 commits
  24. 27 Apr, 2016 1 commit
  25. 26 Apr, 2016 2 commits