• Steven Rostedt (VMware)'s avatar
    tracing: Restructure trace_clock_global() to never block · aafe104a
    Steven Rostedt (VMware) authored
    It was reported that a fix to the ring buffer recursion detection would
    cause a hung machine when performing suspend / resume testing. The
    following backtrace was extracted from debugging that case:
    
    Call Trace:
     trace_clock_global+0x91/0xa0
     __rb_reserve_next+0x237/0x460
     ring_buffer_lock_reserve+0x12a/0x3f0
     trace_buffer_lock_reserve+0x10/0x50
     __trace_graph_return+0x1f/0x80
     trace_graph_return+0xb7/0xf0
     ? trace_clock_global+0x91/0xa0
     ftrace_return_to_handler+0x8b/0xf0
     ? pv_hash+0xa0/0xa0
     return_to_handler+0x15/0x30
     ? ftrace_graph_caller+0xa0/0xa0
     ? trace_clock_global+0x91/0xa0
     ? __rb_reserve_next+0x237/0x460
     ? ring_buffer_lock_reserve+0x12a/0x3f0
     ? trace_event_buffer_lock_reserve+0x3c/0x120
     ? trace_event_buffer_reserve+0x6b/0xc0
     ? trace_event_raw_event_device_pm_callback_start+0x125/0x2d0
     ? dpm_run_callback+0x3b/0xc0
     ? pm_ops_is_empty+0x50/0x50
     ? platform_get_irq_byname_optional+0x90/0x90
     ? trace_device_pm_callback_start+0x82/0xd0
     ? dpm_run_callback+0x49/0xc0
    
    With the following RIP:
    
    RIP: 0010:native_queued_spin_lock_slowpath+0x69/0x200
    
    Since the fix to the recursion detection would allow a single recursion to
    happen while tracing, this lead to the trace_clock_global() taking a spin
    lock and then trying to take it again:
    
    ring_buffer_lock_reserve() {
      trace_clock_global() {
        arch_spin_lock() {
          queued_spin_lock_slowpath() {
            /* lock taken */
            (something else gets traced by function graph tracer)
              ring_buffer_lock_reserve() {
                trace_clock_global() {
                  arch_spin_lock() {
                    queued_spin_lock_slowpath() {
                    /* DEAD LOCK! */
    
    Tracing should *never* block, as it can lead to strange lockups like the
    above.
    
    Restructure the trace_clock_global() code to instead of simply taking a
    lock to update the recorded "prev_time" simply use it, as two events
    happening on two different CPUs that calls this at the same time, really
    doesn't matter which one goes first. Use a trylock to grab the lock for
    updating the prev_time, and if it fails, simply try again the next time.
    If it failed to be taken, that means something else is already updating
    it.
    
    Link: https://lkml.kernel.org/r/20210430121758.650b6e8a@gandalf.local.home
    
    Cc: stable@vger.kernel.org
    Tested-by: default avatarKonstantin Kharlamov <hi-angel@yandex.ru>
    Tested-by: default avatarTodd Brandt <todd.e.brandt@linux.intel.com>
    Fixes: b02414c8 ("ring-buffer: Fix recursion protection transitions between interrupt context") # started showing the problem
    Fixes: 14131f2f ("tracing: implement trace_clock_*() APIs") # where the bug happened
    Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=212761Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
    aafe104a
trace_clock.c 4.22 KB