• Steven Rostedt (VMware)'s avatar
    ring-buffer: Zero out time extend if it is nested and not absolute · 097350d1
    Steven Rostedt (VMware) authored
    Currently the ring buffer makes events that happen in interrupts that preempt
    another event have a delta of zero. (Hopefully we can change this soon). But
    this is to deal with the races of updating a global counter with lockless
    and nesting functions updating deltas.
    
    With the addition of absolute time stamps, the time extend didn't follow
    this rule. A time extend can happen if two events happen longer than 2^27
    nanoseconds appart, as the delta time field in each event is only 27 bits.
    If that happens, then a time extend is injected with 2^59 bits of
    nanoseconds to use (18 years). But if the 2^27 nanoseconds happen between
    two events, and as it is writing the event, an interrupt triggers, it will
    see the 2^27 difference as well and inject a time extend of its own. But a
    recent change made the time extend logic not take into account the nesting,
    and this can cause two time extend deltas to happen moving the time stamp
    much further ahead than the current time. This gets all reset when the ring
    buffer moves to the next page, but that can cause time to appear to go
    backwards.
    
    This was observed in a trace-cmd recording, and since the data is saved in a
    file, with trace-cmd report --debug, it was possible to see that this indeed
    did happen!
    
      bash-52501   110d... 81778.908247: sched_switch:         bash:52501 [120] S ==> swapper/110:0 [120] [12770284:0x2e8:64]
      <idle>-0     110d... 81778.908757: sched_switch:         swapper/110:0 [120] R ==> bash:52501 [120] [509947:0x32c:64]
     TIME EXTEND: delta:306454770 length:0
      bash-52501   110.... 81779.215212: sched_swap_numa:      src_pid=52501 src_tgid=52388 src_ngid=52501 src_cpu=110 src_nid=2 dst_pid=52509 dst_tgid=52388 dst_ngid=52501 dst_cpu=49 dst_nid=1 [0:0x378:48]
     TIME EXTEND: delta:306458165 length:0
      bash-52501   110dNh. 81779.521670: sched_wakeup:         migration/110:565 [0] success=1 CPU:110 [0:0x3b4:40]
    
    and at the next page, caused the time to go backwards:
    
      bash-52504   110d... 81779.685411: sched_switch:         bash:52504 [120] S ==> swapper/110:0 [120] [8347057:0xfb4:64]
    CPU:110 [SUBBUFFER START] [81779379165886:0x1320000]
      <idle>-0     110dN.. 81779.379166: sched_wakeup:         bash:52504 [120] success=1 CPU:110 [0:0x10:40]
      <idle>-0     110d... 81779.379167: sched_switch:         swapper/110:0 [120] R ==> bash:52504 [120] [1168:0x3c:64]
    
    Link: https://lkml.kernel.org/r/20200622151815.345d1bf5@oasis.local.home
    
    Cc: Ingo Molnar <mingo@kernel.org>
    Cc: Andrew Morton <akpm@linux-foundation.org>
    Cc: Tom Zanussi <zanussi@kernel.org>
    Cc: stable@vger.kernel.org
    Fixes: dc4e2801 ("ring-buffer: Redefine the unimplemented RINGBUF_TYPE_TIME_STAMP")
    Reported-by: default avatarJulia Lawall <julia.lawall@inria.fr>
    Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
    097350d1
ring_buffer.c 140 KB