1. 25 Jun, 2021 11 commits
    • Daniel Bristot de Oliveira's avatar
      trace/osnoise: Support hotplug operations · c8895e27
      Daniel Bristot de Oliveira authored
      Enable and disable osnoise/timerlat thread during on CPU hotplug online
      and offline operations respectivelly.
      
      Link: https://lore.kernel.org/linux-doc/20210621134636.5b332226@oasis.local.home/
      Link: https://lkml.kernel.org/r/39f98590b3caeb3c32f09526214058efe0e9272a.1624372313.git.bristot@redhat.com
      
      Cc: Phil Auld <pauld@redhat.com>
      Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
      Cc: Kate Carcia <kcarcia@redhat.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Alexandre Chartre <alexandre.chartre@oracle.com>
      Cc: Clark Willaims <williams@redhat.com>
      Cc: John Kacur <jkacur@redhat.com>
      Cc: Juri Lelli <juri.lelli@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: x86@kernel.org
      Cc: linux-doc@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Suggested-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: default avatarDaniel Bristot de Oliveira <bristot@redhat.com>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      c8895e27
    • Daniel Bristot de Oliveira's avatar
      trace/hwlat: Support hotplug operations · ba998f7d
      Daniel Bristot de Oliveira authored
      Enable and disable hwlat thread during cpu hotplug online
      and offline operations, respectivelly.
      
      Link: https://lore.kernel.org/linux-doc/20210621134636.5b332226@oasis.local.home/
      Link: https://lkml.kernel.org/r/52012d25ea35491a0f8088b947864d8df8e25157.1624372313.git.bristot@redhat.com
      
      Cc: Phil Auld <pauld@redhat.com>
      Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
      Cc: Kate Carcia <kcarcia@redhat.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Alexandre Chartre <alexandre.chartre@oracle.com>
      Cc: Clark Willaims <williams@redhat.com>
      Cc: John Kacur <jkacur@redhat.com>
      Cc: Juri Lelli <juri.lelli@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: x86@kernel.org
      Cc: linux-doc@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Suggested-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: default avatarDaniel Bristot de Oliveira <bristot@redhat.com>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      ba998f7d
    • Daniel Bristot de Oliveira's avatar
      trace/hwlat: Protect kdata->kthread with get/put_online_cpus · 039a602d
      Daniel Bristot de Oliveira authored
      In preparation to the hotplug support, protect kdata->kthread
      with get/put_online_cpus() to avoid concurrency with hotplug
      operations.
      
      Link: https://lore.kernel.org/linux-doc/20210621134636.5b332226@oasis.local.home/
      Link: https://lkml.kernel.org/r/8bdb2a56f46abfd301d6fffbf43448380c09a6f5.1624372313.git.bristot@redhat.com
      
      Cc: Phil Auld <pauld@redhat.com>
      Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
      Cc: Kate Carcia <kcarcia@redhat.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Alexandre Chartre <alexandre.chartre@oracle.com>
      Cc: Clark Willaims <williams@redhat.com>
      Cc: John Kacur <jkacur@redhat.com>
      Cc: Juri Lelli <juri.lelli@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: x86@kernel.org
      Cc: linux-doc@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Suggested-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: default avatarDaniel Bristot de Oliveira <bristot@redhat.com>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      039a602d
    • Daniel Bristot de Oliveira's avatar
      trace: Add timerlat tracer · a955d7ea
      Daniel Bristot de Oliveira authored
      The timerlat tracer aims to help the preemptive kernel developers to
      found souces of wakeup latencies of real-time threads. Like cyclictest,
      the tracer sets a periodic timer that wakes up a thread. The thread then
      computes a *wakeup latency* value as the difference between the *current
      time* and the *absolute time* that the timer was set to expire. The main
      goal of timerlat is tracing in such a way to help kernel developers.
      
      Usage
      
      Write the ASCII text "timerlat" into the current_tracer file of the
      tracing system (generally mounted at /sys/kernel/tracing).
      
      For example:
      
              [root@f32 ~]# cd /sys/kernel/tracing/
              [root@f32 tracing]# echo timerlat > current_tracer
      
      It is possible to follow the trace by reading the trace trace file:
      
        [root@f32 tracing]# cat trace
        # tracer: timerlat
        #
        #                              _-----=> irqs-off
        #                             / _----=> need-resched
        #                            | / _---=> hardirq/softirq
        #                            || / _--=> preempt-depth
        #                            || /
        #                            ||||             ACTIVATION
        #         TASK-PID      CPU# ||||   TIMESTAMP    ID            CONTEXT                LATENCY
        #            | |         |   ||||      |         |                  |                       |
                <idle>-0       [000] d.h1    54.029328: #1     context    irq timer_latency       932 ns
                 <...>-867     [000] ....    54.029339: #1     context thread timer_latency     11700 ns
                <idle>-0       [001] dNh1    54.029346: #1     context    irq timer_latency      2833 ns
                 <...>-868     [001] ....    54.029353: #1     context thread timer_latency      9820 ns
                <idle>-0       [000] d.h1    54.030328: #2     context    irq timer_latency       769 ns
                 <...>-867     [000] ....    54.030330: #2     context thread timer_latency      3070 ns
                <idle>-0       [001] d.h1    54.030344: #2     context    irq timer_latency       935 ns
                 <...>-868     [001] ....    54.030347: #2     context thread timer_latency      4351 ns
      
      The tracer creates a per-cpu kernel thread with real-time priority that
      prints two lines at every activation. The first is the *timer latency*
      observed at the *hardirq* context before the activation of the thread.
      The second is the *timer latency* observed by the thread, which is the
      same level that cyclictest reports. The ACTIVATION ID field
      serves to relate the *irq* execution to its respective *thread* execution.
      
      The irq/thread splitting is important to clarify at which context
      the unexpected high value is coming from. The *irq* context can be
      delayed by hardware related actions, such as SMIs, NMIs, IRQs
      or by a thread masking interrupts. Once the timer happens, the delay
      can also be influenced by blocking caused by threads. For example, by
      postponing the scheduler execution via preempt_disable(),  by the
      scheduler execution, or by masking interrupts. Threads can
      also be delayed by the interference from other threads and IRQs.
      
      The timerlat can also take advantage of the osnoise: traceevents.
      For example:
      
              [root@f32 ~]# cd /sys/kernel/tracing/
              [root@f32 tracing]# echo timerlat > current_tracer
              [root@f32 tracing]# echo osnoise > set_event
              [root@f32 tracing]# echo 25 > osnoise/stop_tracing_total_us
              [root@f32 tracing]# tail -10 trace
                   cc1-87882   [005] d..h...   548.771078: #402268 context    irq timer_latency      1585 ns
                   cc1-87882   [005] dNLh1..   548.771082: irq_noise: local_timer:236 start 548.771077442 duration 4597 ns
                   cc1-87882   [005] dNLh2..   548.771083: irq_noise: reschedule:253 start 548.771083017 duration 56 ns
                   cc1-87882   [005] dNLh2..   548.771086: irq_noise: call_function_single:251 start 548.771083811 duration 2048 ns
                   cc1-87882   [005] dNLh2..   548.771088: irq_noise: call_function_single:251 start 548.771086814 duration 1495 ns
                   cc1-87882   [005] dNLh2..   548.771091: irq_noise: call_function_single:251 start 548.771089194 duration 1558 ns
                   cc1-87882   [005] dNLh2..   548.771094: irq_noise: call_function_single:251 start 548.771091719 duration 1932 ns
                   cc1-87882   [005] dNLh2..   548.771096: irq_noise: call_function_single:251 start 548.771094696 duration 1050 ns
                   cc1-87882   [005] d...3..   548.771101: thread_noise:      cc1:87882 start 548.771078243 duration 10909 ns
            timerlat/5-1035    [005] .......   548.771103: #402268 context thread timer_latency     25960 ns
      
      For further information see: Documentation/trace/timerlat-tracer.rst
      
      Link: https://lkml.kernel.org/r/71f18efc013e1194bcaea1e54db957de2b19ba62.1624372313.git.bristot@redhat.com
      
      Cc: Phil Auld <pauld@redhat.com>
      Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
      Cc: Kate Carcia <kcarcia@redhat.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Alexandre Chartre <alexandre.chartre@oracle.com>
      Cc: Clark Willaims <williams@redhat.com>
      Cc: John Kacur <jkacur@redhat.com>
      Cc: Juri Lelli <juri.lelli@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: x86@kernel.org
      Cc: linux-doc@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarDaniel Bristot de Oliveira <bristot@redhat.com>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      a955d7ea
    • Daniel Bristot de Oliveira's avatar
      trace: Add osnoise tracer · bce29ac9
      Daniel Bristot de Oliveira authored
      In the context of high-performance computing (HPC), the Operating System
      Noise (*osnoise*) refers to the interference experienced by an application
      due to activities inside the operating system. In the context of Linux,
      NMIs, IRQs, SoftIRQs, and any other system thread can cause noise to the
      system. Moreover, hardware-related jobs can also cause noise, for example,
      via SMIs.
      
      The osnoise tracer leverages the hwlat_detector by running a similar
      loop with preemption, SoftIRQs and IRQs enabled, thus allowing all
      the sources of *osnoise* during its execution. Using the same approach
      of hwlat, osnoise takes note of the entry and exit point of any
      source of interferences, increasing a per-cpu interference counter. The
      osnoise tracer also saves an interference counter for each source of
      interference. The interference counter for NMI, IRQs, SoftIRQs, and
      threads is increased anytime the tool observes these interferences' entry
      events. When a noise happens without any interference from the operating
      system level, the hardware noise counter increases, pointing to a
      hardware-related noise. In this way, osnoise can account for any
      source of interference. At the end of the period, the osnoise tracer
      prints the sum of all noise, the max single noise, the percentage of CPU
      available for the thread, and the counters for the noise sources.
      
      Usage
      
      Write the ASCII text "osnoise" into the current_tracer file of the
      tracing system (generally mounted at /sys/kernel/tracing).
      
      For example::
      
              [root@f32 ~]# cd /sys/kernel/tracing/
              [root@f32 tracing]# echo osnoise > current_tracer
      
      It is possible to follow the trace by reading the trace trace file::
      
              [root@f32 tracing]# cat trace
              # tracer: osnoise
              #
              #                                _-----=> irqs-off
              #                               / _----=> need-resched
              #                              | / _---=> hardirq/softirq
              #                              || / _--=> preempt-depth                            MAX
              #                              || /                                             SINGLE     Interference counters:
              #                              ||||               RUNTIME      NOISE   % OF CPU  NOISE    +-----------------------------+
              #           TASK-PID      CPU# ||||   TIMESTAMP    IN US       IN US  AVAILABLE  IN US     HW    NMI    IRQ   SIRQ THREAD
              #              | |         |   ||||      |           |             |    |            |      |      |      |      |      |
                         <...>-859     [000] ....    81.637220: 1000000        190  99.98100       9     18      0   1007     18      1
                         <...>-860     [001] ....    81.638154: 1000000        656  99.93440      74     23      0   1006     16      3
                         <...>-861     [002] ....    81.638193: 1000000       5675  99.43250     202      6      0   1013     25     21
                         <...>-862     [003] ....    81.638242: 1000000        125  99.98750      45      1      0   1011     23      0
                         <...>-863     [004] ....    81.638260: 1000000       1721  99.82790     168      7      0   1002     49     41
                         <...>-864     [005] ....    81.638286: 1000000        263  99.97370      57      6      0   1006     26      2
                         <...>-865     [006] ....    81.638302: 1000000        109  99.98910      21      3      0   1006     18      1
                         <...>-866     [007] ....    81.638326: 1000000       7816  99.21840     107      8      0   1016     39     19
      
      In addition to the regular trace fields (from TASK-PID to TIMESTAMP), the
      tracer prints a message at the end of each period for each CPU that is
      running an osnoise/CPU thread. The osnoise specific fields report:
      
       - The RUNTIME IN USE reports the amount of time in microseconds that
         the osnoise thread kept looping reading the time.
       - The NOISE IN US reports the sum of noise in microseconds observed
         by the osnoise tracer during the associated runtime.
       - The % OF CPU AVAILABLE reports the percentage of CPU available for
         the osnoise thread during the runtime window.
       - The MAX SINGLE NOISE IN US reports the maximum single noise observed
         during the runtime window.
       - The Interference counters display how many each of the respective
         interference happened during the runtime window.
      
      Note that the example above shows a high number of HW noise samples.
      The reason being is that this sample was taken on a virtual machine,
      and the host interference is detected as a hardware interference.
      
      Tracer options
      
      The tracer has a set of options inside the osnoise directory, they are:
      
       - osnoise/cpus: CPUs at which a osnoise thread will execute.
       - osnoise/period_us: the period of the osnoise thread.
       - osnoise/runtime_us: how long an osnoise thread will look for noise.
       - osnoise/stop_tracing_us: stop the system tracing if a single noise
         higher than the configured value happens. Writing 0 disables this
         option.
       - osnoise/stop_tracing_total_us: stop the system tracing if total noise
         higher than the configured value happens. Writing 0 disables this
         option.
       - tracing_threshold: the minimum delta between two time() reads to be
         considered as noise, in us. When set to 0, the default value will
         be used, which is currently 5 us.
      
      Additional Tracing
      
      In addition to the tracer, a set of tracepoints were added to
      facilitate the identification of the osnoise source.
      
       - osnoise:sample_threshold: printed anytime a noise is higher than
         the configurable tolerance_ns.
       - osnoise:nmi_noise: noise from NMI, including the duration.
       - osnoise:irq_noise: noise from an IRQ, including the duration.
       - osnoise:softirq_noise: noise from a SoftIRQ, including the
         duration.
       - osnoise:thread_noise: noise from a thread, including the duration.
      
      Note that all the values are *net values*. For example, if while osnoise
      is running, another thread preempts the osnoise thread, it will start a
      thread_noise duration at the start. Then, an IRQ takes place, preempting
      the thread_noise, starting a irq_noise. When the IRQ ends its execution,
      it will compute its duration, and this duration will be subtracted from
      the thread_noise, in such a way as to avoid the double accounting of the
      IRQ execution. This logic is valid for all sources of noise.
      
      Here is one example of the usage of these tracepoints::
      
             osnoise/8-961     [008] d.h.  5789.857532: irq_noise: local_timer:236 start 5789.857529929 duration 1845 ns
             osnoise/8-961     [008] dNh.  5789.858408: irq_noise: local_timer:236 start 5789.858404871 duration 2848 ns
           migration/8-54      [008] d...  5789.858413: thread_noise: migration/8:54 start 5789.858409300 duration 3068 ns
             osnoise/8-961     [008] ....  5789.858413: sample_threshold: start 5789.858404555 duration 8723 ns interferences 2
      
      In this example, a noise sample of 8 microseconds was reported in the last
      line, pointing to two interferences. Looking backward in the trace, the
      two previous entries were about the migration thread running after a
      timer IRQ execution. The first event is not part of the noise because
      it took place one millisecond before.
      
      It is worth noticing that the sum of the duration reported in the
      tracepoints is smaller than eight us reported in the sample_threshold.
      The reason roots in the overhead of the entry and exit code that happens
      before and after any interference execution. This justifies the dual
      approach: measuring thread and tracing.
      
      Link: https://lkml.kernel.org/r/e649467042d60e7b62714c9c6751a56299d15119.1624372313.git.bristot@redhat.com
      
      Cc: Phil Auld <pauld@redhat.com>
      Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
      Cc: Kate Carcia <kcarcia@redhat.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Alexandre Chartre <alexandre.chartre@oracle.com>
      Cc: Clark Willaims <williams@redhat.com>
      Cc: John Kacur <jkacur@redhat.com>
      Cc: Juri Lelli <juri.lelli@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: x86@kernel.org
      Cc: linux-doc@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarDaniel Bristot de Oliveira <bristot@redhat.com>
      [
        Made the following functions static:
         trace_irqentry_callback()
         trace_irqexit_callback()
         trace_intel_irqentry_callback()
         trace_intel_irqexit_callback()
      
        Added to include/trace.h:
         osnoise_arch_register()
         osnoise_arch_unregister()
      
        Fixed define logic for LATENCY_FS_NOTIFY
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      ]
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      bce29ac9
    • Steven Rostedt (VMware)'s avatar
      tracing: Add LATENCY_FS_NOTIFY to define if latency_fsnotify() is defined · 6880c987
      Steven Rostedt (VMware) authored
      With the coming addition of the osnoise tracer, the configs needed to
      include the latency_fsnotify() has become more complex, and to keep the
      declaration in the header file the same as in the C file, just have the
      logic needed to define it in one place, and that defines LATENCY_FS_NOTIFY
      which will be used in the C code.
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      6880c987
    • Steven Rostedt's avatar
      trace: Add __print_ns_to_secs() and __print_ns_without_secs() helpers · 62de4f29
      Steven Rostedt authored
      To have nanosecond output displayed in a more human readable format, its
      nicer to convert it to a seconds format (XXX.YYYYYYYYY). The problem is that
      to do so, the numbers must be divided by NSEC_PER_SEC, and moded too. But as
      these numbers are 64 bit, this can not be done simply with '/' and '%'
      operators, but must use do_div() instead.
      
      Instead of performing the expensive do_div() in the hot path of the
      tracepoint, it is more efficient to perform it during the output phase. But
      passing in do_div() can confuse the parser, and do_div() doesn't work
      exactly like a normal C function. It modifies the number in place, and we
      don't want to modify the actual values in the ring buffer.
      
      Two helper functions are now created:
      
        __print_ns_to_secs() and __print_ns_without_secs()
      
      They both take a value of nanoseconds, and the former will return that
      number divided by NSEC_PER_SEC, and the latter will mod it with NSEC_PER_SEC
      giving a way to print a nice human readable format:
      
       __print_fmt("time=%llu.%09u",
      	__print_ns_to_secs(REC->nsec_val),
      	__print_ns_without_secs(REC->nsec_val))
      
      Link: https://lkml.kernel.org/r/e503b903045496c4ccde52843e1e318b422f7a56.1624372313.git.bristot@redhat.com
      
      Cc: Phil Auld <pauld@redhat.com>
      Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
      Cc: Kate Carcia <kcarcia@redhat.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Alexandre Chartre <alexandre.chartre@oracle.com>
      Cc: Clark Willaims <williams@redhat.com>
      Cc: John Kacur <jkacur@redhat.com>
      Cc: Juri Lelli <juri.lelli@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: x86@kernel.org
      Cc: linux-doc@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: default avatarDaniel Bristot de Oliveira <bristot@redhat.com>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      62de4f29
    • Daniel Bristot de Oliveira's avatar
      trace/hwlat: Remove printk from sampling loop · aa892f8c
      Daniel Bristot de Oliveira authored
      hwlat has some time operation checks on the sample loop, and it is
      currently using pr_err (printk) to report them. The problem is that
      this can lead the system to an unresponsible state due to an overflow of
      printk messages. This problem can be mitigated by writing the error
      message to the trace buffer.
      
      Remove the printk messages from the sampling loop, switching the to
      messages in the trace buffer.
      
      No functional change.
      
      Link: https://lkml.kernel.org/r/9d77c34869748aa105e965c769d24642914eea3a.1624372313.git.bristot@redhat.com
      
      Cc: Phil Auld <pauld@redhat.com>
      Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
      Cc: Kate Carcia <kcarcia@redhat.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Alexandre Chartre <alexandre.chartre@oracle.com>
      Cc: Clark Willaims <williams@redhat.com>
      Cc: John Kacur <jkacur@redhat.com>
      Cc: Juri Lelli <juri.lelli@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: x86@kernel.org
      Cc: linux-doc@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarDaniel Bristot de Oliveira <bristot@redhat.com>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      aa892f8c
    • Daniel Bristot de Oliveira's avatar
      trace/hwlat: Use trace_min_max_param for width and window params · f27a1c9e
      Daniel Bristot de Oliveira authored
      Use the trace_min_max_param to reduce code duplication.
      
      No functional change.
      
      Link: https://lkml.kernel.org/r/b91accd5a7c6c14ea02d3379aae974ba22b47dd6.1624372313.git.bristot@redhat.com
      
      Cc: Phil Auld <pauld@redhat.com>
      Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
      Cc: Kate Carcia <kcarcia@redhat.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Alexandre Chartre <alexandre.chartre@oracle.com>
      Cc: Clark Willaims <williams@redhat.com>
      Cc: John Kacur <jkacur@redhat.com>
      Cc: Juri Lelli <juri.lelli@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: x86@kernel.org
      Cc: linux-doc@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarDaniel Bristot de Oliveira <bristot@redhat.com>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      f27a1c9e
    • Daniel Bristot de Oliveira's avatar
      trace: Add a generic function to read/write u64 values from tracefs · bc87cf0a
      Daniel Bristot de Oliveira authored
      The hwlat detector and (in preparation for) the osnoise/timerlat tracers
      have a set of u64 parameters that the user can read/write via tracefs.
      For instance, we have hwlat_detector's window and width.
      
      To reduce the code duplication, hwlat's window and width share the same
      read function. However, they do not share the write functions because
      they do different parameter checks. For instance, the width needs to
      be smaller than the window, while the window needs to be larger
      than the window. The same pattern repeats on osnoise/timerlat, and
      a large portion of the code was devoted to the write function.
      
      Despite having different checks, the write functions have the same
      structure:
      
         read a user-space buffer
         take the lock that protects the value
         check for minimum and maximum acceptable values
            save the value
         release the lock
         return success or error
      
      To reduce the code duplication also in the write functions, this patch
      provides a generic read and write implementation for u64 values that
      need to be within some minimum and/or maximum parameters, while
      (potentially) being protected by a lock.
      
      To use this interface, the structure trace_min_max_param needs to be
      filled:
      
       struct trace_min_max_param {
               struct mutex    *lock;
               u64             *val;
               u64             *min;
               u64             *max;
       };
      
      The desired value is stored on the variable pointed by *val. If *min
      points to a minimum acceptable value, it will be checked during the
      write operation. Likewise, if *max points to a maximum allowable value,
      it will be checked during the write operation. Finally, if *lock points
      to a mutex, it will be taken at the beginning of the operation and
      released at the end.
      
      The definition of a trace_min_max_param needs to passed as the
      (private) *data for tracefs_create_file(), and the trace_min_max_fops
      (added by this patch) as the *fops file_operations.
      
      Link: https://lkml.kernel.org/r/3e35760a7c8b5c55f16ae5ad5fc54a0e71cbe647.1624372313.git.bristot@redhat.com
      
      Cc: Phil Auld <pauld@redhat.com>
      Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
      Cc: Kate Carcia <kcarcia@redhat.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Alexandre Chartre <alexandre.chartre@oracle.com>
      Cc: Clark Willaims <williams@redhat.com>
      Cc: John Kacur <jkacur@redhat.com>
      Cc: Juri Lelli <juri.lelli@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: x86@kernel.org
      Cc: linux-doc@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarDaniel Bristot de Oliveira <bristot@redhat.com>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      bc87cf0a
    • Daniel Bristot de Oliveira's avatar
      trace/hwlat: Implement the per-cpu mode · f46b1652
      Daniel Bristot de Oliveira authored
      Implements the per-cpu mode in which a sampling thread is created for
      each cpu in the "cpus" (and tracing_mask).
      
      The per-cpu mode has the potention to speed up the hwlat detection by
      running on multiple CPUs at the same time, at the cost of higher cpu
      usage with irqs disabled. Use with care.
      
      [
        Changed get_cpu_data() to static.
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      ]
      
      Link: https://lkml.kernel.org/r/ec06d0ab340e8460d293772faba19ad8a5c371aa.1624372313.git.bristot@redhat.com
      
      Cc: Phil Auld <pauld@redhat.com>
      Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
      Cc: Kate Carcia <kcarcia@redhat.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Alexandre Chartre <alexandre.chartre@oracle.com>
      Cc: Clark Willaims <williams@redhat.com>
      Cc: John Kacur <jkacur@redhat.com>
      Cc: Juri Lelli <juri.lelli@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: x86@kernel.org
      Cc: linux-doc@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarDaniel Bristot de Oliveira <bristot@redhat.com>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      f46b1652
  2. 24 Jun, 2021 4 commits
  3. 17 Jun, 2021 2 commits
    • Steven Rostedt (VMware)'s avatar
      tracing: Have ftrace_dump_on_oops kernel parameter take numbers · 2db7ab6b
      Steven Rostedt (VMware) authored
      The kernel parameter for ftrace_dump_on_oops can take a single assignment.
      That is, it can be:
      
        ftrace_dump_on_oops or ftrace_dump_on_oops=orig_cpu
      
      But the content in the sysctl file is a number.
      
       0 for disabled
       1 for ftrace_dump_on_oops (all CPUs)
       2 for ftrace_dump_on_oops (orig CPU)
      
      Allow the kernel command line to take a number as well to match the sysctl
      numbers.
      
      That is:
      
        ftrace_dump_on_oops=1 is the same as ftrace_dump_on_oops
      
      and
      
        ftrace_dump_on_oops=2 is the same as ftrace_dump_on_oops=orig_cpu
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      2db7ab6b
    • Steven Rostedt (VMware)'s avatar
      tracing: Add tp_printk_stop_on_boot option · f3860136
      Steven Rostedt (VMware) authored
      Add a kernel command line option that disables printing of events to
      console at late_initcall_sync(). This is useful when needing to see
      specific events written to console on boot up, but not wanting it when
      user space starts, as user space may make the console so noisy that the
      system becomes inoperable.
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      f3860136
  4. 10 Jun, 2021 17 commits
  5. 08 Jun, 2021 4 commits
  6. 06 Jun, 2021 2 commits