1. 24 Nov, 2022 6 commits
  2. 15 Nov, 2022 4 commits
  3. 27 Oct, 2022 2 commits
    • Ravi Bangoria's avatar
      perf: Optimize perf_tp_event() · 571f97f7
      Ravi Bangoria authored
      Use the event group trees to iterate only perf_tracepoint events.
      
      Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      571f97f7
    • Peter Zijlstra's avatar
      perf: Rewrite core context handling · bd275681
      Peter Zijlstra authored
      There have been various issues and limitations with the way perf uses
      (task) contexts to track events. Most notable is the single hardware
      PMU task context, which has resulted in a number of yucky things (both
      proposed and merged).
      
      Notably:
       - HW breakpoint PMU
       - ARM big.little PMU / Intel ADL PMU
       - Intel Branch Monitoring PMU
       - AMD IBS PMU
       - S390 cpum_cf PMU
       - PowerPC trace_imc PMU
      
      *Current design:*
      
      Currently we have a per task and per cpu perf_event_contexts:
      
        task_struct::perf_events_ctxp[] <-> perf_event_context <-> perf_cpu_context
             ^                                 |    ^     |           ^
             `---------------------------------'    |     `--> pmu ---'
                                                    v           ^
                                               perf_event ------'
      
      Each task has an array of pointers to a perf_event_context. Each
      perf_event_context has a direct relation to a PMU and a group of
      events for that PMU. The task related perf_event_context's have a
      pointer back to that task.
      
      Each PMU has a per-cpu pointer to a per-cpu perf_cpu_context, which
      includes a perf_event_context, which again has a direct relation to
      that PMU, and a group of events for that PMU.
      
      The perf_cpu_context also tracks which task context is currently
      associated with that CPU and includes a few other things like the
      hrtimer for rotation etc.
      
      Each perf_event is then associated with its PMU and one
      perf_event_context.
      
      *Proposed design:*
      
      New design proposed by this patch reduce to a single task context and
      a single CPU context but adds some intermediate data-structures:
      
        task_struct::perf_event_ctxp -> perf_event_context <- perf_cpu_context
             ^                           |   ^ ^
             `---------------------------'   | |
                                             | |    perf_cpu_pmu_context <--.
                                             | `----.    ^                  |
                                             |      |    |                  |
                                             |      v    v                  |
                                             | ,--> perf_event_pmu_context  |
                                             | |                            |
                                             | |                            |
                                             v v                            |
                                        perf_event ---> pmu ----------------'
      
      With the new design, perf_event_context will hold all events for all
      pmus in the (respective pinned/flexible) rbtrees. This can be achieved
      by adding pmu to rbtree key:
      
        {cpu, pmu, cgroup, group_index}
      
      Each perf_event_context carries a list of perf_event_pmu_context which
      is used to hold per-pmu-per-context state. For example, it keeps track
      of currently active events for that pmu, a pmu specific task_ctx_data,
      a flag to tell whether rotation is required or not etc.
      
      Additionally, perf_cpu_pmu_context is used to hold per-pmu-per-cpu
      state like hrtimer details to drive the event rotation, a pointer to
      perf_event_pmu_context of currently running task and some other
      ancillary information.
      
      Each perf_event is associated to it's pmu, perf_event_context and
      perf_event_pmu_context.
      
      Further optimizations to current implementation are possible. For
      example, ctx_resched() can be optimized to reschedule only single pmu
      events.
      
      Much thanks to Ravi for picking this up and pushing it towards
      completion.
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Co-developed-by: default avatarRavi Bangoria <ravi.bangoria@amd.com>
      Signed-off-by: default avatarRavi Bangoria <ravi.bangoria@amd.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Link: https://lkml.kernel.org/r/20221008062424.313-1-ravi.bangoria@amd.com
      bd275681
  4. 23 Oct, 2022 9 commits
  5. 22 Oct, 2022 19 commits