1. 04 May, 2022 6 commits
  2. 22 Apr, 2022 1 commit
    • Marco Elver's avatar
      signal: Deliver SIGTRAP on perf event asynchronously if blocked · 78ed93d7
      Marco Elver authored
      With SIGTRAP on perf events, we have encountered termination of
      processes due to user space attempting to block delivery of SIGTRAP.
      Consider this case:
      
          <set up SIGTRAP on a perf event>
          ...
          sigset_t s;
          sigemptyset(&s);
          sigaddset(&s, SIGTRAP | <and others>);
          sigprocmask(SIG_BLOCK, &s, ...);
          ...
          <perf event triggers>
      
      When the perf event triggers, while SIGTRAP is blocked, force_sig_perf()
      will force the signal, but revert back to the default handler, thus
      terminating the task.
      
      This makes sense for error conditions, but not so much for explicitly
      requested monitoring. However, the expectation is still that signals
      generated by perf events are synchronous, which will no longer be the
      case if the signal is blocked and delivered later.
      
      To give user space the ability to clearly distinguish synchronous from
      asynchronous signals, introduce siginfo_t::si_perf_flags and
      TRAP_PERF_FLAG_ASYNC (opted for flags in case more binary information is
      required in future).
      
      The resolution to the problem is then to (a) no longer force the signal
      (avoiding the terminations), but (b) tell user space via si_perf_flags
      if the signal was synchronous or not, so that such signals can be
      handled differently (e.g. let user space decide to ignore or consider
      the data imprecise).
      
      The alternative of making the kernel ignore SIGTRAP on perf events if
      the signal is blocked may work for some usecases, but likely causes
      issues in others that then have to revert back to interception of
      sigprocmask() (which we want to avoid). [ A concrete example: when using
      breakpoint perf events to track data-flow, in a region of code where
      signals are blocked, data-flow can no longer be tracked accurately.
      When a relevant asynchronous signal is received after unblocking the
      signal, the data-flow tracking logic needs to know its state is
      imprecise. ]
      
      Fixes: 97ba62b2 ("perf: Add support for SIGTRAP on perf events")
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarMarco Elver <elver@google.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: default avatarGeert Uytterhoeven <geert@linux-m68k.org>
      Tested-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Link: https://lore.kernel.org/r/20220404111204.935357-1-elver@google.com
      78ed93d7
  3. 05 Apr, 2022 10 commits
    • Yang Jihong's avatar
      perf/x86: Unify format of events sysfs show · 7bebfe9d
      Yang Jihong authored
      Sysfs show formats of files in /sys/devices/cpu/events/ are not unified,
      some end with "\n", and some do not. Modify sysfs show format of events
      defined by EVENT_ATTR_STR to end with "\n".
      
      Before:
        $ ls /sys/devices/cpu/events/* | xargs -i sh -c 'echo -n "{}: "; cat -A {}; echo'
        branch-instructions: event=0xc4$
      
        branch-misses: event=0xc5$
      
        bus-cycles: event=0x3c,umask=0x01$
      
        cache-misses: event=0x2e,umask=0x41$
      
        cache-references: event=0x2e,umask=0x4f$
      
        cpu-cycles: event=0x3c$
      
        instructions: event=0xc0$
      
        ref-cycles: event=0x00,umask=0x03$
      
        slots: event=0x00,umask=0x4
        topdown-bad-spec: event=0x00,umask=0x81
        topdown-be-bound: event=0x00,umask=0x83
        topdown-fe-bound: event=0x00,umask=0x82
        topdown-retiring: event=0x00,umask=0x80
      
      After:
        $ ls /sys/devices/cpu/events/* | xargs -i sh -c 'echo -n "{}: "; cat -A {}; echo'
        /sys/devices/cpu/events/branch-instructions: event=0xc4$
      
        /sys/devices/cpu/events/branch-misses: event=0xc5$
      
        /sys/devices/cpu/events/bus-cycles: event=0x3c,umask=0x01$
      
        /sys/devices/cpu/events/cache-misses: event=0x2e,umask=0x41$
      
        /sys/devices/cpu/events/cache-references: event=0x2e,umask=0x4f$
      
        /sys/devices/cpu/events/cpu-cycles: event=0x3c$
      
        /sys/devices/cpu/events/instructions: event=0xc0$
      
        /sys/devices/cpu/events/ref-cycles: event=0x00,umask=0x03$
      
        /sys/devices/cpu/events/slots: event=0x00,umask=0x4$
      Signed-off-by: default avatarYang Jihong <yangjihong1@huawei.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Link: https://lkml.kernel.org/r/20220324031957.135595-1-yangjihong1@huawei.com
      7bebfe9d
    • Stephane Eranian's avatar
      perf/x86/amd: Add idle hooks for branch sampling · d5616bac
      Stephane Eranian authored
      On AMD Fam19h Zen3, the branch sampling (BRS) feature must be disabled before
      entering low power and re-enabled (if was active) when returning from low
      power. Otherwise, the NMI interrupt may be held up for too long and cause
      problems. Stopping BRS will cause the NMI to be delivered if it was held up.
      
      Define a perf_amd_brs_lopwr_cb() callback to stop/restart BRS.  The callback
      is protected by a jump label which is enabled only when AMD BRS is detected.
      In all other cases, the callback is never called.
      Signed-off-by: default avatarStephane Eranian <eranian@google.com>
      [peterz: static_call() and build fixes]
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20220322221517.2510440-10-eranian@google.com
      d5616bac
    • Stephane Eranian's avatar
      ACPI: Add perf low power callback · 2a606a18
      Stephane Eranian authored
      Add an optional callback needed by some PMU features, e.g., AMD
      BRS, to give a chance to the perf_events code to change its state before
      a CPU goes to low power and after it comes back.
      
      The callback is void when the PERF_NEEDS_LOPWR_CB flag is not set.
      This flag must be set in arch specific perf_event.h header whenever needed.
      When not set, there is no impact on the ACPI code.
      Signed-off-by: default avatarStephane Eranian <eranian@google.com>
      [peterz: build fix]
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20220322221517.2510440-9-eranian@google.com
      2a606a18
    • Stephane Eranian's avatar
      perf/x86/amd: Make Zen3 branch sampling opt-in · cc37e520
      Stephane Eranian authored
      Add a kernel config option CONFIG_PERF_EVENTS_AMD_BRS
      to make the support for AMD Zen3 Branch Sampling (BRS) an opt-in
      compile time option.
      Signed-off-by: default avatarStephane Eranian <eranian@google.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20220322221517.2510440-8-eranian@google.com
      cc37e520
    • Stephane Eranian's avatar
      perf/x86/amd: Add AMD branch sampling period adjustment · ba2fe750
      Stephane Eranian authored
      Add code to adjust the sampling event period when used with the Branch
      Sampling feature (BRS). Given the depth of the BRS (16), the period is
      reduced by that depth such that in the best case scenario, BRS saturates at
      the desired sampling period. In practice, though, the processor may execute
      more branches. Given a desired period P and a depth D, the kernel programs
      the actual period at P - D. After P occurrences of the sampling event, the
      counter overflows. It then may take X branches (skid) before the NMI is
      caught and held by the hardware and BRS activates. Then, after D branches,
      BRS saturates and the NMI is delivered.  With no skid, the effective period
      would be (P - D) + D = P. In practice, however, it will likely be (P - D) +
      X + D. There is no way to eliminate X or predict X.
      Signed-off-by: default avatarStephane Eranian <eranian@google.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20220322221517.2510440-7-eranian@google.com
      ba2fe750
    • Stephane Eranian's avatar
      perf/x86/amd: Enable branch sampling priv level filtering · 8910075d
      Stephane Eranian authored
      The AMD Branch Sampling features does not provide hardware filtering by
      privilege level. The associated PMU counter does but not the branch sampling
      by itself. Given how BRS operates there is a possibility that BRS captures
      kernel level branches even though the event is programmed to count only at
      the user level.
      
      Implement a workaround in software by removing the branches which belong to
      the wrong privilege level. The privilege level is evaluated on the target of
      the branch and not the source so as to be compatible with other architectures.
      As a consequence of this patch, the number of entries in the
      PERF_RECORD_BRANCH_STACK buffer may be less than the maximum (16).  It could
      even be zero. Another consequence is that consecutive entries in the branch
      stack may not reflect actual code path and may have discontinuities, in case
      kernel branches were suppressed. But this is no different than what happens
      on other architectures.
      Signed-off-by: default avatarStephane Eranian <eranian@google.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20220322221517.2510440-6-eranian@google.com
      8910075d
    • Stephane Eranian's avatar
      perf/x86/amd: Add branch-brs helper event for Fam19h BRS · 44175993
      Stephane Eranian authored
      Add a pseudo event called branch-brs to help use the FAM Fam19h
      Branch Sampling feature (BRS). BRS samples taken branches, so it is best used
      when sampling on a retired taken branch event (0xc4) which is what BRS
      captures.  Instead of trying to remember the event code or actual event name,
      users can simply do:
      
      $ perf record -b -e cpu/branch-brs/ -c 1000037 .....
      Signed-off-by: default avatarStephane Eranian <eranian@google.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20220322221517.2510440-5-eranian@google.com
      44175993
    • Stephane Eranian's avatar
      perf/x86/amd: Add AMD Fam19h Branch Sampling support · ada54345
      Stephane Eranian authored
      Add support for the AMD Fam19h 16-deep branch sampling feature as
      described in the AMD PPR Fam19h Model 01h Revision B1.  This is a model
      specific extension. It is not an architected AMD feature.
      
      The Branch Sampling (BRS) operates with a 16-deep saturating buffer in MSR
      registers. There is no branch type filtering. All control flow changes are
      captured. BRS relies on specific programming of the core PMU of Fam19h.  In
      particular, the following requirements must be met:
       - the sampling period be greater than 16 (BRS depth)
       - the sampling period must use a fixed and not frequency mode
      
      BRS interacts with the NMI interrupt as well. Because enabling BRS is
      expensive, it is only activated after P event occurrences, where P is the
      desired sampling period.  At P occurrences of the event, the counter
      overflows, the CPU catches the interrupt, activates BRS for 16 branches until
      it saturates, and then delivers the NMI to the kernel.  Between the overflow
      and the time BRS activates more branches may be executed skewing the period.
      All along, the sampling event keeps counting. The skid may be attenuated by
      reducing the sampling period by 16 (subsequent patch).
      
      BRS is integrated into perf_events seamlessly via the same
      PERF_RECORD_BRANCH_STACK sample format. BRS generates perf_branch_entry
      records in the sampling buffer. No prediction information is supported. The
      branches are stored in reverse order of execution.  The most recent branch is
      the first entry in each record.
      
      No modification to the perf tool is necessary.
      
      BRS can be used with any sampling event. However, it is recommended to use
      the RETIRED_BRANCH_INSTRUCTIONS event because it matches what the BRS
      captures.
      
      $ perf record -b -c 1000037 -e cpu/event=0xc2,name=ret_br_instructions/ test
      
      $ perf report -D
      56531696056126 0x193c000 [0x1a8]: PERF_RECORD_SAMPLE(IP, 0x2): 18122/18230: 0x401d24 period: 1000037 addr: 0
      ... branch stack: nr:16
      .....  0: 0000000000401d24 -> 0000000000401d5a 0 cycles      0
      .....  1: 0000000000401d5c -> 0000000000401d24 0 cycles      0
      .....  2: 0000000000401d22 -> 0000000000401d5c 0 cycles      0
      .....  3: 0000000000401d5e -> 0000000000401d22 0 cycles      0
      .....  4: 0000000000401d20 -> 0000000000401d5e 0 cycles      0
      .....  5: 0000000000401d3e -> 0000000000401d20 0 cycles      0
      .....  6: 0000000000401d42 -> 0000000000401d3e 0 cycles      0
      .....  7: 0000000000401d3c -> 0000000000401d42 0 cycles      0
      .....  8: 0000000000401d44 -> 0000000000401d3c 0 cycles      0
      .....  9: 0000000000401d3a -> 0000000000401d44 0 cycles      0
      ..... 10: 0000000000401d46 -> 0000000000401d3a 0 cycles      0
      ..... 11: 0000000000401d38 -> 0000000000401d46 0 cycles      0
      ..... 12: 0000000000401d48 -> 0000000000401d38 0 cycles      0
      ..... 13: 0000000000401d36 -> 0000000000401d48 0 cycles      0
      ..... 14: 0000000000401d4a -> 0000000000401d36 0 cycles      0
      ..... 15: 0000000000401d34 -> 0000000000401d4a 0 cycles      0
       ... thread: test:18230
       ...... dso: test
      Signed-off-by: default avatarStephane Eranian <eranian@google.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20220322221517.2510440-4-eranian@google.com
      ada54345
    • Stephane Eranian's avatar
      x86/cpufeatures: Add AMD Fam19h Branch Sampling feature · a77d41ac
      Stephane Eranian authored
      Add a cpu feature for AMD Fam19h Branch Sampling feature as bit
      31 of EBX on CPUID leaf function 0x80000008.
      Signed-off-by: default avatarStephane Eranian <eranian@google.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20220322221517.2510440-3-eranian@google.com
      a77d41ac
    • Stephane Eranian's avatar
      perf/core: Add perf_clear_branch_entry_bitfields() helper · bfe4daf8
      Stephane Eranian authored
      Make it simpler to reset all the info fields on the
      perf_branch_entry by adding a helper inline function.
      
      The goal is to centralize the initialization to avoid missing
      a field in case more are added.
      Signed-off-by: default avatarStephane Eranian <eranian@google.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20220322221517.2510440-2-eranian@google.com
      bfe4daf8
  4. 03 Apr, 2022 8 commits
  5. 02 Apr, 2022 15 commits