• Athira Rajeev's avatar
    powerpc/perf: Fix PMU callbacks to clear pending PMI before resetting an overflown PMC · 2c9ac51b
    Athira Rajeev authored
    Running perf fuzzer showed below in dmesg logs:
      "Can't find PMC that caused IRQ"
    
    This means a PMU exception happened, but none of the PMC's (Performance
    Monitor Counter) were found to be overflown. There are some corner cases
    that clears the PMCs after PMI gets masked. In such cases, the perf
    interrupt handler will not find the active PMC values that had caused
    the overflow and thus leads to this message while replaying.
    
    Case 1: PMU Interrupt happens during replay of other interrupts and
    counter values gets cleared by PMU callbacks before replay:
    
    During replay of interrupts like timer, __do_irq() and doorbell
    exception, we conditionally enable interrupts via may_hard_irq_enable().
    This could potentially create a window to generate a PMI. Since irq soft
    mask is set to ALL_DISABLED, the PMI will get masked here. We could get
    IPIs run before perf interrupt is replayed and the PMU events could
    be deleted or stopped. This will change the PMU SPR values and resets
    the counters. Snippet of ftrace log showing PMU callbacks invoked in
    __do_irq():
    
      <idle>-0 [051] dns. 132025441306354: __do_irq <-call_do_irq
      <idle>-0 [051] dns. 132025441306430: irq_enter <-__do_irq
      <idle>-0 [051] dns. 132025441306503: irq_enter_rcu <-__do_irq
      <idle>-0 [051] dnH. 132025441306599: xive_get_irq <-__do_irq
      <<>>
      <idle>-0 [051] dnH. 132025441307770: generic_smp_call_function_single_interrupt <-smp_ipi_demux_relaxed
      <idle>-0 [051] dnH. 132025441307839: flush_smp_call_function_queue <-smp_ipi_demux_relaxed
      <idle>-0 [051] dnH. 132025441308057: _raw_spin_lock <-event_function
      <idle>-0 [051] dnH. 132025441308206: power_pmu_disable <-perf_pmu_disable
      <idle>-0 [051] dnH. 132025441308337: power_pmu_del <-event_sched_out
      <idle>-0 [051] dnH. 132025441308407: power_pmu_read <-power_pmu_del
      <idle>-0 [051] dnH. 132025441308477: read_pmc <-power_pmu_read
      <idle>-0 [051] dnH. 132025441308590: isa207_disable_pmc <-power_pmu_del
      <idle>-0 [051] dnH. 132025441308663: write_pmc <-power_pmu_del
      <idle>-0 [051] dnH. 132025441308787: power_pmu_event_idx <-perf_event_update_userpage
      <idle>-0 [051] dnH. 132025441308859: rcu_read_unlock_strict <-perf_event_update_userpage
      <idle>-0 [051] dnH. 132025441308975: power_pmu_enable <-perf_pmu_enable
      <<>>
      <idle>-0 [051] dnH. 132025441311108: irq_exit <-__do_irq
      <idle>-0 [051] dns. 132025441311319: performance_monitor_exception <-replay_soft_interrupts
    
    Case 2: PMI's masked during local_* operations, example local_add(). If
    the local_add() operation happens within a local_irq_save(), replay of
    PMI will be during local_irq_restore(). Similar to case 1, this could
    also create a window before replay where PMU events gets deleted or
    stopped.
    
    Fix it by updating the PMU callback function power_pmu_disable() to
    check for pending perf interrupt. If there is an overflown PMC and
    pending perf interrupt indicated in paca, clear the PMI bit in paca to
    drop that sample. Clearing of PMI bit is done in power_pmu_disable()
    since disable is invoked before any event gets deleted/stopped. With
    this fix, if there are more than one event running in the PMU, there is
    a chance that we clear the PMI bit for the event which is not getting
    deleted/stopped. The other events may still remain active. Hence to make
    sure we don't drop valid sample in such cases, another check is added in
    power_pmu_enable. This checks if there is an overflown PMC found among
    the active events and if so enable back the PMI bit. Two new helper
    functions are introduced to clear/set the PMI, ie
    clear_pmi_irq_pending() and set_pmi_irq_pending(). Helper function
    pmi_irq_pending() is introduced to give a warning if there is pending
    PMI bit in paca, but no PMC is overflown.
    
    Also there are corner cases which result in performance monitor
    interrupts being triggered during power_pmu_disable(). This happens
    since PMXE bit is not cleared along with disabling of other MMCR0 bits
    in the pmu_disable. Such PMI's could leave the PMU running and could
    trigger PMI again which will set MMCR0 PMAO bit. This could lead to
    spurious interrupts in some corner cases. Example, a timer after
    power_pmu_del() which will re-enable interrupts and triggers a PMI again
    since PMAO bit is still set. But fails to find valid overflow since PMC
    was cleared in power_pmu_del(). Fix that by disabling PMXE along with
    disabling of other MMCR0 bits in power_pmu_disable().
    
    We can't just replay PMI any time. Hence this approach is preferred
    rather than replaying PMI before resetting overflown PMC. Patch also
    documents core-book3s on a race condition which can trigger these PMC
    messages during idle path in PowerNV.
    
    Fixes: f442d004 ("powerpc/64s: Add support to mask perf interrupts and replay them")
    Reported-by: default avatarNageswara R Sastry <nasastry@in.ibm.com>
    Suggested-by: default avatarNicholas Piggin <npiggin@gmail.com>
    Suggested-by: default avatarMadhavan Srinivasan <maddy@linux.ibm.com>
    Signed-off-by: default avatarAthira Rajeev <atrajeev@linux.vnet.ibm.com>
    Tested-by: default avatarNageswara R Sastry <rnsastry@linux.ibm.com>
    Reviewed-by: default avatarNicholas Piggin <npiggin@gmail.com>
    [mpe: Make pmi_irq_pending() return bool, reflow/reword some comments]
    Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
    Link: https://lore.kernel.org/r/1626846509-1350-2-git-send-email-atrajeev@linux.vnet.ibm.com
    2c9ac51b
core-book3s.c 64.1 KB