Commit 119e3eef authored by Will Deacon's avatar Will Deacon

Merge branch 'for-next/perf' into for-next/core

* for-next/perf: (33 commits)
  perf: arm-ni: Fix an NULL vs IS_ERR() bug
  perf: arm_pmuv3: Use BR_RETIRED for HW branch event if enabled
  MAINTAINERS: List Arm interconnect PMUs as supported
  perf: Add driver for Arm NI-700 interconnect PMU
  dt-bindings/perf: Add Arm NI-700 PMU
  perf/arm-cmn: Improve format attr printing
  perf/arm-cmn: Clean up unnecessary NUMA_NO_NODE check
  perf/arm-cmn: Support CMN S3
  dt-bindings: perf: arm-cmn: Add CMN S3
  perf/arm-cmn: Refactor DTC PMU register access
  perf/arm-cmn: Make cycle counts less surprising
  perf/arm-cmn: Improve build-time assertion
  perf/arm-cmn: Ensure dtm_idx is big enough
  perf/arm-cmn: Fix CCLA register offset
  perf/arm-cmn: Refactor node ID handling. Again.
  drivers/perf: hisi_pcie: Export supported Root Ports [bdf_min, bdf_max]
  drivers/perf: hisi_pcie: Fix TLP headers bandwidth counting
  drivers/perf: hisi_pcie: Record hardware counts correctly
  drivers/perf: arm_spe: Use perf_allow_kernel() for permissions
  perf/dwc_pcie: Add support for QCOM vendor devices
  ...
parents c2c94023 2e091a80
====================================
Arm Network-on Chip Interconnect PMU
====================================
NI-700 and friends implement a distinct PMU for each clock domain within the
interconnect. Correspondingly, the driver exposes multiple PMU devices named
arm_ni_<x>_cd_<y>, where <x> is an (arbitrary) instance identifier and <y> is
the clock domain ID within that particular instance. If multiple NI instances
exist within a system, the PMU devices can be correlated with the underlying
hardware instance via sysfs parentage.
Each PMU exposes base event aliases for the interface types present in its clock
domain. These require qualifying with the "eventid" and "nodeid" parameters
to specify the event code to count and the interface at which to count it
(per the configured hardware ID as reflected in the xxNI_NODE_INFO register).
The exception is the "cycles" alias for the PMU cycle counter, which is encoded
with the PMU node type and needs no further qualification.
......@@ -46,16 +46,16 @@ Some of the events only exist for specific configurations.
DesignWare Cores (DWC) PCIe PMU Driver
=======================================
This driver adds PMU devices for each PCIe Root Port named based on the BDF of
This driver adds PMU devices for each PCIe Root Port named based on the SBDF of
the Root Port. For example,
30:03.0 PCI bridge: Device 1ded:8000 (rev 01)
0001:30:03.0 PCI bridge: Device 1ded:8000 (rev 01)
the PMU device name for this Root Port is dwc_rootport_3018.
the PMU device name for this Root Port is dwc_rootport_13018.
The DWC PCIe PMU driver registers a perf PMU driver, which provides
description of available events and configuration options in sysfs, see
/sys/bus/event_source/devices/dwc_rootport_{bdf}.
/sys/bus/event_source/devices/dwc_rootport_{sbdf}.
The "format" directory describes format of the config fields of the
perf_event_attr structure. The "events" directory provides configuration
......@@ -66,16 +66,16 @@ The "perf list" command shall list the available events from sysfs, e.g.::
$# perf list | grep dwc_rootport
<...>
dwc_rootport_3018/Rx_PCIe_TLP_Data_Payload/ [Kernel PMU event]
dwc_rootport_13018/Rx_PCIe_TLP_Data_Payload/ [Kernel PMU event]
<...>
dwc_rootport_3018/rx_memory_read,lane=?/ [Kernel PMU event]
dwc_rootport_13018/rx_memory_read,lane=?/ [Kernel PMU event]
Time Based Analysis Event Usage
-------------------------------
Example usage of counting PCIe RX TLP data payload (Units of bytes)::
$# perf stat -a -e dwc_rootport_3018/Rx_PCIe_TLP_Data_Payload/
$# perf stat -a -e dwc_rootport_13018/Rx_PCIe_TLP_Data_Payload/
The average RX/TX bandwidth can be calculated using the following formula:
......@@ -88,7 +88,7 @@ Lane Event Usage
Each lane has the same event set and to avoid generating a list of hundreds
of events, the user need to specify the lane ID explicitly, e.g.::
$# perf stat -a -e dwc_rootport_3018/rx_memory_read,lane=4/
$# perf stat -a -e dwc_rootport_13018/rx_memory_read,lane=4/
The driver does not support sampling, therefore "perf record" will not
work. Per-task (without "-a") perf sessions are not supported.
......@@ -28,7 +28,9 @@ The "identifier" sysfs file allows users to identify the version of the
PMU hardware device.
The "bus" sysfs file allows users to get the bus number of Root Ports
monitored by PMU.
monitored by PMU. Furthermore users can get the Root Ports range in
[bdf_min, bdf_max] from "bdf_min" and "bdf_max" sysfs attributes
respectively.
Example usage of perf::
......
......@@ -16,6 +16,7 @@ Performance monitor support
starfive_starlink_pmu
arm-ccn
arm-cmn
arm-ni
xgene-pmu
arm_dsu_pmu
thunderx2-pmu
......
......@@ -16,6 +16,7 @@ properties:
- arm,cmn-600
- arm,cmn-650
- arm,cmn-700
- arm,cmn-s3
- arm,ci-700
reg:
......
# SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause
%YAML 1.2
---
$id: http://devicetree.org/schemas/perf/arm,ni.yaml#
$schema: http://devicetree.org/meta-schemas/core.yaml#
title: Arm NI (Network-on-Chip Interconnect) Performance Monitors
maintainers:
- Robin Murphy <robin.murphy@arm.com>
properties:
compatible:
const: arm,ni-700
reg:
items:
- description: Complete configuration register space
interrupts:
minItems: 1
maxItems: 32
description: Overflow interrupts, one per clock domain, in order of domain ID
required:
- compatible
- reg
- interrupts
additionalProperties: false
......@@ -1738,6 +1738,17 @@ F: drivers/mtd/maps/physmap-versatile.*
F: drivers/power/reset/arm-versatile-reboot.c
F: drivers/soc/versatile/
ARM INTERCONNECT PMU DRIVERS
M: Robin Murphy <robin.murphy@arm.com>
S: Supported
F: Documentation/admin-guide/perf/arm-cmn.rst
F: Documentation/admin-guide/perf/arm-ni.rst
F: Documentation/devicetree/bindings/perf/arm,cmn.yaml
F: Documentation/devicetree/bindings/perf/arm,ni.yaml
F: drivers/perf/arm-cmn.c
F: drivers/perf/arm-ni.c
F: tools/perf/pmu-events/arch/arm64/arm/cmn/
ARM KOMEDA DRM-KMS DRIVER
M: Liviu Dudau <liviu.dudau@arm.com>
S: Supported
......
......@@ -127,6 +127,12 @@ static inline u32 read_pmuver(void)
return (dfr0 >> 24) & 0xf;
}
static inline bool pmuv3_has_icntr(void)
{
/* FEAT_PMUv3_ICNTR not accessible for 32-bit */
return false;
}
static inline void write_pmcr(u32 val)
{
write_sysreg(val, PMCR);
......@@ -152,6 +158,13 @@ static inline u64 read_pmccntr(void)
return read_sysreg(PMCCNTR);
}
static inline void write_pmicntr(u64 val) {}
static inline u64 read_pmicntr(void)
{
return 0;
}
static inline void write_pmcntenset(u32 val)
{
write_sysreg(val, PMCNTENSET);
......@@ -177,6 +190,13 @@ static inline void write_pmccfiltr(u32 val)
write_sysreg(val, PMCCFILTR);
}
static inline void write_pmicfiltr(u64 val) {}
static inline u64 read_pmicfiltr(void)
{
return 0;
}
static inline void write_pmovsclr(u32 val)
{
write_sysreg(val, PMOVSR);
......
......@@ -33,6 +33,14 @@ static inline void write_pmevtypern(int n, unsigned long val)
PMEVN_SWITCH(n, WRITE_PMEVTYPERN);
}
#define RETURN_READ_PMEVTYPERN(n) \
return read_sysreg(pmevtyper##n##_el0)
static inline unsigned long read_pmevtypern(int n)
{
PMEVN_SWITCH(n, RETURN_READ_PMEVTYPERN);
return 0;
}
static inline unsigned long read_pmmir(void)
{
return read_cpuid(PMMIR_EL1);
......@@ -46,6 +54,14 @@ static inline u32 read_pmuver(void)
ID_AA64DFR0_EL1_PMUVer_SHIFT);
}
static inline bool pmuv3_has_icntr(void)
{
u64 dfr1 = read_sysreg(id_aa64dfr1_el1);
return !!cpuid_feature_extract_unsigned_field(dfr1,
ID_AA64DFR1_EL1_PMICNTR_SHIFT);
}
static inline void write_pmcr(u64 val)
{
write_sysreg(val, pmcr_el0);
......@@ -71,22 +87,32 @@ static inline u64 read_pmccntr(void)
return read_sysreg(pmccntr_el0);
}
static inline void write_pmcntenset(u32 val)
static inline void write_pmicntr(u64 val)
{
write_sysreg_s(val, SYS_PMICNTR_EL0);
}
static inline u64 read_pmicntr(void)
{
return read_sysreg_s(SYS_PMICNTR_EL0);
}
static inline void write_pmcntenset(u64 val)
{
write_sysreg(val, pmcntenset_el0);
}
static inline void write_pmcntenclr(u32 val)
static inline void write_pmcntenclr(u64 val)
{
write_sysreg(val, pmcntenclr_el0);
}
static inline void write_pmintenset(u32 val)
static inline void write_pmintenset(u64 val)
{
write_sysreg(val, pmintenset_el1);
}
static inline void write_pmintenclr(u32 val)
static inline void write_pmintenclr(u64 val)
{
write_sysreg(val, pmintenclr_el1);
}
......@@ -96,12 +122,27 @@ static inline void write_pmccfiltr(u64 val)
write_sysreg(val, pmccfiltr_el0);
}
static inline void write_pmovsclr(u32 val)
static inline u64 read_pmccfiltr(void)
{
return read_sysreg(pmccfiltr_el0);
}
static inline void write_pmicfiltr(u64 val)
{
write_sysreg_s(val, SYS_PMICFILTR_EL0);
}
static inline u64 read_pmicfiltr(void)
{
return read_sysreg_s(SYS_PMICFILTR_EL0);
}
static inline void write_pmovsclr(u64 val)
{
write_sysreg(val, pmovsclr_el0);
}
static inline u32 read_pmovsclr(void)
static inline u64 read_pmovsclr(void)
{
return read_sysreg(pmovsclr_el0);
}
......
......@@ -1330,12 +1330,12 @@ void kvm_arch_vcpu_load_debug_state_flags(struct kvm_vcpu *vcpu);
void kvm_arch_vcpu_put_debug_state_flags(struct kvm_vcpu *vcpu);
#ifdef CONFIG_KVM
void kvm_set_pmu_events(u32 set, struct perf_event_attr *attr);
void kvm_clr_pmu_events(u32 clr);
void kvm_set_pmu_events(u64 set, struct perf_event_attr *attr);
void kvm_clr_pmu_events(u64 clr);
bool kvm_set_pmuserenr(u64 val);
#else
static inline void kvm_set_pmu_events(u32 set, struct perf_event_attr *attr) {}
static inline void kvm_clr_pmu_events(u32 clr) {}
static inline void kvm_set_pmu_events(u64 set, struct perf_event_attr *attr) {}
static inline void kvm_clr_pmu_events(u64 clr) {}
static inline bool kvm_set_pmuserenr(u64 val)
{
return false;
......
......@@ -403,7 +403,6 @@
#define SYS_PMCNTENCLR_EL0 sys_reg(3, 3, 9, 12, 2)
#define SYS_PMOVSCLR_EL0 sys_reg(3, 3, 9, 12, 3)
#define SYS_PMSWINC_EL0 sys_reg(3, 3, 9, 12, 4)
#define SYS_PMSELR_EL0 sys_reg(3, 3, 9, 12, 5)
#define SYS_PMCEID0_EL0 sys_reg(3, 3, 9, 12, 6)
#define SYS_PMCEID1_EL0 sys_reg(3, 3, 9, 12, 7)
#define SYS_PMCCNTR_EL0 sys_reg(3, 3, 9, 13, 0)
......
......@@ -233,7 +233,7 @@ void kvm_pmu_vcpu_init(struct kvm_vcpu *vcpu)
int i;
struct kvm_pmu *pmu = &vcpu->arch.pmu;
for (i = 0; i < ARMV8_PMU_MAX_COUNTERS; i++)
for (i = 0; i < KVM_ARMV8_PMU_MAX_COUNTERS; i++)
pmu->pmc[i].idx = i;
}
......@@ -260,7 +260,7 @@ void kvm_pmu_vcpu_destroy(struct kvm_vcpu *vcpu)
{
int i;
for (i = 0; i < ARMV8_PMU_MAX_COUNTERS; i++)
for (i = 0; i < KVM_ARMV8_PMU_MAX_COUNTERS; i++)
kvm_pmu_release_perf_event(kvm_vcpu_idx_to_pmc(vcpu, i));
irq_work_sync(&vcpu->arch.pmu.overflow_work);
}
......@@ -291,7 +291,7 @@ void kvm_pmu_enable_counter_mask(struct kvm_vcpu *vcpu, u64 val)
if (!(kvm_vcpu_read_pmcr(vcpu) & ARMV8_PMU_PMCR_E) || !val)
return;
for (i = 0; i < ARMV8_PMU_MAX_COUNTERS; i++) {
for (i = 0; i < KVM_ARMV8_PMU_MAX_COUNTERS; i++) {
struct kvm_pmc *pmc;
if (!(val & BIT(i)))
......@@ -323,7 +323,7 @@ void kvm_pmu_disable_counter_mask(struct kvm_vcpu *vcpu, u64 val)
if (!kvm_vcpu_has_pmu(vcpu) || !val)
return;
for (i = 0; i < ARMV8_PMU_MAX_COUNTERS; i++) {
for (i = 0; i < KVM_ARMV8_PMU_MAX_COUNTERS; i++) {
struct kvm_pmc *pmc;
if (!(val & BIT(i)))
......@@ -910,10 +910,10 @@ u8 kvm_arm_pmu_get_max_counters(struct kvm *kvm)
struct arm_pmu *arm_pmu = kvm->arch.arm_pmu;
/*
* The arm_pmu->num_events considers the cycle counter as well.
* Ignore that and return only the general-purpose counters.
* The arm_pmu->cntr_mask considers the fixed counter(s) as well.
* Ignore those and return only the general-purpose counters.
*/
return arm_pmu->num_events - 1;
return bitmap_weight(arm_pmu->cntr_mask, ARMV8_PMU_MAX_GENERAL_COUNTERS);
}
static void kvm_arm_set_pmu(struct kvm *kvm, struct arm_pmu *arm_pmu)
......
......@@ -5,6 +5,8 @@
*/
#include <linux/kvm_host.h>
#include <linux/perf_event.h>
#include <linux/perf/arm_pmu.h>
#include <linux/perf/arm_pmuv3.h>
static DEFINE_PER_CPU(struct kvm_pmu_events, kvm_pmu_events);
......@@ -35,7 +37,7 @@ struct kvm_pmu_events *kvm_get_pmu_events(void)
* Add events to track that we may want to switch at guest entry/exit
* time.
*/
void kvm_set_pmu_events(u32 set, struct perf_event_attr *attr)
void kvm_set_pmu_events(u64 set, struct perf_event_attr *attr)
{
struct kvm_pmu_events *pmu = kvm_get_pmu_events();
......@@ -51,7 +53,7 @@ void kvm_set_pmu_events(u32 set, struct perf_event_attr *attr)
/*
* Stop tracking events
*/
void kvm_clr_pmu_events(u32 clr)
void kvm_clr_pmu_events(u64 clr)
{
struct kvm_pmu_events *pmu = kvm_get_pmu_events();
......@@ -62,79 +64,32 @@ void kvm_clr_pmu_events(u32 clr)
pmu->events_guest &= ~clr;
}
#define PMEVTYPER_READ_CASE(idx) \
case idx: \
return read_sysreg(pmevtyper##idx##_el0)
#define PMEVTYPER_WRITE_CASE(idx) \
case idx: \
write_sysreg(val, pmevtyper##idx##_el0); \
break
#define PMEVTYPER_CASES(readwrite) \
PMEVTYPER_##readwrite##_CASE(0); \
PMEVTYPER_##readwrite##_CASE(1); \
PMEVTYPER_##readwrite##_CASE(2); \
PMEVTYPER_##readwrite##_CASE(3); \
PMEVTYPER_##readwrite##_CASE(4); \
PMEVTYPER_##readwrite##_CASE(5); \
PMEVTYPER_##readwrite##_CASE(6); \
PMEVTYPER_##readwrite##_CASE(7); \
PMEVTYPER_##readwrite##_CASE(8); \
PMEVTYPER_##readwrite##_CASE(9); \
PMEVTYPER_##readwrite##_CASE(10); \
PMEVTYPER_##readwrite##_CASE(11); \
PMEVTYPER_##readwrite##_CASE(12); \
PMEVTYPER_##readwrite##_CASE(13); \
PMEVTYPER_##readwrite##_CASE(14); \
PMEVTYPER_##readwrite##_CASE(15); \
PMEVTYPER_##readwrite##_CASE(16); \
PMEVTYPER_##readwrite##_CASE(17); \
PMEVTYPER_##readwrite##_CASE(18); \
PMEVTYPER_##readwrite##_CASE(19); \
PMEVTYPER_##readwrite##_CASE(20); \
PMEVTYPER_##readwrite##_CASE(21); \
PMEVTYPER_##readwrite##_CASE(22); \
PMEVTYPER_##readwrite##_CASE(23); \
PMEVTYPER_##readwrite##_CASE(24); \
PMEVTYPER_##readwrite##_CASE(25); \
PMEVTYPER_##readwrite##_CASE(26); \
PMEVTYPER_##readwrite##_CASE(27); \
PMEVTYPER_##readwrite##_CASE(28); \
PMEVTYPER_##readwrite##_CASE(29); \
PMEVTYPER_##readwrite##_CASE(30)
/*
* Read a value direct from PMEVTYPER<idx> where idx is 0-30
* or PMCCFILTR_EL0 where idx is ARMV8_PMU_CYCLE_IDX (31).
* or PMxCFILTR_EL0 where idx is 31-32.
*/
static u64 kvm_vcpu_pmu_read_evtype_direct(int idx)
{
switch (idx) {
PMEVTYPER_CASES(READ);
case ARMV8_PMU_CYCLE_IDX:
return read_sysreg(pmccfiltr_el0);
default:
WARN_ON(1);
}
if (idx == ARMV8_PMU_CYCLE_IDX)
return read_pmccfiltr();
else if (idx == ARMV8_PMU_INSTR_IDX)
return read_pmicfiltr();
return 0;
return read_pmevtypern(idx);
}
/*
* Write a value direct to PMEVTYPER<idx> where idx is 0-30
* or PMCCFILTR_EL0 where idx is ARMV8_PMU_CYCLE_IDX (31).
* or PMxCFILTR_EL0 where idx is 31-32.
*/
static void kvm_vcpu_pmu_write_evtype_direct(int idx, u32 val)
{
switch (idx) {
PMEVTYPER_CASES(WRITE);
case ARMV8_PMU_CYCLE_IDX:
write_sysreg(val, pmccfiltr_el0);
break;
default:
WARN_ON(1);
}
if (idx == ARMV8_PMU_CYCLE_IDX)
write_pmccfiltr(val);
else if (idx == ARMV8_PMU_INSTR_IDX)
write_pmicfiltr(val);
else
write_pmevtypern(idx, val);
}
/*
......@@ -145,7 +100,7 @@ static void kvm_vcpu_pmu_enable_el0(unsigned long events)
u64 typer;
u32 counter;
for_each_set_bit(counter, &events, 32) {
for_each_set_bit(counter, &events, ARMPMU_MAX_HWEVENTS) {
typer = kvm_vcpu_pmu_read_evtype_direct(counter);
typer &= ~ARMV8_PMU_EXCLUDE_EL0;
kvm_vcpu_pmu_write_evtype_direct(counter, typer);
......@@ -160,7 +115,7 @@ static void kvm_vcpu_pmu_disable_el0(unsigned long events)
u64 typer;
u32 counter;
for_each_set_bit(counter, &events, 32) {
for_each_set_bit(counter, &events, ARMPMU_MAX_HWEVENTS) {
typer = kvm_vcpu_pmu_read_evtype_direct(counter);
typer |= ARMV8_PMU_EXCLUDE_EL0;
kvm_vcpu_pmu_write_evtype_direct(counter, typer);
......@@ -176,7 +131,7 @@ static void kvm_vcpu_pmu_disable_el0(unsigned long events)
void kvm_vcpu_pmu_restore_guest(struct kvm_vcpu *vcpu)
{
struct kvm_pmu_events *pmu;
u32 events_guest, events_host;
u64 events_guest, events_host;
if (!kvm_arm_support_pmu_v3() || !has_vhe())
return;
......@@ -197,7 +152,7 @@ void kvm_vcpu_pmu_restore_guest(struct kvm_vcpu *vcpu)
void kvm_vcpu_pmu_restore_host(struct kvm_vcpu *vcpu)
{
struct kvm_pmu_events *pmu;
u32 events_guest, events_host;
u64 events_guest, events_host;
if (!kvm_arm_support_pmu_v3() || !has_vhe())
return;
......
......@@ -18,6 +18,7 @@
#include <linux/printk.h>
#include <linux/uaccess.h>
#include <asm/arm_pmuv3.h>
#include <asm/cacheflush.h>
#include <asm/cputype.h>
#include <asm/debug-monitors.h>
......@@ -887,7 +888,7 @@ static u64 reset_pmevtyper(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r)
static u64 reset_pmselr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r)
{
reset_unknown(vcpu, r);
__vcpu_sys_reg(vcpu, r->reg) &= ARMV8_PMU_COUNTER_MASK;
__vcpu_sys_reg(vcpu, r->reg) &= PMSELR_EL0_SEL_MASK;
return __vcpu_sys_reg(vcpu, r->reg);
}
......@@ -979,7 +980,7 @@ static bool access_pmselr(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
else
/* return PMSELR.SEL field */
p->regval = __vcpu_sys_reg(vcpu, PMSELR_EL0)
& ARMV8_PMU_COUNTER_MASK;
& PMSELR_EL0_SEL_MASK;
return true;
}
......@@ -1047,8 +1048,8 @@ static bool access_pmu_evcntr(struct kvm_vcpu *vcpu,
if (pmu_access_event_counter_el0_disabled(vcpu))
return false;
idx = __vcpu_sys_reg(vcpu, PMSELR_EL0)
& ARMV8_PMU_COUNTER_MASK;
idx = SYS_FIELD_GET(PMSELR_EL0, SEL,
__vcpu_sys_reg(vcpu, PMSELR_EL0));
} else if (r->Op2 == 0) {
/* PMCCNTR_EL0 */
if (pmu_access_cycle_counter_el0_disabled(vcpu))
......@@ -1098,7 +1099,7 @@ static bool access_pmu_evtyper(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
if (r->CRn == 9 && r->CRm == 13 && r->Op2 == 1) {
/* PMXEVTYPER_EL0 */
idx = __vcpu_sys_reg(vcpu, PMSELR_EL0) & ARMV8_PMU_COUNTER_MASK;
idx = SYS_FIELD_GET(PMSELR_EL0, SEL, __vcpu_sys_reg(vcpu, PMSELR_EL0));
reg = PMEVTYPER0_EL0 + idx;
} else if (r->CRn == 14 && (r->CRm & 12) == 12) {
idx = ((r->CRm & 3) << 3) | (r->Op2 & 7);
......
......@@ -2029,6 +2029,31 @@ Sysreg FAR_EL1 3 0 6 0 0
Field 63:0 ADDR
EndSysreg
Sysreg PMICNTR_EL0 3 3 9 4 0
Field 63:0 ICNT
EndSysreg
Sysreg PMICFILTR_EL0 3 3 9 6 0
Res0 63:59
Field 58 SYNC
Field 57:56 VS
Res0 55:32
Field 31 P
Field 30 U
Field 29 NSK
Field 28 NSU
Field 27 NSH
Field 26 M
Res0 25
Field 24 SH
Field 23 T
Field 22 RLK
Field 21 RLU
Field 20 RLH
Res0 19:16
Field 15:0 evtCount
EndSysreg
Sysreg PMSCR_EL1 3 0 9 9 0
Res0 63:8
Field 7:6 PCT
......@@ -2153,6 +2178,11 @@ Field 4 P
Field 3:0 ALIGN
EndSysreg
Sysreg PMSELR_EL0 3 3 9 12 5
Res0 63:5
Field 4:0 SEL
EndSysreg
SysregFields CONTEXTIDR_ELx
Res0 63:32
Field 31:0 PROCID
......
......@@ -48,6 +48,13 @@ config ARM_CMN
Support for PMU events monitoring on the Arm CMN-600 Coherent Mesh
Network interconnect.
config ARM_NI
tristate "Arm NI-700 PMU support"
depends on ARM64 || COMPILE_TEST
help
Support for PMU events monitoring on the Arm NI-700 Network-on-Chip
interconnect and family.
config ARM_PMU
depends on ARM || ARM64
bool "ARM PMU framework"
......
......@@ -3,6 +3,7 @@ obj-$(CONFIG_ARM_CCI_PMU) += arm-cci.o
obj-$(CONFIG_ARM_CCN) += arm-ccn.o
obj-$(CONFIG_ARM_CMN) += arm-cmn.o
obj-$(CONFIG_ARM_DSU_PMU) += arm_dsu_pmu.o
obj-$(CONFIG_ARM_NI) += arm-ni.o
obj-$(CONFIG_ARM_PMU) += arm_pmu.o arm_pmu_platform.o
obj-$(CONFIG_ARM_PMU_ACPI) += arm_pmu_acpi.o
obj-$(CONFIG_ARM_PMUV3) += arm_pmuv3.o
......
......@@ -400,7 +400,7 @@ static irqreturn_t ali_drw_pmu_isr(int irq_num, void *data)
}
/* clear common counter intr status */
clr_status = FIELD_PREP(ALI_DRW_PMCOM_CNT_OV_INTR_MASK, 1);
clr_status = FIELD_PREP(ALI_DRW_PMCOM_CNT_OV_INTR_MASK, status);
writel(clr_status,
drw_pmu->cfg_base + ALI_DRW_PMU_OV_INTR_CLR);
}
......
......@@ -47,46 +47,79 @@
* implementations, we'll have to introduce per cpu-type tables.
*/
enum m1_pmu_events {
M1_PMU_PERFCTR_UNKNOWN_01 = 0x01,
M1_PMU_PERFCTR_CPU_CYCLES = 0x02,
M1_PMU_PERFCTR_INSTRUCTIONS = 0x8c,
M1_PMU_PERFCTR_UNKNOWN_8d = 0x8d,
M1_PMU_PERFCTR_UNKNOWN_8e = 0x8e,
M1_PMU_PERFCTR_UNKNOWN_8f = 0x8f,
M1_PMU_PERFCTR_UNKNOWN_90 = 0x90,
M1_PMU_PERFCTR_UNKNOWN_93 = 0x93,
M1_PMU_PERFCTR_UNKNOWN_94 = 0x94,
M1_PMU_PERFCTR_UNKNOWN_95 = 0x95,
M1_PMU_PERFCTR_UNKNOWN_96 = 0x96,
M1_PMU_PERFCTR_UNKNOWN_97 = 0x97,
M1_PMU_PERFCTR_UNKNOWN_98 = 0x98,
M1_PMU_PERFCTR_UNKNOWN_99 = 0x99,
M1_PMU_PERFCTR_UNKNOWN_9a = 0x9a,
M1_PMU_PERFCTR_UNKNOWN_9b = 0x9b,
M1_PMU_PERFCTR_UNKNOWN_9c = 0x9c,
M1_PMU_PERFCTR_UNKNOWN_9f = 0x9f,
M1_PMU_PERFCTR_UNKNOWN_bf = 0xbf,
M1_PMU_PERFCTR_UNKNOWN_c0 = 0xc0,
M1_PMU_PERFCTR_UNKNOWN_c1 = 0xc1,
M1_PMU_PERFCTR_UNKNOWN_c4 = 0xc4,
M1_PMU_PERFCTR_UNKNOWN_c5 = 0xc5,
M1_PMU_PERFCTR_UNKNOWN_c6 = 0xc6,
M1_PMU_PERFCTR_UNKNOWN_c8 = 0xc8,
M1_PMU_PERFCTR_UNKNOWN_ca = 0xca,
M1_PMU_PERFCTR_UNKNOWN_cb = 0xcb,
M1_PMU_PERFCTR_UNKNOWN_f5 = 0xf5,
M1_PMU_PERFCTR_UNKNOWN_f6 = 0xf6,
M1_PMU_PERFCTR_UNKNOWN_f7 = 0xf7,
M1_PMU_PERFCTR_UNKNOWN_f8 = 0xf8,
M1_PMU_PERFCTR_UNKNOWN_fd = 0xfd,
M1_PMU_PERFCTR_LAST = M1_PMU_CFG_EVENT,
M1_PMU_PERFCTR_RETIRE_UOP = 0x1,
M1_PMU_PERFCTR_CORE_ACTIVE_CYCLE = 0x2,
M1_PMU_PERFCTR_L1I_TLB_FILL = 0x4,
M1_PMU_PERFCTR_L1D_TLB_FILL = 0x5,
M1_PMU_PERFCTR_MMU_TABLE_WALK_INSTRUCTION = 0x7,
M1_PMU_PERFCTR_MMU_TABLE_WALK_DATA = 0x8,
M1_PMU_PERFCTR_L2_TLB_MISS_INSTRUCTION = 0xa,
M1_PMU_PERFCTR_L2_TLB_MISS_DATA = 0xb,
M1_PMU_PERFCTR_MMU_VIRTUAL_MEMORY_FAULT_NONSPEC = 0xd,
M1_PMU_PERFCTR_SCHEDULE_UOP = 0x52,
M1_PMU_PERFCTR_INTERRUPT_PENDING = 0x6c,
M1_PMU_PERFCTR_MAP_STALL_DISPATCH = 0x70,
M1_PMU_PERFCTR_MAP_REWIND = 0x75,
M1_PMU_PERFCTR_MAP_STALL = 0x76,
M1_PMU_PERFCTR_MAP_INT_UOP = 0x7c,
M1_PMU_PERFCTR_MAP_LDST_UOP = 0x7d,
M1_PMU_PERFCTR_MAP_SIMD_UOP = 0x7e,
M1_PMU_PERFCTR_FLUSH_RESTART_OTHER_NONSPEC = 0x84,
M1_PMU_PERFCTR_INST_ALL = 0x8c,
M1_PMU_PERFCTR_INST_BRANCH = 0x8d,
M1_PMU_PERFCTR_INST_BRANCH_CALL = 0x8e,
M1_PMU_PERFCTR_INST_BRANCH_RET = 0x8f,
M1_PMU_PERFCTR_INST_BRANCH_TAKEN = 0x90,
M1_PMU_PERFCTR_INST_BRANCH_INDIR = 0x93,
M1_PMU_PERFCTR_INST_BRANCH_COND = 0x94,
M1_PMU_PERFCTR_INST_INT_LD = 0x95,
M1_PMU_PERFCTR_INST_INT_ST = 0x96,
M1_PMU_PERFCTR_INST_INT_ALU = 0x97,
M1_PMU_PERFCTR_INST_SIMD_LD = 0x98,
M1_PMU_PERFCTR_INST_SIMD_ST = 0x99,
M1_PMU_PERFCTR_INST_SIMD_ALU = 0x9a,
M1_PMU_PERFCTR_INST_LDST = 0x9b,
M1_PMU_PERFCTR_INST_BARRIER = 0x9c,
M1_PMU_PERFCTR_UNKNOWN_9f = 0x9f,
M1_PMU_PERFCTR_L1D_TLB_ACCESS = 0xa0,
M1_PMU_PERFCTR_L1D_TLB_MISS = 0xa1,
M1_PMU_PERFCTR_L1D_CACHE_MISS_ST = 0xa2,
M1_PMU_PERFCTR_L1D_CACHE_MISS_LD = 0xa3,
M1_PMU_PERFCTR_LD_UNIT_UOP = 0xa6,
M1_PMU_PERFCTR_ST_UNIT_UOP = 0xa7,
M1_PMU_PERFCTR_L1D_CACHE_WRITEBACK = 0xa8,
M1_PMU_PERFCTR_LDST_X64_UOP = 0xb1,
M1_PMU_PERFCTR_LDST_XPG_UOP = 0xb2,
M1_PMU_PERFCTR_ATOMIC_OR_EXCLUSIVE_SUCC = 0xb3,
M1_PMU_PERFCTR_ATOMIC_OR_EXCLUSIVE_FAIL = 0xb4,
M1_PMU_PERFCTR_L1D_CACHE_MISS_LD_NONSPEC = 0xbf,
M1_PMU_PERFCTR_L1D_CACHE_MISS_ST_NONSPEC = 0xc0,
M1_PMU_PERFCTR_L1D_TLB_MISS_NONSPEC = 0xc1,
M1_PMU_PERFCTR_ST_MEMORY_ORDER_VIOLATION_NONSPEC = 0xc4,
M1_PMU_PERFCTR_BRANCH_COND_MISPRED_NONSPEC = 0xc5,
M1_PMU_PERFCTR_BRANCH_INDIR_MISPRED_NONSPEC = 0xc6,
M1_PMU_PERFCTR_BRANCH_RET_INDIR_MISPRED_NONSPEC = 0xc8,
M1_PMU_PERFCTR_BRANCH_CALL_INDIR_MISPRED_NONSPEC = 0xca,
M1_PMU_PERFCTR_BRANCH_MISPRED_NONSPEC = 0xcb,
M1_PMU_PERFCTR_L1I_TLB_MISS_DEMAND = 0xd4,
M1_PMU_PERFCTR_MAP_DISPATCH_BUBBLE = 0xd6,
M1_PMU_PERFCTR_L1I_CACHE_MISS_DEMAND = 0xdb,
M1_PMU_PERFCTR_FETCH_RESTART = 0xde,
M1_PMU_PERFCTR_ST_NT_UOP = 0xe5,
M1_PMU_PERFCTR_LD_NT_UOP = 0xe6,
M1_PMU_PERFCTR_UNKNOWN_f5 = 0xf5,
M1_PMU_PERFCTR_UNKNOWN_f6 = 0xf6,
M1_PMU_PERFCTR_UNKNOWN_f7 = 0xf7,
M1_PMU_PERFCTR_UNKNOWN_f8 = 0xf8,
M1_PMU_PERFCTR_UNKNOWN_fd = 0xfd,
M1_PMU_PERFCTR_LAST = M1_PMU_CFG_EVENT,
/*
* From this point onwards, these are not actual HW events,
* but attributes that get stored in hw->config_base.
*/
M1_PMU_CFG_COUNT_USER = BIT(8),
M1_PMU_CFG_COUNT_KERNEL = BIT(9),
M1_PMU_CFG_COUNT_USER = BIT(8),
M1_PMU_CFG_COUNT_KERNEL = BIT(9),
};
/*
......@@ -96,46 +129,45 @@ enum m1_pmu_events {
* counters had strange affinities.
*/
static const u16 m1_pmu_event_affinity[M1_PMU_PERFCTR_LAST + 1] = {
[0 ... M1_PMU_PERFCTR_LAST] = ANY_BUT_0_1,
[M1_PMU_PERFCTR_UNKNOWN_01] = BIT(7),
[M1_PMU_PERFCTR_CPU_CYCLES] = ANY_BUT_0_1 | BIT(0),
[M1_PMU_PERFCTR_INSTRUCTIONS] = BIT(7) | BIT(1),
[M1_PMU_PERFCTR_UNKNOWN_8d] = ONLY_5_6_7,
[M1_PMU_PERFCTR_UNKNOWN_8e] = ONLY_5_6_7,
[M1_PMU_PERFCTR_UNKNOWN_8f] = ONLY_5_6_7,
[M1_PMU_PERFCTR_UNKNOWN_90] = ONLY_5_6_7,
[M1_PMU_PERFCTR_UNKNOWN_93] = ONLY_5_6_7,
[M1_PMU_PERFCTR_UNKNOWN_94] = ONLY_5_6_7,
[M1_PMU_PERFCTR_UNKNOWN_95] = ONLY_5_6_7,
[M1_PMU_PERFCTR_UNKNOWN_96] = ONLY_5_6_7,
[M1_PMU_PERFCTR_UNKNOWN_97] = BIT(7),
[M1_PMU_PERFCTR_UNKNOWN_98] = ONLY_5_6_7,
[M1_PMU_PERFCTR_UNKNOWN_99] = ONLY_5_6_7,
[M1_PMU_PERFCTR_UNKNOWN_9a] = BIT(7),
[M1_PMU_PERFCTR_UNKNOWN_9b] = ONLY_5_6_7,
[M1_PMU_PERFCTR_UNKNOWN_9c] = ONLY_5_6_7,
[M1_PMU_PERFCTR_UNKNOWN_9f] = BIT(7),
[M1_PMU_PERFCTR_UNKNOWN_bf] = ONLY_5_6_7,
[M1_PMU_PERFCTR_UNKNOWN_c0] = ONLY_5_6_7,
[M1_PMU_PERFCTR_UNKNOWN_c1] = ONLY_5_6_7,
[M1_PMU_PERFCTR_UNKNOWN_c4] = ONLY_5_6_7,
[M1_PMU_PERFCTR_UNKNOWN_c5] = ONLY_5_6_7,
[M1_PMU_PERFCTR_UNKNOWN_c6] = ONLY_5_6_7,
[M1_PMU_PERFCTR_UNKNOWN_c8] = ONLY_5_6_7,
[M1_PMU_PERFCTR_UNKNOWN_ca] = ONLY_5_6_7,
[M1_PMU_PERFCTR_UNKNOWN_cb] = ONLY_5_6_7,
[M1_PMU_PERFCTR_UNKNOWN_f5] = ONLY_2_4_6,
[M1_PMU_PERFCTR_UNKNOWN_f6] = ONLY_2_4_6,
[M1_PMU_PERFCTR_UNKNOWN_f7] = ONLY_2_4_6,
[M1_PMU_PERFCTR_UNKNOWN_f8] = ONLY_2_TO_7,
[M1_PMU_PERFCTR_UNKNOWN_fd] = ONLY_2_4_6,
[0 ... M1_PMU_PERFCTR_LAST] = ANY_BUT_0_1,
[M1_PMU_PERFCTR_RETIRE_UOP] = BIT(7),
[M1_PMU_PERFCTR_CORE_ACTIVE_CYCLE] = ANY_BUT_0_1 | BIT(0),
[M1_PMU_PERFCTR_INST_ALL] = BIT(7) | BIT(1),
[M1_PMU_PERFCTR_INST_BRANCH] = ONLY_5_6_7,
[M1_PMU_PERFCTR_INST_BRANCH_CALL] = ONLY_5_6_7,
[M1_PMU_PERFCTR_INST_BRANCH_RET] = ONLY_5_6_7,
[M1_PMU_PERFCTR_INST_BRANCH_TAKEN] = ONLY_5_6_7,
[M1_PMU_PERFCTR_INST_BRANCH_INDIR] = ONLY_5_6_7,
[M1_PMU_PERFCTR_INST_BRANCH_COND] = ONLY_5_6_7,
[M1_PMU_PERFCTR_INST_INT_LD] = ONLY_5_6_7,
[M1_PMU_PERFCTR_INST_INT_ST] = BIT(7),
[M1_PMU_PERFCTR_INST_INT_ALU] = BIT(7),
[M1_PMU_PERFCTR_INST_SIMD_LD] = ONLY_5_6_7,
[M1_PMU_PERFCTR_INST_SIMD_ST] = ONLY_5_6_7,
[M1_PMU_PERFCTR_INST_SIMD_ALU] = BIT(7),
[M1_PMU_PERFCTR_INST_LDST] = BIT(7),
[M1_PMU_PERFCTR_INST_BARRIER] = ONLY_5_6_7,
[M1_PMU_PERFCTR_UNKNOWN_9f] = BIT(7),
[M1_PMU_PERFCTR_L1D_CACHE_MISS_LD_NONSPEC] = ONLY_5_6_7,
[M1_PMU_PERFCTR_L1D_CACHE_MISS_ST_NONSPEC] = ONLY_5_6_7,
[M1_PMU_PERFCTR_L1D_TLB_MISS_NONSPEC] = ONLY_5_6_7,
[M1_PMU_PERFCTR_ST_MEMORY_ORDER_VIOLATION_NONSPEC] = ONLY_5_6_7,
[M1_PMU_PERFCTR_BRANCH_COND_MISPRED_NONSPEC] = ONLY_5_6_7,
[M1_PMU_PERFCTR_BRANCH_INDIR_MISPRED_NONSPEC] = ONLY_5_6_7,
[M1_PMU_PERFCTR_BRANCH_RET_INDIR_MISPRED_NONSPEC] = ONLY_5_6_7,
[M1_PMU_PERFCTR_BRANCH_CALL_INDIR_MISPRED_NONSPEC] = ONLY_5_6_7,
[M1_PMU_PERFCTR_BRANCH_MISPRED_NONSPEC] = ONLY_5_6_7,
[M1_PMU_PERFCTR_UNKNOWN_f5] = ONLY_2_4_6,
[M1_PMU_PERFCTR_UNKNOWN_f6] = ONLY_2_4_6,
[M1_PMU_PERFCTR_UNKNOWN_f7] = ONLY_2_4_6,
[M1_PMU_PERFCTR_UNKNOWN_f8] = ONLY_2_TO_7,
[M1_PMU_PERFCTR_UNKNOWN_fd] = ONLY_2_4_6,
};
static const unsigned m1_pmu_perf_map[PERF_COUNT_HW_MAX] = {
PERF_MAP_ALL_UNSUPPORTED,
[PERF_COUNT_HW_CPU_CYCLES] = M1_PMU_PERFCTR_CPU_CYCLES,
[PERF_COUNT_HW_INSTRUCTIONS] = M1_PMU_PERFCTR_INSTRUCTIONS,
/* No idea about the rest yet */
[PERF_COUNT_HW_CPU_CYCLES] = M1_PMU_PERFCTR_CORE_ACTIVE_CYCLE,
[PERF_COUNT_HW_INSTRUCTIONS] = M1_PMU_PERFCTR_INST_ALL,
};
/* sysfs definitions */
......@@ -154,8 +186,8 @@ static ssize_t m1_pmu_events_sysfs_show(struct device *dev,
PMU_EVENT_ATTR_ID(name, m1_pmu_events_sysfs_show, config)
static struct attribute *m1_pmu_event_attrs[] = {
M1_PMU_EVENT_ATTR(cycles, M1_PMU_PERFCTR_CPU_CYCLES),
M1_PMU_EVENT_ATTR(instructions, M1_PMU_PERFCTR_INSTRUCTIONS),
M1_PMU_EVENT_ATTR(cycles, M1_PMU_PERFCTR_CORE_ACTIVE_CYCLE),
M1_PMU_EVENT_ATTR(instructions, M1_PMU_PERFCTR_INST_ALL),
NULL,
};
......@@ -400,7 +432,7 @@ static irqreturn_t m1_pmu_handle_irq(struct arm_pmu *cpu_pmu)
regs = get_irq_regs();
for (idx = 0; idx < cpu_pmu->num_events; idx++) {
for_each_set_bit(idx, cpu_pmu->cntr_mask, M1_PMU_NR_COUNTERS) {
struct perf_event *event = cpuc->events[idx];
struct perf_sample_data data;
......@@ -560,7 +592,7 @@ static int m1_pmu_init(struct arm_pmu *cpu_pmu, u32 flags)
cpu_pmu->reset = m1_pmu_reset;
cpu_pmu->set_event_filter = m1_pmu_set_event_filter;
cpu_pmu->num_events = M1_PMU_NR_COUNTERS;
bitmap_set(cpu_pmu->cntr_mask, 0, M1_PMU_NR_COUNTERS);
cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_EVENTS] = &m1_pmu_events_attr_group;
cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_FORMATS] = &m1_pmu_format_attr_group;
return 0;
......
This diff is collapsed.
This diff is collapsed.
......@@ -522,7 +522,7 @@ static void armpmu_enable(struct pmu *pmu)
{
struct arm_pmu *armpmu = to_arm_pmu(pmu);
struct pmu_hw_events *hw_events = this_cpu_ptr(armpmu->hw_events);
bool enabled = !bitmap_empty(hw_events->used_mask, armpmu->num_events);
bool enabled = !bitmap_empty(hw_events->used_mask, ARMPMU_MAX_HWEVENTS);
/* For task-bound events we may be called on other CPUs */
if (!cpumask_test_cpu(smp_processor_id(), &armpmu->supported_cpus))
......@@ -742,7 +742,7 @@ static void cpu_pm_pmu_setup(struct arm_pmu *armpmu, unsigned long cmd)
struct perf_event *event;
int idx;
for (idx = 0; idx < armpmu->num_events; idx++) {
for_each_set_bit(idx, armpmu->cntr_mask, ARMPMU_MAX_HWEVENTS) {
event = hw_events->events[idx];
if (!event)
continue;
......@@ -772,7 +772,7 @@ static int cpu_pm_pmu_notify(struct notifier_block *b, unsigned long cmd,
{
struct arm_pmu *armpmu = container_of(b, struct arm_pmu, cpu_pm_nb);
struct pmu_hw_events *hw_events = this_cpu_ptr(armpmu->hw_events);
bool enabled = !bitmap_empty(hw_events->used_mask, armpmu->num_events);
bool enabled = !bitmap_empty(hw_events->used_mask, ARMPMU_MAX_HWEVENTS);
if (!cpumask_test_cpu(smp_processor_id(), &armpmu->supported_cpus))
return NOTIFY_DONE;
......@@ -924,8 +924,9 @@ int armpmu_register(struct arm_pmu *pmu)
if (ret)
goto out_destroy;
pr_info("enabled with %s PMU driver, %d counters available%s\n",
pmu->name, pmu->num_events,
pr_info("enabled with %s PMU driver, %d (%*pb) counters available%s\n",
pmu->name, bitmap_weight(pmu->cntr_mask, ARMPMU_MAX_HWEVENTS),
ARMPMU_MAX_HWEVENTS, &pmu->cntr_mask,
has_nmi ? ", using NMIs" : "");
kvm_host_pmu_init(pmu);
......
......@@ -59,7 +59,7 @@ static int pmu_parse_percpu_irq(struct arm_pmu *pmu, int irq)
static bool pmu_has_irq_affinity(struct device_node *node)
{
return !!of_find_property(node, "interrupt-affinity", NULL);
return of_property_present(node, "interrupt-affinity");
}
static int pmu_parse_irq_affinity(struct device *dev, int i)
......
This diff is collapsed.
......@@ -41,7 +41,7 @@
/*
* Cache if the event is allowed to trace Context information.
* This allows us to perform the check, i.e, perfmon_capable(),
* This allows us to perform the check, i.e, perf_allow_kernel(),
* in the context of the event owner, once, during the event_init().
*/
#define SPE_PMU_HW_FLAGS_CX 0x00001
......@@ -50,7 +50,7 @@ static_assert((PERF_EVENT_FLAG_ARCH & SPE_PMU_HW_FLAGS_CX) == SPE_PMU_HW_FLAGS_C
static void set_spe_event_has_cx(struct perf_event *event)
{
if (IS_ENABLED(CONFIG_PID_IN_CONTEXTIDR) && perfmon_capable())
if (IS_ENABLED(CONFIG_PID_IN_CONTEXTIDR) && !perf_allow_kernel(&event->attr))
event->hw.flags |= SPE_PMU_HW_FLAGS_CX;
}
......@@ -745,9 +745,8 @@ static int arm_spe_pmu_event_init(struct perf_event *event)
set_spe_event_has_cx(event);
reg = arm_spe_event_to_pmscr(event);
if (!perfmon_capable() &&
(reg & (PMSCR_EL1_PA | PMSCR_EL1_PCT)))
return -EACCES;
if (reg & (PMSCR_EL1_PA | PMSCR_EL1_PCT))
return perf_allow_kernel(&event->attr);
return 0;
}
......
......@@ -64,6 +64,7 @@ enum armv6_counters {
ARMV6_CYCLE_COUNTER = 0,
ARMV6_COUNTER0,
ARMV6_COUNTER1,
ARMV6_NUM_COUNTERS
};
/*
......@@ -254,7 +255,7 @@ armv6pmu_handle_irq(struct arm_pmu *cpu_pmu)
*/
armv6_pmcr_write(pmcr);
for (idx = 0; idx < cpu_pmu->num_events; ++idx) {
for_each_set_bit(idx, cpu_pmu->cntr_mask, ARMV6_NUM_COUNTERS) {
struct perf_event *event = cpuc->events[idx];
struct hw_perf_event *hwc;
......@@ -391,7 +392,8 @@ static void armv6pmu_init(struct arm_pmu *cpu_pmu)
cpu_pmu->start = armv6pmu_start;
cpu_pmu->stop = armv6pmu_stop;
cpu_pmu->map_event = armv6_map_event;
cpu_pmu->num_events = 3;
bitmap_set(cpu_pmu->cntr_mask, 0, ARMV6_NUM_COUNTERS);
}
static int armv6_1136_pmu_init(struct arm_pmu *cpu_pmu)
......
......@@ -649,24 +649,12 @@ static struct attribute_group armv7_pmuv2_events_attr_group = {
/*
* Perf Events' indices
*/
#define ARMV7_IDX_CYCLE_COUNTER 0
#define ARMV7_IDX_COUNTER0 1
#define ARMV7_IDX_COUNTER_LAST(cpu_pmu) \
(ARMV7_IDX_CYCLE_COUNTER + cpu_pmu->num_events - 1)
#define ARMV7_MAX_COUNTERS 32
#define ARMV7_COUNTER_MASK (ARMV7_MAX_COUNTERS - 1)
#define ARMV7_IDX_CYCLE_COUNTER 31
#define ARMV7_IDX_COUNTER_MAX 31
/*
* ARMv7 low level PMNC access
*/
/*
* Perf Event to low level counters mapping
*/
#define ARMV7_IDX_TO_COUNTER(x) \
(((x) - ARMV7_IDX_COUNTER0) & ARMV7_COUNTER_MASK)
/*
* Per-CPU PMNC: config reg
*/
......@@ -725,19 +713,17 @@ static inline int armv7_pmnc_has_overflowed(u32 pmnc)
static inline int armv7_pmnc_counter_valid(struct arm_pmu *cpu_pmu, int idx)
{
return idx >= ARMV7_IDX_CYCLE_COUNTER &&
idx <= ARMV7_IDX_COUNTER_LAST(cpu_pmu);
return test_bit(idx, cpu_pmu->cntr_mask);
}
static inline int armv7_pmnc_counter_has_overflowed(u32 pmnc, int idx)
{
return pmnc & BIT(ARMV7_IDX_TO_COUNTER(idx));
return pmnc & BIT(idx);
}
static inline void armv7_pmnc_select_counter(int idx)
{
u32 counter = ARMV7_IDX_TO_COUNTER(idx);
asm volatile("mcr p15, 0, %0, c9, c12, 5" : : "r" (counter));
asm volatile("mcr p15, 0, %0, c9, c12, 5" : : "r" (idx));
isb();
}
......@@ -787,29 +773,25 @@ static inline void armv7_pmnc_write_evtsel(int idx, u32 val)
static inline void armv7_pmnc_enable_counter(int idx)
{
u32 counter = ARMV7_IDX_TO_COUNTER(idx);
asm volatile("mcr p15, 0, %0, c9, c12, 1" : : "r" (BIT(counter)));
asm volatile("mcr p15, 0, %0, c9, c12, 1" : : "r" (BIT(idx)));
}
static inline void armv7_pmnc_disable_counter(int idx)
{
u32 counter = ARMV7_IDX_TO_COUNTER(idx);
asm volatile("mcr p15, 0, %0, c9, c12, 2" : : "r" (BIT(counter)));
asm volatile("mcr p15, 0, %0, c9, c12, 2" : : "r" (BIT(idx)));
}
static inline void armv7_pmnc_enable_intens(int idx)
{
u32 counter = ARMV7_IDX_TO_COUNTER(idx);
asm volatile("mcr p15, 0, %0, c9, c14, 1" : : "r" (BIT(counter)));
asm volatile("mcr p15, 0, %0, c9, c14, 1" : : "r" (BIT(idx)));
}
static inline void armv7_pmnc_disable_intens(int idx)
{
u32 counter = ARMV7_IDX_TO_COUNTER(idx);
asm volatile("mcr p15, 0, %0, c9, c14, 2" : : "r" (BIT(counter)));
asm volatile("mcr p15, 0, %0, c9, c14, 2" : : "r" (BIT(idx)));
isb();
/* Clear the overflow flag in case an interrupt is pending. */
asm volatile("mcr p15, 0, %0, c9, c12, 3" : : "r" (BIT(counter)));
asm volatile("mcr p15, 0, %0, c9, c12, 3" : : "r" (BIT(idx)));
isb();
}
......@@ -853,15 +835,12 @@ static void armv7_pmnc_dump_regs(struct arm_pmu *cpu_pmu)
asm volatile("mrc p15, 0, %0, c9, c13, 0" : "=r" (val));
pr_info("CCNT =0x%08x\n", val);
for (cnt = ARMV7_IDX_COUNTER0;
cnt <= ARMV7_IDX_COUNTER_LAST(cpu_pmu); cnt++) {
for_each_set_bit(cnt, cpu_pmu->cntr_mask, ARMV7_IDX_COUNTER_MAX) {
armv7_pmnc_select_counter(cnt);
asm volatile("mrc p15, 0, %0, c9, c13, 2" : "=r" (val));
pr_info("CNT[%d] count =0x%08x\n",
ARMV7_IDX_TO_COUNTER(cnt), val);
pr_info("CNT[%d] count =0x%08x\n", cnt, val);
asm volatile("mrc p15, 0, %0, c9, c13, 1" : "=r" (val));
pr_info("CNT[%d] evtsel=0x%08x\n",
ARMV7_IDX_TO_COUNTER(cnt), val);
pr_info("CNT[%d] evtsel=0x%08x\n", cnt, val);
}
}
#endif
......@@ -958,7 +937,7 @@ static irqreturn_t armv7pmu_handle_irq(struct arm_pmu *cpu_pmu)
*/
regs = get_irq_regs();
for (idx = 0; idx < cpu_pmu->num_events; ++idx) {
for_each_set_bit(idx, cpu_pmu->cntr_mask, ARMPMU_MAX_HWEVENTS) {
struct perf_event *event = cpuc->events[idx];
struct hw_perf_event *hwc;
......@@ -1027,7 +1006,7 @@ static int armv7pmu_get_event_idx(struct pmu_hw_events *cpuc,
* For anything other than a cycle counter, try and use
* the events counters
*/
for (idx = ARMV7_IDX_COUNTER0; idx < cpu_pmu->num_events; ++idx) {
for_each_set_bit(idx, cpu_pmu->cntr_mask, ARMV7_IDX_COUNTER_MAX) {
if (!test_and_set_bit(idx, cpuc->used_mask))
return idx;
}
......@@ -1073,7 +1052,7 @@ static int armv7pmu_set_event_filter(struct hw_perf_event *event,
static void armv7pmu_reset(void *info)
{
struct arm_pmu *cpu_pmu = (struct arm_pmu *)info;
u32 idx, nb_cnt = cpu_pmu->num_events, val;
u32 idx, val;
if (cpu_pmu->secure_access) {
asm volatile("mrc p15, 0, %0, c1, c1, 1" : "=r" (val));
......@@ -1082,7 +1061,7 @@ static void armv7pmu_reset(void *info)
}
/* The counter and interrupt enable registers are unknown at reset. */
for (idx = ARMV7_IDX_CYCLE_COUNTER; idx < nb_cnt; ++idx) {
for_each_set_bit(idx, cpu_pmu->cntr_mask, ARMPMU_MAX_HWEVENTS) {
armv7_pmnc_disable_counter(idx);
armv7_pmnc_disable_intens(idx);
}
......@@ -1161,20 +1140,22 @@ static void armv7pmu_init(struct arm_pmu *cpu_pmu)
static void armv7_read_num_pmnc_events(void *info)
{
int *nb_cnt = info;
int nb_cnt;
struct arm_pmu *cpu_pmu = info;
/* Read the nb of CNTx counters supported from PMNC */
*nb_cnt = (armv7_pmnc_read() >> ARMV7_PMNC_N_SHIFT) & ARMV7_PMNC_N_MASK;
nb_cnt = (armv7_pmnc_read() >> ARMV7_PMNC_N_SHIFT) & ARMV7_PMNC_N_MASK;
bitmap_set(cpu_pmu->cntr_mask, 0, nb_cnt);
/* Add the CPU cycles counter */
*nb_cnt += 1;
set_bit(ARMV7_IDX_CYCLE_COUNTER, cpu_pmu->cntr_mask);
}
static int armv7_probe_num_events(struct arm_pmu *arm_pmu)
{
return smp_call_function_any(&arm_pmu->supported_cpus,
armv7_read_num_pmnc_events,
&arm_pmu->num_events, 1);
arm_pmu, 1);
}
static int armv7_a8_pmu_init(struct arm_pmu *cpu_pmu)
......@@ -1524,7 +1505,7 @@ static void krait_pmu_reset(void *info)
{
u32 vval, fval;
struct arm_pmu *cpu_pmu = info;
u32 idx, nb_cnt = cpu_pmu->num_events;
u32 idx;
armv7pmu_reset(info);
......@@ -1538,7 +1519,7 @@ static void krait_pmu_reset(void *info)
venum_post_pmresr(vval, fval);
/* Reset PMxEVNCTCR to sane default */
for (idx = ARMV7_IDX_CYCLE_COUNTER; idx < nb_cnt; ++idx) {
for_each_set_bit(idx, cpu_pmu->cntr_mask, ARMV7_IDX_COUNTER_MAX) {
armv7_pmnc_select_counter(idx);
asm volatile("mcr p15, 0, %0, c9, c15, 0" : : "r" (0));
}
......@@ -1562,7 +1543,7 @@ static int krait_event_to_bit(struct perf_event *event, unsigned int region,
* Lower bits are reserved for use by the counters (see
* armv7pmu_get_event_idx() for more info)
*/
bit += ARMV7_IDX_COUNTER_LAST(cpu_pmu) + 1;
bit += bitmap_weight(cpu_pmu->cntr_mask, ARMV7_IDX_COUNTER_MAX);
return bit;
}
......@@ -1845,7 +1826,7 @@ static void scorpion_pmu_reset(void *info)
{
u32 vval, fval;
struct arm_pmu *cpu_pmu = info;
u32 idx, nb_cnt = cpu_pmu->num_events;
u32 idx;
armv7pmu_reset(info);
......@@ -1860,7 +1841,7 @@ static void scorpion_pmu_reset(void *info)
venum_post_pmresr(vval, fval);
/* Reset PMxEVNCTCR to sane default */
for (idx = ARMV7_IDX_CYCLE_COUNTER; idx < nb_cnt; ++idx) {
for_each_set_bit(idx, cpu_pmu->cntr_mask, ARMV7_IDX_COUNTER_MAX) {
armv7_pmnc_select_counter(idx);
asm volatile("mcr p15, 0, %0, c9, c15, 0" : : "r" (0));
}
......@@ -1883,7 +1864,7 @@ static int scorpion_event_to_bit(struct perf_event *event, unsigned int region,
* Lower bits are reserved for use by the counters (see
* armv7pmu_get_event_idx() for more info)
*/
bit += ARMV7_IDX_COUNTER_LAST(cpu_pmu) + 1;
bit += bitmap_weight(cpu_pmu->cntr_mask, ARMV7_IDX_COUNTER_MAX);
return bit;
}
......
......@@ -53,6 +53,8 @@ enum xscale_counters {
XSCALE_COUNTER2,
XSCALE_COUNTER3,
};
#define XSCALE1_NUM_COUNTERS 3
#define XSCALE2_NUM_COUNTERS 5
static const unsigned xscale_perf_map[PERF_COUNT_HW_MAX] = {
PERF_MAP_ALL_UNSUPPORTED,
......@@ -168,7 +170,7 @@ xscale1pmu_handle_irq(struct arm_pmu *cpu_pmu)
regs = get_irq_regs();
for (idx = 0; idx < cpu_pmu->num_events; ++idx) {
for_each_set_bit(idx, cpu_pmu->cntr_mask, XSCALE1_NUM_COUNTERS) {
struct perf_event *event = cpuc->events[idx];
struct hw_perf_event *hwc;
......@@ -364,7 +366,8 @@ static int xscale1pmu_init(struct arm_pmu *cpu_pmu)
cpu_pmu->start = xscale1pmu_start;
cpu_pmu->stop = xscale1pmu_stop;
cpu_pmu->map_event = xscale_map_event;
cpu_pmu->num_events = 3;
bitmap_set(cpu_pmu->cntr_mask, 0, XSCALE1_NUM_COUNTERS);
return 0;
}
......@@ -500,7 +503,7 @@ xscale2pmu_handle_irq(struct arm_pmu *cpu_pmu)
regs = get_irq_regs();
for (idx = 0; idx < cpu_pmu->num_events; ++idx) {
for_each_set_bit(idx, cpu_pmu->cntr_mask, XSCALE2_NUM_COUNTERS) {
struct perf_event *event = cpuc->events[idx];
struct hw_perf_event *hwc;
......@@ -719,7 +722,8 @@ static int xscale2pmu_init(struct arm_pmu *cpu_pmu)
cpu_pmu->start = xscale2pmu_start;
cpu_pmu->stop = xscale2pmu_stop;
cpu_pmu->map_event = xscale_map_event;
cpu_pmu->num_events = 5;
bitmap_set(cpu_pmu->cntr_mask, 0, XSCALE2_NUM_COUNTERS);
return 0;
}
......
......@@ -107,6 +107,7 @@ struct dwc_pcie_vendor_id {
static const struct dwc_pcie_vendor_id dwc_pcie_vendor_ids[] = {
{.vendor_id = PCI_VENDOR_ID_ALIBABA },
{.vendor_id = PCI_VENDOR_ID_QCOM },
{} /* terminator */
};
......@@ -556,10 +557,10 @@ static int dwc_pcie_register_dev(struct pci_dev *pdev)
{
struct platform_device *plat_dev;
struct dwc_pcie_dev_info *dev_info;
u32 bdf;
u32 sbdf;
bdf = PCI_DEVID(pdev->bus->number, pdev->devfn);
plat_dev = platform_device_register_data(NULL, "dwc_pcie_pmu", bdf,
sbdf = (pci_domain_nr(pdev->bus) << 16) | PCI_DEVID(pdev->bus->number, pdev->devfn);
plat_dev = platform_device_register_data(NULL, "dwc_pcie_pmu", sbdf,
pdev, sizeof(*pdev));
if (IS_ERR(plat_dev))
......@@ -611,15 +612,15 @@ static int dwc_pcie_pmu_probe(struct platform_device *plat_dev)
struct pci_dev *pdev = plat_dev->dev.platform_data;
struct dwc_pcie_pmu *pcie_pmu;
char *name;
u32 bdf, val;
u32 sbdf, val;
u16 vsec;
int ret;
vsec = pci_find_vsec_capability(pdev, pdev->vendor,
DWC_PCIE_VSEC_RAS_DES_ID);
pci_read_config_dword(pdev, vsec + PCI_VNDR_HEADER, &val);
bdf = PCI_DEVID(pdev->bus->number, pdev->devfn);
name = devm_kasprintf(&plat_dev->dev, GFP_KERNEL, "dwc_rootport_%x", bdf);
sbdf = plat_dev->id;
name = devm_kasprintf(&plat_dev->dev, GFP_KERNEL, "dwc_rootport_%x", sbdf);
if (!name)
return -ENOMEM;
......@@ -650,7 +651,7 @@ static int dwc_pcie_pmu_probe(struct platform_device *plat_dev)
ret = cpuhp_state_add_instance(dwc_pcie_pmu_hp_state,
&pcie_pmu->cpuhp_node);
if (ret) {
pci_err(pdev, "Error %d registering hotplug @%x\n", ret, bdf);
pci_err(pdev, "Error %d registering hotplug @%x\n", ret, sbdf);
return ret;
}
......@@ -663,7 +664,7 @@ static int dwc_pcie_pmu_probe(struct platform_device *plat_dev)
ret = perf_pmu_register(&pcie_pmu->pmu, name, -1);
if (ret) {
pci_err(pdev, "Error %d registering PMU @%x\n", ret, bdf);
pci_err(pdev, "Error %d registering PMU @%x\n", ret, sbdf);
return ret;
}
ret = devm_add_action_or_reset(&plat_dev->dev, dwc_pcie_unregister_pmu,
......@@ -726,7 +727,6 @@ static struct platform_driver dwc_pcie_pmu_driver = {
static int __init dwc_pcie_pmu_init(void)
{
struct pci_dev *pdev = NULL;
bool found = false;
int ret;
for_each_pci_dev(pdev) {
......@@ -738,11 +738,7 @@ static int __init dwc_pcie_pmu_init(void)
pci_dev_put(pdev);
return ret;
}
found = true;
}
if (!found)
return -ENODEV;
ret = cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN,
"perf/dwc_pcie_pmu:online",
......
......@@ -141,6 +141,22 @@ static ssize_t bus_show(struct device *dev, struct device_attribute *attr, char
}
static DEVICE_ATTR_RO(bus);
static ssize_t bdf_min_show(struct device *dev, struct device_attribute *attr, char *buf)
{
struct hisi_pcie_pmu *pcie_pmu = to_pcie_pmu(dev_get_drvdata(dev));
return sysfs_emit(buf, "%#04x\n", pcie_pmu->bdf_min);
}
static DEVICE_ATTR_RO(bdf_min);
static ssize_t bdf_max_show(struct device *dev, struct device_attribute *attr, char *buf)
{
struct hisi_pcie_pmu *pcie_pmu = to_pcie_pmu(dev_get_drvdata(dev));
return sysfs_emit(buf, "%#04x\n", pcie_pmu->bdf_max);
}
static DEVICE_ATTR_RO(bdf_max);
static struct hisi_pcie_reg_pair
hisi_pcie_parse_reg_value(struct hisi_pcie_pmu *pcie_pmu, u32 reg_off)
{
......@@ -208,7 +224,7 @@ static void hisi_pcie_pmu_writeq(struct hisi_pcie_pmu *pcie_pmu, u32 reg_offset,
static u64 hisi_pcie_pmu_get_event_ctrl_val(struct perf_event *event)
{
u64 port, trig_len, thr_len, len_mode;
u64 reg = HISI_PCIE_INIT_SET;
u64 reg = 0;
/* Config HISI_PCIE_EVENT_CTRL according to event. */
reg |= FIELD_PREP(HISI_PCIE_EVENT_M, hisi_pcie_get_real_event(event));
......@@ -452,10 +468,24 @@ static void hisi_pcie_pmu_set_period(struct perf_event *event)
struct hisi_pcie_pmu *pcie_pmu = to_pcie_pmu(event->pmu);
struct hw_perf_event *hwc = &event->hw;
int idx = hwc->idx;
u64 orig_cnt, cnt;
orig_cnt = hisi_pcie_pmu_read_counter(event);
local64_set(&hwc->prev_count, HISI_PCIE_INIT_VAL);
hisi_pcie_pmu_writeq(pcie_pmu, HISI_PCIE_CNT, idx, HISI_PCIE_INIT_VAL);
hisi_pcie_pmu_writeq(pcie_pmu, HISI_PCIE_EXT_CNT, idx, HISI_PCIE_INIT_VAL);
/*
* The counter maybe unwritable if the target event is unsupported.
* Check this by comparing the counts after setting the period. If
* the counts stay unchanged after setting the period then update
* the hwc->prev_count correctly. Otherwise the final counts user
* get maybe totally wrong.
*/
cnt = hisi_pcie_pmu_read_counter(event);
if (orig_cnt == cnt)
local64_set(&hwc->prev_count, cnt);
}
static void hisi_pcie_pmu_enable_counter(struct hisi_pcie_pmu *pcie_pmu, struct hw_perf_event *hwc)
......@@ -749,6 +779,8 @@ static const struct attribute_group hisi_pcie_pmu_format_group = {
static struct attribute *hisi_pcie_pmu_bus_attrs[] = {
&dev_attr_bus.attr,
&dev_attr_bdf_max.attr,
&dev_attr_bdf_min.attr,
NULL
};
......
......@@ -10,7 +10,7 @@
#include <linux/perf_event.h>
#include <linux/perf/arm_pmuv3.h>
#define ARMV8_PMU_CYCLE_IDX (ARMV8_PMU_MAX_COUNTERS - 1)
#define KVM_ARMV8_PMU_MAX_COUNTERS 32
#if IS_ENABLED(CONFIG_HW_PERF_EVENTS) && IS_ENABLED(CONFIG_KVM)
struct kvm_pmc {
......@@ -19,14 +19,14 @@ struct kvm_pmc {
};
struct kvm_pmu_events {
u32 events_host;
u32 events_guest;
u64 events_host;
u64 events_guest;
};
struct kvm_pmu {
struct irq_work overflow_work;
struct kvm_pmu_events events;
struct kvm_pmc pmc[ARMV8_PMU_MAX_COUNTERS];
struct kvm_pmc pmc[KVM_ARMV8_PMU_MAX_COUNTERS];
int irq_num;
bool created;
bool irq_level;
......
......@@ -17,10 +17,14 @@
#ifdef CONFIG_ARM_PMU
/*
* The ARMv7 CPU PMU supports up to 32 event counters.
* The Armv7 and Armv8.8 or less CPU PMU supports up to 32 event counters.
* The Armv8.9/9.4 CPU PMU supports up to 33 event counters.
*/
#ifdef CONFIG_ARM
#define ARMPMU_MAX_HWEVENTS 32
#else
#define ARMPMU_MAX_HWEVENTS 33
#endif
/*
* ARM PMU hw_event flags
*/
......@@ -96,7 +100,7 @@ struct arm_pmu {
void (*stop)(struct arm_pmu *);
void (*reset)(void *);
int (*map_event)(struct perf_event *event);
int num_events;
DECLARE_BITMAP(cntr_mask, ARMPMU_MAX_HWEVENTS);
bool secure_access; /* 32-bit ARM only */
#define ARMV8_PMUV3_MAX_COMMON_EVENTS 0x40
DECLARE_BITMAP(pmceid_bitmap, ARMV8_PMUV3_MAX_COMMON_EVENTS);
......
......@@ -6,8 +6,9 @@
#ifndef __PERF_ARM_PMUV3_H
#define __PERF_ARM_PMUV3_H
#define ARMV8_PMU_MAX_COUNTERS 32
#define ARMV8_PMU_COUNTER_MASK (ARMV8_PMU_MAX_COUNTERS - 1)
#define ARMV8_PMU_MAX_GENERAL_COUNTERS 31
#define ARMV8_PMU_CYCLE_IDX 31
#define ARMV8_PMU_INSTR_IDX 32 /* Not accessible from AArch32 */
/*
* Common architectural and microarchitectural event numbers.
......@@ -227,8 +228,10 @@
*/
#define ARMV8_PMU_OVSR_P GENMASK(30, 0)
#define ARMV8_PMU_OVSR_C BIT(31)
#define ARMV8_PMU_OVSR_F BIT_ULL(32) /* arm64 only */
/* Mask for writable bits is both P and C fields */
#define ARMV8_PMU_OVERFLOWED_MASK (ARMV8_PMU_OVSR_P | ARMV8_PMU_OVSR_C)
#define ARMV8_PMU_OVERFLOWED_MASK (ARMV8_PMU_OVSR_P | ARMV8_PMU_OVSR_C | \
ARMV8_PMU_OVSR_F)
/*
* PMXEVTYPER: Event selection reg
......
......@@ -1602,13 +1602,7 @@ static inline int perf_is_paranoid(void)
return sysctl_perf_event_paranoid > -1;
}
static inline int perf_allow_kernel(struct perf_event_attr *attr)
{
if (sysctl_perf_event_paranoid > 1 && !perfmon_capable())
return -EACCES;
return security_perf_event_open(attr, PERF_SECURITY_KERNEL);
}
int perf_allow_kernel(struct perf_event_attr *attr);
static inline int perf_allow_cpu(struct perf_event_attr *attr)
{
......
......@@ -13351,6 +13351,15 @@ const struct perf_event_attr *perf_event_attrs(struct perf_event *event)
return &event->attr;
}
int perf_allow_kernel(struct perf_event_attr *attr)
{
if (sysctl_perf_event_paranoid > 1 && !perfmon_capable())
return -EACCES;
return security_perf_event_open(attr, PERF_SECURITY_KERNEL);
}
EXPORT_SYMBOL_GPL(perf_allow_kernel);
/*
* Inherit an event from parent task to child task.
*
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment