Commit 51effa6d authored by Catalin Marinas's avatar Catalin Marinas

Merge branch 'for-next/perf' into for-next/core

- Support for additional PMU topologies on HiSilicon platforms
- Support for CCN-512 interconnect PMU
- Support for AXI ID filtering in the IMX8 DDR PMU
- Support for the CCPI2 uncore PMU in ThunderX2
- Driver cleanup to use devm_platform_ioremap_resource()

* for-next/perf:
  drivers/perf: hisi: update the sccl_id/ccl_id for certain HiSilicon platform
  perf/imx_ddr: Dump AXI ID filter info to userspace
  docs/perf: Add AXI ID filter capabilities information
  perf/imx_ddr: Add driver for DDR PMU in i.MX8MPlus
  perf/imx_ddr: Add enhanced AXI ID filter support
  bindings: perf: imx-ddr: Add new compatible string
  docs/perf: Add explanation for DDR_CAP_AXI_ID_FILTER_ENHANCED quirk
  arm64: perf: Simplify the ARMv8 PMUv3 event attributes
  drivers/perf: Add CCPI2 PMU support in ThunderX2 UNCORE driver.
  Documentation: perf: Update documentation for ThunderX2 PMU uncore driver
  Documentation: Add documentation for CCN-512 DTS binding
  perf: arm-ccn: Enable stats for CCN-512 interconnect
  perf/smmuv3: use devm_platform_ioremap_resource() to simplify code
  perf/arm-cci: use devm_platform_ioremap_resource() to simplify code
  perf/arm-ccn: use devm_platform_ioremap_resource() to simplify code
  perf: xgene: use devm_platform_ioremap_resource() to simplify code
  perf: hisi: use devm_platform_ioremap_resource() to simplify code
parents c1c9ea63 8703317a
......@@ -17,7 +17,8 @@ The "format" directory describes format of the config (event ID) and config1
(AXI filtering) fields of the perf_event_attr structure, see /sys/bus/event_source/
devices/imx8_ddr0/format/. The "events" directory describes the events types
hardware supported that can be used with perf tool, see /sys/bus/event_source/
devices/imx8_ddr0/events/.
devices/imx8_ddr0/events/. The "caps" directory describes filter features implemented
in DDR PMU, see /sys/bus/events_source/devices/imx8_ddr0/caps/.
e.g.::
perf stat -a -e imx8_ddr0/cycles/ cmd
perf stat -a -e imx8_ddr0/read/,imx8_ddr0/write/ cmd
......@@ -25,9 +26,12 @@ devices/imx8_ddr0/events/.
AXI filtering is only used by CSV modes 0x41 (axid-read) and 0x42 (axid-write)
to count reading or writing matches filter setting. Filter setting is various
from different DRAM controller implementations, which is distinguished by quirks
in the driver.
in the driver. You also can dump info from userspace, filter in "caps" directory
indicates whether PMU supports AXI ID filter or not; enhanced_filter indicates
whether PMU supports enhanced AXI ID filter or not. Value 0 for un-supported, and
value 1 for supported.
* With DDR_CAP_AXI_ID_FILTER quirk.
* With DDR_CAP_AXI_ID_FILTER quirk(filter: 1, enhanced_filter: 0).
Filter is defined with two configuration parts:
--AXI_ID defines AxID matching value.
--AXI_MASKING defines which bits of AxID are meaningful for the matching.
......@@ -50,3 +54,8 @@ in the driver.
axi_id to monitor a specific id, rather than having to specify axi_mask.
e.g.::
perf stat -a -e imx8_ddr0/axid-read,axi_id=0x12/ cmd, which will monitor ARID=0x12
* With DDR_CAP_AXI_ID_FILTER_ENHANCED quirk(filter: 1, enhanced_filter: 1).
This is an extension to the DDR_CAP_AXI_ID_FILTER quirk which permits
counting the number of bytes (as opposed to the number of bursts) from DDR
read and write transactions concurrently with another set of data counters.
......@@ -3,24 +3,26 @@ Cavium ThunderX2 SoC Performance Monitoring Unit (PMU UNCORE)
=============================================================
The ThunderX2 SoC PMU consists of independent, system-wide, per-socket
PMUs such as the Level 3 Cache (L3C) and DDR4 Memory Controller (DMC).
PMUs such as the Level 3 Cache (L3C), DDR4 Memory Controller (DMC) and
Cavium Coherent Processor Interconnect (CCPI2).
The DMC has 8 interleaved channels and the L3C has 16 interleaved tiles.
Events are counted for the default channel (i.e. channel 0) and prorated
to the total number of channels/tiles.
The DMC and L3C support up to 4 counters. Counters are independently
programmable and can be started and stopped individually. Each counter
can be set to a different event. Counters are 32-bit and do not support
an overflow interrupt; they are read every 2 seconds.
The DMC and L3C support up to 4 counters, while the CCPI2 supports up to 8
counters. Counters are independently programmable to different events and
can be started and stopped individually. None of the counters support an
overflow interrupt. DMC and L3C counters are 32-bit and read every 2 seconds.
The CCPI2 counters are 64-bit and assumed not to overflow in normal operation.
PMU UNCORE (perf) driver:
The thunderx2_pmu driver registers per-socket perf PMUs for the DMC and
L3C devices. Each PMU can be used to count up to 4 events
simultaneously. The PMUs provide a description of their available events
and configuration options under sysfs, see
/sys/devices/uncore_<l3c_S/dmc_S/>; S is the socket id.
L3C devices. Each PMU can be used to count up to 4 (DMC/L3C) or up to 8
(CCPI2) events simultaneously. The PMUs provide a description of their
available events and configuration options under sysfs, see
/sys/devices/uncore_<l3c_S/dmc_S/ccpi2_S/>; S is the socket id.
The driver does not support sampling, therefore "perf record" will not
work. Per-task perf sessions are also not supported.
......
......@@ -6,6 +6,7 @@ Required properties:
"arm,ccn-502"
"arm,ccn-504"
"arm,ccn-508"
"arm,ccn-512"
- reg: (standard registers property) physical address and size
(16MB) of the configuration registers block
......
......@@ -5,6 +5,7 @@ Required properties:
- compatible: should be one of:
"fsl,imx8-ddr-pmu"
"fsl,imx8m-ddr-pmu"
"fsl,imx8mp-ddr-pmu"
- reg: physical address and size
......
This diff is collapsed.
......@@ -1642,7 +1642,6 @@ static struct cci_pmu *cci_pmu_alloc(struct device *dev)
static int cci_pmu_probe(struct platform_device *pdev)
{
struct resource *res;
struct cci_pmu *cci_pmu;
int i, ret, irq;
......@@ -1650,8 +1649,7 @@ static int cci_pmu_probe(struct platform_device *pdev)
if (IS_ERR(cci_pmu))
return PTR_ERR(cci_pmu);
res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
cci_pmu->base = devm_ioremap_resource(&pdev->dev, res);
cci_pmu->base = devm_platform_ioremap_resource(pdev, 0);
if (IS_ERR(cci_pmu->base))
return -ENOMEM;
......
......@@ -1477,8 +1477,7 @@ static int arm_ccn_probe(struct platform_device *pdev)
ccn->dev = &pdev->dev;
platform_set_drvdata(pdev, ccn);
res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
ccn->base = devm_ioremap_resource(ccn->dev, res);
ccn->base = devm_platform_ioremap_resource(pdev, 0);
if (IS_ERR(ccn->base))
return PTR_ERR(ccn->base);
......@@ -1537,6 +1536,7 @@ static int arm_ccn_remove(struct platform_device *pdev)
static const struct of_device_id arm_ccn_match[] = {
{ .compatible = "arm,ccn-502", },
{ .compatible = "arm,ccn-504", },
{ .compatible = "arm,ccn-512", },
{},
};
MODULE_DEVICE_TABLE(of, arm_ccn_match);
......
......@@ -727,7 +727,7 @@ static void smmu_pmu_get_acpi_options(struct smmu_pmu *smmu_pmu)
static int smmu_pmu_probe(struct platform_device *pdev)
{
struct smmu_pmu *smmu_pmu;
struct resource *res_0, *res_1;
struct resource *res_0;
u32 cfgr, reg_size;
u64 ceid_64[2];
int irq, err;
......@@ -764,8 +764,7 @@ static int smmu_pmu_probe(struct platform_device *pdev)
/* Determine if page 1 is present */
if (cfgr & SMMU_PMCG_CFGR_RELOC_CTRS) {
res_1 = platform_get_resource(pdev, IORESOURCE_MEM, 1);
smmu_pmu->reloc_base = devm_ioremap_resource(dev, res_1);
smmu_pmu->reloc_base = devm_platform_ioremap_resource(pdev, 1);
if (IS_ERR(smmu_pmu->reloc_base))
return PTR_ERR(smmu_pmu->reloc_base);
} else {
......
......@@ -45,7 +45,8 @@
static DEFINE_IDA(ddr_ida);
/* DDR Perf hardware feature */
#define DDR_CAP_AXI_ID_FILTER 0x1 /* support AXI ID filter */
#define DDR_CAP_AXI_ID_FILTER 0x1 /* support AXI ID filter */
#define DDR_CAP_AXI_ID_FILTER_ENHANCED 0x3 /* support enhanced AXI ID filter */
struct fsl_ddr_devtype_data {
unsigned int quirks; /* quirks needed for different DDR Perf core */
......@@ -57,9 +58,14 @@ static const struct fsl_ddr_devtype_data imx8m_devtype_data = {
.quirks = DDR_CAP_AXI_ID_FILTER,
};
static const struct fsl_ddr_devtype_data imx8mp_devtype_data = {
.quirks = DDR_CAP_AXI_ID_FILTER_ENHANCED,
};
static const struct of_device_id imx_ddr_pmu_dt_ids[] = {
{ .compatible = "fsl,imx8-ddr-pmu", .data = &imx8_devtype_data},
{ .compatible = "fsl,imx8m-ddr-pmu", .data = &imx8m_devtype_data},
{ .compatible = "fsl,imx8mp-ddr-pmu", .data = &imx8mp_devtype_data},
{ /* sentinel */ }
};
MODULE_DEVICE_TABLE(of, imx_ddr_pmu_dt_ids);
......@@ -78,6 +84,61 @@ struct ddr_pmu {
int id;
};
enum ddr_perf_filter_capabilities {
PERF_CAP_AXI_ID_FILTER = 0,
PERF_CAP_AXI_ID_FILTER_ENHANCED,
PERF_CAP_AXI_ID_FEAT_MAX,
};
static u32 ddr_perf_filter_cap_get(struct ddr_pmu *pmu, int cap)
{
u32 quirks = pmu->devtype_data->quirks;
switch (cap) {
case PERF_CAP_AXI_ID_FILTER:
return !!(quirks & DDR_CAP_AXI_ID_FILTER);
case PERF_CAP_AXI_ID_FILTER_ENHANCED:
quirks &= DDR_CAP_AXI_ID_FILTER_ENHANCED;
return quirks == DDR_CAP_AXI_ID_FILTER_ENHANCED;
default:
WARN(1, "unknown filter cap %d\n", cap);
}
return 0;
}
static ssize_t ddr_perf_filter_cap_show(struct device *dev,
struct device_attribute *attr,
char *buf)
{
struct ddr_pmu *pmu = dev_get_drvdata(dev);
struct dev_ext_attribute *ea =
container_of(attr, struct dev_ext_attribute, attr);
int cap = (long)ea->var;
return snprintf(buf, PAGE_SIZE, "%u\n",
ddr_perf_filter_cap_get(pmu, cap));
}
#define PERF_EXT_ATTR_ENTRY(_name, _func, _var) \
(&((struct dev_ext_attribute) { \
__ATTR(_name, 0444, _func, NULL), (void *)_var \
}).attr.attr)
#define PERF_FILTER_EXT_ATTR_ENTRY(_name, _var) \
PERF_EXT_ATTR_ENTRY(_name, ddr_perf_filter_cap_show, _var)
static struct attribute *ddr_perf_filter_cap_attr[] = {
PERF_FILTER_EXT_ATTR_ENTRY(filter, PERF_CAP_AXI_ID_FILTER),
PERF_FILTER_EXT_ATTR_ENTRY(enhanced_filter, PERF_CAP_AXI_ID_FILTER_ENHANCED),
NULL,
};
static struct attribute_group ddr_perf_filter_cap_attr_group = {
.name = "caps",
.attrs = ddr_perf_filter_cap_attr,
};
static ssize_t ddr_perf_cpumask_show(struct device *dev,
struct device_attribute *attr, char *buf)
{
......@@ -175,9 +236,40 @@ static const struct attribute_group *attr_groups[] = {
&ddr_perf_events_attr_group,
&ddr_perf_format_attr_group,
&ddr_perf_cpumask_attr_group,
&ddr_perf_filter_cap_attr_group,
NULL,
};
static bool ddr_perf_is_filtered(struct perf_event *event)
{
return event->attr.config == 0x41 || event->attr.config == 0x42;
}
static u32 ddr_perf_filter_val(struct perf_event *event)
{
return event->attr.config1;
}
static bool ddr_perf_filters_compatible(struct perf_event *a,
struct perf_event *b)
{
if (!ddr_perf_is_filtered(a))
return true;
if (!ddr_perf_is_filtered(b))
return true;
return ddr_perf_filter_val(a) == ddr_perf_filter_val(b);
}
static bool ddr_perf_is_enhanced_filtered(struct perf_event *event)
{
unsigned int filt;
struct ddr_pmu *pmu = to_ddr_pmu(event->pmu);
filt = pmu->devtype_data->quirks & DDR_CAP_AXI_ID_FILTER_ENHANCED;
return (filt == DDR_CAP_AXI_ID_FILTER_ENHANCED) &&
ddr_perf_is_filtered(event);
}
static u32 ddr_perf_alloc_counter(struct ddr_pmu *pmu, int event)
{
int i;
......@@ -209,27 +301,17 @@ static void ddr_perf_free_counter(struct ddr_pmu *pmu, int counter)
static u32 ddr_perf_read_counter(struct ddr_pmu *pmu, int counter)
{
return readl_relaxed(pmu->base + COUNTER_READ + counter * 4);
}
static bool ddr_perf_is_filtered(struct perf_event *event)
{
return event->attr.config == 0x41 || event->attr.config == 0x42;
}
struct perf_event *event = pmu->events[counter];
void __iomem *base = pmu->base;
static u32 ddr_perf_filter_val(struct perf_event *event)
{
return event->attr.config1;
}
static bool ddr_perf_filters_compatible(struct perf_event *a,
struct perf_event *b)
{
if (!ddr_perf_is_filtered(a))
return true;
if (!ddr_perf_is_filtered(b))
return true;
return ddr_perf_filter_val(a) == ddr_perf_filter_val(b);
/*
* return bytes instead of bursts from ddr transaction for
* axid-read and axid-write event if PMU core supports enhanced
* filter.
*/
base += ddr_perf_is_enhanced_filtered(event) ? COUNTER_DPCR1 :
COUNTER_READ;
return readl_relaxed(base + counter * 4);
}
static int ddr_perf_event_init(struct perf_event *event)
......
......@@ -243,8 +243,6 @@ MODULE_DEVICE_TABLE(acpi, hisi_ddrc_pmu_acpi_match);
static int hisi_ddrc_pmu_init_data(struct platform_device *pdev,
struct hisi_pmu *ddrc_pmu)
{
struct resource *res;
/*
* Use the SCCL_ID and DDRC channel ID to identify the
* DDRC PMU, while SCCL_ID is in MPIDR[aff2].
......@@ -263,8 +261,7 @@ static int hisi_ddrc_pmu_init_data(struct platform_device *pdev,
/* DDRC PMUs only share the same SCCL */
ddrc_pmu->ccl_id = -1;
res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
ddrc_pmu->base = devm_ioremap_resource(&pdev->dev, res);
ddrc_pmu->base = devm_platform_ioremap_resource(pdev, 0);
if (IS_ERR(ddrc_pmu->base)) {
dev_err(&pdev->dev, "ioremap failed for ddrc_pmu resource\n");
return PTR_ERR(ddrc_pmu->base);
......
......@@ -234,7 +234,6 @@ static int hisi_hha_pmu_init_data(struct platform_device *pdev,
struct hisi_pmu *hha_pmu)
{
unsigned long long id;
struct resource *res;
acpi_status status;
status = acpi_evaluate_integer(ACPI_HANDLE(&pdev->dev),
......@@ -256,8 +255,7 @@ static int hisi_hha_pmu_init_data(struct platform_device *pdev,
/* HHA PMUs only share the same SCCL */
hha_pmu->ccl_id = -1;
res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
hha_pmu->base = devm_ioremap_resource(&pdev->dev, res);
hha_pmu->base = devm_platform_ioremap_resource(pdev, 0);
if (IS_ERR(hha_pmu->base)) {
dev_err(&pdev->dev, "ioremap failed for hha_pmu resource\n");
return PTR_ERR(hha_pmu->base);
......
......@@ -233,7 +233,6 @@ static int hisi_l3c_pmu_init_data(struct platform_device *pdev,
struct hisi_pmu *l3c_pmu)
{
unsigned long long id;
struct resource *res;
acpi_status status;
status = acpi_evaluate_integer(ACPI_HANDLE(&pdev->dev),
......@@ -259,8 +258,7 @@ static int hisi_l3c_pmu_init_data(struct platform_device *pdev,
return -EINVAL;
}
res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
l3c_pmu->base = devm_ioremap_resource(&pdev->dev, res);
l3c_pmu->base = devm_platform_ioremap_resource(pdev, 0);
if (IS_ERR(l3c_pmu->base)) {
dev_err(&pdev->dev, "ioremap failed for l3c_pmu resource\n");
return PTR_ERR(l3c_pmu->base);
......
......@@ -15,6 +15,7 @@
#include <linux/errno.h>
#include <linux/interrupt.h>
#include <asm/cputype.h>
#include <asm/local64.h>
#include "hisi_uncore_pmu.h"
......@@ -338,8 +339,10 @@ void hisi_uncore_pmu_disable(struct pmu *pmu)
/*
* Read Super CPU cluster and CPU cluster ID from MPIDR_EL1.
* If multi-threading is supported, CCL_ID is the low 3-bits in MPIDR[Aff2]
* and SCCL_ID is the upper 5-bits of Aff2 field; if not, SCCL_ID
* If multi-threading is supported, On Huawei Kunpeng 920 SoC whose cpu
* core is tsv110, CCL_ID is the low 3-bits in MPIDR[Aff2] and SCCL_ID
* is the upper 5-bits of Aff2 field; while for other cpu types, SCCL_ID
* is in MPIDR[Aff3] and CCL_ID is in MPIDR[Aff2], if not, SCCL_ID
* is in MPIDR[Aff2] and CCL_ID is in MPIDR[Aff1].
*/
static void hisi_read_sccl_and_ccl_id(int *sccl_id, int *ccl_id)
......@@ -347,12 +350,19 @@ static void hisi_read_sccl_and_ccl_id(int *sccl_id, int *ccl_id)
u64 mpidr = read_cpuid_mpidr();
if (mpidr & MPIDR_MT_BITMASK) {
int aff2 = MPIDR_AFFINITY_LEVEL(mpidr, 2);
if (sccl_id)
*sccl_id = aff2 >> 3;
if (ccl_id)
*ccl_id = aff2 & 0x7;
if (read_cpuid_part_number() == HISI_CPU_PART_TSV110) {
int aff2 = MPIDR_AFFINITY_LEVEL(mpidr, 2);
if (sccl_id)
*sccl_id = aff2 >> 3;
if (ccl_id)
*ccl_id = aff2 & 0x7;
} else {
if (sccl_id)
*sccl_id = MPIDR_AFFINITY_LEVEL(mpidr, 3);
if (ccl_id)
*ccl_id = MPIDR_AFFINITY_LEVEL(mpidr, 2);
}
} else {
if (sccl_id)
*sccl_id = MPIDR_AFFINITY_LEVEL(mpidr, 2);
......
This diff is collapsed.
......@@ -1282,25 +1282,21 @@ static int acpi_pmu_probe_active_mcb_mcu_l3c(struct xgene_pmu *xgene_pmu,
struct platform_device *pdev)
{
void __iomem *csw_csr, *mcba_csr, *mcbb_csr;
struct resource *res;
unsigned int reg;
res = platform_get_resource(pdev, IORESOURCE_MEM, 1);
csw_csr = devm_ioremap_resource(&pdev->dev, res);
csw_csr = devm_platform_ioremap_resource(pdev, 1);
if (IS_ERR(csw_csr)) {
dev_err(&pdev->dev, "ioremap failed for CSW CSR resource\n");
return PTR_ERR(csw_csr);
}
res = platform_get_resource(pdev, IORESOURCE_MEM, 2);
mcba_csr = devm_ioremap_resource(&pdev->dev, res);
mcba_csr = devm_platform_ioremap_resource(pdev, 2);
if (IS_ERR(mcba_csr)) {
dev_err(&pdev->dev, "ioremap failed for MCBA CSR resource\n");
return PTR_ERR(mcba_csr);
}
res = platform_get_resource(pdev, IORESOURCE_MEM, 3);
mcbb_csr = devm_ioremap_resource(&pdev->dev, res);
mcbb_csr = devm_platform_ioremap_resource(pdev, 3);
if (IS_ERR(mcbb_csr)) {
dev_err(&pdev->dev, "ioremap failed for MCBB CSR resource\n");
return PTR_ERR(mcbb_csr);
......@@ -1332,13 +1328,11 @@ static int acpi_pmu_v3_probe_active_mcb_mcu_l3c(struct xgene_pmu *xgene_pmu,
struct platform_device *pdev)
{
void __iomem *csw_csr;
struct resource *res;
unsigned int reg;
u32 mcb0routing;
u32 mcb1routing;
res = platform_get_resource(pdev, IORESOURCE_MEM, 1);
csw_csr = devm_ioremap_resource(&pdev->dev, res);
csw_csr = devm_platform_ioremap_resource(pdev, 1);
if (IS_ERR(csw_csr)) {
dev_err(&pdev->dev, "ioremap failed for CSW CSR resource\n");
return PTR_ERR(csw_csr);
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment