perf vendor events intel: Refresh skylakex metrics

Update the skylakex events from 1.28 to 1.29 and metrics to TMA version 4.5. Generation was done using https://github.com/intel/perfmon. Notable changes are TMA info metrics are renamed from their node name to be lower case and prefixed by tma_info_, MetricThreshold expressions are added, "Sample with" documentation is added to many TMA metrics, smi_cost and transaction metric groups are added replicating existing hard coded metrics in stat-shadow. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Eduard Zingerman <eddyz87@gmail.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Stephane Eranian <eranian@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-stm32@st-md-mailman.stormreply.com Link: https://lore.kernel.org/r/20230219092848.639226-30-irogers@google.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf vendor events intel: Refresh skylakex metrics
Update the skylakex events from 1.28 to 1.29 and metrics to TMA version 4.5. Generation was done using https://github.com/intel/perfmon. Notable changes are TMA info metrics are renamed from their node name to be lower case and prefixed by tma_info_, MetricThreshold expressions are added, "Sample with" documentation is added to many TMA metrics, smi_cost and transaction metric groups are added replicating existing hard coded metrics in stat-shadow. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Eduard Zingerman <eddyz87@gmail.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Stephane Eranian <eranian@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-stm32@st-md-mailman.stormreply.com Link: https://lore.kernel.org/r/20230219092848.639226-30-irogers@google.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
100ee7c3 · Ian Rogers · Arnaldo Carvalho de Melo · 9d9675bb · 100ee7c3 · 100ee7c3
Commit 100ee7c3 authored Feb 19, 2023 by Ian Rogers Committed by Arnaldo Carvalho de Melo Feb 19, 2023
8 changed files
--- a/tools/perf/pmu-events/arch/x86/mapfile.csv
+++ b/tools/perf/pmu-events/arch/x86/mapfile.csv
@@ -25,7 +25,7 @@ GenuineIntel-6-2A,v18,sandybridge,core
 GenuineIntel-6-(8F|CF),v1.11,sapphirerapids,core
 GenuineIntel-6-(37|4A|4C|4D|5A),v15,silvermont,core
 GenuineIntel-6-(4E|5E|8E|9E|A5|A6),v54,skylake,core
-GenuineIntel-6-55-[01234],v1.28,skylakex,core
+GenuineIntel-6-55-[01234],v1.29,skylakex,core
 GenuineIntel-6-86,v1.20,snowridgex,core
 GenuineIntel-6-8[CD],v1.08,tigerlake,core
 GenuineIntel-6-2C,v3,westmereep-dp,core

--- a/tools/perf/pmu-events/arch/x86/skylakex/cache.json
+++ b/tools/perf/pmu-events/arch/x86/skylakex/cache.json
@@ -234,20 +234,22 @@
        "UMask": "0x4f"
    },
    {
-        "BriefDescription": "All retired load instructions.",
+        "BriefDescription": "Retired load instructions.",
        "Data_LA": "1",
        "EventCode": "0xD0",
        "EventName": "MEM_INST_RETIRED.ALL_LOADS",
        "PEBS": "1",
+        "PublicDescription": "Counts all retired load instructions. This event accounts for SW prefetch instructions of PREFETCHNTA or PREFETCHT0/1/2 or PREFETCHW.",
        "SampleAfterValue": "2000003",
        "UMask": "0x81"
    },
    {
-        "BriefDescription": "All retired store instructions.",
+        "BriefDescription": "Retired store instructions.",
        "Data_LA": "1",
        "EventCode": "0xD0",
        "EventName": "MEM_INST_RETIRED.ALL_STORES",
        "PEBS": "1",
+        "PublicDescription": "Counts all retired store instructions.",
        "SampleAfterValue": "2000003",
        "UMask": "0x82"
    },
@@ -484,7 +486,7 @@
        "UMask": "0x80"
    },
    {
-        "BriefDescription": "Cacheable and noncachaeble code read requests",
+        "BriefDescription": "Cacheable and non-cacheable code read requests",
        "EventCode": "0xB0",
        "EventName": "OFFCORE_REQUESTS.DEMAND_CODE_RD",
        "PublicDescription": "Counts both cacheable and non-cacheable code read requests.",

--- a/tools/perf/pmu-events/arch/x86/skylakex/frontend.json
+++ b/tools/perf/pmu-events/arch/x86/skylakex/frontend.json
@@ -322,7 +322,7 @@
        "UMask": "0x4"
    },
    {
-        "BriefDescription": "Cycles when uops are being delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy",
+        "BriefDescription": "Cycles when uops are being delivered to Instruction Decode Queue (IDQ) while Microcode Sequencer (MS) is busy",
        "CounterMask": "1",
        "EventCode": "0x79",
        "EventName": "IDQ.MS_CYCLES",
@@ -331,7 +331,7 @@
        "UMask": "0x30"
    },
    {
-        "BriefDescription": "Cycles when uops initiated by Decode Stream Buffer (DSB) are being delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy",
+        "BriefDescription": "Cycles when uops initiated by Decode Stream Buffer (DSB) are being delivered to Instruction Decode Queue (IDQ) while Microcode Sequencer (MS) is busy",
        "CounterMask": "1",
        "EventCode": "0x79",
        "EventName": "IDQ.MS_DSB_CYCLES",
@@ -340,7 +340,7 @@
        "UMask": "0x10"
    },
    {
-        "BriefDescription": "Uops initiated by MITE and delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy",
+        "BriefDescription": "Uops initiated by MITE and delivered to Instruction Decode Queue (IDQ) while Microcode Sequencer (MS) is busy",
        "EventCode": "0x79",
        "EventName": "IDQ.MS_MITE_UOPS",
        "PublicDescription": "Counts the number of uops initiated by MITE and delivered to Instruction Decode Queue (IDQ) while the Microcode Sequencer (MS) is busy. Counting includes uops that may 'bypass' the IDQ.",
@@ -358,7 +358,7 @@
        "UMask": "0x30"
    },
    {
-        "BriefDescription": "Uops delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy",
+        "BriefDescription": "Uops delivered to Instruction Decode Queue (IDQ) while Microcode Sequencer (MS) is busy",
        "EventCode": "0x79",
        "EventName": "IDQ.MS_UOPS",
        "PublicDescription": "Counts the total number of uops delivered by the Microcode Sequencer (MS). Any instruction over 4 uops will be delivered by the MS. Some instructions such as transcendentals may additionally generate uops from the MS.",

--- a/tools/perf/pmu-events/arch/x86/skylakex/pipeline.json
+++ b/tools/perf/pmu-events/arch/x86/skylakex/pipeline.json
@@ -93,6 +93,22 @@
        "SampleAfterValue": "400009",
        "UMask": "0x10"
    },
+    {
+        "BriefDescription": "Speculative and retired mispredicted macro conditional branches",
+        "EventCode": "0x89",
+        "EventName": "BR_MISP_EXEC.ALL_BRANCHES",
+        "PublicDescription": "This event counts both taken and not taken speculative and retired mispredicted branch instructions.",
+        "SampleAfterValue": "200003",
+        "UMask": "0xff"
+    },
+    {
+        "BriefDescription": "Speculative mispredicted indirect branches",
+        "EventCode": "0x89",
+        "EventName": "BR_MISP_EXEC.INDIRECT",
+        "PublicDescription": "Counts speculatively miss-predicted indirect branches at execution time. Counts for indirect near CALL or JMP instructions (RET excluded).",
+        "SampleAfterValue": "200003",
+        "UMask": "0xe4"
+    },
    {
        "BriefDescription": "All mispredicted macro branch instructions retired.",
        "EventCode": "0xC5",

--- a/tools/perf/pmu-events/arch/x86/skylakex/skx-metrics.json
+++ b/tools/perf/pmu-events/arch/x86/skylakex/skx-metrics.json
--- a/tools/perf/pmu-events/arch/x86/skylakex/uncore-memory.json
+++ b/tools/perf/pmu-events/arch/x86/skylakex/uncore-memory.json
@@ -1952,7 +1952,7 @@
        "EventCode": "0x81",
        "EventName": "UNC_M_WPQ_OCCUPANCY",
        "PerPkg": "1",
-        "PublicDescription": "Counts the number of entries in the Write Pending Queue (WPQ) at each cycle.  This can then be used to calculate both the average queue occupancy (in conjunction with the number of cycles not empty) and the average latency (in conjunction with the number of allocations).  The WPQ is used to schedule writes out to the memory controller and to track the requests.  Requests allocate into the WPQ soon after they enter the memory controller, and need credits for an entry in this buffer before being sent from the CHA to the iMC (memory controller).  They deallocate after being issued to DRAM.  Write requests themselves are able to complete (from the perspective of the rest of the system) as soon they have 'posted' to the iMC.  This is not to be confused with actually performing the write to DRAM.  Therefore, the average latency for this queue is actually not useful for deconstruction intermediate write latencies.  So, we provide filtering based on if the request has posted or not.  By using the 'not posted' filter, we can track how long writes spent in the iMC before completions were sent to the HA.  The 'posted' filter, on the other hand, provides information about how much queueing is actually happenning in the iMC for writes before they are actually issued to memory.  High average occupancies will generally coincide with high write major mode counts. Is there a filter of sorts???",
+        "PublicDescription": "Counts the number of entries in the Write Pending Queue (WPQ) at each cycle.  This can then be used to calculate both the average queue occupancy (in conjunction with the number of cycles not empty) and the average latency (in conjunction with the number of allocations).  The WPQ is used to schedule writes out to the memory controller and to track the requests.  Requests allocate into the WPQ soon after they enter the memory controller, and need credits for an entry in this buffer before being sent from the CHA to the iMC (memory controller).  They deallocate after being issued to DRAM.  Write requests themselves are able to complete (from the perspective of the rest of the system) as soon they have 'posted' to the iMC.  This is not to be confused with actually performing the write to DRAM.  Therefore, the average latency for this queue is actually not useful for deconstruction intermediate write latencies.  So, we provide filtering based on if the request has posted or not.  By using the 'not posted' filter, we can track how long writes spent in the iMC before completions were sent to the HA.  The 'posted' filter, on the other hand, provides information about how much queueing is actually happening in the iMC for writes before they are actually issued to memory.  High average occupancies will generally coincide with high write major mode counts. Is there a filter of sorts???",
        "Unit": "iMC"
    },
    {

--- a/tools/perf/pmu-events/arch/x86/skylakex/uncore-other.json
+++ b/tools/perf/pmu-events/arch/x86/skylakex/uncore-other.json
@@ -804,7 +804,7 @@
        "Unit": "CHA"
    },
    {
-        "BriefDescription": "Counts the number of Allocate/Update to HitMe Cache; Deallocate HtiME$ on Reads without RspFwdI*",
+        "BriefDescription": "Counts the number of Allocate/Update to HitMe Cache; Deallocate HitME$ on Reads without RspFwdI*",
        "EventCode": "0x61",
        "EventName": "UNC_CHA_HITME_UPDATE.DEALLOCATE",
        "PerPkg": "1",
@@ -3321,7 +3321,7 @@
        "EventCode": "0x5C",
        "EventName": "UNC_CHA_SNOOP_RESP.RSP_FWD_WB",
        "PerPkg": "1",
-        "PublicDescription": "Counts when a transaction with the opcode type Rsp*Fwd*WB Snoop Response was received which indicates the data was written back to its home socket, and the cacheline was forwarded to the requestor socket.  This snoop response is only used in &gt;= 4 socket systems.  It is used when a snoop HITM's in a remote caching agent and it directly forwards data to a requestor, and simultaneously returns data to its home socket to be written back to memory.",
+        "PublicDescription": "Counts when a transaction with the opcode type Rsp*Fwd*WB Snoop Response was received which indicates the data was written back to its home socket, and the cacheline was forwarded to the requestor socket.  This snoop response is only used in >= 4 socket systems.  It is used when a snoop HITM's in a remote caching agent and it directly forwards data to a requestor, and simultaneously returns data to its home socket to be written back to memory.",
        "UMask": "0x20",
        "Unit": "CHA"
    },
@@ -4039,20 +4039,22 @@
        "Unit": "CHA"
    },
    {
-        "BriefDescription": "UNC_CHA_TOR_OCCUPANCY.IA_HIT_CRD",
+        "BriefDescription": "TOR Occupancy : CRds issued by iA Cores that Hit the LLC",
        "EventCode": "0x36",
        "EventName": "UNC_CHA_TOR_OCCUPANCY.IA_HIT_CRD",
        "Filter": "config1=0x40233",
        "PerPkg": "1",
+        "PublicDescription": "TOR Occupancy : CRds issued by iA Cores that Hit the LLC : For each cycle, this event accumulates the number of valid entries in the TOR that match qualifications specified by the subevent.     Does not include addressless requests such as locks and interrupts.",
        "UMask": "0x11",
        "Unit": "CHA"
    },
    {
-        "BriefDescription": "UNC_CHA_TOR_OCCUPANCY.IA_HIT_DRD",
+        "BriefDescription": "TOR Occupancy : DRds issued by iA Cores that Hit the LLC",
        "EventCode": "0x36",
        "EventName": "UNC_CHA_TOR_OCCUPANCY.IA_HIT_DRD",
        "Filter": "config1=0x40433",
        "PerPkg": "1",
+        "PublicDescription": "TOR Occupancy : DRds issued by iA Cores that Hit the LLC : For each cycle, this event accumulates the number of valid entries in the TOR that match qualifications specified by the subevent.     Does not include addressless requests such as locks and interrupts.",
        "UMask": "0x11",
        "Unit": "CHA"
    },
@@ -4075,20 +4077,22 @@
        "Unit": "CHA"
    },
    {
-        "BriefDescription": "UNC_CHA_TOR_OCCUPANCY.IA_HIT_LlcPrefRFO",
+        "BriefDescription": "TOR Occupancy : LLCPrefRFO issued by iA Cores that hit the LLC",
        "EventCode": "0x36",
        "EventName": "UNC_CHA_TOR_OCCUPANCY.IA_HIT_LlcPrefRFO",
        "Filter": "config1=0x4b033",
        "PerPkg": "1",
+        "PublicDescription": "TOR Occupancy : LLCPrefRFO issued by iA Cores that hit the LLC : For each cycle, this event accumulates the number of valid entries in the TOR that match qualifications specified by the subevent.     Does not include addressless requests such as locks and interrupts.",
        "UMask": "0x11",
        "Unit": "CHA"
    },
    {
-        "BriefDescription": "UNC_CHA_TOR_OCCUPANCY.IA_HIT_RFO",
+        "BriefDescription": "TOR Occupancy : RFOs issued by iA Cores that Hit the LLC",
        "EventCode": "0x36",
        "EventName": "UNC_CHA_TOR_OCCUPANCY.IA_HIT_RFO",
        "Filter": "config1=0x40033",
        "PerPkg": "1",
+        "PublicDescription": "TOR Occupancy : RFOs issued by iA Cores that Hit the LLC : For each cycle, this event accumulates the number of valid entries in the TOR that match qualifications specified by the subevent.     Does not include addressless requests such as locks and interrupts.",
        "UMask": "0x11",
        "Unit": "CHA"
    },
@@ -4102,20 +4106,22 @@
        "Unit": "CHA"
    },
    {
-        "BriefDescription": "UNC_CHA_TOR_OCCUPANCY.IA_MISS_CRD",
+        "BriefDescription": "TOR Occupancy : CRds issued by iA Cores that Missed the LLC",
        "EventCode": "0x36",
        "EventName": "UNC_CHA_TOR_OCCUPANCY.IA_MISS_CRD",
        "Filter": "config1=0x40233",
        "PerPkg": "1",
+        "PublicDescription": "TOR Occupancy : CRds issued by iA Cores that Missed the LLC : For each cycle, this event accumulates the number of valid entries in the TOR that match qualifications specified by the subevent.     Does not include addressless requests such as locks and interrupts.",
        "UMask": "0x21",
        "Unit": "CHA"
    },
    {
-        "BriefDescription": "UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD",
+        "BriefDescription": "TOR Occupancy : DRds issued by iA Cores that Missed the LLC",
        "EventCode": "0x36",
        "EventName": "UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD",
        "Filter": "config1=0x40433",
        "PerPkg": "1",
+        "PublicDescription": "TOR Occupancy : DRds issued by iA Cores that Missed the LLC : For each cycle, this event accumulates the number of valid entries in the TOR that match qualifications specified by the subevent.     Does not include addressless requests such as locks and interrupts.",
        "UMask": "0x21",
        "Unit": "CHA"
    },
@@ -4138,20 +4144,22 @@
        "Unit": "CHA"
    },
    {
-        "BriefDescription": "UNC_CHA_TOR_OCCUPANCY.IA_MISS_LlcPrefRFO",
+        "BriefDescription": "TOR Occupancy : LLCPrefRFO issued by iA Cores that missed the LLC",
        "EventCode": "0x36",
        "EventName": "UNC_CHA_TOR_OCCUPANCY.IA_MISS_LlcPrefRFO",
        "Filter": "config1=0x4b033",
        "PerPkg": "1",
+        "PublicDescription": "TOR Occupancy : LLCPrefRFO issued by iA Cores that missed the LLC : For each cycle, this event accumulates the number of valid entries in the TOR that match qualifications specified by the subevent.     Does not include addressless requests such as locks and interrupts.",
        "UMask": "0x21",
        "Unit": "CHA"
    },
    {
-        "BriefDescription": "UNC_CHA_TOR_OCCUPANCY.IA_MISS_RFO",
+        "BriefDescription": "TOR Occupancy : RFOs issued by iA Cores that Missed the LLC",
        "EventCode": "0x36",
        "EventName": "UNC_CHA_TOR_OCCUPANCY.IA_MISS_RFO",
        "Filter": "config1=0x40033",
        "PerPkg": "1",
+        "PublicDescription": "TOR Occupancy : RFOs issued by iA Cores that Missed the LLC : For each cycle, this event accumulates the number of valid entries in the TOR that match qualifications specified by the subevent.     Does not include addressless requests such as locks and interrupts.",
        "UMask": "0x21",
        "Unit": "CHA"
    },
@@ -11909,7 +11917,7 @@
        "Unit": "IIO"
    },
    {
-        "BriefDescription": "UNC_IIO_NOTHING",
+        "BriefDescription": "Counting disabled",
        "EventName": "UNC_IIO_NOTHING",
        "PerPkg": "1",
        "Unit": "IIO"
@@ -15507,7 +15515,7 @@
        "EventCode": "0xC",
        "EventName": "UNC_I_TxS_REQUEST_OCCUPANCY",
        "PerPkg": "1",
-        "PublicDescription": "Accumultes the number of outstanding outbound requests from the IRP to the switch (towards the devices).  This can be used in conjuection with the allocations event in order to calculate average latency of outbound requests.",
+        "PublicDescription": "Accumulates the number of outstanding outbound requests from the IRP to the switch (towards the devices).  This can be used in conjunction with the allocations event in order to calculate average latency of outbound requests.",
        "Unit": "IRP"
    },
    {
@@ -16013,35 +16021,35 @@
        "Unit": "M2M"
    },
    {
-        "BriefDescription": "Number of reads in which direct to Intel UPI transactions were overridden",
+        "BriefDescription": "Number of reads in which direct to Intel(R) UPI transactions were overridden",
        "EventCode": "0x28",
        "EventName": "UNC_M2M_DIRECT2UPI_NOT_TAKEN_CREDITS",
        "PerPkg": "1",
-        "PublicDescription": "Counts reads in which direct to Intel Ultra Path Interconnect (UPI) transactions (which would have bypassed the CHA) were overridden",
+        "PublicDescription": "Counts reads in which direct to Intel(R) Ultra Path Interconnect (UPI) transactions (which would have bypassed the CHA) were overridden",
        "Unit": "M2M"
    },
    {
-        "BriefDescription": "Cycles when direct to Intel UPI was disabled",
+        "BriefDescription": "Cycles when direct to Intel(R) UPI was disabled",
        "EventCode": "0x27",
        "EventName": "UNC_M2M_DIRECT2UPI_NOT_TAKEN_DIRSTATE",
        "PerPkg": "1",
-        "PublicDescription": "Counts cycles when the ability to send messages direct to the Intel Ultra Path Interconnect (bypassing the CHA) was disabled",
+        "PublicDescription": "Counts cycles when the ability to send messages direct to the Intel(R) Ultra Path Interconnect (bypassing the CHA) was disabled",
        "Unit": "M2M"
    },
    {
-        "BriefDescription": "Messages sent direct to the Intel UPI",
+        "BriefDescription": "Messages sent direct to the Intel(R) UPI",
        "EventCode": "0x26",
        "EventName": "UNC_M2M_DIRECT2UPI_TAKEN",
        "PerPkg": "1",
-        "PublicDescription": "Counts when messages were sent direct to the Intel Ultra Path Interconnect (bypassing the CHA)",
+        "PublicDescription": "Counts when messages were sent direct to the Intel(R) Ultra Path Interconnect (bypassing the CHA)",
        "Unit": "M2M"
    },
    {
-        "BriefDescription": "Number of reads that a message sent direct2 Intel UPI was overridden",
+        "BriefDescription": "Number of reads that a message sent direct2 Intel(R) UPI was overridden",
        "EventCode": "0x29",
        "EventName": "UNC_M2M_DIRECT2UPI_TXN_OVERRIDE",
        "PerPkg": "1",
-        "PublicDescription": "Counts when a read message that was sent direct to the Intel Ultra Path Interconnect (bypassing the CHA) was overridden",
+        "PublicDescription": "Counts when a read message that was sent direct to the Intel(R) Ultra Path Interconnect (bypassing the CHA) was overridden",
        "Unit": "M2M"
    },
    {
@@ -20785,7 +20793,7 @@
        "EventCode": "0x61",
        "EventName": "UNC_M3UPI_RxC_CRD_OCC.FLITS_IN_FIFO",
        "PerPkg": "1",
-        "PublicDescription": "Occupancy of m3upi ingress -&gt; upi link layer bgf; packets (flits) in fifo",
+        "PublicDescription": "Occupancy of m3upi ingress -> upi link layer bgf; packets (flits) in fifo",
        "UMask": "0x2",
        "Unit": "M3UPI"
    },
@@ -20794,7 +20802,7 @@
        "EventCode": "0x61",
        "EventName": "UNC_M3UPI_RxC_CRD_OCC.FLITS_IN_PATH",
        "PerPkg": "1",
-        "PublicDescription": "Occupancy of m3upi ingress -&gt; upi link layer bgf; packets (flits) in path (i.e. pipe to fifo or fifo)",
+        "PublicDescription": "Occupancy of m3upi ingress -> upi link layer bgf; packets (flits) in path (i.e. pipe to fifo or fifo)",
        "UMask": "0x4",
        "Unit": "M3UPI"
    },
@@ -21180,7 +21188,7 @@
        "Unit": "M3UPI"
    },
    {
-        "BriefDescription": "Flit Gen - Header 1; Acumullate",
+        "BriefDescription": "Flit Gen - Header 1; Accumulate",
        "EventCode": "0x53",
        "EventName": "UNC_M3UPI_RxC_FLIT_GEN_HDR1.ACCUM",
        "PerPkg": "1",
@@ -21851,7 +21859,7 @@
        "Unit": "M3UPI"
    },
    {
-        "BriefDescription": "Remote VNA Credits; Level &lt; 1",
+        "BriefDescription": "Remote VNA Credits; Level < 1",
        "EventCode": "0x5B",
        "EventName": "UNC_M3UPI_RxC_VNA_CRD.LT1",
        "PerPkg": "1",
@@ -21860,7 +21868,7 @@
        "Unit": "M3UPI"
    },
    {
-        "BriefDescription": "Remote VNA Credits; Level &lt; 4",
+        "BriefDescription": "Remote VNA Credits; Level < 4",
        "EventCode": "0x5B",
        "EventName": "UNC_M3UPI_RxC_VNA_CRD.LT4",
        "PerPkg": "1",
@@ -21869,7 +21877,7 @@
        "Unit": "M3UPI"
    },
    {
-        "BriefDescription": "Remote VNA Credits; Level &lt; 5",
+        "BriefDescription": "Remote VNA Credits; Level < 5",
        "EventCode": "0x5B",
        "EventName": "UNC_M3UPI_RxC_VNA_CRD.LT5",
        "PerPkg": "1",
@@ -24401,7 +24409,7 @@
        "EventCode": "0x29",
        "EventName": "UNC_M3UPI_UPI_PREFETCH_SPAWN",
        "PerPkg": "1",
-        "PublicDescription": "Count cases where flow control queue that sits between the Intel Ultra Path Interconnect (UPI) and the mesh spawns a prefetch to the iMC (Memory Controller)",
+        "PublicDescription": "Count cases where flow control queue that sits between the Intel(R) Ultra Path Interconnect (UPI) and the mesh spawns a prefetch to the iMC (Memory Controller)",
        "Unit": "M3UPI"
    },
    {
@@ -24756,11 +24764,11 @@
        "Unit": "M2M"
    },
    {
-        "BriefDescription": "Clocks of the Intel Ultra Path Interconnect (UPI)",
+        "BriefDescription": "Clocks of the Intel(R) Ultra Path Interconnect (UPI)",
        "EventCode": "0x1",
        "EventName": "UNC_UPI_CLOCKTICKS",
        "PerPkg": "1",
-        "PublicDescription": "Counts clockticks of the fixed frequency clock controlling the Intel Ultra Path Interconnect (UPI).  This clock runs at1/8th the 'GT/s' speed of the UPI link.  For example, a  9.6GT/s  link will have a fixed Frequency of 1.2 Ghz.",
+        "PublicDescription": "Counts clockticks of the fixed frequency clock controlling the Intel(R) Ultra Path Interconnect (UPI).  This clock runs at1/8th the 'GT/s' speed of the UPI link.  For example, a  9.6GT/s  link will have a fixed Frequency of 1.2 Ghz.",
        "Unit": "UPI LL"
    },
    {
@@ -24782,11 +24790,11 @@
        "Unit": "UPI LL"
    },
    {
-        "BriefDescription": "Data Response packets that go direct to Intel UPI",
+        "BriefDescription": "Data Response packets that go direct to Intel(R) UPI",
        "EventCode": "0x12",
        "EventName": "UNC_UPI_DIRECT_ATTEMPTS.D2U",
        "PerPkg": "1",
-        "PublicDescription": "Counts Data Response (DRS) packets that attempted to go direct to Intel Ultra Path Interconnect (UPI) bypassing the CHA .",
+        "PublicDescription": "Counts Data Response (DRS) packets that attempted to go direct to Intel(R) Ultra Path Interconnect (UPI) bypassing the CHA .",
        "UMask": "0x2",
        "Unit": "UPI LL"
    },
@@ -24855,11 +24863,11 @@
        "Unit": "UPI LL"
    },
    {
-        "BriefDescription": "Cycles Intel UPI is in L1 power mode (shutdown)",
+        "BriefDescription": "Cycles Intel(R) UPI is in L1 power mode (shutdown)",
        "EventCode": "0x21",
        "EventName": "UNC_UPI_L1_POWER_CYCLES",
        "PerPkg": "1",
-        "PublicDescription": "Counts cycles when the Intel Ultra Path Interconnect (UPI) is in L1 power mode.  L1 is a mode that totally shuts down the UPI link.  Link power states are per link and per direction, so for example the Tx direction could be in one state while Rx was in another, this event only coutns when both links are shutdown.",
+        "PublicDescription": "Counts cycles when the Intel(R) Ultra Path Interconnect (UPI) is in L1 power mode.  L1 is a mode that totally shuts down the UPI link.  Link power states are per link and per direction, so for example the Tx direction could be in one state while Rx was in another, this event only coutns when both links are shutdown.",
        "Unit": "UPI LL"
    },
    {
@@ -25021,11 +25029,11 @@
        "Unit": "UPI LL"
    },
    {
-        "BriefDescription": "Cycles the Rx of the Intel UPI is in L0p power mode",
+        "BriefDescription": "Cycles the Rx of the Intel(R) UPI is in L0p power mode",
        "EventCode": "0x25",
        "EventName": "UNC_UPI_RxL0P_POWER_CYCLES",
        "PerPkg": "1",
-        "PublicDescription": "Counts cycles when the receive side (Rx) of the Intel Ultra Path Interconnect(UPI) is in L0p power mode. L0p is a mode where we disable 60% of the UPI lanes, decreasing our bandwidth in order to save power.",
+        "PublicDescription": "Counts cycles when the receive side (Rx) of the Intel(R) Ultra Path Interconnect(UPI) is in L0p power mode. L0p is a mode where we disable 60% of the UPI lanes, decreasing our bandwidth in order to save power.",
        "Unit": "UPI LL"
    },
    {
@@ -25218,7 +25226,7 @@
        "EventCode": "0x8",
        "EventName": "UNC_UPI_RxL_CRC_LLR_REQ_TRANSMIT",
        "PerPkg": "1",
-        "PublicDescription": "Number of LLR Requests were transmitted.  This should generally be &lt;= the number of CRC errors detected.  If multiple errors are detected before the Rx side receives a LLC_REQ_ACK from the Tx side, there is no need to send more LLR_REQ_NACKs.",
+        "PublicDescription": "Number of LLR Requests were transmitted.  This should generally be <= the number of CRC errors detected.  If multiple errors are detected before the Rx side receives a LLC_REQ_ACK from the Tx side, there is no need to send more LLR_REQ_NACKs.",
        "Unit": "UPI LL"
    },
    {
@@ -25250,7 +25258,7 @@
        "EventCode": "0x3",
        "EventName": "UNC_UPI_RxL_FLITS.ALL_DATA",
        "PerPkg": "1",
-        "PublicDescription": "Counts valid data FLITs  (80 bit FLow control unITs: 64bits of data) received from any of the 3 Intel Ultra Path Interconnect (UPI) Receive Queue slots on this UPI unit.",
+        "PublicDescription": "Counts valid data FLITs  (80 bit FLow control unITs: 64bits of data) received from any of the 3 Intel(R) Ultra Path Interconnect (UPI) Receive Queue slots on this UPI unit.",
        "UMask": "0xf",
        "Unit": "UPI LL"
    },
@@ -25259,7 +25267,7 @@
        "EventCode": "0x3",
        "EventName": "UNC_UPI_RxL_FLITS.ALL_NULL",
        "PerPkg": "1",
-        "PublicDescription": "Counts null FLITs (80 bit FLow control unITs) received from any of the 3 Intel Ultra Path Interconnect (UPI) Receive Queue slots on this UPI unit.",
+        "PublicDescription": "Counts null FLITs (80 bit FLow control unITs) received from any of the 3 Intel(R) Ultra Path Interconnect (UPI) Receive Queue slots on this UPI unit.",
        "UMask": "0x27",
        "Unit": "UPI LL"
    },
@@ -25583,11 +25591,11 @@
        "Unit": "UPI LL"
    },
    {
-        "BriefDescription": "Cycles in which the Tx of the Intel Ultra Path Interconnect (UPI) is in L0p power mode",
+        "BriefDescription": "Cycles in which the Tx of the Intel(R) Ultra Path Interconnect (UPI) is in L0p power mode",
        "EventCode": "0x27",
        "EventName": "UNC_UPI_TxL0P_POWER_CYCLES",
        "PerPkg": "1",
-        "PublicDescription": "Counts cycles when the transmit side (Tx) of the Intel Ultra Path Interconnect(UPI) is in L0p power mode. L0p is a mode where we disable 60% of the UPI lanes, decreasing our bandwidth in order to save power.",
+        "PublicDescription": "Counts cycles when the transmit side (Tx) of the Intel(R) Ultra Path Interconnect(UPI) is in L0p power mode. L0p is a mode where we disable 60% of the UPI lanes, decreasing our bandwidth in order to save power.",
        "Unit": "UPI LL"
    },
    {
@@ -25759,7 +25767,7 @@
        "EventCode": "0x41",
        "EventName": "UNC_UPI_TxL_BYPASSED",
        "PerPkg": "1",
-        "PublicDescription": "Counts incoming FLITs (FLow control unITs) which bypassed the TxL(transmit) FLIT buffer and pass directly out the UPI Link. Generally, when data is transmitted across the Intel Ultra Path Interconnect (UPI), it will bypass the TxQ and pass directly to the link.  However, the TxQ will be used in L0p (Low Power) mode and (Link Layer Retry) LLR  mode, increasing latency to transfer out to the link.",
+        "PublicDescription": "Counts incoming FLITs (FLow control unITs) which bypassed the TxL(transmit) FLIT buffer and pass directly out the UPI Link. Generally, when data is transmitted across the Intel(R) Ultra Path Interconnect (UPI), it will bypass the TxQ and pass directly to the link.  However, the TxQ will be used in L0p (Low Power) mode and (Link Layer Retry) LLR  mode, increasing latency to transfer out to the link.",
        "Unit": "UPI LL"
    },
    {
@@ -25767,7 +25775,7 @@
        "EventCode": "0x2",
        "EventName": "UNC_UPI_TxL_FLITS.ALL_DATA",
        "PerPkg": "1",
-        "PublicDescription": "Counts valid data FLITs (80 bit FLow control unITs: 64bits of data) transmitted (TxL) via any of the 3 Intel Ultra Path Interconnect (UPI) slots on this UPI unit.",
+        "PublicDescription": "Counts valid data FLITs (80 bit FLow control unITs: 64bits of data) transmitted (TxL) via any of the 3 Intel(R) Ultra Path Interconnect (UPI) slots on this UPI unit.",
        "UMask": "0xf",
        "Unit": "UPI LL"
    },
@@ -25776,7 +25784,7 @@
        "EventCode": "0x2",
        "EventName": "UNC_UPI_TxL_FLITS.ALL_NULL",
        "PerPkg": "1",
-        "PublicDescription": "Counts null FLITs (80 bit FLow control unITs) transmitted via any of the 3 Intel Ulra Path Interconnect (UPI) slots on this UPI unit.",
+        "PublicDescription": "Counts null FLITs (80 bit FLow control unITs) transmitted via any of the 3 Intel(R) Ulra Path Interconnect (UPI) slots on this UPI unit.",
        "UMask": "0x27",
        "Unit": "UPI LL"
    },
@@ -26127,7 +26135,7 @@
        "EventCode": "0x2",
        "EventName": "UPI_DATA_BANDWIDTH_TX",
        "PerPkg": "1",
-        "PublicDescription": "Counts valid data FLITs (80 bit FLow control unITs: 64bits of data) transmitted (TxL) via any of the 3 Intel Ultra Path Interconnect (UPI) slots on this UPI unit.",
+        "PublicDescription": "Counts valid data FLITs (80 bit FLow control unITs: 64bits of data) transmitted (TxL) via any of the 3 Intel(R) Ultra Path Interconnect (UPI) slots on this UPI unit.",
        "ScaleUnit": "7.11E-06Bytes",
        "UMask": "0xf",
        "Unit": "UPI LL"
--- a/tools/perf/pmu-events/arch/x86/skylakex/uncore-power.json
+++ b/tools/perf/pmu-events/arch/x86/skylakex/uncore-power.json
@@ -143,7 +143,7 @@
        "EventCode": "0x80",
        "EventName": "UNC_P_POWER_STATE_OCCUPANCY.CORES_C0",
        "PerPkg": "1",
-        "PublicDescription": "This is an occupancy event that tracks the number of cores that are in the chosen C-State.  It can be used by itself to get the average number of cores in that C-state with threshholding to generate histograms, or with other PCU events and occupancy triggering to capture other details.",
+        "PublicDescription": "This is an occupancy event that tracks the number of cores that are in the chosen C-State.  It can be used by itself to get the average number of cores in that C-state with thresholding to generate histograms, or with other PCU events and occupancy triggering to capture other details.",
        "Unit": "PCU"
    },
    {
@@ -151,7 +151,7 @@
        "EventCode": "0x80",
        "EventName": "UNC_P_POWER_STATE_OCCUPANCY.CORES_C3",
        "PerPkg": "1",
-        "PublicDescription": "This is an occupancy event that tracks the number of cores that are in the chosen C-State.  It can be used by itself to get the average number of cores in that C-state with threshholding to generate histograms, or with other PCU events and occupancy triggering to capture other details.",
+        "PublicDescription": "This is an occupancy event that tracks the number of cores that are in the chosen C-State.  It can be used by itself to get the average number of cores in that C-state with thresholding to generate histograms, or with other PCU events and occupancy triggering to capture other details.",
        "Unit": "PCU"
    },
    {
@@ -159,7 +159,7 @@
        "EventCode": "0x80",
        "EventName": "UNC_P_POWER_STATE_OCCUPANCY.CORES_C6",
        "PerPkg": "1",
-        "PublicDescription": "This is an occupancy event that tracks the number of cores that are in the chosen C-State.  It can be used by itself to get the average number of cores in that C-state with threshholding to generate histograms, or with other PCU events and occupancy triggering to capture other details.",
+        "PublicDescription": "This is an occupancy event that tracks the number of cores that are in the chosen C-State.  It can be used by itself to get the average number of cores in that C-state with thresholding to generate histograms, or with other PCU events and occupancy triggering to capture other details.",
        "Unit": "PCU"
    },
    {