Commit 04e32e70 authored by Thomas Richter's avatar Thomas Richter Committed by Khalid Elmously

s390/cpum_sf: Avoid SBD overflow condition in irq handler

BugLink: https://bugs.launchpad.net/bugs/1859640

[ Upstream commit 0539ad0b ]

The s390 CPU Measurement sampling facility has an overflow condition
which fires when all entries in a SBD are used.
The measurement alert interrupt is triggered and reads out all samples
in this SDB. It then tests the successor SDB, if this SBD is not full,
the interrupt handler does not read any samples at all from this SDB
The design waits for the hardware to fill this SBD and then trigger
another meassurement alert interrupt.

This scheme works nicely until
an perf_event_overflow() function call discards the sample due to
a too high sampling rate.
The interrupt handler has logic to read out a partially filled SDB
when the perf event overflow condition in linux common code is met.
This causes the CPUM sampling measurement hardware and the PMU
device driver to operate on the same SBD's trailer entry.
This should not happen.

This can be seen here using this trace:
   cpumsf_pmu_add: tear:0xb5286000
   hw_perf_event_update: sdbt 0xb5286000 full 1 over 0 flush_all:0
   hw_perf_event_update: sdbt 0xb5286008 full 0 over 0 flush_all:0
        above shows 1. interrupt
   hw_perf_event_update: sdbt 0xb5286008 full 1 over 0 flush_all:0
   hw_perf_event_update: sdbt 0xb5286008 full 0 over 0 flush_all:0
        above shows 2. interrupt
	... this goes on fine until...
   hw_perf_event_update: sdbt 0xb5286068 full 1 over 0 flush_all:0
   perf_push_sample1: overflow
      one or more samples read from the IRQ handler are rejected by
      perf_event_overflow() and the IRQ handler advances to the next SDB
      and modifies the trailer entry of a partially filled SDB.
   hw_perf_event_update: sdbt 0xb5286070 full 0 over 0 flush_all:1
      timestamp: 14:32:52.519953

Next time the IRQ handler is called for this SDB the trailer entry shows
an overflow count of 19 missed entries.
   hw_perf_event_update: sdbt 0xb5286070 full 1 over 19 flush_all:1
      timestamp: 14:32:52.970058

Remove access to a follow on SDB when event overflow happened.
Signed-off-by: default avatarThomas Richter <tmricht@linux.ibm.com>
Signed-off-by: default avatarVasily Gorbik <gor@linux.ibm.com>
Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
Signed-off-by: default avatarConnor Kuehl <connor.kuehl@canonical.com>
Signed-off-by: default avatarKhalid Elmously <khalid.elmously@canonical.com>
parent 3e613449
...@@ -1294,12 +1294,6 @@ static void hw_perf_event_update(struct perf_event *event, int flush_all) ...@@ -1294,12 +1294,6 @@ static void hw_perf_event_update(struct perf_event *event, int flush_all)
*/ */
if (flush_all && done) if (flush_all && done)
break; break;
/* If an event overflow happened, discard samples by
* processing any remaining sample-data-blocks.
*/
if (event_overflow)
flush_all = 1;
} }
/* Account sample overflows in the event hardware structure */ /* Account sample overflows in the event hardware structure */
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment