Commit 4243cd53 authored by Chris Wilson's avatar Chris Wilson

drm/i915/gt: Sanitize GT first

We see that if the HW doesn't actually sleep, the HW may eat the poison
we set in its write-only HWSP during sanitize:

  intel_gt_resume.part.8: 0000:00:02.0
  __gt_unpark: 0000:00:02.0
  gt_sanitize: 0000:00:02.0 force:yes
  process_csb: 0000:00:02.0 vcs0: cs-irq head=5, tail=90
  process_csb: 0000:00:02.0 vcs0: csb[0]: status=0x5a5a5a5a:0x5a5a5a5a
  assert_pending_valid: Nothing pending for promotion!

The CS TAIL pointer should have been reset by reset_csb_pointers(), so
in this case it is likely that we have read back from the CPU cache and
so we must clflush our control over that page. In doing so, push the
sanitisation to the start of the GT sequence so that our poisoning is
assuredly before we start talking to the HW.

References: https://gitlab.freedesktop.org/drm/intel/-/issues/1794Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: default avatarTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200427084000.10999-1-chris@chris-wilson.co.uk
parent 2759e395
...@@ -198,11 +198,12 @@ int intel_gt_resume(struct intel_gt *gt) ...@@ -198,11 +198,12 @@ int intel_gt_resume(struct intel_gt *gt)
* Only the kernel contexts should remain pinned over suspend, * Only the kernel contexts should remain pinned over suspend,
* allowing us to fixup the user contexts on their first pin. * allowing us to fixup the user contexts on their first pin.
*/ */
gt_sanitize(gt, true);
intel_gt_pm_get(gt); intel_gt_pm_get(gt);
intel_uncore_forcewake_get(gt->uncore, FORCEWAKE_ALL); intel_uncore_forcewake_get(gt->uncore, FORCEWAKE_ALL);
intel_rc6_sanitize(&gt->rc6); intel_rc6_sanitize(&gt->rc6);
gt_sanitize(gt, true);
if (intel_gt_is_wedged(gt)) { if (intel_gt_is_wedged(gt)) {
err = -EIO; err = -EIO;
goto out_fw; goto out_fw;
......
...@@ -3931,6 +3931,9 @@ static void execlists_sanitize(struct intel_engine_cs *engine) ...@@ -3931,6 +3931,9 @@ static void execlists_sanitize(struct intel_engine_cs *engine)
* reset the value in the HWSP. * reset the value in the HWSP.
*/ */
intel_timeline_reset_seqno(engine->kernel_context->timeline); intel_timeline_reset_seqno(engine->kernel_context->timeline);
/* And scrub the dirty cachelines for the HWSP */
clflush_cache_range(engine->status_page.addr, PAGE_SIZE);
} }
static void enable_error_interrupt(struct intel_engine_cs *engine) static void enable_error_interrupt(struct intel_engine_cs *engine)
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment