Commit 96d3e0e1 authored by John Harrison's avatar John Harrison

drm/i915/guc: Make hangcheck work with GuC virtual engines

The serial number tracking of engines happens at the backend of
request submission and was expecting to only be given physical
engines. However, in GuC submission mode, the decomposition of virtual
to physical engines does not happen in i915. Instead, requests are
submitted to their virtual engine mask all the way through to the
hardware (i.e. to GuC). This would mean that the heart beat code
thinks the physical engines are idle due to the serial number not
incrementing. Which in turns means hangcheck does not work for
GuC virtual engines.

This patch updates the tracking to decompose virtual engines into
their physical constituents and tracks the request against each. This
is not entirely accurate as the GuC will only be issuing the request
to one physical engine. However, it is the best that i915 can do given
that it has no knowledge of the GuC's scheduling decisions.

Downside of this is that all physical engines constituting a GuC
virtual engine will be periodically unparked (even during just a single
context executing) in order to be pinged with a heartbeat request.
However the power and performance cost of this is not expected to be
measurable (due low frequency of heartbeat pulses) and it is considered
an easier option than trying to make changes to GuC firmware.

v2:
 (Tvrtko)
  - Update commit message
  - Have default behavior if no vfunc present
Signed-off-by: default avatarJohn Harrison <John.C.Harrison@Intel.com>
Signed-off-by: default avatarMatthew Brost <matthew.brost@intel.com>
Reviewed-by: default avatarMatthew Brost <matthew.brost@intel.com>
Signed-off-by: default avatarJohn Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-3-matthew.brost@intel.com
parent 55612025
......@@ -382,6 +382,8 @@ struct intel_engine_cs {
void (*park)(struct intel_engine_cs *engine);
void (*unpark)(struct intel_engine_cs *engine);
void (*bump_serial)(struct intel_engine_cs *engine);
void (*set_default_submission)(struct intel_engine_cs *engine);
const struct intel_context_ops *cops;
......
......@@ -1492,6 +1492,15 @@ static void guc_release(struct intel_engine_cs *engine)
lrc_fini_wa_ctx(engine);
}
static void virtual_guc_bump_serial(struct intel_engine_cs *engine)
{
struct intel_engine_cs *e;
intel_engine_mask_t tmp, mask = engine->mask;
for_each_engine_masked(e, engine->gt, mask, tmp)
e->serial++;
}
static void guc_default_vfuncs(struct intel_engine_cs *engine)
{
/* Default vfuncs which can be overridden by each engine. */
......@@ -1835,6 +1844,7 @@ guc_create_virtual(struct intel_engine_cs **siblings, unsigned int count)
ve->base.cops = &virtual_guc_context_ops;
ve->base.request_alloc = guc_request_alloc;
ve->base.bump_serial = virtual_guc_bump_serial;
ve->base.submit_request = guc_submit_request;
......
......@@ -669,7 +669,11 @@ bool __i915_request_submit(struct i915_request *request)
request->ring->vaddr + request->postfix);
trace_i915_request_execute(request);
if (engine->bump_serial)
engine->bump_serial(engine);
else
engine->serial++;
result = true;
GEM_BUG_ON(test_bit(I915_FENCE_FLAG_ACTIVE, &request->fence.flags));
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment