• Chris Wilson's avatar
    drm/i915: Use HW semaphores for inter-engine synchronisation on gen8+ · e8861964
    Chris Wilson authored
    Having introduced per-context seqno, we now have a means to identity
    progress across the system without feel of rollback as befell the
    global_seqno. That is we can program a MI_SEMAPHORE_WAIT operation in
    advance of submission safe in the knowledge that our target seqno and
    address is stable.
    
    However, since we are telling the GPU to busy-spin on the target address
    until it matches the signaling seqno, we only want to do so when we are
    sure that busy-spin will be completed quickly. To achieve this we only
    submit the request to HW once the signaler is itself executing (modulo
    preemption causing us to wait longer), and we only do so for default and
    above priority requests (so that idle priority tasks never themselves
    hog the GPU waiting for others).
    
    As might be reasonably expected, HW semaphores excel in inter-engine
    synchronisation microbenchmarks (where the 3x reduced latency / increased
    throughput more than offset the power cost of spinning on a second ring)
    and have significant improvement (can be up to ~10%, most see no change)
    for single clients that utilize multiple engines (typically media players
    and transcoders), without regressing multiple clients that can saturate
    the system or changing the power envelope dramatically.
    
    v3: Drop the older NEQ branch, now we pin the signaler's HWSP anyway.
    v4: Tell the world and include it as part of scheduler caps.
    
    Testcase: igt/gem_exec_whisper
    Testcase: igt/benchmarks/gem_wsim
    Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
    Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
    Reviewed-by: default avatarTvrtko Ursulin <tvrtko.ursulin@intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20190301170901.8340-3-chris@chris-wilson.co.uk
    e8861964
intel_ringbuffer.h 30.9 KB