Commit 1255501d authored by Chris Wilson's avatar Chris Wilson

drm/i915: Embrace the race in busy-ioctl

Daniel Vetter proposed a new challenge to the serialisation inside the
busy-ioctl that exposed a flaw that could result in us reporting the
wrong engine as being busy. If the request is reallocated as we test
its busyness and then reassigned to this object by another thread, we
would not notice that the test itself was incorrect.

We are faced with a choice of using __i915_gem_active_get_request_rcu()
to first acquire a reference to the request preventing the race, or to
acknowledge the race and accept the limitations upon the accuracy of the
busy flags. Note that we guarantee that we never falsely report the
object as idle (providing userspace itself doesn't race), and so the
most important use of the busy-ioctl and its guarantees are fulfilled.
Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: default avatarJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: default avatarMika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: default avatarDaniel Vetter <daniel.vetter@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471337440-16777-1-git-send-email-chris@chris-wilson.co.uk
parent 1544c42e
...@@ -3807,49 +3807,68 @@ static __always_inline unsigned int ...@@ -3807,49 +3807,68 @@ static __always_inline unsigned int
__busy_set_if_active(const struct i915_gem_active *active, __busy_set_if_active(const struct i915_gem_active *active,
unsigned int (*flag)(unsigned int id)) unsigned int (*flag)(unsigned int id))
{ {
/* For more discussion about the barriers and locking concerns, struct drm_i915_gem_request *request;
* see __i915_gem_active_get_rcu().
*/
do {
struct drm_i915_gem_request *request;
unsigned int id;
request = rcu_dereference(active->request);
if (!request || i915_gem_request_completed(request))
return 0;
id = request->engine->exec_id; request = rcu_dereference(active->request);
if (!request || i915_gem_request_completed(request))
return 0;
/* Check that the pointer wasn't reassigned and overwritten. /* This is racy. See __i915_gem_active_get_rcu() for an in detail
* * discussion of how to handle the race correctly, but for reporting
* In __i915_gem_active_get_rcu(), we enforce ordering between * the busy state we err on the side of potentially reporting the
* the first rcu pointer dereference (imposing a * wrong engine as being busy (but we guarantee that the result
* read-dependency only on access through the pointer) and * is at least self-consistent).
* the second lockless access through the memory barrier *
* following a successful atomic_inc_not_zero(). Here there * As we use SLAB_DESTROY_BY_RCU, the request may be reallocated
* is no such barrier, and so we must manually insert an * whilst we are inspecting it, even under the RCU read lock as we are.
* explicit read barrier to ensure that the following * This means that there is a small window for the engine and/or the
* access occurs after all the loads through the first * seqno to have been overwritten. The seqno will always be in the
* pointer. * future compared to the intended, and so we know that if that
* * seqno is idle (on whatever engine) our request is idle and the
* It is worth comparing this sequence with * return 0 above is correct.
* raw_write_seqcount_latch() which operates very similarly. *
* The challenge here is the visibility of the other CPU * The issue is that if the engine is switched, it is just as likely
* writes to the reallocated request vs the local CPU ordering. * to report that it is busy (but since the switch happened, we know
* Before the other CPU can overwrite the request, it will * the request should be idle). So there is a small chance that a busy
* have updated our active->request and gone through a wmb. * result is actually the wrong engine.
* During the read here, we want to make sure that the values *
* we see have not been overwritten as we do so - and we do * So why don't we care?
* that by serialising the second pointer check with the writes *
* on other other CPUs. * For starters, the busy ioctl is a heuristic that is by definition
* * racy. Even with perfect serialisation in the driver, the hardware
* The corresponding write barrier is part of * state is constantly advancing - the state we report to the user
* rcu_assign_pointer(). * is stale.
*/ *
smp_rmb(); * The critical information for the busy-ioctl is whether the object
if (request == rcu_access_pointer(active->request)) * is idle as userspace relies on that to detect whether its next
return flag(id); * access will stall, or if it has missed submitting commands to
} while (1); * the hardware allowing the GPU to stall. We never generate a
* false-positive for idleness, thus busy-ioctl is reliable at the
* most fundamental level, and we maintain the guarantee that a
* busy object left to itself will eventually become idle (and stay
* idle!).
*
* We allow ourselves the leeway of potentially misreporting the busy
* state because that is an optimisation heuristic that is constantly
* in flux. Being quickly able to detect the busy/idle state is much
* more important than accurate logging of exactly which engines were
* busy.
*
* For accuracy in reporting the engine, we could use
*
* result = 0;
* request = __i915_gem_active_get_rcu(active);
* if (request) {
* if (!i915_gem_request_completed(request))
* result = flag(request->engine->exec_id);
* i915_gem_request_put(request);
* }
*
* but that still remains susceptible to both hardware and userspace
* races. So we accept making the result of that race slightly worse,
* given the rarity of the race and its low impact on the result.
*/
return flag(READ_ONCE(request->engine->exec_id));
} }
static __always_inline unsigned int static __always_inline unsigned int
...@@ -3897,11 +3916,12 @@ i915_gem_busy_ioctl(struct drm_device *dev, void *data, ...@@ -3897,11 +3916,12 @@ i915_gem_busy_ioctl(struct drm_device *dev, void *data,
* retired and freed. We take a local copy of the pointer, * retired and freed. We take a local copy of the pointer,
* but before we add its engine into the busy set, the other * but before we add its engine into the busy set, the other
* thread reallocates it and assigns it to a task on another * thread reallocates it and assigns it to a task on another
* engine with a fresh and incomplete seqno. * engine with a fresh and incomplete seqno. Guarding against
* * that requires careful serialisation and reference counting,
* So after we lookup the engine's id, we double check that * i.e. using __i915_gem_active_get_request_rcu(). We don't,
* the active request is the same and only then do we add it * instead we expect that if the result is busy, which engines
* into the busy set. * are busy is not completely reliable - we only guarantee
* that the object was busy.
*/ */
rcu_read_lock(); rcu_read_lock();
......
...@@ -855,7 +855,16 @@ struct drm_i915_gem_busy { ...@@ -855,7 +855,16 @@ struct drm_i915_gem_busy {
* having flushed any pending activity), and a non-zero return that * having flushed any pending activity), and a non-zero return that
* the object is still in-flight on the GPU. (The GPU has not yet * the object is still in-flight on the GPU. (The GPU has not yet
* signaled completion for all pending requests that reference the * signaled completion for all pending requests that reference the
* object.) * object.) An object is guaranteed to become idle eventually (so
* long as no new GPU commands are executed upon it). Due to the
* asynchronous nature of the hardware, an object reported
* as busy may become idle before the ioctl is completed.
*
* Furthermore, if the object is busy, which engine is busy is only
* provided as a guide. There are race conditions which prevent the
* report of which engines are busy from being always accurate.
* However, the converse is not true. If the object is idle, the
* result of the ioctl, that all engines are idle, is accurate.
* *
* The returned dword is split into two fields to indicate both * The returned dword is split into two fields to indicate both
* the engines on which the object is being read, and the * the engines on which the object is being read, and the
...@@ -878,6 +887,11 @@ struct drm_i915_gem_busy { ...@@ -878,6 +887,11 @@ struct drm_i915_gem_busy {
* execution engines, e.g. multiple media engines, which are * execution engines, e.g. multiple media engines, which are
* mapped to the same identifier in the EXECBUFFER2 ioctl and * mapped to the same identifier in the EXECBUFFER2 ioctl and
* so are not separately reported for busyness. * so are not separately reported for busyness.
*
* Caveat emptor:
* Only the boolean result of this query is reliable; that is whether
* the object is idle or busy. The report of which engines are busy
* should be only used as a heuristic.
*/ */
__u32 busy; __u32 busy;
}; };
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment