• Daniel Vetter's avatar
    drm/i915: Fix spurious -EIO/SIGBUS on wedged gpus · 7abb690a
    Daniel Vetter authored
    Chris Wilson noticed that since
    
    commit 1f83fee0 [v3.9]
    Author: Daniel Vetter <daniel.vetter@ffwll.ch>
    Date:   Thu Nov 15 17:17:22 2012 +0100
    
        drm/i915: clear up wedged transitions
    
    X can again get -EIO when it does not expect it. And even worse score
    a SIGBUS when accessing gtt mmaps. The established ABI is that we
    _only_ return an -EIO from execbuf - all other ioctls should just
    work. And since the reset code moves all bos out of gpu domains and
    clears out all the last_seqno/ring tracking there really shouldn't be
    any reason for non-execbuf code to ever touch the hw and see an -EIO.
    
    After some extensive discussions we've noticed that these spurios -EIO
    are caused by i915_gem_wait_for_error:
    
    http://www.mail-archive.com/intel-gfx@lists.freedesktop.org/msg20540.html
    
    That is easy to fix by returning 0 instead of -EIO, since grabbing the
    dev->struct_mutex does not yet mean that we actually want to touch the
    hw. And so there is no reason at all to fail with -EIO.
    
    But that's not the entire since, since often (at least it's easily
    googleable) dmesg indicates that the reset fails and we declare the
    gpu wedged. Then, quite a bit later X wakes up with the "Timed out
    waiting for the gpu reset to complete" DRM_ERROR message in
    wait_for_errror and brings down the desktop with an -EIO/SIGBUS.
    
    So clearly we're missing a wakeup somewhere, since the gpu reset just
    doesn't take 10 seconds to complete. And indeed we're do handle the
    terminally wedged state wrong.
    
    Fix this all up.
    
    References: https://bugs.freedesktop.org/show_bug.cgi?id=63921
    References: https://bugs.freedesktop.org/show_bug.cgi?id=64073
    Cc: Chris Wilson <chris@chris-wilson.co.uk>
    Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
    Cc: Damien Lespiau <damien.lespiau@intel.com>
    Cc: stable@vger.kernel.org
    Reviewed-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
    Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
    7abb690a
i915_gem.c 111 KB