• Chris Wilson's avatar
    drm/i915: Decouple GPU error reporting from ring initialisation · 372fbb8e
    Chris Wilson authored
    Currently we report through our error state only the rings that have
    been initialised (as detected by ring->obj). This check is done after
    the GPU reset and ring re-initialisation, which means that the software
    state may not be the same as when we captured the hardware error and we
    may not print out any of the vital information for debugging the hang.
    
    This (and the implied object leak) is a regression from
    
    commit 3d57e5bd
    Author: Ben Widawsky <ben@bwidawsk.net>
    Date:   Mon Oct 14 10:01:36 2013 -0700
    
        drm/i915: Do a fuller init after reset
    
    Note that we are already starting to get bug reports with incomplete
    error states from 3.13, which also hampers debugging userspace driver
    issues.
    
    v2: Prevent a NULL dereference on 830gm/845g after a GPU reset where
        the scratch obj may be NULL.
    Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
    Cc: Ben Widawsky <ben@bwidawsk.net>
    Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
    References: https://bugs.freedesktop.org/show_bug.cgi?id=74094
    Cc: stable@vger.kernel.org # please don't delay since it's a
    vital support/debug feature for the intel gfx stack in general
    Reviewed-by: default avatarVille Syrjälä <ville.syrjala@linux.intel.com>
    [danvet: Add a bit of fluff to make it clear we need this expedited in
    stable.]
    Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
    372fbb8e
i915_gpu_error.c 28.2 KB