1. 20 Mar, 2012 3 commits
    • Daniel Vetter's avatar
      drm/i915: implement SNB workaround for lazy global gtt · 149c8407
      Daniel Vetter authored
      PIPE_CONTROL on snb needs global gtt mappings in place to workaround a
      hw gotcha. No other commands need such a workaround. Luckily we can
      detect a PIPE_CONTROL commands easily because they have a write_domain
      = I915_GEM_DOMAIN_INSTRUCTION (and nothing else has that).
      
      v2: Binding the target of such a reloc into the global gtt actually
      works instead of binding the source, which is rather pointless ...
      
      v3: Kill a superflous has_global_gtt_mapping assignement noticed by
      Chris Wilson.
      Reviewed-and-tested-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      149c8407
    • Daniel Vetter's avatar
      drm/i915: bind objects to the global gtt only when needed · 74898d7e
      Daniel Vetter authored
      And track the existence of such a binding similar to the aliasing
      ppgtt case. Speeds up binding/unbinding in the common case where we
      only need a ppgtt binding (which is accessed in a cpu coherent fashion
      by the gpu) and no gloabl gtt binding (which needs uc writes for the
      ptes).
      
      This patch just puts the required tracking in place.
      
      v2: Check that global gtt mappings exist in the error_state capture
      code (with Chris Wilson's llc reloc patches batchbuffers are no longer
      relocated as mappable in all situations, so this matters). Suggested
      by Chris Wilson.
      
      v3: Adapted to Chris' latest llc-reloc patches.
      
      v4: Fix a bug in the i915 error state capture code noticed by Chris
      Wilson.
      Reviewed-and-tested-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Signed-Off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      74898d7e
    • Daniel Vetter's avatar
      drm/i915: split out dma mapping from global gtt bind/unbind functions · 74163907
      Daniel Vetter authored
      Note that there's a functional change buried in this patch wrt the ilk
      dmar workaround: We now only idle the gpu while tearing down the dmar
      mappings, not while clearing the gtt. Keeping the current semantics
      would have made for some really ugly code and afaik the issue is only
      with the dmar unmapping that needs a fully idle gpu.
      Reviewed-and-tested-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Signed-Off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      74163907
  2. 18 Mar, 2012 7 commits
  3. 02 Mar, 2012 2 commits
  4. 01 Mar, 2012 2 commits
  5. 29 Feb, 2012 8 commits
  6. 27 Feb, 2012 8 commits
  7. 23 Feb, 2012 1 commit
  8. 16 Feb, 2012 4 commits
  9. 15 Feb, 2012 4 commits
    • Chris Wilson's avatar
      drm/i915/lvds: Always use the presence pin for LVDS on PCH · f3cfcba6
      Chris Wilson authored
      With the introduction of the PCH, we gained an LVDS presence pin but we
      continued to use the existing logic that asserted that LVDS was only
      supported on certain mobile chipsets. However, there are desktop
      IronLake systems with LVDS attached which we fail to detect. So for PCH,
      trust the LVDS presence pin and quirk all the lying manufacturers.
      Tested-by: default avatarDaniel Woff <wolff.daniel@gmail.com>
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=43171Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: default avatarEugeni Dodonov <eugeni.dodonov@intel.com>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      f3cfcba6
    • Chris Wilson's avatar
      drm/i915: Record the position of the request upon error · ee4f42b1
      Chris Wilson authored
      So that we can tally the request against the command sequence in the
      ringbuffer, or merely jump to the interesting locations.
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      ee4f42b1
    • Chris Wilson's avatar
      drm/i915: Record the in-flight requests at the time of a hang · 52d39a21
      Chris Wilson authored
      Being able to tally the list of outstanding requests with the sequence
      of commands in the ringbuffer is often useful evidence with respect to
      driver corruption.
      
      Note that since this is the umpteenth per-ring data structure to be added
      to the error state, I've coallesced the nearby loops (the ringbuffer and
      batchbuffer) into a single structure along with the list of requests.  A
      later task would be to refactor the ring register state into the same
      structure.
      
      v2: Fix pretty printing of requests so that they are parsed correctly by
      intel_error_decode and use the 0x%08x format for seqno for consistency
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      52d39a21
    • Chris Wilson's avatar
      drm/i915: Record the tail at each request and use it to estimate the head · a71d8d94
      Chris Wilson authored
      By recording the location of every request in the ringbuffer, we know
      that in order to retire the request the GPU must have finished reading
      it and so the GPU head is now beyond the tail of the request. We can
      therefore provide a conservative estimate of where the GPU is reading
      from in order to avoid having to read back the ring buffer registers
      when polling for space upon starting a new write into the ringbuffer.
      
      A secondary effect is that this allows us to convert
      intel_ring_buffer_wait() to use i915_wait_request() and so consolidate
      upon the single function to handle the complicated task of waiting upon
      the GPU. A necessary precaution is that we need to make that wait
      uninterruptible to match the existing conditions as all the callers of
      intel_ring_begin() have not been audited to handle ERESTARTSYS
      correctly.
      
      By using a conservative estimate for the head, and always processing all
      outstanding requests first, we prevent a race condition between using
      the estimate and direct reads of I915_RING_HEAD which could result in
      the value of the head going backwards, and the tail overflowing once
      again. We are also careful to mark any request that we skip over in
      order to free space in ring as consumed which provides a
      self-consistency check.
      
      Given sufficient abuse, such as a set of unthrottled GPU bound
      cairo-traces, avoiding the use of I915_RING_HEAD gives a 10-20% boost on
      Sandy Bridge (i5-2520m):
        firefox-paintball  18927ms -> 15646ms: 1.21x speedup
        firefox-fishtank   12563ms -> 11278ms: 1.11x speedup
      which is a mild consolation for the performance those traces achieved from
      exploiting the buggy autoreported head.
      
      v2: Add a few more comments and make request->tail a conservative
      estimate as suggested by Daniel Vetter.
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      [danvet: resolve conflicts with retirement defering and the lack of
      the autoreport head removal (that will go in through -fixes).]
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      a71d8d94
  10. 14 Feb, 2012 1 commit