Commits · 85345517fe6d4de27b0d6ca19fef9d28ac947c4a · Kirill Smelkov / linux

An error occurred fetching the project authors.

13 Nov, 2010 1 commit

drm/i915: Retire any pending operations on the old scanout when switching · 85345517

Chris Wilson authored 14 years ago

An old and oft reported bug, is that of the GPU hanging on a
MI_WAIT_FOR_EVENT following a mode switch. The cause is that the GPU is
waiting on a scanline counter on an inactive pipe, and so waits for a
very long time until eventually the user reboots his machine.

We can prevent this either by moving the WAIT into the kernel and
thereby incurring considerable cost on every swapbuffers, or by waiting
for the GPU to retire the last batch that accesses the framebuffer
before installing a new one. As mode switches are much rarer than swap
buffers, this looks like an easy choice.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=28964
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=29252Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: stable@kernel.org

85345517

08 Nov, 2010 1 commit

drm/i915: Avoid might_fault during pwrite whilst holding our mutex · b47b30cc

Chris Wilson authored 14 years ago

... and so prevent a potential circular reference:

  [ INFO: possible circular locking dependency detected ]
  2.6.37-rc1-uwe1+ #4
  -------------------------------------------------------
  Xorg/1401 is trying to acquire lock:
   (&mm->mmap_sem){++++++}, at: [<c01e4ddb>] might_fault+0x4b/0xa0

  but task is already holding lock:
   (&dev->struct_mutex){+.+.+.}, at: [<f869c3ac>]
  i915_mutex_lock_interruptible+0x3c/0x60 [i915]

  which lock already depends on the new lock.

When the locking around the pwrite ioctl was simplified, I did not spot
that the phys path never took any locks and so we introduced this
potential circular reference.
Reported-by: Uwe Helm <uwe.helm@googlemail.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

b47b30cc

01 Nov, 2010 1 commit
- drm/i915: Apply big hammer to serialise buffer access between rings · c6afd658
  Chris Wilson authored 14 years ago
```
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: stable@kernel.org
```
  c6afd658
28 Oct, 2010 1 commit

drm/i915: Flush read-only buffers from the active list upon idle as well · 395b70be

Chris Wilson authored 14 years ago

It is possible for the active list to only contain a read-only buffer so
that the ring->gpu_write_list remains entry. This leads to an
inconsistency between i915_gpu_is_active() and i915_gpu_idle() causing
an infinite spin during the shrinker and an assertion failure that
i915_gpu_idle() does indeed flush all buffers from the active lists.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

395b70be

26 Oct, 2010 1 commit

mm: stack based kmap_atomic() · 3e4d3af5

Peter Zijlstra authored 14 years ago

Keep the current interface but ignore the KM_type and use a stack based
approach.

The advantage is that we get rid of crappy code like:

	#define __KM_PTE			\
		(in_nmi() ? KM_NMI_PTE : 	\
		 in_irq() ? KM_IRQ_PTE :	\
		 KM_PTE0)

and in general can stop worrying about what context we're in and what kmap
slots might be appropriate for that.

The downside is that FRV kmap_atomic() gets more expensive.

For now we use a CPP trick suggested by Andrew:

  #define kmap_atomic(page, args...) __kmap_atomic(page)

to avoid having to touch all kmap_atomic() users in a single patch.

[ not compiled on:
  - mn10300: the arch doesn't actually build with highmem to begin with ]

[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: fix up drivers/gpu/drm/i915/intel_overlay.c]
Acked-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Chris Metcalf <cmetcalf@tilera.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: David Miller <davem@davemloft.net>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Dave Airlie <airlied@linux.ie>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

3e4d3af5

24 Oct, 2010 1 commit

drm/i915: Move gpu_write_list to per-ring · 64193406

Chris Wilson authored 14 years ago

... to prevent flush processing of an idle (or even absent) ring.

This fixes a regression during suspend from 87acb0a5.
Reported-and-tested-by: Alexey Fisher <bug-track@fisher-privat.net>
Tested-by: Peter Clifton <pcjc2@cam.ac.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

64193406

23 Oct, 2010 1 commit

drm/i915: Invalidate the to-ring, flush the old-ring when updating domains · b6651458

Chris Wilson authored 14 years ago

When the object has been written to by the gpu it remains on the ring
until its flush has been retired. However, when the object is moving to
the ring and the associated cache needs to be invalidated, we need to
perform the flush on the target ring, not the one it came from (which is
NULL in the reported case and so the flush was entirely absent).
Reported-by: Peter Clifton <pcjc2@cam.ac.uk>
Reported-and-tested-by: Alexey Fisher <bug-track@fisher-privat.net>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

b6651458

22 Oct, 2010 1 commit

drm/i915: Fix flushing regression from · 878a3c37

Chris Wilson authored 14 years ago

Whilst moving the code around in 9af90d19, I dropped the or'ing in of
new write domains which would zero out the write domain for a render
target if later reused as a source later in the batch. This meant that
we might drop a required flush before reading from the render target.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=31043
Reported-by: xunx.fang@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

878a3c37

21 Oct, 2010 1 commit

drm/i915: Enable SandyBridge blitter ring · 549f7365

Chris Wilson authored 14 years ago

Based on an original patch by Zhenyu Wang, this initializes the BLT ring for
SandyBridge and enables support for user execbuffers.

Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

549f7365

20 Oct, 2010 4 commits

drm/i915: Copy the updated reloc->presumed_offset back to the user · b5dc608c

Chris Wilson authored 14 years ago

If the userspace driver is using a constant relocation array with a
static buffer, they will pass the same relocation array back to the
kernel. So we *do* need to update the presumed offset value in those
relocations to reflect the current object so that they remain correct
with future batchbuffers and we avoid the necessity of having to suspend
execution and perform redundant relocations.

Fixes the regression introduced by 12f889c for applications using
absolute addressing on trees of buffer (i.e. the current consumers of
libdrm_intel.so).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=30996Reported-by: Wang, Jinjin <jinjin.wang@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

b5dc608c

drm/i915: Track objects in global active list (as well as per-ring) · 69dc4987

Chris Wilson authored 14 years ago

To handle retirements, we need per-ring tracking of active objects.
To handle evictions, we need global tracking of active objects.

As we enable more rings, rebuilding the global list from the individual
per-ring lists quickly grows tiresome and overly complicated. Tracking the
active objects in two lists is the lesser of two evils.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

69dc4987

drm/i915: Simplify most HAS_BSD() checks · 87acb0a5

Chris Wilson authored 14 years ago

... by always initialising the empty ringbuffer it is always then safe
to check whether it is active.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

87acb0a5

drm/i915: cache the last object lookup during pin_and_relocate() · 9af90d19

Chris Wilson authored 14 years ago

The most frequent relocation within a batchbuffer is a contiguous sequence
of vertex buffer relocations, for which we can virtually eliminate the
drm_gem_object_lookup() overhead by caching the last handle to object
translation.

In doing so we refactor the pin and relocate retry loop out of
do_execbuffer into its own helper function and so improve the error
paths.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

9af90d19

19 Oct, 2010 7 commits

drm/i915: Do interrupible mutex lock first to avoid locking for unreference · 1d7cfea1

Chris Wilson authored 14 years ago

One of the primarily consumers of the i915 driver is X, a large signal
driven application. Frequently when writing into the buffers, there is a
pending signal which causes us not to take the interruptible lock but
then we need to take that same lock around the object unreference. By
rearranging the code to do the interruptible lock as the first check, we
can avoid the frequent additional locking around the unreference.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

1d7cfea1

drm/i915: rearrange mutex acquisition for pread · 4f27b75d

Chris Wilson authored 14 years ago

... to avoid the double acquisition along fast[er] paths.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

4f27b75d

drm/i915: Rearrange acquisition of mutex during pwrite · fbd5a26d

Chris Wilson authored 14 years ago

... to avoid reacquiring it to drop the object reference count on
exit. Note we have to make sure we now drop (and reacquire) the lock
around acquiring the mm semaphore on the slow paths.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

fbd5a26d

drm/i915: Attempt to prefault user pages for pread/pwrite · b5e4feb6

Chris Wilson authored 14 years ago

... in the hope that it makes the atomic fast paths more likely.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

b5e4feb6

drm/i915: Avoid taking the mutex for dropping the refcnt upon creation · 202f2fef

Chris Wilson authored 14 years ago

After allocation a handle for the fresh object, we know that we can
safely drop the refcnt without triggering a free so we do not need the
mutex. Strangely, this mutex acquisition is the one that appears on
driver profiles.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

202f2fef

drm/i915: Perform relocations in CPU domain [if in CPU domain] · f0c43d9b

Chris Wilson authored 14 years ago

Avoid an early eviction of the batch buffer into the uncached GTT
domain, and so do the relocation fixup in cacheable memory.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

f0c43d9b

drm/i915: Avoid vmallocing a buffer for the relocations · 2549d6c2

Chris Wilson authored 14 years ago

... perform an access validation check up front instead and copy them in
on-demand, during i915_gem_object_pin_and_relocate(). As around 20% of
the CPU overhead may be spent inside vmalloc for the relocation entries
when submitting an execbuffer [for x11perf -aa10text], the savings are
considerable and result in around a 10% throughput increase [for glyphs].
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

2549d6c2

07 Oct, 2010 1 commit

drm/i915: Wait for pending flips on the GPU · e59f2bac

Chris Wilson authored 14 years ago

Currently, if a batch buffer refers to an object with a pending flip,
then we sleep until that pending flip is completed (unpinned and
signalled). This is so that a flip can be queued and the user can
continue rendering to the backbuffer oblivious to whether the buffer is
still pinned as the scan out. (The kernel arbitrating at the last moment
to stall the batch and wait until the buffer is unpinned and replaced as
the front buffer.)

As we only have a queue depth of 1, we can simply wait for the current
pending flip to complete and continue rendering. We can achieve this
with a single WAIT_FOR_EVENT command inserted into the ring buffer prior
to executing the batch, *without* stalling the client.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

e59f2bac

04 Oct, 2010 1 commit
- drm/i915: Skip pread/pwrite if size to copy is 0. · 35b62a89
  Chris Wilson authored 14 years ago
```
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
```
  35b62a89
03 Oct, 2010 2 commits

drm/i915: Rephrase pwrite bounds checking to avoid any potential overflow · 7dcd2499
Chris Wilson authored 14 years ago
```
... and do the same for pread.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: stable@kernel.org
```
7dcd2499

drm/i915: Sanity check pread/pwrite · ce9d419d

Chris Wilson authored 14 years ago

Move the access control up from the fast paths, which are no longer
universally taken first, up into the caller. This then duplicates some
sanity checking along the slow paths, but is much simpler.
Tracked as CVE-2010-2962.
Reported-by: Kees Cook <kees@ubuntu.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: stable@kernel.org

ce9d419d

02 Oct, 2010 2 commits

drivers/gpu/drm/i915/i915_gem.c: Add missing error handling code · 929f49bf

Julia Lawall authored 14 years ago

Extend the error handling code with operations found in other nearby error
handling code

A simplified version of the sematic match that finds this problem is as
follows: (http://coccinelle.lip6.fr/)

// <smpl>
@r exists@
@r@
statement S1,S2,S3;
constant C1,C2,C3;
@@

*if (...)
 {... S1 return -C1;}
...
*if (...)
 {... when != S1
    return -C2;}
...
*if (...)
 {... S1 return -C3;}
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: stable@kernel.org

929f49bf

drm/i915: Don't mask the return code whilst relocating. · 1cdf7fef

Chris Wilson authored 14 years ago

The return from move_to_gtt_domain() may indicate a pending signal which
needs to handled as opposed to an actual error, for instance, so report
the original return value rather than forcing an EINVAL.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

1cdf7fef

01 Oct, 2010 3 commits

drm/i915: Clear fence registers on GPU reset · 069efc1d

Chris Wilson authored 14 years ago

When the GPU is reset, the fence registers are invalidated, so release
the objects and clear them out.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

069efc1d

drm/i915: Force the domain to CPU on unbinding whilst wedged. · 812ed492

Chris Wilson authored 14 years ago

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=30083Reported-by: Sitsofe Wheeler <sitsofe@yahoo.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

812ed492

drm: Move the GTT accounting to i915 · 73aa808f

Chris Wilson authored 14 years ago

Only drm/i915 does the bookkeeping that makes the information useful,
and the information maintained is driver specific, so move it out of the
core and into its single user.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Dave Airlie <airlied@redhat.com>

73aa808f

30 Sep, 2010 4 commits

drm/gem: handlecount isn't really a kref so don't make it one. · 29d08b3e

Dave Airlie authored 14 years ago

There were lots of places being inconsistent since handle count
looked like a kref but it really wasn't.

Fix this my just making handle count an atomic on the object,
and have it increase the normal object kref.

Now i915/radeon/nouveau drivers can drop the normal reference on
userspace object creation, and have the handle hold it.

This patch fixes a memory leak or corruption on unload, because
the driver had no way of knowing if a handle had been actually
added for this object, and the fbcon object needed to know this
to clean itself up properly.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Dave Airlie <airlied@redhat.com>

29d08b3e

drm/i915: Remove redundant deletion of obj->gpu_write_list · f394940b

Chris Wilson authored 14 years ago

At that point as the object is no longer in any GPU write domain it must
not be on the list, so the list_del() is redundant.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

f394940b

drm/i915: Make get/put pages static · 5cdf5881
Chris Wilson authored 14 years ago
```
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
```
5cdf5881
drm/i915/debug: Convert i915_verify_active() to scan all lists · 23bc5982
Chris Wilson authored 14 years ago
```
... and check more regularly.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
```
23bc5982

29 Sep, 2010 3 commits

drm/i915: Avoid blocking the kworker thread on a stuck mutex · 891b48cf

Chris Wilson authored 14 years ago

Just reschedule the retire requests again if the device is currently
busy. The request list will be pruned along other paths so will never
grow unbounded and so we can afford to miss the occasional pruning.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

891b48cf

drm/i915/debug: Remove default WATCH_BUF · 3d2a812a
Chris Wilson authored 14 years ago
```
Replaced by tracepoints.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
```
3d2a812a

drm/i915/debug: Remove defunct WATCH_LRU · 97d1ebaf

Chris Wilson authored 14 years ago

This has bitrotted through inuse and superseded by tracing and debugfs.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

97d1ebaf

28 Sep, 2010 1 commit

Revert "drm/i915: Drop ring->lazy_request" · a56ba56c

Chris Wilson authored 14 years ago

With multiple rings generating requests independently, the outstanding
requests must also be track independently.
Reported-by: Wang Jinjin <jinjin.wang@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=30380Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

a56ba56c

26 Sep, 2010 2 commits

drm/i915: Ensure that the mode change flushing is currently uninterruptible · ced270fa

Chris Wilson authored 14 years ago

Introduced by 48b956c5, I had thought I had already fixed this. Oh well.
Reported-by: Sitsofe Wheeler <sitsofe@yahoo.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

ced270fa

drm/i915: Convert the file mutex into a spinlock · 1c25595f

Chris Wilson authored 14 years ago

Daniel Vetter pointed out that in this case is would be clearer and
cleaner to use a spinlock instead of a mutex to protect the per-file
request list manipulation. Make it so.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

1c25595f

25 Sep, 2010 1 commit

drm/i915: Make the mutex_lock interruptible on ioctl paths · 76c1dec1

Chris Wilson authored 14 years ago

... and combine it with the wedged completion handler.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

76c1dec1