Commits · 595e1eeb26d3fcf2e39b494f067ebb5eb3c77e08 · Kirill Smelkov / linux

10 Apr, 2015 40 commits

drm/i915: Remove vestigal DRI1 ring quiescing code · 595e1eeb

Chris Wilson authored Apr 07, 2015

After the removal of DRI1, all access to the rings are through requests
and so we can always be sure that there is a request to wait upon to
free up available space. The fallback code only existed so that we could
quiesce the GPU following unmediated access by DRI1.

v2: Rebase
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

595e1eeb

drm/i915: Use the global runtime-pm wakelock for a busy GPU for execlists · 4bb1bedb

Chris Wilson authored Apr 07, 2015

When we submit a request to the GPU, we first take the rpm wakelock, and
only release it once the GPU has been idle for a small period of time
after all requests have been complete. This means that we are sure no
new interrupt can arrive whilst we do not hold the rpm wakelock and so
can drop the individual get/put around every single request inside
execlists.

Note: to close one potential issue we should mark the GPU as busy
earlier in __i915_add_request.

To elaborate: The issue is that we emit the irq signalling sequence
before we grab the rpm reference, which means we could miss the
resulting interrupt (since that's not set up when suspended). The only
bad side effect is a missed interrupt, gt mmio writes automatically
wake up the hw itself. But otoh we have an umbrella rpm reference for
the entirety of execbuf, as long as that's there we're covered.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: Explain a bit more about the add_request issue, which after
some irc chatting with Chris turns out to not be an issue really.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

4bb1bedb

drm/i915: Use simpler form of spin_lock_irq(execlist_lock) · b5eba372

Chris Wilson authored Apr 07, 2015

We can use the simpler spinlock form to disable interrupts as we are
always outside of an irq/softirq handler.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

b5eba372

drm/i915: Fix the VBT child device parsing for BSW · 90e4f159

Ville Syrjälä authored Mar 25, 2015

Recent BSW VBT has a VBT child device size 37 bytes instead of the 33
bytes our code assumes. This means we fail to parse the VBT and thus
fail to detect eDP ports properly and just register them as DP ports
instead.

Fix it up by using the reported child device size from the VBT instead
of assuming it matches out struct defintions.

The latest spec I have shows that the child device size should be 36
bytes for rev >= 195, however on my BSW the size is actually 37 bytes.
And our current struct definition is 33 bytes.

Feels like the entire VBT parses would need to be rewritten to handle
changes in the layout better, but for now I've decided to do just the
bare minimum to get my eDP port back.

Cc: Vijay Purushothaman <vijay.a.purushothaman@linux.intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

90e4f159

drm/i915: Use complete address space in true PPGTT · a4e0bedc

Michel Thierry authored Apr 08, 2015

True PPGTT is capable of having a full address space, even if the system
has less allocated memory.

Note that aliasing PPGTT always aliases the GGTT and thus should remain
of the same size.
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

a4e0bedc

drm/i915/gen8: Dynamic page table allocations · d7b2633d

Michel Thierry authored Apr 08, 2015

This finishes off the dynamic page tables allocations, in the legacy 3
level style that already exists. Most everything has already been setup
to this point, the patch finishes off the enabling by setting the
appropriate function pointers.

In LRC mode, contexts need to know the PDPs when they are populated. With
dynamic page table allocations, these PDPs may not exist yet. Check if
PDPs have been allocated and use the scratch page if they do not exist yet.

Before submission, update the PDPs in the logic ring context as PDPs
have been allocated.

v2: Update aliasing/true ppgtt allocate/teardown/clear functions for
gen 6 & 7.

v3: Rebase.

v4: Remove BUG() from ppgtt_unbind_vma, but keep checking that either
teardown_va_range or clear_range functions exist (Daniel).

v5: Similar to gen6, in init, gen8_ppgtt_clear_range call is only needed
for aliasing ppgtt. Zombie tracking was originally added for teardown
function and is no longer required.

v6: Update err_out case in gen8_alloc_va_range (missed from lastest
rebase).

v7: Rebase after s/page_tables/page_table/.

v8: Updated scratch_pt check after scratch flag was removed in previous
patch.

v9: Note that lrc mode needs to be updated to support init state without
any PDP.

v10: Unmap correct page_table in gen8_alloc_va_range's error case,  clean-up
gen8_aliasing_ppgtt_init (remove duplicated map), and initialize PTs
during page table allocation.

v11: Squashed LRC enabling commit, otherwise LRC mode would be left broken
until it was updated to handle the init case without any PDP.

v12: Do not overallocate new_pts bitmap, make alloc_gen8_temp_bitmaps
static and don't abuse of inline functions. (Mika)

Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+)
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

d7b2633d

drm/i915/gen8: begin bitmap tracking · 33c8819f

Michel Thierry authored Apr 08, 2015

Like with gen6/7, we can enable bitmap tracking with all the
preallocations to make sure things actually don't blow up.

v2: Rebased to match changes from previous patches.
v3: Without teardown logic, rely on used_pdpes and used_pdes when
freeing page tables.
v4: Rebased after s/page_tables/page_table/.
v5: Rebased after page table generalizations.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+)
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

33c8819f

drm/i915/gen8: Split out mappings · e5815a2e

Michel Thierry authored Apr 08, 2015

When we do dynamic page table allocations for gen8, we'll need to have
more control over how and when we map page tables, similar to gen6.
In particular, DMA mappings for page directories/tables occur at allocation
time.

This patch adds the functionality and calls it at init, which should
have no functional change.

The PDPEs are still a special case for now. We'll need a function for
that in the future as well.

v2: Handle renamed unmap_and_free_page functions.
v3: Updated after teardown_va logic was removed.
v4: Rebase after s/page_tables/page_table/.
v5: No longer allocate all PDPs in GEN8+ systems with less than 4GB of
memory, and update populate_lr_context to handle this new case (proper
tracking will be added later in the patch series).
v6: Assign lrc page directory pointer addresses using a macro. (Mika)

Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+)
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

e5815a2e

drm/i915: Extract PPGTT param from page_directory alloc · c488dbba

Michel Thierry authored Apr 08, 2015

This will be useful for when we move to 48b addressing, and the PDP isn't
the root of the page table structure.

v2: Rebase after changes for Gen8+ systems with less than 4GB of memory.
v3: Rebase after Mika's code review.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2)
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

c488dbba

drm/i915: num_pd_pages/num_pd_entries isn't useful · 09942c65

Michel Thierry authored Apr 08, 2015

These values are never quite useful for dynamic allocations of the page
tables. Getting rid of them will help prevent later confusion.

v2: Updated to use unmap_and_free_pd functions.
v3: Updated gen8_ppgtt_free after teardown logic was removed.
v4: Rebase after s/page_tables/page_table/.
v5: Keep allocating all page directories in GEN8+ systems with less
than 4GB of memory. Updated gen6_for_all_pdes.
v6: Prevent (harmless) out of range access in gen6_for_all_pdes.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+)
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

09942c65

drm/i915/gen8: Update pdp switch and point unused PDPs to scratch page · 7cb6d7ac

Michel Thierry authored Apr 08, 2015

One important part of this patch is we now write a scratch page
directory into any unused PDP descriptors. This matters for 2 reasons,
first, we're not allowed to just use 0, or an invalid pointer, and second,
we must wipe out any previous contents from the last context.

The latter point only matters with full PPGTT. The former point only
effect platforms with less than 4GB memory.

v2: Updated commit message to point that we must set unused PDPs to the
scratch page.

v3: Unmap scratch_pd in gen8_ppgtt_free.

v4: Initialize scratch_pd. (Mika)

Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+)
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

7cb6d7ac

drm/i915/gen8: pagetable allocation rework · 5441f0cb

Michel Thierry authored Apr 08, 2015

Start using gen8_for_each_pde macro to allocate page tables.

v2: teardown_va_range references removed.
v3: Rebase after s/page_tables/page_table/.
v4: Keep setting up page tables for all page directories in systems with
less than 4GB of memory.
v5: Also initialize the page tables. (Mika)
v6: Initialize all page tables, including the extra ones from systems
with less than 4GB of memory. (Mika)

Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+)
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

5441f0cb

drm/i915/gen8: page directories rework allocation · 69876bed

Michel Thierry authored Apr 08, 2015

Start using gen8_for_each_pdpe macro to allocate the page directories.

Similar to PTs, while setting up a page directory, make all entries of
the  pd point to the scratch pd before mapping (and make all its entries
point to the scratch page); this is to be safe in case of out of bound
access or  proactive prefetch. Systems without LLC require an explicit
flush.

v2: Rebased after s/free_pt_*/unmap_and_free_pt/ change.
v3: Rebased after teardown va range logic was removed.
v4: Keep setting up all page directories for systems with less than 4GB
of memory.
v5: Initialize PDs. (Mika)
v6: Initialize also the extra PDs from systems with less than 4GB of
memory. (Mika)

Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+)
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

69876bed

drm/i915/gen8: Add dynamic allocation macros and helper functions · 9271d959

Michel Thierry authored Apr 08, 2015

Similar to gen6, we will use for_each_pde/for_each_pdpe
and pte/pde/pdpe_index to iterate over these new structures.

v2: Match trace_i915_va_teardown params
v3: Multiple rebases.
v4: Updated to use unmap_and_free_pt.
v5: teardown_va_range logic no longer needed.
v6: Rebase after s/page_tables/page_table/.
v7: Renamed commit to match what it does now (it was "Use dynamic
allocation idioms on free").
v8: Prevent (harmless) out of range access in gen8_for_each_pde and
gen8_for_each_pdpe_e.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+)
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
[danvet: s/BUG/WARN/]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

9271d959

drm/i915/gen8: Initialize page tables · 5a8e9943

Michel Thierry authored Apr 08, 2015

Similar to gen6, while setting up a page table, make all entries of the
pt point to the scratch page before mapping; this is to be safe in case
of out of bound access or proactive prefetch.

Systems without LLC require an explicit flush.

v2: Expanded commit text and fixed indentation (Mika)
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

5a8e9943

drm/i915: Remove unnecessary gen8_ppgtt_unmap_pages · 9c57f070

Michel Thierry authored Apr 08, 2015

We are already unmapping them in gen8_ppgtt_free. This function became
redundant since commit 06fda602
("drm/i915: Create page table allocators").

Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

9c57f070

drm/i915: Remove _entry from PPGTT page structures · ec565b3c

Michel Thierry authored Apr 08, 2015

Lets try to keep this consistent:

Page Directory Pointer (PDP).
Page Directory (PD), also known as page directory pointer entries.
Page Table (PT), also known as page directory entries.

s/struct i915_page_table_entry/struct i915_page_table/
s/struct i915_page_directory_entry/struct i915_page_directory/
s/struct i915_page_directory_pointer_entry/struct
i915_page_directory_pointer/
Suggested-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

ec565b3c

drm/i915: Record ring->start address in error state · 94f8cf10

Chris Wilson authored Apr 07, 2015

This is mostly useful for execlists where the rings switch between
contexts (and so checking that the ring's start register matches the
context is important).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

94f8cf10

drm/i915: Suppress empty lines from debugfs/i915_gem_objects · b0da1b79

Chris Wilson authored Apr 07, 2015

This is just so that I don't have to read about the batch pool on
systems that are not using it! Rather than using a newline between the
kernel clients and userspace clients, just distinguish the internal
allocations with a '[k]'
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

b0da1b79

drm/i915: Include active flag when describing objects in debugfs · 481a3d43

Chris Wilson authored Apr 07, 2015

Since we use obj->active as a hint in many places throughout the code,
knowing its state in debugfs is extremely useful.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

481a3d43

drm/i915: Split batch pool into size buckets · 8d9d5744

Chris Wilson authored Apr 07, 2015

Now with the trimmed memcpy before the command parser, we try to
allocate many different sizes of batches, predominantly one or two
pages. We can therefore speed up searching for a good sized batch by
keeping the objects of buckets of roughly the same size.

v2: Add a comment about bucket sizes
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

8d9d5744

drm/i915: Free batch pool when idle · 35c94185

Chris Wilson authored Apr 07, 2015

At runtime, this helps ensure that the batch pools are kept trim and
fast. Then at suspend, this releases memory that we do not need to
restore. It also ties into the oom-notifier to ensure that we recover as
much kernel memory as possible during OOM.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

35c94185

drm/i915: Split the batch pool by engine · 06fbca71

Chris Wilson authored Apr 07, 2015

I woke up one morning and found 50k objects sitting in the batch pool
and every search seemed to iterate the entire list... Painting the
screen in oils would provide a more fluid display.

One issue with the current design is that we only check for retirements
on the current ring when preparing to submit a new batch. This means
that we can have thousands of "active" batches on another ring that we
have to walk over. The simplest way to avoid that is to split the pools
per ring and then our LRU execution ordering will also ensure that the
inactive buffers remain at the front.

v2: execlists still requires duplicate code.
v3: execlists requires more duplicate code
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

06fbca71

drm/i915: Tidy batch pool logic · de4e783a

Chris Wilson authored Apr 07, 2015

Move the madvise logic out of the execbuffer main path into the
relatively rare allocation path, making the execbuffer manipulation less
fragile.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

de4e783a

drm/i915: Split i915_gem_batch_pool into its own header · ed9ddd25

Chris Wilson authored Apr 07, 2015

In the next patch, I want to use the structure elsewhere and so require
it defined earlier. Rather than move the definition to an earlier location
where it feels very odd, place it in its own header file.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

ed9ddd25

drm/i915: Re-enable RPS wait-boosting for all engines · 7c27f525

Chris Wilson authored Apr 07, 2015

This reverts commit ec5cc0f9
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Thu Jun 12 10:28:55 2014 +0100

    drm/i915: Restrict GPU boost to the RCS engine

The premise that media/blitter workloads are not affected by boosting is
patently false with a trip through igt. The question that remains is
what exactly is going wrong with the media workload that prompted this?
Hopefully that would be fixed by the missing agressive downclocking, in
addition to the extra restrictions imposed on how frequent a process is
allowed to boost.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Deepak S <deepak.s@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll>
Acked-by: Deepak S <deepak.s@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

7c27f525

drm/i915: Deminish contribution of wait-boosting from clients · 1854d5ca

Chris Wilson authored Apr 07, 2015

With boosting for missed pageflips, we have a much stronger indication
of when we need to (temporarily) boost GPU frequency to ensure smooth
delivery of frames. So now only allow each client to perform one RPS boost
in each period of GPU activity due to stalling on results.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Deepak S <deepak.s@linux.intel.com>
Reviewed-by: Deepak S <deepak.s@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

1854d5ca

drm/i915: Boost GPU frequency if we detect outstanding pageflips · 6ad790c0

Chris Wilson authored Apr 07, 2015

If we hit a vblank and see that have a pageflip queue but not yet
processed, ensure that the GPU is running at maximum in order to clear
the backlog. Pageflips are only queued for the following vblank, if we
miss it, there will be a visible stutter. Boosting the GPU frequency
doesn't prevent us from missing the target vblank, but it should help
the subsequent frames hitting theirs.

v2: Reorder vblank vs flip-complete so that we only check for a missed
flip after processing the completion events, and avoid spurious boosts.

v3: Rename missed_vblank
v4: Rebase
v5: Cancel the outstanding work in runtime suspend
v6: Rebase
v7: Rebase required fixing
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Deepak S<deepak.s@linux.intel.com>
Reviewed-by: Deepak S<deepak.s@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

6ad790c0

drm/i915: Fix computation of last_adjustment for RPS autotuning · edcf284b

Chris Wilson authored Apr 07, 2015

The issue is that by computing the last_adj value after applying the
clamping, we can end up with a bogus value for feeding into the next RPS
autotuning step.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Deepak S <deepak.s@linux.intel.com>
Reviewed-by: Deepak S <deepak.s@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

edcf284b

drm/i915: Agressive downclocking on Baytrail · 8fb55197

Chris Wilson authored Apr 07, 2015

Reuse the same reclocking strategy for Baytail as on its bigger brethren,
Sandybridge and Ivybridge. In particular, this makes the device quicker
to reclock (both up and down) though the tendency now is to downclock
more aggressively to compensate for the RPS boosts.

v2: Rebase
v3: Exclude Cherrytrail as Deepak was concerned that the increased
number of register writes would wake the common powerwell too often.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Deepak S <deepak.s@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Deepak S <deepak.s@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

8fb55197

drm/i915: Fix the flip synchronisation to consider mmioflips · cf5d8a46

Chris Wilson authored Apr 07, 2015

Currently we emit semaphore synchronisation as if we were going to flip
using the target CS engine, but we then change our minds and do the flip
using the CPU. Consequently we write instructions to the ring but never
use them - even to the point of filling that ring up entirely and never
submitting a request.

The wrinkle in the ointment is that we have to tell a white lie to
pin-to-display for it to skip the synchronisation for mmioflips as we
will create a task specifically for that slow synchronisation. An oddity
of note is the discrepancy in requests that we tell to pin-display to
serialise to and that we then eventually wait upon. This is due to a
limitation in the i915_gem_object_sync() routine that will be lifted
later.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

cf5d8a46

drm/i915: Cache last obj->pages location for i915_gem_object_get_page() · ee286370

Chris Wilson authored Apr 07, 2015

The biggest user of i915_gem_object_get_page() is the relocation
processing during execbuffer. Typically userspace passes in a set of
relocations in sorted order. Sadly, we alternate between relocations
increasing from the start of the buffers, and relocations decreasing
from the end. However the majority of consecutive lookups will still be
in the same page. We could cache the start of the last sg chain, however
for most callers, the entire sgl is inside a single chain and so we see
no improve from the extra layer of caching.

v2: Avoid the double increment inside unlikely()

References: https://bugs.freedesktop.org/show_bug.cgi?id=88308Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

ee286370

drm/i915/skl: Implement WaDisableVFUnitClockGating · f9fc42f4

Damien Lespiau authored Feb 26, 2015

Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

f9fc42f4

drm/i915/skl: Fix stepping check for a couple of W/As · 669506e7

Damien Lespiau authored Feb 26, 2015

Both WaDisableSDEUnitClockGating and WaSetGAPSunitClckGateDisable are
needed on B0 as well.
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

669506e7

drm/i915: Do not set L3-LLC Coherency bit in ctx descriptor · 51847fb9

Arun Siluvery authored Apr 07, 2015

According to Spec this is a reserved bit for Gen9+ and should not be set.

Change-Id: I0215fb7057b94139b7a2f90ecc7a0201c0c93ad4
Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

51847fb9

drm/i915: use kref_put_mutex in i915_gem_request_unreference__unlocked · b833bb61

Maarten Lankhorst authored Apr 07, 2015

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

b833bb61

drm/i915: Don't use staged config in intel_mst_pre_enable_dp() · 9b4fd8f2

Ander Conselvan de Oliveira authored Apr 02, 2015

For the conversion to atomic. The pre_enable() hooks are called as part
of the crtc enable sequence, at which point the staged config was
already made effective. Furthermore, the function actually changes
hardware state, so it should anyway deal with current and not staged
config.
Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

9b4fd8f2

drm/i915: Don't use staged config in check_encoder_cloning() · 98a221da

Ander Conselvan de Oliveira authored Apr 02, 2015

Reduce dependency on the staged config by using the atomic state
instead.
Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

98a221da

drm/i915: Don't use staged config in check_digital_port_conflicts() · 5448a00d

Ander Conselvan de Oliveira authored Apr 02, 2015

Reduce dependency on the staged config by using the atomic state
instead.
Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

5448a00d

drm/i915: Remove intel_crtc->new_config · d69ba09b

Ander Conselvan de Oliveira authored Apr 02, 2015

It's not needed anymore, now that all the users were converted to using
an atomic state.
Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

d69ba09b