Commits · 6f0ea9e212b36fe831f104ab2ac7582b9741600a · Kirill Smelkov / linux

05 Mar, 2014 33 commits

drm/i915: assert we're not runtime suspended when accessing registers · 6f0ea9e2

Paulo Zanoni authored Feb 21, 2014

I could swear this was already happening in the current code...

Also, put the reads and writes in a generic place, so we don't forget
it again when we add runtime PM support to new platforms.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

6f0ea9e2

drm/i915: assert force wake is disabled when we runtime suspend · e998c40f

Paulo Zanoni authored Feb 21, 2014

Just to be sure...
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

e998c40f

drm/i915: call assert_device_not_suspended at gen6_force_wake_work · b2ec142c

Paulo Zanoni authored Feb 21, 2014

Because we shouldn't be runtime suspended when forcewake is supposed
to be enabled.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
[danvet: Update commit message - no WARN expected since the bugfix for
issues hit with this assert is already in. And resolve conflicts with
the change from worker to timer for the delayed fw release.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

b2ec142c

drm/i915: kill dev_priv->pc8.gpu_idle · 86c4ec0d

Paulo Zanoni authored Feb 21, 2014

Since the addition of dev_priv->mm.busy, there's no more need for
dev_priv->pc8.gpu_idle, so kill it.

Notice that when you remove gpu_idle, hsw_package_c8_gpu_idle and
hsw_package_c8_gpu_busy become identical to hsw_enable_package_c8 and
hsw_disable_package_c8, so just use them.

Also, when we boot the machine, dev_priv->mm.busy initially considers
the machine as idle. This is opposed to dev_priv->pc8.gpu_idle, which
considered it busy. So dev_priv->pc8.disable_count has to be
initalized to 1 now.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

86c4ec0d

drm/i915: get/put runtime PM in more places at i915_debugfs.c · 36623ef8

Paulo Zanoni authored Feb 21, 2014

These are places where we read (not write) registers while we're
runtime suspended.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

36623ef8

drm/i915: get runtime PM while trying to detect CRT · c19a0df2

Paulo Zanoni authored Feb 21, 2014

Otherwise we'll read registers that return 0xffffffff, trigger some
WARNs, think CRT is actually connected (because certain bits are 1),
and fail the drm-resources-equal testcase!

Tested on a SNB machine with runtime PM support (which is not upstream
yet, but is already on my public tree at freedesktop.org, and will
hopefully eventually become upstream).

Testcase: igt/pm_pc8/drm-resources-equal
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

c19a0df2

drm/i915: put runtime PM only when we actually release force_wake · 6d88064e

Paulo Zanoni authored Feb 21, 2014

When we call gen6_gt_force_wake_put we don't actually put force_wake,
we just schedule gen6_force_wake_work through mod_delayed_work, and
that will eventually release force_wake.

The problem is that we call intel_runtime_pm_put directly at
gen6_gt_force_wake_put, so most of the times we put our runtime PM
reference before the delayed work happens, so we may runtime suspend
while force_wake is still supposed to be enabled if the graphics
autosuspend_delay_ms is too small.

Now the nice thing about the current code is that after it triggers
the delayed work function it gets a refcount, and it only triggers the
delayed work function if refcount is zero. This guarantees that when
we schedule the funciton, it will run before we try to schedule it
again, which simplifies the problem and allows for the current
solution to work properly (hopefully!).

v2: - Keep the VLV refcounts balanced (Jesse)
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

6d88064e

drm/i915: put runtime PM only at the end of intel_mark_idle · bb4cdd53

Paulo Zanoni authored Feb 21, 2014

Because intel_mark_idle still touches some registers: it needs the
machine to be awake. If you set both the autosuspend and PC8 delays to
zero, you can get a "Device suspended" WARN when gen6_rps_idle touches
registers.

This is not easy to reproduce, but happens once in a while when
running pm_pc8.

Testcase: igt/pm_pc8
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

bb4cdd53

drm/i915: Don't ban default context when stop_rings!=0 · ccc7bed0

Ville Syrjälä authored Feb 21, 2014

If we've explicitly stopped the rings for testing purposes, don't ban
the default context. Fixes kms_flip hang tests.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Acked-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

ccc7bed0

drm/i915: print connector mode list in display_info · f103fc7d

Jesse Barnes authored Feb 20, 2014

Useful for bug reports.
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

f103fc7d

drm/i915: Update VBT data structures to have MIPI block enhancements · ea9a6baf

Shobhit Kumar authored Feb 28, 2014

MIPI Block #52 which provides configuration details for the MIPI panel
including dphy settings as per panel and tcon specs

Block #53 gives information on panel enable sequences

v2: Address review comemnts from Jani
    - Move panel ids from intel_dsi.h to intel_bios.h
    - bdb_mipi_config structure improvements for cleaner code
    - Adding units for the pps delays, all in ms
    - change data structure to be more cleaner and simple

v3: Corrected the unit for pps delays as 100us
Signed-off-by: Shobhit Kumar <shobhit.kumar@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

ea9a6baf

drm/i915: Convert the forcewake worker into a timer func · 8232644c

Chris Wilson authored Mar 05, 2014

We don't want to suffer scheduling delay when turning off the GPU after
waking it up to touch registers. Ideally, we only want to keep the GPU
awake for the register access sequence, with a single forcewake dance on
the first access and release immediately after the last. We set a timer
on the first access so that we only dance once and on the next scheduler
tick, we drop the forcewake again.

This moves the cleanup routine from the common i915 workqueue to a timer
func so that we don't anger powertop, and drop the forcewake again
quicker.

v2: Enable the deferred force_wake_put for regular register reads as
    well.
v3: Beautification and make sure we disable forcewake when shutting
    down.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ben Widawsky <ben@bwidawsk.net>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

8232644c

drm/i915/bdw: Add FBC support · 8f94d24b

Ben Widawsky authored Feb 20, 2014

This got lost when we shuffled around our internal branch and
GEN7_FEATURES macro. There were no HW changes to support FBC, so we just
need to set the flag.

v2: Don't allow FBC for any pipe but A on platforms with DDI. (Paulo)

Cc: Daisy Sun <daisy.sun@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

8f94d24b

drm/i915: Accurately track when we mark the hardware as idle/busy · f62a0076

Chris Wilson authored Feb 21, 2014

We currently call intel_mark_idle() too often, as we do so as a
side-effect of processing the request queue. However, we the calls to
intel_mark_idle() are expected to be paired with a call to
intel_mark_busy() (or else we try to idle the hardware by accessing
registers that are already disabled). Make the idle/busy tracking
explicit to prevent the multiple calls.

v2: We can drop some of the complexity in __i915_add_request() as
queue_delayed_work() already behaves as we want (not requeuing the item
if it is already in the queue) and mark_busy/mark_idle imply that the
idle task is inactive.

v3: We do still need to cancel the pending idle task so that it is sent
again after the current busy load completes (not in the middle of it).
Reported-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Tested-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

f62a0076

drm/i915: Fix forcewake counts for gen8 · e9dbd2b2

Mika Kuoppala authored Feb 18, 2014

Sometimes generic driver code gets forcewake explicitly by
gen6_gt_force_wake_get(), which check forcewake_count before accessing
hardware. However the register access with gen8_write function access
low level hw accessors directly, ignoring the forcewake_count. This
leads to nested forcewake get from hardware, in ring init and possibly
elsewhere, causing forcewake ack clear errors and/or hangs.

Fix this by checking the forcewake count also in gen8_write

v2: Read side doesn't care about shadowed registers,
    Remove __needs_put funkiness from gen8_write. (Ville)
    Improved commit message.

References: https://bugs.freedesktop.org/show_bug.cgi?id=74007Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Ben Widawsky <benjamin.widawsky@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

e9dbd2b2

drm/i915: move hsw power domain comment to its right place · 93c73e8c

Imre Deak authored Feb 18, 2014

Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

93c73e8c

drm/i915: use power domain api to check vga power state · 04098753

Imre Deak authored Feb 18, 2014

This way we can reuse the check on other platforms too. Also factor out
a version of the function that doesn't check if the power is on, we'll
need to call this from within the power domain framework.
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

04098753

drm/i915: switch order of power domain init wrt. irq install · e13192f6

Imre Deak authored Feb 18, 2014

On VLV at least the display IRQ register access and functionality
depends on its power well to be on, so move the power domain HW init
before we install the IRQs.
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

e13192f6

drm/i915: use drm_i915_private everywhere in the power domain api · da7e29bd

Imre Deak authored Feb 18, 2014

The power domains framework is internal to the i915 driver, so pass
drm_i915_private instead of drm_device to its functions.

Also remove a dangling intel_set_power_well() declaration.

No functional change.
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

da7e29bd

drm/i915: ignore bios output config if not all outputs are on · 7e696e4c

Daniel Vetter authored Mar 04, 2014

Both Ville and QA rather immediately complained that with the new
initial_config logic from Jesse not all outputs get enabled. Since the
fbdev emulation pretty much tries to always enable as many outputs as
possible (it even has hotplug handling and all that) fall back if more
outputs could have been enabled.

v2: Fix up my confusion about what enabled means - it's passed from
the fbdev helper, we need to check for a non-zero connector->encoder
link. Spotted by Ville.

v3: Add some debug output as requested by Jesse for debugging fallback
issues.

Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75552Tested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

7e696e4c

drm/i915: s/any_enabled/!fallback/ in fbdev_initial_config · 7c2bb531

Daniel Vetter authored Mar 04, 2014

It started as a simple check whether anything is lit up, but now is't
used to driver the general fallback logic to the default output
configuration selector in the helper library. So rename it for more
clarity.

Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

7c2bb531

drm/i915: Reject changes of fb base when we have a flip pending · 7d5e3799

Chris Wilson authored Mar 04, 2014

This should be impossible due to the wait for outstanding flips that the
caller is meant to perform prior to updating the scanout base. Paranoia
tells me to check anyway.

References: https://bugs.freedesktop.org/show_bug.cgi?id=75502Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

7d5e3799

Revert "drm/i915: enable HiZ Raw Stall Optimization on IVB" · 22721343

Chris Wilson authored Mar 04, 2014

This reverts commit 116f2b6d.

This optimization causes widespread corruption in games, and even in
glxgears, on my ivb:gt1. The corruption appears like z-fighting of
overlapping polygons in the HiZ buffer.

The observation ties in very closely with the description of the
optimization disabled by default on IVB:

"The Hierarchical Z RAW Stall Optimization allows non-overlapping
polygons in the same 8x4 pixel/sample area to be processed without
stalling waiting for the earlier ones to write to Hierarchical Z
buffer."

No reason is given for why it is disabled by default, usually for such
optimizations it is that it is incomplete. However, there is no
indication whether this a gt1 only issue either. Before considering
reenabling this optimization, I would first suggest reproducing the
corruption in piglit.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75623Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Chia-I Wu <olv@lunarg.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

22721343

drm/i915: re-add locking around hw state readout · 8b687df4

Jesse Barnes authored Feb 21, 2014

To silence locking complaints.  This was a rebase failure on my part in

commit fa9fa083
Author: Jesse Barnes <jbarnes@virtuousgeek.org>
Date:   Tue Feb 11 15:28:56 2014 -0800

    drm/i915: read out hw state earlier v2
Reported-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

8b687df4

drm/i915: honor forced connector modes v2 · ef34ab89

Jesse Barnes authored Feb 20, 2014

In the move over to use BIOS connector configs, we lost the ability to
force a specific set of connectors on or off.  Try to remedy that by
dropping back to the old behavior if we detect a hard coded connector
config.

v2: don't deref connector state for disabled connectors (Jesse)
Reported-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

ef34ab89

drm/i915: Revert workaround for disabling L3 cache aging on IVB · 1af8452f

Chris Wilson authored Feb 14, 2014

In commit e4e0c058
Author: Eugeni Dodonov <eugeni.dodonov@intel.com>
Date:   Wed Feb 8 12:53:50 2012 -0800

    drm/i915: gen7: Implement an L3 caching workaround.

the L3 cache aging was disabled. This was part of a shotgun response
to a number of GPU hang bugs, but there appears to be no documentation
to suggest that disabling the L3 cache age was ever required (to prevent
the GPU hangs).

Restoring the L3 cache age is a minor performance win of around 2%
on IVB:GT2. (Note that this value seems to be consistent across a number
of tests and so appears to be above the usual noise.)
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

1af8452f

drm/i915: Revert workaround for disabling L3 cache aging on BYT · 47e74f0f

Sinclair Yeh authored Feb 19, 2014

V2:  edit the commit message to contain more info
The W/A spreadsheet says this is still required, but the b-spec says
it's not for BYT-T.  So the documentation is not clear.  However,
our experience with the other SKUs of BYT-I/M on Android and Linux
suggests that setting this bit actually causes GPU hang for certain
OGL benchmark applications.

Removing this bit completely resolves the GPU hangs.
Signed-off-by: Sinclair Yeh <sinclair.yeh@intel.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

47e74f0f

drm/i915/bdw: Kill ppgtt->num_pt_pages · 5abbcca3

Ben Widawsky authored Feb 21, 2014

With the original PPGTT implementation if the number of PDPs was not a
power of two, the number of pages for the page tables would end up being
rounded up. The code actually had a bug here afaict, but this is a
theoretical bug as I don't believe this can actually occur with the
current code/HW..

With the rework of the page table allocations, there is no longer a
distinction between number of page table pages, and number of page
directory entries. To avoid confusion, kill the redundant (and newer)
struct member.

Cc: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

5abbcca3

drm/i915: Split GEN6 PPGTT initialization up · b146520f

Ben Widawsky authored Feb 19, 2014

Simply to match the GEN8 style of PPGTT initialization, split up the
allocations and mappings. Unlike GEN8, we skip a separate dma_addr_t
allocation function, as it is much simpler pre-gen8.

With this code it would be easy to make a more general PPGTT
initialization function with per GEN alloc/map/etc. or use a common
helper, similar to the ringbuffer code. I don't see a benefit to doing
this just yet, but who knows...
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

b146520f

drm/i915: Split GEN6 PPGTT cleanup · a00d825d

Ben Widawsky authored Feb 19, 2014

This cleanup is similar to the GEN8 cleanup (though less necessary).
Having everything split will make cleaning the initialization path error
paths easier to understand.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

a00d825d

drm/i915: Update i915_gem_gtt.c copyright · c4ac524c

Ben Widawsky authored Feb 19, 2014

I keep meaning to do this... by now almost the entire file has been
written by an Intel employee (including Daniel post-2010).
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

c4ac524c

Revert "drm/i915/bdw: Limit GTT to 2GB" · 7907f45b

Ben Widawsky authored Feb 19, 2014

This reverts commit 3a2ffb65.

Now that the code is fixed to use smaller allocations, it should be safe
to let the full GGTT be used on BDW.

The testcase for this is anything which uses more than half of the GTT,
thus eclipsing the old limit.
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

7907f45b

drm/i915/bdw: Reorganize PT allocations · 7ad47cf2

Ben Widawsky authored Feb 20, 2014

The previous allocation mechanism would get 2 contiguous allocations,
one for the page directories, and one for the page tables. As each page
table is 1 page, and there are 512 of these per page directory, this
goes to 2MB. An unfriendly request at best. Worse still, our HW now
supports 4 page directories, and a 2MB allocation is not allowed.

In order to fix this, this patch attempts to split up each page table
allocation into a single, discrete allocation. There is nothing really
fancy about the patch itself, it just has to manage an extra pointer
indirection, and have a fancier bit of logic to free up the pages.

To accommodate some of the added complexity, two new helpers are
introduced to allocate, and free the page table pages.

NOTE: I really wanted to split the way we do allocations, and the way in
which we identify the page table/page directory being used. I found
splitting this functionality up to be too unwieldy. I apologize in
advance to the reviewer. I'd recommend looking at the result, rather
than the diff.

v2/NOTE2: This patch predated commit:
6f1cc993
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Tue Dec 31 15:50:31 2013 +0000

    drm/i915: Avoid dereference past end of page arr

It fixed the same issue as that patch, but because of the limbo state of
PPGTT, Chris patch was merged instead. The excess churn is a result of
my using my original patch, which has my preferred naming. Primarily
act_* is changed to which_*, but it's mostly the same otherwise. I've
kept the convention Chris used for the pte wrap (I had something
slightly different, and broken - but fixable)

v3: Rename which_p[..]e to drop which_ (Chris)
Remove BUG_ON in inner loop (Chris)
Redo the pde/pdpe wrap logic (Chris)

v4: s/1MB/2MB in commit message (Imre)
Plug leaking gen8_pt_pages in both the error path, as well as general
free case (Imre)

v5: Rename leftover "which_" variables (Imre)
Add the pde = 0 wrap that was missed from v3 (Imre)
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Squash in fixup from Ben.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

7ad47cf2

04 Mar, 2014 7 commits

drm/i915: Make clear/insert vfuncs args absolute · 782f1495

Ben Widawsky authored Feb 20, 2014

This patch converts insert_entries and clear_range, both functions which
are specific to the VM. These functions tend to encapsulate the gen
specific PTE writes. Passing absolute addresses to the insert_entries,
and clear_range will help make the logic clearer within the functions as
to what's going on. Currently, all callers simply do the appropriate
page shift, which IMO, ends up looking weird with an upcoming change for
the gen8 page table allocations.

Up until now, the PPGTT was a funky 2 level page table. GEN8 changes
this to look more like a 3 level page table, and to that extent we need
a significant amount more memory simply for the page tables. To address
this, the allocations will be split up in finer amounts.

v2: Replace size_t with uint64_t (Chris, Imre)

v3: Fix size in gen8_ppgtt_init (Ben)
Fix Size in i915_gem_suspend_gtt_mappings/restore (Imre)

Reviewed-by: Imre Deak <imre.deak@intel.com> (v2)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

782f1495

drm/i915/bdw: Split ppgtt initialization up · bf2b4ed2

Ben Widawsky authored Feb 19, 2014

Like cleanup in an earlier patch, the code becomes much more readable,
and easier to extend if we extract out helper functions for the various
stages of init.

Note that with this patch it becomes really simple, and tempting to begin
using the 'goto out' idiom with explicit free/fini semantics. I've
kept the error path as similar as possible to the cleanup() function to
make sure cleanup is as robust as possible

v2: Remove comment "NB:From here on, ppgtt->base.cleanup() should
function properly"
Update commit message to reflect above

v3: Rebased on top of bugfixes found in the previous patch by Imre
Moved number of pd pages assertion to the proper place (Imre)

v4:
Allocate dma address space for num_pd_pages, not num_pd_entries (Ben)
Don't use gen8_pt_dma_addr after free on error path (Imre)
With new fix from v4 of the previous patch.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

bf2b4ed2

drm/i915/bdw: Reorganize PPGTT init · f3a964b9

Ben Widawsky authored Feb 19, 2014

Create 3 clear stages in PPGTT init. This will help with upcoming
changes be more readable. The 3 stages are, allocation, dma mapping, and
writing the P[DT]Es

One nice benefit to the patches is that it makes 2 very clear error
points, allocation, and mapping, and avoids having to do any handling
after writing PTEs (something which was likely buggy before). This
simplified error handling I suspect will be helpful when we move to
deferred/dynamic page table allocation and mapping.

The patches also attempts to break up some of the steps into more
logical reviewable chunks, particularly when we free.

v2: Don't call cleanup on the error path since that takes down the
drm_mm and list entry, which aren't setup at this point.

v3: Fixes addressing Imre's comments from:
<1392821989.19792.13.camel@intelbox>

Don't do dynamic allocation for the page table DMA addresses. I can't
remember why I did it in the first place. This addresses one of Imre's
other issues.

Fix error path leak of page tables.

v4: Fix the fix of the error path leak. Original fix still leaked page
tables. (Imre)
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

f3a964b9

drm/i915/bdw: Free PPGTT struct · b18b6bde

Ben Widawsky authored Feb 20, 2014

GEN8 never freed the PPGTT struct. As GEN8 doesn't use full PPGTT, the
leak is small and only found on a module reload. ie. I don't think this
needs to go to stable.

v2: The very naive, kfree in gen8 ppgtt cleanup, is subject to a double
free on PPGTT initialization failure. (Spotted by Imre). Instead this
patch pulls the ppgtt struct freeing out of the cleanup and leaves it to
the allocators/callers or the one doing the last kref_put as in standard
convention
Reported-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

b18b6bde

drm/i915: Move ppgtt_release out of the header · 321f2ada

Ben Widawsky authored Feb 20, 2014

At one time it was expected to be called in multiple places by kref_put.
At the current time however, it is all contained within
i915_gem_context.c.

This patch makes an upcoming required addition a bit nicer since it too
doesn't need to be defined in a header file.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

321f2ada

drm/i915: Add a comment about WIZ hashing vs. thread counts · c5c98a58

Ville Syrjälä authored Feb 05, 2014

Add a comment next to our WIZ hashing setup to remind people about the
link between WIZ hashing disable bit and PS/WM thread counts.
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

c5c98a58

drm/i915: Change BDW WIZ hashing mode to 16x4 · 36075a4c

Ville Syrjälä authored Feb 04, 2014

BSpec recommends using 8x4 hashing mode when MSAA is used. But in
practice 16x4 seems to have a slight edge in performance (on IVB and
HSW at least). So just use 16x4.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Antti Koskipää <antti.koskipaa@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

36075a4c