- 21 May, 2015 14 commits
-
-
Ville Syrjälä authored
IBX can have problems with the first write to the port register getting masked when enabling the port. We are trying to apply the workaround also when disabling the port where it's not needed, and we also try to apply it for CPT/PPT as well which don't need it. Just kill it. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> [danvet: Resolve conflict with the remove CHV if block.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Ville Syrjälä authored
The IBX 12bpc port enable toggle is only relevant when enabling the port, not when disabling it. Also this code doesn't actually toggle anything, and essentially just writes the port register one extra time. Furthermore CPT/PPT don't need such workarounds and yet we include them. Just kill it. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Damien Lespiau authored
We need to re-init the display hardware when going out of suspend. This includes: - Hooking the PCH to the reset logic - Restoring CDCDLK - Enabling the DDB power Among those, only the CDCDLK one is a bit tricky. There's some complexity in that: - DPLL0 (which is the source for CDCLK) has two VCOs, each with a set of supported frequencies. As eDP also uses DPLL0 for its link rate, once DPLL0 is on, we restrict the possible eDP link rates the chosen VCO. - CDCLK also limits the bandwidth available to push pixels. So, as a first step, this commit restore what the BIOS set, until I can do more testing. In case that's of interest for the reviewer, I've unit tested the function that derives the decimal frequency field: #include <stdio.h> #include <stdint.h> #include <assert.h> #define ARRAY_SIZE(x) (sizeof(x) / sizeof(*(x))) static const struct dpll_freq { unsigned int freq; unsigned int decimal; } freqs[] = { { .freq = 308570, .decimal = 0b01001100111}, { .freq = 337500, .decimal = 0b01010100001}, { .freq = 432000, .decimal = 0b01101011110}, { .freq = 450000, .decimal = 0b01110000010}, { .freq = 540000, .decimal = 0b10000110110}, { .freq = 617140, .decimal = 0b10011010000}, { .freq = 675000, .decimal = 0b10101000100}, }; static void intbits(unsigned int v) { int i; for(i = 10; i >= 0; i--) putchar('0' + ((v >> i) & 1)); } static unsigned int freq_decimal(unsigned int freq /* in kHz */) { return (freq - 1000) / 500; } static void test_freq(const struct dpll_freq *entry) { unsigned int decimal = freq_decimal(entry->freq); printf("freq: %d, expected: ", entry->freq); intbits(entry->decimal); printf(", got: "); intbits(decimal); putchar('\n'); assert(decimal == entry->decimal); } int main(int argc, char **argv) { int i; for (i = 0; i < ARRAY_SIZE(freqs); i++) test_freq(&freqs[i]); return 0; } v2: - Rebase on top of -nightly - Use (freq - 1000) / 500 for the decimal frequency (Ville) - Fix setting the enable bit of HSW_NDE_RSTWRN_OPT (Ville) - Rename skl_display_{resume,suspend} to skl_{init,uninit}_cdclk to be consistent with the BXT code (Ville) - Store boot CDCLK in ddi_pll_init (Ville) - Merge dev_priv's skl_boot_cdclk into cdclk_freq - Use LCPLL_PLL_LOCK instead of (1 << 30) (Ville) - Replace various '0' by SKL_DPLL0 to be a bit more explicit that we're programming DPLL0 - Busy poll the PCU before doing the frequency change. It takes about 3/4 cycles, each separated by 10us, to get the ACK from the CPU (Ville) v3: - Restore dev_priv->skl_boot_cdclk, leaving unification with dev_priv->cdclk_freq for a later patch (Daniel, Ville) Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Chris Wilson authored
If the client stalls on a congested request, chosen to be 20ms old to match throttling, allow the client a free RPS boost. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> [danvet: s/rq/req/] [danvet: s/0/NULL/ reported by 0-day build] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Chris Wilson authored
If we have clients stalled waiting for requests, ignore the GPU if it signals that it should downclock due to low load. This helps prevent the automatic timeout from causing extremely long running batches from taking even longer. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Chris Wilson authored
Now that we have internal clients, rather than faking a whole drm_i915_file_private just for tracking RPS boosts, create a new struct intel_rps_client and pass it along when waiting. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> [danvet: s/rq/req/] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Chris Wilson authored
Since we will often pageflip to an active surface, we will often have to wait for the surface to be written before issuing the flip. Also we are likely to wait on that surface in plenty of time before the vblank. Since we have a mechanism for boosting when a flip misses the expected vblank, curtain the number of times we RPS boost when simply waiting for mmioflip. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> [danvet: s/rq/req/] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Chris Wilson authored
Ring switches can occur many times per frame, and are often out of control, causing frequent RPS boosting for no practical benefit. Treat the sw semaphore synchronisation as a separate client and only allow it to boost once per busy/idle cycle. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> [danvet: s/rq/req/] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Chris Wilson authored
This trims a little overhead from the common case of not needing to synchronize between rings. v2: execlists is special and likes to duplicate code. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Chris Wilson authored
Currently, we only track the last request globally across all engines. This prevents us from issuing concurrent read requests on e.g. the RCS and BCS engines (or more likely the render and media engines). Without semaphores, we incur costly stalls as we synchronise between rings - greatly impacting the current performance of Broadwell versus Haswell in certain workloads (like video decode). With the introduction of reference counted requests, it is much easier to track the last request per ring, as well as the last global write request so that we can optimise inter-engine read read requests (as well as better optimise certain CPU waits). v2: Fix inverted readonly condition for nonblocking waits. v3: Handle non-continguous engine array after waits v4: Rebase, tidy, rewrite ring list debugging v5: Use obj->active as a bitfield, it looks cool v6: Micro-optimise, mostly involving moving code around v7: Fix retire-requests-upto for execlists (and multiple rq->ringbuf) v8: Rebase v9: Refactor i915_gem_object_sync() to allow the compiler to better optimise it. Benchmark: igt/gem_read_read_speed hsw:gt3e (with semaphores): Before: Time to read-read 1024k: 275.794µs After: Time to read-read 1024k: 123.260µs hsw:gt3e (w/o semaphores): Before: Time to read-read 1024k: 230.433µs After: Time to read-read 1024k: 124.593µs bdw-u (w/o semaphores): Before After Time to read-read 1x1: 26.274µs 10.350µs Time to read-read 128x128: 40.097µs 21.366µs Time to read-read 256x256: 77.087µs 42.608µs Time to read-read 512x512: 281.999µs 181.155µs Time to read-read 1024x1024: 1196.141µs 1118.223µs Time to read-read 2048x2048: 5639.072µs 5225.837µs Time to read-read 4096x4096: 22401.662µs 21137.067µs Time to read-read 8192x8192: 89617.735µs 85637.681µs Testcase: igt/gem_concurrent_blit (read-read and friends) Cc: Lionel Landwerlin <lionel.g.landwerlin@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> [v8] [danvet: s/\<rq\>/req/g] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Daniel Vetter authored
The merged seqno->request conversion from John called request variables req, but some (not all) of Chris' recent patches changed those to just rq. We've had a lenghty (and inconclusive) discussion on irc which is the more meaningful name with maybe at most a slight bias towards req. Given that the "don't change names without good reason to avoid conflicts" rule applies, so lets go back to a req everywhere for consistency. I'll sed any patches for which this will cause conflicts before applying. Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: John Harrison <John.C.Harrison@Intel.com> [danvet: s/origina/merged/ as pointed out by Chris - the first mass-conversion patch was from Chris, the merged one from John.] Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
-
Imre Deak authored
v2: - set the override disable flag too on stepping F0 (mika) Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Imre Deak authored
On B0 and C0 steppings the workaround enable bit would be overriden by default, so the overriding must be disabled. The WA was added in commit 83a24979 Author: Nick Hoath <nicholas.hoath@intel.com> Date: Fri Apr 10 13:12:26 2015 +0100 drm/i915/bxt: Add WaForceContextSaveRestoreNonCoherent Spotted-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Chris Wilson authored
Our driver compiles clean (nowadays thanks to 0day) but for me, at least, it would be beneficial if the compiler threw an error rather than a warning when it found a piece of suspect code. (I use this to compile-check patch series and want to break on the first compiler error in order to fix the patch.) v2: Kick off a new "Debugging" submenu for i915.ko Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Jani Nikula <jani.nikula@intel.com> [danvet: Add "DRM i915" to the menu name as requested by Chris.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
- 20 May, 2015 26 commits
-
-
Damien Lespiau authored
The macros we use there are the magic ones that can take either dev or dev_priv. We'd like to move as much as possible towards dev_priv though. Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Damien Lespiau authored
Couldn't let it go! Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Damien Lespiau authored
Currently bxt_resume_prepare() is only used in the runtime-resume path. Add it to the full S3/S4 path as well. v2: Rebase on top of the vlv_resume_prepare() shuffling around Cc: Imre Deak <imre.deak@intel.com> Reviewed-by: Imre Deak <imre.deak@intel.com> (v1) Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Sonika Jindal authored
Since DRM_ROTATE is counter clockwise (which is compliant with Xrandr), and HW rotation is clockwise, swapping 90/270 to work as expected from userspace. v2: Rebased Suggested-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Sonika Jindal <sonika.jindal@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Ander Conselvan de Oliveira authored
Explain why a few fields of the new pipe_config have their values preserved, while the others are zeroed. Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Imre Deak authored
Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Imre Deak authored
Also make the WA comment consistent with the rest, where the stepping info is not shown. Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Damien Lespiau authored
ARGB8888 is used for cursors on all platforms so we need to allow it everywhere. ABGR8888 is currently only honoured: - on VLV/CHV in sprite planes - on SKL+ for primary and sprite planes so only allow it for those platforms. Note that we only support ARGB8888/ABGR8888 on the primary plane for SKL/BXT because we have in line of sight the pipe bottom color on those platforms and because the primary plane programming on VLV/CHV doesn't anything different for those formats today. v2: Fix the logic to forbid the creation ABGR2101010 fbs (Ville) v3: Still allow the creation of ARGB8888 fbs now that cursor planes use real fb objects (found by PRTS). Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Damien Lespiau authored
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Vandana Kannan authored
Making lane stagger calculation common for HDMI and DP v2: Imre's comments addressed - Remove lane stagger from bxt_clk_div and make it a local variable in ddi_pll_select Signed-off-by: Vandana Kannan <vandana.kannan@intel.com> Reviewed-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Vandana Kannan authored
BUN 1: prop_coeff, int_coeff, tdctargetcnt programming updated and tied to VCO frequencies. Program i_lockthresh in PORT_PLL_9. VCO calculated based on the formula: Desired Output = Port bit rate in MHz (DisplayPort HBR2 is 5400 MHz) Fast Clock = Desired Output / 2 VCO = Fast Clock * P1 * P2 Prop_coeff, int_coeff, and tdctargetcnt modified according to above calculation. BUN 2: Port PLLs require additional programming at certain frequencies - DCO amplitude in PORT_PLL_10 Review comments from Siva which were addressed in the initial version of the patch. - Change PORT_PLL_LOCK_THRESHOLD to PORT_PLL_LOCK_THRESHOLD_MASK - Calculate for HDMI - Correct values for vco = 5.4 - return in case of invalid vco range v2: Imre's review comments addressed - change dcoampovr_en to dcoampovr_en_h - change PORT_PLL_DCO_AMP_OVR_EN to PORT_PLL_DCO_AMP_OVR_EN_H - Correct lane stagger value for 324MHz - Make coef common for HDMI and DP - remove superfluous comments v3: Imre's comments addressed - Remove Prop_coeff, int_coeff, tdctargetcnt, dcoampovr_en, gain_ctl, dcoampovr_en_h from bxt_clk_div and make them local variables. Signed-off-by: Vandana Kannan <vandana.kannan@intel.com> Reviewed-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com> [v1] Cc: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com> Reviewed-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Jani Nikula authored
Be in line with other features that we have. Signed-off-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Jani Nikula authored
Turn [drm:intel_dp_print_rates] source rates: 162000,270000,540000, [drm:intel_dp_print_rates] sink rates: 162000,270000, [drm:intel_dp_print_rates] common rates: 162000,270000, into [drm:intel_dp_print_rates] source rates: 162000, 270000, 540000 [drm:intel_dp_print_rates] sink rates: 162000, 270000 [drm:intel_dp_print_rates] common rates: 162000, 270000 Signed-off-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Damien Lespiau authored
Cc: Imre Deak <imre.deak@intel.com> Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Reviewed-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Damien Lespiau authored
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Damien Lespiau authored
We just have have VLV and CHV sprites programming the hardware differently for the ABGR2101010 so keep them working. Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Damien Lespiau authored
That define makes it hard to figure out what is the actual list of formats at a glance. Expand it then. Suggested-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Chris Wilson authored
Mika encountered one pathological scenario under X where acquiring all the mm locks (required to insert a mmu notifier) was very slow, so slow that by the time we tried to lock the struct_mutex with the usual call to i915_mutex_lock_interruptible(), X's signal timer had fired causing us to restart the ioctl (and so looped indefinitely). While I suspect this is the result of another bug (something leaking mm perhaps?) we can forgo the error checking and interuptible nature of the lock here so we only have to pay the expense once and get on with it. This does expose the userptr creation routine to a driver livelock though by not being interruptible. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> [danvet: Init ret to avoid issues reported by PRTS.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Ander Conselvan de Oliveira authored
When the modeset code is reached with a CRTC that only needs a flip, the code that assigns PLLs is skipped. But since there is still a state swap for that CRTC, the current PLL assignment needs to be preserved. I missed the ddi_pll_sel field in the following commit, which causes warnings in DDI platforms. commit 4978cc93 Author: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com> Date: Tue Apr 21 17:13:21 2015 +0300 drm/i915: Preserve shared DPLL information in new pipe_config Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90410Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Tvrtko Ursulin authored
v2: Split strings to 80 char, add ddi_pll_sel and fixed typo. (Damien Lespiau) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Damien Lespiau <damien.lespiau@intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Ander Conselvan de Oliveira authored
In the following commit, the place where the contents of dpll_hw_state in crtc_state where zeroed was changed. Prior to that commit, it happened when the new state was allocated, but now that happens just before the call the .crtc_compute_clock() hook. The DP code for SKL, however, sets up the (private) PLL in the encoder compute config function that has already run by the time that memset() is reached, causing the previous value to be lost. This patch fixes the issue by moving the memset() down the call chain, so that it is only called if the values in dpll_hw_state are going to be updated. commit 4978cc93 Author: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com> Date: Tue Apr 21 17:13:21 2015 +0300 drm/i915: Preserve shared DPLL information in new pipe_config Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90462Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Reported-and-tested-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Jani Nikula authored
Add one explicit discard of __iomem address space qualifier in validate_vbt(), and respect it otherwise. This adds clarity in the code, and reduces the sparse warnings from the module to just one. Quoting Daniel, "The vbt really is plain old memory. Except that it's reserved in the e820 table as something special and hence treated as io range by the kernel. But it is memory, hence casting away the __iomem is imo the right approach." Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Tvrtko Ursulin authored
Just so it is grouped logically in line with other data and makes a rather verbose output a bit shorter. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Chandra Konduru <chandra.konduru@intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Reviewed-by: Chandra Konduru <chandra.konduru@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Jani Nikula authored
Improve clarity. No functional changes. Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Jani Nikula authored
We never pass a non-NULL vbt to validate_vbt, and we can safely expect the callers to not change. Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Damien Lespiau authored
Ville noticed in another patch we we didn't need them at all, so remove them. It's worth saying that it makes no difference to code generated as gcc is clever enough to optimize it out. v2: Remove 'break' after 'return' in switches (Ville) Suggested-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-