- 15 May, 2014 3 commits
-
-
Shobhit Kumar authored
Added as generic parameters which will be initialized in the panel driver and are specific to panels. Backlight delays have also kept as placeholders and will be used used once we have MIPI backlight enabling support Signed-off-by: Shobhit Kumar <shobhit.kumar@intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Shobhit Kumar authored
Signed-off-by: Shobhit Kumar <shobhit.kumar@intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Shobhit Kumar authored
In VBT fields operation mode is 0 for Video mode and 1 for command mode. This field will be directly used as is in generic panel driver. So adjust accordingly. Signed-off-by: Shobhit Kumar <shobhit.kumar@intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
- 14 May, 2014 6 commits
-
-
Mika Kuoppala authored
These are generated with intel-gpu-tools/tools/null_state_gen v2: Don't use header file for states (Daniel Vetter) v3: Proper URB state size for gen8/GT3 (Damien Lespiau) Tested-by: Kristen Carlson Accardi <kristen@linux.intel.com> (v1) Tested-by: Oscar Mateo <oscar.mateo@intel.com> (v2) Acked-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Mika Kuoppala authored
HW guys say that it is not a cool idea to let device go into rc6 without proper 3d pipeline state. For each new uninitialized context, generate a valid null render state to be run on context creation. This patch introduces a skeleton with empty states. v2: - No need to vmap (Chris Wilson) - use .c files for state (Daniel Vetter) - no need to flush as i915_add_request does it - remove parameter for batch alloc size - don't wait for the init (Ben Widawsky) v3: - move to cpu/gpu (Chris Wilson) Tested-by: Kristen Carlson Accardi <kristen@linux.intel.com> (v1) Tested-by: Oscar Mateo <oscar.mateo@intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Daniel Vetter authored
Otherwise we end up tearing down fences when e.g. the client quits way too early. Might or might not fix a fence pin_count BUG Ville has reported. Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Daniel Vetter authored
Userspace can currently provoke this when e.g. trying to use a pinned scanout as a cursor or overlay target. Later on that might lead to some fun fence pin count mayhem. Spurred by Ville's report that something goes wrong here and originally I've thought that this might slip through the pwrite gtt fastpath. But that one checks of obj tiling, so should be ok. But one thing that _does_ blow up is the vma unbinding with more than one address space. The next patch will fix this. v2: Use a WARN_ON - Chris pointed out that we already catch all cases so userspace can't provoke this like I've originally feared. While reviewing relevant code I've noticed a pile of DRM_ERROR in the overlay&cursor code which are all triggerable by userspace. Tune them down while at it. v3: Split out the DRM_ERROR->DRM_DEBUG_KMS change into a separate patch, as requested by Chris. Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Tested-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Daniel Vetter authored
Noticed while playing with coccinelle. Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Chris Wilson authored
During initial probing of the modes to assign to the fbdev console, we use the CRTC and connector ids. These are much harder for us to understand than if we used their actual names (or pipe in the CRTC case). Similarly, we want to manually print the mode size rather than rely on mode->name being set. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
- 13 May, 2014 16 commits
-
-
Damien Lespiau authored
Patch done using the following semantic patch (thanks Daniel for the help!) @@ iterator name list_for_each_entry; iterator name for_each_crtc; struct drm_crtc * crtc; struct drm_device * dev; @@ -list_for_each_entry(crtc,&dev->mode_config.crtc_list, head) { +for_each_crtc(dev,crtc) { ... } Followed by a couple of fixups by hand (that spatch doesn't match the cases where list_for_each_entry() is not followed by a set of '{', '}', but I couldn't figure out a way to leave the '{' out of the iterator match). Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Damien Lespiau authored
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Damien Lespiau authored
Generated using the semantic patch: @@ iterator name list_for_each_entry; iterator name for_each_intel_crtc; struct intel_crtc * crtc; struct drm_device * dev; @@ -list_for_each_entry(crtc,&dev->mode_config.crtc_list,...) { +for_each_intel_crtc(dev,crtc) { ... } Followed by a couple of fixups by hand (that spatch doesn't match the cases where list_for_each_entry() is not followed by a set of '{', '}', but I couldn't figure out a way to leave the '{' out of the iterator match). Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Damien Lespiau authored
Fed up with having that long list_for_each_entry() invocation? Use for_each_intel_crtc()! Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Damien Lespiau authored
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> [danvet: Squash in patch that exported ilk_wm_max_level.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Damien Lespiau authored
That's not necessary and makes the code not as neat as it could be. Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Daniel Vetter authored
Somehow UXA submits a completely bogus DR4 value since essentially forever. It was originally introduced in commit bade7d7d2505a10a8a7d24b084aff9742e2d6d64 Author: Eric Anholt <eric@anholt.net> Date: Fri Jun 6 14:03:25 2008 -0700 Use the DRM for submitting batchbuffers when available. and dutifully copied around ever since. Since we want to keep the general dirt catching around just special case the UXA value. This regression was introduced in commit 9cb34664 Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Thu Apr 24 08:09:11 2014 +0200 drm/i915: Catch dirt in unused execbuffer fields Comment from Chris' review: "To be fair, it is a sensible value if one supposes a Region style API to cliprects. Under that API, DR[14] define the extents of the clip region, and ((0,0), (0,0)) [DR1==DR4==0] would mean all clipped, do not draw anything." v2: Pimp commit message a bit and remove the double space. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78494 Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Jörg Otte <jrg.otte@gmail.com> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Daniel Vetter authored
The fence pin count should always be <= the bo pin count. If that's not the case then we have a funny problem and are leaking references somewhere. Which means we can catch fence pin leaks by checking for the same upper limit as we do for the bo pin count. Inspired by a discussion with Ville about a fence leak igt testcase. v2: Also check for fence->pin_count <= ggtt_vma->pin_count, since that might catch a leak even quicker. Also de-inline them, they're getting too big. v3: Don't separately check for MAX_PIN_COUNT since the > vma->pin_count check will catch that already (Chris). Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Ville Syrjälä authored
The pipe might not start to actually run until the port has been enabled (depends on the platform and port type). So don't try to wait for vblank after we enabled the pipe but haven't yet enabled the port. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> References: https://bugs.freedesktop.org/show_bug.cgi?id=77297Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Ville Syrjälä authored
We already moved the plane disable/enable to happen as the first/last thing on every other platforms. Follow suit with gmch platforms. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> [danvet: Frob drm_vblank_on conflict, as usual.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Oscar Mateo authored
This is missing in: commit 78325f2d Author: Ben Widawsky <benjamin.widawsky@intel.com> Date: Tue Apr 29 14:52:29 2014 -0700 drm/i915: Virtualize the ringbuffer signal func Looks to me like a rebase side-effect... Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Ville Syrjälä authored
gen8_stolen_size() is missing __init, so add it. Also all the intel_stolen_funcs structures can be marked __initconst. intel_stolen_ids[] can also be made const if we replace the __initdata with __initconst. Cc: Ingo Molnar <mingo@kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Damien Lespiau authored
CHV uses the same bits as SNB/VLV to code the Graphics Mode Select field (GFX stolen memory size) with the addition of finer granularity modes: 4MB increments from 0x11 (8MB) to 0x1d. Values strictly above 0x1d are either reserved or not supported. v2: 4MB increments, not 8MB. 32MB has been omitted from the list of new values (Ville Syrjälä) v3: Also correctly interpret GGMS (GTT Graphics Memory Size) (Ville Syrjälä) v4: Don't assign a value that needs 20bits or more to a u16 (Rafael Barbalho) [vsyrjala: v5: Split from i915 changes and add chv_stolen_funcs] Cc: Ingo Molnar <mingo@kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Rafael Barbalho <rafael.barbalho@intel.com> Tested-by: Rafael Barbalho <rafael.barbalho@intel.com> Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Damien Lespiau authored
CHV uses the same bits as SNB/VLV to code the Graphics Mode Select field (GFX stolen memory size) with the addition of finer granularity modes: 4MB increments from 0x11 (8MB) to 0x1d. Values strictly above 0x1d are either reserved or not supported. v2: 4MB increments, not 8MB. 32MB has been omitted from the list of new values (Ville Syrjälä) v3: Also correctly interpret GGMS (GTT Graphics Memory Size) (Ville Syrjälä) v4: Don't assign a value that needs 20bits or more to a u16 (Rafael Barbalho) [vsyrjala: v5: Split the early quirks to another patch] Reviewed-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Rafael Barbalho <rafael.barbalho@intel.com> Tested-by: Rafael Barbalho <rafael.barbalho@intel.com> Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Ville Syrjälä authored
No CRT output on CHV, so don't call intel_crt_init(). v2: Don't disable CRT on HAS. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> (v1) Reviewed-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Ville Syrjälä authored
Add chv_crtc_clock_get() to read out the DPLL settings. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Imre Deak <imre.deak@intel.com> [danvet: Fix compile due to bikeshedded headers in an earlier patch.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
- 12 May, 2014 15 commits
-
-
Chon Ming Lee authored
With additional of pipe C, current 1 bit registers for pipe select for HDMI and DP are no longer able to gather for 3 pipes. As a result, new bits location in the same registers are added. For HDMI, VLV uses bit 30, CHV uses bit 24-25. For DP, VLV uses bit 30, CHV uses bit 16-17. Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Chon Ming Lee <chon.ming.lee@intel.com> Reviewed-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Chon Ming Lee authored
Added programming phy layer for CHV based on "Application note for 1273 CHV Display phy". v2: Rebase the code and do some cleanup. v3: Rework based on Ville review. -Fix the macro where the ch info need to swap, and add parens to ? operator. -Fix wrong bit define for DPIO_PCS_SWING_CALC_0 and DPIO_PCS_SWING_CALC_1 and rename for meaningful. -Add some comments for CHV specific DPIO registers. -Change the dp margin registery value to decimal to align with the doc. -Fix the not clearing some value in vlv_dpio_read before write again. -Create new hdmi/dp encoder function for chv instead of share with valleyview. v4: Rebase the code after rename the DPIO registers define and upstream change. Based on Ville review. -For unique transition scale selection, after Ville point out, look like the doc might wrong for the bit 26. Use bit 27 for ch0 and ch1. -Break up some dpio write value into two/three steps for readability. -Remove unrelated change. -Add some shift define for some registers instead just give the hex value. -Fix a bug where write to wrong VLV_TX_DW3. v5: Based on Ville review. - Move tx lane latency optimal setting from chv_dp_pre_pll_enable to chv_pre_enable_dp, and chv_hdmi_pre_pll_enable to chv_hdmi_pre_enable respectively. - Fix typo in one margin_reg_value for DP_TRAIN_VOLTAGE_SWING_400. - Clear DPIO_TX_UNIQ_TRANS_SCALE_EN for DP and HDMI. - Mask the old deemph and swing bits for hdmi. v6: Remove stub for pre_pll_enable for dp and hdmi. Signed-off-by: Chon Ming Lee <chon.ming.lee@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> [vsyrjala: Don't touch panel power sequencing on DP] Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Chon Ming Lee authored
Added programming PLL for CHV based on "Application note for 1273 CHV Display phy". v2: -Break the common lane reset into another patch. -Break the clock calculation into another patch. -The changes are based on Ville review. -Rework based on DPIO register define naming convention change. -Break the dpio write into few lines to improve readability. -Correct the udelay during chv_enable_pll. -clean up some magic numbers with some new define. -program the afc recal bit which was missed. v3: Based on Ville review - minor correction of the bit defination - add deassert/propagate data lane reset v4: Corrected the udelay between dclkp enable and pll enable. Minor comment and better way to clear the TX lane reset. v5: Squash in fixup from Rafael Barbalho. [vsyrjala: v6: Polish the defines (Imre)] Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Chon Ming Lee <chon.ming.lee@intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Chon Ming Lee authored
Based on the chv clock limit, find the best divisor. The divisor data has been verified with this spreadsheet. P1273_DPLL_Programming Spreadsheet. v2: Rebase the code and change the chv_find_best_dpll based on new standard way to use intel_PLL_is_valid. Besides, clean up some extra variables. v3: Ville suggest better fixed point for m2 calculation. v4: -Add comment for the limit is compute using fast clock. (Ville) -Don't pass the request clock to chv_clock, as the same function will be use clock readout, which doens't have request clock. (Ville) -Add and use DIV_ROUND_CLOSEST_ULL to consistent with other clock calculation. (Ville) -Fix the dp m2 after m2 has stored fixed point. (Ville) Signed-off-by: Chon Ming Lee <chon.ming.lee@intel.com> [vsyrjala: Avoid div-by-zero in chv_clock()] Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Chon Ming Lee authored
During cold boot, the display controller needs to deassert the common lane reset. Only do it once during intel_init_dpio for both PHYx2 and PHYx1. Besides, assert the common lane reset when disable pll. This still to be determined whether need to do it by driver. Signed-off-by: Chon Ming Lee <chon.ming.lee@intel.com> [vsyrjala: Don't disable DPIO PLL when using DSI] [vsyrjala: Don't call vlv_disable_pll() by accident on CHV] Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Imre Deak <imre.deak@intel.com> [danvet: Move part of a moved comment back as suggested by Imre since it's valid for both byt and chv.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Chon Ming Lee authored
Cherryview has 3 pipes. Some of the pll dpio offset calculation is based on pipe number. Need to use vlv_pipe_to_channel to calculate the correct phy channel to use for the pipe. Signed-off-by: Chon Ming Lee <chon.ming.lee@intel.com> Reviewed-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Chon Ming Lee authored
The additional DPLL registers added to support Port D. Besides, add some new PHY control and status registers based on B-spec. v2: Based on Ville review - Corrected DPIO_PHY_STATUS offset and name. - Rebase based on upstream change after introduce enum dpio_phy and enum dpio_channel. v3: Rebased on top of Antti's 3-pipe prep patch. Note that the new offsets for the DPLL registers aren't in place yet, so this introduces a slight regression. But since 3 pipe support isn't fully enabled yet anyaway in -internal this shouldn't matter too much. Signed-off-by: Chon Ming Lee <chon.ming.lee@intel.com> Reviewed-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Chon Ming Lee authored
CHV has 2 display phys. First phy (IOSF offset 0x1A) has two channels, and second phy (IOSF offset 0x12) has single channel. The first phy is used for port B and port C, while second phy is only for port D. v2: Move the pipe to determine which phy to select for vlv_dpio_read/vlv_dpio_write to another patch. (Daniel) v3: Rebase the code based on rework on how to calculate DPIO offset. Signed-off-by: Chon Ming Lee <chon.ming.lee@intel.com> Reviewed-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Ville Syrjälä authored
Fill in the sprite bits for DDL1/DDL2 registers, and add DDL3. Still need to write the code to use these... Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Ville Syrjälä authored
v2: Update to also fill in the new num_pipes field. v3: Rebase on top of the pciid extraction. v4: Switch from info->has*ring to info->ring mask. Also add VEBOX support whiel at it. v5: s/CHV_PCI_IDS/CHV_IDS/, and drop the trailing '\' Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Ville Syrjälä authored
CHV clock gating isn't identical to VLV, so add a new function for it. This is only a start, and further changes are needed as the details become available. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Ville Syrjälä authored
Make i915_gem_interrupt debugfs file functional on CHV. FIXME: Extract helpers for gt/display blocks to shrink the function a bit and avoid duplication between bdw/chv (and other similar cases for upstream). Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Daniel Vetter authored
Inspired by a review bikeshed from Jani. Cc: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Ville Syrjälä authored
CHV has the Gen8 master interrupt register, as well as Gen8 GT/PCU interrupt registers. The display block is based on VLV, with the main difference of adding pipe C. v2: Rewrite the order of operations to make more sense Don't bail out if MASTER_CTL register doesn't show an interrupt, as display interrupts aren't reported there. v3: Rebase on top of Egbert Eich's hpd irq handling rework by using the relevant port hotplug logic like for vlv. v4: Rebase on top of Ben's gt irq #define refactoring. v5: Squash in gen8_gt_irq_handler refactoring from Zhao Yakui <yakui.zhao@intel.com> v6: Adapt to upstream changes, dev_priv->irq_received is gone. v7: Enable 3 the commented-out 3 pipe support. v8: Rebase on top of Paulo's irq setup rework, use the renamed macros from upstream. v9: Grab irq_lock around i915_enable_pipestat() FIXME: There's probably some potential for more shared code between bdw and chv. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> (v2) Reviewed-by: Jani Nikula <jani.nikula@intel.com> [danvet: Drop the unnecessary cast Jani spotted.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Brad Volkin authored
For clients that submit large batch buffers the command parser has a substantial impact on performance. On my HSW ULT system performance drops as much as ~20% on some tests. Most of the time is spent in the command lookup code. Converting that from the current naive search to a hash table lookup reduces the performance drop to ~10%. The choice of value for I915_CMD_HASH_ORDER allows all commands currently used in the parser tables to hash to their own bucket (except for one collision on the render ring). The tradeoff is that it wastes memory. Because the opcodes for the commands in the tables are not particularly well distributed, reducing the order still leaves many buckets empty. The increased collisions don't seem to have a huge impact on the performance gain, but for now anyhow, the parser trades memory for performance. NB: Ville noticed that the error paths through the ring init code will leak memory. I've not addressed that here. We can do a follow up pass to handle all of the leaks. v2: improved comment describing selection of hash key mask (Damien) replace a BUG_ON() with an error return (Tvrtko, Ville) commit message improvements Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-