Commits · 11a6b88b8cf2ff6e93a5b136ac04fd851a2d935d · Kirill Smelkov / linux

11 Nov, 2021 11 commits

Ville Syrjälä authored Nov 04, 2021

"gen7" in display code is not really sensible. We shall call
these things "ivb".
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211104144520.22605-9-ville.syrjala@linux.intel.comAcked-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Mika Kahola <mika.kahola@intel.com>

11a6b88b

drm/i915/fbc: Introduce .nuke() vfunc · 0242cd3a

Ville Syrjälä authored Nov 04, 2021

Eliminate yet another if-ladder by adding .nuke() vfunc.

We also rename all *_recompress() stuff to *_nuke() since
that's the terminology the spec uses. Also "recompress"
is a bit confusing by perhaps implying that this triggers
an immediate recompression. Depending on the hardware that
may definitely not be the case, and in general we don't
specifically know when the hardware decides to compress.
So all we do is "nuke" the current compressed framebuffer
and leave it up to the hardware to recompress later if it
so chooses.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211104144520.22605-8-ville.syrjala@linux.intel.comAcked-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Mika Kahola <mika.kahola@intel.com>

0242cd3a

drm/i915/fbc: Introduce intel_fbc_funcs · 41b85a52

Ville Syrjälä authored Nov 04, 2021

Replace the "if-ladders everywhere" approach with vfuncs.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211104144520.22605-7-ville.syrjala@linux.intel.comAcked-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Mika Kahola <mika.kahola@intel.com>

41b85a52

drm/i915/fbc: Extract helpers to compute FBC control register values · 6874f958

Ville Syrjälä authored Nov 04, 2021

Declutter the *_fbc_activate() functions by pulling all the
control register value computations into helpers.

I left the enable bit in *_fbc_activate() in the hopes of maybe
using the helpers in the *_fbc_deactivate() paths as well instead
of the current rmw approach. That won't be possible at least
quite yet since we clobber the fbc->params before deactivating
FBC so we could end up changing some of the values live, which
given FBC's lack of/poor double buffering would likely not go
so well.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211104144520.22605-6-ville.syrjala@linux.intel.comAcked-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Mika Kahola <mika.kahola@intel.com>

6874f958

drm/i915/fbc: Introduce intel_fbc_is_compressing() · 74e0457a

Ville Syrjälä authored Nov 04, 2021

Move the direct FBC status register reads from the debugfs code
behind an abstract api.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211104144520.22605-5-ville.syrjala@linux.intel.comAcked-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Mika Kahola <mika.kahola@intel.com>

74e0457a

drm/i915/fbc: Just use params->fence_y_offset always · ef9600ff

Ville Syrjälä authored Nov 04, 2021

No need to tiptoe around programming DPFC_FENCE_YOFF with
params->fence_y_offset vs. 0. If the fence is not enabled
it doesn't even matter what we program here.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211104144520.22605-4-ville.syrjala@linux.intel.comAcked-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Mika Kahola <mika.kahola@intel.com>

ef9600ff

drm/i915/fbc: Extract {skl,glk}_fbc_program_cfb_stride() · 2013ab18

Ville Syrjälä authored Nov 04, 2021

Declutter gen7_fbc_activate() by sucking the override
stride programming stuff into helpers.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211104144520.22605-3-ville.syrjala@linux.intel.comAcked-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Mika Kahola <mika.kahola@intel.com>

2013ab18

drm/i915/fbc: Extract snb_fbc_program_fence() · b50364af

Ville Syrjälä authored Nov 04, 2021

We have two identical copies of the snb+ system agent
CPU fence programming code. Extract into a helper.

Also there's no real point in insisting that we
program 0 into DPFC_CPU_FENCE_OFFSET when the fence is
disabled. So just always stick the computed Y offset there
whether or not the fence is actually used or not.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211104144520.22605-2-ville.syrjala@linux.intel.comAcked-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Mika Kahola <mika.kahola@intel.com>

b50364af

drm/i915/dsi: transmit brightness command in HS state · d1260be7

William Tseng authored Nov 10, 2021

In Video Mode, if DSI transcoder is set to transmit packets
in LP Escape mode, screen flickering would be obseved when
brightness commands are continuously and quickly transmitted
to a panel.

The problem may be resolved by changing the mode to transmit
packets from Low Power to HS.

Cc: Ville Syrjala <ville.syrjala@linux.intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Vandita Kulkarni <vandita.kulkarni@intel.com>
Cc: Lee Shawn C <shawn.c.lee@intel.com>
Cc: Cooper Chiou <cooper.chiou@intel.com>
Signed-off-by: William Tseng <william.tseng@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211110010217.26759-1-william.tseng@intel.com

d1260be7

drm/i915/dp: For PCON TMDS mode set only the relavant bits in config DPCD · f35294e1

Ankit Nautiyal authored Nov 10, 2021

Currently we reset the whole PCON linkConfig DPCD to set the TMDS mode.
This also resets the Source control bit and HDMI link enable bit and
goes to autonomous mode of operation, which is seen to spoil the PCONs
internal state.

This patch avoids resetting the PCON link config register and sets only
the source control bit, with FRL Enable bit set to 0 (TMDS mode) in the
configuration DPCD. It then enables the HDMI Link Enable bit.

v2: Removed the redundant resetting of the bits as the buffer is already
initialized to 0. (Uma)
Updated comments and commit message.

v3: Rebase
Signed-off-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Signed-off-by: Uma Shankar <uma.shankar@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211110072947.171659-3-ankit.k.nautiyal@intel.com

f35294e1

drm/i915/dp: Optimize the FRL configuration for HDMI2.1 PCON · 078e2bb2

Ankit Nautiyal authored Nov 10, 2021

Currently the HDMI2.1 PCON's frl link config DPCD registers are
reset and configured even if they are already configured.
Also the HDMI Link Mode does not settle to FRL MODE immediately after
HDMI Link Status is active.

This patch:
-Checks if the PCON is already configured for FRL.
-Include HDMI Link Mode in wait for loop along with HDMI Link status DPCD.

v2: Rebase
Signed-off-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Signed-off-by: Uma Shankar <uma.shankar@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211110072947.171659-2-ankit.k.nautiyal@intel.com

078e2bb2

10 Nov, 2021 1 commit

Revert "drm/i915/tgl/dsi: Gate the ddi clocks after pll mapping" · 4579509e

Vandita Kulkarni authored Nov 09, 2021

This reverts commit 991d9557 ("drm/i915/tgl/dsi: Gate the ddi clocks
after pll mapping"). The Bspec was updated recently with the pll ungate
sequence similar to that of icl dsi enable sequence. Hence reverting.

Bspec: 49187
Fixes: 991d9557 ("drm/i915/tgl/dsi: Gate the ddi clocks after pll mapping")
Cc: <stable@vger.kernel.org> # v5.4+
Signed-off-by: Vandita Kulkarni <vandita.kulkarni@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211109120428.15211-1-vandita.kulkarni@intel.com

4579509e

09 Nov, 2021 6 commits

drm/i915: pin: delete duplicate check in intel_pin_and_fence_fb_obj() · 6cff894e

Dan Carpenter authored Nov 09, 2021

The "ret" variable is checked on the previous line so we know it's
zero.  No need to check again.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211109114850.GB16587@kili

6cff894e

drm/i915: Call intel_update_active_dpll() for both bigjoiner pipes · c68dac96

Ville Syrjälä authored Nov 05, 2021

Currently we're only calling intel_update_active_dpll() for the
bigjoiner master pipe but not for the slave. With TC ports this
leads to the two pipes end up trying to use different PLLs
(TC vs. TBT). What's worse we're enabling the PLL that didn't get
intel_update_active_dpll() called on it at the spot where we
need the clocks turned on. So we turn on the wrong PLL and the
DDI is now trying to source its clock from the other PLL which is
still disabled. Naturally that doesn't end so well and the DDI
fails to start up.

The state checker also gets a bit unhappy (which is a good thing)
when it notices that one of the pipes was using the wrong PLL.

Let's fix this by remembering to call intel_update_active_dpll()
for both pipes. That should get the correct PLL turned on when
we need it, and the state checker should also be happy.

Cc: Imre Deak <imre.deak@intel.com>
Cc: Manasi Navare <manasi.d.navare@intel.com>
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/4434
Fixes: e12d6218 ("drm/i915: Reduce bigjoiner special casing")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211105212156.5697-1-ville.syrjala@linux.intel.comReviewed-by: Imre Deak <imre.deak@intel.com>

c68dac96

drm/i915: Use unlocked register accesses for LUT loads · 115e0f68

Ville Syrjälä authored Oct 21, 2021

We have to bash in a lot of registers to load the higher
precision LUT modes. The locking overhead is significant, especially
as we have to get this done as quickly as possible during vblank.
So let's switch to unlocked accesses for these. Fortunately the LUT
registers are mostly spread around such that two pipes do not have
any registers on the same cacheline. So as long as commits on the
same pipe are serialized (which they are) we should get away with
this without angering the hardware.

The only exceptions are the PREC_PIPEGCMAX registers on ilk/snb which
we don't use atm as they are only used in the 12bit gamma mode. If/when
we add support for that we may need to remember to still serialize
those registers, though I'm not sure ilk/snb are actually affected
by the same cacheline issue. I think ivb/hsw at least were, but they
use a different set of registers for the precision LUT.

I have a test case which is updating the LUTs on two pipes from a
single atomic commit. Running that in a loop for a minute I get the
following worst case with the locks in place:
intel_crtc_vblank_work_start: pipe B, frame=10037, scanline=1081
intel_crtc_vblank_work_start: pipe A, frame=12274, scanline=769
intel_crtc_vblank_work_end: pipe A, frame=12274, scanline=58
intel_crtc_vblank_work_end: pipe B, frame=10037, scanline=74

And here's the worst case with the locks removed:
intel_crtc_vblank_work_start: pipe B, frame=5869, scanline=1081
intel_crtc_vblank_work_start: pipe A, frame=7616, scanline=769
intel_crtc_vblank_work_end: pipe B, frame=5869, scanline=1096
intel_crtc_vblank_work_end: pipe A, frame=7616, scanline=777

The test was done on a snb using the 10bit 1024 entry LUT mode.
The vtotals for the two displays are 793 and 1125. So we can
see that with the locks ripped out the LUT updates are pretty
nicely confined within the vblank, whereas with the locks in
place we're routinely blasting past the vblank end which causes
visual artifacts near the top of the screen.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211020223339.669-5-ville.syrjala@linux.intel.comReviewed-by: Uma Shankar <uma.shankar@intel.com>

115e0f68

drm/i915: Use vblank workers for gamma updates · 2bbc6fca

Ville Syrjälä authored Oct 21, 2021

The pipe gamma registers are single buffered so they should only
be updated during the vblank to avoid screen tearing. In fact they
really should only be updated between start of vblank and frame
start because that is the only time the pipe is guaranteed to be
empty. Already at frame start the pipe begins to fill up with
data for the next frame.

Unfortunately frame start happens ~1 scanline after the start
of vblank which in practice doesn't always leave us enough time to
finish the gamma update in time (gamma LUTs can be several KiB of
data we have to bash into the registers). However we must try our
best and so we'll add a vblank work for each pipe from where we
can do the gamma update. Additionally we could consider pushing
frame start forward to the max of ~4 scanlines after start of
vblank. But not sure that's exactly a validated configuration.
As it stands the ~100 first pixels tend to make it through with
the old gamma values.

Even though the vblank worker is running on a high prority thread
we still have to contend with C-states. If the CPU happens be in
a deep C-state when the vblank interrupt arrives even the irq
handler gets delayed massively (I've observed dozens of scanlines
worth of latency). To avoid that problem we'll use the qos mechanism
to keep the CPU awake while the vblank work is scheduled.

With all this hooked up we can finally enjoy near atomic gamma
updates. It even works across several pipes from the same atomic
commit which previously was a total fail because we did the
gamma updates for each pipe serially after waiting for all
pipes to have latched the double buffered registers.

In the future the DSB should take over this responsibility
which will hopefully avoid some of these issues.

Kudos to Lyude for finishing the actual vblank workers.
Works like the proverbial train toilet.

v2: Add missing intel_atomic_state fwd declaration
v3: Clean up properly when not scheduling the worker
v4: Clean up the rest and add tracepoints
v5: s/intel_wait_for_vblank_works/intel_wait_for_vblank_workers/ (Jani,Uma)

CC: Lyude Paul <lyude@redhat.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211020223339.669-4-ville.syrjala@linux.intel.comReviewed-by: Uma Shankar <uma.shankar@intel.com>

2bbc6fca

drm/i915: Do vrr push before sampling the frame counter · 6f9976bd

Ville Syrjälä authored Oct 21, 2021

Do the vrr push before we sample the frame counter to
know when the commit has been latched. Doing these in the
wrong order could lead us to complete the flip before it
has actually happened.

Cc: Manasi Navare <manasi.d.navare@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211020223339.669-3-ville.syrjala@linux.intel.comReviewed-by: Uma Shankar <uma.shankar@intel.com>

6f9976bd

drm/i915/dsi: disable lpdt if it is not enabled · 38a1b50c

William Tseng authored Nov 09, 2021

Avoid setting LP_DATA_TRANSFER when enable_lpdt is false

Cc: Ville Syrjala <ville.syrjala@linux.intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Vandita Kulkarni <vandita.kulkarni@intel.com>
Cc: Lee Shawn C <shawn.c.lee@intel.com>
Cc: Cooper Chiou <cooper.chiou@intel.com>
Signed-off-by: William Tseng <william.tseng@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211109034125.11291-1-william.tseng@intel.com

38a1b50c

08 Nov, 2021 1 commit

drm/i915: Fix Memory BW formulae for ADL-P · cf9420cb

Radhakrishna Sripada authored Nov 05, 2021

The earlier update to BW formulae broke ADL-P. Include
display 13 to use TGL path for BW parameters.

Fixes: c64a9a7c ("drm/i915: Update memory bandwidth formulae")
Cc: Matt Roper <matthew.d.roper@intel.com>
Reported-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
Reviewed-by: Caz Yokoyama <caz.yokoyama@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211106003714.17894-1-radhakrishna.sripada@intel.com

cf9420cb

05 Nov, 2021 6 commits

drm/i915/display/adlp: Disable underrun recovery · 4fe7907f

José Roberto de Souza authored Nov 03, 2021

It was also defeatured for ADL-P and other platforms.

BSpec: 55424
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211104010858.43559-1-jose.souza@intel.com

4fe7907f

drm/i915/audio: rename intel_init_audio_hooks to intel_audio_hooks_init · f47a0e35

Jani Nikula authored Nov 04, 2021

Follow the filename based prefix naming.
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211104161858.21786-6-jani.nikula@intel.com

f47a0e35

drm/i915/audio: move intel_audio_funcs internal to intel_audio.c · 5d453746

Jani Nikula authored Nov 04, 2021

It's all internal to intel_audio.c.

Cc: Dave Airlie <airlied@redhat.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211104161858.21786-4-jani.nikula@intel.com

5d453746

drm/i915/audio: define the audio struct separately from drm_i915_private · 37388c01

Jani Nikula authored Nov 04, 2021

Add a standalone definition of struct intel_audio_private, and note that
all of it is private to intel_audio.c.

v2: Rebase

Cc: Dave Airlie <airlied@redhat.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211104161858.21786-3-jani.nikula@intel.com

37388c01

drm/i915/audio: name the audio sub-struct in drm_i915_private · ca3cfb9d

Jani Nikula authored Nov 04, 2021

Add name to the audio sub-struct in drm_i915_private, and remove the
tautologies and other inconsistencies in the member names.

v2: Call the mutex member mutex, not lock. (Ville)

Cc: Dave Airlie <airlied@redhat.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211104161858.21786-2-jani.nikula@intel.com

ca3cfb9d

drm/i915/audio: group audio under anonymous struct in drm_i915_private · fe9b286b

Jani Nikula authored Nov 04, 2021

With an anonymous struct, this can be pure hierarchical organization
without code changes. We'll follow up with adding a name to the
sub-struct separately.

Cc: Dave Airlie <airlied@redhat.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211104161858.21786-1-jani.nikula@intel.com

fe9b286b

04 Nov, 2021 10 commits

drm/i915: Update memory bandwidth formulae · c64a9a7c

Radhakrishna Sripada authored Oct 15, 2021

The formulae has been updated to include more variables. Make
sure the code carries the same.

Bspec: 64631, 54023

v2: Make GEN11 follow the default route and fix calculation of
    maxdebw(RK)
v3: Fix div by zero on default case
    Correct indent for fallthrough(Jani)
v4: Fix div by zero on gen11.
v5: Fix 0 max_numchannels case
v6:
    - Split gen11/gen12 algorithms
    - Fix RKL deburst value
    - Fix difference b/ween ICL and TGL algorithms
    - Protect deinterleave from being 0
    - Warn when numchannels exceeds max_numchannels
    - Fix scaling of clk_max from different units
    - s/deinterleave/channelwidth/ in calculating peakbw
    - Fix off by one for num_planes TGL+
    - Fix SAGV check
v7: Fix div by zero error on gen11
v8: Even though the algorithm for gen11 says that we need to return
    derated bw for a qgv point whose planes are less than no of active
    planes, we return 0 for deratedbw when only one plane is allowed.
    We modify the algorithm to accommodate the case where no of active
    planes are same as the min no of planes supported by a qgv point.
v9: Fix dclk scaling for dg1

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Suggested-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
Reviewed-by: Anusha Srivatsa <anusha.srivatsa@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211015210041.16858-1-radhakrishna.sripada@intel.com

c64a9a7c

drm/i915: Split vlv/chv sprite plane update into noarm+arm pair · a14fef80

Ville Syrjälä authored Oct 18, 2021

Chop vlv_sprite_update() into two halves. Fist half becomes
the _noarm() variant, second part the _arm() variant.

Fortunately I have already previously grouped the register
writes into roughtly the correct order, so the split looks
surprisingly clean.

Looks like most of the hardware logic was copied from the
pre-ctg sprite C, so SPSTRIDE/POS/SIZE are armed by SPSURF,
while the rest are self arming. SPCONSTALPHA is the one
entirely new register that didn't exist in the old sprite C,
and looks like that one is self arming. The CHV pipe B CSC
is also self arming, like the rest of the CHV pipe B
additions.

I didn't have time to capture i915_update_info numbers for
these, but since all the other platforms generally showed
improvements, and crucially no regression, I am fairly
confident this should behave similarly.

Cc: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211018115030.3547-10-ville.syrjala@linux.intel.comReviewed-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>

a14fef80

drm/i915: Split ivb+ sprite plane update into noarm+arm pair · 50105a3a

Ville Syrjälä authored Oct 18, 2021

Chop ivb_sprite_update() into two halves. Fist half becomes
the _noarm() variant, second part the _arm() variant.

Fortunately I have already previously grouped the register
writes into roughtly the correct order, so the split looks
surprisingly clean.

Didn't bother with i915_update_info numbers for this one.
I expect the results to be pretty much identical to the snb
numbers from the corresponding g4x+ sprite modification.

Cc: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211018115030.3547-9-ville.syrjala@linux.intel.comReviewed-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>

50105a3a

drm/i915: Split g4x+ sprite plane update into noarm+arm pair · 120542e2

Ville Syrjälä authored Oct 18, 2021

Chop g4x_sprite_update() into two halves. Fist half becomes
the _noarm() variant, second part the _arm() variant.

Fortunately I have already previously grouped the register
writes into roughtly the correct order, so the split looks
surprisingly clean.

Not much of a change in i915_update_info on these older
platforms that don't have so many planes or registers to
begin with. Here are the numbers from snb (totally unpatched
vs. both primary plane and sprite patched applied) running
kms_atomic_transition --r plane-all-transition --extended:
w/o patch                           w/ patch
Updates: 5404			    Updates: 5405
       |			    	   |
   1us |******			       1us |******
       |*********		    	   |*********
   4us |***********		       4us |***********
       |**********		    	   |**********
  16us |**			      16us |**
       |			    	   |
  66us |			      66us |
       |			    	   |
 262us |			     262us |
       |			    	   |
   1ms |			       1ms |
       |			    	   |
   4ms |			       4ms |
       |			    	   |
  17ms |			      17ms |
       |			    	   |
Min update: 1400ns		    Min update: 1307ns
Max update: 19809ns		    Max update: 20194ns
Average update: 6957ns		    Average update: 6432ns
Overruns > 100us: 0		    Overruns > 100us: 0

But there seems to be a slight improvement with
lockdep enabled:
w/o patch                           w/ patch
Updates: 17612			    Updates: 16364
       |			    	   |
   1us |			       1us |
       |******			    	   |******
   4us |**********		       4us |**********
       |************		    	   |*************
  16us |*************		      16us |************
       |***			    	   |*
  66us |			      66us |
       |			    	   |
 262us |			     262us |
       |			    	   |
   1ms |			       1ms |
       |			    	   |
   4ms |			       4ms |
       |			    	   |
  17ms |			      17ms |
       |			    	   |
Min update: 3141ns		    Min update: 3562ns
Max update: 126450ns		    Max update: 73354ns
Average update: 16373ns		    Average update: 15153ns
Overruns > 250us: 0		    Overruns > 250us: 0

Cc: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211018115030.3547-8-ville.syrjala@linux.intel.comReviewed-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>

120542e2

drm/i915: Split pre-skl primary plane update into noarm+arm pair · 4d0d77de

Ville Syrjälä authored Oct 21, 2021

Chop i9xx_plane_update() into two halves. Fist half becomes
the _noarm() variant, second part the _arm() variant.

Fortunately I have already previously grouped the register
writes into roughtly the correct order, so the split looks
surprisingly clean.

One slightly surprising fact was that the CHV pipe B PRIMPOS/SIZE
registers are self arming unlike their pre-ctg DSPPOS/SIZE
counterparts. In fact all the new CHV pipe B registers are
self arming.

Also we must remind ourselves that i830/i845 are a bit borked
in that all of their plane registers are self-arming.

I didn't do any i915_update_info measurements for this one
alone. I'll get total numbers with the corrsponding sprite
plane changes.

v2: Don't break my precious i830/i845

Cc: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211020212757.13517-1-ville.syrjala@linux.intel.comReviewed-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>

4d0d77de

drm/i915: Split skl+ plane update into noarm+arm pair · 890b6ec4

Ville Syrjälä authored Oct 18, 2021

Chop skl_program_plane() into two halves. Fist half becomes
the _noarm() variant, second part the _arm() variant.

Fortunately I have already previously grouped the register
writes into roughtly the correct order, so the split looks
surprisingly clean.

A few notable oddities I did not realize were self arming
are AUX_DIST and COLOR_CTL.

i915_update_info doesn't look too terrible on my cfl running
kms_atomic_transition --r plane-all-transition --extended:
w/o patch                           w/ patch
Updates: 2178                       Updates: 2018
       |                                   |
   1us |                               1us |
       |                                   |
   4us |                               4us |*****
       |*********                          |**********
  16us |**********                    16us |*******
       |***                                |
  66us |                              66us |
       |                                   |
 262us |                             262us |
       |                                   |
   1ms |                               1ms |
       |                                   |
   4ms |                               4ms |
       |                                   |
  17ms |                              17ms |
       |                                   |
Min update: 8332ns                  Min update: 6164ns
Max update: 48758ns                 Max update: 31808ns
Average update: 19959ns             Average update: 13159ns
Overruns > 100us: 0                 Overruns > 100us: 0

And with lockdep enabled:
w/o patch                           w/ patch
Updates: 2177			    Updates: 2172
       |			    	   |
   1us |			       1us |
       |			    	   |
   4us |			       4us |
       |*******			    	   |*********
  16us |**********		      16us |**********
       |*******			    	   |*
  66us |			      66us |
       |			    	   |
 262us |			     262us |
       |			    	   |
   1ms |			       1ms |
       |			    	   |
   4ms |			       4ms |
       |			    	   |
  17ms |			      17ms |
       |			    	   |
Min update: 12645ns		    Min update: 9980ns
Max update: 50153ns		    Max update: 33533ns
Average update: 25337ns		    Average update: 18245ns
Overruns > 250us: 0		    Overruns > 250us: 0

TODO: On icl+ everything seems to be armed by PLANE_SURF, so we
      can optimize this even further on modern platforms. But I
      think there's a bit of refactoring to be done first to
      figure out the best way to go about it (eg. just reusing
      the current skl+ functions, or doing a lower level split).

TODO: Split scaler programming as well, but IIRC the scaler
      has some oddball double buffering behaviour on some
      platforms, so needs proper reverse engineering

Cc: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211018115030.3547-6-ville.syrjala@linux.intel.comReviewed-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>

890b6ec4

drm/i915: Split update_plane() into update_noarm() + update_arm() · 8ac80733

Ville Syrjälä authored Oct 18, 2021

The amount of plane registers we have to write has been steadily
increasing, putting more pressure on the vblank evasion mechanism
and forcing us to increase its time budget. Let's try to take some
of the pressure off by splitting plane updates into two parts:
1) write all non-self arming plane registers, ie. the registers
where the write actually does nothing until a separate arming
register is also written which will cause the hardware to latch
the new register values at the next start of vblank
2) write all self arming plane registers, ie. registers which always
just latch at the next start of vblank, and registers which also
arm other registers to do so

Here we just provide the mechanism, but don't actually implement
the split on any platform yet. so everything stays now in the _arm()
hooks. Subsequently we can move a whole bunch of stuff into the
_noarm() part, especially in more modern platforms where the number
of registers we have to write is also the greatest. On older
platforms this is less beneficial probably, but no real reason
to deviate from a common behaviour.

And let's sprinkle some TODOs around the areas that will need
adapting.

Cc: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211018115030.3547-5-ville.syrjala@linux.intel.comReviewed-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>

8ac80733

drm/i915: Fix up the sprite namespacing · e56b80d9

Ville Syrjälä authored Oct 18, 2021

Give all sprite exclusive functions/etc. a proper namespace.

Cc: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211018115030.3547-4-ville.syrjala@linux.intel.comReviewed-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>

e56b80d9

drm/i915: Fix async flip with decryption and/or DPT · 50faf7a1

Ville Syrjälä authored Oct 18, 2021

We're currently forgetting to set the PLANE_SURF_DECRYPT
flag in the async flip path. So if the hardware were to
latch that bit despite this being an async flip we'd start
scanning out garbage. And if it doesn't latch it then I
guess we'd just end up with a weird register value that
doesn't actually match the hardware state, which isn't
great for anyone staring at register dumps.

Similarly the async flip path also forgets to call
skl_surf_address() which means the DPT address space to
GGTT address space downshift is not being applied to
the offset. Which means we are pointing PLANE_SURF
at some random location in GGTT instead of the correct
DPT page.

So let's fix two birds with one stone and extract the
PLANE_SURF calculation from skl_program_plane() into
a small helper and use it in the async flip path as well.

Cc: Anshuman Gupta <anshuman.gupta@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Juston Li <juston.li@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Uma Shankar <uma.shankar@intel.com>
Cc: Karthik B S <karthik.b.s@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211018115030.3547-3-ville.syrjala@linux.intel.comReviewed-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>

50faf7a1

drm/i915: Reject planar formats when doing async flips · aaec72ee

Ville Syrjälä authored Oct 18, 2021

Async flips are only capable of changing PLANE_SURF, hence we
they can't easily be used with planar formats.

Older platforms could require updating AUX_DIST as well, which
is not possible. We'd have to make sure AUX_DIST doesn't change
before allowing the async flip through. If we could get async
flips with CCS then that might be interesting, but since the hw
doesn't allow async flips with CCS I don't see much point in
allowing this for planar formats either. No one renders their
game content in YUV anyway.

icl+ could in theory do this I suppose since each color plane
has its own PLANE_SURF register, but I don't know if there is
some magic to guarantee that both the Y and UV plane would
async flip synchronously if you will. Ie. beyond just a clean
tear we'd potentially get some kind of weird tear with some
random mix of luma and chroma from the old and new frames.

So let's just say no to async flips when scanning out planar
formats.

Cc: Karthik B S <karthik.b.s@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211018115030.3547-2-ville.syrjala@linux.intel.comReviewed-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>

aaec72ee