Commits · 161996a8003faa22baaac9f379ef580af551d26f · nexedi / linux

06 Mar, 2019 3 commits

drm/i915/selftests: Fix MI_STORE_DWORD_IMM alignment · 161996a8

Chris Wilson authored Mar 06, 2019

MI_STORE_DWORD_IMM wants to write into a dword-aligned (4B) address, we
mistakenly cleared bit2 and not bits 0 and 1.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190306082447.21563-1-chris@chris-wilson.co.uk

161996a8

drm/i915: Pass around the intel_context · b146e5ef

Chris Wilson authored Mar 06, 2019

Instead of passing the gem_context and engine to find the instance of
the intel_context to use, pass around the intel_context instead. This is
useful for the next few patches, where the intel_context is no longer a
direct lookup.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190306084704.15755-1-chris@chris-wilson.co.uk

b146e5ef

drm/i915: Use i915_global_register() · 103b76ee

Chris Wilson authored Mar 05, 2019

Rather than manually add every new global into each hook, use
i915_global_register() function and keep a list of registered globals to
invoke instead.

However, I haven't found a way for random drivers to add an .init table
to avoid having to manually add ourselves to i915_globals_init() each
time.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190305213830.18094-1-chris@chris-wilson.co.ukReviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

103b76ee

05 Mar, 2019 9 commits

drm/i915/icl: Default to Thread Group preemption for compute workloads · d846325a

Michał Winiarski authored Mar 05, 2019

We assumed that the default preemption granularity is fine for ICL.
Unfortunately, it turns out that some drivers don't support mid-thread
preemption for compute workloads.
If a workload that doesn't support mid-thread preemption gets mid-thread
preempted, we're going to observe a GPU hang.
While I'm here, let's also update the "workaround" naming.
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Cc: Anuj Phogat <anuj.phogat@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Rafael Antognolli <rafael.antognolli@intel.com>
Tested-by: Anuj Phogat <anuj.phogat@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Acked-by: Rafael Antognolli <rafael.antognolli@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190305124827.23446-1-michal.winiarski@intel.com

d846325a

drm/i915: Move find_active_request() to the engine · cf4331dd

Chris Wilson authored Mar 05, 2019

To find the active request, we need only search along the individual
engine for the right request. This does not require touching any global
GEM state, so move it into the engine compartment.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190305180332.30900-3-chris@chris-wilson.co.uk

cf4331dd

drm/i915/gtt: Mark ALL_ENGINES as dirty on ppGTT modification · fb251a72

Chris Wilson authored Mar 05, 2019

Small simplification to set all bits in the dirty mask rather than
lookup the exact mask of populated engines. The bits for the engines
that do not exist are unused and so can safely set and then ignored.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190305180332.30900-2-chris@chris-wilson.co.uk

fb251a72

drm/i915: Store the BIT(engine->id) as the engine's mask · 8a68d464

Chris Wilson authored Mar 05, 2019

In the next patch, we are introducing a broad virtual engine to encompass
multiple physical engines, losing the 1:1 nature of BIT(engine->id). To
reflect the broader set of engines implied by the virtual instance, lets
store the full bitmask.

v2: Use intel_engine_mask_t (s/ring_mask/engine_mask/)
v3: Tvrtko voted for moah churn so teach everyone to not mention ring
and use $class$instance throughout.
v4: Comment upon the disparity in bspec for using VCS1,VCS2 in gen8 and
VCS[0-4] in later gen. We opt to keep the code consistent and use
0-index naming throughout.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190305180332.30900-1-chris@chris-wilson.co.uk

8a68d464

drm/i915: Remove last traces of exec-id (GEM_BUSY) · c8b50242

Chris Wilson authored Mar 05, 2019

As we allow per-context engine allows the legacy concept of
I915_EXEC_RING no longer applies universally. We are still exposing the
unrelated exec-id in GEM_BUSY, so transition this ioctl (once more
slightly changing its ABI, but no one cares) over to only reporting the
uabi-class (not instance as we can not foreseeably fit those into the
small bitmask).

The only user of the extended ring information from GEM_BUSY is ddx/sna,
which tries to use the non-rcs business information to guide which
engine to use for subsequent operations on foreign bo. All that matters
for it is the decision between rcs and !rcs, so it is unaffected by the
change in higher bits.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190305162643.20243-1-chris@chris-wilson.co.uk

c8b50242

drm/i915: Stop capturing semaphore registers for gen6/7 GPU hangs · 62acc7e8

Chris Wilson authored Mar 05, 2019

We no longer use the semaphore sync registers on gen6/7, so including
them in the GPU error state is mere noise.

References: 6faf5916 ("drm/i915: Remove HW semaphores for gen7 inter-engine synchronisation")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190305150914.11340-2-chris@chris-wilson.co.uk

62acc7e8

drm/i915: Just check the vebox IIR regardless · f14c0d9f

Chris Wilson authored Mar 05, 2019

As we don't unmask and enable the vebox interrupts if the engine is not
being used, we will never generate the vebox interrupts as part of the
IIR and so can unconditionally check IIR without fear of chasing into
the vebox.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190305150914.11340-1-chris@chris-wilson.co.uk

f14c0d9f

drm/i915/gtt: Store scratch page size alongside not in the common struct · a2ac437b

Chris Wilson authored Mar 05, 2019

As the scratch page is the only one to be allocated with variable size,
rather than keep an unused slot in all i915_page_table structs, store it
alongside the vm->scratch_page.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190305135430.4948-1-chris@chris-wilson.co.uk

a2ac437b

drm/i915/gtt: Use optimised memset32/64 for clearing PTE · 4f183645

Chris Wilson authored Mar 04, 2019

Replace the open-coded memset loops with the memset32/64 routines that
reduce to a single instruction or two:

add/remove: 0/0 grow/shrink: 0/2 up/down: 0/-83 (-83)
Function                                     old     new   delta
gen6_ppgtt_clear_range                       371     344     -27
gen8_ppgtt_clear_pd                          575     519     -56
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190304230646.23714-1-chris@chris-wilson.co.uk

4f183645

04 Mar, 2019 9 commits

drm/i915: Fix bit name in PP_STATUS register · f139da13

Lucas De Marchi authored Mar 01, 2019

According to the spec PP_SEQUENCE_STATE_ON_S1_1 is the correct name, so
just rename it.
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190302011405.6405-1-lucas.demarchi@intel.com

f139da13

drm/i915: allow platforms without eDP transcoder · bc7e3525

Lucas De Marchi authored Feb 22, 2019

Define a HAS_TRANSCODER_EDP() macro that checks if we have defined an
offset for this transcoder. This allows platforms to be defined without
eDP transcoder.

Cc: Mika Kahola <mika.kahola@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Mika Kahola <mika.kahola@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190222230254.20351-2-lucas.demarchi@intel.com

bc7e3525

drm/i915: refactor transcoders reporting on error state · 062de72b

Lucas De Marchi authored Feb 22, 2019

Instead of keeping track of the number of transcoders, loop through all
the interesting ones and check if there is a correspondent offset.

Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Mika Kahola <mika.kahola@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190222230254.20351-1-lucas.demarchi@intel.com

062de72b

drm/i915: Forcing a modeset when resetting HDMI link · b8fe992a

José Roberto de Souza authored Mar 01, 2019

With fastboot enabled in gen9+ it broke the HDMI reset as just
setting mode_changed to true causes a fastset and here we want a full
modeset that will disable and then enable the encoder of this HDMI
link actually, so setting connectors_changed instead that will cause
modeset as desired.

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190302003349.19189-3-jose.souza@intel.com

b8fe992a

drm/i915: Don't manually add connectors and planes state · 3e5ebcdd

José Roberto de Souza authored Mar 01, 2019

drm_atomic_commit() call chain already takes care of adding
connectors and planes, so lets no add then manually if not changing
their states.

drm_atomic_commit()
        drm_atomic_check_only()
                config->funcs->atomic_check()/intel_atomic_check()
                        drm_atomic_helper_check()
                                drm_atomic_helper_check_modeset()
                                        for_each_oldnew_crtc_in_state()
                                                drm_atomic_add_affected_connectors()
                                                drm_atomic_add_affected_planes()
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Lyude Paul <lyude@redhat.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190302003349.19189-2-jose.souza@intel.com

3e5ebcdd

drm/i915: Fix atomic state leak when resetting HDMI link · a551cd66

José Roberto de Souza authored Mar 01, 2019

Atomic state needs to be put even if the commit was successful.

Fixes: dba14b27 ("drm/i915: Reinitialize sink scrambling/TMDS clock ratio on HPD")
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Lyude Paul <lyude@redhat.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190302003349.19189-1-jose.souza@intel.com

a551cd66

drm/i915: Fix the state checker for ICL Y planes · 3e1d87dd

Ville Syrjälä authored Mar 04, 2019

The plane used to scan out NV12 luma on ICL is logically
off but actually on. Fix the state checker to account for this.

Cc: Imre Deak <imre.deak@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109457Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190304131217.4338-1-ville.syrjala@linux.intel.comReviewed-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

3e1d87dd

drm/i915: Yet another if/else sort of newer to older platforms. · 993298af

Rodrigo Vivi authored Mar 01, 2019

No functional change. Just a reorg to match the preferred
behavior.

When rebasing internal branch on top of latest sort I noticed
few more cases that needs to get reordered.

Let's do in a bundle this time and hoping there's no other
missing places.

v2: Check for HSW/BDW ULT before generic IS_HASWELL or
    IS_BROADWELL or it doesn't work as pointed by Ville.
    But also ULT came afterwards anyway.
v3: Accepting suggestions from Lucas:
    Sort CNL/CFL, KBL/SKL, and use <= 8 removing chv and bdw.

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190301172703.12139-1-rodrigo.vivi@intel.com

993298af

drm/i915: Acquire breadcrumb ref before cancelling · e781a7a3

Chris Wilson authored Mar 04, 2019

We may race the interrupt signaling with retirement, in which case the
order in which we acquire the reference inside the interrupt is vital to
provide the correct barrier against the request being freed in
retirement, i.e. we need to acquire our reference before marking the
breadcrumb as cancelled (as soon as the breadcrumb is cancelled
retirement may drop its reference to the request without serialisation
with the interrupt handler).

<3>[  683.372226] BUG i915_request (Tainted: G     U           ): Object already free
<3>[  683.372269] -----------------------------------------------------------------------------

<4>[  683.372323] Disabling lock debugging due to kernel taint
<3>[  683.372393] INFO: Allocated in i915_request_alloc+0x169/0x810 [i915] age=0 cpu=2 pid=1420
<3>[  683.372412] 	kmem_cache_alloc+0x21c/0x280
<3>[  683.372478] 	i915_request_alloc+0x169/0x810 [i915]
<3>[  683.372540] 	i915_gem_do_execbuffer+0x84e/0x1ae0 [i915]
<3>[  683.372603] 	i915_gem_execbuffer2_ioctl+0x11b/0x420 [i915]
<3>[  683.372617] 	drm_ioctl_kernel+0x83/0xf0
<3>[  683.372626] 	drm_ioctl+0x2f3/0x3b0
<3>[  683.372636] 	do_vfs_ioctl+0xa0/0x6e0
<3>[  683.372645] 	ksys_ioctl+0x35/0x60
<3>[  683.372654] 	__x64_sys_ioctl+0x11/0x20
<3>[  683.372664] 	do_syscall_64+0x55/0x190
<3>[  683.372675] 	entry_SYSCALL_64_after_hwframe+0x49/0xbe
<3>[  683.372740] INFO: Freed in i915_request_retire_upto+0xfb/0x2e0 [i915] age=0 cpu=0 pid=1419
<3>[  683.372807] 	i915_request_retire_upto+0xfb/0x2e0 [i915]
<3>[  683.372870] 	i915_request_add+0x3bd/0x9d0 [i915]
<3>[  683.372931] 	i915_gem_do_execbuffer+0x141c/0x1ae0 [i915]
<3>[  683.372991] 	i915_gem_execbuffer2_ioctl+0x11b/0x420 [i915]
<3>[  683.373001] 	drm_ioctl_kernel+0x83/0xf0
<3>[  683.373008] 	drm_ioctl+0x2f3/0x3b0
<3>[  683.373015] 	do_vfs_ioctl+0xa0/0x6e0
<3>[  683.373023] 	ksys_ioctl+0x35/0x60
<3>[  683.373030] 	__x64_sys_ioctl+0x11/0x20
<3>[  683.373037] 	do_syscall_64+0x55/0x190
<3>[  683.373045] 	entry_SYSCALL_64_after_hwframe+0x49/0xbe
<3>[  683.373054] INFO: Slab 0x0000000079bcdd71 objects=30 used=2 fp=0x000000006d77b8af flags=0x8000000000010201
<3>[  683.373069] INFO: Object 0x000000006d77b8af @offset=24000 fp=0x000000007b061eab

<3>[  683.373083] Redzone 00000000ee47ef28: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb  ................
<3>[  683.373097] Redzone 000000000cb91471: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb  ................
<3>[  683.373111] Redzone 00000000cf2b86ee: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb  ................
<3>[  683.373125] Redzone 00000000f1f5a2cd: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb  ................
<3>[  683.373139] Object 000000006d77b8af: 00 00 00 00 5a 5a 5a 5a 00 3c 49 c0 ff ff ff ff  ....ZZZZ.<I.....
<3>[  683.373153] Object 000000006f9b6204: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a  ZZZZZZZZZZZZZZZZ
<3>[  683.373167] Object 0000000091410ffb: e0 dd 6b fa 87 9f ff ff e0 dd 6b fa 87 9f ff ff  ..k.......k.....
<3>[  683.373181] Object 000000004cdf799d: 20 de 6b fa 87 9f ff ff 3d 00 00 00 00 00 00 00   .k.....=.......
<3>[  683.373195] Object 00000000545afebc: aa b3 00 00 00 00 00 00 0f 00 00 00 00 00 00 00  ................
<3>[  683.373209] Object 00000000e4a394a8: 25 bd bd 1b 9f 00 00 00 00 00 00 00 5a 5a 5a 5a  %...........ZZZZ
<3>[  683.373223] Object 0000000029a7878a: 00 00 00 00 ad 4e ad de ff ff ff ff 5a 5a 5a 5a  .....N......ZZZZ
<3>[  683.373237] Object 00000000d37797b3: ff ff ff ff ff ff ff ff e8 6e 57 c0 ff ff ff ff  .........nW.....
<3>[  683.373251] Object 00000000d50414f6: 00 b3 c8 8e ff ff ff ff 80 b0 c8 8e ff ff ff ff  ................
<3>[  683.373265] Object 00000000c28e8847: 41 01 4b c0 ff ff ff ff 00 00 88 8e 88 9f ff ff  A.K.............
<3>[  683.373279] Object 00000000c74212ab: 38 c1 6d 8a 88 9f ff ff 58 21 74 8a 88 9f ff ff  8.m.....X!t.....
<3>[  683.373293] Object 000000000d8012cf: c0 c1 6d 8a 88 9f ff ff 58 79 dd d9 87 9f ff ff  ..m.....Xy......
<3>[  683.373306] Object 00000000c9900b91: 98 d0 4e 8a 88 9f ff ff 58 3c e8 9b 88 9f ff ff  ..N.....X<......
<3>[  683.373320] Object 0000000044bb8c3d: 58 3c e8 9b 88 9f ff ff 64 f5 04 00 00 00 00 00  X<......d.......
<3>[  683.373334] Object 00000000180c4cca: 00 00 00 00 ad 4e ad de ff ff ff ff 5a 5a 5a 5a  .....N......ZZZZ
<3>[  683.373348] Object 00000000c9044498: ff ff ff ff ff ff ff ff e0 6e 57 c0 ff ff ff ff  .........nW.....
<3>[  683.373362] Object 0000000072d0dfb3: 00 00 00 00 00 00 00 00 c0 b1 c8 8e ff ff ff ff  ................
<3>[  683.373376] Object 0000000081f198b9: 55 01 4b c0 ff ff ff ff d8 de 6b fa 87 9f ff ff  U.K.......k.....
<3>[  683.373390] Object 000000006a375a13: d8 de 6b fa 87 9f ff ff cc 05 39 c0 ff ff ff ff  ..k.......9.....
<3>[  683.373404] Object 00000000b8392dd1: ff ff ff ff 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a  ....ZZZZZZZZZZZZ
<3>[  683.373418] Object 00000000e5c1bbcb: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a  ZZZZZZZZZZZZZZZZ
<3>[  683.373432] Object 00000000199feccd: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a  ZZZZZZZZZZZZZZZZ
<3>[  683.373446] Object 0000000020f5e08b: 20 df 6b fa 87 9f ff ff 20 df 6b fa 87 9f ff ff   .k..... .k.....
<3>[  683.373460] Object 0000000090591b0f: 30 df 6b fa 87 9f ff ff 30 df 6b fa 87 9f ff ff  0.k.....0.k.....
<3>[  683.373473] Object 00000000232f7cd0: 40 df 6b fa 87 9f ff ff 40 df 6b fa 87 9f ff ff  @.k.....@.k.....
<3>[  683.373487] Object 0000000060458027: 50 df 6b fa 87 9f ff ff 50 df 6b fa 87 9f ff ff  P.k.....P.k.....
<3>[  683.373501] Object 00000000e3c82ce2: 06 00 00 00 00 00 00 00 5a 5a 5a 5a 5a 5a 5a 5a  ........ZZZZZZZZ
<3>[  683.373515] Object 00000000ec804eb8: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a  ZZZZZZZZZZZZZZZZ
<3>[  683.373529] Object 00000000ce7ccc08: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a  ZZZZZZZZZZZZZZZZ
<3>[  683.373543] Object 000000002dbc575c: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a  ZZZZZZZZZZZZZZZZ
<3>[  683.373557] Object 00000000b86d3417: 5a 5a 5a 5a 5a 5a 5a 5a 00 de 6b fa 87 9f ff ff  ZZZZZZZZ..k.....
<3>[  683.373571] Object 00000000d1e82276: b8 61 dd d9 87 9f ff ff a0 06 00 00 d0 06 00 00  .a..............
<3>[  683.373585] Object 00000000cc53f969: e8 06 00 00 20 07 00 00 28 07 00 00 00 00 00 00  .... ...(.......
<3>[  683.373599] Object 00000000ea2426d2: 40 0c 8c 7b 88 9f ff ff 00 00 00 00 00 00 00 00  @..{............
<3>[  683.373613] Object 00000000b860c1c3: 68 0d 8c 7b 88 9f ff ff 68 25 8c 7b 88 9f ff ff  h..{....h%.{....
<3>[  683.373627] Object 0000000016455ea0: 96 d5 05 00 01 00 00 00 00 5a 5a 5a 5a 5a 5a 5a  .........ZZZZZZZ
<3>[  683.373640] Object 00000000e66ede82: 00 e0 6b fa 87 9f ff ff 00 e0 6b fa 87 9f ff ff  ..k.......k.....
<3>[  683.373654] Object 0000000080964939: 10 e0 6b fa 87 9f ff ff 10 e0 6b fa 87 9f ff ff  ..k.......k.....
<3>[  683.373668] Object 00000000e7ffc5dd: 00 00 00 00 00 00 00 00 00 01 00 00 00 00 ad de  ................
<3>[  683.373682] Object 000000000ce9d6ca: 00 02 00 00 00 00 ad de 5a 5a 5a 5a 5a 5a 5a 5a  ........ZZZZZZZZ
<3>[  683.373696] Object 00000000386659d0: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a  ZZZZZZZZZZZZZZZZ
<3>[  683.373710] Redzone 0000000075d2069d: bb bb bb bb bb bb bb bb                          ........
<3>[  683.373723] Padding 0000000054e14c6b: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a  ZZZZZZZZZZZZZZZZ
<3>[  683.373737] Padding 00000000425e5b34: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a  ZZZZZZZZZZZZZZZZ
<3>[  683.373751] Padding 00000000ad3d4db9: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a  ZZZZZZZZZZZZZZZZ
<4>[  683.373767] CPU: 1 PID: 151 Comm: kworker/1:2 Tainted: G    BU            5.0.0-rc8-g39139489403b-drmtip_236+ #1
<4>[  683.373769] Hardware name: Intel Corporation Ice Lake Client Platform/IceLake Y LPDDR4x T4 RVP TLC, BIOS ICLSFWR1.R00.3087.A00.1902250334 02/25/2019
<4>[  683.373773] Workqueue: events delayed_fput
<4>[  683.373775] Call Trace:
<4>[  683.373777]  <IRQ>
<4>[  683.373781]  dump_stack+0x67/0x9b
<4>[  683.373783]  free_debug_processing+0x344/0x370
<4>[  683.373832]  ? intel_engine_breadcrumbs_irq+0x2e4/0x380 [i915]
<4>[  683.373836]  __slab_free+0x337/0x4f0
<4>[  683.373840]  ? _raw_spin_unlock_irqrestore+0x39/0x60
<4>[  683.373844]  ? debug_check_no_obj_freed+0x132/0x210
<4>[  683.373889]  ? intel_engine_breadcrumbs_irq+0x2e4/0x380 [i915]
<4>[  683.373892]  ? kmem_cache_free+0x275/0x2e0
<4>[  683.373894]  kmem_cache_free+0x275/0x2e0
<4>[  683.373939]  intel_engine_breadcrumbs_irq+0x2e4/0x380 [i915]
<4>[  683.373984]  gen8_cs_irq_handler+0x4e/0xa0 [i915]
<4>[  683.374026]  gen11_irq_handler+0x24b/0x330 [i915]
<4>[  683.374032]  __handle_irq_event_percpu+0x41/0x2d0
<4>[  683.374034]  ? handle_irq_event+0x27/0x50
<4>[  683.374038]  handle_irq_event_percpu+0x2b/0x70
<4>[  683.374040]  handle_irq_event+0x2f/0x50
<4>[  683.374044]  handle_edge_irq+0xe7/0x190
<4>[  683.374048]  handle_irq+0x67/0x160
<4>[  683.374051]  do_IRQ+0x5e/0x130
<4>[  683.374054]  common_interrupt+0xf/0xf

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109827
Fixes: 52c0fdb2 ("drm/i915: Replace global breadcrumbs with per-context interrupt tracking")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190304114113.371-1-chris@chris-wilson.co.uk

e781a7a3

02 Mar, 2019 2 commits

drm/i915: extract AUX mask assignment to separate function · 9d17210f

Lucas De Marchi authored Feb 25, 2019

No change in behavior, this only allows to more easily follow the flow
of gen8_de_irq_handler without the mask assignments for each platform.
This also re-organizes the branches a little bit, so the one-off case
for CNL_WITH_PORT_F is separate from the generic gen >= 11.

v2: rename de_port_iir_aux_mask -> gen8_de_port_aux_mask (Ville)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Jose Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190226004900.26047-1-lucas.demarchi@intel.com

9d17210f

drm/i915/icl: move MG pll hw_state readout · 510a75a5

Lucas De Marchi authored Feb 22, 2019

Let the MG plls have their own hooks since it shares very little with
other PLL types. It's also better so the platform info contains the info
if the PLL is for MG PHY rather than relying on the PLL ids.
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190222232324.16405-1-lucas.demarchi@intel.com

510a75a5

01 Mar, 2019 9 commits

drm/i915: Re-arrange execbuf so context is known before engine · 4aa90970

Tvrtko Ursulin authored Mar 01, 2019

Needed for a following patch.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190301140404.26690-23-chris@chris-wilson.co.ukSigned-off-by: Chris Wilson <chris@chris-wilson.co.uk>

4aa90970

drm/i915: Fix I915_EXEC_RING_MASK · d90c06d5

Chris Wilson authored Mar 01, 2019

This was supposed to be a mask of all known rings, but it is being used
by execbuffer to filter out invalid rings, and so is instead mapping high
unused values onto valid rings. Instead of a mask of all known rings,
we need it to be the mask of all possible rings.

Fixes: 549f7365 ("drm/i915: Enable SandyBridge blitter ring")
Fixes: de1add36 ("drm/i915: Decouple execbuf uAPI from internal implementation")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: <stable@vger.kernel.org> # v4.6+
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190301140404.26690-21-chris@chris-wilson.co.uk

d90c06d5

drm/i915: Prioritise non-busywait semaphore workloads · f9e9e9de

Chris Wilson authored Mar 01, 2019

We don't want to busywait on the GPU if we have other work to do. If we
give non-busywaiting workloads higher (initial) priority than workloads
that require a busywait, we will prioritise work that is ready to run
immediately. We then also have to be careful that we don't give earlier
semaphores an accidental boost because later work doesn't wait on other
rings, hence we keep a history of semaphore usage of the dependency chain.

v2: Stop rolling the bits into a chain and just use a flag in case this
request or any of our dependencies use a semaphore. The rolling around
was contagious as Tvrtko was heard to fall off his chair.

Testcase: igt/gem_exec_schedule/semaphore
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190301170901.8340-4-chris@chris-wilson.co.uk

f9e9e9de

drm/i915: Use HW semaphores for inter-engine synchronisation on gen8+ · e8861964

Chris Wilson authored Mar 01, 2019

Having introduced per-context seqno, we now have a means to identity
progress across the system without feel of rollback as befell the
global_seqno. That is we can program a MI_SEMAPHORE_WAIT operation in
advance of submission safe in the knowledge that our target seqno and
address is stable.

However, since we are telling the GPU to busy-spin on the target address
until it matches the signaling seqno, we only want to do so when we are
sure that busy-spin will be completed quickly. To achieve this we only
submit the request to HW once the signaler is itself executing (modulo
preemption causing us to wait longer), and we only do so for default and
above priority requests (so that idle priority tasks never themselves
hog the GPU waiting for others).

As might be reasonably expected, HW semaphores excel in inter-engine
synchronisation microbenchmarks (where the 3x reduced latency / increased
throughput more than offset the power cost of spinning on a second ring)
and have significant improvement (can be up to ~10%, most see no change)
for single clients that utilize multiple engines (typically media players
and transcoders), without regressing multiple clients that can saturate
the system or changing the power envelope dramatically.

v3: Drop the older NEQ branch, now we pin the signaler's HWSP anyway.
v4: Tell the world and include it as part of scheduler caps.

Testcase: igt/gem_exec_whisper
Testcase: igt/benchmarks/gem_wsim
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190301170901.8340-3-chris@chris-wilson.co.uk

e8861964

drm/i915: Keep timeline HWSP allocated until idle across the system · ebece753

Chris Wilson authored Mar 01, 2019

In preparation for enabling HW semaphores, we need to keep in flight
timeline HWSP alive until its use across entire system has completed,
as any other timeline active on the GPU may still refer back to the
already retired timeline. We both have to delay recycling available
cachelines and unpinning old HWSP until the next idle point.

An easy option would be to simply keep all used HWSP until the system as
a whole was idle, i.e. we could release them all at once on parking.
However, on a busy system, we may never see a global idle point,
essentially meaning the resource will be leaked until we are forced to
do a GC pass. We already employ a fine-grained idle detection mechanism
for vma, which we can reuse here so that each cacheline can be freed
immediately after the last request using it is retired.

v3: Keep track of the activity of each cacheline.
v4: cacheline_free() on canceling the seqno tracking
v5: Finally with a testcase to exercise wraparound
v6: Pack cacheline into empty bits of page-aligned vaddr
v7: Use i915_utils to hide the pointer casting around bit manipulation
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190301170901.8340-2-chris@chris-wilson.co.uk

ebece753

drm/i915/execlists: Suppress redundant preemption · 1e3f697e

Chris Wilson authored Mar 01, 2019

On unwinding the active request we give it a small (limited to internal
priority levels) boost to prevent it from being gazumped a second time.
However, this means that it can be promoted to above the request that
triggered the preemption request, causing a preempt-to-idle cycle for no
change. We can avoid this if we take the boost into account when
checking if the preemption request is valid.

v2: After preemption the active request will be after the preemptee if
they end up with equal priority.

v3: Tvrtko pointed out that this, the existing logic, makes
I915_PRIORITY_WAIT non-preemptible. Document this interesting quirk!

v4: Prove Tvrtko was right about WAIT being non-preemptible and test it.
v5: Except not all priorities were made equal, and the WAIT not preempting
is only if we start off as !NEWCLIENT.

v6: More commentary after coming to an understanding about what I had
forgotten to say.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190301170901.8340-1-chris@chris-wilson.co.uk

1e3f697e

drm/i915/selftests: Check that whitelisted registers are accessible · 34ae8455

Chris Wilson authored Mar 01, 2019

There is no point in whitelisting a register that the user then cannot
write to, so check the register exists before merging such patches.

v2: Mark SLICE_COMMON_ECO_CHICKEN1 [731c] as write-only
v3: Use different variables for different meanings!
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Dale B Stimson <dale.b.stimson@intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190301140404.26690-6-chris@chris-wilson.co.uk
Link: https://patchwork.freedesktop.org/patch/msgid/20190301160108.19039-1-chris@chris-wilson.co.uk

34ae8455

drm/i915: Finalize Wa_1408961008:icl · c384afe3

Ville Syrjälä authored Feb 28, 2019

The icl wm1+ underrun w/a has been added to the spec. It changed
slightly from the previous incarnation by requiring that we mirror
the lines watermark and the ignore lines bit from WM0 into WM1.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190228173639.18422-1-ville.syrjala@linux.intel.comReviewed-by: Matt Roper <matthew.d.roper@intel.com>
Tested-by: Clint Taylor <Clinton.A.Taylor@intel.com>

c384afe3

drm/i915: Introduce i915_timeline.mutex · 3ef71149

Chris Wilson authored Mar 01, 2019

A simple mutex used for guarding the flow of requests in and out of the
timeline. In the short-term, it will be used only to guard the addition
of requests into the timeline, taken on alloc and released on commit so
that only one caller can construct a request into the timeline
(important as the seqno and ring pointers must be serialised). This will
be used by observers to ensure that the seqno/hwsp is stable. Later,
when we have reduced retiring to only operate on a single timeline at a
time, we can then use the mutex as the sole guard required for retiring.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190301110547.14758-2-chris@chris-wilson.co.uk

3ef71149

28 Feb, 2019 8 commits

drm/i915/execlists: Suppress mere WAIT preemption · b5773a36

Chris Wilson authored Feb 28, 2019

WAIT is occasionally suppressed by virtue of preempted requests being
promoted to NEWCLIENT if they have not all ready received that boost.
Make this consistent for all WAIT boosts that they are not allowed to
preempt executing contexts and are merely granted the right to be at the
front of the queue for the next execution slot. This is in keeping with
the desire that the WAIT boost be a minor tweak that does not give
excessive promotion to its user and open ourselves to trivial abuse.

The problem with the inconsistent WAIT preemption becomes more apparent
as the preemption is propagated across the engines, where one engine may
preempt and the other not, and we be relying on the exact execution
order being consistent across engines (e.g. using HW semaphores to
coordinate parallel execution).

v2: Also protect GuC submission from false preemption loops.
v3: Build bug safeguards and better debug messages for st.
v4: Do the priority bumping in unsubmit (i.e. on preemption/reset
unwind), applying it earlier during submit causes out-of-order execution
combined with execute fences.
v5: Call sw_fence_fini for our dummy request (Matthew)
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190228220639.3173-1-chris@chris-wilson.co.uk

b5773a36

drm/i915: Use __ffs() in for_each_priolist for more compact code · bd5d6781

Chris Wilson authored Feb 26, 2019

Gcc has a slight preference if we use __ffs() to subtract one from the
index once rather than each use:

__execlists_submission_tasklet              2867    2847     -20
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190226102404.29153-11-chris@chris-wilson.co.uk

bd5d6781

drm/i915: Remove second level open-coded rcu work · d9948a10

Chris Wilson authored Feb 28, 2019

We currently use a worker queued from an rcu callback to determine when
a how grace period has elapsed while we remained idle. We use this idle
delay to infer that we will be idle for a while and this is a suitable
point at which we can trim our global memory caches.

Since we wrote that, this mechanism now exists as rcu_work, and having
converted the idle shrinkers over to using that, we can remove our own
variant.

v2: Say goodbye to gt.epoch as well.
v3: Remove the misplaced and redundant comment before parking globals
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190228102035.5857-3-chris@chris-wilson.co.uk

d9948a10

drm/i915: Make object/vma allocation caches global · 13f1bfd3

Chris Wilson authored Feb 28, 2019

As our allocations are not device specific, we can move our slab caches
to a global scope.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190228102035.5857-2-chris@chris-wilson.co.uk

13f1bfd3

drm/i915: Make request allocation caches global · 32eb6bcf

Chris Wilson authored Feb 28, 2019

As kmem_caches share the same properties (size, allocation/free behaviour)
for all potential devices, we can use global caches. While this
potential has worse fragmentation behaviour (one can argue that
different devices would have different activity lifetimes, but you can
also argue that activity is temporal across the system) it is the
default behaviour of the system at large to amalgamate matching caches.

The benefit for us is much reduced pointer dancing along the frequent
allocation paths.

v2: Defer shrinking until after a global grace period for futureproofing
multiple consumers of the slab caches, similar to the current strategy
for avoiding shrinking too early.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190228102035.5857-1-chris@chris-wilson.co.uk

32eb6bcf

drm/i915: Report engines are idle if already parked · bd2be141

Chris Wilson authored Feb 27, 2019

If we have parked, then we must have passed an idleness test and still
be idle. We chose not to use this shortcut in the past so that we could
use the idleness test at any time and inspect HW. However, some HW like
Sandybridge, doesn't like being woken up frivolously, so avoid doing so.

References: 0b702dca ("drm/i915: Avoid waking the engines just to check if they are idle")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190227214159.7946-1-chris@chris-wilson.co.uk

bd2be141

Revert "drm/i915: Avoid waking the engines just to check if they are idle" · 44f8b802

Chris Wilson authored Feb 27, 2019

This reverts commit 0b702dca.

CI reports that this is not as reliable as it first appears, with
failures starting to sporadically occur in selftests.

Fixes: 0b702dca ("drm/i915: Avoid waking the engines just to check if they are idle")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190227204654.14907-1-chris@chris-wilson.co.uk

44f8b802

drm/i915: Compute the global scheduler caps · 2d5eaad0

Chris Wilson authored Feb 26, 2019

Do a pass over all the engines upon starting to determine the global
scheduler capability flags (those that are agreed upon by all).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190226102404.29153-7-chris@chris-wilson.co.uk

2d5eaad0