Commits · ca1a453361cd1cc73752998d1acd8616582c2a64 · Kirill Smelkov / linux

06 Jun, 2024 2 commits

drm/i915/gt: Delete the live_hearbeat_fast selftest · ca1a4533

Niemiec, Krzysztof authored Jun 03, 2024

The test is trying to push the heartbeat frequency to the limit, which
might sometimes fail. Such a failure does not provide valuable
information, because it does not indicate that there is something
necessarily wrong with either the driver or the hardware.

Remove the test to prevent random, unnecessary failures from appearing
in CI.
Suggested-by: Chris Wilson <chris.p.wilson@intel.com>
Signed-off-by: Niemiec, Krzysztof <krzysztof.niemiec@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/fe2vu5h7v7ooxbhwpbfsypxg5mjrnt56gc3cgrqpnhgrgce334@qfrv2skxrp47

ca1a4533

drm/i915: Increase FLR timeout from 3s to 9s · c5d86c19

Andi Shyti authored May 24, 2024

Following the guidelines it takes 3 seconds to perform an FLR
reset. Let's give it a bit more slack because this time can
change depending on the platform and on the firmware
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240523235853.171796-1-andi.shyti@linux.intel.com

c5d86c19

22 May, 2024 1 commit

drm/i915/gt: Fix CCS id's calculation for CCS mode setting · a09d2327

Andi Shyti authored May 17, 2024

The whole point of the previous fixes has been to change the CCS
hardware configuration to generate only one stream available to
the compute users. We did this by changing the info.engine_mask
that is set during device probe, reset during the detection of
the fused engines, and finally reset again when choosing the CCS
mode.

We can't use the engine_mask variable anymore, as with the
current configuration, it imposes only one CCS no matter what the
hardware configuration is.

Before changing the engine_mask for the third time, save it and
use it for calculating the CCS mode.

After the previous changes, the user reported a performance drop
to around 1/4. We have tested that the compute operations, with
the current patch, have improved by the same factor.

Fixes: 6db31251 ("drm/i915/gt: Enable only one CCS for compute workload")
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Cc: Chris Wilson <chris.p.wilson@linux.intel.com>
Cc: Gnattu OC <gnattuoc@me.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Tested-by: Jian Ye <jian.ye@intel.com>
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Tested-by: Gnattu OC <gnattuoc@me.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240517090616.242529-1-andi.shyti@linux.intel.com

a09d2327

16 May, 2024 6 commits

drm/i915/guc: avoid FIELD_PREP warning · 364e0398

Arnd Bergmann authored Apr 30, 2024

With gcc-7 and earlier, there are lots of warnings like

In file included from <command-line>:0:0:
In function '__guc_context_policy_add_priority.isra.66',
    inlined from '__guc_context_set_prio.isra.67' at drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c:3292:3,
    inlined from 'guc_context_set_prio' at drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c:3320:2:
include/linux/compiler_types.h:399:38: error: call to '__compiletime_assert_631' declared with attribute error: FIELD_PREP: mask is not constant
  _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
                                      ^
...
drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c:2422:3: note: in expansion of macro 'FIELD_PREP'
   FIELD_PREP(GUC_KLV_0_KEY, GUC_CONTEXT_POLICIES_KLV_ID_##id) | \
   ^~~~~~~~~~

Make sure that GUC_KLV_0_KEY is an unsigned value to avoid the warning.

Fixes: 77b6f79d ("drm/i915/guc: Update to GuC version 69.0.3")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Julia Filipchuk <julia.filipchuk@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240430164809.482131-1-julia.filipchuk@intel.com

364e0398

drm/i915: Support replaying GPU hangs with captured context image · 0f1bb41b

Tvrtko Ursulin authored May 14, 2024

When debugging GPU hangs Mesa developers are finding it useful to replay
the captured error state against the simulator. But due various simulator
limitations which prevent replicating all hangs, one step further is being
able to replay against a real GPU.

This is almost doable today with the missing part being able to upload the
captured context image into the driver state prior to executing the
uploaded hanging batch and all the buffers.

To enable this last part we add a new context parameter called
I915_CONTEXT_PARAM_CONTEXT_IMAGE. It follows the existing SSEU
configuration pattern of being able to select which context to apply
against, paired with the actual image and its size.

Since this is adding a new concept of debug only uapi, we hide it behind
a new kconfig option and also require activation with a module parameter.
Together with a warning banner printed at driver load, all those combined
should be sufficient to guard against inadvertently enabling the feature.

In terms of implementation we allow the legacy context set param to be
used since that removes the need to record the per context data in the
proto context, while still allowing flexibility of specifying context
images for any context.

Mesa MR using the uapi can be seen at:
  https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27594

v2:
 * Fix whitespace alignment as per checkpatch.
 * Added warning on userspace misuse.
 * Rebase for extracting ce->default_state shadowing.

v3:
 * Rebase for I915_CONTEXT_PARAM_LOW_LATENCY.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Carlos Santa <carlos.santa@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Tested-by: Carlos Santa <carlos.santa@intel.com>
Signed-off-by: Tvrtko Ursulin <tursulin@igalia.com>
Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20240514145939.87427-2-tursulin@igalia.com

0f1bb41b

drm/i915: Shadow default engine context image in the context · e1eb97c2

Tvrtko Ursulin authored May 14, 2024

To enable adding override of the default engine context image let us start
shadowing the per engine state in the context.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Carlos Santa <carlos.santa@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Tvrtko Ursulin <tursulin@igalia.com>
Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20240514145939.87427-1-tursulin@igalia.com

e1eb97c2

Merge drm/drm-next into drm-intel-gt-next · 60a2f25d

Tvrtko Ursulin authored May 16, 2024

Some display refactoring patches are needed in order to allow conflict-
less merging.
Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>

60a2f25d

drm/tests: Add a unit test for range bias allocation · 431c590c

Arunpravin Paneer Selvam authored May 14, 2024

Allocate cleared blocks in the bias range when the DRM
buddy's clear avail is zero. This will validate the bias
range allocation in scenarios like system boot when no
cleared blocks are available and exercise the fallback
path too. The resulting blocks should always be dirty.

v1:(Matthew)
  - move the size to the variable declaration section.
  - move the mm.clear_avail init to allocator init.
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240514145636.16253-2-Arunpravin.PaneerSelvam@amd.com

431c590c

drm/buddy: Fix the range bias clear memory allocation issue · bb21700b

Arunpravin Paneer Selvam authored May 14, 2024

Problem statement: During the system boot time, an application request
for the bulk volume of cleared range bias memory when the clear_avail
is zero, we dont fallback into normal allocation method as we had an
unnecessary clear_avail check which prevents the fallback method leads
to fb allocation failure following system goes into unresponsive state.

Solution: Remove the unnecessary clear_avail check in the range bias
allocation function.

v2: add a kunit for this corner case (Daniel Vetter)
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Fixes: 96950929 ("drm/buddy: Implement tracking clear page feature")
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240514145636.16253-1-Arunpravin.PaneerSelvam@amd.com

bb21700b

14 May, 2024 1 commit

drm/i915/gt: Disarm breadcrumbs if engines are already idle · fbad43ec

Chris Wilson authored Apr 23, 2024

The breadcrumbs use a GT wakeref for guarding the interrupt, but are
disarmed during release of the engine wakeref. This leaves a hole where
we may attach a breadcrumb just as the engine is parking (after it has
parked its breadcrumbs), execute the irq worker with some signalers still
attached, but never be woken again.

That issue manifests itself in CI with IGT runner timeouts while tests
are waiting indefinitely for release of all GT wakerefs.

<6> [209.151778] i915: Running live_engine_pm_selftests/live_engine_busy_stats
<7> [209.231628] i915 0000:00:02.0: [drm:intel_power_well_disable [i915]] disabling PW_5
<7> [209.231816] i915 0000:00:02.0: [drm:intel_power_well_disable [i915]] disabling PW_4
<7> [209.231944] i915 0000:00:02.0: [drm:intel_power_well_disable [i915]] disabling PW_3
<7> [209.232056] i915 0000:00:02.0: [drm:intel_power_well_disable [i915]] disabling PW_2
<7> [209.232166] i915 0000:00:02.0: [drm:intel_power_well_disable [i915]] disabling DC_off
<7> [209.232270] i915 0000:00:02.0: [drm:skl_enable_dc6 [i915]] Enabling DC6
<7> [209.232368] i915 0000:00:02.0: [drm:gen9_set_dc_state.part.0 [i915]] Setting DC state from 00 to 02
<4> [299.356116] [IGT] Inactivity timeout exceeded. Killing the current test with SIGQUIT.
...
<6> [299.356526] sysrq: Show State
...
<6> [299.373964] task:i915_selftest   state:D stack:11784 pid:5578  tgid:5578  ppid:873    flags:0x00004002
<6> [299.373967] Call Trace:
<6> [299.373968]  <TASK>
<6> [299.373970]  __schedule+0x3bb/0xda0
<6> [299.373974]  schedule+0x41/0x110
<6> [299.373976]  intel_wakeref_wait_for_idle+0x82/0x100 [i915]
<6> [299.374083]  ? __pfx_var_wake_function+0x10/0x10
<6> [299.374087]  live_engine_busy_stats+0x9b/0x500 [i915]
<6> [299.374173]  __i915_subtests+0xbe/0x240 [i915]
<6> [299.374277]  ? __pfx___intel_gt_live_setup+0x10/0x10 [i915]
<6> [299.374369]  ? __pfx___intel_gt_live_teardown+0x10/0x10 [i915]
<6> [299.374456]  intel_engine_live_selftests+0x1c/0x30 [i915]
<6> [299.374547]  __run_selftests+0xbb/0x190 [i915]
<6> [299.374635]  i915_live_selftests+0x4b/0x90 [i915]
<6> [299.374717]  i915_pci_probe+0x10d/0x210 [i915]

At the end of the interrupt worker, if there are no more engines awake,
disarm the breadcrumb and go to sleep.

Fixes: 9d5612ca ("drm/i915/gt: Defer enabling the breadcrumb interrupt to after submission")
Closes: https://gitlab.freedesktop.org/drm/intel/issues/10026Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andrzej Hajda <andrzej.hajda@intel.com>
Cc: <stable@vger.kernel.org> # v5.12+
Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Acked-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240423165505.465734-2-janusz.krzysztofik@linux.intel.com

fbad43ec

13 May, 2024 1 commit

drm/i915/gem/i915_gem_ttm_move: Fix typo · a3598d7d

Deming Wang authored May 13, 2024

The mapings should be replaced by mappings.
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Deming Wang <wangdeming@inspur.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240513061451.1627-1-wangdeming@inspur.comSigned-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

a3598d7d

10 May, 2024 3 commits

Merge tag 'drm-xe-next-fixes-2024-05-09-1' of... · 275654c0

Dave Airlie authored May 10, 2024

Merge tag 'drm-xe-next-fixes-2024-05-09-1' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-next

Driver Changes:
- Use ordered WQ for G2H handler. (Matthew Brost)
- Use flexible-array rather than zero-sized (Lucas De Marchi)
Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Thomas Hellstrom <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/Zjz7SzCvfA3vQRxu@fedora

275654c0

Merge tag 'drm-misc-next-fixes-2024-05-08' of... · 110ed472

Dave Airlie authored May 10, 2024

Merge tag 'drm-misc-next-fixes-2024-05-08' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-next

drm-misc-next-fixes for v6.10-rc1:
- panthor fixes.
- Reverting Kconfig changes, and moving drm options to submenu.
- Hide physical fb address in fb helper.
- zynqmp bridge fix.
- Revert broken ti-sn65dsi83 fix.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/fe630414-d13e-4052-86f3-ce3155eb3e44@linux.intel.com

110ed472

Merge tag 'drm-msm-next-2024-05-07' of https://gitlab.freedesktop.org/drm/msm into drm-next · c815e4e7

Dave Airlie authored May 10, 2024

Updates for v6.10

Core:
- Switched to generating register header files during build process
  instead of shipping pre-generated headers
- Merged DPU and MDP4 format databases.

DP:
- Stop using compat string to distinguish DP and eDP cases
- Added support for X Elite platform (X1E80100)
- Reworked DP aux/audio support
- Added SM6350 DP to the bindings (no driver changes, using SM8350
  as a fallback compat)

GPU:
- a7xx perfcntr reg fixes
- MAINTAINERS updates
- a750 devcoredump support
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rob Clark <robdclark@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/CAF6AEGtpw6dNR9JBikFTQ=TCpt-9FeFW+SGjXWv+Jv3emm0Pbg@mail.gmail.com

c815e4e7

09 May, 2024 2 commits

drm/xe/ads: Use flexible-array · d69c3d4b

Lucas De Marchi authored May 06, 2024

Zero-length arrays are deprecated and flexible arrays should be
used instead: https://www.kernel.org/doc/html/v6.9-rc7/process/deprecated.html#zero-length-and-one-element-arraysReported-by: kernel test robot <lkp@intel.com>
Reported-by: Julia Lawall <julia.lawall@inria.fr>
Closes: https://lore.kernel.org/r/202405051824.AmjAI5Pg-lkp@intel.com/
Fixes: dd08ebf6 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240506141917.205714-1-lucas.demarchi@intel.comSigned-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
(cherry picked from commit ee728423)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>

d69c3d4b

drm/xe: Use ordered WQ for G2H handler · 2d9c72f6

Matthew Brost authored May 05, 2024

System work queues are shared, use a dedicated work queue for G2H
processing to avoid G2H processing getting block behind system tasks.

Fixes: dd08ebf6 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Cc: <stable@vger.kernel.org>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Francois Dugast <francois.dugast@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240506034758.3697397-1-matthew.brost@intel.com
(cherry picked from commit 50aec966)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>

2d9c72f6

08 May, 2024 3 commits

drm/i915: Remove counter productive REGION_* wrappers · 3797783b

Ville Syrjälä authored May 02, 2024

This extra macro level between the region IDs and their bitmasks
just makes it harder to see what is used where. Get rid of the
wrappers.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240502121423.1002-3-ville.syrjala@linux.intel.comReviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

3797783b

drm/i915: Pass the region ID rather than a bitmask to HAS_REGION() · d082c05a

Ville Syrjälä authored May 02, 2024

The name 'HAS_REGION()' suggests we are checking for a single
region, so seem more sensible to pass in the region ID rather
than a bitmask.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240502121423.1002-2-ville.syrjala@linux.intel.comReviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

d082c05a

drm/i915: Fix HAS_REGION() usage in intel_gt_probe_lmem() · 8b69ac66

Ville Syrjälä authored May 02, 2024

HAS_REGION() takes a bitmask, not the region ID. This causes the
GEM_BUG_ON() to assert that the SMEM region is available rather
than the intended LMEM region. No real harm since SMEM is always
available, but also not checking what was intended.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240502121423.1002-1-ville.syrjala@linux.intel.comReviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

8b69ac66

07 May, 2024 3 commits

Revert "drm/i915: Remove extra multi-gt pm-references" · 749670a5

Janusz Krzysztofik authored May 06, 2024

This reverts commit 1f33dc0c.

There was a patch supposed to fix an issue of illegal attempts to free a
still active i915 VMA object when parking a GT believed to be idle,
reported by CI on 2-GT Meteor Lake.  As a solution, an extra wakeref for
a Primary GT was acquired from i915_gem_do_execbuffer() -- see commit
f56fe3e9 ("drm/i915: Fix a VMA UAF for multi-gt platform").

However, that fix occurred insufficient -- the issue was still reported by
CI.  That wakeref was released on exit from i915_gem_do_execbuffer(), then
potentially before completion of the request and deactivation of its
associated VMAs.  Moreover, CI reports indicated that single-GT platforms
also suffered sporadically from the same race.

Since that issue was fixed by another commit f3c71b2d ("drm/i915/vma:
Fix UAF on destroy against retire race"), the changes introduced by that
insufficient fix were dropped as no longer useful.  However, that series
resulted in another VMA UAF scenario now being triggered in CI.

<4> [260.290809] ------------[ cut here ]------------
<4> [260.290988] list_del corruption. prev->next should be ffff888118c5d990, but was ffff888118c5a510. (prev=ffff888118c5a510)
<4> [260.291004] WARNING: CPU: 2 PID: 1143 at lib/list_debug.c:62 __list_del_entry_valid_or_report+0xb7/0xe0
..
<4> [260.291055] CPU: 2 PID: 1143 Comm: kms_plane Not tainted 6.9.0-rc2-CI_DRM_14524-ga25d180c6853+ #1
<4> [260.291058] Hardware name: Intel Corporation Meteor Lake Client Platform/MTL-P LP5x T3 RVP, BIOS MTLPFWI1.R00.3471.D91.2401310918 01/31/2024
<4> [260.291060] RIP: 0010:__list_del_entry_valid_or_report+0xb7/0xe0
...
<4> [260.291087] Call Trace:
<4> [260.291089]  <TASK>
<4> [260.291124]  i915_vma_reopen+0x43/0x80 [i915]
<4> [260.291298]  eb_lookup_vmas+0x9cb/0xcc0 [i915]
<4> [260.291579]  i915_gem_do_execbuffer+0xc9a/0x26d0 [i915]
<4> [260.291883]  i915_gem_execbuffer2_ioctl+0x123/0x2a0 [i915]
...
<4> [260.292301]  </TASK>
...
<4> [260.292506] ---[ end trace 0000000000000000 ]---
<4> [260.292782] general protection fault, probably for non-canonical address 0x6b6b6b6b6b6b6ca3: 0000 [#1] PREEMPT SMP NOPTI
<4> [260.303575] CPU: 2 PID: 1143 Comm: kms_plane Tainted: G        W          6.9.0-rc2-CI_DRM_14524-ga25d180c6853+ #1
<4> [260.313851] Hardware name: Intel Corporation Meteor Lake Client Platform/MTL-P LP5x T3 RVP, BIOS MTLPFWI1.R00.3471.D91.2401310918 01/31/2024
<4> [260.326359] RIP: 0010:eb_validate_vmas+0x114/0xd80 [i915]
...
<4> [260.428756] Call Trace:
<4> [260.431192]  <TASK>
<4> [639.283393]  i915_gem_do_execbuffer+0xd05/0x26d0 [i915]
<4> [639.305245]  i915_gem_execbuffer2_ioctl+0x123/0x2a0 [i915]
...
<4> [639.411134]  </TASK>
...
<4> [639.449979] ---[ end trace 0000000000000000 ]---

We defer actually closing, unbinding and destroying a VMA until next idle
point, or until the object is freed in the meantime.  By postponing the
unbind, we allow for the VMA to be reopened by the client, avoiding the
work required to rebind the VMA.

Starting from commit b0647a5e ("drm/i915: Avoid live-lock with
i915_vma_parked()"), we assume that as long as a GT is held idle, no VMA
would be reopened while we destroy them.  That assumption is no longer
true in multi-GT configurations, where a VMA we reopen may be handled by a
GT different from the one that we already keep active via its engine while
we set up an execbuf request.

Restoring the extra GT0 PM wakeref removed from i915_gem_do_execbuffer()
processing path seems to fix this issue.

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/10608Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Nirmoy Das <nirmoy.das@linux.intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@linux.intel.com>
Fixes: 1f33dc0c ("drm/i915: Remove extra multi-gt pm-references")
Link: https://patchwork.freedesktop.org/patch/msgid/20240506180253.96858-2-janusz.krzysztofik@linux.intel.comSigned-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

749670a5

drm/msm/gen_header: allow skipping the validation · b587f413

Dmitry Baryshkov authored May 03, 2024

We don't need to run the validation of the XML files if we are just
compiling the kernel. Skip the validation unless the user enables
corresponding Kconfig option. This removes a warning from gen_header.py
about lxml being not installed.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Closes: https://lore.kernel.org/all/20240409120108.2303d0bd@canb.auug.org.au/Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Reviewed-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
Patchwork: https://patchwork.freedesktop.org/patch/592558/Signed-off-by: Rob Clark <robdclark@chromium.org>

b587f413

drm/msm/a6xx: Cleanup indexed regs const'ness · 69b79e80

Rob Clark authored May 04, 2024

These tables were made non-const in commit 3cba4a2c ("drm/msm/a6xx:
Update ROQ size in coredump") in order to avoid powering up the GPU when
reading back a devcoredump.  Instead let's just stash the count that is
potentially read from hw in struct a6xx_gpu_state_obj, and make the
tables const again.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Patchwork: https://patchwork.freedesktop.org/patch/592699/

69b79e80

05 May, 2024 2 commits

drm/msm: Add devcoredump support for a750 · f3f8207d

Connor Abbott authored May 03, 2024

Add an a750 case to the various places where we choose a list of
registers.
Patchwork: https://patchwork.freedesktop.org/patch/592519/Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
Patchwork: https://patchwork.freedesktop.org/patch/592519Signed-off-by: Rob Clark <robdclark@chromium.org>

f3f8207d

drm/msm: Adjust a7xx GBIF debugbus dumping · b636a6d2

Connor Abbott authored May 03, 2024

Use the kgsl-style list of indices, because this is about to change for
a750 and we want to reuse the downstream header directly.
Patchwork: https://patchwork.freedesktop.org/patch/592520/Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
Patchwork: https://patchwork.freedesktop.org/patch/592520Signed-off-by: Rob Clark <robdclark@chromium.org>

b636a6d2

04 May, 2024 8 commits

drm/msm: Update a6xx registers XML · 0eb61e20

Connor Abbott authored May 03, 2024

Update to Mesa commit e82d70d472cc ("freedreno/a7xx: Add
A7XX_HLSQ_DP_STR location from kgsl").
Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
Patchwork: https://patchwork.freedesktop.org/patch/592518/Signed-off-by: Rob Clark <robdclark@chromium.org>

0eb61e20

drm/msm: Fix imported a750 snapshot header for upstream · 106414f8

Connor Abbott authored May 03, 2024

Add A7XX prefixes necessary because we use the same code for dumping
a6xx and a7xx, fix register name prefixes for upstream, and use the
upstream header.
Patchwork: https://patchwork.freedesktop.org/patch/592517/Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
Patchwork: https://patchwork.freedesktop.org/patch/592517Signed-off-by: Rob Clark <robdclark@chromium.org>

106414f8

drm/msm: Import a750 snapshot registers from kgsl · 6408a1b5

Connor Abbott authored May 03, 2024

Import from kgsl commit 809ee24fe560.
Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
Patchwork: https://patchwork.freedesktop.org/patch/592516/Signed-off-by: Rob Clark <robdclark@chromium.org>

6408a1b5

MAINTAINERS: Add Konrad Dybcio as a reviewer for the Adreno driver · fe43e194

Konrad Dybcio authored Apr 23, 2024

Add myself as a reviewer for Adreno driver changes.
Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org>
Patchwork: https://patchwork.freedesktop.org/patch/590705/Signed-off-by: Rob Clark <robdclark@chromium.org>

fe43e194

MAINTAINERS: Add a separate entry for Qualcomm Adreno GPU drivers · 3fe0aa1c

Konrad Dybcio authored Apr 23, 2024

The msm driver is.. gigantic and covers display hardware (incl. things
concerning (e)DP, DSI, HDMI), as well as the entire lineup of Adreno
GPUs (with hw bringup, memory mappings, userspace interaction etc.).

Because of that, people listed as M:/R: receive patches concerning
drivers for any part of the display block OR the GPU. Separate the
latter, as it's both a functionally separate block and is of
interest to different folks.
Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org>
Patchwork: https://patchwork.freedesktop.org/patch/590704/Signed-off-by: Rob Clark <robdclark@chromium.org>

3fe0aa1c

drm/msm/a6xx: Avoid a nullptr dereference when speedbin setting fails · 46d4efcc

Konrad Dybcio authored Apr 12, 2024

Calling a6xx_destroy() before adreno_gpu_init() leads to a null pointer
dereference on:

msm_gpu_cleanup() : platform_set_drvdata(gpu->pdev, NULL);

as gpu->pdev is only assigned in:

a6xx_gpu_init()
|_ adreno_gpu_init
    |_ msm_gpu_init()

Instead of relying on handwavy null checks down the cleanup chain,
explicitly de-allocate the LLC data and free a6xx_gpu instead.

Fixes: 76efc245 ("drm/msm/gpu: Fix crash during system suspend after unbind")
Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org>
Patchwork: https://patchwork.freedesktop.org/patch/588919/Signed-off-by: Rob Clark <robdclark@chromium.org>

46d4efcc

drm/msm/adreno: fix CP cycles stat retrieval on a7xx · 32866026

Zan Dobersek authored Apr 09, 2024

a7xx_submit() should use the a7xx variant of the RBBM_PERFCTR_CP register
for retrieving the CP cycles value before and after the submitted command
stream execution.
Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Fixes: af66706a ("drm/msm/a6xx: Add skeleton A7xx support")
Patchwork: https://patchwork.freedesktop.org/patch/588445/Signed-off-by: Rob Clark <robdclark@chromium.org>

32866026

drm/msm/a7xx: allow writing to CP_BV counter selection registers · 3f9bb601

Zan Dobersek authored Feb 29, 2024

In addition to the CP_PERFCTR_CP_SEL register range, allow writes to the
CP_BV_PERFCTR_CP_SEL registers in the 0x8e0-0x8e6 range for profiling
purposes of tools like fdperf and perfetto.
Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Patchwork: https://patchwork.freedesktop.org/patch/580548/
[fixup a730_protect size]
Signed-off-by: Rob Clark <robdclark@chromium.org>

3f9bb601

03 May, 2024 1 commit

Merge tag 'drm-xe-next-fixes-2024-05-02' of... · f03eee5f

Dave Airlie authored May 03, 2024

Merge tag 'drm-xe-next-fixes-2024-05-02' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-next

Driver Changes:
- Fix for a backmerge going slightly wrong.
- An UAF fix
- Avoid a WA error on LNL.
Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Thomas Hellstrom <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ZjOijQA43zhu3SZ4@fedora

f03eee5f

02 May, 2024 7 commits

drm: zynqmp_dpsub: Always register bridge · be3f3042

Sean Anderson authored Mar 08, 2024

We must always register the DRM bridge, since zynqmp_dp_hpd_work_func
calls drm_bridge_hpd_notify, which in turn expects hpd_mutex to be
initialized. We do this before zynqmp_dpsub_drm_init since that calls
drm_bridge_attach. This fixes the following lockdep warning:

[   19.217084] ------------[ cut here ]------------
[   19.227530] DEBUG_LOCKS_WARN_ON(lock->magic != lock)
[   19.227768] WARNING: CPU: 0 PID: 140 at kernel/locking/mutex.c:582 __mutex_lock+0x4bc/0x550
[   19.241696] Modules linked in:
[   19.244937] CPU: 0 PID: 140 Comm: kworker/0:4 Not tainted 6.6.20+ #96
[   19.252046] Hardware name: xlnx,zynqmp (DT)
[   19.256421] Workqueue: events zynqmp_dp_hpd_work_func
[   19.261795] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[   19.269104] pc : __mutex_lock+0x4bc/0x550
[   19.273364] lr : __mutex_lock+0x4bc/0x550
[   19.277592] sp : ffffffc085c5bbe0
[   19.281066] x29: ffffffc085c5bbe0 x28: 0000000000000000 x27: ffffff88009417f8
[   19.288624] x26: ffffff8800941788 x25: ffffff8800020008 x24: ffffffc082aa3000
[   19.296227] x23: ffffffc080d90e3c x22: 0000000000000002 x21: 0000000000000000
[   19.303744] x20: 0000000000000000 x19: ffffff88002f5210 x18: 0000000000000000
[   19.311295] x17: 6c707369642e3030 x16: 3030613464662072 x15: 0720072007200720
[   19.318922] x14: 0000000000000000 x13: 284e4f5f4e524157 x12: 0000000000000001
[   19.326442] x11: 0001ffc085c5b940 x10: 0001ff88003f388b x9 : 0001ff88003f3888
[   19.334003] x8 : 0001ff88003f3888 x7 : 0000000000000000 x6 : 0000000000000000
[   19.341537] x5 : 0000000000000000 x4 : 0000000000001668 x3 : 0000000000000000
[   19.349054] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffffff88003f3880
[   19.356581] Call trace:
[   19.359160]  __mutex_lock+0x4bc/0x550
[   19.363032]  mutex_lock_nested+0x24/0x30
[   19.367187]  drm_bridge_hpd_notify+0x2c/0x6c
[   19.371698]  zynqmp_dp_hpd_work_func+0x44/0x54
[   19.376364]  process_one_work+0x3ac/0x988
[   19.380660]  worker_thread+0x398/0x694
[   19.384736]  kthread+0x1bc/0x1c0
[   19.388241]  ret_from_fork+0x10/0x20
[   19.392031] irq event stamp: 183
[   19.395450] hardirqs last  enabled at (183): [<ffffffc0800b9278>] finish_task_switch.isra.0+0xa8/0x2d4
[   19.405140] hardirqs last disabled at (182): [<ffffffc081ad3754>] __schedule+0x714/0xd04
[   19.413612] softirqs last  enabled at (114): [<ffffffc080133de8>] srcu_invoke_callbacks+0x158/0x23c
[   19.423128] softirqs last disabled at (110): [<ffffffc080133de8>] srcu_invoke_callbacks+0x158/0x23c
[   19.432614] ---[ end trace 0000000000000000 ]---

Fixes: eb2d64bf ("drm: xlnx: zynqmp_dpsub: Report HPD through the bridge")
Signed-off-by: Sean Anderson <sean.anderson@linux.dev>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Reviewed-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240308204741.3631919-1-sean.anderson@linux.dev
(cherry picked from commit 61ba791c)
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

be3f3042

Revert "drm/bridge: ti-sn65dsi83: Fix enable error path" · ad81feb5

Luca Ceresoli authored Apr 26, 2024

This reverts commit 8a91b29f.

The regulator_disable() added by the original commit solves one kind of
regulator imbalance but adds another one as it allows the regulator to be
disabled one more time than it is enabled in the following scenario:

 1. Start video pipeline -> sn65dsi83_atomic_pre_enable -> regulator_enable
 2. PLL lock fails -> regulator_disable
 3. Stop video pipeline -> sn65dsi83_atomic_disable -> regulator_disable

The reason is clear from the code flow, which looks like this (after
removing unrelated code):

  static void sn65dsi83_atomic_pre_enable()
  {
      regulator_enable(ctx->vcc);

      if (PLL failed locking) {
          regulator_disable(ctx->vcc);  <---- added by patch being reverted
          return;
      }
  }

  static void sn65dsi83_atomic_disable()
  {
      regulator_disable(ctx->vcc);
  }

The use case for introducing the additional regulator_disable() was
removing the module for debugging (see link below for the discussion). If
the module is removed after a .atomic_pre_enable, i.e. with an active
pipeline from the DRM point of view, .atomic_disable is not called and thus
the regulator would not be disabled.

According to the discussion however there is no actual use case for
removing the module with an active pipeline, except for
debugging/development.

On the other hand, the occurrence of a PLL lock failure is possible due to
any physical reason (e.g. a temporary hardware failure for electrical
reasons) so handling it gracefully should be supported. As there is no way
for .atomic[_pre]_enable to report an error to the core, the only clean way
to support it is calling regulator_disabled() only in .atomic_disable,
unconditionally, as it was before.

Link: https://lore.kernel.org/all/15244220.uLZWGnKmhe@steina-w/
Fixes: 8a91b29f ("drm/bridge: ti-sn65dsi83: Fix enable error path")
Reviewed-by: Alexander Stein <alexander.stein@ew.tq-group.com>
Signed-off-by: Luca Ceresoli <luca.ceresoli@bootlin.com>
Signed-off-by: Robert Foss <rfoss@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20240426122259.46808-1-luca.ceresoli@bootlin.com
(cherry picked from commit 2940ee03)
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

ad81feb5

drm/fb_dma: Add checks in drm_fb_dma_get_scanout_buffer() · 514ca22a

Jocelyn Falempe authored Apr 26, 2024

plane->state and plane->state->fb can be NULL, so add a check before
dereferencing them.
Found by testing with the imx driver.

Fixes: 879b3b65 ("drm/fb_dma: Add generic get_scanout_buffer() for drm_panic")
Signed-off-by: Jocelyn Falempe <jfalempe@redhat.com>
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20240426121121.241366-1-jfalempe@redhat.com
(cherry picked from commit 986c12d8)
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

514ca22a

drm/fbdev-generic: Do not set physical framebuffer address · 87cb4a61

Thomas Zimmermann authored Apr 19, 2024

Framebuffer memory is allocated via vzalloc() from non-contiguous
physical pages. The physical framebuffer start address is therefore
meaningless. Do not set it.

The value is not used within the kernel and only exported to userspace
on dedicated ARM configs. No functional change is expected.

v2:
- refer to vzalloc() in commit message (Javier)
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Fixes: a5b44c4a ("drm/fbdev-generic: Always use shadow buffering")
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Javier Martinez Canillas <javierm@redhat.com>
Cc: Zack Rusin <zackr@vmware.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: <stable@vger.kernel.org> # v6.4+
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Reviewed-by: Zack Rusin <zack.rusin@broadcom.com>
Reviewed-by: Sui Jingfeng <sui.jingfeng@linux.dev>
Tested-by: Sui Jingfeng <sui.jingfeng@linux.dev>
Acked-by: Maxime Ripard <mripard@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20240419083331.7761-2-tzimmermann@suse.de
(cherry picked from commit 73ef0aec)
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

87cb4a61

drm/panthor: Fix the FW reset logic · 2fa42fd9

Boris Brezillon authored Apr 30, 2024

In the post_reset function, if the fast reset didn't succeed, we
are not clearing the fast_reset flag, which prevents firmware
sections from being reloaded. While at it, use panthor_fw_stop()
instead of manually writing DISABLE to the MCU_CONTROL register.

Fixes: 2718d918 ("drm/panthor: Add the FW logical block")
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Liviu Dudau <liviu.dudau@arm.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240430113727.493155-1-boris.brezillon@collabora.com

2fa42fd9

drm/panthor: Make sure we handle 'unknown group state' case properly · 8bdbd8b5

Boris Brezillon authored May 02, 2024

When we check for state values returned by the FW, we only cover part of
the 0:7 range. Make sure we catch FW inconsistencies by adding a default
to the switch statement, and flagging the group state as unknown in that
case.

When an unknown state is detected, we trigger a reset, and consider the
group as unusable after that point, to prevent the potential corruption
from creeping in other places if we continue executing stuff on this
context.

v2:
- Add Steve's R-b
- Fix commit message
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/dri-devel/3b7fd2f2-679e-440c-81cd-42fc2573b515@moroto.mountain/T/#uSuggested-by: Steven Price <steven.price@arm.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Reviewed-by: Liviu Dudau <liviu.dudau@arm.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240502155248.1430582-1-boris.brezillon@collabora.com

8bdbd8b5

drm: move DRM-related CONFIG options into DRM submenu · 08f44136

Masahiro Yamada authored Apr 26, 2024

When you create a submenu using the 'menu' syntax, there is no
ambiguity about its end because the code between 'menu' and 'endmenu'
becomes the submenu.

In contrast, 'menuconfig' does not have the corresponding end marker.
Instead, the end of the submenu is inferred from symbol dependencies.

This is detailed in Documentation/kbuild/kconfig-language.rst, starting
line 348. It outlines two methods to place the code under the submenu:

 (1) Open an if-block immediately after 'menuconfig', enclosing the
     submenu content within it

 (2) Add 'depends on' to every symbol intended for the submenu

Many subsystems opt for (1) because it reliably maintains the submenu
structure.

The DRM subsystem adopts (2). The submenu ends when the sequence of
'depends on DRM' breaks. It can be confirmed by running a GUI frontend
such as 'make menuconfig' and visiting the DRM menu:

    < > Direct Rendering Manager (XFree86 4.1.0 and higher DRI support)  ----

If you toggle this, you will notice most of the DRM-related options
appear below it, not in the submenu.

I highly recommend the approach (1). Obviously, (2) is not reliable,
as the submenu breaks whenever someone forgets to add 'depends on DRM'.

This commit encloses the entire DRM configuration with 'if DRM' and
'endif', except for DRM_PANEL_ORIENTATION_QUIRKS.

Note:
 Now, 'depends on DRM' properties inside the if-block are all redundant.
 I leave it as follow-up cleanups.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20240426135602.2500125-1-masahiroy@kernel.orgSigned-off-by: Maxime Ripard <mripard@kernel.org>

08f44136