Commits · 1c7531f50eaa425eca8ff726287b8df3a4a51e55 · Kirill Smelkov / linux

12 Jan, 2024 6 commits

drm/xe: display support should not depend on EXPERT · 1c7531f5

Jani Nikula authored Jan 11, 2024

Remove the DRM_XE_DISPLAY config dependency on EXPERT. I can only
presume the idea was only experts should be able to disable it, but the
effect is the opposite.
Reported-by: Eero Tamminen <eero.t.tamminen@intel.com>
Reviewed-by: Francois Dugast <francois.dugast@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240111104716.3548744-1-jani.nikula@intel.com

1c7531f5

drm/xe: Fix bounds checking in __xe_bo_placement_for_flags() · 52e3fa3e

Brian Welty authored Jan 10, 2024

Requesting all memory regions on PVC will fill bo->placements up to
XE_BO_MAX_PLACEMENTS. The subsequent call to try_add_stolen() will trip
over the bounds checking even though XE_PL_STOLEN is not expected to
be used in this case.

This is hit with igt@xe_exec_fault_mode@once-basic-prefetch:
xe 0000:8c:00.0: [drm] Assertion `*c < (sizeof(bo->placements) / sizeof((bo->placements)[0]) + ((int)(sizeof(struct { int:(-!!(__builtin_types_compatible_p(typeof((bo->placements)), typeof(&(bo->placements)[0])))); }))))` failed!
WARNING: CPU: 30 PID: 6161 at drivers/gpu/drm/xe/xe_bo.c:203 __xe_bo_placement_for_flags+0x218/0x240 [xe]

Is fixed here by moving the bounds checks closer to where we actually
write into the bo->placement array.

Fixes: 8c54ee8a ("drm/xe: Ensure that we don't access the placements array out-of-bounds")
Link: https://patchwork.freedesktop.org/patch/msgid/20240111002111.10190-1-brian.welty@intel.comSigned-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Brian Welty <brian.welty@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>

52e3fa3e

drm/xe/migrate: Cap PTEs written by MI_STORE_DATA_IMM to 510 · ca630876

Matt Roper authored Jan 11, 2024

Although MI_STORE_DATA_IMM's "length" field is 10-bits, 0x3FE is
considered the largest legal value accepted. Since that instruction
field is always encoded in (val-2) format, this translates to 0x400
dwords for the true maximum length of the instruction. Subtracting the
instruction header (1 dword) and address (2 dwords), that leaves 0x3FD
dwords (i.e., 0x1FE qwords) for PTE values.

Bspec: 60246, 45753
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20240111220238.1467572-2-matthew.d.roper@intel.comSigned-off-by: Matt Roper <matthew.d.roper@intel.com>

ca630876

drm/xe: Fix potential deadlock in __fini_dbm · 1113e52f

Michal Wajdeczko authored Jan 11, 2024

If Doorbell Manager is in unclean state during fini phase, for
debug purposes we try to print it's state, but we missed the fact
that we are already holding a lock so the xe_guc_db_mgr_print()
will deadlock since it also attempts to grab the same lock.

Fixes: 587c7334 ("drm/xe: Introduce GuC Doorbells Manager")
Cc: Piotr Piórkowski <piotr.piorkowski@intel.com>
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Link: https://lore.kernel.org/r/20240111185603.673-1-michal.wajdeczko@intel.comSigned-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>

1113e52f

drm/xe: Allow to exclude part of GGTT from allocations · 33ff1f21

Michal Wajdeczko authored Jan 11, 2024

Soon we will be required to exclude some of the GGTT addresses
from the allocations, since on some platforms running the SR-IOV VF
mode, we will be able to use only selected range of the GGTT space.

Add helper functions to manage such GGTT range exclusions, and
follow the naming from the similar concept used by GVT-g.
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Link: https://lore.kernel.org/r/20240111182559.629-1-michal.wajdeczko@intel.comSigned-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>

33ff1f21

drm/xe/guc: Use HXG definitions on HXG messages · d4978a67

Michal Wajdeczko authored Jan 11, 2024

While parsing and processing CTB G2H messages we should extract
underlying HXG message and use HXG definitions on such message.
Using outer CTB layer message in HXG definitions require use of
shifted dword index, which might be confusing:

	FIELD_GET(GUC_HXG_MSG_0_xxx, msg[1])

instead of:

	FIELD_GET(GUC_HXG_MSG_0_xxx, hxg[0])

Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20240111210632.717-1-michal.wajdeczko@intel.comSigned-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>

d4978a67

11 Jan, 2024 7 commits

drm/xe/guc: Return CTB response length · d898c2e5

Michal Wajdeczko authored Jan 11, 2024

Not all CTB responses from the GuC are fixed size and we need to
pass response length to the caller, if there was a response_buffer.
Easiest solution is to return it as positive value from all
xe_guc_ct_send_recv() functions.  The CTB response length is always
between 1 and 254 (ie. GUC_HXG_MSG_MIN_LEN and GUC_CTB_MAX_DWORDS
- GUC_HXG_MSG_MIN_LEN).

Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20240111152724.497-1-michal.wajdeczko@intel.comSigned-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>

d898c2e5

drm/xe/guc: Treat non-response message after BUSY as unexpected · 3c01e012

Michal Wajdeczko authored Jan 11, 2024

Once GuC replied with GUC_HXG_TYPE_NO_RESPONSE_BUSY message then
we may expect that only RESPONSE_SUCCESS or FAILURE message will
be sent, anything else is a violation of the HXG protocol.

Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20240111154838.541-1-michal.wajdeczko@intel.comSigned-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>

3c01e012

drm/xe: Split GuC communication initialization · 88cbf850

Michal Wajdeczko authored Jan 11, 2024

Soon we will be trying to communicate with the GuC firmware very
early during VF driver probe, before we finish normal init steps.
Split GuC communication initialization code so the GuC MMIO based
communication xe_guc_mmio_send() functions will work where needed.

Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20240111162051.585-1-michal.wajdeczko@intel.comSigned-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>

88cbf850

drm/xe/migrate: Fix CCS copy for small VRAM copy chunks · ef51d754

Thomas Hellström authored Jan 10, 2024

Since the migrate code is using the identity map for addressing VRAM,
copy chunks may become as small as 64K if the VRAM resource is fragmented.

However, a chunk size smaller that 1MiB may lead to the *next* chunk's
offset into the CCS metadata backup memory may not be page-aligned, and
the XY_CTRL_SURF_COPY_BLT command can't handle that, and even if it could,
the current code doesn't handle the offset calculaton correctly.

To fix this, make sure we align the size of VRAM copy chunks to 1MiB. If
the remaining data to copy is smaller than that, that's not a problem,
so use the remaining size. If the VRAM copy cunk becomes fragmented due
to the size alignment restriction, don't use the identity map, but instead
emit PTEs into the page-table like we do for system memory.

v2:
- Rebase
v3:
- Future proof somewhat by taking into account the real data size to
  flat CCS metadata size ratio. (Matt Roper)
- Invert a couple of if-statements for better readability.
- Fix support for 4K-granularity VRAM sizes. (Tested on DG1).
v4:
- Fix up code comments
- Fix debug printout format typo.
v5:
- Add a Fixes: tag.

Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Fixes: e89b384c ("drm/xe/migrate: Update emit_pte to cope with a size level than 4k")
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240110163415.524165-1-thomas.hellstrom@linux.intel.com

ef51d754

drm/xe: unlock on error path in xe_vm_add_compute_exec_queue() · cf46019e

Dan Carpenter authored Jan 05, 2024

Drop the "&vm->lock" before returning.

Fixes: 24f947d5 ("drm/xe: Use DRM GPUVM helpers for external- and evicted objects")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>

cf46019e

drm/xe/selftests: Fix an error pointer dereference bug · 88ec2352

Dan Carpenter authored Jan 05, 2024

Check if "bo" is an error pointer before calling xe_bo_lock() on it.

Fixes: d6abc18d ("drm/xe/xe2: Modify xe_bo_test for system memory")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>

88ec2352

drm/xe/device: clean up on error in probe() · c10da95a

Dan Carpenter authored Jan 05, 2024

This error path should clean up before returning.

Smatch detected this bug:
  drivers/gpu/drm/xe/xe_device.c:487 xe_device_probe() warn: missing unwind goto?

Fixes: 4cb12b71 ("drm/xe/xe2: Determine bios enablement for flat ccs on igfx")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>

c10da95a

10 Jan, 2024 9 commits

drm/xe: Invert access counter queue head / tail · 7c0f97cb

Matthew Brost authored Jan 09, 2024

Convention for queues in Linux is the producer moves the head and
consumer moves the tail. Fix the access counter queue to conform to
this convention.

Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>

7c0f97cb

drm/xe: Add build on bug to assert access counter queue works · d0ca70c0

Matthew Brost authored Jan 09, 2024

If ACC_QUEUE_NUM_DW % ACC_MSG_LEN_DW != 0 then the access counter queue
logic does not work when wrapping occurs. Add a build bug on to assert
ACC_QUEUE_NUM_DW % ACC_MSG_LEN_DW == 0 to enforce this restriction and
document the code.

v2:
- s/NUM_ACC_QUEUE/ACC_QUEUE_NUM_DW (Brian)

Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>

d0ca70c0

drm/xe: Invert page fault queue head / tail · 1fd77cea

Matthew Brost authored Jan 09, 2024

Convention for queues in Linux is the producer moves the head and
consumer moves the tail. Fix the page fault queue to conform to this
convention.

Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>

1fd77cea

drm/xe: Add build on bug to assert page fault queue works · 86f41f43

Matthew Brost authored Jan 09, 2024

If PF_QUEUE_NUM_DW % PF_MSG_LEN_DW != 0 then the page fault queue logic
does not work when wrapping occurs. Add a build bug on to assert
PF_QUEUE_NUM_DW % PF_MSG_LEN_DW == 0 to enforce this restriction and
document the code.

v2:
- s/NUM_PF_QUEUE/PF_QUEUE_NUM_DW (Brian)

Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>

86f41f43

drm/xe: Remove set_job_timeout_ms() from exec_queue_ops · 801e8c7e

Brian Welty authored Jan 10, 2024

This function is no longer used as the job_timeout is now
updated prior to calling queue_ops.init().
Signed-off-by: Brian Welty <brian.welty@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>

801e8c7e

drm/xe: Finish refactoring of exec_queue_create · 25ce7c50

Brian Welty authored Jan 10, 2024

Setting of exec_queue user extensions is moved from the end of the ioctl
function earlier, into __xe_exec_queue_alloc().
This fixes bug in that the USM attributes for access counters were being
applied too late, and effectively were ignored.

However, in order to apply user extensions this early, we can no longer
call q->ops functions.  Instead, make it more efficient. The user extension
functions can simply update the q->sched_props values and they will be
applied by the backend during q->ops->init().

v2: minor changes for readability (Matt)
Signed-off-by: Brian Welty <brian.welty@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>

25ce7c50

drm/xe: Add exec_queue.sched_props.job_timeout_ms · 6ae24344

Brian Welty authored Jan 10, 2024

The purpose here is to allow to optimize exec_queue_set_job_timeout()
in follow-on patch.  Currently it does q->ops->set_job_timeout(...).
But we'd like to apply exec_queue_user_extensions much earlier and
q->ops cannot be called before __xe_exec_queue_init().

It will be much more efficient to instead only have to set
q->sched_props.job_timeout_ms when applying user extensions. That value
will then be used during q->ops->init().
Signed-off-by: Brian Welty <brian.welty@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>

6ae24344

drm/xe: Refactor __xe_exec_queue_create() · 6e144a7d

Brian Welty authored Jan 10, 2024

Split __xe_exec_queue_create() into two functions, alloc and init.

We have an issue in that exec_queue_user_extensions are applied too late.
In the case of USM properties, these need to be set prior to xe_lrc_init().
Refactor the logic here, so we can resolve this in follow-on. We only need
the xe_vm_lock held during the exec_queue_init function.
Signed-off-by: Brian Welty <brian.welty@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>

6e144a7d

drm/xe: Fix build bug for GCC 11 · a109d199

Paul E. McKenney authored Jan 10, 2024

Building drivers/gpu/drm/xe/xe_gt_pagefault.c with GCC 11 results
in the following build errors:

./include/linux/fortify-string.h:57:33: error: writing 16 bytes into a region of size 0 [-Werror=stringop-overflow=]
   57 | #define __underlying_memcpy     __builtin_memcpy
      |                                 ^
./include/linux/fortify-string.h:644:9: note: in expansion of macro ‘__underlying_memcpy’
  644 |         __underlying_##op(p, q, __fortify_size);                        \
      |         ^~~~~~~~~~~~~
./include/linux/fortify-string.h:689:26: note: in expansion of macro ‘__fortify_memcpy_chk’
  689 | #define memcpy(p, q, s)  __fortify_memcpy_chk(p, q, s,                  \
      |                          ^~~~~~~~~~~~~~~~~~~~
drivers/gpu/drm/xe/xe_gt_pagefault.c:340:17: note: in expansion of macro ‘memcpy’
  340 |                 memcpy(pf_queue->data + pf_queue->tail, msg, len * sizeof(u32));
      |                 ^~~~~~
In file included from drivers/gpu/drm/xe/xe_device_types.h:17,
                 from drivers/gpu/drm/xe/xe_vm_types.h:16,
                 from drivers/gpu/drm/xe/xe_bo.h:13,
                 from drivers/gpu/drm/xe/xe_gt_pagefault.c:16:
drivers/gpu/drm/xe/xe_gt_types.h:102:25: note: at offset [1144, 265324] into destination object ‘tile’ of size 8
  102 |         struct xe_tile *tile;
      |                         ^~~~

Fix these by removing -Wstringop-overflow from drm/xe builds.

Closes: https://lore.kernel.org/all/45ad1d0f-a10f-483e-848a-76a30252edbe@paulmck-laptop/
Fixes: 7a8bc117 ("drm/xe: Enable W=1 warnings by default")
Suggested-by: Stephen Rothwell <sfr@rothwell.id.au>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
[ This particular warning is broken on GCC11. In future changes it will
  be moved to the normal C flags in the top level Makefile (out of
  Makefile.extrawarn), but accounting for the compiler support. Just
  remove it out of xe's forced extra warnings for now ]
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>

a109d199

09 Jan, 2024 14 commits

drm/xe: Check skip_guc_pc before setting SLPC flag · 69cac0a8

Vinay Belgaumkar authored Jan 08, 2024

Don't set SLPC GuC feature ctl flag if skip_guc_pc is true.

v2: Skip the freq related sysfs creation as well (Badal)
v3: Remove unnecessary parenthesis (Lucas)

Fixes: 975e4a37 ("drm/xe: Manually setup C6 when skip_guc_pc is set")
Fixes: bef52b5c ("drm/xe: Create a xe_gt_freq component for raw management and sysfs")
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Link: https://lore.kernel.org/r/20240108225842.966066-1-vinay.belgaumkar@intel.comSigned-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

69cac0a8

drm/xe: Add vram frequency sysfs attributes · 4ae3aeab

Sujaritha Sundaresan authored Jan 09, 2024

Add vram frequency sysfs attributes under the below hierarchy;

/device/tile#/memory/freq0
			|-max_freq
			|-min_freq

v2: Drop "vram" from attribute names (Rodrigo)

v3: Add documentation for new sysfs (Riana)
    Drop prefix from XEHP_PCODE_FREQUENCY_CONFIG (Riana)

v4: Create sysfs under tile#/freq0 after removal of
    physical_memsize attrbute

v5: Revert back to creating sysfs under tile#/memory/freq0
    Remove definition of GT_FREQUENCY_MULTIPLIER (Rodrigo)

v6: Rename attributes to max/min_freq (Anshuman)
    Fix review comments (Rodrigo)

v7: Make documentation more verbose
    Move sysfs to separate file (Anshuman)

v8: Fix platform specific conditions and add kernel doc (Anshuman)
    Fix typos and remove redundant headers (Riana)

v9: Fix typo (Riana)
    Change function name to include "sysfs" (Lucas)
Signed-off-by: Sujaritha Sundaresan <sujaritha.sundaresan@intel.com>
Reviewed-by: Anshuman Gupta <anshuman.gupta@intel.com>
Link: https://lore.kernel.org/r/20240109110418.2065101-1-sujaritha.sundaresan@intel.comSigned-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

4ae3aeab

drm/xe: Fix modifying exec_queue priority in xe_migrate_init · a8004af3

Brian Welty authored Jan 05, 2024

After exec_queue has been created, we cannot simply modify q->priority.
This needs to be done by the backend via q->ops. However in this case,
it would be more efficient to simply pass a flag when creating the
exec_queue and set the desired priority upfront during queue creation.

To that end: new flag EXEC_QUEUE_FLAG_HIGH_PRIORITY is introduced.
The priority field is moved to be with other scheduling properties and
is now exec_queue.sched_props.priority. This is no longer set to initial
value by the backend, but is now set within __xe_exec_queue_create().

Fixes: b4eecedc ("drm/xe: Fix potential deadlock handling page faults")
Signed-off-by: Brian Welty <brian.welty@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>

a8004af3

drm/xe: Fix guc_exec_queue_set_priority · b16483f9

Brian Welty authored Jan 05, 2024

We need to set q->priority prior to calling guc_exec_queue_add_msg() as
that will call init_policies() and sets the scheduling properties to those
stored in the exec_queue.

Fixes: dd08ebf6 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Signed-off-by: Brian Welty <brian.welty@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>

b16483f9

drm/xe/xe2_lpg: Add Wa_16018610683 · 9fbedddf

Shekhar Chauhan authored Jan 09, 2024

Force max 128KB SLM during WMTP PASS1 Restore.

BSpec: 70202
Signed-off-by: Shekhar Chauhan <shekhar.chauhan@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20240109055550.679289-1-shekhar.chauhan@intel.comSigned-off-by: Matt Roper <matthew.d.roper@intel.com>

9fbedddf

drm/xe: Annotate xe_ttm_stolen_mgr::mapping with __iomem · dcddb6f0

Thomas Hellström authored Jan 09, 2024

The pointer points to IO memory, but the __iomem annotation was
incorrectly placed. Annotate it correctly, update its usage accordingly
and fix the corresponding sparse error.

Fixes: d8b52a02 ("drm/xe: Implement stolen memory.")
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240109112405.108136-5-thomas.hellstrom@linux.intel.com

dcddb6f0

drm/xe: Annotate multiple mmio pointers with __iomem · 9d612ee5

Thomas Hellström authored Jan 09, 2024

There are a couple of pointers pointing to MMIO space. Annotate them
with __iomem and fix the corresponding sparse warnings.

Fixes: dd08ebf6 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Fixes: 3b0d4a55 ("drm/xe: Move register MMIO into xe_tile")
Fixes: 399a1332 ("drm/xe: add 28-bit address support in struct xe_reg")
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Koby Elbaz <kelbaz@habana.ai>
Cc: Ofir Bitton <obitton@habana.ai>
Cc: Moti Haimovski <mhaimovski@habana.ai>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240109112405.108136-4-thomas.hellstrom@linux.intel.com

9d612ee5

drm/xe: Annotate xe_mem_region::mapping with __iomem · 20855b62

Thomas Hellström authored Jan 09, 2024

The pointer points to IO memory, but the __iomem annotation was
incorrectly placed. Annotate it correctly, update its usage accordingly
and fix the corresponding sparse error.

Fixes: 0887a2e7 ("drm/xe: Make xe_mem_region struct")
Cc: Oak Zeng <oak.zeng@intel.com>
Cc: Michael J. Ruhl <michael.j.ruhl@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240109112405.108136-3-thomas.hellstrom@linux.intel.com

20855b62

drm/xe: Use __iomem for the regs pointer · 9d03bf30

Thomas Hellström authored Jan 09, 2024

The regs pointer points to IO memory. Annotate it properly and
fix the corresponding sparse warning.

Fixes: a4e2f3a2 ("drm/xe: refactor xe_mmio_probe_tiles to support MMIO extension")
Cc: Koby Elbaz <kelbaz@habana.ai>
Cc: Ofir Bitton <obitton@habana.ai>
Cc: Moti Haimovski <mhaimovski@habana.ai>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240109112405.108136-2-thomas.hellstrom@linux.intel.com

9d03bf30

drm/xe/vm: Fix an error path · 9d0c1c56

Thomas Hellström authored Dec 22, 2023

If using the VM_BIND_OP_UNMAP_ALL without any bound vmas for the
vm, we will end up dereferencing an uninitialized variable and leak a
bo lock. Fix this.

v2:
- Updated commit message (Lucas De Marchi)
Reported-by: Dafna Hirschfeld <dhirschfeld@habana.ai>
Closes: https://lore.kernel.org/intel-xe/jrwua7ckbiozfcaodx4gg2h4taiuxs53j5zlpf3qzvyhyiyl2d@pbs3plurokrj/Suggested-by: Dafna Hirschfeld <dhirschfeld@habana.ai>
Fixes: b06d47be ("drm/xe: Port Xe to GPUVA")
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Acked-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231222175904.16732-1-thomas.hellstrom@linux.intel.com

9d0c1c56

drm/xe/guc: Only take actions in CT irq handler if CTs are enabled · 5030e161

Matthew Brost authored Jan 02, 2024

Protect entire IRQ handler by CT being enabled rather than just G2H
handler.

v2: Return on not enabled in CT irq handler (Michal)
Suggested-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>

5030e161

drm/xe: Fix exec IOCTL long running exec queue ring full condition · 97d0047c

Matthew Brost authored Jan 04, 2024

The intent is to return -EWOULDBLOCK to the user if a long running exec
queue is full during the exec IOCTL. -EWOULDBLOCK aliases to -EAGAIN
which results in the exec IOCTL doing a retry loop. Fix this by ensuring
the retry loop is broken when returning -EWOULDBLOCK.

Fixes: 8ae8a2e8 ("drm/xe: Long running job update")
Reported-by: Sai Gowtham Ch <sai.gowtham.ch@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Brian Welty <brian.welty@intel.com>

97d0047c

drm/xe/exec: reserve fence slot for CPU bind · f4e8ab46

Matthew Auld authored Dec 13, 2023

Looks possible to switch from CPU binding to GPU binding mid exec, and
if that happens for the same dma-resv we might use two fence slots, once
for the dummy fence, and another for the actual GPU bind.

References: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/698Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>

f4e8ab46

drm/xe/exec: move fence reservation · 29f424eb

Matthew Auld authored Dec 13, 2023

We currently assume that we can upfront know exactly how many fence
slots we will need at the start of the exec, however the TTM bo_validate
can itself consume numerous fence slots, and due to how the
dma_resv_reserve_fences() works it only ensures that at least that many
fence slots are available. With this it is quite possible that TTM
steals some of the fence slots and then when it comes time to do the vma
binding and final exec stage we are lacking enough fence slots, leading
to some nasty BUG_ON(). A simple fix is to reserve our own fences later,
after the validate stage.

References: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/698Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Tested-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>

29f424eb

08 Jan, 2024 4 commits

drm/xe/dgfx: Release mmap mappings on rpm suspend · fa78e188

Badal Nilawar authored Jan 04, 2024

Release all mmap mappings for all vram objects which are associated
with userfault such that, while pcie function in D3hot, any access
to memory mappings will raise a userfault.

Upon userfault, in order to access memory mappings, if graphics
function is in D3 then runtime resume of dgpu will be triggered to
transition to D0.

v2:
  - Avoid iomem check before bo migration check as bo can migrate
    to system memory (Matthew Auld)
v3:
  - Delete bo userfault link during bo destroy
  - Upon bo move (vram-smem), do bo userfault link deletion in
    xe_bo_move_notify instead of xe_bo_move (Thomas Hellström)
  - Grab lock in rpm hook while deleting bo userfault link (Matthew Auld)
v4:
  - Add kernel doc and wrap vram_userfault related
    stuff in the structure (Matthew Auld)
  - Get rpm wakeref before taking dma reserve lock (Matthew Auld)
  - In suspend path apply lock for entire list op
    including list iteration (Matthew Auld)
v5:
  - Use mutex lock instead of spin lock
v6:
  - Fix review comments (Matthew Auld)

Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Anshuman Gupta <anshuman.gupta@intel.com>
Signed-off-by: Badal Nilawar <badal.nilawar@intel.com>
Acked-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> #For the xe_bo_move_notify() changes
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://lore.kernel.org/r/20240104130702.950078-1-badal.nilawar@intel.comSigned-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

fa78e188

drm/xe/xe2: synchronise CS_CHICKEN1 with WMTP support · ddb5bade

Nirmoy Das authored Jan 04, 2024

Recommendation is to read FUSE4 register to check if WMTP has been
enabled/disabled by HW. If enabled we don't need to do anything special,
however if disabled recommendation is to also disable the WMTP mode in
the FF_SLICE_CS_CHICKEN2 register, falling back to thread-group and
mid-batch preemption only. However on Linux, the per-context CS_CHICKEN1
is how userspace controls pre-emption, so instead use the default lrc to
disable WMTP using CS_CHICKEN1, if disabled by HW. Userspace is still
free to set CS_CHICKEN1 to whatever they want later.

v2: remove redundant version check and also add descriptive name(Matt)
v3: remove usage of REG_FIELD_GET(Matt)

Cc: Matt Roper <matthew.d.roper@intel.com>
Co-developed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20240104182615.21327-1-nirmoy.das@intel.comSigned-off-by: Matt Roper <matthew.d.roper@intel.com>

ddb5bade

drm/xe/kunit: Drop xe_wa tests for pre-production DG2 · be8755a0

Lucas De Marchi authored Dec 21, 2023

As workarounds for pre-production DG2 were dropped in
commit 707d1b992cfe ("drm/xe/dg2: Drop pre-production workarounds"),
there's no point running the kunit tests for them. Drop
those steppings from kunit.

Cc: Gustavo Sousa <gustavo.sousa@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20231221163213.3849523-1-lucas.demarchi@intel.comSigned-off-by: Lucas De Marchi <lucas.demarchi@intel.com>

be8755a0

drm/xe: Fix spelling mistake "gueue" -> "queue" · 264ed178

Colin Ian King authored Jan 02, 2024

There is a spelling mistake in a drm_info message. Fix it.
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20240102092014.3347566-1-colin.i.king@gmail.com

264ed178