Commits · a201c6ee37d63e7c0a2973fb7790e94211b7fa83 · Kirill Smelkov / linux

21 Dec, 2023 33 commits

drm/xe/bo: Evict VRAM to TT rather than to system · a201c6ee

Thomas Hellström authored Jun 26, 2023

The main difference is that we don't bounce and sync on eviction, allowing
for pipelined eviction. Moving forward we also need to be careful with
dma mappings which can be released in SYSTEM but may remain in TT.

v2:
- Remove a stale comment (Matthew Brost)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230626181741.32820-5-thomas.hellstrom@linux.intel.comSigned-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

a201c6ee

drm/xe/bo: Gracefully handle errors from ttm_bo_move_accel_cleanup(). · 70ff6a99

Thomas Hellström authored Jun 26, 2023

The function ttm_bo_move_accel_cleanup() attempts to help pipeline a
move, and in doing so, needs memory allocations which may fail.

Rather than failing in a state where the new resource may freed while
accessed by the copy engine, sync uninterruptible and do a failsafe
cleanup.

v2:
- Don't try to attach the signaled fence on ttm_bo_move_accel_cleanup()
  error.
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230626181741.32820-4-thomas.hellstrom@linux.intel.comSigned-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

70ff6a99

drm/xe/bo: Avoid creating a system resource when allocating a fresh VRAM bo · 3439cc46

Thomas Hellström authored Jun 26, 2023

When creating a new bo, on the first move the bo->resource is typically
NULL. Our move callback rejected that instructing TTM to create a system
resource. In addition a struct ttm_tt with a page-vector was created,
although not populated with pages. Similarly when the clearing of VRAM
was complete, the system resource was put on a ghost object and freed
using the TTM delayed destroy mechanism.

This is a lot of pointless work. So avoid creating the system resource and
instead change the code to cope with a NULL bo->resource.

v2:
- Add some code comments (Matthew Brost)
v3:
- Fix a dereference of old_mem which might be NULL.
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230626181741.32820-3-thomas.hellstrom@linux.intel.comSigned-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

3439cc46

drm/xe/bo: Fix swapin when moving to VRAM · bc2e0215

Thomas Hellström authored Jun 26, 2023

When a source system resource had been swapped out, we incorrectly
assumed that we were lacking source data for a move and therefore
cleared the destination instead of swapping in and copying the
swapped-out data. Fix this.
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230626181741.32820-2-thomas.hellstrom@linux.intel.comSigned-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

bc2e0215

drm/xe/mtl: Add support to get C6 residency/status of MTL · 7b076d14

Badal Nilawar authored Jun 23, 2023

Add the registers to get C6 residency of MTL SAMedia and
C6 status of MTL gts

v2:
   - move register definitions to regs header (Anshuman)
   - correct reg definition for mtl rc status
   - make idle_status function common (Badal)

v3:
   - remove extra line in commit message
   - use only media type check in initialization
   - use graphics ver check (Anshuman)

v4:
   - remove extra lines (Anshuman)

Bspec: 66300
Signed-off-by: Badal Nilawar <badal.nilawar@intel.com>
Signed-off-by: Riana Tauro <riana.tauro@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Reviewed-by: Anshuman Gupta <anshuman.gupta@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

7b076d14

drm/xe: add a new sysfs directory for gtidle properties · 1c2097bb

Riana Tauro authored Jun 23, 2023

1) Add a new sysfs directory under devices/gt#/ called gtidle
   to contain idle properties of GT such as name, idle_status,
   idle_residency_ms

2) Remove forcewake calls for residency counter

v2:
    - abstract using function pointers (Anshuman)
    - remove forcewake calls for residency counter
    - use device_attr (Badal)
    - move rc functions to guc_pc
    - change name to gt_idle (Rodrigo)

v3:
    - return error for drmm_add_action_or_reset
    - replace file and functions with gt_idle prefix
      to gt_idle_sysfs (Himal)
    - use enum for gt idle state
    - move multiplier to gt idle and initialize (Anshuman)
    - correct doc annotation (Rodrigo)
    - remove return variable
    - use kobj_gt instead of new gtidle kobj
    - move residency_ms to gtidle file
    - retain xe_guc_pc prefix for functions in guc_rc file (Michal)

v4:
    - fix doc errors in xe_guc_pc file
    - change u64 to u32 for reading residency counter
    - keep gtidle states generic GT_IDLE_C[0/6] (Anshuman)

v5:
    - update commit message to include removal of
      forcewake calls (Anshuman)
    - return void from sysfs initialization function and add warnings
      (Andi)

v6:
    - remove extra lines (Anshuman)
Signed-off-by: Riana Tauro <riana.tauro@intel.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Anshuman Gupta <anshuman.gupta@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

1c2097bb

drm/xe/bo: consider bo->flags in xe_bo_migrate() · 513e8262

Matthew Auld authored Jun 19, 2023

For VRAM allocations the bo->flags can control some characteristics of
the underlying memory, like whether it needs to be contiguous, and in
the future whether it needs to be in the CPU visible portion. Rather use
add_vram() in xe_bo_migrate() which should take care of such things for
us.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

513e8262

drm/doc: include xe_drm.h · 83ee6699

Matthew Auld authored Jun 26, 2023

Make sure the uapi gets picked up by the normal docs build.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Francois Dugast <francois.dugast@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Francois Dugast <francois.dugast@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

83ee6699

drm/xe/uapi: silence kernel-doc errors · 63f9c3cd

Matthew Auld authored Jun 26, 2023

./include/uapi/drm/xe_drm.h:263: warning: Function parameter or member
'gts' not described in 'drm_xe_query_gts'

./include/uapi/drm/xe_drm.h:854: WARNING: Inline emphasis start-string
without end-string.

With the idea to also include the uapi file in the pre-merge CI hooks
when building the kernel-doc, so first make sure it's clean:

https://gitlab.freedesktop.org/drm/xe/ci/-/merge_requests/16

v2: (Francois)
  - It makes more sense to just fix the kernel-doc for 'gts'
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Francois Dugast <francois.dugast@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Francois Dugast <francois.dugast@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

63f9c3cd

drm/xe/uapi: add some kernel-doc for region query · a9c4a069

Matthew Auld authored Mar 31, 2023

Since we need to extend this, we should also take the time to add some
basic kernel-doc here for the existing bits. Note that this is all still
subject to change when upstreaming.

Also convert XE_MEM_REGION_CLASS_* into an enum, so we can more easily
create links to it from other parts of the uapi.
Suggested-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Filip Hazubski <filip.hazubski@intel.com>
Cc: Carl Zhang <carl.zhang@intel.com>
Cc: Effie Yu <effie.yu@intel.com>
Reviewed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

a9c4a069

drm/xe/uapi: restrict system wide accounting · 1105ac15

Matthew Auld authored Mar 31, 2023

Since this is considered an info leak (system wide accounting), rather
hide behind perfmon_capable().

v2:
  - Without perfmon_capable() it likely makes more sense to report as zero,
    instead of reporting as used == total size. This should give similar
    behaviour as i915 which rather tracks free instead of used.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Filip Hazubski <filip.hazubski@intel.com>
Cc: Carl Zhang <carl.zhang@intel.com>
Cc: Effie Yu <effie.yu@intel.com>
Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

1105ac15

drm/xe: Document topology mask query · 1bc56a93

Francois Dugast authored Jun 22, 2023

Provide information on the types of topology masks that can be
queried and add some examples.
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

1bc56a93

drm/xe: Move defines before relevant fields · 4f082f2c

Francois Dugast authored Jun 22, 2023

Align on same rule in the whole file: defines then doc then relevant
field, with an empty line to separate fields.

v2:
  - Rebase on drm-xe-next
  - Fix ordering of defines and fields in uAPI (Lucas De Marchi)
v3: Remove useless empty lines (Lucas De Marchi)
v4: Move changelog to commit
v5: Rebase
Reported-by: Oded Gabbay <ogabbay@kernel.org>
Link: https://lists.freedesktop.org/archives/intel-xe/2023-May/004704.htmlSigned-off-by: Francois Dugast <francois.dugast@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

4f082f2c

drm/xe: Document structures for device query · ffd6620f

Francois Dugast authored Jun 09, 2023

This adds documentation to the various structures used to query
memory, GTs, topology, engines, and so on. It includes a functional
code snippet to query engines.

v2:
  - Rebase on drm-xe-next
  - Also document structures related to drm_xe_device_query, changed
    pseudo code to snippet (Lucas De Marchi)
v3:
  - Move changelog to commit
  - Fix warnings showed only using dim checkpath
Reported-by: Oded Gabbay <ogabbay@kernel.org>
Link: https://lists.freedesktop.org/archives/intel-xe/2023-May/004704.htmlSigned-off-by: Francois Dugast <francois.dugast@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

ffd6620f

drm/xe: Fix unreffed ptr leak on engine lookup · 5db4afe1

Mika Kuoppala authored Jun 02, 2023

The engine xarray holds a ref to engine, guarded by the lock.
While we do lookup for engine, we need to take the ref inside
the lock to prevent unreffed pointer escaping and
causing potential use-after-free after.

v2: remove branch prediction hint (Thomas)

Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230602172732.1001057-1-mika.kuoppala@linux.intel.comSigned-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

5db4afe1

drm/xe: Skip applying copy engine fuses · 898f86c2

Lucas De Marchi authored Jun 13, 2023

Like commit 69a3738b ("drm/i915: Skip applying copy engine fuses"),
do not apply copy engine fuses for platforms where MEML3_EN is not
relevant for determining the presence of the copy engines.
Acked-by: Gustavo Sousa <gustavo.sousa@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230613180356.2906441-1-lucas.demarchi@intel.comSigned-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

898f86c2

drm/xe/bo: handle PL_TT -> PL_TT · 8489f30e

Matthew Auld authored Jun 15, 2023

When moving between PL_VRAM <-> PL_SYSTEM we have to have use PL_TT in
the middle as a temporary resource for the actual copy. In some GL
workloads it can be seen that once the resource has been moved to the
PL_TT we might have to bail out of the ttm_bo_validate(), before
finishing the final hop. If this happens the resource is left as
TTM_PL_FLAG_TEMPORARY, and when the ttm_bo_validate() is restarted the
current placement is always seen as incompatible, requiring us to
complete the move.  However if the BO allows PL_TT as a possible
placement we can end up attempting a PL_TT -> PL_TT move (like when
running out of VRAM) which leads to explosions in xe_bo_move(), like
triggering the XE_BUG_ON(!tile).

Going from TTM_PL_FLAG_TEMPORARY with PL_TT -> PL_VRAM should already
work as-is, so it looks like we only need to worry about PL_TT -> PL_TT
and it looks like we can just treat it as a dummy move, since no real
move is needed.
Reported-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

8489f30e

drm/xe: VM LRU bulk move · 7ba4c5f0

Matthew Brost authored Jun 07, 2023

Use the TTM LRU bulk move for BOs tied to a VM. Update the bulk moves
LRU position on every exec.

v2: Bulk move for compute VMs, use WARN rather than BUG
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

7ba4c5f0

drm/xe: Only try to lock external BOs in VM bind · 73c09901

Matthew Brost authored Mar 27, 2023

We only need to try to lock a BO if it's external as non-external BOs
share the dma-resv with the already locked VM. Trying to lock
non-external BOs caused an issue (list corruption) in an uncoming patch
which adds bulk LRU move. Since this code isn't needed, remove it.

v2: New commit message, s/mattthew/matthew/
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

73c09901

drm/xe: Ensure LR engines are not persistent · 911cd9b3

Matthew Brost authored Apr 12, 2023

With our ref counting scheme long running (LR) engines only close
properly if not persistent, ensure that LR engines are non-persistent.

v2: spell out LR
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

911cd9b3

drm/xe: Long running job update · 8ae8a2e8

Matthew Brost authored May 21, 2023

For long running (LR) jobs with the DRM scheduler we must return NULL in
run_job which results in signaling the job's finished fence immediately.
This prevents LR jobs from creating infinite dma-fences.

Signaling job's finished fence immediately breaks flow controlling ring
with the DRM scheduler. To work around this, the ring is flow controlled
and written in the exec IOCTL. Signaling job's finished fence
immediately also breaks the TDR which is used in reset / cleanup entity
paths so write a new path for LR entities.

v2: Better commit, white space, remove rmb(), better comment next to
emit_job()
v3 (Thomas): Change LR reference counting, fix working in commit
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

8ae8a2e8

drm/xe: NULL binding implementation · 37430402

Matthew Brost authored Jun 15, 2023

Add uAPI and implementation for NULL bindings. A NULL binding is defined
as writes dropped and read zero. A single bit in the uAPI has been added
which results in a single bit in the PTEs being set.

NULL bindings are intendedd to be used to implement VK sparse bindings,
in particular residencyNonResidentStrict property.

v2: Fix BUG_ON shown in VK testing, fix check patch warning, fix
xe_pt_scan_64K, update __gen8_pte_encode to understand NULL bindings,
remove else if vma_addr
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Suggested-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

37430402

drm/Xe: Use EOPNOTSUPP instead of ENOTSUPP · ee6ad137

Janga Rahul Kumar authored Jun 13, 2023

ENOTSUPP is not a standard Unix error should use
EOPNOTSUPP instead.

v2: Update commit description (Aravind)
Reviewed-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
Signed-off-by: Janga Rahul Kumar <janga.rahul.kumar@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

ee6ad137

drm/xe: limit GGTT size to GUC_GGTT_TOP · ab10e976

Daniele Ceraolo Spurio authored Jun 14, 2023

The GuC can't access addresses above GUC_GGTT_TOP, so any GuC-accessible
objects can't be mapped above that offset. Instead of checking each
object to see if GuC may access it or not before mapping it, we just
limit the GGTT size to GUC_GGTT_TOP. This wastes a bit of address space
(about ~18 MBs, which is in addition to what already removed at the bottom
of the GGTT), but it is a good tradeoff to keep the code simple.

The in-code comment has also been updated to explain the limitation.
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://lore.kernel.org/r/20230615002521.2587250-1-daniele.ceraolospurio@intel.com/Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

ab10e976

drm/xe/mtl: Add some initial MTL workarounds · ff063430

Matt Roper authored Jun 08, 2023

This adds a handful of workarounds that apply to production steppings of
MTL:
 - Wa_14018575942
 - Wa_22016670082
 - Wa_14017856879
 - Wa_18019271663

Wa_22016670082 is currently only applied to the primary GT at the
moment, but may need to be extended to the media GT in the future if a
pending update to the workaround database gets finalized.

OOB workarounds will need to be implemented separately in future patches
for Wa_14016712196, Wa_16018063123, and Wa_18013179988.
Reviewed-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
Link: https://lore.kernel.org/r/20230608181217.2385932-1-matthew.d.roper@intel.comSigned-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

ff063430

drm/xe: Fix check for platform without geometry pipeline · d0e2dd76

Michał Winiarski authored May 23, 2023

It's not possible for the condition checking if we're running on
platform without geometry pipeline to ever be true, since
gt->fuse_topo.g_dss_mask is an array.

It also breaks the build:
../drivers/gpu/drm/xe/xe_rtp.c:183:50: error: address of array 'gt->fuse_topo.g_dss_mask' will always evaluate to 'true' [-Werror,-Wpointer-bool-conversion]
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230523135020.345596-2-michal@hardline.plSigned-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

d0e2dd76

drm/xe: Fix uninitialized variables · 35cbfe56

Michał Winiarski authored May 23, 2023

Using uninitialized variables leads to undefined behavior.

Moreover, it causes the compiler to complain with:
../drivers/gpu/drm/xe/xe_vm.c:3265:40: error: variable 'vma' is uninitialized when used here [-Werror,-Wuninitialized]
../drivers/gpu/drm/xe/xe_rtp.c:118:36: error: variable 'i' is uninitialized when used here [-Werror,-Wuninitialized]
../drivers/gpu/drm/xe/xe_mocs.c:449:3: error: variable 'flags' is uninitialized when used here [-Werror,-Wuninitialized]
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230523135020.345596-1-michal@hardline.plSigned-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

35cbfe56

drm/xe: Fix GT looping for standalone media · 1e80d0c3

Riana Tauro authored Jun 13, 2023

gt_count is only being incremented when initializing the primary GT;
since the media GT sets the ID directly, gt_count is not incremented
again, resulting in an incorrect count on MTL.  Use autoincrement while
assigning the media GTs ID to ensure gt_count is correct on MTL and
other future platforms with standalone media.
Signed-off-by: Riana Tauro <riana.tauro@intel.com>
Link: https://lore.kernel.org/r/20230613094232.3703549-1-riana.tauro@intel.com
[mattrope: Tweaked commit message to focus on gt_count importance]
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

1e80d0c3

drm/xe: Donot apply forcewake while reading actual frequency · 2846d103

Badal Nilawar authored Jun 09, 2023

RPSTAT1 is an sgunit register and thus doesn't need forcewake.
MTL_MIRROR_TARGET_WP1 is within an "always on" power domain and thus
doesn't require any forcewake to ensure the register is powered
up and usable. When GT is RC6 the actual frequency reported will be 0.

v2:
 - Add bspec index (Anshuman)
 - %s/GEN12_RPSTAT1/GT_PERF_STATUS as per bspec
v3: Update Fixes tag

Bspec: 51837, 67651
Fixes: dd08ebf6 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Signed-off-by: Badal Nilawar <badal.nilawar@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230609024954.987039-1-badal.nilawar@intel.comSigned-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

2846d103

drm/xe/guc: Normalize error messages with %#x · 6dc3a12f

Lucas De Marchi authored Jun 11, 2023

One of the messages was printed without 0x prefix, so it was not clear
if it was decimal or hex: make sure to add the prefix by using %#x.
While at it, normalize the other messages in the same function to follow
the same pattern.
Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://lore.kernel.org/r/20230611222447.2837573-3-lucas.demarchi@intel.comSigned-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

6dc3a12f

drm/xe/guc: Fix typo s/enabled/enable/ · 90738d86

Lucas De Marchi authored Jun 11, 2023

Fix the log message when it fails to enable CT.
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20230611222447.2837573-2-lucas.demarchi@intel.comSigned-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

90738d86

drm/xe: Rename pte/pde encoding functions · a0ea91db

Lucas De Marchi authored Jun 11, 2023

Remove the leftover TODO by renameing the functions to use xe prefix.
Since the static __gen8_pte_encode() already has a double score,
just remove the prefix.
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20230611222447.2837573-1-lucas.demarchi@intel.comSigned-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

a0ea91db

drm/xe: Move XE_PTE_FLAG_READ_ONLY to xe_vm_types.h · 6713ee6c

Matthew Brost authored Jun 07, 2023

XE_PTE_FLAG_READ_ONLY is specific to struct xe_vma, move it from xe_bo.h
to xe_vm_types.h to reflect that.
Reviewed-by: Francois Dugast <francois.dugast@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

6713ee6c

19 Dec, 2023 7 commits

drm/xe: s/XE_PTE_READ_ONLY/XE_PTE_FLAG_READ_ONLY · 3534b18c

Matthew Brost authored Jun 07, 2023

This define is for internal PTE flags rather than fields in the hardware
PTEs, rename as such. This will help in an upcoming patch to avoid
further confusion.
Reviewed-by: Francois Dugast <francois.dugast@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

3534b18c

drm/xe: Use Xe ordered workqueue for rebind worker · 5e3220de

Matthew Brost authored Jun 09, 2023

A mix of the system unbound wq and Xe ordered wq was used for the
rebind, only use the Xe ordered wq. This will ensure only 1 rebind is
occuring at a time providing a somewhat clunky work around for short
comings in TTM wrt to memory contention. Once the TTM memory contention
is resolved we should be able to use a dedicated non-ordered workqueue.

Also add helper to queue rebind worker to avoid using wrong workqueue
going forward.
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

5e3220de

drm/xe: Handle unmapped userptr in analyze VM · 790bdc7c

Matthew Brost authored Jun 09, 2023

A corner exists where a userptr may have no mapping when analyze VM is
called, handle this case.
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

790bdc7c

drm/xe: Emit a render cache flush after each rcs/ccs batch · 9f8f93be

Thomas Hellström authored Jun 02, 2023

We need to flush render caches before fence signalling, where we might
release the memory for reuse. We can't rely on userspace doing this,
so flush render caches after the batch, but before user fence- and
dma_fence signalling.

Copy the cache flush from i915, but omit PIPE_CONTROL_FLUSH_L3, since it
should be implied by the other flushes. Also omit
PIPE_CONTROL_TLB_INVALIDATE since there should be no apparent need to
invalidate TLB after batch completion.

v2:
- Update Makefile for OOB WA.
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Tested-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com> #1
Reported-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/291
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/291Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

9f8f93be

drm/xe: Invalidate TLB also on bind if in scratch page mode · 85dbfe47

Thomas Hellström authored Jun 05, 2023

For scratch table mode we need to cover the case where a scratch PTE might
have been pre-fetched and cached and used instead of that of the newly
bound vma.
For compute vms, invalidate TLB globally using GuC before signalling
bind complete. For !long-running vms, invalidate TLB at batch start.

Also document how TLB invalidation works.

v2:
- Fix a pointer to the comment about TLB invalidation (Jose Souza).
- Add a bool to the vm whether we want to invalidate TLB at batch start.
- Invalidate TLB also on BCS- and video engines at batch start where
  needed.
- Use BIT() macro instead of explicit shift.
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Tested-by: José Roberto de Souza <jose.souza@intel.com> #v1
Reported-by: José Roberto de Souza <jose.souza@intel.com> #v1
Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/291
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/291Acked-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

85dbfe47

drm/xe/reg_sr: Apply limit to register whitelisting · 5eeb8b44

Gustavo Sousa authored Jun 09, 2023

If RING_MAX_NONPRIV_SLOTS denotes the maximum number of whitelisting
slots, then it makes sense to refuse going above it.

v2:
  - Use xe_gt_err() instead of drm_err() for more detailed info in the
    error message. (Matt)

Cc: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20230609143815.302540-3-gustavo.sousa@intel.comSigned-off-by: Gustavo Sousa <gustavo.sousa@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

5eeb8b44

drm/xe/reg_sr: Use a single parameter for xe_reg_sr_apply_whitelist() · 1011812c

Gustavo Sousa authored Jun 09, 2023

All other parameters can be extracted from a single struct xe_hw_engine
reference. This removes redundancy and simplifies the code.
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20230609143815.302540-2-gustavo.sousa@intel.comSigned-off-by: Gustavo Sousa <gustavo.sousa@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

1011812c