Commits · f1cb5f647e8959a1034941d85b311d7485a7095f · Kirill Smelkov / linux

21 Dec, 2023 40 commits

Vinay Belgaumkar authored Nov 17, 2023

This flag can be used to disable GuC based power management. This
could be used for debug or comparison to host based C6.

v2: Fix missing definition
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

f1cb5f64

drm/xe: Rename xe_gt_idle_sysfs to xe_gt_idle · c550f64f

Vinay Belgaumkar authored Nov 17, 2023

Prep this file to contain C6 toggling as well instead
of just sysfs related stuff.
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

c550f64f

drm/xe/doc: Include documentation about xe_assert() · 8cdcef1c

Michal Wajdeczko authored Nov 15, 2023

Our xe_assert() macros are well documented.
Include that in master documentation.
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20231115112921.1905-1-michal.wajdeczko@intel.comSigned-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

8cdcef1c

drm/xe/guc: Include only required GuC ABI headers · b67cb798

Michal Wajdeczko authored Nov 28, 2023

On i915 we were adding new GuC ABI headers directly to guc_fwif.h
file since we were replacing old definitions from that file.

On xe driver we could do more and better by including ABI headers
only in files that need those definitions.

Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/741
Cc: Jani Nikula <jani.nikula@intel.com>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20231128203203.1147-3-michal.wajdeczko@intel.comSigned-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

b67cb798

drm/xe/guc: Remove obsolete GuC CTB documentation · 0a39ad21

Michal Wajdeczko authored Nov 28, 2023

Refer to already described CTB Descriptor and CTB HXG Message.
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20231128203203.1147-2-michal.wajdeczko@intel.comSigned-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

0a39ad21

drm/xe/guc: Drop ancient GuC CTB definitions · e784f352

Michal Wajdeczko authored Nov 28, 2023

Those definitions were applicable for old GuC firmwares only.
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/741
Link: https://lore.kernel.org/r/20231128203203.1147-1-michal.wajdeczko@intel.comSigned-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

e784f352

drm/xe: explicitly set GGTT access for GuC DMA · 473b6276

Fei Yang authored Nov 22, 2023

Confirmed with hardware that setting GGTT memory access for GuC
firmware loading is correct for all platforms and required for
new platforms going forward.
Signed-off-by: Fei Yang <fei.yang@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20231122204501.1353325-2-fei.yang@intel.comSigned-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

473b6276

drm/xe/uapi: support pat_index selection with vm_bind · e1fbc4f1

Matthew Auld authored Sep 25, 2023

Allow userspace to directly control the pat_index for a given vm
binding. This should allow directly controlling the coherency, caching
behaviour, compression and potentially other stuff in the future for the
ppGTT binding.

The exact meaning behind the pat_index is very platform specific (see
BSpec or PRMs) but effectively maps to some predefined memory
attributes. From the KMD pov we only care about the coherency that is
provided by the pat_index, which falls into either NONE, 1WAY or 2WAY.
The vm_bind coherency mode for the given pat_index needs to be at least
1way coherent when using cpu_caching with DRM_XE_GEM_CPU_CACHING_WB. For
platforms that lack the explicit coherency mode attribute, we treat
UC/WT/WC as NONE and WB as AT_LEAST_1WAY.

For userptr mappings we lack a corresponding gem object, so the expected
coherency mode is instead implicit and must fall into either 1WAY or
2WAY. Trying to use NONE will be rejected by the kernel. For imported
dma-buf (from a different device) the coherency mode is also implicit
and must also be either 1WAY or 2WAY.

v2:
  - Undefined coh_mode(pat_index) can now be treated as programmer
    error. (Matt Roper)
  - We now allow gem_create.coh_mode <= coh_mode(pat_index), rather than
    having to match exactly. This ensures imported dma-buf can always
    just use 1way (or even 2way), now that we also bundle 1way/2way into
    at_least_1way. We still require 1way/2way for external dma-buf, but
    the policy can now be the same for self-import, if desired.
  - Use u16 for pat_index in uapi. u32 is massive overkill. (José)
  - Move as much of the pat_index validation as we can into
    vm_bind_ioctl_check_args. (José)
v3 (Matt Roper):
  - Split the pte_encode() refactoring into separate patch.
v4:
  - Rebase
v5:
  - Check for and reject !coh_mode which would indicate hw reserved
    pat_index on xe2.
v6:
  - Rebase on removal of coh_mode from uapi. We just need to reject
    cpu_caching=wb + pat_index with coh_none.

Testcase: igt@xe_pat
Bspec: 45101, 44235 #xe
Bspec: 70552, 71582, 59400 #xe2
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Pallavi Mishra <pallavi.mishra@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Filip Hazubski <filip.hazubski@intel.com>
Cc: Carl Zhang <carl.zhang@intel.com>
Cc: Effie Yu <effie.yu@intel.com>
Cc: Zhengguo Xu <zhengguo.xu@intel.com>
Cc: Francois Dugast <francois.dugast@intel.com>
Tested-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Acked-by: Zhengguo Xu <zhengguo.xu@intel.com>
Acked-by: Bartosz Dunajski <bartosz.dunajski@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

e1fbc4f1

drm/xe/pat: annotate pat_index with coherency mode · f6a22e68

Matthew Auld authored Aug 17, 2023

Future uapi needs to give userspace the ability to select the pat_index
for a given vm_bind. However we need to be able to extract the coherency
mode from the provided pat_index to ensure it's compatible with the
cpu_caching mode set at object creation. There are various security
reasons for why this matters.  However the pat_index itself is very
platform specific, so seems reasonable to annotate each platform
definition of the pat table.  On some older platforms there is no
explicit coherency mode, so we just pick whatever makes sense.

v2:
  - Simplify with COH_AT_LEAST_1_WAY
  - Add some kernel-doc
v3 (Matt Roper):
  - Some small tweaks
v4:
  - Rebase
v5:
  - Rebase on Xe2 PAT additions
v6:
  - Rebase on removal of coh_mode from uapi

Bspec: 45101, 44235 #xe
Bspec: 70552, 71582, 59400 #xe2
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Pallavi Mishra <pallavi.mishra@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Filip Hazubski <filip.hazubski@intel.com>
Cc: Carl Zhang <carl.zhang@intel.com>
Cc: Effie Yu <effie.yu@intel.com>
Cc: Zhengguo Xu <zhengguo.xu@intel.com>
Cc: Francois Dugast <francois.dugast@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Pallavi Mishra <pallavi.mishra@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

f6a22e68

drm/xe/uapi: Add support for CPU caching mode · 622f709c

Pallavi Mishra authored Aug 11, 2023

Allow userspace to specify the CPU caching mode at object creation.
Modify gem create handler and introduce xe_bo_create_user to replace
xe_bo_create. In a later patch we will support setting the pat_index as
part of vm_bind, where expectation is that the coherency mode extracted
from the pat_index must be least 1way coherent if using cpu_caching=wb.

v2
  - s/smem_caching/smem_cpu_caching/ and
    s/XE_GEM_CACHING/XE_GEM_CPU_CACHING/. (Matt Roper)
  - Drop COH_2WAY and just use COH_NONE + COH_AT_LEAST_1WAY; KMD mostly
    just cares that zeroing/swap-in can't be bypassed with the given
    smem_caching mode. (Matt Roper)
  - Fix broken range check for coh_mode and smem_cpu_caching and also
    don't use constant value, but the already defined macros. (José)
  - Prefer switch statement for smem_cpu_caching -> ttm_caching. (José)
  - Add note in kernel-doc for dgpu and coherency modes for system
    memory. (José)
v3 (José):
  - Make sure to reject coh_mode == 0 for VRAM-only.
  - Also make sure to actually pass along the (start, end) for
    __xe_bo_create_locked.
v4
  - Drop UC caching mode. Can be added back if we need it. (Matt Roper)
  - s/smem_cpu_caching/cpu_caching. Idea is that VRAM is always WC, but
    that is currently implicit and KMD controlled. Make it explicit in
    the uapi with the limitation that it currently must be WC. For VRAM
    + SYS objects userspace must now select WC. (José)
  - Make sure to initialize bo_flags. (José)
v5
  - Make to align with the other uapi and prefix uapi constants with
    DRM_ (José)
v6:
  - Make it clear that zero cpu_caching is only allowed for kernel
    objects. (José)
v7: (Oak)
  - With all the changes from the original design, it looks we can
    further simplify here and drop the explicit coh_mode. We can just
    infer the coh_mode from the cpu_caching. i.e reject cpu_caching=wb +
    coh_none. It's one less thing for userspace to maintain so seems
    worth it.
v8:
  - Make sure to also update the kselftests.

Testcase: igt@xe_mmap@cpu-caching
Signed-off-by: Pallavi Mishra <pallavi.mishra@intel.com>
Co-developed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Filip Hazubski <filip.hazubski@intel.com>
Cc: Carl Zhang <carl.zhang@intel.com>
Cc: Effie Yu <effie.yu@intel.com>
Cc: Zhengguo Xu <zhengguo.xu@intel.com>
Cc: Francois Dugast <francois.dugast@intel.com>
Cc: Oak Zeng <oak.zeng@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Acked-by: Zhengguo Xu <zhengguo.xu@intel.com>
Acked-by: Bartosz Dunajski <bartosz.dunajski@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

622f709c

drm/xe/kunit: Return number of iterated devices · 5bb83841

Michal Wajdeczko authored Nov 15, 2023

In xe_call_for_each_device() we are already counting number of
iterated devices. Lets make that available to the caller too.
We will use that functionality in upcoming patches.
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20231115115816.1993-1-michal.wajdeczko@intel.comSigned-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

5bb83841

drm/xe: fix mem_access for early lrc generation · fcf98d68

Matthew Auld authored Nov 27, 2023

We spawn some hw queues during device probe to generate the default LRC
for every engine type, however the queue destruction step is typically
async. Queue destruction needs to do stuff like GuC context deregister
which requires GuC CT, which in turn requires an active mem_access ref.
The caller during probe is meant to hold the mem_access token, however
due to the async destruction it might have already been dropped if we
are unlucky.

Similar to how we already handle migrate VMs for which there is no
mem_access ref, fix this by keeping the callers token alive, releasing
it only when destroying the queue. We can treat a NULL vm as indication
that we need to grab our own extra ref.

Fixes the following splat sometimes seen during load:

[ 1682.899930] WARNING: CPU: 1 PID: 8642 at drivers/gpu/drm/xe/xe_device.c:537 xe_device_assert_mem_access+0x27/0x30 [xe]
[ 1682.900209] CPU: 1 PID: 8642 Comm: kworker/u24:97 Tainted: G     U  W   E    N 6.6.0-rc3+ #6
[ 1682.900214] Workqueue: submit_wq xe_sched_process_msg_work [xe]
[ 1682.900303] RIP: 0010:xe_device_assert_mem_access+0x27/0x30 [xe]
[ 1682.900388] Code: 90 90 90 66 0f 1f 00 0f 1f 44 00 00 53 48 89 fb e8 1e 6c 03 00 48 85 c0 74 06 5b c3 cc cc cc cc 8b 83 28 23 00 00 85 c0 75 f0 <0f> 0b 5b c3 cc cc cc cc 90 90 90 90 90 90 90 90 90 90 90 90 90 90
[ 1682.900390] RSP: 0018:ffffc900021cfb68 EFLAGS: 00010246
[ 1682.900394] RAX: 0000000000000000 RBX: ffff8886a96d8000 RCX: 0000000000000000
[ 1682.900396] RDX: 0000000000000001 RSI: ffff8886a6311a00 RDI: ffff8886a96d8000
[ 1682.900398] RBP: ffffc900021cfcc0 R08: 0000000000000001 R09: 0000000000000000
[ 1682.900400] R10: ffffc900021cfcd0 R11: 0000000000000002 R12: 0000000000000004
[ 1682.900402] R13: 0000000000000000 R14: ffff8886a6311990 R15: ffffc900021cfd74
[ 1682.900405] FS:  0000000000000000(0000) GS:ffff888829880000(0000) knlGS:0000000000000000
[ 1682.900407] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1682.900409] CR2: 000055f70bad3fb0 CR3: 000000025243a004 CR4: 00000000003706e0
[ 1682.900412] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1682.900413] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 1682.900415] Call Trace:
[ 1682.900418]  <TASK>
[ 1682.900420]  ? xe_device_assert_mem_access+0x27/0x30 [xe]
[ 1682.900504]  ? __warn+0x85/0x170
[ 1682.900510]  ? xe_device_assert_mem_access+0x27/0x30 [xe]
[ 1682.900596]  ? report_bug+0x171/0x1a0
[ 1682.900604]  ? handle_bug+0x3c/0x80
[ 1682.900608]  ? exc_invalid_op+0x17/0x70
[ 1682.900612]  ? asm_exc_invalid_op+0x1a/0x20
[ 1682.900621]  ? xe_device_assert_mem_access+0x27/0x30 [xe]
[ 1682.900706]  ? xe_device_assert_mem_access+0x12/0x30 [xe]
[ 1682.900790]  guc_ct_send_locked+0xb9/0x1550 [xe]
[ 1682.900882]  ? lock_acquire+0xca/0x2b0
[ 1682.900885]  ? guc_ct_send+0x3c/0x1a0 [xe]
[ 1682.900977]  ? lock_is_held_type+0x9b/0x110
[ 1682.900984]  ? __mutex_lock+0xc0/0xb90
[ 1682.900989]  ? __pfx___drm_printfn_info+0x10/0x10
[ 1682.900999]  guc_ct_send+0x53/0x1a0 [xe]
[ 1682.901090]  ? __lock_acquire+0xf22/0x21b0
[ 1682.901097]  ? process_one_work+0x1a0/0x500
[ 1682.901109]  xe_guc_ct_send+0x19/0x50 [xe]
[ 1682.901202]  set_min_preemption_timeout+0x75/0xa0 [xe]
[ 1682.901294]  disable_scheduling_deregister+0x55/0x250 [xe]
[ 1682.901383]  ? xe_sched_process_msg_work+0x76/0xd0 [xe]
[ 1682.901467]  ? lock_release+0xc9/0x260
[ 1682.901474]  xe_sched_process_msg_work+0x82/0xd0 [xe]
[ 1682.901559]  process_one_work+0x20a/0x500

v2: Add the splat
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

fcf98d68

drm/xe/gsc: Define GSC FW for MTL · 5152234e

Daniele Ceraolo Spurio authored Nov 17, 2023

We track GSC FW based on its compatibility version, which is what
determines the interface it supports.
Also add a modparam override like the ones for GuC and HuC.

v2: fix module param description (John)
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

5152234e

drm/xe/gsc: Define GSCCS for MTL · 9897eb85

Daniele Ceraolo Spurio authored Nov 17, 2023

Add the GSCCS to the media_xelpmp engine list. Note that since the
GSCCS is only used with the GSC FW, we can consider it disabled if we
don't have the FW available.

v2: mark GSCCS as allowed on the media IP in kunit tests
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

9897eb85

drm/xe/gsc: Query GSC compatibility version · 0881cbe0

Daniele Ceraolo Spurio authored Nov 17, 2023

The version is obtained via a dedicated MKHI GSC HECI command.
The compatibility version is what we want to match against for the GSC,
so we need to call the FW version checker after obtaining the version.

Since this is the first time we send a GSC HECI command via the GSCCS,
this patch also introduces common infrastructure to send such commands
to the GSC. Communication with the GSC FW is done via input/output
buffers, whose addresses are provided via a GSCCS command. The buffers
contain a generic header and a client-specific packet (e.g. PXP, HDCP);
the clients don't care about the header format and/or the GSCCS command
in the batch, they only care about their client-specific header. This
patch therefore introduces helpers that allow the callers to
automatically fill in the input header, submit the GSCCS job and decode
the output header, to make it so that the caller only needs to worry about
their client-specific input and output messages.

v3: squash of 2 separate patches ahead of merge, so that the common
functions and their first user are added at the same time
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Cc: Suraj Kandpal <suraj.kandpal@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.Com> #v1
Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

0881cbe0

drm/xe/gsc: Trigger a driver flr to cleanup the GSC on unload · f63182b4

Daniele Ceraolo Spurio authored Nov 17, 2023

GSC is only killed by an FLR, so we need to trigger one on unload to
make sure we stop it. This is because we assign a chunk of memory to
the GSC as part of the FW load, so we need to make sure it stops
using it when we release it to the system on driver unload. Note that
this is not a problem of the unload per-se, because the GSC will not
touch that memory unless there are requests for it coming from the
driver; therefore, no accesses will happen while Xe is not loaded,
but if we re-load the driver then the GSC might wake up and try to
access that old memory location again.
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

f63182b4

drm/xe/gsc: Implement WA 14015076503 · aae84bf1

Daniele Ceraolo Spurio authored Nov 17, 2023

When the GSC FW is loaded, we need to inform it when a GSCCS reset is
coming and then wait 200ms for it to get ready to process the reset.

v2: move WA code to GSC file, use variable in Makefile (John)
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: John Harrison <john.c.harrison@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

aae84bf1

drm/xe/gsc: GSC FW load · dd0e89e5

Daniele Ceraolo Spurio authored Nov 17, 2023

The GSC FW must be copied in a 4MB stolen memory allocation, whose GGTT
address is then passed as a parameter to a dedicated load instruction
submitted via the GSC engine.

Since the GSC load is relatively slow (up to 250ms), we perform it
asynchronously via a worker. This requires us to make sure that the
worker has stopped before suspending/unloading.

Note that we can't yet use xe_migrate_copy for the copy because it
doesn't work with stolen memory right now, so we do a memcpy from the
CPU side instead.

v2: add comment about timeout value, fix GSC status checking
    before load (John)

Bspec: 65306, 65346
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

dd0e89e5

drm/xe/gsc: Parse GSC FW header · 985d5a49

Daniele Ceraolo Spurio authored Nov 17, 2023

The GSC blob starts with a layout header, from which we can move to the
boot directory, which in turns allows us to find the CPD. The CPD uses
the same format as the one in the HuC binary, so we can re-use the same
parsing code to get to the manifest, which contains the release and
security versions of the FW.

v2: Fix comments in struct definition (John)
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

985d5a49

drm/xe/gsc: Introduce GSC FW · 0d1caff4

Daniele Ceraolo Spurio authored Nov 17, 2023

Add the basic definitions and init function. Same as HuC, GSC is only
supported on the media GT on MTL and newer platforms.
Note that the GSC requires submission resources which can't be allocated
during init (because we don't have the hwconfig yet), so it can't be
marked as loadable at the end of the init function. The allocation of
those resources will come in the patch that makes use of them to load
the FW.

v2: better comment, move num FWs define inside the enum (John)
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

0d1caff4

drm/xe/uc: Rework uC version tracking · 2e7227b4

Daniele Ceraolo Spurio authored Nov 17, 2023

The GSC firmware, support for which is coming soon for Xe, has both a
release version (updated on every release) and a compatibility version
(update only on interface changes). The GuC has something similar, with
a global release version and a submission version (which is also known
as the VF compatibility version). The main difference is that for the
GuC we still want to check the driver requirement against the release
version, while for the GSC we'll need to check against the compatibility
version.
Instead of special casing the GSC, this patch reworks the FW logic so
that we store both versions at the uc_fw level for all binaries and we
allow checking against either of the versions. Initially, we'll use it
to support GSC, but the logic could be re-used to allow VFs to check
against the GuC compatibility version.
Note that the GSC version has 4 numbers (major, minor, hotfix, build),
so support for that has been added as part of the rework and will be
used in follow-up patches.
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

2e7227b4

drm/xe: Encapsulate all the module parameters · adce1b39

Bommithi Sakeena authored Nov 17, 2023

Encapsulate all the module parameters in one single global struct
variable. This also removes the extra xe_module.h from includes.

v2: naming consistency as suggested by Jani and Lucas
v3: fix checkpatch errors/warnings
v4: adding blank line after struct declaration

Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Bommithi Sakeena <bommithi.sakeena@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

adce1b39

drm/xe/xe2: Add workaround 14020013138 · a409901f

Tejas Upadhyay authored Nov 20, 2023

This workaround applies to Xe2_LPG A0

V3:
  - Apply rule RENDER class
V2(Matt):
  - Apply WA in lrc context
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

a409901f

drm/xe/dg2: Drop Wa_22014600077 · f91bacce

Matt Roper authored Nov 27, 2023

The workaround database has been updated to drop this workaround for all
DG2 variants.
Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://lore.kernel.org/r/20231127190332.4099519-2-matthew.d.roper@intel.comSigned-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

f91bacce

drm/xe: Sync MTL PCI IDs with i915 · 812ec747

Lucas De Marchi authored Nov 21, 2023

For Xe1 platforms, it's better to follow the way i915 adds the PCI IDs
to the header, so it's easier to catch up when there is an update. This
brings the same logic applied in commit 2e3c369f ("drm/i915/mtl:
Eliminate subplatforms") to the equivalent xe header.

The end result of this header for Xe1 platforms is now in sync with i915
as of commit 5032c607 ("drm/i915: ATS-M device ID update"). This can
be seen by

	$ git show 5032c607:include/drm/i915_pciids.h > a.h
	$ git diff --color-words  --no-index a.h include/drm/xe_pciids.h
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20231121195209.802235-2-lucas.demarchi@intel.comSigned-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

812ec747

drm/xe: Internally change the compute_mode and no_dma_fence mode naming · fdb6a053

Thomas Hellström authored Nov 27, 2023

The name "compute_mode" can be confusing since compute uses either this
mode or fault_mode to achieve the long-running semantics, and compute_mode
can, moving forward, enable fault_mode under the hood to work around
hardware limitations.

Also the name no_dma_fence_mode really refers to what we elsewhere call
long-running mode and the mode contrary to what its name suggests allows
dma-fences as in-fences.

So in an attempt to be more consistent, rename
no_dma_fence_mode -> lr_mode
compute_mode      -> preempt_fence_mode

And adjust flags so that

preempt_fence_mode sets XE_VM_FLAG_LR_MODE
fault_mode sets XE_VM_FLAG_LR_MODE | XE_VM_FLAG_FAULT_MODE

v2:
- Fix a typo in the commit message (Oak Zeng)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Oak Zeng <oak.zeng@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231127123349.23698-1-thomas.hellstrom@linux.intel.comSigned-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

fdb6a053

drm/xe/vm: Fix ASID XA usage · d2f51c50

Thomas Hellström authored Nov 24, 2023

xa_alloc_cyclic() returns 1 on successful allocation, if wrapping occurs,
but the code incorrectly treats that as an error. Fix that.
Also, xa_alloc_cyclic() requires xa_init_flags(..., XA_FLAGS_ALLOC), so
fix that, and assuming we don't want a zero ASID, instead of using
XA_FLAGS_ALLOC1, adjust the xa limits at alloc_cyclic time.

v2:
- On CONFIG_DRM_XE_DEBUG, Initialize the cyclic ASID allocation in such a
  way that the next allocated ASID will be the maximum one, and the one
  following will cause an ASID wrap, (all to have CI test high ASIDs
  and ASID wraps).
v3:
- Stricter return value checking from xa_alloc_cyclic() (Matthew Auld)
Suggested-by: Ohad Sharabi <osharabi@habana.ai>
Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/946Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Ohad Sharabi <osharabi@habana.ai> #v1
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231124153345.97385-5-thomas.hellstrom@linux.intel.comSigned-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

d2f51c50

drm/xe/bo: Remove leftover trace_printk() · e7c9e049

Thomas Hellström authored Nov 22, 2023

trace_printk() is not intended for production code. Remove it.
Suggested-by: Ohad Sharabi <osharabi@habana.ai>
Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/946Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Ohad Sharabi <osharabi@habana.ai>
Link: https://patchwork.freedesktop.org/patch/msgid/20231122110359.4087-4-thomas.hellstrom@linux.intel.comSigned-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

e7c9e049

drm/xe/bo: Rename xe_bo_get_sg() to xe_bo_sg() · a21fe5ee

Thomas Hellström authored Nov 22, 2023

Using "get" typically refers to obtaining a refcount, which we don't do
here so rename to xe_bo_sg().
Suggested-by: Ohad Sharabi <osharabi@habana.ai>
Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/946Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Ohad Sharabi<osharabi@habana.ai>
Link: https://patchwork.freedesktop.org/patch/msgid/20231122110359.4087-3-thomas.hellstrom@linux.intel.comSigned-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

a21fe5ee

drm/xe: Ensure that we don't access the placements array out-of-bounds · 8c54ee8a

Thomas Hellström authored Nov 23, 2023

Ensure, using xe_assert that the various try_add_<placement> functions
don't access the bo placements array out-of-bounds.

v2:
- Remove the places argument to make sure the xe_assert operates on
  the array we're actually populating. (Matthew Auld)
Suggested-by: Ohad Sharabi <osharabi@habana.ai>
Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/946Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Ohad Sharabi <osharabi@habana.ai> #v1
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231123153158.12779-2-thomas.hellstrom@linux.intel.comSigned-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

8c54ee8a

drm/xe: Print virtualization mode during probe · 2475ac27

Michal Wajdeczko authored Nov 15, 2023

We already print some basic information about the device, add
virtualization information, until we expose that elsewhere.
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20231115073804.1861-3-michal.wajdeczko@intel.comSigned-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

2475ac27

drm/xe: Prepare for running in different SR-IOV modes · 13e5c32c

Michal Wajdeczko authored Nov 15, 2023

We will be adding support for the SR-IOV and driver might be then
running, in addition to existing non-virtualized bare-metal mode,
also in Physical Function (PF) or Virtual Function (VF) mode.

Since these additional modes require some changes to the driver,
define enum flag to represent different SR-IOV modes and add a
function where we will detect the actual mode in the runtime.

We start with a forced bare-metal mode as it is sufficient to
enable basic functionality and ensures no impact to existing code.
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20231115073804.1861-2-michal.wajdeczko@intel.comSigned-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

13e5c32c

drm/xe: Add device flag to indicate SR-IOV support · d6d14854

Michal Wajdeczko authored Nov 15, 2023

The Single Root I/O Virtualization (SR-IOV) extension to
the PCI Express (PCIe) specification suite is supported
starting from 12th generation of Intel Graphics processors.

Add a device flag that we will use to enable SR-IOV specific
code paths and to indicate our readiness to support SR-IOV.

We will enable this flag for the specific platforms once all
required changes and additions will be ready and merged.

Bspec: 52391
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20231115073804.1861-1-michal.wajdeczko@intel.comSigned-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

d6d14854

drm/xe/xe2: Add workaround 14019449301 · 8bfbe174

Tejas Upadhyay authored Nov 23, 2023

This workaround applies to Xe2_LPM

V3(MattR):
  - Reorder reg and wa placement
  - Add base parameter to reg macro for better definition
V2(MattR):
  - Change name of register
  - Loop for all engines
  - Driver permanent WA, applies to all steps
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

8bfbe174

drm/xe/xe2: Add workaround 16021867713 · f25d8291

Tejas Upadhyay authored Nov 23, 2023

This workaround applies to Xe2_LPM as well
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

f25d8291

drm/xe/xe2: Add workaround 14017421178 · 11ea758c

Tejas Upadhyay authored Nov 20, 2023

This workaround applies to Xe2_LPM
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

11ea758c

drm/xe/mmio: Make xe_mmio_wait32() aware of interrupts · b3f0654f

Gustavo Sousa authored Nov 16, 2023

With the current implementation, a preemption or other kind of interrupt
might happen between xe_mmio_read32() and ktime_get_raw(). Such an
interruption (specially in the case of preemption) might be long enough
to cause a timeout without giving a chance of a new check on the
register value on a next iteration, which would have happened otherwise.

This issue causes some sporadic timeouts in some code paths. As an
example, we were experiencing some rare timeouts when waiting for PLL
unlock for C10/C20 PHYs (see intel_cx0pll_disable()). After debugging,
we found out that the PLL unlock was happening within the expected time
period (20us), which suggested a bug in xe_mmio_wait32().

To fix the issue, ensure that we do a last check out of the loop if
necessary.

This change was tested with the aforementioned PLL unlocking code path.
Experiments showed that, before this change, we observed reported
timeouts in 54 of 5000 runs; and, after this change, no timeouts were
reported in 5000 runs.

v2:
  - Prefer an implementation without a barrier (v1 switched the order of
    xe_mmio_read32() and ktime_get_raw() calls and added a barrier() in
    between). (Lucas, Rodrigo)

Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20231116214000.70573-3-gustavo.sousa@intel.comSigned-off-by: Gustavo Sousa <gustavo.sousa@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

b3f0654f

drm/xe/mmio: Move xe_mmio_wait32() to xe_mmio.c · 5c09bd6c

Gustavo Sousa authored Nov 16, 2023

This function is big enough, let's move it to a shared compilation unit.

While at it, document it.

Here is the output of running bloat-o-metter on the new and old module
(execution provided by Lucas):

    $ ./scripts/bloat-o-meter build64/drivers/gpu/drm/xe/xe.ko{.old,}
    add/remove: 2/0 grow/shrink: 0/58 up/down: 554/-15645 (-15091)
    (...) # Lines in between omitted
    Total: Before=2181322, After=2166231, chg -0.69%

The overall reduction in the size is not that significant. Nevertheless,
keeping the function as inline arguably does not bring too much benefit
as well.

As noted by Lucas, we would probably benefit from an inline
function that did the fast-path check: do an optimistic first check
before entering the wait-logic, which itself would go to a compilation
unit. We might come back to implement this in the future if we have data
to justify it.

v2:
  - Add note in documentation for @timeout_us regarding the exponential
    backoff strategy. (Lucas)
  - Share output of bloat-o-meter in the commit message. (Lucas)
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20231116214000.70573-2-gustavo.sousa@intel.comSigned-off-by: Gustavo Sousa <gustavo.sousa@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

5c09bd6c

drm/xe: Add mocs kunit · a6a4ea6d

Ruthuvikas Ravikumar authored Nov 17, 2023

This kunit verifies the hardware values of mocs and
l3cc registers with the KMD programmed values.

v14: Fix CHECK.

v13: Remove ret after forcewake.

v11: Add KUNIT_ASSERT_EQ_MSG for Forcewake.

v9/v10: Add Forcewake Fail.

v8: Remove xe_bo.h and xe_pm.h
    Remove mocs and l3cc from live_mocs.
    Pull debug and err msg for mocs/l3cc out of if else block.
    Add HAS_LNCF_MOCS.

v7: correct checkpath

v6: Change ssize_t type.
    Change forcewake domain to XE_FW_GT.
    Update change of MOCS registers are multicast on Xe_HP and beyond
    patch.

v5: Release forcewake.
    Remove single statement braces.
    Fix debug statements.

v4: Drop stratch and vaddr.
    Fix debug statements.
    Fix indentation.

v3: Fix checkpath.

v2: Fix checkpath.

Cc: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
Cc: Mathew D Roper <matthew.d.roper@intel.com>
Reviewed-by: Mathew D Roper <matthew.d.roper@intel.com>
Signed-off-by: Ruthuvikas Ravikumar <ruthuvikas.ravikumar@intel.com>
Link: https://lore.kernel.org/r/20231116215152.2248859-1-ruthuvikas.ravikumar@intel.comSigned-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

a6a4ea6d

drm/xe: Fix modpost warning on kunit modules · 1d425066

Lucas De Marchi authored Nov 20, 2023

When built with W=1, the following warnings show up on modpost:

MODPOST drivers/gpu/drm/xe/Module.symvers
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/gpu/drm/xe/tests/xe_bo_test.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/gpu/drm/xe/tests/xe_dma_buf_test.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/gpu/drm/xe/tests/xe_migrate_test.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/gpu/drm/xe/tests/xe_pci_test.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/gpu/drm/xe/tests/xe_rtp_test.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/gpu/drm/xe/tests/xe_wa_test.o

Add the module description for each of these to fix the warning.
Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://lore.kernel.org/r/20231120221904.695630-1-lucas.demarchi@intel.comSigned-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

1d425066