- 30 Apr, 2024 1 commit
-
-
Nirmoy Das authored
This fixes commit c4f18703 ("drm/xe: Add xe_gt_tlb_invalidation_range and convert PT layer to use this") which added the end variable as part of the function param. v2: Add fixes tag(Matt) Fixes: c4f18703 ("drm/xe: Add xe_gt_tlb_invalidation_range and convert PT layer to use this") Cc: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240429203039.26918-1-nirmoy.das@intel.comSigned-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
-
- 29 Apr, 2024 2 commits
-
-
Lucas De Marchi authored
In order to detect duplicate implementations for the same workaround, early in the implementation of RTP it was decided to error out even if the values set are exactly the same. With the introduction of 18034896535 in commit 74671d23 ("drm/xe/xe2: Add workaround 18034896535"), LNL stepping with graphics stepping A1 now gives the following error on module load: xe 0000:00:02.0: [drm] *ERROR* GT0: [GT OTHER] \ discarding save-restore reg e48c (clear: 00000200, set: 00000200,\ masked: yes, mcr: yes): ret=-22 RTP may be improved in the future, but for now simply join the entries like done with e.g. "1607297627, 1607030317, 1607186500". Fixes: 74671d23 ("drm/xe/xe2: Add workaround 18034896535") Cc: Bommu Krishnaiah <krishnaiah.bommu@intel.com> Cc: Tejas Upadhyay <tejas.upadhyay@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240427135339.3485559-1-lucas.demarchi@intel.comSigned-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
-
Shekhar Chauhan authored
Add Wa_14021490052 for Xe2HPG 20.01. Signed-off-by: Shekhar Chauhan <shekhar.chauhan@intel.com> Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240424034247.1352755-1-shekhar.chauhan@intel.comSigned-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
-
- 26 Apr, 2024 25 commits
-
-
Matthew Brost authored
IGTs (e.g. xe_vm) can provide the exact same coverage as the PT update selftest. The PT update selftest is dependent on internal functions which can change thus maintaining this test is costly and provide no extra coverage. Delete this test. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Oak Zeng <oak.zeng@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240425045513.1913039-14-matthew.brost@intel.com
-
Matthew Brost authored
xe_gt_tlb_invalidation_range accepts a start and end address rather than a VMA. This will enable multiple VMAs to be invalidated in a single invalidation. Update the PT layer to use this new function. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Oak Zeng <oak.zeng@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240425045513.1913039-13-matthew.brost@intel.com
-
Matthew Brost authored
Rather than adding a ufence to a VMA in the bind function, add the ufence to all VMAs in the IOCTL that require binds in vm_bind_ioctl_ops_fini. This help withs the transition to job 1 per VM bind IOCTL. v2: - Rebase v3: - Fix typo in commit (Oak) Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Oak Zeng <oak.zeng@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240425045513.1913039-12-matthew.brost@intel.com
-
Matthew Brost authored
Rather than checking for an unsignaled ufence ay unbind time, check for this during the op_lock_and_prep function. This helps with the transition to job 1 per VM bind IOCTL. v2: - Rebase v3: - Fix typo in commit message (Oak) Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Oak Zeng <oak.zeng@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240425045513.1913039-11-matthew.brost@intel.com
-
Matthew Brost authored
Simplify VM bind code by signaling out-fences / destroying VMAs in a single location. Will help with transition single job for many bind ops. v2: - s/vm_bind_ioctl_ops_install_fences/vm_bind_ioctl_ops_fini (Oak) - Set last fence in vm_bind_ioctl_ops_fini (Oak) Cc: Oak Zeng <oak.zeng@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Oak Zeng <oak.zeng@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240425045513.1913039-10-matthew.brost@intel.com
-
Matthew Brost authored
This will help with moving to single jobs for many bind operations. v2: - Rebase Cc: Oak Zeng <oak.zeng@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Oak Zeng <oak.zeng@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240425045513.1913039-9-matthew.brost@intel.com
-
Matthew Brost authored
In effort to make multiple VMA binds operations atomic (1 job), all device page tables updates will be implemented via a xe_vma_ops (atomic unit) interface, Add xe_vma_rebind function which is implemented using xe_vma_ops interface. Use xe_vma_rebind in GPU page faults for rebinds rather than directly called deprecated function in PT layer. v3: - Update commit message (Oak) v4: - Fix tile_mask argument (CI) Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Oak Zeng <oak.zeng@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240425045513.1913039-8-matthew.brost@intel.com
-
Matthew Brost authored
Clean up everything in VM bind IOCTL in 1 path for both errors and non-errors. Also move VM bind IOCTL cleanup from ops (also used by non-IOCTL binds) to the VM bind IOCTL. v2: - Break ops_execute on error (Oak) Cc: Oak Zeng <oak.zeng@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Oak Zeng <oak.zeng@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240425045513.1913039-7-matthew.brost@intel.com
-
Matthew Brost authored
All page tables updates are moving to a xe_vma_ops interface to implement 1 job per VM bind IOCTL. Convert xe_vm_rebind to use a xe_vma_ops based interface. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Oak Zeng <oak.zeng@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240425045513.1913039-6-matthew.brost@intel.com
-
Matthew Brost authored
Having a structure which encapsulates a list of VMA operations will help enable 1 job for the entire list. v2: - Rebase Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Oak Zeng <oak.zeng@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240425045513.1913039-5-matthew.brost@intel.com
-
Matthew Brost authored
All non-binding operations in VM bind IOCTL should be in the lock and prepare step rather than the execution step. Move prefetch to conform to this pattern. v2: - Rebase - New function names (Oak) - Update stale comment (Oak) Cc: Oak Zeng <oak.zeng@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Oak Zeng <oak.zeng@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240425045513.1913039-4-matthew.brost@intel.com
-
Matthew Brost authored
Add ops_execute function which returns a fence. This will be helpful to initiate all binds (VM bind IOCTL, rebinds in exec IOCTL, rebinds in preempt rebind worker, and rebinds in pagefaults) via a gpuva ops list. Returning a fence is needed in various paths. v2: - Rebase Cc: Oak Zeng <oak.zeng@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Oak Zeng <oak.zeng@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240425045513.1913039-3-matthew.brost@intel.com
-
Matthew Brost authored
Lock all BOs used in gpuva ops and validate all BOs in a single step during the VM bind IOCTL. This help with the transition to making all gpuva ops in a VM bind IOCTL a single atomic job which is required for proper error handling. v2: - Better commit message (Oak) - s/op_lock/op_lock_and_prep, few other renames too (Oak) - Use DRM_EXEC_IGNORE_DUPLICATES flag in drm_exec_init (local testing) - Do not reserve slots in locking step (direction based on series from Thomas) v3: - Validate BO if is immediate set (Oak) Cc: Oak Zeng <oak.zeng@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Oak Zeng <oak.zeng@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240425045513.1913039-2-matthew.brost@intel.com
-
Lucas De Marchi authored
Contrary to i915, in xe ADL-N is kept as a different platform, not a subplatform of ADL-P. Since the display side doesn't need to differentiate between P and N, i.e. IS_ALDERLAKE_P_N() is never called, just fixup the compat header to check for both P and N. Moving ADL-N to be a subplatform would be more complex as the firmware loading in xe only handles platforms, not subplatforms, as going forward the direction is to check on IP version rather than platforms/subplatforms. Fix warning when initializing display: xe 0000:00:02.0: [drm:intel_pch_type [xe]] Found Alder Lake PCH ------------[ cut here ]------------ xe 0000:00:02.0: drm_WARN_ON(!((dev_priv)->info.platform == XE_ALDERLAKE_S) && !((dev_priv)->info.platform == XE_ALDERLAKE_P)) And wrong paths being taken on the display side. Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240425181610.2704633-1-lucas.demarchi@intel.comSigned-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
-
Colin Ian King authored
There is a spelling mistake in a drm_dbg message. Fix it. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240426094904.816033-1-colin.i.king@gmail.comSigned-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
-
Michal Wajdeczko authored
The xe_gt_sriov_pf_init_early() and xe_gt_sriov_pf_init_hw() are ideal places to call per-GT PF service init and update functions. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240425143927.2265-2-michal.wajdeczko@intel.com
-
Michal Wajdeczko authored
On older platforms (12.00) the PF driver must explicitly unblock VF's modifications to the GGTT. On newer platforms this capability is enabled by default. Bspec: 49908, 53204 Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240425143927.2265-1-michal.wajdeczko@intel.com
-
Himal Prasad Ghimiray authored
The function xe_guc_submit_stop consistently returns 0 without an error state, prompting the caller to verify it, which is redundant. Cc: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240424041911.2184868-1-himal.prasad.ghimiray@intel.com
-
Himal Prasad Ghimiray authored
There is no change in functionality. Using the helper function defined within the driver for locking/unlocking the reservation object. Cc: Matthew Brost <matthew.brost@intel.com> Cc: Ashutosh Dixit <ashutosh.dixit@intel.com> Suggested-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240424043910.2190376-3-himal.prasad.ghimiray@intel.com
-
Himal Prasad Ghimiray authored
There is no change in functionality. Using the helper function defined within the driver. -v2 Use xe_vm_unlock() (Ashutosh/Matt) -v3 Use xe_vm_unlock() for error label too (Matt) Reviewed-by: Badal Nilawar <badal.nilawar@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240424043910.2190376-2-himal.prasad.ghimiray@intel.com
-
Matthew Brost authored
Exec queue has replaced engine nomenclature. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240425232544.1935578-6-matthew.brost@intel.com
-
Matthew Brost authored
Normalize the alignment for readability. v3: - Fix typo in commit (Himal) - Fix EXEC_QUEUE_STATE_WEDGED too (Himal) Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240425232544.1935578-5-matthew.brost@intel.com
-
Matthew Brost authored
Exec queue has replaced engine nomenclature. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240425232544.1935578-4-matthew.brost@intel.com
-
Matthew Brost authored
Exec queue has replaced engine nomenclature. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240425232544.1935578-3-matthew.brost@intel.com
-
Matthew Brost authored
Exec queue has replaced engine nomenclature. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240425232544.1935578-2-matthew.brost@intel.com
-
- 25 Apr, 2024 6 commits
-
-
Matthew Brost authored
GuC submission_state.suspend is unused, delete it. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240425054747.1918811-1-matthew.brost@intel.com
-
Matthew Auld authored
We flush the rebind worker during the vm close phase, however in places like preempt_fence_work_func() we seem to queue the rebind worker without first checking if the vm has already been closed. The concern here is the vm being closed with the worker flushed, but then being rearmed later, which looks like potential uaf, since there is no actual refcounting to track the queued worker. We can't take the vm->lock here in preempt_rebind_work_func() to first check if the vm is closed since that will deadlock, so instead flush the worker again when the vm refcount reaches zero. v2: - Grabbing vm->lock in the preempt worker creates a deadlock, so checking the closed state is tricky. Instead flush the worker when the refcount reaches zero. It should be impossible to queue the preempt worker without already holding vm ref. Fixes: dd08ebf6 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1676 Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1591 Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1364 Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1304 Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1249Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: <stable@vger.kernel.org> # v6.8+ Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240423074721.119633-4-matthew.auld@intel.com
-
Matthew Auld authored
This reverts commit 5b259c0d. Cleanup here is good, however we need to able to flush a worker during vm destruction which might involve sleeping, so bring back the worker. Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240423074721.119633-3-matthew.auld@intel.com
-
Matthew Auld authored
It is really easy to introduce subtle deadlocks in preempt_fence_work_func() since we operate on single global ordered-wq for signalling our preempt fences behind the scenes, so even though we signal a particular fence, everything in the callback should be in the fence critical section, since blocking in the callback will prevent other published fences from signalling. If we enlarge the fence critical section to cover the entire callback, then lockdep should be able to understand this better, and complain if we grab a sensitive lock like vm->lock, which is also held when waiting on preempt fences. Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240418144630.299531-2-matthew.auld@intel.com
-
Michal Wajdeczko authored
Apart from the obvious spelling typo, use the correct values for infinity quantum/timeout settings (it's 0x0 instead of 0xFFFFFFFF). Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Piotr Piórkowski <piotr.piorkowski@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240424140506.2133-1-michal.wajdeczko@intel.com
-
Michal Wajdeczko authored
For debug purposes we might want to verify which registers values PF is sharing with VFs and to view which VF/PF ABI versions were negotiated by the VFs. Plug the 'print' functions already provided by the PF service code into our debugfs. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240424171030.2177-1-michal.wajdeczko@intel.com
-
- 24 Apr, 2024 6 commits
-
-
Michal Wajdeczko authored
Although it's unlikely that drmm_mutex_init() will fail during driver initialization, however we shouldn't ignore this case. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240409153132.1111-1-michal.wajdeczko@intel.comSigned-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
-
Tejas Upadhyay authored
Workaround 14021567978 applies to RenderCS xe2 V3: - Cover xe2_hpg as its landed upstream now V2(MattR): - Move tuning to wa and apply to xe2 Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240410064640.1010098-1-tejas.upadhyay@intel.comSigned-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
-
Michal Wajdeczko authored
We can use recently added str_plural() helper. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240419153407.402-1-michal.wajdeczko@intel.com
-
Rodrigo Vivi authored
So, the wedged mode can be selected per device at runtime, before the tests or before reproducing the issue. v2: - s/busted/wedged - some locking consistency v3: - remove mutex - toggle guc reset policy on any mode change Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240423221817.1285081-4-rodrigo.vivi@intel.comSigned-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Rodrigo Vivi authored
In many validation situations when debugging GPU Hangs, it is useful to preserve the GT situation from the moment that the timeout occurred. This patch introduces a module parameter that could be used on situations like this. If xe.wedged module parameter is set to 2, Xe will be declared wedged on every single execution timeout (a.k.a. GPU hang) right after devcoredump snapshot capture and without attempting any kind of GT reset and blocking entirely any kind of execution. v2: Really block gt_reset from guc side. (Lucas) s/wedged/busted (Lucas) v3: - s/busted/wedged - Really use global_flags (Dafna) - More robust timeout handling when wedging it. v4: A really robust clean exit done by Matt Brost. No more kernel warns on unbind. v5: Simplify error message (Lucas) Cc: Matthew Brost <matthew.brost@intel.com> Cc: Dafna Hirschfeld <dhirschfeld@habana.ai> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Cc: Himanshu Somaiya <himanshu.somaiya@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240423221817.1285081-3-rodrigo.vivi@intel.comSigned-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Rodrigo Vivi authored
Let's block the device upon any GuC load failure. But let's continue with the probe so guc logs can be read from the debugfs. v2: - s/wedged/busted - do not block probe or we lose guc_logs in debugfs (Matt) v3: - s/busted/wedged v4: Do not change __xe_guc_upload return. (Himal) Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240423221817.1285081-2-rodrigo.vivi@intel.comSigned-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-