- 03 Feb, 2023 1 commit
-
-
Nirmoy Das authored
DSM granularity is 1MB so make sure we stick to that. The address set by firmware in GEN12_DSMBASE in driver initialization doesn't mean "anything above that and until end of lmem is part of DSM". In fact, there may be a few KB that is not part of DSM on the end of lmem. How large is that space is platform-dependent, but since it's always less than the DSM granularity, it can be simplified by simply aligning the size down. v2: replace "1 * SZ_1M" with SZ_1M (Andrzej). v3: reword commit message to explain why the round down is needed (Lucas) Cc: Matthew Auld <matthew.auld@intel.com> Suggested-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230202180243.23637-1-nirmoy.das@intel.com
-
- 02 Feb, 2023 2 commits
-
-
Michal Wajdeczko authored
Just recently we switched over to new GuC oriented log macros but in the meantime yet another message was added that we missed to update. While around improve that new message by adding engine name and use existing helpers to check for context state. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230131214413.1879-1-michal.wajdeczko@intel.com
-
Nirmoy Das authored
Use sysfs_emit() and sysfs_emit_at() in show() callback as recommended by Documentation/filesystems/sysfs.rst Cc: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230130131358.16800-1-nirmoy.das@intel.com
-
- 01 Feb, 2023 1 commit
-
-
Chris Wilson authored
Check that we invalidate the TLB cache, the updated physical addresses are immediately visible to the HW, and there is no retention of the old physical address for concurrent HW access. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> [ahajda: adjust to upstream driver, v2+] Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> [tursulin: Small indentation fix.] Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230130165058.1647414-1-andrzej.hajda@intel.com
-
- 31 Jan, 2023 9 commits
-
-
John Harrison authored
Wa_22011802037 requires waiting for an engine-specific register to clear. A missing entry for GSC engine in the register table is flagged as a drm_err. The drm_err was originally intended to catch missing register entries for newer engines, however, it was later found that the WA is only required for 'legacy' engines. So just drop the drm_err. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230124231111.1786429-1-umesh.nerlige.ramappa@intel.com
-
Michal Wajdeczko authored
Use new macros to have common prefix that also include GT#. v2: pass gt to print_fw_ver v3: prefer guc_dbg in suspend/resume logs Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230128195907.1837-9-michal.wajdeczko@intel.com
-
Michal Wajdeczko authored
Use new macros to have common prefix that also include GT#. v2: improve few existing messages Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230128195907.1837-8-michal.wajdeczko@intel.com
-
Michal Wajdeczko authored
Use new macros to have common prefix that also include GT#. v2: drop redundant GuC strings, minor improvements v3: more message improvements Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230128195907.1837-7-michal.wajdeczko@intel.com
-
Michal Wajdeczko authored
Use new macros to have common prefix that also include GT#. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230128195907.1837-6-michal.wajdeczko@intel.com
-
Michal Wajdeczko authored
Use new macros to have common prefix that also include GT#. v2: drop unused helpers Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230128195907.1837-5-michal.wajdeczko@intel.com
-
Michal Wajdeczko authored
Use new macros to have common prefix that also include GT#. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230128195907.1837-4-michal.wajdeczko@intel.com
-
Michal Wajdeczko authored
Use new macros to have common prefix that also include GT#. v2: drop now redundant "GuC" word from the message Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230128195907.1837-3-michal.wajdeczko@intel.com
-
Michal Wajdeczko authored
While we do have GT oriented print macros, add few more GuC specific to have common look and feel across all messages related to the GuC and to avoid chasing the gt pointer. We will use these macros shortly in upcoming patches. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230128195907.1837-2-michal.wajdeczko@intel.com
-
- 30 Jan, 2023 1 commit
-
-
Rob Clark authored
A userspace with multiple threads racing I915_GEM_SET_TILING to set the tiling to I915_TILING_NONE could trigger a double free of the bit_17 bitmask. (Or conversely leak memory on the transition to tiled.) Move allocation/free'ing of the bitmask within the section protected by the obj lock. Signed-off-by: Rob Clark <robdclark@chromium.org> Fixes: 2850748e ("drm/i915: Pull i915_vma_pin under the vm->mutex") Cc: <stable@vger.kernel.org> # v5.5+ [tursulin: Correct fixes tag and added cc stable.] Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230127200550.3531984-1-robdclark@gmail.com
-
- 27 Jan, 2023 9 commits
-
-
John Harrison authored
The GuC specific register state entry in the error capture object was just called 'capture'. Although the companion 'node' entry was called 'guc_capture_node'. Rename the base entry to be 'guc_capture' instead so that it is a) more consistent and b) more obvious what it is. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230127002842.3169194-9-John.C.Harrison@Intel.com
-
John Harrison authored
For understanding bug reports, it can be useful to have an explicit dmesg print when a reset notification is received from GuC. As opposed to simply inferring that this happened from other messages. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230127002842.3169194-8-John.C.Harrison@Intel.com
-
John Harrison authored
Engine resets are supposed to never fail. But in the case when one does (due to unknown reasons that normally come down to a missing w/a), it is useful to get as much information out of the system as possible. Given that the GuC intentionally dies on such a situation, it is not possible to get a guilty context notification back. So do a manual search instead. Given that GuC is dead, this is safe because GuC won't be changing the engine state asynchronously. v2: Change comment to be less alarming (Tvrtko) Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230127002842.3169194-7-John.C.Harrison@Intel.com
-
John Harrison authored
A hang situation has been observed where the only requests on the context were either completed or not yet started according to the breaadcrumbs. However, the register state claimed a batch was (maybe) in progress. So, allow capture of the pending request on the grounds that this might be better than nothing. v2: Reword 'not started' warning message (Tvrtko) Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230127002842.3169194-6-John.C.Harrison@Intel.com
-
John Harrison authored
There was a report of error captures occurring without any hung context being indicated despite the capture being initiated by a 'hung context notification' from GuC. The problem was not reproducible. However, it is possible to happen if the context in question has no active requests. For example, if the hang was in the context switch itself then the breadcrumb write would have occurred and the KMD would see an idle context. In the interests of attempting to provide as much information as possible about a hang, it seems wise to include the engine info regardless of whether a request was found or not. As opposed to just prentending there was no hang at all. So update the error capture code to always record engine information if a context is given. Which means updating record_context() to take a context instead of a request (which it only ever used to find the context anyway). And split the request agnostic parts of intel_engine_coredump_add_request() out into a seaprate function. v2: Remove a duplicate 'if' statement (Umesh) and fix a put of a null pointer. v3: Tidy up request locking code flow (Tvrtko) v4: Pull in improved info message from next patch and fix up potential leak of GuC register state (Daniele) Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> (v2) Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230127002842.3169194-5-John.C.Harrison@Intel.com
-
John Harrison authored
The debugfs dump of requests was confused about what state requires the execlist lock versus the GuC lock. There was also a bunch of duplicated messy code between it and the error capture code. So refactor the hung request search into a re-usable function. And reduce the span of the execlist state lock to only the execlist specific code paths. In order to do that, also move the report of hold count (which is an execlist only concept) from the top level dump function to the lower level execlist specific function. Also, move the execlist specific code into the execlist source file. v2: Rename some functions and move to more appropriate files (Daniele). v3: Rename new execlist dump function (Daniele) Fixes: dc0dad36 ("drm/i915/guc: Fix for error capture after full GPU reset with GuC") Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Cc: Michael Cheng <michael.cheng@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Bruce Chang <yu.bruce.chang@intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230127002842.3169194-4-John.C.Harrison@Intel.com
-
John Harrison authored
When GuC support was added to error capture, the reference counting around the request object was broken. Fix it up. The context based search manages the spinlocking around the search internally. So it needs to grab the reference count internally as well. The execlist only request based search relies on external locking, so it needs an external reference count but within the spinlock not outside it. The only other caller of the context based search is the code for dumping engine state to debugfs. That code wasn't previously getting an explicit reference at all as it does everything while holding the execlist specific spinlock. So, that needs updaing as well as that spinlock doesn't help when using GuC submission. Rather than trying to conditionally get/put depending on submission model, just change it to always do the get/put. v2: Explicitly document adding an extra blank line in some dense code (Andy Shevchenko). Fix multiple potential null pointer derefs in case of no request found (some spotted by Tvrtko, but there was more!). Also fix a leaked request in case of !started and another in __guc_reset_context now that intel_context_find_active_request is actually reference counting the returned request. v3: Add a _get suffix to intel_context_find_active_request now that it grabs a reference (Daniele). v4: Split the intel_guc_find_hung_context change to a separate patch and rename intel_context_find_active_request_get to intel_context_get_active_request (Tvrtko). v5: s/locking/reference counting/ in commit message (Tvrtko) Fixes: dc0dad36 ("drm/i915/guc: Fix for error capture after full GPU reset with GuC") Fixes: 573ba126 ("drm/i915/guc: Capture error state on context reset") Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Andrzej Hajda <andrzej.hajda@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Cc: Michael Cheng <michael.cheng@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay@intel.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Aravind Iddamsetty <aravind.iddamsetty@intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Cc: Bruce Chang <yu.bruce.chang@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230127002842.3169194-3-John.C.Harrison@Intel.com
-
John Harrison authored
intel_guc_find_hung_context() was not acquiring the correct spinlock before searching the request list. So fix that up. While at it, add some extra whitespace padding for readability. Fixes: dc0dad36 ("drm/i915/guc: Fix for error capture after full GPU reset with GuC") Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Cc: Michael Cheng <michael.cheng@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay@intel.com> Cc: Chris Wilson <chris.p.wilson@intel.com> Cc: Bruce Chang <yu.bruce.chang@intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230127002842.3169194-2-John.C.Harrison@Intel.com
-
Rob Clark authored
Adding the vm to the vm_xa table makes it visible to userspace, which could try to race with us to close the vm. So we need to take our extra reference before putting it in the table. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Fixes: 9ec8795e ("drm/i915: Drop __rcu from gem_context->vm") Cc: <stable@vger.kernel.org> # v5.16+ Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230119173321.2825472-1-robdclark@gmail.com
-
- 26 Jan, 2023 4 commits
-
-
Matt Roper authored
GAMSTLB_CTRL and GAMCNTRL_CTRL became multicast/replicated registers on Xe_HP. They should be defined accordingly and use MCR-aware operations. These registers have only been used for some dg2/xehpsdv workarounds, so this fix is mostly just for consistency/future-proofing; even lacking the MCR annotation, workarounds will always be properly applied in a multicast manner on these platforms. Cc: Gustavo Sousa <gustavo.sousa@intel.com> Fixes: 58bc2453 ("drm/i915: Define multicast registers as a new type") Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230125234159.3015385-3-matthew.d.roper@intel.com
-
Matt Roper authored
Workaround Wa_18018781329 has applied to several recent Xe_HP-based platforms. However there are some extra gotchas to implementing this properly for MTL that we need to take into account: * Due to the separation of media and render/compute into separate GTs, this workaround needs to be implemented on each GT, not just the primary GT. Since each class of register only exists on one of the two GTs, we should program the appropriate registers on each GT. * As with past Xe_HP platforms, the registers on the primary GT (Xe_LPG IP) are multicast/replicated registers and should be handled with the MCR-aware functions. However the registers on the media GT (Xe_LPM+ IP) are regular singleton registers and should _not_ use MCR handling. We need to create separate register definitions for the Xe_HP multicast form and the Xe_LPM+ singleton form and use each in the appropriate place. * Starting with MTL, workarounds documented by the hardware teams are technically associated with IP versions/steppings rather than top-level platforms. That means we should take care to check the media IP version rather than the graphics IP version when deciding whether the workaround is needed on the Xe_LPM+ media GT (in this case the workaround applies to both IPs and the stepping bounds are identical, but we should still write the code appropriately to set a proper precedent for future workaround implementations). * It's worth noting that the GSC register and the CCS register are defined with the same MMIO offset (0xCF30). Since the CCS is only relevant to the primary GT and the GSC is only relevant to the media GT there isn't actually a clash here (the media GT automatically adds the additional 0x380000 GSI offset). However there's currently a glitch in the bspec where the CCS register doesn't show up at all and the GSC register is listed as existing on both GTs. That's a known documentation problem for several registers with shared GSC/CCS offsets; rest assured that the CCS register really does still exist. Cc: Gustavo Sousa <gustavo.sousa@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Fixes: 41bb543f ("drm/i915/mtl: Add initial gt workarounds") Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230125234159.3015385-2-matthew.d.roper@intel.com
-
Matt Roper authored
Register reset characteristics (i.e., whether the register maintains or loses its value on engine reset) is an important factor that determines which wa_list we want to add workarounds to. We recently found out that the bspec documentation for the Xe_HP's "GAM" registers in the 0xC800 - 0xCFFF range was misleading; these registers do not actually lose their value on engine resets as the documentation implied. This means there's no need to re-apply workarounds touching these registers after a reset, and the corresponding workarounds should be moved from the 'engine' lists back to the 'gt' list. v2: - Don't add Wa_18018781329 to xehpsdv; the original condition didn't include that platform. (Gustavo) - Move the MTL code to the GT function as-is for now; we'll take care of the additional fixes needed in a follow-up patch. Cc: Gustavo Sousa <gustavo.sousa@intel.com> Fixes: edf176f4 ("drm/i915/dg2: Move misplaced 'ctx' & 'gt' wa's to engine wa list") Fixes: b2006061 ("drm/i915/xehpsdv: Move render/compute engine reset domains related workarounds") Fixes: 41bb543f ("drm/i915/mtl: Add initial gt workarounds") Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230125234159.3015385-1-matthew.d.roper@intel.com
-
Tvrtko Ursulin authored
We want to idle all tiles when exiting selftests. Cc: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230125100003.18243-1-nirmoy.das@intel.com
-
- 24 Jan, 2023 4 commits
-
-
Tvrtko Ursulin authored
Lets eradicate a silent conflict between drm-intel-next and drm-intel-gt-next which added a duplicate function (try_firmware_load). Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
-
Daniel Vetter authored
Merge tag 'drm-intel-gt-next-2023-01-18' of git://anongit.freedesktop.org/drm/drm-intel into drm-next Driver Changes: Fixes/improvements/new stuff: - Fix workarounds on Gen2-3 (Tvrtko Ursulin) - Fix HuC delayed load memory leaks (Daniele Ceraolo Spurio) - Fix a BUG caused by impendance mismatch in dma_fence_wait_timeout and GuC (Janusz Krzysztofik) - Add DG2 workarounds Wa_18018764978 and Wa_18019271663 (Matt Atwood) - Apply recommended L3 hashing mask tuning parameters (Gen12+) (Matt Roper) - Improve suspend / resume times with VT-d scanout workaround active (Andi Shyti, Chris Wilson) - Silence misleading "mailbox access failed" warning in snb_pcode_read (Ashutosh Dixit) - Fix null pointer dereference on HSW perf/OA (Umesh Nerlige Ramappa) - Avoid trampling the ring during buffer migration (and selftests) (Chris Wilson, Matthew Auld) - Fix DG2 visual corruption on small BAR systems by not forgetting to copy CCS aux state (Matthew Auld) - More fixing of DG2 visual corruption by not forgetting to copy CCS aux state of backup objects (Matthew Auld) - Fix TLB invalidation for Gen12.50 video and compute engines (Andrzej Hajda) - Limit Wa_22012654132 to just specific steppings (Matt Roper) - Fix userspace crashes due eviction not working under lock contention after the object locking conversion (Matthew Auld) - Avoid double free is user deploys a corrupt GuC firmware (John Harrison) - Fix 32-bit builds by using "%zu" to format size_t (Nirmoy Das) - Fix a possible BUG in TTM async unbind due not reserving enough fence slots (Nirmoy Das) - Fix potential use after free by not exposing the GEM context id to userspace too early (Rob Clark) - Show clamped PL1 limit to the user (hwmon) (Ashutosh Dixit) - Workaround unreliable reset on Jasperlake (Chris Wilson) - Cover rest of SVG unit MCR registers (Gustavo Sousa) - Avoid PXP log spam on platforms which do not support the feature (Alan Previn) - Re-disable RC6p on Sandy Bridge to avoid GPU hangs and visual glitches (Sasa Dragic) Future platform enablement: - Manage uncore->lock while waiting on MCR register (Matt Roper) - Enable Idle Messaging for GSC CS (Vinay Belgaumkar) - Only initialize GSC in tile 0 (José Roberto de Souza) - Media GT and Render GT share common GGTT (Aravind Iddamsetty) - Add dedicated MCR lock (Matt Roper) - Implement recommended caching policy (PVC) (Wayne Boyer) - Add hardware-level lock for steering (Matt Roper) - Check full IP version when applying hw steering semaphore (Matt Roper) - Enable GuC GGTT invalidation from the start (Daniele Ceraolo Spurio) - MTL GSC firmware support (Daniele Ceraolo Spurio, Jonathan Cavitt) - MTL OA support (Umesh Nerlige Ramappa) - MTL initial gt workarounds (Matt Roper) Driver refactors: - Hold forcewake and MCR lock over PPAT setup (Matt Roper) - Acquire fw before loop in intel_uncore_read64_2x32 (Umesh Nerlige Ramappa) - GuC filename cleanups and use submission API version number (John Harrison) - Promote pxp subsystem to top-level of i915 (Alan Previn) - Finish proofing the code agains object size overflows (Chris Wilson, Gwan-gyeong Mun) - Start adding module oriented dmesg output (John Harrison) Miscellaneous: - Correct kerneldoc for intel_gt_mcr_wait_for_reg() (Matt Roper) - Bump up sample period for busy stats selftest (Umesh Nerlige Ramappa) - Make GuC default_lists const data (Jani Nikula) - Fix table order verification to check all FW types (John Harrison) - Remove some limited use register access wrappers (Jani Nikula) - Remove struct_member macro (Andrzej Hajda) - Remove hardcoded value with a macro (Nirmoy Das) - Use helper func to find out map type (Nirmoy Das) - Fix a static analysis warning (John Harrison) - Consolidate VMA active tracking helpers (Andrzej Hajda) - Do not cover all future platforms in TLB invalidation (Tvrtko Ursulin) - Replace zero-length arrays with flexible-array members (Gustavo A. R. Silva) - Unwind hugepages to drop wakeref on error (Chris Wilson) - Remove a couple of superfluous i915_drm.h includes (Jani Nikula) Merges: - Merge drm/drm-next into drm-intel-gt-next (Rodrigo Vivi) danvet: Fix up merge conflict in intel_uc_fw.c, we ended up with 2 copies of try_firmware_load() somehow. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/Y8fW2Ny1B1hZ5ZmF@tursulin-desk
-
Tvrtko Ursulin authored
Default engine map is exactly about uabi engines so no excuse not to use the appropriate iterator to populate it. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> [tursulin: Fixed up r-b tag spelling.] Link: https://patchwork.freedesktop.org/patch/msgid/20230123185629.1593320-1-jonathan.cavitt@intel.com
-
Gustavo Sousa authored
That register became a multicast register as of Xe_HP and it is currently used only for DG2. Use a proper prefix since there could be usage of the same register for previous platforms in the future, which would require a different definition (i.e. using _MMIO). Note that, in its current state, the code does not cause functional problems, since the actual application of the workaround would implicitly use multicast mode. This fix is more toward consistency and being future-proof uses of this register outside of workarounds. v2: - Add paragraph noting that this change is for consistency and making the code future-proof. (Matt) Signed-off-by: Gustavo Sousa <gustavo.sousa@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Matthew Atwood <matthew.s.atwood@intel.com> Fixes: 468a4e63 ("drm/i915/dg2: Introduce Wa_18018764978") Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230120181423.90507-1-gustavo.sousa@intel.com
-
- 20 Jan, 2023 1 commit
-
-
Arnd Bergmann authored
The definition of intel_selftest_modify_policy() does not match the declaration, as gcc-13 points out: drivers/gpu/drm/i915/selftests/intel_scheduler_helpers.c:29:5: error: conflicting types for 'intel_selftest_modify_policy' due to enum/integer mismatch; have 'int(struct intel_engine_cs *, struct intel_selftest_saved_policy *, u32)' {aka 'int(struct intel_engine_cs *, struct intel_selftest_saved_policy *, unsigned int)'} [-Werror=enum-int-mismatch] 29 | int intel_selftest_modify_policy(struct intel_engine_cs *engine, | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~ In file included from drivers/gpu/drm/i915/selftests/intel_scheduler_helpers.c:11: drivers/gpu/drm/i915/selftests/intel_scheduler_helpers.h:28:5: note: previous declaration of 'intel_selftest_modify_policy' with type 'int(struct intel_engine_cs *, struct intel_selftest_saved_policy *, enum selftest_scheduler_modify)' 28 | int intel_selftest_modify_policy(struct intel_engine_cs *engine, | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~ Change the type in the definition to match. Fixes: 617e87c0 ("drm/i915/selftest: Fix hangcheck self test for GuC submission") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230117163743.1003219-1-arnd@kernel.org
-
- 19 Jan, 2023 3 commits
-
-
Gustavo Sousa authored
That register doesn't belong to a specific engine, so the proper placement for workarounds programming it should be general_render_compute_wa_init(). Signed-off-by: Gustavo Sousa <gustavo.sousa@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230118155249.41551-3-gustavo.sousa@intel.com
-
Gustavo Sousa authored
Extend the existing documentation in gt/intel_workarounds.c to make it clear which functions register workarounds should be implemented in according to their types. Signed-off-by: Gustavo Sousa <gustavo.sousa@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230118155249.41551-2-gustavo.sousa@intel.com
-
Lucas De Marchi authored
Commit 0d0e7d1e ("drm/i915/mtl: Define engine context layouts") added the engine context for Meteor Lake. In a second revision of the patch it was believed the xcs offsets were wrong due to a tagging issue in the spec. The first version was actually correct, as shown by the intel_lrc_live_selftests/live_lrc_layout test: i915: Running gt_lrc i915: Running intel_lrc_live_selftests/live_lrc_layout bcs0: LRI command mismatch at dword 1, expected 1108101d found 11081019 [drm:drm_helper_probe_single_connector_modes [drm_kms_helper]] [CONNECTOR:236:DP-1] disconnected bcs0: HW register image: [0000] 00000000 1108101d 00022244 ffff0008 00022034 00000088 00022030 00000088 ... bcs0: SW register image: [0000] 00000000 11081019 00022244 00090009 00022034 00000000 00022030 00000000 The difference in the 2 additional dwords (0x1d vs 0x19) are the offsets 0x120 / 0x124 that are indeed part of the context image. Bspec: 45585 Fixes: 0d0e7d1e ("drm/i915/mtl: Define engine context layouts") Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230111235531.3353815-2-radhakrishna.sripada@intel.com
-
- 18 Jan, 2023 3 commits
-
-
Matt Roper authored
The implementation of Wa_22011450934 introduced three new register definitions in i915_reg.h that didn't get moved to the GT/engine register headers when all the other registers moved; let's move them to the appropriate headers and tidy up their definitions now for consistency: - STATE_ACK_DEBUG is moved to the engine register header and converted to a parameterized definition; the workaround only needs the RCS instance to be programmed, but there are instances on other engines that could be used by other workarounds in the future. - The two CULLBIT registers move to the GT register header. Since they belong to MMIO ranges that became MCR starting with Xe_HP, their definitions should be defined as MCR_REG() and use an Xe_HP prefix to keep the register semantics clear. Note that the MCR definition is just for consistency and to prevent accidental misuse if other workarounds related to these registers show up in the future. There's no functional change to today's driver since the workaround that references these registers only accesses them via MI_LRR engine instructions. Engine-initiated register accesses do not utilize the same steering controls as CPU-initiated accesses; they use a different steering control register (0x20CC) which is initialized to a non-terminated DSS target by pre-OS firmware and never changed thereafter (i915 does not touch it and userspace does not have permission to change that register). Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230117202627.4134579-1-matthew.d.roper@intel.com
-
Jani Nikula authored
Remove a couple of unnecessary includes. Signed-off-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230117123856.2271720-1-jani.nikula@intel.com
-
Chris Wilson authored
Make sure that upon error after we have acquired the wakeref we do release it again. v2: add another missing "goto out_wf"(Andi). Fixes: 027c38b4 ("drm/i915/selftests: Grab the runtime pm in shrink_thp") Cc: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> Signed-off-by: Chris Wilson <chris.p.wilson@linux.intel.com> Signed-off-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230117123234.26487-1-nirmoy.das@intel.com
-
- 17 Jan, 2023 2 commits
-
-
John Harrison authored
When trying to analyse bug reports from CI, customers, etc. it can be difficult to work out exactly what is happening on which GT in a multi-GT system. So add GT oriented debug/error message wrappers. If used instead of the drm_ equivalents, you get the same output but with a GT# prefix on it. v2: Go back to using lower case names (combined review feedback). Convert intel_gt.c as a first step. v3: Add gt_err_ratelimited() as well, undo one conversation that might not have a GT pointer in some scenarios (review feedback from Michal W). Split definitions into separate header (review feedback from Jani). Convert all intel_gt*.c files. v4: Re-order some macro definitions (Andi S), update (c) date (Tvrtko) Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230111200429.2139084-2-John.C.Harrison@Intel.com
-
Sasa Dragic authored
RC6p on Sandy Bridge got re-enabled over time, causing visual glitches and GPU hangs. Disabled originally in commit 1c8ecf80 ("drm/i915: do not enable RC6p on Sandy Bridge"). Signed-off-by: Sasa Dragic <sasa.dragic@gmail.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221219172927.9603-2-sasa.dragic@gmail.com Fixes: fb6db0f5 ("drm/i915: Remove unsafe i915.enable_rc6") Fixes: 13c5a577 ("drm/i915/gt: Select the deepest available parking mode for rc6") Cc: stable@vger.kernel.org Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-