- 15 Jan, 2024 40 commits
-
-
Alexander Richards authored
Theoretically, it would be possible for a buggy or malicious VBIOS to overwrite past the bounds of the passed parameters (or its own workspace); add bounds checking to prevent this from happening. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3093Signed-off-by: Alexander Richards <electrodeyt@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Alvin Lee authored
- Introduce a new Replay mode for DMUB version 0.0.199.0 Reviewed-by: Martin Leung <martin.leung@amd.com> Acked-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alvin Lee <alvin.lee2@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Dillon Varone authored
[Why & How] res_pool is not initialized in all situations such as virtual environments, and therefore link encoder resources should not be initialized if res_pool is NULL. Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Reviewed-by: Martin Leung <martin.leung@amd.com> Acked-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Dillon Varone <dillon.varone@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Nicholas Kazlauskas authored
[Why & How] Add regkey to block video playback in IPS2 by default Allow idle optimizations in the same spot we allow Replay for video playback usecases. Avoid sending it when there's an external display connected by modifying the allow idle checks to check for active non-eDP screens. Reviewed-by: Charlene Liu <charlene.liu@amd.com> Acked-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Srinivasan Shanmugam authored
In link_set_dsc_pps_packet(), 'struct display_stream_compressor *dsc' was dereferenced in a DC_LOGGER_INIT(dsc->ctx->logger); before the 'dsc' NULL pointer check. Fixes the below: drivers/gpu/drm/amd/amdgpu/../display/dc/link/link_dpms.c:905 link_set_dsc_pps_packet() warn: variable dereferenced before check 'dsc' (see line 903) Cc: stable@vger.kernel.org Cc: Aurabindo Pillai <aurabindo.pillai@amd.com> Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Cc: Hamza Mahfooz <hamza.mahfooz@amd.com> Cc: Wenjing Liu <wenjing.liu@amd.com> Cc: Qingqing Zhuo <qingqing.zhuo@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Srinivasan Shanmugam authored
Fixes the below: drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_debug.c:1024 kfd_dbg_trap_device_snapshot() warn: variable dereferenced before check 'entry_size' (see line 1021) Cc: Felix Kuehling <Felix.Kuehling@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Hawking Zhang authored
Driver and firmware share the same ras block enum. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Yang Wang authored
use new ACA macro to instead of MCA Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Candice Li authored
Separate deferred error from UE and CE and log it individually. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Candice Li authored
Needs to do bad page retirement for deferred errors. v2: Drop unused dev_info. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Yang Wang authored
add xgmi v6.4.0 ACA driver support Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Yang Wang authored
v1: add mmhub v1.8 ACA driver support v2: use macro to define smn address value. Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Yang Wang authored
v1: add sdma v4.4.2 ACA driver support v2: use macro to define smn address value. v3: squash in fix for unbalanced irqs Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Yang Wang authored
v1: add gfx v9.4.3 ACA driver support v2: use macro to define smn address value. Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Ma Jun authored
Some customer platforms do not enable mmconfig for various reasons, such as bios bug, and therefore cannot access the GPU extend configuration space through mmio. When the system enters the d3cold state and resumes, the amdgpu driver fails to resume because the extend configuration space registers of GPU can't be restored. At this point, Usually we only see some failure dmesg log printed by amdgpu driver, it is difficult to find the root cause. Therefor print a warnning message if the system can't access the extended configuration space register when using large bar. Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Yang Wang authored
add umc v12.0 ACA driver support Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Yang Wang authored
add aca sysfs node support Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Yang Wang authored
v1: add ACA error query interface v2: Add a new helper function to determine whether to use ACA or MCA. Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Yang Wang authored
add aca smu backend support for smu v13.0.6. Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Alex Deucher authored
It's used for more than just SR-IOV now, so move it to amdgpu_gmc.c and rename it to better match the functionality and update the comments in the code paths to better document when each path is used and why. No functional change. Reviewed-by: Shaoyun.liu <Shaoyun.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: Shaoyun.Liu@amd.com Cc: Christian.Koenig@amd.com
-
Alex Deucher authored
Some chips provide both average and input power. Previously we just exposed average power, add a new query for input power. Example userspace: https://github.com/Umio-Yasuno/libdrm-amdgpu-sys-rs/tree/input_powerReviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Hawking Zhang authored
Check and report firmware boot status if it doesn't reach steady status. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Yang Wang authored
add ACA bank dump debugfs support Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Yang Wang authored
add ACA kernel hardware error log support. Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Hawking Zhang authored
Check and report boot status if discovery failed. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Hawking Zhang authored
Add ras helper function to query boot time gpu errors. v2: use aqua_vanjaram smn addressing pattern Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Yang Wang authored
v1: implement new RAS ACA driver code framework. v2: - rename aca_bank_set to aca_banks. - rename aca_source_xxx to aca_handle_xxx. v3: Optimize some function implementation details. (from Hawking's suggestion) Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Hawking Zhang authored
To allow using this helper for indirect access when nbio funcs is not available. For instance, in ip discovery phase. v2: define macro for pcie_index/data/index_hi fallback. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Hawking Zhang authored
Will replace it with new implementation to cover boot fails in ip discovery phase. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Hawking Zhang authored
So kernel message has the device pcie bdf information, which helps issue debugging especially in multiple GPU system. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Felix Kuehling authored
This is not strictly a change in the IOCTL API. This version bump is meant to indicate to user mode the presence of a number of changes and fixes that enable the management of VA mappings in compute VMs using the GEM_VA ioctl for DMABufs exported from KFD. Signed-off-by: Felix Kuehling <felix.kuehling@amd.com> Reviewed-by: Xiaogang Chen<Xiaogang.Chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Felix Kuehling authored
DMABuf imports in compute VMs are not wrapped in a kgd_mem object on the process_info->kfd_bo_list. There is no explicit KFD API call to validate them or add eviction fences to them. This patch automatically validates and fences dymanic DMABuf imports when they are added to a compute VM. Revalidation after evictions is handled in the VM code. v2: * Renamed amdgpu_vm_validate_evicted_bos to amdgpu_vm_validate * Eliminated evicted_user state, use evicted state for VM BOs and user BOs * Fixed and simplified amdgpu_vm_fence_imports, depends on reserved BOs * Moved dma_resv_reserve_fences for amdgpu_vm_fence_imports into amdgpu_vm_validate, outside the vm->status_lock * Added dummy version of amdgpu_amdkfd_bo_validate_and_fence for builds without KFD v4: Eliminate amdgpu_vm_fence_imports. It's not needed because the reservation with its fences is shared with the export, as long as all imports are from KFD, with the exports already reserved, validated and fenced by the KFD restore worker. v5: Reintroduced separate evicted_user state to simplify the state machine and CS error handling when amdgpu_vm_validate is called without a ticket. Signed-off-by: Felix Kuehling <felix.kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Nathan Chancellor authored
Clang warns (or errors with CONFIG_WERROR=y) when performing arithmetic with different enumerated types, which is usually a bug: drivers/gpu/drm/amd/amdgpu/../display/dc/link/protocols/link_dp_dpia_bw.c:548:24: error: arithmetic between different enumeration types ('const enum dc_link_rate' and 'const enum dc_lane_count') [-Werror,-Wenum-enum-conversion] 548 | link_cap->link_rate * link_cap->lane_count * LINK_RATE_REF_FREQ_IN_KHZ * 8; | ~~~~~~~~~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~~~~~~~ 1 error generated. In this case, there is not a problem because the enumerated types are basically treated as '#define' values. Add an explicit cast to an integral type to silence the warning. Closes: https://github.com/ClangBuiltLinux/linux/issues/1976 Fixes: 5f3bce13 ("drm/amd/display: Request usb4 bw for mst streams") Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Lijo Lazar authored
When current clock is equal to max dpm level clock, the level is not indicated correctly with *. Fix by comparing current clock against dpm level value. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.7.x
-
Lijo Lazar authored
For all mode-2 reset fail cases, add error log. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.7.x
-
Srinivasan Shanmugam authored
Range interval [start, last] is ordered by rb_tree, rb_prev, rb_next return value still needs NULL check, thus modified from "node" to "rb_node". Fixes the below: drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_svm.c:2691 svm_range_get_range_boundaries() warn: can 'node' even be NULL? Suggested-by: Philip Yang <Philip.Yang@amd.com> Cc: Felix Kuehling <Felix.Kuehling@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Alex Deucher authored
No longer needed. Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.7.x
-
Le Ma authored
To bring debug options into effect in early initialization phase Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Le Ma authored
Use debug_mask=0x8 param to help isolating data path issues on new systems in early phase. v2: rename the flag for explicitness (lijo) Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Le Ma authored
This reverts commit c572abff. Will use debug module param instead of independent module param. Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-