Commits · 4683cfecadeb383b03019ad11aeea1efac38c1ba · Kirill Smelkov / linux

21 Apr, 2021 40 commits

drm/amdkfd: deregister svm range · 4683cfec

Philip Yang authored Apr 08, 2020

When application explicitly call unmap or unmap from mmput when
application exit, driver will receive MMU_NOTIFY_UNMAP event to remove
svm range from process svms object tree and list first, unmap from GPUs
(in the following patch).

Split the svm ranges to handle partial unmapping of svm ranges. To
avoid deadlocks, updating MMU notifiers, range lists and interval trees
is done in a deferred worker. New child ranges are attached to their
parent range's child_list until the worker can update the
svm_range_list. svm_range_set_attr flushes deferred work and takes the
mmap_write_lock to guarantee that it has an up-to-date svm_range_list.
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

4683cfec

drm/amdkfd: validate svm range system memory · b1c46c7d

Philip Yang authored Feb 15, 2020

Use HMM to get system memory pages address, which will be used to
map to GPUs or migrate to vram.
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

b1c46c7d

drm/amdkfd: support larger svm range allocation · d8a3c1c8

Philip Yang authored Mar 30, 2021

For larger range allocation, if hmm_range_fault return -EBUSY, set retry
timeout based on 1 second for every 512MB, this is safe timeout value.
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

d8a3c1c8

drm/amdgpu: add common HMM get pages function · 04d8d73d

Philip Yang authored Feb 24, 2020

Move the HMM get pages function from amdgpu_ttm and to amdgpu_mn. This
common function will be used by new svm APIs.
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

04d8d73d

drm/amdkfd: add svm ioctl GET_ATTR op · c5e2e478

Philip Yang authored Feb 16, 2020

Get the intersection of attributes over all memory in the given
range
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

c5e2e478

drm/amdkfd: register svm range · 42de677f

Philip Yang authored Feb 06, 2020

svm range structure stores the range start address, size, attributes,
flags, prefetch location and gpu bitmap which indicates which GPU this
range maps to. Same virtual address is shared by CPU and GPUs.

Process has svm range list which uses both interval tree and list to
store all svm ranges registered by the process. Interval tree is used by
GPU vm fault handler and CPU page fault handler to get svm range
structure from the specific address. List is used to scan all ranges in
eviction restore work.

No overlap range interval [start, last] exist in svms object interval
tree. If process registers new range which has overlap with old range,
the old range split into 2 ranges depending on the overlap happens at
head or tail part of old range.

Apply attributes preferred location, prefetch location, mapping flags,
migration granularity to svm range, store mapping gpu index into bitmap.
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

42de677f

drm/amdkfd: add svm ioctl API · 40ce74d1

Philip Yang authored Feb 05, 2020

Add svm (shared virtual memory) ioctl data structure and API definition.

The svm ioctl API is designed to be extensible in the future. All
operations are provided by a single IOCTL to preserve ioctl number
space. The arguments structure ends with a variable size array of
attributes that can be used to set or get one or multiple attributes.
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

40ce74d1

drm/amdkfd: helper to convert gpu id and idx · 2aeb742b

Alex Sierra authored Apr 07, 2020

svm range uses gpu bitmap to store which GPU svm range maps to.
Application pass driver gpu id to specify GPU, the helper is needed to
convert gpu id to gpu bitmap idx.

Access through kfd_process_device pointers array from kfd_process.
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

2aeb742b

drm/amdgpu: Remove verify_access shortcut for KFD BOs · cccbeb62

Felix Kuehling authored Apr 07, 2021

This shortcut is no longer needed with access managed properly by KFD.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

cccbeb62

drm/amdkfd: Allow access for mmapping KFD BOs · d4ec4bdc

Felix Kuehling authored Apr 07, 2021

DRM render node file handles are used for CPU mapping of BOs using mmap
by the Thunk. It uses the DRM render node of the GPU where the BO was
allocated.

DRM allows mmap access automatically when it creates a GEM handle for a
BO. KFD BOs don't have GEM handles, so KFD needs to manage access
manually. Use drm_vma_node_allow to allow user mode to mmap BOs allocated
with kfd_ioctl_alloc_memory_of_gpu through the DRM render node that was
used in the kfd_ioctl_acquire_vm call for the same GPU.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Philip Yang <philip.yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

d4ec4bdc

drm/amdkfd: Use drm_priv to pass VM from KFD to amdgpu · b40a6ab2

Felix Kuehling authored Apr 07, 2021

amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu needs the drm_priv to allow mmap
to access the BO through the corresponding file descriptor. The VM can
also be extracted from drm_priv, so drm_priv can replace the vm parameter
in the kfd2kgd interface.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Philip Yang <philip.yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

b40a6ab2

drm/amdgpu/gmc9: remove dummy read workaround for newer chips · 7845d80d

Alex Deucher authored Apr 16, 2021

Aldebaran has a hw fix so no longer requires the workaround.
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

7845d80d

drm/amdgpu: Add mem sync flag for IB allocated by SA · 5c88e3b8

Jinzhou Su authored Apr 20, 2021

The buffer of SA bo will be used by many cases. So it's better
to invalidate the cache of indirect buffer allocated by SA before
commit the IB.
Signed-off-by: Jinzhou Su <Jinzhou.Su@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

5c88e3b8

drm/amdgpu: Fix SDMA RAS error reporting on Aldebaran · ceb47e0d

Mukul Joshi authored Mar 24, 2021

Fix the following issues with SDMA RAS error reporting:
1. Read the EDC_COUNTER2 register also to fetch error counts
   for all sub-blocks in SDMA.
2. SDMA RAS on Aldebaran suports single-bit uncorrectable errors
   only. So, report error count in UE count instead of CE count.
Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-By: John Clements <John.Clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

ceb47e0d

drm/amdgpu: Reset RAS error count and status regs · 1f0d8e37

Mukul Joshi authored Mar 24, 2021

Reset the RAS error count and error status registers after
reading to prevent over reporting error counts on Aldebaran.
Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-By: John Clements <John.Clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

1f0d8e37

Revert "drm/amdgpu: workaround the TMR MC address issue (v2)" · 5f41741a

Oak Zeng authored Mar 11, 2021

This reverts commit 2f055097.
2f055097 was a driver workaround
when PSP firmware was not ready. Now the PSP fw is ready so we
revert this driver workaround.
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

5f41741a

drm/amd/display: 3.2.132 · 839ede89

Aric Cyr authored Apr 11, 2021

Signed-off-by: Aric Cyr <aric.cyr@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

839ede89

drm/amd/display: [FW Promotion] Release 0.0.62 · db6622e9

Anthony Koo authored Apr 10, 2021

Signed-off-by: Anthony Koo <Anthony.Koo@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

db6622e9

drm/amd/display: add helper for enabling mst stream features · 6016cd9d

Bing Guo authored Apr 05, 2021

[Why]
Some MST devices uses different method to enable mst
specific stream features.

[How]
Add dm_helpers_mst_enable_stream features. This can be
modified later when we are ready to implement those features.
Signed-off-by: Bing Guo <bing.guo@amd.com>
Reviewed-by: Anthony Koo <Anthony.Koo@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

6016cd9d

drm/amd/display: Report Proper Quantization Range in AVI Infoframe · fdf7d4f5

Dillon Varone authored Apr 09, 2021

[Why?]
When a monitor does not set both QS and QY bits, DC does not
set Q0, Q1, QY0 and QY1 bits in AVI infoframe. Setting RGB bits
should be separate from setting YCC bits.

[How?]
Separate logic for setting RGB and YCC quantization range bits
in the AVI infoframe.
Signed-off-by: Dillon Varone <dillon.varone@amd.com>
Reviewed-by: Chris Park <Chris.Park@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

fdf7d4f5

drm/amd/display: Fix call to pass bpp in 16ths of a bit · dad6bd77

Dillon Varone authored Apr 09, 2021

[Why & How?]
Call to dc_dsc_compute_bandwidth_range should have min and max bpp
in 16ths of a bit.  Multiply min and max bpp from policy.
Signed-off-by: Dillon Varone <dillon.varone@amd.com>
Reviewed-by: Eryk Brol <Eryk.Brol@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

dad6bd77

drm/amd/display: Fixed typo in function name. · 5dac2b73

David Galiffi authored Apr 07, 2021

[How & Why]
Changed "prsent" to "present".
Signed-off-by: David Galiffi <David.Galiffi@amd.com>
Reviewed-by: Chris Park <Chris.Park@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

5dac2b73

drm/amd/display: Always poll for rxstatus in authenticate · e0912e15

Nicholas Kazlauskas authored Apr 08, 2021

[Why]
Requirement from the spec - we shouldn't be potentially exiting out
early based on encryption status.

[How]
Drop the calls from HDCP1 and HDCP2 execution that exit out early
based on link encryption status.
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Reviewed-by: Wenjing Liu <Wenjing.Liu@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

e0912e15

drm/amd/display: Add link rate optimization logs for ILR · 0eda55ca

Michael Strauss authored Apr 06, 2021

[Why&How]
Add logs to verify ILR optimization behaviour on boot
Signed-off-by: Michael Strauss <michael.strauss@amd.com>
Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

0eda55ca

drm/amd/display: Unconditionally clear training pattern set after lt · 97d1765e

Wesley Chalmers authored Apr 05, 2021

[WHY]
While Link Training is being performed,
and the LTTPRs are in Non-LTTPR or LTTPR Transparent mode,
any DPCD registers besides those used for Link Training are not to be
accessed.

The spec defines the link training registers as DP_TRAINING_PATTERN_SET
(102h) to DP_TRAINING_LANE3_SET (106h), and DP_LANE0_1_STATUS (202h)
to DP_ADJUST_REQUEST_LANE2_3 (207h).

[HOW]
Move the current write to DPCD Address DP_LINK_TRAINING_PATTERN_SET out
of its conditional block.
Signed-off-by: Wesley Chalmers <Wesley.Chalmers@amd.com>
Reviewed-by: Jun Lei <Jun.Lei@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

97d1765e

drm/amd/display: Fix FreeSync when RGB MPO in use · 41ef8fbb

Aric Cyr authored Mar 17, 2021

[WHY]
We should skip programming manual trigger on non-primary planes when MPO is
enabled.

[HOW]
Implement an explicit mechanism for skipping manual trigger programming
for planes that shouldn't cause the frame to end.
Signed-off-by: Aric Cyr <aric.cyr@amd.com>
Reviewed-by: Anthony Koo <Anthony.Koo@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Acked-by: Krunoslav Kovac <Krunoslav.Kovac@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

41ef8fbb

drm/amd/display: treat memory as a single-channel for asymmetric memory v2 · 9c82354e

Hugo Hu authored Jan 20, 2021

Previous change had been reverted since it caused hang.
Remake change to avoid defect.

[Why]
1. Driver use umachannelnumber to calculate watermarks for stutter.
In asymmetric memory config, the actual bandwidth is less than
dual-channel. The bandwidth should be the same as single-channel.
2. We found single rank dimm need additional delay time for stutter.

[How]
Get information from each DIMM. Treat memory config as a single-channel
for asymmetric memory in bandwidth calculating.
Add additional delay time for single rank dimm.

Fixes: b8720ed0 ("drm/amd/display: System black screen hangs on driver load")
Signed-off-by: Hugo Hu <hugo.hu@amd.com>
Reviewed-by: Sung Lee <Sung.Lee@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

9c82354e

drm/amd/display: removed unused function dc_link_reallocate_mst_payload. · 8a20c973

Robin Singh authored Apr 05, 2021

[Why]
Found that dc_link_reallocate_mst_payload is not used anymore
in any of the use case scenario.

[How]
removed dc_link_reallocate_mst_payload function definition
and declaration.
Signed-off-by: Robin Singh <robin.singh@amd.com>
Reviewed-by: Harry Wentland <Harry.Wentland@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

8a20c973

drm/amd/display: disable seamless boot for external DP · 19a274f6

Anthony Wang authored Apr 05, 2021

[Why]
Primary feature use case is with eDP panels.

[How]
Fail seamless boot validation if display is not an eDP panel.
Signed-off-by: Anthony Wang <anthony1.wang@amd.com>
Reviewed-by: Martin Leung <Martin.Leung@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

19a274f6

drm/amd/display: add handling for hdcp2 rx id list validation · 4ccf9446

Dingchen (David) Zhang authored Jan 25, 2021

[why]
the current implementation of hdcp2 rx id list validation does not
have handler/checker for invalid message status, e.g. HMAC, the V
parameter calculated from PSP not matching the V prime from Rx.

[how]
return a generic FAILURE for any message status not SUCCESS or
REVOKED.
Signed-off-by: Dingchen (David) Zhang <dingchen.zhang@amd.com>
Reviewed-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

4ccf9446

drm/amd/display: update hdcp display using correct CP type. · 26739690

Dingchen (David) Zhang authored Jan 08, 2021

[why]
currently we enforce to update hdcp display using TYPE0, but there
is case that connector CP type prop be TYPE1 instead of type0.

[how]
using the drm prop of CP type of the connector as input argument.
Signed-off-by: Dingchen (David) Zhang <dingchen.zhang@amd.com>
Reviewed-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

26739690

drm/amd/display: Add DSC check to seamless boot validation · 7cd69b95

Anthony Wang authored Apr 05, 2021

[Why & How]
We want to immediately fail seamless boot validation if DSC is active,
as VBIOS currently does not support DSC timings. Add a check for
the relevant flag in dc_validate_seamless_boot_timing.
Signed-off-by: Anthony Wang <anthony1.wang@amd.com>
Reviewed-by: Martin Leung <Martin.Leung@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

7cd69b95

drm/amd/display: fixed divide by zero kernel crash during dsc enablement · 19cc1f38

Robin Singh authored Dec 14, 2020

[why]
During dsc enable, a divide by zero condition triggered the
kernel crash.

[how]
An IGT test, which enable the DSC, was crashing at the time of
restore the default dsc status, becaue of h_totals value
becoming 0. So add a check before divide condition. If h_total
is zero, gracefully ignore and set the default value.

kernel panic log:

	[  128.758827] divide error: 0000 [#1] PREEMPT SMP NOPTI
	[  128.762714] CPU: 5 PID: 4562 Comm: amd_dp_dsc Tainted: G        W         5.4.19-android-x86_64 #1
	[  128.769728] Hardware name: ADVANCED MICRO DEVICES, INC. Mauna/Mauna, BIOS WMN0B13N Nov 11 2020
	[  128.777695] RIP: 0010:hubp2_vready_at_or_After_vsync+0x37/0x7a [amdgpu]
	[  128.785707] Code: 80 02 00 00 48 89 f3 48 8b 7f 08 b ......
	[  128.805696] RSP: 0018:ffffad8f82d43628 EFLAGS: 00010246
	......
	[  128.857707] CR2: 00007106d8465000 CR3: 0000000426530000 CR4: 0000000000140ee0
	[  128.865695] Call Trace:
	[  128.869712] hubp3_setup+0x1f/0x7f [amdgpu]
	[  128.873705] dcn20_update_dchubp_dpp+0xc8/0x54a [amdgpu]
	[  128.877706] dcn20_program_front_end_for_ctx+0x31d/0x463 [amdgpu]
	[  128.885706] dc_commit_state+0x3d2/0x658 [amdgpu]
	[  128.889707] amdgpu_dm_atomic_commit_tail+0x4b3/0x1e7c [amdgpu]
	[  128.897699] ? dm_read_reg_func+0x41/0xb5 [amdgpu]
	[  128.901707] ? dm_read_reg_func+0x41/0xb5 [amdgpu]
	[  128.905706] ? __is_insn_slot_addr+0x43/0x48
	[  128.909706] ? fill_plane_buffer_attributes+0x29e/0x3dc [amdgpu]
	[  128.917705] ? dm_plane_helper_prepare_fb+0x255/0x284 [amdgpu]
	[  128.921700] ? usleep_range+0x7c/0x7c
	[  128.925705] ? preempt_count_sub+0xf/0x18
	[  128.929706] ? _raw_spin_unlock_irq+0x13/0x24
	[  128.933732] ? __wait_for_common+0x11e/0x18f
	[  128.937705] ? _raw_spin_unlock_irq+0x13/0x24
	[  128.941706] ? __wait_for_common+0x11e/0x18f
	[  128.945705] commit_tail+0x8b/0xd2 [drm_kms_helper]
	[  128.949707] drm_atomic_helper_commit+0xd8/0xf5 [drm_kms_helper]
	[  128.957706] amdgpu_dm_atomic_commit+0x337/0x360 [amdgpu]
	[  128.961705] ? drm_atomic_check_only+0x543/0x68d [drm]
	[  128.969705] ? drm_atomic_set_property+0x760/0x7af [drm]
	[  128.973704] ? drm_mode_atomic_ioctl+0x6f3/0x85a [drm]
	[  128.977705] drm_mode_atomic_ioctl+0x6f3/0x85a [drm]
	[  128.985705] ? drm_atomic_set_property+0x7af/0x7af [drm]
	[  128.989706] drm_ioctl_kernel+0x82/0xda [drm]
	[  128.993706] drm_ioctl+0x225/0x319 [drm]
	[  128.997707] ? drm_atomic_set_property+0x7af/0x7af [drm]
	[  129.001706] ? preempt_count_sub+0xf/0x18
	[  129.005713] amdgpu_drm_ioctl+0x4b/0x76 [amdgpu]
	[  129.009705] vfs_ioctl+0x1d/0x2a
	[  129.013705] do_vfs_ioctl+0x419/0x43d
	[  129.017707] ksys_ioctl+0x52/0x71
	[  129.021707] __x64_sys_ioctl+0x16/0x19
	[  129.025706] do_syscall_64+0x78/0x85
	[  129.029705] entry_SYSCALL_64_after_hwframe+0x44/0xa9
Signed-off-by: Robin Singh <robin.singh@amd.com>
Reviewed-by: Harry Wentland <Harry.Wentland@amd.com>
Reviewed-by: Robin Singh <Robin.Singh@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

19cc1f38

drm/amdgpu: fix GCR_GENERAL_CNTL offset for dimgrey_cavefish · 7c49ee9e

Jiansong Chen authored Apr 19, 2021

dimgrey_cavefish has similar gc_10_3 ip with sienna_cichlid,
so follow its registers offset setting.
Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

7c49ee9e

drm/amdgpu: resolve erroneous gfx_v9_4_2 prints · f9727922

John Clements authored Apr 19, 2021

resolve bug on aldebaran where gfx error counts will
print on driver load when there are no errors present
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

f9727922

drm/amdgpu: fix a error injection failed issue · 6df23f4c

Dennis Li authored Apr 16, 2021

because "sscanf(str, "retire_page")" always return 0, if application use
the raw data for error injection, it always wrongly falls into "op ==
3". Change to use strstr instead.
Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

6df23f4c

drm/amdgpu: only harvest gcea/mmea error status in aldebaran · 1f8d3ad2

Hawking Zhang authored Apr 16, 2021

In aldebaran, driver only needs to harvest SDP
RdRspStatus, WrRspStatus and first parity error
on RdRsp data. Check error type before harvest
error information.
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Stanley Yang <Stanley.Yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

1f8d3ad2

drm/amdgpu: only harvest gcea/mmea error status in arcturus · 53ee6609

Hawking Zhang authored Apr 16, 2021

SDP RdRspStatus/WrRspStatus or first parity error on
RdRsp data can cause system fatal error in arcturus.
GPU will be freezed in such case.

Driver needs to harvest these error information before
reset the GPU. Check error type to avoid harvest normal
gcea/mmea information.
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Stanley Yang <Stanley.Yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

53ee6609

drm/amdgpu: enable tmz on renoir asics · 9406d39b

Huang Rui authored Apr 14, 2021

The tmz functions are verified on renoir chips as well. So enable it by
default.
Signed-off-by: Huang Rui <ray.huang@amd.com>
Tested-by: Lang Yu <Lang.Yu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

9406d39b

drm/amdgpu: correct default gfx wdt timeout setting · 28a5d7a5

Hawking Zhang authored Apr 16, 2021

When gfx wdt was configured to fatal_disable, the
timeout period should be configured to 0x0 (timeout
disabled)
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Dennis Li <Dennis.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

28a5d7a5