Commits · ded750e6faaf965f7b07f8468488381dcb790c9c · nexedi / linux

04 Aug, 2020 31 commits

drm/amd/display: [FW Promotion] Release 0.0.27 · ded750e6

Anthony Koo authored Jul 24, 2020

| [Header Changes]
|       - Reworked the FW versioning to include hotfix
|         and test bits
Signed-off-by: Anthony Koo <Anthony.Koo@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

ded750e6

drm/amd/display: Separate pipe disconnect from rest of progrmaming · 4453fbec

Alvin Lee authored Jul 22, 2020

[Why]
When changing pixel formats for HDR (e.g. ARGB -> FP16)
there are configurations that change from 2 pipes to 1 pipe.
In these cases, it seems that disconnecting MPCC and doing
a surface update at the same time(after unlocking) causes
some registers to be updated slightly faster than others
after unlocking (e.g. if the pixel format is updated to FP16
before the new surface address is programmed, we get
corruption on the screen because the pixel formats aren't
matching). We separate disconnecting MPCC from the rest
of  the  pipe programming sequence to prevent this.

[How]
Move MPCC disconnect into separate operation than the
rest of the pipe programming.
Signed-off-by: Alvin Lee <alvin.lee2@amd.com>
Reviewed-by: Jun Lei <Jun.Lei@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

4453fbec

drm/amd/display: Add debugfs for forcing stream timing sync · 3d4e52d0

Victor Lu authored Jul 21, 2020

[why]
There's currently no method to enable multi-stream synchronization from
userspace and we don't check the VSDB bits to know whether or not
specific displays should have the feature enable.

[how]
Add a debugfs entry that controls a new DM debug option,
"force_timing_sync". This debug option will set on any newly created
stream following the change to the debug option.
Expose a new interface from DC that performs the timing sync and a helper
to the "force_timing_sync" debugfs that iterates over the current streams
and modifies the current synchornization state and grouping.

Example usage to force a resync (from an X based desktop):

echo 1 > /sys/kernel/debug/dri/0/amdgpu_dm_force_timing_sync
xset dpms force off && xset dpms force on
Signed-off-by: Victor Lu <victorchengchi.lu@amd.com>
Reviewed-by: Aurabindo Jayamohanan Pillai <Aurabindo.Pillai@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Acked-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

3d4e52d0

drm/amd/display: Display goes blank after inst · da83b385

Igor Kravchenko authored Jul 24, 2020

[why]
Display goes blank after driver installation.
Aux tuning parameters must be used for 2.x only.
Wrong dc_golden_table offset was used.

[How]
Implement a new enc3_hw_init function without VBIOS constants usage to
be called for 3.x
Calculate dc_golden_table offset using sum of
base dce_info offset and golden table offset
Signed-off-by: Igor Kravchenko <Igor.Kravchenko@amd.com>
Reviewed-by: Aric Cyr <Aric.Cyr@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

da83b385

drm/amd/display: Change null plane state swizzle mode to 4kb_s · 0914d115

George Shen authored Jul 17, 2020

[Why]
During SetPathMode and UpdatePlanes, the plane state can be null. We default
to linear swizzle mode when plane state is null. This resulted in bandwidth
validation failing when trying to set 8K60 mode (which previously passed validation
during rebuild timing list).

[How]
Change the default swizzle mode from linear to 4kb_s and update pitch accordingly.
Signed-off-by: George Shen <george.shen@amd.com>
Reviewed-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

0914d115

drm/amd/display: Use helper function to check for HDMI signal · 519d91d8

JinZe.Xu authored Jul 21, 2020

[How]
Use dc_is_hdmi_signal to determine signal type.
Signed-off-by: JinZe.Xu <JinZe.Xu@amd.com>
Reviewed-by: Charlene Liu <Charlene.Liu@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

519d91d8

drm/amd/display: AMD OUI (DPCD 0x00300) skipped on some sink · d0246567

Aric Cyr authored Jul 23, 2020

[Why]
Sink OUI supported cap is not set so driver skips programming it.

[How]
Revert the change the skips OUI programming if the cap is not set
Signed-off-by: Aric Cyr <aric.cyr@amd.com>
Reviewed-by: Anthony Koo <Anthony.Koo@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

d0246567

drm/amd/display: Comments on how to use DSC debugfs some entries · 87353ae8

Eryk Brol authored Jun 19, 2020

[why]
Some of the DSC debugfs read enteries are missing comments
explaining how to use and how to comprehend the results.
Signed-off-by: Eryk Brol <eryk.brol@amd.com>
Signed-off-by: Mikita Lipski <mikita.lipski@amd.com>
Reviewed-by: Mikita Lipski <Mikita.Lipski@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

87353ae8

drm/amd/display: Fix logger context · 06ff02fc

Harry Wentland authored Jun 30, 2020

[Why&How]
use correct logger context
Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Roman Li <Roman.Li@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

06ff02fc

drm/amd/display: DSC Bit target rate debugfs write entry · 5268bf13

Eryk Brol authored Jun 19, 2020

[Why]
We need to be able to specify bits per pixel for DSC on any
connector.

[How]
Overwrite computed DSC target rate in dsc_cfg, with requested value.
Overwrites for both SST and MST connectors, but in different places, but the process is identical. Overwrites only if DSC is decided to be enabled on that connector.
Signed-off-by: Eryk Brol <eryk.brol@amd.com>
Signed-off-by: Mikita Lipski <mikita.lipski@amd.com>
Reviewed-by: Mikita Lipski <Mikita.Lipski@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

5268bf13

drm/amd/display: populate new dml variable · a245528c

Dmytro Laktyushkin authored Jun 26, 2020

Signed-off-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com>
Reviewed-by: Eric Bernstein <Eric.Bernstein@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

a245528c

drm/amd/display: Read VBIOS Golden Settings Tbl · 6224220d

Igor Kravchenko authored Jul 19, 2020

[Why]
For ver.4.4 and higher VBIOS contains default setting table.

{How]
Read Golden Settings Table from VBIOS, apply Aux tuning parameters.
Signed-off-by: Igor Kravchenko <Igor.Kravchenko@amd.com>
Reviewed-by: Aric Cyr <Aric.Cyr@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

6224220d

drm/amd/display: Use parameter for call to set output mux · 1174eb89

Eric Bernstein authored Jul 20, 2020

Signed-off-by: Eric Bernstein <eric.bernstein@amd.com>
Reviewed-by: Chris Park <Chris.Park@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

1174eb89

drm/amd/display: Update virtual stream encoder · d8a8258e

Eric Bernstein authored Jul 20, 2020

Signed-off-by: Eric Bernstein <eric.bernstein@amd.com>
Reviewed-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

d8a8258e

drm/amd/display: DSC Slice height debugfs write entry · 734e4c97

Eryk Brol authored Jun 17, 2020

[Why]
We need to be able to specify slice height for any connector's DSC

[How]
Overwrite computed parameters in dsc_cfg, with the value needed/
Overwrites for both SST and MST connectors, but in different places, but the process is identical. Overwrites only if DSC is decided to be enabled on that connector.
Signed-off-by: Eryk Brol <eryk.brol@amd.com>
Signed-off-by: Mikita Lipski <mikita.lipski@amd.com>
Reviewed-by: Mikita Lipski <Mikita.Lipski@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

734e4c97

drm/amdgpu: added RAS EEPROM device support check · 4bfb7428

John Clements authored Aug 03, 2020

updated RAS EEPROM init/threshold sequences to check for device support
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

4bfb7428

drm/amdgpu: enable RAS support for sienna cichlid · 0ad7a64d

John Clements authored Aug 03, 2020

enabled GECC error injection and query support
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

0ad7a64d

drm/amdgpu: introduce a new parameter to configure how many KCQ we want(v5) · a300de40

Monk Liu authored Jul 27, 2020

what:
the MQD's save and restore of KCQ (kernel compute queue)
cost lots of clocks during world switch which impacts a lot
to multi-VF performance

how:
introduce a paramter to control the number of KCQ to avoid
performance drop if there is no kernel compute queue needed

notes:
this paramter only affects gfx 8/9/10

v2:
refine namings

v3:
choose queues for each ring to that try best to cross pipes evenly.

v4:
fix indentation
some cleanupsin the gfx_compute_queue_acquire()

v5:
further fix on indentations
more cleanupsin gfx_compute_queue_acquire()

TODO:
in the future we will let hypervisor driver to set this paramter
automatically thus no need for user to configure it through
modprobe in virtual machine
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

a300de40

drm/amdgpu: update eeprom once specifying one bigger threshold(v3) · 9b856def

Guchun Chen authored Jul 27, 2020

During driver's probe, when it hits bad gpu tag in eeprom i2c
init calling(the tag was set when reported bad page reaches
bad page threshold in last driver's working loop), there are
some strategys to deal with the cases:

1. when the module parameter amdgpu_bad_page_threshold = 0,
that means page retirement feature is disabled, so just resetting
the eeprom is fine.
2. When amdgpu_bad_page_threshold is not 0, and moreover, user
sets one bigger valid data in order to make current boot up
succeeds, correct eeprom header tag and do not break booting.
3. For other cases, driver's probe will be broken.

v2: Just update eeprom header tag instead of resetting the whole
    table header when user sets one bigger threshold data.

v3: Use dev_info/dev_err to print PCI device information, which
    helps in mGPU case.
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

9b856def

drm/amdgpu: disable page reservation when amdgpu_bad_page_threshold = 0 · a219ecbb

Guchun Chen authored Jul 27, 2020

When amdgpu_bad_page_threshold = 0, bad page reservation stuffs
are skipped in either UMC ECC irq or page retirement calling of
sync flood isr.
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

a219ecbb

drm/amdgpu: decouple sysfs creating of bad page node · f848159b

Guchun Chen authored Jul 31, 2020

Bad page information should not be exposed by sysfs when
bad page retirement is disabled, so decouple it from ras
sysfs group creating, and add one guard before creating.
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

f848159b

drm/amdgpu: add one definition for RAS's sysfs/debugfs name(v2) · eb0c3cd4

Guchun Chen authored Jul 31, 2020

Add one definition for the RAS module's FS name. It's used
in both debugfs and sysfs cases.

v2: Use static variable instead of macro definition.
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

eb0c3cd4

drm/amdgpu: restore ras flags when user resets eeprom(v2) · bf0b91b7

Guchun Chen authored Jul 22, 2020

RAS flags needs to be cleaned as well when user requires
one clean eeprom.

v2: RAS flags shall be restored after eeprom reset succeeds.
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

bf0b91b7

drm/amdgpu: break GPU recovery once it's in bad state(v4) · e8fbaf03

Guchun Chen authored Jul 23, 2020

When GPU executes recovery and retriving bad GPU tag
from external eerpom device, the recovery will be broken
and error message is printed as well for user's awareness.

v2: Refine warning message in threshold reaching case, and
    fix spelling typo.

v3: Fix explicit calling of bad gpu.

v4: Rename function names.
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

e8fbaf03

drm/amdgpu: schedule ras recovery when reaching bad page threshold(v2) · 9c06f91f

Guchun Chen authored Jul 23, 2020

Once the bad page saved to eeprom reaches the configured
threshold, ras recovery will be issued to notify user.

v2: Fix spelling typo.
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

9c06f91f

drm/amdgpu: skip bad page reservation once issuing from eeprom write · 35cd2cda

Guchun Chen authored Jul 23, 2020

Once the ras recovery is issued from eeprom write itself,
bad page reservation should be ignored, otherwise, recursive
calling of writting to eeprom would happen.
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

35cd2cda

drm/amdgpu: break driver init process when it's bad GPU(v5) · b82e65a9

Guchun Chen authored Jul 23, 2020

When retrieving bad gpu tag from eeprom, GPU init should
fail as the GPU needs to be retired for further check.

v2: Fix spelling typo, correct the condition to detect
    bad gpu tag and refine error message.

v3: Refine function argument name.

v4: Fix missing check of returning value of i2c
    initialization error case.

v5: Use dev_err to print PCI information in dmesg instead
    of DRM_ERROR.
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

b82e65a9

drm/amdgpu: add bad gpu tag definition · 1d6a9d12

Guchun Chen authored Jul 23, 2020

This tag will be hired for bad gpu detection in eeprom's access.
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

1d6a9d12

drm/amdgpu: validate bad page threshold in ras(v3) · c84d4670

Guchun Chen authored Jul 22, 2020

Bad page threshold value should be valid in the range between
-1 and max records length of eeprom. It could determine when
saved bad pages exceed threshold value, and proceed corresponding
actions.

v2: When using the default typical value, it should be min
value between typical value and eeprom max records length.

v3: drop the case of setting bad_page_cnt_threshold to be
    0xFFFFFFFF, as it confuses user.
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

c84d4670

drm/amdgpu: add bad page count threshold in module parameter(v3) · acc0204c

Guchun Chen authored Jul 21, 2020

bad_page_threshold could be configured to enable/disable the
associated bad page retirement feature in RAS.

When it's -1, ras will use typical bad page failure value to
handle bad page retirement.

When it's 0, disable bad page retirement, and no bad page
will be recorded and saved.

For other valid value, driver will use this manual value
as the threshold value of totoal bad pages.

v2: correct documentation of this parameter.

v3: remove confused statement in documentation.
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

acc0204c

drm/amdkfd: Replace bitmask with event idx in SMI event msg · 522ec6e0

Mukul Joshi authored Jul 30, 2020

Event bitmask is a 64-bit mask with only 1 bit set. Sending this
event bitmask in KFD SMI event message is both wasteful of memory
and potentially limiting to only 64 events. Instead send event
index in SMI event message.
Please note this change does not break the ABI for the two event
types defined so far. The new index is identical to the mask used
before.
Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Suggested-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

522ec6e0

30 Jul, 2020 9 commits

Revert "drm/amdgpu: Fix NULL dereference in dpm sysfs handlers" · 2456c290

Alex Deucher authored Jul 30, 2020

This regressed some working configurations so revert it.  Will
fix this properly for 5.9 and backport then.

This reverts commit 38e0c89a.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

2456c290

drm/amdgpu: enable GFXOFF for navy_flounder · 74b35959

Jiansong Chen authored Jul 30, 2020

Enable GFXOFF for navy_flounder.
Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com>
Reviewed-by: Likun Gao <Likun.Gao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

74b35959

drm amdgpu: Skip tmr load for SRIOV · f61772cd

Liu ChengZhe authored Jul 24, 2020

1. For Navi12, CHIP_SIENNA_CICHLID, skip tmr load operation;
2. Check pointer before release firmware.

v2: use CHIP_SIENNA_CICHLID instead
v3: remove local "bool ret"; fix grammer issue
v4: use my name instead of "root"
v5: fix grammer issue and indent issue
Signed-off-by: Liu ChengZhe <ChengZhe.Liu@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

f61772cd

drm/amdgpu: fix PSP autoload twice in FLR · 392cf6a7

Liu ChengZhe authored Jul 24, 2020

Assigning false to block->status.hw overwrites PSP's previous
hardware status, which causes the PSP to Resume operation after
hardware init.

Remove this assignment and let the PSP execute Resume operation
when it is told to.

v2: Remove the braces.
v3: Modify the description.
Signed-off-by: Liu ChengZhe <ChengZhe.Liu@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

392cf6a7

drm/amdgpu/dc: Stop dma_resv_lock inversion in commit_tail · 178b0013

Daniel Vetter authored Jul 27, 2020

Trying to grab dma_resv_lock while in commit_tail before we've done
all the code that leads to the eventual signalling of the vblank event
(which can be a dma_fence) is deadlock-y. Don't do that.

Here the solution is easy because just grabbing locks to read
something races anyway. We don't need to bother, READ_ONCE is
equivalent. And avoids the locking issue.

v2: Also take into account tmz_surface boolean, plus just delete the
old code.

Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Cc: linux-rdma@vger.kernel.org
Cc: amd-gfx@lists.freedesktop.org
Cc: intel-gfx@lists.freedesktop.org
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Christian König <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

178b0013

drm/amd/powerplay: Remove unneeded cast from memory allocation · 317469f6

Li Heng authored Jul 29, 2020

Remove casting the values returned by memory allocation function.

Coccinelle emits WARNING:

./drivers/gpu/drm/amd/powerplay/hwmgr/vega20_processpptables.c:893:37-46: WARNING: casting value returned by memory allocation function to (PPTable_t *) is useless.
Signed-off-by: Li Heng <liheng40@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

317469f6

drm/amdgpu: Prevent kernel-infoleak in amdgpu_info_ioctl() · 8e326285

Peilin Ye authored Jul 28, 2020

Compiler leaves a 4-byte hole near the end of `dev_info`, causing
amdgpu_info_ioctl() to copy uninitialized kernel stack memory to userspace
when `size` is greater than 356.

In 2015 we tried to fix this issue by doing `= {};` on `dev_info`, which
unfortunately does not initialize that 4-byte hole. Fix it by using
memset() instead.

Cc: stable@vger.kernel.org
Fixes: c193fa91 ("drm/amdgpu: information leak in amdgpu_info_ioctl()")
Fixes: d38ceaf9 ("drm/amdgpu: add core driver (v4)")
Suggested-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Peilin Ye <yepeilin.cs@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

8e326285

drm/amd/display: Clear dm_state for fast updates · 76195175

Mazin Rezk authored Jul 27, 2020

This patch fixes a race condition that causes a use-after-free during
amdgpu_dm_atomic_commit_tail. This can occur when 2 non-blocking commits
are requested and the second one finishes before the first. Essentially,
this bug occurs when the following sequence of events happens:

1. Non-blocking commit #1 is requested w/ a new dm_state #1 and is
deferred to the workqueue.

2. Non-blocking commit #2 is requested w/ a new dm_state #2 and is
deferred to the workqueue.

3. Commit #2 starts before commit #1, dm_state #1 is used in the
commit_tail and commit #2 completes, freeing dm_state #1.

4. Commit #1 starts after commit #2 completes, uses the freed dm_state
1 and dereferences a freelist pointer while setting the context.

Since this bug has only been spotted with fast commits, this patch fixes
the bug by clearing the dm_state instead of using the old dc_state for
fast updates. In addition, since dm_state is only used for its dc_state
and amdgpu_dm_atomic_commit_tail will retain the dc_state if none is found,
removing the dm_state should not have any consequences in fast updates.

This use-after-free bug has existed for a while now, but only caused a
noticeable issue starting from 5.7-rc1 due to 3202fa62 ("slub: relocate
freelist pointer to middle of object") moving the freelist pointer from
dm_state->base (which was unused) to dm_state->context (which is
dereferenced).

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=207383
Fixes: bd200d19 ("drm/amd/display: Don't replace the dc_state for fast updates")
Reported-by: Duncan <1i5t5.duncan@cox.net>
Signed-off-by: Mazin Rezk <mnrzk@protonmail.com>
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

76195175

drm/amdgpu: update GC golden setting for navy_flounder · defa4896

Jiansong Chen authored Jul 29, 2020

Update GC golden setting for navy_flounder.
Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

defa4896