Commits · 5d2c102deff63ff8980dfa848ee41858d255c291 · Kirill Smelkov / linux

23 Jul, 2024 33 commits

drm/amd/display: Do 1-to-1 mapping between OPP and DSC in DML2 · 5d2c102d

Sung Joon Kim authored Jul 02, 2024

[why]
To determine which block instance to power-gate,
we look at the available pipe resource for both plane
and stream. On MPO, DSC3 was falsely powered on even
though only 1 stream path was enabled because
the resource mapping was not done correctly.

[how]
Acquire the correct DSC instance to power on / off based
on the instance of OPP which determines the backend
pipe index.
Reviewed-by: Swapnil Patel <swapnil.patel@amd.com>
Signed-off-by: Jerry Zuo <jerry.zuo@amd.com>
Signed-off-by: Sung Joon Kim <sungjoon.kim@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

5d2c102d

drm/amd/display: Refactoring MMHUBBUB · 906fd46a

Revalla Hari Krishna authored Jul 02, 2024

[Why]
To refactor MMHUBBUB files

[How]
Moved mmhubbub files from dcn20 to /mmhubbub/ folder and
update makefile to fix compilation.
Reviewed-by: Martin Leung <martin.leung@amd.com>
Signed-off-by: Jerry Zuo <jerry.zuo@amd.com>
Signed-off-by: Revalla Hari Krishna <harikrishna.revalla@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

906fd46a

drm/amd/display: Deallocate DML memory if allocation fails · 892abca6

Chris Park authored Jun 28, 2024

[Why]
When DC state create DML memory allocation fails, memory is not
deallocated subsequently, resulting in uninitialized structure
that is not NULL.

[How]
Deallocate memory if DML memory allocation fails.
Reviewed-by: Joshua Aberback <joshua.aberback@amd.com>
Signed-off-by: Jerry Zuo <jerry.zuo@amd.com>
Signed-off-by: Chris Park <chris.park@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

892abca6

drm/amd/display: Check stream before comparing them · 35ff747c

Alex Hung authored Jun 27, 2024

[WHAT & HOW]
amdgpu_dm can pass a null stream to dc_is_stream_unchanged. It is
necessary to check for null before dereferencing them.

This fixes 1 FORWARD_NULL issue reported by Coverity.
Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Signed-off-by: Jerry Zuo <jerry.zuo@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

35ff747c

drm/amd/display: Check null pointers before using them · 1ff12bcd

Alex Hung authored Jun 27, 2024

[WHAT & HOW]
These pointers are null checked previously in the same function,
indicating they might be null as reported by Coverity. As a result,
they need to be checked when used again.

This fixes 3 FORWARD_NULL issue reported by Coverity.
Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Signed-off-by: Jerry Zuo <jerry.zuo@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

1ff12bcd

drm/amd/display: Pass non-null to dcn20_validate_apply_pipe_split_flags · 55595987

Alex Hung authored Jun 27, 2024

[WHAT & HOW]
"dcn20_validate_apply_pipe_split_flags" dereferences merge, and thus it
cannot be a null pointer. Let's pass a valid pointer to avoid null
dereference.

This fixes 2 FORWARD_NULL issues reported by Coverity.
Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Signed-off-by: Jerry Zuo <jerry.zuo@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

55595987

drm/amd/display: Check phantom_stream before it is used · 3718a619

Alex Hung authored Jun 20, 2024

dcn32_enable_phantom_stream can return null, so returned value
must be checked before used.

This fixes 1 NULL_RETURNS issue reported by Coverity.
Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Signed-off-by: Jerry Zuo <jerry.zuo@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

3718a619

drm/amd/display: Check null-initialized variables · 367cd9ce

Alex Hung authored Jun 27, 2024

[WHAT & HOW]
drr_timing and subvp_pipe are initialized to null and they are not
always assigned new values. It is necessary to check for null before
dereferencing.

This fixes 2 FORWARD_NULL issues reported by Coverity.
Reviewed-by: Nevenko Stupar <nevenko.stupar@amd.com>
Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Signed-off-by: Jerry Zuo <jerry.zuo@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

367cd9ce

drm/amd/display: Initialize denominators' default to 1 · b995c0a6

Alex Hung authored Jun 18, 2024

[WHAT & HOW]
Variables used as denominators and maybe not assigned to other values,
should not be 0. Change their default to 1 so they are never 0.

This fixes 10 DIVIDE_BY_ZERO issues reported by Coverity.
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Jerry Zuo <jerry.zuo@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

b995c0a6

drm/amd/display: Refactoring OPP · f60881ca

Revalla Hari Krishna authored Jun 26, 2024

[Why]
To refactor OPP files

[How]
Moved opp related files to specific opp folder and
updated Makefiles.
Acked-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Signed-off-by: Jerry Zuo <jerry.zuo@amd.com>
Signed-off-by: Revalla Hari Krishna <harikrishna.revalla@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

f60881ca

drm/amd/display: Fix DP-DVI dongle hotplug · d94df7ca

Gabe Teeger authored Jun 28, 2024

[why]
Hotplugging with a DVI-DP dongle on pre-rdna embedded platform
working about half the time. The regression was found to be the
setting of link->type here.
[what]
Reverts fix besides the logging added.
Reviewed-by: Wenjing Liu <wenjing.liu@amd.com>
Signed-off-by: Jerry Zuo <jerry.zuo@amd.com>
Signed-off-by: Gabe Teeger <gabe.teeger@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

d94df7ca

drm/amd/display: Disable subvp based on HW cursor requirement · c18fa08e

Alvin Lee authored Jun 27, 2024

[Description]
- There are situations where HW cursor is required
- In these scenarios we should disable subvp based on the HW cursor
  requirement
Reviewed-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Jerry Zuo <jerry.zuo@amd.com>
Signed-off-by: Alvin Lee <alvin.lee2@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

c18fa08e

drm/amd/display: When resync fifo ensure to use correct pipe ctx · b3c9c9af

Alvin Lee authored Jun 27, 2024

We resync the FIFO after each pipe update in apply_ctx_to_hw.
However, this means that some pipes (in hardware) are based on the
new context and some are based on the current_state (since the pipes
are updated on at a time). In this case we must ensure to use the
pipe_ctx that's currently still configured in hardware when turning
off / on OTG's and reconfiguring ODM during the resync.
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Jerry Zuo <jerry.zuo@amd.com>
Signed-off-by: Alvin Lee <alvin.lee2@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

b3c9c9af

drm/amd/display: Add option to allow transition when odm is forced · f5c78386

Sridevi Arvindekar authored Jun 27, 2024

Added option to allow transition for forced odm.
Add the variation to the nightly run.
Reviewed-by: Wenjing Liu <wenjing.liu@amd.com>
Signed-off-by: Jerry Zuo <jerry.zuo@amd.com>
Signed-off-by: Sridevi Arvindekar <sarvinde@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

f5c78386

drm/amd/display: avoid disable otg when dig was disabled · 21878404

Jingwen Zhu authored Jun 25, 2024

[Why]
This is a workaround for an dcn3.1 hang that happens if otg dispclk
is ramped while otg is on and stream enc is off.
But this w/a should not trigger when we have a dig active.

[How]
Avoid disable otg when dig FE/BE FIFO was not switched.
Acked-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Signed-off-by: Jerry Zuo <jerry.zuo@amd.com>
Signed-off-by: Jingwen Zhu <jingwen.zhu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

21878404

drm/amd/display: Implement bias and scale pre scl · c83ecc0b

Relja Vojvodic authored Jun 27, 2024

why:
New scaler needs the input to be full range color space. This will also fix
issues that come up due to not having a predefined limited color space matrix
for certain color spaces

how:
Use bias and scale HW to expand the range of limited color spaces to full
before the scaler
Reviewed-by: Krunoslav Kovac <krunoslav.kovac@amd.com>
Signed-off-by: Jerry Zuo <jerry.zuo@amd.com>
Signed-off-by: Relja Vojvodic <relja.vojvodic@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

c83ecc0b

drm/amd/display: apply vmin optimization even if it doesn't reach vmin level · 5fc77c26

Wenjing Liu authored May 31, 2024

[why]
Based on power measurement result, in most cases when display clock is higher
than Vmin display clock, lowering display clock using dynamic ODM will improve
overall power consumption by 0 to 4 watts even if we can't reach Vmin.

[how]
Allow vmin optimization applied even if dispclk can't reach Vmin.
Reviewed-by: Jun Lei <jun.lei@amd.com>
Signed-off-by: Jerry Zuo <jerry.zuo@amd.com>
Signed-off-by: Wenjing Liu <wenjing.liu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

5fc77c26

drm/amd/display: fix dscclk programming sequence on DCN401 · 3c915431

Wenjing Liu authored Jun 25, 2024

[why]
The mux to switch between refclk and dto_dsc_clk is non double buffered.
However dto dsc clk's phase and modulo divider registers are currently
configured as double buffered update. This causes a problem when we switch to
use dto dsc clk and program phase and modulo in the same sequence. In this
sequence dsc clk is switched to dto but the clock divider programming doesn't
take effect until next frame. When we try to program DSCC registers, SMN bus
will hang because dto dsc clk divider phase is set to 0.

[how]
Configure phase and modulo to take effect immediately. Always switch to dto dsc
clk before DSC clock is unagted. Switch back to refclk after DSC clock is gated.
Acked-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Reviewed-by: Jerry Zuo <jerry.zuo@amd.com>
Signed-off-by: Wenjing Liu <wenjing.liu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

3c915431

drm/amd/display: Revert "Check HDCP returned status" · bc2fe69f

Alex Hung authored Jun 25, 2024

This reverts commit 5d93060d due to a
power consumption regression.
Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Signed-off-by: Jerry Zuo <jerry.zuo@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

bc2fe69f

drm/amd/display: Replace assert with error message in dp_retrieve_lttpr_cap() · e8d77cfd

Roman Li authored Jun 25, 2024

[Why]
When assert in dp_retrieve_lttpr_cap() is hit, dmesg has traces like:

 RIP: 0010:dp_retrieve_lttpr_cap+0xcc/0x1a0 [amdgpu]
 Call Trace:
 <TASK>
  dp_retrieve_lttpr_cap+0xcc/0x1a0 [amdgpu]
  report_bug+0x1e8/0x240
  handle_bug+0x46/0x80
  link_detect+0x35/0x580 [amdgpu]

It happens when LTTPRs fail to increment dpcd repeater count.
We have a recovery action in place for such cases.
Assert is misleading, an indicative error in dmesg is more useful.

[How]
Remove ASSERT and use DC_LOG_ERROR instead.
Reviewed-by: Michael Strauss <michael.strauss@amd.com>
Signed-off-by: Jerry Zuo <jerry.zuo@amd.com>
Signed-off-by: Roman Li <roman.li@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

e8d77cfd

drm/amd/display: Added logging for automated DPM testing · 98579743

Ryan Seto authored Jun 26, 2024

[Why]
Added clock logs to automate DPM testing

[How]
Added logs and helper functions to output clocks
Co-authored-by: Ryan Seto <ryanseto@amd.com>
Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Signed-off-by: Jerry Zuo <jerry.zuo@amd.com>
Signed-off-by: Ryan Seto <ryanseto@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

98579743

drm/amd/display: Don't consider cursor for no plane case in DML1 · 0961367c

Alvin Lee authored Jun 25, 2024

[Description]
For no plane scenarios we should not consider cursor as there cannot
be any cursor if  there's no planes. This fixes an issue where
dc_commit_streams fails due to prefetch bandwidth requirements
(the display config + dummy planes + cursor causes the prefetch
bandwidth to exceed what is possible).
Reviewed-by: Chaitanya Dhere <chaitanya.dhere@amd.com>
Signed-off-by: Jerry Zuo <jerry.zuo@amd.com>
Signed-off-by: Alvin Lee <alvin.lee2@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

0961367c

drm/amd/display: quality improvements for EASF and ISHARP · 5f30ee49

Samson Tam authored Jun 20, 2024

[Why]
Update coefficients and LUT tables for scaler and sharpener
 to improve quality and support different use cases (SDR/HDR)

[How]
Move scaler coefficients to new file dc_spl_scl_easf_filters.c
Remove older coefficients file dc_sp_scl_filters_old.c
Update default taps for EASF support
Update LLS policy for DON'T CARE case
Update cositing offset from 0.5 to 0.25
Add support to adjust sharpness based on level, use case,
 and scaling ratio ( using discrete levels )
Apply sharpness to all RGB surfaces and both NV12 and P010
 video ( in fullscreen only ).  Upscale and 1:1 ratios only
Enable scaler when sharpening 1:1 ratios
Add support for coefficients that are in S1.10 format
 (convert to S1.12 format)
Reviewed-by: Jun Lei <jun.lei@amd.com>
Signed-off-by: Jerry Zuo <jerry.zuo@amd.com>
Signed-off-by: Samson Tam <samson.tam@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

5f30ee49

drm/amd/display: Disable HBR audio for DP2 for certain ASICs · 4ccc8fdc

Alvin Lee authored Sep 12, 2023

[Description]
Due to a HW bug, HBR audio is not supported for
DP2 encoders for certain ASICs.
Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Signed-off-by: Jerry Zuo <jerry.zuo@amd.com>
Signed-off-by: Alvin Lee <alvin.lee2@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

4ccc8fdc

drm/amd/display: Disable replay if VRR capability is false · b6841761

Tom Chung authored Jun 26, 2024

[Why]
The VRR need to be supported for panel replay feature.
If VRR capability is false, panel replay capability also
need to be disabled.

[How]
After update the vrr capability, the panel replay capability
also need to be check if need.
Reviewed-by: Wayne Lin <wayne.lin@amd.com>
Signed-off-by: Jerry Zuo <jerry.zuo@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

b6841761

drm/amdgpu: add print support for sdma_v_5_0 ip_dump · e84f798a

Sunil Khatri authored Jul 16, 2024

Add support for ip dump for sdma_v_5_0 in devcoredump.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

e84f798a

drm/amdgpu: Add sdma_v5_0 ip dump for devcoredump · 0f1a9370

Sunil Khatri authored Jul 16, 2024

Add ip dump for sdma_v5_0 for devcoredump for all
instances of sdma.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

0f1a9370

drm/amdgpu: add print support for sdma_v_6_0 ip_dump · ccb54d7d

Sunil Khatri authored Jul 16, 2024

Add print support for ip dump for sdma_v_6_0 in
devcoredump.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

ccb54d7d

drm/amdgpu: Add sdma_v6_0 ip dump for devcoredump · 1eba165a

Sunil Khatri authored Jul 16, 2024

Add ip dump for sdma_v6_0 for devcoredump for all
instances of sdma.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

1eba165a

drm/amdgpu: fix the print message in devcoredump · 00bb3223

Sunil Khatri authored Jul 12, 2024

Fix the memory type logged for gtt memory size
which is wrongly logged as visible vram size.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

00bb3223

drm/amdgpu: fix the extra space between two functions · 43796955

Sunil Khatri authored Jul 16, 2024

fix extra line space between two functions.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

43796955

drm/amdgpu: add print support for sdma_v_5_2 ip_dump · 08bed7e4

Sunil Khatri authored Jul 12, 2024

Add support for ip dump for sdma_v_5_2 in devcoredump.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

08bed7e4

drm/amdgpu: Add sdma_v5_2 ip dump for devcoredump · f763c3b5

Sunil Khatri authored Jul 12, 2024

Add ip dump for sdma_v5_2 for devcoredump for all
instances of sdma.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

f763c3b5

22 Jul, 2024 4 commits

Merge tag 'amd-drm-fixes-6.11-2024-07-18' of... · 627a24f5

Dave Airlie authored Jul 22, 2024

Merge tag 'amd-drm-fixes-6.11-2024-07-18' of https://gitlab.freedesktop.org/agd5f/linux into drm-next

amd-drm-fixes-6.11-2024-07-18:

amdgpu:
- Bump driver version for GFX12 DCC
- DC documention warning fixes
- VCN unified queue power fix
- SMU fix
- RAS fix
- Display corruption fix
Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240718215258.79356-1-alexander.deucher@amd.com

627a24f5

Merge tag 'drm-misc-next-fixes-2024-07-19' of... · 412dbc66

Dave Airlie authored Jul 22, 2024

Merge tag 'drm-misc-next-fixes-2024-07-19' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-next

Two fixes for v3d to fix an array indexing on newer V3D revisions.
Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maxime Ripard <mripard@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240719-emerald-newt-of-skill-89b54a@houat

412dbc66

Merge tag 'drm-xe-next-fixes-2024-07-18' of... · 78e6e468

Dave Airlie authored Jul 22, 2024

Merge tag 'drm-xe-next-fixes-2024-07-18' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-next

- Xe_exec ioctl minor fix on sync entry cleanup upon error (Ashutosh)
- SRIOV: limit VF LMEM provisioning (Michal)
- Wedge mode fixes (Brost)
Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/Zpk6CI0FDoTJwkSb@intel.com

78e6e468

Merge tag 'drm-intel-next-fixes-2024-07-18' of... · 7d4ecf37

Dave Airlie authored Jul 22, 2024

Merge tag 'drm-intel-next-fixes-2024-07-18' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-next

- Reset intel_dp->link_trained before retraining the link [dp] (Imre Deak)
- Don't switch the LTTPR mode on an active link [dp] (Imre Deak)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Tvrtko Ursulin <tursulin@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ZpjgtowjpUZoHvrl@linux

7d4ecf37

18 Jul, 2024 3 commits

drm/xe: Don't suspend device upon wedge · 90936a0a

Matthew Brost authored Jul 15, 2024

When wedging a device we shouldn't be suspending device as state for
debug will be lost.

Also this appears to not work as the below stack trace pops upon trying
to resume a wedged device:

[  304.245044] INFO: task cat:12115 blocked for more than 151 seconds.
[  304.251333]       Tainted: G        W          6.10.0-rc7-xe+ #3518
[  304.257617] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  304.265459] task:cat             state:D stack:13384 pid:12115 tgid:12115 ppid:3986   flags:0x00000006
[  304.265465] Call Trace:
[  304.265467]  <TASK>
[  304.265469]  __schedule+0x3c4/0xdf0
[  304.265478]  schedule+0x3c/0x140
[  304.265481]  rpm_resume+0x1cc/0x740
[  304.265484]  ? __pfx_autoremove_wake_function+0x10/0x10
[  304.265489]  __pm_runtime_resume+0x49/0x80
[  304.265494]  guc_info+0x6b/0xb0 [xe]
[  304.265538]  ? __pfx___drm_printfn_seq_file+0x10/0x10
[  304.265541]  ? __pfx___drm_puts_seq_file+0x10/0x10
[  304.265545]  seq_read_iter+0x111/0x4c0
[  304.265551]  seq_read+0xfc/0x140
[  304.265556]  full_proxy_read+0x58/0x80
[  304.265560]  vfs_read+0xa7/0x360
[  304.265563]  ? find_held_lock+0x2b/0x80
[  304.265568]  ksys_read+0x64/0xe0
[  304.265571]  do_syscall_64+0x68/0x140
[  304.265575]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[  304.265578] RIP: 0033:0x7f4254d14992
[  304.265580] RSP: 002b:00007ffc558666f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[  304.265583] RAX: ffffffffffffffda RBX: 0000000000020000 RCX: 00007f4254d14992
[  304.265584] RDX: 0000000000020000 RSI: 00007f4254ebb000 RDI: 0000000000000003
[  304.265586] RBP: 00007f4254ebb000 R08: 00007f4254eba010 R09: 00007f4254eba010
[  304.265587] R10: 0000000000000022 R11: 0000000000000246 R12: 0000000000022000
[  304.265588] R13: 0000000000000003 R14: 0000000000020000 R15: 0000000000020000
[  304.265593]  </TASK>
[  304.265594]
               Showing all locks held in the system:
[  304.265598] 1 lock held by khungtaskd/57:
[  304.265599]  #0: ffffffff8273b860 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x36/0x1c0
[  304.265607] 3 locks held by kworker/6:1/90:
[  304.265610] 1 lock held by in:imklog/547:
[  304.265611]  #0: ffff88810498cd88 (&f->f_pos_lock){+.+.}-{3:3}, at: __fdget_pos+0x76/0xc0
[  304.265620] 1 lock held by dmesg/1310:

v2: Drop local 'err' variable (Jonathan)

Fixes: 8ed9aaae ("drm/xe: Force wedged state and block GT reset upon any GPU hang")
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240716063902.1390130-2-matthew.brost@intel.com
(cherry picked from commit 452bca0e)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

90936a0a

drm/xe: Wedge the entire device · c9474b72

Matthew Brost authored Jul 15, 2024

Wedge the entire device, not just GT which may have triggered the wedge.
To implement this, cleanup the layering so xe_device_declare_wedged()
calls into the lower layers (GT) to ensure entire device is wedged.

While we are here, also signal any pending GT TLB invalidations upon
wedging device.

Lastly, short circuit reset wait if device is wedged.

v2:
 - Short circuit reset wait if device is wedged (Local testing)

Fixes: 8ed9aaae ("drm/xe: Force wedged state and block GT reset upon any GPU hang")
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240716063902.1390130-1-matthew.brost@intel.com
(cherry picked from commit 7dbe8af1)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

c9474b72

drm/xe/pf: Limit fair VF LMEM provisioning · bf07ca96

Michal Wajdeczko authored Jul 11, 2024

Due to the current design of the BO and VRAM manager, any object
with XE_BO_FLAG_PINNED flag, which the PF driver uses during VF
LMEM provisionining, is created with the TTM_PL_FLAG_CONTIGUOUS
flag, which may cause VRAM fragmentation that prevents subsequent
allocations of larger objects, like fair VF LMEM provisioning.

To avoid such failures, round down fair VF LMEM provisioning size
to next power of two size, to compensate what xe_ttm_vram_mgr is
doing to achieve contiguous allocations.

Fixes: ac6598ae ("drm/xe/pf: Add support to configure SR-IOV VFs")
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240711192320.1198-2-michal.wajdeczko@intel.comSigned-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
(cherry picked from commit 4c3fe5ea)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

bf07ca96