• Lang Yu's avatar
    drm/amdgpu: add support for SMU debug option · 6ff7fddb
    Lang Yu authored
    SMU firmware expects the driver maintains error context
    and doesn't interact with SMU any more when SMU errors
    occurred. That will aid in debugging SMU firmware issues.
    
    Add SMU debug option support for this request, it can be
    enabled or disabled via amdgpu_smu_debug debugfs file.
    Use a 32-bit mask to indicate corresponding debug modes.
    Currently, only one mode(HALT_ON_ERROR) is supported.
    When enabled, it brings hardware to a kind of halt state
    so that no one can touch it any more in the envent of SMU
    errors.
    
    The dirver interacts with SMU via sending messages. And
    threre are three ways to sending messages to SMU in current
    implementation. Handle them respectively as following:
    
    1, smu_cmn_send_smc_msg_with_param() for normal timeout cases
    
      Halt on any error.
    
    2, smu_cmn_send_msg_without_waiting()/smu_cmn_wait_for_response()
    for longer timeout cases
    
      Halt on errors apart from ETIME. Otherwise this way won't work.
      Let the user handle ETIME error in such a case.
    
    3, smu_cmn_send_msg_without_waiting() for no waiting cases
    
      Halt on errors apart from ETIME. Otherwise second way won't work.
    
    == Command Guide ==
    
    1, enable HALT_ON_ERROR mode
    
     # echo 0x1 > /sys/kernel/debug/dri/0/amdgpu_smu_debug
    
    2, disable HALT_ON_ERROR mode
    
     # echo 0x0 > /sys/kernel/debug/dri/0/amdgpu_smu_debug
    
    v5:
     - Use bit mask to allow more debug features.(Evan)
     - Use WRAN() instead of BUG().(Evan)
    
    v4:
     - Set to halt state instead of a simple hang.(Christian)
    
    v3:
     - Use debugfs_create_bool().(Christian)
     - Put variable into smu_context struct.
     - Don't resend command when timeout.
    
    v2:
     - Resend command when timeout.(Lijo)
     - Use debugfs file instead of module parameter.
    Signed-off-by: default avatarLang Yu <lang.yu@amd.com>
    Reviewed-by: default avatarLijo Lazar <lijo.lazar@amd.com>
    Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
    6ff7fddb
amdgpu_debugfs.c 42.5 KB