Commit f524dd54 authored by Tao Zhou's avatar Tao Zhou Committed by Alex Deucher

drm/amdgpu: skip umc ras irq handling in poison mode (v2)

In ras poison mode, umc uncorrectable error will be ignored until
the corrupted data consumed by another ras module (such as gfx, sdma).

v2: update the debug message and replace dev_warn with dev_info.
Signed-off-by: default avatarTao Zhou <tao.zhou1@amd.com>
Reviewed-by: default avatarHawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
parent e4348849
...@@ -1544,22 +1544,28 @@ static void amdgpu_ras_interrupt_handler(struct ras_manager *obj) ...@@ -1544,22 +1544,28 @@ static void amdgpu_ras_interrupt_handler(struct ras_manager *obj)
data->rptr = (data->aligned_element_size + data->rptr = (data->aligned_element_size +
data->rptr) % data->ring_size; data->rptr) % data->ring_size;
/* Let IP handle its data, maybe we need get the output
* from the callback to udpate the error type/count, etc
*/
if (data->cb) { if (data->cb) {
ret = data->cb(obj->adev, &err_data, &entry); if (amdgpu_ras_is_poison_mode_supported(obj->adev) &&
/* ue will trigger an interrupt, and in that case obj->head.block == AMDGPU_RAS_BLOCK__UMC)
* we need do a reset to recovery the whole system. dev_info(obj->adev->dev,
* But leave IP do that recovery, here we just dispatch "Poison is created, no user action is needed.\n");
* the error. else {
*/ /* Let IP handle its data, maybe we need get the output
if (ret == AMDGPU_RAS_SUCCESS) { * from the callback to udpate the error type/count, etc
/* these counts could be left as 0 if */
* some blocks do not count error number ret = data->cb(obj->adev, &err_data, &entry);
/* ue will trigger an interrupt, and in that case
* we need do a reset to recovery the whole system.
* But leave IP do that recovery, here we just dispatch
* the error.
*/ */
obj->err_data.ue_count += err_data.ue_count; if (ret == AMDGPU_RAS_SUCCESS) {
obj->err_data.ce_count += err_data.ce_count; /* these counts could be left as 0 if
* some blocks do not count error number
*/
obj->err_data.ue_count += err_data.ue_count;
obj->err_data.ce_count += err_data.ce_count;
}
} }
} }
} }
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment