drm/amdgpu: Fix silent amdgpu_bo_move failures
Under memory pressure, buffer moves between RAM to VRAM can fail when there is no GTT space available. In those cases amdgpu_bo_move falls back to ttm_bo_move_memcpy, which seems to succeed, although it doesn't really support non-contiguous or invisible VRAM. This manifests as VM faults with corrupted page table entries in KFD eviction stress tests. Print some helpful messages when lack of GTT space is causing buffer moves to fail. Check that source and destination memory regions are supported by ttm_bo_move_memcpy before taking that fallback. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Showing
Please register or sign in to comment