• Felix Kuehling's avatar
    drm/amdkfd: Run restore_workers on freezable WQs · 9a1c1339
    Felix Kuehling authored
    Make restore workers freezable so we don't have to explicitly flush them
    in suspend and GPU reset code paths, and we don't accidentally try to
    restore BOs while the GPU is suspended. Not having to flush restore_work
    also helps avoid lock/fence dependencies in the GPU reset case where we're
    not allowed to wait for fences.
    
    A side effect of this is, that we can now have multiple concurrent threads
    trying to signal the same eviction fence. Rework eviction fence signaling
    and replacement to account for that.
    
    The GPU reset path can no longer rely on restore_process_worker to resume
    queues because evict/restore workers can run independently of it. Instead
    call a new restore_process_helper directly.
    
    This is an RFC and request for testing.
    
    v2:
    - Reworked eviction fence signaling
    - Introduced restore_process_helper
    
    v3:
    - Handle unsignaled eviction fences in restore_process_bos
    Signed-off-by: default avatarFelix Kuehling <Felix.Kuehling@amd.com>
    Acked-by: default avatarChristian König <christian.koenig@amd.com>
    Tested-by: default avatarEmily Deng <Emily.Deng@amd.com>
    Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
    9a1c1339
kfd_svm.c 114 KB