Commit eff8cbf0 authored by Philip Yang's avatar Philip Yang Committed by Alex Deucher

drm/amdkfd: AIP mGPUs best prefetch location for xnack on

For xnack on, if range ACCESS or ACCESS_IN_PLACE (AIP) by single GPU, or
range is ACCESS_IN_PLACE by mGPUs and all mGPUs connection on XGMI same
hive, the best prefetch location is prefetch_loc GPU. Otherwise, the best
prefetch location is always CPU because GPU does not have coherent
mapping VRAM of other GPUs even with large-BAR PCIe connection.
Signed-off-by: default avatarPhilip Yang <Philip.Yang@amd.com>
Reviewed-by: default avatarFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
parent f5bd5239
...@@ -2675,22 +2675,26 @@ svm_range_add(struct kfd_process *p, uint64_t start, uint64_t size, ...@@ -2675,22 +2675,26 @@ svm_range_add(struct kfd_process *p, uint64_t start, uint64_t size,
return 0; return 0;
} }
/* svm_range_best_prefetch_location - decide the best prefetch location /**
* svm_range_best_prefetch_location - decide the best prefetch location
* @prange: svm range structure * @prange: svm range structure
* *
* For xnack off: * For xnack off:
* If range map to single GPU, the best acutal location is prefetch loc, which * If range map to single GPU, the best prefetch location is prefetch_loc, which
* can be CPU or GPU. * can be CPU or GPU.
* *
* If range map to multiple GPUs, only if mGPU connection on xgmi same hive, * If range is ACCESS or ACCESS_IN_PLACE by mGPUs, only if mGPU connection on
* the best actual location could be prefetch_loc GPU. If mGPU connection on * XGMI same hive, the best prefetch location is prefetch_loc GPU, othervise
* PCIe, the best actual location is always CPU, because GPU cannot access vram * the best prefetch location is always CPU, because GPU can not have coherent
* of other GPUs, assuming PCIe small bar (large bar support is not upstream). * mapping VRAM of other GPUs even with large-BAR PCIe connection.
* *
* For xnack on: * For xnack on:
* The best actual location is prefetch location. If mGPU connection on xgmi * If range is not ACCESS_IN_PLACE by mGPUs, the best prefetch location is
* same hive, range map to multiple GPUs. Otherwise, the range only map to * prefetch_loc, other GPU access will generate vm fault and trigger migration.
* actual location GPU. Other GPU access vm fault will trigger migration. *
* If range is ACCESS_IN_PLACE by mGPUs, only if mGPU connection on XGMI same
* hive, the best prefetch location is prefetch_loc GPU, otherwise the best
* prefetch location is always CPU.
* *
* Context: Process context * Context: Process context
* *
...@@ -2710,11 +2714,6 @@ svm_range_best_prefetch_location(struct svm_range *prange) ...@@ -2710,11 +2714,6 @@ svm_range_best_prefetch_location(struct svm_range *prange)
p = container_of(prange->svms, struct kfd_process, svms); p = container_of(prange->svms, struct kfd_process, svms);
/* xnack on */
if (p->xnack_enabled)
goto out;
/* xnack off */
if (!best_loc || best_loc == KFD_IOCTL_SVM_LOCATION_UNDEFINED) if (!best_loc || best_loc == KFD_IOCTL_SVM_LOCATION_UNDEFINED)
goto out; goto out;
...@@ -2724,8 +2723,12 @@ svm_range_best_prefetch_location(struct svm_range *prange) ...@@ -2724,8 +2723,12 @@ svm_range_best_prefetch_location(struct svm_range *prange)
best_loc = 0; best_loc = 0;
goto out; goto out;
} }
bitmap_or(bitmap, prange->bitmap_access, prange->bitmap_aip,
MAX_GPU_INSTANCE); if (p->xnack_enabled)
bitmap_copy(bitmap, prange->bitmap_aip, MAX_GPU_INSTANCE);
else
bitmap_or(bitmap, prange->bitmap_access, prange->bitmap_aip,
MAX_GPU_INSTANCE);
for_each_set_bit(gpuidx, bitmap, MAX_GPU_INSTANCE) { for_each_set_bit(gpuidx, bitmap, MAX_GPU_INSTANCE) {
pdd = kfd_process_device_from_gpuidx(p, gpuidx); pdd = kfd_process_device_from_gpuidx(p, gpuidx);
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment