- 09 Feb, 2022 1 commit
-
-
Mukul Joshi authored
Cleanup the kfd code by removing the unused old debugger implementation. The address watch was only ever implemented in the upstream driver for GFXv7 (Kaveri). The user mode tools runtime using this API was never open-sourced. Work on the old debugger prototype that used this API has been discontinued years ago. Only a small piece of resetting wavefronts is kept and is moved to kfd_device_queue_manager.c. Signed-off-by:
Mukul Joshi <mukul.joshi@amd.com> Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
- 07 Feb, 2022 4 commits
-
-
Rajneesh Bhardwaj authored
Currently the SVM ranges use actual_gpu_id but with Checkpoint Restore support its possible that the SVM ranges can be resumed on another node where the actual_gpu_id may not be same as the original (user_gpu_id) gpu id. So modify svm code to use user_gpu_id. Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
David Yat Sin authored
When doing a restore on a different node, the gpu_id's on the restore node may be different. But the user space application will still refer use the original gpu_id's in the ioctl calls. Adding code to create a gpu id mapping so that kfd can determine actual gpu_id during the user ioctl's. Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
David Yat Sin <david.yatsin@amd.com> Signed-off-by:
Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
David Yat Sin authored
Introducing UNPAUSE op. After CRIU amdgpu plugin performs a PROCESS_INFO op the queues will be stay in an evicted state. Once the plugin is done draining BO contents, it is safe to perform an UNPAUSE op for the queues to resume. Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
David Yat Sin <david.yatsin@amd.com> Signed-off-by:
Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Rajneesh Bhardwaj authored
This adds support to create userptr BOs on restore and introduces a new ioctl op to restart memory notifiers for the restored userptr BOs. When doing CRIU restore MMU notifications can happen anytime after we call amdgpu_mn_register. Prevent MMU notifications until we reach stage-4 of the restore process i.e. criu_resume ioctl op is received, and the process is ready to be resumed. This ioctl is different from other KFD CRIU ioctls since its called by CRIU master restore process for all the target processes being resumed by CRIU. Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
David Yat Sin <david.yatsin@amd.com> Signed-off-by:
Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
- 27 Jan, 2022 1 commit
-
-
Philip Yang authored
kfd_process_notifier_release flush svm_range_restore_work which calls svm_range_list_lock_and_flush_work to flush deferred_list work, but if deferred_list work mmput release the last user, it will call exit_mmap -> notifier_release, it is deadlock with below backtrace. Move flush svm_range_restore_work to kfd_process_wq_release to avoid deadlock. Then svm_range_restore_work take task->mm ref to avoid mm is gone while validating and mapping ranges to GPU. Workqueue: events svm_range_deferred_list_work [amdgpu] Call Trace: wait_for_completion+0x94/0x100 __flush_work+0x12a/0x1e0 __cancel_work_timer+0x10e/0x190 cancel_delayed_work_sync+0x13/0x20 kfd_process_notifier_release+0x98/0x2a0 [amdgpu] __mmu_notifier_release+0x74/0x1f0 exit_mmap+0x170/0x200 mmput+0x5d/0x130 svm_range_deferred_list_work+0x104/0x230 [amdgpu] process_one_work+0x220/0x3c0 Signed-off-by:
Philip Yang <Philip.Yang@amd.com> Reported-by:
Ruili Ji <ruili.ji@amd.com> Tested-by:
Ruili Ji <ruili.ji@amd.com> Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
- 11 Jan, 2022 1 commit
-
-
Greg Kroah-Hartman authored
There are currently 2 ways to create a set of sysfs files for a kobj_type, through the default_attrs field, and the default_groups field. Move the amdkfd sysfs code to use default_groups field which has been the preferred way since aa30f47c ("kobject: Add support for default attribute groups to kobj_type") so that we can soon get rid of the obsolete default_attrs field. Cc: Felix Kuehling <Felix.Kuehling@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: amd-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
- 13 Dec, 2021 1 commit
-
-
Isabella Basso authored
This fixes various warnings relating to erroneous docstring syntax, of which some are listed below: warning: Function parameter or member 'adev' not described in 'amdgpu_atomfirmware_ras_rom_addr' ... warning: expecting prototype for amdgpu_atpx_validate_functions(). Prototype was for amdgpu_atpx_validate() instead ... warning: Excess function parameter 'mem' description in 'amdgpu_preempt_mgr_new' ... warning: Cannot understand * @kfd_get_cu_occupancy - Collect number of waves in-flight on this device ... warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst Signed-off-by:
Isabella Basso <isabbasso@riseup.net> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
- 01 Dec, 2021 2 commits
-
-
Christophe JAILLET authored
The 'doorbell_bitmap' bitmap has just been allocated. So we can use the non-atomic '__set_bit()' function to save a few cycles as no concurrent access can happen. Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Christophe JAILLET authored
'doorbell_bitmap' and 'queue_slot_bitmap' are bitmaps. So use 'bitmap_zalloc()' to simplify code, improve the semantic and avoid some open-coded arithmetic in allocator arguments. Also change the corresponding 'kfree()' into 'bitmap_free()' to keep consistency. Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
- 17 Nov, 2021 6 commits
-
-
Graham Sider authored
Switch to IP version checking instead of asic_type on various KFD version checks. Signed-off-by:
Graham Sider <Graham.Sider@amd.com> Reviewed-by:
Alex Deucher <alexander.deucher@amd.com> Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Graham Sider authored
Defined as GC HWIP >= IP_VERSION(9, 0, 1). Also defines KFD_GC_VERSION to return GC HWIP version. Signed-off-by:
Graham Sider <Graham.Sider@amd.com> Reviewed-by:
Alex Deucher <alexander.deucher@amd.com> Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Graham Sider authored
Remove get_amdgpu_device and other remaining kgd_dev references aside from declaration/kfd struct entry and initialization. Signed-off-by:
Graham Sider <Graham.Sider@amd.com> Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Graham Sider authored
Modified definitions: - amdgpu_amdkfd_gpuvm_acquire_process_vm - amdgpu_amdkfd_gpuvm_release_process_vm - amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu - amdgpu_amdkfd_gpuvm_free_memory_of_gpu - amdgpu_amdkfd_gpuvm_map_memory_to_gpu - amdgpu_amdkfd_gpuvm_unmap_memory_from_gpu - amdgpu_amdkfd_gpuvm_sync_memory - amdgpu_amdkfd_gpuvm_map_gtt_bo_to_kernel - amdgpu_amdkfd_gpuvm_unmap_gtt_bo_from_kernel - amdgpu_amdkfd_gpuvm_get_vm_fault_info - amdgpu_amdkfd_gpuvm_import_dmabuf - amdgpu_amdkfd_get_tile_config Removed: - get_amdgpu_device Signed-off-by:
Graham Sider <Graham.Sider@amd.com> Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Graham Sider authored
Modified definitions: - amdgpu_amdkfd_submit_ib - amdgpu_amdkfd_set_compute_idle - amdgpu_amdkfd_have_atomics_support - amdgpu_amdkfd_flush_gpu_tlb_pasid - amdgpu_amdkfd_flush_gpu_tlb_pasid - amdgpu_amdkfd_gpu_reset - amdgpu_amdkfd_alloc_gtt_mem - amdgpu_amdkfd_free_gtt_mem - amdgpu_amdkfd_alloc_gws - amdgpu_amdkfd_free_gws - amdgpu_amdkfd_ras_poison_consumption_handler Signed-off-by:
Graham Sider <Graham.Sider@amd.com> Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Graham Sider authored
Modified definitions: - program_sh_mem_settings - set_pasid_vmid_mapping - init_interrupts - address_watch_disable - address_watch_execute - wave_control_execute - address_watch_get_offset - get_atc_vmid_pasid_mapping_info - set_scratch_backing_va - set_vm_context_page_table_base - read_vmid_from_vmfault_reg - get_cu_occupancy - program_trap_handler_settings Signed-off-by:
Graham Sider <Graham.Sider@amd.com> Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
- 05 Nov, 2021 1 commit
-
-
shaoyunl authored
When kfd need to be reset, sent command to HWS might cause hang and get unnecessary timeout. This change try not to touch HW in pre_reset and keep queues to be in the evicted state when the reset is done, so they are not put back on the runlist. These queues will be destroied on process termination. Signed-off-by:
shaoyunl <shaoyun.liu@amd.com> Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
- 28 Oct, 2021 1 commit
-
-
Lang Yu authored
Currently, all kfd BOs use same destruction routine. But pinned BOs are not unpinned properly. Separate them from general routine. v2 (Felix): Add safeguard to prevent user space from freeing signal BO. Kunmap signal BO in the event of setting event page error. Just kunmap signal BO to avoid duplicating the code. Signed-off-by:
Lang Yu <lang.yu@amd.com> Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
- 02 Aug, 2021 1 commit
-
-
Eric Huang authored
This reverts commit 7ed9876c . Revert reason: The issue has been resolved. Signed-off-by:
Eric Huang <jinhuieric.huang@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
- 28 Jul, 2021 1 commit
-
-
Eric Huang authored
This reverts commit 7ed9876c . Revert reason: The issue has been resolved. Signed-off-by:
Eric Huang <jinhuieric.huang@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
- 13 Jul, 2021 2 commits
-
-
Eric Huang authored
This reverts commit 31f33243 . Reason for revert: it causes regressions on several Asics. Signed-off-by:
Eric Huang <jinhuieric.huang@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Eric Huang authored
This reverts commit 31f33243 . Reason for revert: it causes regressions on several Asics. Signed-off-by:
Eric Huang <jinhuieric.huang@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
- 30 Jun, 2021 3 commits
-
-
Philip Yang authored
This is part of SVM profiling API, export sysfs counters for per-process, per-GPU vm retry fault, pages migrated in and out of GPU vram. counters will not be updated in parallel in GPU retry fault handler and migration to vram/ram path, use READ_ONCE to avoid compiler optimization. Signed-off-by:
Philip Yang <Philip.Yang@amd.com> Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Philip Yang authored
3 cases of kobj leak, which causes memory leak: kobj_type must have release() method to free memory from release callback. Don't need NULL default_attrs to init kobj. sysfs files created under kobj_status should be removed with kobj_status as parent kobject. Remove queue sysfs files when releasing queue from process MMU notifier release callback. Signed-off-by:
Philip Yang <Philip.Yang@amd.com> Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Philip Yang authored
No functionality change. Modify kfd_sysfs_create_file to use kobject as parameter, so it becomes common helper function to remove duplicate code and will simplify new kfd sysfs file create in future. Move pr_warn to helper function if sysfs file create failed. Set helper function as void return because caller doesn't use the helper function return value. Signed-off-by:
Philip Yang <Philip.Yang@amd.com> Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
- 15 Jun, 2021 1 commit
-
-
Felix Kuehling authored
When some GPUs don't support SVM, don't disabe it for the entire process. That would be inconsistent with the information the process got from the topology, which indicates SVM support per GPU. Instead disable SVM support only for the unsupported GPUs. This is done by checking any per-device attributes against the bitmap of supported GPUs. Also use the supported GPU bitmap to initialize access bitmaps for new SVM address ranges. Don't handle recoverable page faults from unsupported GPUs. (I don't think there will be unsupported GPUs that can generate recoverable page faults. But better safe than sorry.) Signed-off-by:
Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by:
Philip Yang <philip.yang@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
- 07 Jun, 2021 1 commit
-
-
Wan Jiabing authored
kfd_svm.h is included duplicately in commit 42de677f ("drm/amdkfd: register svm range"). After checking possible related header files, remove the former one to make the code format more reasonable. Signed-off-by:
Wan Jiabing <wanjiabing@vivo.com> Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
- 04 Jun, 2021 2 commits
-
-
Eric Huang authored
It is to optimize memory mapping latency, and also aviod a page fault in a corner case of changing valid PDE into PTE. Signed-off-by:
Eric Huang <jinhuieric.huang@amd.com> Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Eric Huang authored
It is to provide more tlb flush types option for different case scenario. Signed-off-by:
Eric Huang <jinhuieric.huang@amd.com> Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
- 21 May, 2021 1 commit
-
-
Guenter Roeck authored
The first parameter passed to container_of() is the pointer to the work structure passed to the worker and never NULL. The NULL check on the result of container_of() is therefore unnecessary and misleading. Remove it. This change was made automatically with the following Coccinelle script. @@ type t; identifier v; statement s; @@ <+... ( t v = container_of(...); | v = container_of(...); ) ... when != v - if (\( !v \| v == NULL \) ) s ...+> Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Guenter Roeck <linux@roeck-us.net> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
- 20 May, 2021 1 commit
-
-
Philip Yang authored
Need do a heavy-weight TLB flush to make sure we have no more dirty data in the cache for the unmapped pages. Define enum TLB_FLUSH_TYPE, add flush_type parameter to amdgpu_amdkfd_flush_gpu_tlb_pasid. Signed-off-by:
Philip Yang <Philip.Yang@amd.com> Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by:
Christian König <christian.koenig@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
- 29 Apr, 2021 1 commit
-
-
Fabio M. De Francesco authored
Fixed a kernel-doc error in the documentation of a function. Signed-off-by:
Fabio M. De Francesco <fmdefrancesco@gmail.com> Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
- 21 Apr, 2021 8 commits
-
-
Felix Kuehling authored
With xnack on, GPU vm fault handler decide the best restore location, then migrate range to the best restore location and update GPU mapping to recover the GPU vm fault. Signed-off-by:
Philip Yang <Philip.Yang@amd.com> Signed-off-by:
Alex Sierra <alex.sierra@amd.com> Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Alex Sierra authored
XNACK mode controls the SQ RETRY_DISABLE setting that determines, whether recoverable page faults can be supported on GFXv9 hardware. Only on Aldebaran we can support different processes running with different XNACK modes. On older chips all processes must use the same RETRY_DISABLE setting. However, processes not relying on recoverable page faults can work with RETRY enabled. This means XNACK off is always available as a fallback so we can use the same mode on all GPUs in a process. Signed-off-by:
Alex Sierra <alex.sierra@amd.com> Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Felix Kuehling authored
HMM interval notifier callback notify CPU page table will be updated, stop process queues if the updated address belongs to svm range registered in process svms objects tree. Scheduled restore work to update GPU page table using new pages address in the updated svm range. The restore worker flushes any deferred work to make sure it restores an up-to-date svm_range_list. Signed-off-by:
Philip Yang <Philip.Yang@amd.com> Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Philip Yang authored
svm range structure stores the range start address, size, attributes, flags, prefetch location and gpu bitmap which indicates which GPU this range maps to. Same virtual address is shared by CPU and GPUs. Process has svm range list which uses both interval tree and list to store all svm ranges registered by the process. Interval tree is used by GPU vm fault handler and CPU page fault handler to get svm range structure from the specific address. List is used to scan all ranges in eviction restore work. No overlap range interval [start, last] exist in svms object interval tree. If process registers new range which has overlap with old range, the old range split into 2 ranges depending on the overlap happens at head or tail part of old range. Apply attributes preferred location, prefetch location, mapping flags, migration granularity to svm range, store mapping gpu index into bitmap. Signed-off-by:
Philip Yang <Philip.Yang@amd.com> Signed-off-by:
Alex Sierra <alex.sierra@amd.com> Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Philip Yang authored
Add svm (shared virtual memory) ioctl data structure and API definition. The svm ioctl API is designed to be extensible in the future. All operations are provided by a single IOCTL to preserve ioctl number space. The arguments structure ends with a variable size array of attributes that can be used to set or get one or multiple attributes. Signed-off-by:
Philip Yang <Philip.Yang@amd.com> Signed-off-by:
Alex Sierra <alex.sierra@amd.com> Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Alex Sierra authored
svm range uses gpu bitmap to store which GPU svm range maps to. Application pass driver gpu id to specify GPU, the helper is needed to convert gpu id to gpu bitmap idx. Access through kfd_process_device pointers array from kfd_process. Signed-off-by:
Alex Sierra <alex.sierra@amd.com> Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Felix Kuehling authored
DRM render node file handles are used for CPU mapping of BOs using mmap by the Thunk. It uses the DRM render node of the GPU where the BO was allocated. DRM allows mmap access automatically when it creates a GEM handle for a BO. KFD BOs don't have GEM handles, so KFD needs to manage access manually. Use drm_vma_node_allow to allow user mode to mmap BOs allocated with kfd_ioctl_alloc_memory_of_gpu through the DRM render node that was used in the kfd_ioctl_acquire_vm call for the same GPU. Signed-off-by:
Felix Kuehling <Felix.Kuehling@amd.com> Acked-by:
Christian König <christian.koenig@amd.com> Reviewed-by:
Philip Yang <philip.yang@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Felix Kuehling authored
amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu needs the drm_priv to allow mmap to access the BO through the corresponding file descriptor. The VM can also be extracted from drm_priv, so drm_priv can replace the vm parameter in the kfd2kgd interface. Signed-off-by:
Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by:
Philip Yang <philip.yang@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-