• Fenghua Yu's avatar
    iommu/vt-d: Fix a deadlock in intel_svm_drain_prq() · 6ef05051
    Fenghua Yu authored
    pasid_mutex and dev->iommu->param->lock are held while unbinding mm is
    flushing IO page fault workqueue and waiting for all page fault works to
    finish. But an in-flight page fault work also need to hold the two locks
    while unbinding mm are holding them and waiting for the work to finish.
    This may cause an ABBA deadlock issue as shown below:
    
    	idxd 0000:00:0a.0: unbind PASID 2
    	======================================================
    	WARNING: possible circular locking dependency detected
    	5.14.0-rc7+ #549 Not tainted [  186.615245] ----------
    	dsa_test/898 is trying to acquire lock:
    	ffff888100d854e8 (&param->lock){+.+.}-{3:3}, at:
    	iopf_queue_flush_dev+0x29/0x60
    	but task is already holding lock:
    	ffffffff82b2f7c8 (pasid_mutex){+.+.}-{3:3}, at:
    	intel_svm_unbind+0x34/0x1e0
    	which lock already depends on the new lock.
    
    	the existing dependency chain (in reverse order) is:
    
    	-> #2 (pasid_mutex){+.+.}-{3:3}:
    	       __mutex_lock+0x75/0x730
    	       mutex_lock_nested+0x1b/0x20
    	       intel_svm_page_response+0x8e/0x260
    	       iommu_page_response+0x122/0x200
    	       iopf_handle_group+0x1c2/0x240
    	       process_one_work+0x2a5/0x5a0
    	       worker_thread+0x55/0x400
    	       kthread+0x13b/0x160
    	       ret_from_fork+0x22/0x30
    
    	-> #1 (&param->fault_param->lock){+.+.}-{3:3}:
    	       __mutex_lock+0x75/0x730
    	       mutex_lock_nested+0x1b/0x20
    	       iommu_report_device_fault+0xc2/0x170
    	       prq_event_thread+0x28a/0x580
    	       irq_thread_fn+0x28/0x60
    	       irq_thread+0xcf/0x180
    	       kthread+0x13b/0x160
    	       ret_from_fork+0x22/0x30
    
    	-> #0 (&param->lock){+.+.}-{3:3}:
    	       __lock_acquire+0x1134/0x1d60
    	       lock_acquire+0xc6/0x2e0
    	       __mutex_lock+0x75/0x730
    	       mutex_lock_nested+0x1b/0x20
    	       iopf_queue_flush_dev+0x29/0x60
    	       intel_svm_drain_prq+0x127/0x210
    	       intel_svm_unbind+0xc5/0x1e0
    	       iommu_sva_unbind_device+0x62/0x80
    	       idxd_cdev_release+0x15a/0x200 [idxd]
    	       __fput+0x9c/0x250
    	       ____fput+0xe/0x10
    	       task_work_run+0x64/0xa0
    	       exit_to_user_mode_prepare+0x227/0x230
    	       syscall_exit_to_user_mode+0x2c/0x60
    	       do_syscall_64+0x48/0x90
    	       entry_SYSCALL_64_after_hwframe+0x44/0xae
    
    	other info that might help us debug this:
    
    	Chain exists of:
    	  &param->lock --> &param->fault_param->lock --> pasid_mutex
    
    	 Possible unsafe locking scenario:
    
    	       CPU0                    CPU1
    	       ----                    ----
    	  lock(pasid_mutex);
    				       lock(&param->fault_param->lock);
    				       lock(pasid_mutex);
    	  lock(&param->lock);
    
    	 *** DEADLOCK ***
    
    	2 locks held by dsa_test/898:
    	 #0: ffff888100cc1cc0 (&group->mutex){+.+.}-{3:3}, at:
    	 iommu_sva_unbind_device+0x53/0x80
    	 #1: ffffffff82b2f7c8 (pasid_mutex){+.+.}-{3:3}, at:
    	 intel_svm_unbind+0x34/0x1e0
    
    	stack backtrace:
    	CPU: 2 PID: 898 Comm: dsa_test Not tainted 5.14.0-rc7+ #549
    	Hardware name: Intel Corporation Kabylake Client platform/KBL S
    	DDR4 UD IMM CRB, BIOS KBLSE2R1.R00.X050.P01.1608011715 08/01/2016
    	Call Trace:
    	 dump_stack_lvl+0x5b/0x74
    	 dump_stack+0x10/0x12
    	 print_circular_bug.cold+0x13d/0x142
    	 check_noncircular+0xf1/0x110
    	 __lock_acquire+0x1134/0x1d60
    	 lock_acquire+0xc6/0x2e0
    	 ? iopf_queue_flush_dev+0x29/0x60
    	 ? pci_mmcfg_read+0xde/0x240
    	 __mutex_lock+0x75/0x730
    	 ? iopf_queue_flush_dev+0x29/0x60
    	 ? pci_mmcfg_read+0xfd/0x240
    	 ? iopf_queue_flush_dev+0x29/0x60
    	 mutex_lock_nested+0x1b/0x20
    	 iopf_queue_flush_dev+0x29/0x60
    	 intel_svm_drain_prq+0x127/0x210
    	 ? intel_pasid_tear_down_entry+0x22e/0x240
    	 intel_svm_unbind+0xc5/0x1e0
    	 iommu_sva_unbind_device+0x62/0x80
    	 idxd_cdev_release+0x15a/0x200
    
    pasid_mutex protects pasid and svm data mapping data. It's unnecessary
    to hold pasid_mutex while flushing the workqueue. To fix the deadlock
    issue, unlock pasid_pasid during flushing the workqueue to allow the works
    to be handled.
    
    Fixes: d5b9e4bf ("iommu/vt-d: Report prq to io-pgfault framework")
    Reported-and-tested-by: default avatarDave Jiang <dave.jiang@intel.com>
    Signed-off-by: default avatarFenghua Yu <fenghua.yu@intel.com>
    Link: https://lore.kernel.org/r/20210826215918.4073446-1-fenghua.yu@intel.comSigned-off-by: default avatarLu Baolu <baolu.lu@linux.intel.com>
    Link: https://lore.kernel.org/r/20210828070622.2437559-3-baolu.lu@linux.intel.com
    [joro: Removed timing information from kernel log messages]
    Signed-off-by: default avatarJoerg Roedel <jroedel@suse.de>
    6ef05051
svm.c 30.3 KB