• Peter Wang's avatar
    scsi: ufs: core: Fix ufshcd_clear_cmd racing issue · 9307a998
    Peter Wang authored
    When ufshcd_clear_cmd is racing with the completion ISR, the completed tag
    of the request's mq_hctx pointer will be set to NULL by the ISR.  And
    ufshcd_clear_cmd's call to ufshcd_mcq_req_to_hwq will get NULL pointer KE.
    Return success when the request is completed by ISR because sq does not
    need cleanup.
    
    The racing flow is:
    
    Thread A
    ufshcd_err_handler					step 1
    	ufshcd_try_to_abort_task
    		ufshcd_cmd_inflight(true)		step 3
    		ufshcd_clear_cmd
    			...
    			ufshcd_mcq_req_to_hwq
    			blk_mq_unique_tag
    				rq->mq_hctx->queue_num	step 5
    
    Thread B
    ufs_mtk_mcq_intr(cq complete ISR)			step 2
    	scsi_done
    		...
    		__blk_mq_free_request
    			rq->mq_hctx = NULL;		step 4
    
    Below is KE back trace:
    
      ufshcd_try_to_abort_task: cmd pending in the device. tag = 6
      Unable to handle kernel NULL pointer dereference at virtual address 0000000000000194
       pc : [0xffffffd589679bf8] blk_mq_unique_tag+0x8/0x14
       lr : [0xffffffd5862f95b4] ufshcd_mcq_sq_cleanup+0x6c/0x1cc [ufs_mediatek_mod_ise]
       Workqueue: ufs_eh_wq_0 ufshcd_err_handler [ufs_mediatek_mod_ise]
       Call trace:
        dump_backtrace+0xf8/0x148
        show_stack+0x18/0x24
        dump_stack_lvl+0x60/0x7c
        dump_stack+0x18/0x3c
        mrdump_common_die+0x24c/0x398 [mrdump]
        ipanic_die+0x20/0x34 [mrdump]
        notify_die+0x80/0xd8
        die+0x94/0x2b8
        __do_kernel_fault+0x264/0x298
        do_page_fault+0xa4/0x4b8
        do_translation_fault+0x38/0x54
        do_mem_abort+0x58/0x118
        el1_abort+0x3c/0x5c
        el1h_64_sync_handler+0x54/0x90
        el1h_64_sync+0x68/0x6c
        blk_mq_unique_tag+0x8/0x14
        ufshcd_clear_cmd+0x34/0x118 [ufs_mediatek_mod_ise]
        ufshcd_try_to_abort_task+0x2c8/0x5b4 [ufs_mediatek_mod_ise]
        ufshcd_err_handler+0xa7c/0xfa8 [ufs_mediatek_mod_ise]
        process_one_work+0x208/0x4fc
        worker_thread+0x228/0x438
        kthread+0x104/0x1d4
        ret_from_fork+0x10/0x20
    
    Fixes: 8d729034 ("scsi: ufs: mcq: Add supporting functions for MCQ abort")
    Suggested-by: default avatarBart Van Assche <bvanassche@acm.org>
    Signed-off-by: default avatarPeter Wang <peter.wang@mediatek.com>
    Link: https://lore.kernel.org/r/20240628070030.30929-2-peter.wang@mediatek.comReviewed-by: default avatarBart Van Assche <bvanassche@acm.org>
    Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
    9307a998
ufs-mcq.c 18.6 KB