• Jie Hai's avatar
    dmaengine: hisilicon: Add multi-thread support for a DMA channel · 2cbb9588
    Jie Hai authored
    When we get a DMA channel and try to use it in multiple threads it
    will cause oops and hanging the system.
    
    % echo 100 > /sys/module/dmatest/parameters/threads_per_chan
    % echo 100 > /sys/module/dmatest/parameters/iterations
    % echo 1 > /sys/module/dmatest/parameters/run
    [383493.327077] Unable to handle kernel paging request at virtual
    		address dead000000000108
    [383493.335103] Mem abort info:
    [383493.335103]   ESR = 0x96000044
    [383493.335105]   EC = 0x25: DABT (current EL), IL = 32 bits
    [383493.335107]   SET = 0, FnV = 0
    [383493.335108]   EA = 0, S1PTW = 0
    [383493.335109]   FSC = 0x04: level 0 translation fault
    [383493.335110] Data abort info:
    [383493.335111]   ISV = 0, ISS = 0x00000044
    [383493.364739]   CM = 0, WnR = 1
    [383493.367793] [dead000000000108] address between user and kernel
    		address ranges
    [383493.375021] Internal error: Oops: 96000044 [#1] PREEMPT SMP
    [383493.437574] CPU: 63 PID: 27895 Comm: dma0chan0-copy2 Kdump:
    		loaded Tainted: GO 5.17.0-rc4+ #2
    [383493.457851] pstate: 204000c9 (nzCv daIF +PAN -UAO -TCO -DIT
    		-SSBS BTYPE=--)
    [383493.465331] pc : vchan_tx_submit+0x64/0xa0
    [383493.469957] lr : vchan_tx_submit+0x34/0xa0
    
    This occurs because the transmission timed out, and that's due
    to data race. Each thread rewrite channels's descriptor as soon as
    device_issue_pending is called. It leads to the situation that
    the driver thinks that it uses the right descriptor in interrupt
    handler while channels's descriptor has been changed by other
    thread. The descriptor which in fact reported interrupt will not
    be handled any more, as well as its tx->callback.
    That's why timeout reports.
    
    With current fixes channels' descriptor changes it's value only
    when it has been used. A new descriptor is acquired from
    vc->desc_issued queue that is already filled with descriptors
    that are ready to be sent. Threads have no direct access to DMA
    channel descriptor. In case of channel's descriptor is busy, try
    to submit to HW again when a descriptor is completed. In this case,
    vc->desc_issued may be empty when hisi_dma_start_transfer is called,
    so delete error reporting on this. Now it is just possible to queue
    a descriptor for further processing.
    
    Fixes: e9f08b65 ("dmaengine: hisilicon: Add Kunpeng DMA engine support")
    Signed-off-by: default avatarJie Hai <haijie1@huawei.com>
    Acked-by: default avatarZhou Wang <wangzhou1@hisilicon.com>
    Link: https://lore.kernel.org/r/20220830062251.52993-4-haijie1@huawei.comSigned-off-by: default avatarVinod Koul <vkoul@kernel.org>
    2cbb9588
hisi_dma.c 14.9 KB