Commit 2cbb9588 authored by Jie Hai's avatar Jie Hai Committed by Vinod Koul

dmaengine: hisilicon: Add multi-thread support for a DMA channel

When we get a DMA channel and try to use it in multiple threads it
will cause oops and hanging the system.

% echo 100 > /sys/module/dmatest/parameters/threads_per_chan
% echo 100 > /sys/module/dmatest/parameters/iterations
% echo 1 > /sys/module/dmatest/parameters/run
[383493.327077] Unable to handle kernel paging request at virtual
		address dead000000000108
[383493.335103] Mem abort info:
[383493.335103]   ESR = 0x96000044
[383493.335105]   EC = 0x25: DABT (current EL), IL = 32 bits
[383493.335107]   SET = 0, FnV = 0
[383493.335108]   EA = 0, S1PTW = 0
[383493.335109]   FSC = 0x04: level 0 translation fault
[383493.335110] Data abort info:
[383493.335111]   ISV = 0, ISS = 0x00000044
[383493.364739]   CM = 0, WnR = 1
[383493.367793] [dead000000000108] address between user and kernel
		address ranges
[383493.375021] Internal error: Oops: 96000044 [#1] PREEMPT SMP
[383493.437574] CPU: 63 PID: 27895 Comm: dma0chan0-copy2 Kdump:
		loaded Tainted: GO 5.17.0-rc4+ #2
[383493.457851] pstate: 204000c9 (nzCv daIF +PAN -UAO -TCO -DIT
		-SSBS BTYPE=--)
[383493.465331] pc : vchan_tx_submit+0x64/0xa0
[383493.469957] lr : vchan_tx_submit+0x34/0xa0

This occurs because the transmission timed out, and that's due
to data race. Each thread rewrite channels's descriptor as soon as
device_issue_pending is called. It leads to the situation that
the driver thinks that it uses the right descriptor in interrupt
handler while channels's descriptor has been changed by other
thread. The descriptor which in fact reported interrupt will not
be handled any more, as well as its tx->callback.
That's why timeout reports.

With current fixes channels' descriptor changes it's value only
when it has been used. A new descriptor is acquired from
vc->desc_issued queue that is already filled with descriptors
that are ready to be sent. Threads have no direct access to DMA
channel descriptor. In case of channel's descriptor is busy, try
to submit to HW again when a descriptor is completed. In this case,
vc->desc_issued may be empty when hisi_dma_start_transfer is called,
so delete error reporting on this. Now it is just possible to queue
a descriptor for further processing.

Fixes: e9f08b65 ("dmaengine: hisilicon: Add Kunpeng DMA engine support")
Signed-off-by: default avatarJie Hai <haijie1@huawei.com>
Acked-by: default avatarZhou Wang <wangzhou1@hisilicon.com>
Link: https://lore.kernel.org/r/20220830062251.52993-4-haijie1@huawei.comSigned-off-by: default avatarVinod Koul <vkoul@kernel.org>
parent 94477a79
......@@ -271,7 +271,6 @@ static void hisi_dma_start_transfer(struct hisi_dma_chan *chan)
vd = vchan_next_desc(&chan->vc);
if (!vd) {
dev_err(&hdma_dev->pdev->dev, "no issued task!\n");
chan->desc = NULL;
return;
}
......@@ -303,7 +302,7 @@ static void hisi_dma_issue_pending(struct dma_chan *c)
spin_lock_irqsave(&chan->vc.lock, flags);
if (vchan_issue_pending(&chan->vc))
if (vchan_issue_pending(&chan->vc) && !chan->desc)
hisi_dma_start_transfer(chan);
spin_unlock_irqrestore(&chan->vc.lock, flags);
......@@ -441,11 +440,10 @@ static irqreturn_t hisi_dma_irq(int irq, void *data)
chan->qp_num, chan->cq_head);
if (FIELD_GET(STATUS_MASK, cqe->w0) == STATUS_SUCC) {
vchan_cookie_complete(&desc->vd);
hisi_dma_start_transfer(chan);
} else {
dev_err(&hdma_dev->pdev->dev, "task error!\n");
}
chan->desc = NULL;
}
spin_unlock(&chan->vc.lock);
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment