Commit df12a317 authored by Vinod Koul's avatar Vinod Koul

Merge commit 'dmaengine-3.13-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/dmaengine

Pull dmaengine changes from Dan

1/ Bartlomiej and Dan finalized a rework of the dma address unmap
   implementation.

2/ In the course of testing 1/ a collection of enhancements to dmatest
   fell out.  Notably basic performance statistics, and fixed / enhanced
   test control through new module parameters 'run', 'wait', 'noverify',
   and 'verbose'.  Thanks to Andriy and Linus for their review.

3/ Testing the raid related corner cases of 1/ triggered bugs in the
   recently added 16-source operation support in the ioatdma driver.

4/ Some minor fixes / cleanups to mv_xor and ioatdma.

Conflicts:
	drivers/dma/dmatest.c
Signed-off-by: default avatarVinod Koul <vinod.koul@intel.com>
parents 2f986ec6 82a1402e
...@@ -15,39 +15,48 @@ be built as module or inside kernel. Let's consider those cases. ...@@ -15,39 +15,48 @@ be built as module or inside kernel. Let's consider those cases.
Part 2 - When dmatest is built as a module... Part 2 - When dmatest is built as a module...
After mounting debugfs and loading the module, the /sys/kernel/debug/dmatest
folder with nodes will be created. There are two important files located. First
is the 'run' node that controls run and stop phases of the test, and the second
one, 'results', is used to get the test case results.
Note that in this case test will not run on load automatically.
Example of usage: Example of usage:
% modprobe dmatest channel=dma0chan0 timeout=2000 iterations=1 run=1
...or:
% modprobe dmatest
% echo dma0chan0 > /sys/module/dmatest/parameters/channel % echo dma0chan0 > /sys/module/dmatest/parameters/channel
% echo 2000 > /sys/module/dmatest/parameters/timeout % echo 2000 > /sys/module/dmatest/parameters/timeout
% echo 1 > /sys/module/dmatest/parameters/iterations % echo 1 > /sys/module/dmatest/parameters/iterations
% echo 1 > /sys/kernel/debug/dmatest/run % echo 1 > /sys/module/dmatest/parameters/run
...or on the kernel command line:
dmatest.channel=dma0chan0 dmatest.timeout=2000 dmatest.iterations=1 dmatest.run=1
Hint: available channel list could be extracted by running the following Hint: available channel list could be extracted by running the following
command: command:
% ls -1 /sys/class/dma/ % ls -1 /sys/class/dma/
After a while you will start to get messages about current status or error like Once started a message like "dmatest: Started 1 threads using dma0chan0" is
in the original code. emitted. After that only test failure messages are reported until the test
stops.
Note that running a new test will not stop any in progress test. Note that running a new test will not stop any in progress test.
The following command should return actual state of the test. The following command returns the state of the test.
% cat /sys/kernel/debug/dmatest/run % cat /sys/module/dmatest/parameters/run
To wait for test done the user may perform a busy loop that checks the state. To wait for test completion userpace can poll 'run' until it is false, or use
the wait parameter. Specifying 'wait=1' when loading the module causes module
% while [ $(cat /sys/kernel/debug/dmatest/run) = "Y" ] initialization to pause until a test run has completed, while reading
> do /sys/module/dmatest/parameters/wait waits for any running test to complete
> echo -n "." before returning. For example, the following scripts wait for 42 tests
> sleep 1 to complete before exiting. Note that if 'iterations' is set to 'infinite' then
> done waiting is disabled.
> echo
Example:
% modprobe dmatest run=1 iterations=42 wait=1
% modprobe -r dmatest
...or:
% modprobe dmatest run=1 iterations=42
% cat /sys/module/dmatest/parameters/wait
% modprobe -r dmatest
Part 3 - When built-in in the kernel... Part 3 - When built-in in the kernel...
...@@ -62,21 +71,22 @@ case. You always could check them at run-time by running ...@@ -62,21 +71,22 @@ case. You always could check them at run-time by running
Part 4 - Gathering the test results Part 4 - Gathering the test results
The module provides a storage for the test results in the memory. The gathered Test results are printed to the kernel log buffer with the format:
data could be used after test is done.
The special file 'results' in the debugfs represents gathered data of the in "dmatest: result <channel>: <test id>: '<error msg>' with src_off=<val> dst_off=<val> len=<val> (<err code>)"
progress test. The messages collected are printed to the kernel log as well.
Example of output: Example of output:
% cat /sys/kernel/debug/dmatest/results % dmesg | tail -n 1
dma0chan0-copy0: #1: No errors with src_off=0x7bf dst_off=0x8ad len=0x3fea (0) dmatest: result dma0chan0-copy0: #1: No errors with src_off=0x7bf dst_off=0x8ad len=0x3fea (0)
The message format is unified across the different types of errors. A number in The message format is unified across the different types of errors. A number in
the parens represents additional information, e.g. error code, error counter, the parens represents additional information, e.g. error code, error counter,
or status. or status. A test thread also emits a summary line at completion listing the
number of tests executed, number that failed, and a result code.
Comparison between buffers is stored to the dedicated structure. Example:
% dmesg | tail -n 1
dmatest: dma0chan0-copy0: summary 1 test, 0 failures 1000 iops 100000 KB/s (0)
Note that the verify result is now accessible only via file 'results' in the The details of a data miscompare error are also emitted, but do not follow the
debugfs. above format.
...@@ -393,36 +393,6 @@ static inline int iop_chan_zero_sum_slot_count(size_t len, int src_cnt, ...@@ -393,36 +393,6 @@ static inline int iop_chan_zero_sum_slot_count(size_t len, int src_cnt,
return slot_cnt; return slot_cnt;
} }
static inline int iop_desc_is_pq(struct iop_adma_desc_slot *desc)
{
return 0;
}
static inline u32 iop_desc_get_dest_addr(struct iop_adma_desc_slot *desc,
struct iop_adma_chan *chan)
{
union iop3xx_desc hw_desc = { .ptr = desc->hw_desc, };
switch (chan->device->id) {
case DMA0_ID:
case DMA1_ID:
return hw_desc.dma->dest_addr;
case AAU_ID:
return hw_desc.aau->dest_addr;
default:
BUG();
}
return 0;
}
static inline u32 iop_desc_get_qdest_addr(struct iop_adma_desc_slot *desc,
struct iop_adma_chan *chan)
{
BUG();
return 0;
}
static inline u32 iop_desc_get_byte_count(struct iop_adma_desc_slot *desc, static inline u32 iop_desc_get_byte_count(struct iop_adma_desc_slot *desc,
struct iop_adma_chan *chan) struct iop_adma_chan *chan)
{ {
......
...@@ -82,8 +82,6 @@ struct iop_adma_chan { ...@@ -82,8 +82,6 @@ struct iop_adma_chan {
* @slot_cnt: total slots used in an transaction (group of operations) * @slot_cnt: total slots used in an transaction (group of operations)
* @slots_per_op: number of slots per operation * @slots_per_op: number of slots per operation
* @idx: pool index * @idx: pool index
* @unmap_src_cnt: number of xor sources
* @unmap_len: transaction bytecount
* @tx_list: list of descriptors that are associated with one operation * @tx_list: list of descriptors that are associated with one operation
* @async_tx: support for the async_tx api * @async_tx: support for the async_tx api
* @group_list: list of slots that make up a multi-descriptor transaction * @group_list: list of slots that make up a multi-descriptor transaction
...@@ -99,8 +97,6 @@ struct iop_adma_desc_slot { ...@@ -99,8 +97,6 @@ struct iop_adma_desc_slot {
u16 slot_cnt; u16 slot_cnt;
u16 slots_per_op; u16 slots_per_op;
u16 idx; u16 idx;
u16 unmap_src_cnt;
size_t unmap_len;
struct list_head tx_list; struct list_head tx_list;
struct dma_async_tx_descriptor async_tx; struct dma_async_tx_descriptor async_tx;
union { union {
......
...@@ -218,20 +218,6 @@ iop_chan_xor_slot_count(size_t len, int src_cnt, int *slots_per_op) ...@@ -218,20 +218,6 @@ iop_chan_xor_slot_count(size_t len, int src_cnt, int *slots_per_op)
#define iop_chan_pq_slot_count iop_chan_xor_slot_count #define iop_chan_pq_slot_count iop_chan_xor_slot_count
#define iop_chan_pq_zero_sum_slot_count iop_chan_xor_slot_count #define iop_chan_pq_zero_sum_slot_count iop_chan_xor_slot_count
static inline u32 iop_desc_get_dest_addr(struct iop_adma_desc_slot *desc,
struct iop_adma_chan *chan)
{
struct iop13xx_adma_desc_hw *hw_desc = desc->hw_desc;
return hw_desc->dest_addr;
}
static inline u32 iop_desc_get_qdest_addr(struct iop_adma_desc_slot *desc,
struct iop_adma_chan *chan)
{
struct iop13xx_adma_desc_hw *hw_desc = desc->hw_desc;
return hw_desc->q_dest_addr;
}
static inline u32 iop_desc_get_byte_count(struct iop_adma_desc_slot *desc, static inline u32 iop_desc_get_byte_count(struct iop_adma_desc_slot *desc,
struct iop_adma_chan *chan) struct iop_adma_chan *chan)
{ {
...@@ -350,18 +336,6 @@ iop_desc_init_pq(struct iop_adma_desc_slot *desc, int src_cnt, ...@@ -350,18 +336,6 @@ iop_desc_init_pq(struct iop_adma_desc_slot *desc, int src_cnt,
hw_desc->desc_ctrl = u_desc_ctrl.value; hw_desc->desc_ctrl = u_desc_ctrl.value;
} }
static inline int iop_desc_is_pq(struct iop_adma_desc_slot *desc)
{
struct iop13xx_adma_desc_hw *hw_desc = desc->hw_desc;
union {
u32 value;
struct iop13xx_adma_desc_ctrl field;
} u_desc_ctrl;
u_desc_ctrl.value = hw_desc->desc_ctrl;
return u_desc_ctrl.field.pq_xfer_en;
}
static inline void static inline void
iop_desc_init_pq_zero_sum(struct iop_adma_desc_slot *desc, int src_cnt, iop_desc_init_pq_zero_sum(struct iop_adma_desc_slot *desc, int src_cnt,
unsigned long flags) unsigned long flags)
......
...@@ -50,33 +50,36 @@ async_memcpy(struct page *dest, struct page *src, unsigned int dest_offset, ...@@ -50,33 +50,36 @@ async_memcpy(struct page *dest, struct page *src, unsigned int dest_offset,
&dest, 1, &src, 1, len); &dest, 1, &src, 1, len);
struct dma_device *device = chan ? chan->device : NULL; struct dma_device *device = chan ? chan->device : NULL;
struct dma_async_tx_descriptor *tx = NULL; struct dma_async_tx_descriptor *tx = NULL;
struct dmaengine_unmap_data *unmap = NULL;
if (device && is_dma_copy_aligned(device, src_offset, dest_offset, len)) { if (device)
dma_addr_t dma_dest, dma_src; unmap = dmaengine_get_unmap_data(device->dev, 2, GFP_NOIO);
if (unmap && is_dma_copy_aligned(device, src_offset, dest_offset, len)) {
unsigned long dma_prep_flags = 0; unsigned long dma_prep_flags = 0;
if (submit->cb_fn) if (submit->cb_fn)
dma_prep_flags |= DMA_PREP_INTERRUPT; dma_prep_flags |= DMA_PREP_INTERRUPT;
if (submit->flags & ASYNC_TX_FENCE) if (submit->flags & ASYNC_TX_FENCE)
dma_prep_flags |= DMA_PREP_FENCE; dma_prep_flags |= DMA_PREP_FENCE;
dma_dest = dma_map_page(device->dev, dest, dest_offset, len,
DMA_FROM_DEVICE);
dma_src = dma_map_page(device->dev, src, src_offset, len, unmap->to_cnt = 1;
unmap->addr[0] = dma_map_page(device->dev, src, src_offset, len,
DMA_TO_DEVICE); DMA_TO_DEVICE);
unmap->from_cnt = 1;
tx = device->device_prep_dma_memcpy(chan, dma_dest, dma_src, unmap->addr[1] = dma_map_page(device->dev, dest, dest_offset, len,
len, dma_prep_flags);
if (!tx) {
dma_unmap_page(device->dev, dma_dest, len,
DMA_FROM_DEVICE); DMA_FROM_DEVICE);
dma_unmap_page(device->dev, dma_src, len, unmap->len = len;
DMA_TO_DEVICE);
} tx = device->device_prep_dma_memcpy(chan, unmap->addr[1],
unmap->addr[0], len,
dma_prep_flags);
} }
if (tx) { if (tx) {
pr_debug("%s: (async) len: %zu\n", __func__, len); pr_debug("%s: (async) len: %zu\n", __func__, len);
dma_set_unmap(tx, unmap);
async_tx_submit(chan, tx, submit); async_tx_submit(chan, tx, submit);
} else { } else {
void *dest_buf, *src_buf; void *dest_buf, *src_buf;
...@@ -96,6 +99,8 @@ async_memcpy(struct page *dest, struct page *src, unsigned int dest_offset, ...@@ -96,6 +99,8 @@ async_memcpy(struct page *dest, struct page *src, unsigned int dest_offset,
async_tx_sync_epilog(submit); async_tx_sync_epilog(submit);
} }
dmaengine_unmap_put(unmap);
return tx; return tx;
} }
EXPORT_SYMBOL_GPL(async_memcpy); EXPORT_SYMBOL_GPL(async_memcpy);
......
...@@ -46,49 +46,24 @@ static struct page *pq_scribble_page; ...@@ -46,49 +46,24 @@ static struct page *pq_scribble_page;
* do_async_gen_syndrome - asynchronously calculate P and/or Q * do_async_gen_syndrome - asynchronously calculate P and/or Q
*/ */
static __async_inline struct dma_async_tx_descriptor * static __async_inline struct dma_async_tx_descriptor *
do_async_gen_syndrome(struct dma_chan *chan, struct page **blocks, do_async_gen_syndrome(struct dma_chan *chan,
const unsigned char *scfs, unsigned int offset, int disks, const unsigned char *scfs, int disks,
size_t len, dma_addr_t *dma_src, struct dmaengine_unmap_data *unmap,
enum dma_ctrl_flags dma_flags,
struct async_submit_ctl *submit) struct async_submit_ctl *submit)
{ {
struct dma_async_tx_descriptor *tx = NULL; struct dma_async_tx_descriptor *tx = NULL;
struct dma_device *dma = chan->device; struct dma_device *dma = chan->device;
enum dma_ctrl_flags dma_flags = 0;
enum async_tx_flags flags_orig = submit->flags; enum async_tx_flags flags_orig = submit->flags;
dma_async_tx_callback cb_fn_orig = submit->cb_fn; dma_async_tx_callback cb_fn_orig = submit->cb_fn;
dma_async_tx_callback cb_param_orig = submit->cb_param; dma_async_tx_callback cb_param_orig = submit->cb_param;
int src_cnt = disks - 2; int src_cnt = disks - 2;
unsigned char coefs[src_cnt];
unsigned short pq_src_cnt; unsigned short pq_src_cnt;
dma_addr_t dma_dest[2]; dma_addr_t dma_dest[2];
int src_off = 0; int src_off = 0;
int idx;
int i;
/* DMAs use destinations as sources, so use BIDIRECTIONAL mapping */
if (P(blocks, disks))
dma_dest[0] = dma_map_page(dma->dev, P(blocks, disks), offset,
len, DMA_BIDIRECTIONAL);
else
dma_flags |= DMA_PREP_PQ_DISABLE_P;
if (Q(blocks, disks))
dma_dest[1] = dma_map_page(dma->dev, Q(blocks, disks), offset,
len, DMA_BIDIRECTIONAL);
else
dma_flags |= DMA_PREP_PQ_DISABLE_Q;
/* convert source addresses being careful to collapse 'empty' if (submit->flags & ASYNC_TX_FENCE)
* sources and update the coefficients accordingly dma_flags |= DMA_PREP_FENCE;
*/
for (i = 0, idx = 0; i < src_cnt; i++) {
if (blocks[i] == NULL)
continue;
dma_src[idx] = dma_map_page(dma->dev, blocks[i], offset, len,
DMA_TO_DEVICE);
coefs[idx] = scfs[i];
idx++;
}
src_cnt = idx;
while (src_cnt > 0) { while (src_cnt > 0) {
submit->flags = flags_orig; submit->flags = flags_orig;
...@@ -100,28 +75,25 @@ do_async_gen_syndrome(struct dma_chan *chan, struct page **blocks, ...@@ -100,28 +75,25 @@ do_async_gen_syndrome(struct dma_chan *chan, struct page **blocks,
if (src_cnt > pq_src_cnt) { if (src_cnt > pq_src_cnt) {
submit->flags &= ~ASYNC_TX_ACK; submit->flags &= ~ASYNC_TX_ACK;
submit->flags |= ASYNC_TX_FENCE; submit->flags |= ASYNC_TX_FENCE;
dma_flags |= DMA_COMPL_SKIP_DEST_UNMAP;
submit->cb_fn = NULL; submit->cb_fn = NULL;
submit->cb_param = NULL; submit->cb_param = NULL;
} else { } else {
dma_flags &= ~DMA_COMPL_SKIP_DEST_UNMAP;
submit->cb_fn = cb_fn_orig; submit->cb_fn = cb_fn_orig;
submit->cb_param = cb_param_orig; submit->cb_param = cb_param_orig;
if (cb_fn_orig) if (cb_fn_orig)
dma_flags |= DMA_PREP_INTERRUPT; dma_flags |= DMA_PREP_INTERRUPT;
} }
if (submit->flags & ASYNC_TX_FENCE)
dma_flags |= DMA_PREP_FENCE;
/* Since we have clobbered the src_list we are committed /* Drivers force forward progress in case they can not provide
* to doing this asynchronously. Drivers force forward * a descriptor
* progress in case they can not provide a descriptor
*/ */
for (;;) { for (;;) {
dma_dest[0] = unmap->addr[disks - 2];
dma_dest[1] = unmap->addr[disks - 1];
tx = dma->device_prep_dma_pq(chan, dma_dest, tx = dma->device_prep_dma_pq(chan, dma_dest,
&dma_src[src_off], &unmap->addr[src_off],
pq_src_cnt, pq_src_cnt,
&coefs[src_off], len, &scfs[src_off], unmap->len,
dma_flags); dma_flags);
if (likely(tx)) if (likely(tx))
break; break;
...@@ -129,6 +101,7 @@ do_async_gen_syndrome(struct dma_chan *chan, struct page **blocks, ...@@ -129,6 +101,7 @@ do_async_gen_syndrome(struct dma_chan *chan, struct page **blocks,
dma_async_issue_pending(chan); dma_async_issue_pending(chan);
} }
dma_set_unmap(tx, unmap);
async_tx_submit(chan, tx, submit); async_tx_submit(chan, tx, submit);
submit->depend_tx = tx; submit->depend_tx = tx;
...@@ -188,10 +161,6 @@ do_sync_gen_syndrome(struct page **blocks, unsigned int offset, int disks, ...@@ -188,10 +161,6 @@ do_sync_gen_syndrome(struct page **blocks, unsigned int offset, int disks,
* set to NULL those buffers will be replaced with the raid6_zero_page * set to NULL those buffers will be replaced with the raid6_zero_page
* in the synchronous path and omitted in the hardware-asynchronous * in the synchronous path and omitted in the hardware-asynchronous
* path. * path.
*
* 'blocks' note: if submit->scribble is NULL then the contents of
* 'blocks' may be overwritten to perform address conversions
* (dma_map_page() or page_address()).
*/ */
struct dma_async_tx_descriptor * struct dma_async_tx_descriptor *
async_gen_syndrome(struct page **blocks, unsigned int offset, int disks, async_gen_syndrome(struct page **blocks, unsigned int offset, int disks,
...@@ -202,26 +171,69 @@ async_gen_syndrome(struct page **blocks, unsigned int offset, int disks, ...@@ -202,26 +171,69 @@ async_gen_syndrome(struct page **blocks, unsigned int offset, int disks,
&P(blocks, disks), 2, &P(blocks, disks), 2,
blocks, src_cnt, len); blocks, src_cnt, len);
struct dma_device *device = chan ? chan->device : NULL; struct dma_device *device = chan ? chan->device : NULL;
dma_addr_t *dma_src = NULL; struct dmaengine_unmap_data *unmap = NULL;
BUG_ON(disks > 255 || !(P(blocks, disks) || Q(blocks, disks))); BUG_ON(disks > 255 || !(P(blocks, disks) || Q(blocks, disks)));
if (submit->scribble) if (device)
dma_src = submit->scribble; unmap = dmaengine_get_unmap_data(device->dev, disks, GFP_NOIO);
else if (sizeof(dma_addr_t) <= sizeof(struct page *))
dma_src = (dma_addr_t *) blocks;
if (dma_src && device && if (unmap &&
(src_cnt <= dma_maxpq(device, 0) || (src_cnt <= dma_maxpq(device, 0) ||
dma_maxpq(device, DMA_PREP_CONTINUE) > 0) && dma_maxpq(device, DMA_PREP_CONTINUE) > 0) &&
is_dma_pq_aligned(device, offset, 0, len)) { is_dma_pq_aligned(device, offset, 0, len)) {
struct dma_async_tx_descriptor *tx;
enum dma_ctrl_flags dma_flags = 0;
unsigned char coefs[src_cnt];
int i, j;
/* run the p+q asynchronously */ /* run the p+q asynchronously */
pr_debug("%s: (async) disks: %d len: %zu\n", pr_debug("%s: (async) disks: %d len: %zu\n",
__func__, disks, len); __func__, disks, len);
return do_async_gen_syndrome(chan, blocks, raid6_gfexp, offset,
disks, len, dma_src, submit); /* convert source addresses being careful to collapse 'empty'
* sources and update the coefficients accordingly
*/
unmap->len = len;
for (i = 0, j = 0; i < src_cnt; i++) {
if (blocks[i] == NULL)
continue;
unmap->addr[j] = dma_map_page(device->dev, blocks[i], offset,
len, DMA_TO_DEVICE);
coefs[j] = raid6_gfexp[i];
unmap->to_cnt++;
j++;
}
/*
* DMAs use destinations as sources,
* so use BIDIRECTIONAL mapping
*/
unmap->bidi_cnt++;
if (P(blocks, disks))
unmap->addr[j++] = dma_map_page(device->dev, P(blocks, disks),
offset, len, DMA_BIDIRECTIONAL);
else {
unmap->addr[j++] = 0;
dma_flags |= DMA_PREP_PQ_DISABLE_P;
}
unmap->bidi_cnt++;
if (Q(blocks, disks))
unmap->addr[j++] = dma_map_page(device->dev, Q(blocks, disks),
offset, len, DMA_BIDIRECTIONAL);
else {
unmap->addr[j++] = 0;
dma_flags |= DMA_PREP_PQ_DISABLE_Q;
}
tx = do_async_gen_syndrome(chan, coefs, j, unmap, dma_flags, submit);
dmaengine_unmap_put(unmap);
return tx;
} }
dmaengine_unmap_put(unmap);
/* run the pq synchronously */ /* run the pq synchronously */
pr_debug("%s: (sync) disks: %d len: %zu\n", __func__, disks, len); pr_debug("%s: (sync) disks: %d len: %zu\n", __func__, disks, len);
...@@ -277,50 +289,60 @@ async_syndrome_val(struct page **blocks, unsigned int offset, int disks, ...@@ -277,50 +289,60 @@ async_syndrome_val(struct page **blocks, unsigned int offset, int disks,
struct dma_async_tx_descriptor *tx; struct dma_async_tx_descriptor *tx;
unsigned char coefs[disks-2]; unsigned char coefs[disks-2];
enum dma_ctrl_flags dma_flags = submit->cb_fn ? DMA_PREP_INTERRUPT : 0; enum dma_ctrl_flags dma_flags = submit->cb_fn ? DMA_PREP_INTERRUPT : 0;
dma_addr_t *dma_src = NULL; struct dmaengine_unmap_data *unmap = NULL;
int src_cnt = 0;
BUG_ON(disks < 4); BUG_ON(disks < 4);
if (submit->scribble) if (device)
dma_src = submit->scribble; unmap = dmaengine_get_unmap_data(device->dev, disks, GFP_NOIO);
else if (sizeof(dma_addr_t) <= sizeof(struct page *))
dma_src = (dma_addr_t *) blocks;
if (dma_src && device && disks <= dma_maxpq(device, 0) && if (unmap && disks <= dma_maxpq(device, 0) &&
is_dma_pq_aligned(device, offset, 0, len)) { is_dma_pq_aligned(device, offset, 0, len)) {
struct device *dev = device->dev; struct device *dev = device->dev;
dma_addr_t *pq = &dma_src[disks-2]; dma_addr_t pq[2];
int i; int i, j = 0, src_cnt = 0;
pr_debug("%s: (async) disks: %d len: %zu\n", pr_debug("%s: (async) disks: %d len: %zu\n",
__func__, disks, len); __func__, disks, len);
if (!P(blocks, disks))
unmap->len = len;
for (i = 0; i < disks-2; i++)
if (likely(blocks[i])) {
unmap->addr[j] = dma_map_page(dev, blocks[i],
offset, len,
DMA_TO_DEVICE);
coefs[j] = raid6_gfexp[i];
unmap->to_cnt++;
src_cnt++;
j++;
}
if (!P(blocks, disks)) {
pq[0] = 0;
dma_flags |= DMA_PREP_PQ_DISABLE_P; dma_flags |= DMA_PREP_PQ_DISABLE_P;
else } else {
pq[0] = dma_map_page(dev, P(blocks, disks), pq[0] = dma_map_page(dev, P(blocks, disks),
offset, len, offset, len,
DMA_TO_DEVICE); DMA_TO_DEVICE);
if (!Q(blocks, disks)) unmap->addr[j++] = pq[0];
unmap->to_cnt++;
}
if (!Q(blocks, disks)) {
pq[1] = 0;
dma_flags |= DMA_PREP_PQ_DISABLE_Q; dma_flags |= DMA_PREP_PQ_DISABLE_Q;
else } else {
pq[1] = dma_map_page(dev, Q(blocks, disks), pq[1] = dma_map_page(dev, Q(blocks, disks),
offset, len, offset, len,
DMA_TO_DEVICE); DMA_TO_DEVICE);
unmap->addr[j++] = pq[1];
unmap->to_cnt++;
}
if (submit->flags & ASYNC_TX_FENCE) if (submit->flags & ASYNC_TX_FENCE)
dma_flags |= DMA_PREP_FENCE; dma_flags |= DMA_PREP_FENCE;
for (i = 0; i < disks-2; i++)
if (likely(blocks[i])) {
dma_src[src_cnt] = dma_map_page(dev, blocks[i],
offset, len,
DMA_TO_DEVICE);
coefs[src_cnt] = raid6_gfexp[i];
src_cnt++;
}
for (;;) { for (;;) {
tx = device->device_prep_dma_pq_val(chan, pq, dma_src, tx = device->device_prep_dma_pq_val(chan, pq,
unmap->addr,
src_cnt, src_cnt,
coefs, coefs,
len, pqres, len, pqres,
...@@ -330,6 +352,8 @@ async_syndrome_val(struct page **blocks, unsigned int offset, int disks, ...@@ -330,6 +352,8 @@ async_syndrome_val(struct page **blocks, unsigned int offset, int disks,
async_tx_quiesce(&submit->depend_tx); async_tx_quiesce(&submit->depend_tx);
dma_async_issue_pending(chan); dma_async_issue_pending(chan);
} }
dma_set_unmap(tx, unmap);
async_tx_submit(chan, tx, submit); async_tx_submit(chan, tx, submit);
return tx; return tx;
......
...@@ -26,6 +26,7 @@ ...@@ -26,6 +26,7 @@
#include <linux/dma-mapping.h> #include <linux/dma-mapping.h>
#include <linux/raid/pq.h> #include <linux/raid/pq.h>
#include <linux/async_tx.h> #include <linux/async_tx.h>
#include <linux/dmaengine.h>
static struct dma_async_tx_descriptor * static struct dma_async_tx_descriptor *
async_sum_product(struct page *dest, struct page **srcs, unsigned char *coef, async_sum_product(struct page *dest, struct page **srcs, unsigned char *coef,
...@@ -34,35 +35,45 @@ async_sum_product(struct page *dest, struct page **srcs, unsigned char *coef, ...@@ -34,35 +35,45 @@ async_sum_product(struct page *dest, struct page **srcs, unsigned char *coef,
struct dma_chan *chan = async_tx_find_channel(submit, DMA_PQ, struct dma_chan *chan = async_tx_find_channel(submit, DMA_PQ,
&dest, 1, srcs, 2, len); &dest, 1, srcs, 2, len);
struct dma_device *dma = chan ? chan->device : NULL; struct dma_device *dma = chan ? chan->device : NULL;
struct dmaengine_unmap_data *unmap = NULL;
const u8 *amul, *bmul; const u8 *amul, *bmul;
u8 ax, bx; u8 ax, bx;
u8 *a, *b, *c; u8 *a, *b, *c;
if (dma) { if (dma)
dma_addr_t dma_dest[2]; unmap = dmaengine_get_unmap_data(dma->dev, 3, GFP_NOIO);
dma_addr_t dma_src[2];
if (unmap) {
struct device *dev = dma->dev; struct device *dev = dma->dev;
dma_addr_t pq[2];
struct dma_async_tx_descriptor *tx; struct dma_async_tx_descriptor *tx;
enum dma_ctrl_flags dma_flags = DMA_PREP_PQ_DISABLE_P; enum dma_ctrl_flags dma_flags = DMA_PREP_PQ_DISABLE_P;
if (submit->flags & ASYNC_TX_FENCE) if (submit->flags & ASYNC_TX_FENCE)
dma_flags |= DMA_PREP_FENCE; dma_flags |= DMA_PREP_FENCE;
dma_dest[1] = dma_map_page(dev, dest, 0, len, DMA_BIDIRECTIONAL); unmap->addr[0] = dma_map_page(dev, srcs[0], 0, len, DMA_TO_DEVICE);
dma_src[0] = dma_map_page(dev, srcs[0], 0, len, DMA_TO_DEVICE); unmap->addr[1] = dma_map_page(dev, srcs[1], 0, len, DMA_TO_DEVICE);
dma_src[1] = dma_map_page(dev, srcs[1], 0, len, DMA_TO_DEVICE); unmap->to_cnt = 2;
tx = dma->device_prep_dma_pq(chan, dma_dest, dma_src, 2, coef,
unmap->addr[2] = dma_map_page(dev, dest, 0, len, DMA_BIDIRECTIONAL);
unmap->bidi_cnt = 1;
/* engine only looks at Q, but expects it to follow P */
pq[1] = unmap->addr[2];
unmap->len = len;
tx = dma->device_prep_dma_pq(chan, pq, unmap->addr, 2, coef,
len, dma_flags); len, dma_flags);
if (tx) { if (tx) {
dma_set_unmap(tx, unmap);
async_tx_submit(chan, tx, submit); async_tx_submit(chan, tx, submit);
dmaengine_unmap_put(unmap);
return tx; return tx;
} }
/* could not get a descriptor, unmap and fall through to /* could not get a descriptor, unmap and fall through to
* the synchronous path * the synchronous path
*/ */
dma_unmap_page(dev, dma_dest[1], len, DMA_BIDIRECTIONAL); dmaengine_unmap_put(unmap);
dma_unmap_page(dev, dma_src[0], len, DMA_TO_DEVICE);
dma_unmap_page(dev, dma_src[1], len, DMA_TO_DEVICE);
} }
/* run the operation synchronously */ /* run the operation synchronously */
...@@ -89,23 +100,38 @@ async_mult(struct page *dest, struct page *src, u8 coef, size_t len, ...@@ -89,23 +100,38 @@ async_mult(struct page *dest, struct page *src, u8 coef, size_t len,
struct dma_chan *chan = async_tx_find_channel(submit, DMA_PQ, struct dma_chan *chan = async_tx_find_channel(submit, DMA_PQ,
&dest, 1, &src, 1, len); &dest, 1, &src, 1, len);
struct dma_device *dma = chan ? chan->device : NULL; struct dma_device *dma = chan ? chan->device : NULL;
struct dmaengine_unmap_data *unmap = NULL;
const u8 *qmul; /* Q multiplier table */ const u8 *qmul; /* Q multiplier table */
u8 *d, *s; u8 *d, *s;
if (dma) { if (dma)
unmap = dmaengine_get_unmap_data(dma->dev, 3, GFP_NOIO);
if (unmap) {
dma_addr_t dma_dest[2]; dma_addr_t dma_dest[2];
dma_addr_t dma_src[1];
struct device *dev = dma->dev; struct device *dev = dma->dev;
struct dma_async_tx_descriptor *tx; struct dma_async_tx_descriptor *tx;
enum dma_ctrl_flags dma_flags = DMA_PREP_PQ_DISABLE_P; enum dma_ctrl_flags dma_flags = DMA_PREP_PQ_DISABLE_P;
if (submit->flags & ASYNC_TX_FENCE) if (submit->flags & ASYNC_TX_FENCE)
dma_flags |= DMA_PREP_FENCE; dma_flags |= DMA_PREP_FENCE;
dma_dest[1] = dma_map_page(dev, dest, 0, len, DMA_BIDIRECTIONAL); unmap->addr[0] = dma_map_page(dev, src, 0, len, DMA_TO_DEVICE);
dma_src[0] = dma_map_page(dev, src, 0, len, DMA_TO_DEVICE); unmap->to_cnt++;
tx = dma->device_prep_dma_pq(chan, dma_dest, dma_src, 1, &coef, unmap->addr[1] = dma_map_page(dev, dest, 0, len, DMA_BIDIRECTIONAL);
len, dma_flags); dma_dest[1] = unmap->addr[1];
unmap->bidi_cnt++;
unmap->len = len;
/* this looks funny, but the engine looks for Q at
* dma_dest[1] and ignores dma_dest[0] as a dest
* due to DMA_PREP_PQ_DISABLE_P
*/
tx = dma->device_prep_dma_pq(chan, dma_dest, unmap->addr,
1, &coef, len, dma_flags);
if (tx) { if (tx) {
dma_set_unmap(tx, unmap);
dmaengine_unmap_put(unmap);
async_tx_submit(chan, tx, submit); async_tx_submit(chan, tx, submit);
return tx; return tx;
} }
...@@ -113,8 +139,7 @@ async_mult(struct page *dest, struct page *src, u8 coef, size_t len, ...@@ -113,8 +139,7 @@ async_mult(struct page *dest, struct page *src, u8 coef, size_t len,
/* could not get a descriptor, unmap and fall through to /* could not get a descriptor, unmap and fall through to
* the synchronous path * the synchronous path
*/ */
dma_unmap_page(dev, dma_dest[1], len, DMA_BIDIRECTIONAL); dmaengine_unmap_put(unmap);
dma_unmap_page(dev, dma_src[0], len, DMA_TO_DEVICE);
} }
/* no channel available, or failed to allocate a descriptor, so /* no channel available, or failed to allocate a descriptor, so
......
...@@ -33,48 +33,31 @@ ...@@ -33,48 +33,31 @@
/* do_async_xor - dma map the pages and perform the xor with an engine */ /* do_async_xor - dma map the pages and perform the xor with an engine */
static __async_inline struct dma_async_tx_descriptor * static __async_inline struct dma_async_tx_descriptor *
do_async_xor(struct dma_chan *chan, struct page *dest, struct page **src_list, do_async_xor(struct dma_chan *chan, struct dmaengine_unmap_data *unmap,
unsigned int offset, int src_cnt, size_t len, dma_addr_t *dma_src,
struct async_submit_ctl *submit) struct async_submit_ctl *submit)
{ {
struct dma_device *dma = chan->device; struct dma_device *dma = chan->device;
struct dma_async_tx_descriptor *tx = NULL; struct dma_async_tx_descriptor *tx = NULL;
int src_off = 0;
int i;
dma_async_tx_callback cb_fn_orig = submit->cb_fn; dma_async_tx_callback cb_fn_orig = submit->cb_fn;
void *cb_param_orig = submit->cb_param; void *cb_param_orig = submit->cb_param;
enum async_tx_flags flags_orig = submit->flags; enum async_tx_flags flags_orig = submit->flags;
enum dma_ctrl_flags dma_flags; enum dma_ctrl_flags dma_flags = 0;
int xor_src_cnt = 0; int src_cnt = unmap->to_cnt;
dma_addr_t dma_dest; int xor_src_cnt;
dma_addr_t dma_dest = unmap->addr[unmap->to_cnt];
/* map the dest bidrectional in case it is re-used as a source */ dma_addr_t *src_list = unmap->addr;
dma_dest = dma_map_page(dma->dev, dest, offset, len, DMA_BIDIRECTIONAL);
for (i = 0; i < src_cnt; i++) {
/* only map the dest once */
if (!src_list[i])
continue;
if (unlikely(src_list[i] == dest)) {
dma_src[xor_src_cnt++] = dma_dest;
continue;
}
dma_src[xor_src_cnt++] = dma_map_page(dma->dev, src_list[i], offset,
len, DMA_TO_DEVICE);
}
src_cnt = xor_src_cnt;
while (src_cnt) { while (src_cnt) {
dma_addr_t tmp;
submit->flags = flags_orig; submit->flags = flags_orig;
dma_flags = 0;
xor_src_cnt = min(src_cnt, (int)dma->max_xor); xor_src_cnt = min(src_cnt, (int)dma->max_xor);
/* if we are submitting additional xors, leave the chain open, /* if we are submitting additional xors, leave the chain open
* clear the callback parameters, and leave the destination * and clear the callback parameters
* buffer mapped
*/ */
if (src_cnt > xor_src_cnt) { if (src_cnt > xor_src_cnt) {
submit->flags &= ~ASYNC_TX_ACK; submit->flags &= ~ASYNC_TX_ACK;
submit->flags |= ASYNC_TX_FENCE; submit->flags |= ASYNC_TX_FENCE;
dma_flags = DMA_COMPL_SKIP_DEST_UNMAP;
submit->cb_fn = NULL; submit->cb_fn = NULL;
submit->cb_param = NULL; submit->cb_param = NULL;
} else { } else {
...@@ -85,12 +68,18 @@ do_async_xor(struct dma_chan *chan, struct page *dest, struct page **src_list, ...@@ -85,12 +68,18 @@ do_async_xor(struct dma_chan *chan, struct page *dest, struct page **src_list,
dma_flags |= DMA_PREP_INTERRUPT; dma_flags |= DMA_PREP_INTERRUPT;
if (submit->flags & ASYNC_TX_FENCE) if (submit->flags & ASYNC_TX_FENCE)
dma_flags |= DMA_PREP_FENCE; dma_flags |= DMA_PREP_FENCE;
/* Since we have clobbered the src_list we are committed
* to doing this asynchronously. Drivers force forward progress /* Drivers force forward progress in case they can not provide a
* in case they can not provide a descriptor * descriptor
*/ */
tx = dma->device_prep_dma_xor(chan, dma_dest, &dma_src[src_off], tmp = src_list[0];
xor_src_cnt, len, dma_flags); if (src_list > unmap->addr)
src_list[0] = dma_dest;
tx = dma->device_prep_dma_xor(chan, dma_dest, src_list,
xor_src_cnt, unmap->len,
dma_flags);
src_list[0] = tmp;
if (unlikely(!tx)) if (unlikely(!tx))
async_tx_quiesce(&submit->depend_tx); async_tx_quiesce(&submit->depend_tx);
...@@ -99,22 +88,21 @@ do_async_xor(struct dma_chan *chan, struct page *dest, struct page **src_list, ...@@ -99,22 +88,21 @@ do_async_xor(struct dma_chan *chan, struct page *dest, struct page **src_list,
while (unlikely(!tx)) { while (unlikely(!tx)) {
dma_async_issue_pending(chan); dma_async_issue_pending(chan);
tx = dma->device_prep_dma_xor(chan, dma_dest, tx = dma->device_prep_dma_xor(chan, dma_dest,
&dma_src[src_off], src_list,
xor_src_cnt, len, xor_src_cnt, unmap->len,
dma_flags); dma_flags);
} }
dma_set_unmap(tx, unmap);
async_tx_submit(chan, tx, submit); async_tx_submit(chan, tx, submit);
submit->depend_tx = tx; submit->depend_tx = tx;
if (src_cnt > xor_src_cnt) { if (src_cnt > xor_src_cnt) {
/* drop completed sources */ /* drop completed sources */
src_cnt -= xor_src_cnt; src_cnt -= xor_src_cnt;
src_off += xor_src_cnt;
/* use the intermediate result a source */ /* use the intermediate result a source */
dma_src[--src_off] = dma_dest;
src_cnt++; src_cnt++;
src_list += xor_src_cnt - 1;
} else } else
break; break;
} }
...@@ -189,22 +177,40 @@ async_xor(struct page *dest, struct page **src_list, unsigned int offset, ...@@ -189,22 +177,40 @@ async_xor(struct page *dest, struct page **src_list, unsigned int offset,
struct dma_chan *chan = async_tx_find_channel(submit, DMA_XOR, struct dma_chan *chan = async_tx_find_channel(submit, DMA_XOR,
&dest, 1, src_list, &dest, 1, src_list,
src_cnt, len); src_cnt, len);
dma_addr_t *dma_src = NULL; struct dma_device *device = chan ? chan->device : NULL;
struct dmaengine_unmap_data *unmap = NULL;
BUG_ON(src_cnt <= 1); BUG_ON(src_cnt <= 1);
if (submit->scribble) if (device)
dma_src = submit->scribble; unmap = dmaengine_get_unmap_data(device->dev, src_cnt+1, GFP_NOIO);
else if (sizeof(dma_addr_t) <= sizeof(struct page *))
dma_src = (dma_addr_t *) src_list; if (unmap && is_dma_xor_aligned(device, offset, 0, len)) {
struct dma_async_tx_descriptor *tx;
int i, j;
if (dma_src && chan && is_dma_xor_aligned(chan->device, offset, 0, len)) {
/* run the xor asynchronously */ /* run the xor asynchronously */
pr_debug("%s (async): len: %zu\n", __func__, len); pr_debug("%s (async): len: %zu\n", __func__, len);
return do_async_xor(chan, dest, src_list, offset, src_cnt, len, unmap->len = len;
dma_src, submit); for (i = 0, j = 0; i < src_cnt; i++) {
if (!src_list[i])
continue;
unmap->to_cnt++;
unmap->addr[j++] = dma_map_page(device->dev, src_list[i],
offset, len, DMA_TO_DEVICE);
}
/* map it bidirectional as it may be re-used as a source */
unmap->addr[j] = dma_map_page(device->dev, dest, offset, len,
DMA_BIDIRECTIONAL);
unmap->bidi_cnt = 1;
tx = do_async_xor(chan, unmap, submit);
dmaengine_unmap_put(unmap);
return tx;
} else { } else {
dmaengine_unmap_put(unmap);
/* run the xor synchronously */ /* run the xor synchronously */
pr_debug("%s (sync): len: %zu\n", __func__, len); pr_debug("%s (sync): len: %zu\n", __func__, len);
WARN_ONCE(chan, "%s: no space for dma address conversion\n", WARN_ONCE(chan, "%s: no space for dma address conversion\n",
...@@ -268,16 +274,14 @@ async_xor_val(struct page *dest, struct page **src_list, unsigned int offset, ...@@ -268,16 +274,14 @@ async_xor_val(struct page *dest, struct page **src_list, unsigned int offset,
struct dma_chan *chan = xor_val_chan(submit, dest, src_list, src_cnt, len); struct dma_chan *chan = xor_val_chan(submit, dest, src_list, src_cnt, len);
struct dma_device *device = chan ? chan->device : NULL; struct dma_device *device = chan ? chan->device : NULL;
struct dma_async_tx_descriptor *tx = NULL; struct dma_async_tx_descriptor *tx = NULL;
dma_addr_t *dma_src = NULL; struct dmaengine_unmap_data *unmap = NULL;
BUG_ON(src_cnt <= 1); BUG_ON(src_cnt <= 1);
if (submit->scribble) if (device)
dma_src = submit->scribble; unmap = dmaengine_get_unmap_data(device->dev, src_cnt, GFP_NOIO);
else if (sizeof(dma_addr_t) <= sizeof(struct page *))
dma_src = (dma_addr_t *) src_list;
if (dma_src && device && src_cnt <= device->max_xor && if (unmap && src_cnt <= device->max_xor &&
is_dma_xor_aligned(device, offset, 0, len)) { is_dma_xor_aligned(device, offset, 0, len)) {
unsigned long dma_prep_flags = 0; unsigned long dma_prep_flags = 0;
int i; int i;
...@@ -288,11 +292,15 @@ async_xor_val(struct page *dest, struct page **src_list, unsigned int offset, ...@@ -288,11 +292,15 @@ async_xor_val(struct page *dest, struct page **src_list, unsigned int offset,
dma_prep_flags |= DMA_PREP_INTERRUPT; dma_prep_flags |= DMA_PREP_INTERRUPT;
if (submit->flags & ASYNC_TX_FENCE) if (submit->flags & ASYNC_TX_FENCE)
dma_prep_flags |= DMA_PREP_FENCE; dma_prep_flags |= DMA_PREP_FENCE;
for (i = 0; i < src_cnt; i++)
dma_src[i] = dma_map_page(device->dev, src_list[i], for (i = 0; i < src_cnt; i++) {
unmap->addr[i] = dma_map_page(device->dev, src_list[i],
offset, len, DMA_TO_DEVICE); offset, len, DMA_TO_DEVICE);
unmap->to_cnt++;
}
unmap->len = len;
tx = device->device_prep_dma_xor_val(chan, dma_src, src_cnt, tx = device->device_prep_dma_xor_val(chan, unmap->addr, src_cnt,
len, result, len, result,
dma_prep_flags); dma_prep_flags);
if (unlikely(!tx)) { if (unlikely(!tx)) {
...@@ -301,11 +309,11 @@ async_xor_val(struct page *dest, struct page **src_list, unsigned int offset, ...@@ -301,11 +309,11 @@ async_xor_val(struct page *dest, struct page **src_list, unsigned int offset,
while (!tx) { while (!tx) {
dma_async_issue_pending(chan); dma_async_issue_pending(chan);
tx = device->device_prep_dma_xor_val(chan, tx = device->device_prep_dma_xor_val(chan,
dma_src, src_cnt, len, result, unmap->addr, src_cnt, len, result,
dma_prep_flags); dma_prep_flags);
} }
} }
dma_set_unmap(tx, unmap);
async_tx_submit(chan, tx, submit); async_tx_submit(chan, tx, submit);
} else { } else {
enum async_tx_flags flags_orig = submit->flags; enum async_tx_flags flags_orig = submit->flags;
...@@ -327,6 +335,7 @@ async_xor_val(struct page *dest, struct page **src_list, unsigned int offset, ...@@ -327,6 +335,7 @@ async_xor_val(struct page *dest, struct page **src_list, unsigned int offset,
async_tx_sync_epilog(submit); async_tx_sync_epilog(submit);
submit->flags = flags_orig; submit->flags = flags_orig;
} }
dmaengine_unmap_put(unmap);
return tx; return tx;
} }
......
...@@ -28,7 +28,7 @@ ...@@ -28,7 +28,7 @@
#undef pr #undef pr
#define pr(fmt, args...) pr_info("raid6test: " fmt, ##args) #define pr(fmt, args...) pr_info("raid6test: " fmt, ##args)
#define NDISKS 16 /* Including P and Q */ #define NDISKS 64 /* Including P and Q */
static struct page *dataptrs[NDISKS]; static struct page *dataptrs[NDISKS];
static addr_conv_t addr_conv[NDISKS]; static addr_conv_t addr_conv[NDISKS];
...@@ -219,6 +219,14 @@ static int raid6_test(void) ...@@ -219,6 +219,14 @@ static int raid6_test(void)
err += test(11, &tests); err += test(11, &tests);
err += test(12, &tests); err += test(12, &tests);
} }
/* the 24 disk case is special for ioatdma as it is the boudary point
* at which it needs to switch from 8-source ops to 16-source
* ops for continuation (assumes DMA_HAS_PQ_CONTINUE is not set)
*/
if (NDISKS > 24)
err += test(24, &tests);
err += test(NDISKS, &tests); err += test(NDISKS, &tests);
pr("\n"); pr("\n");
......
...@@ -396,8 +396,7 @@ dma_xfer(struct arasan_cf_dev *acdev, dma_addr_t src, dma_addr_t dest, u32 len) ...@@ -396,8 +396,7 @@ dma_xfer(struct arasan_cf_dev *acdev, dma_addr_t src, dma_addr_t dest, u32 len)
struct dma_async_tx_descriptor *tx; struct dma_async_tx_descriptor *tx;
struct dma_chan *chan = acdev->dma_chan; struct dma_chan *chan = acdev->dma_chan;
dma_cookie_t cookie; dma_cookie_t cookie;
unsigned long flags = DMA_PREP_INTERRUPT | DMA_COMPL_SKIP_SRC_UNMAP | unsigned long flags = DMA_PREP_INTERRUPT;
DMA_COMPL_SKIP_DEST_UNMAP;
int ret = 0; int ret = 0;
tx = chan->device->device_prep_dma_memcpy(chan, dest, src, len, flags); tx = chan->device->device_prep_dma_memcpy(chan, dest, src, len, flags);
......
...@@ -1164,42 +1164,12 @@ static void pl08x_free_txd(struct pl08x_driver_data *pl08x, ...@@ -1164,42 +1164,12 @@ static void pl08x_free_txd(struct pl08x_driver_data *pl08x,
kfree(txd); kfree(txd);
} }
static void pl08x_unmap_buffers(struct pl08x_txd *txd)
{
struct device *dev = txd->vd.tx.chan->device->dev;
struct pl08x_sg *dsg;
if (!(txd->vd.tx.flags & DMA_COMPL_SKIP_SRC_UNMAP)) {
if (txd->vd.tx.flags & DMA_COMPL_SRC_UNMAP_SINGLE)
list_for_each_entry(dsg, &txd->dsg_list, node)
dma_unmap_single(dev, dsg->src_addr, dsg->len,
DMA_TO_DEVICE);
else {
list_for_each_entry(dsg, &txd->dsg_list, node)
dma_unmap_page(dev, dsg->src_addr, dsg->len,
DMA_TO_DEVICE);
}
}
if (!(txd->vd.tx.flags & DMA_COMPL_SKIP_DEST_UNMAP)) {
if (txd->vd.tx.flags & DMA_COMPL_DEST_UNMAP_SINGLE)
list_for_each_entry(dsg, &txd->dsg_list, node)
dma_unmap_single(dev, dsg->dst_addr, dsg->len,
DMA_FROM_DEVICE);
else
list_for_each_entry(dsg, &txd->dsg_list, node)
dma_unmap_page(dev, dsg->dst_addr, dsg->len,
DMA_FROM_DEVICE);
}
}
static void pl08x_desc_free(struct virt_dma_desc *vd) static void pl08x_desc_free(struct virt_dma_desc *vd)
{ {
struct pl08x_txd *txd = to_pl08x_txd(&vd->tx); struct pl08x_txd *txd = to_pl08x_txd(&vd->tx);
struct pl08x_dma_chan *plchan = to_pl08x_chan(vd->tx.chan); struct pl08x_dma_chan *plchan = to_pl08x_chan(vd->tx.chan);
if (!plchan->slave) dma_descriptor_unmap(txd);
pl08x_unmap_buffers(txd);
if (!txd->done) if (!txd->done)
pl08x_release_mux(plchan); pl08x_release_mux(plchan);
......
...@@ -344,31 +344,7 @@ atc_chain_complete(struct at_dma_chan *atchan, struct at_desc *desc) ...@@ -344,31 +344,7 @@ atc_chain_complete(struct at_dma_chan *atchan, struct at_desc *desc)
/* move myself to free_list */ /* move myself to free_list */
list_move(&desc->desc_node, &atchan->free_list); list_move(&desc->desc_node, &atchan->free_list);
/* unmap dma addresses (not on slave channels) */ dma_descriptor_unmap(txd);
if (!atchan->chan_common.private) {
struct device *parent = chan2parent(&atchan->chan_common);
if (!(txd->flags & DMA_COMPL_SKIP_DEST_UNMAP)) {
if (txd->flags & DMA_COMPL_DEST_UNMAP_SINGLE)
dma_unmap_single(parent,
desc->lli.daddr,
desc->len, DMA_FROM_DEVICE);
else
dma_unmap_page(parent,
desc->lli.daddr,
desc->len, DMA_FROM_DEVICE);
}
if (!(txd->flags & DMA_COMPL_SKIP_SRC_UNMAP)) {
if (txd->flags & DMA_COMPL_SRC_UNMAP_SINGLE)
dma_unmap_single(parent,
desc->lli.saddr,
desc->len, DMA_TO_DEVICE);
else
dma_unmap_page(parent,
desc->lli.saddr,
desc->len, DMA_TO_DEVICE);
}
}
/* for cyclic transfers, /* for cyclic transfers,
* no need to replay callback function while stopping */ * no need to replay callback function while stopping */
if (!atc_chan_is_cyclic(atchan)) { if (!atc_chan_is_cyclic(atchan)) {
......
...@@ -65,6 +65,7 @@ ...@@ -65,6 +65,7 @@
#include <linux/acpi.h> #include <linux/acpi.h>
#include <linux/acpi_dma.h> #include <linux/acpi_dma.h>
#include <linux/of_dma.h> #include <linux/of_dma.h>
#include <linux/mempool.h>
static DEFINE_MUTEX(dma_list_mutex); static DEFINE_MUTEX(dma_list_mutex);
static DEFINE_IDR(dma_idr); static DEFINE_IDR(dma_idr);
...@@ -901,98 +902,132 @@ void dma_async_device_unregister(struct dma_device *device) ...@@ -901,98 +902,132 @@ void dma_async_device_unregister(struct dma_device *device)
} }
EXPORT_SYMBOL(dma_async_device_unregister); EXPORT_SYMBOL(dma_async_device_unregister);
/** struct dmaengine_unmap_pool {
* dma_async_memcpy_buf_to_buf - offloaded copy between virtual addresses struct kmem_cache *cache;
* @chan: DMA channel to offload copy to const char *name;
* @dest: destination address (virtual) mempool_t *pool;
* @src: source address (virtual) size_t size;
* @len: length };
*
* Both @dest and @src must be mappable to a bus address according to the
* DMA mapping API rules for streaming mappings.
* Both @dest and @src must stay memory resident (kernel memory or locked
* user space pages).
*/
dma_cookie_t
dma_async_memcpy_buf_to_buf(struct dma_chan *chan, void *dest,
void *src, size_t len)
{
struct dma_device *dev = chan->device;
struct dma_async_tx_descriptor *tx;
dma_addr_t dma_dest, dma_src;
dma_cookie_t cookie;
unsigned long flags;
dma_src = dma_map_single(dev->dev, src, len, DMA_TO_DEVICE); #define __UNMAP_POOL(x) { .size = x, .name = "dmaengine-unmap-" __stringify(x) }
dma_dest = dma_map_single(dev->dev, dest, len, DMA_FROM_DEVICE); static struct dmaengine_unmap_pool unmap_pool[] = {
flags = DMA_CTRL_ACK | __UNMAP_POOL(2),
DMA_COMPL_SRC_UNMAP_SINGLE | #if IS_ENABLED(CONFIG_ASYNC_TX_DMA)
DMA_COMPL_DEST_UNMAP_SINGLE; __UNMAP_POOL(16),
tx = dev->device_prep_dma_memcpy(chan, dma_dest, dma_src, len, flags); __UNMAP_POOL(128),
__UNMAP_POOL(256),
#endif
};
if (!tx) { static struct dmaengine_unmap_pool *__get_unmap_pool(int nr)
dma_unmap_single(dev->dev, dma_src, len, DMA_TO_DEVICE); {
dma_unmap_single(dev->dev, dma_dest, len, DMA_FROM_DEVICE); int order = get_count_order(nr);
return -ENOMEM;
switch (order) {
case 0 ... 1:
return &unmap_pool[0];
case 2 ... 4:
return &unmap_pool[1];
case 5 ... 7:
return &unmap_pool[2];
case 8:
return &unmap_pool[3];
default:
BUG();
return NULL;
} }
}
tx->callback = NULL; static void dmaengine_unmap(struct kref *kref)
cookie = tx->tx_submit(tx); {
struct dmaengine_unmap_data *unmap = container_of(kref, typeof(*unmap), kref);
struct device *dev = unmap->dev;
int cnt, i;
cnt = unmap->to_cnt;
for (i = 0; i < cnt; i++)
dma_unmap_page(dev, unmap->addr[i], unmap->len,
DMA_TO_DEVICE);
cnt += unmap->from_cnt;
for (; i < cnt; i++)
dma_unmap_page(dev, unmap->addr[i], unmap->len,
DMA_FROM_DEVICE);
cnt += unmap->bidi_cnt;
for (; i < cnt; i++) {
if (unmap->addr[i] == 0)
continue;
dma_unmap_page(dev, unmap->addr[i], unmap->len,
DMA_BIDIRECTIONAL);
}
mempool_free(unmap, __get_unmap_pool(cnt)->pool);
}
preempt_disable(); void dmaengine_unmap_put(struct dmaengine_unmap_data *unmap)
__this_cpu_add(chan->local->bytes_transferred, len); {
__this_cpu_inc(chan->local->memcpy_count); if (unmap)
preempt_enable(); kref_put(&unmap->kref, dmaengine_unmap);
}
EXPORT_SYMBOL_GPL(dmaengine_unmap_put);
return cookie; static void dmaengine_destroy_unmap_pool(void)
{
int i;
for (i = 0; i < ARRAY_SIZE(unmap_pool); i++) {
struct dmaengine_unmap_pool *p = &unmap_pool[i];
if (p->pool)
mempool_destroy(p->pool);
p->pool = NULL;
if (p->cache)
kmem_cache_destroy(p->cache);
p->cache = NULL;
}
} }
EXPORT_SYMBOL(dma_async_memcpy_buf_to_buf);
/** static int __init dmaengine_init_unmap_pool(void)
* dma_async_memcpy_buf_to_pg - offloaded copy from address to page
* @chan: DMA channel to offload copy to
* @page: destination page
* @offset: offset in page to copy to
* @kdata: source address (virtual)
* @len: length
*
* Both @page/@offset and @kdata must be mappable to a bus address according
* to the DMA mapping API rules for streaming mappings.
* Both @page/@offset and @kdata must stay memory resident (kernel memory or
* locked user space pages)
*/
dma_cookie_t
dma_async_memcpy_buf_to_pg(struct dma_chan *chan, struct page *page,
unsigned int offset, void *kdata, size_t len)
{ {
struct dma_device *dev = chan->device; int i;
struct dma_async_tx_descriptor *tx;
dma_addr_t dma_dest, dma_src;
dma_cookie_t cookie;
unsigned long flags;
dma_src = dma_map_single(dev->dev, kdata, len, DMA_TO_DEVICE); for (i = 0; i < ARRAY_SIZE(unmap_pool); i++) {
dma_dest = dma_map_page(dev->dev, page, offset, len, DMA_FROM_DEVICE); struct dmaengine_unmap_pool *p = &unmap_pool[i];
flags = DMA_CTRL_ACK | DMA_COMPL_SRC_UNMAP_SINGLE; size_t size;
tx = dev->device_prep_dma_memcpy(chan, dma_dest, dma_src, len, flags);
if (!tx) { size = sizeof(struct dmaengine_unmap_data) +
dma_unmap_single(dev->dev, dma_src, len, DMA_TO_DEVICE); sizeof(dma_addr_t) * p->size;
dma_unmap_page(dev->dev, dma_dest, len, DMA_FROM_DEVICE);
return -ENOMEM; p->cache = kmem_cache_create(p->name, size, 0,
SLAB_HWCACHE_ALIGN, NULL);
if (!p->cache)
break;
p->pool = mempool_create_slab_pool(1, p->cache);
if (!p->pool)
break;
} }
tx->callback = NULL; if (i == ARRAY_SIZE(unmap_pool))
cookie = tx->tx_submit(tx); return 0;
preempt_disable(); dmaengine_destroy_unmap_pool();
__this_cpu_add(chan->local->bytes_transferred, len); return -ENOMEM;
__this_cpu_inc(chan->local->memcpy_count); }
preempt_enable();
return cookie; struct dmaengine_unmap_data *
dmaengine_get_unmap_data(struct device *dev, int nr, gfp_t flags)
{
struct dmaengine_unmap_data *unmap;
unmap = mempool_alloc(__get_unmap_pool(nr)->pool, flags);
if (!unmap)
return NULL;
memset(unmap, 0, sizeof(*unmap));
kref_init(&unmap->kref);
unmap->dev = dev;
return unmap;
} }
EXPORT_SYMBOL(dma_async_memcpy_buf_to_pg); EXPORT_SYMBOL(dmaengine_get_unmap_data);
/** /**
* dma_async_memcpy_pg_to_pg - offloaded copy from page to page * dma_async_memcpy_pg_to_pg - offloaded copy from page to page
...@@ -1015,24 +1050,33 @@ dma_async_memcpy_pg_to_pg(struct dma_chan *chan, struct page *dest_pg, ...@@ -1015,24 +1050,33 @@ dma_async_memcpy_pg_to_pg(struct dma_chan *chan, struct page *dest_pg,
{ {
struct dma_device *dev = chan->device; struct dma_device *dev = chan->device;
struct dma_async_tx_descriptor *tx; struct dma_async_tx_descriptor *tx;
dma_addr_t dma_dest, dma_src; struct dmaengine_unmap_data *unmap;
dma_cookie_t cookie; dma_cookie_t cookie;
unsigned long flags; unsigned long flags;
dma_src = dma_map_page(dev->dev, src_pg, src_off, len, DMA_TO_DEVICE); unmap = dmaengine_get_unmap_data(dev->dev, 2, GFP_NOIO);
dma_dest = dma_map_page(dev->dev, dest_pg, dest_off, len, if (!unmap)
return -ENOMEM;
unmap->to_cnt = 1;
unmap->from_cnt = 1;
unmap->addr[0] = dma_map_page(dev->dev, src_pg, src_off, len,
DMA_TO_DEVICE);
unmap->addr[1] = dma_map_page(dev->dev, dest_pg, dest_off, len,
DMA_FROM_DEVICE); DMA_FROM_DEVICE);
unmap->len = len;
flags = DMA_CTRL_ACK; flags = DMA_CTRL_ACK;
tx = dev->device_prep_dma_memcpy(chan, dma_dest, dma_src, len, flags); tx = dev->device_prep_dma_memcpy(chan, unmap->addr[1], unmap->addr[0],
len, flags);
if (!tx) { if (!tx) {
dma_unmap_page(dev->dev, dma_src, len, DMA_TO_DEVICE); dmaengine_unmap_put(unmap);
dma_unmap_page(dev->dev, dma_dest, len, DMA_FROM_DEVICE);
return -ENOMEM; return -ENOMEM;
} }
tx->callback = NULL; dma_set_unmap(tx, unmap);
cookie = tx->tx_submit(tx); cookie = tx->tx_submit(tx);
dmaengine_unmap_put(unmap);
preempt_disable(); preempt_disable();
__this_cpu_add(chan->local->bytes_transferred, len); __this_cpu_add(chan->local->bytes_transferred, len);
...@@ -1043,6 +1087,52 @@ dma_async_memcpy_pg_to_pg(struct dma_chan *chan, struct page *dest_pg, ...@@ -1043,6 +1087,52 @@ dma_async_memcpy_pg_to_pg(struct dma_chan *chan, struct page *dest_pg,
} }
EXPORT_SYMBOL(dma_async_memcpy_pg_to_pg); EXPORT_SYMBOL(dma_async_memcpy_pg_to_pg);
/**
* dma_async_memcpy_buf_to_buf - offloaded copy between virtual addresses
* @chan: DMA channel to offload copy to
* @dest: destination address (virtual)
* @src: source address (virtual)
* @len: length
*
* Both @dest and @src must be mappable to a bus address according to the
* DMA mapping API rules for streaming mappings.
* Both @dest and @src must stay memory resident (kernel memory or locked
* user space pages).
*/
dma_cookie_t
dma_async_memcpy_buf_to_buf(struct dma_chan *chan, void *dest,
void *src, size_t len)
{
return dma_async_memcpy_pg_to_pg(chan, virt_to_page(dest),
(unsigned long) dest & ~PAGE_MASK,
virt_to_page(src),
(unsigned long) src & ~PAGE_MASK, len);
}
EXPORT_SYMBOL(dma_async_memcpy_buf_to_buf);
/**
* dma_async_memcpy_buf_to_pg - offloaded copy from address to page
* @chan: DMA channel to offload copy to
* @page: destination page
* @offset: offset in page to copy to
* @kdata: source address (virtual)
* @len: length
*
* Both @page/@offset and @kdata must be mappable to a bus address according
* to the DMA mapping API rules for streaming mappings.
* Both @page/@offset and @kdata must stay memory resident (kernel memory or
* locked user space pages)
*/
dma_cookie_t
dma_async_memcpy_buf_to_pg(struct dma_chan *chan, struct page *page,
unsigned int offset, void *kdata, size_t len)
{
return dma_async_memcpy_pg_to_pg(chan, page, offset,
virt_to_page(kdata),
(unsigned long) kdata & ~PAGE_MASK, len);
}
EXPORT_SYMBOL(dma_async_memcpy_buf_to_pg);
void dma_async_tx_descriptor_init(struct dma_async_tx_descriptor *tx, void dma_async_tx_descriptor_init(struct dma_async_tx_descriptor *tx,
struct dma_chan *chan) struct dma_chan *chan)
{ {
...@@ -1116,6 +1206,10 @@ EXPORT_SYMBOL_GPL(dma_run_dependencies); ...@@ -1116,6 +1206,10 @@ EXPORT_SYMBOL_GPL(dma_run_dependencies);
static int __init dma_bus_init(void) static int __init dma_bus_init(void)
{ {
int err = dmaengine_init_unmap_pool();
if (err)
return err;
return class_register(&dma_devclass); return class_register(&dma_devclass);
} }
arch_initcall(dma_bus_init); arch_initcall(dma_bus_init);
......
This diff is collapsed.
...@@ -85,10 +85,6 @@ static struct device *chan2dev(struct dma_chan *chan) ...@@ -85,10 +85,6 @@ static struct device *chan2dev(struct dma_chan *chan)
{ {
return &chan->dev->device; return &chan->dev->device;
} }
static struct device *chan2parent(struct dma_chan *chan)
{
return chan->dev->device.parent;
}
static struct dw_desc *dwc_first_active(struct dw_dma_chan *dwc) static struct dw_desc *dwc_first_active(struct dw_dma_chan *dwc)
{ {
...@@ -311,26 +307,7 @@ dwc_descriptor_complete(struct dw_dma_chan *dwc, struct dw_desc *desc, ...@@ -311,26 +307,7 @@ dwc_descriptor_complete(struct dw_dma_chan *dwc, struct dw_desc *desc,
list_splice_init(&desc->tx_list, &dwc->free_list); list_splice_init(&desc->tx_list, &dwc->free_list);
list_move(&desc->desc_node, &dwc->free_list); list_move(&desc->desc_node, &dwc->free_list);
if (!is_slave_direction(dwc->direction)) { dma_descriptor_unmap(txd);
struct device *parent = chan2parent(&dwc->chan);
if (!(txd->flags & DMA_COMPL_SKIP_DEST_UNMAP)) {
if (txd->flags & DMA_COMPL_DEST_UNMAP_SINGLE)
dma_unmap_single(parent, desc->lli.dar,
desc->total_len, DMA_FROM_DEVICE);
else
dma_unmap_page(parent, desc->lli.dar,
desc->total_len, DMA_FROM_DEVICE);
}
if (!(txd->flags & DMA_COMPL_SKIP_SRC_UNMAP)) {
if (txd->flags & DMA_COMPL_SRC_UNMAP_SINGLE)
dma_unmap_single(parent, desc->lli.sar,
desc->total_len, DMA_TO_DEVICE);
else
dma_unmap_page(parent, desc->lli.sar,
desc->total_len, DMA_TO_DEVICE);
}
}
spin_unlock_irqrestore(&dwc->lock, flags); spin_unlock_irqrestore(&dwc->lock, flags);
if (callback) if (callback)
......
...@@ -733,28 +733,6 @@ static void ep93xx_dma_advance_work(struct ep93xx_dma_chan *edmac) ...@@ -733,28 +733,6 @@ static void ep93xx_dma_advance_work(struct ep93xx_dma_chan *edmac)
spin_unlock_irqrestore(&edmac->lock, flags); spin_unlock_irqrestore(&edmac->lock, flags);
} }
static void ep93xx_dma_unmap_buffers(struct ep93xx_dma_desc *desc)
{
struct device *dev = desc->txd.chan->device->dev;
if (!(desc->txd.flags & DMA_COMPL_SKIP_SRC_UNMAP)) {
if (desc->txd.flags & DMA_COMPL_SRC_UNMAP_SINGLE)
dma_unmap_single(dev, desc->src_addr, desc->size,
DMA_TO_DEVICE);
else
dma_unmap_page(dev, desc->src_addr, desc->size,
DMA_TO_DEVICE);
}
if (!(desc->txd.flags & DMA_COMPL_SKIP_DEST_UNMAP)) {
if (desc->txd.flags & DMA_COMPL_DEST_UNMAP_SINGLE)
dma_unmap_single(dev, desc->dst_addr, desc->size,
DMA_FROM_DEVICE);
else
dma_unmap_page(dev, desc->dst_addr, desc->size,
DMA_FROM_DEVICE);
}
}
static void ep93xx_dma_tasklet(unsigned long data) static void ep93xx_dma_tasklet(unsigned long data)
{ {
struct ep93xx_dma_chan *edmac = (struct ep93xx_dma_chan *)data; struct ep93xx_dma_chan *edmac = (struct ep93xx_dma_chan *)data;
...@@ -787,13 +765,7 @@ static void ep93xx_dma_tasklet(unsigned long data) ...@@ -787,13 +765,7 @@ static void ep93xx_dma_tasklet(unsigned long data)
/* Now we can release all the chained descriptors */ /* Now we can release all the chained descriptors */
list_for_each_entry_safe(desc, d, &list, node) { list_for_each_entry_safe(desc, d, &list, node) {
/* dma_descriptor_unmap(&desc->txd);
* For the memcpy channels the API requires us to unmap the
* buffers unless requested otherwise.
*/
if (!edmac->chan.private)
ep93xx_dma_unmap_buffers(desc);
ep93xx_dma_desc_put(edmac, desc); ep93xx_dma_desc_put(edmac, desc);
} }
......
...@@ -868,22 +868,7 @@ static void fsldma_cleanup_descriptor(struct fsldma_chan *chan, ...@@ -868,22 +868,7 @@ static void fsldma_cleanup_descriptor(struct fsldma_chan *chan,
/* Run any dependencies */ /* Run any dependencies */
dma_run_dependencies(txd); dma_run_dependencies(txd);
/* Unmap the dst buffer, if requested */ dma_descriptor_unmap(txd);
if (!(txd->flags & DMA_COMPL_SKIP_DEST_UNMAP)) {
if (txd->flags & DMA_COMPL_DEST_UNMAP_SINGLE)
dma_unmap_single(dev, dst, len, DMA_FROM_DEVICE);
else
dma_unmap_page(dev, dst, len, DMA_FROM_DEVICE);
}
/* Unmap the src buffer, if requested */
if (!(txd->flags & DMA_COMPL_SKIP_SRC_UNMAP)) {
if (txd->flags & DMA_COMPL_SRC_UNMAP_SINGLE)
dma_unmap_single(dev, src, len, DMA_TO_DEVICE);
else
dma_unmap_page(dev, src, len, DMA_TO_DEVICE);
}
#ifdef FSL_DMA_LD_DEBUG #ifdef FSL_DMA_LD_DEBUG
chan_dbg(chan, "LD %p free\n", desc); chan_dbg(chan, "LD %p free\n", desc);
#endif #endif
......
...@@ -531,21 +531,6 @@ static void ioat1_cleanup_event(unsigned long data) ...@@ -531,21 +531,6 @@ static void ioat1_cleanup_event(unsigned long data)
writew(IOAT_CHANCTRL_RUN, ioat->base.reg_base + IOAT_CHANCTRL_OFFSET); writew(IOAT_CHANCTRL_RUN, ioat->base.reg_base + IOAT_CHANCTRL_OFFSET);
} }
void ioat_dma_unmap(struct ioat_chan_common *chan, enum dma_ctrl_flags flags,
size_t len, struct ioat_dma_descriptor *hw)
{
struct pci_dev *pdev = chan->device->pdev;
size_t offset = len - hw->size;
if (!(flags & DMA_COMPL_SKIP_DEST_UNMAP))
ioat_unmap(pdev, hw->dst_addr - offset, len,
PCI_DMA_FROMDEVICE, flags, 1);
if (!(flags & DMA_COMPL_SKIP_SRC_UNMAP))
ioat_unmap(pdev, hw->src_addr - offset, len,
PCI_DMA_TODEVICE, flags, 0);
}
dma_addr_t ioat_get_current_completion(struct ioat_chan_common *chan) dma_addr_t ioat_get_current_completion(struct ioat_chan_common *chan)
{ {
dma_addr_t phys_complete; dma_addr_t phys_complete;
...@@ -602,7 +587,7 @@ static void __cleanup(struct ioat_dma_chan *ioat, dma_addr_t phys_complete) ...@@ -602,7 +587,7 @@ static void __cleanup(struct ioat_dma_chan *ioat, dma_addr_t phys_complete)
dump_desc_dbg(ioat, desc); dump_desc_dbg(ioat, desc);
if (tx->cookie) { if (tx->cookie) {
dma_cookie_complete(tx); dma_cookie_complete(tx);
ioat_dma_unmap(chan, tx->flags, desc->len, desc->hw); dma_descriptor_unmap(tx);
ioat->active -= desc->hw->tx_cnt; ioat->active -= desc->hw->tx_cnt;
if (tx->callback) { if (tx->callback) {
tx->callback(tx->callback_param); tx->callback(tx->callback_param);
...@@ -833,8 +818,7 @@ int ioat_dma_self_test(struct ioatdma_device *device) ...@@ -833,8 +818,7 @@ int ioat_dma_self_test(struct ioatdma_device *device)
dma_src = dma_map_single(dev, src, IOAT_TEST_SIZE, DMA_TO_DEVICE); dma_src = dma_map_single(dev, src, IOAT_TEST_SIZE, DMA_TO_DEVICE);
dma_dest = dma_map_single(dev, dest, IOAT_TEST_SIZE, DMA_FROM_DEVICE); dma_dest = dma_map_single(dev, dest, IOAT_TEST_SIZE, DMA_FROM_DEVICE);
flags = DMA_COMPL_SKIP_SRC_UNMAP | DMA_COMPL_SKIP_DEST_UNMAP | flags = DMA_PREP_INTERRUPT;
DMA_PREP_INTERRUPT;
tx = device->common.device_prep_dma_memcpy(dma_chan, dma_dest, dma_src, tx = device->common.device_prep_dma_memcpy(dma_chan, dma_dest, dma_src,
IOAT_TEST_SIZE, flags); IOAT_TEST_SIZE, flags);
if (!tx) { if (!tx) {
...@@ -885,8 +869,7 @@ static char ioat_interrupt_style[32] = "msix"; ...@@ -885,8 +869,7 @@ static char ioat_interrupt_style[32] = "msix";
module_param_string(ioat_interrupt_style, ioat_interrupt_style, module_param_string(ioat_interrupt_style, ioat_interrupt_style,
sizeof(ioat_interrupt_style), 0644); sizeof(ioat_interrupt_style), 0644);
MODULE_PARM_DESC(ioat_interrupt_style, MODULE_PARM_DESC(ioat_interrupt_style,
"set ioat interrupt style: msix (default), " "set ioat interrupt style: msix (default), msi, intx");
"msix-single-vector, msi, intx)");
/** /**
* ioat_dma_setup_interrupts - setup interrupt handler * ioat_dma_setup_interrupts - setup interrupt handler
...@@ -904,8 +887,6 @@ int ioat_dma_setup_interrupts(struct ioatdma_device *device) ...@@ -904,8 +887,6 @@ int ioat_dma_setup_interrupts(struct ioatdma_device *device)
if (!strcmp(ioat_interrupt_style, "msix")) if (!strcmp(ioat_interrupt_style, "msix"))
goto msix; goto msix;
if (!strcmp(ioat_interrupt_style, "msix-single-vector"))
goto msix_single_vector;
if (!strcmp(ioat_interrupt_style, "msi")) if (!strcmp(ioat_interrupt_style, "msi"))
goto msi; goto msi;
if (!strcmp(ioat_interrupt_style, "intx")) if (!strcmp(ioat_interrupt_style, "intx"))
...@@ -920,10 +901,8 @@ int ioat_dma_setup_interrupts(struct ioatdma_device *device) ...@@ -920,10 +901,8 @@ int ioat_dma_setup_interrupts(struct ioatdma_device *device)
device->msix_entries[i].entry = i; device->msix_entries[i].entry = i;
err = pci_enable_msix(pdev, device->msix_entries, msixcnt); err = pci_enable_msix(pdev, device->msix_entries, msixcnt);
if (err < 0) if (err)
goto msi; goto msi;
if (err > 0)
goto msix_single_vector;
for (i = 0; i < msixcnt; i++) { for (i = 0; i < msixcnt; i++) {
msix = &device->msix_entries[i]; msix = &device->msix_entries[i];
...@@ -937,29 +916,13 @@ int ioat_dma_setup_interrupts(struct ioatdma_device *device) ...@@ -937,29 +916,13 @@ int ioat_dma_setup_interrupts(struct ioatdma_device *device)
chan = ioat_chan_by_index(device, j); chan = ioat_chan_by_index(device, j);
devm_free_irq(dev, msix->vector, chan); devm_free_irq(dev, msix->vector, chan);
} }
goto msix_single_vector; goto msi;
} }
} }
intrctrl |= IOAT_INTRCTRL_MSIX_VECTOR_CONTROL; intrctrl |= IOAT_INTRCTRL_MSIX_VECTOR_CONTROL;
device->irq_mode = IOAT_MSIX; device->irq_mode = IOAT_MSIX;
goto done; goto done;
msix_single_vector:
msix = &device->msix_entries[0];
msix->entry = 0;
err = pci_enable_msix(pdev, device->msix_entries, 1);
if (err)
goto msi;
err = devm_request_irq(dev, msix->vector, ioat_dma_do_interrupt, 0,
"ioat-msix", device);
if (err) {
pci_disable_msix(pdev);
goto msi;
}
device->irq_mode = IOAT_MSIX_SINGLE;
goto done;
msi: msi:
err = pci_enable_msi(pdev); err = pci_enable_msi(pdev);
if (err) if (err)
...@@ -971,7 +934,7 @@ int ioat_dma_setup_interrupts(struct ioatdma_device *device) ...@@ -971,7 +934,7 @@ int ioat_dma_setup_interrupts(struct ioatdma_device *device)
pci_disable_msi(pdev); pci_disable_msi(pdev);
goto intx; goto intx;
} }
device->irq_mode = IOAT_MSIX; device->irq_mode = IOAT_MSI;
goto done; goto done;
intx: intx:
......
...@@ -52,7 +52,6 @@ ...@@ -52,7 +52,6 @@
enum ioat_irq_mode { enum ioat_irq_mode {
IOAT_NOIRQ = 0, IOAT_NOIRQ = 0,
IOAT_MSIX, IOAT_MSIX,
IOAT_MSIX_SINGLE,
IOAT_MSI, IOAT_MSI,
IOAT_INTX IOAT_INTX
}; };
...@@ -83,7 +82,6 @@ struct ioatdma_device { ...@@ -83,7 +82,6 @@ struct ioatdma_device {
struct pci_pool *completion_pool; struct pci_pool *completion_pool;
#define MAX_SED_POOLS 5 #define MAX_SED_POOLS 5
struct dma_pool *sed_hw_pool[MAX_SED_POOLS]; struct dma_pool *sed_hw_pool[MAX_SED_POOLS];
struct kmem_cache *sed_pool;
struct dma_device common; struct dma_device common;
u8 version; u8 version;
struct msix_entry msix_entries[4]; struct msix_entry msix_entries[4];
...@@ -342,16 +340,6 @@ static inline bool is_ioat_bug(unsigned long err) ...@@ -342,16 +340,6 @@ static inline bool is_ioat_bug(unsigned long err)
return !!err; return !!err;
} }
static inline void ioat_unmap(struct pci_dev *pdev, dma_addr_t addr, size_t len,
int direction, enum dma_ctrl_flags flags, bool dst)
{
if ((dst && (flags & DMA_COMPL_DEST_UNMAP_SINGLE)) ||
(!dst && (flags & DMA_COMPL_SRC_UNMAP_SINGLE)))
pci_unmap_single(pdev, addr, len, direction);
else
pci_unmap_page(pdev, addr, len, direction);
}
int ioat_probe(struct ioatdma_device *device); int ioat_probe(struct ioatdma_device *device);
int ioat_register(struct ioatdma_device *device); int ioat_register(struct ioatdma_device *device);
int ioat1_dma_probe(struct ioatdma_device *dev, int dca); int ioat1_dma_probe(struct ioatdma_device *dev, int dca);
...@@ -363,8 +351,6 @@ void ioat_init_channel(struct ioatdma_device *device, ...@@ -363,8 +351,6 @@ void ioat_init_channel(struct ioatdma_device *device,
struct ioat_chan_common *chan, int idx); struct ioat_chan_common *chan, int idx);
enum dma_status ioat_dma_tx_status(struct dma_chan *c, dma_cookie_t cookie, enum dma_status ioat_dma_tx_status(struct dma_chan *c, dma_cookie_t cookie,
struct dma_tx_state *txstate); struct dma_tx_state *txstate);
void ioat_dma_unmap(struct ioat_chan_common *chan, enum dma_ctrl_flags flags,
size_t len, struct ioat_dma_descriptor *hw);
bool ioat_cleanup_preamble(struct ioat_chan_common *chan, bool ioat_cleanup_preamble(struct ioat_chan_common *chan,
dma_addr_t *phys_complete); dma_addr_t *phys_complete);
void ioat_kobject_add(struct ioatdma_device *device, struct kobj_type *type); void ioat_kobject_add(struct ioatdma_device *device, struct kobj_type *type);
......
...@@ -148,7 +148,7 @@ static void __cleanup(struct ioat2_dma_chan *ioat, dma_addr_t phys_complete) ...@@ -148,7 +148,7 @@ static void __cleanup(struct ioat2_dma_chan *ioat, dma_addr_t phys_complete)
tx = &desc->txd; tx = &desc->txd;
dump_desc_dbg(ioat, desc); dump_desc_dbg(ioat, desc);
if (tx->cookie) { if (tx->cookie) {
ioat_dma_unmap(chan, tx->flags, desc->len, desc->hw); dma_descriptor_unmap(tx);
dma_cookie_complete(tx); dma_cookie_complete(tx);
if (tx->callback) { if (tx->callback) {
tx->callback(tx->callback_param); tx->callback(tx->callback_param);
......
...@@ -157,7 +157,6 @@ static inline void ioat2_set_chainaddr(struct ioat2_dma_chan *ioat, u64 addr) ...@@ -157,7 +157,6 @@ static inline void ioat2_set_chainaddr(struct ioat2_dma_chan *ioat, u64 addr)
int ioat2_dma_probe(struct ioatdma_device *dev, int dca); int ioat2_dma_probe(struct ioatdma_device *dev, int dca);
int ioat3_dma_probe(struct ioatdma_device *dev, int dca); int ioat3_dma_probe(struct ioatdma_device *dev, int dca);
void ioat3_dma_remove(struct ioatdma_device *dev);
struct dca_provider *ioat2_dca_init(struct pci_dev *pdev, void __iomem *iobase); struct dca_provider *ioat2_dca_init(struct pci_dev *pdev, void __iomem *iobase);
struct dca_provider *ioat3_dca_init(struct pci_dev *pdev, void __iomem *iobase); struct dca_provider *ioat3_dca_init(struct pci_dev *pdev, void __iomem *iobase);
int ioat2_check_space_lock(struct ioat2_dma_chan *ioat, int num_descs); int ioat2_check_space_lock(struct ioat2_dma_chan *ioat, int num_descs);
......
This diff is collapsed.
...@@ -123,6 +123,7 @@ module_param(ioat_dca_enabled, int, 0644); ...@@ -123,6 +123,7 @@ module_param(ioat_dca_enabled, int, 0644);
MODULE_PARM_DESC(ioat_dca_enabled, "control support of dca service (default: 1)"); MODULE_PARM_DESC(ioat_dca_enabled, "control support of dca service (default: 1)");
struct kmem_cache *ioat2_cache; struct kmem_cache *ioat2_cache;
struct kmem_cache *ioat3_sed_cache;
#define DRV_NAME "ioatdma" #define DRV_NAME "ioatdma"
...@@ -207,9 +208,6 @@ static void ioat_remove(struct pci_dev *pdev) ...@@ -207,9 +208,6 @@ static void ioat_remove(struct pci_dev *pdev)
if (!device) if (!device)
return; return;
if (device->version >= IOAT_VER_3_0)
ioat3_dma_remove(device);
dev_err(&pdev->dev, "Removing dma and dca services\n"); dev_err(&pdev->dev, "Removing dma and dca services\n");
if (device->dca) { if (device->dca) {
unregister_dca_provider(device->dca, &pdev->dev); unregister_dca_provider(device->dca, &pdev->dev);
...@@ -221,7 +219,7 @@ static void ioat_remove(struct pci_dev *pdev) ...@@ -221,7 +219,7 @@ static void ioat_remove(struct pci_dev *pdev)
static int __init ioat_init_module(void) static int __init ioat_init_module(void)
{ {
int err; int err = -ENOMEM;
pr_info("%s: Intel(R) QuickData Technology Driver %s\n", pr_info("%s: Intel(R) QuickData Technology Driver %s\n",
DRV_NAME, IOAT_DMA_VERSION); DRV_NAME, IOAT_DMA_VERSION);
...@@ -231,8 +229,20 @@ static int __init ioat_init_module(void) ...@@ -231,8 +229,20 @@ static int __init ioat_init_module(void)
if (!ioat2_cache) if (!ioat2_cache)
return -ENOMEM; return -ENOMEM;
ioat3_sed_cache = KMEM_CACHE(ioat_sed_ent, 0);
if (!ioat3_sed_cache)
goto err_ioat2_cache;
err = pci_register_driver(&ioat_pci_driver); err = pci_register_driver(&ioat_pci_driver);
if (err) if (err)
goto err_ioat3_cache;
return 0;
err_ioat3_cache:
kmem_cache_destroy(ioat3_sed_cache);
err_ioat2_cache:
kmem_cache_destroy(ioat2_cache); kmem_cache_destroy(ioat2_cache);
return err; return err;
......
...@@ -61,80 +61,6 @@ static void iop_adma_free_slots(struct iop_adma_desc_slot *slot) ...@@ -61,80 +61,6 @@ static void iop_adma_free_slots(struct iop_adma_desc_slot *slot)
} }
} }
static void
iop_desc_unmap(struct iop_adma_chan *iop_chan, struct iop_adma_desc_slot *desc)
{
struct dma_async_tx_descriptor *tx = &desc->async_tx;
struct iop_adma_desc_slot *unmap = desc->group_head;
struct device *dev = &iop_chan->device->pdev->dev;
u32 len = unmap->unmap_len;
enum dma_ctrl_flags flags = tx->flags;
u32 src_cnt;
dma_addr_t addr;
dma_addr_t dest;
src_cnt = unmap->unmap_src_cnt;
dest = iop_desc_get_dest_addr(unmap, iop_chan);
if (!(flags & DMA_COMPL_SKIP_DEST_UNMAP)) {
enum dma_data_direction dir;
if (src_cnt > 1) /* is xor? */
dir = DMA_BIDIRECTIONAL;
else
dir = DMA_FROM_DEVICE;
dma_unmap_page(dev, dest, len, dir);
}
if (!(flags & DMA_COMPL_SKIP_SRC_UNMAP)) {
while (src_cnt--) {
addr = iop_desc_get_src_addr(unmap, iop_chan, src_cnt);
if (addr == dest)
continue;
dma_unmap_page(dev, addr, len, DMA_TO_DEVICE);
}
}
desc->group_head = NULL;
}
static void
iop_desc_unmap_pq(struct iop_adma_chan *iop_chan, struct iop_adma_desc_slot *desc)
{
struct dma_async_tx_descriptor *tx = &desc->async_tx;
struct iop_adma_desc_slot *unmap = desc->group_head;
struct device *dev = &iop_chan->device->pdev->dev;
u32 len = unmap->unmap_len;
enum dma_ctrl_flags flags = tx->flags;
u32 src_cnt = unmap->unmap_src_cnt;
dma_addr_t pdest = iop_desc_get_dest_addr(unmap, iop_chan);
dma_addr_t qdest = iop_desc_get_qdest_addr(unmap, iop_chan);
int i;
if (tx->flags & DMA_PREP_CONTINUE)
src_cnt -= 3;
if (!(flags & DMA_COMPL_SKIP_DEST_UNMAP) && !desc->pq_check_result) {
dma_unmap_page(dev, pdest, len, DMA_BIDIRECTIONAL);
dma_unmap_page(dev, qdest, len, DMA_BIDIRECTIONAL);
}
if (!(flags & DMA_COMPL_SKIP_SRC_UNMAP)) {
dma_addr_t addr;
for (i = 0; i < src_cnt; i++) {
addr = iop_desc_get_src_addr(unmap, iop_chan, i);
dma_unmap_page(dev, addr, len, DMA_TO_DEVICE);
}
if (desc->pq_check_result) {
dma_unmap_page(dev, pdest, len, DMA_TO_DEVICE);
dma_unmap_page(dev, qdest, len, DMA_TO_DEVICE);
}
}
desc->group_head = NULL;
}
static dma_cookie_t static dma_cookie_t
iop_adma_run_tx_complete_actions(struct iop_adma_desc_slot *desc, iop_adma_run_tx_complete_actions(struct iop_adma_desc_slot *desc,
struct iop_adma_chan *iop_chan, dma_cookie_t cookie) struct iop_adma_chan *iop_chan, dma_cookie_t cookie)
...@@ -152,15 +78,9 @@ iop_adma_run_tx_complete_actions(struct iop_adma_desc_slot *desc, ...@@ -152,15 +78,9 @@ iop_adma_run_tx_complete_actions(struct iop_adma_desc_slot *desc,
if (tx->callback) if (tx->callback)
tx->callback(tx->callback_param); tx->callback(tx->callback_param);
/* unmap dma addresses dma_descriptor_unmap(tx);
* (unmap_single vs unmap_page?) if (desc->group_head)
*/ desc->group_head = NULL;
if (desc->group_head && desc->unmap_len) {
if (iop_desc_is_pq(desc))
iop_desc_unmap_pq(iop_chan, desc);
else
iop_desc_unmap(iop_chan, desc);
}
} }
/* run dependent operations */ /* run dependent operations */
...@@ -591,7 +511,6 @@ iop_adma_prep_dma_interrupt(struct dma_chan *chan, unsigned long flags) ...@@ -591,7 +511,6 @@ iop_adma_prep_dma_interrupt(struct dma_chan *chan, unsigned long flags)
if (sw_desc) { if (sw_desc) {
grp_start = sw_desc->group_head; grp_start = sw_desc->group_head;
iop_desc_init_interrupt(grp_start, iop_chan); iop_desc_init_interrupt(grp_start, iop_chan);
grp_start->unmap_len = 0;
sw_desc->async_tx.flags = flags; sw_desc->async_tx.flags = flags;
} }
spin_unlock_bh(&iop_chan->lock); spin_unlock_bh(&iop_chan->lock);
...@@ -623,8 +542,6 @@ iop_adma_prep_dma_memcpy(struct dma_chan *chan, dma_addr_t dma_dest, ...@@ -623,8 +542,6 @@ iop_adma_prep_dma_memcpy(struct dma_chan *chan, dma_addr_t dma_dest,
iop_desc_set_byte_count(grp_start, iop_chan, len); iop_desc_set_byte_count(grp_start, iop_chan, len);
iop_desc_set_dest_addr(grp_start, iop_chan, dma_dest); iop_desc_set_dest_addr(grp_start, iop_chan, dma_dest);
iop_desc_set_memcpy_src_addr(grp_start, dma_src); iop_desc_set_memcpy_src_addr(grp_start, dma_src);
sw_desc->unmap_src_cnt = 1;
sw_desc->unmap_len = len;
sw_desc->async_tx.flags = flags; sw_desc->async_tx.flags = flags;
} }
spin_unlock_bh(&iop_chan->lock); spin_unlock_bh(&iop_chan->lock);
...@@ -657,8 +574,6 @@ iop_adma_prep_dma_xor(struct dma_chan *chan, dma_addr_t dma_dest, ...@@ -657,8 +574,6 @@ iop_adma_prep_dma_xor(struct dma_chan *chan, dma_addr_t dma_dest,
iop_desc_init_xor(grp_start, src_cnt, flags); iop_desc_init_xor(grp_start, src_cnt, flags);
iop_desc_set_byte_count(grp_start, iop_chan, len); iop_desc_set_byte_count(grp_start, iop_chan, len);
iop_desc_set_dest_addr(grp_start, iop_chan, dma_dest); iop_desc_set_dest_addr(grp_start, iop_chan, dma_dest);
sw_desc->unmap_src_cnt = src_cnt;
sw_desc->unmap_len = len;
sw_desc->async_tx.flags = flags; sw_desc->async_tx.flags = flags;
while (src_cnt--) while (src_cnt--)
iop_desc_set_xor_src_addr(grp_start, src_cnt, iop_desc_set_xor_src_addr(grp_start, src_cnt,
...@@ -694,8 +609,6 @@ iop_adma_prep_dma_xor_val(struct dma_chan *chan, dma_addr_t *dma_src, ...@@ -694,8 +609,6 @@ iop_adma_prep_dma_xor_val(struct dma_chan *chan, dma_addr_t *dma_src,
grp_start->xor_check_result = result; grp_start->xor_check_result = result;
pr_debug("\t%s: grp_start->xor_check_result: %p\n", pr_debug("\t%s: grp_start->xor_check_result: %p\n",
__func__, grp_start->xor_check_result); __func__, grp_start->xor_check_result);
sw_desc->unmap_src_cnt = src_cnt;
sw_desc->unmap_len = len;
sw_desc->async_tx.flags = flags; sw_desc->async_tx.flags = flags;
while (src_cnt--) while (src_cnt--)
iop_desc_set_zero_sum_src_addr(grp_start, src_cnt, iop_desc_set_zero_sum_src_addr(grp_start, src_cnt,
...@@ -748,8 +661,6 @@ iop_adma_prep_dma_pq(struct dma_chan *chan, dma_addr_t *dst, dma_addr_t *src, ...@@ -748,8 +661,6 @@ iop_adma_prep_dma_pq(struct dma_chan *chan, dma_addr_t *dst, dma_addr_t *src,
dst[0] = dst[1] & 0x7; dst[0] = dst[1] & 0x7;
iop_desc_set_pq_addr(g, dst); iop_desc_set_pq_addr(g, dst);
sw_desc->unmap_src_cnt = src_cnt;
sw_desc->unmap_len = len;
sw_desc->async_tx.flags = flags; sw_desc->async_tx.flags = flags;
for (i = 0; i < src_cnt; i++) for (i = 0; i < src_cnt; i++)
iop_desc_set_pq_src_addr(g, i, src[i], scf[i]); iop_desc_set_pq_src_addr(g, i, src[i], scf[i]);
...@@ -804,8 +715,6 @@ iop_adma_prep_dma_pq_val(struct dma_chan *chan, dma_addr_t *pq, dma_addr_t *src, ...@@ -804,8 +715,6 @@ iop_adma_prep_dma_pq_val(struct dma_chan *chan, dma_addr_t *pq, dma_addr_t *src,
g->pq_check_result = pqres; g->pq_check_result = pqres;
pr_debug("\t%s: g->pq_check_result: %p\n", pr_debug("\t%s: g->pq_check_result: %p\n",
__func__, g->pq_check_result); __func__, g->pq_check_result);
sw_desc->unmap_src_cnt = src_cnt+2;
sw_desc->unmap_len = len;
sw_desc->async_tx.flags = flags; sw_desc->async_tx.flags = flags;
while (src_cnt--) while (src_cnt--)
iop_desc_set_pq_zero_sum_src_addr(g, src_cnt, iop_desc_set_pq_zero_sum_src_addr(g, src_cnt,
......
...@@ -60,14 +60,6 @@ static u32 mv_desc_get_dest_addr(struct mv_xor_desc_slot *desc) ...@@ -60,14 +60,6 @@ static u32 mv_desc_get_dest_addr(struct mv_xor_desc_slot *desc)
return hw_desc->phy_dest_addr; return hw_desc->phy_dest_addr;
} }
static u32 mv_desc_get_src_addr(struct mv_xor_desc_slot *desc,
int src_idx)
{
struct mv_xor_desc *hw_desc = desc->hw_desc;
return hw_desc->phy_src_addr[mv_phy_src_idx(src_idx)];
}
static void mv_desc_set_byte_count(struct mv_xor_desc_slot *desc, static void mv_desc_set_byte_count(struct mv_xor_desc_slot *desc,
u32 byte_count) u32 byte_count)
{ {
...@@ -278,43 +270,10 @@ mv_xor_run_tx_complete_actions(struct mv_xor_desc_slot *desc, ...@@ -278,43 +270,10 @@ mv_xor_run_tx_complete_actions(struct mv_xor_desc_slot *desc,
desc->async_tx.callback( desc->async_tx.callback(
desc->async_tx.callback_param); desc->async_tx.callback_param);
/* unmap dma addresses dma_descriptor_unmap(&desc->async_tx);
* (unmap_single vs unmap_page?) if (desc->group_head)
*/
if (desc->group_head && desc->unmap_len) {
struct mv_xor_desc_slot *unmap = desc->group_head;
struct device *dev = mv_chan_to_devp(mv_chan);
u32 len = unmap->unmap_len;
enum dma_ctrl_flags flags = desc->async_tx.flags;
u32 src_cnt;
dma_addr_t addr;
dma_addr_t dest;
src_cnt = unmap->unmap_src_cnt;
dest = mv_desc_get_dest_addr(unmap);
if (!(flags & DMA_COMPL_SKIP_DEST_UNMAP)) {
enum dma_data_direction dir;
if (src_cnt > 1) /* is xor ? */
dir = DMA_BIDIRECTIONAL;
else
dir = DMA_FROM_DEVICE;
dma_unmap_page(dev, dest, len, dir);
}
if (!(flags & DMA_COMPL_SKIP_SRC_UNMAP)) {
while (src_cnt--) {
addr = mv_desc_get_src_addr(unmap,
src_cnt);
if (addr == dest)
continue;
dma_unmap_page(dev, addr, len,
DMA_TO_DEVICE);
}
}
desc->group_head = NULL; desc->group_head = NULL;
} }
}
/* run dependent operations */ /* run dependent operations */
dma_run_dependencies(&desc->async_tx); dma_run_dependencies(&desc->async_tx);
...@@ -1076,10 +1035,7 @@ mv_xor_channel_add(struct mv_xor_device *xordev, ...@@ -1076,10 +1035,7 @@ mv_xor_channel_add(struct mv_xor_device *xordev,
} }
mv_chan->mmr_base = xordev->xor_base; mv_chan->mmr_base = xordev->xor_base;
if (!mv_chan->mmr_base) { mv_chan->mmr_high_base = xordev->xor_high_base;
ret = -ENOMEM;
goto err_free_dma;
}
tasklet_init(&mv_chan->irq_tasklet, mv_xor_tasklet, (unsigned long) tasklet_init(&mv_chan->irq_tasklet, mv_xor_tasklet, (unsigned long)
mv_chan); mv_chan);
...@@ -1138,7 +1094,7 @@ static void ...@@ -1138,7 +1094,7 @@ static void
mv_xor_conf_mbus_windows(struct mv_xor_device *xordev, mv_xor_conf_mbus_windows(struct mv_xor_device *xordev,
const struct mbus_dram_target_info *dram) const struct mbus_dram_target_info *dram)
{ {
void __iomem *base = xordev->xor_base; void __iomem *base = xordev->xor_high_base;
u32 win_enable = 0; u32 win_enable = 0;
int i; int i;
......
...@@ -34,13 +34,13 @@ ...@@ -34,13 +34,13 @@
#define XOR_OPERATION_MODE_MEMCPY 2 #define XOR_OPERATION_MODE_MEMCPY 2
#define XOR_DESCRIPTOR_SWAP BIT(14) #define XOR_DESCRIPTOR_SWAP BIT(14)
#define XOR_CURR_DESC(chan) (chan->mmr_base + 0x210 + (chan->idx * 4)) #define XOR_CURR_DESC(chan) (chan->mmr_high_base + 0x10 + (chan->idx * 4))
#define XOR_NEXT_DESC(chan) (chan->mmr_base + 0x200 + (chan->idx * 4)) #define XOR_NEXT_DESC(chan) (chan->mmr_high_base + 0x00 + (chan->idx * 4))
#define XOR_BYTE_COUNT(chan) (chan->mmr_base + 0x220 + (chan->idx * 4)) #define XOR_BYTE_COUNT(chan) (chan->mmr_high_base + 0x20 + (chan->idx * 4))
#define XOR_DEST_POINTER(chan) (chan->mmr_base + 0x2B0 + (chan->idx * 4)) #define XOR_DEST_POINTER(chan) (chan->mmr_high_base + 0xB0 + (chan->idx * 4))
#define XOR_BLOCK_SIZE(chan) (chan->mmr_base + 0x2C0 + (chan->idx * 4)) #define XOR_BLOCK_SIZE(chan) (chan->mmr_high_base + 0xC0 + (chan->idx * 4))
#define XOR_INIT_VALUE_LOW(chan) (chan->mmr_base + 0x2E0) #define XOR_INIT_VALUE_LOW(chan) (chan->mmr_high_base + 0xE0)
#define XOR_INIT_VALUE_HIGH(chan) (chan->mmr_base + 0x2E4) #define XOR_INIT_VALUE_HIGH(chan) (chan->mmr_high_base + 0xE4)
#define XOR_CONFIG(chan) (chan->mmr_base + 0x10 + (chan->idx * 4)) #define XOR_CONFIG(chan) (chan->mmr_base + 0x10 + (chan->idx * 4))
#define XOR_ACTIVATION(chan) (chan->mmr_base + 0x20 + (chan->idx * 4)) #define XOR_ACTIVATION(chan) (chan->mmr_base + 0x20 + (chan->idx * 4))
...@@ -50,11 +50,11 @@ ...@@ -50,11 +50,11 @@
#define XOR_ERROR_ADDR(chan) (chan->mmr_base + 0x60) #define XOR_ERROR_ADDR(chan) (chan->mmr_base + 0x60)
#define XOR_INTR_MASK_VALUE 0x3F5 #define XOR_INTR_MASK_VALUE 0x3F5
#define WINDOW_BASE(w) (0x250 + ((w) << 2)) #define WINDOW_BASE(w) (0x50 + ((w) << 2))
#define WINDOW_SIZE(w) (0x270 + ((w) << 2)) #define WINDOW_SIZE(w) (0x70 + ((w) << 2))
#define WINDOW_REMAP_HIGH(w) (0x290 + ((w) << 2)) #define WINDOW_REMAP_HIGH(w) (0x90 + ((w) << 2))
#define WINDOW_BAR_ENABLE(chan) (0x240 + ((chan) << 2)) #define WINDOW_BAR_ENABLE(chan) (0x40 + ((chan) << 2))
#define WINDOW_OVERRIDE_CTRL(chan) (0x2A0 + ((chan) << 2)) #define WINDOW_OVERRIDE_CTRL(chan) (0xA0 + ((chan) << 2))
struct mv_xor_device { struct mv_xor_device {
void __iomem *xor_base; void __iomem *xor_base;
...@@ -82,6 +82,7 @@ struct mv_xor_chan { ...@@ -82,6 +82,7 @@ struct mv_xor_chan {
int pending; int pending;
spinlock_t lock; /* protects the descriptor slot pool */ spinlock_t lock; /* protects the descriptor slot pool */
void __iomem *mmr_base; void __iomem *mmr_base;
void __iomem *mmr_high_base;
unsigned int idx; unsigned int idx;
int irq; int irq;
enum dma_transaction_type current_type; enum dma_transaction_type current_type;
......
...@@ -2268,6 +2268,8 @@ static void pl330_tasklet(unsigned long data) ...@@ -2268,6 +2268,8 @@ static void pl330_tasklet(unsigned long data)
list_move_tail(&desc->node, &pch->dmac->desc_pool); list_move_tail(&desc->node, &pch->dmac->desc_pool);
} }
dma_descriptor_unmap(&desc->txd);
if (callback) { if (callback) {
spin_unlock_irqrestore(&pch->lock, flags); spin_unlock_irqrestore(&pch->lock, flags);
callback(callback_param); callback(callback_param);
......
...@@ -801,218 +801,6 @@ static void ppc440spe_desc_set_link(struct ppc440spe_adma_chan *chan, ...@@ -801,218 +801,6 @@ static void ppc440spe_desc_set_link(struct ppc440spe_adma_chan *chan,
local_irq_restore(flags); local_irq_restore(flags);
} }
/**
* ppc440spe_desc_get_src_addr - extract the source address from the descriptor
*/
static u32 ppc440spe_desc_get_src_addr(struct ppc440spe_adma_desc_slot *desc,
struct ppc440spe_adma_chan *chan, int src_idx)
{
struct dma_cdb *dma_hw_desc;
struct xor_cb *xor_hw_desc;
switch (chan->device->id) {
case PPC440SPE_DMA0_ID:
case PPC440SPE_DMA1_ID:
dma_hw_desc = desc->hw_desc;
/* May have 0, 1, 2, or 3 sources */
switch (dma_hw_desc->opc) {
case DMA_CDB_OPC_NO_OP:
case DMA_CDB_OPC_DFILL128:
return 0;
case DMA_CDB_OPC_DCHECK128:
if (unlikely(src_idx)) {
printk(KERN_ERR "%s: try to get %d source for"
" DCHECK128\n", __func__, src_idx);
BUG();
}
return le32_to_cpu(dma_hw_desc->sg1l);
case DMA_CDB_OPC_MULTICAST:
case DMA_CDB_OPC_MV_SG1_SG2:
if (unlikely(src_idx > 2)) {
printk(KERN_ERR "%s: try to get %d source from"
" DMA descr\n", __func__, src_idx);
BUG();
}
if (src_idx) {
if (le32_to_cpu(dma_hw_desc->sg1u) &
DMA_CUED_XOR_WIN_MSK) {
u8 region;
if (src_idx == 1)
return le32_to_cpu(
dma_hw_desc->sg1l) +
desc->unmap_len;
region = (le32_to_cpu(
dma_hw_desc->sg1u)) >>
DMA_CUED_REGION_OFF;
region &= DMA_CUED_REGION_MSK;
switch (region) {
case DMA_RXOR123:
return le32_to_cpu(
dma_hw_desc->sg1l) +
(desc->unmap_len << 1);
case DMA_RXOR124:
return le32_to_cpu(
dma_hw_desc->sg1l) +
(desc->unmap_len * 3);
case DMA_RXOR125:
return le32_to_cpu(
dma_hw_desc->sg1l) +
(desc->unmap_len << 2);
default:
printk(KERN_ERR
"%s: try to"
" get src3 for region %02x"
"PPC440SPE_DESC_RXOR12?\n",
__func__, region);
BUG();
}
} else {
printk(KERN_ERR
"%s: try to get %d"
" source for non-cued descr\n",
__func__, src_idx);
BUG();
}
}
return le32_to_cpu(dma_hw_desc->sg1l);
default:
printk(KERN_ERR "%s: unknown OPC 0x%02x\n",
__func__, dma_hw_desc->opc);
BUG();
}
return le32_to_cpu(dma_hw_desc->sg1l);
case PPC440SPE_XOR_ID:
/* May have up to 16 sources */
xor_hw_desc = desc->hw_desc;
return xor_hw_desc->ops[src_idx].l;
}
return 0;
}
/**
* ppc440spe_desc_get_dest_addr - extract the destination address from the
* descriptor
*/
static u32 ppc440spe_desc_get_dest_addr(struct ppc440spe_adma_desc_slot *desc,
struct ppc440spe_adma_chan *chan, int idx)
{
struct dma_cdb *dma_hw_desc;
struct xor_cb *xor_hw_desc;
switch (chan->device->id) {
case PPC440SPE_DMA0_ID:
case PPC440SPE_DMA1_ID:
dma_hw_desc = desc->hw_desc;
if (likely(!idx))
return le32_to_cpu(dma_hw_desc->sg2l);
return le32_to_cpu(dma_hw_desc->sg3l);
case PPC440SPE_XOR_ID:
xor_hw_desc = desc->hw_desc;
return xor_hw_desc->cbtal;
}
return 0;
}
/**
* ppc440spe_desc_get_src_num - extract the number of source addresses from
* the descriptor
*/
static u32 ppc440spe_desc_get_src_num(struct ppc440spe_adma_desc_slot *desc,
struct ppc440spe_adma_chan *chan)
{
struct dma_cdb *dma_hw_desc;
struct xor_cb *xor_hw_desc;
switch (chan->device->id) {
case PPC440SPE_DMA0_ID:
case PPC440SPE_DMA1_ID:
dma_hw_desc = desc->hw_desc;
switch (dma_hw_desc->opc) {
case DMA_CDB_OPC_NO_OP:
case DMA_CDB_OPC_DFILL128:
return 0;
case DMA_CDB_OPC_DCHECK128:
return 1;
case DMA_CDB_OPC_MV_SG1_SG2:
case DMA_CDB_OPC_MULTICAST:
/*
* Only for RXOR operations we have more than
* one source
*/
if (le32_to_cpu(dma_hw_desc->sg1u) &
DMA_CUED_XOR_WIN_MSK) {
/* RXOR op, there are 2 or 3 sources */
if (((le32_to_cpu(dma_hw_desc->sg1u) >>
DMA_CUED_REGION_OFF) &
DMA_CUED_REGION_MSK) == DMA_RXOR12) {
/* RXOR 1-2 */
return 2;
} else {
/* RXOR 1-2-3/1-2-4/1-2-5 */
return 3;
}
}
return 1;
default:
printk(KERN_ERR "%s: unknown OPC 0x%02x\n",
__func__, dma_hw_desc->opc);
BUG();
}
case PPC440SPE_XOR_ID:
/* up to 16 sources */
xor_hw_desc = desc->hw_desc;
return xor_hw_desc->cbc & XOR_CDCR_OAC_MSK;
default:
BUG();
}
return 0;
}
/**
* ppc440spe_desc_get_dst_num - get the number of destination addresses in
* this descriptor
*/
static u32 ppc440spe_desc_get_dst_num(struct ppc440spe_adma_desc_slot *desc,
struct ppc440spe_adma_chan *chan)
{
struct dma_cdb *dma_hw_desc;
switch (chan->device->id) {
case PPC440SPE_DMA0_ID:
case PPC440SPE_DMA1_ID:
/* May be 1 or 2 destinations */
dma_hw_desc = desc->hw_desc;
switch (dma_hw_desc->opc) {
case DMA_CDB_OPC_NO_OP:
case DMA_CDB_OPC_DCHECK128:
return 0;
case DMA_CDB_OPC_MV_SG1_SG2:
case DMA_CDB_OPC_DFILL128:
return 1;
case DMA_CDB_OPC_MULTICAST:
if (desc->dst_cnt == 2)
return 2;
else
return 1;
default:
printk(KERN_ERR "%s: unknown OPC 0x%02x\n",
__func__, dma_hw_desc->opc);
BUG();
}
case PPC440SPE_XOR_ID:
/* Always only 1 destination */
return 1;
default:
BUG();
}
return 0;
}
/** /**
* ppc440spe_desc_get_link - get the address of the descriptor that * ppc440spe_desc_get_link - get the address of the descriptor that
* follows this one * follows this one
...@@ -1705,43 +1493,6 @@ static void ppc440spe_adma_free_slots(struct ppc440spe_adma_desc_slot *slot, ...@@ -1705,43 +1493,6 @@ static void ppc440spe_adma_free_slots(struct ppc440spe_adma_desc_slot *slot,
} }
} }
static void ppc440spe_adma_unmap(struct ppc440spe_adma_chan *chan,
struct ppc440spe_adma_desc_slot *desc)
{
u32 src_cnt, dst_cnt;
dma_addr_t addr;
/*
* get the number of sources & destination
* included in this descriptor and unmap
* them all
*/
src_cnt = ppc440spe_desc_get_src_num(desc, chan);
dst_cnt = ppc440spe_desc_get_dst_num(desc, chan);
/* unmap destinations */
if (!(desc->async_tx.flags & DMA_COMPL_SKIP_DEST_UNMAP)) {
while (dst_cnt--) {
addr = ppc440spe_desc_get_dest_addr(
desc, chan, dst_cnt);
dma_unmap_page(chan->device->dev,
addr, desc->unmap_len,
DMA_FROM_DEVICE);
}
}
/* unmap sources */
if (!(desc->async_tx.flags & DMA_COMPL_SKIP_SRC_UNMAP)) {
while (src_cnt--) {
addr = ppc440spe_desc_get_src_addr(
desc, chan, src_cnt);
dma_unmap_page(chan->device->dev,
addr, desc->unmap_len,
DMA_TO_DEVICE);
}
}
}
/** /**
* ppc440spe_adma_run_tx_complete_actions - call functions to be called * ppc440spe_adma_run_tx_complete_actions - call functions to be called
* upon completion * upon completion
...@@ -1765,26 +1516,7 @@ static dma_cookie_t ppc440spe_adma_run_tx_complete_actions( ...@@ -1765,26 +1516,7 @@ static dma_cookie_t ppc440spe_adma_run_tx_complete_actions(
desc->async_tx.callback( desc->async_tx.callback(
desc->async_tx.callback_param); desc->async_tx.callback_param);
/* unmap dma addresses dma_descriptor_unmap(&desc->async_tx);
* (unmap_single vs unmap_page?)
*
* actually, ppc's dma_unmap_page() functions are empty, so
* the following code is just for the sake of completeness
*/
if (chan && chan->needs_unmap && desc->group_head &&
desc->unmap_len) {
struct ppc440spe_adma_desc_slot *unmap =
desc->group_head;
/* assume 1 slot per op always */
u32 slot_count = unmap->slot_cnt;
/* Run through the group list and unmap addresses */
for (i = 0; i < slot_count; i++) {
BUG_ON(!unmap);
ppc440spe_adma_unmap(chan, unmap);
unmap = unmap->hw_next;
}
}
} }
/* run dependent operations */ /* run dependent operations */
......
...@@ -154,38 +154,6 @@ static bool __td_dma_done_ack(struct timb_dma_chan *td_chan) ...@@ -154,38 +154,6 @@ static bool __td_dma_done_ack(struct timb_dma_chan *td_chan)
return done; return done;
} }
static void __td_unmap_desc(struct timb_dma_chan *td_chan, const u8 *dma_desc,
bool single)
{
dma_addr_t addr;
int len;
addr = (dma_desc[7] << 24) | (dma_desc[6] << 16) | (dma_desc[5] << 8) |
dma_desc[4];
len = (dma_desc[3] << 8) | dma_desc[2];
if (single)
dma_unmap_single(chan2dev(&td_chan->chan), addr, len,
DMA_TO_DEVICE);
else
dma_unmap_page(chan2dev(&td_chan->chan), addr, len,
DMA_TO_DEVICE);
}
static void __td_unmap_descs(struct timb_dma_desc *td_desc, bool single)
{
struct timb_dma_chan *td_chan = container_of(td_desc->txd.chan,
struct timb_dma_chan, chan);
u8 *descs;
for (descs = td_desc->desc_list; ; descs += TIMB_DMA_DESC_SIZE) {
__td_unmap_desc(td_chan, descs, single);
if (descs[0] & 0x02)
break;
}
}
static int td_fill_desc(struct timb_dma_chan *td_chan, u8 *dma_desc, static int td_fill_desc(struct timb_dma_chan *td_chan, u8 *dma_desc,
struct scatterlist *sg, bool last) struct scatterlist *sg, bool last)
{ {
...@@ -293,10 +261,7 @@ static void __td_finish(struct timb_dma_chan *td_chan) ...@@ -293,10 +261,7 @@ static void __td_finish(struct timb_dma_chan *td_chan)
list_move(&td_desc->desc_node, &td_chan->free_list); list_move(&td_desc->desc_node, &td_chan->free_list);
if (!(txd->flags & DMA_COMPL_SKIP_SRC_UNMAP)) dma_descriptor_unmap(txd);
__td_unmap_descs(td_desc,
txd->flags & DMA_COMPL_SRC_UNMAP_SINGLE);
/* /*
* The API requires that no submissions are done from a * The API requires that no submissions are done from a
* callback, so we don't need to drop the lock here * callback, so we don't need to drop the lock here
......
...@@ -419,30 +419,7 @@ txx9dmac_descriptor_complete(struct txx9dmac_chan *dc, ...@@ -419,30 +419,7 @@ txx9dmac_descriptor_complete(struct txx9dmac_chan *dc,
list_splice_init(&desc->tx_list, &dc->free_list); list_splice_init(&desc->tx_list, &dc->free_list);
list_move(&desc->desc_node, &dc->free_list); list_move(&desc->desc_node, &dc->free_list);
if (!ds) { dma_descriptor_unmap(txd);
dma_addr_t dmaaddr;
if (!(txd->flags & DMA_COMPL_SKIP_DEST_UNMAP)) {
dmaaddr = is_dmac64(dc) ?
desc->hwdesc.DAR : desc->hwdesc32.DAR;
if (txd->flags & DMA_COMPL_DEST_UNMAP_SINGLE)
dma_unmap_single(chan2parent(&dc->chan),
dmaaddr, desc->len, DMA_FROM_DEVICE);
else
dma_unmap_page(chan2parent(&dc->chan),
dmaaddr, desc->len, DMA_FROM_DEVICE);
}
if (!(txd->flags & DMA_COMPL_SKIP_SRC_UNMAP)) {
dmaaddr = is_dmac64(dc) ?
desc->hwdesc.SAR : desc->hwdesc32.SAR;
if (txd->flags & DMA_COMPL_SRC_UNMAP_SINGLE)
dma_unmap_single(chan2parent(&dc->chan),
dmaaddr, desc->len, DMA_TO_DEVICE);
else
dma_unmap_page(chan2parent(&dc->chan),
dmaaddr, desc->len, DMA_TO_DEVICE);
}
}
/* /*
* The API requires that no submissions are done from a * The API requires that no submissions are done from a
* callback, so we don't need to drop the lock here * callback, so we don't need to drop the lock here
......
...@@ -341,8 +341,7 @@ static void deinterlace_issue_dma(struct deinterlace_ctx *ctx, int op, ...@@ -341,8 +341,7 @@ static void deinterlace_issue_dma(struct deinterlace_ctx *ctx, int op,
ctx->xt->dir = DMA_MEM_TO_MEM; ctx->xt->dir = DMA_MEM_TO_MEM;
ctx->xt->src_sgl = false; ctx->xt->src_sgl = false;
ctx->xt->dst_sgl = true; ctx->xt->dst_sgl = true;
flags = DMA_CTRL_ACK | DMA_PREP_INTERRUPT | flags = DMA_CTRL_ACK | DMA_PREP_INTERRUPT;
DMA_COMPL_SKIP_DEST_UNMAP | DMA_COMPL_SKIP_SRC_UNMAP;
tx = dmadev->device_prep_interleaved_dma(chan, ctx->xt, flags); tx = dmadev->device_prep_interleaved_dma(chan, ctx->xt, flags);
if (tx == NULL) { if (tx == NULL) {
......
...@@ -565,7 +565,7 @@ static void buffer_queue(struct videobuf_queue *vq, struct videobuf_buffer *vb) ...@@ -565,7 +565,7 @@ static void buffer_queue(struct videobuf_queue *vq, struct videobuf_buffer *vb)
desc = dmaengine_prep_slave_sg(fh->chan, desc = dmaengine_prep_slave_sg(fh->chan,
buf->sg, sg_elems, DMA_DEV_TO_MEM, buf->sg, sg_elems, DMA_DEV_TO_MEM,
DMA_PREP_INTERRUPT | DMA_COMPL_SKIP_SRC_UNMAP); DMA_PREP_INTERRUPT);
if (!desc) { if (!desc) {
spin_lock_irq(&fh->queue_lock); spin_lock_irq(&fh->queue_lock);
list_del_init(&vb->queue); list_del_init(&vb->queue);
......
...@@ -631,8 +631,7 @@ static int data_submit_dma(struct fpga_device *priv, struct data_buf *buf) ...@@ -631,8 +631,7 @@ static int data_submit_dma(struct fpga_device *priv, struct data_buf *buf)
struct dma_async_tx_descriptor *tx; struct dma_async_tx_descriptor *tx;
dma_cookie_t cookie; dma_cookie_t cookie;
dma_addr_t dst, src; dma_addr_t dst, src;
unsigned long dma_flags = DMA_COMPL_SKIP_DEST_UNMAP | unsigned long dma_flags = 0;
DMA_COMPL_SKIP_SRC_UNMAP;
dst_sg = buf->vb.sglist; dst_sg = buf->vb.sglist;
dst_nents = buf->vb.sglen; dst_nents = buf->vb.sglen;
......
...@@ -375,8 +375,7 @@ static int atmel_nand_dma_op(struct mtd_info *mtd, void *buf, int len, ...@@ -375,8 +375,7 @@ static int atmel_nand_dma_op(struct mtd_info *mtd, void *buf, int len,
dma_dev = host->dma_chan->device; dma_dev = host->dma_chan->device;
flags = DMA_CTRL_ACK | DMA_PREP_INTERRUPT | DMA_COMPL_SKIP_SRC_UNMAP | flags = DMA_CTRL_ACK | DMA_PREP_INTERRUPT;
DMA_COMPL_SKIP_DEST_UNMAP;
phys_addr = dma_map_single(dma_dev->dev, p, len, dir); phys_addr = dma_map_single(dma_dev->dev, p, len, dir);
if (dma_mapping_error(dma_dev->dev, phys_addr)) { if (dma_mapping_error(dma_dev->dev, phys_addr)) {
......
...@@ -573,8 +573,6 @@ static int dma_xfer(struct fsmc_nand_data *host, void *buffer, int len, ...@@ -573,8 +573,6 @@ static int dma_xfer(struct fsmc_nand_data *host, void *buffer, int len,
dma_dev = chan->device; dma_dev = chan->device;
dma_addr = dma_map_single(dma_dev->dev, buffer, len, direction); dma_addr = dma_map_single(dma_dev->dev, buffer, len, direction);
flags |= DMA_COMPL_SKIP_SRC_UNMAP | DMA_COMPL_SKIP_DEST_UNMAP;
if (direction == DMA_TO_DEVICE) { if (direction == DMA_TO_DEVICE) {
dma_src = dma_addr; dma_src = dma_addr;
dma_dst = host->data_pa; dma_dst = host->data_pa;
......
...@@ -459,8 +459,7 @@ static int ks8842_tx_frame_dma(struct sk_buff *skb, struct net_device *netdev) ...@@ -459,8 +459,7 @@ static int ks8842_tx_frame_dma(struct sk_buff *skb, struct net_device *netdev)
sg_dma_len(&ctl->sg) += 4 - sg_dma_len(&ctl->sg) % 4; sg_dma_len(&ctl->sg) += 4 - sg_dma_len(&ctl->sg) % 4;
ctl->adesc = dmaengine_prep_slave_sg(ctl->chan, ctl->adesc = dmaengine_prep_slave_sg(ctl->chan,
&ctl->sg, 1, DMA_MEM_TO_DEV, &ctl->sg, 1, DMA_MEM_TO_DEV, DMA_PREP_INTERRUPT);
DMA_PREP_INTERRUPT | DMA_COMPL_SKIP_SRC_UNMAP);
if (!ctl->adesc) if (!ctl->adesc)
return NETDEV_TX_BUSY; return NETDEV_TX_BUSY;
...@@ -571,8 +570,7 @@ static int __ks8842_start_new_rx_dma(struct net_device *netdev) ...@@ -571,8 +570,7 @@ static int __ks8842_start_new_rx_dma(struct net_device *netdev)
sg_dma_len(sg) = DMA_BUFFER_SIZE; sg_dma_len(sg) = DMA_BUFFER_SIZE;
ctl->adesc = dmaengine_prep_slave_sg(ctl->chan, ctl->adesc = dmaengine_prep_slave_sg(ctl->chan,
sg, 1, DMA_DEV_TO_MEM, sg, 1, DMA_DEV_TO_MEM, DMA_PREP_INTERRUPT);
DMA_PREP_INTERRUPT | DMA_COMPL_SKIP_SRC_UNMAP);
if (!ctl->adesc) if (!ctl->adesc)
goto out; goto out;
......
...@@ -1034,10 +1034,9 @@ static void ntb_async_rx(struct ntb_queue_entry *entry, void *offset, ...@@ -1034,10 +1034,9 @@ static void ntb_async_rx(struct ntb_queue_entry *entry, void *offset,
struct dma_chan *chan = qp->dma_chan; struct dma_chan *chan = qp->dma_chan;
struct dma_device *device; struct dma_device *device;
size_t pay_off, buff_off; size_t pay_off, buff_off;
dma_addr_t src, dest; struct dmaengine_unmap_data *unmap;
dma_cookie_t cookie; dma_cookie_t cookie;
void *buf = entry->buf; void *buf = entry->buf;
unsigned long flags;
entry->len = len; entry->len = len;
...@@ -1045,35 +1044,49 @@ static void ntb_async_rx(struct ntb_queue_entry *entry, void *offset, ...@@ -1045,35 +1044,49 @@ static void ntb_async_rx(struct ntb_queue_entry *entry, void *offset,
goto err; goto err;
if (len < copy_bytes) if (len < copy_bytes)
goto err1; goto err_wait;
device = chan->device; device = chan->device;
pay_off = (size_t) offset & ~PAGE_MASK; pay_off = (size_t) offset & ~PAGE_MASK;
buff_off = (size_t) buf & ~PAGE_MASK; buff_off = (size_t) buf & ~PAGE_MASK;
if (!is_dma_copy_aligned(device, pay_off, buff_off, len)) if (!is_dma_copy_aligned(device, pay_off, buff_off, len))
goto err1; goto err_wait;
dest = dma_map_single(device->dev, buf, len, DMA_FROM_DEVICE); unmap = dmaengine_get_unmap_data(device->dev, 2, GFP_NOWAIT);
if (dma_mapping_error(device->dev, dest)) if (!unmap)
goto err1; goto err_wait;
src = dma_map_single(device->dev, offset, len, DMA_TO_DEVICE); unmap->len = len;
if (dma_mapping_error(device->dev, src)) unmap->addr[0] = dma_map_page(device->dev, virt_to_page(offset),
goto err2; pay_off, len, DMA_TO_DEVICE);
if (dma_mapping_error(device->dev, unmap->addr[0]))
goto err_get_unmap;
unmap->to_cnt = 1;
flags = DMA_COMPL_DEST_UNMAP_SINGLE | DMA_COMPL_SRC_UNMAP_SINGLE | unmap->addr[1] = dma_map_page(device->dev, virt_to_page(buf),
DMA_PREP_INTERRUPT; buff_off, len, DMA_FROM_DEVICE);
txd = device->device_prep_dma_memcpy(chan, dest, src, len, flags); if (dma_mapping_error(device->dev, unmap->addr[1]))
goto err_get_unmap;
unmap->from_cnt = 1;
txd = device->device_prep_dma_memcpy(chan, unmap->addr[1],
unmap->addr[0], len,
DMA_PREP_INTERRUPT);
if (!txd) if (!txd)
goto err3; goto err_get_unmap;
txd->callback = ntb_rx_copy_callback; txd->callback = ntb_rx_copy_callback;
txd->callback_param = entry; txd->callback_param = entry;
dma_set_unmap(txd, unmap);
cookie = dmaengine_submit(txd); cookie = dmaengine_submit(txd);
if (dma_submit_error(cookie)) if (dma_submit_error(cookie))
goto err3; goto err_set_unmap;
dmaengine_unmap_put(unmap);
qp->last_cookie = cookie; qp->last_cookie = cookie;
...@@ -1081,11 +1094,11 @@ static void ntb_async_rx(struct ntb_queue_entry *entry, void *offset, ...@@ -1081,11 +1094,11 @@ static void ntb_async_rx(struct ntb_queue_entry *entry, void *offset,
return; return;
err3: err_set_unmap:
dma_unmap_single(device->dev, src, len, DMA_TO_DEVICE); dmaengine_unmap_put(unmap);
err2: err_get_unmap:
dma_unmap_single(device->dev, dest, len, DMA_FROM_DEVICE); dmaengine_unmap_put(unmap);
err1: err_wait:
/* If the callbacks come out of order, the writing of the index to the /* If the callbacks come out of order, the writing of the index to the
* last completed will be out of order. This may result in the * last completed will be out of order. This may result in the
* receive stalling forever. * receive stalling forever.
...@@ -1245,12 +1258,12 @@ static void ntb_async_tx(struct ntb_transport_qp *qp, ...@@ -1245,12 +1258,12 @@ static void ntb_async_tx(struct ntb_transport_qp *qp,
struct dma_chan *chan = qp->dma_chan; struct dma_chan *chan = qp->dma_chan;
struct dma_device *device; struct dma_device *device;
size_t dest_off, buff_off; size_t dest_off, buff_off;
dma_addr_t src, dest; struct dmaengine_unmap_data *unmap;
dma_addr_t dest;
dma_cookie_t cookie; dma_cookie_t cookie;
void __iomem *offset; void __iomem *offset;
size_t len = entry->len; size_t len = entry->len;
void *buf = entry->buf; void *buf = entry->buf;
unsigned long flags;
offset = qp->tx_mw + qp->tx_max_frame * qp->tx_index; offset = qp->tx_mw + qp->tx_max_frame * qp->tx_index;
hdr = offset + qp->tx_max_frame - sizeof(struct ntb_payload_header); hdr = offset + qp->tx_max_frame - sizeof(struct ntb_payload_header);
...@@ -1273,28 +1286,41 @@ static void ntb_async_tx(struct ntb_transport_qp *qp, ...@@ -1273,28 +1286,41 @@ static void ntb_async_tx(struct ntb_transport_qp *qp,
if (!is_dma_copy_aligned(device, buff_off, dest_off, len)) if (!is_dma_copy_aligned(device, buff_off, dest_off, len))
goto err; goto err;
src = dma_map_single(device->dev, buf, len, DMA_TO_DEVICE); unmap = dmaengine_get_unmap_data(device->dev, 1, GFP_NOWAIT);
if (dma_mapping_error(device->dev, src)) if (!unmap)
goto err; goto err;
flags = DMA_COMPL_SRC_UNMAP_SINGLE | DMA_PREP_INTERRUPT; unmap->len = len;
txd = device->device_prep_dma_memcpy(chan, dest, src, len, flags); unmap->addr[0] = dma_map_page(device->dev, virt_to_page(buf),
buff_off, len, DMA_TO_DEVICE);
if (dma_mapping_error(device->dev, unmap->addr[0]))
goto err_get_unmap;
unmap->to_cnt = 1;
txd = device->device_prep_dma_memcpy(chan, dest, unmap->addr[0], len,
DMA_PREP_INTERRUPT);
if (!txd) if (!txd)
goto err1; goto err_get_unmap;
txd->callback = ntb_tx_copy_callback; txd->callback = ntb_tx_copy_callback;
txd->callback_param = entry; txd->callback_param = entry;
dma_set_unmap(txd, unmap);
cookie = dmaengine_submit(txd); cookie = dmaengine_submit(txd);
if (dma_submit_error(cookie)) if (dma_submit_error(cookie))
goto err1; goto err_set_unmap;
dmaengine_unmap_put(unmap);
dma_async_issue_pending(chan); dma_async_issue_pending(chan);
qp->tx_async++; qp->tx_async++;
return; return;
err1: err_set_unmap:
dma_unmap_single(device->dev, src, len, DMA_TO_DEVICE); dmaengine_unmap_put(unmap);
err_get_unmap:
dmaengine_unmap_put(unmap);
err: err:
ntb_memcpy_tx(entry, offset); ntb_memcpy_tx(entry, offset);
qp->tx_memcpy++; qp->tx_memcpy++;
......
...@@ -150,7 +150,7 @@ static int mid_spi_dma_transfer(struct dw_spi *dws, int cs_change) ...@@ -150,7 +150,7 @@ static int mid_spi_dma_transfer(struct dw_spi *dws, int cs_change)
&dws->tx_sgl, &dws->tx_sgl,
1, 1,
DMA_MEM_TO_DEV, DMA_MEM_TO_DEV,
DMA_PREP_INTERRUPT | DMA_COMPL_SKIP_DEST_UNMAP); DMA_PREP_INTERRUPT);
txdesc->callback = dw_spi_dma_done; txdesc->callback = dw_spi_dma_done;
txdesc->callback_param = dws; txdesc->callback_param = dws;
...@@ -173,7 +173,7 @@ static int mid_spi_dma_transfer(struct dw_spi *dws, int cs_change) ...@@ -173,7 +173,7 @@ static int mid_spi_dma_transfer(struct dw_spi *dws, int cs_change)
&dws->rx_sgl, &dws->rx_sgl,
1, 1,
DMA_DEV_TO_MEM, DMA_DEV_TO_MEM,
DMA_PREP_INTERRUPT | DMA_COMPL_SKIP_DEST_UNMAP); DMA_PREP_INTERRUPT);
rxdesc->callback = dw_spi_dma_done; rxdesc->callback = dw_spi_dma_done;
rxdesc->callback_param = dws; rxdesc->callback_param = dws;
......
...@@ -171,12 +171,6 @@ struct dma_interleaved_template { ...@@ -171,12 +171,6 @@ struct dma_interleaved_template {
* @DMA_CTRL_ACK - if clear, the descriptor cannot be reused until the client * @DMA_CTRL_ACK - if clear, the descriptor cannot be reused until the client
* acknowledges receipt, i.e. has has a chance to establish any dependency * acknowledges receipt, i.e. has has a chance to establish any dependency
* chains * chains
* @DMA_COMPL_SKIP_SRC_UNMAP - set to disable dma-unmapping the source buffer(s)
* @DMA_COMPL_SKIP_DEST_UNMAP - set to disable dma-unmapping the destination(s)
* @DMA_COMPL_SRC_UNMAP_SINGLE - set to do the source dma-unmapping as single
* (if not set, do the source dma-unmapping as page)
* @DMA_COMPL_DEST_UNMAP_SINGLE - set to do the destination dma-unmapping as single
* (if not set, do the destination dma-unmapping as page)
* @DMA_PREP_PQ_DISABLE_P - prevent generation of P while generating Q * @DMA_PREP_PQ_DISABLE_P - prevent generation of P while generating Q
* @DMA_PREP_PQ_DISABLE_Q - prevent generation of Q while generating P * @DMA_PREP_PQ_DISABLE_Q - prevent generation of Q while generating P
* @DMA_PREP_CONTINUE - indicate to a driver that it is reusing buffers as * @DMA_PREP_CONTINUE - indicate to a driver that it is reusing buffers as
...@@ -188,14 +182,10 @@ struct dma_interleaved_template { ...@@ -188,14 +182,10 @@ struct dma_interleaved_template {
enum dma_ctrl_flags { enum dma_ctrl_flags {
DMA_PREP_INTERRUPT = (1 << 0), DMA_PREP_INTERRUPT = (1 << 0),
DMA_CTRL_ACK = (1 << 1), DMA_CTRL_ACK = (1 << 1),
DMA_COMPL_SKIP_SRC_UNMAP = (1 << 2), DMA_PREP_PQ_DISABLE_P = (1 << 2),
DMA_COMPL_SKIP_DEST_UNMAP = (1 << 3), DMA_PREP_PQ_DISABLE_Q = (1 << 3),
DMA_COMPL_SRC_UNMAP_SINGLE = (1 << 4), DMA_PREP_CONTINUE = (1 << 4),
DMA_COMPL_DEST_UNMAP_SINGLE = (1 << 5), DMA_PREP_FENCE = (1 << 5),
DMA_PREP_PQ_DISABLE_P = (1 << 6),
DMA_PREP_PQ_DISABLE_Q = (1 << 7),
DMA_PREP_CONTINUE = (1 << 8),
DMA_PREP_FENCE = (1 << 9),
}; };
/** /**
...@@ -413,6 +403,17 @@ void dma_chan_cleanup(struct kref *kref); ...@@ -413,6 +403,17 @@ void dma_chan_cleanup(struct kref *kref);
typedef bool (*dma_filter_fn)(struct dma_chan *chan, void *filter_param); typedef bool (*dma_filter_fn)(struct dma_chan *chan, void *filter_param);
typedef void (*dma_async_tx_callback)(void *dma_async_param); typedef void (*dma_async_tx_callback)(void *dma_async_param);
struct dmaengine_unmap_data {
u8 to_cnt;
u8 from_cnt;
u8 bidi_cnt;
struct device *dev;
struct kref kref;
size_t len;
dma_addr_t addr[0];
};
/** /**
* struct dma_async_tx_descriptor - async transaction descriptor * struct dma_async_tx_descriptor - async transaction descriptor
* ---dma generic offload fields--- * ---dma generic offload fields---
...@@ -438,6 +439,7 @@ struct dma_async_tx_descriptor { ...@@ -438,6 +439,7 @@ struct dma_async_tx_descriptor {
dma_cookie_t (*tx_submit)(struct dma_async_tx_descriptor *tx); dma_cookie_t (*tx_submit)(struct dma_async_tx_descriptor *tx);
dma_async_tx_callback callback; dma_async_tx_callback callback;
void *callback_param; void *callback_param;
struct dmaengine_unmap_data *unmap;
#ifdef CONFIG_ASYNC_TX_ENABLE_CHANNEL_SWITCH #ifdef CONFIG_ASYNC_TX_ENABLE_CHANNEL_SWITCH
struct dma_async_tx_descriptor *next; struct dma_async_tx_descriptor *next;
struct dma_async_tx_descriptor *parent; struct dma_async_tx_descriptor *parent;
...@@ -445,6 +447,40 @@ struct dma_async_tx_descriptor { ...@@ -445,6 +447,40 @@ struct dma_async_tx_descriptor {
#endif #endif
}; };
#ifdef CONFIG_DMA_ENGINE
static inline void dma_set_unmap(struct dma_async_tx_descriptor *tx,
struct dmaengine_unmap_data *unmap)
{
kref_get(&unmap->kref);
tx->unmap = unmap;
}
struct dmaengine_unmap_data *
dmaengine_get_unmap_data(struct device *dev, int nr, gfp_t flags);
void dmaengine_unmap_put(struct dmaengine_unmap_data *unmap);
#else
static inline void dma_set_unmap(struct dma_async_tx_descriptor *tx,
struct dmaengine_unmap_data *unmap)
{
}
static inline struct dmaengine_unmap_data *
dmaengine_get_unmap_data(struct device *dev, int nr, gfp_t flags)
{
return NULL;
}
static inline void dmaengine_unmap_put(struct dmaengine_unmap_data *unmap)
{
}
#endif
static inline void dma_descriptor_unmap(struct dma_async_tx_descriptor *tx)
{
if (tx->unmap) {
dmaengine_unmap_put(tx->unmap);
tx->unmap = NULL;
}
}
#ifndef CONFIG_ASYNC_TX_ENABLE_CHANNEL_SWITCH #ifndef CONFIG_ASYNC_TX_ENABLE_CHANNEL_SWITCH
static inline void txd_lock(struct dma_async_tx_descriptor *txd) static inline void txd_lock(struct dma_async_tx_descriptor *txd)
{ {
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment