Commit 37b5e889 authored by LEROY Christophe's avatar LEROY Christophe Committed by Herbert Xu

crypto: talitos - chain in buffered data for ahash on SEC1

SEC1 doesn't support S/G in descriptors so for hash operations,
the CPU has to build a buffer containing the buffered block and
the incoming data. This generates a lot of memory copies which
represents more than 50% of CPU time of a md5sum operation as
shown below with a 'perf record'.

|--86.24%-- kcapi_md_digest
|          |
|          |--86.18%-- _kcapi_common_vmsplice_chunk_fd
|          |          |
|          |          |--83.68%-- splice
|          |          |          |
|          |          |          |--83.59%-- ret_from_syscall
|          |          |          |          |
|          |          |          |          |--83.52%-- sys_splice
|          |          |          |          |          |
|          |          |          |          |          |--83.49%-- splice_from_pipe
|          |          |          |          |          |          |
|          |          |          |          |          |          |--83.04%-- __splice_from_pipe
|          |          |          |          |          |          |          |
|          |          |          |          |          |          |          |--80.67%-- pipe_to_sendpage
|          |          |          |          |          |          |          |          |
|          |          |          |          |          |          |          |          |--78.25%-- hash_sendpage
|          |          |          |          |          |          |          |          |          |
|          |          |          |          |          |          |          |          |          |--60.08%-- ahash_process_req
|          |          |          |          |          |          |          |          |          |          |
|          |          |          |          |          |          |          |          |          |          |--56.36%-- sg_copy_buffer
|          |          |          |          |          |          |          |          |          |          |          |
|          |          |          |          |          |          |          |          |          |          |          |--55.29%-- memcpy
|          |          |          |          |          |          |          |          |          |          |          |

However, unlike SEC2+, SEC1 offers the possibility to chain
descriptors. It is therefore possible to build a first descriptor
pointing to the buffered data and a second descriptor pointing to
the incoming data, hence avoiding the memory copy to a single
buffer.

With this patch, the time necessary for a md5sum on a 90Mbytes file
is approximately 3 seconds. Without the patch it takes 6 seconds.
Signed-off-by: default avatarChristophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
parent 49f9783b
This diff is collapsed.
......@@ -236,6 +236,7 @@ static inline bool has_ftr_sec1(struct talitos_private *priv)
#define TALITOS_CCCR_LO_IWSE 0x80 /* chan. ICCR writeback enab. */
#define TALITOS_CCCR_LO_EAE 0x20 /* extended address enable */
#define TALITOS_CCCR_LO_CDWE 0x10 /* chan. done writeback enab. */
#define TALITOS_CCCR_LO_NE 0x8 /* fetch next descriptor enab. */
#define TALITOS_CCCR_LO_NT 0x4 /* notification type */
#define TALITOS_CCCR_LO_CDIE 0x2 /* channel done IRQ enable */
#define TALITOS1_CCCR_LO_RESET 0x1 /* channel reset on SEC1 */
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment