- 11 Feb, 2017 1 commit
-
-
Ard Biesheuvel authored
The PMULL based CRC32 implementation already contains code based on the separate, optional CRC32 instructions to fallback to when operating on small quantities of data. We can expose these routines directly on systems that lack the 64x64 PMULL instructions but do implement the CRC32 ones, which makes the driver that is based solely on those CRC32 instructions redundant. So remove it. Note that this aligns arm64 with ARM, whose accelerated CRC32 driver also combines the CRC32 extension based and the PMULL based versions. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Tested-by: Matthias Brugger <mbrugger@suse.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
- 03 Feb, 2017 39 commits
-
-
Ard Biesheuvel authored
The ARM bit sliced AES core code uses the IV buffer to pass the final keystream block back to the glue code if the input is not a multiple of the block size, so that the asm code does not have to deal with anything except 16 byte blocks. This is done under the assumption that the outgoing IV is meaningless anyway in this case, given that chaining is no longer possible under these circumstances. However, as it turns out, the CCM driver does expect the IV to retain a value that is equal to the original IV except for the counter value, and even interprets byte zero as a length indicator, which may result in memory corruption if the IV is overwritten with something else. So use a separate buffer to return the final keystream block. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Ard Biesheuvel authored
The arm64 bit sliced AES core code uses the IV buffer to pass the final keystream block back to the glue code if the input is not a multiple of the block size, so that the asm code does not have to deal with anything except 16 byte blocks. This is done under the assumption that the outgoing IV is meaningless anyway in this case, given that chaining is no longer possible under these circumstances. However, as it turns out, the CCM driver does expect the IV to retain a value that is equal to the original IV except for the counter value, and even interprets byte zero as a length indicator, which may result in memory corruption if the IV is overwritten with something else. So use a separate buffer to return the final keystream block. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Ard Biesheuvel authored
The new bitsliced NEON implementation of AES uses a fallback in two places: CBC encryption (which is strictly sequential, whereas this driver can only operate efficiently on 8 blocks at a time), and the XTS tweak generation, which involves encrypting a single AES block with a different key schedule. The plain (i.e., non-bitsliced) NEON code is more suitable as a fallback, given that it is faster than scalar on low end cores (which is what the NEON implementations target, since high end cores have dedicated instructions for AES), and shows similar behavior in terms of D-cache footprint and sensitivity to cache timing attacks. So switch the fallback handling to the plain NEON driver. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Ard Biesheuvel authored
The non-bitsliced AES implementation using the NEON is highly sensitive to micro-architectural details, and, as it turns out, the Cortex-A53 on the Raspberry Pi 3 is a core that can benefit from this code, given that its scalar AES performance is abysmal (32.9 cycles per byte). The new bitsliced AES code manages 19.8 cycles per byte on this core, but can only operate on 8 blocks at a time, which is not supported by all chaining modes. With a bit of tweaking, we can get the plain NEON code to run at 22.0 cycles per byte, making it useful for sequential modes like CBC encryption. (Like bitsliced NEON, the plain NEON implementation does not use any lookup tables, which makes it easy on the D-cache, and invulnerable to cache timing attacks) So tweak the plain NEON AES code to use tbl instructions rather than shl/sri pairs, and to avoid the need to reload permutation vectors or other constants from memory in every round. Also, improve the decryption performance by switching to 16x8 pmul instructions for the performing the multiplications in GF(2^8). To allow the ECB and CBC encrypt routines to be reused by the bitsliced NEON code in a subsequent patch, export them from the module. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Ard Biesheuvel authored
Shuffle some instructions around in the __hround macro to shave off 0.1 cycles per byte on Cortex-A57. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Ard Biesheuvel authored
Using simple adrp/add pairs to refer to the AES lookup tables exposed by the generic AES driver (which could be loaded far away from this driver when KASLR is in effect) was unreliable at module load time before commit 41c066f2 ("arm64: assembler: make adr_l work in modules under KASLR"), which is why the AES code used literals instead. So now we can get rid of the literals, and switch to the adr_l macro. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Ard Biesheuvel authored
Remove the unnecessary alignmask: it is much more efficient to deal with the misalignment in the core algorithm than relying on the crypto API to copy the data to a suitably aligned buffer. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Ard Biesheuvel authored
Remove the unnecessary alignmask: it is much more efficient to deal with the misalignment in the core algorithm than relying on the crypto API to copy the data to a suitably aligned buffer. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Ard Biesheuvel authored
Remove the unnecessary alignmask: it is much more efficient to deal with the misalignment in the core algorithm than relying on the crypto API to copy the data to a suitably aligned buffer. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Ard Biesheuvel authored
Remove the unnecessary alignmask: it is much more efficient to deal with the misalignment in the core algorithm than relying on the crypto API to copy the data to a suitably aligned buffer. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Ard Biesheuvel authored
Remove the unnecessary alignmask: it is much more efficient to deal with the misalignment in the core algorithm than relying on the crypto API to copy the data to a suitably aligned buffer. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Harsh Jain authored
Initialise variable after null check. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Harsh Jain <harsh@chelsio.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Harsh Jain authored
Typecast the pointer with correct structure. Signed-off-by: Atul Gupta <atul.gupta@chelsio.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Harsh Jain authored
Update priorities to 3000 Signed-off-by: Harsh Jain <harsh@chelsio.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Harsh Jain authored
Change cipher algos flags to CRYPTO_ALG_TYPE_ABLKCIPHER. Signed-off-by: Harsh Jain <harsh@chelsio.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Harsh Jain authored
1 Block of encrption can be done with aes-generic. no need of cbc(aes). This patch replaces cbc(aes-generic) with aes-generic. Signed-off-by: Harsh Jain <harsh@chelsio.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Harsh Jain authored
The first argument to list_for_each_entry cannot be NULL. Generated by: scripts/coccinelle/iterators/itnull.cocci Signed-off-by: Julia Lawall <julia.lawall@lip6.fr> Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Harsh Jain <harsh@chelsio.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Harsh Jain authored
Change assign flowc id to each outgoing request.Firmware use flowc id to schedule each request onto HW. FW reply may miss without this change. Reviewed-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Atul Gupta <atul.gupta@chelsio.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Cyrille Pitchen authored
When VERBOSE_DEBUG is defined and SHA_FLAGS_DUMP_REG flag is set in dd->flags, this patch prints the register names and values when performing IO accesses. Signed-off-by: Cyrille Pitchen <cyrille.pitchen@atmel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Cyrille Pitchen authored
This patchs allows to combine the AES and SHA hardware accelerators on some Atmel SoCs. Doing so, AES blocks are only written to/read from the AES hardware. Those blocks are also transferred from the AES to the SHA accelerator internally, without additionnal accesses to the system busses. Hence, the AES and SHA accelerators work in parallel to process all the data blocks, instead of serializing the process by (de)crypting those blocks first then authenticating them after like the generic crypto/authenc.c driver does. Of course, both the AES and SHA hardware accelerators need to be available before we can start to process the data blocks. Hence we use their crypto request queue to synchronize both drivers. Signed-off-by: Cyrille Pitchen <cyrille.pitchen@atmel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Cyrille Pitchen authored
This patch fixes the value returned by atmel_aes_handle_queue(), which could have been wrong previously when the crypto request was started synchronously but became asynchronous during the ctx->start() call. Signed-off-by: Cyrille Pitchen <cyrille.pitchen@atmel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Cyrille Pitchen authored
This patch adds support to the hmac(shaX) algorithms. Signed-off-by: Cyrille Pitchen <cyrille.pitchen@atmel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Cyrille Pitchen authored
This patch adds a simple function to perform data transfer with the DMA controller. Signed-off-by: Cyrille Pitchen <cyrille.pitchen@atmel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Cyrille Pitchen authored
This patch adds a simple function to perform data transfer with PIO, hence handled by the CPU. Signed-off-by: Cyrille Pitchen <cyrille.pitchen@atmel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Cyrille Pitchen authored
This patch defines an alias macro to SHA_MR_MODE_PDC, which is not suited for DMA usage. Signed-off-by: Cyrille Pitchen <cyrille.pitchen@atmel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Cyrille Pitchen authored
This patch simply defines a helper function to test the 'Data Ready' flag of the Status Register. It also gives a chance for the crypto request to be processed synchronously if this 'Data Ready' flag is already set when polling the Status Register. Indeed, running synchronously avoid the latency of the 'Data Ready' interrupt. When the 'Data Ready' flag has not been set yet, we enable the associated interrupt and resume processing the crypto request asynchronously from the 'done' task just as before. Signed-off-by: Cyrille Pitchen <cyrille.pitchen@atmel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Cyrille Pitchen authored
This patch modifies the SHA_FLAGS_SHA* flags: those algo flags are now organized as values of a single bitfield instead of individual bits. This allows to reduce the number of bits needed to encode all possible values. Also the new values match the SHA_MR_ALGO_SHA* values hence the algorithm bitfield of the SHA_MR register could simply be set with: mr = (mr & ~SHA_FLAGS_ALGO_MASK) | (ctx->flags & SHA_FLAGS_ALGO_MASK) Signed-off-by: Cyrille Pitchen <cyrille.pitchen@atmel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Cyrille Pitchen authored
This patch is a transitional patch. It updates atmel_sha_done_task() to make it more generic. Indeed, it adds a new .resume() member in the atmel_sha_dev structure. This hook is called from atmel_sha_done_task() to resume processing an asynchronous request. Signed-off-by: Cyrille Pitchen <cyrille.pitchen@atmel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Cyrille Pitchen authored
This patch is a transitional patch. It splits the atmel_sha_handle_queue() function. Now atmel_sha_handle_queue() only manages the request queue and calls a new .start() hook from the atmel_sha_ctx structure. This hook allows to implement different kind of requests still handled by a single queue. Also when the req parameter of atmel_sha_handle_queue() refers to the very same request as the one returned by crypto_dequeue_request(), the queue management now gives a chance to this crypto request to be handled synchronously, hence reducing latencies. The .start() hook returns 0 if the crypto request was handled synchronously and -EINPROGRESS if the crypto request still need to be handled asynchronously. Besides, the new .is_async member of the atmel_sha_dev structure helps tagging this asynchronous state. Indeed, the req->base.complete() callback should not be called if the crypto request is handled synchronously. Signed-off-by: Cyrille Pitchen <cyrille.pitchen@atmel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Cyrille Pitchen authored
This is a transitional patch: it creates the atmel_sha_find_dev() function, which will be used in further patches to share the source code responsible for finding a Atmel SHA device. Signed-off-by: Cyrille Pitchen <cyrille.pitchen@atmel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Rabin Vincent authored
The documentation states that crypto_ahash_reqsize() provides the size of the state structure used by crypto_ahash_export(). But it's actually crypto_ahash_statesize() which provides this size. Signed-off-by: Rabin Vincent <rabinv@axis.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6Herbert Xu authored
Merge the crypto tree to pick up arm64 output IV patch.
-
Harsh Jain authored
Check keylen before copying salt to avoid wrap around of Integer. Signed-off-by: Harsh Jain <harsh@chelsio.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Harsh Jain authored
Kernel panics when userspace program try to access AEAD interface. Remove node from Linked List before freeing its memory. Cc: <stable@vger.kernel.org> Signed-off-by: Harsh Jain <harsh@chelsio.com> Reviewed-by: Stephan Müller <smueller@chronox.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Herbert Xu authored
When aesni is built as a module together with pcbc, the pcbc module must be present for aesni to load. However, the pcbc module may not be present for reasons such as its absence on initramfs. This patch allows the aesni to function even if the pcbc module is enabled but not present. Reported-by: Arkadiusz Miśkiewicz <arekm@maven.pl> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Gary R Hook authored
Eliminate a double-add by creating a new list to manage command descriptors when created; move the descriptor to the pending list when the command is submitted. Cc: <stable@vger.kernel.org> Signed-off-by: Gary R Hook <gary.hook@amd.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Gary R Hook authored
An I/O page fault occurs when the IOMMU is enabled on a system that supports the v5 CCP. DMA operations use a Request ID value that does not match what is expected by the IOMMU, resulting in the I/O page fault. Setting the Request ID value to 0 corrects this issue. Cc: <stable@vger.kernel.org> Signed-off-by: Gary R Hook <gary.hook@amd.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Harsh Jain authored
Ensure dev is allocated for crypto uld context before using the device for crypto operations. Cc: <stable@vger.kernel.org> Signed-off-by: Atul Gupta <atul.gupta@chelsio.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Harsh Jain authored
Save DMA mapped sg list addresses to request context buffer. Signed-off-by: Atul Gupta <atul.gupta@chelsio.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-