Commit 9eb31227 authored by Linus Torvalds's avatar Linus Torvalds

Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6

Pull crypto updates from Herbert Xu:
 "API:

   - add AEAD support to crypto engine

   - allow batch registration in simd

  Algorithms:

   - add CFB mode

   - add speck block cipher

   - add sm4 block cipher

   - new test case for crct10dif

   - improve scheduling latency on ARM

   - scatter/gather support to gcm in aesni

   - convert x86 crypto algorithms to skcihper

  Drivers:

   - hmac(sha224/sha256) support in inside-secure

   - aes gcm/ccm support in stm32

   - stm32mp1 support in stm32

   - ccree driver from staging tree

   - gcm support over QI in caam

   - add ks-sa hwrng driver"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (212 commits)
  crypto: ccree - remove unused enums
  crypto: ahash - Fix early termination in hash walk
  crypto: brcm - explicitly cast cipher to hash type
  crypto: talitos - don't leak pointers to authenc keys
  crypto: qat - don't leak pointers to authenc keys
  crypto: picoxcell - don't leak pointers to authenc keys
  crypto: ixp4xx - don't leak pointers to authenc keys
  crypto: chelsio - don't leak pointers to authenc keys
  crypto: caam/qi - don't leak pointers to authenc keys
  crypto: caam - don't leak pointers to authenc keys
  crypto: lrw - Free rctx->ext with kzfree
  crypto: talitos - fix IPsec cipher in length
  crypto: Deduplicate le32_to_cpu_array() and cpu_to_le32_array()
  crypto: doc - clarify hash callbacks state machine
  crypto: api - Keep failed instances alive
  crypto: api - Make crypto_alg_lookup static
  crypto: api - Remove unused crypto_type lookup function
  crypto: chelsio - Remove declaration of static function from header
  crypto: inside-secure - hmac(sha224) support
  crypto: inside-secure - hmac(sha256) support
  ..
parents 527cd207 f444ec10
=============
CRYPTO ENGINE
=============
Overview
--------
The crypto engine API (CE), is a crypto queue manager.
Requirement
-----------
You have to put at start of your tfm_ctx the struct crypto_engine_ctx
struct your_tfm_ctx {
struct crypto_engine_ctx enginectx;
...
};
Why: Since CE manage only crypto_async_request, it cannot know the underlying
request_type and so have access only on the TFM.
So using container_of for accessing __ctx is impossible.
Furthermore, the crypto engine cannot know the "struct your_tfm_ctx",
so it must assume that crypto_engine_ctx is at start of it.
Order of operations
-------------------
You have to obtain a struct crypto_engine via crypto_engine_alloc_init().
And start it via crypto_engine_start().
Before transferring any request, you have to fill the enginectx.
- prepare_request: (taking a function pointer) If you need to do some processing before doing the request
- unprepare_request: (taking a function pointer) Undoing what's done in prepare_request
- do_one_request: (taking a function pointer) Do encryption for current request
Note: that those three functions get the crypto_async_request associated with the received request.
So your need to get the original request via container_of(areq, struct yourrequesttype_request, base);
When your driver receive a crypto_request, you have to transfer it to
the cryptoengine via one of:
- crypto_transfer_ablkcipher_request_to_engine()
- crypto_transfer_aead_request_to_engine()
- crypto_transfer_akcipher_request_to_engine()
- crypto_transfer_hash_request_to_engine()
- crypto_transfer_skcipher_request_to_engine()
At the end of the request process, a call to one of the following function is needed:
- crypto_finalize_ablkcipher_request
- crypto_finalize_aead_request
- crypto_finalize_akcipher_request
- crypto_finalize_hash_request
- crypto_finalize_skcipher_request
...@@ -236,6 +236,14 @@ when used from another part of the kernel. ...@@ -236,6 +236,14 @@ when used from another part of the kernel.
| |
'---------------> HASH2 '---------------> HASH2
Note that it is perfectly legal to "abandon" a request object:
- call .init() and then (as many times) .update()
- _not_ call any of .final(), .finup() or .export() at any point in future
In other words implementations should mind the resource allocation and clean-up.
No resources related to request objects should remain allocated after a call
to .init() or .update(), since there might be no chance to free them.
Specifics Of Asynchronous HASH Transformation Specifics Of Asynchronous HASH Transformation
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
......
Arm TrustZone CryptoCell cryptographic engine Arm TrustZone CryptoCell cryptographic engine
Required properties: Required properties:
- compatible: Should be "arm,cryptocell-712-ree". - compatible: Should be one of: "arm,cryptocell-712-ree",
"arm,cryptocell-710-ree" or "arm,cryptocell-630p-ree".
- reg: Base physical address of the engine and length of memory mapped region. - reg: Base physical address of the engine and length of memory mapped region.
- interrupts: Interrupt number for the device. - interrupts: Interrupt number for the device.
......
...@@ -8,7 +8,11 @@ Required properties: ...@@ -8,7 +8,11 @@ Required properties:
- interrupt-names: Should be "ring0", "ring1", "ring2", "ring3", "eip", "mem". - interrupt-names: Should be "ring0", "ring1", "ring2", "ring3", "eip", "mem".
Optional properties: Optional properties:
- clocks: Reference to the crypto engine clock. - clocks: Reference to the crypto engine clocks, the second clock is
needed for the Armada 7K/8K SoCs.
- clock-names: mandatory if there is a second clock, in this case the
name must be "core" for the first clock and "reg" for
the second one.
Example: Example:
......
Freescale RNGC (Random Number Generator Version C) Freescale RNGA/RNGB/RNGC (Random Number Generator Versions A, B and C)
The driver also supports version B, which is mostly compatible
to version C.
Required properties: Required properties:
- compatible : should be one of - compatible : should be one of
"fsl,imx21-rnga"
"fsl,imx31-rnga" (backward compatible with "fsl,imx21-rnga")
"fsl,imx25-rngb" "fsl,imx25-rngb"
"fsl,imx35-rngc" "fsl,imx35-rngc"
- reg : offset and length of the register set of this block - reg : offset and length of the register set of this block
- interrupts : the interrupt number for the RNGC block - interrupts : the interrupt number for the RNG block
- clocks : the RNGC clk source - clocks : the RNG clk source
Example: Example:
......
Keystone SoC Hardware Random Number Generator(HWRNG) Module
On Keystone SoCs HWRNG module is a submodule of the Security Accelerator.
- compatible: should be "ti,keystone-rng"
- ti,syscon-sa-cfg: phandle to syscon node of the SA configuration registers.
This registers are shared between hwrng and crypto drivers.
- clocks: phandle to the reference clocks for the subsystem
- clock-names: functional clock name. Should be set to "fck"
- reg: HWRNG module register space
Example:
/* K2HK */
rng@24000 {
compatible = "ti,keystone-rng";
ti,syscon-sa-cfg = <&sa_config>;
clocks = <&clksa>;
clock-names = "fck";
reg = <0x24000 0x1000>;
};
...@@ -13,7 +13,12 @@ Required properties: ...@@ -13,7 +13,12 @@ Required properties:
- interrupts : the interrupt number for the RNG module. - interrupts : the interrupt number for the RNG module.
Used for "ti,omap4-rng" and "inside-secure,safexcel-eip76" Used for "ti,omap4-rng" and "inside-secure,safexcel-eip76"
- clocks: the trng clock source. Only mandatory for the - clocks: the trng clock source. Only mandatory for the
"inside-secure,safexcel-eip76" compatible. "inside-secure,safexcel-eip76" compatible, the second clock is
needed for the Armada 7K/8K SoCs
- clock-names: mandatory if there is a second clock, in this case the
name must be "core" for the first clock and "reg" for the second
one
Example: Example:
/* AM335x */ /* AM335x */
......
...@@ -11,6 +11,10 @@ Required properties: ...@@ -11,6 +11,10 @@ Required properties:
- interrupts : The designated IRQ line for the RNG - interrupts : The designated IRQ line for the RNG
- clocks : The clock needed to enable the RNG - clocks : The clock needed to enable the RNG
Optional properties:
- resets : The reset to properly start RNG
- clock-error-detect : Enable the clock detection management
Example: Example:
rng: rng@50060800 { rng: rng@50060800 {
......
...@@ -3252,12 +3252,11 @@ F: drivers/net/ieee802154/cc2520.c ...@@ -3252,12 +3252,11 @@ F: drivers/net/ieee802154/cc2520.c
F: include/linux/spi/cc2520.h F: include/linux/spi/cc2520.h
F: Documentation/devicetree/bindings/net/ieee802154/cc2520.txt F: Documentation/devicetree/bindings/net/ieee802154/cc2520.txt
CCREE ARM TRUSTZONE CRYPTOCELL 700 REE DRIVER CCREE ARM TRUSTZONE CRYPTOCELL REE DRIVER
M: Gilad Ben-Yossef <gilad@benyossef.com> M: Gilad Ben-Yossef <gilad@benyossef.com>
L: linux-crypto@vger.kernel.org L: linux-crypto@vger.kernel.org
L: driverdev-devel@linuxdriverproject.org
S: Supported S: Supported
F: drivers/staging/ccree/ F: drivers/crypto/ccree/
W: https://developer.arm.com/products/system-ip/trustzone-cryptocell/cryptocell-700-family W: https://developer.arm.com/products/system-ip/trustzone-cryptocell/cryptocell-700-family
CEC FRAMEWORK CEC FRAMEWORK
...@@ -6962,7 +6961,7 @@ F: drivers/input/input-mt.c ...@@ -6962,7 +6961,7 @@ F: drivers/input/input-mt.c
K: \b(ABS|SYN)_MT_ K: \b(ABS|SYN)_MT_
INSIDE SECURE CRYPTO DRIVER INSIDE SECURE CRYPTO DRIVER
M: Antoine Tenart <antoine.tenart@free-electrons.com> M: Antoine Tenart <antoine.tenart@bootlin.com>
F: drivers/crypto/inside-secure/ F: drivers/crypto/inside-secure/
S: Maintained S: Maintained
L: linux-crypto@vger.kernel.org L: linux-crypto@vger.kernel.org
...@@ -7200,6 +7199,14 @@ L: linux-rdma@vger.kernel.org ...@@ -7200,6 +7199,14 @@ L: linux-rdma@vger.kernel.org
S: Supported S: Supported
F: drivers/infiniband/hw/i40iw/ F: drivers/infiniband/hw/i40iw/
INTEL SHA MULTIBUFFER DRIVER
M: Megha Dey <megha.dey@linux.intel.com>
R: Tim Chen <tim.c.chen@linux.intel.com>
L: linux-crypto@vger.kernel.org
S: Supported
F: arch/x86/crypto/sha*-mb
F: crypto/mcryptd.c
INTEL TELEMETRY DRIVER INTEL TELEMETRY DRIVER
M: Souvik Kumar Chakravarty <souvik.k.chakravarty@intel.com> M: Souvik Kumar Chakravarty <souvik.k.chakravarty@intel.com>
L: platform-driver-x86@vger.kernel.org L: platform-driver-x86@vger.kernel.org
......
...@@ -121,4 +121,10 @@ config CRYPTO_CHACHA20_NEON ...@@ -121,4 +121,10 @@ config CRYPTO_CHACHA20_NEON
select CRYPTO_BLKCIPHER select CRYPTO_BLKCIPHER
select CRYPTO_CHACHA20 select CRYPTO_CHACHA20
config CRYPTO_SPECK_NEON
tristate "NEON accelerated Speck cipher algorithms"
depends on KERNEL_MODE_NEON
select CRYPTO_BLKCIPHER
select CRYPTO_SPECK
endif endif
...@@ -10,6 +10,7 @@ obj-$(CONFIG_CRYPTO_SHA1_ARM_NEON) += sha1-arm-neon.o ...@@ -10,6 +10,7 @@ obj-$(CONFIG_CRYPTO_SHA1_ARM_NEON) += sha1-arm-neon.o
obj-$(CONFIG_CRYPTO_SHA256_ARM) += sha256-arm.o obj-$(CONFIG_CRYPTO_SHA256_ARM) += sha256-arm.o
obj-$(CONFIG_CRYPTO_SHA512_ARM) += sha512-arm.o obj-$(CONFIG_CRYPTO_SHA512_ARM) += sha512-arm.o
obj-$(CONFIG_CRYPTO_CHACHA20_NEON) += chacha20-neon.o obj-$(CONFIG_CRYPTO_CHACHA20_NEON) += chacha20-neon.o
obj-$(CONFIG_CRYPTO_SPECK_NEON) += speck-neon.o
ce-obj-$(CONFIG_CRYPTO_AES_ARM_CE) += aes-arm-ce.o ce-obj-$(CONFIG_CRYPTO_AES_ARM_CE) += aes-arm-ce.o
ce-obj-$(CONFIG_CRYPTO_SHA1_ARM_CE) += sha1-arm-ce.o ce-obj-$(CONFIG_CRYPTO_SHA1_ARM_CE) += sha1-arm-ce.o
...@@ -53,7 +54,9 @@ ghash-arm-ce-y := ghash-ce-core.o ghash-ce-glue.o ...@@ -53,7 +54,9 @@ ghash-arm-ce-y := ghash-ce-core.o ghash-ce-glue.o
crct10dif-arm-ce-y := crct10dif-ce-core.o crct10dif-ce-glue.o crct10dif-arm-ce-y := crct10dif-ce-core.o crct10dif-ce-glue.o
crc32-arm-ce-y:= crc32-ce-core.o crc32-ce-glue.o crc32-arm-ce-y:= crc32-ce-core.o crc32-ce-glue.o
chacha20-neon-y := chacha20-neon-core.o chacha20-neon-glue.o chacha20-neon-y := chacha20-neon-core.o chacha20-neon-glue.o
speck-neon-y := speck-neon-core.o speck-neon-glue.o
ifdef REGENERATE_ARM_CRYPTO
quiet_cmd_perl = PERL $@ quiet_cmd_perl = PERL $@
cmd_perl = $(PERL) $(<) > $(@) cmd_perl = $(PERL) $(<) > $(@)
...@@ -62,5 +65,6 @@ $(src)/sha256-core.S_shipped: $(src)/sha256-armv4.pl ...@@ -62,5 +65,6 @@ $(src)/sha256-core.S_shipped: $(src)/sha256-armv4.pl
$(src)/sha512-core.S_shipped: $(src)/sha512-armv4.pl $(src)/sha512-core.S_shipped: $(src)/sha512-armv4.pl
$(call cmd,perl) $(call cmd,perl)
endif
.PRECIOUS: $(obj)/sha256-core.S $(obj)/sha512-core.S .PRECIOUS: $(obj)/sha256-core.S $(obj)/sha512-core.S
...@@ -174,6 +174,16 @@ ...@@ -174,6 +174,16 @@
.ltorg .ltorg
.endm .endm
ENTRY(__aes_arm_encrypt)
do_crypt fround, crypto_ft_tab, crypto_ft_tab + 1, 2
ENDPROC(__aes_arm_encrypt)
.align 5
ENTRY(__aes_arm_decrypt)
do_crypt iround, crypto_it_tab, __aes_arm_inverse_sbox, 0
ENDPROC(__aes_arm_decrypt)
.section ".rodata", "a"
.align L1_CACHE_SHIFT .align L1_CACHE_SHIFT
.type __aes_arm_inverse_sbox, %object .type __aes_arm_inverse_sbox, %object
__aes_arm_inverse_sbox: __aes_arm_inverse_sbox:
...@@ -210,12 +220,3 @@ __aes_arm_inverse_sbox: ...@@ -210,12 +220,3 @@ __aes_arm_inverse_sbox:
.byte 0x17, 0x2b, 0x04, 0x7e, 0xba, 0x77, 0xd6, 0x26 .byte 0x17, 0x2b, 0x04, 0x7e, 0xba, 0x77, 0xd6, 0x26
.byte 0xe1, 0x69, 0x14, 0x63, 0x55, 0x21, 0x0c, 0x7d .byte 0xe1, 0x69, 0x14, 0x63, 0x55, 0x21, 0x0c, 0x7d
.size __aes_arm_inverse_sbox, . - __aes_arm_inverse_sbox .size __aes_arm_inverse_sbox, . - __aes_arm_inverse_sbox
ENTRY(__aes_arm_encrypt)
do_crypt fround, crypto_ft_tab, crypto_ft_tab + 1, 2
ENDPROC(__aes_arm_encrypt)
.align 5
ENTRY(__aes_arm_decrypt)
do_crypt iround, crypto_it_tab, __aes_arm_inverse_sbox, 0
ENDPROC(__aes_arm_decrypt)
This diff is collapsed.
// SPDX-License-Identifier: GPL-2.0
/*
* NEON-accelerated implementation of Speck128-XTS and Speck64-XTS
*
* Copyright (c) 2018 Google, Inc
*
* Note: the NIST recommendation for XTS only specifies a 128-bit block size,
* but a 64-bit version (needed for Speck64) is fairly straightforward; the math
* is just done in GF(2^64) instead of GF(2^128), with the reducing polynomial
* x^64 + x^4 + x^3 + x + 1 from the original XEX paper (Rogaway, 2004:
* "Efficient Instantiations of Tweakable Blockciphers and Refinements to Modes
* OCB and PMAC"), represented as 0x1B.
*/
#include <asm/hwcap.h>
#include <asm/neon.h>
#include <asm/simd.h>
#include <crypto/algapi.h>
#include <crypto/gf128mul.h>
#include <crypto/internal/skcipher.h>
#include <crypto/speck.h>
#include <crypto/xts.h>
#include <linux/kernel.h>
#include <linux/module.h>
/* The assembly functions only handle multiples of 128 bytes */
#define SPECK_NEON_CHUNK_SIZE 128
/* Speck128 */
struct speck128_xts_tfm_ctx {
struct speck128_tfm_ctx main_key;
struct speck128_tfm_ctx tweak_key;
};
asmlinkage void speck128_xts_encrypt_neon(const u64 *round_keys, int nrounds,
void *dst, const void *src,
unsigned int nbytes, void *tweak);
asmlinkage void speck128_xts_decrypt_neon(const u64 *round_keys, int nrounds,
void *dst, const void *src,
unsigned int nbytes, void *tweak);
typedef void (*speck128_crypt_one_t)(const struct speck128_tfm_ctx *,
u8 *, const u8 *);
typedef void (*speck128_xts_crypt_many_t)(const u64 *, int, void *,
const void *, unsigned int, void *);
static __always_inline int
__speck128_xts_crypt(struct skcipher_request *req,
speck128_crypt_one_t crypt_one,
speck128_xts_crypt_many_t crypt_many)
{
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
const struct speck128_xts_tfm_ctx *ctx = crypto_skcipher_ctx(tfm);
struct skcipher_walk walk;
le128 tweak;
int err;
err = skcipher_walk_virt(&walk, req, true);
crypto_speck128_encrypt(&ctx->tweak_key, (u8 *)&tweak, walk.iv);
while (walk.nbytes > 0) {
unsigned int nbytes = walk.nbytes;
u8 *dst = walk.dst.virt.addr;
const u8 *src = walk.src.virt.addr;
if (nbytes >= SPECK_NEON_CHUNK_SIZE && may_use_simd()) {
unsigned int count;
count = round_down(nbytes, SPECK_NEON_CHUNK_SIZE);
kernel_neon_begin();
(*crypt_many)(ctx->main_key.round_keys,
ctx->main_key.nrounds,
dst, src, count, &tweak);
kernel_neon_end();
dst += count;
src += count;
nbytes -= count;
}
/* Handle any remainder with generic code */
while (nbytes >= sizeof(tweak)) {
le128_xor((le128 *)dst, (const le128 *)src, &tweak);
(*crypt_one)(&ctx->main_key, dst, dst);
le128_xor((le128 *)dst, (const le128 *)dst, &tweak);
gf128mul_x_ble(&tweak, &tweak);
dst += sizeof(tweak);
src += sizeof(tweak);
nbytes -= sizeof(tweak);
}
err = skcipher_walk_done(&walk, nbytes);
}
return err;
}
static int speck128_xts_encrypt(struct skcipher_request *req)
{
return __speck128_xts_crypt(req, crypto_speck128_encrypt,
speck128_xts_encrypt_neon);
}
static int speck128_xts_decrypt(struct skcipher_request *req)
{
return __speck128_xts_crypt(req, crypto_speck128_decrypt,
speck128_xts_decrypt_neon);
}
static int speck128_xts_setkey(struct crypto_skcipher *tfm, const u8 *key,
unsigned int keylen)
{
struct speck128_xts_tfm_ctx *ctx = crypto_skcipher_ctx(tfm);
int err;
err = xts_verify_key(tfm, key, keylen);
if (err)
return err;
keylen /= 2;
err = crypto_speck128_setkey(&ctx->main_key, key, keylen);
if (err)
return err;
return crypto_speck128_setkey(&ctx->tweak_key, key + keylen, keylen);
}
/* Speck64 */
struct speck64_xts_tfm_ctx {
struct speck64_tfm_ctx main_key;
struct speck64_tfm_ctx tweak_key;
};
asmlinkage void speck64_xts_encrypt_neon(const u32 *round_keys, int nrounds,
void *dst, const void *src,
unsigned int nbytes, void *tweak);
asmlinkage void speck64_xts_decrypt_neon(const u32 *round_keys, int nrounds,
void *dst, const void *src,
unsigned int nbytes, void *tweak);
typedef void (*speck64_crypt_one_t)(const struct speck64_tfm_ctx *,
u8 *, const u8 *);
typedef void (*speck64_xts_crypt_many_t)(const u32 *, int, void *,
const void *, unsigned int, void *);
static __always_inline int
__speck64_xts_crypt(struct skcipher_request *req, speck64_crypt_one_t crypt_one,
speck64_xts_crypt_many_t crypt_many)
{
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
const struct speck64_xts_tfm_ctx *ctx = crypto_skcipher_ctx(tfm);
struct skcipher_walk walk;
__le64 tweak;
int err;
err = skcipher_walk_virt(&walk, req, true);
crypto_speck64_encrypt(&ctx->tweak_key, (u8 *)&tweak, walk.iv);
while (walk.nbytes > 0) {
unsigned int nbytes = walk.nbytes;
u8 *dst = walk.dst.virt.addr;
const u8 *src = walk.src.virt.addr;
if (nbytes >= SPECK_NEON_CHUNK_SIZE && may_use_simd()) {
unsigned int count;
count = round_down(nbytes, SPECK_NEON_CHUNK_SIZE);
kernel_neon_begin();
(*crypt_many)(ctx->main_key.round_keys,
ctx->main_key.nrounds,
dst, src, count, &tweak);
kernel_neon_end();
dst += count;
src += count;
nbytes -= count;
}
/* Handle any remainder with generic code */
while (nbytes >= sizeof(tweak)) {
*(__le64 *)dst = *(__le64 *)src ^ tweak;
(*crypt_one)(&ctx->main_key, dst, dst);
*(__le64 *)dst ^= tweak;
tweak = cpu_to_le64((le64_to_cpu(tweak) << 1) ^
((tweak & cpu_to_le64(1ULL << 63)) ?
0x1B : 0));
dst += sizeof(tweak);
src += sizeof(tweak);
nbytes -= sizeof(tweak);
}
err = skcipher_walk_done(&walk, nbytes);
}
return err;
}
static int speck64_xts_encrypt(struct skcipher_request *req)
{
return __speck64_xts_crypt(req, crypto_speck64_encrypt,
speck64_xts_encrypt_neon);
}
static int speck64_xts_decrypt(struct skcipher_request *req)
{
return __speck64_xts_crypt(req, crypto_speck64_decrypt,
speck64_xts_decrypt_neon);
}
static int speck64_xts_setkey(struct crypto_skcipher *tfm, const u8 *key,
unsigned int keylen)
{
struct speck64_xts_tfm_ctx *ctx = crypto_skcipher_ctx(tfm);
int err;
err = xts_verify_key(tfm, key, keylen);
if (err)
return err;
keylen /= 2;
err = crypto_speck64_setkey(&ctx->main_key, key, keylen);
if (err)
return err;
return crypto_speck64_setkey(&ctx->tweak_key, key + keylen, keylen);
}
static struct skcipher_alg speck_algs[] = {
{
.base.cra_name = "xts(speck128)",
.base.cra_driver_name = "xts-speck128-neon",
.base.cra_priority = 300,
.base.cra_blocksize = SPECK128_BLOCK_SIZE,
.base.cra_ctxsize = sizeof(struct speck128_xts_tfm_ctx),
.base.cra_alignmask = 7,
.base.cra_module = THIS_MODULE,
.min_keysize = 2 * SPECK128_128_KEY_SIZE,
.max_keysize = 2 * SPECK128_256_KEY_SIZE,
.ivsize = SPECK128_BLOCK_SIZE,
.walksize = SPECK_NEON_CHUNK_SIZE,
.setkey = speck128_xts_setkey,
.encrypt = speck128_xts_encrypt,
.decrypt = speck128_xts_decrypt,
}, {
.base.cra_name = "xts(speck64)",
.base.cra_driver_name = "xts-speck64-neon",
.base.cra_priority = 300,
.base.cra_blocksize = SPECK64_BLOCK_SIZE,
.base.cra_ctxsize = sizeof(struct speck64_xts_tfm_ctx),
.base.cra_alignmask = 7,
.base.cra_module = THIS_MODULE,
.min_keysize = 2 * SPECK64_96_KEY_SIZE,
.max_keysize = 2 * SPECK64_128_KEY_SIZE,
.ivsize = SPECK64_BLOCK_SIZE,
.walksize = SPECK_NEON_CHUNK_SIZE,
.setkey = speck64_xts_setkey,
.encrypt = speck64_xts_encrypt,
.decrypt = speck64_xts_decrypt,
}
};
static int __init speck_neon_module_init(void)
{
if (!(elf_hwcap & HWCAP_NEON))
return -ENODEV;
return crypto_register_skciphers(speck_algs, ARRAY_SIZE(speck_algs));
}
static void __exit speck_neon_module_exit(void)
{
crypto_unregister_skciphers(speck_algs, ARRAY_SIZE(speck_algs));
}
module_init(speck_neon_module_init);
module_exit(speck_neon_module_exit);
MODULE_DESCRIPTION("Speck block cipher (NEON-accelerated)");
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Eric Biggers <ebiggers@google.com>");
MODULE_ALIAS_CRYPTO("xts(speck128)");
MODULE_ALIAS_CRYPTO("xts-speck128-neon");
MODULE_ALIAS_CRYPTO("xts(speck64)");
MODULE_ALIAS_CRYPTO("xts-speck64-neon");
...@@ -113,4 +113,10 @@ config CRYPTO_AES_ARM64_BS ...@@ -113,4 +113,10 @@ config CRYPTO_AES_ARM64_BS
select CRYPTO_AES_ARM64 select CRYPTO_AES_ARM64
select CRYPTO_SIMD select CRYPTO_SIMD
config CRYPTO_SPECK_NEON
tristate "NEON accelerated Speck cipher algorithms"
depends on KERNEL_MODE_NEON
select CRYPTO_BLKCIPHER
select CRYPTO_SPECK
endif endif
...@@ -53,20 +53,21 @@ sha512-arm64-y := sha512-glue.o sha512-core.o ...@@ -53,20 +53,21 @@ sha512-arm64-y := sha512-glue.o sha512-core.o
obj-$(CONFIG_CRYPTO_CHACHA20_NEON) += chacha20-neon.o obj-$(CONFIG_CRYPTO_CHACHA20_NEON) += chacha20-neon.o
chacha20-neon-y := chacha20-neon-core.o chacha20-neon-glue.o chacha20-neon-y := chacha20-neon-core.o chacha20-neon-glue.o
obj-$(CONFIG_CRYPTO_SPECK_NEON) += speck-neon.o
speck-neon-y := speck-neon-core.o speck-neon-glue.o
obj-$(CONFIG_CRYPTO_AES_ARM64) += aes-arm64.o obj-$(CONFIG_CRYPTO_AES_ARM64) += aes-arm64.o
aes-arm64-y := aes-cipher-core.o aes-cipher-glue.o aes-arm64-y := aes-cipher-core.o aes-cipher-glue.o
obj-$(CONFIG_CRYPTO_AES_ARM64_BS) += aes-neon-bs.o obj-$(CONFIG_CRYPTO_AES_ARM64_BS) += aes-neon-bs.o
aes-neon-bs-y := aes-neonbs-core.o aes-neonbs-glue.o aes-neon-bs-y := aes-neonbs-core.o aes-neonbs-glue.o
AFLAGS_aes-ce.o := -DINTERLEAVE=4
AFLAGS_aes-neon.o := -DINTERLEAVE=4
CFLAGS_aes-glue-ce.o := -DUSE_V8_CRYPTO_EXTENSIONS CFLAGS_aes-glue-ce.o := -DUSE_V8_CRYPTO_EXTENSIONS
$(obj)/aes-glue-%.o: $(src)/aes-glue.c FORCE $(obj)/aes-glue-%.o: $(src)/aes-glue.c FORCE
$(call if_changed_rule,cc_o_c) $(call if_changed_rule,cc_o_c)
ifdef REGENERATE_ARM64_CRYPTO
quiet_cmd_perlasm = PERLASM $@ quiet_cmd_perlasm = PERLASM $@
cmd_perlasm = $(PERL) $(<) void $(@) cmd_perlasm = $(PERL) $(<) void $(@)
...@@ -75,5 +76,6 @@ $(src)/sha256-core.S_shipped: $(src)/sha512-armv8.pl ...@@ -75,5 +76,6 @@ $(src)/sha256-core.S_shipped: $(src)/sha512-armv8.pl
$(src)/sha512-core.S_shipped: $(src)/sha512-armv8.pl $(src)/sha512-core.S_shipped: $(src)/sha512-armv8.pl
$(call cmd,perlasm) $(call cmd,perlasm)
endif
.PRECIOUS: $(obj)/sha256-core.S $(obj)/sha512-core.S .PRECIOUS: $(obj)/sha256-core.S $(obj)/sha512-core.S
...@@ -107,11 +107,13 @@ static int ccm_init_mac(struct aead_request *req, u8 maciv[], u32 msglen) ...@@ -107,11 +107,13 @@ static int ccm_init_mac(struct aead_request *req, u8 maciv[], u32 msglen)
} }
static void ccm_update_mac(struct crypto_aes_ctx *key, u8 mac[], u8 const in[], static void ccm_update_mac(struct crypto_aes_ctx *key, u8 mac[], u8 const in[],
u32 abytes, u32 *macp, bool use_neon) u32 abytes, u32 *macp)
{ {
if (likely(use_neon)) { if (may_use_simd()) {
kernel_neon_begin();
ce_aes_ccm_auth_data(mac, in, abytes, macp, key->key_enc, ce_aes_ccm_auth_data(mac, in, abytes, macp, key->key_enc,
num_rounds(key)); num_rounds(key));
kernel_neon_end();
} else { } else {
if (*macp > 0 && *macp < AES_BLOCK_SIZE) { if (*macp > 0 && *macp < AES_BLOCK_SIZE) {
int added = min(abytes, AES_BLOCK_SIZE - *macp); int added = min(abytes, AES_BLOCK_SIZE - *macp);
...@@ -143,8 +145,7 @@ static void ccm_update_mac(struct crypto_aes_ctx *key, u8 mac[], u8 const in[], ...@@ -143,8 +145,7 @@ static void ccm_update_mac(struct crypto_aes_ctx *key, u8 mac[], u8 const in[],
} }
} }
static void ccm_calculate_auth_mac(struct aead_request *req, u8 mac[], static void ccm_calculate_auth_mac(struct aead_request *req, u8 mac[])
bool use_neon)
{ {
struct crypto_aead *aead = crypto_aead_reqtfm(req); struct crypto_aead *aead = crypto_aead_reqtfm(req);
struct crypto_aes_ctx *ctx = crypto_aead_ctx(aead); struct crypto_aes_ctx *ctx = crypto_aead_ctx(aead);
...@@ -163,7 +164,7 @@ static void ccm_calculate_auth_mac(struct aead_request *req, u8 mac[], ...@@ -163,7 +164,7 @@ static void ccm_calculate_auth_mac(struct aead_request *req, u8 mac[],
ltag.len = 6; ltag.len = 6;
} }
ccm_update_mac(ctx, mac, (u8 *)&ltag, ltag.len, &macp, use_neon); ccm_update_mac(ctx, mac, (u8 *)&ltag, ltag.len, &macp);
scatterwalk_start(&walk, req->src); scatterwalk_start(&walk, req->src);
do { do {
...@@ -175,7 +176,7 @@ static void ccm_calculate_auth_mac(struct aead_request *req, u8 mac[], ...@@ -175,7 +176,7 @@ static void ccm_calculate_auth_mac(struct aead_request *req, u8 mac[],
n = scatterwalk_clamp(&walk, len); n = scatterwalk_clamp(&walk, len);
} }
p = scatterwalk_map(&walk); p = scatterwalk_map(&walk);
ccm_update_mac(ctx, mac, p, n, &macp, use_neon); ccm_update_mac(ctx, mac, p, n, &macp);
len -= n; len -= n;
scatterwalk_unmap(p); scatterwalk_unmap(p);
...@@ -242,43 +243,42 @@ static int ccm_encrypt(struct aead_request *req) ...@@ -242,43 +243,42 @@ static int ccm_encrypt(struct aead_request *req)
u8 __aligned(8) mac[AES_BLOCK_SIZE]; u8 __aligned(8) mac[AES_BLOCK_SIZE];
u8 buf[AES_BLOCK_SIZE]; u8 buf[AES_BLOCK_SIZE];
u32 len = req->cryptlen; u32 len = req->cryptlen;
bool use_neon = may_use_simd();
int err; int err;
err = ccm_init_mac(req, mac, len); err = ccm_init_mac(req, mac, len);
if (err) if (err)
return err; return err;
if (likely(use_neon))
kernel_neon_begin();
if (req->assoclen) if (req->assoclen)
ccm_calculate_auth_mac(req, mac, use_neon); ccm_calculate_auth_mac(req, mac);
/* preserve the original iv for the final round */ /* preserve the original iv for the final round */
memcpy(buf, req->iv, AES_BLOCK_SIZE); memcpy(buf, req->iv, AES_BLOCK_SIZE);
err = skcipher_walk_aead_encrypt(&walk, req, true); err = skcipher_walk_aead_encrypt(&walk, req, true);
if (likely(use_neon)) { if (may_use_simd()) {
while (walk.nbytes) { while (walk.nbytes) {
u32 tail = walk.nbytes % AES_BLOCK_SIZE; u32 tail = walk.nbytes % AES_BLOCK_SIZE;
if (walk.nbytes == walk.total) if (walk.nbytes == walk.total)
tail = 0; tail = 0;
kernel_neon_begin();
ce_aes_ccm_encrypt(walk.dst.virt.addr, ce_aes_ccm_encrypt(walk.dst.virt.addr,
walk.src.virt.addr, walk.src.virt.addr,
walk.nbytes - tail, ctx->key_enc, walk.nbytes - tail, ctx->key_enc,
num_rounds(ctx), mac, walk.iv); num_rounds(ctx), mac, walk.iv);
kernel_neon_end();
err = skcipher_walk_done(&walk, tail); err = skcipher_walk_done(&walk, tail);
} }
if (!err) if (!err) {
kernel_neon_begin();
ce_aes_ccm_final(mac, buf, ctx->key_enc, ce_aes_ccm_final(mac, buf, ctx->key_enc,
num_rounds(ctx)); num_rounds(ctx));
kernel_neon_end();
kernel_neon_end(); }
} else { } else {
err = ccm_crypt_fallback(&walk, mac, buf, ctx, true); err = ccm_crypt_fallback(&walk, mac, buf, ctx, true);
} }
...@@ -301,43 +301,42 @@ static int ccm_decrypt(struct aead_request *req) ...@@ -301,43 +301,42 @@ static int ccm_decrypt(struct aead_request *req)
u8 __aligned(8) mac[AES_BLOCK_SIZE]; u8 __aligned(8) mac[AES_BLOCK_SIZE];
u8 buf[AES_BLOCK_SIZE]; u8 buf[AES_BLOCK_SIZE];
u32 len = req->cryptlen - authsize; u32 len = req->cryptlen - authsize;
bool use_neon = may_use_simd();
int err; int err;
err = ccm_init_mac(req, mac, len); err = ccm_init_mac(req, mac, len);
if (err) if (err)
return err; return err;
if (likely(use_neon))
kernel_neon_begin();
if (req->assoclen) if (req->assoclen)
ccm_calculate_auth_mac(req, mac, use_neon); ccm_calculate_auth_mac(req, mac);
/* preserve the original iv for the final round */ /* preserve the original iv for the final round */
memcpy(buf, req->iv, AES_BLOCK_SIZE); memcpy(buf, req->iv, AES_BLOCK_SIZE);
err = skcipher_walk_aead_decrypt(&walk, req, true); err = skcipher_walk_aead_decrypt(&walk, req, true);
if (likely(use_neon)) { if (may_use_simd()) {
while (walk.nbytes) { while (walk.nbytes) {
u32 tail = walk.nbytes % AES_BLOCK_SIZE; u32 tail = walk.nbytes % AES_BLOCK_SIZE;
if (walk.nbytes == walk.total) if (walk.nbytes == walk.total)
tail = 0; tail = 0;
kernel_neon_begin();
ce_aes_ccm_decrypt(walk.dst.virt.addr, ce_aes_ccm_decrypt(walk.dst.virt.addr,
walk.src.virt.addr, walk.src.virt.addr,
walk.nbytes - tail, ctx->key_enc, walk.nbytes - tail, ctx->key_enc,
num_rounds(ctx), mac, walk.iv); num_rounds(ctx), mac, walk.iv);
kernel_neon_end();
err = skcipher_walk_done(&walk, tail); err = skcipher_walk_done(&walk, tail);
} }
if (!err) if (!err) {
kernel_neon_begin();
ce_aes_ccm_final(mac, buf, ctx->key_enc, ce_aes_ccm_final(mac, buf, ctx->key_enc,
num_rounds(ctx)); num_rounds(ctx));
kernel_neon_end();
kernel_neon_end(); }
} else { } else {
err = ccm_crypt_fallback(&walk, mac, buf, ctx, false); err = ccm_crypt_fallback(&walk, mac, buf, ctx, false);
} }
......
...@@ -64,17 +64,17 @@ MODULE_LICENSE("GPL v2"); ...@@ -64,17 +64,17 @@ MODULE_LICENSE("GPL v2");
/* defined in aes-modes.S */ /* defined in aes-modes.S */
asmlinkage void aes_ecb_encrypt(u8 out[], u8 const in[], u8 const rk[], asmlinkage void aes_ecb_encrypt(u8 out[], u8 const in[], u8 const rk[],
int rounds, int blocks, int first); int rounds, int blocks);
asmlinkage void aes_ecb_decrypt(u8 out[], u8 const in[], u8 const rk[], asmlinkage void aes_ecb_decrypt(u8 out[], u8 const in[], u8 const rk[],
int rounds, int blocks, int first); int rounds, int blocks);
asmlinkage void aes_cbc_encrypt(u8 out[], u8 const in[], u8 const rk[], asmlinkage void aes_cbc_encrypt(u8 out[], u8 const in[], u8 const rk[],
int rounds, int blocks, u8 iv[], int first); int rounds, int blocks, u8 iv[]);
asmlinkage void aes_cbc_decrypt(u8 out[], u8 const in[], u8 const rk[], asmlinkage void aes_cbc_decrypt(u8 out[], u8 const in[], u8 const rk[],
int rounds, int blocks, u8 iv[], int first); int rounds, int blocks, u8 iv[]);
asmlinkage void aes_ctr_encrypt(u8 out[], u8 const in[], u8 const rk[], asmlinkage void aes_ctr_encrypt(u8 out[], u8 const in[], u8 const rk[],
int rounds, int blocks, u8 ctr[], int first); int rounds, int blocks, u8 ctr[]);
asmlinkage void aes_xts_encrypt(u8 out[], u8 const in[], u8 const rk1[], asmlinkage void aes_xts_encrypt(u8 out[], u8 const in[], u8 const rk1[],
int rounds, int blocks, u8 const rk2[], u8 iv[], int rounds, int blocks, u8 const rk2[], u8 iv[],
...@@ -133,19 +133,19 @@ static int ecb_encrypt(struct skcipher_request *req) ...@@ -133,19 +133,19 @@ static int ecb_encrypt(struct skcipher_request *req)
{ {
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct crypto_aes_ctx *ctx = crypto_skcipher_ctx(tfm); struct crypto_aes_ctx *ctx = crypto_skcipher_ctx(tfm);
int err, first, rounds = 6 + ctx->key_length / 4; int err, rounds = 6 + ctx->key_length / 4;
struct skcipher_walk walk; struct skcipher_walk walk;
unsigned int blocks; unsigned int blocks;
err = skcipher_walk_virt(&walk, req, true); err = skcipher_walk_virt(&walk, req, false);
kernel_neon_begin(); while ((blocks = (walk.nbytes / AES_BLOCK_SIZE))) {
for (first = 1; (blocks = (walk.nbytes / AES_BLOCK_SIZE)); first = 0) { kernel_neon_begin();
aes_ecb_encrypt(walk.dst.virt.addr, walk.src.virt.addr, aes_ecb_encrypt(walk.dst.virt.addr, walk.src.virt.addr,
(u8 *)ctx->key_enc, rounds, blocks, first); (u8 *)ctx->key_enc, rounds, blocks);
kernel_neon_end();
err = skcipher_walk_done(&walk, walk.nbytes % AES_BLOCK_SIZE); err = skcipher_walk_done(&walk, walk.nbytes % AES_BLOCK_SIZE);
} }
kernel_neon_end();
return err; return err;
} }
...@@ -153,19 +153,19 @@ static int ecb_decrypt(struct skcipher_request *req) ...@@ -153,19 +153,19 @@ static int ecb_decrypt(struct skcipher_request *req)
{ {
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct crypto_aes_ctx *ctx = crypto_skcipher_ctx(tfm); struct crypto_aes_ctx *ctx = crypto_skcipher_ctx(tfm);
int err, first, rounds = 6 + ctx->key_length / 4; int err, rounds = 6 + ctx->key_length / 4;
struct skcipher_walk walk; struct skcipher_walk walk;
unsigned int blocks; unsigned int blocks;
err = skcipher_walk_virt(&walk, req, true); err = skcipher_walk_virt(&walk, req, false);
kernel_neon_begin(); while ((blocks = (walk.nbytes / AES_BLOCK_SIZE))) {
for (first = 1; (blocks = (walk.nbytes / AES_BLOCK_SIZE)); first = 0) { kernel_neon_begin();
aes_ecb_decrypt(walk.dst.virt.addr, walk.src.virt.addr, aes_ecb_decrypt(walk.dst.virt.addr, walk.src.virt.addr,
(u8 *)ctx->key_dec, rounds, blocks, first); (u8 *)ctx->key_dec, rounds, blocks);
kernel_neon_end();
err = skcipher_walk_done(&walk, walk.nbytes % AES_BLOCK_SIZE); err = skcipher_walk_done(&walk, walk.nbytes % AES_BLOCK_SIZE);
} }
kernel_neon_end();
return err; return err;
} }
...@@ -173,20 +173,19 @@ static int cbc_encrypt(struct skcipher_request *req) ...@@ -173,20 +173,19 @@ static int cbc_encrypt(struct skcipher_request *req)
{ {
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct crypto_aes_ctx *ctx = crypto_skcipher_ctx(tfm); struct crypto_aes_ctx *ctx = crypto_skcipher_ctx(tfm);
int err, first, rounds = 6 + ctx->key_length / 4; int err, rounds = 6 + ctx->key_length / 4;
struct skcipher_walk walk; struct skcipher_walk walk;
unsigned int blocks; unsigned int blocks;
err = skcipher_walk_virt(&walk, req, true); err = skcipher_walk_virt(&walk, req, false);
kernel_neon_begin(); while ((blocks = (walk.nbytes / AES_BLOCK_SIZE))) {
for (first = 1; (blocks = (walk.nbytes / AES_BLOCK_SIZE)); first = 0) { kernel_neon_begin();
aes_cbc_encrypt(walk.dst.virt.addr, walk.src.virt.addr, aes_cbc_encrypt(walk.dst.virt.addr, walk.src.virt.addr,
(u8 *)ctx->key_enc, rounds, blocks, walk.iv, (u8 *)ctx->key_enc, rounds, blocks, walk.iv);
first); kernel_neon_end();
err = skcipher_walk_done(&walk, walk.nbytes % AES_BLOCK_SIZE); err = skcipher_walk_done(&walk, walk.nbytes % AES_BLOCK_SIZE);
} }
kernel_neon_end();
return err; return err;
} }
...@@ -194,20 +193,19 @@ static int cbc_decrypt(struct skcipher_request *req) ...@@ -194,20 +193,19 @@ static int cbc_decrypt(struct skcipher_request *req)
{ {
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct crypto_aes_ctx *ctx = crypto_skcipher_ctx(tfm); struct crypto_aes_ctx *ctx = crypto_skcipher_ctx(tfm);
int err, first, rounds = 6 + ctx->key_length / 4; int err, rounds = 6 + ctx->key_length / 4;
struct skcipher_walk walk; struct skcipher_walk walk;
unsigned int blocks; unsigned int blocks;
err = skcipher_walk_virt(&walk, req, true); err = skcipher_walk_virt(&walk, req, false);
kernel_neon_begin(); while ((blocks = (walk.nbytes / AES_BLOCK_SIZE))) {
for (first = 1; (blocks = (walk.nbytes / AES_BLOCK_SIZE)); first = 0) { kernel_neon_begin();
aes_cbc_decrypt(walk.dst.virt.addr, walk.src.virt.addr, aes_cbc_decrypt(walk.dst.virt.addr, walk.src.virt.addr,
(u8 *)ctx->key_dec, rounds, blocks, walk.iv, (u8 *)ctx->key_dec, rounds, blocks, walk.iv);
first); kernel_neon_end();
err = skcipher_walk_done(&walk, walk.nbytes % AES_BLOCK_SIZE); err = skcipher_walk_done(&walk, walk.nbytes % AES_BLOCK_SIZE);
} }
kernel_neon_end();
return err; return err;
} }
...@@ -215,20 +213,18 @@ static int ctr_encrypt(struct skcipher_request *req) ...@@ -215,20 +213,18 @@ static int ctr_encrypt(struct skcipher_request *req)
{ {
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct crypto_aes_ctx *ctx = crypto_skcipher_ctx(tfm); struct crypto_aes_ctx *ctx = crypto_skcipher_ctx(tfm);
int err, first, rounds = 6 + ctx->key_length / 4; int err, rounds = 6 + ctx->key_length / 4;
struct skcipher_walk walk; struct skcipher_walk walk;
int blocks; int blocks;
err = skcipher_walk_virt(&walk, req, true); err = skcipher_walk_virt(&walk, req, false);
first = 1;
kernel_neon_begin();
while ((blocks = (walk.nbytes / AES_BLOCK_SIZE))) { while ((blocks = (walk.nbytes / AES_BLOCK_SIZE))) {
kernel_neon_begin();
aes_ctr_encrypt(walk.dst.virt.addr, walk.src.virt.addr, aes_ctr_encrypt(walk.dst.virt.addr, walk.src.virt.addr,
(u8 *)ctx->key_enc, rounds, blocks, walk.iv, (u8 *)ctx->key_enc, rounds, blocks, walk.iv);
first);
err = skcipher_walk_done(&walk, walk.nbytes % AES_BLOCK_SIZE); err = skcipher_walk_done(&walk, walk.nbytes % AES_BLOCK_SIZE);
first = 0; kernel_neon_end();
} }
if (walk.nbytes) { if (walk.nbytes) {
u8 __aligned(8) tail[AES_BLOCK_SIZE]; u8 __aligned(8) tail[AES_BLOCK_SIZE];
...@@ -241,12 +237,13 @@ static int ctr_encrypt(struct skcipher_request *req) ...@@ -241,12 +237,13 @@ static int ctr_encrypt(struct skcipher_request *req)
*/ */
blocks = -1; blocks = -1;
kernel_neon_begin();
aes_ctr_encrypt(tail, NULL, (u8 *)ctx->key_enc, rounds, aes_ctr_encrypt(tail, NULL, (u8 *)ctx->key_enc, rounds,
blocks, walk.iv, first); blocks, walk.iv);
kernel_neon_end();
crypto_xor_cpy(tdst, tsrc, tail, nbytes); crypto_xor_cpy(tdst, tsrc, tail, nbytes);
err = skcipher_walk_done(&walk, 0); err = skcipher_walk_done(&walk, 0);
} }
kernel_neon_end();
return err; return err;
} }
...@@ -270,16 +267,16 @@ static int xts_encrypt(struct skcipher_request *req) ...@@ -270,16 +267,16 @@ static int xts_encrypt(struct skcipher_request *req)
struct skcipher_walk walk; struct skcipher_walk walk;
unsigned int blocks; unsigned int blocks;
err = skcipher_walk_virt(&walk, req, true); err = skcipher_walk_virt(&walk, req, false);
kernel_neon_begin();
for (first = 1; (blocks = (walk.nbytes / AES_BLOCK_SIZE)); first = 0) { for (first = 1; (blocks = (walk.nbytes / AES_BLOCK_SIZE)); first = 0) {
kernel_neon_begin();
aes_xts_encrypt(walk.dst.virt.addr, walk.src.virt.addr, aes_xts_encrypt(walk.dst.virt.addr, walk.src.virt.addr,
(u8 *)ctx->key1.key_enc, rounds, blocks, (u8 *)ctx->key1.key_enc, rounds, blocks,
(u8 *)ctx->key2.key_enc, walk.iv, first); (u8 *)ctx->key2.key_enc, walk.iv, first);
kernel_neon_end();
err = skcipher_walk_done(&walk, walk.nbytes % AES_BLOCK_SIZE); err = skcipher_walk_done(&walk, walk.nbytes % AES_BLOCK_SIZE);
} }
kernel_neon_end();
return err; return err;
} }
...@@ -292,16 +289,16 @@ static int xts_decrypt(struct skcipher_request *req) ...@@ -292,16 +289,16 @@ static int xts_decrypt(struct skcipher_request *req)
struct skcipher_walk walk; struct skcipher_walk walk;
unsigned int blocks; unsigned int blocks;
err = skcipher_walk_virt(&walk, req, true); err = skcipher_walk_virt(&walk, req, false);
kernel_neon_begin();
for (first = 1; (blocks = (walk.nbytes / AES_BLOCK_SIZE)); first = 0) { for (first = 1; (blocks = (walk.nbytes / AES_BLOCK_SIZE)); first = 0) {
kernel_neon_begin();
aes_xts_decrypt(walk.dst.virt.addr, walk.src.virt.addr, aes_xts_decrypt(walk.dst.virt.addr, walk.src.virt.addr,
(u8 *)ctx->key1.key_dec, rounds, blocks, (u8 *)ctx->key1.key_dec, rounds, blocks,
(u8 *)ctx->key2.key_enc, walk.iv, first); (u8 *)ctx->key2.key_enc, walk.iv, first);
kernel_neon_end();
err = skcipher_walk_done(&walk, walk.nbytes % AES_BLOCK_SIZE); err = skcipher_walk_done(&walk, walk.nbytes % AES_BLOCK_SIZE);
} }
kernel_neon_end();
return err; return err;
} }
...@@ -425,7 +422,7 @@ static int cmac_setkey(struct crypto_shash *tfm, const u8 *in_key, ...@@ -425,7 +422,7 @@ static int cmac_setkey(struct crypto_shash *tfm, const u8 *in_key,
/* encrypt the zero vector */ /* encrypt the zero vector */
kernel_neon_begin(); kernel_neon_begin();
aes_ecb_encrypt(ctx->consts, (u8[AES_BLOCK_SIZE]){}, rk, rounds, 1, 1); aes_ecb_encrypt(ctx->consts, (u8[AES_BLOCK_SIZE]){}, rk, rounds, 1);
kernel_neon_end(); kernel_neon_end();
cmac_gf128_mul_by_x(consts, consts); cmac_gf128_mul_by_x(consts, consts);
...@@ -454,8 +451,8 @@ static int xcbc_setkey(struct crypto_shash *tfm, const u8 *in_key, ...@@ -454,8 +451,8 @@ static int xcbc_setkey(struct crypto_shash *tfm, const u8 *in_key,
return err; return err;
kernel_neon_begin(); kernel_neon_begin();
aes_ecb_encrypt(key, ks[0], rk, rounds, 1, 1); aes_ecb_encrypt(key, ks[0], rk, rounds, 1);
aes_ecb_encrypt(ctx->consts, ks[1], rk, rounds, 2, 0); aes_ecb_encrypt(ctx->consts, ks[1], rk, rounds, 2);
kernel_neon_end(); kernel_neon_end();
return cbcmac_setkey(tfm, key, sizeof(key)); return cbcmac_setkey(tfm, key, sizeof(key));
......
This diff is collapsed.
...@@ -46,10 +46,9 @@ asmlinkage void aesbs_xts_decrypt(u8 out[], u8 const in[], u8 const rk[], ...@@ -46,10 +46,9 @@ asmlinkage void aesbs_xts_decrypt(u8 out[], u8 const in[], u8 const rk[],
/* borrowed from aes-neon-blk.ko */ /* borrowed from aes-neon-blk.ko */
asmlinkage void neon_aes_ecb_encrypt(u8 out[], u8 const in[], u32 const rk[], asmlinkage void neon_aes_ecb_encrypt(u8 out[], u8 const in[], u32 const rk[],
int rounds, int blocks, int first); int rounds, int blocks);
asmlinkage void neon_aes_cbc_encrypt(u8 out[], u8 const in[], u32 const rk[], asmlinkage void neon_aes_cbc_encrypt(u8 out[], u8 const in[], u32 const rk[],
int rounds, int blocks, u8 iv[], int rounds, int blocks, u8 iv[]);
int first);
struct aesbs_ctx { struct aesbs_ctx {
u8 rk[13 * (8 * AES_BLOCK_SIZE) + 32]; u8 rk[13 * (8 * AES_BLOCK_SIZE) + 32];
...@@ -100,9 +99,8 @@ static int __ecb_crypt(struct skcipher_request *req, ...@@ -100,9 +99,8 @@ static int __ecb_crypt(struct skcipher_request *req,
struct skcipher_walk walk; struct skcipher_walk walk;
int err; int err;
err = skcipher_walk_virt(&walk, req, true); err = skcipher_walk_virt(&walk, req, false);
kernel_neon_begin();
while (walk.nbytes >= AES_BLOCK_SIZE) { while (walk.nbytes >= AES_BLOCK_SIZE) {
unsigned int blocks = walk.nbytes / AES_BLOCK_SIZE; unsigned int blocks = walk.nbytes / AES_BLOCK_SIZE;
...@@ -110,12 +108,13 @@ static int __ecb_crypt(struct skcipher_request *req, ...@@ -110,12 +108,13 @@ static int __ecb_crypt(struct skcipher_request *req,
blocks = round_down(blocks, blocks = round_down(blocks,
walk.stride / AES_BLOCK_SIZE); walk.stride / AES_BLOCK_SIZE);
kernel_neon_begin();
fn(walk.dst.virt.addr, walk.src.virt.addr, ctx->rk, fn(walk.dst.virt.addr, walk.src.virt.addr, ctx->rk,
ctx->rounds, blocks); ctx->rounds, blocks);
kernel_neon_end();
err = skcipher_walk_done(&walk, err = skcipher_walk_done(&walk,
walk.nbytes - blocks * AES_BLOCK_SIZE); walk.nbytes - blocks * AES_BLOCK_SIZE);
} }
kernel_neon_end();
return err; return err;
} }
...@@ -157,22 +156,21 @@ static int cbc_encrypt(struct skcipher_request *req) ...@@ -157,22 +156,21 @@ static int cbc_encrypt(struct skcipher_request *req)
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct aesbs_cbc_ctx *ctx = crypto_skcipher_ctx(tfm); struct aesbs_cbc_ctx *ctx = crypto_skcipher_ctx(tfm);
struct skcipher_walk walk; struct skcipher_walk walk;
int err, first = 1; int err;
err = skcipher_walk_virt(&walk, req, true); err = skcipher_walk_virt(&walk, req, false);
kernel_neon_begin();
while (walk.nbytes >= AES_BLOCK_SIZE) { while (walk.nbytes >= AES_BLOCK_SIZE) {
unsigned int blocks = walk.nbytes / AES_BLOCK_SIZE; unsigned int blocks = walk.nbytes / AES_BLOCK_SIZE;
/* fall back to the non-bitsliced NEON implementation */ /* fall back to the non-bitsliced NEON implementation */
kernel_neon_begin();
neon_aes_cbc_encrypt(walk.dst.virt.addr, walk.src.virt.addr, neon_aes_cbc_encrypt(walk.dst.virt.addr, walk.src.virt.addr,
ctx->enc, ctx->key.rounds, blocks, walk.iv, ctx->enc, ctx->key.rounds, blocks,
first); walk.iv);
kernel_neon_end();
err = skcipher_walk_done(&walk, walk.nbytes % AES_BLOCK_SIZE); err = skcipher_walk_done(&walk, walk.nbytes % AES_BLOCK_SIZE);
first = 0;
} }
kernel_neon_end();
return err; return err;
} }
...@@ -183,9 +181,8 @@ static int cbc_decrypt(struct skcipher_request *req) ...@@ -183,9 +181,8 @@ static int cbc_decrypt(struct skcipher_request *req)
struct skcipher_walk walk; struct skcipher_walk walk;
int err; int err;
err = skcipher_walk_virt(&walk, req, true); err = skcipher_walk_virt(&walk, req, false);
kernel_neon_begin();
while (walk.nbytes >= AES_BLOCK_SIZE) { while (walk.nbytes >= AES_BLOCK_SIZE) {
unsigned int blocks = walk.nbytes / AES_BLOCK_SIZE; unsigned int blocks = walk.nbytes / AES_BLOCK_SIZE;
...@@ -193,13 +190,14 @@ static int cbc_decrypt(struct skcipher_request *req) ...@@ -193,13 +190,14 @@ static int cbc_decrypt(struct skcipher_request *req)
blocks = round_down(blocks, blocks = round_down(blocks,
walk.stride / AES_BLOCK_SIZE); walk.stride / AES_BLOCK_SIZE);
kernel_neon_begin();
aesbs_cbc_decrypt(walk.dst.virt.addr, walk.src.virt.addr, aesbs_cbc_decrypt(walk.dst.virt.addr, walk.src.virt.addr,
ctx->key.rk, ctx->key.rounds, blocks, ctx->key.rk, ctx->key.rounds, blocks,
walk.iv); walk.iv);
kernel_neon_end();
err = skcipher_walk_done(&walk, err = skcipher_walk_done(&walk,
walk.nbytes - blocks * AES_BLOCK_SIZE); walk.nbytes - blocks * AES_BLOCK_SIZE);
} }
kernel_neon_end();
return err; return err;
} }
...@@ -231,9 +229,8 @@ static int ctr_encrypt(struct skcipher_request *req) ...@@ -231,9 +229,8 @@ static int ctr_encrypt(struct skcipher_request *req)
u8 buf[AES_BLOCK_SIZE]; u8 buf[AES_BLOCK_SIZE];
int err; int err;
err = skcipher_walk_virt(&walk, req, true); err = skcipher_walk_virt(&walk, req, false);
kernel_neon_begin();
while (walk.nbytes > 0) { while (walk.nbytes > 0) {
unsigned int blocks = walk.nbytes / AES_BLOCK_SIZE; unsigned int blocks = walk.nbytes / AES_BLOCK_SIZE;
u8 *final = (walk.total % AES_BLOCK_SIZE) ? buf : NULL; u8 *final = (walk.total % AES_BLOCK_SIZE) ? buf : NULL;
...@@ -244,8 +241,10 @@ static int ctr_encrypt(struct skcipher_request *req) ...@@ -244,8 +241,10 @@ static int ctr_encrypt(struct skcipher_request *req)
final = NULL; final = NULL;
} }
kernel_neon_begin();
aesbs_ctr_encrypt(walk.dst.virt.addr, walk.src.virt.addr, aesbs_ctr_encrypt(walk.dst.virt.addr, walk.src.virt.addr,
ctx->rk, ctx->rounds, blocks, walk.iv, final); ctx->rk, ctx->rounds, blocks, walk.iv, final);
kernel_neon_end();
if (final) { if (final) {
u8 *dst = walk.dst.virt.addr + blocks * AES_BLOCK_SIZE; u8 *dst = walk.dst.virt.addr + blocks * AES_BLOCK_SIZE;
...@@ -260,8 +259,6 @@ static int ctr_encrypt(struct skcipher_request *req) ...@@ -260,8 +259,6 @@ static int ctr_encrypt(struct skcipher_request *req)
err = skcipher_walk_done(&walk, err = skcipher_walk_done(&walk,
walk.nbytes - blocks * AES_BLOCK_SIZE); walk.nbytes - blocks * AES_BLOCK_SIZE);
} }
kernel_neon_end();
return err; return err;
} }
...@@ -306,12 +303,11 @@ static int __xts_crypt(struct skcipher_request *req, ...@@ -306,12 +303,11 @@ static int __xts_crypt(struct skcipher_request *req,
struct skcipher_walk walk; struct skcipher_walk walk;
int err; int err;
err = skcipher_walk_virt(&walk, req, true); err = skcipher_walk_virt(&walk, req, false);
kernel_neon_begin(); kernel_neon_begin();
neon_aes_ecb_encrypt(walk.iv, walk.iv, ctx->twkey, ctx->key.rounds, 1);
neon_aes_ecb_encrypt(walk.iv, walk.iv, ctx->twkey, kernel_neon_end();
ctx->key.rounds, 1, 1);
while (walk.nbytes >= AES_BLOCK_SIZE) { while (walk.nbytes >= AES_BLOCK_SIZE) {
unsigned int blocks = walk.nbytes / AES_BLOCK_SIZE; unsigned int blocks = walk.nbytes / AES_BLOCK_SIZE;
...@@ -320,13 +316,13 @@ static int __xts_crypt(struct skcipher_request *req, ...@@ -320,13 +316,13 @@ static int __xts_crypt(struct skcipher_request *req,
blocks = round_down(blocks, blocks = round_down(blocks,
walk.stride / AES_BLOCK_SIZE); walk.stride / AES_BLOCK_SIZE);
kernel_neon_begin();
fn(walk.dst.virt.addr, walk.src.virt.addr, ctx->key.rk, fn(walk.dst.virt.addr, walk.src.virt.addr, ctx->key.rk,
ctx->key.rounds, blocks, walk.iv); ctx->key.rounds, blocks, walk.iv);
kernel_neon_end();
err = skcipher_walk_done(&walk, err = skcipher_walk_done(&walk,
walk.nbytes - blocks * AES_BLOCK_SIZE); walk.nbytes - blocks * AES_BLOCK_SIZE);
} }
kernel_neon_end();
return err; return err;
} }
......
...@@ -37,12 +37,19 @@ static void chacha20_doneon(u32 *state, u8 *dst, const u8 *src, ...@@ -37,12 +37,19 @@ static void chacha20_doneon(u32 *state, u8 *dst, const u8 *src,
u8 buf[CHACHA20_BLOCK_SIZE]; u8 buf[CHACHA20_BLOCK_SIZE];
while (bytes >= CHACHA20_BLOCK_SIZE * 4) { while (bytes >= CHACHA20_BLOCK_SIZE * 4) {
kernel_neon_begin();
chacha20_4block_xor_neon(state, dst, src); chacha20_4block_xor_neon(state, dst, src);
kernel_neon_end();
bytes -= CHACHA20_BLOCK_SIZE * 4; bytes -= CHACHA20_BLOCK_SIZE * 4;
src += CHACHA20_BLOCK_SIZE * 4; src += CHACHA20_BLOCK_SIZE * 4;
dst += CHACHA20_BLOCK_SIZE * 4; dst += CHACHA20_BLOCK_SIZE * 4;
state[12] += 4; state[12] += 4;
} }
if (!bytes)
return;
kernel_neon_begin();
while (bytes >= CHACHA20_BLOCK_SIZE) { while (bytes >= CHACHA20_BLOCK_SIZE) {
chacha20_block_xor_neon(state, dst, src); chacha20_block_xor_neon(state, dst, src);
bytes -= CHACHA20_BLOCK_SIZE; bytes -= CHACHA20_BLOCK_SIZE;
...@@ -55,6 +62,7 @@ static void chacha20_doneon(u32 *state, u8 *dst, const u8 *src, ...@@ -55,6 +62,7 @@ static void chacha20_doneon(u32 *state, u8 *dst, const u8 *src,
chacha20_block_xor_neon(state, buf, buf); chacha20_block_xor_neon(state, buf, buf);
memcpy(dst, buf, bytes); memcpy(dst, buf, bytes);
} }
kernel_neon_end();
} }
static int chacha20_neon(struct skcipher_request *req) static int chacha20_neon(struct skcipher_request *req)
...@@ -68,11 +76,10 @@ static int chacha20_neon(struct skcipher_request *req) ...@@ -68,11 +76,10 @@ static int chacha20_neon(struct skcipher_request *req)
if (!may_use_simd() || req->cryptlen <= CHACHA20_BLOCK_SIZE) if (!may_use_simd() || req->cryptlen <= CHACHA20_BLOCK_SIZE)
return crypto_chacha20_crypt(req); return crypto_chacha20_crypt(req);
err = skcipher_walk_virt(&walk, req, true); err = skcipher_walk_virt(&walk, req, false);
crypto_chacha20_init(state, ctx, walk.iv); crypto_chacha20_init(state, ctx, walk.iv);
kernel_neon_begin();
while (walk.nbytes > 0) { while (walk.nbytes > 0) {
unsigned int nbytes = walk.nbytes; unsigned int nbytes = walk.nbytes;
...@@ -83,7 +90,6 @@ static int chacha20_neon(struct skcipher_request *req) ...@@ -83,7 +90,6 @@ static int chacha20_neon(struct skcipher_request *req)
nbytes); nbytes);
err = skcipher_walk_done(&walk, walk.nbytes - nbytes); err = skcipher_walk_done(&walk, walk.nbytes - nbytes);
} }
kernel_neon_end();
return err; return err;
} }
......
...@@ -89,21 +89,32 @@ static struct shash_alg algs[] = { { ...@@ -89,21 +89,32 @@ static struct shash_alg algs[] = { {
static int sha256_update_neon(struct shash_desc *desc, const u8 *data, static int sha256_update_neon(struct shash_desc *desc, const u8 *data,
unsigned int len) unsigned int len)
{ {
/* struct sha256_state *sctx = shash_desc_ctx(desc);
* Stacking and unstacking a substantial slice of the NEON register
* file may significantly affect performance for small updates when
* executing in interrupt context, so fall back to the scalar code
* in that case.
*/
if (!may_use_simd()) if (!may_use_simd())
return sha256_base_do_update(desc, data, len, return sha256_base_do_update(desc, data, len,
(sha256_block_fn *)sha256_block_data_order); (sha256_block_fn *)sha256_block_data_order);
kernel_neon_begin(); while (len > 0) {
sha256_base_do_update(desc, data, len, unsigned int chunk = len;
(sha256_block_fn *)sha256_block_neon);
kernel_neon_end(); /*
* Don't hog the CPU for the entire time it takes to process all
* input when running on a preemptible kernel, but process the
* data block by block instead.
*/
if (IS_ENABLED(CONFIG_PREEMPT) &&
chunk + sctx->count % SHA256_BLOCK_SIZE > SHA256_BLOCK_SIZE)
chunk = SHA256_BLOCK_SIZE -
sctx->count % SHA256_BLOCK_SIZE;
kernel_neon_begin();
sha256_base_do_update(desc, data, chunk,
(sha256_block_fn *)sha256_block_neon);
kernel_neon_end();
data += chunk;
len -= chunk;
}
return 0; return 0;
} }
...@@ -117,10 +128,9 @@ static int sha256_finup_neon(struct shash_desc *desc, const u8 *data, ...@@ -117,10 +128,9 @@ static int sha256_finup_neon(struct shash_desc *desc, const u8 *data,
sha256_base_do_finalize(desc, sha256_base_do_finalize(desc,
(sha256_block_fn *)sha256_block_data_order); (sha256_block_fn *)sha256_block_data_order);
} else { } else {
kernel_neon_begin();
if (len) if (len)
sha256_base_do_update(desc, data, len, sha256_update_neon(desc, data, len);
(sha256_block_fn *)sha256_block_neon); kernel_neon_begin();
sha256_base_do_finalize(desc, sha256_base_do_finalize(desc,
(sha256_block_fn *)sha256_block_neon); (sha256_block_fn *)sha256_block_neon);
kernel_neon_end(); kernel_neon_end();
......
This diff is collapsed.
// SPDX-License-Identifier: GPL-2.0
/*
* NEON-accelerated implementation of Speck128-XTS and Speck64-XTS
* (64-bit version; based on the 32-bit version)
*
* Copyright (c) 2018 Google, Inc
*/
#include <asm/hwcap.h>
#include <asm/neon.h>
#include <asm/simd.h>
#include <crypto/algapi.h>
#include <crypto/gf128mul.h>
#include <crypto/internal/skcipher.h>
#include <crypto/speck.h>
#include <crypto/xts.h>
#include <linux/kernel.h>
#include <linux/module.h>
/* The assembly functions only handle multiples of 128 bytes */
#define SPECK_NEON_CHUNK_SIZE 128
/* Speck128 */
struct speck128_xts_tfm_ctx {
struct speck128_tfm_ctx main_key;
struct speck128_tfm_ctx tweak_key;
};
asmlinkage void speck128_xts_encrypt_neon(const u64 *round_keys, int nrounds,
void *dst, const void *src,
unsigned int nbytes, void *tweak);
asmlinkage void speck128_xts_decrypt_neon(const u64 *round_keys, int nrounds,
void *dst, const void *src,
unsigned int nbytes, void *tweak);
typedef void (*speck128_crypt_one_t)(const struct speck128_tfm_ctx *,
u8 *, const u8 *);
typedef void (*speck128_xts_crypt_many_t)(const u64 *, int, void *,
const void *, unsigned int, void *);
static __always_inline int
__speck128_xts_crypt(struct skcipher_request *req,
speck128_crypt_one_t crypt_one,
speck128_xts_crypt_many_t crypt_many)
{
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
const struct speck128_xts_tfm_ctx *ctx = crypto_skcipher_ctx(tfm);
struct skcipher_walk walk;
le128 tweak;
int err;
err = skcipher_walk_virt(&walk, req, true);
crypto_speck128_encrypt(&ctx->tweak_key, (u8 *)&tweak, walk.iv);
while (walk.nbytes > 0) {
unsigned int nbytes = walk.nbytes;
u8 *dst = walk.dst.virt.addr;
const u8 *src = walk.src.virt.addr;
if (nbytes >= SPECK_NEON_CHUNK_SIZE && may_use_simd()) {
unsigned int count;
count = round_down(nbytes, SPECK_NEON_CHUNK_SIZE);
kernel_neon_begin();
(*crypt_many)(ctx->main_key.round_keys,
ctx->main_key.nrounds,
dst, src, count, &tweak);
kernel_neon_end();
dst += count;
src += count;
nbytes -= count;
}
/* Handle any remainder with generic code */
while (nbytes >= sizeof(tweak)) {
le128_xor((le128 *)dst, (const le128 *)src, &tweak);
(*crypt_one)(&ctx->main_key, dst, dst);
le128_xor((le128 *)dst, (const le128 *)dst, &tweak);
gf128mul_x_ble(&tweak, &tweak);
dst += sizeof(tweak);
src += sizeof(tweak);
nbytes -= sizeof(tweak);
}
err = skcipher_walk_done(&walk, nbytes);
}
return err;
}
static int speck128_xts_encrypt(struct skcipher_request *req)
{
return __speck128_xts_crypt(req, crypto_speck128_encrypt,
speck128_xts_encrypt_neon);
}
static int speck128_xts_decrypt(struct skcipher_request *req)
{
return __speck128_xts_crypt(req, crypto_speck128_decrypt,
speck128_xts_decrypt_neon);
}
static int speck128_xts_setkey(struct crypto_skcipher *tfm, const u8 *key,
unsigned int keylen)
{
struct speck128_xts_tfm_ctx *ctx = crypto_skcipher_ctx(tfm);
int err;
err = xts_verify_key(tfm, key, keylen);
if (err)
return err;
keylen /= 2;
err = crypto_speck128_setkey(&ctx->main_key, key, keylen);
if (err)
return err;
return crypto_speck128_setkey(&ctx->tweak_key, key + keylen, keylen);
}
/* Speck64 */
struct speck64_xts_tfm_ctx {
struct speck64_tfm_ctx main_key;
struct speck64_tfm_ctx tweak_key;
};
asmlinkage void speck64_xts_encrypt_neon(const u32 *round_keys, int nrounds,
void *dst, const void *src,
unsigned int nbytes, void *tweak);
asmlinkage void speck64_xts_decrypt_neon(const u32 *round_keys, int nrounds,
void *dst, const void *src,
unsigned int nbytes, void *tweak);
typedef void (*speck64_crypt_one_t)(const struct speck64_tfm_ctx *,
u8 *, const u8 *);
typedef void (*speck64_xts_crypt_many_t)(const u32 *, int, void *,
const void *, unsigned int, void *);
static __always_inline int
__speck64_xts_crypt(struct skcipher_request *req, speck64_crypt_one_t crypt_one,
speck64_xts_crypt_many_t crypt_many)
{
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
const struct speck64_xts_tfm_ctx *ctx = crypto_skcipher_ctx(tfm);
struct skcipher_walk walk;
__le64 tweak;
int err;
err = skcipher_walk_virt(&walk, req, true);
crypto_speck64_encrypt(&ctx->tweak_key, (u8 *)&tweak, walk.iv);
while (walk.nbytes > 0) {
unsigned int nbytes = walk.nbytes;
u8 *dst = walk.dst.virt.addr;
const u8 *src = walk.src.virt.addr;
if (nbytes >= SPECK_NEON_CHUNK_SIZE && may_use_simd()) {
unsigned int count;
count = round_down(nbytes, SPECK_NEON_CHUNK_SIZE);
kernel_neon_begin();
(*crypt_many)(ctx->main_key.round_keys,
ctx->main_key.nrounds,
dst, src, count, &tweak);
kernel_neon_end();
dst += count;
src += count;
nbytes -= count;
}
/* Handle any remainder with generic code */
while (nbytes >= sizeof(tweak)) {
*(__le64 *)dst = *(__le64 *)src ^ tweak;
(*crypt_one)(&ctx->main_key, dst, dst);
*(__le64 *)dst ^= tweak;
tweak = cpu_to_le64((le64_to_cpu(tweak) << 1) ^
((tweak & cpu_to_le64(1ULL << 63)) ?
0x1B : 0));
dst += sizeof(tweak);
src += sizeof(tweak);
nbytes -= sizeof(tweak);
}
err = skcipher_walk_done(&walk, nbytes);
}
return err;
}
static int speck64_xts_encrypt(struct skcipher_request *req)
{
return __speck64_xts_crypt(req, crypto_speck64_encrypt,
speck64_xts_encrypt_neon);
}
static int speck64_xts_decrypt(struct skcipher_request *req)
{
return __speck64_xts_crypt(req, crypto_speck64_decrypt,
speck64_xts_decrypt_neon);
}
static int speck64_xts_setkey(struct crypto_skcipher *tfm, const u8 *key,
unsigned int keylen)
{
struct speck64_xts_tfm_ctx *ctx = crypto_skcipher_ctx(tfm);
int err;
err = xts_verify_key(tfm, key, keylen);
if (err)
return err;
keylen /= 2;
err = crypto_speck64_setkey(&ctx->main_key, key, keylen);
if (err)
return err;
return crypto_speck64_setkey(&ctx->tweak_key, key + keylen, keylen);
}
static struct skcipher_alg speck_algs[] = {
{
.base.cra_name = "xts(speck128)",
.base.cra_driver_name = "xts-speck128-neon",
.base.cra_priority = 300,
.base.cra_blocksize = SPECK128_BLOCK_SIZE,
.base.cra_ctxsize = sizeof(struct speck128_xts_tfm_ctx),
.base.cra_alignmask = 7,
.base.cra_module = THIS_MODULE,
.min_keysize = 2 * SPECK128_128_KEY_SIZE,
.max_keysize = 2 * SPECK128_256_KEY_SIZE,
.ivsize = SPECK128_BLOCK_SIZE,
.walksize = SPECK_NEON_CHUNK_SIZE,
.setkey = speck128_xts_setkey,
.encrypt = speck128_xts_encrypt,
.decrypt = speck128_xts_decrypt,
}, {
.base.cra_name = "xts(speck64)",
.base.cra_driver_name = "xts-speck64-neon",
.base.cra_priority = 300,
.base.cra_blocksize = SPECK64_BLOCK_SIZE,
.base.cra_ctxsize = sizeof(struct speck64_xts_tfm_ctx),
.base.cra_alignmask = 7,
.base.cra_module = THIS_MODULE,
.min_keysize = 2 * SPECK64_96_KEY_SIZE,
.max_keysize = 2 * SPECK64_128_KEY_SIZE,
.ivsize = SPECK64_BLOCK_SIZE,
.walksize = SPECK_NEON_CHUNK_SIZE,
.setkey = speck64_xts_setkey,
.encrypt = speck64_xts_encrypt,
.decrypt = speck64_xts_decrypt,
}
};
static int __init speck_neon_module_init(void)
{
if (!(elf_hwcap & HWCAP_ASIMD))
return -ENODEV;
return crypto_register_skciphers(speck_algs, ARRAY_SIZE(speck_algs));
}
static void __exit speck_neon_module_exit(void)
{
crypto_unregister_skciphers(speck_algs, ARRAY_SIZE(speck_algs));
}
module_init(speck_neon_module_init);
module_exit(speck_neon_module_exit);
MODULE_DESCRIPTION("Speck block cipher (NEON-accelerated)");
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Eric Biggers <ebiggers@google.com>");
MODULE_ALIAS_CRYPTO("xts(speck128)");
MODULE_ALIAS_CRYPTO("xts-speck128-neon");
MODULE_ALIAS_CRYPTO("xts(speck64)");
MODULE_ALIAS_CRYPTO("xts-speck64-neon");
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
...@@ -106,13 +106,6 @@ static asmlinkage struct job_sha1* (*sha1_job_mgr_flush) ...@@ -106,13 +106,6 @@ static asmlinkage struct job_sha1* (*sha1_job_mgr_flush)
static asmlinkage struct job_sha1* (*sha1_job_mgr_get_comp_job) static asmlinkage struct job_sha1* (*sha1_job_mgr_get_comp_job)
(struct sha1_mb_mgr *state); (struct sha1_mb_mgr *state);
static inline void sha1_init_digest(uint32_t *digest)
{
static const uint32_t initial_digest[SHA1_DIGEST_LENGTH] = {SHA1_H0,
SHA1_H1, SHA1_H2, SHA1_H3, SHA1_H4 };
memcpy(digest, initial_digest, sizeof(initial_digest));
}
static inline uint32_t sha1_pad(uint8_t padblock[SHA1_BLOCK_SIZE * 2], static inline uint32_t sha1_pad(uint8_t padblock[SHA1_BLOCK_SIZE * 2],
uint64_t total_len) uint64_t total_len)
{ {
...@@ -244,11 +237,8 @@ static struct sha1_hash_ctx *sha1_ctx_mgr_submit(struct sha1_ctx_mgr *mgr, ...@@ -244,11 +237,8 @@ static struct sha1_hash_ctx *sha1_ctx_mgr_submit(struct sha1_ctx_mgr *mgr,
uint32_t len, uint32_t len,
int flags) int flags)
{ {
if (flags & (~HASH_ENTIRE)) { if (flags & ~(HASH_UPDATE | HASH_LAST)) {
/* /* User should not pass anything other than UPDATE or LAST */
* User should not pass anything other than FIRST, UPDATE, or
* LAST
*/
ctx->error = HASH_CTX_ERROR_INVALID_FLAGS; ctx->error = HASH_CTX_ERROR_INVALID_FLAGS;
return ctx; return ctx;
} }
...@@ -259,24 +249,12 @@ static struct sha1_hash_ctx *sha1_ctx_mgr_submit(struct sha1_ctx_mgr *mgr, ...@@ -259,24 +249,12 @@ static struct sha1_hash_ctx *sha1_ctx_mgr_submit(struct sha1_ctx_mgr *mgr,
return ctx; return ctx;
} }
if ((ctx->status & HASH_CTX_STS_COMPLETE) && !(flags & HASH_FIRST)) { if (ctx->status & HASH_CTX_STS_COMPLETE) {
/* Cannot update a finished job. */ /* Cannot update a finished job. */
ctx->error = HASH_CTX_ERROR_ALREADY_COMPLETED; ctx->error = HASH_CTX_ERROR_ALREADY_COMPLETED;
return ctx; return ctx;
} }
if (flags & HASH_FIRST) {
/* Init digest */
sha1_init_digest(ctx->job.result_digest);
/* Reset byte counter */
ctx->total_length = 0;
/* Clear extra blocks */
ctx->partial_block_buffer_length = 0;
}
/* /*
* If we made it here, there were no errors during this call to * If we made it here, there were no errors during this call to
* submit * submit
......
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment