Commit 64e7003c authored by Linus Torvalds's avatar Linus Torvalds

Merge tag 'v6.2-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6

Pull crypto updates from Herbert Xu:
 "API:
   - Optimise away self-test overhead when they are disabled
   - Support symmetric encryption via keyring keys in af_alg
   - Flip hwrng default_quality, the default is now maximum entropy

  Algorithms:
   - Add library version of aesgcm
   - CFI fixes for assembly code
   - Add arm/arm64 accelerated versions of sm3/sm4

  Drivers:
   - Remove assumption on arm64 that kmalloc is DMA-aligned
   - Fix selftest failures in rockchip
   - Add support for RK3328/RK3399 in rockchip
   - Add deflate support in qat
   - Merge ux500 into stm32
   - Add support for TEE for PCI ID 0x14CA in ccp
   - Add mt7986 support in mtk
   - Add MaxLinear platform support in inside-secure
   - Add NPCM8XX support in npcm"

* tag 'v6.2-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (184 commits)
  crypto: ux500/cryp - delete driver
  crypto: stm32/cryp - enable for use with Ux500
  crypto: stm32 - enable drivers to be used on Ux500
  dt-bindings: crypto: Let STM32 define Ux500 CRYP
  hwrng: geode - Fix PCI device refcount leak
  hwrng: amd - Fix PCI device refcount leak
  crypto: qce - Set DMA alignment explicitly
  crypto: octeontx2 - Set DMA alignment explicitly
  crypto: octeontx - Set DMA alignment explicitly
  crypto: keembay - Set DMA alignment explicitly
  crypto: safexcel - Set DMA alignment explicitly
  crypto: hisilicon/hpre - Set DMA alignment explicitly
  crypto: chelsio - Set DMA alignment explicitly
  crypto: ccree - Set DMA alignment explicitly
  crypto: ccp - Set DMA alignment explicitly
  crypto: cavium - Set DMA alignment explicitly
  crypto: img-hash - Fix variable dereferenced before check 'hdev->req'
  crypto: arm64/ghash-ce - use frame_push/pop macros consistently
  crypto: arm64/crct10dif - use frame_push/pop macros consistently
  crypto: arm64/aes-modes - use frame_push/pop macros consistently
  ...
parents 48ea09cd 453de3eb
...@@ -172,7 +172,7 @@ Here are schematics of how these functions are called when operated from ...@@ -172,7 +172,7 @@ Here are schematics of how these functions are called when operated from
other part of the kernel. Note that the .setkey() call might happen other part of the kernel. Note that the .setkey() call might happen
before or after any of these schematics happen, but must not happen before or after any of these schematics happen, but must not happen
during any of these are in-flight. Please note that calling .init() during any of these are in-flight. Please note that calling .init()
followed immediately by .finish() is also a perfectly valid followed immediately by .final() is also a perfectly valid
transformation. transformation.
:: ::
......
...@@ -131,9 +131,9 @@ from the kernel crypto API. If the buffer is too small for the message ...@@ -131,9 +131,9 @@ from the kernel crypto API. If the buffer is too small for the message
digest, the flag MSG_TRUNC is set by the kernel. digest, the flag MSG_TRUNC is set by the kernel.
In order to set a message digest key, the calling application must use In order to set a message digest key, the calling application must use
the setsockopt() option of ALG_SET_KEY. If the key is not set the HMAC the setsockopt() option of ALG_SET_KEY or ALG_SET_KEY_BY_KEY_SERIAL. If the
operation is performed without the initial HMAC state change caused by key is not set the HMAC operation is performed without the initial HMAC state
the key. change caused by the key.
Symmetric Cipher API Symmetric Cipher API
-------------------- --------------------
...@@ -382,6 +382,15 @@ mentioned optname: ...@@ -382,6 +382,15 @@ mentioned optname:
- the RNG cipher type to provide the seed - the RNG cipher type to provide the seed
- ALG_SET_KEY_BY_KEY_SERIAL -- Setting the key via keyring key_serial_t.
This operation behaves the same as ALG_SET_KEY. The decrypted
data is copied from a keyring key, and uses that data as the
key for symmetric encryption.
The passed in key_serial_t must have the KEY_(POS|USR|GRP|OTH)_SEARCH
permission set, otherwise -EPERM is returned. Supports key types: user,
logon, encrypted, and trusted.
- ALG_SET_AEAD_AUTHSIZE -- Setting the authentication tag size for - ALG_SET_AEAD_AUTHSIZE -- Setting the authentication tag size for
AEAD ciphers. For a encryption operation, the authentication tag of AEAD ciphers. For a encryption operation, the authentication tag of
the given size will be generated. For a decryption operation, the the given size will be generated. For a decryption operation, the
......
# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
%YAML 1.2
---
$id: http://devicetree.org/schemas/crypto/rockchip,rk3288-crypto.yaml#
$schema: http://devicetree.org/meta-schemas/core.yaml#
title: Rockchip Electronics Security Accelerator
maintainers:
- Heiko Stuebner <heiko@sntech.de>
properties:
compatible:
enum:
- rockchip,rk3288-crypto
- rockchip,rk3328-crypto
- rockchip,rk3399-crypto
reg:
maxItems: 1
interrupts:
maxItems: 1
clocks:
minItems: 3
maxItems: 4
clock-names:
minItems: 3
maxItems: 4
resets:
minItems: 1
maxItems: 3
reset-names:
minItems: 1
maxItems: 3
allOf:
- if:
properties:
compatible:
contains:
const: rockchip,rk3288-crypto
then:
properties:
clocks:
minItems: 4
clock-names:
items:
- const: aclk
- const: hclk
- const: sclk
- const: apb_pclk
resets:
maxItems: 1
reset-names:
items:
- const: crypto-rst
- if:
properties:
compatible:
contains:
const: rockchip,rk3328-crypto
then:
properties:
clocks:
maxItems: 3
clock-names:
items:
- const: hclk_master
- const: hclk_slave
- const: sclk
resets:
maxItems: 1
reset-names:
items:
- const: crypto-rst
- if:
properties:
compatible:
contains:
const: rockchip,rk3399-crypto
then:
properties:
clocks:
maxItems: 3
clock-names:
items:
- const: hclk_master
- const: hclk_slave
- const: sclk
resets:
minItems: 3
reset-names:
items:
- const: master
- const: slave
- const: crypto-rst
required:
- compatible
- reg
- interrupts
- clocks
- clock-names
- resets
- reset-names
additionalProperties: false
examples:
- |
#include <dt-bindings/interrupt-controller/arm-gic.h>
#include <dt-bindings/clock/rk3288-cru.h>
crypto@ff8a0000 {
compatible = "rockchip,rk3288-crypto";
reg = <0xff8a0000 0x4000>;
interrupts = <GIC_SPI 48 IRQ_TYPE_LEVEL_HIGH>;
clocks = <&cru ACLK_CRYPTO>, <&cru HCLK_CRYPTO>,
<&cru SCLK_CRYPTO>, <&cru ACLK_DMAC1>;
clock-names = "aclk", "hclk", "sclk", "apb_pclk";
resets = <&cru SRST_CRYPTO>;
reset-names = "crypto-rst";
};
Rockchip Electronics And Security Accelerator
Required properties:
- compatible: Should be "rockchip,rk3288-crypto"
- reg: Base physical address of the engine and length of memory mapped
region
- interrupts: Interrupt number
- clocks: Reference to the clocks about crypto
- clock-names: "aclk" used to clock data
"hclk" used to clock data
"sclk" used to clock crypto accelerator
"apb_pclk" used to clock dma
- resets: Must contain an entry for each entry in reset-names.
See ../reset/reset.txt for details.
- reset-names: Must include the name "crypto-rst".
Examples:
crypto: cypto-controller@ff8a0000 {
compatible = "rockchip,rk3288-crypto";
reg = <0xff8a0000 0x4000>;
interrupts = <GIC_SPI 48 IRQ_TYPE_LEVEL_HIGH>;
clocks = <&cru ACLK_CRYPTO>, <&cru HCLK_CRYPTO>,
<&cru SCLK_CRYPTO>, <&cru ACLK_DMAC1>;
clock-names = "aclk", "hclk", "sclk", "apb_pclk";
resets = <&cru SRST_CRYPTO>;
reset-names = "crypto-rst";
};
...@@ -6,12 +6,18 @@ $schema: http://devicetree.org/meta-schemas/core.yaml# ...@@ -6,12 +6,18 @@ $schema: http://devicetree.org/meta-schemas/core.yaml#
title: STMicroelectronics STM32 CRYP bindings title: STMicroelectronics STM32 CRYP bindings
description: The STM32 CRYP block is built on the CRYP block found in
the STn8820 SoC introduced in 2007, and subsequently used in the U8500
SoC in 2010.
maintainers: maintainers:
- Lionel Debieve <lionel.debieve@foss.st.com> - Lionel Debieve <lionel.debieve@foss.st.com>
properties: properties:
compatible: compatible:
enum: enum:
- st,stn8820-cryp
- stericsson,ux500-cryp
- st,stm32f756-cryp - st,stm32f756-cryp
- st,stm32mp1-cryp - st,stm32mp1-cryp
...@@ -27,6 +33,19 @@ properties: ...@@ -27,6 +33,19 @@ properties:
resets: resets:
maxItems: 1 maxItems: 1
dmas:
items:
- description: mem2cryp DMA channel
- description: cryp2mem DMA channel
dma-names:
items:
- const: mem2cryp
- const: cryp2mem
power-domains:
maxItems: 1
required: required:
- compatible - compatible
- reg - reg
......
...@@ -16,7 +16,9 @@ maintainers: ...@@ -16,7 +16,9 @@ maintainers:
properties: properties:
compatible: compatible:
const: nuvoton,npcm750-rng enum:
- nuvoton,npcm750-rng
- nuvoton,npcm845-rng
reg: reg:
maxItems: 1 maxItems: 1
......
...@@ -17941,6 +17941,13 @@ F: Documentation/ABI/*/sysfs-driver-hid-roccat* ...@@ -17941,6 +17941,13 @@ F: Documentation/ABI/*/sysfs-driver-hid-roccat*
F: drivers/hid/hid-roccat* F: drivers/hid/hid-roccat*
F: include/linux/hid-roccat* F: include/linux/hid-roccat*
ROCKCHIP CRYPTO DRIVERS
M: Corentin Labbe <clabbe@baylibre.com>
L: linux-crypto@vger.kernel.org
S: Maintained
F: Documentation/devicetree/bindings/crypto/rockchip,rk3288-crypto.yaml
F: drivers/crypto/rockchip/
ROCKCHIP I2S TDM DRIVER ROCKCHIP I2S TDM DRIVER
M: Nicolas Frattaroli <frattaroli.nicolas@gmail.com> M: Nicolas Frattaroli <frattaroli.nicolas@gmail.com>
L: linux-rockchip@lists.infradead.org L: linux-rockchip@lists.infradead.org
......
...@@ -18,7 +18,7 @@ config CRYPTO_GHASH_ARM_CE ...@@ -18,7 +18,7 @@ config CRYPTO_GHASH_ARM_CE
depends on KERNEL_MODE_NEON depends on KERNEL_MODE_NEON
select CRYPTO_HASH select CRYPTO_HASH
select CRYPTO_CRYPTD select CRYPTO_CRYPTD
select CRYPTO_GF128MUL select CRYPTO_LIB_GF128MUL
help help
GCM GHASH function (NIST SP800-38D) GCM GHASH function (NIST SP800-38D)
......
...@@ -7,7 +7,7 @@ ...@@ -7,7 +7,7 @@
*/ */
#include <crypto/aes.h> #include <crypto/aes.h>
#include <linux/crypto.h> #include <crypto/algapi.h>
#include <linux/module.h> #include <linux/module.h>
asmlinkage void __aes_arm_encrypt(u32 *rk, int rounds, const u8 *in, u8 *out); asmlinkage void __aes_arm_encrypt(u32 *rk, int rounds, const u8 *in, u8 *out);
......
...@@ -69,7 +69,7 @@ ...@@ -69,7 +69,7 @@
/* /*
* void nh_neon(const u32 *key, const u8 *message, size_t message_len, * void nh_neon(const u32 *key, const u8 *message, size_t message_len,
* u8 hash[NH_HASH_BYTES]) * __le64 hash[NH_NUM_PASSES])
* *
* It's guaranteed that message_len % 16 == 0. * It's guaranteed that message_len % 16 == 0.
*/ */
......
...@@ -14,14 +14,7 @@ ...@@ -14,14 +14,7 @@
#include <linux/module.h> #include <linux/module.h>
asmlinkage void nh_neon(const u32 *key, const u8 *message, size_t message_len, asmlinkage void nh_neon(const u32 *key, const u8 *message, size_t message_len,
u8 hash[NH_HASH_BYTES]); __le64 hash[NH_NUM_PASSES]);
/* wrapper to avoid indirect call to assembly, which doesn't work with CFI */
static void _nh_neon(const u32 *key, const u8 *message, size_t message_len,
__le64 hash[NH_NUM_PASSES])
{
nh_neon(key, message, message_len, (u8 *)hash);
}
static int nhpoly1305_neon_update(struct shash_desc *desc, static int nhpoly1305_neon_update(struct shash_desc *desc,
const u8 *src, unsigned int srclen) const u8 *src, unsigned int srclen)
...@@ -33,7 +26,7 @@ static int nhpoly1305_neon_update(struct shash_desc *desc, ...@@ -33,7 +26,7 @@ static int nhpoly1305_neon_update(struct shash_desc *desc,
unsigned int n = min_t(unsigned int, srclen, SZ_4K); unsigned int n = min_t(unsigned int, srclen, SZ_4K);
kernel_neon_begin(); kernel_neon_begin();
crypto_nhpoly1305_update_helper(desc, src, n, _nh_neon); crypto_nhpoly1305_update_helper(desc, src, n, nh_neon);
kernel_neon_end(); kernel_neon_end();
src += n; src += n;
srclen -= n; srclen -= n;
......
...@@ -6,8 +6,8 @@ config CRYPTO_GHASH_ARM64_CE ...@@ -6,8 +6,8 @@ config CRYPTO_GHASH_ARM64_CE
tristate "Hash functions: GHASH (ARMv8 Crypto Extensions)" tristate "Hash functions: GHASH (ARMv8 Crypto Extensions)"
depends on KERNEL_MODE_NEON depends on KERNEL_MODE_NEON
select CRYPTO_HASH select CRYPTO_HASH
select CRYPTO_GF128MUL
select CRYPTO_LIB_AES select CRYPTO_LIB_AES
select CRYPTO_LIB_GF128MUL
select CRYPTO_AEAD select CRYPTO_AEAD
help help
GCM GHASH function (NIST SP800-38D) GCM GHASH function (NIST SP800-38D)
...@@ -96,6 +96,17 @@ config CRYPTO_SHA3_ARM64 ...@@ -96,6 +96,17 @@ config CRYPTO_SHA3_ARM64
Architecture: arm64 using: Architecture: arm64 using:
- ARMv8.2 Crypto Extensions - ARMv8.2 Crypto Extensions
config CRYPTO_SM3_NEON
tristate "Hash functions: SM3 (NEON)"
depends on KERNEL_MODE_NEON
select CRYPTO_HASH
select CRYPTO_SM3
help
SM3 (ShangMi 3) secure hash function (OSCCA GM/T 0004-2012)
Architecture: arm64 using:
- NEON (Advanced SIMD) extensions
config CRYPTO_SM3_ARM64_CE config CRYPTO_SM3_ARM64_CE
tristate "Hash functions: SM3 (ARMv8.2 Crypto Extensions)" tristate "Hash functions: SM3 (ARMv8.2 Crypto Extensions)"
depends on KERNEL_MODE_NEON depends on KERNEL_MODE_NEON
...@@ -220,7 +231,7 @@ config CRYPTO_SM4_ARM64_CE ...@@ -220,7 +231,7 @@ config CRYPTO_SM4_ARM64_CE
- NEON (Advanced SIMD) extensions - NEON (Advanced SIMD) extensions
config CRYPTO_SM4_ARM64_CE_BLK config CRYPTO_SM4_ARM64_CE_BLK
tristate "Ciphers: SM4, modes: ECB/CBC/CFB/CTR (ARMv8 Crypto Extensions)" tristate "Ciphers: SM4, modes: ECB/CBC/CFB/CTR/XTS (ARMv8 Crypto Extensions)"
depends on KERNEL_MODE_NEON depends on KERNEL_MODE_NEON
select CRYPTO_SKCIPHER select CRYPTO_SKCIPHER
select CRYPTO_SM4 select CRYPTO_SM4
...@@ -231,6 +242,8 @@ config CRYPTO_SM4_ARM64_CE_BLK ...@@ -231,6 +242,8 @@ config CRYPTO_SM4_ARM64_CE_BLK
- CBC (Cipher Block Chaining) mode (NIST SP800-38A) - CBC (Cipher Block Chaining) mode (NIST SP800-38A)
- CFB (Cipher Feedback) mode (NIST SP800-38A) - CFB (Cipher Feedback) mode (NIST SP800-38A)
- CTR (Counter) mode (NIST SP800-38A) - CTR (Counter) mode (NIST SP800-38A)
- XTS (XOR Encrypt XOR with ciphertext stealing) mode (NIST SP800-38E
and IEEE 1619)
Architecture: arm64 using: Architecture: arm64 using:
- ARMv8 Crypto Extensions - ARMv8 Crypto Extensions
...@@ -268,6 +281,38 @@ config CRYPTO_AES_ARM64_CE_CCM ...@@ -268,6 +281,38 @@ config CRYPTO_AES_ARM64_CE_CCM
- ARMv8 Crypto Extensions - ARMv8 Crypto Extensions
- NEON (Advanced SIMD) extensions - NEON (Advanced SIMD) extensions
config CRYPTO_SM4_ARM64_CE_CCM
tristate "AEAD cipher: SM4 in CCM mode (ARMv8 Crypto Extensions)"
depends on KERNEL_MODE_NEON
select CRYPTO_ALGAPI
select CRYPTO_AEAD
select CRYPTO_SM4
select CRYPTO_SM4_ARM64_CE_BLK
help
AEAD cipher: SM4 cipher algorithms (OSCCA GB/T 32907-2016) with
CCM (Counter with Cipher Block Chaining-Message Authentication Code)
authenticated encryption mode (NIST SP800-38C)
Architecture: arm64 using:
- ARMv8 Crypto Extensions
- NEON (Advanced SIMD) extensions
config CRYPTO_SM4_ARM64_CE_GCM
tristate "AEAD cipher: SM4 in GCM mode (ARMv8 Crypto Extensions)"
depends on KERNEL_MODE_NEON
select CRYPTO_ALGAPI
select CRYPTO_AEAD
select CRYPTO_SM4
select CRYPTO_SM4_ARM64_CE_BLK
help
AEAD cipher: SM4 cipher algorithms (OSCCA GB/T 32907-2016) with
GCM (Galois/Counter Mode) authenticated encryption mode (NIST SP800-38D)
Architecture: arm64 using:
- ARMv8 Crypto Extensions
- PMULL (Polynomial Multiply Long) instructions
- NEON (Advanced SIMD) extensions
config CRYPTO_CRCT10DIF_ARM64_CE config CRYPTO_CRCT10DIF_ARM64_CE
tristate "CRCT10DIF (PMULL)" tristate "CRCT10DIF (PMULL)"
depends on KERNEL_MODE_NEON && CRC_T10DIF depends on KERNEL_MODE_NEON && CRC_T10DIF
......
...@@ -17,6 +17,9 @@ sha512-ce-y := sha512-ce-glue.o sha512-ce-core.o ...@@ -17,6 +17,9 @@ sha512-ce-y := sha512-ce-glue.o sha512-ce-core.o
obj-$(CONFIG_CRYPTO_SHA3_ARM64) += sha3-ce.o obj-$(CONFIG_CRYPTO_SHA3_ARM64) += sha3-ce.o
sha3-ce-y := sha3-ce-glue.o sha3-ce-core.o sha3-ce-y := sha3-ce-glue.o sha3-ce-core.o
obj-$(CONFIG_CRYPTO_SM3_NEON) += sm3-neon.o
sm3-neon-y := sm3-neon-glue.o sm3-neon-core.o
obj-$(CONFIG_CRYPTO_SM3_ARM64_CE) += sm3-ce.o obj-$(CONFIG_CRYPTO_SM3_ARM64_CE) += sm3-ce.o
sm3-ce-y := sm3-ce-glue.o sm3-ce-core.o sm3-ce-y := sm3-ce-glue.o sm3-ce-core.o
...@@ -26,6 +29,12 @@ sm4-ce-cipher-y := sm4-ce-cipher-glue.o sm4-ce-cipher-core.o ...@@ -26,6 +29,12 @@ sm4-ce-cipher-y := sm4-ce-cipher-glue.o sm4-ce-cipher-core.o
obj-$(CONFIG_CRYPTO_SM4_ARM64_CE_BLK) += sm4-ce.o obj-$(CONFIG_CRYPTO_SM4_ARM64_CE_BLK) += sm4-ce.o
sm4-ce-y := sm4-ce-glue.o sm4-ce-core.o sm4-ce-y := sm4-ce-glue.o sm4-ce-core.o
obj-$(CONFIG_CRYPTO_SM4_ARM64_CE_CCM) += sm4-ce-ccm.o
sm4-ce-ccm-y := sm4-ce-ccm-glue.o sm4-ce-ccm-core.o
obj-$(CONFIG_CRYPTO_SM4_ARM64_CE_GCM) += sm4-ce-gcm.o
sm4-ce-gcm-y := sm4-ce-gcm-glue.o sm4-ce-gcm-core.o
obj-$(CONFIG_CRYPTO_SM4_ARM64_NEON_BLK) += sm4-neon.o obj-$(CONFIG_CRYPTO_SM4_ARM64_NEON_BLK) += sm4-neon.o
sm4-neon-y := sm4-neon-glue.o sm4-neon-core.o sm4-neon-y := sm4-neon-glue.o sm4-neon-core.o
......
...@@ -9,9 +9,9 @@ ...@@ -9,9 +9,9 @@
#include <asm/simd.h> #include <asm/simd.h>
#include <asm/unaligned.h> #include <asm/unaligned.h>
#include <crypto/aes.h> #include <crypto/aes.h>
#include <crypto/algapi.h>
#include <crypto/internal/simd.h> #include <crypto/internal/simd.h>
#include <linux/cpufeature.h> #include <linux/cpufeature.h>
#include <linux/crypto.h>
#include <linux/module.h> #include <linux/module.h>
#include "aes-ce-setkey.h" #include "aes-ce-setkey.h"
......
...@@ -6,7 +6,7 @@ ...@@ -6,7 +6,7 @@
*/ */
#include <crypto/aes.h> #include <crypto/aes.h>
#include <linux/crypto.h> #include <crypto/algapi.h>
#include <linux/module.h> #include <linux/module.h>
asmlinkage void __aes_arm64_encrypt(u32 *rk, u8 *out, const u8 *in, int rounds); asmlinkage void __aes_arm64_encrypt(u32 *rk, u8 *out, const u8 *in, int rounds);
......
...@@ -52,8 +52,7 @@ SYM_FUNC_END(aes_decrypt_block5x) ...@@ -52,8 +52,7 @@ SYM_FUNC_END(aes_decrypt_block5x)
*/ */
AES_FUNC_START(aes_ecb_encrypt) AES_FUNC_START(aes_ecb_encrypt)
stp x29, x30, [sp, #-16]! frame_push 0
mov x29, sp
enc_prepare w3, x2, x5 enc_prepare w3, x2, x5
...@@ -77,14 +76,13 @@ ST5( st1 {v4.16b}, [x0], #16 ) ...@@ -77,14 +76,13 @@ ST5( st1 {v4.16b}, [x0], #16 )
subs w4, w4, #1 subs w4, w4, #1
bne .Lecbencloop bne .Lecbencloop
.Lecbencout: .Lecbencout:
ldp x29, x30, [sp], #16 frame_pop
ret ret
AES_FUNC_END(aes_ecb_encrypt) AES_FUNC_END(aes_ecb_encrypt)
AES_FUNC_START(aes_ecb_decrypt) AES_FUNC_START(aes_ecb_decrypt)
stp x29, x30, [sp, #-16]! frame_push 0
mov x29, sp
dec_prepare w3, x2, x5 dec_prepare w3, x2, x5
...@@ -108,7 +106,7 @@ ST5( st1 {v4.16b}, [x0], #16 ) ...@@ -108,7 +106,7 @@ ST5( st1 {v4.16b}, [x0], #16 )
subs w4, w4, #1 subs w4, w4, #1
bne .Lecbdecloop bne .Lecbdecloop
.Lecbdecout: .Lecbdecout:
ldp x29, x30, [sp], #16 frame_pop
ret ret
AES_FUNC_END(aes_ecb_decrypt) AES_FUNC_END(aes_ecb_decrypt)
...@@ -171,9 +169,6 @@ AES_FUNC_END(aes_cbc_encrypt) ...@@ -171,9 +169,6 @@ AES_FUNC_END(aes_cbc_encrypt)
AES_FUNC_END(aes_essiv_cbc_encrypt) AES_FUNC_END(aes_essiv_cbc_encrypt)
AES_FUNC_START(aes_essiv_cbc_decrypt) AES_FUNC_START(aes_essiv_cbc_decrypt)
stp x29, x30, [sp, #-16]!
mov x29, sp
ld1 {cbciv.16b}, [x5] /* get iv */ ld1 {cbciv.16b}, [x5] /* get iv */
mov w8, #14 /* AES-256: 14 rounds */ mov w8, #14 /* AES-256: 14 rounds */
...@@ -182,11 +177,9 @@ AES_FUNC_START(aes_essiv_cbc_decrypt) ...@@ -182,11 +177,9 @@ AES_FUNC_START(aes_essiv_cbc_decrypt)
b .Lessivcbcdecstart b .Lessivcbcdecstart
AES_FUNC_START(aes_cbc_decrypt) AES_FUNC_START(aes_cbc_decrypt)
stp x29, x30, [sp, #-16]!
mov x29, sp
ld1 {cbciv.16b}, [x5] /* get iv */ ld1 {cbciv.16b}, [x5] /* get iv */
.Lessivcbcdecstart: .Lessivcbcdecstart:
frame_push 0
dec_prepare w3, x2, x6 dec_prepare w3, x2, x6
.LcbcdecloopNx: .LcbcdecloopNx:
...@@ -236,7 +229,7 @@ ST5( st1 {v4.16b}, [x0], #16 ) ...@@ -236,7 +229,7 @@ ST5( st1 {v4.16b}, [x0], #16 )
bne .Lcbcdecloop bne .Lcbcdecloop
.Lcbcdecout: .Lcbcdecout:
st1 {cbciv.16b}, [x5] /* return iv */ st1 {cbciv.16b}, [x5] /* return iv */
ldp x29, x30, [sp], #16 frame_pop
ret ret
AES_FUNC_END(aes_cbc_decrypt) AES_FUNC_END(aes_cbc_decrypt)
AES_FUNC_END(aes_essiv_cbc_decrypt) AES_FUNC_END(aes_essiv_cbc_decrypt)
...@@ -337,8 +330,7 @@ AES_FUNC_END(aes_cbc_cts_decrypt) ...@@ -337,8 +330,7 @@ AES_FUNC_END(aes_cbc_cts_decrypt)
BLOCKS .req x13 BLOCKS .req x13
BLOCKS_W .req w13 BLOCKS_W .req w13
stp x29, x30, [sp, #-16]! frame_push 0
mov x29, sp
enc_prepare ROUNDS_W, KEY, IV_PART enc_prepare ROUNDS_W, KEY, IV_PART
ld1 {vctr.16b}, [IV] ld1 {vctr.16b}, [IV]
...@@ -481,7 +473,7 @@ ST5( st1 {v4.16b}, [OUT], #16 ) ...@@ -481,7 +473,7 @@ ST5( st1 {v4.16b}, [OUT], #16 )
.if !\xctr .if !\xctr
st1 {vctr.16b}, [IV] /* return next CTR value */ st1 {vctr.16b}, [IV] /* return next CTR value */
.endif .endif
ldp x29, x30, [sp], #16 frame_pop
ret ret
.Lctrtail\xctr: .Lctrtail\xctr:
...@@ -645,8 +637,7 @@ AES_FUNC_END(aes_xctr_encrypt) ...@@ -645,8 +637,7 @@ AES_FUNC_END(aes_xctr_encrypt)
.endm .endm
AES_FUNC_START(aes_xts_encrypt) AES_FUNC_START(aes_xts_encrypt)
stp x29, x30, [sp, #-16]! frame_push 0
mov x29, sp
ld1 {v4.16b}, [x6] ld1 {v4.16b}, [x6]
xts_load_mask v8 xts_load_mask v8
...@@ -704,7 +695,7 @@ AES_FUNC_START(aes_xts_encrypt) ...@@ -704,7 +695,7 @@ AES_FUNC_START(aes_xts_encrypt)
st1 {v0.16b}, [x0] st1 {v0.16b}, [x0]
.Lxtsencret: .Lxtsencret:
st1 {v4.16b}, [x6] st1 {v4.16b}, [x6]
ldp x29, x30, [sp], #16 frame_pop
ret ret
.LxtsencctsNx: .LxtsencctsNx:
...@@ -732,8 +723,7 @@ AES_FUNC_START(aes_xts_encrypt) ...@@ -732,8 +723,7 @@ AES_FUNC_START(aes_xts_encrypt)
AES_FUNC_END(aes_xts_encrypt) AES_FUNC_END(aes_xts_encrypt)
AES_FUNC_START(aes_xts_decrypt) AES_FUNC_START(aes_xts_decrypt)
stp x29, x30, [sp, #-16]! frame_push 0
mov x29, sp
/* subtract 16 bytes if we are doing CTS */ /* subtract 16 bytes if we are doing CTS */
sub w8, w4, #0x10 sub w8, w4, #0x10
...@@ -794,7 +784,7 @@ AES_FUNC_START(aes_xts_decrypt) ...@@ -794,7 +784,7 @@ AES_FUNC_START(aes_xts_decrypt)
b .Lxtsdecloop b .Lxtsdecloop
.Lxtsdecout: .Lxtsdecout:
st1 {v4.16b}, [x6] st1 {v4.16b}, [x6]
ldp x29, x30, [sp], #16 frame_pop
ret ret
.Lxtsdeccts: .Lxtsdeccts:
......
...@@ -760,7 +760,7 @@ SYM_FUNC_START_LOCAL(__xts_crypt8) ...@@ -760,7 +760,7 @@ SYM_FUNC_START_LOCAL(__xts_crypt8)
eor v6.16b, v6.16b, v31.16b eor v6.16b, v6.16b, v31.16b
eor v7.16b, v7.16b, v16.16b eor v7.16b, v7.16b, v16.16b
stp q16, q17, [sp, #16] stp q16, q17, [x6]
mov bskey, x2 mov bskey, x2
mov rounds, x3 mov rounds, x3
...@@ -768,8 +768,8 @@ SYM_FUNC_START_LOCAL(__xts_crypt8) ...@@ -768,8 +768,8 @@ SYM_FUNC_START_LOCAL(__xts_crypt8)
SYM_FUNC_END(__xts_crypt8) SYM_FUNC_END(__xts_crypt8)
.macro __xts_crypt, do8, o0, o1, o2, o3, o4, o5, o6, o7 .macro __xts_crypt, do8, o0, o1, o2, o3, o4, o5, o6, o7
stp x29, x30, [sp, #-48]! frame_push 0, 32
mov x29, sp add x6, sp, #.Lframe_local_offset
ld1 {v25.16b}, [x5] ld1 {v25.16b}, [x5]
...@@ -781,7 +781,7 @@ SYM_FUNC_END(__xts_crypt8) ...@@ -781,7 +781,7 @@ SYM_FUNC_END(__xts_crypt8)
eor v18.16b, \o2\().16b, v27.16b eor v18.16b, \o2\().16b, v27.16b
eor v19.16b, \o3\().16b, v28.16b eor v19.16b, \o3\().16b, v28.16b
ldp q24, q25, [sp, #16] ldp q24, q25, [x6]
eor v20.16b, \o4\().16b, v29.16b eor v20.16b, \o4\().16b, v29.16b
eor v21.16b, \o5\().16b, v30.16b eor v21.16b, \o5\().16b, v30.16b
...@@ -795,7 +795,7 @@ SYM_FUNC_END(__xts_crypt8) ...@@ -795,7 +795,7 @@ SYM_FUNC_END(__xts_crypt8)
b.gt 0b b.gt 0b
st1 {v25.16b}, [x5] st1 {v25.16b}, [x5]
ldp x29, x30, [sp], #48 frame_pop
ret ret
.endm .endm
...@@ -820,9 +820,7 @@ SYM_FUNC_END(aesbs_xts_decrypt) ...@@ -820,9 +820,7 @@ SYM_FUNC_END(aesbs_xts_decrypt)
* int rounds, int blocks, u8 iv[]) * int rounds, int blocks, u8 iv[])
*/ */
SYM_FUNC_START(aesbs_ctr_encrypt) SYM_FUNC_START(aesbs_ctr_encrypt)
stp x29, x30, [sp, #-16]! frame_push 0
mov x29, sp
ldp x7, x8, [x5] ldp x7, x8, [x5]
ld1 {v0.16b}, [x5] ld1 {v0.16b}, [x5]
CPU_LE( rev x7, x7 ) CPU_LE( rev x7, x7 )
...@@ -862,6 +860,6 @@ CPU_LE( rev x8, x8 ) ...@@ -862,6 +860,6 @@ CPU_LE( rev x8, x8 )
b.gt 0b b.gt 0b
st1 {v0.16b}, [x5] st1 {v0.16b}, [x5]
ldp x29, x30, [sp], #16 frame_pop
ret ret
SYM_FUNC_END(aesbs_ctr_encrypt) SYM_FUNC_END(aesbs_ctr_encrypt)
...@@ -429,7 +429,7 @@ CPU_LE( ext v0.16b, v0.16b, v0.16b, #8 ) ...@@ -429,7 +429,7 @@ CPU_LE( ext v0.16b, v0.16b, v0.16b, #8 )
umov w0, v0.h[0] umov w0, v0.h[0]
.ifc \p, p8 .ifc \p, p8
ldp x29, x30, [sp], #16 frame_pop
.endif .endif
ret ret
...@@ -466,8 +466,7 @@ CPU_LE( ext v7.16b, v7.16b, v7.16b, #8 ) ...@@ -466,8 +466,7 @@ CPU_LE( ext v7.16b, v7.16b, v7.16b, #8 )
// Assumes len >= 16. // Assumes len >= 16.
// //
SYM_FUNC_START(crc_t10dif_pmull_p8) SYM_FUNC_START(crc_t10dif_pmull_p8)
stp x29, x30, [sp, #-16]! frame_push 1
mov x29, sp
crc_t10dif_pmull p8 crc_t10dif_pmull p8
SYM_FUNC_END(crc_t10dif_pmull_p8) SYM_FUNC_END(crc_t10dif_pmull_p8)
......
...@@ -436,9 +436,7 @@ SYM_FUNC_END(pmull_ghash_update_p8) ...@@ -436,9 +436,7 @@ SYM_FUNC_END(pmull_ghash_update_p8)
.align 6 .align 6
.macro pmull_gcm_do_crypt, enc .macro pmull_gcm_do_crypt, enc
stp x29, x30, [sp, #-32]! frame_push 1
mov x29, sp
str x19, [sp, #24]
load_round_keys x7, x6, x8 load_round_keys x7, x6, x8
...@@ -529,7 +527,7 @@ CPU_LE( rev w8, w8 ) ...@@ -529,7 +527,7 @@ CPU_LE( rev w8, w8 )
.endif .endif
bne 0b bne 0b
3: ldp x19, x10, [sp, #24] 3: ldr x10, [sp, #.Lframe_local_offset]
cbz x10, 5f // output tag? cbz x10, 5f // output tag?
ld1 {INP3.16b}, [x10] // load lengths[] ld1 {INP3.16b}, [x10] // load lengths[]
...@@ -562,7 +560,7 @@ CPU_LE( rev w8, w8 ) ...@@ -562,7 +560,7 @@ CPU_LE( rev w8, w8 )
smov w0, v0.b[0] // return b0 smov w0, v0.b[0] // return b0
.endif .endif
4: ldp x29, x30, [sp], #32 4: frame_pop
ret ret
5: 5:
......
...@@ -508,7 +508,7 @@ static void __exit ghash_ce_mod_exit(void) ...@@ -508,7 +508,7 @@ static void __exit ghash_ce_mod_exit(void)
crypto_unregister_shash(&ghash_alg); crypto_unregister_shash(&ghash_alg);
} }
static const struct cpu_feature ghash_cpu_feature[] = { static const struct cpu_feature __maybe_unused ghash_cpu_feature[] = {
{ cpu_feature(PMULL) }, { } { cpu_feature(PMULL) }, { }
}; };
MODULE_DEVICE_TABLE(cpu, ghash_cpu_feature); MODULE_DEVICE_TABLE(cpu, ghash_cpu_feature);
......
...@@ -8,6 +8,7 @@ ...@@ -8,6 +8,7 @@
*/ */
#include <linux/linkage.h> #include <linux/linkage.h>
#include <linux/cfi_types.h>
KEY .req x0 KEY .req x0
MESSAGE .req x1 MESSAGE .req x1
...@@ -58,11 +59,11 @@ ...@@ -58,11 +59,11 @@
/* /*
* void nh_neon(const u32 *key, const u8 *message, size_t message_len, * void nh_neon(const u32 *key, const u8 *message, size_t message_len,
* u8 hash[NH_HASH_BYTES]) * __le64 hash[NH_NUM_PASSES])
* *
* It's guaranteed that message_len % 16 == 0. * It's guaranteed that message_len % 16 == 0.
*/ */
SYM_FUNC_START(nh_neon) SYM_TYPED_FUNC_START(nh_neon)
ld1 {K0.4s,K1.4s}, [KEY], #32 ld1 {K0.4s,K1.4s}, [KEY], #32
movi PASS0_SUMS.2d, #0 movi PASS0_SUMS.2d, #0
......
...@@ -14,14 +14,7 @@ ...@@ -14,14 +14,7 @@
#include <linux/module.h> #include <linux/module.h>
asmlinkage void nh_neon(const u32 *key, const u8 *message, size_t message_len, asmlinkage void nh_neon(const u32 *key, const u8 *message, size_t message_len,
u8 hash[NH_HASH_BYTES]); __le64 hash[NH_NUM_PASSES]);
/* wrapper to avoid indirect call to assembly, which doesn't work with CFI */
static void _nh_neon(const u32 *key, const u8 *message, size_t message_len,
__le64 hash[NH_NUM_PASSES])
{
nh_neon(key, message, message_len, (u8 *)hash);
}
static int nhpoly1305_neon_update(struct shash_desc *desc, static int nhpoly1305_neon_update(struct shash_desc *desc,
const u8 *src, unsigned int srclen) const u8 *src, unsigned int srclen)
...@@ -33,7 +26,7 @@ static int nhpoly1305_neon_update(struct shash_desc *desc, ...@@ -33,7 +26,7 @@ static int nhpoly1305_neon_update(struct shash_desc *desc,
unsigned int n = min_t(unsigned int, srclen, SZ_4K); unsigned int n = min_t(unsigned int, srclen, SZ_4K);
kernel_neon_begin(); kernel_neon_begin();
crypto_nhpoly1305_update_helper(desc, src, n, _nh_neon); crypto_nhpoly1305_update_helper(desc, src, n, nh_neon);
kernel_neon_end(); kernel_neon_end();
src += n; src += n;
srclen -= n; srclen -= n;
......
...@@ -84,7 +84,7 @@ static struct shash_alg sm3_alg = { ...@@ -84,7 +84,7 @@ static struct shash_alg sm3_alg = {
.base.cra_driver_name = "sm3-ce", .base.cra_driver_name = "sm3-ce",
.base.cra_blocksize = SM3_BLOCK_SIZE, .base.cra_blocksize = SM3_BLOCK_SIZE,
.base.cra_module = THIS_MODULE, .base.cra_module = THIS_MODULE,
.base.cra_priority = 200, .base.cra_priority = 400,
}; };
static int __init sm3_ce_mod_init(void) static int __init sm3_ce_mod_init(void)
......
This diff is collapsed.
// SPDX-License-Identifier: GPL-2.0-or-later
/*
* sm3-neon-glue.c - SM3 secure hash using NEON instructions
*
* Copyright (C) 2022 Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
*/
#include <asm/neon.h>
#include <asm/simd.h>
#include <asm/unaligned.h>
#include <crypto/internal/hash.h>
#include <crypto/internal/simd.h>
#include <crypto/sm3.h>
#include <crypto/sm3_base.h>
#include <linux/cpufeature.h>
#include <linux/crypto.h>
#include <linux/module.h>
asmlinkage void sm3_neon_transform(struct sm3_state *sst, u8 const *src,
int blocks);
static int sm3_neon_update(struct shash_desc *desc, const u8 *data,
unsigned int len)
{
if (!crypto_simd_usable()) {
sm3_update(shash_desc_ctx(desc), data, len);
return 0;
}
kernel_neon_begin();
sm3_base_do_update(desc, data, len, sm3_neon_transform);
kernel_neon_end();
return 0;
}
static int sm3_neon_final(struct shash_desc *desc, u8 *out)
{
if (!crypto_simd_usable()) {
sm3_final(shash_desc_ctx(desc), out);
return 0;
}
kernel_neon_begin();
sm3_base_do_finalize(desc, sm3_neon_transform);
kernel_neon_end();
return sm3_base_finish(desc, out);
}
static int sm3_neon_finup(struct shash_desc *desc, const u8 *data,
unsigned int len, u8 *out)
{
if (!crypto_simd_usable()) {
struct sm3_state *sctx = shash_desc_ctx(desc);
if (len)
sm3_update(sctx, data, len);
sm3_final(sctx, out);
return 0;
}
kernel_neon_begin();
if (len)
sm3_base_do_update(desc, data, len, sm3_neon_transform);
sm3_base_do_finalize(desc, sm3_neon_transform);
kernel_neon_end();
return sm3_base_finish(desc, out);
}
static struct shash_alg sm3_alg = {
.digestsize = SM3_DIGEST_SIZE,
.init = sm3_base_init,
.update = sm3_neon_update,
.final = sm3_neon_final,
.finup = sm3_neon_finup,
.descsize = sizeof(struct sm3_state),
.base.cra_name = "sm3",
.base.cra_driver_name = "sm3-neon",
.base.cra_blocksize = SM3_BLOCK_SIZE,
.base.cra_module = THIS_MODULE,
.base.cra_priority = 200,
};
static int __init sm3_neon_init(void)
{
return crypto_register_shash(&sm3_alg);
}
static void __exit sm3_neon_fini(void)
{
crypto_unregister_shash(&sm3_alg);
}
module_init(sm3_neon_init);
module_exit(sm3_neon_fini);
MODULE_DESCRIPTION("SM3 secure hash using NEON instructions");
MODULE_AUTHOR("Jussi Kivilinna <jussi.kivilinna@iki.fi>");
MODULE_AUTHOR("Tianjia Zhang <tianjia.zhang@linux.alibaba.com>");
MODULE_LICENSE("GPL v2");
/* SPDX-License-Identifier: GPL-2.0-or-later */
/*
* SM4 helper macros for Crypto Extensions
* Copyright (C) 2022 Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
*/
#define SM4_PREPARE(ptr) \
ld1 {v24.16b-v27.16b}, [ptr], #64; \
ld1 {v28.16b-v31.16b}, [ptr];
#define SM4_CRYPT_BLK_BE(b0) \
sm4e b0.4s, v24.4s; \
sm4e b0.4s, v25.4s; \
sm4e b0.4s, v26.4s; \
sm4e b0.4s, v27.4s; \
sm4e b0.4s, v28.4s; \
sm4e b0.4s, v29.4s; \
sm4e b0.4s, v30.4s; \
sm4e b0.4s, v31.4s; \
rev64 b0.4s, b0.4s; \
ext b0.16b, b0.16b, b0.16b, #8; \
rev32 b0.16b, b0.16b;
#define SM4_CRYPT_BLK(b0) \
rev32 b0.16b, b0.16b; \
SM4_CRYPT_BLK_BE(b0);
#define SM4_CRYPT_BLK2_BE(b0, b1) \
sm4e b0.4s, v24.4s; \
sm4e b1.4s, v24.4s; \
sm4e b0.4s, v25.4s; \
sm4e b1.4s, v25.4s; \
sm4e b0.4s, v26.4s; \
sm4e b1.4s, v26.4s; \
sm4e b0.4s, v27.4s; \
sm4e b1.4s, v27.4s; \
sm4e b0.4s, v28.4s; \
sm4e b1.4s, v28.4s; \
sm4e b0.4s, v29.4s; \
sm4e b1.4s, v29.4s; \
sm4e b0.4s, v30.4s; \
sm4e b1.4s, v30.4s; \
sm4e b0.4s, v31.4s; \
sm4e b1.4s, v31.4s; \
rev64 b0.4s, b0.4s; \
rev64 b1.4s, b1.4s; \
ext b0.16b, b0.16b, b0.16b, #8; \
ext b1.16b, b1.16b, b1.16b, #8; \
rev32 b0.16b, b0.16b; \
rev32 b1.16b, b1.16b; \
#define SM4_CRYPT_BLK2(b0, b1) \
rev32 b0.16b, b0.16b; \
rev32 b1.16b, b1.16b; \
SM4_CRYPT_BLK2_BE(b0, b1);
#define SM4_CRYPT_BLK4_BE(b0, b1, b2, b3) \
sm4e b0.4s, v24.4s; \
sm4e b1.4s, v24.4s; \
sm4e b2.4s, v24.4s; \
sm4e b3.4s, v24.4s; \
sm4e b0.4s, v25.4s; \
sm4e b1.4s, v25.4s; \
sm4e b2.4s, v25.4s; \
sm4e b3.4s, v25.4s; \
sm4e b0.4s, v26.4s; \
sm4e b1.4s, v26.4s; \
sm4e b2.4s, v26.4s; \
sm4e b3.4s, v26.4s; \
sm4e b0.4s, v27.4s; \
sm4e b1.4s, v27.4s; \
sm4e b2.4s, v27.4s; \
sm4e b3.4s, v27.4s; \
sm4e b0.4s, v28.4s; \
sm4e b1.4s, v28.4s; \
sm4e b2.4s, v28.4s; \
sm4e b3.4s, v28.4s; \
sm4e b0.4s, v29.4s; \
sm4e b1.4s, v29.4s; \
sm4e b2.4s, v29.4s; \
sm4e b3.4s, v29.4s; \
sm4e b0.4s, v30.4s; \
sm4e b1.4s, v30.4s; \
sm4e b2.4s, v30.4s; \
sm4e b3.4s, v30.4s; \
sm4e b0.4s, v31.4s; \
sm4e b1.4s, v31.4s; \
sm4e b2.4s, v31.4s; \
sm4e b3.4s, v31.4s; \
rev64 b0.4s, b0.4s; \
rev64 b1.4s, b1.4s; \
rev64 b2.4s, b2.4s; \
rev64 b3.4s, b3.4s; \
ext b0.16b, b0.16b, b0.16b, #8; \
ext b1.16b, b1.16b, b1.16b, #8; \
ext b2.16b, b2.16b, b2.16b, #8; \
ext b3.16b, b3.16b, b3.16b, #8; \
rev32 b0.16b, b0.16b; \
rev32 b1.16b, b1.16b; \
rev32 b2.16b, b2.16b; \
rev32 b3.16b, b3.16b;
#define SM4_CRYPT_BLK4(b0, b1, b2, b3) \
rev32 b0.16b, b0.16b; \
rev32 b1.16b, b1.16b; \
rev32 b2.16b, b2.16b; \
rev32 b3.16b, b3.16b; \
SM4_CRYPT_BLK4_BE(b0, b1, b2, b3);
#define SM4_CRYPT_BLK8_BE(b0, b1, b2, b3, b4, b5, b6, b7) \
sm4e b0.4s, v24.4s; \
sm4e b1.4s, v24.4s; \
sm4e b2.4s, v24.4s; \
sm4e b3.4s, v24.4s; \
sm4e b4.4s, v24.4s; \
sm4e b5.4s, v24.4s; \
sm4e b6.4s, v24.4s; \
sm4e b7.4s, v24.4s; \
sm4e b0.4s, v25.4s; \
sm4e b1.4s, v25.4s; \
sm4e b2.4s, v25.4s; \
sm4e b3.4s, v25.4s; \
sm4e b4.4s, v25.4s; \
sm4e b5.4s, v25.4s; \
sm4e b6.4s, v25.4s; \
sm4e b7.4s, v25.4s; \
sm4e b0.4s, v26.4s; \
sm4e b1.4s, v26.4s; \
sm4e b2.4s, v26.4s; \
sm4e b3.4s, v26.4s; \
sm4e b4.4s, v26.4s; \
sm4e b5.4s, v26.4s; \
sm4e b6.4s, v26.4s; \
sm4e b7.4s, v26.4s; \
sm4e b0.4s, v27.4s; \
sm4e b1.4s, v27.4s; \
sm4e b2.4s, v27.4s; \
sm4e b3.4s, v27.4s; \
sm4e b4.4s, v27.4s; \
sm4e b5.4s, v27.4s; \
sm4e b6.4s, v27.4s; \
sm4e b7.4s, v27.4s; \
sm4e b0.4s, v28.4s; \
sm4e b1.4s, v28.4s; \
sm4e b2.4s, v28.4s; \
sm4e b3.4s, v28.4s; \
sm4e b4.4s, v28.4s; \
sm4e b5.4s, v28.4s; \
sm4e b6.4s, v28.4s; \
sm4e b7.4s, v28.4s; \
sm4e b0.4s, v29.4s; \
sm4e b1.4s, v29.4s; \
sm4e b2.4s, v29.4s; \
sm4e b3.4s, v29.4s; \
sm4e b4.4s, v29.4s; \
sm4e b5.4s, v29.4s; \
sm4e b6.4s, v29.4s; \
sm4e b7.4s, v29.4s; \
sm4e b0.4s, v30.4s; \
sm4e b1.4s, v30.4s; \
sm4e b2.4s, v30.4s; \
sm4e b3.4s, v30.4s; \
sm4e b4.4s, v30.4s; \
sm4e b5.4s, v30.4s; \
sm4e b6.4s, v30.4s; \
sm4e b7.4s, v30.4s; \
sm4e b0.4s, v31.4s; \
sm4e b1.4s, v31.4s; \
sm4e b2.4s, v31.4s; \
sm4e b3.4s, v31.4s; \
sm4e b4.4s, v31.4s; \
sm4e b5.4s, v31.4s; \
sm4e b6.4s, v31.4s; \
sm4e b7.4s, v31.4s; \
rev64 b0.4s, b0.4s; \
rev64 b1.4s, b1.4s; \
rev64 b2.4s, b2.4s; \
rev64 b3.4s, b3.4s; \
rev64 b4.4s, b4.4s; \
rev64 b5.4s, b5.4s; \
rev64 b6.4s, b6.4s; \
rev64 b7.4s, b7.4s; \
ext b0.16b, b0.16b, b0.16b, #8; \
ext b1.16b, b1.16b, b1.16b, #8; \
ext b2.16b, b2.16b, b2.16b, #8; \
ext b3.16b, b3.16b, b3.16b, #8; \
ext b4.16b, b4.16b, b4.16b, #8; \
ext b5.16b, b5.16b, b5.16b, #8; \
ext b6.16b, b6.16b, b6.16b, #8; \
ext b7.16b, b7.16b, b7.16b, #8; \
rev32 b0.16b, b0.16b; \
rev32 b1.16b, b1.16b; \
rev32 b2.16b, b2.16b; \
rev32 b3.16b, b3.16b; \
rev32 b4.16b, b4.16b; \
rev32 b5.16b, b5.16b; \
rev32 b6.16b, b6.16b; \
rev32 b7.16b, b7.16b;
#define SM4_CRYPT_BLK8(b0, b1, b2, b3, b4, b5, b6, b7) \
rev32 b0.16b, b0.16b; \
rev32 b1.16b, b1.16b; \
rev32 b2.16b, b2.16b; \
rev32 b3.16b, b3.16b; \
rev32 b4.16b, b4.16b; \
rev32 b5.16b, b5.16b; \
rev32 b6.16b, b6.16b; \
rev32 b7.16b, b7.16b; \
SM4_CRYPT_BLK8_BE(b0, b1, b2, b3, b4, b5, b6, b7);
/* SPDX-License-Identifier: GPL-2.0-or-later */
/*
* SM4-CCM AEAD Algorithm using ARMv8 Crypto Extensions
* as specified in rfc8998
* https://datatracker.ietf.org/doc/html/rfc8998
*
* Copyright (C) 2022 Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
*/
#include <linux/linkage.h>
#include <asm/assembler.h>
#include "sm4-ce-asm.h"
.arch armv8-a+crypto
.irp b, 0, 1, 8, 9, 10, 11, 12, 13, 14, 15, 16, 24, 25, 26, 27, 28, 29, 30, 31
.set .Lv\b\().4s, \b
.endr
.macro sm4e, vd, vn
.inst 0xcec08400 | (.L\vn << 5) | .L\vd
.endm
/* Register macros */
#define RMAC v16
/* Helper macros. */
#define inc_le128(vctr) \
mov vctr.d[1], x8; \
mov vctr.d[0], x7; \
adds x8, x8, #1; \
rev64 vctr.16b, vctr.16b; \
adc x7, x7, xzr;
.align 3
SYM_FUNC_START(sm4_ce_cbcmac_update)
/* input:
* x0: round key array, CTX
* x1: mac
* x2: src
* w3: nblocks
*/
SM4_PREPARE(x0)
ld1 {RMAC.16b}, [x1]
.Lcbcmac_loop_4x:
cmp w3, #4
blt .Lcbcmac_loop_1x
sub w3, w3, #4
ld1 {v0.16b-v3.16b}, [x2], #64
SM4_CRYPT_BLK(RMAC)
eor RMAC.16b, RMAC.16b, v0.16b
SM4_CRYPT_BLK(RMAC)
eor RMAC.16b, RMAC.16b, v1.16b
SM4_CRYPT_BLK(RMAC)
eor RMAC.16b, RMAC.16b, v2.16b
SM4_CRYPT_BLK(RMAC)
eor RMAC.16b, RMAC.16b, v3.16b
cbz w3, .Lcbcmac_end
b .Lcbcmac_loop_4x
.Lcbcmac_loop_1x:
sub w3, w3, #1
ld1 {v0.16b}, [x2], #16
SM4_CRYPT_BLK(RMAC)
eor RMAC.16b, RMAC.16b, v0.16b
cbnz w3, .Lcbcmac_loop_1x
.Lcbcmac_end:
st1 {RMAC.16b}, [x1]
ret
SYM_FUNC_END(sm4_ce_cbcmac_update)
.align 3
SYM_FUNC_START(sm4_ce_ccm_final)
/* input:
* x0: round key array, CTX
* x1: ctr0 (big endian, 128 bit)
* x2: mac
*/
SM4_PREPARE(x0)
ld1 {RMAC.16b}, [x2]
ld1 {v0.16b}, [x1]
SM4_CRYPT_BLK2(RMAC, v0)
/* en-/decrypt the mac with ctr0 */
eor RMAC.16b, RMAC.16b, v0.16b
st1 {RMAC.16b}, [x2]
ret
SYM_FUNC_END(sm4_ce_ccm_final)
.align 3
SYM_FUNC_START(sm4_ce_ccm_enc)
/* input:
* x0: round key array, CTX
* x1: dst
* x2: src
* x3: ctr (big endian, 128 bit)
* w4: nbytes
* x5: mac
*/
SM4_PREPARE(x0)
ldp x7, x8, [x3]
rev x7, x7
rev x8, x8
ld1 {RMAC.16b}, [x5]
.Lccm_enc_loop_4x:
cmp w4, #(4 * 16)
blt .Lccm_enc_loop_1x
sub w4, w4, #(4 * 16)
/* construct CTRs */
inc_le128(v8) /* +0 */
inc_le128(v9) /* +1 */
inc_le128(v10) /* +2 */
inc_le128(v11) /* +3 */
ld1 {v0.16b-v3.16b}, [x2], #64
SM4_CRYPT_BLK2(v8, RMAC)
eor v8.16b, v8.16b, v0.16b
eor RMAC.16b, RMAC.16b, v0.16b
SM4_CRYPT_BLK2(v9, RMAC)
eor v9.16b, v9.16b, v1.16b
eor RMAC.16b, RMAC.16b, v1.16b
SM4_CRYPT_BLK2(v10, RMAC)
eor v10.16b, v10.16b, v2.16b
eor RMAC.16b, RMAC.16b, v2.16b
SM4_CRYPT_BLK2(v11, RMAC)
eor v11.16b, v11.16b, v3.16b
eor RMAC.16b, RMAC.16b, v3.16b
st1 {v8.16b-v11.16b}, [x1], #64
cbz w4, .Lccm_enc_end
b .Lccm_enc_loop_4x
.Lccm_enc_loop_1x:
cmp w4, #16
blt .Lccm_enc_tail
sub w4, w4, #16
/* construct CTRs */
inc_le128(v8)
ld1 {v0.16b}, [x2], #16
SM4_CRYPT_BLK2(v8, RMAC)
eor v8.16b, v8.16b, v0.16b
eor RMAC.16b, RMAC.16b, v0.16b
st1 {v8.16b}, [x1], #16
cbz w4, .Lccm_enc_end
b .Lccm_enc_loop_1x
.Lccm_enc_tail:
/* construct CTRs */
inc_le128(v8)
SM4_CRYPT_BLK2(RMAC, v8)
/* store new MAC */
st1 {RMAC.16b}, [x5]
.Lccm_enc_tail_loop:
ldrb w0, [x2], #1 /* get 1 byte from input */
umov w9, v8.b[0] /* get top crypted CTR byte */
umov w6, RMAC.b[0] /* get top MAC byte */
eor w9, w9, w0 /* w9 = CTR ^ input */
eor w6, w6, w0 /* w6 = MAC ^ input */
strb w9, [x1], #1 /* store out byte */
strb w6, [x5], #1 /* store MAC byte */
subs w4, w4, #1
beq .Lccm_enc_ret
/* shift out one byte */
ext RMAC.16b, RMAC.16b, RMAC.16b, #1
ext v8.16b, v8.16b, v8.16b, #1
b .Lccm_enc_tail_loop
.Lccm_enc_end:
/* store new MAC */
st1 {RMAC.16b}, [x5]
/* store new CTR */
rev x7, x7
rev x8, x8
stp x7, x8, [x3]
.Lccm_enc_ret:
ret
SYM_FUNC_END(sm4_ce_ccm_enc)
.align 3
SYM_FUNC_START(sm4_ce_ccm_dec)
/* input:
* x0: round key array, CTX
* x1: dst
* x2: src
* x3: ctr (big endian, 128 bit)
* w4: nbytes
* x5: mac
*/
SM4_PREPARE(x0)
ldp x7, x8, [x3]
rev x7, x7
rev x8, x8
ld1 {RMAC.16b}, [x5]
.Lccm_dec_loop_4x:
cmp w4, #(4 * 16)
blt .Lccm_dec_loop_1x
sub w4, w4, #(4 * 16)
/* construct CTRs */
inc_le128(v8) /* +0 */
inc_le128(v9) /* +1 */
inc_le128(v10) /* +2 */
inc_le128(v11) /* +3 */
ld1 {v0.16b-v3.16b}, [x2], #64
SM4_CRYPT_BLK2(v8, RMAC)
eor v8.16b, v8.16b, v0.16b
eor RMAC.16b, RMAC.16b, v8.16b
SM4_CRYPT_BLK2(v9, RMAC)
eor v9.16b, v9.16b, v1.16b
eor RMAC.16b, RMAC.16b, v9.16b
SM4_CRYPT_BLK2(v10, RMAC)
eor v10.16b, v10.16b, v2.16b
eor RMAC.16b, RMAC.16b, v10.16b
SM4_CRYPT_BLK2(v11, RMAC)
eor v11.16b, v11.16b, v3.16b
eor RMAC.16b, RMAC.16b, v11.16b
st1 {v8.16b-v11.16b}, [x1], #64
cbz w4, .Lccm_dec_end
b .Lccm_dec_loop_4x
.Lccm_dec_loop_1x:
cmp w4, #16
blt .Lccm_dec_tail
sub w4, w4, #16
/* construct CTRs */
inc_le128(v8)
ld1 {v0.16b}, [x2], #16
SM4_CRYPT_BLK2(v8, RMAC)
eor v8.16b, v8.16b, v0.16b
eor RMAC.16b, RMAC.16b, v8.16b
st1 {v8.16b}, [x1], #16
cbz w4, .Lccm_dec_end
b .Lccm_dec_loop_1x
.Lccm_dec_tail:
/* construct CTRs */
inc_le128(v8)
SM4_CRYPT_BLK2(RMAC, v8)
/* store new MAC */
st1 {RMAC.16b}, [x5]
.Lccm_dec_tail_loop:
ldrb w0, [x2], #1 /* get 1 byte from input */
umov w9, v8.b[0] /* get top crypted CTR byte */
umov w6, RMAC.b[0] /* get top MAC byte */
eor w9, w9, w0 /* w9 = CTR ^ input */
eor w6, w6, w9 /* w6 = MAC ^ output */
strb w9, [x1], #1 /* store out byte */
strb w6, [x5], #1 /* store MAC byte */
subs w4, w4, #1
beq .Lccm_dec_ret
/* shift out one byte */
ext RMAC.16b, RMAC.16b, RMAC.16b, #1
ext v8.16b, v8.16b, v8.16b, #1
b .Lccm_dec_tail_loop
.Lccm_dec_end:
/* store new MAC */
st1 {RMAC.16b}, [x5]
/* store new CTR */
rev x7, x7
rev x8, x8
stp x7, x8, [x3]
.Lccm_dec_ret:
ret
SYM_FUNC_END(sm4_ce_ccm_dec)
/* SPDX-License-Identifier: GPL-2.0-or-later */
/*
* SM4-CCM AEAD Algorithm using ARMv8 Crypto Extensions
* as specified in rfc8998
* https://datatracker.ietf.org/doc/html/rfc8998
*
* Copyright (C) 2022 Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
*/
#include <linux/module.h>
#include <linux/crypto.h>
#include <linux/kernel.h>
#include <linux/cpufeature.h>
#include <asm/neon.h>
#include <crypto/scatterwalk.h>
#include <crypto/internal/aead.h>
#include <crypto/internal/skcipher.h>
#include <crypto/sm4.h>
#include "sm4-ce.h"
asmlinkage void sm4_ce_cbcmac_update(const u32 *rkey_enc, u8 *mac,
const u8 *src, unsigned int nblocks);
asmlinkage void sm4_ce_ccm_enc(const u32 *rkey_enc, u8 *dst, const u8 *src,
u8 *iv, unsigned int nbytes, u8 *mac);
asmlinkage void sm4_ce_ccm_dec(const u32 *rkey_enc, u8 *dst, const u8 *src,
u8 *iv, unsigned int nbytes, u8 *mac);
asmlinkage void sm4_ce_ccm_final(const u32 *rkey_enc, u8 *iv, u8 *mac);
static int ccm_setkey(struct crypto_aead *tfm, const u8 *key,
unsigned int key_len)
{
struct sm4_ctx *ctx = crypto_aead_ctx(tfm);
if (key_len != SM4_KEY_SIZE)
return -EINVAL;
kernel_neon_begin();
sm4_ce_expand_key(key, ctx->rkey_enc, ctx->rkey_dec,
crypto_sm4_fk, crypto_sm4_ck);
kernel_neon_end();
return 0;
}
static int ccm_setauthsize(struct crypto_aead *tfm, unsigned int authsize)
{
if ((authsize & 1) || authsize < 4)
return -EINVAL;
return 0;
}
static int ccm_format_input(u8 info[], struct aead_request *req,
unsigned int msglen)
{
struct crypto_aead *aead = crypto_aead_reqtfm(req);
unsigned int l = req->iv[0] + 1;
unsigned int m;
__be32 len;
/* verify that CCM dimension 'L': 2 <= L <= 8 */
if (l < 2 || l > 8)
return -EINVAL;
if (l < 4 && msglen >> (8 * l))
return -EOVERFLOW;
memset(&req->iv[SM4_BLOCK_SIZE - l], 0, l);
memcpy(info, req->iv, SM4_BLOCK_SIZE);
m = crypto_aead_authsize(aead);
/* format flags field per RFC 3610/NIST 800-38C */
*info |= ((m - 2) / 2) << 3;
if (req->assoclen)
*info |= (1 << 6);
/*
* format message length field,
* Linux uses a u32 type to represent msglen
*/
if (l >= 4)
l = 4;
len = cpu_to_be32(msglen);
memcpy(&info[SM4_BLOCK_SIZE - l], (u8 *)&len + 4 - l, l);
return 0;
}
static void ccm_calculate_auth_mac(struct aead_request *req, u8 mac[])
{
struct crypto_aead *aead = crypto_aead_reqtfm(req);
struct sm4_ctx *ctx = crypto_aead_ctx(aead);
struct __packed { __be16 l; __be32 h; } aadlen;
u32 assoclen = req->assoclen;
struct scatter_walk walk;
unsigned int len;
if (assoclen < 0xff00) {
aadlen.l = cpu_to_be16(assoclen);
len = 2;
} else {
aadlen.l = cpu_to_be16(0xfffe);
put_unaligned_be32(assoclen, &aadlen.h);
len = 6;
}
sm4_ce_crypt_block(ctx->rkey_enc, mac, mac);
crypto_xor(mac, (const u8 *)&aadlen, len);
scatterwalk_start(&walk, req->src);
do {
u32 n = scatterwalk_clamp(&walk, assoclen);
u8 *p, *ptr;
if (!n) {
scatterwalk_start(&walk, sg_next(walk.sg));
n = scatterwalk_clamp(&walk, assoclen);
}
p = ptr = scatterwalk_map(&walk);
assoclen -= n;
scatterwalk_advance(&walk, n);
while (n > 0) {
unsigned int l, nblocks;
if (len == SM4_BLOCK_SIZE) {
if (n < SM4_BLOCK_SIZE) {
sm4_ce_crypt_block(ctx->rkey_enc,
mac, mac);
len = 0;
} else {
nblocks = n / SM4_BLOCK_SIZE;
sm4_ce_cbcmac_update(ctx->rkey_enc,
mac, ptr, nblocks);
ptr += nblocks * SM4_BLOCK_SIZE;
n %= SM4_BLOCK_SIZE;
continue;
}
}
l = min(n, SM4_BLOCK_SIZE - len);
if (l) {
crypto_xor(mac + len, ptr, l);
len += l;
ptr += l;
n -= l;
}
}
scatterwalk_unmap(p);
scatterwalk_done(&walk, 0, assoclen);
} while (assoclen);
}
static int ccm_crypt(struct aead_request *req, struct skcipher_walk *walk,
u32 *rkey_enc, u8 mac[],
void (*sm4_ce_ccm_crypt)(const u32 *rkey_enc, u8 *dst,
const u8 *src, u8 *iv,
unsigned int nbytes, u8 *mac))
{
u8 __aligned(8) ctr0[SM4_BLOCK_SIZE];
int err;
/* preserve the initial ctr0 for the TAG */
memcpy(ctr0, walk->iv, SM4_BLOCK_SIZE);
crypto_inc(walk->iv, SM4_BLOCK_SIZE);
kernel_neon_begin();
if (req->assoclen)
ccm_calculate_auth_mac(req, mac);
do {
unsigned int tail = walk->nbytes % SM4_BLOCK_SIZE;
const u8 *src = walk->src.virt.addr;
u8 *dst = walk->dst.virt.addr;
if (walk->nbytes == walk->total)
tail = 0;
if (walk->nbytes - tail)
sm4_ce_ccm_crypt(rkey_enc, dst, src, walk->iv,
walk->nbytes - tail, mac);
if (walk->nbytes == walk->total)
sm4_ce_ccm_final(rkey_enc, ctr0, mac);
kernel_neon_end();
if (walk->nbytes) {
err = skcipher_walk_done(walk, tail);
if (err)
return err;
if (walk->nbytes)
kernel_neon_begin();
}
} while (walk->nbytes > 0);
return 0;
}
static int ccm_encrypt(struct aead_request *req)
{
struct crypto_aead *aead = crypto_aead_reqtfm(req);
struct sm4_ctx *ctx = crypto_aead_ctx(aead);
u8 __aligned(8) mac[SM4_BLOCK_SIZE];
struct skcipher_walk walk;
int err;
err = ccm_format_input(mac, req, req->cryptlen);
if (err)
return err;
err = skcipher_walk_aead_encrypt(&walk, req, false);
if (err)
return err;
err = ccm_crypt(req, &walk, ctx->rkey_enc, mac, sm4_ce_ccm_enc);
if (err)
return err;
/* copy authtag to end of dst */
scatterwalk_map_and_copy(mac, req->dst, req->assoclen + req->cryptlen,
crypto_aead_authsize(aead), 1);
return 0;
}
static int ccm_decrypt(struct aead_request *req)
{
struct crypto_aead *aead = crypto_aead_reqtfm(req);
unsigned int authsize = crypto_aead_authsize(aead);
struct sm4_ctx *ctx = crypto_aead_ctx(aead);
u8 __aligned(8) mac[SM4_BLOCK_SIZE];
u8 authtag[SM4_BLOCK_SIZE];
struct skcipher_walk walk;
int err;
err = ccm_format_input(mac, req, req->cryptlen - authsize);
if (err)
return err;
err = skcipher_walk_aead_decrypt(&walk, req, false);
if (err)
return err;
err = ccm_crypt(req, &walk, ctx->rkey_enc, mac, sm4_ce_ccm_dec);
if (err)
return err;
/* compare calculated auth tag with the stored one */
scatterwalk_map_and_copy(authtag, req->src,
req->assoclen + req->cryptlen - authsize,
authsize, 0);
if (crypto_memneq(authtag, mac, authsize))
return -EBADMSG;
return 0;
}
static struct aead_alg sm4_ccm_alg = {
.base = {
.cra_name = "ccm(sm4)",
.cra_driver_name = "ccm-sm4-ce",
.cra_priority = 400,
.cra_blocksize = 1,
.cra_ctxsize = sizeof(struct sm4_ctx),
.cra_module = THIS_MODULE,
},
.ivsize = SM4_BLOCK_SIZE,
.chunksize = SM4_BLOCK_SIZE,
.maxauthsize = SM4_BLOCK_SIZE,
.setkey = ccm_setkey,
.setauthsize = ccm_setauthsize,
.encrypt = ccm_encrypt,
.decrypt = ccm_decrypt,
};
static int __init sm4_ce_ccm_init(void)
{
return crypto_register_aead(&sm4_ccm_alg);
}
static void __exit sm4_ce_ccm_exit(void)
{
crypto_unregister_aead(&sm4_ccm_alg);
}
module_cpu_feature_match(SM4, sm4_ce_ccm_init);
module_exit(sm4_ce_ccm_exit);
MODULE_DESCRIPTION("Synchronous SM4 in CCM mode using ARMv8 Crypto Extensions");
MODULE_ALIAS_CRYPTO("ccm(sm4)");
MODULE_AUTHOR("Tianjia Zhang <tianjia.zhang@linux.alibaba.com>");
MODULE_LICENSE("GPL v2");
...@@ -2,11 +2,11 @@ ...@@ -2,11 +2,11 @@
#include <asm/neon.h> #include <asm/neon.h>
#include <asm/simd.h> #include <asm/simd.h>
#include <crypto/algapi.h>
#include <crypto/sm4.h> #include <crypto/sm4.h>
#include <crypto/internal/simd.h> #include <crypto/internal/simd.h>
#include <linux/module.h> #include <linux/module.h>
#include <linux/cpufeature.h> #include <linux/cpufeature.h>
#include <linux/crypto.h>
#include <linux/types.h> #include <linux/types.h>
MODULE_ALIAS_CRYPTO("sm4"); MODULE_ALIAS_CRYPTO("sm4");
......
This diff is collapsed.
This diff is collapsed.
/* SPDX-License-Identifier: GPL-2.0-or-later */
/*
* SM4-GCM AEAD Algorithm using ARMv8 Crypto Extensions
* as specified in rfc8998
* https://datatracker.ietf.org/doc/html/rfc8998
*
* Copyright (C) 2022 Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
*/
#include <linux/module.h>
#include <linux/crypto.h>
#include <linux/kernel.h>
#include <linux/cpufeature.h>
#include <asm/neon.h>
#include <crypto/b128ops.h>
#include <crypto/scatterwalk.h>
#include <crypto/internal/aead.h>
#include <crypto/internal/skcipher.h>
#include <crypto/sm4.h>
#include "sm4-ce.h"
asmlinkage void sm4_ce_pmull_ghash_setup(const u32 *rkey_enc, u8 *ghash_table);
asmlinkage void pmull_ghash_update(const u8 *ghash_table, u8 *ghash,
const u8 *src, unsigned int nblocks);
asmlinkage void sm4_ce_pmull_gcm_enc(const u32 *rkey_enc, u8 *dst,
const u8 *src, u8 *iv,
unsigned int nbytes, u8 *ghash,
const u8 *ghash_table, const u8 *lengths);
asmlinkage void sm4_ce_pmull_gcm_dec(const u32 *rkey_enc, u8 *dst,
const u8 *src, u8 *iv,
unsigned int nbytes, u8 *ghash,
const u8 *ghash_table, const u8 *lengths);
#define GHASH_BLOCK_SIZE 16
#define GCM_IV_SIZE 12
struct sm4_gcm_ctx {
struct sm4_ctx key;
u8 ghash_table[16 * 4];
};
static int gcm_setkey(struct crypto_aead *tfm, const u8 *key,
unsigned int key_len)
{
struct sm4_gcm_ctx *ctx = crypto_aead_ctx(tfm);
if (key_len != SM4_KEY_SIZE)
return -EINVAL;
kernel_neon_begin();
sm4_ce_expand_key(key, ctx->key.rkey_enc, ctx->key.rkey_dec,
crypto_sm4_fk, crypto_sm4_ck);
sm4_ce_pmull_ghash_setup(ctx->key.rkey_enc, ctx->ghash_table);
kernel_neon_end();
return 0;
}
static int gcm_setauthsize(struct crypto_aead *tfm, unsigned int authsize)
{
switch (authsize) {
case 4:
case 8:
case 12 ... 16:
return 0;
default:
return -EINVAL;
}
}
static void gcm_calculate_auth_mac(struct aead_request *req, u8 ghash[])
{
struct crypto_aead *aead = crypto_aead_reqtfm(req);
struct sm4_gcm_ctx *ctx = crypto_aead_ctx(aead);
u8 __aligned(8) buffer[GHASH_BLOCK_SIZE];
u32 assoclen = req->assoclen;
struct scatter_walk walk;
unsigned int buflen = 0;
scatterwalk_start(&walk, req->src);
do {
u32 n = scatterwalk_clamp(&walk, assoclen);
u8 *p, *ptr;
if (!n) {
scatterwalk_start(&walk, sg_next(walk.sg));
n = scatterwalk_clamp(&walk, assoclen);
}
p = ptr = scatterwalk_map(&walk);
assoclen -= n;
scatterwalk_advance(&walk, n);
if (n + buflen < GHASH_BLOCK_SIZE) {
memcpy(&buffer[buflen], ptr, n);
buflen += n;
} else {
unsigned int nblocks;
if (buflen) {
unsigned int l = GHASH_BLOCK_SIZE - buflen;
memcpy(&buffer[buflen], ptr, l);
ptr += l;
n -= l;
pmull_ghash_update(ctx->ghash_table, ghash,
buffer, 1);
}
nblocks = n / GHASH_BLOCK_SIZE;
if (nblocks) {
pmull_ghash_update(ctx->ghash_table, ghash,
ptr, nblocks);
ptr += nblocks * GHASH_BLOCK_SIZE;
}
buflen = n % GHASH_BLOCK_SIZE;
if (buflen)
memcpy(&buffer[0], ptr, buflen);
}
scatterwalk_unmap(p);
scatterwalk_done(&walk, 0, assoclen);
} while (assoclen);
/* padding with '0' */
if (buflen) {
memset(&buffer[buflen], 0, GHASH_BLOCK_SIZE - buflen);
pmull_ghash_update(ctx->ghash_table, ghash, buffer, 1);
}
}
static int gcm_crypt(struct aead_request *req, struct skcipher_walk *walk,
struct sm4_gcm_ctx *ctx, u8 ghash[],
void (*sm4_ce_pmull_gcm_crypt)(const u32 *rkey_enc,
u8 *dst, const u8 *src, u8 *iv,
unsigned int nbytes, u8 *ghash,
const u8 *ghash_table, const u8 *lengths))
{
u8 __aligned(8) iv[SM4_BLOCK_SIZE];
be128 __aligned(8) lengths;
int err;
memset(ghash, 0, SM4_BLOCK_SIZE);
lengths.a = cpu_to_be64(req->assoclen * 8);
lengths.b = cpu_to_be64(walk->total * 8);
memcpy(iv, walk->iv, GCM_IV_SIZE);
put_unaligned_be32(2, iv + GCM_IV_SIZE);
kernel_neon_begin();
if (req->assoclen)
gcm_calculate_auth_mac(req, ghash);
do {
unsigned int tail = walk->nbytes % SM4_BLOCK_SIZE;
const u8 *src = walk->src.virt.addr;
u8 *dst = walk->dst.virt.addr;
if (walk->nbytes == walk->total) {
tail = 0;
sm4_ce_pmull_gcm_crypt(ctx->key.rkey_enc, dst, src, iv,
walk->nbytes, ghash,
ctx->ghash_table,
(const u8 *)&lengths);
} else if (walk->nbytes - tail) {
sm4_ce_pmull_gcm_crypt(ctx->key.rkey_enc, dst, src, iv,
walk->nbytes - tail, ghash,
ctx->ghash_table, NULL);
}
kernel_neon_end();
err = skcipher_walk_done(walk, tail);
if (err)
return err;
if (walk->nbytes)
kernel_neon_begin();
} while (walk->nbytes > 0);
return 0;
}
static int gcm_encrypt(struct aead_request *req)
{
struct crypto_aead *aead = crypto_aead_reqtfm(req);
struct sm4_gcm_ctx *ctx = crypto_aead_ctx(aead);
u8 __aligned(8) ghash[SM4_BLOCK_SIZE];
struct skcipher_walk walk;
int err;
err = skcipher_walk_aead_encrypt(&walk, req, false);
if (err)
return err;
err = gcm_crypt(req, &walk, ctx, ghash, sm4_ce_pmull_gcm_enc);
if (err)
return err;
/* copy authtag to end of dst */
scatterwalk_map_and_copy(ghash, req->dst, req->assoclen + req->cryptlen,
crypto_aead_authsize(aead), 1);
return 0;
}
static int gcm_decrypt(struct aead_request *req)
{
struct crypto_aead *aead = crypto_aead_reqtfm(req);
unsigned int authsize = crypto_aead_authsize(aead);
struct sm4_gcm_ctx *ctx = crypto_aead_ctx(aead);
u8 __aligned(8) ghash[SM4_BLOCK_SIZE];
u8 authtag[SM4_BLOCK_SIZE];
struct skcipher_walk walk;
int err;
err = skcipher_walk_aead_decrypt(&walk, req, false);
if (err)
return err;
err = gcm_crypt(req, &walk, ctx, ghash, sm4_ce_pmull_gcm_dec);
if (err)
return err;
/* compare calculated auth tag with the stored one */
scatterwalk_map_and_copy(authtag, req->src,
req->assoclen + req->cryptlen - authsize,
authsize, 0);
if (crypto_memneq(authtag, ghash, authsize))
return -EBADMSG;
return 0;
}
static struct aead_alg sm4_gcm_alg = {
.base = {
.cra_name = "gcm(sm4)",
.cra_driver_name = "gcm-sm4-ce",
.cra_priority = 400,
.cra_blocksize = 1,
.cra_ctxsize = sizeof(struct sm4_gcm_ctx),
.cra_module = THIS_MODULE,
},
.ivsize = GCM_IV_SIZE,
.chunksize = SM4_BLOCK_SIZE,
.maxauthsize = SM4_BLOCK_SIZE,
.setkey = gcm_setkey,
.setauthsize = gcm_setauthsize,
.encrypt = gcm_encrypt,
.decrypt = gcm_decrypt,
};
static int __init sm4_ce_gcm_init(void)
{
if (!cpu_have_named_feature(PMULL))
return -ENODEV;
return crypto_register_aead(&sm4_gcm_alg);
}
static void __exit sm4_ce_gcm_exit(void)
{
crypto_unregister_aead(&sm4_gcm_alg);
}
static const struct cpu_feature __maybe_unused sm4_ce_gcm_cpu_feature[] = {
{ cpu_feature(PMULL) },
{}
};
MODULE_DEVICE_TABLE(cpu, sm4_ce_gcm_cpu_feature);
module_cpu_feature_match(SM4, sm4_ce_gcm_init);
module_exit(sm4_ce_gcm_exit);
MODULE_DESCRIPTION("Synchronous SM4 in GCM mode using ARMv8 Crypto Extensions");
MODULE_ALIAS_CRYPTO("gcm(sm4)");
MODULE_AUTHOR("Tianjia Zhang <tianjia.zhang@linux.alibaba.com>");
MODULE_LICENSE("GPL v2");
This diff is collapsed.
/* SPDX-License-Identifier: GPL-2.0-or-later */
/*
* SM4 common functions for Crypto Extensions
* Copyright (C) 2022 Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
*/
void sm4_ce_expand_key(const u8 *key, u32 *rkey_enc, u32 *rkey_dec,
const u32 *fk, const u32 *ck);
void sm4_ce_crypt_block(const u32 *rkey, u8 *dst, const u8 *src);
void sm4_ce_cbc_enc(const u32 *rkey_enc, u8 *dst, const u8 *src,
u8 *iv, unsigned int nblocks);
void sm4_ce_cfb_enc(const u32 *rkey_enc, u8 *dst, const u8 *src,
u8 *iv, unsigned int nblocks);
This diff is collapsed.
...@@ -18,19 +18,14 @@ ...@@ -18,19 +18,14 @@
#include <crypto/internal/skcipher.h> #include <crypto/internal/skcipher.h>
#include <crypto/sm4.h> #include <crypto/sm4.h>
#define BYTES2BLKS(nbytes) ((nbytes) >> 4) asmlinkage void sm4_neon_crypt(const u32 *rkey, u8 *dst, const u8 *src,
#define BYTES2BLK8(nbytes) (((nbytes) >> 4) & ~(8 - 1)) unsigned int nblocks);
asmlinkage void sm4_neon_cbc_dec(const u32 *rkey_dec, u8 *dst, const u8 *src,
asmlinkage void sm4_neon_crypt_blk1_8(const u32 *rkey, u8 *dst, const u8 *src, u8 *iv, unsigned int nblocks);
unsigned int nblks); asmlinkage void sm4_neon_cfb_dec(const u32 *rkey_enc, u8 *dst, const u8 *src,
asmlinkage void sm4_neon_crypt_blk8(const u32 *rkey, u8 *dst, const u8 *src, u8 *iv, unsigned int nblocks);
unsigned int nblks); asmlinkage void sm4_neon_ctr_crypt(const u32 *rkey_enc, u8 *dst, const u8 *src,
asmlinkage void sm4_neon_cbc_dec_blk8(const u32 *rkey, u8 *dst, const u8 *src, u8 *iv, unsigned int nblocks);
u8 *iv, unsigned int nblks);
asmlinkage void sm4_neon_cfb_dec_blk8(const u32 *rkey, u8 *dst, const u8 *src,
u8 *iv, unsigned int nblks);
asmlinkage void sm4_neon_ctr_enc_blk8(const u32 *rkey, u8 *dst, const u8 *src,
u8 *iv, unsigned int nblks);
static int sm4_setkey(struct crypto_skcipher *tfm, const u8 *key, static int sm4_setkey(struct crypto_skcipher *tfm, const u8 *key,
unsigned int key_len) unsigned int key_len)
...@@ -51,27 +46,18 @@ static int sm4_ecb_do_crypt(struct skcipher_request *req, const u32 *rkey) ...@@ -51,27 +46,18 @@ static int sm4_ecb_do_crypt(struct skcipher_request *req, const u32 *rkey)
while ((nbytes = walk.nbytes) > 0) { while ((nbytes = walk.nbytes) > 0) {
const u8 *src = walk.src.virt.addr; const u8 *src = walk.src.virt.addr;
u8 *dst = walk.dst.virt.addr; u8 *dst = walk.dst.virt.addr;
unsigned int nblks; unsigned int nblocks;
kernel_neon_begin(); nblocks = nbytes / SM4_BLOCK_SIZE;
if (nblocks) {
kernel_neon_begin();
nblks = BYTES2BLK8(nbytes); sm4_neon_crypt(rkey, dst, src, nblocks);
if (nblks) {
sm4_neon_crypt_blk8(rkey, dst, src, nblks);
dst += nblks * SM4_BLOCK_SIZE;
src += nblks * SM4_BLOCK_SIZE;
nbytes -= nblks * SM4_BLOCK_SIZE;
}
nblks = BYTES2BLKS(nbytes); kernel_neon_end();
if (nblks) {
sm4_neon_crypt_blk1_8(rkey, dst, src, nblks);
nbytes -= nblks * SM4_BLOCK_SIZE;
} }
kernel_neon_end(); err = skcipher_walk_done(&walk, nbytes % SM4_BLOCK_SIZE);
err = skcipher_walk_done(&walk, nbytes);
} }
return err; return err;
...@@ -138,48 +124,19 @@ static int sm4_cbc_decrypt(struct skcipher_request *req) ...@@ -138,48 +124,19 @@ static int sm4_cbc_decrypt(struct skcipher_request *req)
while ((nbytes = walk.nbytes) > 0) { while ((nbytes = walk.nbytes) > 0) {
const u8 *src = walk.src.virt.addr; const u8 *src = walk.src.virt.addr;
u8 *dst = walk.dst.virt.addr; u8 *dst = walk.dst.virt.addr;
unsigned int nblks; unsigned int nblocks;
kernel_neon_begin(); nblocks = nbytes / SM4_BLOCK_SIZE;
if (nblocks) {
kernel_neon_begin();
nblks = BYTES2BLK8(nbytes); sm4_neon_cbc_dec(ctx->rkey_dec, dst, src,
if (nblks) { walk.iv, nblocks);
sm4_neon_cbc_dec_blk8(ctx->rkey_dec, dst, src,
walk.iv, nblks);
dst += nblks * SM4_BLOCK_SIZE;
src += nblks * SM4_BLOCK_SIZE;
nbytes -= nblks * SM4_BLOCK_SIZE;
}
nblks = BYTES2BLKS(nbytes); kernel_neon_end();
if (nblks) {
u8 keystream[SM4_BLOCK_SIZE * 8];
u8 iv[SM4_BLOCK_SIZE];
int i;
sm4_neon_crypt_blk1_8(ctx->rkey_dec, keystream,
src, nblks);
src += ((int)nblks - 2) * SM4_BLOCK_SIZE;
dst += (nblks - 1) * SM4_BLOCK_SIZE;
memcpy(iv, src + SM4_BLOCK_SIZE, SM4_BLOCK_SIZE);
for (i = nblks - 1; i > 0; i--) {
crypto_xor_cpy(dst, src,
&keystream[i * SM4_BLOCK_SIZE],
SM4_BLOCK_SIZE);
src -= SM4_BLOCK_SIZE;
dst -= SM4_BLOCK_SIZE;
}
crypto_xor_cpy(dst, walk.iv,
keystream, SM4_BLOCK_SIZE);
memcpy(walk.iv, iv, SM4_BLOCK_SIZE);
nbytes -= nblks * SM4_BLOCK_SIZE;
} }
kernel_neon_end(); err = skcipher_walk_done(&walk, nbytes % SM4_BLOCK_SIZE);
err = skcipher_walk_done(&walk, nbytes);
} }
return err; return err;
...@@ -238,41 +195,21 @@ static int sm4_cfb_decrypt(struct skcipher_request *req) ...@@ -238,41 +195,21 @@ static int sm4_cfb_decrypt(struct skcipher_request *req)
while ((nbytes = walk.nbytes) > 0) { while ((nbytes = walk.nbytes) > 0) {
const u8 *src = walk.src.virt.addr; const u8 *src = walk.src.virt.addr;
u8 *dst = walk.dst.virt.addr; u8 *dst = walk.dst.virt.addr;
unsigned int nblks; unsigned int nblocks;
kernel_neon_begin(); nblocks = nbytes / SM4_BLOCK_SIZE;
if (nblocks) {
kernel_neon_begin();
nblks = BYTES2BLK8(nbytes); sm4_neon_cfb_dec(ctx->rkey_enc, dst, src,
if (nblks) { walk.iv, nblocks);
sm4_neon_cfb_dec_blk8(ctx->rkey_enc, dst, src,
walk.iv, nblks);
dst += nblks * SM4_BLOCK_SIZE;
src += nblks * SM4_BLOCK_SIZE;
nbytes -= nblks * SM4_BLOCK_SIZE;
}
nblks = BYTES2BLKS(nbytes); kernel_neon_end();
if (nblks) {
u8 keystream[SM4_BLOCK_SIZE * 8];
memcpy(keystream, walk.iv, SM4_BLOCK_SIZE);
if (nblks > 1)
memcpy(&keystream[SM4_BLOCK_SIZE], src,
(nblks - 1) * SM4_BLOCK_SIZE);
memcpy(walk.iv, src + (nblks - 1) * SM4_BLOCK_SIZE,
SM4_BLOCK_SIZE);
sm4_neon_crypt_blk1_8(ctx->rkey_enc, keystream,
keystream, nblks);
crypto_xor_cpy(dst, src, keystream,
nblks * SM4_BLOCK_SIZE);
dst += nblks * SM4_BLOCK_SIZE;
src += nblks * SM4_BLOCK_SIZE;
nbytes -= nblks * SM4_BLOCK_SIZE;
}
kernel_neon_end(); dst += nblocks * SM4_BLOCK_SIZE;
src += nblocks * SM4_BLOCK_SIZE;
nbytes -= nblocks * SM4_BLOCK_SIZE;
}
/* tail */ /* tail */
if (walk.nbytes == walk.total && nbytes > 0) { if (walk.nbytes == walk.total && nbytes > 0) {
...@@ -302,40 +239,21 @@ static int sm4_ctr_crypt(struct skcipher_request *req) ...@@ -302,40 +239,21 @@ static int sm4_ctr_crypt(struct skcipher_request *req)
while ((nbytes = walk.nbytes) > 0) { while ((nbytes = walk.nbytes) > 0) {
const u8 *src = walk.src.virt.addr; const u8 *src = walk.src.virt.addr;
u8 *dst = walk.dst.virt.addr; u8 *dst = walk.dst.virt.addr;
unsigned int nblks; unsigned int nblocks;
kernel_neon_begin(); nblocks = nbytes / SM4_BLOCK_SIZE;
if (nblocks) {
kernel_neon_begin();
nblks = BYTES2BLK8(nbytes); sm4_neon_ctr_crypt(ctx->rkey_enc, dst, src,
if (nblks) { walk.iv, nblocks);
sm4_neon_ctr_enc_blk8(ctx->rkey_enc, dst, src,
walk.iv, nblks);
dst += nblks * SM4_BLOCK_SIZE;
src += nblks * SM4_BLOCK_SIZE;
nbytes -= nblks * SM4_BLOCK_SIZE;
}
nblks = BYTES2BLKS(nbytes); kernel_neon_end();
if (nblks) {
u8 keystream[SM4_BLOCK_SIZE * 8];
int i;
for (i = 0; i < nblks; i++) {
memcpy(&keystream[i * SM4_BLOCK_SIZE],
walk.iv, SM4_BLOCK_SIZE);
crypto_inc(walk.iv, SM4_BLOCK_SIZE);
}
sm4_neon_crypt_blk1_8(ctx->rkey_enc, keystream,
keystream, nblks);
crypto_xor_cpy(dst, src, keystream,
nblks * SM4_BLOCK_SIZE);
dst += nblks * SM4_BLOCK_SIZE;
src += nblks * SM4_BLOCK_SIZE;
nbytes -= nblks * SM4_BLOCK_SIZE;
}
kernel_neon_end(); dst += nblocks * SM4_BLOCK_SIZE;
src += nblocks * SM4_BLOCK_SIZE;
nbytes -= nblocks * SM4_BLOCK_SIZE;
}
/* tail */ /* tail */
if (walk.nbytes == walk.total && nbytes > 0) { if (walk.nbytes == walk.total && nbytes > 0) {
......
...@@ -82,7 +82,6 @@ static int __init rng_init (void) ...@@ -82,7 +82,6 @@ static int __init rng_init (void)
sigio_broken(random_fd); sigio_broken(random_fd);
hwrng.name = RNG_MODULE_NAME; hwrng.name = RNG_MODULE_NAME;
hwrng.read = rng_dev_read; hwrng.read = rng_dev_read;
hwrng.quality = 1024;
err = hwrng_register(&hwrng); err = hwrng_register(&hwrng);
if (err) { if (err) {
......
...@@ -107,3 +107,6 @@ quiet_cmd_perlasm = PERLASM $@ ...@@ -107,3 +107,6 @@ quiet_cmd_perlasm = PERLASM $@
cmd_perlasm = $(PERL) $< > $@ cmd_perlasm = $(PERL) $< > $@
$(obj)/%.S: $(src)/%.pl FORCE $(obj)/%.S: $(src)/%.pl FORCE
$(call if_changed,perlasm) $(call if_changed,perlasm)
# Disable GCOV in odd or sensitive code
GCOV_PROFILE_curve25519-x86_64.o := n
...@@ -7,6 +7,7 @@ ...@@ -7,6 +7,7 @@
*/ */
#include <linux/linkage.h> #include <linux/linkage.h>
#include <linux/cfi_types.h>
#include <asm/frame.h> #include <asm/frame.h>
#define STATE0 %xmm0 #define STATE0 %xmm0
...@@ -402,7 +403,7 @@ SYM_FUNC_END(crypto_aegis128_aesni_ad) ...@@ -402,7 +403,7 @@ SYM_FUNC_END(crypto_aegis128_aesni_ad)
* void crypto_aegis128_aesni_enc(void *state, unsigned int length, * void crypto_aegis128_aesni_enc(void *state, unsigned int length,
* const void *src, void *dst); * const void *src, void *dst);
*/ */
SYM_FUNC_START(crypto_aegis128_aesni_enc) SYM_TYPED_FUNC_START(crypto_aegis128_aesni_enc)
FRAME_BEGIN FRAME_BEGIN
cmp $0x10, LEN cmp $0x10, LEN
...@@ -499,7 +500,7 @@ SYM_FUNC_END(crypto_aegis128_aesni_enc) ...@@ -499,7 +500,7 @@ SYM_FUNC_END(crypto_aegis128_aesni_enc)
* void crypto_aegis128_aesni_enc_tail(void *state, unsigned int length, * void crypto_aegis128_aesni_enc_tail(void *state, unsigned int length,
* const void *src, void *dst); * const void *src, void *dst);
*/ */
SYM_FUNC_START(crypto_aegis128_aesni_enc_tail) SYM_TYPED_FUNC_START(crypto_aegis128_aesni_enc_tail)
FRAME_BEGIN FRAME_BEGIN
/* load the state: */ /* load the state: */
...@@ -556,7 +557,7 @@ SYM_FUNC_END(crypto_aegis128_aesni_enc_tail) ...@@ -556,7 +557,7 @@ SYM_FUNC_END(crypto_aegis128_aesni_enc_tail)
* void crypto_aegis128_aesni_dec(void *state, unsigned int length, * void crypto_aegis128_aesni_dec(void *state, unsigned int length,
* const void *src, void *dst); * const void *src, void *dst);
*/ */
SYM_FUNC_START(crypto_aegis128_aesni_dec) SYM_TYPED_FUNC_START(crypto_aegis128_aesni_dec)
FRAME_BEGIN FRAME_BEGIN
cmp $0x10, LEN cmp $0x10, LEN
...@@ -653,7 +654,7 @@ SYM_FUNC_END(crypto_aegis128_aesni_dec) ...@@ -653,7 +654,7 @@ SYM_FUNC_END(crypto_aegis128_aesni_dec)
* void crypto_aegis128_aesni_dec_tail(void *state, unsigned int length, * void crypto_aegis128_aesni_dec_tail(void *state, unsigned int length,
* const void *src, void *dst); * const void *src, void *dst);
*/ */
SYM_FUNC_START(crypto_aegis128_aesni_dec_tail) SYM_TYPED_FUNC_START(crypto_aegis128_aesni_dec_tail)
FRAME_BEGIN FRAME_BEGIN
/* load the state: */ /* load the state: */
......
...@@ -7,6 +7,7 @@ ...@@ -7,6 +7,7 @@
*/ */
#include <linux/linkage.h> #include <linux/linkage.h>
#include <linux/cfi_types.h>
#include <asm/frame.h> #include <asm/frame.h>
/* struct aria_ctx: */ /* struct aria_ctx: */
...@@ -913,7 +914,7 @@ SYM_FUNC_START_LOCAL(__aria_aesni_avx_crypt_16way) ...@@ -913,7 +914,7 @@ SYM_FUNC_START_LOCAL(__aria_aesni_avx_crypt_16way)
RET; RET;
SYM_FUNC_END(__aria_aesni_avx_crypt_16way) SYM_FUNC_END(__aria_aesni_avx_crypt_16way)
SYM_FUNC_START(aria_aesni_avx_encrypt_16way) SYM_TYPED_FUNC_START(aria_aesni_avx_encrypt_16way)
/* input: /* input:
* %rdi: ctx, CTX * %rdi: ctx, CTX
* %rsi: dst * %rsi: dst
...@@ -938,7 +939,7 @@ SYM_FUNC_START(aria_aesni_avx_encrypt_16way) ...@@ -938,7 +939,7 @@ SYM_FUNC_START(aria_aesni_avx_encrypt_16way)
RET; RET;
SYM_FUNC_END(aria_aesni_avx_encrypt_16way) SYM_FUNC_END(aria_aesni_avx_encrypt_16way)
SYM_FUNC_START(aria_aesni_avx_decrypt_16way) SYM_TYPED_FUNC_START(aria_aesni_avx_decrypt_16way)
/* input: /* input:
* %rdi: ctx, CTX * %rdi: ctx, CTX
* %rsi: dst * %rsi: dst
...@@ -1039,7 +1040,7 @@ SYM_FUNC_START_LOCAL(__aria_aesni_avx_ctr_gen_keystream_16way) ...@@ -1039,7 +1040,7 @@ SYM_FUNC_START_LOCAL(__aria_aesni_avx_ctr_gen_keystream_16way)
RET; RET;
SYM_FUNC_END(__aria_aesni_avx_ctr_gen_keystream_16way) SYM_FUNC_END(__aria_aesni_avx_ctr_gen_keystream_16way)
SYM_FUNC_START(aria_aesni_avx_ctr_crypt_16way) SYM_TYPED_FUNC_START(aria_aesni_avx_ctr_crypt_16way)
/* input: /* input:
* %rdi: ctx * %rdi: ctx
* %rsi: dst * %rsi: dst
...@@ -1208,7 +1209,7 @@ SYM_FUNC_START_LOCAL(__aria_aesni_avx_gfni_crypt_16way) ...@@ -1208,7 +1209,7 @@ SYM_FUNC_START_LOCAL(__aria_aesni_avx_gfni_crypt_16way)
RET; RET;
SYM_FUNC_END(__aria_aesni_avx_gfni_crypt_16way) SYM_FUNC_END(__aria_aesni_avx_gfni_crypt_16way)
SYM_FUNC_START(aria_aesni_avx_gfni_encrypt_16way) SYM_TYPED_FUNC_START(aria_aesni_avx_gfni_encrypt_16way)
/* input: /* input:
* %rdi: ctx, CTX * %rdi: ctx, CTX
* %rsi: dst * %rsi: dst
...@@ -1233,7 +1234,7 @@ SYM_FUNC_START(aria_aesni_avx_gfni_encrypt_16way) ...@@ -1233,7 +1234,7 @@ SYM_FUNC_START(aria_aesni_avx_gfni_encrypt_16way)
RET; RET;
SYM_FUNC_END(aria_aesni_avx_gfni_encrypt_16way) SYM_FUNC_END(aria_aesni_avx_gfni_encrypt_16way)
SYM_FUNC_START(aria_aesni_avx_gfni_decrypt_16way) SYM_TYPED_FUNC_START(aria_aesni_avx_gfni_decrypt_16way)
/* input: /* input:
* %rdi: ctx, CTX * %rdi: ctx, CTX
* %rsi: dst * %rsi: dst
...@@ -1258,7 +1259,7 @@ SYM_FUNC_START(aria_aesni_avx_gfni_decrypt_16way) ...@@ -1258,7 +1259,7 @@ SYM_FUNC_START(aria_aesni_avx_gfni_decrypt_16way)
RET; RET;
SYM_FUNC_END(aria_aesni_avx_gfni_decrypt_16way) SYM_FUNC_END(aria_aesni_avx_gfni_decrypt_16way)
SYM_FUNC_START(aria_aesni_avx_gfni_ctr_crypt_16way) SYM_TYPED_FUNC_START(aria_aesni_avx_gfni_ctr_crypt_16way)
/* input: /* input:
* %rdi: ctx * %rdi: ctx
* %rsi: dst * %rsi: dst
......
...@@ -8,6 +8,7 @@ ...@@ -8,6 +8,7 @@
*/ */
#include <linux/linkage.h> #include <linux/linkage.h>
#include <linux/cfi_types.h>
#define PASS0_SUMS %ymm0 #define PASS0_SUMS %ymm0
#define PASS1_SUMS %ymm1 #define PASS1_SUMS %ymm1
...@@ -65,11 +66,11 @@ ...@@ -65,11 +66,11 @@
/* /*
* void nh_avx2(const u32 *key, const u8 *message, size_t message_len, * void nh_avx2(const u32 *key, const u8 *message, size_t message_len,
* u8 hash[NH_HASH_BYTES]) * __le64 hash[NH_NUM_PASSES])
* *
* It's guaranteed that message_len % 16 == 0. * It's guaranteed that message_len % 16 == 0.
*/ */
SYM_FUNC_START(nh_avx2) SYM_TYPED_FUNC_START(nh_avx2)
vmovdqu 0x00(KEY), K0 vmovdqu 0x00(KEY), K0
vmovdqu 0x10(KEY), K1 vmovdqu 0x10(KEY), K1
......
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
...@@ -54,6 +54,7 @@ ...@@ -54,6 +54,7 @@
*/ */
#include <linux/linkage.h> #include <linux/linkage.h>
#include <linux/cfi_types.h>
#define DIGEST_PTR %rdi /* 1st arg */ #define DIGEST_PTR %rdi /* 1st arg */
#define DATA_PTR %rsi /* 2nd arg */ #define DATA_PTR %rsi /* 2nd arg */
...@@ -93,7 +94,7 @@ ...@@ -93,7 +94,7 @@
*/ */
.text .text
.align 32 .align 32
SYM_FUNC_START(sha1_ni_transform) SYM_TYPED_FUNC_START(sha1_ni_transform)
push %rbp push %rbp
mov %rsp, %rbp mov %rsp, %rbp
sub $FRAME_SIZE, %rsp sub $FRAME_SIZE, %rsp
......
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment