1. 27 Jan, 2023 8 commits
  2. 20 Jan, 2023 14 commits
  3. 18 Jan, 2023 1 commit
    • Herbert Xu's avatar
      crypto: p10-aes-gcm - Revert implementation · 596f674d
      Herbert Xu authored
      Revert the changes that added p10-aes-gcm:
      
      	0781bbd7 ("crypto: p10-aes-gcm - A perl script to process PowerPC assembler source")
      	41a6437a ("crypto: p10-aes-gcm - Supporting functions for ghash")
      	3b47ecca ("crypto: p10-aes-gcm - Supporting functions for AES")
      	ca68a96c ("crypto: p10-aes-gcm - An accelerated AES/GCM stitched implementation")
      	cc40379b ("crypto: p10-aes-gcm - Glue code for AES/GCM stitched implementation")
      	3c657e86 ("crypto: p10-aes-gcm - Update Kconfig and Makefile")
      
      These changes fail to build in many configurations and are not ready
      for prime time.
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      596f674d
  4. 13 Jan, 2023 13 commits
  5. 06 Jan, 2023 4 commits
    • Taehee Yoo's avatar
      crypto: x86/aria - implement aria-avx512 · c970d420
      Taehee Yoo authored
      aria-avx512 implementation uses AVX512 and GFNI.
      It supports 64way parallel processing.
      So, byteslicing code is changed to support 64way parallel.
      And it exports some aria-avx2 functions such as encrypt() and decrypt().
      
      AVX and AVX2 have 16 registers.
      They should use memory to store/load state because of lack of registers.
      But AVX512 supports 32 registers.
      So, it doesn't require store/load in the s-box layer.
      It means that it can reduce overhead of store/load in the s-box layer.
      Also code become much simpler.
      
      Benchmark with modprobe tcrypt mode=610 num_mb=8192, i3-12100:
      
      ARIA-AVX512(128bit and 256bit)
          testing speed of multibuffer ecb(aria) (ecb-aria-avx512) encryption
      tcrypt: 1 operation in 1504 cycles (1024 bytes)
      tcrypt: 1 operation in 4595 cycles (4096 bytes)
      tcrypt: 1 operation in 1763 cycles (1024 bytes)
      tcrypt: 1 operation in 5540 cycles (4096 bytes)
          testing speed of multibuffer ecb(aria) (ecb-aria-avx512) decryption
      tcrypt: 1 operation in 1502 cycles (1024 bytes)
      tcrypt: 1 operation in 4615 cycles (4096 bytes)
      tcrypt: 1 operation in 1759 cycles (1024 bytes)
      tcrypt: 1 operation in 5554 cycles (4096 bytes)
      
      ARIA-AVX2 with GFNI(128bit and 256bit)
          testing speed of multibuffer ecb(aria) (ecb-aria-avx2) encryption
      tcrypt: 1 operation in 2003 cycles (1024 bytes)
      tcrypt: 1 operation in 5867 cycles (4096 bytes)
      tcrypt: 1 operation in 2358 cycles (1024 bytes)
      tcrypt: 1 operation in 7295 cycles (4096 bytes)
          testing speed of multibuffer ecb(aria) (ecb-aria-avx2) decryption
      tcrypt: 1 operation in 2004 cycles (1024 bytes)
      tcrypt: 1 operation in 5956 cycles (4096 bytes)
      tcrypt: 1 operation in 2409 cycles (1024 bytes)
      tcrypt: 1 operation in 7564 cycles (4096 bytes)
      Signed-off-by: default avatarTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      c970d420
    • Taehee Yoo's avatar
      crypto: x86/aria - implement aria-avx2 · 37d8d3ae
      Taehee Yoo authored
      aria-avx2 implementation uses AVX2, AES-NI, and GFNI.
      It supports 32way parallel processing.
      So, byteslicing code is changed to support 32way parallel.
      And it exports some aria-avx functions such as encrypt() and decrypt().
      
      There are two main logics, s-box layer and diffusion layer.
      These codes are the same as aria-avx implementation.
      But some instruction are exchanged because they don't support 256bit
      registers.
      Also, AES-NI doesn't support 256bit register.
      So, aesenclast and aesdeclast are used twice like below:
      	vextracti128 $1, ymm0, xmm6;
      	vaesenclast xmm7, xmm0, xmm0;
      	vaesenclast xmm7, xmm6, xmm6;
      	vinserti128 $1, xmm6, ymm0, ymm0;
      
      Benchmark with modprobe tcrypt mode=610 num_mb=8192, i3-12100:
      
      ARIA-AVX2 with GFNI(128bit and 256bit)
          testing speed of multibuffer ecb(aria) (ecb-aria-avx2) encryption
      tcrypt: 1 operation in 2003 cycles (1024 bytes)
      tcrypt: 1 operation in 5867 cycles (4096 bytes)
      tcrypt: 1 operation in 2358 cycles (1024 bytes)
      tcrypt: 1 operation in 7295 cycles (4096 bytes)
          testing speed of multibuffer ecb(aria) (ecb-aria-avx2) decryption
      tcrypt: 1 operation in 2004 cycles (1024 bytes)
      tcrypt: 1 operation in 5956 cycles (4096 bytes)
      tcrypt: 1 operation in 2409 cycles (1024 bytes)
      tcrypt: 1 operation in 7564 cycles (4096 bytes)
      
      ARIA-AVX with GFNI(128bit and 256bit)
          testing speed of multibuffer ecb(aria) (ecb-aria-avx) encryption
      tcrypt: 1 operation in 2761 cycles (1024 bytes)
      tcrypt: 1 operation in 9390 cycles (4096 bytes)
      tcrypt: 1 operation in 3401 cycles (1024 bytes)
      tcrypt: 1 operation in 11876 cycles (4096 bytes)
          testing speed of multibuffer ecb(aria) (ecb-aria-avx) decryption
      tcrypt: 1 operation in 2735 cycles (1024 bytes)
      tcrypt: 1 operation in 9424 cycles (4096 bytes)
      tcrypt: 1 operation in 3369 cycles (1024 bytes)
      tcrypt: 1 operation in 11954 cycles (4096 bytes)
      Signed-off-by: default avatarTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      37d8d3ae
    • Taehee Yoo's avatar
      crypto: x86/aria - do not use magic number offsets of aria_ctx · 35344cf3
      Taehee Yoo authored
      aria-avx assembly code accesses members of aria_ctx with magic number
      offset. If the shape of struct aria_ctx is changed carelessly,
      aria-avx will not work.
      So, we need to ensure accessing members of aria_ctx with correct
      offset values, not with magic numbers.
      
      It adds ARIA_CTX_enc_key, ARIA_CTX_dec_key, and ARIA_CTX_rounds in the
      asm-offsets.c So, correct offset definitions will be generated.
      aria-avx assembly code can access members of aria_ctx safely with
      these definitions.
      Signed-off-by: default avatarTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      35344cf3
    • Taehee Yoo's avatar
      crypto: x86/aria - add keystream array into request ctx · 8e7d7ce2
      Taehee Yoo authored
      avx accelerated aria module used local keystream array.
      But, keystream array size is too big.
      So, it puts the keystream array into request ctx.
      Signed-off-by: default avatarTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      8e7d7ce2