linux/arch/x86/crypto
Taehee Yoo 37d8d3ae7a crypto: x86/aria - implement aria-avx2
aria-avx2 implementation uses AVX2, AES-NI, and GFNI.
It supports 32way parallel processing.
So, byteslicing code is changed to support 32way parallel.
And it exports some aria-avx functions such as encrypt() and decrypt().

There are two main logics, s-box layer and diffusion layer.
These codes are the same as aria-avx implementation.
But some instruction are exchanged because they don't support 256bit
registers.
Also, AES-NI doesn't support 256bit register.
So, aesenclast and aesdeclast are used twice like below:
	vextracti128 $1, ymm0, xmm6;
	vaesenclast xmm7, xmm0, xmm0;
	vaesenclast xmm7, xmm6, xmm6;
	vinserti128 $1, xmm6, ymm0, ymm0;

Benchmark with modprobe tcrypt mode=610 num_mb=8192, i3-12100:

ARIA-AVX2 with GFNI(128bit and 256bit)
    testing speed of multibuffer ecb(aria) (ecb-aria-avx2) encryption
tcrypt: 1 operation in 2003 cycles (1024 bytes)
tcrypt: 1 operation in 5867 cycles (4096 bytes)
tcrypt: 1 operation in 2358 cycles (1024 bytes)
tcrypt: 1 operation in 7295 cycles (4096 bytes)
    testing speed of multibuffer ecb(aria) (ecb-aria-avx2) decryption
tcrypt: 1 operation in 2004 cycles (1024 bytes)
tcrypt: 1 operation in 5956 cycles (4096 bytes)
tcrypt: 1 operation in 2409 cycles (1024 bytes)
tcrypt: 1 operation in 7564 cycles (4096 bytes)

ARIA-AVX with GFNI(128bit and 256bit)
    testing speed of multibuffer ecb(aria) (ecb-aria-avx) encryption
tcrypt: 1 operation in 2761 cycles (1024 bytes)
tcrypt: 1 operation in 9390 cycles (4096 bytes)
tcrypt: 1 operation in 3401 cycles (1024 bytes)
tcrypt: 1 operation in 11876 cycles (4096 bytes)
    testing speed of multibuffer ecb(aria) (ecb-aria-avx) decryption
tcrypt: 1 operation in 2735 cycles (1024 bytes)
tcrypt: 1 operation in 9424 cycles (4096 bytes)
tcrypt: 1 operation in 3369 cycles (1024 bytes)
tcrypt: 1 operation in 11954 cycles (4096 bytes)

Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-01-06 17:15:47 +08:00
..
.gitignore .gitignore: add SPDX License Identifier 2020-03-25 11:50:48 +01:00
aegis128-aesni-asm.S crypto: x86/aegis128 - fix possible crash with CFI enabled 2022-11-25 17:39:18 +08:00
aegis128-aesni-glue.c crypto: remove CRYPTO_TFM_RES_BAD_KEY_LEN 2020-01-09 11:30:53 +08:00
aes_ctrby8_avx-x86_64.S crypto: x86/aesni-xctr - Add accelerated implementation of XCTR 2022-06-10 16:40:17 +08:00
aesni-intel_asm.S x86: clean up symbol aliasing 2022-02-22 16:21:34 +00:00
aesni-intel_avx-x86_64.S x86: Prepare asm files for straight-line-speculation 2021-12-08 12:25:37 +01:00
aesni-intel_glue.c crypto: x86/aesni-xctr - Add accelerated implementation of XCTR 2022-06-10 16:40:17 +08:00
aria_aesni_avx2_glue.c crypto: x86/aria - implement aria-avx2 2023-01-06 17:15:47 +08:00
aria_aesni_avx_glue.c crypto: x86/aria - implement aria-avx2 2023-01-06 17:15:47 +08:00
aria-aesni-avx2-asm_64.S crypto: x86/aria - implement aria-avx2 2023-01-06 17:15:47 +08:00
aria-aesni-avx-asm_64.S crypto: x86/aria - do not use magic number offsets of aria_ctx 2023-01-06 17:15:47 +08:00
aria-avx.h crypto: x86/aria - implement aria-avx2 2023-01-06 17:15:47 +08:00
blake2s-core.S x86: Prepare asm files for straight-line-speculation 2021-12-08 12:25:37 +01:00
blake2s-glue.c crypto: blake2s - remove shash module 2022-06-10 16:43:49 +08:00
blowfish_glue.c crypto: x86/blowfish - remove redundant assignment to variable nytes 2022-07-15 16:43:22 +08:00
blowfish-x86_64-asm_64.S x86: Add types to indirectly called assembly functions 2022-09-26 10:13:15 -07:00
camellia_aesni_avx2_glue.c crypto: x86 - use local headers for x86 specific shared declarations 2021-01-14 17:10:30 +11:00
camellia_aesni_avx_glue.c crypto: x86 - use local headers for x86 specific shared declarations 2021-01-14 17:10:30 +11:00
camellia_glue.c crypto: x86 - eliminate anonymous module_init & module_exit 2022-04-08 16:13:31 +08:00
camellia-aesni-avx2-asm_64.S crypto: x86/camellia: Remove redundant alignments 2022-10-17 16:41:00 +02:00
camellia-aesni-avx-asm_64.S crypto: x86/camellia: Remove redundant alignments 2022-10-17 16:41:00 +02:00
camellia-x86_64-asm_64.S x86: Prepare asm files for straight-line-speculation 2021-12-08 12:25:37 +01:00
camellia.h crypto: x86 - use local headers for x86 specific shared declarations 2021-01-14 17:10:30 +11:00
cast5_avx_glue.c crypto: x86/cast5 - drop dependency on glue helper 2021-01-14 17:10:29 +11:00
cast5-avx-x86_64-asm_64.S crypto: x86/cast5: Remove redundant alignments 2022-10-17 16:41:00 +02:00
cast6_avx_glue.c crypto: x86/cast6 - drop dependency on glue helper 2021-01-14 17:10:29 +11:00
cast6-avx-x86_64-asm_64.S x86: Prepare asm files for straight-line-speculation 2021-12-08 12:25:37 +01:00
chacha_glue.c crypto: algapi - Remove skbuff.h inclusion 2020-08-20 14:04:28 +10:00
chacha-avx2-x86_64.S x86: Prepare asm files for straight-line-speculation 2021-12-08 12:25:37 +01:00
chacha-avx512vl-x86_64.S crypto: x86/chacha20 - Avoid spurious jumps to other functions 2022-03-25 16:21:05 +12:00
chacha-ssse3-x86_64.S x86: Prepare asm files for straight-line-speculation 2021-12-08 12:25:37 +01:00
crc32-pclmul_asm.S x86/crypto: Remove stray comment terminator 2022-06-13 09:47:58 +02:00
crc32-pclmul_glue.c x86: Fix various typos in comments, take #2 2021-03-21 23:50:28 +01:00
crc32c-intel_glue.c crypto: x86/crc32c-intel - Use CRC32 mnemonic 2020-08-21 14:45:28 +10:00
crc32c-pcl-intel-asm_64.S x86/ibt,crypto: Add ENDBR for the jump-table entries 2022-03-15 10:32:36 +01:00
crct10dif-pcl-asm_64.S crypto: x86/crct10dif-pcl: Remove redundant alignments 2022-10-17 16:41:01 +02:00
crct10dif-pclmul_glue.c crypto: Convert to new CPU match macros 2020-03-24 21:36:06 +01:00
curve25519-x86_64.c crypto: x86/curve25519 - use in/out register constraints more precisely 2021-12-24 14:18:22 +11:00
des3_ede_glue.c crypto: x86/des3 - Remove unused inline function des3_ede_enc_blk_3way() 2022-02-23 15:28:32 +12:00
des3_ede-asm_64.S x86: Prepare asm files for straight-line-speculation 2021-12-08 12:25:37 +01:00
ecb_cbc_helpers.h crypto: x86 - add some helper macros for ECB and CBC modes 2021-01-14 17:10:29 +11:00
ghash-clmulni-intel_asm.S crypto: x86/ghash - add comment and fix broken link 2022-12-30 17:57:42 +08:00
ghash-clmulni-intel_glue.c crypto: x86/ghash - add comment and fix broken link 2022-12-30 17:57:42 +08:00
glue_helper-asm-avx2.S crypto: x86/glue-helper - drop CTR helper routines 2021-01-14 17:10:28 +11:00
glue_helper-asm-avx.S crypto: x86/glue-helper - drop CTR helper routines 2021-01-14 17:10:28 +11:00
Kconfig crypto: x86/aria - implement aria-avx2 2023-01-06 17:15:47 +08:00
Makefile crypto: x86/aria - implement aria-avx2 2023-01-06 17:15:47 +08:00
nh-avx2-x86_64.S crypto: x86/nhpoly1305 - eliminate unnecessary CFI wrappers 2022-11-25 17:39:19 +08:00
nh-sse2-x86_64.S crypto: x86/nhpoly1305 - eliminate unnecessary CFI wrappers 2022-11-25 17:39:19 +08:00
nhpoly1305-avx2-glue.c crypto: x86/nhpoly1305 - eliminate unnecessary CFI wrappers 2022-11-25 17:39:19 +08:00
nhpoly1305-sse2-glue.c crypto: x86/nhpoly1305 - eliminate unnecessary CFI wrappers 2022-11-25 17:39:19 +08:00
poly1305_glue.c crypto: poly1305 - fix poly1305_core_setkey() declaration 2021-04-02 18:28:12 +11:00
poly1305-x86_64-cryptogams.pl crypto: x86/poly1305: Remove custom function alignment 2022-10-17 16:41:03 +02:00
polyval-clmulni_asm.S crypto: x86/polyval - Add PCLMULQDQ accelerated implementation of POLYVAL 2022-06-10 16:40:17 +08:00
polyval-clmulni_glue.c crypto: x86/polyval - Fix crashes when keys are not 16-byte aligned 2022-10-21 19:05:05 +08:00
serpent_avx2_glue.c crypto: x86 - eliminate anonymous module_init & module_exit 2022-04-08 16:13:31 +08:00
serpent_avx_glue.c crypto: x86 - use local headers for x86 specific shared declarations 2021-01-14 17:10:30 +11:00
serpent_sse2_glue.c crypto: x86 - use local headers for x86 specific shared declarations 2021-01-14 17:10:30 +11:00
serpent-avx2-asm_64.S crypto: x86/serpent: Remove redundant alignments 2022-10-17 16:41:01 +02:00
serpent-avx-x86_64-asm_64.S crypto: x86/serpent: Remove redundant alignments 2022-10-17 16:41:01 +02:00
serpent-avx.h crypto: x86 - use local headers for x86 specific shared declarations 2021-01-14 17:10:30 +11:00
serpent-sse2-i586-asm_32.S x86: Prepare asm files for straight-line-speculation 2021-12-08 12:25:37 +01:00
serpent-sse2-x86_64-asm_64.S x86: Prepare asm files for straight-line-speculation 2021-12-08 12:25:37 +01:00
serpent-sse2.h crypto: x86 - use local headers for x86 specific shared declarations 2021-01-14 17:10:30 +11:00
sha1_avx2_x86_64_asm.S x86: Prepare asm files for straight-line-speculation 2021-12-08 12:25:37 +01:00
sha1_ni_asm.S - Add the call depth tracking mitigation for Retbleed which has 2022-12-14 15:03:00 -08:00
sha1_ssse3_asm.S crypto: x86/sha1 - fix possible crash with CFI enabled 2022-11-25 17:39:19 +08:00
sha1_ssse3_glue.c crypto: sha - split sha.h into sha1.h and sha2.h 2020-11-20 14:45:33 +11:00
sha256_ni_asm.S - Add the call depth tracking mitigation for Retbleed which has 2022-12-14 15:03:00 -08:00
sha256_ssse3_glue.c crypto: sha - split sha.h into sha1.h and sha2.h 2020-11-20 14:45:33 +11:00
sha256-avx2-asm.S - Add the call depth tracking mitigation for Retbleed which has 2022-12-14 15:03:00 -08:00
sha256-avx-asm.S - Add the call depth tracking mitigation for Retbleed which has 2022-12-14 15:03:00 -08:00
sha256-ssse3-asm.S - Add the call depth tracking mitigation for Retbleed which has 2022-12-14 15:03:00 -08:00
sha512_ssse3_glue.c crypto: x86/sha512 - load based on CPU features 2022-08-19 18:39:39 +08:00
sha512-avx2-asm.S crypto: x86/sha512 - fix possible crash with CFI enabled 2022-11-25 17:39:19 +08:00
sha512-avx-asm.S crypto: x86/sha512 - fix possible crash with CFI enabled 2022-11-25 17:39:19 +08:00
sha512-ssse3-asm.S crypto: x86/sha512 - fix possible crash with CFI enabled 2022-11-25 17:39:19 +08:00
sm3_avx_glue.c crypto: x86/sm3 - add AVX assembly implementation 2022-01-28 16:51:11 +11:00
sm3-avx-asm_64.S - Add the call depth tracking mitigation for Retbleed which has 2022-12-14 15:03:00 -08:00
sm4_aesni_avx2_glue.c crypto: x86/sm4 - add AES-NI/AVX2/x86_64 implementation 2021-08-27 16:30:18 +08:00
sm4_aesni_avx_glue.c crypto: x86/sm4 - export reusable AESNI/AVX functions 2021-08-27 16:30:18 +08:00
sm4-aesni-avx2-asm_64.S - Add the call depth tracking mitigation for Retbleed which has 2022-12-14 15:03:00 -08:00
sm4-aesni-avx-asm_64.S - Add the call depth tracking mitigation for Retbleed which has 2022-12-14 15:03:00 -08:00
sm4-avx.h crypto: x86/sm4 - export reusable AESNI/AVX functions 2021-08-27 16:30:18 +08:00
twofish_avx_glue.c crypto: x86 - use local headers for x86 specific shared declarations 2021-01-14 17:10:30 +11:00
twofish_glue_3way.c crypto: x86 - eliminate anonymous module_init & module_exit 2022-04-08 16:13:31 +08:00
twofish_glue.c crypto: Prepare to move crypto_tfm_ctx 2022-12-02 18:12:40 +08:00
twofish-avx-x86_64-asm_64.S crypto: twofish: Remove redundant alignments 2022-10-17 16:41:03 +02:00
twofish-i586-asm_32.S x86: Prepare asm files for straight-line-speculation 2021-12-08 12:25:37 +01:00
twofish-x86_64-asm_64-3way.S x86: Prepare asm files for straight-line-speculation 2021-12-08 12:25:37 +01:00
twofish-x86_64-asm_64.S x86: Prepare asm files for straight-line-speculation 2021-12-08 12:25:37 +01:00
twofish.h crypto: x86 - use local headers for x86 specific shared declarations 2021-01-14 17:10:30 +11:00