linux/arch/arm64/crypto
Ard Biesheuvel 4edd7d015b crypto: arm64/aes-neon-blk - tweak performance for low end cores
The non-bitsliced AES implementation using the NEON is highly sensitive
to micro-architectural details, and, as it turns out, the Cortex-A53 on
the Raspberry Pi 3 is a core that can benefit from this code, given that
its scalar AES performance is abysmal (32.9 cycles per byte).

The new bitsliced AES code manages 19.8 cycles per byte on this core,
but can only operate on 8 blocks at a time, which is not supported by
all chaining modes. With a bit of tweaking, we can get the plain NEON
code to run at 22.0 cycles per byte, making it useful for sequential
modes like CBC encryption. (Like bitsliced NEON, the plain NEON
implementation does not use any lookup tables, which makes it easy on
the D-cache, and invulnerable to cache timing attacks)

So tweak the plain NEON AES code to use tbl instructions rather than
shl/sri pairs, and to avoid the need to reload permutation vectors or
other constants from memory in every round. Also, improve the decryption
performance by switching to 16x8 pmul instructions for the performing
the multiplications in GF(2^8).

To allow the ECB and CBC encrypt routines to be reused by the bitsliced
NEON code in a subsequent patch, export them from the module.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-02-03 18:16:20 +08:00
..
.gitignore crypto: arm64/sha2 - add generated .S files to .gitignore 2016-11-29 16:06:56 +08:00
aes-ce-ccm-core.S crypto: arm64/aes-ccm-ce: fix for big endian 2016-10-21 11:03:43 +08:00
aes-ce-ccm-glue.c crypto: arm64/aes-ce-ccm - remove cra_alignmask 2017-02-03 18:16:19 +08:00
aes-ce-cipher.c crypto: arm64/aes-ce - fix for big endian 2016-10-21 11:03:42 +08:00
aes-ce-setkey.h arm64/crypto: use crypto instructions to generate AES key schedule 2014-11-06 17:25:28 +00:00
aes-ce.S crypto: arm64/aes-xts-ce: fix for big endian 2016-10-21 11:03:45 +08:00
aes-cipher-core.S crypto: arm64/aes - performance tweak 2017-02-03 18:16:20 +08:00
aes-cipher-glue.c crypto: arm64/aes - add scalar implementation 2017-01-13 00:26:49 +08:00
aes-glue.c crypto: arm64/aes-neon-blk - tweak performance for low end cores 2017-02-03 18:16:20 +08:00
aes-modes.S crypto: arm64/aes-blk - remove cra_alignmask 2017-02-03 18:16:19 +08:00
aes-neon.S crypto: arm64/aes-neon-blk - tweak performance for low end cores 2017-02-03 18:16:20 +08:00
aes-neonbs-core.S crypto: arm64/aes - reimplement bit-sliced ARM/NEON implementation for arm64 2017-01-13 00:26:51 +08:00
aes-neonbs-glue.c crypto: arm64/aes - reimplement bit-sliced ARM/NEON implementation for arm64 2017-01-13 00:26:51 +08:00
chacha20-neon-core.S crypto: arm64/chacha20 - implement NEON version based on SSE3 code 2017-01-13 00:26:48 +08:00
chacha20-neon-glue.c crypto: arm64/chacha20 - remove cra_alignmask 2017-02-03 18:16:19 +08:00
crc32-arm64.c crypto: arm64/crc32 - bring in line with generic CRC32 2015-05-07 11:16:24 +08:00
crc32-ce-core.S crypto: arm64/crc32 - accelerated support based on x86 SSE implementation 2016-12-07 20:01:22 +08:00
crc32-ce-glue.c crypto: arm64/crc32 - accelerated support based on x86 SSE implementation 2016-12-07 20:01:22 +08:00
crct10dif-ce-core.S crypto: arm64/crct10dif - port x86 SSE implementation to arm64 2016-12-07 20:01:17 +08:00
crct10dif-ce-glue.c crypto: arm64/crct10dif - port x86 SSE implementation to arm64 2016-12-07 20:01:17 +08:00
ghash-ce-core.S crypto: arm64/ghash-ce - fix for big endian 2016-10-21 11:03:43 +08:00
ghash-ce-glue.c arm64/crypto: improve performance of GHASH algorithm 2014-06-18 12:40:54 +01:00
Kconfig crypto: arm64/aes - reimplement bit-sliced ARM/NEON implementation for arm64 2017-01-13 00:26:51 +08:00
Makefile crypto: arm64/aes - reimplement bit-sliced ARM/NEON implementation for arm64 2017-01-13 00:26:51 +08:00
sha1-ce-core.S crypto: arm64/sha1-ce - fix for big endian 2016-10-21 11:03:43 +08:00
sha1-ce-glue.c crypto: arm64/sha1-ce - prevent asm code finalization in final() path 2015-05-07 11:16:25 +08:00
sha2-ce-core.S crypto: arm64/sha2-ce - fix for big endian 2016-10-21 11:03:43 +08:00
sha2-ce-glue.c crypto: arm64/sha2-ce - prevent asm code finalization in final() path 2015-05-07 11:16:26 +08:00
sha256-core.S_shipped crypto: arm64/sha2 - integrate OpenSSL implementations of SHA256/SHA512 2016-11-28 19:58:05 +08:00
sha256-glue.c crypto: arm64/sha2 - integrate OpenSSL implementations of SHA256/SHA512 2016-11-28 19:58:05 +08:00
sha512-armv8.pl crypto: arm64/sha2 - integrate OpenSSL implementations of SHA256/SHA512 2016-11-28 19:58:05 +08:00
sha512-core.S_shipped crypto: arm64/sha2 - integrate OpenSSL implementations of SHA256/SHA512 2016-11-28 19:58:05 +08:00
sha512-glue.c crypto: arm64/sha2 - integrate OpenSSL implementations of SHA256/SHA512 2016-11-28 19:58:05 +08:00