linux/arch/x86
Taehee Yoo 37d8d3ae7a crypto: x86/aria - implement aria-avx2
aria-avx2 implementation uses AVX2, AES-NI, and GFNI.
It supports 32way parallel processing.
So, byteslicing code is changed to support 32way parallel.
And it exports some aria-avx functions such as encrypt() and decrypt().

There are two main logics, s-box layer and diffusion layer.
These codes are the same as aria-avx implementation.
But some instruction are exchanged because they don't support 256bit
registers.
Also, AES-NI doesn't support 256bit register.
So, aesenclast and aesdeclast are used twice like below:
	vextracti128 $1, ymm0, xmm6;
	vaesenclast xmm7, xmm0, xmm0;
	vaesenclast xmm7, xmm6, xmm6;
	vinserti128 $1, xmm6, ymm0, ymm0;

Benchmark with modprobe tcrypt mode=610 num_mb=8192, i3-12100:

ARIA-AVX2 with GFNI(128bit and 256bit)
    testing speed of multibuffer ecb(aria) (ecb-aria-avx2) encryption
tcrypt: 1 operation in 2003 cycles (1024 bytes)
tcrypt: 1 operation in 5867 cycles (4096 bytes)
tcrypt: 1 operation in 2358 cycles (1024 bytes)
tcrypt: 1 operation in 7295 cycles (4096 bytes)
    testing speed of multibuffer ecb(aria) (ecb-aria-avx2) decryption
tcrypt: 1 operation in 2004 cycles (1024 bytes)
tcrypt: 1 operation in 5956 cycles (4096 bytes)
tcrypt: 1 operation in 2409 cycles (1024 bytes)
tcrypt: 1 operation in 7564 cycles (4096 bytes)

ARIA-AVX with GFNI(128bit and 256bit)
    testing speed of multibuffer ecb(aria) (ecb-aria-avx) encryption
tcrypt: 1 operation in 2761 cycles (1024 bytes)
tcrypt: 1 operation in 9390 cycles (4096 bytes)
tcrypt: 1 operation in 3401 cycles (1024 bytes)
tcrypt: 1 operation in 11876 cycles (4096 bytes)
    testing speed of multibuffer ecb(aria) (ecb-aria-avx) decryption
tcrypt: 1 operation in 2735 cycles (1024 bytes)
tcrypt: 1 operation in 9424 cycles (4096 bytes)
tcrypt: 1 operation in 3369 cycles (1024 bytes)
tcrypt: 1 operation in 11954 cycles (4096 bytes)

Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-01-06 17:15:47 +08:00
..
boot - Add the call depth tracking mitigation for Retbleed which has 2022-12-14 15:03:00 -08:00
coco x86/tdx: Add a wrapper to get TDREPORT0 from the TDX Module 2022-11-17 11:03:09 -08:00
configs
crypto crypto: x86/aria - implement aria-avx2 2023-01-06 17:15:47 +08:00
entry - Add the call depth tracking mitigation for Retbleed which has 2022-12-14 15:03:00 -08:00
events ARM64: 2022-12-15 11:12:21 -08:00
hyperv x86/hyperv: Remove unregister syscore call from Hyper-V cleanup 2022-11-29 17:55:29 +00:00
ia32
include New Feature: 2022-12-17 14:06:53 -06:00
kernel crypto: x86/aria - do not use magic number offsets of aria_ctx 2023-01-06 17:15:47 +08:00
kvm ARM64: 2022-12-15 11:12:21 -08:00
lib - Add the call depth tracking mitigation for Retbleed which has 2022-12-14 15:03:00 -08:00
math-emu
mm prandom: remove prandom_u32_max() 2022-12-20 03:13:45 +01:00
net - Add the call depth tracking mitigation for Retbleed which has 2022-12-14 15:03:00 -08:00
pci x86/PCI: Use pr_info() when possible 2022-12-10 10:33:18 -06:00
platform pci-v6.2-changes 2022-12-14 09:54:10 -08:00
power - Add the call depth tracking mitigation for Retbleed which has 2022-12-14 15:03:00 -08:00
purgatory x86/purgatory: disable KMSAN instrumentation 2022-10-28 13:37:23 -07:00
ras
realmode x86/boot: Skip realmode init code when running as Xen PV guest 2022-11-25 12:05:22 +01:00
tools
um [elf][non-regset] uninline elf_core_copy_task_fpregs() (and lose pt_regs argument) 2022-11-24 23:24:23 -05:00
video
virt/vmx/tdx
xen - Add the call depth tracking mitigation for Retbleed which has 2022-12-14 15:03:00 -08:00
.gitignore
Kbuild
Kconfig powerpc updates for 6.2 2022-12-19 07:13:33 -06:00
Kconfig.assembler
Kconfig.cpu
Kconfig.debug
Makefile Kbuild updates for v6.2 2022-12-19 12:33:32 -06:00
Makefile_32.cpu
Makefile.um