linux/arch/x86
Ard Biesheuvel 86ad60a65f crypto: x86/aes-ni-xts - use direct calls to and 4-way stride
The XTS asm helper arrangement is a bit odd: the 8-way stride helper
consists of back-to-back calls to the 4-way core transforms, which
are called indirectly, based on a boolean that indicates whether we
are performing encryption or decryption.

Given how costly indirect calls are on x86, let's switch to direct
calls, and given how the 8-way stride doesn't really add anything
substantial, use a 4-way stride instead, and make the asm core
routine deal with any multiple of 4 blocks. Since 512 byte sectors
or 4 KB blocks are the typical quantities XTS operates on, increase
the stride exported to the glue helper to 512 bytes as well.

As a result, the number of indirect calls is reduced from 3 per 64 bytes
of in/output to 1 per 512 bytes of in/output, which produces a 65% speedup
when operating on 1 KB blocks (measured on a Intel(R) Core(TM) i7-8650U CPU)

Fixes: 9697fa39ef ("x86/retpoline/crypto: Convert crypto assembler indirect jumps")
Tested-by: Eric Biggers <ebiggers@google.com> # x86_64
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2021-01-08 15:39:47 +11:00
..
boot EFI updates collected by Ard Biesheuvel: 2020-12-24 12:40:07 -08:00
configs * A defconfig fix, from Daniel Díaz. 2020-09-20 15:06:43 -07:00
crypto crypto: x86/aes-ni-xts - use direct calls to and 4-way stride 2021-01-08 15:39:47 +11:00
entry epoll: wire up syscall epoll_pwait2 2020-12-19 11:18:38 -08:00
events Perf updates: 2020-12-14 17:34:12 -08:00
hyperv hyperv-fixes for 5.10-rc3 2020-11-05 11:32:03 -08:00
ia32 x86/ia32_signal: Propagate __user annotation properly 2020-12-11 19:44:31 +01:00
include EFI updates collected by Ard Biesheuvel: 2020-12-24 12:40:07 -08:00
kernel A treewide cleanup of interrupt descriptor (ab)use with all sorts of racy 2020-12-24 13:50:23 -08:00
kvm ARM: 2020-12-20 10:44:05 -08:00
lib Scheduler updates: 2020-12-14 18:29:11 -08:00
math-emu treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
mm Merge branch 'stable/for-linus-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb 2020-12-16 13:51:34 -08:00
net bpf: x64: Do not emit sub/add 0, %rsp when !stack_depth 2020-09-29 16:47:39 -07:00
oprofile x86/oprofile: Avoid TIF_IA32 when checking 64bit mode 2020-10-26 13:46:46 +01:00
pci ARM: SoC drivers for v5.11 2020-12-16 16:38:41 -08:00
platform Yet another large set of x86 interrupt management updates: 2020-12-14 18:59:53 -08:00
power Kbuild updates for v5.9 2020-08-09 14:10:26 -07:00
purgatory crypto: sha - split sha.h into sha1.h and sha2.h 2020-11-20 14:45:33 +11:00
ras
realmode x86/head/64: Don't call verify_cpu() on starting APs 2020-09-09 11:33:20 +02:00
tools x86/insn: Make inat-tables.c suitable for pre-decompression code 2020-09-07 19:45:24 +02:00
um arch/um: partially revert the conversion to __section() macro 2020-10-26 15:39:37 -07:00
video
xen EFI updates collected by Ard Biesheuvel: 2020-12-24 12:40:07 -08:00
.gitignore
Kbuild
Kconfig Tracing updates for 5.11 2020-12-17 13:22:17 -08:00
Kconfig.assembler
Kconfig.cpu
Kconfig.debug x86, libnvdimm/test: Remove COPY_MC_TEST 2020-10-26 18:08:35 +01:00
Makefile - Fix the vmlinux size check on 64-bit along with adding useful clarifications on the topic 2020-12-14 13:54:50 -08:00
Makefile_32.cpu
Makefile.um