86ad60a65f
The XTS asm helper arrangement is a bit odd: the 8-way stride helper
consists of back-to-back calls to the 4-way core transforms, which
are called indirectly, based on a boolean that indicates whether we
are performing encryption or decryption.
Given how costly indirect calls are on x86, let's switch to direct
calls, and given how the 8-way stride doesn't really add anything
substantial, use a 4-way stride instead, and make the asm core
routine deal with any multiple of 4 blocks. Since 512 byte sectors
or 4 KB blocks are the typical quantities XTS operates on, increase
the stride exported to the glue helper to 512 bytes as well.
As a result, the number of indirect calls is reduced from 3 per 64 bytes
of in/output to 1 per 512 bytes of in/output, which produces a 65% speedup
when operating on 1 KB blocks (measured on a Intel(R) Core(TM) i7-8650U CPU)
Fixes:
|
||
---|---|---|
.. | ||
boot | ||
configs | ||
crypto | ||
entry | ||
events | ||
hyperv | ||
ia32 | ||
include | ||
kernel | ||
kvm | ||
lib | ||
math-emu | ||
mm | ||
net | ||
oprofile | ||
pci | ||
platform | ||
power | ||
purgatory | ||
ras | ||
realmode | ||
tools | ||
um | ||
video | ||
xen | ||
.gitignore | ||
Kbuild | ||
Kconfig | ||
Kconfig.assembler | ||
Kconfig.cpu | ||
Kconfig.debug | ||
Makefile | ||
Makefile_32.cpu | ||
Makefile.um |