linux/arch
Ard Biesheuvel 11031c0d7d crypto: arm64/gcm-ce - implement 4 way interleave
To improve performance on cores with deep pipelines such as ThunderX2,
reimplement gcm(aes) using a 4-way interleave rather than the 2-way
interleave we use currently.

This comes down to a complete rewrite of the GCM part of the combined
GCM/GHASH driver, and instead of interleaving two invocations of AES
with the GHASH handling at the instruction level, the new version
uses a more coarse grained approach where each chunk of 64 bytes is
encrypted first and then ghashed (or ghashed and then decrypted in
the converse case).

The core NEON routine is now able to consume inputs of any size,
and tail blocks of less than 64 bytes are handled using overlapping
loads and stores, and processed by the same 4-way encryption and
hashing routines. This gets rid of most of the branches, and avoids
having to return to the C code to handle the tail block using a
stack buffer.

The table below compares the performance of the old driver and the new
one on various micro-architectures and running in various modes.

        |     AES-128      |     AES-192      |     AES-256      |
 #bytes | 512 | 1500 |  4k | 512 | 1500 |  4k | 512 | 1500 |  4k |
 -------+-----+------+-----+-----+------+-----+-----+------+-----+
    TX2 | 35% |  23% | 11% | 34% |  20% |  9% | 38% |  25% | 16% |
   EMAG | 11% |   6% |  3% | 12% |   4% |  2% | 11% |   4% |  2% |
    A72 |  8% |   5% | -4% |  9% |   4% | -5% |  7% |   4% | -5% |
    A53 | 11% |   6% | -1% | 10% |   8% | -1% | 10% |   8% | -2% |

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-10-05 01:04:31 +10:00
..
alpha mm: introduce MADV_PAGEOUT 2019-09-25 17:51:41 -07:00
arc mm: treewide: clarify pgtable_page_{ctor,dtor}() naming 2019-09-26 10:10:44 -07:00
arm ARM: SoC fixes 2019-09-30 10:04:28 -07:00
arm64 crypto: arm64/gcm-ce - implement 4 way interleave 2019-10-05 01:04:31 +10:00
c6x mm: consolidate pgtable_cache_init() and pgd_cache_init() 2019-09-24 15:54:09 -07:00
csky csky-for-linus-5.4-rc1: arch/csky patches for 5.4-rc1 2019-09-30 10:16:17 -07:00
h8300 mm: consolidate pgtable_cache_init() and pgd_cache_init() 2019-09-24 15:54:09 -07:00
hexagon mm: treewide: clarify pgtable_page_{ctor,dtor}() naming 2019-09-26 10:10:44 -07:00
ia64 Merge branch 'akpm' (patches from Andrew) 2019-09-24 16:10:23 -07:00
m68k mm: treewide: clarify pgtable_page_{ctor,dtor}() naming 2019-09-26 10:10:44 -07:00
microblaze Merge branch 'akpm' (patches from Andrew) 2019-09-24 16:10:23 -07:00
mips mm: treewide: clarify pgtable_page_{ctor,dtor}() naming 2019-09-26 10:10:44 -07:00
nds32 mm: consolidate pgtable_cache_init() and pgd_cache_init() 2019-09-24 15:54:09 -07:00
nios2 nios2 update for v5.4-rc1 2019-09-27 13:02:19 -07:00
openrisc mm: treewide: clarify pgtable_page_{ctor,dtor}() naming 2019-09-26 10:10:44 -07:00
parisc mm: introduce MADV_PAGEOUT 2019-09-25 17:51:41 -07:00
powerpc libnvdimm fixes v5.4-rc1 2019-09-29 10:33:41 -07:00
riscv RISC-V additional updates for v5.4-rc1 2019-09-27 13:08:36 -07:00
s390 Merge branch 'next-lockdown' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security 2019-09-28 08:14:15 -07:00
sh mm: treewide: clarify pgtable_page_{ctor,dtor}() naming 2019-09-26 10:10:44 -07:00
sparc arch/sparc/include/asm/pgtable_64.h: fix build 2019-09-26 10:27:06 -07:00
um mm: treewide: clarify pgtable_page_{ctor,dtor}() naming 2019-09-26 10:10:44 -07:00
unicore32 mm: treewide: clarify pgtable_page_{ctor,dtor}() naming 2019-09-26 10:10:44 -07:00
x86 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2019-09-28 13:37:41 -07:00
xtensa mm: treewide: clarify pgtable_page_{ctor,dtor}() naming 2019-09-26 10:10:44 -07:00
.gitignore
Kconfig arm64, mm: make randomization selected by generic topdown mmap layout 2019-09-24 15:54:11 -07:00