linux/crypto
Johannes Goetzfried 4ea1277d30 crypto: cast6 - add x86_64/avx assembler implementation
This patch adds a x86_64/avx assembler implementation of the Cast6 block
cipher. The implementation processes eight blocks in parallel (two 4 block
chunk AVX operations). The table-lookups are done in general-purpose registers.
For small blocksizes the functions from the generic module are called. A good
performance increase is provided for blocksizes greater or equal to 128B.

Patch has been tested with tcrypt and automated filesystem tests.

Tcrypt benchmark results:

Intel Core i5-2500 CPU (fam:6, model:42, step:7)

cast6-avx-x86_64 vs. cast6-generic
128bit key:                                             (lrw:256bit)    (xts:256bit)
size    ecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec lrw-enc lrw-dec xts-enc xts-dec
16B     0.97x   1.00x   1.01x   1.01x   0.99x   0.97x   0.98x   1.01x   0.96x   0.98x
64B     0.98x   0.99x   1.02x   1.01x   0.99x   1.00x   1.01x   0.99x   1.00x   0.99x
256B    1.77x   1.84x   0.99x   1.85x   1.77x   1.77x   1.70x   1.74x   1.69x   1.72x
1024B   1.93x   1.95x   0.99x   1.96x   1.93x   1.93x   1.84x   1.85x   1.89x   1.87x
8192B   1.91x   1.95x   0.99x   1.97x   1.95x   1.91x   1.86x   1.87x   1.93x   1.90x

256bit key:                                             (lrw:384bit)    (xts:512bit)
size    ecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec lrw-enc lrw-dec xts-enc xts-dec
16B     0.97x   0.99x   1.02x   1.01x   0.98x   0.99x   1.00x   1.00x   0.98x   0.98x
64B     0.98x   0.99x   1.01x   1.00x   1.00x   1.00x   1.01x   1.01x   0.97x   1.00x
256B    1.77x   1.83x   1.00x   1.86x   1.79x   1.78x   1.70x   1.76x   1.71x   1.69x
1024B   1.92x   1.95x   0.99x   1.96x   1.93x   1.93x   1.83x   1.86x   1.89x   1.87x
8192B   1.94x   1.95x   0.99x   1.97x   1.95x   1.95x   1.87x   1.87x   1.93x   1.91x

Signed-off-by: Johannes Goetzfried <Johannes.Goetzfried@informatik.stud.uni-erlangen.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-08-01 17:47:30 +08:00
..
async_tx crypto: remove the second argument of k[un]map_atomic() 2012-03-20 21:48:16 +08:00
ablkcipher.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2012-04-10 14:30:45 -04:00
aead.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2012-04-10 14:30:45 -04:00
aes_generic.c crypto: cleanup - remove unneeded crypto_alg.cra_list initializations 2012-08-01 17:47:27 +08:00
af_alg.c atomic: use <linux/atomic.h> 2011-07-26 16:49:47 -07:00
ahash.c crypto: Stop using NLA_PUT*(). 2012-04-02 04:33:42 -04:00
algapi.c crypto: algapi - Move larval completion into algboss 2012-06-22 20:08:29 +08:00
algboss.c crypto: algapi - Fix hang on crypto allocation 2012-06-27 20:59:12 +08:00
algif_hash.c crypto: algif_hash - Handle initial af_alg_make_sg error correctly 2011-06-30 07:44:06 +08:00
algif_skcipher.c crypto: algif_skcipher - Handle unaligned receive buffer 2010-11-30 17:04:31 +08:00
ansi_cprng.c crypto: ansi_cprng - use crypto_[un]register_algs 2012-08-01 17:47:25 +08:00
anubis.c crypto: cleanup - remove unneeded crypto_alg.cra_list initializations 2012-08-01 17:47:27 +08:00
api.c crypto: api - Fix checkpatch errors 2010-02-16 20:26:46 +08:00
arc4.c crypto: arc4 - improve performance by using u32 for ctx and variables 2012-06-14 10:07:23 +08:00
authenc.c crypto: Use scatterwalk_crypto_chain 2010-12-02 14:47:16 +08:00
authencesn.c crypto: authencesn - Add algorithm to handle IPsec extended sequence numbers 2011-03-13 20:22:27 -07:00
blkcipher.c crypto: Stop using NLA_PUT*(). 2012-04-02 04:33:42 -04:00
blowfish_common.c crypto: blowfish - split generic and common c code 2011-09-22 21:25:25 +10:00
blowfish_generic.c crypto: cleanup - remove unneeded crypto_alg.cra_list initializations 2012-08-01 17:47:27 +08:00
camellia_generic.c crypto: cleanup - remove unneeded crypto_alg.cra_list initializations 2012-08-01 17:47:27 +08:00
cast5_generic.c crypto: cast5 - prepare generic module for optimized implementations 2012-08-01 17:47:29 +08:00
cast6_generic.c crypto: cast6 - prepare generic module for optimized implementations 2012-08-01 17:47:30 +08:00
cbc.c
ccm.c crypto: remove the second argument of k[un]map_atomic() 2012-03-20 21:48:16 +08:00
chainiv.c crypto: chainiv - Use kcrypto_wq instead of keventd_wq 2009-02-19 14:44:02 +08:00
cipher.c crypto: cipher - Fix checkpatch errors 2010-02-16 20:31:37 +08:00
compress.c crypto: compress - Fix checkpatch errors 2010-02-16 20:31:04 +08:00
crc32c.c crypto: crc32c should use library implementation 2012-03-23 16:58:38 -07:00
cryptd.c crypto: cryptd - Use subsys_initcall to prevent races with aesni 2011-08-20 16:08:03 +08:00
crypto_null.c crypto: crypto_null - use crypto_[un]register_algs 2012-08-01 17:47:24 +08:00
crypto_user.c netlink: add netlink_kernel_cfg parameter to netlink_kernel_create 2012-06-29 16:46:02 -07:00
crypto_wq.c crypto: add module.h to those files that are explicitly using it 2011-10-31 19:31:11 -04:00
ctr.c crypto: Use ERR_CAST 2010-05-26 10:36:51 +10:00
cts.c [CRYPTO] cts: Init SG tables 2008-06-02 15:46:51 +10:00
deflate.c crypto: cleanup - remove unneeded crypto_alg.cra_list initializations 2012-08-01 17:47:27 +08:00
des_generic.c crypto: des - use crypto_[un]register_algs 2012-08-01 17:47:24 +08:00
ecb.c crypto: ecb - Fix checkpatch errors 2010-02-16 20:33:49 +08:00
eseqiv.c crypto: Use scatterwalk_crypto_chain 2010-12-02 14:47:16 +08:00
fcrypt.c crypto: cleanup - remove unneeded crypto_alg.cra_list initializations 2012-08-01 17:47:27 +08:00
fips.c crypto: api - Add fips_enable flag 2008-08-29 15:50:02 +10:00
gcm.c crypto: Use scatterwalk_crypto_chain 2010-12-02 14:47:16 +08:00
gf128mul.c crypto: gf128mul - fix call to memset() 2011-07-08 17:21:21 +08:00
ghash-generic.c crypto: cleanup - remove unneeded crypto_alg.cra_list initializations 2012-08-01 17:47:27 +08:00
hmac.c include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
internal.h crypto: algapi - Move larval completion into algboss 2012-06-22 20:08:29 +08:00
Kconfig crypto: cast6 - add x86_64/avx assembler implementation 2012-08-01 17:47:30 +08:00
khazad.c crypto: cleanup - remove unneeded crypto_alg.cra_list initializations 2012-08-01 17:47:27 +08:00
krng.c crypto: cleanup - remove unneeded crypto_alg.cra_list initializations 2012-08-01 17:47:27 +08:00
lrw.c crypto: lrw - add interface for parallelized cipher implementions 2011-11-09 11:50:31 +08:00
lzo.c crypto: cleanup - remove unneeded crypto_alg.cra_list initializations 2012-08-01 17:47:27 +08:00
Makefile crypto: cast6 - prepare generic module for optimized implementations 2012-08-01 17:47:30 +08:00
md4.c crypto: add module.h to those files that are explicitly using it 2011-10-31 19:31:11 -04:00
md5.c crypto: Move md5_transform to lib/md5.c 2011-08-06 18:32:45 -07:00
michael_mic.c crypto: michael_mic - Switch to shash 2008-12-25 11:02:24 +11:00
pcbc.c
pcompress.c crypto: Stop using NLA_PUT*(). 2012-04-02 04:33:42 -04:00
pcrypt.c crypto: pcrypt - Use the online cpumask as the default 2012-03-29 19:52:47 +08:00
proc.c crypto: add module.h to those files that are explicitly using it 2011-10-31 19:31:11 -04:00
ripemd.h [CRYPTO] ripemd: Put all common RIPEMD values in header file 2008-07-10 20:35:12 +08:00
rmd128.c crypto: ripemd - Set module author and update email address 2011-01-04 23:34:03 +11:00
rmd160.c crypto: ripemd - Set module author and update email address 2011-01-04 23:34:03 +11:00
rmd256.c crypto: ripemd - Set module author and update email address 2011-01-04 23:34:03 +11:00
rmd320.c crypto: ripemd - Set module author and update email address 2011-01-04 23:34:03 +11:00
rng.c crypto: Stop using NLA_PUT*(). 2012-04-02 04:33:42 -04:00
salsa20_generic.c crypto: cleanup - remove unneeded crypto_alg.cra_list initializations 2012-08-01 17:47:27 +08:00
scatterwalk.c crypto: remove the second argument of k[un]map_atomic() 2012-03-20 21:48:16 +08:00
seed.c crypto: cleanup - remove unneeded crypto_alg.cra_list initializations 2012-08-01 17:47:27 +08:00
seqiv.c include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
serpent_generic.c crypto: serpent - use crypto_[un]register_algs 2012-08-01 17:47:25 +08:00
sha1_generic.c crypto: sha1 - export sha1_update for reuse 2011-08-10 19:00:28 +08:00
sha256_generic.c crypto: sha256 - use crypto_[un]register_shashes 2012-08-01 17:47:26 +08:00
sha512_generic.c crypto: sha512 - use crypto_[un]register_shashes 2012-08-01 17:47:26 +08:00
shash.c crypto: add crypto_[un]register_shashes for [un]registering multiple shash entries at once 2012-08-01 17:47:26 +08:00
tcrypt.c crypto: testmgr - add larger cast6 testvectors 2012-08-01 17:47:30 +08:00
tcrypt.h crypto: testmgr - add larger cast5 testvectors 2012-08-01 17:47:29 +08:00
tea.c crypto: tea - use crypto_[un]register_algs 2012-08-01 17:47:24 +08:00
testmgr.c crypto: cast6 - add x86_64/avx assembler implementation 2012-08-01 17:47:30 +08:00
testmgr.h crypto: testmgr - add larger cast6 testvectors 2012-08-01 17:47:30 +08:00
tgr192.c crypto: tiger - use crypto_[un]register_shashes 2012-08-01 17:47:26 +08:00
twofish_common.c crypto: twofish-x86_64-3way - add lrw support 2011-11-09 11:53:32 +08:00
twofish_generic.c crypto: cleanup - remove unneeded crypto_alg.cra_list initializations 2012-08-01 17:47:27 +08:00
vmac.c crypto: add module.h to those files that are explicitly using it 2011-10-31 19:31:11 -04:00
wp512.c crypto: whirlpool - use crypto_[un]register_shashes 2012-08-01 17:47:27 +08:00
xcbc.c crypto: add module.h to those files that are explicitly using it 2011-10-31 19:31:11 -04:00
xor.c md updates for 3.5 2012-05-23 17:08:40 -07:00
xts.c crypto: xts: add interface for parallelized cipher implementations 2011-11-09 11:56:06 +08:00
zlib.c net+crypto: Use vmalloc for zlib inflate buffers. 2011-06-29 05:48:41 -07:00