linux/arch
Mathias Krause 66be895158 crypto: sha1 - SSSE3 based SHA1 implementation for x86-64
This is an assembler implementation of the SHA1 algorithm using the
Supplemental SSE3 (SSSE3) instructions or, when available, the
Advanced Vector Extensions (AVX).

Testing with the tcrypt module shows the raw hash performance is up to
2.3 times faster than the C implementation, using 8k data blocks on a
Core 2 Duo T5500. For the smalest data set (16 byte) it is still 25%
faster.

Since this implementation uses SSE/YMM registers it cannot safely be
used in every situation, e.g. while an IRQ interrupts a kernel thread.
The implementation falls back to the generic SHA1 variant, if using
the SSE/YMM registers is not possible.

With this algorithm I was able to increase the throughput of a single
IPsec link from 344 Mbit/s to 464 Mbit/s on a Core 2 Quad CPU using
the SSSE3 variant -- a speedup of +34.8%.

Saving and restoring SSE/YMM state might make the actual throughput
fluctuate when there are FPU intensive userland applications running.
For example, meassuring the performance using iperf2 directly on the
machine under test gives wobbling numbers because iperf2 uses the FPU
for each packet to check if the reporting interval has expired (in the
above test I got min/max/avg: 402/484/464 MBit/s).

Using this algorithm on a IPsec gateway gives much more reasonable and
stable numbers, albeit not as high as in the directly connected case.
Here is the result from an RFC 2544 test run with a EXFO Packet Blazer
FTB-8510:

 frame size    sha1-generic     sha1-ssse3    delta
    64 byte     37.5 MBit/s    37.5 MBit/s     0.0%
   128 byte     56.3 MBit/s    62.5 MBit/s   +11.0%
   256 byte     87.5 MBit/s   100.0 MBit/s   +14.3%
   512 byte    131.3 MBit/s   150.0 MBit/s   +14.2%
  1024 byte    162.5 MBit/s   193.8 MBit/s   +19.3%
  1280 byte    175.0 MBit/s   212.5 MBit/s   +21.4%
  1420 byte    175.0 MBit/s   218.7 MBit/s   +25.0%
  1518 byte    150.0 MBit/s   181.2 MBit/s   +20.8%

The throughput for the largest frame size is lower than for the
previous size because the IP packets need to be fragmented in this
case to make there way through the IPsec tunnel.

Signed-off-by: Mathias Krause <minipli@googlemail.com>
Cc: Maxim Locktyukhin <maxim.locktyukhin@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2011-08-10 19:00:29 +08:00
..
alpha Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 2011-07-29 23:35:05 -07:00
arm Merge branch 'gpio/next' of git://git.secretlab.ca/git/linux-2.6 2011-08-01 06:13:48 -10:00
avr32 atomic: cleanup asm-generic atomic*.h inclusion 2011-07-26 16:49:47 -07:00
blackfin atomic: cleanup asm-generic atomic*.h inclusion 2011-07-26 16:49:47 -07:00
cris atomic: cleanup asm-generic atomic*.h inclusion 2011-07-26 16:49:47 -07:00
frv frv: remove unnecessary code 2011-07-29 23:41:09 -07:00
h8300 atomic: cleanup asm-generic atomic*.h inclusion 2011-07-26 16:49:47 -07:00
ia64 Merge branch 'gpiolib' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6 2011-08-01 13:41:43 -10:00
m32r atomic: cleanup asm-generic atomic*.h inclusion 2011-07-26 16:49:47 -07:00
m68k Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k 2011-07-31 14:30:59 -10:00
microblaze Merge branch 'next' of git://git.monstr.eu/linux-2.6-microblaze 2011-07-27 09:24:20 -07:00
mips atomic: cleanup asm-generic atomic*.h inclusion 2011-07-26 16:49:47 -07:00
mn10300 atomic: cleanup asm-generic atomic*.h inclusion 2011-07-26 16:49:47 -07:00
openrisc OpenRISC: Miscellaneous 2011-07-22 18:46:41 +02:00
parisc atomic: cleanup asm-generic atomic*.h inclusion 2011-07-26 16:49:47 -07:00
powerpc Merge branch 'next/cross-platform' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/linux-arm-soc 2011-07-26 17:12:10 -07:00
s390 atomic: cleanup asm-generic atomic*.h inclusion 2011-07-26 16:49:47 -07:00
score modules: make arch's use default loader hooks 2011-07-24 22:06:04 +09:30
sh Merge branch 'sh-latest' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-3.x 2011-08-01 06:10:16 -10:00
sparc Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k 2011-07-31 14:30:59 -10:00
tile Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 2011-07-29 23:35:05 -07:00
um ptrace: unify show_regs() prototype 2011-07-26 16:49:43 -07:00
unicore32 Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 2011-07-29 23:35:05 -07:00
x86 crypto: sha1 - SSSE3 based SHA1 implementation for x86-64 2011-08-10 19:00:29 +08:00
xtensa atomic: cleanup asm-generic atomic*.h inclusion 2011-07-26 16:49:47 -07:00
.gitignore
Kconfig