2019-05-27 09:55:01 +03:00
/* SPDX-License-Identifier: GPL-2.0-or-later */
2009-01-18 08:28:34 +03:00
/ *
* Implement A E S a l g o r i t h m i n I n t e l A E S - N I i n s t r u c t i o n s .
*
* The w h i t e p a p e r o f A E S - N I i n s t r u c t i o n s c a n b e d o w n l o a d e d f r o m :
* http : / / softwarecommunity. i n t e l . c o m / i s n / d o w n l o a d s / i n t e l a v x / A E S - I n s t r u c t i o n s - S e t _ W P . p d f
*
* Copyright ( C ) 2 0 0 8 , I n t e l C o r p .
* Author : Huang Y i n g < y i n g . h u a n g @intel.com>
* Vinodh G o p a l < v i n o d h . g o p a l @intel.com>
* Kahraman A k d e m i r
*
2010-11-04 22:00:45 +03:00
* Added R F C 4 1 0 6 A E S - G C M s u p p o r t f o r 1 2 8 - b i t k e y s u n d e r t h e A E A D
* interface f o r 6 4 - b i t k e r n e l s .
* Authors : Erdinc O z t u r k ( e r d i n c . o z t u r k @intel.com)
* Aidan O ' M a h o n y ( a i d a n . o . m a h o n y @intel.com)
* Adrian H o b a n < a d r i a n . h o b a n @intel.com>
* James G u i l f o r d ( j a m e s . g u i l f o r d @intel.com)
* Gabriele P a o l o n i < g a b r i e l e . p a o l o n i @intel.com>
* Tadeusz S t r u k ( t a d e u s z . s t r u k @intel.com)
* Wajdi F e g h a l i ( w a j d i . k . f e g h a l i @intel.com)
* Copyright ( c ) 2 0 1 0 , I n t e l C o r p o r a t i o n .
*
crypto: aesni-intel - Ported implementation to x86-32
The AES-NI instructions are also available in legacy mode so the 32-bit
architecture may profit from those, too.
To illustrate the performance gain here's a short summary of a dm-crypt
speed test on a Core i7 M620 running at 2.67GHz comparing both assembler
implementations:
x86: i568 aes-ni delta
ECB, 256 bit: 93.8 MB/s 123.3 MB/s +31.4%
CBC, 256 bit: 84.8 MB/s 262.3 MB/s +209.3%
LRW, 256 bit: 108.6 MB/s 222.1 MB/s +104.5%
XTS, 256 bit: 105.0 MB/s 205.5 MB/s +95.7%
Additionally, due to some minor optimizations, the 64-bit version also
got a minor performance gain as seen below:
x86-64: old impl. new impl. delta
ECB, 256 bit: 121.1 MB/s 123.0 MB/s +1.5%
CBC, 256 bit: 285.3 MB/s 290.8 MB/s +1.9%
LRW, 256 bit: 263.7 MB/s 265.3 MB/s +0.6%
XTS, 256 bit: 251.1 MB/s 255.3 MB/s +1.7%
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Reviewed-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2010-11-27 11:34:46 +03:00
* Ported x86 _ 6 4 v e r s i o n t o x86 :
* Author : Mathias K r a u s e < m i n i p l i @googlemail.com>
2009-01-18 08:28:34 +03:00
* /
# include < l i n u x / l i n k a g e . h >
2016-01-22 01:49:19 +03:00
# include < a s m / f r a m e . h >
2018-01-12 00:46:27 +03:00
# include < a s m / n o s p e c - b r a n c h . h >
2009-01-18 08:28:34 +03:00
2015-01-13 21:16:43 +03:00
/ *
* The f o l l o w i n g m a c r o s a r e u s e d t o m o v e a n ( u n ) a l i g n e d 1 6 b y t e v a l u e t o / f r o m
* an X M M r e g i s t e r . T h i s c a n d o n e f o r e i t h e r F P o r i n t e g e r v a l u e s , f o r F P u s e
* movaps ( m o v e a l i g n e d p a c k e d s i n g l e ) o r i n t e g e r u s e m o v d q a ( m o v e d o u b l e q u a d
* aligned) . I t d o e s n ' t m a k e a p e r f o r m a n c e d i f f e r e n c e w h i c h i n s t r u c t i o n i s u s e d
* since N e h a l e m ( o r i g i n a l C o r e i 7 ) w a s r e l e a s e d . H o w e v e r , t h e m o v a p s i s a b y t e
* shorter, s o t h a t i s t h e o n e w e ' l l u s e f o r n o w . ( s a m e f o r u n a l i g n e d ) .
* /
# define M O V A D Q m o v a p s
# define M O V U D Q m o v u p s
2010-11-29 03:35:39 +03:00
# ifdef _ _ x86 _ 6 4 _ _
2015-01-13 21:16:43 +03:00
crypto: x86 - make constants readonly, allow linker to merge them
A lot of asm-optimized routines in arch/x86/crypto/ keep its
constants in .data. This is wrong, they should be on .rodata.
Mnay of these constants are the same in different modules.
For example, 128-bit shuffle mask 0x000102030405060708090A0B0C0D0E0F
exists in at least half a dozen places.
There is a way to let linker merge them and use just one copy.
The rules are as follows: mergeable objects of different sizes
should not share sections. You can't put them all in one .rodata
section, they will lose "mergeability".
GCC puts its mergeable constants in ".rodata.cstSIZE" sections,
or ".rodata.cstSIZE.<object_name>" if -fdata-sections is used.
This patch does the same:
.section .rodata.cst16.SHUF_MASK, "aM", @progbits, 16
It is important that all data in such section consists of
16-byte elements, not larger ones, and there are no implicit
use of one element from another.
When this is not the case, use non-mergeable section:
.section .rodata[.VAR_NAME], "a", @progbits
This reduces .data by ~15 kbytes:
text data bss dec hex filename
11097415 2705840 2630712 16433967 fac32f vmlinux-prev.o
11112095 2690672 2630712 16433479 fac147 vmlinux.o
Merged objects are visible in System.map:
ffffffff81a28810 r POLY
ffffffff81a28810 r POLY
ffffffff81a28820 r TWOONE
ffffffff81a28820 r TWOONE
ffffffff81a28830 r PSHUFFLE_BYTE_FLIP_MASK <- merged regardless of
ffffffff81a28830 r SHUF_MASK <------------- the name difference
ffffffff81a28830 r SHUF_MASK
ffffffff81a28830 r SHUF_MASK
..
ffffffff81a28d00 r K512 <- merged three identical 640-byte tables
ffffffff81a28d00 r K512
ffffffff81a28d00 r K512
Use of object names in section name suffixes is not strictly necessary,
but might help if someday link stage will use garbage collection
to eliminate unused sections (ld --gc-sections).
Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: Josh Poimboeuf <jpoimboe@redhat.com>
CC: Xiaodong Liu <xiaodong.liu@intel.com>
CC: Megha Dey <megha.dey@intel.com>
CC: linux-crypto@vger.kernel.org
CC: x86@kernel.org
CC: linux-kernel@vger.kernel.org
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-01-20 00:33:04 +03:00
# constants i n m e r g e a b l e s e c t i o n s , l i n k e r c a n r e o r d e r a n d m e r g e
.section .rodata .cst16 .gf128mul_x_ble_mask , " aM" , @progbits, 16
2013-04-08 22:51:16 +04:00
.align 16
.Lgf128mul_x_ble_mask :
.octa 0x00000000000000010000000000000087
crypto: x86 - make constants readonly, allow linker to merge them
A lot of asm-optimized routines in arch/x86/crypto/ keep its
constants in .data. This is wrong, they should be on .rodata.
Mnay of these constants are the same in different modules.
For example, 128-bit shuffle mask 0x000102030405060708090A0B0C0D0E0F
exists in at least half a dozen places.
There is a way to let linker merge them and use just one copy.
The rules are as follows: mergeable objects of different sizes
should not share sections. You can't put them all in one .rodata
section, they will lose "mergeability".
GCC puts its mergeable constants in ".rodata.cstSIZE" sections,
or ".rodata.cstSIZE.<object_name>" if -fdata-sections is used.
This patch does the same:
.section .rodata.cst16.SHUF_MASK, "aM", @progbits, 16
It is important that all data in such section consists of
16-byte elements, not larger ones, and there are no implicit
use of one element from another.
When this is not the case, use non-mergeable section:
.section .rodata[.VAR_NAME], "a", @progbits
This reduces .data by ~15 kbytes:
text data bss dec hex filename
11097415 2705840 2630712 16433967 fac32f vmlinux-prev.o
11112095 2690672 2630712 16433479 fac147 vmlinux.o
Merged objects are visible in System.map:
ffffffff81a28810 r POLY
ffffffff81a28810 r POLY
ffffffff81a28820 r TWOONE
ffffffff81a28820 r TWOONE
ffffffff81a28830 r PSHUFFLE_BYTE_FLIP_MASK <- merged regardless of
ffffffff81a28830 r SHUF_MASK <------------- the name difference
ffffffff81a28830 r SHUF_MASK
ffffffff81a28830 r SHUF_MASK
..
ffffffff81a28d00 r K512 <- merged three identical 640-byte tables
ffffffff81a28d00 r K512
ffffffff81a28d00 r K512
Use of object names in section name suffixes is not strictly necessary,
but might help if someday link stage will use garbage collection
to eliminate unused sections (ld --gc-sections).
Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: Josh Poimboeuf <jpoimboe@redhat.com>
CC: Xiaodong Liu <xiaodong.liu@intel.com>
CC: Megha Dey <megha.dey@intel.com>
CC: linux-crypto@vger.kernel.org
CC: x86@kernel.org
CC: linux-kernel@vger.kernel.org
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-01-20 00:33:04 +03:00
.section .rodata .cst16 .POLY , " aM" , @progbits, 16
.align 16
2010-11-04 22:00:45 +03:00
POLY : .octa 0xC2000000 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
crypto: x86 - make constants readonly, allow linker to merge them
A lot of asm-optimized routines in arch/x86/crypto/ keep its
constants in .data. This is wrong, they should be on .rodata.
Mnay of these constants are the same in different modules.
For example, 128-bit shuffle mask 0x000102030405060708090A0B0C0D0E0F
exists in at least half a dozen places.
There is a way to let linker merge them and use just one copy.
The rules are as follows: mergeable objects of different sizes
should not share sections. You can't put them all in one .rodata
section, they will lose "mergeability".
GCC puts its mergeable constants in ".rodata.cstSIZE" sections,
or ".rodata.cstSIZE.<object_name>" if -fdata-sections is used.
This patch does the same:
.section .rodata.cst16.SHUF_MASK, "aM", @progbits, 16
It is important that all data in such section consists of
16-byte elements, not larger ones, and there are no implicit
use of one element from another.
When this is not the case, use non-mergeable section:
.section .rodata[.VAR_NAME], "a", @progbits
This reduces .data by ~15 kbytes:
text data bss dec hex filename
11097415 2705840 2630712 16433967 fac32f vmlinux-prev.o
11112095 2690672 2630712 16433479 fac147 vmlinux.o
Merged objects are visible in System.map:
ffffffff81a28810 r POLY
ffffffff81a28810 r POLY
ffffffff81a28820 r TWOONE
ffffffff81a28820 r TWOONE
ffffffff81a28830 r PSHUFFLE_BYTE_FLIP_MASK <- merged regardless of
ffffffff81a28830 r SHUF_MASK <------------- the name difference
ffffffff81a28830 r SHUF_MASK
ffffffff81a28830 r SHUF_MASK
..
ffffffff81a28d00 r K512 <- merged three identical 640-byte tables
ffffffff81a28d00 r K512
ffffffff81a28d00 r K512
Use of object names in section name suffixes is not strictly necessary,
but might help if someday link stage will use garbage collection
to eliminate unused sections (ld --gc-sections).
Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: Josh Poimboeuf <jpoimboe@redhat.com>
CC: Xiaodong Liu <xiaodong.liu@intel.com>
CC: Megha Dey <megha.dey@intel.com>
CC: linux-crypto@vger.kernel.org
CC: x86@kernel.org
CC: linux-kernel@vger.kernel.org
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-01-20 00:33:04 +03:00
.section .rodata .cst16 .TWOONE , " aM" , @progbits, 16
.align 16
2010-11-04 22:00:45 +03:00
TWOONE : .octa 0x00000001 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
crypto: x86 - make constants readonly, allow linker to merge them
A lot of asm-optimized routines in arch/x86/crypto/ keep its
constants in .data. This is wrong, they should be on .rodata.
Mnay of these constants are the same in different modules.
For example, 128-bit shuffle mask 0x000102030405060708090A0B0C0D0E0F
exists in at least half a dozen places.
There is a way to let linker merge them and use just one copy.
The rules are as follows: mergeable objects of different sizes
should not share sections. You can't put them all in one .rodata
section, they will lose "mergeability".
GCC puts its mergeable constants in ".rodata.cstSIZE" sections,
or ".rodata.cstSIZE.<object_name>" if -fdata-sections is used.
This patch does the same:
.section .rodata.cst16.SHUF_MASK, "aM", @progbits, 16
It is important that all data in such section consists of
16-byte elements, not larger ones, and there are no implicit
use of one element from another.
When this is not the case, use non-mergeable section:
.section .rodata[.VAR_NAME], "a", @progbits
This reduces .data by ~15 kbytes:
text data bss dec hex filename
11097415 2705840 2630712 16433967 fac32f vmlinux-prev.o
11112095 2690672 2630712 16433479 fac147 vmlinux.o
Merged objects are visible in System.map:
ffffffff81a28810 r POLY
ffffffff81a28810 r POLY
ffffffff81a28820 r TWOONE
ffffffff81a28820 r TWOONE
ffffffff81a28830 r PSHUFFLE_BYTE_FLIP_MASK <- merged regardless of
ffffffff81a28830 r SHUF_MASK <------------- the name difference
ffffffff81a28830 r SHUF_MASK
ffffffff81a28830 r SHUF_MASK
..
ffffffff81a28d00 r K512 <- merged three identical 640-byte tables
ffffffff81a28d00 r K512
ffffffff81a28d00 r K512
Use of object names in section name suffixes is not strictly necessary,
but might help if someday link stage will use garbage collection
to eliminate unused sections (ld --gc-sections).
Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: Josh Poimboeuf <jpoimboe@redhat.com>
CC: Xiaodong Liu <xiaodong.liu@intel.com>
CC: Megha Dey <megha.dey@intel.com>
CC: linux-crypto@vger.kernel.org
CC: x86@kernel.org
CC: linux-kernel@vger.kernel.org
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-01-20 00:33:04 +03:00
.section .rodata .cst16 .SHUF_MASK , " aM" , @progbits, 16
.align 16
2010-11-04 22:00:45 +03:00
SHUF_MASK : .octa 0x00010203 0 4 0 5 0 6 0 7 0 8 0 9 0 A0 B 0 C 0 D 0 E 0 F
crypto: x86 - make constants readonly, allow linker to merge them
A lot of asm-optimized routines in arch/x86/crypto/ keep its
constants in .data. This is wrong, they should be on .rodata.
Mnay of these constants are the same in different modules.
For example, 128-bit shuffle mask 0x000102030405060708090A0B0C0D0E0F
exists in at least half a dozen places.
There is a way to let linker merge them and use just one copy.
The rules are as follows: mergeable objects of different sizes
should not share sections. You can't put them all in one .rodata
section, they will lose "mergeability".
GCC puts its mergeable constants in ".rodata.cstSIZE" sections,
or ".rodata.cstSIZE.<object_name>" if -fdata-sections is used.
This patch does the same:
.section .rodata.cst16.SHUF_MASK, "aM", @progbits, 16
It is important that all data in such section consists of
16-byte elements, not larger ones, and there are no implicit
use of one element from another.
When this is not the case, use non-mergeable section:
.section .rodata[.VAR_NAME], "a", @progbits
This reduces .data by ~15 kbytes:
text data bss dec hex filename
11097415 2705840 2630712 16433967 fac32f vmlinux-prev.o
11112095 2690672 2630712 16433479 fac147 vmlinux.o
Merged objects are visible in System.map:
ffffffff81a28810 r POLY
ffffffff81a28810 r POLY
ffffffff81a28820 r TWOONE
ffffffff81a28820 r TWOONE
ffffffff81a28830 r PSHUFFLE_BYTE_FLIP_MASK <- merged regardless of
ffffffff81a28830 r SHUF_MASK <------------- the name difference
ffffffff81a28830 r SHUF_MASK
ffffffff81a28830 r SHUF_MASK
..
ffffffff81a28d00 r K512 <- merged three identical 640-byte tables
ffffffff81a28d00 r K512
ffffffff81a28d00 r K512
Use of object names in section name suffixes is not strictly necessary,
but might help if someday link stage will use garbage collection
to eliminate unused sections (ld --gc-sections).
Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: Josh Poimboeuf <jpoimboe@redhat.com>
CC: Xiaodong Liu <xiaodong.liu@intel.com>
CC: Megha Dey <megha.dey@intel.com>
CC: linux-crypto@vger.kernel.org
CC: x86@kernel.org
CC: linux-kernel@vger.kernel.org
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-01-20 00:33:04 +03:00
.section .rodata .cst16 .MASK1 , " aM" , @progbits, 16
.align 16
2010-11-04 22:00:45 +03:00
MASK1 : .octa 0x00000000 0 0 0 0 0 0 0 0 ffffffffffffffff
crypto: x86 - make constants readonly, allow linker to merge them
A lot of asm-optimized routines in arch/x86/crypto/ keep its
constants in .data. This is wrong, they should be on .rodata.
Mnay of these constants are the same in different modules.
For example, 128-bit shuffle mask 0x000102030405060708090A0B0C0D0E0F
exists in at least half a dozen places.
There is a way to let linker merge them and use just one copy.
The rules are as follows: mergeable objects of different sizes
should not share sections. You can't put them all in one .rodata
section, they will lose "mergeability".
GCC puts its mergeable constants in ".rodata.cstSIZE" sections,
or ".rodata.cstSIZE.<object_name>" if -fdata-sections is used.
This patch does the same:
.section .rodata.cst16.SHUF_MASK, "aM", @progbits, 16
It is important that all data in such section consists of
16-byte elements, not larger ones, and there are no implicit
use of one element from another.
When this is not the case, use non-mergeable section:
.section .rodata[.VAR_NAME], "a", @progbits
This reduces .data by ~15 kbytes:
text data bss dec hex filename
11097415 2705840 2630712 16433967 fac32f vmlinux-prev.o
11112095 2690672 2630712 16433479 fac147 vmlinux.o
Merged objects are visible in System.map:
ffffffff81a28810 r POLY
ffffffff81a28810 r POLY
ffffffff81a28820 r TWOONE
ffffffff81a28820 r TWOONE
ffffffff81a28830 r PSHUFFLE_BYTE_FLIP_MASK <- merged regardless of
ffffffff81a28830 r SHUF_MASK <------------- the name difference
ffffffff81a28830 r SHUF_MASK
ffffffff81a28830 r SHUF_MASK
..
ffffffff81a28d00 r K512 <- merged three identical 640-byte tables
ffffffff81a28d00 r K512
ffffffff81a28d00 r K512
Use of object names in section name suffixes is not strictly necessary,
but might help if someday link stage will use garbage collection
to eliminate unused sections (ld --gc-sections).
Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: Josh Poimboeuf <jpoimboe@redhat.com>
CC: Xiaodong Liu <xiaodong.liu@intel.com>
CC: Megha Dey <megha.dey@intel.com>
CC: linux-crypto@vger.kernel.org
CC: x86@kernel.org
CC: linux-kernel@vger.kernel.org
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-01-20 00:33:04 +03:00
.section .rodata .cst16 .MASK2 , " aM" , @progbits, 16
.align 16
2010-11-04 22:00:45 +03:00
MASK2 : .octa 0xffffffff ffffffff0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
crypto: x86 - make constants readonly, allow linker to merge them
A lot of asm-optimized routines in arch/x86/crypto/ keep its
constants in .data. This is wrong, they should be on .rodata.
Mnay of these constants are the same in different modules.
For example, 128-bit shuffle mask 0x000102030405060708090A0B0C0D0E0F
exists in at least half a dozen places.
There is a way to let linker merge them and use just one copy.
The rules are as follows: mergeable objects of different sizes
should not share sections. You can't put them all in one .rodata
section, they will lose "mergeability".
GCC puts its mergeable constants in ".rodata.cstSIZE" sections,
or ".rodata.cstSIZE.<object_name>" if -fdata-sections is used.
This patch does the same:
.section .rodata.cst16.SHUF_MASK, "aM", @progbits, 16
It is important that all data in such section consists of
16-byte elements, not larger ones, and there are no implicit
use of one element from another.
When this is not the case, use non-mergeable section:
.section .rodata[.VAR_NAME], "a", @progbits
This reduces .data by ~15 kbytes:
text data bss dec hex filename
11097415 2705840 2630712 16433967 fac32f vmlinux-prev.o
11112095 2690672 2630712 16433479 fac147 vmlinux.o
Merged objects are visible in System.map:
ffffffff81a28810 r POLY
ffffffff81a28810 r POLY
ffffffff81a28820 r TWOONE
ffffffff81a28820 r TWOONE
ffffffff81a28830 r PSHUFFLE_BYTE_FLIP_MASK <- merged regardless of
ffffffff81a28830 r SHUF_MASK <------------- the name difference
ffffffff81a28830 r SHUF_MASK
ffffffff81a28830 r SHUF_MASK
..
ffffffff81a28d00 r K512 <- merged three identical 640-byte tables
ffffffff81a28d00 r K512
ffffffff81a28d00 r K512
Use of object names in section name suffixes is not strictly necessary,
but might help if someday link stage will use garbage collection
to eliminate unused sections (ld --gc-sections).
Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: Josh Poimboeuf <jpoimboe@redhat.com>
CC: Xiaodong Liu <xiaodong.liu@intel.com>
CC: Megha Dey <megha.dey@intel.com>
CC: linux-crypto@vger.kernel.org
CC: x86@kernel.org
CC: linux-kernel@vger.kernel.org
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-01-20 00:33:04 +03:00
.section .rodata .cst16 .ONE , " aM" , @progbits, 16
.align 16
2010-11-04 22:00:45 +03:00
ONE : .octa 0x00000000 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
crypto: x86 - make constants readonly, allow linker to merge them
A lot of asm-optimized routines in arch/x86/crypto/ keep its
constants in .data. This is wrong, they should be on .rodata.
Mnay of these constants are the same in different modules.
For example, 128-bit shuffle mask 0x000102030405060708090A0B0C0D0E0F
exists in at least half a dozen places.
There is a way to let linker merge them and use just one copy.
The rules are as follows: mergeable objects of different sizes
should not share sections. You can't put them all in one .rodata
section, they will lose "mergeability".
GCC puts its mergeable constants in ".rodata.cstSIZE" sections,
or ".rodata.cstSIZE.<object_name>" if -fdata-sections is used.
This patch does the same:
.section .rodata.cst16.SHUF_MASK, "aM", @progbits, 16
It is important that all data in such section consists of
16-byte elements, not larger ones, and there are no implicit
use of one element from another.
When this is not the case, use non-mergeable section:
.section .rodata[.VAR_NAME], "a", @progbits
This reduces .data by ~15 kbytes:
text data bss dec hex filename
11097415 2705840 2630712 16433967 fac32f vmlinux-prev.o
11112095 2690672 2630712 16433479 fac147 vmlinux.o
Merged objects are visible in System.map:
ffffffff81a28810 r POLY
ffffffff81a28810 r POLY
ffffffff81a28820 r TWOONE
ffffffff81a28820 r TWOONE
ffffffff81a28830 r PSHUFFLE_BYTE_FLIP_MASK <- merged regardless of
ffffffff81a28830 r SHUF_MASK <------------- the name difference
ffffffff81a28830 r SHUF_MASK
ffffffff81a28830 r SHUF_MASK
..
ffffffff81a28d00 r K512 <- merged three identical 640-byte tables
ffffffff81a28d00 r K512
ffffffff81a28d00 r K512
Use of object names in section name suffixes is not strictly necessary,
but might help if someday link stage will use garbage collection
to eliminate unused sections (ld --gc-sections).
Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: Josh Poimboeuf <jpoimboe@redhat.com>
CC: Xiaodong Liu <xiaodong.liu@intel.com>
CC: Megha Dey <megha.dey@intel.com>
CC: linux-crypto@vger.kernel.org
CC: x86@kernel.org
CC: linux-kernel@vger.kernel.org
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-01-20 00:33:04 +03:00
.section .rodata .cst16 .F_MIN_MASK , " aM" , @progbits, 16
.align 16
2010-11-04 22:00:45 +03:00
F_MIN_MASK : .octa 0xf1f2f3f4 f5 f6 f7 f8 f9 f a f b f c f d f e f f0
crypto: x86 - make constants readonly, allow linker to merge them
A lot of asm-optimized routines in arch/x86/crypto/ keep its
constants in .data. This is wrong, they should be on .rodata.
Mnay of these constants are the same in different modules.
For example, 128-bit shuffle mask 0x000102030405060708090A0B0C0D0E0F
exists in at least half a dozen places.
There is a way to let linker merge them and use just one copy.
The rules are as follows: mergeable objects of different sizes
should not share sections. You can't put them all in one .rodata
section, they will lose "mergeability".
GCC puts its mergeable constants in ".rodata.cstSIZE" sections,
or ".rodata.cstSIZE.<object_name>" if -fdata-sections is used.
This patch does the same:
.section .rodata.cst16.SHUF_MASK, "aM", @progbits, 16
It is important that all data in such section consists of
16-byte elements, not larger ones, and there are no implicit
use of one element from another.
When this is not the case, use non-mergeable section:
.section .rodata[.VAR_NAME], "a", @progbits
This reduces .data by ~15 kbytes:
text data bss dec hex filename
11097415 2705840 2630712 16433967 fac32f vmlinux-prev.o
11112095 2690672 2630712 16433479 fac147 vmlinux.o
Merged objects are visible in System.map:
ffffffff81a28810 r POLY
ffffffff81a28810 r POLY
ffffffff81a28820 r TWOONE
ffffffff81a28820 r TWOONE
ffffffff81a28830 r PSHUFFLE_BYTE_FLIP_MASK <- merged regardless of
ffffffff81a28830 r SHUF_MASK <------------- the name difference
ffffffff81a28830 r SHUF_MASK
ffffffff81a28830 r SHUF_MASK
..
ffffffff81a28d00 r K512 <- merged three identical 640-byte tables
ffffffff81a28d00 r K512
ffffffff81a28d00 r K512
Use of object names in section name suffixes is not strictly necessary,
but might help if someday link stage will use garbage collection
to eliminate unused sections (ld --gc-sections).
Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: Josh Poimboeuf <jpoimboe@redhat.com>
CC: Xiaodong Liu <xiaodong.liu@intel.com>
CC: Megha Dey <megha.dey@intel.com>
CC: linux-crypto@vger.kernel.org
CC: x86@kernel.org
CC: linux-kernel@vger.kernel.org
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-01-20 00:33:04 +03:00
.section .rodata .cst16 .dec , " aM" , @progbits, 16
.align 16
2010-11-04 22:00:45 +03:00
dec : .octa 0x1
crypto: x86 - make constants readonly, allow linker to merge them
A lot of asm-optimized routines in arch/x86/crypto/ keep its
constants in .data. This is wrong, they should be on .rodata.
Mnay of these constants are the same in different modules.
For example, 128-bit shuffle mask 0x000102030405060708090A0B0C0D0E0F
exists in at least half a dozen places.
There is a way to let linker merge them and use just one copy.
The rules are as follows: mergeable objects of different sizes
should not share sections. You can't put them all in one .rodata
section, they will lose "mergeability".
GCC puts its mergeable constants in ".rodata.cstSIZE" sections,
or ".rodata.cstSIZE.<object_name>" if -fdata-sections is used.
This patch does the same:
.section .rodata.cst16.SHUF_MASK, "aM", @progbits, 16
It is important that all data in such section consists of
16-byte elements, not larger ones, and there are no implicit
use of one element from another.
When this is not the case, use non-mergeable section:
.section .rodata[.VAR_NAME], "a", @progbits
This reduces .data by ~15 kbytes:
text data bss dec hex filename
11097415 2705840 2630712 16433967 fac32f vmlinux-prev.o
11112095 2690672 2630712 16433479 fac147 vmlinux.o
Merged objects are visible in System.map:
ffffffff81a28810 r POLY
ffffffff81a28810 r POLY
ffffffff81a28820 r TWOONE
ffffffff81a28820 r TWOONE
ffffffff81a28830 r PSHUFFLE_BYTE_FLIP_MASK <- merged regardless of
ffffffff81a28830 r SHUF_MASK <------------- the name difference
ffffffff81a28830 r SHUF_MASK
ffffffff81a28830 r SHUF_MASK
..
ffffffff81a28d00 r K512 <- merged three identical 640-byte tables
ffffffff81a28d00 r K512
ffffffff81a28d00 r K512
Use of object names in section name suffixes is not strictly necessary,
but might help if someday link stage will use garbage collection
to eliminate unused sections (ld --gc-sections).
Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: Josh Poimboeuf <jpoimboe@redhat.com>
CC: Xiaodong Liu <xiaodong.liu@intel.com>
CC: Megha Dey <megha.dey@intel.com>
CC: linux-crypto@vger.kernel.org
CC: x86@kernel.org
CC: linux-kernel@vger.kernel.org
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-01-20 00:33:04 +03:00
.section .rodata .cst16 .enc , " aM" , @progbits, 16
.align 16
2010-11-04 22:00:45 +03:00
enc : .octa 0x2
crypto: x86 - make constants readonly, allow linker to merge them
A lot of asm-optimized routines in arch/x86/crypto/ keep its
constants in .data. This is wrong, they should be on .rodata.
Mnay of these constants are the same in different modules.
For example, 128-bit shuffle mask 0x000102030405060708090A0B0C0D0E0F
exists in at least half a dozen places.
There is a way to let linker merge them and use just one copy.
The rules are as follows: mergeable objects of different sizes
should not share sections. You can't put them all in one .rodata
section, they will lose "mergeability".
GCC puts its mergeable constants in ".rodata.cstSIZE" sections,
or ".rodata.cstSIZE.<object_name>" if -fdata-sections is used.
This patch does the same:
.section .rodata.cst16.SHUF_MASK, "aM", @progbits, 16
It is important that all data in such section consists of
16-byte elements, not larger ones, and there are no implicit
use of one element from another.
When this is not the case, use non-mergeable section:
.section .rodata[.VAR_NAME], "a", @progbits
This reduces .data by ~15 kbytes:
text data bss dec hex filename
11097415 2705840 2630712 16433967 fac32f vmlinux-prev.o
11112095 2690672 2630712 16433479 fac147 vmlinux.o
Merged objects are visible in System.map:
ffffffff81a28810 r POLY
ffffffff81a28810 r POLY
ffffffff81a28820 r TWOONE
ffffffff81a28820 r TWOONE
ffffffff81a28830 r PSHUFFLE_BYTE_FLIP_MASK <- merged regardless of
ffffffff81a28830 r SHUF_MASK <------------- the name difference
ffffffff81a28830 r SHUF_MASK
ffffffff81a28830 r SHUF_MASK
..
ffffffff81a28d00 r K512 <- merged three identical 640-byte tables
ffffffff81a28d00 r K512
ffffffff81a28d00 r K512
Use of object names in section name suffixes is not strictly necessary,
but might help if someday link stage will use garbage collection
to eliminate unused sections (ld --gc-sections).
Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: Josh Poimboeuf <jpoimboe@redhat.com>
CC: Xiaodong Liu <xiaodong.liu@intel.com>
CC: Megha Dey <megha.dey@intel.com>
CC: linux-crypto@vger.kernel.org
CC: x86@kernel.org
CC: linux-kernel@vger.kernel.org
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-01-20 00:33:04 +03:00
# order o f t h e s e c o n s t a n t s s h o u l d n o t c h a n g e .
# more s p e c i f i c a l l y , A L L _ F s h o u l d f o l l o w S H I F T _ M A S K ,
# and z e r o s h o u l d f o l l o w A L L _ F
.section .rodata , " a" , @progbits
.align 16
SHIFT_MASK : .octa 0x0f0e0d0c 0 b0 a09 0 8 0 7 0 6 0 5 0 4 0 3 0 2 0 1 0 0
ALL_F : .octa 0xffffffff ffffffffffffffffffffffff
.octa 0x00000000000000000000000000000000
2009-01-18 08:28:34 +03:00
.text
2010-11-04 22:00:45 +03:00
# define S T A C K _ O F F S E T 8 * 3
2018-02-14 20:39:23 +03:00
# define A a d H a s h 1 6 * 0
# define A a d L e n 1 6 * 1
# define I n L e n ( 1 6 * 1 ) + 8
# define P B l o c k E n c K e y 1 6 * 2
# define O r i g I V 1 6 * 3
# define C u r C o u n t 1 6 * 4
# define P B l o c k L e n 1 6 * 5
2018-02-14 20:40:10 +03:00
# define H a s h K e y 1 6 * 6 / / s t o r e H a s h K e y < < 1 m o d p o l y h e r e
# define H a s h K e y _ 2 1 6 * 7 / / s t o r e H a s h K e y ^ 2 < < 1 m o d p o l y h e r e
# define H a s h K e y _ 3 1 6 * 8 / / s t o r e H a s h K e y ^ 3 < < 1 m o d p o l y h e r e
# define H a s h K e y _ 4 1 6 * 9 / / s t o r e H a s h K e y ^ 4 < < 1 m o d p o l y h e r e
# define H a s h K e y _ k 1 6 * 1 0 / / s t o r e X O R o f H i g h 6 4 b i t s a n d L o w 6 4
/ / bits o f H a s h K e y < < 1 m o d p o l y h e r e
/ / ( for K a r a t s u b a p u r p o s e s )
# define H a s h K e y _ 2 _ k 1 6 * 1 1 / / s t o r e X O R o f H i g h 6 4 b i t s a n d L o w 6 4
/ / bits o f H a s h K e y ^ 2 < < 1 m o d p o l y h e r e
/ / ( for K a r a t s u b a p u r p o s e s )
# define H a s h K e y _ 3 _ k 1 6 * 1 2 / / s t o r e X O R o f H i g h 6 4 b i t s a n d L o w 6 4
/ / bits o f H a s h K e y ^ 3 < < 1 m o d p o l y h e r e
/ / ( for K a r a t s u b a p u r p o s e s )
# define H a s h K e y _ 4 _ k 1 6 * 1 3 / / s t o r e X O R o f H i g h 6 4 b i t s a n d L o w 6 4
/ / bits o f H a s h K e y ^ 4 < < 1 m o d p o l y h e r e
/ / ( for K a r a t s u b a p u r p o s e s )
2018-02-14 20:39:23 +03:00
2010-11-04 22:00:45 +03:00
# define a r g 1 r d i
# define a r g 2 r s i
# define a r g 3 r d x
# define a r g 4 r c x
# define a r g 5 r8
# define a r g 6 r9
2018-02-14 20:40:10 +03:00
# define a r g 7 S T A C K _ O F F S E T + 8 ( % r s p )
# define a r g 8 S T A C K _ O F F S E T + 1 6 ( % r s p )
# define a r g 9 S T A C K _ O F F S E T + 2 4 ( % r s p )
# define a r g 1 0 S T A C K _ O F F S E T + 3 2 ( % r s p )
# define a r g 1 1 S T A C K _ O F F S E T + 4 0 ( % r s p )
2015-01-13 21:16:43 +03:00
# define k e y s i z e 2 * 1 5 * 1 6 ( % a r g 1 )
2010-11-29 03:35:39 +03:00
# endif
2010-11-04 22:00:45 +03:00
2009-01-18 08:28:34 +03:00
# define S T A T E 1 % x m m 0
# define S T A T E 2 % x m m 4
# define S T A T E 3 % x m m 5
# define S T A T E 4 % x m m 6
# define S T A T E S T A T E 1
# define I N 1 % x m m 1
# define I N 2 % x m m 7
# define I N 3 % x m m 8
# define I N 4 % x m m 9
# define I N I N 1
# define K E Y % x m m 2
# define I V % x m m 3
crypto: aesni-intel - Ported implementation to x86-32
The AES-NI instructions are also available in legacy mode so the 32-bit
architecture may profit from those, too.
To illustrate the performance gain here's a short summary of a dm-crypt
speed test on a Core i7 M620 running at 2.67GHz comparing both assembler
implementations:
x86: i568 aes-ni delta
ECB, 256 bit: 93.8 MB/s 123.3 MB/s +31.4%
CBC, 256 bit: 84.8 MB/s 262.3 MB/s +209.3%
LRW, 256 bit: 108.6 MB/s 222.1 MB/s +104.5%
XTS, 256 bit: 105.0 MB/s 205.5 MB/s +95.7%
Additionally, due to some minor optimizations, the 64-bit version also
got a minor performance gain as seen below:
x86-64: old impl. new impl. delta
ECB, 256 bit: 121.1 MB/s 123.0 MB/s +1.5%
CBC, 256 bit: 285.3 MB/s 290.8 MB/s +1.9%
LRW, 256 bit: 263.7 MB/s 265.3 MB/s +0.6%
XTS, 256 bit: 251.1 MB/s 255.3 MB/s +1.7%
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Reviewed-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2010-11-27 11:34:46 +03:00
2010-03-10 13:28:55 +03:00
# define B S W A P _ M A S K % x m m 1 0
# define C T R % x m m 1 1
# define I N C % x m m 1 2
2009-01-18 08:28:34 +03:00
2013-04-08 22:51:16 +04:00
# define G F 1 2 8 M U L _ M A S K % x m m 1 0
crypto: aesni-intel - Ported implementation to x86-32
The AES-NI instructions are also available in legacy mode so the 32-bit
architecture may profit from those, too.
To illustrate the performance gain here's a short summary of a dm-crypt
speed test on a Core i7 M620 running at 2.67GHz comparing both assembler
implementations:
x86: i568 aes-ni delta
ECB, 256 bit: 93.8 MB/s 123.3 MB/s +31.4%
CBC, 256 bit: 84.8 MB/s 262.3 MB/s +209.3%
LRW, 256 bit: 108.6 MB/s 222.1 MB/s +104.5%
XTS, 256 bit: 105.0 MB/s 205.5 MB/s +95.7%
Additionally, due to some minor optimizations, the 64-bit version also
got a minor performance gain as seen below:
x86-64: old impl. new impl. delta
ECB, 256 bit: 121.1 MB/s 123.0 MB/s +1.5%
CBC, 256 bit: 285.3 MB/s 290.8 MB/s +1.9%
LRW, 256 bit: 263.7 MB/s 265.3 MB/s +0.6%
XTS, 256 bit: 251.1 MB/s 255.3 MB/s +1.7%
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Reviewed-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2010-11-27 11:34:46 +03:00
# ifdef _ _ x86 _ 6 4 _ _
# define A R E G % r a x
2009-01-18 08:28:34 +03:00
# define K E Y P % r d i
# define O U T P % r s i
crypto: aesni-intel - Ported implementation to x86-32
The AES-NI instructions are also available in legacy mode so the 32-bit
architecture may profit from those, too.
To illustrate the performance gain here's a short summary of a dm-crypt
speed test on a Core i7 M620 running at 2.67GHz comparing both assembler
implementations:
x86: i568 aes-ni delta
ECB, 256 bit: 93.8 MB/s 123.3 MB/s +31.4%
CBC, 256 bit: 84.8 MB/s 262.3 MB/s +209.3%
LRW, 256 bit: 108.6 MB/s 222.1 MB/s +104.5%
XTS, 256 bit: 105.0 MB/s 205.5 MB/s +95.7%
Additionally, due to some minor optimizations, the 64-bit version also
got a minor performance gain as seen below:
x86-64: old impl. new impl. delta
ECB, 256 bit: 121.1 MB/s 123.0 MB/s +1.5%
CBC, 256 bit: 285.3 MB/s 290.8 MB/s +1.9%
LRW, 256 bit: 263.7 MB/s 265.3 MB/s +0.6%
XTS, 256 bit: 251.1 MB/s 255.3 MB/s +1.7%
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Reviewed-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2010-11-27 11:34:46 +03:00
# define U K E Y P O U T P
2009-01-18 08:28:34 +03:00
# define I N P % r d x
# define L E N % r c x
# define I V P % r8
# define K L E N % r9 d
# define T 1 % r10
# define T K E Y P T 1
# define T 2 % r11
2010-03-10 13:28:55 +03:00
# define T C T R _ L O W T 2
crypto: aesni-intel - Ported implementation to x86-32
The AES-NI instructions are also available in legacy mode so the 32-bit
architecture may profit from those, too.
To illustrate the performance gain here's a short summary of a dm-crypt
speed test on a Core i7 M620 running at 2.67GHz comparing both assembler
implementations:
x86: i568 aes-ni delta
ECB, 256 bit: 93.8 MB/s 123.3 MB/s +31.4%
CBC, 256 bit: 84.8 MB/s 262.3 MB/s +209.3%
LRW, 256 bit: 108.6 MB/s 222.1 MB/s +104.5%
XTS, 256 bit: 105.0 MB/s 205.5 MB/s +95.7%
Additionally, due to some minor optimizations, the 64-bit version also
got a minor performance gain as seen below:
x86-64: old impl. new impl. delta
ECB, 256 bit: 121.1 MB/s 123.0 MB/s +1.5%
CBC, 256 bit: 285.3 MB/s 290.8 MB/s +1.9%
LRW, 256 bit: 263.7 MB/s 265.3 MB/s +0.6%
XTS, 256 bit: 251.1 MB/s 255.3 MB/s +1.7%
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Reviewed-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2010-11-27 11:34:46 +03:00
# else
# define A R E G % e a x
# define K E Y P % e d i
# define O U T P A R E G
# define U K E Y P O U T P
# define I N P % e d x
# define L E N % e s i
# define I V P % e b p
# define K L E N % e b x
# define T 1 % e c x
# define T K E Y P T 1
# endif
2009-01-18 08:28:34 +03:00
2018-02-14 20:38:35 +03:00
.macro FUNC_SAVE
push % r12
push % r13
push % r14
#
# states o f % x m m r e g i s t e r s % x m m 6 : % x m m 1 5 n o t s a v e d
# all % x m m r e g i s t e r s a r e c l o b b e r e d
#
.endm
.macro FUNC_RESTORE
pop % r14
pop % r13
pop % r12
.endm
2010-11-04 22:00:45 +03:00
2018-02-14 20:40:10 +03:00
# Precompute h a s h k e y s .
# Input : Hash s u b k e y .
# Output : HashKeys s t o r e d i n g c m _ c o n t e x t _ d a t a . O n l y n e e d s t o b e c a l l e d
# once p e r k e y .
# clobbers r12 , a n d t m p x m m r e g i s t e r s .
2018-02-14 20:40:47 +03:00
.macro PRECOMPUTE SUBKEY T M P 1 T M P 2 T M P 3 T M P 4 T M P 5 T M P 6 T M P 7
mov \ S U B K E Y , % r12
2018-02-14 20:40:10 +03:00
movdqu ( % r12 ) , \ T M P 3
movdqa S H U F _ M A S K ( % r i p ) , \ T M P 2
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
pshufb \ T M P 2 , \ T M P 3
2018-02-14 20:40:10 +03:00
# precompute H a s h K e y < < 1 m o d p o l y f r o m t h e H a s h K e y ( r e q u i r e d f o r G H A S H )
movdqa \ T M P 3 , \ T M P 2
psllq $ 1 , \ T M P 3
psrlq $ 6 3 , \ T M P 2
movdqa \ T M P 2 , \ T M P 1
pslldq $ 8 , \ T M P 2
psrldq $ 8 , \ T M P 1
por \ T M P 2 , \ T M P 3
# reduce H a s h K e y < < 1
pshufd $ 0 x24 , \ T M P 1 , \ T M P 2
pcmpeqd T W O O N E ( % r i p ) , \ T M P 2
pand P O L Y ( % r i p ) , \ T M P 2
pxor \ T M P 2 , \ T M P 3
2018-08-15 20:29:42 +03:00
movdqu \ T M P 3 , H a s h K e y ( % a r g 2 )
2018-02-14 20:40:10 +03:00
movdqa \ T M P 3 , \ T M P 5
pshufd $ 7 8 , \ T M P 3 , \ T M P 1
pxor \ T M P 3 , \ T M P 1
2018-08-15 20:29:42 +03:00
movdqu \ T M P 1 , H a s h K e y _ k ( % a r g 2 )
2018-02-14 20:40:10 +03:00
GHASH_ M U L \ T M P 5 , \ T M P 3 , \ T M P 1 , \ T M P 2 , \ T M P 4 , \ T M P 6 , \ T M P 7
# TMP5 = H a s h K e y ^ 2 < < 1 ( m o d p o l y )
2018-08-15 20:29:42 +03:00
movdqu \ T M P 5 , H a s h K e y _ 2 ( % a r g 2 )
2018-02-14 20:40:10 +03:00
# HashKey_ 2 = H a s h K e y ^ 2 < < 1 ( m o d p o l y )
pshufd $ 7 8 , \ T M P 5 , \ T M P 1
pxor \ T M P 5 , \ T M P 1
2018-08-15 20:29:42 +03:00
movdqu \ T M P 1 , H a s h K e y _ 2 _ k ( % a r g 2 )
2018-02-14 20:40:10 +03:00
GHASH_ M U L \ T M P 5 , \ T M P 3 , \ T M P 1 , \ T M P 2 , \ T M P 4 , \ T M P 6 , \ T M P 7
# TMP5 = H a s h K e y ^ 3 < < 1 ( m o d p o l y )
2018-08-15 20:29:42 +03:00
movdqu \ T M P 5 , H a s h K e y _ 3 ( % a r g 2 )
2018-02-14 20:40:10 +03:00
pshufd $ 7 8 , \ T M P 5 , \ T M P 1
pxor \ T M P 5 , \ T M P 1
2018-08-15 20:29:42 +03:00
movdqu \ T M P 1 , H a s h K e y _ 3 _ k ( % a r g 2 )
2018-02-14 20:40:10 +03:00
GHASH_ M U L \ T M P 5 , \ T M P 3 , \ T M P 1 , \ T M P 2 , \ T M P 4 , \ T M P 6 , \ T M P 7
# TMP5 = H a s h K e y ^ 3 < < 1 ( m o d p o l y )
2018-08-15 20:29:42 +03:00
movdqu \ T M P 5 , H a s h K e y _ 4 ( % a r g 2 )
2018-02-14 20:40:10 +03:00
pshufd $ 7 8 , \ T M P 5 , \ T M P 1
pxor \ T M P 5 , \ T M P 1
2018-08-15 20:29:42 +03:00
movdqu \ T M P 1 , H a s h K e y _ 4 _ k ( % a r g 2 )
2018-02-14 20:40:10 +03:00
.endm
2018-02-14 20:38:45 +03:00
# GCM_ I N I T i n i t i a l i z e s a g c m _ c o n t e x t s t r u c t t o p r e p a r e f o r e n c o d i n g / d e c o d i n g .
# Clobbers r a x , r10 - r13 a n d x m m 0 - x m m 6 , % x m m 1 3
2018-02-14 20:40:47 +03:00
.macro GCM_INIT Iv S U B K E Y A A D A A D L E N
mov \ A A D L E N , % r11
2018-02-14 20:39:45 +03:00
mov % r11 , A a d L e n ( % a r g 2 ) # c t x _ d a t a . a a d _ l e n g t h = a a d _ l e n g t h
2018-07-02 13:31:54 +03:00
xor % r11 d , % r11 d
2018-02-14 20:39:45 +03:00
mov % r11 , I n L e n ( % a r g 2 ) # c t x _ d a t a . i n _ l e n g t h = 0
mov % r11 , P B l o c k L e n ( % a r g 2 ) # c t x _ d a t a . p a r t i a l _ b l o c k _ l e n g t h = 0
mov % r11 , P B l o c k E n c K e y ( % a r g 2 ) # c t x _ d a t a . p a r t i a l _ b l o c k _ e n c _ k e y = 0
2018-02-14 20:40:47 +03:00
mov \ I v , % r a x
2018-02-14 20:39:45 +03:00
movdqu ( % r a x ) , % x m m 0
movdqu % x m m 0 , O r i g I V ( % a r g 2 ) # c t x _ d a t a . o r i g _ I V = i v
movdqa S H U F _ M A S K ( % r i p ) , % x m m 2
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
pshufb % x m m 2 , % x m m 0
2018-02-14 20:39:45 +03:00
movdqu % x m m 0 , C u r C o u n t ( % a r g 2 ) # c t x _ d a t a . c u r r e n t _ c o u n t e r = i v
crypto: aesni - Fix build with LLVM_IAS=1
When building with LLVM_IAS=1 means using Clang's Integrated Assembly (IAS)
from LLVM/Clang >= v10.0.1-rc1+ instead of GNU/as from GNU/binutils
I see the following breakage in Debian/testing AMD64:
<instantiation>:15:74: error: too many positional arguments
PRECOMPUTE 8*3+8(%rsp), %xmm1, %xmm2, %xmm3, %xmm4, %xmm5, %xmm6, %xmm7,
^
arch/x86/crypto/aesni-intel_asm.S:1598:2: note: while in macro instantiation
GCM_INIT %r9, 8*3 +8(%rsp), 8*3 +16(%rsp), 8*3 +24(%rsp)
^
<instantiation>:47:2: error: unknown use of instruction mnemonic without a size suffix
GHASH_4_ENCRYPT_4_PARALLEL_dec %xmm9, %xmm10, %xmm11, %xmm12, %xmm13, %xmm14, %xmm0, %xmm1, %xmm2, %xmm3, %xmm4, %xmm5, %xmm6, %xmm7, %xmm8, enc
^
arch/x86/crypto/aesni-intel_asm.S:1599:2: note: while in macro instantiation
GCM_ENC_DEC dec
^
<instantiation>:15:74: error: too many positional arguments
PRECOMPUTE 8*3+8(%rsp), %xmm1, %xmm2, %xmm3, %xmm4, %xmm5, %xmm6, %xmm7,
^
arch/x86/crypto/aesni-intel_asm.S:1686:2: note: while in macro instantiation
GCM_INIT %r9, 8*3 +8(%rsp), 8*3 +16(%rsp), 8*3 +24(%rsp)
^
<instantiation>:47:2: error: unknown use of instruction mnemonic without a size suffix
GHASH_4_ENCRYPT_4_PARALLEL_enc %xmm9, %xmm10, %xmm11, %xmm12, %xmm13, %xmm14, %xmm0, %xmm1, %xmm2, %xmm3, %xmm4, %xmm5, %xmm6, %xmm7, %xmm8, enc
^
arch/x86/crypto/aesni-intel_asm.S:1687:2: note: while in macro instantiation
GCM_ENC_DEC enc
Craig Topper suggested me in ClangBuiltLinux issue #1050:
> I think the "too many positional arguments" is because the parser isn't able
> to handle the trailing commas.
>
> The "unknown use of instruction mnemonic" is because the macro was named
> GHASH_4_ENCRYPT_4_PARALLEL_DEC but its being instantiated with
> GHASH_4_ENCRYPT_4_PARALLEL_dec I guess gas ignores case on the
> macro instantiation, but llvm doesn't.
First, I removed the trailing comma in the PRECOMPUTE line.
Second, I substituted:
1. GHASH_4_ENCRYPT_4_PARALLEL_DEC -> GHASH_4_ENCRYPT_4_PARALLEL_dec
2. GHASH_4_ENCRYPT_4_PARALLEL_ENC -> GHASH_4_ENCRYPT_4_PARALLEL_enc
With these changes I was able to build with LLVM_IAS=1 and boot on bare metal.
I confirmed that this works with Linux-kernel v5.7.5 final.
NOTE: This patch is on top of Linux v5.7 final.
Thanks to Craig and especially Nick for double-checking and his comments.
Suggested-by: Craig Topper <craig.topper@intel.com>
Suggested-by: Craig Topper <craig.topper@gmail.com>
Suggested-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Cc: "ClangBuiltLinux" <clang-built-linux@googlegroups.com>
Link: https://github.com/ClangBuiltLinux/linux/issues/1050
Link: https://bugs.llvm.org/show_bug.cgi?id=24494
Signed-off-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-03 17:32:06 +03:00
PRECOMPUTE \ S U B K E Y , % x m m 1 , % x m m 2 , % x m m 3 , % x m m 4 , % x m m 5 , % x m m 6 , % x m m 7
2018-08-15 20:29:42 +03:00
movdqu H a s h K e y ( % a r g 2 ) , % x m m 1 3
2018-02-14 20:39:36 +03:00
2018-02-14 20:40:47 +03:00
CALC_ A A D _ H A S H % x m m 1 3 , \ A A D , \ A A D L E N , % x m m 0 , % x m m 1 , % x m m 2 , % x m m 3 , \
% xmm4 , % x m m 5 , % x m m 6
2018-02-14 20:38:45 +03:00
.endm
2018-02-14 20:39:10 +03:00
# GCM_ E N C _ D E C E n c o d e s / D e c o d e s g i v e n d a t a . A s s u m e s t h a t t h e p a s s e d g c m _ c o n t e x t
# struct h a s b e e n i n i t i a l i z e d b y G C M _ I N I T .
# Requires t h e i n p u t d a t a b e a t l e a s t 1 b y t e l o n g b e c a u s e o f R E A D _ P A R T I A L _ B L O C K
# Clobbers r a x , r10 - r13 , a n d x m m 0 - x m m 1 5
.macro GCM_ENC_DEC operation
2018-02-14 20:39:45 +03:00
movdqu A a d H a s h ( % a r g 2 ) , % x m m 8
2018-02-14 20:40:10 +03:00
movdqu H a s h K e y ( % a r g 2 ) , % x m m 1 3
2018-02-14 20:39:45 +03:00
add % a r g 5 , I n L e n ( % a r g 2 )
2018-02-14 20:40:19 +03:00
2018-07-02 13:31:54 +03:00
xor % r11 d , % r11 d # i n i t i a l i s e t h e d a t a p o i n t e r o f f s e t a s z e r o
2018-02-14 20:40:19 +03:00
PARTIAL_ B L O C K % a r g 3 % a r g 4 % a r g 5 % r11 % x m m 8 \ o p e r a t i o n
sub % r11 , % a r g 5 # s u b p a r t i a l b l o c k d a t a u s e d
2018-02-14 20:39:45 +03:00
mov % a r g 5 , % r13 # s a v e t h e n u m b e r o f b y t e s
2018-02-14 20:40:19 +03:00
2018-02-14 20:39:45 +03:00
and $ - 1 6 , % r13 # % r 13 = % r13 - ( % r13 m o d 1 6 )
mov % r13 , % r12
2018-02-14 20:39:10 +03:00
# Encrypt/ D e c r y p t f i r s t f e w b l o c k s
and $ ( 3 < < 4 ) , % r12
jz _ i n i t i a l _ n u m _ b l o c k s _ i s _ 0 _ \ @
cmp $ ( 2 < < 4 ) , % r12
jb _ i n i t i a l _ n u m _ b l o c k s _ i s _ 1 _ \ @
je _ i n i t i a l _ n u m _ b l o c k s _ i s _ 2 _ \ @
_ initial_ n u m _ b l o c k s _ i s _ 3 _ \ @:
INITIAL_ B L O C K S _ E N C _ D E C % x m m 9 , % x m m 1 0 , % x m m 1 3 , % x m m 1 1 , % x m m 1 2 , % x m m 0 , \
% xmm1 , % x m m 2 , % x m m 3 , % x m m 4 , % x m m 8 , % x m m 5 , % x m m 6 , 5 , 6 7 8 , \ o p e r a t i o n
sub $ 4 8 , % r13
jmp _ i n i t i a l _ b l o c k s _ \ @
_ initial_ n u m _ b l o c k s _ i s _ 2 _ \ @:
INITIAL_ B L O C K S _ E N C _ D E C % x m m 9 , % x m m 1 0 , % x m m 1 3 , % x m m 1 1 , % x m m 1 2 , % x m m 0 , \
% xmm1 , % x m m 2 , % x m m 3 , % x m m 4 , % x m m 8 , % x m m 5 , % x m m 6 , 6 , 7 8 , \ o p e r a t i o n
sub $ 3 2 , % r13
jmp _ i n i t i a l _ b l o c k s _ \ @
_ initial_ n u m _ b l o c k s _ i s _ 1 _ \ @:
INITIAL_ B L O C K S _ E N C _ D E C % x m m 9 , % x m m 1 0 , % x m m 1 3 , % x m m 1 1 , % x m m 1 2 , % x m m 0 , \
% xmm1 , % x m m 2 , % x m m 3 , % x m m 4 , % x m m 8 , % x m m 5 , % x m m 6 , 7 , 8 , \ o p e r a t i o n
sub $ 1 6 , % r13
jmp _ i n i t i a l _ b l o c k s _ \ @
_ initial_ n u m _ b l o c k s _ i s _ 0 _ \ @:
INITIAL_ B L O C K S _ E N C _ D E C % x m m 9 , % x m m 1 0 , % x m m 1 3 , % x m m 1 1 , % x m m 1 2 , % x m m 0 , \
% xmm1 , % x m m 2 , % x m m 3 , % x m m 4 , % x m m 8 , % x m m 5 , % x m m 6 , 8 , 0 , \ o p e r a t i o n
_ initial_ b l o c k s _ \ @:
# Main l o o p - E n c r y p t / D e c r y p t r e m a i n i n g b l o c k s
2020-11-27 12:44:52 +03:00
test % r13 , % r13
2018-02-14 20:39:10 +03:00
je _ z e r o _ c i p h e r _ l e f t _ \ @
sub $ 6 4 , % r13
je _ f o u r _ c i p h e r _ l e f t _ \ @
_ crypt_ b y _ 4 _ \ @:
GHASH_ 4 _ E N C R Y P T _ 4 _ P A R A L L E L _ \ o p e r a t i o n % x m m 9 , % x m m 1 0 , % x m m 1 1 , % x m m 1 2 , \
% xmm1 3 , % x m m 1 4 , % x m m 0 , % x m m 1 , % x m m 2 , % x m m 3 , % x m m 4 , % x m m 5 , % x m m 6 , \
% xmm7 , % x m m 8 , e n c
add $ 6 4 , % r11
sub $ 6 4 , % r13
jne _ c r y p t _ b y _ 4 _ \ @
_ four_ c i p h e r _ l e f t _ \ @:
GHASH_ L A S T _ 4 % x m m 9 , % x m m 1 0 , % x m m 1 1 , % x m m 1 2 , % x m m 1 3 , % x m m 1 4 , \
% xmm1 5 , % x m m 1 , % x m m 2 , % x m m 3 , % x m m 4 , % x m m 8
_ zero_ c i p h e r _ l e f t _ \ @:
2018-02-14 20:39:45 +03:00
movdqu % x m m 8 , A a d H a s h ( % a r g 2 )
movdqu % x m m 0 , C u r C o u n t ( % a r g 2 )
2018-02-14 20:39:23 +03:00
mov % a r g 5 , % r13
and $ 1 5 , % r13 # % r 13 = a r g 5 ( m o d 1 6 )
2018-02-14 20:39:10 +03:00
je _ m u l t i p l e _ o f _ 1 6 _ b y t e s _ \ @
2018-02-14 20:39:45 +03:00
mov % r13 , P B l o c k L e n ( % a r g 2 )
2018-02-14 20:39:10 +03:00
# Handle t h e l a s t < 1 6 B y t e b l o c k s e p a r a t e l y
paddd O N E ( % r i p ) , % x m m 0 # I N C R C N T t o g e t Y n
2018-02-14 20:39:45 +03:00
movdqu % x m m 0 , C u r C o u n t ( % a r g 2 )
2018-02-14 20:39:23 +03:00
movdqa S H U F _ M A S K ( % r i p ) , % x m m 1 0
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
pshufb % x m m 1 0 , % x m m 0
2018-02-14 20:39:10 +03:00
ENCRYPT_ S I N G L E _ B L O C K % x m m 0 , % x m m 1 # E n c r y p t ( K , Y n )
2018-02-14 20:39:45 +03:00
movdqu % x m m 0 , P B l o c k E n c K e y ( % a r g 2 )
2018-02-14 20:39:10 +03:00
2018-02-14 20:40:31 +03:00
cmp $ 1 6 , % a r g 5
jge _ l a r g e _ e n o u g h _ u p d a t e _ \ @
2018-02-14 20:39:23 +03:00
lea ( % a r g 4 ,% r11 ,1 ) , % r10
2018-02-14 20:39:10 +03:00
mov % r13 , % r12
READ_ P A R T I A L _ B L O C K % r10 % r12 % x m m 2 % x m m 1
2018-02-14 20:40:31 +03:00
jmp _ d a t a _ r e a d _ \ @
_ large_ e n o u g h _ u p d a t e _ \ @:
sub $ 1 6 , % r11
add % r13 , % r11
# receive t h e l a s t < 1 6 B y t e b l o c k
movdqu ( % a r g 4 , % r11 , 1 ) , % x m m 1
2018-02-14 20:39:10 +03:00
2018-02-14 20:40:31 +03:00
sub % r13 , % r11
add $ 1 6 , % r11
lea S H I F T _ M A S K + 1 6 ( % r i p ) , % r12
# adjust t h e s h u f f l e m a s k p o i n t e r t o b e a b l e t o s h i f t 1 6 - r13 b y t e s
# ( r1 3 i s t h e n u m b e r o f b y t e s i n p l a i n t e x t m o d 1 6 )
sub % r13 , % r12
# get t h e a p p r o p r i a t e s h u f f l e m a s k
movdqu ( % r12 ) , % x m m 2
# shift r i g h t 1 6 - r13 b y t e s
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
pshufb % x m m 2 , % x m m 1
2018-02-14 20:40:31 +03:00
_ data_ r e a d _ \ @:
2018-02-14 20:39:10 +03:00
lea A L L _ F + 1 6 ( % r i p ) , % r12
sub % r13 , % r12
2018-02-14 20:40:31 +03:00
2018-02-14 20:39:10 +03:00
.ifc \ operation, d e c
movdqa % x m m 1 , % x m m 2
.endif
pxor % x m m 1 , % x m m 0 # X O R E n c r y p t ( K , Y n )
movdqu ( % r12 ) , % x m m 1
# get t h e a p p r o p r i a t e m a s k t o m a s k o u t t o p 1 6 - r13 b y t e s o f x m m 0
pand % x m m 1 , % x m m 0 # m a s k o u t t o p 16 - r13 b y t e s o f x m m 0
.ifc \ operation, d e c
pand % x m m 1 , % x m m 2
movdqa S H U F _ M A S K ( % r i p ) , % x m m 1 0
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
pshufb % x m m 1 0 ,% x m m 2
2018-02-14 20:39:10 +03:00
pxor % x m m 2 , % x m m 8
.else
movdqa S H U F _ M A S K ( % r i p ) , % x m m 1 0
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
pshufb % x m m 1 0 ,% x m m 0
2018-02-14 20:39:10 +03:00
pxor % x m m 0 , % x m m 8
.endif
2018-02-14 20:39:45 +03:00
movdqu % x m m 8 , A a d H a s h ( % a r g 2 )
2018-02-14 20:39:10 +03:00
.ifc \ operation, e n c
# GHASH c o m p u t a t i o n f o r t h e l a s t < 1 6 b y t e b l o c k
movdqa S H U F _ M A S K ( % r i p ) , % x m m 1 0
# shuffle x m m 0 b a c k t o o u t p u t a s c i p h e r t e x t
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
pshufb % x m m 1 0 , % x m m 0
2018-02-14 20:39:10 +03:00
.endif
# Output % r13 b y t e s
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
movq % x m m 0 , % r a x
2018-02-14 20:39:10 +03:00
cmp $ 8 , % r13
jle _ l e s s _ t h a n _ 8 _ b y t e s _ l e f t _ \ @
2018-02-14 20:39:23 +03:00
mov % r a x , ( % a r g 3 , % r11 , 1 )
2018-02-14 20:39:10 +03:00
add $ 8 , % r11
psrldq $ 8 , % x m m 0
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
movq % x m m 0 , % r a x
2018-02-14 20:39:10 +03:00
sub $ 8 , % r13
_ less_ t h a n _ 8 _ b y t e s _ l e f t _ \ @:
2018-02-14 20:39:23 +03:00
mov % a l , ( % a r g 3 , % r11 , 1 )
2018-02-14 20:39:10 +03:00
add $ 1 , % r11
shr $ 8 , % r a x
sub $ 1 , % r13
jne _ l e s s _ t h a n _ 8 _ b y t e s _ l e f t _ \ @
_ multiple_ o f _ 1 6 _ b y t e s _ \ @:
.endm
2018-02-14 20:38:57 +03:00
# GCM_ C O M P L E T E F i n i s h e s u p d a t e o f t a g o f l a s t p a r t i a l b l o c k
# Output : Authorization T a g ( A U T H _ T A G )
# Clobbers r a x , r10 - r12 , a n d x m m 0 , x m m 1 , x m m 5 - x m m 1 5
2018-02-14 20:40:47 +03:00
.macro GCM_COMPLETE AUTHTAG A U T H T A G L E N
2018-02-14 20:39:45 +03:00
movdqu A a d H a s h ( % a r g 2 ) , % x m m 8
2018-02-14 20:40:10 +03:00
movdqu H a s h K e y ( % a r g 2 ) , % x m m 1 3
2018-02-14 20:39:55 +03:00
mov P B l o c k L e n ( % a r g 2 ) , % r12
2020-11-27 12:44:52 +03:00
test % r12 , % r12
2018-02-14 20:39:55 +03:00
je _ p a r t i a l _ d o n e \ @
GHASH_ M U L % x m m 8 , % x m m 1 3 , % x m m 9 , % x m m 1 0 , % x m m 1 1 , % x m m 5 , % x m m 6
_ partial_ d o n e \ @:
2018-02-14 20:39:45 +03:00
mov A a d L e n ( % a r g 2 ) , % r12 # % r 13 = a a d L e n ( n u m b e r o f b y t e s )
2018-02-14 20:38:57 +03:00
shl $ 3 , % r12 # c o n v e r t i n t o n u m b e r o f b i t s
movd % r12 d , % x m m 1 5 # l e n ( A ) i n % x m m 15
2018-02-14 20:39:45 +03:00
mov I n L e n ( % a r g 2 ) , % r12
shl $ 3 , % r12 # l e n ( C ) i n b i t s ( * 128 )
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
movq % r12 , % x m m 1
2018-02-14 20:39:45 +03:00
2018-02-14 20:38:57 +03:00
pslldq $ 8 , % x m m 1 5 # % x m m 15 = l e n ( A ) | | 0 x00 0 0 0 0 0 0 0 0 0 0 0 0 0 0
pxor % x m m 1 , % x m m 1 5 # % x m m 15 = l e n ( A ) | | l e n ( C )
pxor % x m m 1 5 , % x m m 8
GHASH_ M U L % x m m 8 , % x m m 1 3 , % x m m 9 , % x m m 1 0 , % x m m 1 1 , % x m m 5 , % x m m 6
# final G H A S H c o m p u t a t i o n
movdqa S H U F _ M A S K ( % r i p ) , % x m m 1 0
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
pshufb % x m m 1 0 , % x m m 8
2018-02-14 20:38:57 +03:00
2018-02-14 20:39:45 +03:00
movdqu O r i g I V ( % a r g 2 ) , % x m m 0 # % x m m 0 = Y 0
2018-02-14 20:38:57 +03:00
ENCRYPT_ S I N G L E _ B L O C K % x m m 0 , % x m m 1 # E ( K , Y 0 )
pxor % x m m 8 , % x m m 0
_ return_ T _ \ @:
2018-02-14 20:40:47 +03:00
mov \ A U T H T A G , % r10 # % r 10 = a u t h T a g
mov \ A U T H T A G L E N , % r11 # % r 11 = a u t h _ t a g _ l e n
2018-02-14 20:38:57 +03:00
cmp $ 1 6 , % r11
je _ T _ 1 6 _ \ @
cmp $ 8 , % r11
jl _ T _ 4 _ \ @
_ T_ 8 _ \ @:
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
movq % x m m 0 , % r a x
2018-02-14 20:38:57 +03:00
mov % r a x , ( % r10 )
add $ 8 , % r10
sub $ 8 , % r11
psrldq $ 8 , % x m m 0
2020-11-27 12:44:52 +03:00
test % r11 , % r11
2018-02-14 20:38:57 +03:00
je _ r e t u r n _ T _ d o n e _ \ @
_ T_ 4 _ \ @:
movd % x m m 0 , % e a x
mov % e a x , ( % r10 )
add $ 4 , % r10
sub $ 4 , % r11
psrldq $ 4 , % x m m 0
2020-11-27 12:44:52 +03:00
test % r11 , % r11
2018-02-14 20:38:57 +03:00
je _ r e t u r n _ T _ d o n e _ \ @
_ T_ 1 2 3 _ \ @:
movd % x m m 0 , % e a x
cmp $ 2 , % r11
jl _ T _ 1 _ \ @
mov % a x , ( % r10 )
cmp $ 2 , % r11
je _ r e t u r n _ T _ d o n e _ \ @
add $ 2 , % r10
sar $ 1 6 , % e a x
_ T_ 1 _ \ @:
mov % a l , ( % r10 )
jmp _ r e t u r n _ T _ d o n e _ \ @
_ T_ 1 6 _ \ @:
movdqu % x m m 0 , ( % r10 )
_ return_ T _ d o n e _ \ @:
.endm
2010-11-29 03:35:39 +03:00
# ifdef _ _ x86 _ 6 4 _ _
2010-11-04 22:00:45 +03:00
/ * GHASH_ M U L M A C R O t o i m p l e m e n t : D a t a * H a s h K e y m o d ( 1 2 8 ,1 2 7 ,1 2 6 ,1 2 1 ,0 )
*
*
* Input : A a n d B ( 1 2 8 - b i t s e a c h , b i t - r e f l e c t e d )
* Output : C = A * B * x m o d p o l y , ( i . e . > > 1 )
* To c o m p u t e G H = G H * H a s h K e y m o d p o l y , g i v e H K = H a s h K e y < < 1 m o d p o l y a s i n p u t
* GH = G H * H K * x m o d p o l y w h i c h i s e q u i v a l e n t t o G H * H a s h K e y m o d p o l y .
*
* /
.macro GHASH_MUL GH H K T M P 1 T M P 2 T M P 3 T M P 4 T M P 5
movdqa \ G H , \ T M P 1
pshufd $ 7 8 , \ G H , \ T M P 2
pshufd $ 7 8 , \ H K , \ T M P 3
pxor \ G H , \ T M P 2 # T M P 2 = a1 + a0
pxor \ H K , \ T M P 3 # T M P 3 = b1 + b0
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
pclmulqdq $ 0 x11 , \ H K , \ T M P 1 # T M P 1 = a1 * b1
pclmulqdq $ 0 x00 , \ H K , \ G H # G H = a 0 * b0
pclmulqdq $ 0 x00 , \ T M P 3 , \ T M P 2 # T M P 2 = ( a0 + a1 ) * ( b1 + b0 )
2010-11-04 22:00:45 +03:00
pxor \ G H , \ T M P 2
pxor \ T M P 1 , \ T M P 2 # T M P 2 = ( a0 * b0 ) + ( a1 * b0 )
movdqa \ T M P 2 , \ T M P 3
pslldq $ 8 , \ T M P 3 # l e f t s h i f t T M P 3 2 D W s
psrldq $ 8 , \ T M P 2 # r i g h t s h i f t T M P 2 2 D W s
pxor \ T M P 3 , \ G H
pxor \ T M P 2 , \ T M P 1 # T M P 2 : G H h o l d s t h e r e s u l t o f G H * H K
# first p h a s e o f t h e r e d u c t i o n
movdqa \ G H , \ T M P 2
movdqa \ G H , \ T M P 3
movdqa \ G H , \ T M P 4 # c o p y G H i n t o T M P 2 ,T M P 3 a n d T M P 4
# in i n o r d e r t o p e r f o r m
# independent s h i f t s
pslld $ 3 1 , \ T M P 2 # p a c k e d r i g h t s h i f t < < 31
pslld $ 3 0 , \ T M P 3 # p a c k e d r i g h t s h i f t < < 30
pslld $ 2 5 , \ T M P 4 # p a c k e d r i g h t s h i f t < < 25
pxor \ T M P 3 , \ T M P 2 # x o r t h e s h i f t e d v e r s i o n s
pxor \ T M P 4 , \ T M P 2
movdqa \ T M P 2 , \ T M P 5
psrldq $ 4 , \ T M P 5 # r i g h t s h i f t T M P 5 1 D W
pslldq $ 1 2 , \ T M P 2 # l e f t s h i f t T M P 2 3 D W s
pxor \ T M P 2 , \ G H
# second p h a s e o f t h e r e d u c t i o n
movdqa \ G H ,\ T M P 2 # c o p y G H i n t o T M P 2 ,T M P 3 a n d T M P 4
# in i n o r d e r t o p e r f o r m
# independent s h i f t s
movdqa \ G H ,\ T M P 3
movdqa \ G H ,\ T M P 4
psrld $ 1 ,\ T M P 2 # p a c k e d l e f t s h i f t > > 1
psrld $ 2 ,\ T M P 3 # p a c k e d l e f t s h i f t > > 2
psrld $ 7 ,\ T M P 4 # p a c k e d l e f t s h i f t > > 7
pxor \ T M P 3 ,\ T M P 2 # x o r t h e s h i f t e d v e r s i o n s
pxor \ T M P 4 ,\ T M P 2
pxor \ T M P 5 , \ T M P 2
pxor \ T M P 2 , \ G H
pxor \ T M P 1 , \ G H # r e s u l t i s i n T M P 1
.endm
2017-12-21 04:08:37 +03:00
# Reads D L E N b y t e s s t a r t i n g a t D P T R a n d s t o r e s i n X M M D s t
# where 0 < D L E N < 1 6
# Clobbers % r a x , D L E N a n d X M M 1
.macro READ_PARTIAL_BLOCK DPTR D L E N X M M 1 X M M D s t
cmp $ 8 , \ D L E N
jl _ r e a d _ l t 8 _ \ @
mov ( \ D P T R ) , % r a x
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
movq % r a x , \ X M M D s t
2017-12-21 04:08:37 +03:00
sub $ 8 , \ D L E N
jz _ d o n e _ r e a d _ p a r t i a l _ b l o c k _ \ @
xor % e a x , % e a x
_ read_ n e x t _ b y t e _ \ @:
shl $ 8 , % r a x
mov 7 ( \ D P T R , \ D L E N , 1 ) , % a l
dec \ D L E N
jnz _ r e a d _ n e x t _ b y t e _ \ @
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
movq % r a x , \ X M M 1
2017-12-21 04:08:37 +03:00
pslldq $ 8 , \ X M M 1
por \ X M M 1 , \ X M M D s t
jmp _ d o n e _ r e a d _ p a r t i a l _ b l o c k _ \ @
_ read_ l t 8 _ \ @:
xor % e a x , % e a x
_ read_ n e x t _ b y t e _ l t 8 _ \ @:
shl $ 8 , % r a x
mov - 1 ( \ D P T R , \ D L E N , 1 ) , % a l
dec \ D L E N
jnz _ r e a d _ n e x t _ b y t e _ l t 8 _ \ @
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
movq % r a x , \ X M M D s t
2017-12-21 04:08:37 +03:00
_ done_ r e a d _ p a r t i a l _ b l o c k _ \ @:
.endm
2018-02-14 20:39:36 +03:00
# CALC_AAD_HASH : Calculates t h e h a s h o f t h e d a t a w h i c h w i l l n o t b e e n c r y p t e d .
# clobbers r10 - 1 1 , x m m 1 4
2018-02-14 20:40:47 +03:00
.macro CALC_AAD_HASH HASHKEY A A D A A D L E N T M P 1 T M P 2 T M P 3 T M P 4 T M P 5 \
2018-02-14 20:39:36 +03:00
TMP6 T M P 7
MOVADQ S H U F _ M A S K ( % r i p ) , % x m m 1 4
2018-02-14 20:40:47 +03:00
mov \ A A D , % r10 # % r 10 = A A D
mov \ A A D L E N , % r11 # % r 11 = a a d L e n
2018-02-14 20:39:36 +03:00
pxor \ T M P 7 , \ T M P 7
pxor \ T M P 6 , \ T M P 6
2017-04-28 19:11:56 +03:00
cmp $ 1 6 , % r11
2018-02-14 20:38:12 +03:00
jl _ g e t _ A A D _ r e s t \ @
_ get_ A A D _ b l o c k s \ @:
2018-02-14 20:39:36 +03:00
movdqu ( % r10 ) , \ T M P 7
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
pshufb % x m m 1 4 , \ T M P 7 # b y t e - r e f l e c t t h e A A D d a t a
2018-02-14 20:39:36 +03:00
pxor \ T M P 7 , \ T M P 6
GHASH_ M U L \ T M P 6 , \ H A S H K E Y , \ T M P 1 , \ T M P 2 , \ T M P 3 , \ T M P 4 , \ T M P 5
2017-04-28 19:11:56 +03:00
add $ 1 6 , % r10
sub $ 1 6 , % r11
cmp $ 1 6 , % r11
2018-02-14 20:38:12 +03:00
jge _ g e t _ A A D _ b l o c k s \ @
2017-04-28 19:11:56 +03:00
2018-02-14 20:39:36 +03:00
movdqu \ T M P 6 , \ T M P 7
2017-12-21 04:08:38 +03:00
/* read the last <16B of AAD */
2018-02-14 20:38:12 +03:00
_ get_ A A D _ r e s t \ @:
2020-11-27 12:44:52 +03:00
test % r11 , % r11
2018-02-14 20:38:12 +03:00
je _ g e t _ A A D _ d o n e \ @
2017-04-28 19:11:56 +03:00
2018-02-14 20:39:36 +03:00
READ_ P A R T I A L _ B L O C K % r10 , % r11 , \ T M P 1 , \ T M P 7
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
pshufb % x m m 1 4 , \ T M P 7 # b y t e - r e f l e c t t h e A A D d a t a
2018-02-14 20:39:36 +03:00
pxor \ T M P 6 , \ T M P 7
GHASH_ M U L \ T M P 7 , \ H A S H K E Y , \ T M P 1 , \ T M P 2 , \ T M P 3 , \ T M P 4 , \ T M P 5
movdqu \ T M P 7 , \ T M P 6
2010-12-13 14:51:15 +03:00
2018-02-14 20:38:12 +03:00
_ get_ A A D _ d o n e \ @:
2018-02-14 20:39:36 +03:00
movdqu \ T M P 6 , A a d H a s h ( % a r g 2 )
.endm
2018-02-14 20:40:19 +03:00
# PARTIAL_BLOCK : Handles e n c r y p t i o n / d e c r y p t i o n a n d t h e t a g p a r t i a l b l o c k s
# between u p d a t e c a l l s .
# Requires t h e i n p u t d a t a b e a t l e a s t 1 b y t e l o n g d u e t o R E A D _ P A R T I A L _ B L O C K
# Outputs e n c r y p t e d b y t e s , a n d u p d a t e s h a s h a n d p a r t i a l i n f o i n g c m _ d a t a _ c o n t e x t
# Clobbers r a x , r10 , r12 , r13 , x m m 0 - 6 , x m m 9 - 1 3
.macro PARTIAL_BLOCK CYPH_ P L A I N _ O U T P L A I N _ C Y P H _ I N P L A I N _ C Y P H _ L E N D A T A _ O F F S E T \
AAD_ H A S H o p e r a t i o n
mov P B l o c k L e n ( % a r g 2 ) , % r13
2020-11-27 12:44:52 +03:00
test % r13 , % r13
2018-02-14 20:40:19 +03:00
je _ p a r t i a l _ b l o c k _ d o n e _ \ @ # Leave Macro if no partial blocks
# Read i n i n p u t d a t a w i t h o u t o v e r r e a d i n g
cmp $ 1 6 , \ P L A I N _ C Y P H _ L E N
jl _ f e w e r _ t h a n _ 1 6 _ b y t e s _ \ @
movups ( \ P L A I N _ C Y P H _ I N ) , % x m m 1 # I f m o r e t h a n 16 b y t e s , j u s t f i l l x m m
jmp _ d a t a _ r e a d _ \ @
_ fewer_ t h a n _ 1 6 _ b y t e s _ \ @:
lea ( \ P L A I N _ C Y P H _ I N , \ D A T A _ O F F S E T , 1 ) , % r10
mov \ P L A I N _ C Y P H _ L E N , % r12
READ_ P A R T I A L _ B L O C K % r10 % r12 % x m m 0 % x m m 1
mov P B l o c k L e n ( % a r g 2 ) , % r13
_ data_ r e a d _ \ @: # Finished reading in data
movdqu P B l o c k E n c K e y ( % a r g 2 ) , % x m m 9
movdqu H a s h K e y ( % a r g 2 ) , % x m m 1 3
lea S H I F T _ M A S K ( % r i p ) , % r12
# adjust t h e s h u f f l e m a s k p o i n t e r t o b e a b l e t o s h i f t r13 b y t e s
# r1 6 - r13 i s t h e n u m b e r o f b y t e s i n p l a i n t e x t m o d 1 6 )
add % r13 , % r12
movdqu ( % r12 ) , % x m m 2 # g e t t h e a p p r o p r i a t e s h u f f l e m a s k
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
pshufb % x m m 2 , % x m m 9 # s h i f t r i g h t r 13 b y t e s
2018-02-14 20:40:19 +03:00
.ifc \ operation, d e c
movdqa % x m m 1 , % x m m 3
pxor % x m m 1 , % x m m 9 # C y p h e r t e x t X O R E ( K , Y n )
mov \ P L A I N _ C Y P H _ L E N , % r10
add % r13 , % r10
# Set r10 t o b e t h e a m o u n t o f d a t a l e f t i n C Y P H _ P L A I N _ I N a f t e r f i l l i n g
sub $ 1 6 , % r10
# Determine i f i f p a r t i a l b l o c k i s n o t b e i n g f i l l e d a n d
# shift m a s k a c c o r d i n g l y
jge _ n o _ e x t r a _ m a s k _ 1 _ \ @
sub % r10 , % r12
_ no_ e x t r a _ m a s k _ 1 _ \ @:
movdqu A L L _ F - S H I F T _ M A S K ( % r12 ) , % x m m 1
# get t h e a p p r o p r i a t e m a s k t o m a s k o u t b o t t o m r13 b y t e s o f x m m 9
pand % x m m 1 , % x m m 9 # m a s k o u t b o t t o m r 13 b y t e s o f x m m 9
pand % x m m 1 , % x m m 3
movdqa S H U F _ M A S K ( % r i p ) , % x m m 1 0
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
pshufb % x m m 1 0 , % x m m 3
pshufb % x m m 2 , % x m m 3
2018-02-14 20:40:19 +03:00
pxor % x m m 3 , \ A A D _ H A S H
2020-11-27 12:44:52 +03:00
test % r10 , % r10
2018-02-14 20:40:19 +03:00
jl _ p a r t i a l _ i n c o m p l e t e _ 1 _ \ @
# GHASH c o m p u t a t i o n f o r t h e l a s t < 1 6 B y t e b l o c k
GHASH_ M U L \ A A D _ H A S H , % x m m 1 3 , % x m m 0 , % x m m 1 0 , % x m m 1 1 , % x m m 5 , % x m m 6
2018-07-02 13:31:54 +03:00
xor % e a x , % e a x
2018-02-14 20:40:19 +03:00
mov % r a x , P B l o c k L e n ( % a r g 2 )
jmp _ d e c _ d o n e _ \ @
_ partial_ i n c o m p l e t e _ 1 _ \ @:
add \ P L A I N _ C Y P H _ L E N , P B l o c k L e n ( % a r g 2 )
_ dec_ d o n e _ \ @:
movdqu \ A A D _ H A S H , A a d H a s h ( % a r g 2 )
.else
pxor % x m m 1 , % x m m 9 # P l a i n t e x t X O R E ( K , Y n )
mov \ P L A I N _ C Y P H _ L E N , % r10
add % r13 , % r10
# Set r10 t o b e t h e a m o u n t o f d a t a l e f t i n C Y P H _ P L A I N _ I N a f t e r f i l l i n g
sub $ 1 6 , % r10
# Determine i f i f p a r t i a l b l o c k i s n o t b e i n g f i l l e d a n d
# shift m a s k a c c o r d i n g l y
jge _ n o _ e x t r a _ m a s k _ 2 _ \ @
sub % r10 , % r12
_ no_ e x t r a _ m a s k _ 2 _ \ @:
movdqu A L L _ F - S H I F T _ M A S K ( % r12 ) , % x m m 1
# get t h e a p p r o p r i a t e m a s k t o m a s k o u t b o t t o m r13 b y t e s o f x m m 9
pand % x m m 1 , % x m m 9
movdqa S H U F _ M A S K ( % r i p ) , % x m m 1
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
pshufb % x m m 1 , % x m m 9
pshufb % x m m 2 , % x m m 9
2018-02-14 20:40:19 +03:00
pxor % x m m 9 , \ A A D _ H A S H
2020-11-27 12:44:52 +03:00
test % r10 , % r10
2018-02-14 20:40:19 +03:00
jl _ p a r t i a l _ i n c o m p l e t e _ 2 _ \ @
# GHASH c o m p u t a t i o n f o r t h e l a s t < 1 6 B y t e b l o c k
GHASH_ M U L \ A A D _ H A S H , % x m m 1 3 , % x m m 0 , % x m m 1 0 , % x m m 1 1 , % x m m 5 , % x m m 6
2018-07-02 13:31:54 +03:00
xor % e a x , % e a x
2018-02-14 20:40:19 +03:00
mov % r a x , P B l o c k L e n ( % a r g 2 )
jmp _ e n c o d e _ d o n e _ \ @
_ partial_ i n c o m p l e t e _ 2 _ \ @:
add \ P L A I N _ C Y P H _ L E N , P B l o c k L e n ( % a r g 2 )
_ encode_ d o n e _ \ @:
movdqu \ A A D _ H A S H , A a d H a s h ( % a r g 2 )
movdqa S H U F _ M A S K ( % r i p ) , % x m m 1 0
# shuffle x m m 9 b a c k t o o u t p u t a s c i p h e r t e x t
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
pshufb % x m m 1 0 , % x m m 9
pshufb % x m m 2 , % x m m 9
2018-02-14 20:40:19 +03:00
.endif
# output e n c r y p t e d B y t e s
2020-11-27 12:44:52 +03:00
test % r10 , % r10
2018-02-14 20:40:19 +03:00
jl _ p a r t i a l _ f i l l _ \ @
mov % r13 , % r12
mov $ 1 6 , % r13
# Set r13 t o b e t h e n u m b e r o f b y t e s t o w r i t e o u t
sub % r12 , % r13
jmp _ c o u n t _ s e t _ \ @
_ partial_ f i l l _ \ @:
mov \ P L A I N _ C Y P H _ L E N , % r13
_ count_ s e t _ \ @:
movdqa % x m m 9 , % x m m 0
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
movq % x m m 0 , % r a x
2018-02-14 20:40:19 +03:00
cmp $ 8 , % r13
jle _ l e s s _ t h a n _ 8 _ b y t e s _ l e f t _ \ @
mov % r a x , ( \ C Y P H _ P L A I N _ O U T , \ D A T A _ O F F S E T , 1 )
add $ 8 , \ D A T A _ O F F S E T
psrldq $ 8 , % x m m 0
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
movq % x m m 0 , % r a x
2018-02-14 20:40:19 +03:00
sub $ 8 , % r13
_ less_ t h a n _ 8 _ b y t e s _ l e f t _ \ @:
movb % a l , ( \ C Y P H _ P L A I N _ O U T , \ D A T A _ O F F S E T , 1 )
add $ 1 , \ D A T A _ O F F S E T
shr $ 8 , % r a x
sub $ 1 , % r13
jne _ l e s s _ t h a n _ 8 _ b y t e s _ l e f t _ \ @
_ partial_ b l o c k _ d o n e _ \ @:
.endm # PARTIAL_ B L O C K
2018-02-14 20:39:36 +03:00
/ *
* if a = n u m b e r o f t o t a l p l a i n t e x t b y t e s
* b = f l o o r ( a / 1 6 )
* num_ i n i t i a l _ b l o c k s = b m o d 4
* encrypt t h e i n i t i a l n u m _ i n i t i a l _ b l o c k s b l o c k s a n d a p p l y g h a s h o n
* the c i p h e r t e x t
* % r1 0 , % r11 , % r12 , % r a x , % x m m 5 , % x m m 6 , % x m m 7 , % x m m 8 , % x m m 9 r e g i s t e r s
* are c l o b b e r e d
2018-02-14 20:40:10 +03:00
* arg1 , % a r g 2 , % a r g 3 a r e u s e d a s a p o i n t e r o n l y , n o t m o d i f i e d
2018-02-14 20:39:36 +03:00
* /
.macro INITIAL_BLOCKS_ENC_DEC TMP1 T M P 2 T M P 3 T M P 4 T M P 5 X M M 0 X M M 1 \
XMM2 X M M 3 X M M 4 X M M D s t T M P 6 T M P 7 i i _ s e q o p e r a t i o n
2018-02-14 20:39:45 +03:00
MOVADQ S H U F _ M A S K ( % r i p ) , % x m m 1 4
2018-02-14 20:39:36 +03:00
movdqu A a d H a s h ( % a r g 2 ) , % x m m \ i # X M M 0 = Y 0
2017-04-28 19:11:56 +03:00
# start A E S f o r n u m _ i n i t i a l _ b l o c k s b l o c k s
2010-12-13 14:51:15 +03:00
2018-02-14 20:39:45 +03:00
movdqu C u r C o u n t ( % a r g 2 ) , \ X M M 0 # X M M 0 = Y 0
2010-12-13 14:51:15 +03:00
.if ( \ i = = 5 ) | | ( \ i = = 6 ) | | ( \ i = = 7 )
2015-01-13 21:16:43 +03:00
MOVADQ O N E ( % R I P ) ,\ T M P 1
MOVADQ 0 ( % a r g 1 ) ,\ T M P 2
2010-12-13 14:51:15 +03:00
.irpc index, \ i _ s e q
2015-01-13 21:16:43 +03:00
paddd \ T M P 1 , \ X M M 0 # I N C R Y 0
2018-02-14 20:38:12 +03:00
.ifc \ operation, d e c
movdqa \ X M M 0 , % x m m \ i n d e x
.else
2015-01-13 21:16:43 +03:00
MOVADQ \ X M M 0 , % x m m \ i n d e x
2018-02-14 20:38:12 +03:00
.endif
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
pshufb % x m m 1 4 , % x m m \ i n d e x # p e r f o r m a 16 b y t e s w a p
2015-01-13 21:16:43 +03:00
pxor \ T M P 2 , % x m m \ i n d e x
2010-12-13 14:51:15 +03:00
.endr
2015-01-13 21:16:43 +03:00
lea 0 x10 ( % a r g 1 ) ,% r10
mov k e y s i z e ,% e a x
shr $ 2 ,% e a x # 128 - > 4 , 1 9 2 - > 6 , 2 5 6 - > 8
add $ 5 ,% e a x # 128 - > 9 , 1 9 2 - > 1 1 , 2 5 6 - > 1 3
2018-02-14 20:38:12 +03:00
aes_ l o o p _ i n i t i a l _ \ @:
2015-01-13 21:16:43 +03:00
MOVADQ ( % r10 ) ,\ T M P 1
.irpc index, \ i _ s e q
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesenc \ T M P 1 , % x m m \ i n d e x
2010-12-13 14:51:15 +03:00
.endr
2015-01-13 21:16:43 +03:00
add $ 1 6 ,% r10
sub $ 1 ,% e a x
2018-02-14 20:38:12 +03:00
jnz a e s _ l o o p _ i n i t i a l _ \ @
2015-01-13 21:16:43 +03:00
MOVADQ ( % r10 ) , \ T M P 1
2010-12-13 14:51:15 +03:00
.irpc index, \ i _ s e q
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesenclast \ T M P 1 , % x m m \ i n d e x # L a s t R o u n d
2010-12-13 14:51:15 +03:00
.endr
.irpc index, \ i _ s e q
2018-02-14 20:39:23 +03:00
movdqu ( % a r g 4 , % r11 , 1 ) , \ T M P 1
2010-12-13 14:51:15 +03:00
pxor \ T M P 1 , % x m m \ i n d e x
2018-02-14 20:39:23 +03:00
movdqu % x m m \ i n d e x , ( % a r g 3 , % r11 , 1 )
2010-12-13 14:51:15 +03:00
# write b a c k p l a i n t e x t / c i p h e r t e x t f o r n u m _ i n i t i a l _ b l o c k s
add $ 1 6 , % r11
2018-02-14 20:38:12 +03:00
.ifc \ operation, d e c
movdqa \ T M P 1 , % x m m \ i n d e x
.endif
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
pshufb % x m m 1 4 , % x m m \ i n d e x
2010-12-13 14:51:15 +03:00
# prepare p l a i n t e x t / c i p h e r t e x t f o r G H A S H c o m p u t a t i o n
.endr
.endif
2017-04-28 19:11:56 +03:00
2010-12-13 14:51:15 +03:00
# apply G H A S H o n n u m _ i n i t i a l _ b l o c k s b l o c k s
.if \ i = = 5
pxor % x m m 5 , % x m m 6
GHASH_ M U L % x m m 6 , \ T M P 3 , \ T M P 1 , \ T M P 2 , \ T M P 4 , \ T M P 5 , \ X M M 1
pxor % x m m 6 , % x m m 7
GHASH_ M U L % x m m 7 , \ T M P 3 , \ T M P 1 , \ T M P 2 , \ T M P 4 , \ T M P 5 , \ X M M 1
pxor % x m m 7 , % x m m 8
GHASH_ M U L % x m m 8 , \ T M P 3 , \ T M P 1 , \ T M P 2 , \ T M P 4 , \ T M P 5 , \ X M M 1
.elseif \ i = = 6
pxor % x m m 6 , % x m m 7
GHASH_ M U L % x m m 7 , \ T M P 3 , \ T M P 1 , \ T M P 2 , \ T M P 4 , \ T M P 5 , \ X M M 1
pxor % x m m 7 , % x m m 8
GHASH_ M U L % x m m 8 , \ T M P 3 , \ T M P 1 , \ T M P 2 , \ T M P 4 , \ T M P 5 , \ X M M 1
.elseif \ i = = 7
pxor % x m m 7 , % x m m 8
GHASH_ M U L % x m m 8 , \ T M P 3 , \ T M P 1 , \ T M P 2 , \ T M P 4 , \ T M P 5 , \ X M M 1
.endif
cmp $ 6 4 , % r13
2018-02-14 20:38:12 +03:00
jl _ i n i t i a l _ b l o c k s _ d o n e \ @
2010-12-13 14:51:15 +03:00
# no n e e d f o r p r e c o m p u t e d v a l u e s
/ *
*
* Precomputations f o r H a s h K e y p a r a l l e l w i t h e n c r y p t i o n o f f i r s t 4 b l o c k s .
* Haskey_ i _ k h o l d s X O R e d v a l u e s o f t h e l o w a n d h i g h p a r t s o f t h e H a s k e y _ i
* /
2015-01-13 21:16:43 +03:00
MOVADQ O N E ( % R I P ) ,\ T M P 1
paddd \ T M P 1 , \ X M M 0 # I N C R Y 0
MOVADQ \ X M M 0 , \ X M M 1
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
pshufb % x m m 1 4 , \ X M M 1 # p e r f o r m a 16 b y t e s w a p
2010-12-13 14:51:15 +03:00
2015-01-13 21:16:43 +03:00
paddd \ T M P 1 , \ X M M 0 # I N C R Y 0
MOVADQ \ X M M 0 , \ X M M 2
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
pshufb % x m m 1 4 , \ X M M 2 # p e r f o r m a 16 b y t e s w a p
2010-12-13 14:51:15 +03:00
2015-01-13 21:16:43 +03:00
paddd \ T M P 1 , \ X M M 0 # I N C R Y 0
MOVADQ \ X M M 0 , \ X M M 3
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
pshufb % x m m 1 4 , \ X M M 3 # p e r f o r m a 16 b y t e s w a p
2010-12-13 14:51:15 +03:00
2015-01-13 21:16:43 +03:00
paddd \ T M P 1 , \ X M M 0 # I N C R Y 0
MOVADQ \ X M M 0 , \ X M M 4
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
pshufb % x m m 1 4 , \ X M M 4 # p e r f o r m a 16 b y t e s w a p
2010-12-13 14:51:15 +03:00
2015-01-13 21:16:43 +03:00
MOVADQ 0 ( % a r g 1 ) ,\ T M P 1
pxor \ T M P 1 , \ X M M 1
pxor \ T M P 1 , \ X M M 2
pxor \ T M P 1 , \ X M M 3
pxor \ T M P 1 , \ X M M 4
2010-12-13 14:51:15 +03:00
.irpc index, 1 2 3 4 # d o 4 r o u n d s
movaps 0 x10 * \ i n d e x ( % a r g 1 ) , \ T M P 1
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesenc \ T M P 1 , \ X M M 1
aesenc \ T M P 1 , \ X M M 2
aesenc \ T M P 1 , \ X M M 3
aesenc \ T M P 1 , \ X M M 4
2010-12-13 14:51:15 +03:00
.endr
.irpc index, 5 6 7 8 9 # d o n e x t 5 r o u n d s
movaps 0 x10 * \ i n d e x ( % a r g 1 ) , \ T M P 1
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesenc \ T M P 1 , \ X M M 1
aesenc \ T M P 1 , \ X M M 2
aesenc \ T M P 1 , \ X M M 3
aesenc \ T M P 1 , \ X M M 4
2010-12-13 14:51:15 +03:00
.endr
2015-01-13 21:16:43 +03:00
lea 0 x a0 ( % a r g 1 ) ,% r10
mov k e y s i z e ,% e a x
shr $ 2 ,% e a x # 128 - > 4 , 1 9 2 - > 6 , 2 5 6 - > 8
sub $ 4 ,% e a x # 128 - > 0 , 1 9 2 - > 2 , 2 5 6 - > 4
2018-02-14 20:38:12 +03:00
jz a e s _ l o o p _ p r e _ d o n e \ @
2015-01-13 21:16:43 +03:00
2018-02-14 20:38:12 +03:00
aes_ l o o p _ p r e _ \ @:
2015-01-13 21:16:43 +03:00
MOVADQ ( % r10 ) ,\ T M P 2
.irpc index, 1 2 3 4
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesenc \ T M P 2 , % x m m \ i n d e x
2015-01-13 21:16:43 +03:00
.endr
add $ 1 6 ,% r10
sub $ 1 ,% e a x
2018-02-14 20:38:12 +03:00
jnz a e s _ l o o p _ p r e _ \ @
2015-01-13 21:16:43 +03:00
2018-02-14 20:38:12 +03:00
aes_ l o o p _ p r e _ d o n e \ @:
2015-01-13 21:16:43 +03:00
MOVADQ ( % r10 ) , \ T M P 2
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesenclast \ T M P 2 , \ X M M 1
aesenclast \ T M P 2 , \ X M M 2
aesenclast \ T M P 2 , \ X M M 3
aesenclast \ T M P 2 , \ X M M 4
2018-02-14 20:39:23 +03:00
movdqu 1 6 * 0 ( % a r g 4 , % r11 , 1 ) , \ T M P 1
2010-12-13 14:51:15 +03:00
pxor \ T M P 1 , \ X M M 1
2018-02-14 20:38:12 +03:00
.ifc \ operation, d e c
2018-02-14 20:39:23 +03:00
movdqu \ X M M 1 , 1 6 * 0 ( % a r g 3 , % r11 , 1 )
2018-02-14 20:38:12 +03:00
movdqa \ T M P 1 , \ X M M 1
.endif
2018-02-14 20:39:23 +03:00
movdqu 1 6 * 1 ( % a r g 4 , % r11 , 1 ) , \ T M P 1
2010-12-13 14:51:15 +03:00
pxor \ T M P 1 , \ X M M 2
2018-02-14 20:38:12 +03:00
.ifc \ operation, d e c
2018-02-14 20:39:23 +03:00
movdqu \ X M M 2 , 1 6 * 1 ( % a r g 3 , % r11 , 1 )
2018-02-14 20:38:12 +03:00
movdqa \ T M P 1 , \ X M M 2
.endif
2018-02-14 20:39:23 +03:00
movdqu 1 6 * 2 ( % a r g 4 , % r11 , 1 ) , \ T M P 1
2010-12-13 14:51:15 +03:00
pxor \ T M P 1 , \ X M M 3
2018-02-14 20:38:12 +03:00
.ifc \ operation, d e c
2018-02-14 20:39:23 +03:00
movdqu \ X M M 3 , 1 6 * 2 ( % a r g 3 , % r11 , 1 )
2018-02-14 20:38:12 +03:00
movdqa \ T M P 1 , \ X M M 3
.endif
2018-02-14 20:39:23 +03:00
movdqu 1 6 * 3 ( % a r g 4 , % r11 , 1 ) , \ T M P 1
2010-12-13 14:51:15 +03:00
pxor \ T M P 1 , \ X M M 4
2018-02-14 20:38:12 +03:00
.ifc \ operation, d e c
2018-02-14 20:39:23 +03:00
movdqu \ X M M 4 , 1 6 * 3 ( % a r g 3 , % r11 , 1 )
2018-02-14 20:38:12 +03:00
movdqa \ T M P 1 , \ X M M 4
.else
2018-02-14 20:39:23 +03:00
movdqu \ X M M 1 , 1 6 * 0 ( % a r g 3 , % r11 , 1 )
movdqu \ X M M 2 , 1 6 * 1 ( % a r g 3 , % r11 , 1 )
movdqu \ X M M 3 , 1 6 * 2 ( % a r g 3 , % r11 , 1 )
movdqu \ X M M 4 , 1 6 * 3 ( % a r g 3 , % r11 , 1 )
2018-02-14 20:38:12 +03:00
.endif
2010-12-13 14:51:15 +03:00
2010-11-04 22:00:45 +03:00
add $ 6 4 , % r11
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
pshufb % x m m 1 4 , \ X M M 1 # p e r f o r m a 16 b y t e s w a p
2010-11-04 22:00:45 +03:00
pxor \ X M M D s t , \ X M M 1
# combine G H A S H e d v a l u e w i t h t h e c o r r e s p o n d i n g c i p h e r t e x t
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
pshufb % x m m 1 4 , \ X M M 2 # p e r f o r m a 16 b y t e s w a p
pshufb % x m m 1 4 , \ X M M 3 # p e r f o r m a 16 b y t e s w a p
pshufb % x m m 1 4 , \ X M M 4 # p e r f o r m a 16 b y t e s w a p
2010-12-13 14:51:15 +03:00
2018-02-14 20:38:12 +03:00
_ initial_ b l o c k s _ d o n e \ @:
2010-12-13 14:51:15 +03:00
2010-11-04 22:00:45 +03:00
.endm
/ *
* encrypt 4 b l o c k s a t a t i m e
* ghash t h e 4 p r e v i o u s l y e n c r y p t e d c i p h e r t e x t b l o c k s
2018-02-14 20:39:23 +03:00
* arg1 , % a r g 3 , % a r g 4 a r e u s e d a s p o i n t e r s o n l y , n o t m o d i f i e d
2010-11-04 22:00:45 +03:00
* % r1 1 i s t h e d a t a o f f s e t v a l u e
* /
crypto: aesni - Fix build with LLVM_IAS=1
When building with LLVM_IAS=1 means using Clang's Integrated Assembly (IAS)
from LLVM/Clang >= v10.0.1-rc1+ instead of GNU/as from GNU/binutils
I see the following breakage in Debian/testing AMD64:
<instantiation>:15:74: error: too many positional arguments
PRECOMPUTE 8*3+8(%rsp), %xmm1, %xmm2, %xmm3, %xmm4, %xmm5, %xmm6, %xmm7,
^
arch/x86/crypto/aesni-intel_asm.S:1598:2: note: while in macro instantiation
GCM_INIT %r9, 8*3 +8(%rsp), 8*3 +16(%rsp), 8*3 +24(%rsp)
^
<instantiation>:47:2: error: unknown use of instruction mnemonic without a size suffix
GHASH_4_ENCRYPT_4_PARALLEL_dec %xmm9, %xmm10, %xmm11, %xmm12, %xmm13, %xmm14, %xmm0, %xmm1, %xmm2, %xmm3, %xmm4, %xmm5, %xmm6, %xmm7, %xmm8, enc
^
arch/x86/crypto/aesni-intel_asm.S:1599:2: note: while in macro instantiation
GCM_ENC_DEC dec
^
<instantiation>:15:74: error: too many positional arguments
PRECOMPUTE 8*3+8(%rsp), %xmm1, %xmm2, %xmm3, %xmm4, %xmm5, %xmm6, %xmm7,
^
arch/x86/crypto/aesni-intel_asm.S:1686:2: note: while in macro instantiation
GCM_INIT %r9, 8*3 +8(%rsp), 8*3 +16(%rsp), 8*3 +24(%rsp)
^
<instantiation>:47:2: error: unknown use of instruction mnemonic without a size suffix
GHASH_4_ENCRYPT_4_PARALLEL_enc %xmm9, %xmm10, %xmm11, %xmm12, %xmm13, %xmm14, %xmm0, %xmm1, %xmm2, %xmm3, %xmm4, %xmm5, %xmm6, %xmm7, %xmm8, enc
^
arch/x86/crypto/aesni-intel_asm.S:1687:2: note: while in macro instantiation
GCM_ENC_DEC enc
Craig Topper suggested me in ClangBuiltLinux issue #1050:
> I think the "too many positional arguments" is because the parser isn't able
> to handle the trailing commas.
>
> The "unknown use of instruction mnemonic" is because the macro was named
> GHASH_4_ENCRYPT_4_PARALLEL_DEC but its being instantiated with
> GHASH_4_ENCRYPT_4_PARALLEL_dec I guess gas ignores case on the
> macro instantiation, but llvm doesn't.
First, I removed the trailing comma in the PRECOMPUTE line.
Second, I substituted:
1. GHASH_4_ENCRYPT_4_PARALLEL_DEC -> GHASH_4_ENCRYPT_4_PARALLEL_dec
2. GHASH_4_ENCRYPT_4_PARALLEL_ENC -> GHASH_4_ENCRYPT_4_PARALLEL_enc
With these changes I was able to build with LLVM_IAS=1 and boot on bare metal.
I confirmed that this works with Linux-kernel v5.7.5 final.
NOTE: This patch is on top of Linux v5.7 final.
Thanks to Craig and especially Nick for double-checking and his comments.
Suggested-by: Craig Topper <craig.topper@intel.com>
Suggested-by: Craig Topper <craig.topper@gmail.com>
Suggested-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Cc: "ClangBuiltLinux" <clang-built-linux@googlegroups.com>
Link: https://github.com/ClangBuiltLinux/linux/issues/1050
Link: https://bugs.llvm.org/show_bug.cgi?id=24494
Signed-off-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-03 17:32:06 +03:00
.macro GHASH_4_ENCRYPT_4_PARALLEL_enc TMP1 T M P 2 T M P 3 T M P 4 T M P 5 \
2010-12-13 14:51:15 +03:00
TMP6 X M M 0 X M M 1 X M M 2 X M M 3 X M M 4 X M M 5 X M M 6 X M M 7 X M M 8 o p e r a t i o n
movdqa \ X M M 1 , \ X M M 5
movdqa \ X M M 2 , \ X M M 6
movdqa \ X M M 3 , \ X M M 7
movdqa \ X M M 4 , \ X M M 8
movdqa S H U F _ M A S K ( % r i p ) , % x m m 1 5
# multiply T M P 5 * H a s h K e y u s i n g k a r a t s u b a
movdqa \ X M M 5 , \ T M P 4
pshufd $ 7 8 , \ X M M 5 , \ T M P 6
pxor \ X M M 5 , \ T M P 6
paddd O N E ( % r i p ) , \ X M M 0 # I N C R C N T
2018-08-15 20:29:42 +03:00
movdqu H a s h K e y _ 4 ( % a r g 2 ) , \ T M P 5
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
pclmulqdq $ 0 x11 , \ T M P 5 , \ T M P 4 # T M P 4 = a1 * b1
2010-12-13 14:51:15 +03:00
movdqa \ X M M 0 , \ X M M 1
paddd O N E ( % r i p ) , \ X M M 0 # I N C R C N T
movdqa \ X M M 0 , \ X M M 2
paddd O N E ( % r i p ) , \ X M M 0 # I N C R C N T
movdqa \ X M M 0 , \ X M M 3
paddd O N E ( % r i p ) , \ X M M 0 # I N C R C N T
movdqa \ X M M 0 , \ X M M 4
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
pshufb % x m m 1 5 , \ X M M 1 # p e r f o r m a 16 b y t e s w a p
pclmulqdq $ 0 x00 , \ T M P 5 , \ X M M 5 # X M M 5 = a0 * b0
pshufb % x m m 1 5 , \ X M M 2 # p e r f o r m a 16 b y t e s w a p
pshufb % x m m 1 5 , \ X M M 3 # p e r f o r m a 16 b y t e s w a p
pshufb % x m m 1 5 , \ X M M 4 # p e r f o r m a 16 b y t e s w a p
2010-12-13 14:51:15 +03:00
pxor ( % a r g 1 ) , \ X M M 1
pxor ( % a r g 1 ) , \ X M M 2
pxor ( % a r g 1 ) , \ X M M 3
pxor ( % a r g 1 ) , \ X M M 4
2018-08-15 20:29:42 +03:00
movdqu H a s h K e y _ 4 _ k ( % a r g 2 ) , \ T M P 5
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
pclmulqdq $ 0 x00 , \ T M P 5 , \ T M P 6 # T M P 6 = ( a1 + a0 ) * ( b1 + b0 )
2010-12-13 14:51:15 +03:00
movaps 0 x10 ( % a r g 1 ) , \ T M P 1
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesenc \ T M P 1 , \ X M M 1 # R o u n d 1
aesenc \ T M P 1 , \ X M M 2
aesenc \ T M P 1 , \ X M M 3
aesenc \ T M P 1 , \ X M M 4
2010-12-13 14:51:15 +03:00
movaps 0 x20 ( % a r g 1 ) , \ T M P 1
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesenc \ T M P 1 , \ X M M 1 # R o u n d 2
aesenc \ T M P 1 , \ X M M 2
aesenc \ T M P 1 , \ X M M 3
aesenc \ T M P 1 , \ X M M 4
2010-12-13 14:51:15 +03:00
movdqa \ X M M 6 , \ T M P 1
pshufd $ 7 8 , \ X M M 6 , \ T M P 2
pxor \ X M M 6 , \ T M P 2
2018-08-15 20:29:42 +03:00
movdqu H a s h K e y _ 3 ( % a r g 2 ) , \ T M P 5
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
pclmulqdq $ 0 x11 , \ T M P 5 , \ T M P 1 # T M P 1 = a1 * b1
2010-12-13 14:51:15 +03:00
movaps 0 x30 ( % a r g 1 ) , \ T M P 3
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesenc \ T M P 3 , \ X M M 1 # R o u n d 3
aesenc \ T M P 3 , \ X M M 2
aesenc \ T M P 3 , \ X M M 3
aesenc \ T M P 3 , \ X M M 4
pclmulqdq $ 0 x00 , \ T M P 5 , \ X M M 6 # X M M 6 = a0 * b0
2010-12-13 14:51:15 +03:00
movaps 0 x40 ( % a r g 1 ) , \ T M P 3
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesenc \ T M P 3 , \ X M M 1 # R o u n d 4
aesenc \ T M P 3 , \ X M M 2
aesenc \ T M P 3 , \ X M M 3
aesenc \ T M P 3 , \ X M M 4
2018-08-15 20:29:42 +03:00
movdqu H a s h K e y _ 3 _ k ( % a r g 2 ) , \ T M P 5
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
pclmulqdq $ 0 x00 , \ T M P 5 , \ T M P 2 # T M P 2 = ( a1 + a0 ) * ( b1 + b0 )
2010-12-13 14:51:15 +03:00
movaps 0 x50 ( % a r g 1 ) , \ T M P 3
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesenc \ T M P 3 , \ X M M 1 # R o u n d 5
aesenc \ T M P 3 , \ X M M 2
aesenc \ T M P 3 , \ X M M 3
aesenc \ T M P 3 , \ X M M 4
2010-12-13 14:51:15 +03:00
pxor \ T M P 1 , \ T M P 4
# accumulate t h e r e s u l t s i n T M P 4 : X M M 5 , T M P 6 h o l d s t h e m i d d l e p a r t
pxor \ X M M 6 , \ X M M 5
pxor \ T M P 2 , \ T M P 6
movdqa \ X M M 7 , \ T M P 1
pshufd $ 7 8 , \ X M M 7 , \ T M P 2
pxor \ X M M 7 , \ T M P 2
2018-08-15 20:29:42 +03:00
movdqu H a s h K e y _ 2 ( % a r g 2 ) , \ T M P 5
2010-12-13 14:51:15 +03:00
# Multiply T M P 5 * H a s h K e y u s i n g k a r a t s u b a
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
pclmulqdq $ 0 x11 , \ T M P 5 , \ T M P 1 # T M P 1 = a1 * b1
2010-12-13 14:51:15 +03:00
movaps 0 x60 ( % a r g 1 ) , \ T M P 3
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesenc \ T M P 3 , \ X M M 1 # R o u n d 6
aesenc \ T M P 3 , \ X M M 2
aesenc \ T M P 3 , \ X M M 3
aesenc \ T M P 3 , \ X M M 4
pclmulqdq $ 0 x00 , \ T M P 5 , \ X M M 7 # X M M 7 = a0 * b0
2010-12-13 14:51:15 +03:00
movaps 0 x70 ( % a r g 1 ) , \ T M P 3
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesenc \ T M P 3 , \ X M M 1 # R o u n d 7
aesenc \ T M P 3 , \ X M M 2
aesenc \ T M P 3 , \ X M M 3
aesenc \ T M P 3 , \ X M M 4
2018-08-15 20:29:42 +03:00
movdqu H a s h K e y _ 2 _ k ( % a r g 2 ) , \ T M P 5
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
pclmulqdq $ 0 x00 , \ T M P 5 , \ T M P 2 # T M P 2 = ( a1 + a0 ) * ( b1 + b0 )
2010-12-13 14:51:15 +03:00
movaps 0 x80 ( % a r g 1 ) , \ T M P 3
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesenc \ T M P 3 , \ X M M 1 # R o u n d 8
aesenc \ T M P 3 , \ X M M 2
aesenc \ T M P 3 , \ X M M 3
aesenc \ T M P 3 , \ X M M 4
2010-12-13 14:51:15 +03:00
pxor \ T M P 1 , \ T M P 4
# accumulate t h e r e s u l t s i n T M P 4 : X M M 5 , T M P 6 h o l d s t h e m i d d l e p a r t
pxor \ X M M 7 , \ X M M 5
pxor \ T M P 2 , \ T M P 6
# Multiply X M M 8 * H a s h K e y
# XMM8 a n d T M P 5 h o l d t h e v a l u e s f o r t h e t w o o p e r a n d s
movdqa \ X M M 8 , \ T M P 1
pshufd $ 7 8 , \ X M M 8 , \ T M P 2
pxor \ X M M 8 , \ T M P 2
2018-08-15 20:29:42 +03:00
movdqu H a s h K e y ( % a r g 2 ) , \ T M P 5
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
pclmulqdq $ 0 x11 , \ T M P 5 , \ T M P 1 # T M P 1 = a1 * b1
2010-12-13 14:51:15 +03:00
movaps 0 x90 ( % a r g 1 ) , \ T M P 3
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesenc \ T M P 3 , \ X M M 1 # R o u n d 9
aesenc \ T M P 3 , \ X M M 2
aesenc \ T M P 3 , \ X M M 3
aesenc \ T M P 3 , \ X M M 4
pclmulqdq $ 0 x00 , \ T M P 5 , \ X M M 8 # X M M 8 = a0 * b0
2015-01-13 21:16:43 +03:00
lea 0 x a0 ( % a r g 1 ) ,% r10
mov k e y s i z e ,% e a x
shr $ 2 ,% e a x # 128 - > 4 , 1 9 2 - > 6 , 2 5 6 - > 8
sub $ 4 ,% e a x # 128 - > 0 , 1 9 2 - > 2 , 2 5 6 - > 4
2018-02-14 20:40:47 +03:00
jz a e s _ l o o p _ p a r _ e n c _ d o n e \ @
2015-01-13 21:16:43 +03:00
2018-02-14 20:40:47 +03:00
aes_ l o o p _ p a r _ e n c \ @:
2015-01-13 21:16:43 +03:00
MOVADQ ( % r10 ) ,\ T M P 3
.irpc index, 1 2 3 4
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesenc \ T M P 3 , % x m m \ i n d e x
2015-01-13 21:16:43 +03:00
.endr
add $ 1 6 ,% r10
sub $ 1 ,% e a x
2018-02-14 20:40:47 +03:00
jnz a e s _ l o o p _ p a r _ e n c \ @
2015-01-13 21:16:43 +03:00
2018-02-14 20:40:47 +03:00
aes_ l o o p _ p a r _ e n c _ d o n e \ @:
2015-01-13 21:16:43 +03:00
MOVADQ ( % r10 ) , \ T M P 3
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesenclast \ T M P 3 , \ X M M 1 # R o u n d 10
aesenclast \ T M P 3 , \ X M M 2
aesenclast \ T M P 3 , \ X M M 3
aesenclast \ T M P 3 , \ X M M 4
2018-08-15 20:29:42 +03:00
movdqu H a s h K e y _ k ( % a r g 2 ) , \ T M P 5
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
pclmulqdq $ 0 x00 , \ T M P 5 , \ T M P 2 # T M P 2 = ( a1 + a0 ) * ( b1 + b0 )
2018-02-14 20:39:23 +03:00
movdqu ( % a r g 4 ,% r11 ,1 ) , \ T M P 3
2010-12-13 14:51:15 +03:00
pxor \ T M P 3 , \ X M M 1 # C i p h e r t e x t / P l a i n t e x t X O R E K
2018-02-14 20:39:23 +03:00
movdqu 1 6 ( % a r g 4 ,% r11 ,1 ) , \ T M P 3
2010-12-13 14:51:15 +03:00
pxor \ T M P 3 , \ X M M 2 # C i p h e r t e x t / P l a i n t e x t X O R E K
2018-02-14 20:39:23 +03:00
movdqu 3 2 ( % a r g 4 ,% r11 ,1 ) , \ T M P 3
2010-12-13 14:51:15 +03:00
pxor \ T M P 3 , \ X M M 3 # C i p h e r t e x t / P l a i n t e x t X O R E K
2018-02-14 20:39:23 +03:00
movdqu 4 8 ( % a r g 4 ,% r11 ,1 ) , \ T M P 3
2010-12-13 14:51:15 +03:00
pxor \ T M P 3 , \ X M M 4 # C i p h e r t e x t / P l a i n t e x t X O R E K
2018-02-14 20:39:23 +03:00
movdqu \ X M M 1 , ( % a r g 3 ,% r11 ,1 ) # W r i t e t o t h e c i p h e r t e x t b u f f e r
movdqu \ X M M 2 , 1 6 ( % a r g 3 ,% r11 ,1 ) # W r i t e t o t h e c i p h e r t e x t b u f f e r
movdqu \ X M M 3 , 3 2 ( % a r g 3 ,% r11 ,1 ) # W r i t e t o t h e c i p h e r t e x t b u f f e r
movdqu \ X M M 4 , 4 8 ( % a r g 3 ,% r11 ,1 ) # W r i t e t o t h e c i p h e r t e x t b u f f e r
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
pshufb % x m m 1 5 , \ X M M 1 # p e r f o r m a 16 b y t e s w a p
pshufb % x m m 1 5 , \ X M M 2 # p e r f o r m a 16 b y t e s w a p
pshufb % x m m 1 5 , \ X M M 3 # p e r f o r m a 16 b y t e s w a p
pshufb % x m m 1 5 , \ X M M 4 # p e r f o r m a 16 b y t e s w a p
2010-12-13 14:51:15 +03:00
pxor \ T M P 4 , \ T M P 1
pxor \ X M M 8 , \ X M M 5
pxor \ T M P 6 , \ T M P 2
pxor \ T M P 1 , \ T M P 2
pxor \ X M M 5 , \ T M P 2
movdqa \ T M P 2 , \ T M P 3
pslldq $ 8 , \ T M P 3 # l e f t s h i f t T M P 3 2 D W s
psrldq $ 8 , \ T M P 2 # r i g h t s h i f t T M P 2 2 D W s
pxor \ T M P 3 , \ X M M 5
pxor \ T M P 2 , \ T M P 1 # a c c u m u l a t e t h e r e s u l t s i n T M P 1 : X M M 5
# first p h a s e o f r e d u c t i o n
movdqa \ X M M 5 , \ T M P 2
movdqa \ X M M 5 , \ T M P 3
movdqa \ X M M 5 , \ T M P 4
# move X M M 5 i n t o T M P 2 , T M P 3 , T M P 4 i n o r d e r t o p e r f o r m s h i f t s i n d e p e n d e n t l y
pslld $ 3 1 , \ T M P 2 # p a c k e d r i g h t s h i f t < < 31
pslld $ 3 0 , \ T M P 3 # p a c k e d r i g h t s h i f t < < 30
pslld $ 2 5 , \ T M P 4 # p a c k e d r i g h t s h i f t < < 25
pxor \ T M P 3 , \ T M P 2 # x o r t h e s h i f t e d v e r s i o n s
pxor \ T M P 4 , \ T M P 2
movdqa \ T M P 2 , \ T M P 5
psrldq $ 4 , \ T M P 5 # r i g h t s h i f t T 5 1 D W
pslldq $ 1 2 , \ T M P 2 # l e f t s h i f t T 2 3 D W s
pxor \ T M P 2 , \ X M M 5
# second p h a s e o f r e d u c t i o n
movdqa \ X M M 5 ,\ T M P 2 # m a k e 3 c o p i e s o f X M M 5 i n t o T M P 2 , T M P 3 , T M P 4
movdqa \ X M M 5 ,\ T M P 3
movdqa \ X M M 5 ,\ T M P 4
psrld $ 1 , \ T M P 2 # p a c k e d l e f t s h i f t > > 1
psrld $ 2 , \ T M P 3 # p a c k e d l e f t s h i f t > > 2
psrld $ 7 , \ T M P 4 # p a c k e d l e f t s h i f t > > 7
pxor \ T M P 3 ,\ T M P 2 # x o r t h e s h i f t e d v e r s i o n s
pxor \ T M P 4 ,\ T M P 2
pxor \ T M P 5 , \ T M P 2
pxor \ T M P 2 , \ X M M 5
pxor \ T M P 1 , \ X M M 5 # r e s u l t i s i n T M P 1
pxor \ X M M 5 , \ X M M 1
.endm
/ *
* decrypt 4 b l o c k s a t a t i m e
* ghash t h e 4 p r e v i o u s l y d e c r y p t e d c i p h e r t e x t b l o c k s
2018-02-14 20:39:23 +03:00
* arg1 , % a r g 3 , % a r g 4 a r e u s e d a s p o i n t e r s o n l y , n o t m o d i f i e d
2010-12-13 14:51:15 +03:00
* % r1 1 i s t h e d a t a o f f s e t v a l u e
* /
crypto: aesni - Fix build with LLVM_IAS=1
When building with LLVM_IAS=1 means using Clang's Integrated Assembly (IAS)
from LLVM/Clang >= v10.0.1-rc1+ instead of GNU/as from GNU/binutils
I see the following breakage in Debian/testing AMD64:
<instantiation>:15:74: error: too many positional arguments
PRECOMPUTE 8*3+8(%rsp), %xmm1, %xmm2, %xmm3, %xmm4, %xmm5, %xmm6, %xmm7,
^
arch/x86/crypto/aesni-intel_asm.S:1598:2: note: while in macro instantiation
GCM_INIT %r9, 8*3 +8(%rsp), 8*3 +16(%rsp), 8*3 +24(%rsp)
^
<instantiation>:47:2: error: unknown use of instruction mnemonic without a size suffix
GHASH_4_ENCRYPT_4_PARALLEL_dec %xmm9, %xmm10, %xmm11, %xmm12, %xmm13, %xmm14, %xmm0, %xmm1, %xmm2, %xmm3, %xmm4, %xmm5, %xmm6, %xmm7, %xmm8, enc
^
arch/x86/crypto/aesni-intel_asm.S:1599:2: note: while in macro instantiation
GCM_ENC_DEC dec
^
<instantiation>:15:74: error: too many positional arguments
PRECOMPUTE 8*3+8(%rsp), %xmm1, %xmm2, %xmm3, %xmm4, %xmm5, %xmm6, %xmm7,
^
arch/x86/crypto/aesni-intel_asm.S:1686:2: note: while in macro instantiation
GCM_INIT %r9, 8*3 +8(%rsp), 8*3 +16(%rsp), 8*3 +24(%rsp)
^
<instantiation>:47:2: error: unknown use of instruction mnemonic without a size suffix
GHASH_4_ENCRYPT_4_PARALLEL_enc %xmm9, %xmm10, %xmm11, %xmm12, %xmm13, %xmm14, %xmm0, %xmm1, %xmm2, %xmm3, %xmm4, %xmm5, %xmm6, %xmm7, %xmm8, enc
^
arch/x86/crypto/aesni-intel_asm.S:1687:2: note: while in macro instantiation
GCM_ENC_DEC enc
Craig Topper suggested me in ClangBuiltLinux issue #1050:
> I think the "too many positional arguments" is because the parser isn't able
> to handle the trailing commas.
>
> The "unknown use of instruction mnemonic" is because the macro was named
> GHASH_4_ENCRYPT_4_PARALLEL_DEC but its being instantiated with
> GHASH_4_ENCRYPT_4_PARALLEL_dec I guess gas ignores case on the
> macro instantiation, but llvm doesn't.
First, I removed the trailing comma in the PRECOMPUTE line.
Second, I substituted:
1. GHASH_4_ENCRYPT_4_PARALLEL_DEC -> GHASH_4_ENCRYPT_4_PARALLEL_dec
2. GHASH_4_ENCRYPT_4_PARALLEL_ENC -> GHASH_4_ENCRYPT_4_PARALLEL_enc
With these changes I was able to build with LLVM_IAS=1 and boot on bare metal.
I confirmed that this works with Linux-kernel v5.7.5 final.
NOTE: This patch is on top of Linux v5.7 final.
Thanks to Craig and especially Nick for double-checking and his comments.
Suggested-by: Craig Topper <craig.topper@intel.com>
Suggested-by: Craig Topper <craig.topper@gmail.com>
Suggested-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Cc: "ClangBuiltLinux" <clang-built-linux@googlegroups.com>
Link: https://github.com/ClangBuiltLinux/linux/issues/1050
Link: https://bugs.llvm.org/show_bug.cgi?id=24494
Signed-off-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-03 17:32:06 +03:00
.macro GHASH_4_ENCRYPT_4_PARALLEL_dec TMP1 T M P 2 T M P 3 T M P 4 T M P 5 \
2010-11-04 22:00:45 +03:00
TMP6 X M M 0 X M M 1 X M M 2 X M M 3 X M M 4 X M M 5 X M M 6 X M M 7 X M M 8 o p e r a t i o n
movdqa \ X M M 1 , \ X M M 5
movdqa \ X M M 2 , \ X M M 6
movdqa \ X M M 3 , \ X M M 7
movdqa \ X M M 4 , \ X M M 8
2010-12-13 14:51:15 +03:00
movdqa S H U F _ M A S K ( % r i p ) , % x m m 1 5
2010-11-04 22:00:45 +03:00
# multiply T M P 5 * H a s h K e y u s i n g k a r a t s u b a
movdqa \ X M M 5 , \ T M P 4
pshufd $ 7 8 , \ X M M 5 , \ T M P 6
pxor \ X M M 5 , \ T M P 6
paddd O N E ( % r i p ) , \ X M M 0 # I N C R C N T
2018-08-15 20:29:42 +03:00
movdqu H a s h K e y _ 4 ( % a r g 2 ) , \ T M P 5
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
pclmulqdq $ 0 x11 , \ T M P 5 , \ T M P 4 # T M P 4 = a1 * b1
2010-11-04 22:00:45 +03:00
movdqa \ X M M 0 , \ X M M 1
paddd O N E ( % r i p ) , \ X M M 0 # I N C R C N T
movdqa \ X M M 0 , \ X M M 2
paddd O N E ( % r i p ) , \ X M M 0 # I N C R C N T
movdqa \ X M M 0 , \ X M M 3
paddd O N E ( % r i p ) , \ X M M 0 # I N C R C N T
movdqa \ X M M 0 , \ X M M 4
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
pshufb % x m m 1 5 , \ X M M 1 # p e r f o r m a 16 b y t e s w a p
pclmulqdq $ 0 x00 , \ T M P 5 , \ X M M 5 # X M M 5 = a0 * b0
pshufb % x m m 1 5 , \ X M M 2 # p e r f o r m a 16 b y t e s w a p
pshufb % x m m 1 5 , \ X M M 3 # p e r f o r m a 16 b y t e s w a p
pshufb % x m m 1 5 , \ X M M 4 # p e r f o r m a 16 b y t e s w a p
2010-12-13 14:51:15 +03:00
2010-11-04 22:00:45 +03:00
pxor ( % a r g 1 ) , \ X M M 1
pxor ( % a r g 1 ) , \ X M M 2
pxor ( % a r g 1 ) , \ X M M 3
pxor ( % a r g 1 ) , \ X M M 4
2018-08-15 20:29:42 +03:00
movdqu H a s h K e y _ 4 _ k ( % a r g 2 ) , \ T M P 5
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
pclmulqdq $ 0 x00 , \ T M P 5 , \ T M P 6 # T M P 6 = ( a1 + a0 ) * ( b1 + b0 )
2010-11-04 22:00:45 +03:00
movaps 0 x10 ( % a r g 1 ) , \ T M P 1
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesenc \ T M P 1 , \ X M M 1 # R o u n d 1
aesenc \ T M P 1 , \ X M M 2
aesenc \ T M P 1 , \ X M M 3
aesenc \ T M P 1 , \ X M M 4
2010-11-04 22:00:45 +03:00
movaps 0 x20 ( % a r g 1 ) , \ T M P 1
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesenc \ T M P 1 , \ X M M 1 # R o u n d 2
aesenc \ T M P 1 , \ X M M 2
aesenc \ T M P 1 , \ X M M 3
aesenc \ T M P 1 , \ X M M 4
2010-11-04 22:00:45 +03:00
movdqa \ X M M 6 , \ T M P 1
pshufd $ 7 8 , \ X M M 6 , \ T M P 2
pxor \ X M M 6 , \ T M P 2
2018-08-15 20:29:42 +03:00
movdqu H a s h K e y _ 3 ( % a r g 2 ) , \ T M P 5
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
pclmulqdq $ 0 x11 , \ T M P 5 , \ T M P 1 # T M P 1 = a1 * b1
2010-11-04 22:00:45 +03:00
movaps 0 x30 ( % a r g 1 ) , \ T M P 3
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesenc \ T M P 3 , \ X M M 1 # R o u n d 3
aesenc \ T M P 3 , \ X M M 2
aesenc \ T M P 3 , \ X M M 3
aesenc \ T M P 3 , \ X M M 4
pclmulqdq $ 0 x00 , \ T M P 5 , \ X M M 6 # X M M 6 = a0 * b0
2010-11-04 22:00:45 +03:00
movaps 0 x40 ( % a r g 1 ) , \ T M P 3
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesenc \ T M P 3 , \ X M M 1 # R o u n d 4
aesenc \ T M P 3 , \ X M M 2
aesenc \ T M P 3 , \ X M M 3
aesenc \ T M P 3 , \ X M M 4
2018-08-15 20:29:42 +03:00
movdqu H a s h K e y _ 3 _ k ( % a r g 2 ) , \ T M P 5
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
pclmulqdq $ 0 x00 , \ T M P 5 , \ T M P 2 # T M P 2 = ( a1 + a0 ) * ( b1 + b0 )
2010-11-04 22:00:45 +03:00
movaps 0 x50 ( % a r g 1 ) , \ T M P 3
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesenc \ T M P 3 , \ X M M 1 # R o u n d 5
aesenc \ T M P 3 , \ X M M 2
aesenc \ T M P 3 , \ X M M 3
aesenc \ T M P 3 , \ X M M 4
2010-11-04 22:00:45 +03:00
pxor \ T M P 1 , \ T M P 4
# accumulate t h e r e s u l t s i n T M P 4 : X M M 5 , T M P 6 h o l d s t h e m i d d l e p a r t
pxor \ X M M 6 , \ X M M 5
pxor \ T M P 2 , \ T M P 6
movdqa \ X M M 7 , \ T M P 1
pshufd $ 7 8 , \ X M M 7 , \ T M P 2
pxor \ X M M 7 , \ T M P 2
2018-08-15 20:29:42 +03:00
movdqu H a s h K e y _ 2 ( % a r g 2 ) , \ T M P 5
2010-11-04 22:00:45 +03:00
# Multiply T M P 5 * H a s h K e y u s i n g k a r a t s u b a
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
pclmulqdq $ 0 x11 , \ T M P 5 , \ T M P 1 # T M P 1 = a1 * b1
2010-11-04 22:00:45 +03:00
movaps 0 x60 ( % a r g 1 ) , \ T M P 3
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesenc \ T M P 3 , \ X M M 1 # R o u n d 6
aesenc \ T M P 3 , \ X M M 2
aesenc \ T M P 3 , \ X M M 3
aesenc \ T M P 3 , \ X M M 4
pclmulqdq $ 0 x00 , \ T M P 5 , \ X M M 7 # X M M 7 = a0 * b0
2010-11-04 22:00:45 +03:00
movaps 0 x70 ( % a r g 1 ) , \ T M P 3
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesenc \ T M P 3 , \ X M M 1 # R o u n d 7
aesenc \ T M P 3 , \ X M M 2
aesenc \ T M P 3 , \ X M M 3
aesenc \ T M P 3 , \ X M M 4
2018-08-15 20:29:42 +03:00
movdqu H a s h K e y _ 2 _ k ( % a r g 2 ) , \ T M P 5
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
pclmulqdq $ 0 x00 , \ T M P 5 , \ T M P 2 # T M P 2 = ( a1 + a0 ) * ( b1 + b0 )
2010-11-04 22:00:45 +03:00
movaps 0 x80 ( % a r g 1 ) , \ T M P 3
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesenc \ T M P 3 , \ X M M 1 # R o u n d 8
aesenc \ T M P 3 , \ X M M 2
aesenc \ T M P 3 , \ X M M 3
aesenc \ T M P 3 , \ X M M 4
2010-11-04 22:00:45 +03:00
pxor \ T M P 1 , \ T M P 4
# accumulate t h e r e s u l t s i n T M P 4 : X M M 5 , T M P 6 h o l d s t h e m i d d l e p a r t
pxor \ X M M 7 , \ X M M 5
pxor \ T M P 2 , \ T M P 6
# Multiply X M M 8 * H a s h K e y
# XMM8 a n d T M P 5 h o l d t h e v a l u e s f o r t h e t w o o p e r a n d s
movdqa \ X M M 8 , \ T M P 1
pshufd $ 7 8 , \ X M M 8 , \ T M P 2
pxor \ X M M 8 , \ T M P 2
2018-08-15 20:29:42 +03:00
movdqu H a s h K e y ( % a r g 2 ) , \ T M P 5
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
pclmulqdq $ 0 x11 , \ T M P 5 , \ T M P 1 # T M P 1 = a1 * b1
2010-11-04 22:00:45 +03:00
movaps 0 x90 ( % a r g 1 ) , \ T M P 3
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesenc \ T M P 3 , \ X M M 1 # R o u n d 9
aesenc \ T M P 3 , \ X M M 2
aesenc \ T M P 3 , \ X M M 3
aesenc \ T M P 3 , \ X M M 4
pclmulqdq $ 0 x00 , \ T M P 5 , \ X M M 8 # X M M 8 = a0 * b0
2015-01-13 21:16:43 +03:00
lea 0 x a0 ( % a r g 1 ) ,% r10
mov k e y s i z e ,% e a x
shr $ 2 ,% e a x # 128 - > 4 , 1 9 2 - > 6 , 2 5 6 - > 8
sub $ 4 ,% e a x # 128 - > 0 , 1 9 2 - > 2 , 2 5 6 - > 4
2018-02-14 20:40:47 +03:00
jz a e s _ l o o p _ p a r _ d e c _ d o n e \ @
2015-01-13 21:16:43 +03:00
2018-02-14 20:40:47 +03:00
aes_ l o o p _ p a r _ d e c \ @:
2015-01-13 21:16:43 +03:00
MOVADQ ( % r10 ) ,\ T M P 3
.irpc index, 1 2 3 4
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesenc \ T M P 3 , % x m m \ i n d e x
2015-01-13 21:16:43 +03:00
.endr
add $ 1 6 ,% r10
sub $ 1 ,% e a x
2018-02-14 20:40:47 +03:00
jnz a e s _ l o o p _ p a r _ d e c \ @
2015-01-13 21:16:43 +03:00
2018-02-14 20:40:47 +03:00
aes_ l o o p _ p a r _ d e c _ d o n e \ @:
2015-01-13 21:16:43 +03:00
MOVADQ ( % r10 ) , \ T M P 3
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesenclast \ T M P 3 , \ X M M 1 # l a s t r o u n d
aesenclast \ T M P 3 , \ X M M 2
aesenclast \ T M P 3 , \ X M M 3
aesenclast \ T M P 3 , \ X M M 4
2018-08-15 20:29:42 +03:00
movdqu H a s h K e y _ k ( % a r g 2 ) , \ T M P 5
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
pclmulqdq $ 0 x00 , \ T M P 5 , \ T M P 2 # T M P 2 = ( a1 + a0 ) * ( b1 + b0 )
2018-02-14 20:39:23 +03:00
movdqu ( % a r g 4 ,% r11 ,1 ) , \ T M P 3
2010-11-04 22:00:45 +03:00
pxor \ T M P 3 , \ X M M 1 # C i p h e r t e x t / P l a i n t e x t X O R E K
2018-02-14 20:39:23 +03:00
movdqu \ X M M 1 , ( % a r g 3 ,% r11 ,1 ) # W r i t e t o p l a i n t e x t b u f f e r
2010-11-04 22:00:45 +03:00
movdqa \ T M P 3 , \ X M M 1
2018-02-14 20:39:23 +03:00
movdqu 1 6 ( % a r g 4 ,% r11 ,1 ) , \ T M P 3
2010-11-04 22:00:45 +03:00
pxor \ T M P 3 , \ X M M 2 # C i p h e r t e x t / P l a i n t e x t X O R E K
2018-02-14 20:39:23 +03:00
movdqu \ X M M 2 , 1 6 ( % a r g 3 ,% r11 ,1 ) # W r i t e t o p l a i n t e x t b u f f e r
2010-11-04 22:00:45 +03:00
movdqa \ T M P 3 , \ X M M 2
2018-02-14 20:39:23 +03:00
movdqu 3 2 ( % a r g 4 ,% r11 ,1 ) , \ T M P 3
2010-11-04 22:00:45 +03:00
pxor \ T M P 3 , \ X M M 3 # C i p h e r t e x t / P l a i n t e x t X O R E K
2018-02-14 20:39:23 +03:00
movdqu \ X M M 3 , 3 2 ( % a r g 3 ,% r11 ,1 ) # W r i t e t o p l a i n t e x t b u f f e r
2010-11-04 22:00:45 +03:00
movdqa \ T M P 3 , \ X M M 3
2018-02-14 20:39:23 +03:00
movdqu 4 8 ( % a r g 4 ,% r11 ,1 ) , \ T M P 3
2010-11-04 22:00:45 +03:00
pxor \ T M P 3 , \ X M M 4 # C i p h e r t e x t / P l a i n t e x t X O R E K
2018-02-14 20:39:23 +03:00
movdqu \ X M M 4 , 4 8 ( % a r g 3 ,% r11 ,1 ) # W r i t e t o p l a i n t e x t b u f f e r
2010-11-04 22:00:45 +03:00
movdqa \ T M P 3 , \ X M M 4
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
pshufb % x m m 1 5 , \ X M M 1 # p e r f o r m a 16 b y t e s w a p
pshufb % x m m 1 5 , \ X M M 2 # p e r f o r m a 16 b y t e s w a p
pshufb % x m m 1 5 , \ X M M 3 # p e r f o r m a 16 b y t e s w a p
pshufb % x m m 1 5 , \ X M M 4 # p e r f o r m a 16 b y t e s w a p
2010-11-04 22:00:45 +03:00
pxor \ T M P 4 , \ T M P 1
pxor \ X M M 8 , \ X M M 5
pxor \ T M P 6 , \ T M P 2
pxor \ T M P 1 , \ T M P 2
pxor \ X M M 5 , \ T M P 2
movdqa \ T M P 2 , \ T M P 3
pslldq $ 8 , \ T M P 3 # l e f t s h i f t T M P 3 2 D W s
psrldq $ 8 , \ T M P 2 # r i g h t s h i f t T M P 2 2 D W s
pxor \ T M P 3 , \ X M M 5
pxor \ T M P 2 , \ T M P 1 # a c c u m u l a t e t h e r e s u l t s i n T M P 1 : X M M 5
# first p h a s e o f r e d u c t i o n
movdqa \ X M M 5 , \ T M P 2
movdqa \ X M M 5 , \ T M P 3
movdqa \ X M M 5 , \ T M P 4
# move X M M 5 i n t o T M P 2 , T M P 3 , T M P 4 i n o r d e r t o p e r f o r m s h i f t s i n d e p e n d e n t l y
pslld $ 3 1 , \ T M P 2 # p a c k e d r i g h t s h i f t < < 31
pslld $ 3 0 , \ T M P 3 # p a c k e d r i g h t s h i f t < < 30
pslld $ 2 5 , \ T M P 4 # p a c k e d r i g h t s h i f t < < 25
pxor \ T M P 3 , \ T M P 2 # x o r t h e s h i f t e d v e r s i o n s
pxor \ T M P 4 , \ T M P 2
movdqa \ T M P 2 , \ T M P 5
psrldq $ 4 , \ T M P 5 # r i g h t s h i f t T 5 1 D W
pslldq $ 1 2 , \ T M P 2 # l e f t s h i f t T 2 3 D W s
pxor \ T M P 2 , \ X M M 5
# second p h a s e o f r e d u c t i o n
movdqa \ X M M 5 ,\ T M P 2 # m a k e 3 c o p i e s o f X M M 5 i n t o T M P 2 , T M P 3 , T M P 4
movdqa \ X M M 5 ,\ T M P 3
movdqa \ X M M 5 ,\ T M P 4
psrld $ 1 , \ T M P 2 # p a c k e d l e f t s h i f t > > 1
psrld $ 2 , \ T M P 3 # p a c k e d l e f t s h i f t > > 2
psrld $ 7 , \ T M P 4 # p a c k e d l e f t s h i f t > > 7
pxor \ T M P 3 ,\ T M P 2 # x o r t h e s h i f t e d v e r s i o n s
pxor \ T M P 4 ,\ T M P 2
pxor \ T M P 5 , \ T M P 2
pxor \ T M P 2 , \ X M M 5
pxor \ T M P 1 , \ X M M 5 # r e s u l t i s i n T M P 1
pxor \ X M M 5 , \ X M M 1
.endm
/* GHASH the last 4 ciphertext blocks. */
.macro GHASH_LAST_4 TMP1 T M P 2 T M P 3 T M P 4 T M P 5 T M P 6 \
TMP7 X M M 1 X M M 2 X M M 3 X M M 4 X M M D s t
# Multiply T M P 6 * H a s h K e y ( u s i n g K a r a t s u b a )
movdqa \ X M M 1 , \ T M P 6
pshufd $ 7 8 , \ X M M 1 , \ T M P 2
pxor \ X M M 1 , \ T M P 2
2018-08-15 20:29:42 +03:00
movdqu H a s h K e y _ 4 ( % a r g 2 ) , \ T M P 5
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
pclmulqdq $ 0 x11 , \ T M P 5 , \ T M P 6 # T M P 6 = a1 * b1
pclmulqdq $ 0 x00 , \ T M P 5 , \ X M M 1 # X M M 1 = a0 * b0
2018-08-15 20:29:42 +03:00
movdqu H a s h K e y _ 4 _ k ( % a r g 2 ) , \ T M P 4
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
pclmulqdq $ 0 x00 , \ T M P 4 , \ T M P 2 # T M P 2 = ( a1 + a0 ) * ( b1 + b0 )
2010-11-04 22:00:45 +03:00
movdqa \ X M M 1 , \ X M M D s t
movdqa \ T M P 2 , \ X M M 1 # r e s u l t i n T M P 6 , X M M D s t , X M M 1
# Multiply T M P 1 * H a s h K e y ( u s i n g K a r a t s u b a )
movdqa \ X M M 2 , \ T M P 1
pshufd $ 7 8 , \ X M M 2 , \ T M P 2
pxor \ X M M 2 , \ T M P 2
2018-08-15 20:29:42 +03:00
movdqu H a s h K e y _ 3 ( % a r g 2 ) , \ T M P 5
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
pclmulqdq $ 0 x11 , \ T M P 5 , \ T M P 1 # T M P 1 = a1 * b1
pclmulqdq $ 0 x00 , \ T M P 5 , \ X M M 2 # X M M 2 = a0 * b0
2018-08-15 20:29:42 +03:00
movdqu H a s h K e y _ 3 _ k ( % a r g 2 ) , \ T M P 4
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
pclmulqdq $ 0 x00 , \ T M P 4 , \ T M P 2 # T M P 2 = ( a1 + a0 ) * ( b1 + b0 )
2010-11-04 22:00:45 +03:00
pxor \ T M P 1 , \ T M P 6
pxor \ X M M 2 , \ X M M D s t
pxor \ T M P 2 , \ X M M 1
# results a c c u m u l a t e d i n T M P 6 , X M M D s t , X M M 1
# Multiply T M P 1 * H a s h K e y ( u s i n g K a r a t s u b a )
movdqa \ X M M 3 , \ T M P 1
pshufd $ 7 8 , \ X M M 3 , \ T M P 2
pxor \ X M M 3 , \ T M P 2
2018-08-15 20:29:42 +03:00
movdqu H a s h K e y _ 2 ( % a r g 2 ) , \ T M P 5
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
pclmulqdq $ 0 x11 , \ T M P 5 , \ T M P 1 # T M P 1 = a1 * b1
pclmulqdq $ 0 x00 , \ T M P 5 , \ X M M 3 # X M M 3 = a0 * b0
2018-08-15 20:29:42 +03:00
movdqu H a s h K e y _ 2 _ k ( % a r g 2 ) , \ T M P 4
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
pclmulqdq $ 0 x00 , \ T M P 4 , \ T M P 2 # T M P 2 = ( a1 + a0 ) * ( b1 + b0 )
2010-11-04 22:00:45 +03:00
pxor \ T M P 1 , \ T M P 6
pxor \ X M M 3 , \ X M M D s t
pxor \ T M P 2 , \ X M M 1 # r e s u l t s a c c u m u l a t e d i n T M P 6 , X M M D s t , X M M 1
# Multiply T M P 1 * H a s h K e y ( u s i n g K a r a t s u b a )
movdqa \ X M M 4 , \ T M P 1
pshufd $ 7 8 , \ X M M 4 , \ T M P 2
pxor \ X M M 4 , \ T M P 2
2018-08-15 20:29:42 +03:00
movdqu H a s h K e y ( % a r g 2 ) , \ T M P 5
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
pclmulqdq $ 0 x11 , \ T M P 5 , \ T M P 1 # T M P 1 = a1 * b1
pclmulqdq $ 0 x00 , \ T M P 5 , \ X M M 4 # X M M 4 = a0 * b0
2018-08-15 20:29:42 +03:00
movdqu H a s h K e y _ k ( % a r g 2 ) , \ T M P 4
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
pclmulqdq $ 0 x00 , \ T M P 4 , \ T M P 2 # T M P 2 = ( a1 + a0 ) * ( b1 + b0 )
2010-11-04 22:00:45 +03:00
pxor \ T M P 1 , \ T M P 6
pxor \ X M M 4 , \ X M M D s t
pxor \ X M M 1 , \ T M P 2
pxor \ T M P 6 , \ T M P 2
pxor \ X M M D s t , \ T M P 2
# middle s e c t i o n o f t h e t e m p r e s u l t s c o m b i n e d a s i n k a r a t s u b a a l g o r i t h m
movdqa \ T M P 2 , \ T M P 4
pslldq $ 8 , \ T M P 4 # l e f t s h i f t T M P 4 2 D W s
psrldq $ 8 , \ T M P 2 # r i g h t s h i f t T M P 2 2 D W s
pxor \ T M P 4 , \ X M M D s t
pxor \ T M P 2 , \ T M P 6
# TMP6 : XMMDst h o l d s t h e r e s u l t o f t h e a c c u m u l a t e d c a r r y - l e s s m u l t i p l i c a t i o n s
# first p h a s e o f t h e r e d u c t i o n
movdqa \ X M M D s t , \ T M P 2
movdqa \ X M M D s t , \ T M P 3
movdqa \ X M M D s t , \ T M P 4
# move X M M D s t i n t o T M P 2 , T M P 3 , T M P 4 i n o r d e r t o p e r f o r m 3 s h i f t s i n d e p e n d e n t l y
pslld $ 3 1 , \ T M P 2 # p a c k e d r i g h t s h i f t i n g < < 31
pslld $ 3 0 , \ T M P 3 # p a c k e d r i g h t s h i f t i n g < < 30
pslld $ 2 5 , \ T M P 4 # p a c k e d r i g h t s h i f t i n g < < 25
pxor \ T M P 3 , \ T M P 2 # x o r t h e s h i f t e d v e r s i o n s
pxor \ T M P 4 , \ T M P 2
movdqa \ T M P 2 , \ T M P 7
psrldq $ 4 , \ T M P 7 # r i g h t s h i f t T M P 7 1 D W
pslldq $ 1 2 , \ T M P 2 # l e f t s h i f t T M P 2 3 D W s
pxor \ T M P 2 , \ X M M D s t
# second p h a s e o f t h e r e d u c t i o n
movdqa \ X M M D s t , \ T M P 2
# make 3 c o p i e s o f X M M D s t f o r d o i n g 3 s h i f t o p e r a t i o n s
movdqa \ X M M D s t , \ T M P 3
movdqa \ X M M D s t , \ T M P 4
psrld $ 1 , \ T M P 2 # p a c k e d l e f t s h i f t > > 1
psrld $ 2 , \ T M P 3 # p a c k e d l e f t s h i f t > > 2
psrld $ 7 , \ T M P 4 # p a c k e d l e f t s h i f t > > 7
pxor \ T M P 3 , \ T M P 2 # x o r t h e s h i f t e d v e r s i o n s
pxor \ T M P 4 , \ T M P 2
pxor \ T M P 7 , \ T M P 2
pxor \ T M P 2 , \ X M M D s t
pxor \ T M P 6 , \ X M M D s t # r e d u c e d r e s u l t i s i n X M M D s t
.endm
2015-01-13 21:16:43 +03:00
/ * Encryption o f a s i n g l e b l o c k
* uses e a x & r10
* /
2010-11-04 22:00:45 +03:00
2015-01-13 21:16:43 +03:00
.macro ENCRYPT_SINGLE_BLOCK XMM0 T M P 1
2010-11-04 22:00:45 +03:00
2015-01-13 21:16:43 +03:00
pxor ( % a r g 1 ) , \ X M M 0
mov k e y s i z e ,% e a x
shr $ 2 ,% e a x # 128 - > 4 , 1 9 2 - > 6 , 2 5 6 - > 8
add $ 5 ,% e a x # 128 - > 9 , 1 9 2 - > 1 1 , 2 5 6 - > 1 3
lea 1 6 ( % a r g 1 ) , % r10 # g e t f i r s t e x p a n d e d k e y a d d r e s s
_ esb_ l o o p _ \ @:
MOVADQ ( % r10 ) ,\ T M P 1
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesenc \ T M P 1 ,\ X M M 0
2015-01-13 21:16:43 +03:00
add $ 1 6 ,% r10
sub $ 1 ,% e a x
jnz _ e s b _ l o o p _ \ @
MOVADQ ( % r10 ) ,\ T M P 1
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesenclast \ T M P 1 ,\ X M M 0
2015-01-13 21:16:43 +03:00
.endm
2010-11-04 22:00:45 +03:00
/ * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
* void a e s n i _ g c m _ d e c ( v o i d * a e s _ c t x , / / A E S K e y s c h e d u l e . S t a r t s o n a 1 6 b y t e b o u n d a r y .
2018-02-14 20:39:23 +03:00
* struct g c m _ c o n t e x t _ d a t a * d a t a
* / / Context d a t a
2010-11-04 22:00:45 +03:00
* u8 * o u t , / / P l a i n t e x t o u t p u t . E n c r y p t i n - p l a c e i s a l l o w e d .
* const u 8 * i n , / / C i p h e r t e x t i n p u t
* u6 4 p l a i n t e x t _ l e n , / / L e n g t h o f d a t a i n b y t e s f o r d e c r y p t i o n .
* u8 * i v , / / P r e - c o u n t e r b l o c k j 0 : 4 b y t e s a l t ( f r o m S e c u r i t y A s s o c i a t i o n )
* / / concatenated w i t h 8 b y t e I n i t i a l i s a t i o n V e c t o r ( f r o m I P S e c E S P P a y l o a d )
* / / concatenated w i t h 0 x00 0 0 0 0 0 1 . 1 6 - b y t e a l i g n e d p o i n t e r .
* u8 * h a s h _ s u b k e y , / / H , t h e H a s h s u b k e y i n p u t . D a t a s t a r t s o n a 1 6 - b y t e b o u n d a r y .
* const u 8 * a a d , / / A d d i t i o n a l A u t h e n t i c a t i o n D a t a ( A A D )
* u6 4 a a d _ l e n , / / L e n g t h o f A A D i n b y t e s . W i t h R F C 4 1 0 6 t h i s i s g o i n g t o b e 8 o r 1 2 b y t e s
* u8 * a u t h _ t a g , / / A u t h e n t i c a t e d T a g o u t p u t . T h e d r i v e r w i l l c o m p a r e t h i s t o t h e
* / / given a u t h e n t i c a t i o n t a g a n d o n l y r e t u r n t h e p l a i n t e x t i f t h e y m a t c h .
* u6 4 a u t h _ t a g _ l e n ) ; // Authenticated Tag Length in bytes. Valid values are 16
* / / ( most l i k e l y ) , 1 2 o r 8 .
*
* Assumptions :
*
* keys :
* keys a r e p r e - e x p a n d e d a n d a l i g n e d t o 1 6 b y t e s . w e a r e u s i n g t h e f i r s t
* set o f 1 1 k e y s i n t h e d a t a s t r u c t u r e v o i d * a e s _ c t x
*
* iv :
* 0 1 2 3
* 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
* + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - +
* | Salt ( F r o m t h e S A ) |
* + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - +
* | Initialization V e c t o r |
* | ( This i s t h e s e q u e n c e n u m b e r f r o m I P S e c h e a d e r ) |
* + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - +
* | 0 x1 |
* + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - +
*
*
*
* AAD :
* AAD p a d d e d t o 1 2 8 b i t s w i t h 0
* for e x a m p l e , a s s u m e A A D i s a u 3 2 v e c t o r
*
* if A A D i s 8 b y t e s :
* AAD[ 3 ] = { A 0 , A 1 } ;
* padded A A D i n x m m r e g i s t e r = { A 1 A 0 0 0 }
*
* 0 1 2 3
* 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
* + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - +
* | SPI ( A 1 ) |
* + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - +
* | 3 2 - bit S e q u e n c e N u m b e r ( A 0 ) |
* + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - +
* | 0 x0 |
* + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - +
*
* AAD F o r m a t w i t h 3 2 - b i t S e q u e n c e N u m b e r
*
* if A A D i s 1 2 b y t e s :
* AAD[ 3 ] = { A 0 , A 1 , A 2 } ;
* padded A A D i n x m m r e g i s t e r = { A 2 A 1 A 0 0 }
*
* 0 1 2 3
* 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
* + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - +
* 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
* + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - +
* | SPI ( A 2 ) |
* + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - +
* | 6 4 - bit E x t e n d e d S e q u e n c e N u m b e r { A 1 ,A 0 } |
* | |
* + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - +
* | 0 x0 |
* + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - +
*
* AAD F o r m a t w i t h 6 4 - b i t E x t e n d e d S e q u e n c e N u m b e r
*
* poly = x ^ 1 2 8 + x ^ 1 2 7 + x ^ 1 2 6 + x ^ 1 2 1 + 1
*
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * /
2019-10-11 14:51:04 +03:00
SYM_ F U N C _ S T A R T ( a e s n i _ g c m _ d e c )
2018-02-14 20:38:35 +03:00
FUNC_ S A V E
2010-11-04 22:00:45 +03:00
2018-02-14 20:40:47 +03:00
GCM_ I N I T % a r g 6 , a r g 7 , a r g 8 , a r g 9
2018-02-14 20:39:10 +03:00
GCM_ E N C _ D E C d e c
2018-02-14 20:40:47 +03:00
GCM_ C O M P L E T E a r g 1 0 , a r g 1 1
2018-02-14 20:38:35 +03:00
FUNC_ R E S T O R E
2010-11-04 22:00:45 +03:00
ret
2019-10-11 14:51:04 +03:00
SYM_ F U N C _ E N D ( a e s n i _ g c m _ d e c )
2010-11-04 22:00:45 +03:00
/ * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
* void a e s n i _ g c m _ e n c ( v o i d * a e s _ c t x , / / A E S K e y s c h e d u l e . S t a r t s o n a 1 6 b y t e b o u n d a r y .
2018-02-14 20:39:23 +03:00
* struct g c m _ c o n t e x t _ d a t a * d a t a
* / / Context d a t a
2010-11-04 22:00:45 +03:00
* u8 * o u t , / / C i p h e r t e x t o u t p u t . E n c r y p t i n - p l a c e i s a l l o w e d .
* const u 8 * i n , / / P l a i n t e x t i n p u t
* u6 4 p l a i n t e x t _ l e n , / / L e n g t h o f d a t a i n b y t e s f o r e n c r y p t i o n .
* u8 * i v , / / P r e - c o u n t e r b l o c k j 0 : 4 b y t e s a l t ( f r o m S e c u r i t y A s s o c i a t i o n )
* / / concatenated w i t h 8 b y t e I n i t i a l i s a t i o n V e c t o r ( f r o m I P S e c E S P P a y l o a d )
* / / concatenated w i t h 0 x00 0 0 0 0 0 1 . 1 6 - b y t e a l i g n e d p o i n t e r .
* u8 * h a s h _ s u b k e y , / / H , t h e H a s h s u b k e y i n p u t . D a t a s t a r t s o n a 1 6 - b y t e b o u n d a r y .
* const u 8 * a a d , / / A d d i t i o n a l A u t h e n t i c a t i o n D a t a ( A A D )
* u6 4 a a d _ l e n , / / L e n g t h o f A A D i n b y t e s . W i t h R F C 4 1 0 6 t h i s i s g o i n g t o b e 8 o r 1 2 b y t e s
* u8 * a u t h _ t a g , / / A u t h e n t i c a t e d T a g o u t p u t .
* u6 4 a u t h _ t a g _ l e n ) ; // Authenticated Tag Length in bytes. Valid values are 16 (most likely),
* / / 1 2 or 8 .
*
* Assumptions :
*
* keys :
* keys a r e p r e - e x p a n d e d a n d a l i g n e d t o 1 6 b y t e s . w e a r e u s i n g t h e
* first s e t o f 1 1 k e y s i n t h e d a t a s t r u c t u r e v o i d * a e s _ c t x
*
*
* iv :
* 0 1 2 3
* 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
* + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - +
* | Salt ( F r o m t h e S A ) |
* + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - +
* | Initialization V e c t o r |
* | ( This i s t h e s e q u e n c e n u m b e r f r o m I P S e c h e a d e r ) |
* + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - +
* | 0 x1 |
* + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - +
*
*
*
* AAD :
* AAD p a d d e d t o 1 2 8 b i t s w i t h 0
* for e x a m p l e , a s s u m e A A D i s a u 3 2 v e c t o r
*
* if A A D i s 8 b y t e s :
* AAD[ 3 ] = { A 0 , A 1 } ;
* padded A A D i n x m m r e g i s t e r = { A 1 A 0 0 0 }
*
* 0 1 2 3
* 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
* + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - +
* | SPI ( A 1 ) |
* + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - +
* | 3 2 - bit S e q u e n c e N u m b e r ( A 0 ) |
* + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - +
* | 0 x0 |
* + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - +
*
* AAD F o r m a t w i t h 3 2 - b i t S e q u e n c e N u m b e r
*
* if A A D i s 1 2 b y t e s :
* AAD[ 3 ] = { A 0 , A 1 , A 2 } ;
* padded A A D i n x m m r e g i s t e r = { A 2 A 1 A 0 0 }
*
* 0 1 2 3
* 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
* + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - +
* | SPI ( A 2 ) |
* + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - +
* | 6 4 - bit E x t e n d e d S e q u e n c e N u m b e r { A 1 ,A 0 } |
* | |
* + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - +
* | 0 x0 |
* + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - +
*
* AAD F o r m a t w i t h 6 4 - b i t E x t e n d e d S e q u e n c e N u m b e r
*
* poly = x ^ 1 2 8 + x ^ 1 2 7 + x ^ 1 2 6 + x ^ 1 2 1 + 1
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * /
2019-10-11 14:51:04 +03:00
SYM_ F U N C _ S T A R T ( a e s n i _ g c m _ e n c )
2018-02-14 20:38:35 +03:00
FUNC_ S A V E
2010-11-04 22:00:45 +03:00
2018-02-14 20:40:47 +03:00
GCM_ I N I T % a r g 6 , a r g 7 , a r g 8 , a r g 9
2018-02-14 20:39:10 +03:00
GCM_ E N C _ D E C e n c
2018-02-14 20:40:47 +03:00
GCM_ C O M P L E T E a r g 1 0 , a r g 1 1
2018-02-14 20:38:35 +03:00
FUNC_ R E S T O R E
2010-11-04 22:00:45 +03:00
ret
2019-10-11 14:51:04 +03:00
SYM_ F U N C _ E N D ( a e s n i _ g c m _ e n c )
2010-12-13 14:51:15 +03:00
2018-02-14 20:40:47 +03:00
/ * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
* void a e s n i _ g c m _ i n i t ( v o i d * a e s _ c t x , / / A E S K e y s c h e d u l e . S t a r t s o n a 1 6 b y t e b o u n d a r y .
* struct g c m _ c o n t e x t _ d a t a * d a t a ,
* / / context d a t a
* u8 * i v , / / P r e - c o u n t e r b l o c k j 0 : 4 b y t e s a l t ( f r o m S e c u r i t y A s s o c i a t i o n )
* / / concatenated w i t h 8 b y t e I n i t i a l i s a t i o n V e c t o r ( f r o m I P S e c E S P P a y l o a d )
* / / concatenated w i t h 0 x00 0 0 0 0 0 1 . 1 6 - b y t e a l i g n e d p o i n t e r .
* u8 * h a s h _ s u b k e y , / / H , t h e H a s h s u b k e y i n p u t . D a t a s t a r t s o n a 1 6 - b y t e b o u n d a r y .
* const u 8 * a a d , / / A d d i t i o n a l A u t h e n t i c a t i o n D a t a ( A A D )
* u6 4 a a d _ l e n ) / / L e n g t h o f A A D i n b y t e s .
* /
2019-10-11 14:51:04 +03:00
SYM_ F U N C _ S T A R T ( a e s n i _ g c m _ i n i t )
2018-02-14 20:40:47 +03:00
FUNC_ S A V E
GCM_ I N I T % a r g 3 , % a r g 4 ,% a r g 5 , % a r g 6
FUNC_ R E S T O R E
ret
2019-10-11 14:51:04 +03:00
SYM_ F U N C _ E N D ( a e s n i _ g c m _ i n i t )
2018-02-14 20:40:47 +03:00
/ * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
* void a e s n i _ g c m _ e n c _ u p d a t e ( v o i d * a e s _ c t x , / / A E S K e y s c h e d u l e . S t a r t s o n a 1 6 b y t e b o u n d a r y .
* struct g c m _ c o n t e x t _ d a t a * d a t a ,
* / / context d a t a
* u8 * o u t , / / C i p h e r t e x t o u t p u t . E n c r y p t i n - p l a c e i s a l l o w e d .
* const u 8 * i n , / / P l a i n t e x t i n p u t
* u6 4 p l a i n t e x t _ l e n , / / L e n g t h o f d a t a i n b y t e s f o r e n c r y p t i o n .
* /
2019-10-11 14:51:04 +03:00
SYM_ F U N C _ S T A R T ( a e s n i _ g c m _ e n c _ u p d a t e )
2018-02-14 20:40:47 +03:00
FUNC_ S A V E
GCM_ E N C _ D E C e n c
FUNC_ R E S T O R E
ret
2019-10-11 14:51:04 +03:00
SYM_ F U N C _ E N D ( a e s n i _ g c m _ e n c _ u p d a t e )
2018-02-14 20:40:47 +03:00
/ * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
* void a e s n i _ g c m _ d e c _ u p d a t e ( v o i d * a e s _ c t x , / / A E S K e y s c h e d u l e . S t a r t s o n a 1 6 b y t e b o u n d a r y .
* struct g c m _ c o n t e x t _ d a t a * d a t a ,
* / / context d a t a
* u8 * o u t , / / C i p h e r t e x t o u t p u t . E n c r y p t i n - p l a c e i s a l l o w e d .
* const u 8 * i n , / / P l a i n t e x t i n p u t
* u6 4 p l a i n t e x t _ l e n , / / L e n g t h o f d a t a i n b y t e s f o r e n c r y p t i o n .
* /
2019-10-11 14:51:04 +03:00
SYM_ F U N C _ S T A R T ( a e s n i _ g c m _ d e c _ u p d a t e )
2018-02-14 20:40:47 +03:00
FUNC_ S A V E
GCM_ E N C _ D E C d e c
FUNC_ R E S T O R E
ret
2019-10-11 14:51:04 +03:00
SYM_ F U N C _ E N D ( a e s n i _ g c m _ d e c _ u p d a t e )
2018-02-14 20:40:47 +03:00
/ * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
* void a e s n i _ g c m _ f i n a l i z e ( v o i d * a e s _ c t x , / / A E S K e y s c h e d u l e . S t a r t s o n a 1 6 b y t e b o u n d a r y .
* struct g c m _ c o n t e x t _ d a t a * d a t a ,
* / / context d a t a
* u8 * a u t h _ t a g , / / A u t h e n t i c a t e d T a g o u t p u t .
* u6 4 a u t h _ t a g _ l e n ) ; // Authenticated Tag Length in bytes. Valid values are 16 (most likely),
* / / 1 2 or 8 .
* /
2019-10-11 14:51:04 +03:00
SYM_ F U N C _ S T A R T ( a e s n i _ g c m _ f i n a l i z e )
2018-02-14 20:40:47 +03:00
FUNC_ S A V E
GCM_ C O M P L E T E % a r g 3 % a r g 4
FUNC_ R E S T O R E
ret
2019-10-11 14:51:04 +03:00
SYM_ F U N C _ E N D ( a e s n i _ g c m _ f i n a l i z e )
2018-02-14 20:40:47 +03:00
2010-11-29 03:35:39 +03:00
# endif
2010-11-04 22:00:45 +03:00
2019-10-11 14:50:49 +03:00
SYM_ F U N C _ S T A R T _ L O C A L _ A L I A S ( _ k e y _ e x p a n s i o n _ 1 2 8 )
2019-10-11 14:50:46 +03:00
SYM_ F U N C _ S T A R T _ L O C A L ( _ k e y _ e x p a n s i o n _ 2 5 6 a )
2009-01-18 08:28:34 +03:00
pshufd $ 0 b11 1 1 1 1 1 1 , % x m m 1 , % x m m 1
shufps $ 0 b00 0 1 0 0 0 0 , % x m m 0 , % x m m 4
pxor % x m m 4 , % x m m 0
shufps $ 0 b10 0 0 1 1 0 0 , % x m m 0 , % x m m 4
pxor % x m m 4 , % x m m 0
pxor % x m m 1 , % x m m 0
crypto: aesni-intel - Ported implementation to x86-32
The AES-NI instructions are also available in legacy mode so the 32-bit
architecture may profit from those, too.
To illustrate the performance gain here's a short summary of a dm-crypt
speed test on a Core i7 M620 running at 2.67GHz comparing both assembler
implementations:
x86: i568 aes-ni delta
ECB, 256 bit: 93.8 MB/s 123.3 MB/s +31.4%
CBC, 256 bit: 84.8 MB/s 262.3 MB/s +209.3%
LRW, 256 bit: 108.6 MB/s 222.1 MB/s +104.5%
XTS, 256 bit: 105.0 MB/s 205.5 MB/s +95.7%
Additionally, due to some minor optimizations, the 64-bit version also
got a minor performance gain as seen below:
x86-64: old impl. new impl. delta
ECB, 256 bit: 121.1 MB/s 123.0 MB/s +1.5%
CBC, 256 bit: 285.3 MB/s 290.8 MB/s +1.9%
LRW, 256 bit: 263.7 MB/s 265.3 MB/s +0.6%
XTS, 256 bit: 251.1 MB/s 255.3 MB/s +1.7%
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Reviewed-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2010-11-27 11:34:46 +03:00
movaps % x m m 0 , ( T K E Y P )
add $ 0 x10 , T K E Y P
2009-01-18 08:28:34 +03:00
ret
2019-10-11 14:50:46 +03:00
SYM_ F U N C _ E N D ( _ k e y _ e x p a n s i o n _ 2 5 6 a )
2019-10-11 14:50:49 +03:00
SYM_ F U N C _ E N D _ A L I A S ( _ k e y _ e x p a n s i o n _ 1 2 8 )
2009-01-18 08:28:34 +03:00
2019-10-11 14:50:46 +03:00
SYM_ F U N C _ S T A R T _ L O C A L ( _ k e y _ e x p a n s i o n _ 1 9 2 a )
2009-01-18 08:28:34 +03:00
pshufd $ 0 b01 0 1 0 1 0 1 , % x m m 1 , % x m m 1
shufps $ 0 b00 0 1 0 0 0 0 , % x m m 0 , % x m m 4
pxor % x m m 4 , % x m m 0
shufps $ 0 b10 0 0 1 1 0 0 , % x m m 0 , % x m m 4
pxor % x m m 4 , % x m m 0
pxor % x m m 1 , % x m m 0
movaps % x m m 2 , % x m m 5
movaps % x m m 2 , % x m m 6
pslldq $ 4 , % x m m 5
pshufd $ 0 b11 1 1 1 1 1 1 , % x m m 0 , % x m m 3
pxor % x m m 3 , % x m m 2
pxor % x m m 5 , % x m m 2
movaps % x m m 0 , % x m m 1
shufps $ 0 b01 0 0 0 1 0 0 , % x m m 0 , % x m m 6
crypto: aesni-intel - Ported implementation to x86-32
The AES-NI instructions are also available in legacy mode so the 32-bit
architecture may profit from those, too.
To illustrate the performance gain here's a short summary of a dm-crypt
speed test on a Core i7 M620 running at 2.67GHz comparing both assembler
implementations:
x86: i568 aes-ni delta
ECB, 256 bit: 93.8 MB/s 123.3 MB/s +31.4%
CBC, 256 bit: 84.8 MB/s 262.3 MB/s +209.3%
LRW, 256 bit: 108.6 MB/s 222.1 MB/s +104.5%
XTS, 256 bit: 105.0 MB/s 205.5 MB/s +95.7%
Additionally, due to some minor optimizations, the 64-bit version also
got a minor performance gain as seen below:
x86-64: old impl. new impl. delta
ECB, 256 bit: 121.1 MB/s 123.0 MB/s +1.5%
CBC, 256 bit: 285.3 MB/s 290.8 MB/s +1.9%
LRW, 256 bit: 263.7 MB/s 265.3 MB/s +0.6%
XTS, 256 bit: 251.1 MB/s 255.3 MB/s +1.7%
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Reviewed-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2010-11-27 11:34:46 +03:00
movaps % x m m 6 , ( T K E Y P )
2009-01-18 08:28:34 +03:00
shufps $ 0 b01 0 0 1 1 1 0 , % x m m 2 , % x m m 1
crypto: aesni-intel - Ported implementation to x86-32
The AES-NI instructions are also available in legacy mode so the 32-bit
architecture may profit from those, too.
To illustrate the performance gain here's a short summary of a dm-crypt
speed test on a Core i7 M620 running at 2.67GHz comparing both assembler
implementations:
x86: i568 aes-ni delta
ECB, 256 bit: 93.8 MB/s 123.3 MB/s +31.4%
CBC, 256 bit: 84.8 MB/s 262.3 MB/s +209.3%
LRW, 256 bit: 108.6 MB/s 222.1 MB/s +104.5%
XTS, 256 bit: 105.0 MB/s 205.5 MB/s +95.7%
Additionally, due to some minor optimizations, the 64-bit version also
got a minor performance gain as seen below:
x86-64: old impl. new impl. delta
ECB, 256 bit: 121.1 MB/s 123.0 MB/s +1.5%
CBC, 256 bit: 285.3 MB/s 290.8 MB/s +1.9%
LRW, 256 bit: 263.7 MB/s 265.3 MB/s +0.6%
XTS, 256 bit: 251.1 MB/s 255.3 MB/s +1.7%
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Reviewed-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2010-11-27 11:34:46 +03:00
movaps % x m m 1 , 0 x10 ( T K E Y P )
add $ 0 x20 , T K E Y P
2009-01-18 08:28:34 +03:00
ret
2019-10-11 14:50:46 +03:00
SYM_ F U N C _ E N D ( _ k e y _ e x p a n s i o n _ 1 9 2 a )
2009-01-18 08:28:34 +03:00
2019-10-11 14:50:46 +03:00
SYM_ F U N C _ S T A R T _ L O C A L ( _ k e y _ e x p a n s i o n _ 1 9 2 b )
2009-01-18 08:28:34 +03:00
pshufd $ 0 b01 0 1 0 1 0 1 , % x m m 1 , % x m m 1
shufps $ 0 b00 0 1 0 0 0 0 , % x m m 0 , % x m m 4
pxor % x m m 4 , % x m m 0
shufps $ 0 b10 0 0 1 1 0 0 , % x m m 0 , % x m m 4
pxor % x m m 4 , % x m m 0
pxor % x m m 1 , % x m m 0
movaps % x m m 2 , % x m m 5
pslldq $ 4 , % x m m 5
pshufd $ 0 b11 1 1 1 1 1 1 , % x m m 0 , % x m m 3
pxor % x m m 3 , % x m m 2
pxor % x m m 5 , % x m m 2
crypto: aesni-intel - Ported implementation to x86-32
The AES-NI instructions are also available in legacy mode so the 32-bit
architecture may profit from those, too.
To illustrate the performance gain here's a short summary of a dm-crypt
speed test on a Core i7 M620 running at 2.67GHz comparing both assembler
implementations:
x86: i568 aes-ni delta
ECB, 256 bit: 93.8 MB/s 123.3 MB/s +31.4%
CBC, 256 bit: 84.8 MB/s 262.3 MB/s +209.3%
LRW, 256 bit: 108.6 MB/s 222.1 MB/s +104.5%
XTS, 256 bit: 105.0 MB/s 205.5 MB/s +95.7%
Additionally, due to some minor optimizations, the 64-bit version also
got a minor performance gain as seen below:
x86-64: old impl. new impl. delta
ECB, 256 bit: 121.1 MB/s 123.0 MB/s +1.5%
CBC, 256 bit: 285.3 MB/s 290.8 MB/s +1.9%
LRW, 256 bit: 263.7 MB/s 265.3 MB/s +0.6%
XTS, 256 bit: 251.1 MB/s 255.3 MB/s +1.7%
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Reviewed-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2010-11-27 11:34:46 +03:00
movaps % x m m 0 , ( T K E Y P )
add $ 0 x10 , T K E Y P
2009-01-18 08:28:34 +03:00
ret
2019-10-11 14:50:46 +03:00
SYM_ F U N C _ E N D ( _ k e y _ e x p a n s i o n _ 1 9 2 b )
2009-01-18 08:28:34 +03:00
2019-10-11 14:50:46 +03:00
SYM_ F U N C _ S T A R T _ L O C A L ( _ k e y _ e x p a n s i o n _ 2 5 6 b )
2009-01-18 08:28:34 +03:00
pshufd $ 0 b10 1 0 1 0 1 0 , % x m m 1 , % x m m 1
shufps $ 0 b00 0 1 0 0 0 0 , % x m m 2 , % x m m 4
pxor % x m m 4 , % x m m 2
shufps $ 0 b10 0 0 1 1 0 0 , % x m m 2 , % x m m 4
pxor % x m m 4 , % x m m 2
pxor % x m m 1 , % x m m 2
crypto: aesni-intel - Ported implementation to x86-32
The AES-NI instructions are also available in legacy mode so the 32-bit
architecture may profit from those, too.
To illustrate the performance gain here's a short summary of a dm-crypt
speed test on a Core i7 M620 running at 2.67GHz comparing both assembler
implementations:
x86: i568 aes-ni delta
ECB, 256 bit: 93.8 MB/s 123.3 MB/s +31.4%
CBC, 256 bit: 84.8 MB/s 262.3 MB/s +209.3%
LRW, 256 bit: 108.6 MB/s 222.1 MB/s +104.5%
XTS, 256 bit: 105.0 MB/s 205.5 MB/s +95.7%
Additionally, due to some minor optimizations, the 64-bit version also
got a minor performance gain as seen below:
x86-64: old impl. new impl. delta
ECB, 256 bit: 121.1 MB/s 123.0 MB/s +1.5%
CBC, 256 bit: 285.3 MB/s 290.8 MB/s +1.9%
LRW, 256 bit: 263.7 MB/s 265.3 MB/s +0.6%
XTS, 256 bit: 251.1 MB/s 255.3 MB/s +1.7%
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Reviewed-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2010-11-27 11:34:46 +03:00
movaps % x m m 2 , ( T K E Y P )
add $ 0 x10 , T K E Y P
2009-01-18 08:28:34 +03:00
ret
2019-10-11 14:50:46 +03:00
SYM_ F U N C _ E N D ( _ k e y _ e x p a n s i o n _ 2 5 6 b )
2009-01-18 08:28:34 +03:00
/ *
* int a e s n i _ s e t _ k e y ( s t r u c t c r y p t o _ a e s _ c t x * c t x , c o n s t u 8 * i n _ k e y ,
* unsigned i n t k e y _ l e n )
* /
2019-10-11 14:51:04 +03:00
SYM_ F U N C _ S T A R T ( a e s n i _ s e t _ k e y )
2016-01-22 01:49:19 +03:00
FRAME_ B E G I N
crypto: aesni-intel - Ported implementation to x86-32
The AES-NI instructions are also available in legacy mode so the 32-bit
architecture may profit from those, too.
To illustrate the performance gain here's a short summary of a dm-crypt
speed test on a Core i7 M620 running at 2.67GHz comparing both assembler
implementations:
x86: i568 aes-ni delta
ECB, 256 bit: 93.8 MB/s 123.3 MB/s +31.4%
CBC, 256 bit: 84.8 MB/s 262.3 MB/s +209.3%
LRW, 256 bit: 108.6 MB/s 222.1 MB/s +104.5%
XTS, 256 bit: 105.0 MB/s 205.5 MB/s +95.7%
Additionally, due to some minor optimizations, the 64-bit version also
got a minor performance gain as seen below:
x86-64: old impl. new impl. delta
ECB, 256 bit: 121.1 MB/s 123.0 MB/s +1.5%
CBC, 256 bit: 285.3 MB/s 290.8 MB/s +1.9%
LRW, 256 bit: 263.7 MB/s 265.3 MB/s +0.6%
XTS, 256 bit: 251.1 MB/s 255.3 MB/s +1.7%
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Reviewed-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2010-11-27 11:34:46 +03:00
# ifndef _ _ x86 _ 6 4 _ _
pushl K E Y P
2016-01-22 01:49:19 +03:00
movl ( F R A M E _ O F F S E T + 8 ) ( % e s p ) , K E Y P # c t x
movl ( F R A M E _ O F F S E T + 1 2 ) ( % e s p ) , U K E Y P # i n _ k e y
movl ( F R A M E _ O F F S E T + 1 6 ) ( % e s p ) , % e d x # k e y _ l e n
crypto: aesni-intel - Ported implementation to x86-32
The AES-NI instructions are also available in legacy mode so the 32-bit
architecture may profit from those, too.
To illustrate the performance gain here's a short summary of a dm-crypt
speed test on a Core i7 M620 running at 2.67GHz comparing both assembler
implementations:
x86: i568 aes-ni delta
ECB, 256 bit: 93.8 MB/s 123.3 MB/s +31.4%
CBC, 256 bit: 84.8 MB/s 262.3 MB/s +209.3%
LRW, 256 bit: 108.6 MB/s 222.1 MB/s +104.5%
XTS, 256 bit: 105.0 MB/s 205.5 MB/s +95.7%
Additionally, due to some minor optimizations, the 64-bit version also
got a minor performance gain as seen below:
x86-64: old impl. new impl. delta
ECB, 256 bit: 121.1 MB/s 123.0 MB/s +1.5%
CBC, 256 bit: 285.3 MB/s 290.8 MB/s +1.9%
LRW, 256 bit: 263.7 MB/s 265.3 MB/s +0.6%
XTS, 256 bit: 251.1 MB/s 255.3 MB/s +1.7%
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Reviewed-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2010-11-27 11:34:46 +03:00
# endif
movups ( U K E Y P ) , % x m m 0 # u s e r k e y ( f i r s t 16 b y t e s )
movaps % x m m 0 , ( K E Y P )
lea 0 x10 ( K E Y P ) , T K E Y P # k e y a d d r
movl % e d x , 4 8 0 ( K E Y P )
2009-01-18 08:28:34 +03:00
pxor % x m m 4 , % x m m 4 # x m m 4 i s a s s u m e d 0 i n _ k e y _ e x p a n s i o n _ x
cmp $ 2 4 , % d l
jb . L e n c _ k e y 1 2 8
je . L e n c _ k e y 1 9 2
crypto: aesni-intel - Ported implementation to x86-32
The AES-NI instructions are also available in legacy mode so the 32-bit
architecture may profit from those, too.
To illustrate the performance gain here's a short summary of a dm-crypt
speed test on a Core i7 M620 running at 2.67GHz comparing both assembler
implementations:
x86: i568 aes-ni delta
ECB, 256 bit: 93.8 MB/s 123.3 MB/s +31.4%
CBC, 256 bit: 84.8 MB/s 262.3 MB/s +209.3%
LRW, 256 bit: 108.6 MB/s 222.1 MB/s +104.5%
XTS, 256 bit: 105.0 MB/s 205.5 MB/s +95.7%
Additionally, due to some minor optimizations, the 64-bit version also
got a minor performance gain as seen below:
x86-64: old impl. new impl. delta
ECB, 256 bit: 121.1 MB/s 123.0 MB/s +1.5%
CBC, 256 bit: 285.3 MB/s 290.8 MB/s +1.9%
LRW, 256 bit: 263.7 MB/s 265.3 MB/s +0.6%
XTS, 256 bit: 251.1 MB/s 255.3 MB/s +1.7%
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Reviewed-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2010-11-27 11:34:46 +03:00
movups 0 x10 ( U K E Y P ) , % x m m 2 # o t h e r u s e r k e y
movaps % x m m 2 , ( T K E Y P )
add $ 0 x10 , T K E Y P
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aeskeygenassist $ 0 x1 , % x m m 2 , % x m m 1 # r o u n d 1
2009-01-18 08:28:34 +03:00
call _ k e y _ e x p a n s i o n _ 2 5 6 a
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aeskeygenassist $ 0 x1 , % x m m 0 , % x m m 1
2009-01-18 08:28:34 +03:00
call _ k e y _ e x p a n s i o n _ 2 5 6 b
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aeskeygenassist $ 0 x2 , % x m m 2 , % x m m 1 # r o u n d 2
2009-01-18 08:28:34 +03:00
call _ k e y _ e x p a n s i o n _ 2 5 6 a
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aeskeygenassist $ 0 x2 , % x m m 0 , % x m m 1
2009-01-18 08:28:34 +03:00
call _ k e y _ e x p a n s i o n _ 2 5 6 b
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aeskeygenassist $ 0 x4 , % x m m 2 , % x m m 1 # r o u n d 3
2009-01-18 08:28:34 +03:00
call _ k e y _ e x p a n s i o n _ 2 5 6 a
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aeskeygenassist $ 0 x4 , % x m m 0 , % x m m 1
2009-01-18 08:28:34 +03:00
call _ k e y _ e x p a n s i o n _ 2 5 6 b
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aeskeygenassist $ 0 x8 , % x m m 2 , % x m m 1 # r o u n d 4
2009-01-18 08:28:34 +03:00
call _ k e y _ e x p a n s i o n _ 2 5 6 a
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aeskeygenassist $ 0 x8 , % x m m 0 , % x m m 1
2009-01-18 08:28:34 +03:00
call _ k e y _ e x p a n s i o n _ 2 5 6 b
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aeskeygenassist $ 0 x10 , % x m m 2 , % x m m 1 # r o u n d 5
2009-01-18 08:28:34 +03:00
call _ k e y _ e x p a n s i o n _ 2 5 6 a
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aeskeygenassist $ 0 x10 , % x m m 0 , % x m m 1
2009-01-18 08:28:34 +03:00
call _ k e y _ e x p a n s i o n _ 2 5 6 b
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aeskeygenassist $ 0 x20 , % x m m 2 , % x m m 1 # r o u n d 6
2009-01-18 08:28:34 +03:00
call _ k e y _ e x p a n s i o n _ 2 5 6 a
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aeskeygenassist $ 0 x20 , % x m m 0 , % x m m 1
2009-01-18 08:28:34 +03:00
call _ k e y _ e x p a n s i o n _ 2 5 6 b
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aeskeygenassist $ 0 x40 , % x m m 2 , % x m m 1 # r o u n d 7
2009-01-18 08:28:34 +03:00
call _ k e y _ e x p a n s i o n _ 2 5 6 a
jmp . L d e c _ k e y
.Lenc_key192 :
crypto: aesni-intel - Ported implementation to x86-32
The AES-NI instructions are also available in legacy mode so the 32-bit
architecture may profit from those, too.
To illustrate the performance gain here's a short summary of a dm-crypt
speed test on a Core i7 M620 running at 2.67GHz comparing both assembler
implementations:
x86: i568 aes-ni delta
ECB, 256 bit: 93.8 MB/s 123.3 MB/s +31.4%
CBC, 256 bit: 84.8 MB/s 262.3 MB/s +209.3%
LRW, 256 bit: 108.6 MB/s 222.1 MB/s +104.5%
XTS, 256 bit: 105.0 MB/s 205.5 MB/s +95.7%
Additionally, due to some minor optimizations, the 64-bit version also
got a minor performance gain as seen below:
x86-64: old impl. new impl. delta
ECB, 256 bit: 121.1 MB/s 123.0 MB/s +1.5%
CBC, 256 bit: 285.3 MB/s 290.8 MB/s +1.9%
LRW, 256 bit: 263.7 MB/s 265.3 MB/s +0.6%
XTS, 256 bit: 251.1 MB/s 255.3 MB/s +1.7%
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Reviewed-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2010-11-27 11:34:46 +03:00
movq 0 x10 ( U K E Y P ) , % x m m 2 # o t h e r u s e r k e y
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aeskeygenassist $ 0 x1 , % x m m 2 , % x m m 1 # r o u n d 1
2009-01-18 08:28:34 +03:00
call _ k e y _ e x p a n s i o n _ 1 9 2 a
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aeskeygenassist $ 0 x2 , % x m m 2 , % x m m 1 # r o u n d 2
2009-01-18 08:28:34 +03:00
call _ k e y _ e x p a n s i o n _ 1 9 2 b
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aeskeygenassist $ 0 x4 , % x m m 2 , % x m m 1 # r o u n d 3
2009-01-18 08:28:34 +03:00
call _ k e y _ e x p a n s i o n _ 1 9 2 a
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aeskeygenassist $ 0 x8 , % x m m 2 , % x m m 1 # r o u n d 4
2009-01-18 08:28:34 +03:00
call _ k e y _ e x p a n s i o n _ 1 9 2 b
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aeskeygenassist $ 0 x10 , % x m m 2 , % x m m 1 # r o u n d 5
2009-01-18 08:28:34 +03:00
call _ k e y _ e x p a n s i o n _ 1 9 2 a
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aeskeygenassist $ 0 x20 , % x m m 2 , % x m m 1 # r o u n d 6
2009-01-18 08:28:34 +03:00
call _ k e y _ e x p a n s i o n _ 1 9 2 b
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aeskeygenassist $ 0 x40 , % x m m 2 , % x m m 1 # r o u n d 7
2009-01-18 08:28:34 +03:00
call _ k e y _ e x p a n s i o n _ 1 9 2 a
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aeskeygenassist $ 0 x80 , % x m m 2 , % x m m 1 # r o u n d 8
2009-01-18 08:28:34 +03:00
call _ k e y _ e x p a n s i o n _ 1 9 2 b
jmp . L d e c _ k e y
.Lenc_key128 :
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aeskeygenassist $ 0 x1 , % x m m 0 , % x m m 1 # r o u n d 1
2009-01-18 08:28:34 +03:00
call _ k e y _ e x p a n s i o n _ 1 2 8
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aeskeygenassist $ 0 x2 , % x m m 0 , % x m m 1 # r o u n d 2
2009-01-18 08:28:34 +03:00
call _ k e y _ e x p a n s i o n _ 1 2 8
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aeskeygenassist $ 0 x4 , % x m m 0 , % x m m 1 # r o u n d 3
2009-01-18 08:28:34 +03:00
call _ k e y _ e x p a n s i o n _ 1 2 8
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aeskeygenassist $ 0 x8 , % x m m 0 , % x m m 1 # r o u n d 4
2009-01-18 08:28:34 +03:00
call _ k e y _ e x p a n s i o n _ 1 2 8
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aeskeygenassist $ 0 x10 , % x m m 0 , % x m m 1 # r o u n d 5
2009-01-18 08:28:34 +03:00
call _ k e y _ e x p a n s i o n _ 1 2 8
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aeskeygenassist $ 0 x20 , % x m m 0 , % x m m 1 # r o u n d 6
2009-01-18 08:28:34 +03:00
call _ k e y _ e x p a n s i o n _ 1 2 8
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aeskeygenassist $ 0 x40 , % x m m 0 , % x m m 1 # r o u n d 7
2009-01-18 08:28:34 +03:00
call _ k e y _ e x p a n s i o n _ 1 2 8
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aeskeygenassist $ 0 x80 , % x m m 0 , % x m m 1 # r o u n d 8
2009-01-18 08:28:34 +03:00
call _ k e y _ e x p a n s i o n _ 1 2 8
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aeskeygenassist $ 0 x1 b , % x m m 0 , % x m m 1 # r o u n d 9
2009-01-18 08:28:34 +03:00
call _ k e y _ e x p a n s i o n _ 1 2 8
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aeskeygenassist $ 0 x36 , % x m m 0 , % x m m 1 # r o u n d 10
2009-01-18 08:28:34 +03:00
call _ k e y _ e x p a n s i o n _ 1 2 8
.Ldec_key :
crypto: aesni-intel - Ported implementation to x86-32
The AES-NI instructions are also available in legacy mode so the 32-bit
architecture may profit from those, too.
To illustrate the performance gain here's a short summary of a dm-crypt
speed test on a Core i7 M620 running at 2.67GHz comparing both assembler
implementations:
x86: i568 aes-ni delta
ECB, 256 bit: 93.8 MB/s 123.3 MB/s +31.4%
CBC, 256 bit: 84.8 MB/s 262.3 MB/s +209.3%
LRW, 256 bit: 108.6 MB/s 222.1 MB/s +104.5%
XTS, 256 bit: 105.0 MB/s 205.5 MB/s +95.7%
Additionally, due to some minor optimizations, the 64-bit version also
got a minor performance gain as seen below:
x86-64: old impl. new impl. delta
ECB, 256 bit: 121.1 MB/s 123.0 MB/s +1.5%
CBC, 256 bit: 285.3 MB/s 290.8 MB/s +1.9%
LRW, 256 bit: 263.7 MB/s 265.3 MB/s +0.6%
XTS, 256 bit: 251.1 MB/s 255.3 MB/s +1.7%
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Reviewed-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2010-11-27 11:34:46 +03:00
sub $ 0 x10 , T K E Y P
movaps ( K E Y P ) , % x m m 0
movaps ( T K E Y P ) , % x m m 1
movaps % x m m 0 , 2 4 0 ( T K E Y P )
movaps % x m m 1 , 2 4 0 ( K E Y P )
add $ 0 x10 , K E Y P
lea 2 4 0 - 1 6 ( T K E Y P ) , U K E Y P
2009-01-18 08:28:34 +03:00
.align 4
.Ldec_key_loop :
crypto: aesni-intel - Ported implementation to x86-32
The AES-NI instructions are also available in legacy mode so the 32-bit
architecture may profit from those, too.
To illustrate the performance gain here's a short summary of a dm-crypt
speed test on a Core i7 M620 running at 2.67GHz comparing both assembler
implementations:
x86: i568 aes-ni delta
ECB, 256 bit: 93.8 MB/s 123.3 MB/s +31.4%
CBC, 256 bit: 84.8 MB/s 262.3 MB/s +209.3%
LRW, 256 bit: 108.6 MB/s 222.1 MB/s +104.5%
XTS, 256 bit: 105.0 MB/s 205.5 MB/s +95.7%
Additionally, due to some minor optimizations, the 64-bit version also
got a minor performance gain as seen below:
x86-64: old impl. new impl. delta
ECB, 256 bit: 121.1 MB/s 123.0 MB/s +1.5%
CBC, 256 bit: 285.3 MB/s 290.8 MB/s +1.9%
LRW, 256 bit: 263.7 MB/s 265.3 MB/s +0.6%
XTS, 256 bit: 251.1 MB/s 255.3 MB/s +1.7%
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Reviewed-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2010-11-27 11:34:46 +03:00
movaps ( K E Y P ) , % x m m 0
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesimc % x m m 0 , % x m m 1
crypto: aesni-intel - Ported implementation to x86-32
The AES-NI instructions are also available in legacy mode so the 32-bit
architecture may profit from those, too.
To illustrate the performance gain here's a short summary of a dm-crypt
speed test on a Core i7 M620 running at 2.67GHz comparing both assembler
implementations:
x86: i568 aes-ni delta
ECB, 256 bit: 93.8 MB/s 123.3 MB/s +31.4%
CBC, 256 bit: 84.8 MB/s 262.3 MB/s +209.3%
LRW, 256 bit: 108.6 MB/s 222.1 MB/s +104.5%
XTS, 256 bit: 105.0 MB/s 205.5 MB/s +95.7%
Additionally, due to some minor optimizations, the 64-bit version also
got a minor performance gain as seen below:
x86-64: old impl. new impl. delta
ECB, 256 bit: 121.1 MB/s 123.0 MB/s +1.5%
CBC, 256 bit: 285.3 MB/s 290.8 MB/s +1.9%
LRW, 256 bit: 263.7 MB/s 265.3 MB/s +0.6%
XTS, 256 bit: 251.1 MB/s 255.3 MB/s +1.7%
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Reviewed-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2010-11-27 11:34:46 +03:00
movaps % x m m 1 , ( U K E Y P )
add $ 0 x10 , K E Y P
sub $ 0 x10 , U K E Y P
cmp T K E Y P , K E Y P
2009-01-18 08:28:34 +03:00
jb . L d e c _ k e y _ l o o p
crypto: aesni-intel - Ported implementation to x86-32
The AES-NI instructions are also available in legacy mode so the 32-bit
architecture may profit from those, too.
To illustrate the performance gain here's a short summary of a dm-crypt
speed test on a Core i7 M620 running at 2.67GHz comparing both assembler
implementations:
x86: i568 aes-ni delta
ECB, 256 bit: 93.8 MB/s 123.3 MB/s +31.4%
CBC, 256 bit: 84.8 MB/s 262.3 MB/s +209.3%
LRW, 256 bit: 108.6 MB/s 222.1 MB/s +104.5%
XTS, 256 bit: 105.0 MB/s 205.5 MB/s +95.7%
Additionally, due to some minor optimizations, the 64-bit version also
got a minor performance gain as seen below:
x86-64: old impl. new impl. delta
ECB, 256 bit: 121.1 MB/s 123.0 MB/s +1.5%
CBC, 256 bit: 285.3 MB/s 290.8 MB/s +1.9%
LRW, 256 bit: 263.7 MB/s 265.3 MB/s +0.6%
XTS, 256 bit: 251.1 MB/s 255.3 MB/s +1.7%
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Reviewed-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2010-11-27 11:34:46 +03:00
xor A R E G , A R E G
# ifndef _ _ x86 _ 6 4 _ _
popl K E Y P
# endif
2016-01-22 01:49:19 +03:00
FRAME_ E N D
2009-01-18 08:28:34 +03:00
ret
2019-10-11 14:51:04 +03:00
SYM_ F U N C _ E N D ( a e s n i _ s e t _ k e y )
2009-01-18 08:28:34 +03:00
/ *
2019-11-27 09:08:02 +03:00
* void a e s n i _ e n c ( c o n s t v o i d * c t x , u 8 * d s t , c o n s t u 8 * s r c )
2009-01-18 08:28:34 +03:00
* /
2019-10-11 14:51:04 +03:00
SYM_ F U N C _ S T A R T ( a e s n i _ e n c )
2016-01-22 01:49:19 +03:00
FRAME_ B E G I N
crypto: aesni-intel - Ported implementation to x86-32
The AES-NI instructions are also available in legacy mode so the 32-bit
architecture may profit from those, too.
To illustrate the performance gain here's a short summary of a dm-crypt
speed test on a Core i7 M620 running at 2.67GHz comparing both assembler
implementations:
x86: i568 aes-ni delta
ECB, 256 bit: 93.8 MB/s 123.3 MB/s +31.4%
CBC, 256 bit: 84.8 MB/s 262.3 MB/s +209.3%
LRW, 256 bit: 108.6 MB/s 222.1 MB/s +104.5%
XTS, 256 bit: 105.0 MB/s 205.5 MB/s +95.7%
Additionally, due to some minor optimizations, the 64-bit version also
got a minor performance gain as seen below:
x86-64: old impl. new impl. delta
ECB, 256 bit: 121.1 MB/s 123.0 MB/s +1.5%
CBC, 256 bit: 285.3 MB/s 290.8 MB/s +1.9%
LRW, 256 bit: 263.7 MB/s 265.3 MB/s +0.6%
XTS, 256 bit: 251.1 MB/s 255.3 MB/s +1.7%
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Reviewed-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2010-11-27 11:34:46 +03:00
# ifndef _ _ x86 _ 6 4 _ _
pushl K E Y P
pushl K L E N
2016-01-22 01:49:19 +03:00
movl ( F R A M E _ O F F S E T + 1 2 ) ( % e s p ) , K E Y P # c t x
movl ( F R A M E _ O F F S E T + 1 6 ) ( % e s p ) , O U T P # d s t
movl ( F R A M E _ O F F S E T + 2 0 ) ( % e s p ) , I N P # s r c
crypto: aesni-intel - Ported implementation to x86-32
The AES-NI instructions are also available in legacy mode so the 32-bit
architecture may profit from those, too.
To illustrate the performance gain here's a short summary of a dm-crypt
speed test on a Core i7 M620 running at 2.67GHz comparing both assembler
implementations:
x86: i568 aes-ni delta
ECB, 256 bit: 93.8 MB/s 123.3 MB/s +31.4%
CBC, 256 bit: 84.8 MB/s 262.3 MB/s +209.3%
LRW, 256 bit: 108.6 MB/s 222.1 MB/s +104.5%
XTS, 256 bit: 105.0 MB/s 205.5 MB/s +95.7%
Additionally, due to some minor optimizations, the 64-bit version also
got a minor performance gain as seen below:
x86-64: old impl. new impl. delta
ECB, 256 bit: 121.1 MB/s 123.0 MB/s +1.5%
CBC, 256 bit: 285.3 MB/s 290.8 MB/s +1.9%
LRW, 256 bit: 263.7 MB/s 265.3 MB/s +0.6%
XTS, 256 bit: 251.1 MB/s 255.3 MB/s +1.7%
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Reviewed-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2010-11-27 11:34:46 +03:00
# endif
2009-01-18 08:28:34 +03:00
movl 4 8 0 ( K E Y P ) , K L E N # k e y l e n g t h
movups ( I N P ) , S T A T E # i n p u t
call _ a e s n i _ e n c1
movups S T A T E , ( O U T P ) # o u t p u t
crypto: aesni-intel - Ported implementation to x86-32
The AES-NI instructions are also available in legacy mode so the 32-bit
architecture may profit from those, too.
To illustrate the performance gain here's a short summary of a dm-crypt
speed test on a Core i7 M620 running at 2.67GHz comparing both assembler
implementations:
x86: i568 aes-ni delta
ECB, 256 bit: 93.8 MB/s 123.3 MB/s +31.4%
CBC, 256 bit: 84.8 MB/s 262.3 MB/s +209.3%
LRW, 256 bit: 108.6 MB/s 222.1 MB/s +104.5%
XTS, 256 bit: 105.0 MB/s 205.5 MB/s +95.7%
Additionally, due to some minor optimizations, the 64-bit version also
got a minor performance gain as seen below:
x86-64: old impl. new impl. delta
ECB, 256 bit: 121.1 MB/s 123.0 MB/s +1.5%
CBC, 256 bit: 285.3 MB/s 290.8 MB/s +1.9%
LRW, 256 bit: 263.7 MB/s 265.3 MB/s +0.6%
XTS, 256 bit: 251.1 MB/s 255.3 MB/s +1.7%
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Reviewed-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2010-11-27 11:34:46 +03:00
# ifndef _ _ x86 _ 6 4 _ _
popl K L E N
popl K E Y P
# endif
2016-01-22 01:49:19 +03:00
FRAME_ E N D
2009-01-18 08:28:34 +03:00
ret
2019-10-11 14:51:04 +03:00
SYM_ F U N C _ E N D ( a e s n i _ e n c )
2009-01-18 08:28:34 +03:00
/ *
* _aesni_enc1 : internal A B I
* input :
* KEYP : key s t r u c t p o i n t e r
* KLEN : round c o u n t
* STATE : initial s t a t e ( i n p u t )
* output :
* STATE : finial s t a t e ( o u t p u t )
* changed :
* KEY
* TKEYP ( T 1 )
* /
2019-10-11 14:50:46 +03:00
SYM_ F U N C _ S T A R T _ L O C A L ( _ a e s n i _ e n c1 )
2009-01-18 08:28:34 +03:00
movaps ( K E Y P ) , K E Y # k e y
mov K E Y P , T K E Y P
pxor K E Y , S T A T E # r o u n d 0
add $ 0 x30 , T K E Y P
cmp $ 2 4 , K L E N
jb . L e n c12 8
lea 0 x20 ( T K E Y P ) , T K E Y P
je . L e n c19 2
add $ 0 x20 , T K E Y P
movaps - 0 x60 ( T K E Y P ) , K E Y
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesenc K E Y , S T A T E
2009-01-18 08:28:34 +03:00
movaps - 0 x50 ( T K E Y P ) , K E Y
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesenc K E Y , S T A T E
2009-01-18 08:28:34 +03:00
.align 4
.Lenc192 :
movaps - 0 x40 ( T K E Y P ) , K E Y
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesenc K E Y , S T A T E
2009-01-18 08:28:34 +03:00
movaps - 0 x30 ( T K E Y P ) , K E Y
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesenc K E Y , S T A T E
2009-01-18 08:28:34 +03:00
.align 4
.Lenc128 :
movaps - 0 x20 ( T K E Y P ) , K E Y
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesenc K E Y , S T A T E
2009-01-18 08:28:34 +03:00
movaps - 0 x10 ( T K E Y P ) , K E Y
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesenc K E Y , S T A T E
2009-01-18 08:28:34 +03:00
movaps ( T K E Y P ) , K E Y
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesenc K E Y , S T A T E
2009-01-18 08:28:34 +03:00
movaps 0 x10 ( T K E Y P ) , K E Y
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesenc K E Y , S T A T E
2009-01-18 08:28:34 +03:00
movaps 0 x20 ( T K E Y P ) , K E Y
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesenc K E Y , S T A T E
2009-01-18 08:28:34 +03:00
movaps 0 x30 ( T K E Y P ) , K E Y
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesenc K E Y , S T A T E
2009-01-18 08:28:34 +03:00
movaps 0 x40 ( T K E Y P ) , K E Y
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesenc K E Y , S T A T E
2009-01-18 08:28:34 +03:00
movaps 0 x50 ( T K E Y P ) , K E Y
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesenc K E Y , S T A T E
2009-01-18 08:28:34 +03:00
movaps 0 x60 ( T K E Y P ) , K E Y
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesenc K E Y , S T A T E
2009-01-18 08:28:34 +03:00
movaps 0 x70 ( T K E Y P ) , K E Y
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesenclast K E Y , S T A T E
2009-01-18 08:28:34 +03:00
ret
2019-10-11 14:50:46 +03:00
SYM_ F U N C _ E N D ( _ a e s n i _ e n c1 )
2009-01-18 08:28:34 +03:00
/ *
* _aesni_enc4 : internal A B I
* input :
* KEYP : key s t r u c t p o i n t e r
* KLEN : round c o u n t
* STATE1 : initial s t a t e ( i n p u t )
* STATE2
* STATE3
* STATE4
* output :
* STATE1 : finial s t a t e ( o u t p u t )
* STATE2
* STATE3
* STATE4
* changed :
* KEY
* TKEYP ( T 1 )
* /
2019-10-11 14:50:46 +03:00
SYM_ F U N C _ S T A R T _ L O C A L ( _ a e s n i _ e n c4 )
2009-01-18 08:28:34 +03:00
movaps ( K E Y P ) , K E Y # k e y
mov K E Y P , T K E Y P
pxor K E Y , S T A T E 1 # r o u n d 0
pxor K E Y , S T A T E 2
pxor K E Y , S T A T E 3
pxor K E Y , S T A T E 4
add $ 0 x30 , T K E Y P
cmp $ 2 4 , K L E N
jb . L 4 e n c12 8
lea 0 x20 ( T K E Y P ) , T K E Y P
je . L 4 e n c19 2
add $ 0 x20 , T K E Y P
movaps - 0 x60 ( T K E Y P ) , K E Y
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesenc K E Y , S T A T E 1
aesenc K E Y , S T A T E 2
aesenc K E Y , S T A T E 3
aesenc K E Y , S T A T E 4
2009-01-18 08:28:34 +03:00
movaps - 0 x50 ( T K E Y P ) , K E Y
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesenc K E Y , S T A T E 1
aesenc K E Y , S T A T E 2
aesenc K E Y , S T A T E 3
aesenc K E Y , S T A T E 4
2009-01-18 08:28:34 +03:00
# .align 4
.L4enc192 :
movaps - 0 x40 ( T K E Y P ) , K E Y
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesenc K E Y , S T A T E 1
aesenc K E Y , S T A T E 2
aesenc K E Y , S T A T E 3
aesenc K E Y , S T A T E 4
2009-01-18 08:28:34 +03:00
movaps - 0 x30 ( T K E Y P ) , K E Y
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesenc K E Y , S T A T E 1
aesenc K E Y , S T A T E 2
aesenc K E Y , S T A T E 3
aesenc K E Y , S T A T E 4
2009-01-18 08:28:34 +03:00
# .align 4
.L4enc128 :
movaps - 0 x20 ( T K E Y P ) , K E Y
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesenc K E Y , S T A T E 1
aesenc K E Y , S T A T E 2
aesenc K E Y , S T A T E 3
aesenc K E Y , S T A T E 4
2009-01-18 08:28:34 +03:00
movaps - 0 x10 ( T K E Y P ) , K E Y
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesenc K E Y , S T A T E 1
aesenc K E Y , S T A T E 2
aesenc K E Y , S T A T E 3
aesenc K E Y , S T A T E 4
2009-01-18 08:28:34 +03:00
movaps ( T K E Y P ) , K E Y
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesenc K E Y , S T A T E 1
aesenc K E Y , S T A T E 2
aesenc K E Y , S T A T E 3
aesenc K E Y , S T A T E 4
2009-01-18 08:28:34 +03:00
movaps 0 x10 ( T K E Y P ) , K E Y
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesenc K E Y , S T A T E 1
aesenc K E Y , S T A T E 2
aesenc K E Y , S T A T E 3
aesenc K E Y , S T A T E 4
2009-01-18 08:28:34 +03:00
movaps 0 x20 ( T K E Y P ) , K E Y
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesenc K E Y , S T A T E 1
aesenc K E Y , S T A T E 2
aesenc K E Y , S T A T E 3
aesenc K E Y , S T A T E 4
2009-01-18 08:28:34 +03:00
movaps 0 x30 ( T K E Y P ) , K E Y
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesenc K E Y , S T A T E 1
aesenc K E Y , S T A T E 2
aesenc K E Y , S T A T E 3
aesenc K E Y , S T A T E 4
2009-01-18 08:28:34 +03:00
movaps 0 x40 ( T K E Y P ) , K E Y
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesenc K E Y , S T A T E 1
aesenc K E Y , S T A T E 2
aesenc K E Y , S T A T E 3
aesenc K E Y , S T A T E 4
2009-01-18 08:28:34 +03:00
movaps 0 x50 ( T K E Y P ) , K E Y
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesenc K E Y , S T A T E 1
aesenc K E Y , S T A T E 2
aesenc K E Y , S T A T E 3
aesenc K E Y , S T A T E 4
2009-01-18 08:28:34 +03:00
movaps 0 x60 ( T K E Y P ) , K E Y
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesenc K E Y , S T A T E 1
aesenc K E Y , S T A T E 2
aesenc K E Y , S T A T E 3
aesenc K E Y , S T A T E 4
2009-01-18 08:28:34 +03:00
movaps 0 x70 ( T K E Y P ) , K E Y
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesenclast K E Y , S T A T E 1 # l a s t r o u n d
aesenclast K E Y , S T A T E 2
aesenclast K E Y , S T A T E 3
aesenclast K E Y , S T A T E 4
2009-01-18 08:28:34 +03:00
ret
2019-10-11 14:50:46 +03:00
SYM_ F U N C _ E N D ( _ a e s n i _ e n c4 )
2009-01-18 08:28:34 +03:00
/ *
2019-11-27 09:08:02 +03:00
* void a e s n i _ d e c ( c o n s t v o i d * c t x , u 8 * d s t , c o n s t u 8 * s r c )
2009-01-18 08:28:34 +03:00
* /
2019-10-11 14:51:04 +03:00
SYM_ F U N C _ S T A R T ( a e s n i _ d e c )
2016-01-22 01:49:19 +03:00
FRAME_ B E G I N
crypto: aesni-intel - Ported implementation to x86-32
The AES-NI instructions are also available in legacy mode so the 32-bit
architecture may profit from those, too.
To illustrate the performance gain here's a short summary of a dm-crypt
speed test on a Core i7 M620 running at 2.67GHz comparing both assembler
implementations:
x86: i568 aes-ni delta
ECB, 256 bit: 93.8 MB/s 123.3 MB/s +31.4%
CBC, 256 bit: 84.8 MB/s 262.3 MB/s +209.3%
LRW, 256 bit: 108.6 MB/s 222.1 MB/s +104.5%
XTS, 256 bit: 105.0 MB/s 205.5 MB/s +95.7%
Additionally, due to some minor optimizations, the 64-bit version also
got a minor performance gain as seen below:
x86-64: old impl. new impl. delta
ECB, 256 bit: 121.1 MB/s 123.0 MB/s +1.5%
CBC, 256 bit: 285.3 MB/s 290.8 MB/s +1.9%
LRW, 256 bit: 263.7 MB/s 265.3 MB/s +0.6%
XTS, 256 bit: 251.1 MB/s 255.3 MB/s +1.7%
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Reviewed-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2010-11-27 11:34:46 +03:00
# ifndef _ _ x86 _ 6 4 _ _
pushl K E Y P
pushl K L E N
2016-01-22 01:49:19 +03:00
movl ( F R A M E _ O F F S E T + 1 2 ) ( % e s p ) , K E Y P # c t x
movl ( F R A M E _ O F F S E T + 1 6 ) ( % e s p ) , O U T P # d s t
movl ( F R A M E _ O F F S E T + 2 0 ) ( % e s p ) , I N P # s r c
crypto: aesni-intel - Ported implementation to x86-32
The AES-NI instructions are also available in legacy mode so the 32-bit
architecture may profit from those, too.
To illustrate the performance gain here's a short summary of a dm-crypt
speed test on a Core i7 M620 running at 2.67GHz comparing both assembler
implementations:
x86: i568 aes-ni delta
ECB, 256 bit: 93.8 MB/s 123.3 MB/s +31.4%
CBC, 256 bit: 84.8 MB/s 262.3 MB/s +209.3%
LRW, 256 bit: 108.6 MB/s 222.1 MB/s +104.5%
XTS, 256 bit: 105.0 MB/s 205.5 MB/s +95.7%
Additionally, due to some minor optimizations, the 64-bit version also
got a minor performance gain as seen below:
x86-64: old impl. new impl. delta
ECB, 256 bit: 121.1 MB/s 123.0 MB/s +1.5%
CBC, 256 bit: 285.3 MB/s 290.8 MB/s +1.9%
LRW, 256 bit: 263.7 MB/s 265.3 MB/s +0.6%
XTS, 256 bit: 251.1 MB/s 255.3 MB/s +1.7%
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Reviewed-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2010-11-27 11:34:46 +03:00
# endif
2009-01-18 08:28:34 +03:00
mov 4 8 0 ( K E Y P ) , K L E N # k e y l e n g t h
add $ 2 4 0 , K E Y P
movups ( I N P ) , S T A T E # i n p u t
call _ a e s n i _ d e c1
movups S T A T E , ( O U T P ) #o u t p u t
crypto: aesni-intel - Ported implementation to x86-32
The AES-NI instructions are also available in legacy mode so the 32-bit
architecture may profit from those, too.
To illustrate the performance gain here's a short summary of a dm-crypt
speed test on a Core i7 M620 running at 2.67GHz comparing both assembler
implementations:
x86: i568 aes-ni delta
ECB, 256 bit: 93.8 MB/s 123.3 MB/s +31.4%
CBC, 256 bit: 84.8 MB/s 262.3 MB/s +209.3%
LRW, 256 bit: 108.6 MB/s 222.1 MB/s +104.5%
XTS, 256 bit: 105.0 MB/s 205.5 MB/s +95.7%
Additionally, due to some minor optimizations, the 64-bit version also
got a minor performance gain as seen below:
x86-64: old impl. new impl. delta
ECB, 256 bit: 121.1 MB/s 123.0 MB/s +1.5%
CBC, 256 bit: 285.3 MB/s 290.8 MB/s +1.9%
LRW, 256 bit: 263.7 MB/s 265.3 MB/s +0.6%
XTS, 256 bit: 251.1 MB/s 255.3 MB/s +1.7%
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Reviewed-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2010-11-27 11:34:46 +03:00
# ifndef _ _ x86 _ 6 4 _ _
popl K L E N
popl K E Y P
# endif
2016-01-22 01:49:19 +03:00
FRAME_ E N D
2009-01-18 08:28:34 +03:00
ret
2019-10-11 14:51:04 +03:00
SYM_ F U N C _ E N D ( a e s n i _ d e c )
2009-01-18 08:28:34 +03:00
/ *
* _aesni_dec1 : internal A B I
* input :
* KEYP : key s t r u c t p o i n t e r
* KLEN : key l e n g t h
* STATE : initial s t a t e ( i n p u t )
* output :
* STATE : finial s t a t e ( o u t p u t )
* changed :
* KEY
* TKEYP ( T 1 )
* /
2019-10-11 14:50:46 +03:00
SYM_ F U N C _ S T A R T _ L O C A L ( _ a e s n i _ d e c1 )
2009-01-18 08:28:34 +03:00
movaps ( K E Y P ) , K E Y # k e y
mov K E Y P , T K E Y P
pxor K E Y , S T A T E # r o u n d 0
add $ 0 x30 , T K E Y P
cmp $ 2 4 , K L E N
jb . L d e c12 8
lea 0 x20 ( T K E Y P ) , T K E Y P
je . L d e c19 2
add $ 0 x20 , T K E Y P
movaps - 0 x60 ( T K E Y P ) , K E Y
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesdec K E Y , S T A T E
2009-01-18 08:28:34 +03:00
movaps - 0 x50 ( T K E Y P ) , K E Y
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesdec K E Y , S T A T E
2009-01-18 08:28:34 +03:00
.align 4
.Ldec192 :
movaps - 0 x40 ( T K E Y P ) , K E Y
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesdec K E Y , S T A T E
2009-01-18 08:28:34 +03:00
movaps - 0 x30 ( T K E Y P ) , K E Y
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesdec K E Y , S T A T E
2009-01-18 08:28:34 +03:00
.align 4
.Ldec128 :
movaps - 0 x20 ( T K E Y P ) , K E Y
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesdec K E Y , S T A T E
2009-01-18 08:28:34 +03:00
movaps - 0 x10 ( T K E Y P ) , K E Y
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesdec K E Y , S T A T E
2009-01-18 08:28:34 +03:00
movaps ( T K E Y P ) , K E Y
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesdec K E Y , S T A T E
2009-01-18 08:28:34 +03:00
movaps 0 x10 ( T K E Y P ) , K E Y
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesdec K E Y , S T A T E
2009-01-18 08:28:34 +03:00
movaps 0 x20 ( T K E Y P ) , K E Y
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesdec K E Y , S T A T E
2009-01-18 08:28:34 +03:00
movaps 0 x30 ( T K E Y P ) , K E Y
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesdec K E Y , S T A T E
2009-01-18 08:28:34 +03:00
movaps 0 x40 ( T K E Y P ) , K E Y
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesdec K E Y , S T A T E
2009-01-18 08:28:34 +03:00
movaps 0 x50 ( T K E Y P ) , K E Y
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesdec K E Y , S T A T E
2009-01-18 08:28:34 +03:00
movaps 0 x60 ( T K E Y P ) , K E Y
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesdec K E Y , S T A T E
2009-01-18 08:28:34 +03:00
movaps 0 x70 ( T K E Y P ) , K E Y
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesdeclast K E Y , S T A T E
2009-01-18 08:28:34 +03:00
ret
2019-10-11 14:50:46 +03:00
SYM_ F U N C _ E N D ( _ a e s n i _ d e c1 )
2009-01-18 08:28:34 +03:00
/ *
* _aesni_dec4 : internal A B I
* input :
* KEYP : key s t r u c t p o i n t e r
* KLEN : key l e n g t h
* STATE1 : initial s t a t e ( i n p u t )
* STATE2
* STATE3
* STATE4
* output :
* STATE1 : finial s t a t e ( o u t p u t )
* STATE2
* STATE3
* STATE4
* changed :
* KEY
* TKEYP ( T 1 )
* /
2019-10-11 14:50:46 +03:00
SYM_ F U N C _ S T A R T _ L O C A L ( _ a e s n i _ d e c4 )
2009-01-18 08:28:34 +03:00
movaps ( K E Y P ) , K E Y # k e y
mov K E Y P , T K E Y P
pxor K E Y , S T A T E 1 # r o u n d 0
pxor K E Y , S T A T E 2
pxor K E Y , S T A T E 3
pxor K E Y , S T A T E 4
add $ 0 x30 , T K E Y P
cmp $ 2 4 , K L E N
jb . L 4 d e c12 8
lea 0 x20 ( T K E Y P ) , T K E Y P
je . L 4 d e c19 2
add $ 0 x20 , T K E Y P
movaps - 0 x60 ( T K E Y P ) , K E Y
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesdec K E Y , S T A T E 1
aesdec K E Y , S T A T E 2
aesdec K E Y , S T A T E 3
aesdec K E Y , S T A T E 4
2009-01-18 08:28:34 +03:00
movaps - 0 x50 ( T K E Y P ) , K E Y
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesdec K E Y , S T A T E 1
aesdec K E Y , S T A T E 2
aesdec K E Y , S T A T E 3
aesdec K E Y , S T A T E 4
2009-01-18 08:28:34 +03:00
.align 4
.L4dec192 :
movaps - 0 x40 ( T K E Y P ) , K E Y
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesdec K E Y , S T A T E 1
aesdec K E Y , S T A T E 2
aesdec K E Y , S T A T E 3
aesdec K E Y , S T A T E 4
2009-01-18 08:28:34 +03:00
movaps - 0 x30 ( T K E Y P ) , K E Y
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesdec K E Y , S T A T E 1
aesdec K E Y , S T A T E 2
aesdec K E Y , S T A T E 3
aesdec K E Y , S T A T E 4
2009-01-18 08:28:34 +03:00
.align 4
.L4dec128 :
movaps - 0 x20 ( T K E Y P ) , K E Y
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesdec K E Y , S T A T E 1
aesdec K E Y , S T A T E 2
aesdec K E Y , S T A T E 3
aesdec K E Y , S T A T E 4
2009-01-18 08:28:34 +03:00
movaps - 0 x10 ( T K E Y P ) , K E Y
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesdec K E Y , S T A T E 1
aesdec K E Y , S T A T E 2
aesdec K E Y , S T A T E 3
aesdec K E Y , S T A T E 4
2009-01-18 08:28:34 +03:00
movaps ( T K E Y P ) , K E Y
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesdec K E Y , S T A T E 1
aesdec K E Y , S T A T E 2
aesdec K E Y , S T A T E 3
aesdec K E Y , S T A T E 4
2009-01-18 08:28:34 +03:00
movaps 0 x10 ( T K E Y P ) , K E Y
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesdec K E Y , S T A T E 1
aesdec K E Y , S T A T E 2
aesdec K E Y , S T A T E 3
aesdec K E Y , S T A T E 4
2009-01-18 08:28:34 +03:00
movaps 0 x20 ( T K E Y P ) , K E Y
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesdec K E Y , S T A T E 1
aesdec K E Y , S T A T E 2
aesdec K E Y , S T A T E 3
aesdec K E Y , S T A T E 4
2009-01-18 08:28:34 +03:00
movaps 0 x30 ( T K E Y P ) , K E Y
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesdec K E Y , S T A T E 1
aesdec K E Y , S T A T E 2
aesdec K E Y , S T A T E 3
aesdec K E Y , S T A T E 4
2009-01-18 08:28:34 +03:00
movaps 0 x40 ( T K E Y P ) , K E Y
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesdec K E Y , S T A T E 1
aesdec K E Y , S T A T E 2
aesdec K E Y , S T A T E 3
aesdec K E Y , S T A T E 4
2009-01-18 08:28:34 +03:00
movaps 0 x50 ( T K E Y P ) , K E Y
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesdec K E Y , S T A T E 1
aesdec K E Y , S T A T E 2
aesdec K E Y , S T A T E 3
aesdec K E Y , S T A T E 4
2009-01-18 08:28:34 +03:00
movaps 0 x60 ( T K E Y P ) , K E Y
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesdec K E Y , S T A T E 1
aesdec K E Y , S T A T E 2
aesdec K E Y , S T A T E 3
aesdec K E Y , S T A T E 4
2009-01-18 08:28:34 +03:00
movaps 0 x70 ( T K E Y P ) , K E Y
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
aesdeclast K E Y , S T A T E 1 # l a s t r o u n d
aesdeclast K E Y , S T A T E 2
aesdeclast K E Y , S T A T E 3
aesdeclast K E Y , S T A T E 4
2009-01-18 08:28:34 +03:00
ret
2019-10-11 14:50:46 +03:00
SYM_ F U N C _ E N D ( _ a e s n i _ d e c4 )
2009-01-18 08:28:34 +03:00
/ *
* void a e s n i _ e c b _ e n c ( s t r u c t c r y p t o _ a e s _ c t x * c t x , c o n s t u 8 * d s t , u 8 * s r c ,
* size_ t l e n )
* /
2019-10-11 14:51:04 +03:00
SYM_ F U N C _ S T A R T ( a e s n i _ e c b _ e n c )
2016-01-22 01:49:19 +03:00
FRAME_ B E G I N
crypto: aesni-intel - Ported implementation to x86-32
The AES-NI instructions are also available in legacy mode so the 32-bit
architecture may profit from those, too.
To illustrate the performance gain here's a short summary of a dm-crypt
speed test on a Core i7 M620 running at 2.67GHz comparing both assembler
implementations:
x86: i568 aes-ni delta
ECB, 256 bit: 93.8 MB/s 123.3 MB/s +31.4%
CBC, 256 bit: 84.8 MB/s 262.3 MB/s +209.3%
LRW, 256 bit: 108.6 MB/s 222.1 MB/s +104.5%
XTS, 256 bit: 105.0 MB/s 205.5 MB/s +95.7%
Additionally, due to some minor optimizations, the 64-bit version also
got a minor performance gain as seen below:
x86-64: old impl. new impl. delta
ECB, 256 bit: 121.1 MB/s 123.0 MB/s +1.5%
CBC, 256 bit: 285.3 MB/s 290.8 MB/s +1.9%
LRW, 256 bit: 263.7 MB/s 265.3 MB/s +0.6%
XTS, 256 bit: 251.1 MB/s 255.3 MB/s +1.7%
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Reviewed-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2010-11-27 11:34:46 +03:00
# ifndef _ _ x86 _ 6 4 _ _
pushl L E N
pushl K E Y P
pushl K L E N
2016-01-22 01:49:19 +03:00
movl ( F R A M E _ O F F S E T + 1 6 ) ( % e s p ) , K E Y P # c t x
movl ( F R A M E _ O F F S E T + 2 0 ) ( % e s p ) , O U T P # d s t
movl ( F R A M E _ O F F S E T + 2 4 ) ( % e s p ) , I N P # s r c
movl ( F R A M E _ O F F S E T + 2 8 ) ( % e s p ) , L E N # l e n
crypto: aesni-intel - Ported implementation to x86-32
The AES-NI instructions are also available in legacy mode so the 32-bit
architecture may profit from those, too.
To illustrate the performance gain here's a short summary of a dm-crypt
speed test on a Core i7 M620 running at 2.67GHz comparing both assembler
implementations:
x86: i568 aes-ni delta
ECB, 256 bit: 93.8 MB/s 123.3 MB/s +31.4%
CBC, 256 bit: 84.8 MB/s 262.3 MB/s +209.3%
LRW, 256 bit: 108.6 MB/s 222.1 MB/s +104.5%
XTS, 256 bit: 105.0 MB/s 205.5 MB/s +95.7%
Additionally, due to some minor optimizations, the 64-bit version also
got a minor performance gain as seen below:
x86-64: old impl. new impl. delta
ECB, 256 bit: 121.1 MB/s 123.0 MB/s +1.5%
CBC, 256 bit: 285.3 MB/s 290.8 MB/s +1.9%
LRW, 256 bit: 263.7 MB/s 265.3 MB/s +0.6%
XTS, 256 bit: 251.1 MB/s 255.3 MB/s +1.7%
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Reviewed-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2010-11-27 11:34:46 +03:00
# endif
2009-01-18 08:28:34 +03:00
test L E N , L E N # c h e c k l e n g t h
jz . L e c b _ e n c _ r e t
mov 4 8 0 ( K E Y P ) , K L E N
cmp $ 1 6 , L E N
jb . L e c b _ e n c _ r e t
cmp $ 6 4 , L E N
jb . L e c b _ e n c _ l o o p1
.align 4
.Lecb_enc_loop4 :
movups ( I N P ) , S T A T E 1
movups 0 x10 ( I N P ) , S T A T E 2
movups 0 x20 ( I N P ) , S T A T E 3
movups 0 x30 ( I N P ) , S T A T E 4
call _ a e s n i _ e n c4
movups S T A T E 1 , ( O U T P )
movups S T A T E 2 , 0 x10 ( O U T P )
movups S T A T E 3 , 0 x20 ( O U T P )
movups S T A T E 4 , 0 x30 ( O U T P )
sub $ 6 4 , L E N
add $ 6 4 , I N P
add $ 6 4 , O U T P
cmp $ 6 4 , L E N
jge . L e c b _ e n c _ l o o p4
cmp $ 1 6 , L E N
jb . L e c b _ e n c _ r e t
.align 4
.Lecb_enc_loop1 :
movups ( I N P ) , S T A T E 1
call _ a e s n i _ e n c1
movups S T A T E 1 , ( O U T P )
sub $ 1 6 , L E N
add $ 1 6 , I N P
add $ 1 6 , O U T P
cmp $ 1 6 , L E N
jge . L e c b _ e n c _ l o o p1
.Lecb_enc_ret :
crypto: aesni-intel - Ported implementation to x86-32
The AES-NI instructions are also available in legacy mode so the 32-bit
architecture may profit from those, too.
To illustrate the performance gain here's a short summary of a dm-crypt
speed test on a Core i7 M620 running at 2.67GHz comparing both assembler
implementations:
x86: i568 aes-ni delta
ECB, 256 bit: 93.8 MB/s 123.3 MB/s +31.4%
CBC, 256 bit: 84.8 MB/s 262.3 MB/s +209.3%
LRW, 256 bit: 108.6 MB/s 222.1 MB/s +104.5%
XTS, 256 bit: 105.0 MB/s 205.5 MB/s +95.7%
Additionally, due to some minor optimizations, the 64-bit version also
got a minor performance gain as seen below:
x86-64: old impl. new impl. delta
ECB, 256 bit: 121.1 MB/s 123.0 MB/s +1.5%
CBC, 256 bit: 285.3 MB/s 290.8 MB/s +1.9%
LRW, 256 bit: 263.7 MB/s 265.3 MB/s +0.6%
XTS, 256 bit: 251.1 MB/s 255.3 MB/s +1.7%
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Reviewed-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2010-11-27 11:34:46 +03:00
# ifndef _ _ x86 _ 6 4 _ _
popl K L E N
popl K E Y P
popl L E N
# endif
2016-01-22 01:49:19 +03:00
FRAME_ E N D
2009-01-18 08:28:34 +03:00
ret
2019-10-11 14:51:04 +03:00
SYM_ F U N C _ E N D ( a e s n i _ e c b _ e n c )
2009-01-18 08:28:34 +03:00
/ *
* void a e s n i _ e c b _ d e c ( s t r u c t c r y p t o _ a e s _ c t x * c t x , c o n s t u 8 * d s t , u 8 * s r c ,
* size_ t l e n ) ;
* /
2019-10-11 14:51:04 +03:00
SYM_ F U N C _ S T A R T ( a e s n i _ e c b _ d e c )
2016-01-22 01:49:19 +03:00
FRAME_ B E G I N
crypto: aesni-intel - Ported implementation to x86-32
The AES-NI instructions are also available in legacy mode so the 32-bit
architecture may profit from those, too.
To illustrate the performance gain here's a short summary of a dm-crypt
speed test on a Core i7 M620 running at 2.67GHz comparing both assembler
implementations:
x86: i568 aes-ni delta
ECB, 256 bit: 93.8 MB/s 123.3 MB/s +31.4%
CBC, 256 bit: 84.8 MB/s 262.3 MB/s +209.3%
LRW, 256 bit: 108.6 MB/s 222.1 MB/s +104.5%
XTS, 256 bit: 105.0 MB/s 205.5 MB/s +95.7%
Additionally, due to some minor optimizations, the 64-bit version also
got a minor performance gain as seen below:
x86-64: old impl. new impl. delta
ECB, 256 bit: 121.1 MB/s 123.0 MB/s +1.5%
CBC, 256 bit: 285.3 MB/s 290.8 MB/s +1.9%
LRW, 256 bit: 263.7 MB/s 265.3 MB/s +0.6%
XTS, 256 bit: 251.1 MB/s 255.3 MB/s +1.7%
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Reviewed-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2010-11-27 11:34:46 +03:00
# ifndef _ _ x86 _ 6 4 _ _
pushl L E N
pushl K E Y P
pushl K L E N
2016-01-22 01:49:19 +03:00
movl ( F R A M E _ O F F S E T + 1 6 ) ( % e s p ) , K E Y P # c t x
movl ( F R A M E _ O F F S E T + 2 0 ) ( % e s p ) , O U T P # d s t
movl ( F R A M E _ O F F S E T + 2 4 ) ( % e s p ) , I N P # s r c
movl ( F R A M E _ O F F S E T + 2 8 ) ( % e s p ) , L E N # l e n
crypto: aesni-intel - Ported implementation to x86-32
The AES-NI instructions are also available in legacy mode so the 32-bit
architecture may profit from those, too.
To illustrate the performance gain here's a short summary of a dm-crypt
speed test on a Core i7 M620 running at 2.67GHz comparing both assembler
implementations:
x86: i568 aes-ni delta
ECB, 256 bit: 93.8 MB/s 123.3 MB/s +31.4%
CBC, 256 bit: 84.8 MB/s 262.3 MB/s +209.3%
LRW, 256 bit: 108.6 MB/s 222.1 MB/s +104.5%
XTS, 256 bit: 105.0 MB/s 205.5 MB/s +95.7%
Additionally, due to some minor optimizations, the 64-bit version also
got a minor performance gain as seen below:
x86-64: old impl. new impl. delta
ECB, 256 bit: 121.1 MB/s 123.0 MB/s +1.5%
CBC, 256 bit: 285.3 MB/s 290.8 MB/s +1.9%
LRW, 256 bit: 263.7 MB/s 265.3 MB/s +0.6%
XTS, 256 bit: 251.1 MB/s 255.3 MB/s +1.7%
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Reviewed-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2010-11-27 11:34:46 +03:00
# endif
2009-01-18 08:28:34 +03:00
test L E N , L E N
jz . L e c b _ d e c _ r e t
mov 4 8 0 ( K E Y P ) , K L E N
add $ 2 4 0 , K E Y P
cmp $ 1 6 , L E N
jb . L e c b _ d e c _ r e t
cmp $ 6 4 , L E N
jb . L e c b _ d e c _ l o o p1
.align 4
.Lecb_dec_loop4 :
movups ( I N P ) , S T A T E 1
movups 0 x10 ( I N P ) , S T A T E 2
movups 0 x20 ( I N P ) , S T A T E 3
movups 0 x30 ( I N P ) , S T A T E 4
call _ a e s n i _ d e c4
movups S T A T E 1 , ( O U T P )
movups S T A T E 2 , 0 x10 ( O U T P )
movups S T A T E 3 , 0 x20 ( O U T P )
movups S T A T E 4 , 0 x30 ( O U T P )
sub $ 6 4 , L E N
add $ 6 4 , I N P
add $ 6 4 , O U T P
cmp $ 6 4 , L E N
jge . L e c b _ d e c _ l o o p4
cmp $ 1 6 , L E N
jb . L e c b _ d e c _ r e t
.align 4
.Lecb_dec_loop1 :
movups ( I N P ) , S T A T E 1
call _ a e s n i _ d e c1
movups S T A T E 1 , ( O U T P )
sub $ 1 6 , L E N
add $ 1 6 , I N P
add $ 1 6 , O U T P
cmp $ 1 6 , L E N
jge . L e c b _ d e c _ l o o p1
.Lecb_dec_ret :
crypto: aesni-intel - Ported implementation to x86-32
The AES-NI instructions are also available in legacy mode so the 32-bit
architecture may profit from those, too.
To illustrate the performance gain here's a short summary of a dm-crypt
speed test on a Core i7 M620 running at 2.67GHz comparing both assembler
implementations:
x86: i568 aes-ni delta
ECB, 256 bit: 93.8 MB/s 123.3 MB/s +31.4%
CBC, 256 bit: 84.8 MB/s 262.3 MB/s +209.3%
LRW, 256 bit: 108.6 MB/s 222.1 MB/s +104.5%
XTS, 256 bit: 105.0 MB/s 205.5 MB/s +95.7%
Additionally, due to some minor optimizations, the 64-bit version also
got a minor performance gain as seen below:
x86-64: old impl. new impl. delta
ECB, 256 bit: 121.1 MB/s 123.0 MB/s +1.5%
CBC, 256 bit: 285.3 MB/s 290.8 MB/s +1.9%
LRW, 256 bit: 263.7 MB/s 265.3 MB/s +0.6%
XTS, 256 bit: 251.1 MB/s 255.3 MB/s +1.7%
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Reviewed-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2010-11-27 11:34:46 +03:00
# ifndef _ _ x86 _ 6 4 _ _
popl K L E N
popl K E Y P
popl L E N
# endif
2016-01-22 01:49:19 +03:00
FRAME_ E N D
2009-01-18 08:28:34 +03:00
ret
2019-10-11 14:51:04 +03:00
SYM_ F U N C _ E N D ( a e s n i _ e c b _ d e c )
2009-01-18 08:28:34 +03:00
/ *
* void a e s n i _ c b c _ e n c ( s t r u c t c r y p t o _ a e s _ c t x * c t x , c o n s t u 8 * d s t , u 8 * s r c ,
* size_ t l e n , u 8 * i v )
* /
2019-10-11 14:51:04 +03:00
SYM_ F U N C _ S T A R T ( a e s n i _ c b c _ e n c )
2016-01-22 01:49:19 +03:00
FRAME_ B E G I N
crypto: aesni-intel - Ported implementation to x86-32
The AES-NI instructions are also available in legacy mode so the 32-bit
architecture may profit from those, too.
To illustrate the performance gain here's a short summary of a dm-crypt
speed test on a Core i7 M620 running at 2.67GHz comparing both assembler
implementations:
x86: i568 aes-ni delta
ECB, 256 bit: 93.8 MB/s 123.3 MB/s +31.4%
CBC, 256 bit: 84.8 MB/s 262.3 MB/s +209.3%
LRW, 256 bit: 108.6 MB/s 222.1 MB/s +104.5%
XTS, 256 bit: 105.0 MB/s 205.5 MB/s +95.7%
Additionally, due to some minor optimizations, the 64-bit version also
got a minor performance gain as seen below:
x86-64: old impl. new impl. delta
ECB, 256 bit: 121.1 MB/s 123.0 MB/s +1.5%
CBC, 256 bit: 285.3 MB/s 290.8 MB/s +1.9%
LRW, 256 bit: 263.7 MB/s 265.3 MB/s +0.6%
XTS, 256 bit: 251.1 MB/s 255.3 MB/s +1.7%
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Reviewed-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2010-11-27 11:34:46 +03:00
# ifndef _ _ x86 _ 6 4 _ _
pushl I V P
pushl L E N
pushl K E Y P
pushl K L E N
2016-01-22 01:49:19 +03:00
movl ( F R A M E _ O F F S E T + 2 0 ) ( % e s p ) , K E Y P # c t x
movl ( F R A M E _ O F F S E T + 2 4 ) ( % e s p ) , O U T P # d s t
movl ( F R A M E _ O F F S E T + 2 8 ) ( % e s p ) , I N P # s r c
movl ( F R A M E _ O F F S E T + 3 2 ) ( % e s p ) , L E N # l e n
movl ( F R A M E _ O F F S E T + 3 6 ) ( % e s p ) , I V P # i v
crypto: aesni-intel - Ported implementation to x86-32
The AES-NI instructions are also available in legacy mode so the 32-bit
architecture may profit from those, too.
To illustrate the performance gain here's a short summary of a dm-crypt
speed test on a Core i7 M620 running at 2.67GHz comparing both assembler
implementations:
x86: i568 aes-ni delta
ECB, 256 bit: 93.8 MB/s 123.3 MB/s +31.4%
CBC, 256 bit: 84.8 MB/s 262.3 MB/s +209.3%
LRW, 256 bit: 108.6 MB/s 222.1 MB/s +104.5%
XTS, 256 bit: 105.0 MB/s 205.5 MB/s +95.7%
Additionally, due to some minor optimizations, the 64-bit version also
got a minor performance gain as seen below:
x86-64: old impl. new impl. delta
ECB, 256 bit: 121.1 MB/s 123.0 MB/s +1.5%
CBC, 256 bit: 285.3 MB/s 290.8 MB/s +1.9%
LRW, 256 bit: 263.7 MB/s 265.3 MB/s +0.6%
XTS, 256 bit: 251.1 MB/s 255.3 MB/s +1.7%
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Reviewed-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2010-11-27 11:34:46 +03:00
# endif
2009-01-18 08:28:34 +03:00
cmp $ 1 6 , L E N
jb . L c b c _ e n c _ r e t
mov 4 8 0 ( K E Y P ) , K L E N
movups ( I V P ) , S T A T E # l o a d i v a s i n i t i a l s t a t e
.align 4
.Lcbc_enc_loop :
movups ( I N P ) , I N # l o a d i n p u t
pxor I N , S T A T E
call _ a e s n i _ e n c1
movups S T A T E , ( O U T P ) # s t o r e o u t p u t
sub $ 1 6 , L E N
add $ 1 6 , I N P
add $ 1 6 , O U T P
cmp $ 1 6 , L E N
jge . L c b c _ e n c _ l o o p
movups S T A T E , ( I V P )
.Lcbc_enc_ret :
crypto: aesni-intel - Ported implementation to x86-32
The AES-NI instructions are also available in legacy mode so the 32-bit
architecture may profit from those, too.
To illustrate the performance gain here's a short summary of a dm-crypt
speed test on a Core i7 M620 running at 2.67GHz comparing both assembler
implementations:
x86: i568 aes-ni delta
ECB, 256 bit: 93.8 MB/s 123.3 MB/s +31.4%
CBC, 256 bit: 84.8 MB/s 262.3 MB/s +209.3%
LRW, 256 bit: 108.6 MB/s 222.1 MB/s +104.5%
XTS, 256 bit: 105.0 MB/s 205.5 MB/s +95.7%
Additionally, due to some minor optimizations, the 64-bit version also
got a minor performance gain as seen below:
x86-64: old impl. new impl. delta
ECB, 256 bit: 121.1 MB/s 123.0 MB/s +1.5%
CBC, 256 bit: 285.3 MB/s 290.8 MB/s +1.9%
LRW, 256 bit: 263.7 MB/s 265.3 MB/s +0.6%
XTS, 256 bit: 251.1 MB/s 255.3 MB/s +1.7%
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Reviewed-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2010-11-27 11:34:46 +03:00
# ifndef _ _ x86 _ 6 4 _ _
popl K L E N
popl K E Y P
popl L E N
popl I V P
# endif
2016-01-22 01:49:19 +03:00
FRAME_ E N D
2009-01-18 08:28:34 +03:00
ret
2019-10-11 14:51:04 +03:00
SYM_ F U N C _ E N D ( a e s n i _ c b c _ e n c )
2009-01-18 08:28:34 +03:00
/ *
* void a e s n i _ c b c _ d e c ( s t r u c t c r y p t o _ a e s _ c t x * c t x , c o n s t u 8 * d s t , u 8 * s r c ,
* size_ t l e n , u 8 * i v )
* /
2019-10-11 14:51:04 +03:00
SYM_ F U N C _ S T A R T ( a e s n i _ c b c _ d e c )
2016-01-22 01:49:19 +03:00
FRAME_ B E G I N
crypto: aesni-intel - Ported implementation to x86-32
The AES-NI instructions are also available in legacy mode so the 32-bit
architecture may profit from those, too.
To illustrate the performance gain here's a short summary of a dm-crypt
speed test on a Core i7 M620 running at 2.67GHz comparing both assembler
implementations:
x86: i568 aes-ni delta
ECB, 256 bit: 93.8 MB/s 123.3 MB/s +31.4%
CBC, 256 bit: 84.8 MB/s 262.3 MB/s +209.3%
LRW, 256 bit: 108.6 MB/s 222.1 MB/s +104.5%
XTS, 256 bit: 105.0 MB/s 205.5 MB/s +95.7%
Additionally, due to some minor optimizations, the 64-bit version also
got a minor performance gain as seen below:
x86-64: old impl. new impl. delta
ECB, 256 bit: 121.1 MB/s 123.0 MB/s +1.5%
CBC, 256 bit: 285.3 MB/s 290.8 MB/s +1.9%
LRW, 256 bit: 263.7 MB/s 265.3 MB/s +0.6%
XTS, 256 bit: 251.1 MB/s 255.3 MB/s +1.7%
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Reviewed-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2010-11-27 11:34:46 +03:00
# ifndef _ _ x86 _ 6 4 _ _
pushl I V P
pushl L E N
pushl K E Y P
pushl K L E N
2016-01-22 01:49:19 +03:00
movl ( F R A M E _ O F F S E T + 2 0 ) ( % e s p ) , K E Y P # c t x
movl ( F R A M E _ O F F S E T + 2 4 ) ( % e s p ) , O U T P # d s t
movl ( F R A M E _ O F F S E T + 2 8 ) ( % e s p ) , I N P # s r c
movl ( F R A M E _ O F F S E T + 3 2 ) ( % e s p ) , L E N # l e n
movl ( F R A M E _ O F F S E T + 3 6 ) ( % e s p ) , I V P # i v
crypto: aesni-intel - Ported implementation to x86-32
The AES-NI instructions are also available in legacy mode so the 32-bit
architecture may profit from those, too.
To illustrate the performance gain here's a short summary of a dm-crypt
speed test on a Core i7 M620 running at 2.67GHz comparing both assembler
implementations:
x86: i568 aes-ni delta
ECB, 256 bit: 93.8 MB/s 123.3 MB/s +31.4%
CBC, 256 bit: 84.8 MB/s 262.3 MB/s +209.3%
LRW, 256 bit: 108.6 MB/s 222.1 MB/s +104.5%
XTS, 256 bit: 105.0 MB/s 205.5 MB/s +95.7%
Additionally, due to some minor optimizations, the 64-bit version also
got a minor performance gain as seen below:
x86-64: old impl. new impl. delta
ECB, 256 bit: 121.1 MB/s 123.0 MB/s +1.5%
CBC, 256 bit: 285.3 MB/s 290.8 MB/s +1.9%
LRW, 256 bit: 263.7 MB/s 265.3 MB/s +0.6%
XTS, 256 bit: 251.1 MB/s 255.3 MB/s +1.7%
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Reviewed-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2010-11-27 11:34:46 +03:00
# endif
2009-01-18 08:28:34 +03:00
cmp $ 1 6 , L E N
2009-06-18 15:33:57 +04:00
jb . L c b c _ d e c _ j u s t _ r e t
2009-01-18 08:28:34 +03:00
mov 4 8 0 ( K E Y P ) , K L E N
add $ 2 4 0 , K E Y P
movups ( I V P ) , I V
cmp $ 6 4 , L E N
jb . L c b c _ d e c _ l o o p1
.align 4
.Lcbc_dec_loop4 :
movups ( I N P ) , I N 1
movaps I N 1 , S T A T E 1
movups 0 x10 ( I N P ) , I N 2
movaps I N 2 , S T A T E 2
crypto: aesni-intel - Ported implementation to x86-32
The AES-NI instructions are also available in legacy mode so the 32-bit
architecture may profit from those, too.
To illustrate the performance gain here's a short summary of a dm-crypt
speed test on a Core i7 M620 running at 2.67GHz comparing both assembler
implementations:
x86: i568 aes-ni delta
ECB, 256 bit: 93.8 MB/s 123.3 MB/s +31.4%
CBC, 256 bit: 84.8 MB/s 262.3 MB/s +209.3%
LRW, 256 bit: 108.6 MB/s 222.1 MB/s +104.5%
XTS, 256 bit: 105.0 MB/s 205.5 MB/s +95.7%
Additionally, due to some minor optimizations, the 64-bit version also
got a minor performance gain as seen below:
x86-64: old impl. new impl. delta
ECB, 256 bit: 121.1 MB/s 123.0 MB/s +1.5%
CBC, 256 bit: 285.3 MB/s 290.8 MB/s +1.9%
LRW, 256 bit: 263.7 MB/s 265.3 MB/s +0.6%
XTS, 256 bit: 251.1 MB/s 255.3 MB/s +1.7%
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Reviewed-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2010-11-27 11:34:46 +03:00
# ifdef _ _ x86 _ 6 4 _ _
2009-01-18 08:28:34 +03:00
movups 0 x20 ( I N P ) , I N 3
movaps I N 3 , S T A T E 3
movups 0 x30 ( I N P ) , I N 4
movaps I N 4 , S T A T E 4
crypto: aesni-intel - Ported implementation to x86-32
The AES-NI instructions are also available in legacy mode so the 32-bit
architecture may profit from those, too.
To illustrate the performance gain here's a short summary of a dm-crypt
speed test on a Core i7 M620 running at 2.67GHz comparing both assembler
implementations:
x86: i568 aes-ni delta
ECB, 256 bit: 93.8 MB/s 123.3 MB/s +31.4%
CBC, 256 bit: 84.8 MB/s 262.3 MB/s +209.3%
LRW, 256 bit: 108.6 MB/s 222.1 MB/s +104.5%
XTS, 256 bit: 105.0 MB/s 205.5 MB/s +95.7%
Additionally, due to some minor optimizations, the 64-bit version also
got a minor performance gain as seen below:
x86-64: old impl. new impl. delta
ECB, 256 bit: 121.1 MB/s 123.0 MB/s +1.5%
CBC, 256 bit: 285.3 MB/s 290.8 MB/s +1.9%
LRW, 256 bit: 263.7 MB/s 265.3 MB/s +0.6%
XTS, 256 bit: 251.1 MB/s 255.3 MB/s +1.7%
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Reviewed-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2010-11-27 11:34:46 +03:00
# else
movups 0 x20 ( I N P ) , I N 1
movaps I N 1 , S T A T E 3
movups 0 x30 ( I N P ) , I N 2
movaps I N 2 , S T A T E 4
# endif
2009-01-18 08:28:34 +03:00
call _ a e s n i _ d e c4
pxor I V , S T A T E 1
crypto: aesni-intel - Ported implementation to x86-32
The AES-NI instructions are also available in legacy mode so the 32-bit
architecture may profit from those, too.
To illustrate the performance gain here's a short summary of a dm-crypt
speed test on a Core i7 M620 running at 2.67GHz comparing both assembler
implementations:
x86: i568 aes-ni delta
ECB, 256 bit: 93.8 MB/s 123.3 MB/s +31.4%
CBC, 256 bit: 84.8 MB/s 262.3 MB/s +209.3%
LRW, 256 bit: 108.6 MB/s 222.1 MB/s +104.5%
XTS, 256 bit: 105.0 MB/s 205.5 MB/s +95.7%
Additionally, due to some minor optimizations, the 64-bit version also
got a minor performance gain as seen below:
x86-64: old impl. new impl. delta
ECB, 256 bit: 121.1 MB/s 123.0 MB/s +1.5%
CBC, 256 bit: 285.3 MB/s 290.8 MB/s +1.9%
LRW, 256 bit: 263.7 MB/s 265.3 MB/s +0.6%
XTS, 256 bit: 251.1 MB/s 255.3 MB/s +1.7%
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Reviewed-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2010-11-27 11:34:46 +03:00
# ifdef _ _ x86 _ 6 4 _ _
2009-01-18 08:28:34 +03:00
pxor I N 1 , S T A T E 2
pxor I N 2 , S T A T E 3
pxor I N 3 , S T A T E 4
movaps I N 4 , I V
crypto: aesni-intel - Ported implementation to x86-32
The AES-NI instructions are also available in legacy mode so the 32-bit
architecture may profit from those, too.
To illustrate the performance gain here's a short summary of a dm-crypt
speed test on a Core i7 M620 running at 2.67GHz comparing both assembler
implementations:
x86: i568 aes-ni delta
ECB, 256 bit: 93.8 MB/s 123.3 MB/s +31.4%
CBC, 256 bit: 84.8 MB/s 262.3 MB/s +209.3%
LRW, 256 bit: 108.6 MB/s 222.1 MB/s +104.5%
XTS, 256 bit: 105.0 MB/s 205.5 MB/s +95.7%
Additionally, due to some minor optimizations, the 64-bit version also
got a minor performance gain as seen below:
x86-64: old impl. new impl. delta
ECB, 256 bit: 121.1 MB/s 123.0 MB/s +1.5%
CBC, 256 bit: 285.3 MB/s 290.8 MB/s +1.9%
LRW, 256 bit: 263.7 MB/s 265.3 MB/s +0.6%
XTS, 256 bit: 251.1 MB/s 255.3 MB/s +1.7%
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Reviewed-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2010-11-27 11:34:46 +03:00
# else
pxor I N 1 , S T A T E 4
movaps I N 2 , I V
2012-05-30 03:43:08 +04:00
movups ( I N P ) , I N 1
pxor I N 1 , S T A T E 2
movups 0 x10 ( I N P ) , I N 2
pxor I N 2 , S T A T E 3
crypto: aesni-intel - Ported implementation to x86-32
The AES-NI instructions are also available in legacy mode so the 32-bit
architecture may profit from those, too.
To illustrate the performance gain here's a short summary of a dm-crypt
speed test on a Core i7 M620 running at 2.67GHz comparing both assembler
implementations:
x86: i568 aes-ni delta
ECB, 256 bit: 93.8 MB/s 123.3 MB/s +31.4%
CBC, 256 bit: 84.8 MB/s 262.3 MB/s +209.3%
LRW, 256 bit: 108.6 MB/s 222.1 MB/s +104.5%
XTS, 256 bit: 105.0 MB/s 205.5 MB/s +95.7%
Additionally, due to some minor optimizations, the 64-bit version also
got a minor performance gain as seen below:
x86-64: old impl. new impl. delta
ECB, 256 bit: 121.1 MB/s 123.0 MB/s +1.5%
CBC, 256 bit: 285.3 MB/s 290.8 MB/s +1.9%
LRW, 256 bit: 263.7 MB/s 265.3 MB/s +0.6%
XTS, 256 bit: 251.1 MB/s 255.3 MB/s +1.7%
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Reviewed-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2010-11-27 11:34:46 +03:00
# endif
2009-01-18 08:28:34 +03:00
movups S T A T E 1 , ( O U T P )
movups S T A T E 2 , 0 x10 ( O U T P )
movups S T A T E 3 , 0 x20 ( O U T P )
movups S T A T E 4 , 0 x30 ( O U T P )
sub $ 6 4 , L E N
add $ 6 4 , I N P
add $ 6 4 , O U T P
cmp $ 6 4 , L E N
jge . L c b c _ d e c _ l o o p4
cmp $ 1 6 , L E N
jb . L c b c _ d e c _ r e t
.align 4
.Lcbc_dec_loop1 :
movups ( I N P ) , I N
movaps I N , S T A T E
call _ a e s n i _ d e c1
pxor I V , S T A T E
movups S T A T E , ( O U T P )
movaps I N , I V
sub $ 1 6 , L E N
add $ 1 6 , I N P
add $ 1 6 , O U T P
cmp $ 1 6 , L E N
jge . L c b c _ d e c _ l o o p1
.Lcbc_dec_ret :
2009-06-18 15:33:57 +04:00
movups I V , ( I V P )
.Lcbc_dec_just_ret :
crypto: aesni-intel - Ported implementation to x86-32
The AES-NI instructions are also available in legacy mode so the 32-bit
architecture may profit from those, too.
To illustrate the performance gain here's a short summary of a dm-crypt
speed test on a Core i7 M620 running at 2.67GHz comparing both assembler
implementations:
x86: i568 aes-ni delta
ECB, 256 bit: 93.8 MB/s 123.3 MB/s +31.4%
CBC, 256 bit: 84.8 MB/s 262.3 MB/s +209.3%
LRW, 256 bit: 108.6 MB/s 222.1 MB/s +104.5%
XTS, 256 bit: 105.0 MB/s 205.5 MB/s +95.7%
Additionally, due to some minor optimizations, the 64-bit version also
got a minor performance gain as seen below:
x86-64: old impl. new impl. delta
ECB, 256 bit: 121.1 MB/s 123.0 MB/s +1.5%
CBC, 256 bit: 285.3 MB/s 290.8 MB/s +1.9%
LRW, 256 bit: 263.7 MB/s 265.3 MB/s +0.6%
XTS, 256 bit: 251.1 MB/s 255.3 MB/s +1.7%
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Reviewed-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2010-11-27 11:34:46 +03:00
# ifndef _ _ x86 _ 6 4 _ _
popl K L E N
popl K E Y P
popl L E N
popl I V P
# endif
2016-01-22 01:49:19 +03:00
FRAME_ E N D
2009-01-18 08:28:34 +03:00
ret
2019-10-11 14:51:04 +03:00
SYM_ F U N C _ E N D ( a e s n i _ c b c _ d e c )
2010-03-10 13:28:55 +03:00
2020-12-08 02:34:02 +03:00
/ *
* void a e s n i _ c t s _ c b c _ e n c ( s t r u c t c r y p t o _ a e s _ c t x * c t x , c o n s t u 8 * d s t , u 8 * s r c ,
* size_ t l e n , u 8 * i v )
* /
SYM_ F U N C _ S T A R T ( a e s n i _ c t s _ c b c _ e n c )
FRAME_ B E G I N
# ifndef _ _ x86 _ 6 4 _ _
pushl I V P
pushl L E N
pushl K E Y P
pushl K L E N
movl ( F R A M E _ O F F S E T + 2 0 ) ( % e s p ) , K E Y P # c t x
movl ( F R A M E _ O F F S E T + 2 4 ) ( % e s p ) , O U T P # d s t
movl ( F R A M E _ O F F S E T + 2 8 ) ( % e s p ) , I N P # s r c
movl ( F R A M E _ O F F S E T + 3 2 ) ( % e s p ) , L E N # l e n
movl ( F R A M E _ O F F S E T + 3 6 ) ( % e s p ) , I V P # i v
lea . L c t s _ p e r m u t e _ t a b l e , T 1
# else
lea . L c t s _ p e r m u t e _ t a b l e ( % r i p ) , T 1
# endif
mov 4 8 0 ( K E Y P ) , K L E N
movups ( I V P ) , S T A T E
sub $ 1 6 , L E N
mov T 1 , I V P
add $ 3 2 , I V P
add L E N , T 1
sub L E N , I V P
movups ( T 1 ) , % x m m 4
movups ( I V P ) , % x m m 5
movups ( I N P ) , I N 1
add L E N , I N P
movups ( I N P ) , I N 2
pxor I N 1 , S T A T E
call _ a e s n i _ e n c1
pshufb % x m m 5 , I N 2
pxor S T A T E , I N 2
pshufb % x m m 4 , S T A T E
add O U T P , L E N
movups S T A T E , ( L E N )
movaps I N 2 , S T A T E
call _ a e s n i _ e n c1
movups S T A T E , ( O U T P )
# ifndef _ _ x86 _ 6 4 _ _
popl K L E N
popl K E Y P
popl L E N
popl I V P
# endif
FRAME_ E N D
ret
SYM_ F U N C _ E N D ( a e s n i _ c t s _ c b c _ e n c )
/ *
* void a e s n i _ c t s _ c b c _ d e c ( s t r u c t c r y p t o _ a e s _ c t x * c t x , c o n s t u 8 * d s t , u 8 * s r c ,
* size_ t l e n , u 8 * i v )
* /
SYM_ F U N C _ S T A R T ( a e s n i _ c t s _ c b c _ d e c )
FRAME_ B E G I N
# ifndef _ _ x86 _ 6 4 _ _
pushl I V P
pushl L E N
pushl K E Y P
pushl K L E N
movl ( F R A M E _ O F F S E T + 2 0 ) ( % e s p ) , K E Y P # c t x
movl ( F R A M E _ O F F S E T + 2 4 ) ( % e s p ) , O U T P # d s t
movl ( F R A M E _ O F F S E T + 2 8 ) ( % e s p ) , I N P # s r c
movl ( F R A M E _ O F F S E T + 3 2 ) ( % e s p ) , L E N # l e n
movl ( F R A M E _ O F F S E T + 3 6 ) ( % e s p ) , I V P # i v
lea . L c t s _ p e r m u t e _ t a b l e , T 1
# else
lea . L c t s _ p e r m u t e _ t a b l e ( % r i p ) , T 1
# endif
mov 4 8 0 ( K E Y P ) , K L E N
add $ 2 4 0 , K E Y P
movups ( I V P ) , I V
sub $ 1 6 , L E N
mov T 1 , I V P
add $ 3 2 , I V P
add L E N , T 1
sub L E N , I V P
movups ( T 1 ) , % x m m 4
movups ( I N P ) , S T A T E
add L E N , I N P
movups ( I N P ) , I N 1
call _ a e s n i _ d e c1
movaps S T A T E , I N 2
pshufb % x m m 4 , S T A T E
pxor I N 1 , S T A T E
add O U T P , L E N
movups S T A T E , ( L E N )
movups ( I V P ) , % x m m 0
pshufb % x m m 0 , I N 1
pblendvb I N 2 , I N 1
movaps I N 1 , S T A T E
call _ a e s n i _ d e c1
pxor I V , S T A T E
movups S T A T E , ( O U T P )
# ifndef _ _ x86 _ 6 4 _ _
popl K L E N
popl K E Y P
popl L E N
popl I V P
# endif
FRAME_ E N D
ret
SYM_ F U N C _ E N D ( a e s n i _ c t s _ c b c _ d e c )
x86/asm/crypto: Move .Lbswap_mask data to .rodata section
stacktool reports the following warning:
stacktool: arch/x86/crypto/aesni-intel_asm.o: _aesni_inc_init(): can't find starting instruction
stacktool gets confused when it tries to disassemble the following data
in the .text section:
.Lbswap_mask:
.byte 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0
Move it to .rodata which is a more appropriate section for read-only
data.
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Chris J Arges <chris.j.arges@canonical.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Pedro Alves <palves@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: live-patching@vger.kernel.org
Link: http://lkml.kernel.org/r/b6a2f3f8bda705143e127c025edb2b53c86e6eb4.1453405861.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-01-22 01:49:15 +03:00
.pushsection .rodata
2010-03-10 13:28:55 +03:00
.align 16
2020-12-08 02:34:02 +03:00
.Lcts_permute_table :
.byte 0 x8 0 , 0 x80 , 0 x80 , 0 x80 , 0 x80 , 0 x80 , 0 x80 , 0 x80
.byte 0 x8 0 , 0 x80 , 0 x80 , 0 x80 , 0 x80 , 0 x80 , 0 x80 , 0 x80
.byte 0 x0 0 , 0 x01 , 0 x02 , 0 x03 , 0 x04 , 0 x05 , 0 x06 , 0 x07
.byte 0 x0 8 , 0 x09 , 0 x0 a , 0 x0 b , 0 x0 c , 0 x0 d , 0 x0 e , 0 x0 f
.byte 0 x8 0 , 0 x80 , 0 x80 , 0 x80 , 0 x80 , 0 x80 , 0 x80 , 0 x80
.byte 0 x8 0 , 0 x80 , 0 x80 , 0 x80 , 0 x80 , 0 x80 , 0 x80 , 0 x80
# ifdef _ _ x86 _ 6 4 _ _
2010-03-10 13:28:55 +03:00
.Lbswap_mask :
.byte 1 5 , 1 4 , 1 3 , 1 2 , 1 1 , 1 0 , 9 , 8 , 7 , 6 , 5 , 4 , 3 , 2 , 1 , 0
2020-12-08 02:34:02 +03:00
# endif
x86/asm/crypto: Move .Lbswap_mask data to .rodata section
stacktool reports the following warning:
stacktool: arch/x86/crypto/aesni-intel_asm.o: _aesni_inc_init(): can't find starting instruction
stacktool gets confused when it tries to disassemble the following data
in the .text section:
.Lbswap_mask:
.byte 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0
Move it to .rodata which is a more appropriate section for read-only
data.
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Chris J Arges <chris.j.arges@canonical.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Pedro Alves <palves@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: live-patching@vger.kernel.org
Link: http://lkml.kernel.org/r/b6a2f3f8bda705143e127c025edb2b53c86e6eb4.1453405861.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-01-22 01:49:15 +03:00
.popsection
2010-03-10 13:28:55 +03:00
2020-12-08 02:34:02 +03:00
# ifdef _ _ x86 _ 6 4 _ _
2010-03-10 13:28:55 +03:00
/ *
* _aesni_inc_init : internal A B I
* setup r e g i s t e r s u s e d b y _ a e s n i _ i n c
* input :
* IV
* output :
* CTR : = = IV, i n l i t t l e e n d i a n
* TCTR_LOW : = = lower q w o r d o f C T R
* INC : = = 1 , in l i t t l e e n d i a n
* BSWAP_ M A S K = = e n d i a n s w a p p i n g m a s k
* /
2019-10-11 14:50:46 +03:00
SYM_ F U N C _ S T A R T _ L O C A L ( _ a e s n i _ i n c _ i n i t )
2010-03-10 13:28:55 +03:00
movaps . L b s w a p _ m a s k , B S W A P _ M A S K
movaps I V , C T R
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
pshufb B S W A P _ M A S K , C T R
2010-03-10 13:28:55 +03:00
mov $ 1 , T C T R _ L O W
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
movq T C T R _ L O W , I N C
movq C T R , T C T R _ L O W
2010-03-10 13:28:55 +03:00
ret
2019-10-11 14:50:46 +03:00
SYM_ F U N C _ E N D ( _ a e s n i _ i n c _ i n i t )
2010-03-10 13:28:55 +03:00
/ *
* _aesni_inc : internal A B I
* Increase I V b y 1 , I V i s i n b i g e n d i a n
* input :
* IV
* CTR : = = IV, i n l i t t l e e n d i a n
* TCTR_LOW : = = lower q w o r d o f C T R
* INC : = = 1 , in l i t t l e e n d i a n
* BSWAP_ M A S K = = e n d i a n s w a p p i n g m a s k
* output :
* IV : Increase b y 1
* changed :
* CTR : = = output I V , i n l i t t l e e n d i a n
* TCTR_LOW : = = lower q w o r d o f C T R
* /
2019-10-11 14:50:46 +03:00
SYM_ F U N C _ S T A R T _ L O C A L ( _ a e s n i _ i n c )
2010-03-10 13:28:55 +03:00
paddq I N C , C T R
add $ 1 , T C T R _ L O W
jnc . L i n c _ l o w
pslldq $ 8 , I N C
paddq I N C , C T R
psrldq $ 8 , I N C
.Linc_low :
movaps C T R , I V
crypto: x86 - Remove include/asm/inst.h
Current minimum required version of binutils is 2.23,
which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST,
AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ
instruction mnemonics.
Substitute macros from include/asm/inst.h with a proper
instruction mnemonics in various assmbly files from
x86/crypto directory, and remove now unneeded file.
The patch was tested by calculating and comparing sha256sum
hashes of stripped object files before and after the patch,
to be sure that executable code didn't change.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: "David S. Miller" <davem@davemloft.net>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-07-09 18:08:57 +03:00
pshufb B S W A P _ M A S K , I V
2010-03-10 13:28:55 +03:00
ret
2019-10-11 14:50:46 +03:00
SYM_ F U N C _ E N D ( _ a e s n i _ i n c )
2010-03-10 13:28:55 +03:00
/ *
* void a e s n i _ c t r _ e n c ( s t r u c t c r y p t o _ a e s _ c t x * c t x , c o n s t u 8 * d s t , u 8 * s r c ,
* size_ t l e n , u 8 * i v )
* /
2019-10-11 14:51:04 +03:00
SYM_ F U N C _ S T A R T ( a e s n i _ c t r _ e n c )
2016-01-22 01:49:19 +03:00
FRAME_ B E G I N
2010-03-10 13:28:55 +03:00
cmp $ 1 6 , L E N
jb . L c t r _ e n c _ j u s t _ r e t
mov 4 8 0 ( K E Y P ) , K L E N
movups ( I V P ) , I V
call _ a e s n i _ i n c _ i n i t
cmp $ 6 4 , L E N
jb . L c t r _ e n c _ l o o p1
.align 4
.Lctr_enc_loop4 :
movaps I V , S T A T E 1
call _ a e s n i _ i n c
movups ( I N P ) , I N 1
movaps I V , S T A T E 2
call _ a e s n i _ i n c
movups 0 x10 ( I N P ) , I N 2
movaps I V , S T A T E 3
call _ a e s n i _ i n c
movups 0 x20 ( I N P ) , I N 3
movaps I V , S T A T E 4
call _ a e s n i _ i n c
movups 0 x30 ( I N P ) , I N 4
call _ a e s n i _ e n c4
pxor I N 1 , S T A T E 1
movups S T A T E 1 , ( O U T P )
pxor I N 2 , S T A T E 2
movups S T A T E 2 , 0 x10 ( O U T P )
pxor I N 3 , S T A T E 3
movups S T A T E 3 , 0 x20 ( O U T P )
pxor I N 4 , S T A T E 4
movups S T A T E 4 , 0 x30 ( O U T P )
sub $ 6 4 , L E N
add $ 6 4 , I N P
add $ 6 4 , O U T P
cmp $ 6 4 , L E N
jge . L c t r _ e n c _ l o o p4
cmp $ 1 6 , L E N
jb . L c t r _ e n c _ r e t
.align 4
.Lctr_enc_loop1 :
movaps I V , S T A T E
call _ a e s n i _ i n c
movups ( I N P ) , I N
call _ a e s n i _ e n c1
pxor I N , S T A T E
movups S T A T E , ( O U T P )
sub $ 1 6 , L E N
add $ 1 6 , I N P
add $ 1 6 , O U T P
cmp $ 1 6 , L E N
jge . L c t r _ e n c _ l o o p1
.Lctr_enc_ret :
movups I V , ( I V P )
.Lctr_enc_just_ret :
2016-01-22 01:49:19 +03:00
FRAME_ E N D
2010-03-10 13:28:55 +03:00
ret
2019-10-11 14:51:04 +03:00
SYM_ F U N C _ E N D ( a e s n i _ c t r _ e n c )
2013-04-08 22:51:16 +04:00
/ *
* _aesni_gf128mul_x_ble : internal A B I
* Multiply i n G F ( 2 ^ 1 2 8 ) f o r X T S I V s
* input :
* IV : current I V
* GF1 2 8 M U L _ M A S K = = m a s k w i t h 0 x87 a n d 0 x01
* output :
* IV : next I V
* changed :
* CTR : = = temporary v a l u e
* /
# define _ a e s n i _ g f12 8 m u l _ x _ b l e ( ) \
pshufd $ 0 x13 , I V , C T R ; \
paddq I V , I V ; \
psrad $ 3 1 , C T R ; \
pand G F 1 2 8 M U L _ M A S K , C T R ; \
pxor C T R , I V ;
/ *
2020-12-31 19:41:54 +03:00
* void a e s n i _ x t s _ e n c r y p t ( c o n s t s t r u c t c r y p t o _ a e s _ c t x * c t x , u 8 * d s t ,
* const u 8 * s r c , u n s i g n e d i n t l e n , l e 1 2 8 * i v )
2013-04-08 22:51:16 +04:00
* /
2020-12-31 19:41:54 +03:00
SYM_ F U N C _ S T A R T ( a e s n i _ x t s _ e n c r y p t )
2016-01-22 01:49:19 +03:00
FRAME_ B E G I N
2013-04-08 22:51:16 +04:00
movdqa . L g f12 8 m u l _ x _ b l e _ m a s k , G F 1 2 8 M U L _ M A S K
movups ( I V P ) , I V
mov 4 8 0 ( K E Y P ) , K L E N
2020-12-31 19:41:54 +03:00
.Lxts_enc_loop4 :
2013-04-08 22:51:16 +04:00
movdqa I V , S T A T E 1
2013-06-11 23:25:22 +04:00
movdqu 0 x00 ( I N P ) , I N C
pxor I N C , S T A T E 1
2013-04-08 22:51:16 +04:00
movdqu I V , 0 x00 ( O U T P )
_ aesni_ g f12 8 m u l _ x _ b l e ( )
movdqa I V , S T A T E 2
2013-06-11 23:25:22 +04:00
movdqu 0 x10 ( I N P ) , I N C
pxor I N C , S T A T E 2
2013-04-08 22:51:16 +04:00
movdqu I V , 0 x10 ( O U T P )
_ aesni_ g f12 8 m u l _ x _ b l e ( )
movdqa I V , S T A T E 3
2013-06-11 23:25:22 +04:00
movdqu 0 x20 ( I N P ) , I N C
pxor I N C , S T A T E 3
2013-04-08 22:51:16 +04:00
movdqu I V , 0 x20 ( O U T P )
_ aesni_ g f12 8 m u l _ x _ b l e ( )
movdqa I V , S T A T E 4
2013-06-11 23:25:22 +04:00
movdqu 0 x30 ( I N P ) , I N C
pxor I N C , S T A T E 4
2013-04-08 22:51:16 +04:00
movdqu I V , 0 x30 ( O U T P )
2020-12-31 19:41:54 +03:00
call _ a e s n i _ e n c4
2013-04-08 22:51:16 +04:00
2013-06-11 23:25:22 +04:00
movdqu 0 x00 ( O U T P ) , I N C
pxor I N C , S T A T E 1
2013-04-08 22:51:16 +04:00
movdqu S T A T E 1 , 0 x00 ( O U T P )
2013-06-11 23:25:22 +04:00
movdqu 0 x10 ( O U T P ) , I N C
pxor I N C , S T A T E 2
2013-04-08 22:51:16 +04:00
movdqu S T A T E 2 , 0 x10 ( O U T P )
2013-06-11 23:25:22 +04:00
movdqu 0 x20 ( O U T P ) , I N C
pxor I N C , S T A T E 3
2013-04-08 22:51:16 +04:00
movdqu S T A T E 3 , 0 x20 ( O U T P )
2013-06-11 23:25:22 +04:00
movdqu 0 x30 ( O U T P ) , I N C
pxor I N C , S T A T E 4
2013-04-08 22:51:16 +04:00
movdqu S T A T E 4 , 0 x30 ( O U T P )
_ aesni_ g f12 8 m u l _ x _ b l e ( )
2020-12-31 19:41:54 +03:00
add $ 6 4 , I N P
add $ 6 4 , O U T P
sub $ 6 4 , L E N
ja . L x t s _ e n c _ l o o p4
2013-04-08 22:51:16 +04:00
movups I V , ( I V P )
2020-12-31 19:41:54 +03:00
FRAME_ E N D
ret
SYM_ F U N C _ E N D ( a e s n i _ x t s _ e n c r y p t )
/ *
* void a e s n i _ x t s _ d e c r y p t ( c o n s t s t r u c t c r y p t o _ a e s _ c t x * c t x , u 8 * d s t ,
* const u 8 * s r c , u n s i g n e d i n t l e n , l e 1 2 8 * i v )
* /
SYM_ F U N C _ S T A R T ( a e s n i _ x t s _ d e c r y p t )
FRAME_ B E G I N
movdqa . L g f12 8 m u l _ x _ b l e _ m a s k , G F 1 2 8 M U L _ M A S K
movups ( I V P ) , I V
mov 4 8 0 ( K E Y P ) , K L E N
add $ 2 4 0 , K E Y P
2013-04-08 22:51:16 +04:00
2020-12-31 19:41:54 +03:00
.Lxts_dec_loop4 :
movdqa I V , S T A T E 1
movdqu 0 x00 ( I N P ) , I N C
2013-06-11 23:25:22 +04:00
pxor I N C , S T A T E 1
2020-12-31 19:41:54 +03:00
movdqu I V , 0 x00 ( O U T P )
2013-04-08 22:51:16 +04:00
2020-12-31 19:41:54 +03:00
_ aesni_ g f12 8 m u l _ x _ b l e ( )
movdqa I V , S T A T E 2
movdqu 0 x10 ( I N P ) , I N C
pxor I N C , S T A T E 2
movdqu I V , 0 x10 ( O U T P )
_ aesni_ g f12 8 m u l _ x _ b l e ( )
movdqa I V , S T A T E 3
movdqu 0 x20 ( I N P ) , I N C
pxor I N C , S T A T E 3
movdqu I V , 0 x20 ( O U T P )
_ aesni_ g f12 8 m u l _ x _ b l e ( )
movdqa I V , S T A T E 4
movdqu 0 x30 ( I N P ) , I N C
pxor I N C , S T A T E 4
movdqu I V , 0 x30 ( O U T P )
call _ a e s n i _ d e c4
movdqu 0 x00 ( O U T P ) , I N C
pxor I N C , S T A T E 1
movdqu S T A T E 1 , 0 x00 ( O U T P )
movdqu 0 x10 ( O U T P ) , I N C
2013-06-11 23:25:22 +04:00
pxor I N C , S T A T E 2
2020-12-31 19:41:54 +03:00
movdqu S T A T E 2 , 0 x10 ( O U T P )
2013-04-08 22:51:16 +04:00
2020-12-31 19:41:54 +03:00
movdqu 0 x20 ( O U T P ) , I N C
2013-06-11 23:25:22 +04:00
pxor I N C , S T A T E 3
2020-12-31 19:41:54 +03:00
movdqu S T A T E 3 , 0 x20 ( O U T P )
2013-04-08 22:51:16 +04:00
2020-12-31 19:41:54 +03:00
movdqu 0 x30 ( O U T P ) , I N C
2013-06-11 23:25:22 +04:00
pxor I N C , S T A T E 4
2020-12-31 19:41:54 +03:00
movdqu S T A T E 4 , 0 x30 ( O U T P )
_ aesni_ g f12 8 m u l _ x _ b l e ( )
add $ 6 4 , I N P
add $ 6 4 , O U T P
sub $ 6 4 , L E N
ja . L x t s _ d e c _ l o o p4
movups I V , ( I V P )
2013-04-08 22:51:16 +04:00
2016-01-22 01:49:19 +03:00
FRAME_ E N D
2013-04-08 22:51:16 +04:00
ret
2020-12-31 19:41:54 +03:00
SYM_ F U N C _ E N D ( a e s n i _ x t s _ d e c r y p t )
2013-04-08 22:51:16 +04:00
crypto: aesni-intel - Ported implementation to x86-32
The AES-NI instructions are also available in legacy mode so the 32-bit
architecture may profit from those, too.
To illustrate the performance gain here's a short summary of a dm-crypt
speed test on a Core i7 M620 running at 2.67GHz comparing both assembler
implementations:
x86: i568 aes-ni delta
ECB, 256 bit: 93.8 MB/s 123.3 MB/s +31.4%
CBC, 256 bit: 84.8 MB/s 262.3 MB/s +209.3%
LRW, 256 bit: 108.6 MB/s 222.1 MB/s +104.5%
XTS, 256 bit: 105.0 MB/s 205.5 MB/s +95.7%
Additionally, due to some minor optimizations, the 64-bit version also
got a minor performance gain as seen below:
x86-64: old impl. new impl. delta
ECB, 256 bit: 121.1 MB/s 123.0 MB/s +1.5%
CBC, 256 bit: 285.3 MB/s 290.8 MB/s +1.9%
LRW, 256 bit: 263.7 MB/s 265.3 MB/s +0.6%
XTS, 256 bit: 251.1 MB/s 255.3 MB/s +1.7%
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Reviewed-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2010-11-27 11:34:46 +03:00
# endif