2016-12-05 18:42:26 +00:00
/ /
/ / Accelerated C R C - T 1 0 D I F u s i n g A R M N E O N a n d C r y p t o E x t e n s i o n s i n s t r u c t i o n s
/ /
/ / Copyright ( C ) 2 0 1 6 L i n a r o L t d < a r d . b i e s h e u v e l @linaro.org>
2019-01-30 20:42:41 -08:00
/ / Copyright ( C ) 2 0 1 9 G o o g l e L L C < e b i g g e r s @google.com>
2016-12-05 18:42:26 +00:00
/ /
/ / This p r o g r a m i s f r e e s o f t w a r e ; you can redistribute it and/or modify
/ / it u n d e r t h e t e r m s o f t h e G N U G e n e r a l P u b l i c L i c e n s e v e r s i o n 2 a s
/ / published b y t h e F r e e S o f t w a r e F o u n d a t i o n .
/ /
2019-01-30 20:42:41 -08:00
/ / Derived f r o m t h e x86 v e r s i o n :
2016-12-05 18:42:26 +00:00
/ /
/ / Implement f a s t C R C - T 1 0 D I F c o m p u t a t i o n w i t h S S E a n d P C L M U L Q D Q i n s t r u c t i o n s
/ /
/ / Copyright ( c ) 2 0 1 3 , I n t e l C o r p o r a t i o n
/ /
/ / Authors :
/ / Erdinc O z t u r k < e r d i n c . o z t u r k @intel.com>
/ / Vinodh G o p a l < v i n o d h . g o p a l @intel.com>
/ / James G u i l f o r d < j a m e s . g u i l f o r d @intel.com>
/ / Tim C h e n < t i m . c . c h e n @linux.intel.com>
/ /
/ / This s o f t w a r e i s a v a i l a b l e t o y o u u n d e r a c h o i c e o f o n e o f t w o
/ / licenses. Y o u m a y c h o o s e t o b e l i c e n s e d u n d e r t h e t e r m s o f t h e G N U
/ / General P u b l i c L i c e n s e ( G P L ) V e r s i o n 2 , a v a i l a b l e f r o m t h e f i l e
/ / COPYING i n t h e m a i n d i r e c t o r y o f t h i s s o u r c e t r e e , o r t h e
/ / OpenIB. o r g B S D l i c e n s e b e l o w :
/ /
/ / Redistribution a n d u s e i n s o u r c e a n d b i n a r y f o r m s , w i t h o r w i t h o u t
/ / modification, a r e p e r m i t t e d p r o v i d e d t h a t t h e f o l l o w i n g c o n d i t i o n s a r e
/ / met :
/ /
/ / * Redistributions o f s o u r c e c o d e m u s t r e t a i n t h e a b o v e c o p y r i g h t
/ / notice, t h i s l i s t o f c o n d i t i o n s a n d t h e f o l l o w i n g d i s c l a i m e r .
/ /
/ / * Redistributions i n b i n a r y f o r m m u s t r e p r o d u c e t h e a b o v e c o p y r i g h t
/ / notice, t h i s l i s t o f c o n d i t i o n s a n d t h e f o l l o w i n g d i s c l a i m e r i n t h e
/ / documentation a n d / o r o t h e r m a t e r i a l s p r o v i d e d w i t h t h e
/ / distribution.
/ /
/ / * Neither t h e n a m e o f t h e I n t e l C o r p o r a t i o n n o r t h e n a m e s o f i t s
/ / contributors m a y b e u s e d t o e n d o r s e o r p r o m o t e p r o d u c t s d e r i v e d f r o m
/ / this s o f t w a r e w i t h o u t s p e c i f i c p r i o r w r i t t e n p e r m i s s i o n .
/ /
/ /
/ / THIS S O F T W A R E I S P R O V I D E D B Y I N T E L C O R P O R A T I O N " " A S I S " " A N D A N Y
/ / EXPRESS O R I M P L I E D W A R R A N T I E S , I N C L U D I N G , B U T N O T L I M I T E D T O , T H E
/ / IMPLIED W A R R A N T I E S O F M E R C H A N T A B I L I T Y A N D F I T N E S S F O R A P A R T I C U L A R
/ / PURPOSE A R E D I S C L A I M E D . I N N O E V E N T S H A L L I N T E L C O R P O R A T I O N O R
/ / CONTRIBUTORS B E L I A B L E F O R A N Y D I R E C T , I N D I R E C T , I N C I D E N T A L , S P E C I A L ,
/ / EXEMPLARY, O R C O N S E Q U E N T I A L D A M A G E S ( I N C L U D I N G , B U T N O T L I M I T E D T O ,
/ / PROCUREMENT O F S U B S T I T U T E G O O D S O R S E R V I C E S ; LOSS OF USE, DATA, OR
/ / PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
/ / LIABILITY, W H E T H E R I N C O N T R A C T , S T R I C T L I A B I L I T Y , O R T O R T ( I N C L U D I N G
/ / NEGLIGENCE O R O T H E R W I S E ) A R I S I N G I N A N Y W A Y O U T O F T H E U S E O F T H I S
/ / SOFTWARE, E V E N I F A D V I S E D O F T H E P O S S I B I L I T Y O F S U C H D A M A G E .
/ /
/ / Reference p a p e r t i t l e d " F a s t C R C C o m p u t a t i o n f o r G e n e r i c
/ / Polynomials U s i n g P C L M U L Q D Q I n s t r u c t i o n "
/ / URL : http : / / www. i n t e l . c o m / c o n t e n t / d a m / w w w / p u b l i c / u s / e n / d o c u m e n t s
/ / / white- p a p e r s / f a s t - c r c - c o m p u t a t i o n - g e n e r i c - p o l y n o m i a l s - p c l m u l q d q - p a p e r . p d f
/ /
# include < l i n u x / l i n k a g e . h >
# include < a s m / a s s e m b l e r . h >
# ifdef C O N F I G _ C P U _ E N D I A N _ B E 8
# define C P U _ L E ( c o d e . . . )
# else
# define C P U _ L E ( c o d e . . . ) c o d e
# endif
.text
2019-10-11 11:08:00 +02:00
.arch armv8 - a
2016-12-05 18:42:26 +00:00
.fpu crypto- n e o n - f p - a r m v8
2019-01-30 20:42:41 -08:00
init_ c r c . r e q r0
buf . r e q r1
len . r e q r2
2016-12-05 18:42:26 +00:00
2019-01-30 20:42:41 -08:00
fold_ c o n s t s _ p t r . r e q i p
2016-12-05 18:42:26 +00:00
q0 l . r e q d0
q0 h . r e q d1
q1 l . r e q d2
q1 h . r e q d3
q2 l . r e q d4
q2 h . r e q d5
q3 l . r e q d6
q3 h . r e q d7
q4 l . r e q d8
q4 h . r e q d9
q5 l . r e q d10
q5 h . r e q d11
q6 l . r e q d12
q6 h . r e q d13
q7 l . r e q d14
q7 h . r e q d15
2019-01-30 20:42:41 -08:00
q8 l . r e q d16
q8 h . r e q d17
q9 l . r e q d18
q9 h . r e q d19
q1 0 l . r e q d20
q1 0 h . r e q d21
q1 1 l . r e q d22
q1 1 h . r e q d23
q1 2 l . r e q d24
q1 2 h . r e q d25
FOLD_ C O N S T S . r e q q10
FOLD_ C O N S T _ L . r e q q10 l
FOLD_ C O N S T _ H . r e q q10 h
/ / Fold r e g 1 , r e g 2 i n t o t h e n e x t 3 2 d a t a b y t e s , s t o r i n g t h e r e s u l t b a c k
/ / into r e g 1 , r e g 2 .
.macro fold_ 3 2 _ b y t e s , r e g 1 , r e g 2
vld1 . 6 4 { q11 - q12 } , [ b u f ] !
vmull. p64 q8 , \ r e g 1 \ ( ) h , F O L D _ C O N S T _ H
vmull. p64 \ r e g 1 , \ r e g 1 \ ( ) l , F O L D _ C O N S T _ L
vmull. p64 q9 , \ r e g 2 \ ( ) h , F O L D _ C O N S T _ H
vmull. p64 \ r e g 2 , \ r e g 2 \ ( ) l , F O L D _ C O N S T _ L
CPU_ L E ( v r e v64 . 8 q11 , q11 )
CPU_ L E ( v r e v64 . 8 q12 , q12 )
vswp q11 l , q11 h
vswp q12 l , q12 h
2016-12-05 18:42:26 +00:00
veor. 8 \ r e g 1 , \ r e g 1 , q8
veor. 8 \ r e g 2 , \ r e g 2 , q9
veor. 8 \ r e g 1 , \ r e g 1 , q11
veor. 8 \ r e g 2 , \ r e g 2 , q12
.endm
2019-01-30 20:42:41 -08:00
/ / Fold s r c _ r e g i n t o d s t _ r e g , o p t i o n a l l y l o a d i n g t h e n e x t f o l d c o n s t a n t s
.macro fold_ 1 6 _ b y t e s , s r c _ r e g , d s t _ r e g , l o a d _ n e x t _ c o n s t s
vmull. p64 q8 , \ s r c _ r e g \ ( ) l , F O L D _ C O N S T _ L
vmull. p64 \ s r c _ r e g , \ s r c _ r e g \ ( ) h , F O L D _ C O N S T _ H
.ifnb \ load_ n e x t _ c o n s t s
vld1 . 6 4 { F O L D _ C O N S T S } , [ f o l d _ c o n s t s _ p t r , : 1 2 8 ] !
2016-12-05 18:42:26 +00:00
.endif
2019-01-30 20:42:41 -08:00
veor. 8 \ d s t _ r e g , \ d s t _ r e g , q8
veor. 8 \ d s t _ r e g , \ d s t _ r e g , \ s r c _ r e g
2016-12-05 18:42:26 +00:00
.endm
2019-01-30 20:42:41 -08:00
.macro _ _ adrl, o u t , s y m
movw \ o u t , #: l o w e r 16 : \ s y m
movt \ o u t , #: u p p e r 16 : \ s y m
.endm
2016-12-05 18:42:26 +00:00
2019-01-30 20:42:41 -08:00
/ /
/ / u1 6 c r c _ t 1 0 d i f _ p m u l l ( u 1 6 i n i t _ c r c , c o n s t u 8 * b u f , s i z e _ t l e n ) ;
/ /
/ / Assumes l e n > = 1 6 .
/ /
ENTRY( c r c _ t 1 0 d i f _ p m u l l )
2016-12-05 18:42:26 +00:00
2019-01-30 20:42:41 -08:00
/ / For s i z e s l e s s t h a n 2 5 6 b y t e s , w e c a n ' t f o l d 1 2 8 b y t e s a t a t i m e .
cmp l e n , #256
blt . L l e s s _ t h a n _ 2 5 6 _ b y t e s
_ _ adrl f o l d _ c o n s t s _ p t r , . L f o l d _ a c r o s s _ 1 2 8 _ b y t e s _ c o n s t s
/ / Load t h e f i r s t 1 2 8 d a t a b y t e s . B y t e s w a p p i n g i s n e c e s s a r y t o m a k e
/ / the b i t o r d e r m a t c h t h e p o l y n o m i a l c o e f f i c i e n t o r d e r .
vld1 . 6 4 { q0 - q1 } , [ b u f ] !
vld1 . 6 4 { q2 - q3 } , [ b u f ] !
vld1 . 6 4 { q4 - q5 } , [ b u f ] !
vld1 . 6 4 { q6 - q7 } , [ b u f ] !
CPU_ L E ( v r e v64 . 8 q0 , q0 )
CPU_ L E ( v r e v64 . 8 q1 , q1 )
CPU_ L E ( v r e v64 . 8 q2 , q2 )
CPU_ L E ( v r e v64 . 8 q3 , q3 )
CPU_ L E ( v r e v64 . 8 q4 , q4 )
CPU_ L E ( v r e v64 . 8 q5 , q5 )
CPU_ L E ( v r e v64 . 8 q6 , q6 )
CPU_ L E ( v r e v64 . 8 q7 , q7 )
vswp q0 l , q0 h
vswp q1 l , q1 h
vswp q2 l , q2 h
vswp q3 l , q3 h
vswp q4 l , q4 h
vswp q5 l , q5 h
vswp q6 l , q6 h
vswp q7 l , q7 h
/ / XOR t h e f i r s t 1 6 d a t a * b i t s * w i t h t h e i n i t i a l C R C v a l u e .
vmov. i 8 q8 h , #0
vmov. u 1 6 q8 h [ 3 ] , i n i t _ c r c
veor q0 h , q0 h , q8 h
/ / Load t h e c o n s t a n t s f o r f o l d i n g a c r o s s 1 2 8 b y t e s .
vld1 . 6 4 { F O L D _ C O N S T S } , [ f o l d _ c o n s t s _ p t r , : 1 2 8 ] !
/ / Subtract 1 2 8 f o r t h e 1 2 8 d a t a b y t e s j u s t c o n s u m e d . S u b t r a c t a n o t h e r
/ / 1 2 8 to s i m p l i f y t h e t e r m i n a t i o n c o n d i t i o n o f t h e f o l l o w i n g l o o p .
sub l e n , l e n , #256
/ / While > = 1 2 8 d a t a b y t e s r e m a i n ( n o t c o u n t i n g q0 - q7 ) , f o l d t h e 1 2 8
/ / bytes q0 - q7 i n t o t h e m , s t o r i n g t h e r e s u l t b a c k i n t o q0 - q7 .
.Lfold_128_bytes_loop :
fold_ 3 2 _ b y t e s q0 , q1
fold_ 3 2 _ b y t e s q2 , q3
fold_ 3 2 _ b y t e s q4 , q5
fold_ 3 2 _ b y t e s q6 , q7
subs l e n , l e n , #128
bge . L f o l d _ 1 2 8 _ b y t e s _ l o o p
/ / Now f o l d t h e 1 1 2 b y t e s i n q0 - q6 i n t o t h e 1 6 b y t e s i n q7 .
/ / Fold a c r o s s 6 4 b y t e s .
vld1 . 6 4 { F O L D _ C O N S T S } , [ f o l d _ c o n s t s _ p t r , : 1 2 8 ] !
fold_ 1 6 _ b y t e s q0 , q4
fold_ 1 6 _ b y t e s q1 , q5
fold_ 1 6 _ b y t e s q2 , q6
fold_ 1 6 _ b y t e s q3 , q7 , 1
/ / Fold a c r o s s 3 2 b y t e s .
fold_ 1 6 _ b y t e s q4 , q6
fold_ 1 6 _ b y t e s q5 , q7 , 1
/ / Fold a c r o s s 1 6 b y t e s .
fold_ 1 6 _ b y t e s q6 , q7
/ / Add 1 2 8 t o g e t t h e c o r r e c t n u m b e r o f d a t a b y t e s r e m a i n i n g i n 0 . . . 1 2 7
/ / ( not c o u n t i n g q7 ) , f o l l o w i n g t h e p r e v i o u s e x t r a s u b t r a c t i o n b y 1 2 8 .
/ / Then s u b t r a c t 1 6 t o s i m p l i f y t h e t e r m i n a t i o n c o n d i t i o n o f t h e
/ / following l o o p .
adds l e n , l e n , #( 128 - 1 6 )
/ / While > = 1 6 d a t a b y t e s r e m a i n ( n o t c o u n t i n g q7 ) , f o l d t h e 1 6 b y t e s q7
/ / into t h e m , s t o r i n g t h e r e s u l t b a c k i n t o q7 .
blt . L f o l d _ 1 6 _ b y t e s _ l o o p _ d o n e
.Lfold_16_bytes_loop :
vmull. p64 q8 , q7 l , F O L D _ C O N S T _ L
vmull. p64 q7 , q7 h , F O L D _ C O N S T _ H
veor. 8 q7 , q7 , q8
vld1 . 6 4 { q0 } , [ b u f ] !
CPU_ L E ( v r e v64 . 8 q0 , q0 )
vswp q0 l , q0 h
2016-12-05 18:42:26 +00:00
veor. 8 q7 , q7 , q0
2019-01-30 20:42:41 -08:00
subs l e n , l e n , #16
bge . L f o l d _ 1 6 _ b y t e s _ l o o p
.Lfold_16_bytes_loop_done :
/ / Add 1 6 t o g e t t h e c o r r e c t n u m b e r o f d a t a b y t e s r e m a i n i n g i n 0 . . . 1 5
/ / ( not c o u n t i n g q7 ) , f o l l o w i n g t h e p r e v i o u s e x t r a s u b t r a c t i o n b y 1 6 .
adds l e n , l e n , #16
beq . L r e d u c e _ f i n a l _ 1 6 _ b y t e s
.Lhandle_partial_segment :
/ / Reduce t h e l a s t ' 1 6 + l e n ' b y t e s w h e r e 1 < = l e n < = 1 5 a n d t h e f i r s t
/ / 1 6 bytes a r e i n q7 a n d t h e r e s t a r e t h e r e m a i n i n g d a t a i n ' b u f ' . T o
/ / do t h i s w i t h o u t n e e d i n g a f o l d c o n s t a n t f o r e a c h p o s s i b l e ' l e n ' ,
/ / redivide t h e b y t e s i n t o a f i r s t c h u n k o f ' l e n ' b y t e s a n d a s e c o n d
/ / chunk o f 1 6 b y t e s , t h e n f o l d t h e f i r s t c h u n k i n t o t h e s e c o n d .
/ / q0 = l a s t 1 6 o r i g i n a l d a t a b y t e s
add b u f , b u f , l e n
sub b u f , b u f , #16
vld1 . 6 4 { q0 } , [ b u f ]
CPU_ L E ( v r e v64 . 8 q0 , q0 )
vswp q0 l , q0 h
/ / q1 = h i g h o r d e r p a r t o f s e c o n d c h u n k : q7 l e f t - s h i f t e d b y ' l e n ' b y t e s .
_ _ adrl r3 , . L b y t e s h i f t _ t a b l e + 1 6
sub r3 , r3 , l e n
vld1 . 8 { q2 } , [ r3 ]
vtbl. 8 q1 l , { q7 l - q7 h } , q2 l
vtbl. 8 q1 h , { q7 l - q7 h } , q2 h
/ / q3 = f i r s t c h u n k : q7 r i g h t - s h i f t e d b y ' 1 6 - l e n ' b y t e s .
vmov. i 8 q3 , #0x80
veor. 8 q2 , q2 , q3
vtbl. 8 q3 l , { q7 l - q7 h } , q2 l
vtbl. 8 q3 h , { q7 l - q7 h } , q2 h
/ / Convert t o 8 - b i t m a s k s : ' l e n ' 0 x00 b y t e s , t h e n ' 1 6 - l e n ' 0 x f f b y t e s .
vshr. s8 q2 , q2 , #7
/ / q2 = s e c o n d c h u n k : ' l e n ' b y t e s f r o m q0 ( l o w - o r d e r b y t e s ) ,
/ / then ' 1 6 - l e n ' b y t e s f r o m q1 ( h i g h - o r d e r b y t e s ) .
vbsl. 8 q2 , q1 , q0
/ / Fold t h e f i r s t c h u n k i n t o t h e s e c o n d c h u n k , s t o r i n g t h e r e s u l t i n q7 .
vmull. p64 q0 , q3 l , F O L D _ C O N S T _ L
vmull. p64 q7 , q3 h , F O L D _ C O N S T _ H
2016-12-05 18:42:26 +00:00
veor. 8 q7 , q7 , q0
2019-01-30 20:42:41 -08:00
veor. 8 q7 , q7 , q2
.Lreduce_final_16_bytes :
/ / Reduce t h e 1 2 8 - b i t v a l u e M ( x ) , s t o r e d i n q7 , t o t h e f i n a l 1 6 - b i t C R C .
/ / Load ' x ^ 4 8 * ( x ^ 4 8 m o d G ( x ) ) ' a n d ' x ^ 4 8 * ( x ^ 8 0 m o d G ( x ) ) ' .
vld1 . 6 4 { F O L D _ C O N S T S } , [ f o l d _ c o n s t s _ p t r , : 1 2 8 ] !
/ / Fold t h e h i g h 6 4 b i t s i n t o t h e l o w 6 4 b i t s , w h i l e a l s o m u l t i p l y i n g b y
/ / x^ 6 4 . T h i s p r o d u c e s a 1 2 8 - b i t v a l u e c o n g r u e n t t o x ^ 6 4 * M ( x ) a n d
/ / whose l o w 4 8 b i t s a r e 0 .
vmull. p64 q0 , q7 h , F O L D _ C O N S T _ H / / h i g h b i t s * x ^ 4 8 * ( x ^ 8 0 m o d G ( x ) )
veor. 8 q0 h , q0 h , q7 l / / + l o w b i t s * x ^ 6 4
/ / Fold t h e h i g h 3 2 b i t s i n t o t h e l o w 9 6 b i t s . T h i s p r o d u c e s a 9 6 - b i t
/ / value c o n g r u e n t t o x ^ 6 4 * M ( x ) a n d w h o s e l o w 4 8 b i t s a r e 0 .
vmov. i 8 q1 , #0
vmov s4 , s3 / / e x t r a c t h i g h 3 2 b i t s
vmov s3 , s5 / / z e r o h i g h 3 2 b i t s
vmull. p64 q1 , q1 l , F O L D _ C O N S T _ L / / h i g h 3 2 b i t s * x ^ 4 8 * ( x ^ 4 8 m o d G ( x ) )
veor. 8 q0 , q0 , q1 / / + l o w b i t s
/ / Load G ( x ) a n d f l o o r ( x ^ 4 8 / G ( x ) ) .
vld1 . 6 4 { F O L D _ C O N S T S } , [ f o l d _ c o n s t s _ p t r , : 1 2 8 ]
/ / Use B a r r e t t r e d u c t i o n t o c o m p u t e t h e f i n a l C R C v a l u e .
vmull. p64 q1 , q0 h , F O L D _ C O N S T _ H / / h i g h 3 2 b i t s * f l o o r ( x ^ 4 8 / G ( x ) )
vshr. u 6 4 q1 l , q1 l , #32 / / / = x ^ 3 2
vmull. p64 q1 , q1 l , F O L D _ C O N S T _ L / / * = G ( x )
vshr. u 6 4 q0 l , q0 l , #48
veor. 8 q0 l , q0 l , q1 l / / + l o w 1 6 n o n z e r o b i t s
/ / Final C R C v a l u e ( x ^ 1 6 * M ( x ) ) m o d G ( x ) i s i n l o w 1 6 b i t s o f q0 .
vmov. u 1 6 r0 , q0 l [ 0 ]
2016-12-05 18:42:26 +00:00
bx l r
2019-01-30 20:42:41 -08:00
.Lless_than_256_bytes :
/ / Checksumming a b u f f e r o f l e n g t h 1 6 . . . 2 5 5 b y t e s
2016-12-05 18:42:26 +00:00
2019-01-30 20:42:41 -08:00
_ _ adrl f o l d _ c o n s t s _ p t r , . L f o l d _ a c r o s s _ 1 6 _ b y t e s _ c o n s t s
2016-12-05 18:42:26 +00:00
2019-01-30 20:42:41 -08:00
/ / Load t h e f i r s t 1 6 d a t a b y t e s .
vld1 . 6 4 { q7 } , [ b u f ] !
CPU_ L E ( v r e v64 . 8 q7 , q7 )
vswp q7 l , q7 h
2016-12-05 18:42:26 +00:00
2019-01-30 20:42:41 -08:00
/ / XOR t h e f i r s t 1 6 d a t a * b i t s * w i t h t h e i n i t i a l C R C v a l u e .
vmov. i 8 q0 h , #0
vmov. u 1 6 q0 h [ 3 ] , i n i t _ c r c
veor. 8 q7 h , q7 h , q0 h
2016-12-05 18:42:26 +00:00
2019-01-30 20:42:41 -08:00
/ / Load t h e f o l d - a c r o s s - 1 6 - b y t e s c o n s t a n t s .
vld1 . 6 4 { F O L D _ C O N S T S } , [ f o l d _ c o n s t s _ p t r , : 1 2 8 ] !
2016-12-05 18:42:26 +00:00
2019-01-30 20:42:41 -08:00
cmp l e n , #16
beq . L r e d u c e _ f i n a l _ 1 6 _ b y t e s / / l e n = = 1 6
subs l e n , l e n , #32
addlt l e n , l e n , #16
blt . L h a n d l e _ p a r t i a l _ s e g m e n t / / 1 7 < = l e n < = 3 1
b . L f o l d _ 1 6 _ b y t e s _ l o o p / / 3 2 < = l e n < = 2 5 5
2016-12-05 18:42:26 +00:00
ENDPROC( c r c _ t 1 0 d i f _ p m u l l )
2019-01-30 20:42:41 -08:00
.section " .rodata " , " a"
2016-12-05 18:42:26 +00:00
.align 4
2019-01-30 20:42:41 -08:00
/ / Fold c o n s t a n t s p r e c o m p u t e d f r o m t h e p o l y n o m i a l 0 x18 b b7
/ / G( x ) = x ^ 1 6 + x ^ 1 5 + x ^ 1 1 + x ^ 9 + x ^ 8 + x ^ 7 + x ^ 5 + x ^ 4 + x ^ 2 + x ^ 1 + x ^ 0
.Lfold_across_128_bytes_consts :
.quad 0x0000000000006123 / / x^ ( 8 * 1 2 8 ) m o d G ( x )
.quad 0x0000000000002295 / / x^ ( 8 * 1 2 8 + 6 4 ) m o d G ( x )
/ / .Lfold_across_64_bytes_consts :
.quad 0x0000000000001069 / / x^ ( 4 * 1 2 8 ) m o d G ( x )
.quad 0x000000000000dd31 / / x^ ( 4 * 1 2 8 + 6 4 ) m o d G ( x )
/ / .Lfold_across_32_bytes_consts :
.quad 0x000000000000857d / / x^ ( 2 * 1 2 8 ) m o d G ( x )
.quad 0x0000000000007acc / / x^ ( 2 * 1 2 8 + 6 4 ) m o d G ( x )
.Lfold_across_16_bytes_consts :
.quad 0x000000000000a010 / / x^ ( 1 * 1 2 8 ) m o d G ( x )
.quad 0x0000000000001faa / / x^ ( 1 * 1 2 8 + 6 4 ) m o d G ( x )
/ / .Lfinal_fold_consts :
.quad 0x1368000000000000 / / x^ 4 8 * ( x ^ 4 8 m o d G ( x ) )
.quad 0x2d56000000000000 / / x^ 4 8 * ( x ^ 8 0 m o d G ( x ) )
/ / .Lbarrett_reduction_consts :
.quad 0x0000000000018bb7 / / G( x )
.quad 0x00000001f65a57f8 / / floor( x ^ 4 8 / G ( x ) )
/ / For 1 < = l e n < = 1 5 , t h e 1 6 - b y t e v e c t o r b e g i n n i n g a t & b y t e s h i f t _ t a b l e [ 1 6 -
/ / len] i s t h e i n d e x v e c t o r t o s h i f t l e f t b y ' l e n ' b y t e s , a n d i s a l s o { 0 x80 ,
/ / . . . , 0 x8 0 } X O R t h e i n d e x v e c t o r t o s h i f t r i g h t b y ' 1 6 - l e n ' b y t e s .
.Lbyteshift_table :
2016-12-05 18:42:26 +00:00
.byte 0 x0 , 0 x81 , 0 x82 , 0 x83 , 0 x84 , 0 x85 , 0 x86 , 0 x87
.byte 0 x8 8 , 0 x89 , 0 x8 a , 0 x8 b , 0 x8 c , 0 x8 d , 0 x8 e , 0 x8 f
.byte 0 x0 , 0 x1 , 0 x2 , 0 x3 , 0 x4 , 0 x5 , 0 x6 , 0 x7
.byte 0 x8 , 0 x9 , 0 x a , 0 x b , 0 x c , 0 x d , 0 x e , 0 x0