2005-04-16 15:20:36 -07:00
|
| setox. s a 3 . 1 1 2 / 1 0 / 9 0
|
| The e n t r y p o i n t s e t o x c o m p u t e s t h e e x p o n e n t i a l o f a v a l u e .
| setoxd d o e s t h e s a m e e x c e p t t h e i n p u t v a l u e i s a d e n o r m a l i z e d
| number. s e t o x m 1 c o m p u t e s e x p ( X ) - 1 , a n d s e t o x m 1 d c o m p u t e s
| exp( X ) - 1 f o r d e n o r m a l i z e d X .
|
| INPUT
| - - - - -
| Double- e x t e n d e d v a l u e i n m e m o r y l o c a t i o n p o i n t e d t o b y a d d r e s s
| register a0 .
|
| OUTPUT
| - - - - - -
| exp( X ) o r e x p ( X ) - 1 r e t u r n e d i n f l o a t i n g - p o i n t r e g i s t e r f p0 .
|
| ACCURACY a n d M O N O T O N I C I T Y
| - - - - - - - - - - - - - - - - - - - - - - - - -
| The r e t u r n e d r e s u l t i s w i t h i n 0 . 8 5 u l p s i n 6 4 s i g n i f i c a n t b i t , i . e .
| within 0 . 5 0 0 1 u l p t o 5 3 b i t s i f t h e r e s u l t i s s u b s e q u e n t l y r o u n d e d
| to d o u b l e p r e c i s i o n . T h e r e s u l t i s p r o v a b l y m o n o t o n i c i n d o u b l e
| precision.
|
| SPEED
| - - - - -
| Two t i m i n g s a r e m e a s u r e d , b o t h i n t h e c o p y - b a c k m o d e . T h e
| first o n e i s m e a s u r e d w h e n t h e f u n c t i o n i s i n v o k e d t h e f i r s t t i m e
| ( so t h e i n s t r u c t i o n s a n d d a t a a r e n o t i n c a c h e ) , a n d t h e
| second o n e i s m e a s u r e d w h e n t h e f u n c t i o n i s r e i n v o k e d a t t h e s a m e
| input a r g u m e n t .
|
| The p r o g r a m s e t o x t a k e s a p p r o x i m a t e l y 2 1 0 / 1 9 0 c y c l e s f o r i n p u t
| argument X w h o s e m a g n i t u d e i s l e s s t h a n 1 6 3 8 0 l o g 2 , w h i c h
| is t h e u s u a l s i t u a t i o n . F o r t h e l e s s c o m m o n a r g u m e n t s ,
| depending o n t h e i r v a l u e s , t h e p r o g r a m m a y r u n f a s t e r o r s l o w e r - -
| but n o w o r s e t h a n 1 0 % s l o w e r e v e n i n t h e e x t r e m e c a s e s .
|
| The p r o g r a m s e t o x m 1 t a k e s a p p r o x i m a t e l y ? ? ? / ? ? ? c y c l e s f o r i n p u t
| argument X , 0 . 2 5 < = | X | < 7 0 l o g 2 . F o r | X | < 0 . 2 5 , i t t a k e s
| approximately ? ? ? / ? ? ? c y c l e s . F o r t h e l e s s c o m m o n a r g u m e n t s ,
| depending o n t h e i r v a l u e s , t h e p r o g r a m m a y r u n f a s t e r o r s l o w e r - -
| but n o w o r s e t h a n 1 0 % s l o w e r e v e n i n t h e e x t r e m e c a s e s .
|
| ALGORITHM a n d I M P L E M E N T A T I O N N O T E S
| - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
|
| setoxd
| - - - - - -
| Step 1 . S e t a n s : = 1 . 0
|
| Step 2 . R e t u r n a n s : = a n s + s i g n ( X ) * 2 ^ ( - 1 2 6 ) . E x i t .
| Notes : This w i l l a l w a y s g e n e r a t e o n e e x c e p t i o n - - i n e x a c t .
|
|
| setox
| - - - - -
|
| Step 1 . F i l t e r o u t e x t r e m e c a s e s o f i n p u t a r g u m e n t .
| 1 .1 If | X| > = 2 ^ ( - 6 5 ) , g o t o S t e p 1 . 3 .
| 1 .2 Go to S t e p 7 .
| 1 .3 If | X| < 1 6 3 8 0 l o g ( 2 ) , g o t o S t e p 2 .
| 1 .4 Go to S t e p 8 .
| Notes : The u s u a l c a s e s h o u l d t a k e t h e b r a n c h e s 1 . 1 - > 1 . 3 - > 2 .
| To a v o i d t h e u s e o f f l o a t i n g - p o i n t c o m p a r i s o n s , a
| compact r e p r e s e n t a t i o n o f | X | i s u s e d . T h i s f o r m a t i s a
| 3 2 - bit i n t e g e r , t h e u p p e r ( m o r e s i g n i f i c a n t ) 1 6 b i t s a r e
| the s i g n a n d b i a s e d e x p o n e n t f i e l d o f | X | ; the lower 16
| bits a r e t h e 1 6 m o s t s i g n i f i c a n t f r a c t i o n ( i n c l u d i n g t h e
| explicit b i t ) b i t s o f | X | . C o n s e q u e n t l y , t h e c o m p a r i s o n s
| in S t e p s 1 . 1 a n d 1 . 3 c a n b e p e r f o r m e d b y i n t e g e r c o m p a r i s o n .
| Note a l s o t h a t t h e c o n s t a n t 1 6 3 8 0 l o g ( 2 ) u s e d i n S t e p 1 . 3
| is a l s o i n t h e c o m p a c t f o r m . T h u s t a k i n g t h e b r a n c h
| to S t e p 2 g u a r a n t e e s | X | < 1 6 3 8 0 l o g ( 2 ) . T h e r e i s n o h a r m
| to h a v e a s m a l l n u m b e r o f c a s e s w h e r e | X | i s l e s s t h a n ,
| but c l o s e t o , 1 6 3 8 0 l o g ( 2 ) a n d t h e b r a n c h t o S t e p 9 i s
| taken.
|
| Step 2 . C a l c u l a t e N = r o u n d - t o - n e a r e s t - i n t ( X * 6 4 / l o g 2 ) .
| 2 .1 Set AdjFlag : = 0 ( i n d i c a t e s t h e b r a n c h 1 . 3 - > 2 w a s t a k e n )
| 2 .2 N : = round- t o - n e a r e s t - i n t e g e r ( X * 6 4 / l o g 2 ) .
| 2 .3 Calculate J = N m o d 6 4 ; so J = 0,1,2,..., or 63.
| 2 .4 Calculate M = ( N - J ) / 6 4 ; so N = 64M + J.
| 2 .5 Calculate the a d d r e s s o f t h e s t o r e d v a l u e o f 2 ^ ( J / 6 4 ) .
| 2 .6 Create the v a l u e S c a l e = 2 ^ M .
| Notes : The c a l c u l a t i o n i n 2 . 2 i s r e a l l y p e r f o r m e d b y
|
| Z : = X * c o n s t a n t
| N : = r o u n d - t o - n e a r e s t - i n t e g e r ( Z )
|
| where
|
| constant : = s i n g l e - p r e c i s i o n ( 6 4 / l o g 2 ) .
|
| Using a s i n g l e - p r e c i s i o n c o n s t a n t a v o i d s m e m o r y a c c e s s .
| Another e f f e c t o f u s i n g a s i n g l e - p r e c i s i o n " c o n s t a n t " i s
| that t h e c a l c u l a t e d v a l u e Z i s
|
| Z = X * ( 6 4 / l o g 2 ) * ( 1 + e p s ) , | e p s | < = 2 ^ ( - 2 4 ) .
|
| This e r r o r h a s t o b e c o n s i d e r e d l a t e r i n S t e p s 3 a n d 4 .
|
| Step 3 . C a l c u l a t e X - N * l o g 2 / 6 4 .
| 3 .1 R : = X + N * L 1 , w h e r e L 1 : = s i n g l e - p r e c i s i o n ( - l o g 2 / 6 4 ) .
| 3 .2 R : = R + N * L 2 , L 2 : = e x t e n d e d - p r e c i s i o n ( - l o g 2 / 6 4 - L 1 ) .
| Notes : a) T h e w a y L 1 a n d L 2 a r e c h o s e n e n s u r e s L 1 + L 2 a p p r o x i m a t e
| the v a l u e - l o g 2 / 6 4 t o 8 8 b i t s o f a c c u r a c y .
| b) N * L 1 i s e x a c t b e c a u s e N i s n o l o n g e r t h a n 2 2 b i t s a n d
| L1 i s n o l o n g e r t h a n 2 4 b i t s .
| c) T h e c a l c u l a t i o n X + N * L 1 i s a l s o e x a c t d u e t o c a n c e l l a t i o n .
| Thus, R i s p r a c t i c a l l y X + N ( L 1 + L 2 ) t o f u l l 6 4 b i t s .
| d) I t i s i m p o r t a n t t o e s t i m a t e h o w l a r g e c a n | R | b e a f t e r
| Step 3 . 2 .
|
| N = r n d - t o - i n t ( X * 6 4 / l o g 2 ( 1 + e p s ) ) , | e p s | < =2 ^ ( - 2 4 )
| X* 6 4 / l o g 2 ( 1 + e p s ) = N + f , | f | < = 0 . 5
| X* 6 4 / l o g 2 - N = f - e p s * X 6 4 / l o g 2
| X - N * l o g 2 / 6 4 = f * l o g 2 / 6 4 - e p s * X
|
|
| Now | X | < = 1 6 4 4 6 l o g 2 , t h u s
|
| | X - N * l o g 2 / 6 4 | < = ( 0 . 5 + 1 6 4 4 6 / 2 ^ ( 1 8 ) ) * l o g 2 / 6 4
| < = 0 .57 log2 / 6 4 .
| This b o u n d w i l l b e u s e d i n S t e p 4 .
|
| Step 4 . A p p r o x i m a t e e x p ( R ) - 1 b y a p o l y n o m i a l
| p = R + R * R * ( A 1 + R * ( A 2 + R * ( A 3 + R * ( A 4 + R * A 5 ) ) ) )
| Notes : a) I n o r d e r t o r e d u c e m e m o r y a c c e s s , t h e c o e f f i c i e n t s a r e
| made a s " s h o r t " a s p o s s i b l e : A 1 ( w h i c h i s 1 / 2 ) , A 4 a n d A 5
| are s i n g l e p r e c i s i o n ; A2 and A3 are double precision.
| b) E v e n w i t h t h e r e s t r i c t i o n s a b o v e ,
| | p - ( e x p ( R ) - 1 ) | < 2 ^ ( - 6 8 . 8 ) f o r a l l | R | < = 0 . 0 0 6 2 .
| Note t h a t 0 . 0 0 6 2 i s s l i g h t l y b i g g e r t h a n 0 . 5 7 l o g 2 / 6 4 .
| c) T o f u l l y u t i l i z e t h e p i p e l i n e , p i s s e p a r a t e d i n t o
| two i n d e p e n d e n t p i e c e s o f r o u g h l y e q u a l c o m p l e x i t i e s
| p = [ R + R * S * ( A 2 + S * A 4 ) ] +
| [ S* ( A 1 + S * ( A 3 + S * A 5 ) ) ]
| where S = R * R .
|
| Step 5 . C o m p u t e 2 ^ ( J / 6 4 ) * e x p ( R ) = 2 ^ ( J / 6 4 ) * ( 1 + p ) b y
| ans : = T + ( T * p + t )
| where T a n d t a r e t h e s t o r e d v a l u e s f o r 2 ^ ( J / 6 4 ) .
| Notes : 2 ^ ( J/ 6 4 ) i s s t o r e d a s T a n d t w h e r e T + t a p p r o x i m a t e s
| 2 ^ ( J/ 6 4 ) t o r o u g h l y 8 5 b i t s ; T is in extended precision
| and t i s i n s i n g l e p r e c i s i o n . N o t e a l s o t h a t T i s r o u n d e d
| to 6 2 b i t s s o t h a t t h e l a s t t w o b i t s o f T a r e z e r o . T h e
| reason f o r s u c h a s p e c i a l f o r m i s t h a t T - 1 , T - 2 , a n d T - 8
| will a l l b e e x a c t - - - a p r o p e r t y t h a t w i l l g i v e m u c h
| more a c c u r a t e c o m p u t a t i o n o f t h e f u n c t i o n E X P M 1 .
|
| Step 6 . R e c o n s t r u c t i o n o f e x p ( X )
| exp( X ) = 2 ^ M * 2 ^ ( J / 6 4 ) * e x p ( R ) .
| 6 .1 If AdjFlag = 0 , g o t o 6 . 3
| 6 .2 ans : = ans * A d j S c a l e
| 6 .3 Restore the u s e r F P C R
| 6 .4 Return ans : = a n s * S c a l e . E x i t .
| Notes : If A d j F l a g = 0 , w e h a v e X = M l o g 2 + J l o g 2 / 6 4 + R ,
| | M| < = 1 6 3 8 0 , a n d S c a l e = 2 ^ M . M o r e o v e r , e x p ( X ) w i l l
| neither o v e r f l o w n o r u n d e r f l o w . I f A d j F l a g = 1 , t h a t
| means t h a t
| X = ( M 1 + M ) l o g 2 + J l o g 2 / 6 4 + R , | M 1 + M | > = 1 6 3 8 0 .
| Hence, e x p ( X ) m a y o v e r f l o w o r u n d e r f l o w o r n e i t h e r .
| When t h a t i s t h e c a s e , A d j S c a l e = 2 ^ ( M 1 ) w h e r e M 1 i s
| approximately M . T h u s 6 . 2 w i l l n e v e r c a u s e o v e r / u n d e r f l o w .
| Possible e x c e p t i o n i n 6 . 4 i s o v e r f l o w o r u n d e r f l o w .
| The i n e x a c t e x c e p t i o n i s n o t g e n e r a t e d i n 6 . 4 . A l t h o u g h
| one c a n a r g u e t h a t t h e i n e x a c t f l a g s h o u l d a l w a y s b e
| raised, t o s i m u l a t e t h a t e x c e p t i o n c o s t t o m u c h t h a n t h e
| flag i s w o r t h i n p r a c t i c a l u s e s .
|
| Step 7 . R e t u r n 1 + X .
| 7 .1 ans : = X
| 7 .2 Restore user F P C R .
| 7 .3 Return ans : = 1 + a n s . E x i t
| Notes : For n o n - z e r o X , t h e i n e x a c t e x c e p t i o n w i l l a l w a y s b e
| raised b y 7 . 3 . T h a t i s t h e o n l y e x c e p t i o n r a i s e d b y 7 . 3 .
| Note a l s o t h a t w e u s e t h e F M O V E M i n s t r u c t i o n t o m o v e X
| in S t e p 7 . 1 t o a v o i d u n n e c e s s a r y t r a p p i n g . ( A l t h o u g h
| the F M O V E M m a y n o t s e e m r e l e v a n t s i n c e X i s n o r m a l i z e d ,
| the p r e c a u t i o n w i l l b e u s e f u l i n t h e l i b r a r y v e r s i o n o f
| this c o d e w h e r e t h e s e p a r a t e e n t r y f o r d e n o r m a l i z e d i n p u t s
| will b e d o n e a w a y w i t h . )
|
| Step 8 . H a n d l e e x p ( X ) w h e r e | X | > = 1 6 3 8 0 l o g 2 .
| 8 .1 If | X| > 1 6 4 8 0 l o g 2 , g o t o S t e p 9 .
| ( mimic 2 . 2 - 2 . 6 )
| 8 .2 N : = round- t o - i n t e g e r ( X * 6 4 / l o g 2 )
| 8 .3 Calculate J = N m o d 6 4 , J = 0 ,1 ,. . . ,6 3
| 8 .4 K : = ( N- J ) / 6 4 , M 1 : = t r u n c a t e ( K / 2 ) , M = K - M 1 , A d j F l a g : = 1 .
| 8 .5 Calculate the a d d r e s s o f t h e s t o r e d v a l u e 2 ^ ( J / 6 4 ) .
| 8 .6 Create the v a l u e s S c a l e = 2 ^ M , A d j S c a l e = 2 ^ M 1 .
| 8 .7 Go to S t e p 3 .
| Notes : Refer t o n o t e s f o r 2 . 2 - 2 . 6 .
|
| Step 9 . H a n d l e e x p ( X ) , | X | > 1 6 4 8 0 l o g 2 .
| 9 .1 If X < 0 , g o t o 9 . 3
| 9 .2 ans : = Huge, g o t o 9 . 4
| 9 .3 ans : = Tiny.
| 9 .4 Restore user F P C R .
| 9 .5 Return ans : = a n s * a n s . E x i t .
| Notes : Exp( X ) w i l l s u r e l y o v e r f l o w o r u n d e r f l o w , d e p e n d i n g o n
| X' s s i g n . " H u g e " a n d " T i n y " a r e r e s p e c t i v e l y l a r g e / t i n y
| extended- p r e c i s i o n n u m b e r s w h o s e s q u a r e o v e r / u n d e r f l o w
| with a n i n e x a c t r e s u l t . T h u s , 9 . 5 a l w a y s r a i s e s t h e
| inexact t o g e t h e r w i t h e i t h e r o v e r f l o w o r u n d e r f l o w .
|
|
| setoxm1 d
| - - - - - - - -
|
| Step 1 . S e t a n s : = 0
|
| Step 2 . R e t u r n a n s : = X + a n s . E x i t .
| Notes : This w i l l r e t u r n X w i t h t h e a p p r o p r i a t e r o u n d i n g
| precision p r e s c r i b e d b y t h e u s e r F P C R .
|
| setoxm1
| - - - - - - -
|
| Step 1 . C h e c k | X |
| 1 .1 If | X| > = 1 / 4 , g o t o S t e p 1 . 3 .
| 1 .2 Go to S t e p 7 .
| 1 .3 If | X| < 7 0 l o g ( 2 ) , g o t o S t e p 2 .
| 1 .4 Go to S t e p 1 0 .
| Notes : The u s u a l c a s e s h o u l d t a k e t h e b r a n c h e s 1 . 1 - > 1 . 3 - > 2 .
| However, i t i s c o n c e i v a b l e | X | c a n b e s m a l l v e r y o f t e n
| because E X P M 1 i s i n t e n d e d t o e v a l u a t e e x p ( X ) - 1 a c c u r a t e l y
| when | X | i s s m a l l . F o r f u r t h e r d e t a i l s o n t h e c o m p a r i s o n s ,
| see t h e n o t e s o n S t e p 1 o f s e t o x .
|
| Step 2 . C a l c u l a t e N = r o u n d - t o - n e a r e s t - i n t ( X * 6 4 / l o g 2 ) .
| 2 .1 N : = round- t o - n e a r e s t - i n t e g e r ( X * 6 4 / l o g 2 ) .
| 2 .2 Calculate J = N m o d 6 4 ; so J = 0,1,2,..., or 63.
| 2 .3 Calculate M = ( N - J ) / 6 4 ; so N = 64M + J.
| 2 .4 Calculate the a d d r e s s o f t h e s t o r e d v a l u e o f 2 ^ ( J / 6 4 ) .
| 2 .5 Create the v a l u e s S c = 2 ^ M a n d O n e b y S c : = - 2 ^ ( - M ) .
| Notes : See t h e n o t e s o n S t e p 2 o f s e t o x .
|
| Step 3 . C a l c u l a t e X - N * l o g 2 / 6 4 .
| 3 .1 R : = X + N * L 1 , w h e r e L 1 : = s i n g l e - p r e c i s i o n ( - l o g 2 / 6 4 ) .
| 3 .2 R : = R + N * L 2 , L 2 : = e x t e n d e d - p r e c i s i o n ( - l o g 2 / 6 4 - L 1 ) .
| Notes : Applying t h e a n a l y s i s o f S t e p 3 o f s e t o x i n t h i s c a s e
| shows t h a t | R | < = 0 . 0 0 5 5 ( n o t e t h a t | X | < = 7 0 l o g 2 i n
| this c a s e ) .
|
| Step 4 . A p p r o x i m a t e e x p ( R ) - 1 b y a p o l y n o m i a l
| p = R + R * R * ( A 1 + R * ( A 2 + R * ( A 3 + R * ( A 4 + R * ( A 5 + R * A 6 ) ) ) ) )
| Notes : a) I n o r d e r t o r e d u c e m e m o r y a c c e s s , t h e c o e f f i c i e n t s a r e
| made a s " s h o r t " a s p o s s i b l e : A 1 ( w h i c h i s 1 / 2 ) , A 5 a n d A 6
| are s i n g l e p r e c i s i o n ; A2, A3 and A4 are double precision.
| b) E v e n w i t h t h e r e s t r i c t i o n a b o v e ,
| | p - ( e x p ( R ) - 1 ) | < | R | * 2 ^ ( - 7 2 . 7 )
| for a l l | R | < = 0 . 0 0 5 5 .
| c) T o f u l l y u t i l i z e t h e p i p e l i n e , p i s s e p a r a t e d i n t o
| two i n d e p e n d e n t p i e c e s o f r o u g h l y e q u a l c o m p l e x i t y
| p = [ R * S * ( A 2 + S * ( A 4 + S * A 6 ) ) ] +
| [ R + S * ( A 1 + S * ( A 3 + S * A 5 ) ) ]
| where S = R * R .
|
| Step 5 . C o m p u t e 2 ^ ( J / 6 4 ) * p b y
| p : = T * p
| where T a n d t a r e t h e s t o r e d v a l u e s f o r 2 ^ ( J / 6 4 ) .
| Notes : 2 ^ ( J/ 6 4 ) i s s t o r e d a s T a n d t w h e r e T + t a p p r o x i m a t e s
| 2 ^ ( J/ 6 4 ) t o r o u g h l y 8 5 b i t s ; T is in extended precision
| and t i s i n s i n g l e p r e c i s i o n . N o t e a l s o t h a t T i s r o u n d e d
| to 6 2 b i t s s o t h a t t h e l a s t t w o b i t s o f T a r e z e r o . T h e
| reason f o r s u c h a s p e c i a l f o r m i s t h a t T - 1 , T - 2 , a n d T - 8
| will a l l b e e x a c t - - - a p r o p e r t y t h a t w i l l b e e x p l o i t e d
| in S t e p 6 b e l o w . T h e t o t a l r e l a t i v e e r r o r i n p i s n o
| bigger t h a n 2 ^ ( - 6 7 . 7 ) c o m p a r e d t o t h e f i n a l r e s u l t .
|
| Step 6 . R e c o n s t r u c t i o n o f e x p ( X ) - 1
| exp( X ) - 1 = 2 ^ M * ( 2 ^ ( J / 6 4 ) + p - 2 ^ ( - M ) ) .
| 6 .1 If M < = 6 3 , g o t o S t e p 6 . 3 .
| 6 .2 ans : = T + ( p + ( t + O n e b y S c ) ) . G o t o 6 . 6
| 6 .3 If M > = - 3 , g o t o 6 . 5 .
| 6 .4 ans : = ( T + ( p + t ) ) + O n e b y S c . G o t o 6 . 6
| 6 .5 ans : = ( T + O n e b y S c ) + ( p + t ) .
| 6 .6 Restore user F P C R .
| 6 .7 Return ans : = S c * a n s . E x i t .
| Notes : The v a r i o u s a r r a n g e m e n t s o f t h e e x p r e s s i o n s g i v e a c c u r a t e
| evaluations.
|
| Step 7 . e x p ( X ) - 1 f o r | X | < 1 / 4 .
| 7 .1 If | X| > = 2 ^ ( - 6 5 ) , g o t o S t e p 9 .
| 7 .2 Go to S t e p 8 .
|
| Step 8 . C a l c u l a t e e x p ( X ) - 1 , | X | < 2 ^ ( - 6 5 ) .
| 8 .1 If | X| < 2 ^ ( - 1 6 3 1 2 ) , g o t o 8 . 3
| 8 .2 Restore FPCR; return ans := X - 2^(-16382). Exit.
| 8 .3 X : = X * 2 ^ ( 1 4 0 ) .
| 8 .4 Restore FPCR; ans := ans - 2^(-16382).
| Return a n s : = a n s * 2 ^ ( 1 4 0 ) . E x i t
| Notes : The i d e a i s t o r e t u r n " X - t i n y " u n d e r t h e u s e r
| precision a n d r o u n d i n g m o d e s . T o a v o i d u n n e c e s s a r y
| inefficiency, w e s t a y a w a y f r o m d e n o r m a l i z e d n u m b e r s t h e
| best w e c a n . F o r | X | > = 2 ^ ( - 1 6 3 1 2 ) , t h e s t r a i g h t f o r w a r d
| 8 .2 generates the i n e x a c t e x c e p t i o n a s t h e c a s e w a r r a n t s .
|
| Step 9 . C a l c u l a t e e x p ( X ) - 1 , | X | < 1 / 4 , b y a p o l y n o m i a l
| p = X + X * X * ( B 1 + X * ( B 2 + . . . + X * B 1 2 ) )
| Notes : a) I n o r d e r t o r e d u c e m e m o r y a c c e s s , t h e c o e f f i c i e n t s a r e
| made a s " s h o r t " a s p o s s i b l e : B 1 ( w h i c h i s 1 / 2 ) , B 9 t o B 1 2
| are s i n g l e p r e c i s i o n ; B3 to B8 are double precision; and
| B2 i s d o u b l e e x t e n d e d .
| b) E v e n w i t h t h e r e s t r i c t i o n a b o v e ,
| | p - ( e x p ( X ) - 1 ) | < | X | 2 ^ ( - 7 0 . 6 )
| for a l l | X | < = 0 . 2 5 1 .
| Note t h a t 0 . 2 5 1 i s s l i g h t l y b i g g e r t h a n 1 / 4 .
| c) T o f u l l y p r e s e r v e a c c u r a c y , t h e p o l y n o m i a l i s c o m p u t e d
| as X + ( S * B 1 + Q ) w h e r e S = X * X a n d
| Q = X * S * ( B 2 + X * ( B 3 + . . . + X * B 1 2 ) )
| d) T o f u l l y u t i l i z e t h e p i p e l i n e , Q i s s e p a r a t e d i n t o
| two i n d e p e n d e n t p i e c e s o f r o u g h l y e q u a l c o m p l e x i t y
| Q = [ X * S * ( B 2 + S * ( B 4 + . . . + S * B 1 2 ) ) ] +
| [ S* S * ( B 3 + S * ( B 5 + . . . + S * B 1 1 ) ) ]
|
| Step 1 0 . C a l c u l a t e e x p ( X ) - 1 f o r | X | > = 7 0 l o g 2 .
| 1 0 .1 If X > = 7 0 l o g 2 , e x p ( X ) - 1 = e x p ( X ) f o r a l l p r a c t i c a l
| purposes. T h e r e f o r e , g o t o S t e p 1 o f s e t o x .
| 1 0 .2 If X < = - 7 0 l o g 2 , e x p ( X ) - 1 = - 1 f o r a l l p r a c t i c a l p u r p o s e s .
| ans : = - 1
| Restore u s e r F P C R
| Return a n s : = a n s + 2 ^ ( - 1 2 6 ) . E x i t .
| Notes : 1 0 .2 will always c r e a t e a n i n e x a c t a n d r e t u r n - 1 + t i n y
| in t h e u s e r r o u n d i n g p r e c i s i o n a n d m o d e .
|
|
| Copyright ( C ) M o t o r o l a , I n c . 1 9 9 0
| All R i g h t s R e s e r v e d
|
2006-02-11 17:55:48 -08:00
| For d e t a i l s o n t h e l i c e n s e f o r t h i s f i l e , p l e a s e s e e t h e
| file, R E A D M E , i n t h i s s a m e d i r e c t o r y .
2005-04-16 15:20:36 -07:00
| setox i d n t 2 ,1 | M o t o r o l a 0 4 0 F l o a t i n g P o i n t S o f t w a r e P a c k a g e
| section 8
# include " f p s p . h "
L2 : .long 0x3FDC0000 , 0 x8 2 E 3 0 8 6 5 ,0 x43 6 1 C 4 C 6 ,0 x00 0 0 0 0 0 0
EXPA3 : .long 0x3FA55555 , 0 x5 5 5 5 4 4 3 1
EXPA2 : .long 0x3FC55555 , 0 x5 5 5 5 4 0 1 8
HUGE : .long 0x7FFE0000 , 0 xFFFFFFFF,0 x F F F F F F F F ,0 x00 0 0 0 0 0 0
TINY : .long 0x00010000 , 0 xFFFFFFFF,0 x F F F F F F F F ,0 x00 0 0 0 0 0 0
EM1A4 : .long 0x3F811111 , 0 x1 1 1 7 4 3 8 5
EM1A3 : .long 0x3FA55555 , 0 x5 5 5 5 4 F 5 A
EM1A2 : .long 0x3FC55555 , 0 x5 5 5 5 5 5 5 5 ,0 x00 0 0 0 0 0 0 ,0 x00 0 0 0 0 0 0
EM1B8 : .long 0x3EC71DE3 , 0 xA5 7 7 4 6 8 2
EM1B7 : .long 0x3EFA01A0 , 0 x1 9 D 7 C B 6 8
EM1B6 : .long 0x3F2A01A0 , 0 x1 A 0 1 9 D F 3
EM1B5 : .long 0x3F56C16C , 0 x1 6 C 1 7 0 E 2
EM1B4 : .long 0x3F811111 , 0 x1 1 1 1 1 1 1 1
EM1B3 : .long 0x3FA55555 , 0 x5 5 5 5 5 5 5 5
EM1B2 : .long 0x3FFC0000 , 0 xAAAAAAAA,0 x A A A A A A A B
.long 0x00000000
TWO140 : .long 0x48B00000 , 0 x0 0 0 0 0 0 0 0
TWON140 : .long 0x37300000 , 0 x0 0 0 0 0 0 0 0
EXPTBL :
.long 0 x3 F F F 0 0 0 0 ,0 x80 0 0 0 0 0 0 ,0 x00 0 0 0 0 0 0 ,0 x00 0 0 0 0 0 0
.long 0 x3 F F F 0 0 0 0 ,0 x81 6 4 D 1 F 3 ,0 x B C 0 3 0 7 7 4 ,0 x9 F 8 4 1 A 9 B
.long 0 x3 F F F 0 0 0 0 ,0 x82 C D 8 6 9 8 ,0 x A C 2 B A 1 D 8 ,0 x9 F C 1 D 5 B 9
.long 0 x3 F F F 0 0 0 0 ,0 x84 3 A 2 8 C 3 ,0 x A C D E 4 0 4 8 ,0 x A 0 7 2 8 3 6 9
.long 0 x3 F F F 0 0 0 0 ,0 x85 A A C 3 6 7 ,0 x C C 4 8 7 B 1 4 ,0 x1 F C 5 C 9 5 C
.long 0 x3 F F F 0 0 0 0 ,0 x87 1 F 6 1 9 6 ,0 x9 E 8 D 1 0 1 0 ,0 x1 E E 8 5 C 9 F
.long 0 x3 F F F 0 0 0 0 ,0 x88 9 8 0 E 8 0 ,0 x92 D A 8 5 2 8 ,0 x9 F A 2 0 7 2 9
.long 0 x3 F F F 0 0 0 0 ,0 x8 A 1 4 D 5 7 5 ,0 x49 6 E F D 9 C ,0 x A 0 7 B F 9 A F
.long 0 x3 F F F 0 0 0 0 ,0 x8 B 9 5 C 1 E 3 ,0 x E A 8 B D 6 E 8 ,0 x A 0 0 2 0 D C F
.long 0 x3 F F F 0 0 0 0 ,0 x8 D 1 A D F 5 B ,0 x7 E 5 B A 9 E 4 ,0 x20 5 A 6 3 D A
.long 0 x3 F F F 0 0 0 0 ,0 x8 E A 4 3 9 8 B ,0 x45 C D 5 3 C 0 ,0 x1 E B 7 0 0 5 1
.long 0 x3 F F F 0 0 0 0 ,0 x90 3 1 D C 4 3 ,0 x14 6 6 B 1 D C ,0 x1 F 6 E B 0 2 9
.long 0 x3 F F F 0 0 0 0 ,0 x91 C 3 D 3 7 3 ,0 x A B 1 1 C 3 3 8 ,0 x A 0 7 8 1 4 9 4
.long 0 x3 F F F 0 0 0 0 ,0 x93 5 A 2 B 2 F ,0 x13 E 6 E 9 2 C ,0 x9 E B 3 1 9 B 0
.long 0 x3 F F F 0 0 0 0 ,0 x94 F 4 E F A 8 ,0 x F E F 7 0 9 6 0 ,0 x20 1 7 4 5 7 D
.long 0 x3 F F F 0 0 0 0 ,0 x96 9 4 2 D 3 7 ,0 x20 1 8 5 A 0 0 ,0 x1 F 1 1 D 5 3 7
.long 0 x3 F F F 0 0 0 0 ,0 x98 3 7 F 0 5 1 ,0 x8 D B 8 A 9 7 0 ,0 x9 F B 9 5 2 D D
.long 0 x3 F F F 0 0 0 0 ,0 x99 E 0 4 5 9 3 ,0 x20 B 7 F A 6 4 ,0 x1 F E 4 3 0 8 7
.long 0 x3 F F F 0 0 0 0 ,0 x9 B 8 D 3 9 B 9 ,0 x D 5 4 E 5 5 3 8 ,0 x1 F A 2 A 8 1 8
.long 0 x3 F F F 0 0 0 0 ,0 x9 D 3 E D 9 A 7 ,0 x2 C F F B 7 5 0 ,0 x1 F D E 4 9 4 D
.long 0 x3 F F F 0 0 0 0 ,0 x9 E F 5 3 2 6 0 ,0 x91 A 1 1 1 A C ,0 x20 5 0 4 8 9 0
.long 0 x3 F F F 0 0 0 0 ,0 x A 0 B 0 5 1 0 F ,0 x B 9 7 1 4 F C 4 ,0 x A 0 7 3 6 9 1 C
.long 0 x3 F F F 0 0 0 0 ,0 x A 2 7 0 4 3 0 3 ,0 x0 C 4 9 6 8 1 8 ,0 x1 F 9 B 7 A 0 5
.long 0 x3 F F F 0 0 0 0 ,0 x A 4 3 5 1 5 A E ,0 x09 E 6 8 0 A 0 ,0 x A 0 7 9 7 1 2 6
.long 0 x3 F F F 0 0 0 0 ,0 x A 5 F E D 6 A 9 ,0 x B 1 5 1 3 8 E C ,0 x A 0 7 1 A 1 4 0
.long 0 x3 F F F 0 0 0 0 ,0 x A 7 C D 9 3 B 4 ,0 x E 9 6 5 3 5 6 8 ,0 x20 4 F 6 2 D A
.long 0 x3 F F F 0 0 0 0 ,0 x A 9 A 1 5 A B 4 ,0 x E A 7 C 0 E F 8 ,0 x1 F 2 8 3 C 4 A
.long 0 x3 F F F 0 0 0 0 ,0 x A B 7 A 3 9 B 5 ,0 x A 9 3 E D 3 3 8 ,0 x9 F 9 A 7 F D C
.long 0 x3 F F F 0 0 0 0 ,0 x A D 5 8 3 E E A ,0 x42 A 1 4 A C 8 ,0 x A 0 5 B 3 F A C
.long 0 x3 F F F 0 0 0 0 ,0 x A F 3 B 7 8 A D ,0 x69 0 A 4 3 7 4 ,0 x1 F D F 2 6 1 0
.long 0 x3 F F F 0 0 0 0 ,0 x B 1 2 3 F 5 8 1 ,0 x D 2 A C 2 5 9 0 ,0 x9 F 7 0 5 F 9 0
.long 0 x3 F F F 0 0 0 0 ,0 x B 3 1 1 C 4 1 2 ,0 x A 9 1 1 2 4 8 8 ,0 x20 1 F 6 7 8 A
.long 0 x3 F F F 0 0 0 0 ,0 x B 5 0 4 F 3 3 3 ,0 x F 9 D E 6 4 8 4 ,0 x1 F 3 2 F B 1 3
.long 0 x3 F F F 0 0 0 0 ,0 x B 6 F D 9 1 E 3 ,0 x28 D 1 7 7 9 0 ,0 x20 0 3 8 B 3 0
.long 0 x3 F F F 0 0 0 0 ,0 x B 8 F B A F 4 7 ,0 x62 F B 9 E E 8 ,0 x20 0 D C 3 C C
.long 0 x3 F F F 0 0 0 0 ,0 x B A F F 5 A B 2 ,0 x13 3 E 4 5 F C ,0 x9 F 8 B 2 A E 6
.long 0 x3 F F F 0 0 0 0 ,0 x B D 0 8 A 3 9 F ,0 x58 0 C 3 6 C 0 ,0 x A 0 2 B B F 7 0
.long 0 x3 F F F 0 0 0 0 ,0 x B F 1 7 9 9 B 6 ,0 x7 A 7 3 1 0 8 4 ,0 x A 0 0 B F 5 1 8
.long 0 x3 F F F 0 0 0 0 ,0 x C 1 2 C 4 C C A ,0 x66 7 0 9 4 5 8 ,0 x A 0 4 1 D D 4 1
.long 0 x3 F F F 0 0 0 0 ,0 x C 3 4 6 C C D A ,0 x24 9 7 6 4 0 8 ,0 x9 F D F 1 3 7 B
.long 0 x3 F F F 0 0 0 0 ,0 x C 5 6 7 2 A 1 1 ,0 x55 0 6 D A D C ,0 x20 1 F 1 5 6 8
.long 0 x3 F F F 0 0 0 0 ,0 x C 7 8 D 7 4 C 8 ,0 x A B B 9 B 1 5 C ,0 x1 F C 1 3 A 2 E
.long 0 x3 F F F 0 0 0 0 ,0 x C 9 B 9 B D 8 6 ,0 x6 E 2 F 2 7 A 4 ,0 x A 0 3 F 8 F 0 3
.long 0 x3 F F F 0 0 0 0 ,0 x C B E C 1 4 F E ,0 x F 2 7 2 7 C 5 C ,0 x1 F F 4 9 0 7 D
.long 0 x3 F F F 0 0 0 0 ,0 x C E 2 4 8 C 1 5 ,0 x1 F 8 4 8 0 E 4 ,0 x9 E 6 E 5 3 E 4
.long 0 x3 F F F 0 0 0 0 ,0 x D 0 6 3 3 3 D A ,0 x E F 2 B 2 5 9 4 ,0 x1 F D 6 D 4 5 C
.long 0 x3 F F F 0 0 0 0 ,0 x D 2 A 8 1 D 9 1 ,0 x F 1 2 A E 4 5 C ,0 x A 0 7 6 E D B 9
.long 0 x3 F F F 0 0 0 0 ,0 x D 4 F 3 5 A A B ,0 x C F E D F A 2 0 ,0 x9 F A 6 D E 2 1
.long 0 x3 F F F 0 0 0 0 ,0 x D 7 4 4 F C C A ,0 x D 6 9 D 6 A F 4 ,0 x1 E E 6 9 A 2 F
.long 0 x3 F F F 0 0 0 0 ,0 x D 9 9 D 1 5 C 2 ,0 x78 A F D 7 B 4 ,0 x20 7 F 4 3 9 F
.long 0 x3 F F F 0 0 0 0 ,0 x D B F B B 7 9 7 ,0 x D A F 2 3 7 5 4 ,0 x20 1 E C 2 0 7
.long 0 x3 F F F 0 0 0 0 ,0 x D E 6 0 F 4 8 2 ,0 x5 E 0 E 9 1 2 4 ,0 x9 E 8 B E 1 7 5
.long 0 x3 F F F 0 0 0 0 ,0 x E 0 C C D E E C ,0 x2 A 9 4 E 1 1 0 ,0 x20 0 3 2 C 4 B
.long 0 x3 F F F 0 0 0 0 ,0 x E 3 3 F 8 9 7 2 ,0 x B E 8 A 5 A 5 0 ,0 x20 0 4 D F F 5
.long 0 x3 F F F 0 0 0 0 ,0 x E 5 B 9 0 6 E 7 ,0 x7 C 8 3 4 8 A 8 ,0 x1 E 7 2 F 4 7 A
.long 0 x3 F F F 0 0 0 0 ,0 x E 8 3 9 6 A 5 0 ,0 x3 C 4 B D C 6 8 ,0 x1 F 7 2 2 F 2 2
.long 0 x3 F F F 0 0 0 0 ,0 x E A C 0 C 6 E 7 ,0 x D D 2 4 3 9 3 0 ,0 x A 0 1 7 E 9 4 5
.long 0 x3 F F F 0 0 0 0 ,0 x E D 4 F 3 0 1 E ,0 x D 9 9 4 2 B 8 4 ,0 x1 F 4 0 1 A 5 B
.long 0 x3 F F F 0 0 0 0 ,0 x E F E 4 B 9 9 B ,0 x D C D A F 5 C C ,0 x9 F B 9 A 9 E 3
.long 0 x3 F F F 0 0 0 0 ,0 x F 2 8 1 7 7 3 C ,0 x59 F F B 1 3 8 ,0 x20 7 4 4 C 0 5
.long 0 x3 F F F 0 0 0 0 ,0 x F 5 2 5 7 D 1 5 ,0 x24 8 6 C C 2 C ,0 x1 F 7 7 3 A 1 9
.long 0 x3 F F F 0 0 0 0 ,0 x F 7 D 0 D F 7 3 ,0 x0 A D 1 3 B B 8 ,0 x1 F F E 9 0 D 5
.long 0 x3 F F F 0 0 0 0 ,0 x F A 8 3 B 2 D B ,0 x72 2 A 0 3 3 C ,0 x A 0 4 1 E D 2 2
.long 0 x3 F F F 0 0 0 0 ,0 x F D 3 E 0 C 0 C ,0 x F 4 8 6 C 1 7 4 ,0 x1 F 8 5 3 F 3 A
.set ADJFLAG,L _ S C R 2
.set SCALE,F P _ S C R 1
.set ADJSCALE,F P _ S C R 2
.set SC,F P _ S C R 3
.set ONEBYSC,F P _ S C R 4
| xref t _ f r c i n x
| xref t _ e x t d n r m
| xref t _ u n f l
| xref t _ o v f l
.global setoxd
setoxd :
| - - entry p o i n t f o r E X P ( X ) , X i s d e n o r m a l i z e d
movel ( % a0 ) ,% d0
andil #0x80000000 ,% d0
oril #0x00800000 ,% d0 | . . . s i g n ( X ) * 2 ^ ( - 1 2 6 )
movel % d0 ,- ( % s p )
fmoves #0x3F800000 ,% f p0
fmovel % d1 ,% f p c r
fadds ( % s p ) + ,% f p0
bra t _ f r c i n x
.global setox
setox :
| - - entry p o i n t f o r E X P ( X ) , h e r e X i s f i n i t e , n o n - z e r o , a n d n o t N a N ' s
| - - Step 1 .
movel ( % a0 ) ,% d0 | . . . l o a d p a r t o f i n p u t X
andil #0x7FFF0000 ,% d0 | . . . b i a s e d e x p o . o f X
cmpil #0x3FBE0000 ,% d0 | . . . 2 ^ ( - 6 5 )
bges E X P C 1 | . . . n o r m a l c a s e
bra E X P S M
EXPC1 :
| - - The c a s e | X | > = 2 ^ ( - 6 5 )
movew 4 ( % a0 ) ,% d0 | . . . e x p o . a n d p a r t i a l s i g . o f | X |
cmpil #0x400CB167 ,% d0 | . . . 1 6 3 8 0 l o g 2 t r u n c . 1 6 b i t s
blts E X P M A I N | . . . n o r m a l c a s e
bra E X P B I G
EXPMAIN :
| - - Step 2 .
| - - This i s t h e n o r m a l b r a n c h : 2 ^ ( - 6 5 ) < = | X | < 1 6 3 8 0 l o g 2 .
fmovex ( % a0 ) ,% f p0 | . . . l o a d i n p u t f r o m ( a0 )
fmovex % f p0 ,% f p1
fmuls #0x42B8AA3B ,% f p0 | . . . 6 4 / l o g 2 * X
fmovemx % f p2 - % f p2 / % f p3 ,- ( % a7 ) | . . . s a v e f p2
movel #0 ,A D J F L A G ( % a6 )
fmovel % f p0 ,% d0 | . . . N = i n t ( X * 6 4 / l o g 2 )
lea E X P T B L ,% a1
fmovel % d0 ,% f p0 | . . . c o n v e r t t o f l o a t i n g - f o r m a t
movel % d0 ,L _ S C R 1 ( % a6 ) | . . . s a v e N t e m p o r a r i l y
andil #0x3F ,% d0 | . . . D 0 i s J = N m o d 6 4
lsll #4 ,% d0
addal % d0 ,% a1 | . . . a d d r e s s o f 2 ^ ( J / 6 4 )
movel L _ S C R 1 ( % a6 ) ,% d0
asrl #6 ,% d0 | . . . D 0 i s M
addiw #0x3FFF ,% d0 | . . . b i a s e d e x p o . o f 2 ^ ( M )
movew L 2 ,L _ S C R 1 ( % a6 ) | . . . p r e f e t c h L 2 , n o n e e d i n C B
EXPCONT1 :
| - - Step 3 .
| - - fp1 ,f p2 s a v e d o n t h e s t a c k . f p0 i s N , f p1 i s X ,
| - - a0 p o i n t s t o 2 ^ ( J / 6 4 ) , D 0 i s b i a s e d e x p o . o f 2 ^ ( M )
fmovex % f p0 ,% f p2
fmuls #0xBC317218 ,% f p0 | . . . N * L 1 , L 1 = l e a d ( - l o g 2 / 6 4 )
fmulx L 2 ,% f p2 | . . . N * L 2 , L 1 + L 2 = - l o g 2 / 6 4
faddx % f p1 ,% f p0 | . . . X + N * L 1
faddx % f p2 ,% f p0 | . . . f p0 i s R , r e d u c e d a r g .
| MOVE. W #$ 3 F A 5 ,E X P A 3 . . . l o a d E X P A 3 i n c a c h e
| - - Step 4 .
| - - WE N O W C O M P U T E E X P ( R ) - 1 B Y A P O L Y N O M I A L
| - - R + R * R * ( A 1 + R * ( A 2 + R * ( A 3 + R * ( A 4 + R * A 5 ) ) ) )
| - - TO F U L L Y U T I L I Z E T H E P I P E L I N E , W E C O M P U T E S = R * R
| - - [ R+ R * S * ( A 2 + S * A 4 ) ] + [ S * ( A 1 + S * ( A 3 + S * A 5 ) ) ]
fmovex % f p0 ,% f p1
fmulx % f p1 ,% f p1 | . . . f p1 I S S = R * R
fmoves #0x3AB60B70 ,% f p2 | . . . f p2 I S A 5
| MOVE. W #0 ,2 ( % a1 ) . . . l o a d 2 ^ ( J / 6 4 ) i n c a c h e
fmulx % f p1 ,% f p2 | . . . f p2 I S S * A 5
fmovex % f p1 ,% f p3
fmuls #0x3C088895 ,% f p3 | . . . f p3 I S S * A 4
faddd E X P A 3 ,% f p2 | . . . f p2 I S A 3 + S * A 5
faddd E X P A 2 ,% f p3 | . . . f p3 I S A 2 + S * A 4
fmulx % f p1 ,% f p2 | . . . f p2 I S S * ( A 3 + S * A 5 )
movew % d0 ,S C A L E ( % a6 ) | . . . S C A L E i s 2 ^ ( M ) i n e x t e n d e d
clrw S C A L E + 2 ( % a6 )
movel #0x80000000 ,S C A L E + 4 ( % a6 )
clrl S C A L E + 8 ( % a6 )
fmulx % f p1 ,% f p3 | . . . f p3 I S S * ( A 2 + S * A 4 )
fadds #0x3F000000 ,% f p2 | . . . f p2 I S A 1 + S * ( A 3 + S * A 5 )
fmulx % f p0 ,% f p3 | . . . f p3 I S R * S * ( A 2 + S * A 4 )
fmulx % f p1 ,% f p2 | . . . f p2 I S S * ( A 1 + S * ( A 3 + S * A 5 ) )
faddx % f p3 ,% f p0 | . . . f p0 I S R + R * S * ( A 2 + S * A 4 ) ,
| . . .fp3 released
fmovex ( % a1 ) + ,% f p1 | . . . f p1 i s l e a d . p t . o f 2 ^ ( J / 6 4 )
faddx % f p2 ,% f p0 | . . . f p0 i s E X P ( R ) - 1
| . . .fp2 released
| - - Step 5
| - - final r e c o n s t r u c t i o n p r o c e s s
| - - EXP( X ) = 2 ^ M * ( 2 ^ ( J / 6 4 ) + 2 ^ ( J / 6 4 ) * ( E X P ( R ) - 1 ) )
fmulx % f p1 ,% f p0 | . . . 2 ^ ( J / 6 4 ) * ( E x p ( R ) - 1 )
fmovemx ( % a7 ) + ,% f p2 - % f p2 / % f p3 | . . . f p2 r e s t o r e d
fadds ( % a1 ) ,% f p0 | . . . a c c u r a t e 2 ^ ( J / 6 4 )
faddx % f p1 ,% f p0 | . . . 2 ^ ( J / 6 4 ) + 2 ^ ( J / 6 4 ) * . . .
movel A D J F L A G ( % a6 ) ,% d0
| - - Step 6
tstl % d0
beqs N O R M A L
ADJUST :
fmulx A D J S C A L E ( % a6 ) ,% f p0
NORMAL :
fmovel % d1 ,% F P C R | . . . r e s t o r e u s e r F P C R
fmulx S C A L E ( % a6 ) ,% f p0 | . . . m u l t i p l y 2 ^ ( M )
bra t _ f r c i n x
EXPSM :
| - - Step 7
fmovemx ( % a0 ) ,% f p0 - % f p0 | . . . i n c a s e X i s d e n o r m a l i z e d
fmovel % d1 ,% F P C R
fadds #0x3F800000 ,% f p0 | . . . 1 + X i n u s e r m o d e
bra t _ f r c i n x
EXPBIG :
| - - Step 8
cmpil #0x400CB27C ,% d0 | . . . 1 6 4 8 0 l o g 2
bgts E X P 2 B I G
| - - Steps 8 . 2 - - 8 . 6
fmovex ( % a0 ) ,% f p0 | . . . l o a d i n p u t f r o m ( a0 )
fmovex % f p0 ,% f p1
fmuls #0x42B8AA3B ,% f p0 | . . . 6 4 / l o g 2 * X
fmovemx % f p2 - % f p2 / % f p3 ,- ( % a7 ) | . . . s a v e f p2
movel #1 ,A D J F L A G ( % a6 )
fmovel % f p0 ,% d0 | . . . N = i n t ( X * 6 4 / l o g 2 )
lea E X P T B L ,% a1
fmovel % d0 ,% f p0 | . . . c o n v e r t t o f l o a t i n g - f o r m a t
movel % d0 ,L _ S C R 1 ( % a6 ) | . . . s a v e N t e m p o r a r i l y
andil #0x3F ,% d0 | . . . D 0 i s J = N m o d 6 4
lsll #4 ,% d0
addal % d0 ,% a1 | . . . a d d r e s s o f 2 ^ ( J / 6 4 )
movel L _ S C R 1 ( % a6 ) ,% d0
asrl #6 ,% d0 | . . . D 0 i s K
movel % d0 ,L _ S C R 1 ( % a6 ) | . . . s a v e K t e m p o r a r i l y
asrl #1 ,% d0 | . . . D 0 i s M 1
subl % d0 ,L _ S C R 1 ( % a6 ) | . . . a1 i s M
addiw #0x3FFF ,% d0 | . . . b i a s e d e x p o . o f 2 ^ ( M 1 )
movew % d0 ,A D J S C A L E ( % a6 ) | . . . A D J S C A L E : = 2 ^ ( M 1 )
clrw A D J S C A L E + 2 ( % a6 )
movel #0x80000000 ,A D J S C A L E + 4 ( % a6 )
clrl A D J S C A L E + 8 ( % a6 )
movel L _ S C R 1 ( % a6 ) ,% d0 | . . . D 0 i s M
addiw #0x3FFF ,% d0 | . . . b i a s e d e x p o . o f 2 ^ ( M )
bra E X P C O N T 1 | . . . g o b a c k t o S t e p 3
EXP2BIG :
| - - Step 9
fmovel % d1 ,% F P C R
movel ( % a0 ) ,% d0
bclrb #s i g n _ b i t , ( % a 0 ) | . . . s e t o x a l w a y s r e t u r n s p o s i t i v e
cmpil #0 ,% d0
blt t _ u n f l
bra t _ o v f l
.global setoxm1d
setoxm1d :
| - - entry p o i n t f o r E X P M 1 ( X ) , h e r e X i s d e n o r m a l i z e d
| - - Step 0 .
bra t _ e x t d n r m
.global setoxm1
setoxm1 :
| - - entry p o i n t f o r E X P M 1 ( X ) , h e r e X i s f i n i t e , n o n - z e r o , n o n - N a N
| - - Step 1 .
| - - Step 1 . 1
movel ( % a0 ) ,% d0 | . . . l o a d p a r t o f i n p u t X
andil #0x7FFF0000 ,% d0 | . . . b i a s e d e x p o . o f X
cmpil #0x3FFD0000 ,% d0 | . . . 1 / 4
bges E M 1 C O N 1 | . . . | X | > = 1 / 4
bra E M 1 S M
EM1CON1 :
| - - Step 1 . 3
| - - The c a s e | X | > = 1 / 4
movew 4 ( % a0 ) ,% d0 | . . . e x p o . a n d p a r t i a l s i g . o f | X |
cmpil #0x4004C215 ,% d0 | . . . 7 0 l o g 2 r o u n d e d u p t o 1 6 b i t s
bles E M 1 M A I N | . . . 1 / 4 < = | X | < = 7 0 l o g 2
bra E M 1 B I G
EM1MAIN :
| - - Step 2 .
| - - This i s t h e c a s e : 1 / 4 < = | X | < = 7 0 l o g 2 .
fmovex ( % a0 ) ,% f p0 | . . . l o a d i n p u t f r o m ( a0 )
fmovex % f p0 ,% f p1
fmuls #0x42B8AA3B ,% f p0 | . . . 6 4 / l o g 2 * X
fmovemx % f p2 - % f p2 / % f p3 ,- ( % a7 ) | . . . s a v e f p2
| MOVE. W #$ 3 F 8 1 ,E M 1 A 4 . . . p r e f e t c h i n C B m o d e
fmovel % f p0 ,% d0 | . . . N = i n t ( X * 6 4 / l o g 2 )
lea E X P T B L ,% a1
fmovel % d0 ,% f p0 | . . . c o n v e r t t o f l o a t i n g - f o r m a t
movel % d0 ,L _ S C R 1 ( % a6 ) | . . . s a v e N t e m p o r a r i l y
andil #0x3F ,% d0 | . . . D 0 i s J = N m o d 6 4
lsll #4 ,% d0
addal % d0 ,% a1 | . . . a d d r e s s o f 2 ^ ( J / 6 4 )
movel L _ S C R 1 ( % a6 ) ,% d0
asrl #6 ,% d0 | . . . D 0 i s M
movel % d0 ,L _ S C R 1 ( % a6 ) | . . . s a v e a c o p y o f M
| MOVE. W #$ 3 F D C ,L 2 . . . p r e f e t c h L 2 i n C B m o d e
| - - Step 3 .
| - - fp1 ,f p2 s a v e d o n t h e s t a c k . f p0 i s N , f p1 i s X ,
| - - a0 p o i n t s t o 2 ^ ( J / 6 4 ) , D 0 a n d a1 b o t h c o n t a i n M
fmovex % f p0 ,% f p2
fmuls #0xBC317218 ,% f p0 | . . . N * L 1 , L 1 = l e a d ( - l o g 2 / 6 4 )
fmulx L 2 ,% f p2 | . . . N * L 2 , L 1 + L 2 = - l o g 2 / 6 4
faddx % f p1 ,% f p0 | . . . X + N * L 1
faddx % f p2 ,% f p0 | . . . f p0 i s R , r e d u c e d a r g .
| MOVE. W #$ 3 F C 5 ,E M 1 A 2 . . . l o a d E M 1 A 2 i n c a c h e
addiw #0x3FFF ,% d0 | . . . D 0 i s b i a s e d e x p o . o f 2 ^ M
| - - Step 4 .
| - - WE N O W C O M P U T E E X P ( R ) - 1 B Y A P O L Y N O M I A L
| - - R + R * R * ( A 1 + R * ( A 2 + R * ( A 3 + R * ( A 4 + R * ( A 5 + R * A 6 ) ) ) ) )
| - - TO F U L L Y U T I L I Z E T H E P I P E L I N E , W E C O M P U T E S = R * R
| - - [ R* S * ( A 2 + S * ( A 4 + S * A 6 ) ) ] + [ R + S * ( A 1 + S * ( A 3 + S * A 5 ) ) ]
fmovex % f p0 ,% f p1
fmulx % f p1 ,% f p1 | . . . f p1 I S S = R * R
fmoves #0x3950097B ,% f p2 | . . . f p2 I S a6
| MOVE. W #0 ,2 ( % a1 ) . . . l o a d 2 ^ ( J / 6 4 ) i n c a c h e
fmulx % f p1 ,% f p2 | . . . f p2 I S S * A 6
fmovex % f p1 ,% f p3
fmuls #0x3AB60B6A ,% f p3 | . . . f p3 I S S * A 5
faddd E M 1 A 4 ,% f p2 | . . . f p2 I S A 4 + S * A 6
faddd E M 1 A 3 ,% f p3 | . . . f p3 I S A 3 + S * A 5
movew % d0 ,S C ( % a6 ) | . . . S C i s 2 ^ ( M ) i n e x t e n d e d
clrw S C + 2 ( % a6 )
movel #0x80000000 ,S C + 4 ( % a6 )
clrl S C + 8 ( % a6 )
fmulx % f p1 ,% f p2 | . . . f p2 I S S * ( A 4 + S * A 6 )
movel L _ S C R 1 ( % a6 ) ,% d0 | . . . D 0 i s M
negw % d0 | . . . D 0 i s - M
fmulx % f p1 ,% f p3 | . . . f p3 I S S * ( A 3 + S * A 5 )
addiw #0x3FFF ,% d0 | . . . b i a s e d e x p o . o f 2 ^ ( - M )
faddd E M 1 A 2 ,% f p2 | . . . f p2 I S A 2 + S * ( A 4 + S * A 6 )
fadds #0x3F000000 ,% f p3 | . . . f p3 I S A 1 + S * ( A 3 + S * A 5 )
fmulx % f p1 ,% f p2 | . . . f p2 I S S * ( A 2 + S * ( A 4 + S * A 6 ) )
oriw #0x8000 ,% d0 | . . . s i g n e d / e x p o . o f - 2 ^ ( - M )
movew % d0 ,O N E B Y S C ( % a6 ) | . . . O n e b y S c i s - 2 ^ ( - M )
clrw O N E B Y S C + 2 ( % a6 )
movel #0x80000000 ,O N E B Y S C + 4 ( % a6 )
clrl O N E B Y S C + 8 ( % a6 )
fmulx % f p3 ,% f p1 | . . . f p1 I S S * ( A 1 + S * ( A 3 + S * A 5 ) )
| . . .fp3 released
fmulx % f p0 ,% f p2 | . . . f p2 I S R * S * ( A 2 + S * ( A 4 + S * A 6 ) )
faddx % f p1 ,% f p0 | . . . f p0 I S R + S * ( A 1 + S * ( A 3 + S * A 5 ) )
| . . .fp1 released
faddx % f p2 ,% f p0 | . . . f p0 I S E X P ( R ) - 1
| . . .fp2 released
fmovemx ( % a7 ) + ,% f p2 - % f p2 / % f p3 | . . . f p2 r e s t o r e d
| - - Step 5
| - - Compute 2 ^ ( J / 6 4 ) * p
fmulx ( % a1 ) ,% f p0 | . . . 2 ^ ( J / 6 4 ) * ( E x p ( R ) - 1 )
| - - Step 6
| - - Step 6 . 1
movel L _ S C R 1 ( % a6 ) ,% d0 | . . . r e t r i e v e M
cmpil #63 ,% d0
bles M L E 6 3
| - - Step 6 . 2 M > = 6 4
fmoves 1 2 ( % a1 ) ,% f p1 | . . . f p1 i s t
faddx O N E B Y S C ( % a6 ) ,% f p1 | . . . f p1 i s t + O n e b y S c
faddx % f p1 ,% f p0 | . . . p + ( t + O n e b y S c ) , f p1 r e l e a s e d
faddx ( % a1 ) ,% f p0 | . . . T + ( p + ( t + O n e b y S c ) )
bras E M 1 S C A L E
MLE63 :
| - - Step 6 . 3 M < = 6 3
cmpil #- 3 ,% d0
bges M G E N 3
MLTN3 :
| - - Step 6 . 4 M < = - 4
fadds 1 2 ( % a1 ) ,% f p0 | . . . p + t
faddx ( % a1 ) ,% f p0 | . . . T + ( p + t )
faddx O N E B Y S C ( % a6 ) ,% f p0 | . . . O n e b y S c + ( T + ( p + t ) )
bras E M 1 S C A L E
MGEN3 :
| - - Step 6 . 5 - 3 < = M < = 6 3
fmovex ( % a1 ) + ,% f p1 | . . . f p1 i s T
fadds ( % a1 ) ,% f p0 | . . . f p0 i s p + t
faddx O N E B Y S C ( % a6 ) ,% f p1 | . . . f p1 i s T + O n e b y S c
faddx % f p1 ,% f p0 | . . . ( T + O n e b y S c ) + ( p + t )
EM1SCALE :
| - - Step 6 . 6
fmovel % d1 ,% F P C R
fmulx S C ( % a6 ) ,% f p0
bra t _ f r c i n x
EM1SM :
| - - Step 7 | X | < 1 / 4 .
cmpil #0x3FBE0000 ,% d0 | . . . 2 ^ ( - 6 5 )
bges E M 1 P O L Y
EM1TINY :
| - - Step 8 | X | < 2 ^ ( - 6 5 )
cmpil #0x00330000 ,% d0 | . . . 2 ^ ( - 1 6 3 1 2 )
blts E M 1 2 T I N Y
| - - Step 8 . 2
movel #0x80010000 ,S C ( % a6 ) | . . . S C i s - 2 ^ ( - 1 6 3 8 2 )
movel #0x80000000 ,S C + 4 ( % a6 )
clrl S C + 8 ( % a6 )
fmovex ( % a0 ) ,% f p0
fmovel % d1 ,% F P C R
faddx S C ( % a6 ) ,% f p0
bra t _ f r c i n x
EM12TINY :
| - - Step 8 . 3
fmovex ( % a0 ) ,% f p0
fmuld T W O 1 4 0 ,% f p0
movel #0x80010000 ,S C ( % a6 )
movel #0x80000000 ,S C + 4 ( % a6 )
clrl S C + 8 ( % a6 )
faddx S C ( % a6 ) ,% f p0
fmovel % d1 ,% F P C R
fmuld T W O N 1 4 0 ,% f p0
bra t _ f r c i n x
EM1POLY :
| - - Step 9 e x p ( X ) - 1 b y a s i m p l e p o l y n o m i a l
fmovex ( % a0 ) ,% f p0 | . . . f p0 i s X
fmulx % f p0 ,% f p0 | . . . f p0 i s S : = X * X
fmovemx % f p2 - % f p2 / % f p3 ,- ( % a7 ) | . . . s a v e f p2
fmoves #0x2F30CAA8 ,% f p1 | . . . f p1 i s B 1 2
fmulx % f p0 ,% f p1 | . . . f p1 i s S * B 1 2
fmoves #0x310F8290 ,% f p2 | . . . f p2 i s B 1 1
fadds #0x32D73220 ,% f p1 | . . . f p1 i s B 1 0 + S * B 1 2
fmulx % f p0 ,% f p2 | . . . f p2 i s S * B 1 1
fmulx % f p0 ,% f p1 | . . . f p1 i s S * ( B 1 0 + . . .
fadds #0x3493F281 ,% f p2 | . . . f p2 i s B 9 + S * . . .
faddd E M 1 B 8 ,% f p1 | . . . f p1 i s B 8 + S * . . .
fmulx % f p0 ,% f p2 | . . . f p2 i s S * ( B 9 + . . .
fmulx % f p0 ,% f p1 | . . . f p1 i s S * ( B 8 + . . .
faddd E M 1 B 7 ,% f p2 | . . . f p2 i s B 7 + S * . . .
faddd E M 1 B 6 ,% f p1 | . . . f p1 i s B 6 + S * . . .
fmulx % f p0 ,% f p2 | . . . f p2 i s S * ( B 7 + . . .
fmulx % f p0 ,% f p1 | . . . f p1 i s S * ( B 6 + . . .
faddd E M 1 B 5 ,% f p2 | . . . f p2 i s B 5 + S * . . .
faddd E M 1 B 4 ,% f p1 | . . . f p1 i s B 4 + S * . . .
fmulx % f p0 ,% f p2 | . . . f p2 i s S * ( B 5 + . . .
fmulx % f p0 ,% f p1 | . . . f p1 i s S * ( B 4 + . . .
faddd E M 1 B 3 ,% f p2 | . . . f p2 i s B 3 + S * . . .
faddx E M 1 B 2 ,% f p1 | . . . f p1 i s B 2 + S * . . .
fmulx % f p0 ,% f p2 | . . . f p2 i s S * ( B 3 + . . .
fmulx % f p0 ,% f p1 | . . . f p1 i s S * ( B 2 + . . .
fmulx % f p0 ,% f p2 | . . . f p2 i s S * S * ( B 3 + . . . )
fmulx ( % a0 ) ,% f p1 | . . . f p1 i s X * S * ( B 2 . . .
fmuls #0x3F000000 ,% f p0 | . . . f p0 i s S * B 1
faddx % f p2 ,% f p1 | . . . f p1 i s Q
| . . .fp2 released
fmovemx ( % a7 ) + ,% f p2 - % f p2 / % f p3 | . . . f p2 r e s t o r e d
faddx % f p1 ,% f p0 | . . . f p0 i s S * B 1 + Q
| . . .fp1 released
fmovel % d1 ,% F P C R
faddx ( % a0 ) ,% f p0
bra t _ f r c i n x
EM1BIG :
| - - Step 1 0 | X | > 7 0 l o g 2
movel ( % a0 ) ,% d0
cmpil #0 ,% d0
bgt E X P C 1
| - - Step 1 0 . 2
fmoves #0xBF800000 ,% f p0 | . . . f p0 i s - 1
fmovel % d1 ,% F P C R
fadds #0x00800000 ,% f p0 | . . . - 1 + 2 ^ ( - 1 2 6 )
bra t _ f r c i n x
| end