2007-07-18 05:37:07 +04:00
/ *
Asm v e r s i o n s o f X e n p v - o p s , s u i t a b l e f o r e i t h e r d i r e c t u s e o r i n l i n i n g .
The i n l i n e v e r s i o n s a r e t h e s a m e a s t h e d i r e c t - u s e v e r s i o n s , w i t h t h e
pre- a n d p o s t - a m b l e c h o p p e d o f f .
This c o d e i s e n c o d e d f o r s i z e r a t h e r t h a n a b s o l u t e e f f i c i e n c y ,
with a v i e w t o b e i n g a b l e t o i n l i n e a s m u c h a s p o s s i b l e .
We o n l y b o t h e r w i t h d i r e c t f o r m s ( i e , v c p u i n p d a ) o f t h e o p e r a t i o n s
here; the indirect forms are better handled in C, since they're
generally t o o l a r g e t o i n l i n e a n y w a y .
* /
# include < l i n u x / l i n k a g e . h >
xen: use iret directly when possible
Most of the time we can simply use the iret instruction to exit the
kernel, rather than having to use the iret hypercall - the only
exception is if we're returning into vm86 mode, or from delivering an
NMI (which we don't support yet).
When running native, iret has the behaviour of testing for a pending
interrupt atomically with re-enabling interrupts. Unfortunately
there's no way to do this with Xen, so there's a window in which we
could get a recursive exception after enabling events but before
actually returning to userspace.
This causes a problem: if the nested interrupt causes one of the
task's TIF_WORK_MASK flags to be set, they will not be checked again
before returning to userspace. This means that pending work may be
left pending indefinitely, until the process enters and leaves the
kernel again. The net effect is that a pending signal or reschedule
event could be delayed for an unbounded amount of time.
To deal with this, the xen event upcall handler checks to see if the
EIP is within the critical section of the iret code, after events
are (potentially) enabled up to the iret itself. If its within this
range, it calls the iret critical section fixup, which adjusts the
stack to deal with any unrestored registers, and then shifts the
stack frame up to replace the previous invocation.
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
2007-07-18 05:37:07 +04:00
2007-07-18 05:37:07 +04:00
# include < a s m / a s m - o f f s e t s . h >
# include < a s m / t h r e a d _ i n f o . h >
# include < a s m / p e r c p u . h >
# include < a s m / p r o c e s s o r - f l a g s . h >
xen: use iret directly when possible
Most of the time we can simply use the iret instruction to exit the
kernel, rather than having to use the iret hypercall - the only
exception is if we're returning into vm86 mode, or from delivering an
NMI (which we don't support yet).
When running native, iret has the behaviour of testing for a pending
interrupt atomically with re-enabling interrupts. Unfortunately
there's no way to do this with Xen, so there's a window in which we
could get a recursive exception after enabling events but before
actually returning to userspace.
This causes a problem: if the nested interrupt causes one of the
task's TIF_WORK_MASK flags to be set, they will not be checked again
before returning to userspace. This means that pending work may be
left pending indefinitely, until the process enters and leaves the
kernel again. The net effect is that a pending signal or reschedule
event could be delayed for an unbounded amount of time.
To deal with this, the xen event upcall handler checks to see if the
EIP is within the critical section of the iret code, after events
are (potentially) enabled up to the iret itself. If its within this
range, it calls the iret critical section fixup, which adjusts the
stack to deal with any unrestored registers, and then shifts the
stack frame up to replace the previous invocation.
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
2007-07-18 05:37:07 +04:00
# include < a s m / s e g m e n t . h >
# include < x e n / i n t e r f a c e / x e n . h >
2007-07-18 05:37:07 +04:00
# define R E L O C ( x , v ) . g l o b l x ## _ r e l o c ; x # # _ r e l o c = v
# define E N D P A T C H ( x ) . g l o b l x ## _ e n d ; x # # _ e n d = .
xen: use iret directly when possible
Most of the time we can simply use the iret instruction to exit the
kernel, rather than having to use the iret hypercall - the only
exception is if we're returning into vm86 mode, or from delivering an
NMI (which we don't support yet).
When running native, iret has the behaviour of testing for a pending
interrupt atomically with re-enabling interrupts. Unfortunately
there's no way to do this with Xen, so there's a window in which we
could get a recursive exception after enabling events but before
actually returning to userspace.
This causes a problem: if the nested interrupt causes one of the
task's TIF_WORK_MASK flags to be set, they will not be checked again
before returning to userspace. This means that pending work may be
left pending indefinitely, until the process enters and leaves the
kernel again. The net effect is that a pending signal or reschedule
event could be delayed for an unbounded amount of time.
To deal with this, the xen event upcall handler checks to see if the
EIP is within the critical section of the iret code, after events
are (potentially) enabled up to the iret itself. If its within this
range, it calls the iret critical section fixup, which adjusts the
stack to deal with any unrestored registers, and then shifts the
stack frame up to replace the previous invocation.
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
2007-07-18 05:37:07 +04:00
/* Pseudo-flag used for virtual NMI, which we don't implement yet */
# define X E N _ E F L A G S _ N M I 0 x80 0 0 0 0 0 0
2007-07-18 05:37:07 +04:00
/ *
Enable e v e n t s . T h i s c l e a r s t h e e v e n t m a s k a n d t e s t s t h e p e n d i n g
event s t a t u s w i t h o n e a n d o p e r a t i o n . I f t h e r e a r e p e n d i n g
events, t h e n e n t e r t h e h y p e r v i s o r t o g e t t h e m h a n d l e d .
* /
ENTRY( x e n _ i r q _ e n a b l e _ d i r e c t )
/* Clear mask and test pending */
andw $ 0 x00 f f , P E R _ C P U _ V A R ( x e n _ v c p u _ i n f o ) + X E N _ v c p u _ i n f o _ p e n d i n g
/ * Preempt h e r e d o e s n ' t m a t t e r b e c a u s e t h a t w i l l d e a l w i t h
any p e n d i n g i n t e r r u p t s . T h e p e n d i n g c h e c k m a y e n d u p b e i n g
run o n t h e w r o n g C P U , b u t t h a t d o e s n ' t h u r t . * /
jz 1 f
2 : call c h e c k _ e v e n t s
1 :
ENDPATCH( x e n _ i r q _ e n a b l e _ d i r e c t )
ret
ENDPROC( x e n _ i r q _ e n a b l e _ d i r e c t )
RELOC( x e n _ i r q _ e n a b l e _ d i r e c t , 2 b + 1 )
/ *
Disabling e v e n t s i s s i m p l y a m a t t e r o f m a k i n g t h e e v e n t m a s k
non- z e r o .
* /
ENTRY( x e n _ i r q _ d i s a b l e _ d i r e c t )
movb $ 1 , P E R _ C P U _ V A R ( x e n _ v c p u _ i n f o ) + X E N _ v c p u _ i n f o _ m a s k
ENDPATCH( x e n _ i r q _ d i s a b l e _ d i r e c t )
ret
ENDPROC( x e n _ i r q _ d i s a b l e _ d i r e c t )
RELOC( x e n _ i r q _ d i s a b l e _ d i r e c t , 0 )
/ *
( xen_ ) s a v e _ f l i s u s e d t o g e t t h e c u r r e n t i n t e r r u p t e n a b l e s t a t u s .
Callers e x p e c t t h e s t a t u s t o b e i n X 8 6 _ E F L A G S _ I F , a n d o t h e r b i t s
may b e s e t i n t h e r e t u r n v a l u e . W e t a k e a d v a n t a g e o f t h i s b y
making s u r e t h a t X 8 6 _ E F L A G S _ I F h a s t h e r i g h t v a l u e ( a n d o t h e r b i t s
in t h a t b y t e a r e 0 ) , b u t o t h e r b i t s i n t h e r e t u r n v a l u e a r e
undefined. W e n e e d t o t o g g l e t h e s t a t e o f t h e b i t , b e c a u s e
Xen a n d x86 u s e o p p o s i t e s e n s e s ( m a s k v s e n a b l e ) .
* /
ENTRY( x e n _ s a v e _ f l _ d i r e c t )
testb $ 0 x f f , P E R _ C P U _ V A R ( x e n _ v c p u _ i n f o ) + X E N _ v c p u _ i n f o _ m a s k
setz % a h
addb % a h ,% a h
ENDPATCH( x e n _ s a v e _ f l _ d i r e c t )
ret
ENDPROC( x e n _ s a v e _ f l _ d i r e c t )
RELOC( x e n _ s a v e _ f l _ d i r e c t , 0 )
/ *
In p r i n c i p l e t h e c a l l e r s h o u l d b e p a s s i n g u s a v a l u e r e t u r n
from x e n _ s a v e _ f l _ d i r e c t , b u t f o r r o b u s t n e s s s a k e w e t e s t o n l y
the X 8 6 _ E F L A G S _ I F f l a g r a t h e r t h a n t h e w h o l e b y t e . A f t e r
setting t h e i n t e r r u p t m a s k s t a t e , i t c h e c k s f o r u n m a s k e d
pending e v e n t s a n d e n t e r s t h e h y p e r v i s o r t o g e t t h e m d e l i v e r e d
if s o .
* /
ENTRY( x e n _ r e s t o r e _ f l _ d i r e c t )
testb $ X 8 6 _ E F L A G S _ I F > > 8 , % a h
xen: use iret directly when possible
Most of the time we can simply use the iret instruction to exit the
kernel, rather than having to use the iret hypercall - the only
exception is if we're returning into vm86 mode, or from delivering an
NMI (which we don't support yet).
When running native, iret has the behaviour of testing for a pending
interrupt atomically with re-enabling interrupts. Unfortunately
there's no way to do this with Xen, so there's a window in which we
could get a recursive exception after enabling events but before
actually returning to userspace.
This causes a problem: if the nested interrupt causes one of the
task's TIF_WORK_MASK flags to be set, they will not be checked again
before returning to userspace. This means that pending work may be
left pending indefinitely, until the process enters and leaves the
kernel again. The net effect is that a pending signal or reschedule
event could be delayed for an unbounded amount of time.
To deal with this, the xen event upcall handler checks to see if the
EIP is within the critical section of the iret code, after events
are (potentially) enabled up to the iret itself. If its within this
range, it calls the iret critical section fixup, which adjusts the
stack to deal with any unrestored registers, and then shifts the
stack frame up to replace the previous invocation.
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
2007-07-18 05:37:07 +04:00
setz P E R _ C P U _ V A R ( x e n _ v c p u _ i n f o ) + X E N _ v c p u _ i n f o _ m a s k
2007-07-18 05:37:07 +04:00
/ * Preempt h e r e d o e s n ' t m a t t e r b e c a u s e t h a t w i l l d e a l w i t h
any p e n d i n g i n t e r r u p t s . T h e p e n d i n g c h e c k m a y e n d u p b e i n g
run o n t h e w r o n g C P U , b u t t h a t d o e s n ' t h u r t . * /
xen: use iret directly when possible
Most of the time we can simply use the iret instruction to exit the
kernel, rather than having to use the iret hypercall - the only
exception is if we're returning into vm86 mode, or from delivering an
NMI (which we don't support yet).
When running native, iret has the behaviour of testing for a pending
interrupt atomically with re-enabling interrupts. Unfortunately
there's no way to do this with Xen, so there's a window in which we
could get a recursive exception after enabling events but before
actually returning to userspace.
This causes a problem: if the nested interrupt causes one of the
task's TIF_WORK_MASK flags to be set, they will not be checked again
before returning to userspace. This means that pending work may be
left pending indefinitely, until the process enters and leaves the
kernel again. The net effect is that a pending signal or reschedule
event could be delayed for an unbounded amount of time.
To deal with this, the xen event upcall handler checks to see if the
EIP is within the critical section of the iret code, after events
are (potentially) enabled up to the iret itself. If its within this
range, it calls the iret critical section fixup, which adjusts the
stack to deal with any unrestored registers, and then shifts the
stack frame up to replace the previous invocation.
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
2007-07-18 05:37:07 +04:00
/* check for unmasked and pending */
2007-07-18 05:37:07 +04:00
cmpw $ 0 x00 0 1 , P E R _ C P U _ V A R ( x e n _ v c p u _ i n f o ) + X E N _ v c p u _ i n f o _ p e n d i n g
jz 1 f
2 : call c h e c k _ e v e n t s
1 :
ENDPATCH( x e n _ r e s t o r e _ f l _ d i r e c t )
ret
ENDPROC( x e n _ r e s t o r e _ f l _ d i r e c t )
RELOC( x e n _ r e s t o r e _ f l _ d i r e c t , 2 b + 1 )
xen: use iret directly when possible
Most of the time we can simply use the iret instruction to exit the
kernel, rather than having to use the iret hypercall - the only
exception is if we're returning into vm86 mode, or from delivering an
NMI (which we don't support yet).
When running native, iret has the behaviour of testing for a pending
interrupt atomically with re-enabling interrupts. Unfortunately
there's no way to do this with Xen, so there's a window in which we
could get a recursive exception after enabling events but before
actually returning to userspace.
This causes a problem: if the nested interrupt causes one of the
task's TIF_WORK_MASK flags to be set, they will not be checked again
before returning to userspace. This means that pending work may be
left pending indefinitely, until the process enters and leaves the
kernel again. The net effect is that a pending signal or reschedule
event could be delayed for an unbounded amount of time.
To deal with this, the xen event upcall handler checks to see if the
EIP is within the critical section of the iret code, after events
are (potentially) enabled up to the iret itself. If its within this
range, it calls the iret critical section fixup, which adjusts the
stack to deal with any unrestored registers, and then shifts the
stack frame up to replace the previous invocation.
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
2007-07-18 05:37:07 +04:00
/ *
This i s r u n w h e r e a n o r m a l i r e t w o u l d b e r u n , w i t h t h e s a m e s t a c k s e t u p :
8 : eflags
4 : cs
esp- > 0 : e i p
This a t t e m p t s t o m a k e s u r e t h a t a n y p e n d i n g e v e n t s a r e d e a l t
with o n r e t u r n t o u s e r m o d e , b u t t h e r e i s a s m a l l w i n d o w i n
which a n e v e n t c a n h a p p e n j u s t b e f o r e e n t e r i n g u s e r m o d e . I f
the n e s t e d i n t e r r u p t e n d s u p s e t t i n g o n e o f t h e T I F _ W O R K _ M A S K
pending w o r k f l a g s , t h e y w i l l n o t b e t e s t e d a g a i n b e f o r e
returning t o u s e r m o d e . T h i s m e a n s t h a t a p r o c e s s c a n e n d u p
with p e n d i n g w o r k , w h i c h w i l l b e u n p r o c e s s e d u n t i l t h e p r o c e s s
enters a n d l e a v e s t h e k e r n e l a g a i n , w h i c h c o u l d b e a n
unbounded a m o u n t o f t i m e . T h i s m e a n s t h a t a p e n d i n g s i g n a l o r
reschedule e v e n t c o u l d b e i n d e f i n i t e l y d e l a y e d .
The f i x i s t o n o t i c e a n e s t e d i n t e r r u p t i n t h e c r i t i c a l
window, a n d i f o n e o c c u r s , t h e n f o l d t h e n e s t e d i n t e r r u p t i n t o
the c u r r e n t i n t e r r u p t s t a c k f r a m e , a n d r e - p r o c e s s i t
iteratively r a t h e r t h a n r e c u r s i v e l y . T h i s m e a n s t h a t i t w i l l
exit v i a t h e n o r m a l p a t h , a n d a l l p e n d i n g w o r k w i l l b e d e a l t
with a p p r o p r i a t e l y .
Because t h e n e s t e d i n t e r r u p t h a n d l e r n e e d s t o d e a l w i t h t h e
current s t a c k s t a t e i n w h a t e v e r f o r m i t s i n , w e k e e p t h i n g s
simple b y o n l y u s i n g a s i n g l e r e g i s t e r w h i c h i s p u s h e d / p o p p e d
on t h e s t a c k .
Non- d i r e c t i r e t c o u l d b e d o n e i n t h e s a m e w a y , b u t i t w o u l d
require a n a n n o y i n g a m o u n t o f c o d e d u p l i c a t i o n . W e ' l l a s s u m e
that d i r e c t m o d e w i l l b e t h e c o m m o n c a s e o n c e t h e h y p e r v i s o r
support b e c o m e s c o m m o n p l a c e .
* /
ENTRY( x e n _ i r e t _ d i r e c t )
/* test eflags for special cases */
testl $ ( X 8 6 _ E F L A G S _ V M | X E N _ E F L A G S _ N M I ) , 8 ( % e s p )
jnz h y p e r _ i r e t
push % e a x
ESP_ O F F S E T =4 # b y t e s p u s h e d o n t o s t a c k
/ * Store v c p u _ i n f o p o i n t e r f o r e a s y a c c e s s . D o i t t h i s
way t o a v o i d h a v i n g t o r e l o a d % f s * /
# ifdef C O N F I G _ S M P
GET_ T H R E A D _ I N F O ( % e a x )
movl T I _ c p u ( % e a x ) ,% e a x
movl _ _ p e r _ c p u _ o f f s e t ( ,% e a x ,4 ) ,% e a x
lea p e r _ c p u _ _ x e n _ v c p u _ i n f o ( % e a x ) ,% e a x
# else
movl $ p e r _ c p u _ _ x e n _ v c p u _ i n f o , % e a x
# endif
/* check IF state we're restoring */
testb $ X 8 6 _ E F L A G S _ I F > > 8 , 8 + 1 + E S P _ O F F S E T ( % e s p )
/ * Maybe e n a b l e e v e n t s . O n c e t h i s h a p p e n s w e c o u l d g e t a
recursive e v e n t , s o t h e c r i t i c a l r e g i o n s t a r t s i m m e d i a t e l y
afterwards. H o w e v e r , i f t h a t h a p p e n s w e d o n ' t e n d u p
resuming t h e c o d e , s o w e d o n ' t h a v e t o b e w o r r i e d a b o u t
being p r e e m p t e d t o a n o t h e r C P U . * /
setz X E N _ v c p u _ i n f o _ m a s k ( % e a x )
xen_iret_start_crit :
/* check for unmasked and pending */
cmpw $ 0 x00 0 1 , X E N _ v c p u _ i n f o _ p e n d i n g ( % e a x )
/ * If t h e r e ' s s o m e t h i n g p e n d i n g , m a s k e v e n t s a g a i n s o w e
can j u m p b a c k i n t o x e n _ h y p e r v i s o r _ c a l l b a c k * /
sete X E N _ v c p u _ i n f o _ m a s k ( % e a x )
popl % e a x
/ * From t h i s p o i n t o n t h e r e g i s t e r s a r e r e s t o r e d a n d t h e s t a c k
updated, s o w e d o n ' t n e e d t o w o r r y a b o u t i t i f w e ' r e p r e e m p t e d * /
iret_restore_end :
/ * Jump t o h y p e r v i s o r _ c a l l b a c k a f t e r f i x i n g u p t h e s t a c k .
Events a r e m a s k e d , s o j u m p i n g o u t o f t h e c r i t i c a l
region i s O K . * /
je x e n _ h y p e r v i s o r _ c a l l b a c k
iret
xen_iret_end_crit :
hyper_iret :
/* put this out of line since its very rarely used */
jmp h y p e r c a l l _ p a g e + _ _ H Y P E R V I S O R _ i r e t * 3 2
.globl xen_ i r e t _ s t a r t _ c r i t , x e n _ i r e t _ e n d _ c r i t
/ *
This i s c a l l e d b y x e n _ h y p e r v i s o r _ c a l l b a c k i n e n t r y . S w h e n i t s e e s
that t h e E I P a t t h e t i m e o f i n t e r r u p t w a s b e t w e e n x e n _ i r e t _ s t a r t _ c r i t
and x e n _ i r e t _ e n d _ c r i t . W e ' r e p a s s e d t h e E I P i n % e a x s o w e c a n d o
a m o r e r e f i n e d d e t e r m i n a t i o n o f w h a t t o d o .
The s t a c k f o r m a t a t t h i s p o i n t i s :
- - - - - - - - - - - - - - - -
ss : ( s s / e s p m a y b e p r e s e n t i f w e c a m e f r o m u s e r m o d e )
esp :
eflags } o u t e r e x c e p t i o n i n f o
cs }
eip }
- - - - - - - - - - - - - - - - < - edi ( c o p y d e s t )
eax : o u t e r e a x i f i t h a s n ' t b e e n r e s t o r e d
- - - - - - - - - - - - - - - -
eflags } n e s t e d e x c e p t i o n i n f o
cs } ( n o s s / e s p b e c a u s e w e ' r e n e s t e d
eip } f r o m t h e s a m e r i n g )
orig_ e a x } < - e s i ( c o p y s r c )
- - - - - - - -
fs }
es }
ds } S A V E _ A L L s t a t e
eax }
: :
ebx }
- - - - - - - - - - - - - - - -
return a d d r < - e s p
- - - - - - - - - - - - - - - -
In o r d e r t o d e l i v e r t h e n e s t e d e x c e p t i o n p r o p e r l y , w e n e e d t o s h i f t
everything f r o m t h e r e t u r n a d d r u p t o t h e e r r o r c o d e s o i t
sits j u s t u n d e r t h e o u t e r e x c e p t i o n i n f o . T h i s m e a n s t h a t w h e n w e
handle t h e e x c e p t i o n , w e d o i t i n t h e c o n t e x t o f t h e o u t e r e x c e p t i o n
rather t h a n s t a r t i n g a n e w o n e .
The o n l y c a v e a t i s t h a t i f t h e o u t e r e a x h a s n ' t b e e n
restored y e t ( i e , i t ' s s t i l l o n s t a c k ) , w e n e e d t o i n s e r t
its v a l u e i n t o t h e S A V E _ A L L s t a t e b e f o r e g o i n g o n , s i n c e
it' s u s e r m o d e s t a t e w h i c h w e e v e n t u a l l y n e e d t o r e s t o r e .
* /
ENTRY( x e n _ i r e t _ c r i t _ f i x u p )
/* offsets +4 for return address */
/ *
Paranoia : Make s u r e w e ' r e r e a l l y c o m i n g f r o m u s e r s p a c e .
One c o u l d i m a g i n e a c a s e w h e r e u s e r s p a c e j u m p s i n t o t h e
critical r a n g e a d d r e s s , b u t j u s t b e f o r e t h e C P U d e l i v e r s a G P ,
it d e c i d e s t o d e l i v e r a n i n t e r r u p t i n s t e a d . U n l i k e l y ?
Definitely. E a s y t o a v o i d ? Y e s . T h e I n t e l d o c u m e n t s
explicitly s a y t h a t t h e r e p o r t e d E I P f o r a b a d j u m p i s t h e
jump i n s t r u c t i o n i t s e l f , n o t t h e d e s t i n a t i o n , b u t s o m e v i r t u a l
environments g e t t h i s w r o n g .
* /
movl P T _ C S + 4 ( % e s p ) , % e c x
andl $ S E G M E N T _ R P L _ M A S K , % e c x
cmpl $ U S E R _ R P L , % e c x
je 2 f
lea P T _ O R I G _ E A X + 4 ( % e s p ) , % e s i
lea P T _ E F L A G S + 4 ( % e s p ) , % e d i
/ * If e i p i s b e f o r e i r e t _ r e s t o r e _ e n d t h e n s t a c k
hasn' t b e e n r e s t o r e d y e t . * /
cmp $ i r e t _ r e s t o r e _ e n d , % e a x
jae 1 f
movl 0 + 4 ( % e d i ) ,% e a x / * c o p y E A X * /
movl % e a x , P T _ E A X + 4 ( % e s p )
lea E S P _ O F F S E T ( % e d i ) ,% e d i / * m o v e d e s t u p o v e r s a v e d r e g s * /
/* set up the copy */
1 : std
mov $ ( P T _ E I P + 4 ) / 4 , % e c x / * c o p y r e t + s a v e d r e g s u p t o o r i g _ e a x * /
rep m o v s l
cld
lea 4 ( % e d i ) ,% e s p / * p o i n t e s p t o n e w f r a m e * /
2 : ret
2007-07-18 05:37:07 +04:00
/ *
Force a n e v e n t c h e c k b y m a k i n g a h y p e r c a l l ,
but p r e s e r v e r e g s b e f o r e m a k i n g t h e c a l l .
* /
check_events :
push % e a x
push % e c x
push % e d x
call f o r c e _ e v t c h n _ c a l l b a c k
pop % e d x
pop % e c x
pop % e a x
ret