2005-04-17 02:20:36 +04:00
/ *
* Linux/ P A - R I S C P r o j e c t ( h t t p : / / w w w . p a r i s c - l i n u x . o r g / )
*
2013-05-03 01:16:38 +04:00
* System c a l l e n t r y c o d e / L i n u x g a t e w a y p a g e
2018-06-17 00:32:07 +03:00
* Copyright ( c ) M a t t h e w W i l c o x 1 9 9 9 < w i l l y @infradead.org>
2005-04-17 02:20:36 +04:00
* Licensed u n d e r t h e G N U G P L .
* thanks t o P h i l i p p R u m p f , M i k e S h a v e r a n d v a r i o u s o t h e r s
* sorry a b o u t t h e w a l l , p u f f i n . .
* /
2013-05-03 01:16:38 +04:00
/ *
How d o e s t h e L i n u x g a t e w a y p a g e o n P A - R I S C w o r k ?
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
The L i n u x g a t e w a y p a g e o n P A - R I S C i s " s p e c i a l " .
It a c t u a l l y h a s P A G E _ G A T E W A Y b i t s s e t ( t h i s i s l i n u x t e r m i n o l o g y ; in parisc
terminology i t ' s E x e c u t e , p r o m o t e t o P L 0 ) i n t h e p a g e m a p . S o a n y t h i n g
executing o n t h i s p a g e e x e c u t e s w i t h k e r n e l l e v e l p r i v i l e g e ( t h e r e ' s m o r e t o i t
than t h a t : t o h a v e t h i s h a p p e n , y o u a l s o h a v e t o u s e a b r a n c h w i t h a ,g a t e
completer t o a c t i v a t e t h e p r i v i l e g e p r o m o t i o n ) . T h e u p s h o t i s t h a t e v e r y t h i n g
that r u n s o n t h e g a t e w a y p a g e r u n s a t k e r n e l p r i v i l e g e b u t w i t h t h e c u r r e n t
user p r o c e s s a d d r e s s s p a c e ( a l t h o u g h y o u h a v e a c c e s s t o k e r n e l s p a c e v i a % s r2 ) .
For t h e 0 x10 0 s y s c a l l e n t r y , w e r e d o t h e s p a c e r e g i s t e r s t o p o i n t t o t h e k e r n e l
address s p a c e ( p r e s e r v i n g t h e u s e r a d d r e s s s p a c e i n % s r3 ) , m o v e t o w i d e m o d e i f
required, s a v e t h e u s e r r e g i s t e r s a n d b r a n c h i n t o t h e k e r n e l s y s c a l l e n t r y
point. F o r a l l t h e o t h e r f u n c t i o n s , w e e x e c u t e a t k e r n e l p r i v i l e g e b u t d o n ' t
flip a d d r e s s s p a c e s . T h e b a s i c u p s h o t o f t h i s i s t h a t t h e s e c o d e s n i p p e t s a r e
executed a t o m i c a l l y ( b e c a u s e t h e k e r n e l c a n ' t b e p r e - e m p t e d ) a n d t h e y m a y
perform a r c h i t e c t u r a l l y f o r b i d d e n ( t o P L 3 ) o p e r a t i o n s ( l i k e s e t t i n g c o n t r o l
registers) .
* /
2005-09-09 22:57:26 +04:00
# include < a s m / a s m - o f f s e t s . h >
2005-04-17 02:20:36 +04:00
# include < a s m / u n i s t d . h >
# include < a s m / e r r n o . h >
2007-10-18 11:04:34 +04:00
# include < a s m / p a g e . h >
2005-04-17 02:20:36 +04:00
# include < a s m / p s w . h >
# include < a s m / t h r e a d _ i n f o . h >
# include < a s m / a s s e m b l y . h >
# include < a s m / p r o c e s s o r . h >
2013-05-03 00:41:45 +04:00
# include < a s m / c a c h e . h >
2005-04-17 02:20:36 +04:00
2007-01-25 00:36:32 +03:00
# include < l i n u x / l i n k a g e . h >
2005-04-17 02:20:36 +04:00
/ * We f i l l t h e e m p t y p a r t s o f t h e g a t e w a y p a g e w i t h
* something t h a t w i l l k i l l t h e k e r n e l o r a
* userspace a p p l i c a t i o n .
* /
# define K I L L _ I N S N b r e a k 0 ,0
2019-05-06 00:54:34 +03:00
.level PA_ASM_LEVEL
2007-01-25 00:36:32 +03:00
parisc: Rewrite light-weight syscall and futex code
The parisc architecture lacks general hardware support for compare and swap.
Particularly for userspace, it is difficult to implement software atomic
support. Page faults in critical regions can cause processes to sleep and
block the forward progress of other processes. Thus, it is essential that
page faults be disabled in critical regions. For performance reasons, we
also need to disable external interrupts in critical regions.
In order to do this, we need a mechanism to trigger COW breaks outside the
critical region. Fortunately, parisc has the "stbys,e" instruction. When
the leftmost byte of a word is addressed, this instruction triggers all
the exceptions of a normal store but it does not write to memory. Thus,
we can use it to trigger COW breaks outside the critical region without
modifying the data that is to be updated atomically.
COW breaks occur randomly. So even if we have priviously executed a "stbys,e"
instruction, we still need to disable pagefaults around the critical region.
If a fault occurs in the critical region, we return -EAGAIN. I had to add
a wrapper around _arch_futex_atomic_op_inuser() as I found in testing that
returning -EAGAIN caused problems for some processes even though it is
listed as a possible return value.
The patch implements the above. The code no longer attempts to sleep with
interrupts disabled and I haven't seen any stalls with the change.
I have attempted to merge common code and streamline the fast path. In the
futex code, we only compute the spinlock address once.
I eliminated some debug code in the original CAS routine that just made the
flow more complicated.
I don't clip the arguments when called from wide mode. As a result, the LWS
routines should work when called from 64-bit processes.
I defined TASK_PAGEFAULT_DISABLED offset for use in the lws_pagefault_disable
and lws_pagefault_enable macros.
Since we now disable interrupts on the gateway page where necessary, it
might be possible to allow processes to be scheduled when they are on the
gateway page.
Change has been tested on c8000 and rp3440. It improves glibc build and test
time by about 10%.
In v2, I removed the lws_atomic_xchg and and lws_atomic_store calls. I
also removed the bug fixes that were not directly related to this patch.
In v3, I removed the code to force interruptions from
arch_futex_atomic_op_inuser(). It is always called with page faults
disabled, so this code had no effect.
In v4, I fixed a typo in depi_safe line.
In v5, I moved the code to disable/enable page faults inside the spinlocks.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-01-05 00:44:32 +03:00
.macro lws_pagefault_disable reg1 ,r e g 2
mfctl % c r30 , \ r e g 2
ldo T A S K _ P A G E F A U L T _ D I S A B L E D ( \ r e g 2 ) , \ r e g 2
ldw 0 ( % s r2 ,\ r e g 2 ) , \ r e g 1
ldo 1 ( \ r e g 1 ) , \ r e g 1
stw \ r e g 1 , 0 ( % s r2 ,\ r e g 2 )
.endm
.macro lws_pagefault_enable reg1 ,r e g 2
mfctl % c r30 , \ r e g 2
ldo T A S K _ P A G E F A U L T _ D I S A B L E D ( \ r e g 2 ) , \ r e g 2
ldw 0 ( % s r2 ,\ r e g 2 ) , \ r e g 1
ldo - 1 ( \ r e g 1 ) , \ r e g 1
stw \ r e g 1 , 0 ( % s r2 ,\ r e g 2 )
.endm
2008-05-22 22:36:31 +04:00
.text
2005-04-17 02:20:36 +04:00
.import syscall_ e x i t ,c o d e
.import syscall_ e x i t _ r f i ,c o d e
/ * Linux g a t e w a y p a g e i s a l i a s e d t o v i r t u a l p a g e 0 i n t h e k e r n e l
* address s p a c e . S i n c e i t i s a g a t e w a y p a g e i t c a n n o t b e
* dereferenced, s o n u l l p o i n t e r s w i l l s t i l l f a u l t . W e s t a r t
* the a c t u a l e n t r y p o i n t a t 0 x10 0 . W e p u t b r e a k i n s t r u c t i o n s
* at t h e b e g i n n i n g o f t h e p a g e t o t r a p n u l l i n d i r e c t f u n c t i o n
* pointers.
* /
2007-10-18 11:04:34 +04:00
.align PAGE_SIZE
2007-01-25 00:36:32 +03:00
ENTRY( l i n u x _ g a t e w a y _ p a g e )
2005-04-17 02:20:36 +04:00
/* ADDRESS 0x00 to 0xb0 = 176 bytes / 4 bytes per insn = 44 insns */
.rept 44
KILL_ I N S N
.endr
2010-04-11 21:26:34 +04:00
/* ADDRESS 0xb0 to 0xb8, lws uses two insns for entry */
2005-04-17 02:20:36 +04:00
/* Light-weight-syscall entry must always be located at 0xb0 */
/* WARNING: Keep this number updated with table size changes */
2022-01-05 00:48:11 +03:00
# define _ _ N R _ l w s _ e n t r i e s ( 5 )
2005-04-17 02:20:36 +04:00
lws_entry :
2010-04-11 21:26:34 +04:00
gate l w s _ s t a r t , % r0 / * i n c r e a s e p r i v i l e g e * /
2021-10-17 16:23:53 +03:00
depi P R I V _ U S E R , 3 1 , 2 , % r31 / * E n s u r e w e r e t u r n i n t o u s e r m o d e . * /
2005-04-17 02:20:36 +04:00
2010-04-11 21:26:34 +04:00
/* Fill from 0xb8 to 0xe0 */
.rept 10
2005-04-17 02:20:36 +04:00
KILL_ I N S N
.endr
/ * This f u n c t i o n M U S T b e l o c a t e d a t 0 x e 0 f o r g l i b c ' s t h r e a d i n g
mechanism t o w o r k . D O N O T M O V E T H I S C O D E E V E R ! * /
set_thread_pointer :
gate . + 8 , % r0 / * i n c r e a s e p r i v i l e g e * /
2021-10-17 16:23:53 +03:00
depi P R I V _ U S E R , 3 1 , 2 , % r31 / * E n s u r e w e r e t u r n i n t o u s e r m o d e . * /
2005-04-17 02:20:36 +04:00
be 0 ( % s r7 ,% r31 ) / * r e t u r n t o u s e r s p a c e * /
mtctl % r26 , % c r27 / * m o v e a r g 0 t o t h e c o n t r o l r e g i s t e r * /
/ * Increase t h e c h a n c e o f t r a p p i n g i f r a n d o m j u m p s o c c u r t o t h i s
address, f i l l f r o m 0 x f0 t o 0 x10 0 * /
.rept 4
KILL_ I N S N
.endr
/* This address must remain fixed at 0x100 for glibc's syscalls to work */
2016-10-30 00:52:43 +03:00
.align LINUX_GATEWAY_ADDR
2005-04-17 02:20:36 +04:00
linux_gateway_entry :
gate . + 8 , % r0 / * b e c o m e p r i v i l e g e d * /
mtsp % r0 ,% s r4 / * g e t k e r n e l s p a c e i n t o s r4 * /
mtsp % r0 ,% s r5 / * g e t k e r n e l s p a c e i n t o s r5 * /
mtsp % r0 ,% s r6 / * g e t k e r n e l s p a c e i n t o s r6 * /
2005-10-22 06:46:48 +04:00
# ifdef C O N F I G _ 6 4 B I T
2018-08-16 23:36:26 +03:00
/ * Store W b i t o n e n t r y t o t h e s y s c a l l i n c a s e i t ' s a w i d e u s e r l a n d
* process. * /
2005-04-17 02:20:36 +04:00
ssm P S W _ S M _ W , % r1
extrd,u % r1 ,P S W _ W _ B I T ,1 ,% r1
/ * sp m u s t b e a l i g n e d o n 4 , s o d e p o s i t t h e W b i t s e t t i n g i n t o
* the b o t t o m o f s p t e m p o r a r i l y * /
or,e v % r1 ,% r30 ,% r30
b,n 1 f
/ * The t o p h a l v e s o f a r g u m e n t r e g i s t e r s m u s t b e c l e a r e d o n s y s c a l l
* entry f r o m n a r r o w e x e c u t a b l e .
* /
depdi 0 , 3 1 , 3 2 , % r26
depdi 0 , 3 1 , 3 2 , % r25
depdi 0 , 3 1 , 3 2 , % r24
depdi 0 , 3 1 , 3 2 , % r23
depdi 0 , 3 1 , 3 2 , % r22
depdi 0 , 3 1 , 3 2 , % r21
1 :
# endif
2016-10-29 06:00:34 +03:00
/ * We u s e a r s m / s s m p a i r t o p r e v e n t s r3 f r o m b e i n g c l o b b e r e d
* by e x t e r n a l i n t e r r u p t s .
* /
mfsp % s r7 ,% r1 / * s a v e u s e r s r7 * /
rsm P S W _ S M _ I , % r0 / * d i s a b l e i n t e r r u p t s * /
mtsp % r1 ,% s r3 / * a n d s t o r e i t i n s r3 * /
2005-04-17 02:20:36 +04:00
mfctl % c r30 ,% r1
xor % r1 ,% r30 ,% r30 / * y e o l d e x o r t r i c k * /
xor % r1 ,% r30 ,% r1
xor % r1 ,% r30 ,% r30
2021-10-15 11:41:03 +03:00
LDREG T A S K _ S T A C K ( % r30 ) ,% r30 / * s e t u p k e r n e l s t a c k * /
ldo F R A M E _ S I Z E ( % r30 ) ,% r30
2005-04-17 02:20:36 +04:00
/ * N. B . : I t i s c r i t i c a l t h a t w e d o n ' t s e t s r7 t o 0 u n t i l r30
* contains a v a l i d k e r n e l s t a c k p o i n t e r . I t i s a l s o
* critical t h a t w e d o n ' t s t a r t u s i n g t h e k e r n e l s t a c k
* until a f t e r s r7 h a s b e e n s e t t o 0 .
* /
mtsp % r0 ,% s r7 / * g e t k e r n e l s p a c e i n t o s r7 * /
2016-10-29 06:00:34 +03:00
ssm P S W _ S M _ I , % r0 / * e n a b l e i n t e r r u p t s * /
2005-04-17 02:20:36 +04:00
STREGM % r1 ,F R A M E _ S I Z E ( % r30 ) / * s a v e r1 ( u s p ) h e r e f o r n o w * /
mfctl % c r30 ,% r1 / * g e t t a s k p t r i n % r1 * /
/ * Save s o m e r e g i s t e r s f o r s i g c o n t e x t a n d p o t e n t i a l t a s k
switch ( s e e e n t r y . S f o r t h e d e t a i l s o f w h i c h o n e s a r e
saved/ r e s t o r e d ) . T A S K _ P T _ P S W i s z e r o e d s o w e c a n s e e w h e t h e r
a p r o c e s s i s o n a s y s c a l l o r n o t . F o r a n i n t e r r u p t t h e r e a l
PSW v a l u e i s s t o r e d . T h i s i s n e e d e d f o r g d b a n d s y s _ p t r a c e . * /
STREG % r0 , T A S K _ P T _ P S W ( % r1 )
STREG % r2 , T A S K _ P T _ G R 2 ( % r1 ) / * p r e s e r v e r p * /
STREG % r19 , T A S K _ P T _ G R 1 9 ( % r1 )
LDREGM - F R A M E _ S I Z E ( % r30 ) , % r2 / * g e t u s e r s s p b a c k * /
2005-10-22 06:46:48 +04:00
# ifdef C O N F I G _ 6 4 B I T
2005-04-17 02:20:36 +04:00
extrd,u % r2 ,6 3 ,1 ,% r19 / * W h i d d e n i n b o t t o m b i t * /
# if 0
xor % r19 ,% r2 ,% r2 / * c l e a r b o t t o m b i t * /
depd,z % r19 ,1 ,1 ,% r19
std % r19 ,T A S K _ P T _ P S W ( % r1 )
# endif
# endif
STREG % r2 , T A S K _ P T _ G R 3 0 ( % r1 ) / * . . . a n d s a v e i t * /
2005-11-18 00:32:46 +03:00
STREG % r20 , T A S K _ P T _ G R 2 0 ( % r1 ) / * S y s c a l l n u m b e r * /
2005-04-17 02:20:36 +04:00
STREG % r21 , T A S K _ P T _ G R 2 1 ( % r1 )
STREG % r22 , T A S K _ P T _ G R 2 2 ( % r1 )
STREG % r23 , T A S K _ P T _ G R 2 3 ( % r1 ) / * 4 t h a r g u m e n t * /
STREG % r24 , T A S K _ P T _ G R 2 4 ( % r1 ) / * 3 r d a r g u m e n t * /
STREG % r25 , T A S K _ P T _ G R 2 5 ( % r1 ) / * 2 n d a r g u m e n t * /
STREG % r26 , T A S K _ P T _ G R 2 6 ( % r1 ) / * 1 s t a r g u m e n t * /
STREG % r27 , T A S K _ P T _ G R 2 7 ( % r1 ) / * u s e r d p * /
STREG % r28 , T A S K _ P T _ G R 2 8 ( % r1 ) / * r e t u r n v a l u e 0 * /
2012-05-19 08:29:22 +04:00
STREG % r0 , T A S K _ P T _ O R I G _ R 2 8 ( % r1 ) / * d o n ' t p r o h i b i t r e s t a r t s * /
2005-04-17 02:20:36 +04:00
STREG % r29 , T A S K _ P T _ G R 2 9 ( % r1 ) / * r e t u r n v a l u e 1 * /
STREG % r31 , T A S K _ P T _ G R 3 1 ( % r1 ) / * p r e s e r v e s y s c a l l r e t u r n p t r * /
ldo T A S K _ P T _ F R 0 ( % r1 ) , % r27 / * s a v e f p r e g s f r o m t h e k e r n e l * /
save_ f p % r27 / * o r p o t e n t i a l t a s k s w i t c h * /
mfctl % c r11 , % r27 / * i . e . S A R * /
STREG % r27 , T A S K _ P T _ S A R ( % r1 )
loadgp
2005-10-22 06:46:48 +04:00
# ifdef C O N F I G _ 6 4 B I T
2005-04-17 02:20:36 +04:00
ldo - 1 6 ( % r30 ) ,% r29 / * R e f e r e n c e p a r a m s a v e a r e a * /
copy % r19 ,% r2 / * W b i t b a c k t o r2 * /
# else
/ * no n e e d t o s a v e t h e s e o n s t a c k i n w i d e m o d e b e c a u s e t h e f i r s t 8
* args a r e p a s s e d i n r e g i s t e r s * /
stw % r22 , - 5 2 ( % r30 ) / * 5 t h a r g u m e n t * /
stw % r21 , - 5 6 ( % r30 ) / * 6 t h a r g u m e n t * /
# endif
/* Are we being ptraced? */
mfctl % c r30 , % r1
2021-10-15 11:41:03 +03:00
LDREG T A S K _ T I _ F L A G S ( % r1 ) ,% r1
2012-05-20 19:59:03 +04:00
ldi _ T I F _ S Y S C A L L _ T R A C E _ M A S K , % r19
and,C O N D ( = ) % r1 , % r19 , % r0
b,n . L t r a c e s y s
2005-04-17 02:20:36 +04:00
/ * Note! W e c a n n o t u s e t h e s y s c a l l t a b l e t h a t i s m a p p e d
nearby s i n c e t h e g a t e w a y p a g e i s m a p p e d e x e c u t e - o n l y . * /
2005-10-22 06:46:48 +04:00
# ifdef C O N F I G _ 6 4 B I T
2005-04-17 02:20:36 +04:00
ldil L % s y s _ c a l l _ t a b l e , % r1
or,= % r2 ,% r2 ,% r2
addil L % ( s y s _ c a l l _ t a b l e 6 4 - s y s _ c a l l _ t a b l e ) , % r1
ldo R % s y s _ c a l l _ t a b l e ( % r1 ) , % r19
or,= % r2 ,% r2 ,% r2
ldo R % s y s _ c a l l _ t a b l e 6 4 ( % r1 ) , % r19
# else
2018-08-16 23:33:04 +03:00
load3 2 s y s _ c a l l _ t a b l e , % r19
2005-04-17 02:20:36 +04:00
# endif
2007-06-04 01:47:00 +04:00
comiclr,> > _ _ N R _ L i n u x _ s y s c a l l s , % r20 , % r0
2005-04-17 02:20:36 +04:00
b,n . L s y s c a l l _ n o s y s
LDREGX % r20 ( % r19 ) , % r19
/ * If t h i s i s a s y s _ r t _ s i g r e t u r n c a l l , a n d t h e s i g n a l w a s r e c e i v e d
* when n o t i n _ s y s c a l l , t h e n w e w a n t t o r e t u r n v i a s y s c a l l _ e x i t _ r f i ,
* not s y s c a l l _ e x i t . S i g n a l n o . i n r20 , i n _ s y s c a l l i n r25 ( s e e
* trampoline c o d e i n s i g n a l . c ) .
* /
ldi _ _ N R _ r t _ s i g r e t u r n ,% r2
comb,= % r2 ,% r20 ,. L r t _ s i g r e t u r n
.Lin_syscall :
ldil L % s y s c a l l _ e x i t ,% r2
be 0 ( % s r7 ,% r19 )
ldo R % s y s c a l l _ e x i t ( % r2 ) ,% r2
.Lrt_sigreturn :
comib,< > 0 ,% r25 ,. L i n _ s y s c a l l
ldil L % s y s c a l l _ e x i t _ r f i ,% r2
be 0 ( % s r7 ,% r19 )
ldo R % s y s c a l l _ e x i t _ r f i ( % r2 ) ,% r2
/ * Note! B e c a u s e w e a r e n o t r u n n i n g w h e r e w e w e r e l i n k e d , a n y
calls t o f u n c t i o n s e x t e r n a l t o t h i s f i l e m u s t b e i n d i r e c t . T o
be s a f e , w e a p p l y t h e o p p o s i t e r u l e t o f u n c t i o n s w i t h i n t h i s
file, w i t h l o c a l l a b e l s g i v e n t o t h e m t o e n s u r e c o r r e c t n e s s . * /
.Lsyscall_nosys :
syscall_nosys :
ldil L % s y s c a l l _ e x i t ,% r1
be R % s y s c a l l _ e x i t ( % s r7 ,% r1 )
ldo - E N O S Y S ( % r0 ) ,% r28 / * s e t e r r n o * /
/ * Warning! T h i s t r a c e c o d e i s a v i r t u a l d u p l i c a t e o f t h e c o d e a b o v e s o b e
* sure t o m a i n t a i n b o t h ! * /
.Ltracesys :
tracesys :
/ * Need t o s a v e m o r e r e g i s t e r s s o t h e d e b u g g e r c a n s e e w h e r e w e
* are. T h i s s a v e s o n l y t h e l o w e r 8 b i t s o f P S W , s o t h a t t h e C
* bit i s s t i l l c l e a r o n s y s c a l l s , a n d t h e D b i t i s s e t i f t h i s
* full r e g i s t e r s a v e p a t h h a s b e e n e x e c u t e d . W e c h e c k t h e D
* bit o n s y s c a l l _ r e t u r n _ r f i t o d e t e r m i n e w h i c h r e g i s t e r s t o
* restore. A n i n t e r r u p t r e s u l t s i n a f u l l P S W s a v e d w i t h t h e
* C b i t s e t , a n o n - s t r a c e d s y s c a l l e n t r y r e s u l t s i n C a n d D c l e a r
* in t h e s a v e d P S W .
* /
2021-10-15 11:41:03 +03:00
mfctl % c r30 ,% r1 / * g e t t a s k p t r * /
2005-04-17 02:20:36 +04:00
ssm 0 ,% r2
STREG % r2 ,T A S K _ P T _ P S W ( % r1 ) / * L o w e r 8 b i t s o n l y ! ! * /
mfsp % s r0 ,% r2
STREG % r2 ,T A S K _ P T _ S R 0 ( % r1 )
mfsp % s r1 ,% r2
STREG % r2 ,T A S K _ P T _ S R 1 ( % r1 )
mfsp % s r2 ,% r2
STREG % r2 ,T A S K _ P T _ S R 2 ( % r1 )
mfsp % s r3 ,% r2
STREG % r2 ,T A S K _ P T _ S R 3 ( % r1 )
STREG % r2 ,T A S K _ P T _ S R 4 ( % r1 )
STREG % r2 ,T A S K _ P T _ S R 5 ( % r1 )
STREG % r2 ,T A S K _ P T _ S R 6 ( % r1 )
STREG % r2 ,T A S K _ P T _ S R 7 ( % r1 )
STREG % r2 ,T A S K _ P T _ I A S Q 0 ( % r1 )
STREG % r2 ,T A S K _ P T _ I A S Q 1 ( % r1 )
LDREG T A S K _ P T _ G R 3 1 ( % r1 ) ,% r2
STREG % r2 ,T A S K _ P T _ I A O Q 0 ( % r1 )
ldo 4 ( % r2 ) ,% r2
STREG % r2 ,T A S K _ P T _ I A O Q 1 ( % r1 )
ldo T A S K _ R E G S ( % r1 ) ,% r2
/* reg_save %r2 */
STREG % r3 ,P T _ G R 3 ( % r2 )
STREG % r4 ,P T _ G R 4 ( % r2 )
STREG % r5 ,P T _ G R 5 ( % r2 )
STREG % r6 ,P T _ G R 6 ( % r2 )
STREG % r7 ,P T _ G R 7 ( % r2 )
STREG % r8 ,P T _ G R 8 ( % r2 )
STREG % r9 ,P T _ G R 9 ( % r2 )
STREG % r10 ,P T _ G R 1 0 ( % r2 )
STREG % r11 ,P T _ G R 1 1 ( % r2 )
STREG % r12 ,P T _ G R 1 2 ( % r2 )
STREG % r13 ,P T _ G R 1 3 ( % r2 )
STREG % r14 ,P T _ G R 1 4 ( % r2 )
STREG % r15 ,P T _ G R 1 5 ( % r2 )
STREG % r16 ,P T _ G R 1 6 ( % r2 )
STREG % r17 ,P T _ G R 1 7 ( % r2 )
STREG % r18 ,P T _ G R 1 8 ( % r2 )
/* Finished saving things for the debugger */
2009-07-05 22:36:16 +04:00
copy % r2 ,% r26
ldil L % d o _ s y s c a l l _ t r a c e _ e n t e r ,% r1
2005-04-17 02:20:36 +04:00
ldil L % t r a c e s y s _ n e x t ,% r2
2009-07-05 22:36:16 +04:00
be R % d o _ s y s c a l l _ t r a c e _ e n t e r ( % s r7 ,% r1 )
2005-04-17 02:20:36 +04:00
ldo R % t r a c e s y s _ n e x t ( % r2 ) ,% r2
2009-07-05 22:36:16 +04:00
tracesys_next :
/ * do_ s y s c a l l _ t r a c e _ e n t e r e i t h e r r e t u r n e d t h e s y s c a l l n o , o r - 1 L ,
* so w e s k i p r e s t o r i n g t h e P T _ G R 2 0 b e l o w , s i n c e w e p u l l e d i t f r o m
* task- > t h r e a d . r e g s . g r [ 2 0 ] a b o v e .
* /
copy % r e t 0 ,% r20
2005-04-17 02:20:36 +04:00
2021-10-15 11:41:03 +03:00
mfctl % c r30 ,% r1 / * g e t t a s k p t r * /
2016-03-30 15:14:31 +03:00
LDREG T A S K _ P T _ G R 2 8 ( % r1 ) , % r28 / * R e s t o r e r e t u r n v a l u e * /
2005-04-17 02:20:36 +04:00
LDREG T A S K _ P T _ G R 2 6 ( % r1 ) , % r26 / * R e s t o r e t h e u s e r s a r g s * /
LDREG T A S K _ P T _ G R 2 5 ( % r1 ) , % r25
LDREG T A S K _ P T _ G R 2 4 ( % r1 ) , % r24
LDREG T A S K _ P T _ G R 2 3 ( % r1 ) , % r23
LDREG T A S K _ P T _ G R 2 2 ( % r1 ) , % r22
LDREG T A S K _ P T _ G R 2 1 ( % r1 ) , % r21
2012-12-09 10:16:14 +04:00
# ifdef C O N F I G _ 6 4 B I T
2005-04-17 02:20:36 +04:00
ldo - 1 6 ( % r30 ) ,% r29 / * R e f e r e n c e p a r a m s a v e a r e a * /
2012-12-09 10:16:14 +04:00
# else
stw % r22 , - 5 2 ( % r30 ) / * 5 t h a r g u m e n t * /
stw % r21 , - 5 6 ( % r30 ) / * 6 t h a r g u m e n t * /
2005-04-17 02:20:36 +04:00
# endif
2016-03-30 15:14:31 +03:00
cmpib,C O N D ( = ) ,n - 1 ,% r20 ,t r a c e s y s _ e x i t / * s e c c o m p m a y h a v e r e t u r n e d - 1 * /
2016-04-27 04:56:11 +03:00
comiclr,> > _ _ N R _ L i n u x _ s y s c a l l s , % r20 , % r0
2016-01-19 18:08:49 +03:00
b,n . L t r a c e s y s _ n o s y s
2005-04-17 02:20:36 +04:00
2018-08-16 23:33:04 +03:00
/ * Note! W e c a n n o t u s e t h e s y s c a l l t a b l e t h a t i s m a p p e d
nearby s i n c e t h e g a t e w a y p a g e i s m a p p e d e x e c u t e - o n l y . * /
# ifdef C O N F I G _ 6 4 B I T
LDREG T A S K _ P T _ G R 3 0 ( % r1 ) , % r19 / * g e t u s e r s s p b a c k * /
extrd,u % r19 ,6 3 ,1 ,% r2 / * W h i d d e n i n b o t t o m b i t * /
ldil L % s y s _ c a l l _ t a b l e , % r1
or,= % r2 ,% r2 ,% r2
addil L % ( s y s _ c a l l _ t a b l e 6 4 - s y s _ c a l l _ t a b l e ) , % r1
ldo R % s y s _ c a l l _ t a b l e ( % r1 ) , % r19
or,= % r2 ,% r2 ,% r2
ldo R % s y s _ c a l l _ t a b l e 6 4 ( % r1 ) , % r19
# else
load3 2 s y s _ c a l l _ t a b l e , % r19
# endif
2005-04-17 02:20:36 +04:00
LDREGX % r20 ( % r19 ) , % r19
/ * If t h i s i s a s y s _ r t _ s i g r e t u r n c a l l , a n d t h e s i g n a l w a s r e c e i v e d
* when n o t i n _ s y s c a l l , t h e n w e w a n t t o r e t u r n v i a s y s c a l l _ e x i t _ r f i ,
* not s y s c a l l _ e x i t . S i g n a l n o . i n r20 , i n _ s y s c a l l i n r25 ( s e e
* trampoline c o d e i n s i g n a l . c ) .
* /
ldi _ _ N R _ r t _ s i g r e t u r n ,% r2
comb,= % r2 ,% r20 ,. L t r a c e _ r t _ s i g r e t u r n
.Ltrace_in_syscall :
ldil L % t r a c e s y s _ e x i t ,% r2
be 0 ( % s r7 ,% r19 )
ldo R % t r a c e s y s _ e x i t ( % r2 ) ,% r2
2016-01-19 18:08:49 +03:00
.Ltracesys_nosys :
ldo - E N O S Y S ( % r0 ) ,% r28 / * s e t e r r n o * /
2005-04-17 02:20:36 +04:00
/ * Do * n o t * c a l l t h i s f u n c t i o n o n t h e g a t e w a y p a g e , b e c a u s e i t
makes a d i r e c t c a l l t o s y s c a l l _ t r a c e . * /
tracesys_exit :
2021-10-15 11:41:03 +03:00
mfctl % c r30 ,% r1 / * g e t t a s k p t r * /
2005-10-22 06:46:48 +04:00
# ifdef C O N F I G _ 6 4 B I T
2005-04-17 02:20:36 +04:00
ldo - 1 6 ( % r30 ) ,% r29 / * R e f e r e n c e p a r a m s a v e a r e a * /
# endif
2009-07-05 22:36:16 +04:00
ldo T A S K _ R E G S ( % r1 ) ,% r26
2015-11-20 13:22:32 +03:00
BL d o _ s y s c a l l _ t r a c e _ e x i t ,% r2
2005-04-17 02:20:36 +04:00
STREG % r28 ,T A S K _ P T _ G R 2 8 ( % r1 ) / * s a v e r e t u r n v a l u e n o w * /
2021-10-15 11:41:03 +03:00
mfctl % c r30 ,% r1 / * g e t t a s k p t r * /
2005-04-17 02:20:36 +04:00
LDREG T A S K _ P T _ G R 2 8 ( % r1 ) , % r28 / * R e s t o r e r e t u r n v a l . * /
ldil L % s y s c a l l _ e x i t ,% r1
be,n R % s y s c a l l _ e x i t ( % s r7 ,% r1 )
.Ltrace_rt_sigreturn :
comib,< > 0 ,% r25 ,. L t r a c e _ i n _ s y s c a l l
ldil L % t r a c e s y s _ s i g e x i t ,% r2
be 0 ( % s r7 ,% r19 )
ldo R % t r a c e s y s _ s i g e x i t ( % r2 ) ,% r2
tracesys_sigexit :
2021-10-15 11:41:03 +03:00
mfctl % c r30 ,% r1 / * g e t t a s k p t r * /
2005-10-22 06:46:48 +04:00
# ifdef C O N F I G _ 6 4 B I T
2005-04-17 02:20:36 +04:00
ldo - 1 6 ( % r30 ) ,% r29 / * R e f e r e n c e p a r a m s a v e a r e a * /
# endif
2015-11-20 13:22:32 +03:00
BL d o _ s y s c a l l _ t r a c e _ e x i t ,% r2
2009-07-05 22:36:16 +04:00
ldo T A S K _ R E G S ( % r1 ) ,% r26
2005-04-17 02:20:36 +04:00
ldil L % s y s c a l l _ e x i t _ r f i ,% r1
be,n R % s y s c a l l _ e x i t _ r f i ( % s r7 ,% r1 )
/ * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
2008-12-30 05:47:38 +03:00
3 2 / 6 4 - bit L i g h t - W e i g h t - S y s c a l l A B I
2005-04-17 02:20:36 +04:00
2008-12-30 05:47:38 +03:00
* - Indicates a h i n t f o r u s e r s p a c e i n l i n e a s m
implementations.
2005-04-17 02:20:36 +04:00
2008-12-30 05:47:38 +03:00
Syscall n u m b e r ( c a l l e r - s a v e s )
- % r2 0
* In a s m c l o b b e r .
2005-04-17 02:20:36 +04:00
2008-12-30 05:47:38 +03:00
Argument r e g i s t e r s ( c a l l e r - s a v e s )
- % r2 6 , % r25 , % r24 , % r23 , % r22
* In a s m i n p u t .
Return r e g i s t e r s ( c a l l e r - s a v e s )
- % r2 8 ( r e t u r n ) , % r21 ( e r r n o )
* In a s m o u t p u t .
Caller- s a v e s r e g i s t e r s
- % r1 , % r27 , % r29
- % r2 ( r e t u r n p o i n t e r )
- % r3 1 ( b l e l i n k r e g i s t e r )
* In a s m c l o b b e r .
Callee- s a v e s r e g i s t e r s
- % r3 - % r18
- % r3 0 ( s t a c k p o i n t e r )
* Not i n a s m c l o b b e r .
If u s e r s p a c e i s 3 2 - b i t :
Callee- s a v e s r e g i s t e r s
- % r1 9 ( 3 2 - b i t P I C r e g i s t e r )
Differences f r o m 3 2 - b i t c a l l i n g c o n v e n t i o n :
- Syscall n u m b e r i n % r20
- Additional a r g u m e n t r e g i s t e r % r22 ( a r g 4 )
- Callee- s a v e s % r19 .
If u s e r s p a c e i s 6 4 - b i t :
Callee- s a v e s r e g i s t e r s
- % r2 7 ( 6 4 - b i t P I C r e g i s t e r )
Differences f r o m 6 4 - b i t c a l l i n g c o n v e n t i o n :
- Syscall n u m b e r i n % r20
- Additional a r g u m e n t r e g i s t e r % r22 ( a r g 4 )
- Callee- s a v e s % r27 .
2005-04-17 02:20:36 +04:00
Error c o d e s r e t u r n e d b y e n t r y p a t h :
ENOSYS - r20 w a s a n i n v a l i d L W S n u m b e r .
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * /
lws_start :
2005-10-22 06:46:48 +04:00
# ifdef C O N F I G _ 6 4 B I T
2005-04-17 02:20:36 +04:00
ssm P S W _ S M _ W , % r1
extrd,u % r1 ,P S W _ W _ B I T ,1 ,% r1
/ * sp m u s t b e a l i g n e d o n 4 , s o d e p o s i t t h e W b i t s e t t i n g i n t o
* the b o t t o m o f s p t e m p o r a r i l y * /
2021-12-21 21:21:22 +03:00
or,o d % r1 ,% r30 ,% r30
2005-04-17 02:20:36 +04:00
2018-08-16 23:36:26 +03:00
/* Clip LWS number to a 32-bit value for 32-bit processes */
2005-04-17 02:20:36 +04:00
depdi 0 , 3 1 , 3 2 , % r20
# endif
/* Is the lws entry number valid? */
2010-04-11 21:26:34 +04:00
comiclr,> > _ _ N R _ l w s _ e n t r i e s , % r20 , % r0
2005-04-17 02:20:36 +04:00
b,n l w s _ e x i t _ n o s y s
/* Load table start */
ldil L % l w s _ t a b l e , % r1
ldo R % l w s _ t a b l e ( % r1 ) , % r28 / * S c r a t c h u s e o f r28 * /
LDREGX % r20 ( % s r2 ,r28 ) , % r21 / * S c r a t c h u s e o f r21 * /
/* Jump to lws, lws table pointers already relocated */
be,n 0 ( % s r2 ,% r21 )
parisc: Rewrite light-weight syscall and futex code
The parisc architecture lacks general hardware support for compare and swap.
Particularly for userspace, it is difficult to implement software atomic
support. Page faults in critical regions can cause processes to sleep and
block the forward progress of other processes. Thus, it is essential that
page faults be disabled in critical regions. For performance reasons, we
also need to disable external interrupts in critical regions.
In order to do this, we need a mechanism to trigger COW breaks outside the
critical region. Fortunately, parisc has the "stbys,e" instruction. When
the leftmost byte of a word is addressed, this instruction triggers all
the exceptions of a normal store but it does not write to memory. Thus,
we can use it to trigger COW breaks outside the critical region without
modifying the data that is to be updated atomically.
COW breaks occur randomly. So even if we have priviously executed a "stbys,e"
instruction, we still need to disable pagefaults around the critical region.
If a fault occurs in the critical region, we return -EAGAIN. I had to add
a wrapper around _arch_futex_atomic_op_inuser() as I found in testing that
returning -EAGAIN caused problems for some processes even though it is
listed as a possible return value.
The patch implements the above. The code no longer attempts to sleep with
interrupts disabled and I haven't seen any stalls with the change.
I have attempted to merge common code and streamline the fast path. In the
futex code, we only compute the spinlock address once.
I eliminated some debug code in the original CAS routine that just made the
flow more complicated.
I don't clip the arguments when called from wide mode. As a result, the LWS
routines should work when called from 64-bit processes.
I defined TASK_PAGEFAULT_DISABLED offset for use in the lws_pagefault_disable
and lws_pagefault_enable macros.
Since we now disable interrupts on the gateway page where necessary, it
might be possible to allow processes to be scheduled when they are on the
gateway page.
Change has been tested on c8000 and rp3440. It improves glibc build and test
time by about 10%.
In v2, I removed the lws_atomic_xchg and and lws_atomic_store calls. I
also removed the bug fixes that were not directly related to this patch.
In v3, I removed the code to force interruptions from
arch_futex_atomic_op_inuser(). It is always called with page faults
disabled, so this code had no effect.
In v4, I fixed a typo in depi_safe line.
In v5, I moved the code to disable/enable page faults inside the spinlocks.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-01-05 00:44:32 +03:00
lws_exit_noerror :
lws_ p a g e f a u l t _ e n a b l e % r1 ,% r21
stw,m a % r20 , 0 ( % s r2 ,% r20 )
ssm P S W _ S M _ I , % r0
b l w s _ e x i t
copy % r0 , % r21
lws_wouldblock :
ssm P S W _ S M _ I , % r0
ldo 2 ( % r0 ) , % r28
b l w s _ e x i t
ldo - E A G A I N ( % r0 ) , % r21
lws_pagefault :
lws_ p a g e f a u l t _ e n a b l e % r1 ,% r21
stw,m a % r20 , 0 ( % s r2 ,% r20 )
ssm P S W _ S M _ I , % r0
ldo 3 ( % r0 ) ,% r28
b l w s _ e x i t
ldo - E A G A I N ( % r0 ) ,% r21
lws_fault :
ldo 1 ( % r0 ) ,% r28
b l w s _ e x i t
ldo - E F A U L T ( % r0 ) ,% r21
2005-04-17 02:20:36 +04:00
lws_exit_nosys :
parisc: Rewrite light-weight syscall and futex code
The parisc architecture lacks general hardware support for compare and swap.
Particularly for userspace, it is difficult to implement software atomic
support. Page faults in critical regions can cause processes to sleep and
block the forward progress of other processes. Thus, it is essential that
page faults be disabled in critical regions. For performance reasons, we
also need to disable external interrupts in critical regions.
In order to do this, we need a mechanism to trigger COW breaks outside the
critical region. Fortunately, parisc has the "stbys,e" instruction. When
the leftmost byte of a word is addressed, this instruction triggers all
the exceptions of a normal store but it does not write to memory. Thus,
we can use it to trigger COW breaks outside the critical region without
modifying the data that is to be updated atomically.
COW breaks occur randomly. So even if we have priviously executed a "stbys,e"
instruction, we still need to disable pagefaults around the critical region.
If a fault occurs in the critical region, we return -EAGAIN. I had to add
a wrapper around _arch_futex_atomic_op_inuser() as I found in testing that
returning -EAGAIN caused problems for some processes even though it is
listed as a possible return value.
The patch implements the above. The code no longer attempts to sleep with
interrupts disabled and I haven't seen any stalls with the change.
I have attempted to merge common code and streamline the fast path. In the
futex code, we only compute the spinlock address once.
I eliminated some debug code in the original CAS routine that just made the
flow more complicated.
I don't clip the arguments when called from wide mode. As a result, the LWS
routines should work when called from 64-bit processes.
I defined TASK_PAGEFAULT_DISABLED offset for use in the lws_pagefault_disable
and lws_pagefault_enable macros.
Since we now disable interrupts on the gateway page where necessary, it
might be possible to allow processes to be scheduled when they are on the
gateway page.
Change has been tested on c8000 and rp3440. It improves glibc build and test
time by about 10%.
In v2, I removed the lws_atomic_xchg and and lws_atomic_store calls. I
also removed the bug fixes that were not directly related to this patch.
In v3, I removed the code to force interruptions from
arch_futex_atomic_op_inuser(). It is always called with page faults
disabled, so this code had no effect.
In v4, I fixed a typo in depi_safe line.
In v5, I moved the code to disable/enable page faults inside the spinlocks.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-01-05 00:44:32 +03:00
ldo - E N O S Y S ( % r0 ) ,% r21
2005-04-17 02:20:36 +04:00
/* Fall through: Return to userspace */
lws_exit :
2005-10-22 06:46:48 +04:00
# ifdef C O N F I G _ 6 4 B I T
2005-04-17 02:20:36 +04:00
/ * decide w h e t h e r t o r e s e t t h e w i d e m o d e b i t
*
* For a s y s c a l l , t h e W b i t i s s t o r e d i n t h e l o w e s t b i t
* of s p . E x t r a c t i t a n d r e s e t W i f i t i s z e r o * /
extrd,u ,* < > % r30 ,6 3 ,1 ,% r1
rsm P S W _ S M _ W , % r0
/* now reset the lowest bit of sp if it was set */
xor % r30 ,% r1 ,% r30
# endif
2010-04-11 21:26:34 +04:00
be,n 0 ( % s r7 , % r31 )
2005-04-17 02:20:36 +04:00
/ * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
2014-09-12 20:02:34 +04:00
Implementing 3 2 b i t C A S a s a n a t o m i c o p e r a t i o n :
2005-04-17 02:20:36 +04:00
% r2 6 - A d d r e s s t o e x a m i n e
% r2 5 - O l d v a l u e t o c h e c k ( o l d )
% r2 4 - N e w v a l u e t o s e t ( n e w )
% r2 8 - R e t u r n p r e v t h r o u g h t h i s r e g i s t e r .
% r2 1 - K e r n e l e r r o r c o d e
parisc: Rewrite light-weight syscall and futex code
The parisc architecture lacks general hardware support for compare and swap.
Particularly for userspace, it is difficult to implement software atomic
support. Page faults in critical regions can cause processes to sleep and
block the forward progress of other processes. Thus, it is essential that
page faults be disabled in critical regions. For performance reasons, we
also need to disable external interrupts in critical regions.
In order to do this, we need a mechanism to trigger COW breaks outside the
critical region. Fortunately, parisc has the "stbys,e" instruction. When
the leftmost byte of a word is addressed, this instruction triggers all
the exceptions of a normal store but it does not write to memory. Thus,
we can use it to trigger COW breaks outside the critical region without
modifying the data that is to be updated atomically.
COW breaks occur randomly. So even if we have priviously executed a "stbys,e"
instruction, we still need to disable pagefaults around the critical region.
If a fault occurs in the critical region, we return -EAGAIN. I had to add
a wrapper around _arch_futex_atomic_op_inuser() as I found in testing that
returning -EAGAIN caused problems for some processes even though it is
listed as a possible return value.
The patch implements the above. The code no longer attempts to sleep with
interrupts disabled and I haven't seen any stalls with the change.
I have attempted to merge common code and streamline the fast path. In the
futex code, we only compute the spinlock address once.
I eliminated some debug code in the original CAS routine that just made the
flow more complicated.
I don't clip the arguments when called from wide mode. As a result, the LWS
routines should work when called from 64-bit processes.
I defined TASK_PAGEFAULT_DISABLED offset for use in the lws_pagefault_disable
and lws_pagefault_enable macros.
Since we now disable interrupts on the gateway page where necessary, it
might be possible to allow processes to be scheduled when they are on the
gateway page.
Change has been tested on c8000 and rp3440. It improves glibc build and test
time by about 10%.
In v2, I removed the lws_atomic_xchg and and lws_atomic_store calls. I
also removed the bug fixes that were not directly related to this patch.
In v3, I removed the code to force interruptions from
arch_futex_atomic_op_inuser(). It is always called with page faults
disabled, so this code had no effect.
In v4, I fixed a typo in depi_safe line.
In v5, I moved the code to disable/enable page faults inside the spinlocks.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-01-05 00:44:32 +03:00
% r2 1 r e t u r n s t h e f o l l o w i n g e r r o r c o d e s :
2005-04-17 02:20:36 +04:00
EAGAIN - C A S i s b u s y , l d c w f a i l e d , t r y a g a i n .
EFAULT - R e a d o r w r i t e f a i l e d .
parisc: Rewrite light-weight syscall and futex code
The parisc architecture lacks general hardware support for compare and swap.
Particularly for userspace, it is difficult to implement software atomic
support. Page faults in critical regions can cause processes to sleep and
block the forward progress of other processes. Thus, it is essential that
page faults be disabled in critical regions. For performance reasons, we
also need to disable external interrupts in critical regions.
In order to do this, we need a mechanism to trigger COW breaks outside the
critical region. Fortunately, parisc has the "stbys,e" instruction. When
the leftmost byte of a word is addressed, this instruction triggers all
the exceptions of a normal store but it does not write to memory. Thus,
we can use it to trigger COW breaks outside the critical region without
modifying the data that is to be updated atomically.
COW breaks occur randomly. So even if we have priviously executed a "stbys,e"
instruction, we still need to disable pagefaults around the critical region.
If a fault occurs in the critical region, we return -EAGAIN. I had to add
a wrapper around _arch_futex_atomic_op_inuser() as I found in testing that
returning -EAGAIN caused problems for some processes even though it is
listed as a possible return value.
The patch implements the above. The code no longer attempts to sleep with
interrupts disabled and I haven't seen any stalls with the change.
I have attempted to merge common code and streamline the fast path. In the
futex code, we only compute the spinlock address once.
I eliminated some debug code in the original CAS routine that just made the
flow more complicated.
I don't clip the arguments when called from wide mode. As a result, the LWS
routines should work when called from 64-bit processes.
I defined TASK_PAGEFAULT_DISABLED offset for use in the lws_pagefault_disable
and lws_pagefault_enable macros.
Since we now disable interrupts on the gateway page where necessary, it
might be possible to allow processes to be scheduled when they are on the
gateway page.
Change has been tested on c8000 and rp3440. It improves glibc build and test
time by about 10%.
In v2, I removed the lws_atomic_xchg and and lws_atomic_store calls. I
also removed the bug fixes that were not directly related to this patch.
In v3, I removed the code to force interruptions from
arch_futex_atomic_op_inuser(). It is always called with page faults
disabled, so this code had no effect.
In v4, I fixed a typo in depi_safe line.
In v5, I moved the code to disable/enable page faults inside the spinlocks.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-01-05 00:44:32 +03:00
If E A G A I N i s r e t u r n e d , % r28 i n d i c a t e s t h e b u s y r e a s o n :
r2 8 = = 1 - C A S i s b u s y . l o c k c o n t e n d e d .
r2 8 = = 2 - C A S i s b u s y . l d c w f a i l e d .
r2 8 = = 3 - C A S i s b u s y . p a g e f a u l t .
2005-04-17 02:20:36 +04:00
Scratch : r2 0 , r28 , r1
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * /
/* ELF64 Process entry path */
lws_compare_and_swap64 :
2005-10-22 06:46:48 +04:00
# ifdef C O N F I G _ 6 4 B I T
2005-04-17 02:20:36 +04:00
b,n l w s _ c o m p a r e _ a n d _ s w a p
# else
/ * If w e a r e n o t a 6 4 - b i t k e r n e l , t h e n w e d o n ' t
2008-12-30 05:47:38 +03:00
* have 6 4 - b i t i n p u t r e g i s t e r s , a n d c a l l i n g
* the 6 4 - b i t L W S C A S r e t u r n s E N O S Y S .
2005-04-17 02:20:36 +04:00
* /
b,n l w s _ e x i t _ n o s y s
# endif
parisc: Rewrite light-weight syscall and futex code
The parisc architecture lacks general hardware support for compare and swap.
Particularly for userspace, it is difficult to implement software atomic
support. Page faults in critical regions can cause processes to sleep and
block the forward progress of other processes. Thus, it is essential that
page faults be disabled in critical regions. For performance reasons, we
also need to disable external interrupts in critical regions.
In order to do this, we need a mechanism to trigger COW breaks outside the
critical region. Fortunately, parisc has the "stbys,e" instruction. When
the leftmost byte of a word is addressed, this instruction triggers all
the exceptions of a normal store but it does not write to memory. Thus,
we can use it to trigger COW breaks outside the critical region without
modifying the data that is to be updated atomically.
COW breaks occur randomly. So even if we have priviously executed a "stbys,e"
instruction, we still need to disable pagefaults around the critical region.
If a fault occurs in the critical region, we return -EAGAIN. I had to add
a wrapper around _arch_futex_atomic_op_inuser() as I found in testing that
returning -EAGAIN caused problems for some processes even though it is
listed as a possible return value.
The patch implements the above. The code no longer attempts to sleep with
interrupts disabled and I haven't seen any stalls with the change.
I have attempted to merge common code and streamline the fast path. In the
futex code, we only compute the spinlock address once.
I eliminated some debug code in the original CAS routine that just made the
flow more complicated.
I don't clip the arguments when called from wide mode. As a result, the LWS
routines should work when called from 64-bit processes.
I defined TASK_PAGEFAULT_DISABLED offset for use in the lws_pagefault_disable
and lws_pagefault_enable macros.
Since we now disable interrupts on the gateway page where necessary, it
might be possible to allow processes to be scheduled when they are on the
gateway page.
Change has been tested on c8000 and rp3440. It improves glibc build and test
time by about 10%.
In v2, I removed the lws_atomic_xchg and and lws_atomic_store calls. I
also removed the bug fixes that were not directly related to this patch.
In v3, I removed the code to force interruptions from
arch_futex_atomic_op_inuser(). It is always called with page faults
disabled, so this code had no effect.
In v4, I fixed a typo in depi_safe line.
In v5, I moved the code to disable/enable page faults inside the spinlocks.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-01-05 00:44:32 +03:00
/* ELF32/ELF64 Process entry path */
2005-04-17 02:20:36 +04:00
lws_compare_and_swap32 :
2005-10-22 06:46:48 +04:00
# ifdef C O N F I G _ 6 4 B I T
parisc: Rewrite light-weight syscall and futex code
The parisc architecture lacks general hardware support for compare and swap.
Particularly for userspace, it is difficult to implement software atomic
support. Page faults in critical regions can cause processes to sleep and
block the forward progress of other processes. Thus, it is essential that
page faults be disabled in critical regions. For performance reasons, we
also need to disable external interrupts in critical regions.
In order to do this, we need a mechanism to trigger COW breaks outside the
critical region. Fortunately, parisc has the "stbys,e" instruction. When
the leftmost byte of a word is addressed, this instruction triggers all
the exceptions of a normal store but it does not write to memory. Thus,
we can use it to trigger COW breaks outside the critical region without
modifying the data that is to be updated atomically.
COW breaks occur randomly. So even if we have priviously executed a "stbys,e"
instruction, we still need to disable pagefaults around the critical region.
If a fault occurs in the critical region, we return -EAGAIN. I had to add
a wrapper around _arch_futex_atomic_op_inuser() as I found in testing that
returning -EAGAIN caused problems for some processes even though it is
listed as a possible return value.
The patch implements the above. The code no longer attempts to sleep with
interrupts disabled and I haven't seen any stalls with the change.
I have attempted to merge common code and streamline the fast path. In the
futex code, we only compute the spinlock address once.
I eliminated some debug code in the original CAS routine that just made the
flow more complicated.
I don't clip the arguments when called from wide mode. As a result, the LWS
routines should work when called from 64-bit processes.
I defined TASK_PAGEFAULT_DISABLED offset for use in the lws_pagefault_disable
and lws_pagefault_enable macros.
Since we now disable interrupts on the gateway page where necessary, it
might be possible to allow processes to be scheduled when they are on the
gateway page.
Change has been tested on c8000 and rp3440. It improves glibc build and test
time by about 10%.
In v2, I removed the lws_atomic_xchg and and lws_atomic_store calls. I
also removed the bug fixes that were not directly related to this patch.
In v3, I removed the code to force interruptions from
arch_futex_atomic_op_inuser(). It is always called with page faults
disabled, so this code had no effect.
In v4, I fixed a typo in depi_safe line.
In v5, I moved the code to disable/enable page faults inside the spinlocks.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-01-05 00:44:32 +03:00
/* Wide mode user process? */
bb,< ,n % s p , 3 1 , l w s _ c o m p a r e _ a n d _ s w a p
/* Clip all the input registers for 32-bit processes */
2005-04-17 02:20:36 +04:00
depdi 0 , 3 1 , 3 2 , % r26
depdi 0 , 3 1 , 3 2 , % r25
depdi 0 , 3 1 , 3 2 , % r24
# endif
lws_compare_and_swap :
parisc: Rewrite light-weight syscall and futex code
The parisc architecture lacks general hardware support for compare and swap.
Particularly for userspace, it is difficult to implement software atomic
support. Page faults in critical regions can cause processes to sleep and
block the forward progress of other processes. Thus, it is essential that
page faults be disabled in critical regions. For performance reasons, we
also need to disable external interrupts in critical regions.
In order to do this, we need a mechanism to trigger COW breaks outside the
critical region. Fortunately, parisc has the "stbys,e" instruction. When
the leftmost byte of a word is addressed, this instruction triggers all
the exceptions of a normal store but it does not write to memory. Thus,
we can use it to trigger COW breaks outside the critical region without
modifying the data that is to be updated atomically.
COW breaks occur randomly. So even if we have priviously executed a "stbys,e"
instruction, we still need to disable pagefaults around the critical region.
If a fault occurs in the critical region, we return -EAGAIN. I had to add
a wrapper around _arch_futex_atomic_op_inuser() as I found in testing that
returning -EAGAIN caused problems for some processes even though it is
listed as a possible return value.
The patch implements the above. The code no longer attempts to sleep with
interrupts disabled and I haven't seen any stalls with the change.
I have attempted to merge common code and streamline the fast path. In the
futex code, we only compute the spinlock address once.
I eliminated some debug code in the original CAS routine that just made the
flow more complicated.
I don't clip the arguments when called from wide mode. As a result, the LWS
routines should work when called from 64-bit processes.
I defined TASK_PAGEFAULT_DISABLED offset for use in the lws_pagefault_disable
and lws_pagefault_enable macros.
Since we now disable interrupts on the gateway page where necessary, it
might be possible to allow processes to be scheduled when they are on the
gateway page.
Change has been tested on c8000 and rp3440. It improves glibc build and test
time by about 10%.
In v2, I removed the lws_atomic_xchg and and lws_atomic_store calls. I
also removed the bug fixes that were not directly related to this patch.
In v3, I removed the code to force interruptions from
arch_futex_atomic_op_inuser(). It is always called with page faults
disabled, so this code had no effect.
In v4, I fixed a typo in depi_safe line.
In v5, I moved the code to disable/enable page faults inside the spinlocks.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-01-05 00:44:32 +03:00
/* Trigger memory reference interruptions without writing to memory */
1 : ldw 0 ( % r26 ) , % r28
2 : stbys,e % r0 , 0 ( % r26 )
/* Calculate 8-bit hash index from virtual address */
extru_ s a f e % r26 , 2 7 , 8 , % r20
2005-04-17 02:20:36 +04:00
parisc: Rewrite light-weight syscall and futex code
The parisc architecture lacks general hardware support for compare and swap.
Particularly for userspace, it is difficult to implement software atomic
support. Page faults in critical regions can cause processes to sleep and
block the forward progress of other processes. Thus, it is essential that
page faults be disabled in critical regions. For performance reasons, we
also need to disable external interrupts in critical regions.
In order to do this, we need a mechanism to trigger COW breaks outside the
critical region. Fortunately, parisc has the "stbys,e" instruction. When
the leftmost byte of a word is addressed, this instruction triggers all
the exceptions of a normal store but it does not write to memory. Thus,
we can use it to trigger COW breaks outside the critical region without
modifying the data that is to be updated atomically.
COW breaks occur randomly. So even if we have priviously executed a "stbys,e"
instruction, we still need to disable pagefaults around the critical region.
If a fault occurs in the critical region, we return -EAGAIN. I had to add
a wrapper around _arch_futex_atomic_op_inuser() as I found in testing that
returning -EAGAIN caused problems for some processes even though it is
listed as a possible return value.
The patch implements the above. The code no longer attempts to sleep with
interrupts disabled and I haven't seen any stalls with the change.
I have attempted to merge common code and streamline the fast path. In the
futex code, we only compute the spinlock address once.
I eliminated some debug code in the original CAS routine that just made the
flow more complicated.
I don't clip the arguments when called from wide mode. As a result, the LWS
routines should work when called from 64-bit processes.
I defined TASK_PAGEFAULT_DISABLED offset for use in the lws_pagefault_disable
and lws_pagefault_enable macros.
Since we now disable interrupts on the gateway page where necessary, it
might be possible to allow processes to be scheduled when they are on the
gateway page.
Change has been tested on c8000 and rp3440. It improves glibc build and test
time by about 10%.
In v2, I removed the lws_atomic_xchg and and lws_atomic_store calls. I
also removed the bug fixes that were not directly related to this patch.
In v3, I removed the code to force interruptions from
arch_futex_atomic_op_inuser(). It is always called with page faults
disabled, so this code had no effect.
In v4, I fixed a typo in depi_safe line.
In v5, I moved the code to disable/enable page faults inside the spinlocks.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-01-05 00:44:32 +03:00
/* Load start of lock table */
ldil L % l w s _ l o c k _ s t a r t , % r28
ldo R % l w s _ l o c k _ s t a r t ( % r28 ) , % r28
2005-04-17 02:20:36 +04:00
parisc: Rewrite light-weight syscall and futex code
The parisc architecture lacks general hardware support for compare and swap.
Particularly for userspace, it is difficult to implement software atomic
support. Page faults in critical regions can cause processes to sleep and
block the forward progress of other processes. Thus, it is essential that
page faults be disabled in critical regions. For performance reasons, we
also need to disable external interrupts in critical regions.
In order to do this, we need a mechanism to trigger COW breaks outside the
critical region. Fortunately, parisc has the "stbys,e" instruction. When
the leftmost byte of a word is addressed, this instruction triggers all
the exceptions of a normal store but it does not write to memory. Thus,
we can use it to trigger COW breaks outside the critical region without
modifying the data that is to be updated atomically.
COW breaks occur randomly. So even if we have priviously executed a "stbys,e"
instruction, we still need to disable pagefaults around the critical region.
If a fault occurs in the critical region, we return -EAGAIN. I had to add
a wrapper around _arch_futex_atomic_op_inuser() as I found in testing that
returning -EAGAIN caused problems for some processes even though it is
listed as a possible return value.
The patch implements the above. The code no longer attempts to sleep with
interrupts disabled and I haven't seen any stalls with the change.
I have attempted to merge common code and streamline the fast path. In the
futex code, we only compute the spinlock address once.
I eliminated some debug code in the original CAS routine that just made the
flow more complicated.
I don't clip the arguments when called from wide mode. As a result, the LWS
routines should work when called from 64-bit processes.
I defined TASK_PAGEFAULT_DISABLED offset for use in the lws_pagefault_disable
and lws_pagefault_enable macros.
Since we now disable interrupts on the gateway page where necessary, it
might be possible to allow processes to be scheduled when they are on the
gateway page.
Change has been tested on c8000 and rp3440. It improves glibc build and test
time by about 10%.
In v2, I removed the lws_atomic_xchg and and lws_atomic_store calls. I
also removed the bug fixes that were not directly related to this patch.
In v3, I removed the code to force interruptions from
arch_futex_atomic_op_inuser(). It is always called with page faults
disabled, so this code had no effect.
In v4, I fixed a typo in depi_safe line.
In v5, I moved the code to disable/enable page faults inside the spinlocks.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-01-05 00:44:32 +03:00
/ * Find l o c k t o u s e , t h e h a s h i n d e x i s o n e o f 0 t o
2 5 5 , multiplied b y 1 6 ( k e e p i t 1 6 - b y t e a l i g n e d )
2005-04-17 02:20:36 +04:00
and a d d t o t h e l o c k t a b l e o f f s e t . * /
shlw % r20 , 4 , % r20
add % r20 , % r28 , % r20
parisc: Rewrite light-weight syscall and futex code
The parisc architecture lacks general hardware support for compare and swap.
Particularly for userspace, it is difficult to implement software atomic
support. Page faults in critical regions can cause processes to sleep and
block the forward progress of other processes. Thus, it is essential that
page faults be disabled in critical regions. For performance reasons, we
also need to disable external interrupts in critical regions.
In order to do this, we need a mechanism to trigger COW breaks outside the
critical region. Fortunately, parisc has the "stbys,e" instruction. When
the leftmost byte of a word is addressed, this instruction triggers all
the exceptions of a normal store but it does not write to memory. Thus,
we can use it to trigger COW breaks outside the critical region without
modifying the data that is to be updated atomically.
COW breaks occur randomly. So even if we have priviously executed a "stbys,e"
instruction, we still need to disable pagefaults around the critical region.
If a fault occurs in the critical region, we return -EAGAIN. I had to add
a wrapper around _arch_futex_atomic_op_inuser() as I found in testing that
returning -EAGAIN caused problems for some processes even though it is
listed as a possible return value.
The patch implements the above. The code no longer attempts to sleep with
interrupts disabled and I haven't seen any stalls with the change.
I have attempted to merge common code and streamline the fast path. In the
futex code, we only compute the spinlock address once.
I eliminated some debug code in the original CAS routine that just made the
flow more complicated.
I don't clip the arguments when called from wide mode. As a result, the LWS
routines should work when called from 64-bit processes.
I defined TASK_PAGEFAULT_DISABLED offset for use in the lws_pagefault_disable
and lws_pagefault_enable macros.
Since we now disable interrupts on the gateway page where necessary, it
might be possible to allow processes to be scheduled when they are on the
gateway page.
Change has been tested on c8000 and rp3440. It improves glibc build and test
time by about 10%.
In v2, I removed the lws_atomic_xchg and and lws_atomic_store calls. I
also removed the bug fixes that were not directly related to this patch.
In v3, I removed the code to force interruptions from
arch_futex_atomic_op_inuser(). It is always called with page faults
disabled, so this code had no effect.
In v4, I fixed a typo in depi_safe line.
In v5, I moved the code to disable/enable page faults inside the spinlocks.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-01-05 00:44:32 +03:00
rsm P S W _ S M _ I , % r0 / * D i s a b l e i n t e r r u p t s * /
/* Try to acquire the lock */
LDCW 0 ( % s r2 ,% r20 ) , % r28
comclr,< > % r0 , % r28 , % r0
b,n l w s _ w o u l d b l o c k
/* Disable page faults to prevent sleeping in critical region */
lws_ p a g e f a u l t _ d i s a b l e % r21 ,% r28
2005-04-17 02:20:36 +04:00
/ *
prev = * a d d r ;
if ( p r e v = = o l d )
* addr = n e w ;
return p r e v ;
* /
/ * NOTES :
parisc: Rewrite light-weight syscall and futex code
The parisc architecture lacks general hardware support for compare and swap.
Particularly for userspace, it is difficult to implement software atomic
support. Page faults in critical regions can cause processes to sleep and
block the forward progress of other processes. Thus, it is essential that
page faults be disabled in critical regions. For performance reasons, we
also need to disable external interrupts in critical regions.
In order to do this, we need a mechanism to trigger COW breaks outside the
critical region. Fortunately, parisc has the "stbys,e" instruction. When
the leftmost byte of a word is addressed, this instruction triggers all
the exceptions of a normal store but it does not write to memory. Thus,
we can use it to trigger COW breaks outside the critical region without
modifying the data that is to be updated atomically.
COW breaks occur randomly. So even if we have priviously executed a "stbys,e"
instruction, we still need to disable pagefaults around the critical region.
If a fault occurs in the critical region, we return -EAGAIN. I had to add
a wrapper around _arch_futex_atomic_op_inuser() as I found in testing that
returning -EAGAIN caused problems for some processes even though it is
listed as a possible return value.
The patch implements the above. The code no longer attempts to sleep with
interrupts disabled and I haven't seen any stalls with the change.
I have attempted to merge common code and streamline the fast path. In the
futex code, we only compute the spinlock address once.
I eliminated some debug code in the original CAS routine that just made the
flow more complicated.
I don't clip the arguments when called from wide mode. As a result, the LWS
routines should work when called from 64-bit processes.
I defined TASK_PAGEFAULT_DISABLED offset for use in the lws_pagefault_disable
and lws_pagefault_enable macros.
Since we now disable interrupts on the gateway page where necessary, it
might be possible to allow processes to be scheduled when they are on the
gateway page.
Change has been tested on c8000 and rp3440. It improves glibc build and test
time by about 10%.
In v2, I removed the lws_atomic_xchg and and lws_atomic_store calls. I
also removed the bug fixes that were not directly related to this patch.
In v3, I removed the code to force interruptions from
arch_futex_atomic_op_inuser(). It is always called with page faults
disabled, so this code had no effect.
In v4, I fixed a typo in depi_safe line.
In v5, I moved the code to disable/enable page faults inside the spinlocks.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-01-05 00:44:32 +03:00
This a l l w o r k s b e c a u s e i n t r _ d o _ s i g n a l
2005-04-17 02:20:36 +04:00
and s c h e d u l e b o t h c h e c k t h e r e t u r n i a s q
and s e e t h a t w e a r e o n t h e k e r n e l p a g e
so t h i s p r o c e s s i s n e v e r s c h e d u l e d o f f
or i s e v e r s e n t a n y s i g n a l o f a n y s o r t ,
parisc: Rewrite light-weight syscall and futex code
The parisc architecture lacks general hardware support for compare and swap.
Particularly for userspace, it is difficult to implement software atomic
support. Page faults in critical regions can cause processes to sleep and
block the forward progress of other processes. Thus, it is essential that
page faults be disabled in critical regions. For performance reasons, we
also need to disable external interrupts in critical regions.
In order to do this, we need a mechanism to trigger COW breaks outside the
critical region. Fortunately, parisc has the "stbys,e" instruction. When
the leftmost byte of a word is addressed, this instruction triggers all
the exceptions of a normal store but it does not write to memory. Thus,
we can use it to trigger COW breaks outside the critical region without
modifying the data that is to be updated atomically.
COW breaks occur randomly. So even if we have priviously executed a "stbys,e"
instruction, we still need to disable pagefaults around the critical region.
If a fault occurs in the critical region, we return -EAGAIN. I had to add
a wrapper around _arch_futex_atomic_op_inuser() as I found in testing that
returning -EAGAIN caused problems for some processes even though it is
listed as a possible return value.
The patch implements the above. The code no longer attempts to sleep with
interrupts disabled and I haven't seen any stalls with the change.
I have attempted to merge common code and streamline the fast path. In the
futex code, we only compute the spinlock address once.
I eliminated some debug code in the original CAS routine that just made the
flow more complicated.
I don't clip the arguments when called from wide mode. As a result, the LWS
routines should work when called from 64-bit processes.
I defined TASK_PAGEFAULT_DISABLED offset for use in the lws_pagefault_disable
and lws_pagefault_enable macros.
Since we now disable interrupts on the gateway page where necessary, it
might be possible to allow processes to be scheduled when they are on the
gateway page.
Change has been tested on c8000 and rp3440. It improves glibc build and test
time by about 10%.
In v2, I removed the lws_atomic_xchg and and lws_atomic_store calls. I
also removed the bug fixes that were not directly related to this patch.
In v3, I removed the code to force interruptions from
arch_futex_atomic_op_inuser(). It is always called with page faults
disabled, so this code had no effect.
In v4, I fixed a typo in depi_safe line.
In v5, I moved the code to disable/enable page faults inside the spinlocks.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-01-05 00:44:32 +03:00
thus i t i s w h o l l y a t o m i c f r o m u s r s p a c e ' s
2005-04-17 02:20:36 +04:00
perspective
* /
/* The load and store could fail */
parisc: Rewrite light-weight syscall and futex code
The parisc architecture lacks general hardware support for compare and swap.
Particularly for userspace, it is difficult to implement software atomic
support. Page faults in critical regions can cause processes to sleep and
block the forward progress of other processes. Thus, it is essential that
page faults be disabled in critical regions. For performance reasons, we
also need to disable external interrupts in critical regions.
In order to do this, we need a mechanism to trigger COW breaks outside the
critical region. Fortunately, parisc has the "stbys,e" instruction. When
the leftmost byte of a word is addressed, this instruction triggers all
the exceptions of a normal store but it does not write to memory. Thus,
we can use it to trigger COW breaks outside the critical region without
modifying the data that is to be updated atomically.
COW breaks occur randomly. So even if we have priviously executed a "stbys,e"
instruction, we still need to disable pagefaults around the critical region.
If a fault occurs in the critical region, we return -EAGAIN. I had to add
a wrapper around _arch_futex_atomic_op_inuser() as I found in testing that
returning -EAGAIN caused problems for some processes even though it is
listed as a possible return value.
The patch implements the above. The code no longer attempts to sleep with
interrupts disabled and I haven't seen any stalls with the change.
I have attempted to merge common code and streamline the fast path. In the
futex code, we only compute the spinlock address once.
I eliminated some debug code in the original CAS routine that just made the
flow more complicated.
I don't clip the arguments when called from wide mode. As a result, the LWS
routines should work when called from 64-bit processes.
I defined TASK_PAGEFAULT_DISABLED offset for use in the lws_pagefault_disable
and lws_pagefault_enable macros.
Since we now disable interrupts on the gateway page where necessary, it
might be possible to allow processes to be scheduled when they are on the
gateway page.
Change has been tested on c8000 and rp3440. It improves glibc build and test
time by about 10%.
In v2, I removed the lws_atomic_xchg and and lws_atomic_store calls. I
also removed the bug fixes that were not directly related to this patch.
In v3, I removed the code to force interruptions from
arch_futex_atomic_op_inuser(). It is always called with page faults
disabled, so this code had no effect.
In v4, I fixed a typo in depi_safe line.
In v5, I moved the code to disable/enable page faults inside the spinlocks.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-01-05 00:44:32 +03:00
3 : ldw 0 ( % r26 ) , % r28
2005-04-17 02:20:36 +04:00
sub,< > % r28 , % r25 , % r0
parisc: Rewrite light-weight syscall and futex code
The parisc architecture lacks general hardware support for compare and swap.
Particularly for userspace, it is difficult to implement software atomic
support. Page faults in critical regions can cause processes to sleep and
block the forward progress of other processes. Thus, it is essential that
page faults be disabled in critical regions. For performance reasons, we
also need to disable external interrupts in critical regions.
In order to do this, we need a mechanism to trigger COW breaks outside the
critical region. Fortunately, parisc has the "stbys,e" instruction. When
the leftmost byte of a word is addressed, this instruction triggers all
the exceptions of a normal store but it does not write to memory. Thus,
we can use it to trigger COW breaks outside the critical region without
modifying the data that is to be updated atomically.
COW breaks occur randomly. So even if we have priviously executed a "stbys,e"
instruction, we still need to disable pagefaults around the critical region.
If a fault occurs in the critical region, we return -EAGAIN. I had to add
a wrapper around _arch_futex_atomic_op_inuser() as I found in testing that
returning -EAGAIN caused problems for some processes even though it is
listed as a possible return value.
The patch implements the above. The code no longer attempts to sleep with
interrupts disabled and I haven't seen any stalls with the change.
I have attempted to merge common code and streamline the fast path. In the
futex code, we only compute the spinlock address once.
I eliminated some debug code in the original CAS routine that just made the
flow more complicated.
I don't clip the arguments when called from wide mode. As a result, the LWS
routines should work when called from 64-bit processes.
I defined TASK_PAGEFAULT_DISABLED offset for use in the lws_pagefault_disable
and lws_pagefault_enable macros.
Since we now disable interrupts on the gateway page where necessary, it
might be possible to allow processes to be scheduled when they are on the
gateway page.
Change has been tested on c8000 and rp3440. It improves glibc build and test
time by about 10%.
In v2, I removed the lws_atomic_xchg and and lws_atomic_store calls. I
also removed the bug fixes that were not directly related to this patch.
In v3, I removed the code to force interruptions from
arch_futex_atomic_op_inuser(). It is always called with page faults
disabled, so this code had no effect.
In v4, I fixed a typo in depi_safe line.
In v5, I moved the code to disable/enable page faults inside the spinlocks.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-01-05 00:44:32 +03:00
4 : stw % r24 , 0 ( % r26 )
b,n l w s _ e x i t _ n o e r r o r
2005-04-17 02:20:36 +04:00
parisc: Rewrite light-weight syscall and futex code
The parisc architecture lacks general hardware support for compare and swap.
Particularly for userspace, it is difficult to implement software atomic
support. Page faults in critical regions can cause processes to sleep and
block the forward progress of other processes. Thus, it is essential that
page faults be disabled in critical regions. For performance reasons, we
also need to disable external interrupts in critical regions.
In order to do this, we need a mechanism to trigger COW breaks outside the
critical region. Fortunately, parisc has the "stbys,e" instruction. When
the leftmost byte of a word is addressed, this instruction triggers all
the exceptions of a normal store but it does not write to memory. Thus,
we can use it to trigger COW breaks outside the critical region without
modifying the data that is to be updated atomically.
COW breaks occur randomly. So even if we have priviously executed a "stbys,e"
instruction, we still need to disable pagefaults around the critical region.
If a fault occurs in the critical region, we return -EAGAIN. I had to add
a wrapper around _arch_futex_atomic_op_inuser() as I found in testing that
returning -EAGAIN caused problems for some processes even though it is
listed as a possible return value.
The patch implements the above. The code no longer attempts to sleep with
interrupts disabled and I haven't seen any stalls with the change.
I have attempted to merge common code and streamline the fast path. In the
futex code, we only compute the spinlock address once.
I eliminated some debug code in the original CAS routine that just made the
flow more complicated.
I don't clip the arguments when called from wide mode. As a result, the LWS
routines should work when called from 64-bit processes.
I defined TASK_PAGEFAULT_DISABLED offset for use in the lws_pagefault_disable
and lws_pagefault_enable macros.
Since we now disable interrupts on the gateway page where necessary, it
might be possible to allow processes to be scheduled when they are on the
gateway page.
Change has been tested on c8000 and rp3440. It improves glibc build and test
time by about 10%.
In v2, I removed the lws_atomic_xchg and and lws_atomic_store calls. I
also removed the bug fixes that were not directly related to this patch.
In v3, I removed the code to force interruptions from
arch_futex_atomic_op_inuser(). It is always called with page faults
disabled, so this code had no effect.
In v4, I fixed a typo in depi_safe line.
In v5, I moved the code to disable/enable page faults inside the spinlocks.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-01-05 00:44:32 +03:00
/* A fault occurred on load or stbys,e store */
5 : b,n l w s _ f a u l t
ASM_ E X C E P T I O N T A B L E _ E N T R Y ( 1 b - l i n u x _ g a t e w a y _ p a g e , 5 b - l i n u x _ g a t e w a y _ p a g e )
ASM_ E X C E P T I O N T A B L E _ E N T R Y ( 2 b - l i n u x _ g a t e w a y _ p a g e , 5 b - l i n u x _ g a t e w a y _ p a g e )
2005-04-17 02:20:36 +04:00
parisc: Rewrite light-weight syscall and futex code
The parisc architecture lacks general hardware support for compare and swap.
Particularly for userspace, it is difficult to implement software atomic
support. Page faults in critical regions can cause processes to sleep and
block the forward progress of other processes. Thus, it is essential that
page faults be disabled in critical regions. For performance reasons, we
also need to disable external interrupts in critical regions.
In order to do this, we need a mechanism to trigger COW breaks outside the
critical region. Fortunately, parisc has the "stbys,e" instruction. When
the leftmost byte of a word is addressed, this instruction triggers all
the exceptions of a normal store but it does not write to memory. Thus,
we can use it to trigger COW breaks outside the critical region without
modifying the data that is to be updated atomically.
COW breaks occur randomly. So even if we have priviously executed a "stbys,e"
instruction, we still need to disable pagefaults around the critical region.
If a fault occurs in the critical region, we return -EAGAIN. I had to add
a wrapper around _arch_futex_atomic_op_inuser() as I found in testing that
returning -EAGAIN caused problems for some processes even though it is
listed as a possible return value.
The patch implements the above. The code no longer attempts to sleep with
interrupts disabled and I haven't seen any stalls with the change.
I have attempted to merge common code and streamline the fast path. In the
futex code, we only compute the spinlock address once.
I eliminated some debug code in the original CAS routine that just made the
flow more complicated.
I don't clip the arguments when called from wide mode. As a result, the LWS
routines should work when called from 64-bit processes.
I defined TASK_PAGEFAULT_DISABLED offset for use in the lws_pagefault_disable
and lws_pagefault_enable macros.
Since we now disable interrupts on the gateway page where necessary, it
might be possible to allow processes to be scheduled when they are on the
gateway page.
Change has been tested on c8000 and rp3440. It improves glibc build and test
time by about 10%.
In v2, I removed the lws_atomic_xchg and and lws_atomic_store calls. I
also removed the bug fixes that were not directly related to this patch.
In v3, I removed the code to force interruptions from
arch_futex_atomic_op_inuser(). It is always called with page faults
disabled, so this code had no effect.
In v4, I fixed a typo in depi_safe line.
In v5, I moved the code to disable/enable page faults inside the spinlocks.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-01-05 00:44:32 +03:00
/* A page fault occurred in critical region */
6 : b,n l w s _ p a g e f a u l t
ASM_ E X C E P T I O N T A B L E _ E N T R Y ( 3 b - l i n u x _ g a t e w a y _ p a g e , 6 b - l i n u x _ g a t e w a y _ p a g e )
ASM_ E X C E P T I O N T A B L E _ E N T R Y ( 4 b - l i n u x _ g a t e w a y _ p a g e , 6 b - l i n u x _ g a t e w a y _ p a g e )
2005-04-17 02:20:36 +04:00
2014-09-12 20:02:34 +04:00
/ * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
New C A S i m p l e m e n t a t i o n w h i c h u s e s p o i n t e r s a n d v a r i a b l e s i z e
information. T h e v a l u e p o i n t e d b y o l d a n d n e w M U S T N O T c h a n g e
parisc: Rewrite light-weight syscall and futex code
The parisc architecture lacks general hardware support for compare and swap.
Particularly for userspace, it is difficult to implement software atomic
support. Page faults in critical regions can cause processes to sleep and
block the forward progress of other processes. Thus, it is essential that
page faults be disabled in critical regions. For performance reasons, we
also need to disable external interrupts in critical regions.
In order to do this, we need a mechanism to trigger COW breaks outside the
critical region. Fortunately, parisc has the "stbys,e" instruction. When
the leftmost byte of a word is addressed, this instruction triggers all
the exceptions of a normal store but it does not write to memory. Thus,
we can use it to trigger COW breaks outside the critical region without
modifying the data that is to be updated atomically.
COW breaks occur randomly. So even if we have priviously executed a "stbys,e"
instruction, we still need to disable pagefaults around the critical region.
If a fault occurs in the critical region, we return -EAGAIN. I had to add
a wrapper around _arch_futex_atomic_op_inuser() as I found in testing that
returning -EAGAIN caused problems for some processes even though it is
listed as a possible return value.
The patch implements the above. The code no longer attempts to sleep with
interrupts disabled and I haven't seen any stalls with the change.
I have attempted to merge common code and streamline the fast path. In the
futex code, we only compute the spinlock address once.
I eliminated some debug code in the original CAS routine that just made the
flow more complicated.
I don't clip the arguments when called from wide mode. As a result, the LWS
routines should work when called from 64-bit processes.
I defined TASK_PAGEFAULT_DISABLED offset for use in the lws_pagefault_disable
and lws_pagefault_enable macros.
Since we now disable interrupts on the gateway page where necessary, it
might be possible to allow processes to be scheduled when they are on the
gateway page.
Change has been tested on c8000 and rp3440. It improves glibc build and test
time by about 10%.
In v2, I removed the lws_atomic_xchg and and lws_atomic_store calls. I
also removed the bug fixes that were not directly related to this patch.
In v3, I removed the code to force interruptions from
arch_futex_atomic_op_inuser(). It is always called with page faults
disabled, so this code had no effect.
In v4, I fixed a typo in depi_safe line.
In v5, I moved the code to disable/enable page faults inside the spinlocks.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-01-05 00:44:32 +03:00
while p e r f o r m i n g C A S . T h e l o c k o n l y p r o t e c t s t h e v a l u e a t % r26 .
2014-09-12 20:02:34 +04:00
% r2 6 - A d d r e s s t o e x a m i n e
% r2 5 - P o i n t e r t o t h e v a l u e t o c h e c k ( o l d )
% r2 4 - P o i n t e r t o t h e v a l u e t o s e t ( n e w )
% r2 3 - S i z e o f t h e v a r i a b l e ( 0 / 1 / 2 / 3 f o r 8 / 1 6 / 3 2 / 6 4 b i t )
% r2 8 - R e t u r n n o n - z e r o o n f a i l u r e
% r2 1 - K e r n e l e r r o r c o d e
parisc: Rewrite light-weight syscall and futex code
The parisc architecture lacks general hardware support for compare and swap.
Particularly for userspace, it is difficult to implement software atomic
support. Page faults in critical regions can cause processes to sleep and
block the forward progress of other processes. Thus, it is essential that
page faults be disabled in critical regions. For performance reasons, we
also need to disable external interrupts in critical regions.
In order to do this, we need a mechanism to trigger COW breaks outside the
critical region. Fortunately, parisc has the "stbys,e" instruction. When
the leftmost byte of a word is addressed, this instruction triggers all
the exceptions of a normal store but it does not write to memory. Thus,
we can use it to trigger COW breaks outside the critical region without
modifying the data that is to be updated atomically.
COW breaks occur randomly. So even if we have priviously executed a "stbys,e"
instruction, we still need to disable pagefaults around the critical region.
If a fault occurs in the critical region, we return -EAGAIN. I had to add
a wrapper around _arch_futex_atomic_op_inuser() as I found in testing that
returning -EAGAIN caused problems for some processes even though it is
listed as a possible return value.
The patch implements the above. The code no longer attempts to sleep with
interrupts disabled and I haven't seen any stalls with the change.
I have attempted to merge common code and streamline the fast path. In the
futex code, we only compute the spinlock address once.
I eliminated some debug code in the original CAS routine that just made the
flow more complicated.
I don't clip the arguments when called from wide mode. As a result, the LWS
routines should work when called from 64-bit processes.
I defined TASK_PAGEFAULT_DISABLED offset for use in the lws_pagefault_disable
and lws_pagefault_enable macros.
Since we now disable interrupts on the gateway page where necessary, it
might be possible to allow processes to be scheduled when they are on the
gateway page.
Change has been tested on c8000 and rp3440. It improves glibc build and test
time by about 10%.
In v2, I removed the lws_atomic_xchg and and lws_atomic_store calls. I
also removed the bug fixes that were not directly related to this patch.
In v3, I removed the code to force interruptions from
arch_futex_atomic_op_inuser(). It is always called with page faults
disabled, so this code had no effect.
In v4, I fixed a typo in depi_safe line.
In v5, I moved the code to disable/enable page faults inside the spinlocks.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-01-05 00:44:32 +03:00
% r2 1 r e t u r n s t h e f o l l o w i n g e r r o r c o d e s :
2014-09-12 20:02:34 +04:00
EAGAIN - C A S i s b u s y , l d c w f a i l e d , t r y a g a i n .
EFAULT - R e a d o r w r i t e f a i l e d .
parisc: Rewrite light-weight syscall and futex code
The parisc architecture lacks general hardware support for compare and swap.
Particularly for userspace, it is difficult to implement software atomic
support. Page faults in critical regions can cause processes to sleep and
block the forward progress of other processes. Thus, it is essential that
page faults be disabled in critical regions. For performance reasons, we
also need to disable external interrupts in critical regions.
In order to do this, we need a mechanism to trigger COW breaks outside the
critical region. Fortunately, parisc has the "stbys,e" instruction. When
the leftmost byte of a word is addressed, this instruction triggers all
the exceptions of a normal store but it does not write to memory. Thus,
we can use it to trigger COW breaks outside the critical region without
modifying the data that is to be updated atomically.
COW breaks occur randomly. So even if we have priviously executed a "stbys,e"
instruction, we still need to disable pagefaults around the critical region.
If a fault occurs in the critical region, we return -EAGAIN. I had to add
a wrapper around _arch_futex_atomic_op_inuser() as I found in testing that
returning -EAGAIN caused problems for some processes even though it is
listed as a possible return value.
The patch implements the above. The code no longer attempts to sleep with
interrupts disabled and I haven't seen any stalls with the change.
I have attempted to merge common code and streamline the fast path. In the
futex code, we only compute the spinlock address once.
I eliminated some debug code in the original CAS routine that just made the
flow more complicated.
I don't clip the arguments when called from wide mode. As a result, the LWS
routines should work when called from 64-bit processes.
I defined TASK_PAGEFAULT_DISABLED offset for use in the lws_pagefault_disable
and lws_pagefault_enable macros.
Since we now disable interrupts on the gateway page where necessary, it
might be possible to allow processes to be scheduled when they are on the
gateway page.
Change has been tested on c8000 and rp3440. It improves glibc build and test
time by about 10%.
In v2, I removed the lws_atomic_xchg and and lws_atomic_store calls. I
also removed the bug fixes that were not directly related to this patch.
In v3, I removed the code to force interruptions from
arch_futex_atomic_op_inuser(). It is always called with page faults
disabled, so this code had no effect.
In v4, I fixed a typo in depi_safe line.
In v5, I moved the code to disable/enable page faults inside the spinlocks.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-01-05 00:44:32 +03:00
If E A G A I N i s r e t u r n e d , % r28 i n d i c a t e s t h e b u s y r e a s o n :
r2 8 = = 1 - C A S i s b u s y . l o c k c o n t e n d e d .
r2 8 = = 2 - C A S i s b u s y . l d c w f a i l e d .
r2 8 = = 3 - C A S i s b u s y . p a g e f a u l t .
2014-09-12 20:02:34 +04:00
Scratch : r2 0 , r22 , r28 , r29 , r1 , f r4 ( 3 2 b i t f o r 6 4 b i t C A S o n l y )
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * /
lws_compare_and_swap_2 :
# ifdef C O N F I G _ 6 4 B I T
parisc: Rewrite light-weight syscall and futex code
The parisc architecture lacks general hardware support for compare and swap.
Particularly for userspace, it is difficult to implement software atomic
support. Page faults in critical regions can cause processes to sleep and
block the forward progress of other processes. Thus, it is essential that
page faults be disabled in critical regions. For performance reasons, we
also need to disable external interrupts in critical regions.
In order to do this, we need a mechanism to trigger COW breaks outside the
critical region. Fortunately, parisc has the "stbys,e" instruction. When
the leftmost byte of a word is addressed, this instruction triggers all
the exceptions of a normal store but it does not write to memory. Thus,
we can use it to trigger COW breaks outside the critical region without
modifying the data that is to be updated atomically.
COW breaks occur randomly. So even if we have priviously executed a "stbys,e"
instruction, we still need to disable pagefaults around the critical region.
If a fault occurs in the critical region, we return -EAGAIN. I had to add
a wrapper around _arch_futex_atomic_op_inuser() as I found in testing that
returning -EAGAIN caused problems for some processes even though it is
listed as a possible return value.
The patch implements the above. The code no longer attempts to sleep with
interrupts disabled and I haven't seen any stalls with the change.
I have attempted to merge common code and streamline the fast path. In the
futex code, we only compute the spinlock address once.
I eliminated some debug code in the original CAS routine that just made the
flow more complicated.
I don't clip the arguments when called from wide mode. As a result, the LWS
routines should work when called from 64-bit processes.
I defined TASK_PAGEFAULT_DISABLED offset for use in the lws_pagefault_disable
and lws_pagefault_enable macros.
Since we now disable interrupts on the gateway page where necessary, it
might be possible to allow processes to be scheduled when they are on the
gateway page.
Change has been tested on c8000 and rp3440. It improves glibc build and test
time by about 10%.
In v2, I removed the lws_atomic_xchg and and lws_atomic_store calls. I
also removed the bug fixes that were not directly related to this patch.
In v3, I removed the code to force interruptions from
arch_futex_atomic_op_inuser(). It is always called with page faults
disabled, so this code had no effect.
In v4, I fixed a typo in depi_safe line.
In v5, I moved the code to disable/enable page faults inside the spinlocks.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-01-05 00:44:32 +03:00
/* Wide mode user process? */
bb,< ,n % s p , 3 1 , c a s2 _ b e g i n
/ * Clip t h e i n p u t r e g i s t e r s f o r 3 2 - b i t p r o c e s s e s . W e d o n ' t
need t o c l i p % r23 a s w e o n l y u s e i t f o r w o r d o p e r a t i o n s * /
2014-09-12 20:02:34 +04:00
depdi 0 , 3 1 , 3 2 , % r26
depdi 0 , 3 1 , 3 2 , % r25
depdi 0 , 3 1 , 3 2 , % r24
# endif
parisc: Rewrite light-weight syscall and futex code
The parisc architecture lacks general hardware support for compare and swap.
Particularly for userspace, it is difficult to implement software atomic
support. Page faults in critical regions can cause processes to sleep and
block the forward progress of other processes. Thus, it is essential that
page faults be disabled in critical regions. For performance reasons, we
also need to disable external interrupts in critical regions.
In order to do this, we need a mechanism to trigger COW breaks outside the
critical region. Fortunately, parisc has the "stbys,e" instruction. When
the leftmost byte of a word is addressed, this instruction triggers all
the exceptions of a normal store but it does not write to memory. Thus,
we can use it to trigger COW breaks outside the critical region without
modifying the data that is to be updated atomically.
COW breaks occur randomly. So even if we have priviously executed a "stbys,e"
instruction, we still need to disable pagefaults around the critical region.
If a fault occurs in the critical region, we return -EAGAIN. I had to add
a wrapper around _arch_futex_atomic_op_inuser() as I found in testing that
returning -EAGAIN caused problems for some processes even though it is
listed as a possible return value.
The patch implements the above. The code no longer attempts to sleep with
interrupts disabled and I haven't seen any stalls with the change.
I have attempted to merge common code and streamline the fast path. In the
futex code, we only compute the spinlock address once.
I eliminated some debug code in the original CAS routine that just made the
flow more complicated.
I don't clip the arguments when called from wide mode. As a result, the LWS
routines should work when called from 64-bit processes.
I defined TASK_PAGEFAULT_DISABLED offset for use in the lws_pagefault_disable
and lws_pagefault_enable macros.
Since we now disable interrupts on the gateway page where necessary, it
might be possible to allow processes to be scheduled when they are on the
gateway page.
Change has been tested on c8000 and rp3440. It improves glibc build and test
time by about 10%.
In v2, I removed the lws_atomic_xchg and and lws_atomic_store calls. I
also removed the bug fixes that were not directly related to this patch.
In v3, I removed the code to force interruptions from
arch_futex_atomic_op_inuser(). It is always called with page faults
disabled, so this code had no effect.
In v4, I fixed a typo in depi_safe line.
In v5, I moved the code to disable/enable page faults inside the spinlocks.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-01-05 00:44:32 +03:00
cas2_begin :
2014-09-12 20:02:34 +04:00
/* Check the validity of the size pointer */
2017-11-12 01:11:16 +03:00
subi,> > = 3 , % r23 , % r0
2014-09-12 20:02:34 +04:00
b,n l w s _ e x i t _ n o s y s
/ * Jump t o t h e f u n c t i o n s w h i c h w i l l l o a d t h e o l d a n d n e w v a l u e s i n t o
registers d e p e n d i n g o n t h e t h e i r s i z e * /
shlw % r23 , 2 , % r29
blr % r29 , % r0
nop
parisc: Rewrite light-weight syscall and futex code
The parisc architecture lacks general hardware support for compare and swap.
Particularly for userspace, it is difficult to implement software atomic
support. Page faults in critical regions can cause processes to sleep and
block the forward progress of other processes. Thus, it is essential that
page faults be disabled in critical regions. For performance reasons, we
also need to disable external interrupts in critical regions.
In order to do this, we need a mechanism to trigger COW breaks outside the
critical region. Fortunately, parisc has the "stbys,e" instruction. When
the leftmost byte of a word is addressed, this instruction triggers all
the exceptions of a normal store but it does not write to memory. Thus,
we can use it to trigger COW breaks outside the critical region without
modifying the data that is to be updated atomically.
COW breaks occur randomly. So even if we have priviously executed a "stbys,e"
instruction, we still need to disable pagefaults around the critical region.
If a fault occurs in the critical region, we return -EAGAIN. I had to add
a wrapper around _arch_futex_atomic_op_inuser() as I found in testing that
returning -EAGAIN caused problems for some processes even though it is
listed as a possible return value.
The patch implements the above. The code no longer attempts to sleep with
interrupts disabled and I haven't seen any stalls with the change.
I have attempted to merge common code and streamline the fast path. In the
futex code, we only compute the spinlock address once.
I eliminated some debug code in the original CAS routine that just made the
flow more complicated.
I don't clip the arguments when called from wide mode. As a result, the LWS
routines should work when called from 64-bit processes.
I defined TASK_PAGEFAULT_DISABLED offset for use in the lws_pagefault_disable
and lws_pagefault_enable macros.
Since we now disable interrupts on the gateway page where necessary, it
might be possible to allow processes to be scheduled when they are on the
gateway page.
Change has been tested on c8000 and rp3440. It improves glibc build and test
time by about 10%.
In v2, I removed the lws_atomic_xchg and and lws_atomic_store calls. I
also removed the bug fixes that were not directly related to this patch.
In v3, I removed the code to force interruptions from
arch_futex_atomic_op_inuser(). It is always called with page faults
disabled, so this code had no effect.
In v4, I fixed a typo in depi_safe line.
In v5, I moved the code to disable/enable page faults inside the spinlocks.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-01-05 00:44:32 +03:00
/* 8-bit load */
1 : ldb 0 ( % r25 ) , % r25
2014-09-12 20:02:34 +04:00
b c a s2 _ l o c k _ s t a r t
parisc: Rewrite light-weight syscall and futex code
The parisc architecture lacks general hardware support for compare and swap.
Particularly for userspace, it is difficult to implement software atomic
support. Page faults in critical regions can cause processes to sleep and
block the forward progress of other processes. Thus, it is essential that
page faults be disabled in critical regions. For performance reasons, we
also need to disable external interrupts in critical regions.
In order to do this, we need a mechanism to trigger COW breaks outside the
critical region. Fortunately, parisc has the "stbys,e" instruction. When
the leftmost byte of a word is addressed, this instruction triggers all
the exceptions of a normal store but it does not write to memory. Thus,
we can use it to trigger COW breaks outside the critical region without
modifying the data that is to be updated atomically.
COW breaks occur randomly. So even if we have priviously executed a "stbys,e"
instruction, we still need to disable pagefaults around the critical region.
If a fault occurs in the critical region, we return -EAGAIN. I had to add
a wrapper around _arch_futex_atomic_op_inuser() as I found in testing that
returning -EAGAIN caused problems for some processes even though it is
listed as a possible return value.
The patch implements the above. The code no longer attempts to sleep with
interrupts disabled and I haven't seen any stalls with the change.
I have attempted to merge common code and streamline the fast path. In the
futex code, we only compute the spinlock address once.
I eliminated some debug code in the original CAS routine that just made the
flow more complicated.
I don't clip the arguments when called from wide mode. As a result, the LWS
routines should work when called from 64-bit processes.
I defined TASK_PAGEFAULT_DISABLED offset for use in the lws_pagefault_disable
and lws_pagefault_enable macros.
Since we now disable interrupts on the gateway page where necessary, it
might be possible to allow processes to be scheduled when they are on the
gateway page.
Change has been tested on c8000 and rp3440. It improves glibc build and test
time by about 10%.
In v2, I removed the lws_atomic_xchg and and lws_atomic_store calls. I
also removed the bug fixes that were not directly related to this patch.
In v3, I removed the code to force interruptions from
arch_futex_atomic_op_inuser(). It is always called with page faults
disabled, so this code had no effect.
In v4, I fixed a typo in depi_safe line.
In v5, I moved the code to disable/enable page faults inside the spinlocks.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-01-05 00:44:32 +03:00
2 : ldb 0 ( % r24 ) , % r24
2014-09-12 20:02:34 +04:00
nop
nop
nop
nop
nop
parisc: Rewrite light-weight syscall and futex code
The parisc architecture lacks general hardware support for compare and swap.
Particularly for userspace, it is difficult to implement software atomic
support. Page faults in critical regions can cause processes to sleep and
block the forward progress of other processes. Thus, it is essential that
page faults be disabled in critical regions. For performance reasons, we
also need to disable external interrupts in critical regions.
In order to do this, we need a mechanism to trigger COW breaks outside the
critical region. Fortunately, parisc has the "stbys,e" instruction. When
the leftmost byte of a word is addressed, this instruction triggers all
the exceptions of a normal store but it does not write to memory. Thus,
we can use it to trigger COW breaks outside the critical region without
modifying the data that is to be updated atomically.
COW breaks occur randomly. So even if we have priviously executed a "stbys,e"
instruction, we still need to disable pagefaults around the critical region.
If a fault occurs in the critical region, we return -EAGAIN. I had to add
a wrapper around _arch_futex_atomic_op_inuser() as I found in testing that
returning -EAGAIN caused problems for some processes even though it is
listed as a possible return value.
The patch implements the above. The code no longer attempts to sleep with
interrupts disabled and I haven't seen any stalls with the change.
I have attempted to merge common code and streamline the fast path. In the
futex code, we only compute the spinlock address once.
I eliminated some debug code in the original CAS routine that just made the
flow more complicated.
I don't clip the arguments when called from wide mode. As a result, the LWS
routines should work when called from 64-bit processes.
I defined TASK_PAGEFAULT_DISABLED offset for use in the lws_pagefault_disable
and lws_pagefault_enable macros.
Since we now disable interrupts on the gateway page where necessary, it
might be possible to allow processes to be scheduled when they are on the
gateway page.
Change has been tested on c8000 and rp3440. It improves glibc build and test
time by about 10%.
In v2, I removed the lws_atomic_xchg and and lws_atomic_store calls. I
also removed the bug fixes that were not directly related to this patch.
In v3, I removed the code to force interruptions from
arch_futex_atomic_op_inuser(). It is always called with page faults
disabled, so this code had no effect.
In v4, I fixed a typo in depi_safe line.
In v5, I moved the code to disable/enable page faults inside the spinlocks.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-01-05 00:44:32 +03:00
/* 16-bit load */
3 : ldh 0 ( % r25 ) , % r25
2014-09-12 20:02:34 +04:00
b c a s2 _ l o c k _ s t a r t
parisc: Rewrite light-weight syscall and futex code
The parisc architecture lacks general hardware support for compare and swap.
Particularly for userspace, it is difficult to implement software atomic
support. Page faults in critical regions can cause processes to sleep and
block the forward progress of other processes. Thus, it is essential that
page faults be disabled in critical regions. For performance reasons, we
also need to disable external interrupts in critical regions.
In order to do this, we need a mechanism to trigger COW breaks outside the
critical region. Fortunately, parisc has the "stbys,e" instruction. When
the leftmost byte of a word is addressed, this instruction triggers all
the exceptions of a normal store but it does not write to memory. Thus,
we can use it to trigger COW breaks outside the critical region without
modifying the data that is to be updated atomically.
COW breaks occur randomly. So even if we have priviously executed a "stbys,e"
instruction, we still need to disable pagefaults around the critical region.
If a fault occurs in the critical region, we return -EAGAIN. I had to add
a wrapper around _arch_futex_atomic_op_inuser() as I found in testing that
returning -EAGAIN caused problems for some processes even though it is
listed as a possible return value.
The patch implements the above. The code no longer attempts to sleep with
interrupts disabled and I haven't seen any stalls with the change.
I have attempted to merge common code and streamline the fast path. In the
futex code, we only compute the spinlock address once.
I eliminated some debug code in the original CAS routine that just made the
flow more complicated.
I don't clip the arguments when called from wide mode. As a result, the LWS
routines should work when called from 64-bit processes.
I defined TASK_PAGEFAULT_DISABLED offset for use in the lws_pagefault_disable
and lws_pagefault_enable macros.
Since we now disable interrupts on the gateway page where necessary, it
might be possible to allow processes to be scheduled when they are on the
gateway page.
Change has been tested on c8000 and rp3440. It improves glibc build and test
time by about 10%.
In v2, I removed the lws_atomic_xchg and and lws_atomic_store calls. I
also removed the bug fixes that were not directly related to this patch.
In v3, I removed the code to force interruptions from
arch_futex_atomic_op_inuser(). It is always called with page faults
disabled, so this code had no effect.
In v4, I fixed a typo in depi_safe line.
In v5, I moved the code to disable/enable page faults inside the spinlocks.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-01-05 00:44:32 +03:00
4 : ldh 0 ( % r24 ) , % r24
2014-09-12 20:02:34 +04:00
nop
nop
nop
nop
nop
parisc: Rewrite light-weight syscall and futex code
The parisc architecture lacks general hardware support for compare and swap.
Particularly for userspace, it is difficult to implement software atomic
support. Page faults in critical regions can cause processes to sleep and
block the forward progress of other processes. Thus, it is essential that
page faults be disabled in critical regions. For performance reasons, we
also need to disable external interrupts in critical regions.
In order to do this, we need a mechanism to trigger COW breaks outside the
critical region. Fortunately, parisc has the "stbys,e" instruction. When
the leftmost byte of a word is addressed, this instruction triggers all
the exceptions of a normal store but it does not write to memory. Thus,
we can use it to trigger COW breaks outside the critical region without
modifying the data that is to be updated atomically.
COW breaks occur randomly. So even if we have priviously executed a "stbys,e"
instruction, we still need to disable pagefaults around the critical region.
If a fault occurs in the critical region, we return -EAGAIN. I had to add
a wrapper around _arch_futex_atomic_op_inuser() as I found in testing that
returning -EAGAIN caused problems for some processes even though it is
listed as a possible return value.
The patch implements the above. The code no longer attempts to sleep with
interrupts disabled and I haven't seen any stalls with the change.
I have attempted to merge common code and streamline the fast path. In the
futex code, we only compute the spinlock address once.
I eliminated some debug code in the original CAS routine that just made the
flow more complicated.
I don't clip the arguments when called from wide mode. As a result, the LWS
routines should work when called from 64-bit processes.
I defined TASK_PAGEFAULT_DISABLED offset for use in the lws_pagefault_disable
and lws_pagefault_enable macros.
Since we now disable interrupts on the gateway page where necessary, it
might be possible to allow processes to be scheduled when they are on the
gateway page.
Change has been tested on c8000 and rp3440. It improves glibc build and test
time by about 10%.
In v2, I removed the lws_atomic_xchg and and lws_atomic_store calls. I
also removed the bug fixes that were not directly related to this patch.
In v3, I removed the code to force interruptions from
arch_futex_atomic_op_inuser(). It is always called with page faults
disabled, so this code had no effect.
In v4, I fixed a typo in depi_safe line.
In v5, I moved the code to disable/enable page faults inside the spinlocks.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-01-05 00:44:32 +03:00
/* 32-bit load */
5 : ldw 0 ( % r25 ) , % r25
2014-09-12 20:02:34 +04:00
b c a s2 _ l o c k _ s t a r t
parisc: Rewrite light-weight syscall and futex code
The parisc architecture lacks general hardware support for compare and swap.
Particularly for userspace, it is difficult to implement software atomic
support. Page faults in critical regions can cause processes to sleep and
block the forward progress of other processes. Thus, it is essential that
page faults be disabled in critical regions. For performance reasons, we
also need to disable external interrupts in critical regions.
In order to do this, we need a mechanism to trigger COW breaks outside the
critical region. Fortunately, parisc has the "stbys,e" instruction. When
the leftmost byte of a word is addressed, this instruction triggers all
the exceptions of a normal store but it does not write to memory. Thus,
we can use it to trigger COW breaks outside the critical region without
modifying the data that is to be updated atomically.
COW breaks occur randomly. So even if we have priviously executed a "stbys,e"
instruction, we still need to disable pagefaults around the critical region.
If a fault occurs in the critical region, we return -EAGAIN. I had to add
a wrapper around _arch_futex_atomic_op_inuser() as I found in testing that
returning -EAGAIN caused problems for some processes even though it is
listed as a possible return value.
The patch implements the above. The code no longer attempts to sleep with
interrupts disabled and I haven't seen any stalls with the change.
I have attempted to merge common code and streamline the fast path. In the
futex code, we only compute the spinlock address once.
I eliminated some debug code in the original CAS routine that just made the
flow more complicated.
I don't clip the arguments when called from wide mode. As a result, the LWS
routines should work when called from 64-bit processes.
I defined TASK_PAGEFAULT_DISABLED offset for use in the lws_pagefault_disable
and lws_pagefault_enable macros.
Since we now disable interrupts on the gateway page where necessary, it
might be possible to allow processes to be scheduled when they are on the
gateway page.
Change has been tested on c8000 and rp3440. It improves glibc build and test
time by about 10%.
In v2, I removed the lws_atomic_xchg and and lws_atomic_store calls. I
also removed the bug fixes that were not directly related to this patch.
In v3, I removed the code to force interruptions from
arch_futex_atomic_op_inuser(). It is always called with page faults
disabled, so this code had no effect.
In v4, I fixed a typo in depi_safe line.
In v5, I moved the code to disable/enable page faults inside the spinlocks.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-01-05 00:44:32 +03:00
6 : ldw 0 ( % r24 ) , % r24
2014-09-12 20:02:34 +04:00
nop
nop
nop
nop
nop
parisc: Rewrite light-weight syscall and futex code
The parisc architecture lacks general hardware support for compare and swap.
Particularly for userspace, it is difficult to implement software atomic
support. Page faults in critical regions can cause processes to sleep and
block the forward progress of other processes. Thus, it is essential that
page faults be disabled in critical regions. For performance reasons, we
also need to disable external interrupts in critical regions.
In order to do this, we need a mechanism to trigger COW breaks outside the
critical region. Fortunately, parisc has the "stbys,e" instruction. When
the leftmost byte of a word is addressed, this instruction triggers all
the exceptions of a normal store but it does not write to memory. Thus,
we can use it to trigger COW breaks outside the critical region without
modifying the data that is to be updated atomically.
COW breaks occur randomly. So even if we have priviously executed a "stbys,e"
instruction, we still need to disable pagefaults around the critical region.
If a fault occurs in the critical region, we return -EAGAIN. I had to add
a wrapper around _arch_futex_atomic_op_inuser() as I found in testing that
returning -EAGAIN caused problems for some processes even though it is
listed as a possible return value.
The patch implements the above. The code no longer attempts to sleep with
interrupts disabled and I haven't seen any stalls with the change.
I have attempted to merge common code and streamline the fast path. In the
futex code, we only compute the spinlock address once.
I eliminated some debug code in the original CAS routine that just made the
flow more complicated.
I don't clip the arguments when called from wide mode. As a result, the LWS
routines should work when called from 64-bit processes.
I defined TASK_PAGEFAULT_DISABLED offset for use in the lws_pagefault_disable
and lws_pagefault_enable macros.
Since we now disable interrupts on the gateway page where necessary, it
might be possible to allow processes to be scheduled when they are on the
gateway page.
Change has been tested on c8000 and rp3440. It improves glibc build and test
time by about 10%.
In v2, I removed the lws_atomic_xchg and and lws_atomic_store calls. I
also removed the bug fixes that were not directly related to this patch.
In v3, I removed the code to force interruptions from
arch_futex_atomic_op_inuser(). It is always called with page faults
disabled, so this code had no effect.
In v4, I fixed a typo in depi_safe line.
In v5, I moved the code to disable/enable page faults inside the spinlocks.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-01-05 00:44:32 +03:00
/* 64-bit load */
2014-09-12 20:02:34 +04:00
# ifdef C O N F I G _ 6 4 B I T
parisc: Rewrite light-weight syscall and futex code
The parisc architecture lacks general hardware support for compare and swap.
Particularly for userspace, it is difficult to implement software atomic
support. Page faults in critical regions can cause processes to sleep and
block the forward progress of other processes. Thus, it is essential that
page faults be disabled in critical regions. For performance reasons, we
also need to disable external interrupts in critical regions.
In order to do this, we need a mechanism to trigger COW breaks outside the
critical region. Fortunately, parisc has the "stbys,e" instruction. When
the leftmost byte of a word is addressed, this instruction triggers all
the exceptions of a normal store but it does not write to memory. Thus,
we can use it to trigger COW breaks outside the critical region without
modifying the data that is to be updated atomically.
COW breaks occur randomly. So even if we have priviously executed a "stbys,e"
instruction, we still need to disable pagefaults around the critical region.
If a fault occurs in the critical region, we return -EAGAIN. I had to add
a wrapper around _arch_futex_atomic_op_inuser() as I found in testing that
returning -EAGAIN caused problems for some processes even though it is
listed as a possible return value.
The patch implements the above. The code no longer attempts to sleep with
interrupts disabled and I haven't seen any stalls with the change.
I have attempted to merge common code and streamline the fast path. In the
futex code, we only compute the spinlock address once.
I eliminated some debug code in the original CAS routine that just made the
flow more complicated.
I don't clip the arguments when called from wide mode. As a result, the LWS
routines should work when called from 64-bit processes.
I defined TASK_PAGEFAULT_DISABLED offset for use in the lws_pagefault_disable
and lws_pagefault_enable macros.
Since we now disable interrupts on the gateway page where necessary, it
might be possible to allow processes to be scheduled when they are on the
gateway page.
Change has been tested on c8000 and rp3440. It improves glibc build and test
time by about 10%.
In v2, I removed the lws_atomic_xchg and and lws_atomic_store calls. I
also removed the bug fixes that were not directly related to this patch.
In v3, I removed the code to force interruptions from
arch_futex_atomic_op_inuser(). It is always called with page faults
disabled, so this code had no effect.
In v4, I fixed a typo in depi_safe line.
In v5, I moved the code to disable/enable page faults inside the spinlocks.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-01-05 00:44:32 +03:00
7 : ldd 0 ( % r25 ) , % r25
8 : ldd 0 ( % r24 ) , % r24
2014-09-12 20:02:34 +04:00
# else
2017-10-01 00:24:23 +03:00
/* Load old value into r22/r23 - high/low */
parisc: Rewrite light-weight syscall and futex code
The parisc architecture lacks general hardware support for compare and swap.
Particularly for userspace, it is difficult to implement software atomic
support. Page faults in critical regions can cause processes to sleep and
block the forward progress of other processes. Thus, it is essential that
page faults be disabled in critical regions. For performance reasons, we
also need to disable external interrupts in critical regions.
In order to do this, we need a mechanism to trigger COW breaks outside the
critical region. Fortunately, parisc has the "stbys,e" instruction. When
the leftmost byte of a word is addressed, this instruction triggers all
the exceptions of a normal store but it does not write to memory. Thus,
we can use it to trigger COW breaks outside the critical region without
modifying the data that is to be updated atomically.
COW breaks occur randomly. So even if we have priviously executed a "stbys,e"
instruction, we still need to disable pagefaults around the critical region.
If a fault occurs in the critical region, we return -EAGAIN. I had to add
a wrapper around _arch_futex_atomic_op_inuser() as I found in testing that
returning -EAGAIN caused problems for some processes even though it is
listed as a possible return value.
The patch implements the above. The code no longer attempts to sleep with
interrupts disabled and I haven't seen any stalls with the change.
I have attempted to merge common code and streamline the fast path. In the
futex code, we only compute the spinlock address once.
I eliminated some debug code in the original CAS routine that just made the
flow more complicated.
I don't clip the arguments when called from wide mode. As a result, the LWS
routines should work when called from 64-bit processes.
I defined TASK_PAGEFAULT_DISABLED offset for use in the lws_pagefault_disable
and lws_pagefault_enable macros.
Since we now disable interrupts on the gateway page where necessary, it
might be possible to allow processes to be scheduled when they are on the
gateway page.
Change has been tested on c8000 and rp3440. It improves glibc build and test
time by about 10%.
In v2, I removed the lws_atomic_xchg and and lws_atomic_store calls. I
also removed the bug fixes that were not directly related to this patch.
In v3, I removed the code to force interruptions from
arch_futex_atomic_op_inuser(). It is always called with page faults
disabled, so this code had no effect.
In v4, I fixed a typo in depi_safe line.
In v5, I moved the code to disable/enable page faults inside the spinlocks.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-01-05 00:44:32 +03:00
7 : ldw 0 ( % r25 ) , % r22
8 : ldw 4 ( % r25 ) , % r23
2014-09-12 20:02:34 +04:00
/* Load new value into fr4 for atomic store later */
parisc: Rewrite light-weight syscall and futex code
The parisc architecture lacks general hardware support for compare and swap.
Particularly for userspace, it is difficult to implement software atomic
support. Page faults in critical regions can cause processes to sleep and
block the forward progress of other processes. Thus, it is essential that
page faults be disabled in critical regions. For performance reasons, we
also need to disable external interrupts in critical regions.
In order to do this, we need a mechanism to trigger COW breaks outside the
critical region. Fortunately, parisc has the "stbys,e" instruction. When
the leftmost byte of a word is addressed, this instruction triggers all
the exceptions of a normal store but it does not write to memory. Thus,
we can use it to trigger COW breaks outside the critical region without
modifying the data that is to be updated atomically.
COW breaks occur randomly. So even if we have priviously executed a "stbys,e"
instruction, we still need to disable pagefaults around the critical region.
If a fault occurs in the critical region, we return -EAGAIN. I had to add
a wrapper around _arch_futex_atomic_op_inuser() as I found in testing that
returning -EAGAIN caused problems for some processes even though it is
listed as a possible return value.
The patch implements the above. The code no longer attempts to sleep with
interrupts disabled and I haven't seen any stalls with the change.
I have attempted to merge common code and streamline the fast path. In the
futex code, we only compute the spinlock address once.
I eliminated some debug code in the original CAS routine that just made the
flow more complicated.
I don't clip the arguments when called from wide mode. As a result, the LWS
routines should work when called from 64-bit processes.
I defined TASK_PAGEFAULT_DISABLED offset for use in the lws_pagefault_disable
and lws_pagefault_enable macros.
Since we now disable interrupts on the gateway page where necessary, it
might be possible to allow processes to be scheduled when they are on the
gateway page.
Change has been tested on c8000 and rp3440. It improves glibc build and test
time by about 10%.
In v2, I removed the lws_atomic_xchg and and lws_atomic_store calls. I
also removed the bug fixes that were not directly related to this patch.
In v3, I removed the code to force interruptions from
arch_futex_atomic_op_inuser(). It is always called with page faults
disabled, so this code had no effect.
In v4, I fixed a typo in depi_safe line.
In v5, I moved the code to disable/enable page faults inside the spinlocks.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-01-05 00:44:32 +03:00
9 : flddx 0 ( % r24 ) , % f r4
2014-09-12 20:02:34 +04:00
# endif
cas2_lock_start :
parisc: Rewrite light-weight syscall and futex code
The parisc architecture lacks general hardware support for compare and swap.
Particularly for userspace, it is difficult to implement software atomic
support. Page faults in critical regions can cause processes to sleep and
block the forward progress of other processes. Thus, it is essential that
page faults be disabled in critical regions. For performance reasons, we
also need to disable external interrupts in critical regions.
In order to do this, we need a mechanism to trigger COW breaks outside the
critical region. Fortunately, parisc has the "stbys,e" instruction. When
the leftmost byte of a word is addressed, this instruction triggers all
the exceptions of a normal store but it does not write to memory. Thus,
we can use it to trigger COW breaks outside the critical region without
modifying the data that is to be updated atomically.
COW breaks occur randomly. So even if we have priviously executed a "stbys,e"
instruction, we still need to disable pagefaults around the critical region.
If a fault occurs in the critical region, we return -EAGAIN. I had to add
a wrapper around _arch_futex_atomic_op_inuser() as I found in testing that
returning -EAGAIN caused problems for some processes even though it is
listed as a possible return value.
The patch implements the above. The code no longer attempts to sleep with
interrupts disabled and I haven't seen any stalls with the change.
I have attempted to merge common code and streamline the fast path. In the
futex code, we only compute the spinlock address once.
I eliminated some debug code in the original CAS routine that just made the
flow more complicated.
I don't clip the arguments when called from wide mode. As a result, the LWS
routines should work when called from 64-bit processes.
I defined TASK_PAGEFAULT_DISABLED offset for use in the lws_pagefault_disable
and lws_pagefault_enable macros.
Since we now disable interrupts on the gateway page where necessary, it
might be possible to allow processes to be scheduled when they are on the
gateway page.
Change has been tested on c8000 and rp3440. It improves glibc build and test
time by about 10%.
In v2, I removed the lws_atomic_xchg and and lws_atomic_store calls. I
also removed the bug fixes that were not directly related to this patch.
In v3, I removed the code to force interruptions from
arch_futex_atomic_op_inuser(). It is always called with page faults
disabled, so this code had no effect.
In v4, I fixed a typo in depi_safe line.
In v5, I moved the code to disable/enable page faults inside the spinlocks.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-01-05 00:44:32 +03:00
/* Trigger memory reference interruptions without writing to memory */
copy % r26 , % r28
depi_ s a f e 0 , 3 1 , 2 , % r28
10 : ldw 0 ( % r28 ) , % r1
11 : stbys,e % r0 , 0 ( % r28 )
2014-09-12 20:02:34 +04:00
parisc: Rewrite light-weight syscall and futex code
The parisc architecture lacks general hardware support for compare and swap.
Particularly for userspace, it is difficult to implement software atomic
support. Page faults in critical regions can cause processes to sleep and
block the forward progress of other processes. Thus, it is essential that
page faults be disabled in critical regions. For performance reasons, we
also need to disable external interrupts in critical regions.
In order to do this, we need a mechanism to trigger COW breaks outside the
critical region. Fortunately, parisc has the "stbys,e" instruction. When
the leftmost byte of a word is addressed, this instruction triggers all
the exceptions of a normal store but it does not write to memory. Thus,
we can use it to trigger COW breaks outside the critical region without
modifying the data that is to be updated atomically.
COW breaks occur randomly. So even if we have priviously executed a "stbys,e"
instruction, we still need to disable pagefaults around the critical region.
If a fault occurs in the critical region, we return -EAGAIN. I had to add
a wrapper around _arch_futex_atomic_op_inuser() as I found in testing that
returning -EAGAIN caused problems for some processes even though it is
listed as a possible return value.
The patch implements the above. The code no longer attempts to sleep with
interrupts disabled and I haven't seen any stalls with the change.
I have attempted to merge common code and streamline the fast path. In the
futex code, we only compute the spinlock address once.
I eliminated some debug code in the original CAS routine that just made the
flow more complicated.
I don't clip the arguments when called from wide mode. As a result, the LWS
routines should work when called from 64-bit processes.
I defined TASK_PAGEFAULT_DISABLED offset for use in the lws_pagefault_disable
and lws_pagefault_enable macros.
Since we now disable interrupts on the gateway page where necessary, it
might be possible to allow processes to be scheduled when they are on the
gateway page.
Change has been tested on c8000 and rp3440. It improves glibc build and test
time by about 10%.
In v2, I removed the lws_atomic_xchg and and lws_atomic_store calls. I
also removed the bug fixes that were not directly related to this patch.
In v3, I removed the code to force interruptions from
arch_futex_atomic_op_inuser(). It is always called with page faults
disabled, so this code had no effect.
In v4, I fixed a typo in depi_safe line.
In v5, I moved the code to disable/enable page faults inside the spinlocks.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-01-05 00:44:32 +03:00
/* Calculate 8-bit hash index from virtual address */
extru_ s a f e % r26 , 2 7 , 8 , % r20
2014-09-12 20:02:34 +04:00
parisc: Rewrite light-weight syscall and futex code
The parisc architecture lacks general hardware support for compare and swap.
Particularly for userspace, it is difficult to implement software atomic
support. Page faults in critical regions can cause processes to sleep and
block the forward progress of other processes. Thus, it is essential that
page faults be disabled in critical regions. For performance reasons, we
also need to disable external interrupts in critical regions.
In order to do this, we need a mechanism to trigger COW breaks outside the
critical region. Fortunately, parisc has the "stbys,e" instruction. When
the leftmost byte of a word is addressed, this instruction triggers all
the exceptions of a normal store but it does not write to memory. Thus,
we can use it to trigger COW breaks outside the critical region without
modifying the data that is to be updated atomically.
COW breaks occur randomly. So even if we have priviously executed a "stbys,e"
instruction, we still need to disable pagefaults around the critical region.
If a fault occurs in the critical region, we return -EAGAIN. I had to add
a wrapper around _arch_futex_atomic_op_inuser() as I found in testing that
returning -EAGAIN caused problems for some processes even though it is
listed as a possible return value.
The patch implements the above. The code no longer attempts to sleep with
interrupts disabled and I haven't seen any stalls with the change.
I have attempted to merge common code and streamline the fast path. In the
futex code, we only compute the spinlock address once.
I eliminated some debug code in the original CAS routine that just made the
flow more complicated.
I don't clip the arguments when called from wide mode. As a result, the LWS
routines should work when called from 64-bit processes.
I defined TASK_PAGEFAULT_DISABLED offset for use in the lws_pagefault_disable
and lws_pagefault_enable macros.
Since we now disable interrupts on the gateway page where necessary, it
might be possible to allow processes to be scheduled when they are on the
gateway page.
Change has been tested on c8000 and rp3440. It improves glibc build and test
time by about 10%.
In v2, I removed the lws_atomic_xchg and and lws_atomic_store calls. I
also removed the bug fixes that were not directly related to this patch.
In v3, I removed the code to force interruptions from
arch_futex_atomic_op_inuser(). It is always called with page faults
disabled, so this code had no effect.
In v4, I fixed a typo in depi_safe line.
In v5, I moved the code to disable/enable page faults inside the spinlocks.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-01-05 00:44:32 +03:00
/* Load start of lock table */
ldil L % l w s _ l o c k _ s t a r t , % r28
ldo R % l w s _ l o c k _ s t a r t ( % r28 ) , % r28
/ * Find l o c k t o u s e , t h e h a s h i n d e x i s o n e o f 0 t o
2 5 5 , multiplied b y 1 6 ( k e e p i t 1 6 - b y t e a l i g n e d )
2014-09-12 20:02:34 +04:00
and a d d t o t h e l o c k t a b l e o f f s e t . * /
shlw % r20 , 4 , % r20
add % r20 , % r28 , % r20
parisc: Rewrite light-weight syscall and futex code
The parisc architecture lacks general hardware support for compare and swap.
Particularly for userspace, it is difficult to implement software atomic
support. Page faults in critical regions can cause processes to sleep and
block the forward progress of other processes. Thus, it is essential that
page faults be disabled in critical regions. For performance reasons, we
also need to disable external interrupts in critical regions.
In order to do this, we need a mechanism to trigger COW breaks outside the
critical region. Fortunately, parisc has the "stbys,e" instruction. When
the leftmost byte of a word is addressed, this instruction triggers all
the exceptions of a normal store but it does not write to memory. Thus,
we can use it to trigger COW breaks outside the critical region without
modifying the data that is to be updated atomically.
COW breaks occur randomly. So even if we have priviously executed a "stbys,e"
instruction, we still need to disable pagefaults around the critical region.
If a fault occurs in the critical region, we return -EAGAIN. I had to add
a wrapper around _arch_futex_atomic_op_inuser() as I found in testing that
returning -EAGAIN caused problems for some processes even though it is
listed as a possible return value.
The patch implements the above. The code no longer attempts to sleep with
interrupts disabled and I haven't seen any stalls with the change.
I have attempted to merge common code and streamline the fast path. In the
futex code, we only compute the spinlock address once.
I eliminated some debug code in the original CAS routine that just made the
flow more complicated.
I don't clip the arguments when called from wide mode. As a result, the LWS
routines should work when called from 64-bit processes.
I defined TASK_PAGEFAULT_DISABLED offset for use in the lws_pagefault_disable
and lws_pagefault_enable macros.
Since we now disable interrupts on the gateway page where necessary, it
might be possible to allow processes to be scheduled when they are on the
gateway page.
Change has been tested on c8000 and rp3440. It improves glibc build and test
time by about 10%.
In v2, I removed the lws_atomic_xchg and and lws_atomic_store calls. I
also removed the bug fixes that were not directly related to this patch.
In v3, I removed the code to force interruptions from
arch_futex_atomic_op_inuser(). It is always called with page faults
disabled, so this code had no effect.
In v4, I fixed a typo in depi_safe line.
In v5, I moved the code to disable/enable page faults inside the spinlocks.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-01-05 00:44:32 +03:00
rsm P S W _ S M _ I , % r0 / * D i s a b l e i n t e r r u p t s * /
/* Try to acquire the lock */
LDCW 0 ( % s r2 ,% r20 ) , % r28
comclr,< > % r0 , % r28 , % r0
b,n l w s _ w o u l d b l o c k
/* Disable page faults to prevent sleeping in critical region */
lws_ p a g e f a u l t _ d i s a b l e % r21 ,% r28
2014-09-12 20:02:34 +04:00
/ *
prev = * a d d r ;
if ( p r e v = = o l d )
* addr = n e w ;
return p r e v ;
* /
/ * NOTES :
parisc: Rewrite light-weight syscall and futex code
The parisc architecture lacks general hardware support for compare and swap.
Particularly for userspace, it is difficult to implement software atomic
support. Page faults in critical regions can cause processes to sleep and
block the forward progress of other processes. Thus, it is essential that
page faults be disabled in critical regions. For performance reasons, we
also need to disable external interrupts in critical regions.
In order to do this, we need a mechanism to trigger COW breaks outside the
critical region. Fortunately, parisc has the "stbys,e" instruction. When
the leftmost byte of a word is addressed, this instruction triggers all
the exceptions of a normal store but it does not write to memory. Thus,
we can use it to trigger COW breaks outside the critical region without
modifying the data that is to be updated atomically.
COW breaks occur randomly. So even if we have priviously executed a "stbys,e"
instruction, we still need to disable pagefaults around the critical region.
If a fault occurs in the critical region, we return -EAGAIN. I had to add
a wrapper around _arch_futex_atomic_op_inuser() as I found in testing that
returning -EAGAIN caused problems for some processes even though it is
listed as a possible return value.
The patch implements the above. The code no longer attempts to sleep with
interrupts disabled and I haven't seen any stalls with the change.
I have attempted to merge common code and streamline the fast path. In the
futex code, we only compute the spinlock address once.
I eliminated some debug code in the original CAS routine that just made the
flow more complicated.
I don't clip the arguments when called from wide mode. As a result, the LWS
routines should work when called from 64-bit processes.
I defined TASK_PAGEFAULT_DISABLED offset for use in the lws_pagefault_disable
and lws_pagefault_enable macros.
Since we now disable interrupts on the gateway page where necessary, it
might be possible to allow processes to be scheduled when they are on the
gateway page.
Change has been tested on c8000 and rp3440. It improves glibc build and test
time by about 10%.
In v2, I removed the lws_atomic_xchg and and lws_atomic_store calls. I
also removed the bug fixes that were not directly related to this patch.
In v3, I removed the code to force interruptions from
arch_futex_atomic_op_inuser(). It is always called with page faults
disabled, so this code had no effect.
In v4, I fixed a typo in depi_safe line.
In v5, I moved the code to disable/enable page faults inside the spinlocks.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-01-05 00:44:32 +03:00
This a l l w o r k s b e c a u s e i n t r _ d o _ s i g n a l
2014-09-12 20:02:34 +04:00
and s c h e d u l e b o t h c h e c k t h e r e t u r n i a s q
and s e e t h a t w e a r e o n t h e k e r n e l p a g e
so t h i s p r o c e s s i s n e v e r s c h e d u l e d o f f
or i s e v e r s e n t a n y s i g n a l o f a n y s o r t ,
parisc: Rewrite light-weight syscall and futex code
The parisc architecture lacks general hardware support for compare and swap.
Particularly for userspace, it is difficult to implement software atomic
support. Page faults in critical regions can cause processes to sleep and
block the forward progress of other processes. Thus, it is essential that
page faults be disabled in critical regions. For performance reasons, we
also need to disable external interrupts in critical regions.
In order to do this, we need a mechanism to trigger COW breaks outside the
critical region. Fortunately, parisc has the "stbys,e" instruction. When
the leftmost byte of a word is addressed, this instruction triggers all
the exceptions of a normal store but it does not write to memory. Thus,
we can use it to trigger COW breaks outside the critical region without
modifying the data that is to be updated atomically.
COW breaks occur randomly. So even if we have priviously executed a "stbys,e"
instruction, we still need to disable pagefaults around the critical region.
If a fault occurs in the critical region, we return -EAGAIN. I had to add
a wrapper around _arch_futex_atomic_op_inuser() as I found in testing that
returning -EAGAIN caused problems for some processes even though it is
listed as a possible return value.
The patch implements the above. The code no longer attempts to sleep with
interrupts disabled and I haven't seen any stalls with the change.
I have attempted to merge common code and streamline the fast path. In the
futex code, we only compute the spinlock address once.
I eliminated some debug code in the original CAS routine that just made the
flow more complicated.
I don't clip the arguments when called from wide mode. As a result, the LWS
routines should work when called from 64-bit processes.
I defined TASK_PAGEFAULT_DISABLED offset for use in the lws_pagefault_disable
and lws_pagefault_enable macros.
Since we now disable interrupts on the gateway page where necessary, it
might be possible to allow processes to be scheduled when they are on the
gateway page.
Change has been tested on c8000 and rp3440. It improves glibc build and test
time by about 10%.
In v2, I removed the lws_atomic_xchg and and lws_atomic_store calls. I
also removed the bug fixes that were not directly related to this patch.
In v3, I removed the code to force interruptions from
arch_futex_atomic_op_inuser(). It is always called with page faults
disabled, so this code had no effect.
In v4, I fixed a typo in depi_safe line.
In v5, I moved the code to disable/enable page faults inside the spinlocks.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-01-05 00:44:32 +03:00
thus i t i s w h o l l y a t o m i c f r o m u s r s p a c e ' s
2014-09-12 20:02:34 +04:00
perspective
* /
parisc: Rewrite light-weight syscall and futex code
The parisc architecture lacks general hardware support for compare and swap.
Particularly for userspace, it is difficult to implement software atomic
support. Page faults in critical regions can cause processes to sleep and
block the forward progress of other processes. Thus, it is essential that
page faults be disabled in critical regions. For performance reasons, we
also need to disable external interrupts in critical regions.
In order to do this, we need a mechanism to trigger COW breaks outside the
critical region. Fortunately, parisc has the "stbys,e" instruction. When
the leftmost byte of a word is addressed, this instruction triggers all
the exceptions of a normal store but it does not write to memory. Thus,
we can use it to trigger COW breaks outside the critical region without
modifying the data that is to be updated atomically.
COW breaks occur randomly. So even if we have priviously executed a "stbys,e"
instruction, we still need to disable pagefaults around the critical region.
If a fault occurs in the critical region, we return -EAGAIN. I had to add
a wrapper around _arch_futex_atomic_op_inuser() as I found in testing that
returning -EAGAIN caused problems for some processes even though it is
listed as a possible return value.
The patch implements the above. The code no longer attempts to sleep with
interrupts disabled and I haven't seen any stalls with the change.
I have attempted to merge common code and streamline the fast path. In the
futex code, we only compute the spinlock address once.
I eliminated some debug code in the original CAS routine that just made the
flow more complicated.
I don't clip the arguments when called from wide mode. As a result, the LWS
routines should work when called from 64-bit processes.
I defined TASK_PAGEFAULT_DISABLED offset for use in the lws_pagefault_disable
and lws_pagefault_enable macros.
Since we now disable interrupts on the gateway page where necessary, it
might be possible to allow processes to be scheduled when they are on the
gateway page.
Change has been tested on c8000 and rp3440. It improves glibc build and test
time by about 10%.
In v2, I removed the lws_atomic_xchg and and lws_atomic_store calls. I
also removed the bug fixes that were not directly related to this patch.
In v3, I removed the code to force interruptions from
arch_futex_atomic_op_inuser(). It is always called with page faults
disabled, so this code had no effect.
In v4, I fixed a typo in depi_safe line.
In v5, I moved the code to disable/enable page faults inside the spinlocks.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-01-05 00:44:32 +03:00
2014-09-12 20:02:34 +04:00
/* Jump to the correct function */
blr % r29 , % r0
/* Set %r28 as non-zero for now */
ldo 1 ( % r0 ) ,% r28
parisc: Rewrite light-weight syscall and futex code
The parisc architecture lacks general hardware support for compare and swap.
Particularly for userspace, it is difficult to implement software atomic
support. Page faults in critical regions can cause processes to sleep and
block the forward progress of other processes. Thus, it is essential that
page faults be disabled in critical regions. For performance reasons, we
also need to disable external interrupts in critical regions.
In order to do this, we need a mechanism to trigger COW breaks outside the
critical region. Fortunately, parisc has the "stbys,e" instruction. When
the leftmost byte of a word is addressed, this instruction triggers all
the exceptions of a normal store but it does not write to memory. Thus,
we can use it to trigger COW breaks outside the critical region without
modifying the data that is to be updated atomically.
COW breaks occur randomly. So even if we have priviously executed a "stbys,e"
instruction, we still need to disable pagefaults around the critical region.
If a fault occurs in the critical region, we return -EAGAIN. I had to add
a wrapper around _arch_futex_atomic_op_inuser() as I found in testing that
returning -EAGAIN caused problems for some processes even though it is
listed as a possible return value.
The patch implements the above. The code no longer attempts to sleep with
interrupts disabled and I haven't seen any stalls with the change.
I have attempted to merge common code and streamline the fast path. In the
futex code, we only compute the spinlock address once.
I eliminated some debug code in the original CAS routine that just made the
flow more complicated.
I don't clip the arguments when called from wide mode. As a result, the LWS
routines should work when called from 64-bit processes.
I defined TASK_PAGEFAULT_DISABLED offset for use in the lws_pagefault_disable
and lws_pagefault_enable macros.
Since we now disable interrupts on the gateway page where necessary, it
might be possible to allow processes to be scheduled when they are on the
gateway page.
Change has been tested on c8000 and rp3440. It improves glibc build and test
time by about 10%.
In v2, I removed the lws_atomic_xchg and and lws_atomic_store calls. I
also removed the bug fixes that were not directly related to this patch.
In v3, I removed the code to force interruptions from
arch_futex_atomic_op_inuser(). It is always called with page faults
disabled, so this code had no effect.
In v4, I fixed a typo in depi_safe line.
In v5, I moved the code to disable/enable page faults inside the spinlocks.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-01-05 00:44:32 +03:00
/* 8-bit CAS */
12 : ldb 0 ( % r26 ) , % r29
2014-09-12 20:02:34 +04:00
sub,= % r29 , % r25 , % r0
parisc: Rewrite light-weight syscall and futex code
The parisc architecture lacks general hardware support for compare and swap.
Particularly for userspace, it is difficult to implement software atomic
support. Page faults in critical regions can cause processes to sleep and
block the forward progress of other processes. Thus, it is essential that
page faults be disabled in critical regions. For performance reasons, we
also need to disable external interrupts in critical regions.
In order to do this, we need a mechanism to trigger COW breaks outside the
critical region. Fortunately, parisc has the "stbys,e" instruction. When
the leftmost byte of a word is addressed, this instruction triggers all
the exceptions of a normal store but it does not write to memory. Thus,
we can use it to trigger COW breaks outside the critical region without
modifying the data that is to be updated atomically.
COW breaks occur randomly. So even if we have priviously executed a "stbys,e"
instruction, we still need to disable pagefaults around the critical region.
If a fault occurs in the critical region, we return -EAGAIN. I had to add
a wrapper around _arch_futex_atomic_op_inuser() as I found in testing that
returning -EAGAIN caused problems for some processes even though it is
listed as a possible return value.
The patch implements the above. The code no longer attempts to sleep with
interrupts disabled and I haven't seen any stalls with the change.
I have attempted to merge common code and streamline the fast path. In the
futex code, we only compute the spinlock address once.
I eliminated some debug code in the original CAS routine that just made the
flow more complicated.
I don't clip the arguments when called from wide mode. As a result, the LWS
routines should work when called from 64-bit processes.
I defined TASK_PAGEFAULT_DISABLED offset for use in the lws_pagefault_disable
and lws_pagefault_enable macros.
Since we now disable interrupts on the gateway page where necessary, it
might be possible to allow processes to be scheduled when they are on the
gateway page.
Change has been tested on c8000 and rp3440. It improves glibc build and test
time by about 10%.
In v2, I removed the lws_atomic_xchg and and lws_atomic_store calls. I
also removed the bug fixes that were not directly related to this patch.
In v3, I removed the code to force interruptions from
arch_futex_atomic_op_inuser(). It is always called with page faults
disabled, so this code had no effect.
In v4, I fixed a typo in depi_safe line.
In v5, I moved the code to disable/enable page faults inside the spinlocks.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-01-05 00:44:32 +03:00
b,n l w s _ e x i t _ n o e r r o r
13 : stb % r24 , 0 ( % r26 )
b l w s _ e x i t _ n o e r r o r
2014-09-12 20:02:34 +04:00
copy % r0 , % r28
nop
nop
parisc: Rewrite light-weight syscall and futex code
The parisc architecture lacks general hardware support for compare and swap.
Particularly for userspace, it is difficult to implement software atomic
support. Page faults in critical regions can cause processes to sleep and
block the forward progress of other processes. Thus, it is essential that
page faults be disabled in critical regions. For performance reasons, we
also need to disable external interrupts in critical regions.
In order to do this, we need a mechanism to trigger COW breaks outside the
critical region. Fortunately, parisc has the "stbys,e" instruction. When
the leftmost byte of a word is addressed, this instruction triggers all
the exceptions of a normal store but it does not write to memory. Thus,
we can use it to trigger COW breaks outside the critical region without
modifying the data that is to be updated atomically.
COW breaks occur randomly. So even if we have priviously executed a "stbys,e"
instruction, we still need to disable pagefaults around the critical region.
If a fault occurs in the critical region, we return -EAGAIN. I had to add
a wrapper around _arch_futex_atomic_op_inuser() as I found in testing that
returning -EAGAIN caused problems for some processes even though it is
listed as a possible return value.
The patch implements the above. The code no longer attempts to sleep with
interrupts disabled and I haven't seen any stalls with the change.
I have attempted to merge common code and streamline the fast path. In the
futex code, we only compute the spinlock address once.
I eliminated some debug code in the original CAS routine that just made the
flow more complicated.
I don't clip the arguments when called from wide mode. As a result, the LWS
routines should work when called from 64-bit processes.
I defined TASK_PAGEFAULT_DISABLED offset for use in the lws_pagefault_disable
and lws_pagefault_enable macros.
Since we now disable interrupts on the gateway page where necessary, it
might be possible to allow processes to be scheduled when they are on the
gateway page.
Change has been tested on c8000 and rp3440. It improves glibc build and test
time by about 10%.
In v2, I removed the lws_atomic_xchg and and lws_atomic_store calls. I
also removed the bug fixes that were not directly related to this patch.
In v3, I removed the code to force interruptions from
arch_futex_atomic_op_inuser(). It is always called with page faults
disabled, so this code had no effect.
In v4, I fixed a typo in depi_safe line.
In v5, I moved the code to disable/enable page faults inside the spinlocks.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-01-05 00:44:32 +03:00
/* 16-bit CAS */
14 : ldh 0 ( % r26 ) , % r29
2014-09-12 20:02:34 +04:00
sub,= % r29 , % r25 , % r0
parisc: Rewrite light-weight syscall and futex code
The parisc architecture lacks general hardware support for compare and swap.
Particularly for userspace, it is difficult to implement software atomic
support. Page faults in critical regions can cause processes to sleep and
block the forward progress of other processes. Thus, it is essential that
page faults be disabled in critical regions. For performance reasons, we
also need to disable external interrupts in critical regions.
In order to do this, we need a mechanism to trigger COW breaks outside the
critical region. Fortunately, parisc has the "stbys,e" instruction. When
the leftmost byte of a word is addressed, this instruction triggers all
the exceptions of a normal store but it does not write to memory. Thus,
we can use it to trigger COW breaks outside the critical region without
modifying the data that is to be updated atomically.
COW breaks occur randomly. So even if we have priviously executed a "stbys,e"
instruction, we still need to disable pagefaults around the critical region.
If a fault occurs in the critical region, we return -EAGAIN. I had to add
a wrapper around _arch_futex_atomic_op_inuser() as I found in testing that
returning -EAGAIN caused problems for some processes even though it is
listed as a possible return value.
The patch implements the above. The code no longer attempts to sleep with
interrupts disabled and I haven't seen any stalls with the change.
I have attempted to merge common code and streamline the fast path. In the
futex code, we only compute the spinlock address once.
I eliminated some debug code in the original CAS routine that just made the
flow more complicated.
I don't clip the arguments when called from wide mode. As a result, the LWS
routines should work when called from 64-bit processes.
I defined TASK_PAGEFAULT_DISABLED offset for use in the lws_pagefault_disable
and lws_pagefault_enable macros.
Since we now disable interrupts on the gateway page where necessary, it
might be possible to allow processes to be scheduled when they are on the
gateway page.
Change has been tested on c8000 and rp3440. It improves glibc build and test
time by about 10%.
In v2, I removed the lws_atomic_xchg and and lws_atomic_store calls. I
also removed the bug fixes that were not directly related to this patch.
In v3, I removed the code to force interruptions from
arch_futex_atomic_op_inuser(). It is always called with page faults
disabled, so this code had no effect.
In v4, I fixed a typo in depi_safe line.
In v5, I moved the code to disable/enable page faults inside the spinlocks.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-01-05 00:44:32 +03:00
b,n l w s _ e x i t _ n o e r r o r
15 : sth % r24 , 0 ( % r26 )
b l w s _ e x i t _ n o e r r o r
2014-09-12 20:02:34 +04:00
copy % r0 , % r28
nop
nop
parisc: Rewrite light-weight syscall and futex code
The parisc architecture lacks general hardware support for compare and swap.
Particularly for userspace, it is difficult to implement software atomic
support. Page faults in critical regions can cause processes to sleep and
block the forward progress of other processes. Thus, it is essential that
page faults be disabled in critical regions. For performance reasons, we
also need to disable external interrupts in critical regions.
In order to do this, we need a mechanism to trigger COW breaks outside the
critical region. Fortunately, parisc has the "stbys,e" instruction. When
the leftmost byte of a word is addressed, this instruction triggers all
the exceptions of a normal store but it does not write to memory. Thus,
we can use it to trigger COW breaks outside the critical region without
modifying the data that is to be updated atomically.
COW breaks occur randomly. So even if we have priviously executed a "stbys,e"
instruction, we still need to disable pagefaults around the critical region.
If a fault occurs in the critical region, we return -EAGAIN. I had to add
a wrapper around _arch_futex_atomic_op_inuser() as I found in testing that
returning -EAGAIN caused problems for some processes even though it is
listed as a possible return value.
The patch implements the above. The code no longer attempts to sleep with
interrupts disabled and I haven't seen any stalls with the change.
I have attempted to merge common code and streamline the fast path. In the
futex code, we only compute the spinlock address once.
I eliminated some debug code in the original CAS routine that just made the
flow more complicated.
I don't clip the arguments when called from wide mode. As a result, the LWS
routines should work when called from 64-bit processes.
I defined TASK_PAGEFAULT_DISABLED offset for use in the lws_pagefault_disable
and lws_pagefault_enable macros.
Since we now disable interrupts on the gateway page where necessary, it
might be possible to allow processes to be scheduled when they are on the
gateway page.
Change has been tested on c8000 and rp3440. It improves glibc build and test
time by about 10%.
In v2, I removed the lws_atomic_xchg and and lws_atomic_store calls. I
also removed the bug fixes that were not directly related to this patch.
In v3, I removed the code to force interruptions from
arch_futex_atomic_op_inuser(). It is always called with page faults
disabled, so this code had no effect.
In v4, I fixed a typo in depi_safe line.
In v5, I moved the code to disable/enable page faults inside the spinlocks.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-01-05 00:44:32 +03:00
/* 32-bit CAS */
16 : ldw 0 ( % r26 ) , % r29
2014-09-12 20:02:34 +04:00
sub,= % r29 , % r25 , % r0
parisc: Rewrite light-weight syscall and futex code
The parisc architecture lacks general hardware support for compare and swap.
Particularly for userspace, it is difficult to implement software atomic
support. Page faults in critical regions can cause processes to sleep and
block the forward progress of other processes. Thus, it is essential that
page faults be disabled in critical regions. For performance reasons, we
also need to disable external interrupts in critical regions.
In order to do this, we need a mechanism to trigger COW breaks outside the
critical region. Fortunately, parisc has the "stbys,e" instruction. When
the leftmost byte of a word is addressed, this instruction triggers all
the exceptions of a normal store but it does not write to memory. Thus,
we can use it to trigger COW breaks outside the critical region without
modifying the data that is to be updated atomically.
COW breaks occur randomly. So even if we have priviously executed a "stbys,e"
instruction, we still need to disable pagefaults around the critical region.
If a fault occurs in the critical region, we return -EAGAIN. I had to add
a wrapper around _arch_futex_atomic_op_inuser() as I found in testing that
returning -EAGAIN caused problems for some processes even though it is
listed as a possible return value.
The patch implements the above. The code no longer attempts to sleep with
interrupts disabled and I haven't seen any stalls with the change.
I have attempted to merge common code and streamline the fast path. In the
futex code, we only compute the spinlock address once.
I eliminated some debug code in the original CAS routine that just made the
flow more complicated.
I don't clip the arguments when called from wide mode. As a result, the LWS
routines should work when called from 64-bit processes.
I defined TASK_PAGEFAULT_DISABLED offset for use in the lws_pagefault_disable
and lws_pagefault_enable macros.
Since we now disable interrupts on the gateway page where necessary, it
might be possible to allow processes to be scheduled when they are on the
gateway page.
Change has been tested on c8000 and rp3440. It improves glibc build and test
time by about 10%.
In v2, I removed the lws_atomic_xchg and and lws_atomic_store calls. I
also removed the bug fixes that were not directly related to this patch.
In v3, I removed the code to force interruptions from
arch_futex_atomic_op_inuser(). It is always called with page faults
disabled, so this code had no effect.
In v4, I fixed a typo in depi_safe line.
In v5, I moved the code to disable/enable page faults inside the spinlocks.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-01-05 00:44:32 +03:00
b,n l w s _ e x i t _ n o e r r o r
17 : stw % r24 , 0 ( % r26 )
b l w s _ e x i t _ n o e r r o r
2014-09-12 20:02:34 +04:00
copy % r0 , % r28
nop
nop
parisc: Rewrite light-weight syscall and futex code
The parisc architecture lacks general hardware support for compare and swap.
Particularly for userspace, it is difficult to implement software atomic
support. Page faults in critical regions can cause processes to sleep and
block the forward progress of other processes. Thus, it is essential that
page faults be disabled in critical regions. For performance reasons, we
also need to disable external interrupts in critical regions.
In order to do this, we need a mechanism to trigger COW breaks outside the
critical region. Fortunately, parisc has the "stbys,e" instruction. When
the leftmost byte of a word is addressed, this instruction triggers all
the exceptions of a normal store but it does not write to memory. Thus,
we can use it to trigger COW breaks outside the critical region without
modifying the data that is to be updated atomically.
COW breaks occur randomly. So even if we have priviously executed a "stbys,e"
instruction, we still need to disable pagefaults around the critical region.
If a fault occurs in the critical region, we return -EAGAIN. I had to add
a wrapper around _arch_futex_atomic_op_inuser() as I found in testing that
returning -EAGAIN caused problems for some processes even though it is
listed as a possible return value.
The patch implements the above. The code no longer attempts to sleep with
interrupts disabled and I haven't seen any stalls with the change.
I have attempted to merge common code and streamline the fast path. In the
futex code, we only compute the spinlock address once.
I eliminated some debug code in the original CAS routine that just made the
flow more complicated.
I don't clip the arguments when called from wide mode. As a result, the LWS
routines should work when called from 64-bit processes.
I defined TASK_PAGEFAULT_DISABLED offset for use in the lws_pagefault_disable
and lws_pagefault_enable macros.
Since we now disable interrupts on the gateway page where necessary, it
might be possible to allow processes to be scheduled when they are on the
gateway page.
Change has been tested on c8000 and rp3440. It improves glibc build and test
time by about 10%.
In v2, I removed the lws_atomic_xchg and and lws_atomic_store calls. I
also removed the bug fixes that were not directly related to this patch.
In v3, I removed the code to force interruptions from
arch_futex_atomic_op_inuser(). It is always called with page faults
disabled, so this code had no effect.
In v4, I fixed a typo in depi_safe line.
In v5, I moved the code to disable/enable page faults inside the spinlocks.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-01-05 00:44:32 +03:00
/* 64-bit CAS */
2014-09-12 20:02:34 +04:00
# ifdef C O N F I G _ 6 4 B I T
parisc: Rewrite light-weight syscall and futex code
The parisc architecture lacks general hardware support for compare and swap.
Particularly for userspace, it is difficult to implement software atomic
support. Page faults in critical regions can cause processes to sleep and
block the forward progress of other processes. Thus, it is essential that
page faults be disabled in critical regions. For performance reasons, we
also need to disable external interrupts in critical regions.
In order to do this, we need a mechanism to trigger COW breaks outside the
critical region. Fortunately, parisc has the "stbys,e" instruction. When
the leftmost byte of a word is addressed, this instruction triggers all
the exceptions of a normal store but it does not write to memory. Thus,
we can use it to trigger COW breaks outside the critical region without
modifying the data that is to be updated atomically.
COW breaks occur randomly. So even if we have priviously executed a "stbys,e"
instruction, we still need to disable pagefaults around the critical region.
If a fault occurs in the critical region, we return -EAGAIN. I had to add
a wrapper around _arch_futex_atomic_op_inuser() as I found in testing that
returning -EAGAIN caused problems for some processes even though it is
listed as a possible return value.
The patch implements the above. The code no longer attempts to sleep with
interrupts disabled and I haven't seen any stalls with the change.
I have attempted to merge common code and streamline the fast path. In the
futex code, we only compute the spinlock address once.
I eliminated some debug code in the original CAS routine that just made the
flow more complicated.
I don't clip the arguments when called from wide mode. As a result, the LWS
routines should work when called from 64-bit processes.
I defined TASK_PAGEFAULT_DISABLED offset for use in the lws_pagefault_disable
and lws_pagefault_enable macros.
Since we now disable interrupts on the gateway page where necessary, it
might be possible to allow processes to be scheduled when they are on the
gateway page.
Change has been tested on c8000 and rp3440. It improves glibc build and test
time by about 10%.
In v2, I removed the lws_atomic_xchg and and lws_atomic_store calls. I
also removed the bug fixes that were not directly related to this patch.
In v3, I removed the code to force interruptions from
arch_futex_atomic_op_inuser(). It is always called with page faults
disabled, so this code had no effect.
In v4, I fixed a typo in depi_safe line.
In v5, I moved the code to disable/enable page faults inside the spinlocks.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-01-05 00:44:32 +03:00
18 : ldd 0 ( % r26 ) , % r29
2015-09-08 03:13:28 +03:00
sub,* = % r29 , % r25 , % r0
parisc: Rewrite light-weight syscall and futex code
The parisc architecture lacks general hardware support for compare and swap.
Particularly for userspace, it is difficult to implement software atomic
support. Page faults in critical regions can cause processes to sleep and
block the forward progress of other processes. Thus, it is essential that
page faults be disabled in critical regions. For performance reasons, we
also need to disable external interrupts in critical regions.
In order to do this, we need a mechanism to trigger COW breaks outside the
critical region. Fortunately, parisc has the "stbys,e" instruction. When
the leftmost byte of a word is addressed, this instruction triggers all
the exceptions of a normal store but it does not write to memory. Thus,
we can use it to trigger COW breaks outside the critical region without
modifying the data that is to be updated atomically.
COW breaks occur randomly. So even if we have priviously executed a "stbys,e"
instruction, we still need to disable pagefaults around the critical region.
If a fault occurs in the critical region, we return -EAGAIN. I had to add
a wrapper around _arch_futex_atomic_op_inuser() as I found in testing that
returning -EAGAIN caused problems for some processes even though it is
listed as a possible return value.
The patch implements the above. The code no longer attempts to sleep with
interrupts disabled and I haven't seen any stalls with the change.
I have attempted to merge common code and streamline the fast path. In the
futex code, we only compute the spinlock address once.
I eliminated some debug code in the original CAS routine that just made the
flow more complicated.
I don't clip the arguments when called from wide mode. As a result, the LWS
routines should work when called from 64-bit processes.
I defined TASK_PAGEFAULT_DISABLED offset for use in the lws_pagefault_disable
and lws_pagefault_enable macros.
Since we now disable interrupts on the gateway page where necessary, it
might be possible to allow processes to be scheduled when they are on the
gateway page.
Change has been tested on c8000 and rp3440. It improves glibc build and test
time by about 10%.
In v2, I removed the lws_atomic_xchg and and lws_atomic_store calls. I
also removed the bug fixes that were not directly related to this patch.
In v3, I removed the code to force interruptions from
arch_futex_atomic_op_inuser(). It is always called with page faults
disabled, so this code had no effect.
In v4, I fixed a typo in depi_safe line.
In v5, I moved the code to disable/enable page faults inside the spinlocks.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-01-05 00:44:32 +03:00
b,n l w s _ e x i t _ n o e r r o r
19 : std % r24 , 0 ( % r26 )
2014-09-12 20:02:34 +04:00
copy % r0 , % r28
# else
/* Compare first word */
parisc: Rewrite light-weight syscall and futex code
The parisc architecture lacks general hardware support for compare and swap.
Particularly for userspace, it is difficult to implement software atomic
support. Page faults in critical regions can cause processes to sleep and
block the forward progress of other processes. Thus, it is essential that
page faults be disabled in critical regions. For performance reasons, we
also need to disable external interrupts in critical regions.
In order to do this, we need a mechanism to trigger COW breaks outside the
critical region. Fortunately, parisc has the "stbys,e" instruction. When
the leftmost byte of a word is addressed, this instruction triggers all
the exceptions of a normal store but it does not write to memory. Thus,
we can use it to trigger COW breaks outside the critical region without
modifying the data that is to be updated atomically.
COW breaks occur randomly. So even if we have priviously executed a "stbys,e"
instruction, we still need to disable pagefaults around the critical region.
If a fault occurs in the critical region, we return -EAGAIN. I had to add
a wrapper around _arch_futex_atomic_op_inuser() as I found in testing that
returning -EAGAIN caused problems for some processes even though it is
listed as a possible return value.
The patch implements the above. The code no longer attempts to sleep with
interrupts disabled and I haven't seen any stalls with the change.
I have attempted to merge common code and streamline the fast path. In the
futex code, we only compute the spinlock address once.
I eliminated some debug code in the original CAS routine that just made the
flow more complicated.
I don't clip the arguments when called from wide mode. As a result, the LWS
routines should work when called from 64-bit processes.
I defined TASK_PAGEFAULT_DISABLED offset for use in the lws_pagefault_disable
and lws_pagefault_enable macros.
Since we now disable interrupts on the gateway page where necessary, it
might be possible to allow processes to be scheduled when they are on the
gateway page.
Change has been tested on c8000 and rp3440. It improves glibc build and test
time by about 10%.
In v2, I removed the lws_atomic_xchg and and lws_atomic_store calls. I
also removed the bug fixes that were not directly related to this patch.
In v3, I removed the code to force interruptions from
arch_futex_atomic_op_inuser(). It is always called with page faults
disabled, so this code had no effect.
In v4, I fixed a typo in depi_safe line.
In v5, I moved the code to disable/enable page faults inside the spinlocks.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-01-05 00:44:32 +03:00
18 : ldw 0 ( % r26 ) , % r29
2014-09-12 20:02:34 +04:00
sub,= % r29 , % r22 , % r0
parisc: Rewrite light-weight syscall and futex code
The parisc architecture lacks general hardware support for compare and swap.
Particularly for userspace, it is difficult to implement software atomic
support. Page faults in critical regions can cause processes to sleep and
block the forward progress of other processes. Thus, it is essential that
page faults be disabled in critical regions. For performance reasons, we
also need to disable external interrupts in critical regions.
In order to do this, we need a mechanism to trigger COW breaks outside the
critical region. Fortunately, parisc has the "stbys,e" instruction. When
the leftmost byte of a word is addressed, this instruction triggers all
the exceptions of a normal store but it does not write to memory. Thus,
we can use it to trigger COW breaks outside the critical region without
modifying the data that is to be updated atomically.
COW breaks occur randomly. So even if we have priviously executed a "stbys,e"
instruction, we still need to disable pagefaults around the critical region.
If a fault occurs in the critical region, we return -EAGAIN. I had to add
a wrapper around _arch_futex_atomic_op_inuser() as I found in testing that
returning -EAGAIN caused problems for some processes even though it is
listed as a possible return value.
The patch implements the above. The code no longer attempts to sleep with
interrupts disabled and I haven't seen any stalls with the change.
I have attempted to merge common code and streamline the fast path. In the
futex code, we only compute the spinlock address once.
I eliminated some debug code in the original CAS routine that just made the
flow more complicated.
I don't clip the arguments when called from wide mode. As a result, the LWS
routines should work when called from 64-bit processes.
I defined TASK_PAGEFAULT_DISABLED offset for use in the lws_pagefault_disable
and lws_pagefault_enable macros.
Since we now disable interrupts on the gateway page where necessary, it
might be possible to allow processes to be scheduled when they are on the
gateway page.
Change has been tested on c8000 and rp3440. It improves glibc build and test
time by about 10%.
In v2, I removed the lws_atomic_xchg and and lws_atomic_store calls. I
also removed the bug fixes that were not directly related to this patch.
In v3, I removed the code to force interruptions from
arch_futex_atomic_op_inuser(). It is always called with page faults
disabled, so this code had no effect.
In v4, I fixed a typo in depi_safe line.
In v5, I moved the code to disable/enable page faults inside the spinlocks.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-01-05 00:44:32 +03:00
b,n l w s _ e x i t _ n o e r r o r
2014-09-12 20:02:34 +04:00
/* Compare second word */
parisc: Rewrite light-weight syscall and futex code
The parisc architecture lacks general hardware support for compare and swap.
Particularly for userspace, it is difficult to implement software atomic
support. Page faults in critical regions can cause processes to sleep and
block the forward progress of other processes. Thus, it is essential that
page faults be disabled in critical regions. For performance reasons, we
also need to disable external interrupts in critical regions.
In order to do this, we need a mechanism to trigger COW breaks outside the
critical region. Fortunately, parisc has the "stbys,e" instruction. When
the leftmost byte of a word is addressed, this instruction triggers all
the exceptions of a normal store but it does not write to memory. Thus,
we can use it to trigger COW breaks outside the critical region without
modifying the data that is to be updated atomically.
COW breaks occur randomly. So even if we have priviously executed a "stbys,e"
instruction, we still need to disable pagefaults around the critical region.
If a fault occurs in the critical region, we return -EAGAIN. I had to add
a wrapper around _arch_futex_atomic_op_inuser() as I found in testing that
returning -EAGAIN caused problems for some processes even though it is
listed as a possible return value.
The patch implements the above. The code no longer attempts to sleep with
interrupts disabled and I haven't seen any stalls with the change.
I have attempted to merge common code and streamline the fast path. In the
futex code, we only compute the spinlock address once.
I eliminated some debug code in the original CAS routine that just made the
flow more complicated.
I don't clip the arguments when called from wide mode. As a result, the LWS
routines should work when called from 64-bit processes.
I defined TASK_PAGEFAULT_DISABLED offset for use in the lws_pagefault_disable
and lws_pagefault_enable macros.
Since we now disable interrupts on the gateway page where necessary, it
might be possible to allow processes to be scheduled when they are on the
gateway page.
Change has been tested on c8000 and rp3440. It improves glibc build and test
time by about 10%.
In v2, I removed the lws_atomic_xchg and and lws_atomic_store calls. I
also removed the bug fixes that were not directly related to this patch.
In v3, I removed the code to force interruptions from
arch_futex_atomic_op_inuser(). It is always called with page faults
disabled, so this code had no effect.
In v4, I fixed a typo in depi_safe line.
In v5, I moved the code to disable/enable page faults inside the spinlocks.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-01-05 00:44:32 +03:00
19 : ldw 4 ( % r26 ) , % r29
2014-09-12 20:02:34 +04:00
sub,= % r29 , % r23 , % r0
parisc: Rewrite light-weight syscall and futex code
The parisc architecture lacks general hardware support for compare and swap.
Particularly for userspace, it is difficult to implement software atomic
support. Page faults in critical regions can cause processes to sleep and
block the forward progress of other processes. Thus, it is essential that
page faults be disabled in critical regions. For performance reasons, we
also need to disable external interrupts in critical regions.
In order to do this, we need a mechanism to trigger COW breaks outside the
critical region. Fortunately, parisc has the "stbys,e" instruction. When
the leftmost byte of a word is addressed, this instruction triggers all
the exceptions of a normal store but it does not write to memory. Thus,
we can use it to trigger COW breaks outside the critical region without
modifying the data that is to be updated atomically.
COW breaks occur randomly. So even if we have priviously executed a "stbys,e"
instruction, we still need to disable pagefaults around the critical region.
If a fault occurs in the critical region, we return -EAGAIN. I had to add
a wrapper around _arch_futex_atomic_op_inuser() as I found in testing that
returning -EAGAIN caused problems for some processes even though it is
listed as a possible return value.
The patch implements the above. The code no longer attempts to sleep with
interrupts disabled and I haven't seen any stalls with the change.
I have attempted to merge common code and streamline the fast path. In the
futex code, we only compute the spinlock address once.
I eliminated some debug code in the original CAS routine that just made the
flow more complicated.
I don't clip the arguments when called from wide mode. As a result, the LWS
routines should work when called from 64-bit processes.
I defined TASK_PAGEFAULT_DISABLED offset for use in the lws_pagefault_disable
and lws_pagefault_enable macros.
Since we now disable interrupts on the gateway page where necessary, it
might be possible to allow processes to be scheduled when they are on the
gateway page.
Change has been tested on c8000 and rp3440. It improves glibc build and test
time by about 10%.
In v2, I removed the lws_atomic_xchg and and lws_atomic_store calls. I
also removed the bug fixes that were not directly related to this patch.
In v3, I removed the code to force interruptions from
arch_futex_atomic_op_inuser(). It is always called with page faults
disabled, so this code had no effect.
In v4, I fixed a typo in depi_safe line.
In v5, I moved the code to disable/enable page faults inside the spinlocks.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-01-05 00:44:32 +03:00
b,n l w s _ e x i t _ n o e r r o r
2014-09-12 20:02:34 +04:00
/* Perform the store */
parisc: Rewrite light-weight syscall and futex code
The parisc architecture lacks general hardware support for compare and swap.
Particularly for userspace, it is difficult to implement software atomic
support. Page faults in critical regions can cause processes to sleep and
block the forward progress of other processes. Thus, it is essential that
page faults be disabled in critical regions. For performance reasons, we
also need to disable external interrupts in critical regions.
In order to do this, we need a mechanism to trigger COW breaks outside the
critical region. Fortunately, parisc has the "stbys,e" instruction. When
the leftmost byte of a word is addressed, this instruction triggers all
the exceptions of a normal store but it does not write to memory. Thus,
we can use it to trigger COW breaks outside the critical region without
modifying the data that is to be updated atomically.
COW breaks occur randomly. So even if we have priviously executed a "stbys,e"
instruction, we still need to disable pagefaults around the critical region.
If a fault occurs in the critical region, we return -EAGAIN. I had to add
a wrapper around _arch_futex_atomic_op_inuser() as I found in testing that
returning -EAGAIN caused problems for some processes even though it is
listed as a possible return value.
The patch implements the above. The code no longer attempts to sleep with
interrupts disabled and I haven't seen any stalls with the change.
I have attempted to merge common code and streamline the fast path. In the
futex code, we only compute the spinlock address once.
I eliminated some debug code in the original CAS routine that just made the
flow more complicated.
I don't clip the arguments when called from wide mode. As a result, the LWS
routines should work when called from 64-bit processes.
I defined TASK_PAGEFAULT_DISABLED offset for use in the lws_pagefault_disable
and lws_pagefault_enable macros.
Since we now disable interrupts on the gateway page where necessary, it
might be possible to allow processes to be scheduled when they are on the
gateway page.
Change has been tested on c8000 and rp3440. It improves glibc build and test
time by about 10%.
In v2, I removed the lws_atomic_xchg and and lws_atomic_store calls. I
also removed the bug fixes that were not directly related to this patch.
In v3, I removed the code to force interruptions from
arch_futex_atomic_op_inuser(). It is always called with page faults
disabled, so this code had no effect.
In v4, I fixed a typo in depi_safe line.
In v5, I moved the code to disable/enable page faults inside the spinlocks.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-01-05 00:44:32 +03:00
20 : fstdx % f r4 , 0 ( % r26 )
2014-09-12 20:02:34 +04:00
copy % r0 , % r28
# endif
parisc: Rewrite light-weight syscall and futex code
The parisc architecture lacks general hardware support for compare and swap.
Particularly for userspace, it is difficult to implement software atomic
support. Page faults in critical regions can cause processes to sleep and
block the forward progress of other processes. Thus, it is essential that
page faults be disabled in critical regions. For performance reasons, we
also need to disable external interrupts in critical regions.
In order to do this, we need a mechanism to trigger COW breaks outside the
critical region. Fortunately, parisc has the "stbys,e" instruction. When
the leftmost byte of a word is addressed, this instruction triggers all
the exceptions of a normal store but it does not write to memory. Thus,
we can use it to trigger COW breaks outside the critical region without
modifying the data that is to be updated atomically.
COW breaks occur randomly. So even if we have priviously executed a "stbys,e"
instruction, we still need to disable pagefaults around the critical region.
If a fault occurs in the critical region, we return -EAGAIN. I had to add
a wrapper around _arch_futex_atomic_op_inuser() as I found in testing that
returning -EAGAIN caused problems for some processes even though it is
listed as a possible return value.
The patch implements the above. The code no longer attempts to sleep with
interrupts disabled and I haven't seen any stalls with the change.
I have attempted to merge common code and streamline the fast path. In the
futex code, we only compute the spinlock address once.
I eliminated some debug code in the original CAS routine that just made the
flow more complicated.
I don't clip the arguments when called from wide mode. As a result, the LWS
routines should work when called from 64-bit processes.
I defined TASK_PAGEFAULT_DISABLED offset for use in the lws_pagefault_disable
and lws_pagefault_enable macros.
Since we now disable interrupts on the gateway page where necessary, it
might be possible to allow processes to be scheduled when they are on the
gateway page.
Change has been tested on c8000 and rp3440. It improves glibc build and test
time by about 10%.
In v2, I removed the lws_atomic_xchg and and lws_atomic_store calls. I
also removed the bug fixes that were not directly related to this patch.
In v3, I removed the code to force interruptions from
arch_futex_atomic_op_inuser(). It is always called with page faults
disabled, so this code had no effect.
In v4, I fixed a typo in depi_safe line.
In v5, I moved the code to disable/enable page faults inside the spinlocks.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-01-05 00:44:32 +03:00
b l w s _ e x i t _ n o e r r o r
copy % r0 , % r28
2014-09-12 20:02:34 +04:00
parisc: Rewrite light-weight syscall and futex code
The parisc architecture lacks general hardware support for compare and swap.
Particularly for userspace, it is difficult to implement software atomic
support. Page faults in critical regions can cause processes to sleep and
block the forward progress of other processes. Thus, it is essential that
page faults be disabled in critical regions. For performance reasons, we
also need to disable external interrupts in critical regions.
In order to do this, we need a mechanism to trigger COW breaks outside the
critical region. Fortunately, parisc has the "stbys,e" instruction. When
the leftmost byte of a word is addressed, this instruction triggers all
the exceptions of a normal store but it does not write to memory. Thus,
we can use it to trigger COW breaks outside the critical region without
modifying the data that is to be updated atomically.
COW breaks occur randomly. So even if we have priviously executed a "stbys,e"
instruction, we still need to disable pagefaults around the critical region.
If a fault occurs in the critical region, we return -EAGAIN. I had to add
a wrapper around _arch_futex_atomic_op_inuser() as I found in testing that
returning -EAGAIN caused problems for some processes even though it is
listed as a possible return value.
The patch implements the above. The code no longer attempts to sleep with
interrupts disabled and I haven't seen any stalls with the change.
I have attempted to merge common code and streamline the fast path. In the
futex code, we only compute the spinlock address once.
I eliminated some debug code in the original CAS routine that just made the
flow more complicated.
I don't clip the arguments when called from wide mode. As a result, the LWS
routines should work when called from 64-bit processes.
I defined TASK_PAGEFAULT_DISABLED offset for use in the lws_pagefault_disable
and lws_pagefault_enable macros.
Since we now disable interrupts on the gateway page where necessary, it
might be possible to allow processes to be scheduled when they are on the
gateway page.
Change has been tested on c8000 and rp3440. It improves glibc build and test
time by about 10%.
In v2, I removed the lws_atomic_xchg and and lws_atomic_store calls. I
also removed the bug fixes that were not directly related to this patch.
In v3, I removed the code to force interruptions from
arch_futex_atomic_op_inuser(). It is always called with page faults
disabled, so this code had no effect.
In v4, I fixed a typo in depi_safe line.
In v5, I moved the code to disable/enable page faults inside the spinlocks.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-01-05 00:44:32 +03:00
/* A fault occurred on load or stbys,e store */
30 : b,n l w s _ f a u l t
ASM_ E X C E P T I O N T A B L E _ E N T R Y ( 1 b - l i n u x _ g a t e w a y _ p a g e , 3 0 b - l i n u x _ g a t e w a y _ p a g e )
ASM_ E X C E P T I O N T A B L E _ E N T R Y ( 2 b - l i n u x _ g a t e w a y _ p a g e , 3 0 b - l i n u x _ g a t e w a y _ p a g e )
ASM_ E X C E P T I O N T A B L E _ E N T R Y ( 3 b - l i n u x _ g a t e w a y _ p a g e , 3 0 b - l i n u x _ g a t e w a y _ p a g e )
ASM_ E X C E P T I O N T A B L E _ E N T R Y ( 4 b - l i n u x _ g a t e w a y _ p a g e , 3 0 b - l i n u x _ g a t e w a y _ p a g e )
ASM_ E X C E P T I O N T A B L E _ E N T R Y ( 5 b - l i n u x _ g a t e w a y _ p a g e , 3 0 b - l i n u x _ g a t e w a y _ p a g e )
ASM_ E X C E P T I O N T A B L E _ E N T R Y ( 6 b - l i n u x _ g a t e w a y _ p a g e , 3 0 b - l i n u x _ g a t e w a y _ p a g e )
ASM_ E X C E P T I O N T A B L E _ E N T R Y ( 7 b - l i n u x _ g a t e w a y _ p a g e , 3 0 b - l i n u x _ g a t e w a y _ p a g e )
ASM_ E X C E P T I O N T A B L E _ E N T R Y ( 8 b - l i n u x _ g a t e w a y _ p a g e , 3 0 b - l i n u x _ g a t e w a y _ p a g e )
# ifndef C O N F I G _ 6 4 B I T
ASM_ E X C E P T I O N T A B L E _ E N T R Y ( 9 b - l i n u x _ g a t e w a y _ p a g e , 3 0 b - l i n u x _ g a t e w a y _ p a g e )
# endif
2014-09-12 20:02:34 +04:00
parisc: Rewrite light-weight syscall and futex code
The parisc architecture lacks general hardware support for compare and swap.
Particularly for userspace, it is difficult to implement software atomic
support. Page faults in critical regions can cause processes to sleep and
block the forward progress of other processes. Thus, it is essential that
page faults be disabled in critical regions. For performance reasons, we
also need to disable external interrupts in critical regions.
In order to do this, we need a mechanism to trigger COW breaks outside the
critical region. Fortunately, parisc has the "stbys,e" instruction. When
the leftmost byte of a word is addressed, this instruction triggers all
the exceptions of a normal store but it does not write to memory. Thus,
we can use it to trigger COW breaks outside the critical region without
modifying the data that is to be updated atomically.
COW breaks occur randomly. So even if we have priviously executed a "stbys,e"
instruction, we still need to disable pagefaults around the critical region.
If a fault occurs in the critical region, we return -EAGAIN. I had to add
a wrapper around _arch_futex_atomic_op_inuser() as I found in testing that
returning -EAGAIN caused problems for some processes even though it is
listed as a possible return value.
The patch implements the above. The code no longer attempts to sleep with
interrupts disabled and I haven't seen any stalls with the change.
I have attempted to merge common code and streamline the fast path. In the
futex code, we only compute the spinlock address once.
I eliminated some debug code in the original CAS routine that just made the
flow more complicated.
I don't clip the arguments when called from wide mode. As a result, the LWS
routines should work when called from 64-bit processes.
I defined TASK_PAGEFAULT_DISABLED offset for use in the lws_pagefault_disable
and lws_pagefault_enable macros.
Since we now disable interrupts on the gateway page where necessary, it
might be possible to allow processes to be scheduled when they are on the
gateway page.
Change has been tested on c8000 and rp3440. It improves glibc build and test
time by about 10%.
In v2, I removed the lws_atomic_xchg and and lws_atomic_store calls. I
also removed the bug fixes that were not directly related to this patch.
In v3, I removed the code to force interruptions from
arch_futex_atomic_op_inuser(). It is always called with page faults
disabled, so this code had no effect.
In v4, I fixed a typo in depi_safe line.
In v5, I moved the code to disable/enable page faults inside the spinlocks.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-01-05 00:44:32 +03:00
ASM_ E X C E P T I O N T A B L E _ E N T R Y ( 1 0 b - l i n u x _ g a t e w a y _ p a g e , 3 0 b - l i n u x _ g a t e w a y _ p a g e )
ASM_ E X C E P T I O N T A B L E _ E N T R Y ( 1 1 b - l i n u x _ g a t e w a y _ p a g e , 3 0 b - l i n u x _ g a t e w a y _ p a g e )
/* A page fault occurred in critical region */
31 : b,n l w s _ p a g e f a u l t
ASM_ E X C E P T I O N T A B L E _ E N T R Y ( 1 2 b - l i n u x _ g a t e w a y _ p a g e , 3 1 b - l i n u x _ g a t e w a y _ p a g e )
ASM_ E X C E P T I O N T A B L E _ E N T R Y ( 1 3 b - l i n u x _ g a t e w a y _ p a g e , 3 1 b - l i n u x _ g a t e w a y _ p a g e )
ASM_ E X C E P T I O N T A B L E _ E N T R Y ( 1 4 b - l i n u x _ g a t e w a y _ p a g e , 3 1 b - l i n u x _ g a t e w a y _ p a g e )
ASM_ E X C E P T I O N T A B L E _ E N T R Y ( 1 5 b - l i n u x _ g a t e w a y _ p a g e , 3 1 b - l i n u x _ g a t e w a y _ p a g e )
ASM_ E X C E P T I O N T A B L E _ E N T R Y ( 1 6 b - l i n u x _ g a t e w a y _ p a g e , 3 1 b - l i n u x _ g a t e w a y _ p a g e )
ASM_ E X C E P T I O N T A B L E _ E N T R Y ( 1 7 b - l i n u x _ g a t e w a y _ p a g e , 3 1 b - l i n u x _ g a t e w a y _ p a g e )
ASM_ E X C E P T I O N T A B L E _ E N T R Y ( 1 8 b - l i n u x _ g a t e w a y _ p a g e , 3 1 b - l i n u x _ g a t e w a y _ p a g e )
ASM_ E X C E P T I O N T A B L E _ E N T R Y ( 1 9 b - l i n u x _ g a t e w a y _ p a g e , 3 1 b - l i n u x _ g a t e w a y _ p a g e )
2014-09-12 20:02:34 +04:00
# ifndef C O N F I G _ 6 4 B I T
parisc: Rewrite light-weight syscall and futex code
The parisc architecture lacks general hardware support for compare and swap.
Particularly for userspace, it is difficult to implement software atomic
support. Page faults in critical regions can cause processes to sleep and
block the forward progress of other processes. Thus, it is essential that
page faults be disabled in critical regions. For performance reasons, we
also need to disable external interrupts in critical regions.
In order to do this, we need a mechanism to trigger COW breaks outside the
critical region. Fortunately, parisc has the "stbys,e" instruction. When
the leftmost byte of a word is addressed, this instruction triggers all
the exceptions of a normal store but it does not write to memory. Thus,
we can use it to trigger COW breaks outside the critical region without
modifying the data that is to be updated atomically.
COW breaks occur randomly. So even if we have priviously executed a "stbys,e"
instruction, we still need to disable pagefaults around the critical region.
If a fault occurs in the critical region, we return -EAGAIN. I had to add
a wrapper around _arch_futex_atomic_op_inuser() as I found in testing that
returning -EAGAIN caused problems for some processes even though it is
listed as a possible return value.
The patch implements the above. The code no longer attempts to sleep with
interrupts disabled and I haven't seen any stalls with the change.
I have attempted to merge common code and streamline the fast path. In the
futex code, we only compute the spinlock address once.
I eliminated some debug code in the original CAS routine that just made the
flow more complicated.
I don't clip the arguments when called from wide mode. As a result, the LWS
routines should work when called from 64-bit processes.
I defined TASK_PAGEFAULT_DISABLED offset for use in the lws_pagefault_disable
and lws_pagefault_enable macros.
Since we now disable interrupts on the gateway page where necessary, it
might be possible to allow processes to be scheduled when they are on the
gateway page.
Change has been tested on c8000 and rp3440. It improves glibc build and test
time by about 10%.
In v2, I removed the lws_atomic_xchg and and lws_atomic_store calls. I
also removed the bug fixes that were not directly related to this patch.
In v3, I removed the code to force interruptions from
arch_futex_atomic_op_inuser(). It is always called with page faults
disabled, so this code had no effect.
In v4, I fixed a typo in depi_safe line.
In v5, I moved the code to disable/enable page faults inside the spinlocks.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-01-05 00:44:32 +03:00
ASM_ E X C E P T I O N T A B L E _ E N T R Y ( 2 0 b - l i n u x _ g a t e w a y _ p a g e , 3 1 b - l i n u x _ g a t e w a y _ p a g e )
2014-09-12 20:02:34 +04:00
# endif
2022-01-05 00:48:11 +03:00
/ * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
LWS a t o m i c e x c h a n g e .
% r2 6 - E x c h a n g e a d d r e s s
% r2 5 - S i z e o f t h e v a r i a b l e ( 0 / 1 / 2 / 3 f o r 8 / 1 6 / 3 2 / 6 4 b i t )
% r2 4 - A d d r e s s o f n e w v a l u e
% r2 3 - A d d r e s s o f o l d v a l u e
% r2 8 - R e t u r n n o n - z e r o o n f a i l u r e
% r2 1 - K e r n e l e r r o r c o d e
% r2 1 r e t u r n s t h e f o l l o w i n g e r r o r c o d e s :
EAGAIN - C A S i s b u s y , l d c w f a i l e d , t r y a g a i n .
EFAULT - R e a d o r w r i t e f a i l e d .
If E A G A I N i s r e t u r n e d , % r28 i n d i c a t e s t h e b u s y r e a s o n :
r2 8 = = 1 - C A S i s b u s y . l o c k c o n t e n d e d .
r2 8 = = 2 - C A S i s b u s y . l d c w f a i l e d .
r2 8 = = 3 - C A S i s b u s y . p a g e f a u l t .
Scratch : r2 0 , r1
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * /
lws_atomic_xchg :
# ifdef C O N F I G _ 6 4 B I T
/* Wide mode user process? */
bb,< ,n % s p , 3 1 , a t o m i c _ x c h g _ b e g i n
/ * Clip t h e i n p u t r e g i s t e r s f o r 3 2 - b i t p r o c e s s e s . W e d o n ' t
need t o c l i p % r23 a s w e o n l y u s e i t f o r w o r d o p e r a t i o n s * /
depdi 0 , 3 1 , 3 2 , % r26
depdi 0 , 3 1 , 3 2 , % r25
depdi 0 , 3 1 , 3 2 , % r24
depdi 0 , 3 1 , 3 2 , % r23
# endif
atomic_xchg_begin :
/* Check the validity of the size pointer */
subi,> > = 3 , % r25 , % r0
b,n l w s _ e x i t _ n o s y s
/ * Jump t o t h e f u n c t i o n s w h i c h w i l l l o a d t h e o l d a n d n e w v a l u e s i n t o
registers d e p e n d i n g o n t h e t h e i r s i z e * /
shlw % r25 , 2 , % r1
blr % r1 , % r0
nop
/* Perform exception checks */
/* 8-bit exchange */
1 : ldb 0 ( % r24 ) , % r20
copy % r23 , % r20
depi_ s a f e 0 , 3 1 , 2 , % r20
b a t o m i c _ x c h g _ s t a r t
2 : stbys,e % r0 , 0 ( % r20 )
nop
nop
nop
/* 16-bit exchange */
3 : ldh 0 ( % r24 ) , % r20
copy % r23 , % r20
depi_ s a f e 0 , 3 1 , 2 , % r20
b a t o m i c _ x c h g _ s t a r t
4 : stbys,e % r0 , 0 ( % r20 )
nop
nop
nop
/* 32-bit exchange */
5 : ldw 0 ( % r24 ) , % r20
b a t o m i c _ x c h g _ s t a r t
6 : stbys,e % r0 , 0 ( % r23 )
nop
nop
nop
nop
nop
/* 64-bit exchange */
# ifdef C O N F I G _ 6 4 B I T
7 : ldd 0 ( % r24 ) , % r20
8 : stdby,e % r0 , 0 ( % r23 )
# else
7 : ldw 0 ( % r24 ) , % r20
8 : ldw 4 ( % r24 ) , % r20
copy % r23 , % r20
depi_ s a f e 0 , 3 1 , 2 , % r20
9 : stbys,e % r0 , 0 ( % r20 )
10 : stbys,e % r0 , 4 ( % r20 )
# endif
atomic_xchg_start :
/* Trigger memory reference interruptions without writing to memory */
copy % r26 , % r28
depi_ s a f e 0 , 3 1 , 2 , % r28
11 : ldw 0 ( % r28 ) , % r1
12 : stbys,e % r0 , 0 ( % r28 )
/* Calculate 8-bit hash index from virtual address */
extru_ s a f e % r26 , 2 7 , 8 , % r20
/* Load start of lock table */
ldil L % l w s _ l o c k _ s t a r t , % r28
ldo R % l w s _ l o c k _ s t a r t ( % r28 ) , % r28
/ * Find l o c k t o u s e , t h e h a s h i n d e x i s o n e o f 0 t o
2 5 5 , multiplied b y 1 6 ( k e e p i t 1 6 - b y t e a l i g n e d )
and a d d t o t h e l o c k t a b l e o f f s e t . * /
shlw % r20 , 4 , % r20
add % r20 , % r28 , % r20
rsm P S W _ S M _ I , % r0 / * D i s a b l e i n t e r r u p t s * /
/* Try to acquire the lock */
LDCW 0 ( % s r2 ,% r20 ) , % r28
comclr,< > % r0 , % r28 , % r0
b,n l w s _ w o u l d b l o c k
/* Disable page faults to prevent sleeping in critical region */
lws_ p a g e f a u l t _ d i s a b l e % r21 ,% r28
/ * NOTES :
This a l l w o r k s b e c a u s e i n t r _ d o _ s i g n a l
and s c h e d u l e b o t h c h e c k t h e r e t u r n i a s q
and s e e t h a t w e a r e o n t h e k e r n e l p a g e
so t h i s p r o c e s s i s n e v e r s c h e d u l e d o f f
or i s e v e r s e n t a n y s i g n a l o f a n y s o r t ,
thus i t i s w h o l l y a t o m i c f r o m u s e r s p a c e ' s
perspective
* /
/* Jump to the correct function */
blr % r1 , % r0
/* Set %r28 as non-zero for now */
ldo 1 ( % r0 ) ,% r28
/* 8-bit exchange */
14 : ldb 0 ( % r26 ) , % r1
15 : stb % r1 , 0 ( % r23 )
15 : ldb 0 ( % r24 ) , % r1
17 : stb % r1 , 0 ( % r26 )
b l w s _ e x i t _ n o e r r o r
copy % r0 , % r28
nop
nop
/* 16-bit exchange */
18 : ldh 0 ( % r26 ) , % r1
19 : sth % r1 , 0 ( % r23 )
20 : ldh 0 ( % r24 ) , % r1
21 : sth % r1 , 0 ( % r26 )
b l w s _ e x i t _ n o e r r o r
copy % r0 , % r28
nop
nop
/* 32-bit exchange */
22 : ldw 0 ( % r26 ) , % r1
23 : stw % r1 , 0 ( % r23 )
24 : ldw 0 ( % r24 ) , % r1
25 : stw % r1 , 0 ( % r26 )
b l w s _ e x i t _ n o e r r o r
copy % r0 , % r28
nop
nop
/* 64-bit exchange */
# ifdef C O N F I G _ 6 4 B I T
26 : ldd 0 ( % r26 ) , % r1
27 : std % r1 , 0 ( % r23 )
28 : ldd 0 ( % r24 ) , % r1
29 : std % r1 , 0 ( % r26 )
# else
26 : flddx 0 ( % r26 ) , % f r4
27 : fstdx % f r4 , 0 ( % r23 )
28 : flddx 0 ( % r24 ) , % f r4
29 : fstdx % f r4 , 0 ( % r26 )
# endif
b l w s _ e x i t _ n o e r r o r
copy % r0 , % r28
/* A fault occurred on load or stbys,e store */
30 : b,n l w s _ f a u l t
ASM_ E X C E P T I O N T A B L E _ E N T R Y ( 1 b - l i n u x _ g a t e w a y _ p a g e , 3 0 b - l i n u x _ g a t e w a y _ p a g e )
ASM_ E X C E P T I O N T A B L E _ E N T R Y ( 2 b - l i n u x _ g a t e w a y _ p a g e , 3 0 b - l i n u x _ g a t e w a y _ p a g e )
ASM_ E X C E P T I O N T A B L E _ E N T R Y ( 3 b - l i n u x _ g a t e w a y _ p a g e , 3 0 b - l i n u x _ g a t e w a y _ p a g e )
ASM_ E X C E P T I O N T A B L E _ E N T R Y ( 4 b - l i n u x _ g a t e w a y _ p a g e , 3 0 b - l i n u x _ g a t e w a y _ p a g e )
ASM_ E X C E P T I O N T A B L E _ E N T R Y ( 5 b - l i n u x _ g a t e w a y _ p a g e , 3 0 b - l i n u x _ g a t e w a y _ p a g e )
ASM_ E X C E P T I O N T A B L E _ E N T R Y ( 6 b - l i n u x _ g a t e w a y _ p a g e , 3 0 b - l i n u x _ g a t e w a y _ p a g e )
ASM_ E X C E P T I O N T A B L E _ E N T R Y ( 7 b - l i n u x _ g a t e w a y _ p a g e , 3 0 b - l i n u x _ g a t e w a y _ p a g e )
ASM_ E X C E P T I O N T A B L E _ E N T R Y ( 8 b - l i n u x _ g a t e w a y _ p a g e , 3 0 b - l i n u x _ g a t e w a y _ p a g e )
# ifndef C O N F I G _ 6 4 B I T
ASM_ E X C E P T I O N T A B L E _ E N T R Y ( 9 b - l i n u x _ g a t e w a y _ p a g e , 3 0 b - l i n u x _ g a t e w a y _ p a g e )
ASM_ E X C E P T I O N T A B L E _ E N T R Y ( 1 0 b - l i n u x _ g a t e w a y _ p a g e , 3 0 b - l i n u x _ g a t e w a y _ p a g e )
# endif
ASM_ E X C E P T I O N T A B L E _ E N T R Y ( 1 1 b - l i n u x _ g a t e w a y _ p a g e , 3 0 b - l i n u x _ g a t e w a y _ p a g e )
ASM_ E X C E P T I O N T A B L E _ E N T R Y ( 1 2 b - l i n u x _ g a t e w a y _ p a g e , 3 0 b - l i n u x _ g a t e w a y _ p a g e )
/* A page fault occurred in critical region */
31 : b,n l w s _ p a g e f a u l t
ASM_ E X C E P T I O N T A B L E _ E N T R Y ( 1 4 b - l i n u x _ g a t e w a y _ p a g e , 3 1 b - l i n u x _ g a t e w a y _ p a g e )
ASM_ E X C E P T I O N T A B L E _ E N T R Y ( 1 5 b - l i n u x _ g a t e w a y _ p a g e , 3 1 b - l i n u x _ g a t e w a y _ p a g e )
ASM_ E X C E P T I O N T A B L E _ E N T R Y ( 1 6 b - l i n u x _ g a t e w a y _ p a g e , 3 1 b - l i n u x _ g a t e w a y _ p a g e )
ASM_ E X C E P T I O N T A B L E _ E N T R Y ( 1 7 b - l i n u x _ g a t e w a y _ p a g e , 3 1 b - l i n u x _ g a t e w a y _ p a g e )
ASM_ E X C E P T I O N T A B L E _ E N T R Y ( 1 8 b - l i n u x _ g a t e w a y _ p a g e , 3 1 b - l i n u x _ g a t e w a y _ p a g e )
ASM_ E X C E P T I O N T A B L E _ E N T R Y ( 1 9 b - l i n u x _ g a t e w a y _ p a g e , 3 1 b - l i n u x _ g a t e w a y _ p a g e )
ASM_ E X C E P T I O N T A B L E _ E N T R Y ( 2 0 b - l i n u x _ g a t e w a y _ p a g e , 3 1 b - l i n u x _ g a t e w a y _ p a g e )
ASM_ E X C E P T I O N T A B L E _ E N T R Y ( 2 1 b - l i n u x _ g a t e w a y _ p a g e , 3 1 b - l i n u x _ g a t e w a y _ p a g e )
ASM_ E X C E P T I O N T A B L E _ E N T R Y ( 2 2 b - l i n u x _ g a t e w a y _ p a g e , 3 1 b - l i n u x _ g a t e w a y _ p a g e )
ASM_ E X C E P T I O N T A B L E _ E N T R Y ( 2 3 b - l i n u x _ g a t e w a y _ p a g e , 3 1 b - l i n u x _ g a t e w a y _ p a g e )
ASM_ E X C E P T I O N T A B L E _ E N T R Y ( 2 4 b - l i n u x _ g a t e w a y _ p a g e , 3 1 b - l i n u x _ g a t e w a y _ p a g e )
ASM_ E X C E P T I O N T A B L E _ E N T R Y ( 2 5 b - l i n u x _ g a t e w a y _ p a g e , 3 1 b - l i n u x _ g a t e w a y _ p a g e )
ASM_ E X C E P T I O N T A B L E _ E N T R Y ( 2 6 b - l i n u x _ g a t e w a y _ p a g e , 3 1 b - l i n u x _ g a t e w a y _ p a g e )
ASM_ E X C E P T I O N T A B L E _ E N T R Y ( 2 7 b - l i n u x _ g a t e w a y _ p a g e , 3 1 b - l i n u x _ g a t e w a y _ p a g e )
ASM_ E X C E P T I O N T A B L E _ E N T R Y ( 2 8 b - l i n u x _ g a t e w a y _ p a g e , 3 1 b - l i n u x _ g a t e w a y _ p a g e )
ASM_ E X C E P T I O N T A B L E _ E N T R Y ( 2 9 b - l i n u x _ g a t e w a y _ p a g e , 3 1 b - l i n u x _ g a t e w a y _ p a g e )
/ * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
LWS a t o m i c s t o r e .
% r2 6 - A d d r e s s t o s t o r e
% r2 5 - S i z e o f t h e v a r i a b l e ( 0 / 1 / 2 / 3 f o r 8 / 1 6 / 3 2 / 6 4 b i t )
% r2 4 - A d d r e s s o f v a l u e t o s t o r e
% r2 8 - R e t u r n n o n - z e r o o n f a i l u r e
% r2 1 - K e r n e l e r r o r c o d e
% r2 1 r e t u r n s t h e f o l l o w i n g e r r o r c o d e s :
EAGAIN - C A S i s b u s y , l d c w f a i l e d , t r y a g a i n .
EFAULT - R e a d o r w r i t e f a i l e d .
If E A G A I N i s r e t u r n e d , % r28 i n d i c a t e s t h e b u s y r e a s o n :
r2 8 = = 1 - C A S i s b u s y . l o c k c o n t e n d e d .
r2 8 = = 2 - C A S i s b u s y . l d c w f a i l e d .
r2 8 = = 3 - C A S i s b u s y . p a g e f a u l t .
Scratch : r2 0 , r1
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * /
lws_atomic_store :
# ifdef C O N F I G _ 6 4 B I T
/* Wide mode user process? */
bb,< ,n % s p , 3 1 , a t o m i c _ s t o r e _ b e g i n
/ * Clip t h e i n p u t r e g i s t e r s f o r 3 2 - b i t p r o c e s s e s . W e d o n ' t
need t o c l i p % r23 a s w e o n l y u s e i t f o r w o r d o p e r a t i o n s * /
depdi 0 , 3 1 , 3 2 , % r26
depdi 0 , 3 1 , 3 2 , % r25
depdi 0 , 3 1 , 3 2 , % r24
# endif
atomic_store_begin :
/* Check the validity of the size pointer */
subi,> > = 3 , % r25 , % r0
b,n l w s _ e x i t _ n o s y s
shlw % r25 , 1 , % r1
blr % r1 , % r0
nop
/* Perform exception checks */
/* 8-bit store */
1 : ldb 0 ( % r24 ) , % r20
b,n a t o m i c _ s t o r e _ s t a r t
nop
nop
/* 16-bit store */
2 : ldh 0 ( % r24 ) , % r20
b,n a t o m i c _ s t o r e _ s t a r t
nop
nop
/* 32-bit store */
3 : ldw 0 ( % r24 ) , % r20
b,n a t o m i c _ s t o r e _ s t a r t
nop
nop
/* 64-bit store */
# ifdef C O N F I G _ 6 4 B I T
4 : ldd 0 ( % r24 ) , % r20
# else
4 : ldw 0 ( % r24 ) , % r20
5 : ldw 4 ( % r24 ) , % r20
# endif
atomic_store_start :
/* Trigger memory reference interruptions without writing to memory */
copy % r26 , % r28
depi_ s a f e 0 , 3 1 , 2 , % r28
6 : ldw 0 ( % r28 ) , % r1
7 : stbys,e % r0 , 0 ( % r28 )
/* Calculate 8-bit hash index from virtual address */
extru_ s a f e % r26 , 2 7 , 8 , % r20
/* Load start of lock table */
ldil L % l w s _ l o c k _ s t a r t , % r28
ldo R % l w s _ l o c k _ s t a r t ( % r28 ) , % r28
/ * Find l o c k t o u s e , t h e h a s h i n d e x i s o n e o f 0 t o
2 5 5 , multiplied b y 1 6 ( k e e p i t 1 6 - b y t e a l i g n e d )
and a d d t o t h e l o c k t a b l e o f f s e t . * /
shlw % r20 , 4 , % r20
add % r20 , % r28 , % r20
rsm P S W _ S M _ I , % r0 / * D i s a b l e i n t e r r u p t s * /
/* Try to acquire the lock */
LDCW 0 ( % s r2 ,% r20 ) , % r28
comclr,< > % r0 , % r28 , % r0
b,n l w s _ w o u l d b l o c k
/* Disable page faults to prevent sleeping in critical region */
lws_ p a g e f a u l t _ d i s a b l e % r21 ,% r28
/ * NOTES :
This a l l w o r k s b e c a u s e i n t r _ d o _ s i g n a l
and s c h e d u l e b o t h c h e c k t h e r e t u r n i a s q
and s e e t h a t w e a r e o n t h e k e r n e l p a g e
so t h i s p r o c e s s i s n e v e r s c h e d u l e d o f f
or i s e v e r s e n t a n y s i g n a l o f a n y s o r t ,
thus i t i s w h o l l y a t o m i c f r o m u s e r s p a c e ' s
perspective
* /
/* Jump to the correct function */
blr % r1 , % r0
/* Set %r28 as non-zero for now */
ldo 1 ( % r0 ) ,% r28
/* 8-bit store */
9 : ldb 0 ( % r24 ) , % r1
10 : stb % r1 , 0 ( % r26 )
b l w s _ e x i t _ n o e r r o r
copy % r0 , % r28
/* 16-bit store */
11 : ldh 0 ( % r24 ) , % r1
12 : sth % r1 , 0 ( % r26 )
b l w s _ e x i t _ n o e r r o r
copy % r0 , % r28
/* 32-bit store */
13 : ldw 0 ( % r24 ) , % r1
14 : stw % r1 , 0 ( % r26 )
b l w s _ e x i t _ n o e r r o r
copy % r0 , % r28
/* 64-bit store */
# ifdef C O N F I G _ 6 4 B I T
15 : ldd 0 ( % r24 ) , % r1
16 : std % r1 , 0 ( % r26 )
# else
15 : flddx 0 ( % r24 ) , % f r4
16 : fstdx % f r4 , 0 ( % r26 )
# endif
b l w s _ e x i t _ n o e r r o r
copy % r0 , % r28
/* A fault occurred on load or stbys,e store */
30 : b,n l w s _ f a u l t
ASM_ E X C E P T I O N T A B L E _ E N T R Y ( 1 b - l i n u x _ g a t e w a y _ p a g e , 3 0 b - l i n u x _ g a t e w a y _ p a g e )
ASM_ E X C E P T I O N T A B L E _ E N T R Y ( 2 b - l i n u x _ g a t e w a y _ p a g e , 3 0 b - l i n u x _ g a t e w a y _ p a g e )
ASM_ E X C E P T I O N T A B L E _ E N T R Y ( 3 b - l i n u x _ g a t e w a y _ p a g e , 3 0 b - l i n u x _ g a t e w a y _ p a g e )
ASM_ E X C E P T I O N T A B L E _ E N T R Y ( 4 b - l i n u x _ g a t e w a y _ p a g e , 3 0 b - l i n u x _ g a t e w a y _ p a g e )
# ifndef C O N F I G _ 6 4 B I T
ASM_ E X C E P T I O N T A B L E _ E N T R Y ( 5 b - l i n u x _ g a t e w a y _ p a g e , 3 0 b - l i n u x _ g a t e w a y _ p a g e )
# endif
ASM_ E X C E P T I O N T A B L E _ E N T R Y ( 6 b - l i n u x _ g a t e w a y _ p a g e , 3 0 b - l i n u x _ g a t e w a y _ p a g e )
ASM_ E X C E P T I O N T A B L E _ E N T R Y ( 7 b - l i n u x _ g a t e w a y _ p a g e , 3 0 b - l i n u x _ g a t e w a y _ p a g e )
/* A page fault occurred in critical region */
31 : b,n l w s _ p a g e f a u l t
ASM_ E X C E P T I O N T A B L E _ E N T R Y ( 9 b - l i n u x _ g a t e w a y _ p a g e , 3 1 b - l i n u x _ g a t e w a y _ p a g e )
ASM_ E X C E P T I O N T A B L E _ E N T R Y ( 1 0 b - l i n u x _ g a t e w a y _ p a g e , 3 1 b - l i n u x _ g a t e w a y _ p a g e )
ASM_ E X C E P T I O N T A B L E _ E N T R Y ( 1 1 b - l i n u x _ g a t e w a y _ p a g e , 3 1 b - l i n u x _ g a t e w a y _ p a g e )
ASM_ E X C E P T I O N T A B L E _ E N T R Y ( 1 2 b - l i n u x _ g a t e w a y _ p a g e , 3 1 b - l i n u x _ g a t e w a y _ p a g e )
ASM_ E X C E P T I O N T A B L E _ E N T R Y ( 1 3 b - l i n u x _ g a t e w a y _ p a g e , 3 1 b - l i n u x _ g a t e w a y _ p a g e )
ASM_ E X C E P T I O N T A B L E _ E N T R Y ( 1 4 b - l i n u x _ g a t e w a y _ p a g e , 3 1 b - l i n u x _ g a t e w a y _ p a g e )
ASM_ E X C E P T I O N T A B L E _ E N T R Y ( 1 5 b - l i n u x _ g a t e w a y _ p a g e , 3 1 b - l i n u x _ g a t e w a y _ p a g e )
ASM_ E X C E P T I O N T A B L E _ E N T R Y ( 1 6 b - l i n u x _ g a t e w a y _ p a g e , 3 1 b - l i n u x _ g a t e w a y _ p a g e )
2005-04-17 02:20:36 +04:00
/* Make sure nothing else is placed on this page */
2007-10-18 11:04:34 +04:00
.align PAGE_SIZE
2007-01-25 00:36:32 +03:00
END( l i n u x _ g a t e w a y _ p a g e )
ENTRY( e n d _ l i n u x _ g a t e w a y _ p a g e )
2005-04-17 02:20:36 +04:00
/ * Relocate s y m b o l s a s s u m i n g l i n u x _ g a t e w a y _ p a g e i s m a p p e d
to v i r t u a l a d d r e s s 0 x0 * /
2007-01-25 00:36:32 +03:00
2007-01-28 16:52:57 +03:00
# define L W S _ E N T R Y ( _ n a m e _ ) A S M _ U L O N G _ I N S N ( l w s _ ## _ n a m e _ - l i n u x _ g a t e w a y _ p a g e )
2005-04-17 02:20:36 +04:00
2006-01-13 23:21:06 +03:00
.section .rodata , " a"
2013-05-03 00:41:45 +04:00
.align 8
2005-04-17 02:20:36 +04:00
/* Light-weight-syscall table */
/* Start of lws table. */
2007-01-25 00:36:32 +03:00
ENTRY( l w s _ t a b l e )
2014-09-12 20:02:34 +04:00
LWS_ E N T R Y ( c o m p a r e _ a n d _ s w a p32 ) / * 0 - E L F 3 2 A t o m i c 3 2 b i t C A S * /
LWS_ E N T R Y ( c o m p a r e _ a n d _ s w a p64 ) / * 1 - E L F 6 4 A t o m i c 3 2 b i t C A S * /
parisc: Rewrite light-weight syscall and futex code
The parisc architecture lacks general hardware support for compare and swap.
Particularly for userspace, it is difficult to implement software atomic
support. Page faults in critical regions can cause processes to sleep and
block the forward progress of other processes. Thus, it is essential that
page faults be disabled in critical regions. For performance reasons, we
also need to disable external interrupts in critical regions.
In order to do this, we need a mechanism to trigger COW breaks outside the
critical region. Fortunately, parisc has the "stbys,e" instruction. When
the leftmost byte of a word is addressed, this instruction triggers all
the exceptions of a normal store but it does not write to memory. Thus,
we can use it to trigger COW breaks outside the critical region without
modifying the data that is to be updated atomically.
COW breaks occur randomly. So even if we have priviously executed a "stbys,e"
instruction, we still need to disable pagefaults around the critical region.
If a fault occurs in the critical region, we return -EAGAIN. I had to add
a wrapper around _arch_futex_atomic_op_inuser() as I found in testing that
returning -EAGAIN caused problems for some processes even though it is
listed as a possible return value.
The patch implements the above. The code no longer attempts to sleep with
interrupts disabled and I haven't seen any stalls with the change.
I have attempted to merge common code and streamline the fast path. In the
futex code, we only compute the spinlock address once.
I eliminated some debug code in the original CAS routine that just made the
flow more complicated.
I don't clip the arguments when called from wide mode. As a result, the LWS
routines should work when called from 64-bit processes.
I defined TASK_PAGEFAULT_DISABLED offset for use in the lws_pagefault_disable
and lws_pagefault_enable macros.
Since we now disable interrupts on the gateway page where necessary, it
might be possible to allow processes to be scheduled when they are on the
gateway page.
Change has been tested on c8000 and rp3440. It improves glibc build and test
time by about 10%.
In v2, I removed the lws_atomic_xchg and and lws_atomic_store calls. I
also removed the bug fixes that were not directly related to this patch.
In v3, I removed the code to force interruptions from
arch_futex_atomic_op_inuser(). It is always called with page faults
disabled, so this code had no effect.
In v4, I fixed a typo in depi_safe line.
In v5, I moved the code to disable/enable page faults inside the spinlocks.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2022-01-05 00:44:32 +03:00
LWS_ E N T R Y ( c o m p a r e _ a n d _ s w a p _ 2 ) / * 2 - A t o m i c 6 4 b i t C A S * /
2022-01-05 00:48:11 +03:00
LWS_ E N T R Y ( a t o m i c _ x c h g ) / * 3 - A t o m i c E x c h a n g e * /
LWS_ E N T R Y ( a t o m i c _ s t o r e ) / * 4 - A t o m i c S t o r e * /
2007-01-25 00:36:32 +03:00
END( l w s _ t a b l e )
2005-04-17 02:20:36 +04:00
/* End of lws table */
2021-03-01 17:58:22 +03:00
# ifdef C O N F I G _ 6 4 B I T
# define _ _ S Y S C A L L _ W I T H _ C O M P A T ( n r , n a t i v e , c o m p a t ) _ _ S Y S C A L L ( n r , c o m p a t )
# else
# define _ _ S Y S C A L L _ W I T H _ C O M P A T ( n r , n a t i v e , c o m p a t ) _ _ S Y S C A L L ( n r , n a t i v e )
# endif
2019-01-02 19:02:32 +03:00
# define _ _ S Y S C A L L ( n r , e n t r y ) A S M _ U L O N G _ I N S N e n t r y
2013-05-03 00:41:45 +04:00
.align 8
2007-01-25 00:36:32 +03:00
ENTRY( s y s _ c a l l _ t a b l e )
2016-04-13 23:44:54 +03:00
.export sys_ c a l l _ t a b l e ,d a t a
2021-03-01 17:58:22 +03:00
# include < a s m / s y s c a l l _ t a b l e _ 3 2 . h > / * 3 2 - b i t s y s c a l l s * /
2007-01-25 00:36:32 +03:00
END( s y s _ c a l l _ t a b l e )
2005-04-17 02:20:36 +04:00
2005-10-22 06:46:48 +04:00
# ifdef C O N F I G _ 6 4 B I T
2013-05-03 00:41:45 +04:00
.align 8
2007-01-25 00:36:32 +03:00
ENTRY( s y s _ c a l l _ t a b l e 6 4 )
2021-03-01 17:58:22 +03:00
# include < a s m / s y s c a l l _ t a b l e _ 6 4 . h > / * 6 4 - b i t s y s c a l l s * /
2007-01-25 00:36:32 +03:00
END( s y s _ c a l l _ t a b l e 6 4 )
2005-04-17 02:20:36 +04:00
# endif
/ *
All l i g h t - w e i g h t - s y s c a l l a t o m i c o p e r a t i o n s
will u s e t h i s s e t o f l o c k s
2008-12-30 05:47:38 +03:00
NOTE : The l w s _ l o c k _ s t a r t s y m b o l m u s t b e
at l e a s t 1 6 - b y t e a l i g n e d f o r s a f e u s e
with l d c w .
2005-04-17 02:20:36 +04:00
* /
2008-05-22 22:36:31 +04:00
.section .data
2013-05-03 00:41:45 +04:00
.align L1_CACHE_BYTES
2007-01-25 00:36:32 +03:00
ENTRY( l w s _ l o c k _ s t a r t )
2005-04-17 02:20:36 +04:00
/* lws locks */
2020-10-02 22:21:41 +03:00
.rept 256
2005-04-17 02:20:36 +04:00
/* Keep locks aligned at 16-bytes */
.word 1
.word 0
.word 0
.word 0
.endr
2007-01-25 00:36:32 +03:00
END( l w s _ l o c k _ s t a r t )
2005-04-17 02:20:36 +04:00
.previous
.end