2012-03-05 11:49:27 +00:00
/ *
* Low- l e v e l C P U i n i t i a l i s a t i o n
* Based o n a r c h / a r m / k e r n e l / h e a d . S
*
* Copyright ( C ) 1 9 9 4 - 2 0 0 2 R u s s e l l K i n g
* Copyright ( C ) 2 0 0 3 - 2 0 1 2 A R M L t d .
* Authors : Catalin M a r i n a s < c a t a l i n . m a r i n a s @arm.com>
* Will D e a c o n < w i l l . d e a c o n @arm.com>
*
* This p r o g r a m i s f r e e s o f t w a r e ; you can redistribute it and/or modify
* it u n d e r t h e t e r m s o f t h e G N U G e n e r a l P u b l i c L i c e n s e v e r s i o n 2 a s
* published b y t h e F r e e S o f t w a r e F o u n d a t i o n .
*
* This p r o g r a m i s d i s t r i b u t e d i n t h e h o p e t h a t i t w i l l b e u s e f u l ,
* but W I T H O U T A N Y W A R R A N T Y ; without even the implied warranty of
* MERCHANTABILITY o r F I T N E S S F O R A P A R T I C U L A R P U R P O S E . S e e t h e
* GNU G e n e r a l P u b l i c L i c e n s e f o r m o r e d e t a i l s .
*
* You s h o u l d h a v e r e c e i v e d a c o p y o f t h e G N U G e n e r a l P u b l i c L i c e n s e
* along w i t h t h i s p r o g r a m . I f n o t , s e e < h t t p : / / w w w . g n u . o r g / l i c e n s e s / > .
* /
# include < l i n u x / l i n k a g e . h >
# include < l i n u x / i n i t . h >
2014-06-30 16:01:31 +01:00
# include < l i n u x / i r q c h i p / a r m - g i c - v3 . h >
2012-03-05 11:49:27 +00:00
# include < a s m / a s s e m b l e r . h >
2016-04-18 17:09:47 +02:00
# include < a s m / b o o t . h >
2012-03-05 11:49:27 +00:00
# include < a s m / p t r a c e . h >
# include < a s m / a s m - o f f s e t s . h >
2014-03-26 18:25:55 +00:00
# include < a s m / c a c h e . h >
2012-08-29 18:32:18 +01:00
# include < a s m / c p u t y p e . h >
2016-01-26 09:13:44 +01:00
# include < a s m / e l f . h >
2015-10-19 14:19:27 +01:00
# include < a s m / k e r n e l - p g t a b l e . h >
2014-02-19 09:33:14 +00:00
# include < a s m / k v m _ a r m . h >
2012-03-05 11:49:27 +00:00
# include < a s m / m e m o r y . h >
# include < a s m / p g t a b l e - h w d e f . h >
# include < a s m / p g t a b l e . h >
# include < a s m / p a g e . h >
2016-02-23 10:31:42 +00:00
# include < a s m / s m p . h >
2015-10-19 14:19:35 +01:00
# include < a s m / s y s r e g . h >
# include < a s m / t h r e a d _ i n f o . h >
2012-10-26 15:40:05 +01:00
# include < a s m / v i r t . h >
2012-03-05 11:49:27 +00:00
2017-03-23 19:00:46 +00:00
# include " e f i - h e a d e r . S "
2015-03-17 09:14:29 +01:00
# define _ _ P H Y S _ O F F S E T ( K E R N E L _ S T A R T - T E X T _ O F F S E T )
2012-03-05 11:49:27 +00:00
2014-08-13 18:53:03 +01:00
# if ( T E X T _ O F F S E T & 0 x f f f ) ! = 0
# error T E X T _ O F F S E T m u s t b e a t l e a s t 4 K B a l i g n e d
# elif ( P A G E _ O F F S E T & 0 x1 f f f f f ) ! = 0
2014-06-24 16:51:37 +01:00
# error P A G E _ O F F S E T m u s t b e a t l e a s t 2 M B a l i g n e d
2014-08-13 18:53:03 +01:00
# elif T E X T _ O F F S E T > 0 x1 f f f f f
2014-06-24 16:51:37 +01:00
# error T E X T _ O F F S E T m u s t b e l e s s t h a n 2 M B
2012-03-05 11:49:27 +00:00
# endif
/ *
* Kernel s t a r t u p e n t r y p o i n t .
* - - - - - - - - - - - - - - - - - - - - - - - - - - -
*
* The r e q u i r e m e n t s a r e :
* MMU = o f f , D - c a c h e = o f f , I - c a c h e = o n o r o f f ,
* x0 = p h y s i c a l a d d r e s s t o t h e F D T b l o b .
*
* This c o d e i s m o s t l y p o s i t i o n i n d e p e n d e n t s o y o u c a l l t h i s a t
* _ _ pa( P A G E _ O F F S E T + T E X T _ O F F S E T ) .
*
* Note t h a t t h e c a l l e e - s a v e d r e g i s t e r s a r e u s e d f o r s t o r i n g v a r i a b l e s
* that a r e u s e f u l b e f o r e t h e M M U i s e n a b l e d . T h e a l l o c a t i o n s a r e d e s c r i b e d
* in t h e e n t r y r o u t i n e s .
* /
_ _ HEAD
2015-12-26 12:46:40 +01:00
_head :
2012-03-05 11:49:27 +00:00
/ *
* DO N O T M O D I F Y . I m a g e h e a d e r e x p e c t e d b y L i n u x b o o t - l o a d e r s .
* /
2014-04-15 22:47:52 -04:00
# ifdef C O N F I G _ E F I
/ *
* This a d d i n s t r u c t i o n h a s n o m e a n i n g f u l e f f e c t e x c e p t t h a t
* its o p c o d e f o r m s t h e m a g i c " M Z " s i g n a t u r e r e q u i r e d b y U E F I .
* /
add x13 , x18 , #0x16
b s t e x t
# else
2012-03-05 11:49:27 +00:00
b s t e x t / / b r a n c h t o k e r n e l s t a r t , m a g i c
.long 0 / / reserved
2014-04-15 22:47:52 -04:00
# endif
2015-12-26 13:48:02 +01:00
le6 4 s y m _ k e r n e l _ o f f s e t _ l e / / I m a g e l o a d o f f s e t f r o m s t a r t o f R A M , l i t t l e - e n d i a n
le6 4 s y m _ k e r n e l _ s i z e _ l e / / E f f e c t i v e s i z e o f k e r n e l i m a g e , l i t t l e - e n d i a n
le6 4 s y m _ k e r n e l _ f l a g s _ l e / / I n f o r m a t i v e f l a g s , l i t t l e - e n d i a n
2013-08-15 00:10:00 +01:00
.quad 0 / / reserved
.quad 0 / / reserved
.quad 0 / / reserved
2017-03-23 19:00:47 +00:00
.ascii " ARM\ x64 " / / M a g i c n u m b e r
2014-04-15 22:47:52 -04:00
# ifdef C O N F I G _ E F I
2015-12-26 12:46:40 +01:00
.long pe_header - _ head / / O f f s e t t o t h e P E h e a d e r .
2014-04-15 22:47:52 -04:00
pe_header :
2017-03-23 19:00:46 +00:00
_ _ EFI_ P E _ H E A D E R
2017-03-23 19:00:47 +00:00
# else
.long 0 / / reserved
2014-04-15 22:47:52 -04:00
# endif
2012-03-05 11:49:27 +00:00
2016-03-30 17:43:07 +02:00
_ _ INIT
2016-08-31 12:05:17 +01:00
/ *
* The f o l l o w i n g c a l l e e s a v e d g e n e r a l p u r p o s e r e g i s t e r s a r e u s e d o n t h e
* primary l o w l e v e l b o o t p a t h :
*
* Register S c o p e P u r p o s e
* x2 1 s t e x t ( ) . . s t a r t _ k e r n e l ( ) F D T p o i n t e r p a s s e d a t b o o t i n x0
* x2 3 s t e x t ( ) . . s t a r t _ k e r n e l ( ) p h y s i c a l m i s a l i g n m e n t / K A S L R o f f s e t
* x2 8 _ _ c r e a t e _ p a g e _ t a b l e s ( ) c a l l e e p r e s e r v e d t e m p r e g i s t e r
* x1 9 / x20 _ _ p r i m a r y _ s w i t c h ( ) c a l l e e p r e s e r v e d t e m p r e g i s t e r s
* /
2012-03-05 11:49:27 +00:00
ENTRY( s t e x t )
2015-03-17 10:55:12 +01:00
bl p r e s e r v e _ b o o t _ a r g s
2016-08-31 12:05:12 +01:00
bl e l 2 _ s e t u p / / D r o p t o E L 1 , w0 =cpu_boot_mode
2016-08-31 12:05:15 +01:00
adrp x23 , _ _ P H Y S _ O F F S E T
and x23 , x23 , M I N _ K I M G _ A L I G N - 1 / / K A S L R o f f s e t , d e f a u l t s t o 0
2013-10-11 14:52:16 +01:00
bl s e t _ c p u _ b o o t _ m o d e _ f l a g
2016-08-16 21:02:32 +02:00
bl _ _ c r e a t e _ p a g e _ t a b l e s
2012-03-05 11:49:27 +00:00
/ *
2015-03-18 14:55:20 +00:00
* The f o l l o w i n g c a l l s C P U s e t u p c o d e , s e e a r c h / a r m 6 4 / m m / p r o c . S f o r
* details.
2012-03-05 11:49:27 +00:00
* On r e t u r n , t h e C P U w i l l b e r e a d y f o r t h e M M U t o b e t u r n e d o n a n d
* the T C R w i l l h a v e b e e n s e t .
* /
2016-04-18 17:09:43 +02:00
bl _ _ c p u _ s e t u p / / i n i t i a l i s e p r o c e s s o r
2016-08-31 12:05:13 +01:00
b _ _ p r i m a r y _ s w i t c h
2012-03-05 11:49:27 +00:00
ENDPROC( s t e x t )
2015-03-17 10:55:12 +01:00
/ *
* Preserve t h e a r g u m e n t s p a s s e d b y t h e b o o t l o a d e r i n x0 . . x3
* /
preserve_boot_args :
mov x21 , x0 / / x21 =FDT
adr_ l x0 , b o o t _ a r g s / / r e c o r d t h e c o n t e n t s o f
stp x21 , x1 , [ x0 ] / / x0 . . x3 a t k e r n e l e n t r y
stp x2 , x3 , [ x0 , #16 ]
dmb s y / / n e e d e d b e f o r e d c i v a c w i t h
/ / MMU o f f
2017-07-25 11:55:39 +01:00
mov x1 , #0x20 / / 4 x 8 b y t e s
b _ _ i n v a l _ d c a c h e _ a r e a / / t a i l c a l l
2015-03-17 10:55:12 +01:00
ENDPROC( p r e s e r v e _ b o o t _ a r g s )
2014-11-21 13:50:41 -08:00
/ *
* Macro t o c r e a t e a t a b l e e n t r y t o t h e n e x t p a g e .
*
* tbl : page t a b l e a d d r e s s
* virt : virtual a d d r e s s
* shift : # imm p a g e t a b l e s h i f t
* ptrs : # imm p o i n t e r s p e r t a b l e p a g e
*
* Preserves : virt
arm64: allow ID map to be extended to 52 bits
Currently, when using VA_BITS < 48, if the ID map text happens to be
placed in physical memory above VA_BITS, we increase the VA size (up to
48) and create a new table level, in order to map in the ID map text.
This is okay because the system always supports 48 bits of VA.
This patch extends the code such that if the system supports 52 bits of
VA, and the ID map text is placed that high up, then we increase the VA
size accordingly, up to 52.
One difference from the current implementation is that so far the
condition of VA_BITS < 48 has meant that the top level table is always
"full", with the maximum number of entries, and an extra table level is
always needed. Now, when VA_BITS = 48 (and using 64k pages), the top
level table is not full, and we simply need to increase the number of
entries in it, instead of creating a new table level.
Tested-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Tested-by: Bob Picco <bob.picco@oracle.com>
Reviewed-by: Bob Picco <bob.picco@oracle.com>
Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com>
[catalin.marinas@arm.com: reduce arguments to __create_hyp_mappings()]
[catalin.marinas@arm.com: reworked/renamed __cpu_uses_extended_idmap_level()]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-12-13 17:07:24 +00:00
* Corrupts : ptrs, t m p1 , t m p2
2014-11-21 13:50:41 -08:00
* Returns : tbl - > n e x t l e v e l t a b l e p a g e a d d r e s s
* /
.macro create_ t a b l e _ e n t r y , t b l , v i r t , s h i f t , p t r s , t m p1 , t m p2
2017-12-13 17:07:19 +00:00
add \ t m p1 , \ t b l , #P A G E _ S I Z E
2018-01-29 11:59:59 +00:00
phys_ t o _ p t e \ t m p2 , \ t m p1
2017-12-13 17:07:19 +00:00
orr \ t m p2 , \ t m p2 , #P M D _ T Y P E _ T A B L E / / a d d r e s s o f n e x t t a b l e a n d e n t r y t y p e
2014-11-21 13:50:41 -08:00
lsr \ t m p1 , \ v i r t , #\ s h i f t
arm64: allow ID map to be extended to 52 bits
Currently, when using VA_BITS < 48, if the ID map text happens to be
placed in physical memory above VA_BITS, we increase the VA size (up to
48) and create a new table level, in order to map in the ID map text.
This is okay because the system always supports 48 bits of VA.
This patch extends the code such that if the system supports 52 bits of
VA, and the ID map text is placed that high up, then we increase the VA
size accordingly, up to 52.
One difference from the current implementation is that so far the
condition of VA_BITS < 48 has meant that the top level table is always
"full", with the maximum number of entries, and an extra table level is
always needed. Now, when VA_BITS = 48 (and using 64k pages), the top
level table is not full, and we simply need to increase the number of
entries in it, instead of creating a new table level.
Tested-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Tested-by: Bob Picco <bob.picco@oracle.com>
Reviewed-by: Bob Picco <bob.picco@oracle.com>
Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com>
[catalin.marinas@arm.com: reduce arguments to __create_hyp_mappings()]
[catalin.marinas@arm.com: reworked/renamed __cpu_uses_extended_idmap_level()]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-12-13 17:07:24 +00:00
sub \ p t r s , \ p t r s , #1
and \ t m p1 , \ t m p1 , \ p t r s / / t a b l e i n d e x
2014-11-21 13:50:41 -08:00
str \ t m p2 , [ \ t b l , \ t m p1 , l s l #3 ]
add \ t b l , \ t b l , #P A G E _ S I Z E / / n e x t l e v e l t a b l e p a g e
.endm
/ *
2018-01-11 10:11:59 +00:00
* Macro t o p o p u l a t e p a g e t a b l e e n t r i e s , t h e s e e n t r i e s c a n b e p o i n t e r s t o t h e n e x t l e v e l
* or l a s t l e v e l e n t r i e s p o i n t i n g t o p h y s i c a l m e m o r y .
2014-11-21 13:50:41 -08:00
*
2018-01-11 10:11:59 +00:00
* tbl : page t a b l e a d d r e s s
* rtbl : pointer t o p a g e t a b l e o r p h y s i c a l m e m o r y
* index : start i n d e x t o w r i t e
* eindex : end i n d e x t o w r i t e - [ i n d e x , e i n d e x ] w r i t t e n t o
* flags : flags f o r p a g e t a b l e e n t r y t o o r i n
* inc : increment t o r t b l b e t w e e n e a c h e n t r y
* tmp1 : temporary v a r i a b l e
*
* Preserves : tbl, e i n d e x , f l a g s , i n c
* Corrupts : index, t m p1
* Returns : rtbl
2014-11-21 13:50:41 -08:00
* /
2018-01-11 10:11:59 +00:00
.macro populate_ e n t r i e s , t b l , r t b l , i n d e x , e i n d e x , f l a g s , i n c , t m p1
2018-01-29 11:59:59 +00:00
.Lpe \ @: phys_to_pte \tmp1, \rtbl
2018-01-11 10:11:59 +00:00
orr \ t m p1 , \ t m p1 , \ f l a g s / / t m p1 = t a b l e e n t r y
str \ t m p1 , [ \ t b l , \ i n d e x , l s l #3 ]
add \ r t b l , \ r t b l , \ i n c / / r t b l = p a n e x t l e v e l
add \ i n d e x , \ i n d e x , #1
cmp \ i n d e x , \ e i n d e x
b. l s . L p e \ @
.endm
/ *
* Compute i n d i c e s o f t a b l e e n t r i e s f r o m v i r t u a l a d d r e s s r a n g e . I f m u l t i p l e e n t r i e s
* were n e e d e d i n t h e p r e v i o u s p a g e t a b l e l e v e l t h e n t h e n e x t p a g e t a b l e l e v e l i s a s s u m e d
* to b e c o m p o s e d o f m u l t i p l e p a g e s . ( T h i s e f f e c t i v e l y s c a l e s t h e e n d i n d e x ) .
*
* vstart : virtual a d d r e s s o f s t a r t o f r a n g e
* vend : virtual a d d r e s s o f e n d o f r a n g e
* shift : shift u s e d t o t r a n s f o r m v i r t u a l a d d r e s s i n t o i n d e x
* ptrs : number o f e n t r i e s i n p a g e t a b l e
* istart : index i n t a b l e c o r r e s p o n d i n g t o v s t a r t
* iend : index i n t a b l e c o r r e s p o n d i n g t o v e n d
* count : On e n t r y : h o w m a n y e x t r a e n t r i e s w e r e r e q u i r e d i n p r e v i o u s l e v e l , s c a l e s
* our e n d i n d e x .
* On e x i t : r e t u r n s h o w m a n y e x t r a e n t r i e s r e q u i r e d f o r n e x t p a g e t a b l e l e v e l
*
* Preserves : vstart, v e n d , s h i f t , p t r s
* Returns : istart, i e n d , c o u n t
* /
.macro compute_ i n d i c e s , v s t a r t , v e n d , s h i f t , p t r s , i s t a r t , i e n d , c o u n t
lsr \ i e n d , \ v e n d , \ s h i f t
mov \ i s t a r t , \ p t r s
sub \ i s t a r t , \ i s t a r t , #1
and \ i e n d , \ i e n d , \ i s t a r t / / i e n d = ( v e n d > > s h i f t ) & ( p t r s - 1 )
mov \ i s t a r t , \ p t r s
mul \ i s t a r t , \ i s t a r t , \ c o u n t
add \ i e n d , \ i e n d , \ i s t a r t / / i e n d + = ( c o u n t - 1 ) * p t r s
/ / our e n t r i e s s p a n m u l t i p l e t a b l e s
lsr \ i s t a r t , \ v s t a r t , \ s h i f t
mov \ c o u n t , \ p t r s
sub \ c o u n t , \ c o u n t , #1
and \ i s t a r t , \ i s t a r t , \ c o u n t
sub \ c o u n t , \ i e n d , \ i s t a r t
2014-11-21 13:50:41 -08:00
.endm
/ *
2018-01-11 10:11:59 +00:00
* Map m e m o r y f o r s p e c i f i e d v i r t u a l a d d r e s s r a n g e . E a c h l e v e l o f p a g e t a b l e n e e d e d s u p p o r t s
* multiple e n t r i e s . I f a l e v e l r e q u i r e s n e n t r i e s t h e n e x t p a g e t a b l e l e v e l i s a s s u m e d t o b e
* formed f r o m n p a g e s .
*
* tbl : location o f p a g e t a b l e
* rtbl : address t o b e u s e d f o r f i r s t l e v e l p a g e t a b l e e n t r y ( t y p i c a l l y t b l + P A G E _ S I Z E )
* vstart : start a d d r e s s t o m a p
* vend : end a d d r e s s t o m a p - w e m a p [ v s t a r t , v e n d ]
* flags : flags t o u s e t o m a p l a s t l e v e l e n t r i e s
* phys : physical a d d r e s s c o r r e s p o n d i n g t o v s t a r t - p h y s i c a l m e m o r y i s c o n t i g u o u s
* pgds : the n u m b e r o f p g d e n t r i e s
2014-11-21 13:50:41 -08:00
*
2018-01-11 10:11:59 +00:00
* Temporaries : istart, i e n d , t m p , c o u n t , s v - t h e s e n e e d t o b e d i f f e r e n t r e g i s t e r s
* Preserves : vstart, v e n d , f l a g s
* Corrupts : tbl, r t b l , i s t a r t , i e n d , t m p , c o u n t , s v
2014-11-21 13:50:41 -08:00
* /
2018-01-11 10:11:59 +00:00
.macro map_ m e m o r y , t b l , r t b l , v s t a r t , v e n d , f l a g s , p h y s , p g d s , i s t a r t , i e n d , t m p , c o u n t , s v
add \ r t b l , \ t b l , #P A G E _ S I Z E
mov \ s v , \ r t b l
mov \ c o u n t , #0
compute_ i n d i c e s \ v s t a r t , \ v e n d , #P G D I R _ S H I F T , \ p g d s , \ i s t a r t , \ i e n d , \ c o u n t
populate_ e n t r i e s \ t b l , \ r t b l , \ i s t a r t , \ i e n d , #P M D _ T Y P E _ T A B L E , # P A G E _ S I Z E , \ t m p
mov \ t b l , \ s v
mov \ s v , \ r t b l
# if S W A P P E R _ P G T A B L E _ L E V E L S > 3
compute_ i n d i c e s \ v s t a r t , \ v e n d , #P U D _ S H I F T , # P T R S _ P E R _ P U D , \ i s t a r t , \ i e n d , \ c o u n t
populate_ e n t r i e s \ t b l , \ r t b l , \ i s t a r t , \ i e n d , #P M D _ T Y P E _ T A B L E , # P A G E _ S I Z E , \ t m p
mov \ t b l , \ s v
mov \ s v , \ r t b l
# endif
# if S W A P P E R _ P G T A B L E _ L E V E L S > 2
compute_ i n d i c e s \ v s t a r t , \ v e n d , #S W A P P E R _ T A B L E _ S H I F T , # P T R S _ P E R _ P M D , \ i s t a r t , \ i e n d , \ c o u n t
populate_ e n t r i e s \ t b l , \ r t b l , \ i s t a r t , \ i e n d , #P M D _ T Y P E _ T A B L E , # P A G E _ S I Z E , \ t m p
mov \ t b l , \ s v
# endif
compute_ i n d i c e s \ v s t a r t , \ v e n d , #S W A P P E R _ B L O C K _ S H I F T , # P T R S _ P E R _ P T E , \ i s t a r t , \ i e n d , \ c o u n t
bic \ c o u n t , \ p h y s , #S W A P P E R _ B L O C K _ S I Z E - 1
populate_ e n t r i e s \ t b l , \ c o u n t , \ i s t a r t , \ i e n d , \ f l a g s , #S W A P P E R _ B L O C K _ S I Z E , \ t m p
2014-11-21 13:50:41 -08:00
.endm
/ *
* Setup t h e i n i t i a l p a g e t a b l e s . W e o n l y s e t u p t h e b a r e s t a m o u n t w h i c h i s
* required t o g e t t h e k e r n e l r u n n i n g . T h e f o l l o w i n g s e c t i o n s a r e r e q u i r e d :
* - identity m a p p i n g t o e n a b l e t h e M M U ( l o w a d d r e s s , T T B R 0 )
* - first f e w M B o f t h e k e r n e l l i n e a r m a p p i n g t o j u m p t o o n c e t h e M M U h a s
2015-06-01 13:40:32 +02:00
* been e n a b l e d
2014-11-21 13:50:41 -08:00
* /
__create_page_tables :
arm64: add support for kernel ASLR
This adds support for KASLR is implemented, based on entropy provided by
the bootloader in the /chosen/kaslr-seed DT property. Depending on the size
of the address space (VA_BITS) and the page size, the entropy in the
virtual displacement is up to 13 bits (16k/2 levels) and up to 25 bits (all
4 levels), with the sidenote that displacements that result in the kernel
image straddling a 1GB/32MB/512MB alignment boundary (for 4KB/16KB/64KB
granule kernels, respectively) are not allowed, and will be rounded up to
an acceptable value.
If CONFIG_RANDOMIZE_MODULE_REGION_FULL is enabled, the module region is
randomized independently from the core kernel. This makes it less likely
that the location of core kernel data structures can be determined by an
adversary, but causes all function calls from modules into the core kernel
to be resolved via entries in the module PLTs.
If CONFIG_RANDOMIZE_MODULE_REGION_FULL is not enabled, the module region is
randomized by choosing a page aligned 128 MB region inside the interval
[_etext - 128 MB, _stext + 128 MB). This gives between 10 and 14 bits of
entropy (depending on page size), independently of the kernel randomization,
but still guarantees that modules are within the range of relative branch
and jump instructions (with the caveat that, since the module region is
shared with other uses of the vmalloc area, modules may need to be loaded
further away if the module region is exhausted)
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-01-26 14:12:01 +01:00
mov x28 , l r
2014-11-21 13:50:41 -08:00
/ *
2018-09-24 17:56:18 +01:00
* Invalidate t h e i n i t p a g e t a b l e s t o a v o i d p o t e n t i a l d i r t y c a c h e l i n e s
* being e v i c t e d . O t h e r p a g e t a b l e s a r e a l l o c a t e d i n r o d a t a a s p a r t o f
* the k e r n e l i m a g e , a n d t h u s a r e c l e a n t o t h e P o C p e r t h e b o o t
* protocol.
2014-11-21 13:50:41 -08:00
* /
2018-09-24 17:56:18 +01:00
adrp x0 , i n i t _ p g _ d i r
arm64/mm: Separate boot-time page tables from swapper_pg_dir
Since the address of swapper_pg_dir is fixed for a given kernel image,
it is an attractive target for manipulation via an arbitrary write. To
mitigate this we'd like to make it read-only by moving it into the
rodata section.
We require that swapper_pg_dir is at a fixed offset from tramp_pg_dir
and reserved_ttbr0, so these will also need to move into rodata.
However, swapper_pg_dir is allocated along with some transient page
tables used for boot which we do not want to move into rodata.
As a step towards this, this patch separates the boot-time page tables
into a new init_pg_dir, and reduces swapper_pg_dir to the single page it
needs to be. This allows us to retain the relationship between
swapper_pg_dir, tramp_pg_dir, and swapper_pg_dir, while cleanly
separating these from the boot-time page tables.
The init_pg_dir holds all of the pgd/pud/pmd/pte levels needed during
boot, and all of these levels will be freed when we switch to the
swapper_pg_dir, which is initialized by the existing code in
paging_init(). Since we start off on the init_pg_dir, we no longer need
to allocate a transient page table in paging_init() in order to ensure
that swapper_pg_dir isn't live while we initialize it.
There should be no functional change as a result of this patch.
Signed-off-by: Jun Yao <yaojun8558363@gmail.com>
Reviewed-by: James Morse <james.morse@arm.com>
[Mark: place init_pg_dir after BSS, fold mm changes, commit message]
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-09-24 15:47:49 +01:00
adrp x1 , i n i t _ p g _ e n d
2018-01-11 10:11:59 +00:00
sub x1 , x1 , x0
2017-07-25 11:55:39 +01:00
bl _ _ i n v a l _ d c a c h e _ a r e a
2014-11-21 13:50:41 -08:00
/ *
2018-09-24 17:56:18 +01:00
* Clear t h e i n i t p a g e t a b l e s .
2014-11-21 13:50:41 -08:00
* /
2018-09-24 17:56:18 +01:00
adrp x0 , i n i t _ p g _ d i r
arm64/mm: Separate boot-time page tables from swapper_pg_dir
Since the address of swapper_pg_dir is fixed for a given kernel image,
it is an attractive target for manipulation via an arbitrary write. To
mitigate this we'd like to make it read-only by moving it into the
rodata section.
We require that swapper_pg_dir is at a fixed offset from tramp_pg_dir
and reserved_ttbr0, so these will also need to move into rodata.
However, swapper_pg_dir is allocated along with some transient page
tables used for boot which we do not want to move into rodata.
As a step towards this, this patch separates the boot-time page tables
into a new init_pg_dir, and reduces swapper_pg_dir to the single page it
needs to be. This allows us to retain the relationship between
swapper_pg_dir, tramp_pg_dir, and swapper_pg_dir, while cleanly
separating these from the boot-time page tables.
The init_pg_dir holds all of the pgd/pud/pmd/pte levels needed during
boot, and all of these levels will be freed when we switch to the
swapper_pg_dir, which is initialized by the existing code in
paging_init(). Since we start off on the init_pg_dir, we no longer need
to allocate a transient page table in paging_init() in order to ensure
that swapper_pg_dir isn't live while we initialize it.
There should be no functional change as a result of this patch.
Signed-off-by: Jun Yao <yaojun8558363@gmail.com>
Reviewed-by: James Morse <james.morse@arm.com>
[Mark: place init_pg_dir after BSS, fold mm changes, commit message]
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-09-24 15:47:49 +01:00
adrp x1 , i n i t _ p g _ e n d
2018-01-11 10:11:59 +00:00
sub x1 , x1 , x0
2014-11-21 13:50:41 -08:00
1 : stp x z r , x z r , [ x0 ] , #16
stp x z r , x z r , [ x0 ] , #16
stp x z r , x z r , [ x0 ] , #16
stp x z r , x z r , [ x0 ] , #16
2017-07-25 11:55:39 +01:00
subs x1 , x1 , #64
b. n e 1 b
2014-11-21 13:50:41 -08:00
2016-04-18 17:09:45 +02:00
mov x7 , S W A P P E R _ M M _ M M U F L A G S
2014-11-21 13:50:41 -08:00
/ *
* Create t h e i d e n t i t y m a p p i n g .
* /
2016-08-16 21:02:32 +02:00
adrp x0 , i d m a p _ p g _ d i r
2015-06-01 13:40:33 +02:00
adrp x3 , _ _ i d m a p _ t e x t _ s t a r t / / _ _ p a ( _ _ i d m a p _ t e x t _ s t a r t )
arm64: mm: increase VA range of identity map
The page size and the number of translation levels, and hence the supported
virtual address range, are build-time configurables on arm64 whose optimal
values are use case dependent. However, in the current implementation, if
the system's RAM is located at a very high offset, the virtual address range
needs to reflect that merely because the identity mapping, which is only used
to enable or disable the MMU, requires the extended virtual range to map the
physical memory at an equal virtual offset.
This patch relaxes that requirement, by increasing the number of translation
levels for the identity mapping only, and only when actually needed, i.e.,
when system RAM's offset is found to be out of reach at runtime.
Tested-by: Laura Abbott <lauraa@codeaurora.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-03-19 16:42:27 +00:00
/ *
arm64: allow ID map to be extended to 52 bits
Currently, when using VA_BITS < 48, if the ID map text happens to be
placed in physical memory above VA_BITS, we increase the VA size (up to
48) and create a new table level, in order to map in the ID map text.
This is okay because the system always supports 48 bits of VA.
This patch extends the code such that if the system supports 52 bits of
VA, and the ID map text is placed that high up, then we increase the VA
size accordingly, up to 52.
One difference from the current implementation is that so far the
condition of VA_BITS < 48 has meant that the top level table is always
"full", with the maximum number of entries, and an extra table level is
always needed. Now, when VA_BITS = 48 (and using 64k pages), the top
level table is not full, and we simply need to increase the number of
entries in it, instead of creating a new table level.
Tested-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Tested-by: Bob Picco <bob.picco@oracle.com>
Reviewed-by: Bob Picco <bob.picco@oracle.com>
Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com>
[catalin.marinas@arm.com: reduce arguments to __create_hyp_mappings()]
[catalin.marinas@arm.com: reworked/renamed __cpu_uses_extended_idmap_level()]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-12-13 17:07:24 +00:00
* VA_ B I T S m a y b e t o o s m a l l t o a l l o w f o r a n I D m a p p i n g t o b e c r e a t e d
* that c o v e r s s y s t e m R A M i f t h a t i s l o c a t e d s u f f i c i e n t l y h i g h i n t h e
* physical a d d r e s s s p a c e . S o f o r t h e I D m a p , u s e a n e x t e n d e d v i r t u a l
* range i n t h a t c a s e , a n d c o n f i g u r e a n a d d i t i o n a l t r a n s l a t i o n l e v e l
* if n e e d e d .
*
arm64: mm: increase VA range of identity map
The page size and the number of translation levels, and hence the supported
virtual address range, are build-time configurables on arm64 whose optimal
values are use case dependent. However, in the current implementation, if
the system's RAM is located at a very high offset, the virtual address range
needs to reflect that merely because the identity mapping, which is only used
to enable or disable the MMU, requires the extended virtual range to map the
physical memory at an equal virtual offset.
This patch relaxes that requirement, by increasing the number of translation
levels for the identity mapping only, and only when actually needed, i.e.,
when system RAM's offset is found to be out of reach at runtime.
Tested-by: Laura Abbott <lauraa@codeaurora.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-03-19 16:42:27 +00:00
* Calculate t h e m a x i m u m a l l o w e d v a l u e f o r T C R _ E L 1 . T 0 S Z s o t h a t t h e
2015-06-01 13:40:33 +02:00
* entire I D m a p r e g i o n c a n b e m a p p e d . A s T 0 S Z = = ( 6 4 - #b i t s u s e d ) ,
arm64: mm: increase VA range of identity map
The page size and the number of translation levels, and hence the supported
virtual address range, are build-time configurables on arm64 whose optimal
values are use case dependent. However, in the current implementation, if
the system's RAM is located at a very high offset, the virtual address range
needs to reflect that merely because the identity mapping, which is only used
to enable or disable the MMU, requires the extended virtual range to map the
physical memory at an equal virtual offset.
This patch relaxes that requirement, by increasing the number of translation
levels for the identity mapping only, and only when actually needed, i.e.,
when system RAM's offset is found to be out of reach at runtime.
Tested-by: Laura Abbott <lauraa@codeaurora.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-03-19 16:42:27 +00:00
* this n u m b e r c o n v e n i e n t l y e q u a l s t h e n u m b e r o f l e a d i n g z e r o e s i n
2015-06-01 13:40:33 +02:00
* the p h y s i c a l a d d r e s s o f _ _ i d m a p _ t e x t _ e n d .
arm64: mm: increase VA range of identity map
The page size and the number of translation levels, and hence the supported
virtual address range, are build-time configurables on arm64 whose optimal
values are use case dependent. However, in the current implementation, if
the system's RAM is located at a very high offset, the virtual address range
needs to reflect that merely because the identity mapping, which is only used
to enable or disable the MMU, requires the extended virtual range to map the
physical memory at an equal virtual offset.
This patch relaxes that requirement, by increasing the number of translation
levels for the identity mapping only, and only when actually needed, i.e.,
when system RAM's offset is found to be out of reach at runtime.
Tested-by: Laura Abbott <lauraa@codeaurora.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-03-19 16:42:27 +00:00
* /
2015-06-01 13:40:33 +02:00
adrp x5 , _ _ i d m a p _ t e x t _ e n d
arm64: mm: increase VA range of identity map
The page size and the number of translation levels, and hence the supported
virtual address range, are build-time configurables on arm64 whose optimal
values are use case dependent. However, in the current implementation, if
the system's RAM is located at a very high offset, the virtual address range
needs to reflect that merely because the identity mapping, which is only used
to enable or disable the MMU, requires the extended virtual range to map the
physical memory at an equal virtual offset.
This patch relaxes that requirement, by increasing the number of translation
levels for the identity mapping only, and only when actually needed, i.e.,
when system RAM's offset is found to be out of reach at runtime.
Tested-by: Laura Abbott <lauraa@codeaurora.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-03-19 16:42:27 +00:00
clz x5 , x5
cmp x5 , T C R _ T 0 S Z ( V A _ B I T S ) / / d e f a u l t T 0 S Z s m a l l e n o u g h ?
arm64: allow ID map to be extended to 52 bits
Currently, when using VA_BITS < 48, if the ID map text happens to be
placed in physical memory above VA_BITS, we increase the VA size (up to
48) and create a new table level, in order to map in the ID map text.
This is okay because the system always supports 48 bits of VA.
This patch extends the code such that if the system supports 52 bits of
VA, and the ID map text is placed that high up, then we increase the VA
size accordingly, up to 52.
One difference from the current implementation is that so far the
condition of VA_BITS < 48 has meant that the top level table is always
"full", with the maximum number of entries, and an extra table level is
always needed. Now, when VA_BITS = 48 (and using 64k pages), the top
level table is not full, and we simply need to increase the number of
entries in it, instead of creating a new table level.
Tested-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Tested-by: Bob Picco <bob.picco@oracle.com>
Reviewed-by: Bob Picco <bob.picco@oracle.com>
Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com>
[catalin.marinas@arm.com: reduce arguments to __create_hyp_mappings()]
[catalin.marinas@arm.com: reworked/renamed __cpu_uses_extended_idmap_level()]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-12-13 17:07:24 +00:00
b. g e 1 f / / . . t h e n s k i p V A r a n g e e x t e n s i o n
arm64: mm: increase VA range of identity map
The page size and the number of translation levels, and hence the supported
virtual address range, are build-time configurables on arm64 whose optimal
values are use case dependent. However, in the current implementation, if
the system's RAM is located at a very high offset, the virtual address range
needs to reflect that merely because the identity mapping, which is only used
to enable or disable the MMU, requires the extended virtual range to map the
physical memory at an equal virtual offset.
This patch relaxes that requirement, by increasing the number of translation
levels for the identity mapping only, and only when actually needed, i.e.,
when system RAM's offset is found to be out of reach at runtime.
Tested-by: Laura Abbott <lauraa@codeaurora.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-03-19 16:42:27 +00:00
2015-03-24 15:10:21 +00:00
adr_ l x6 , i d m a p _ t 0 s z
str x5 , [ x6 ]
dmb s y
dc i v a c , x6 / / I n v a l i d a t e p o t e n t i a l l y s t a l e c a c h e l i n e
arm64: mm: increase VA range of identity map
The page size and the number of translation levels, and hence the supported
virtual address range, are build-time configurables on arm64 whose optimal
values are use case dependent. However, in the current implementation, if
the system's RAM is located at a very high offset, the virtual address range
needs to reflect that merely because the identity mapping, which is only used
to enable or disable the MMU, requires the extended virtual range to map the
physical memory at an equal virtual offset.
This patch relaxes that requirement, by increasing the number of translation
levels for the identity mapping only, and only when actually needed, i.e.,
when system RAM's offset is found to be out of reach at runtime.
Tested-by: Laura Abbott <lauraa@codeaurora.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-03-19 16:42:27 +00:00
arm64: allow ID map to be extended to 52 bits
Currently, when using VA_BITS < 48, if the ID map text happens to be
placed in physical memory above VA_BITS, we increase the VA size (up to
48) and create a new table level, in order to map in the ID map text.
This is okay because the system always supports 48 bits of VA.
This patch extends the code such that if the system supports 52 bits of
VA, and the ID map text is placed that high up, then we increase the VA
size accordingly, up to 52.
One difference from the current implementation is that so far the
condition of VA_BITS < 48 has meant that the top level table is always
"full", with the maximum number of entries, and an extra table level is
always needed. Now, when VA_BITS = 48 (and using 64k pages), the top
level table is not full, and we simply need to increase the number of
entries in it, instead of creating a new table level.
Tested-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Tested-by: Bob Picco <bob.picco@oracle.com>
Reviewed-by: Bob Picco <bob.picco@oracle.com>
Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com>
[catalin.marinas@arm.com: reduce arguments to __create_hyp_mappings()]
[catalin.marinas@arm.com: reworked/renamed __cpu_uses_extended_idmap_level()]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-12-13 17:07:24 +00:00
# if ( V A _ B I T S < 4 8 )
# define E X T R A _ S H I F T ( P G D I R _ S H I F T + P A G E _ S H I F T - 3 )
# define E X T R A _ P T R S ( 1 < < ( P H Y S _ M A S K _ S H I F T - E X T R A _ S H I F T ) )
/ *
* If V A _ B I T S < 4 8 , w e h a v e t o c o n f i g u r e a n a d d i t i o n a l t a b l e l e v e l .
* First, w e h a v e t o v e r i f y o u r a s s u m p t i o n t h a t t h e c u r r e n t v a l u e o f
* VA_ B I T S w a s c h o s e n s u c h t h a t a l l t r a n s l a t i o n l e v e l s a r e f u l l y
* utilised, a n d t h a t l o w e r i n g T 0 S Z w i l l a l w a y s r e s u l t i n a n a d d i t i o n a l
* translation l e v e l t o b e c o n f i g u r e d .
* /
# if V A _ B I T S ! = E X T R A _ S H I F T
# error " M i s m a t c h b e t w e e n V A _ B I T S a n d p a g e s i z e / n u m b e r o f t r a n s l a t i o n l e v e l s "
arm64: mm: increase VA range of identity map
The page size and the number of translation levels, and hence the supported
virtual address range, are build-time configurables on arm64 whose optimal
values are use case dependent. However, in the current implementation, if
the system's RAM is located at a very high offset, the virtual address range
needs to reflect that merely because the identity mapping, which is only used
to enable or disable the MMU, requires the extended virtual range to map the
physical memory at an equal virtual offset.
This patch relaxes that requirement, by increasing the number of translation
levels for the identity mapping only, and only when actually needed, i.e.,
when system RAM's offset is found to be out of reach at runtime.
Tested-by: Laura Abbott <lauraa@codeaurora.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-03-19 16:42:27 +00:00
# endif
arm64: allow ID map to be extended to 52 bits
Currently, when using VA_BITS < 48, if the ID map text happens to be
placed in physical memory above VA_BITS, we increase the VA size (up to
48) and create a new table level, in order to map in the ID map text.
This is okay because the system always supports 48 bits of VA.
This patch extends the code such that if the system supports 52 bits of
VA, and the ID map text is placed that high up, then we increase the VA
size accordingly, up to 52.
One difference from the current implementation is that so far the
condition of VA_BITS < 48 has meant that the top level table is always
"full", with the maximum number of entries, and an extra table level is
always needed. Now, when VA_BITS = 48 (and using 64k pages), the top
level table is not full, and we simply need to increase the number of
entries in it, instead of creating a new table level.
Tested-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Tested-by: Bob Picco <bob.picco@oracle.com>
Reviewed-by: Bob Picco <bob.picco@oracle.com>
Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com>
[catalin.marinas@arm.com: reduce arguments to __create_hyp_mappings()]
[catalin.marinas@arm.com: reworked/renamed __cpu_uses_extended_idmap_level()]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-12-13 17:07:24 +00:00
mov x4 , E X T R A _ P T R S
create_ t a b l e _ e n t r y x0 , x3 , E X T R A _ S H I F T , x4 , x5 , x6
# else
/ *
* If V A _ B I T S = = 4 8 , w e d o n ' t h a v e t o c o n f i g u r e a n a d d i t i o n a l
* translation l e v e l , b u t t h e t o p - l e v e l t a b l e h a s m o r e e n t r i e s .
* /
mov x4 , #1 < < ( P H Y S _ M A S K _ S H I F T - P G D I R _ S H I F T )
str_ l x4 , i d m a p _ p t r s _ p e r _ p g d , x5
# endif
1 :
ldr_ l x4 , i d m a p _ p t r s _ p e r _ p g d
2015-06-01 13:40:33 +02:00
mov x5 , x3 / / _ _ p a ( _ _ i d m a p _ t e x t _ s t a r t )
adr_ l x6 , _ _ i d m a p _ t e x t _ e n d / / _ _ p a ( _ _ i d m a p _ t e x t _ e n d )
2018-01-11 10:11:59 +00:00
map_ m e m o r y x0 , x1 , x3 , x6 , x7 , x3 , x4 , x10 , x11 , x12 , x13 , x14
2014-11-21 13:50:41 -08:00
/ *
* Map t h e k e r n e l i m a g e ( s t a r t i n g w i t h P H Y S _ O F F S E T ) .
* /
arm64/mm: Separate boot-time page tables from swapper_pg_dir
Since the address of swapper_pg_dir is fixed for a given kernel image,
it is an attractive target for manipulation via an arbitrary write. To
mitigate this we'd like to make it read-only by moving it into the
rodata section.
We require that swapper_pg_dir is at a fixed offset from tramp_pg_dir
and reserved_ttbr0, so these will also need to move into rodata.
However, swapper_pg_dir is allocated along with some transient page
tables used for boot which we do not want to move into rodata.
As a step towards this, this patch separates the boot-time page tables
into a new init_pg_dir, and reduces swapper_pg_dir to the single page it
needs to be. This allows us to retain the relationship between
swapper_pg_dir, tramp_pg_dir, and swapper_pg_dir, while cleanly
separating these from the boot-time page tables.
The init_pg_dir holds all of the pgd/pud/pmd/pte levels needed during
boot, and all of these levels will be freed when we switch to the
swapper_pg_dir, which is initialized by the existing code in
paging_init(). Since we start off on the init_pg_dir, we no longer need
to allocate a transient page table in paging_init() in order to ensure
that swapper_pg_dir isn't live while we initialize it.
There should be no functional change as a result of this patch.
Signed-off-by: Jun Yao <yaojun8558363@gmail.com>
Reviewed-by: James Morse <james.morse@arm.com>
[Mark: place init_pg_dir after BSS, fold mm changes, commit message]
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-09-24 15:47:49 +01:00
adrp x0 , i n i t _ p g _ d i r
arm64: don't map TEXT_OFFSET bytes below the kernel if we can avoid it
For historical reasons, the kernel Image must be loaded into physical
memory at a 512 KB offset above a 2 MB aligned base address. The region
between the base address and the start of the kernel Image has no
significance to the kernel itself, but it is currently mapped explicitly
into the early kernel VMA range for all translation granules.
In some cases (i.e., 4 KB granule), this is unavoidable, due to the 2 MB
granularity of the early kernel mappings. However, in other cases, e.g.,
when running with larger page sizes, or in the future, with more granular
KASLR, there is no reason to map it explicitly like we do currently.
So update the logic so that the region is mapped only if that happens as
a side effect of rounding the start address of the kernel to swapper block
size, and leave it unmapped otherwise.
Since the symbol kernel_img_size now simply resolves to the memory
footprint of the kernel Image, we can drop its definition from image.h
and opencode its calculation.
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-04-18 17:09:46 +02:00
mov_ q x5 , K I M A G E _ V A D D R + T E X T _ O F F S E T / / c o m p i l e t i m e _ _ v a ( _ t e x t )
arm64: add support for kernel ASLR
This adds support for KASLR is implemented, based on entropy provided by
the bootloader in the /chosen/kaslr-seed DT property. Depending on the size
of the address space (VA_BITS) and the page size, the entropy in the
virtual displacement is up to 13 bits (16k/2 levels) and up to 25 bits (all
4 levels), with the sidenote that displacements that result in the kernel
image straddling a 1GB/32MB/512MB alignment boundary (for 4KB/16KB/64KB
granule kernels, respectively) are not allowed, and will be rounded up to
an acceptable value.
If CONFIG_RANDOMIZE_MODULE_REGION_FULL is enabled, the module region is
randomized independently from the core kernel. This makes it less likely
that the location of core kernel data structures can be determined by an
adversary, but causes all function calls from modules into the core kernel
to be resolved via entries in the module PLTs.
If CONFIG_RANDOMIZE_MODULE_REGION_FULL is not enabled, the module region is
randomized by choosing a page aligned 128 MB region inside the interval
[_etext - 128 MB, _stext + 128 MB). This gives between 10 and 14 bits of
entropy (depending on page size), independently of the kernel randomization,
but still guarantees that modules are within the range of relative branch
and jump instructions (with the caveat that, since the module region is
shared with other uses of the vmalloc area, modules may need to be loaded
further away if the module region is exhausted)
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-01-26 14:12:01 +01:00
add x5 , x5 , x23 / / a d d K A S L R d i s p l a c e m e n t
arm64: allow ID map to be extended to 52 bits
Currently, when using VA_BITS < 48, if the ID map text happens to be
placed in physical memory above VA_BITS, we increase the VA size (up to
48) and create a new table level, in order to map in the ID map text.
This is okay because the system always supports 48 bits of VA.
This patch extends the code such that if the system supports 52 bits of
VA, and the ID map text is placed that high up, then we increase the VA
size accordingly, up to 52.
One difference from the current implementation is that so far the
condition of VA_BITS < 48 has meant that the top level table is always
"full", with the maximum number of entries, and an extra table level is
always needed. Now, when VA_BITS = 48 (and using 64k pages), the top
level table is not full, and we simply need to increase the number of
entries in it, instead of creating a new table level.
Tested-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Tested-by: Bob Picco <bob.picco@oracle.com>
Reviewed-by: Bob Picco <bob.picco@oracle.com>
Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com>
[catalin.marinas@arm.com: reduce arguments to __create_hyp_mappings()]
[catalin.marinas@arm.com: reworked/renamed __cpu_uses_extended_idmap_level()]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-12-13 17:07:24 +00:00
mov x4 , P T R S _ P E R _ P G D
arm64: don't map TEXT_OFFSET bytes below the kernel if we can avoid it
For historical reasons, the kernel Image must be loaded into physical
memory at a 512 KB offset above a 2 MB aligned base address. The region
between the base address and the start of the kernel Image has no
significance to the kernel itself, but it is currently mapped explicitly
into the early kernel VMA range for all translation granules.
In some cases (i.e., 4 KB granule), this is unavoidable, due to the 2 MB
granularity of the early kernel mappings. However, in other cases, e.g.,
when running with larger page sizes, or in the future, with more granular
KASLR, there is no reason to map it explicitly like we do currently.
So update the logic so that the region is mapped only if that happens as
a side effect of rounding the start address of the kernel to swapper block
size, and leave it unmapped otherwise.
Since the symbol kernel_img_size now simply resolves to the memory
footprint of the kernel Image, we can drop its definition from image.h
and opencode its calculation.
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-04-18 17:09:46 +02:00
adrp x6 , _ e n d / / r u n t i m e _ _ p a ( _ e n d )
adrp x3 , _ t e x t / / r u n t i m e _ _ p a ( _ t e x t )
sub x6 , x6 , x3 / / _ e n d - _ t e x t
add x6 , x6 , x5 / / r u n t i m e _ _ v a ( _ e n d )
2018-01-11 10:11:59 +00:00
map_ m e m o r y x0 , x1 , x5 , x6 , x7 , x3 , x4 , x10 , x11 , x12 , x13 , x14
2014-11-21 13:50:41 -08:00
/ *
* Since t h e p a g e t a b l e s h a v e b e e n p o p u l a t e d w i t h n o n - c a c h e a b l e
* accesses ( M M U d i s a b l e d ) , i n v a l i d a t e t h e i d m a p a n d s w a p p e r p a g e
* tables a g a i n t o r e m o v e a n y s p e c u l a t i v e l y l o a d e d c a c h e l i n e s .
* /
2016-08-16 21:02:32 +02:00
adrp x0 , i d m a p _ p g _ d i r
arm64/mm: Separate boot-time page tables from swapper_pg_dir
Since the address of swapper_pg_dir is fixed for a given kernel image,
it is an attractive target for manipulation via an arbitrary write. To
mitigate this we'd like to make it read-only by moving it into the
rodata section.
We require that swapper_pg_dir is at a fixed offset from tramp_pg_dir
and reserved_ttbr0, so these will also need to move into rodata.
However, swapper_pg_dir is allocated along with some transient page
tables used for boot which we do not want to move into rodata.
As a step towards this, this patch separates the boot-time page tables
into a new init_pg_dir, and reduces swapper_pg_dir to the single page it
needs to be. This allows us to retain the relationship between
swapper_pg_dir, tramp_pg_dir, and swapper_pg_dir, while cleanly
separating these from the boot-time page tables.
The init_pg_dir holds all of the pgd/pud/pmd/pte levels needed during
boot, and all of these levels will be freed when we switch to the
swapper_pg_dir, which is initialized by the existing code in
paging_init(). Since we start off on the init_pg_dir, we no longer need
to allocate a transient page table in paging_init() in order to ensure
that swapper_pg_dir isn't live while we initialize it.
There should be no functional change as a result of this patch.
Signed-off-by: Jun Yao <yaojun8558363@gmail.com>
Reviewed-by: James Morse <james.morse@arm.com>
[Mark: place init_pg_dir after BSS, fold mm changes, commit message]
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-09-24 15:47:49 +01:00
adrp x1 , i n i t _ p g _ e n d
2018-01-11 10:11:59 +00:00
sub x1 , x1 , x0
2015-03-24 13:50:27 +00:00
dmb s y
2017-07-25 11:55:39 +01:00
bl _ _ i n v a l _ d c a c h e _ a r e a
2014-11-21 13:50:41 -08:00
arm64: add support for kernel ASLR
This adds support for KASLR is implemented, based on entropy provided by
the bootloader in the /chosen/kaslr-seed DT property. Depending on the size
of the address space (VA_BITS) and the page size, the entropy in the
virtual displacement is up to 13 bits (16k/2 levels) and up to 25 bits (all
4 levels), with the sidenote that displacements that result in the kernel
image straddling a 1GB/32MB/512MB alignment boundary (for 4KB/16KB/64KB
granule kernels, respectively) are not allowed, and will be rounded up to
an acceptable value.
If CONFIG_RANDOMIZE_MODULE_REGION_FULL is enabled, the module region is
randomized independently from the core kernel. This makes it less likely
that the location of core kernel data structures can be determined by an
adversary, but causes all function calls from modules into the core kernel
to be resolved via entries in the module PLTs.
If CONFIG_RANDOMIZE_MODULE_REGION_FULL is not enabled, the module region is
randomized by choosing a page aligned 128 MB region inside the interval
[_etext - 128 MB, _stext + 128 MB). This gives between 10 and 14 bits of
entropy (depending on page size), independently of the kernel randomization,
but still guarantees that modules are within the range of relative branch
and jump instructions (with the caveat that, since the module region is
shared with other uses of the vmalloc area, modules may need to be loaded
further away if the module region is exhausted)
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-01-26 14:12:01 +01:00
ret x28
2014-11-21 13:50:41 -08:00
ENDPROC( _ _ c r e a t e _ p a g e _ t a b l e s )
.ltorg
/ *
2015-03-04 11:51:48 +01:00
* The f o l l o w i n g f r a g m e n t o f c o d e i s e x e c u t e d w i t h t h e M M U e n a b l e d .
2016-08-31 12:05:15 +01:00
*
* x0 = _ _ P H Y S _ O F F S E T
2014-11-21 13:50:41 -08:00
* /
2016-04-18 17:09:43 +02:00
__primary_switched :
2016-08-31 12:05:16 +01:00
adrp x4 , i n i t _ t h r e a d _ u n i o n
add s p , x4 , #T H R E A D _ S I Z E
arm64: split thread_info from task stack
This patch moves arm64's struct thread_info from the task stack into
task_struct. This protects thread_info from corruption in the case of
stack overflows, and makes its address harder to determine if stack
addresses are leaked, making a number of attacks more difficult. Precise
detection and handling of overflow is left for subsequent patches.
Largely, this involves changing code to store the task_struct in sp_el0,
and acquire the thread_info from the task struct. Core code now
implements current_thread_info(), and as noted in <linux/sched.h> this
relies on offsetof(task_struct, thread_info) == 0, enforced by core
code.
This change means that the 'tsk' register used in entry.S now points to
a task_struct, rather than a thread_info as it used to. To make this
clear, the TI_* field offsets are renamed to TSK_TI_*, with asm-offsets
appropriately updated to account for the structural change.
Userspace clobbers sp_el0, and we can no longer restore this from the
stack. Instead, the current task is cached in a per-cpu variable that we
can safely access from early assembly as interrupts are disabled (and we
are thus not preemptible).
Both secondary entry and idle are updated to stash the sp and task
pointer separately.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Laura Abbott <labbott@redhat.com>
Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: James Morse <james.morse@arm.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-11-03 20:23:13 +00:00
adr_ l x5 , i n i t _ t a s k
msr s p _ e l 0 , x5 / / S a v e t h r e a d _ i n f o
2016-08-31 12:05:16 +01:00
2015-12-26 12:46:40 +01:00
adr_ l x8 , v e c t o r s / / l o a d V B A R _ E L 1 w i t h v i r t u a l
msr v b a r _ e l 1 , x8 / / v e c t o r t a b l e a d d r e s s
isb
2016-08-31 12:05:16 +01:00
stp x z r , x30 , [ s p , #- 16 ] !
mov x29 , s p
2016-08-31 12:05:15 +01:00
str_ l x21 , _ _ f d t _ p o i n t e r , x5 / / S a v e F D T p o i n t e r
ldr_ l x4 , k i m a g e _ v a d d r / / S a v e t h e o f f s e t b e t w e e n
sub x4 , x4 , x0 / / t h e k e r n e l v i r t u a l a n d
str_ l x4 , k i m a g e _ v o f f s e t , x5 / / p h y s i c a l m a p p i n g s
2016-01-06 11:05:27 +00:00
/ / Clear B S S
adr_ l x0 , _ _ b s s _ s t a r t
mov x1 , x z r
adr_ l x2 , _ _ b s s _ s t o p
sub x2 , x2 , x0
bl _ _ p i _ m e m s e t
arm64: mm: place empty_zero_page in bss
Currently the zero page is set up in paging_init, and thus we cannot use
the zero page earlier. We use the zero page as a reserved TTBR value
from which no TLB entries may be allocated (e.g. when uninstalling the
idmap). To enable such usage earlier (as may be required for invasive
changes to the kernel page tables), and to minimise the time that the
idmap is active, we need to be able to use the zero page before
paging_init.
This patch follows the example set by x86, by allocating the zero page
at compile time, in .bss. This means that the zero page itself is
available immediately upon entry to start_kernel (as we zero .bss before
this), and also means that the zero page takes up no space in the raw
Image binary. The associated struct page is allocated in bootmem_init,
and remains unavailable until this time.
Outside of arch code, the only users of empty_zero_page assume that the
empty_zero_page symbol refers to the zeroed memory itself, and that
ZERO_PAGE(x) must be used to acquire the associated struct page,
following the example of x86. This patch also brings arm64 inline with
these assumptions.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Tested-by: Jeremy Linton <jeremy.linton@arm.com>
Cc: Laura Abbott <labbott@fedoraproject.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-01-25 11:44:57 +00:00
dsb i s h s t / / M a k e z e r o p a g e v i s i b l e t o P T W
2016-01-06 11:05:27 +00:00
2015-10-12 18:52:58 +03:00
# ifdef C O N F I G _ K A S A N
bl k a s a n _ e a r l y _ i n i t
arm64: add support for kernel ASLR
This adds support for KASLR is implemented, based on entropy provided by
the bootloader in the /chosen/kaslr-seed DT property. Depending on the size
of the address space (VA_BITS) and the page size, the entropy in the
virtual displacement is up to 13 bits (16k/2 levels) and up to 25 bits (all
4 levels), with the sidenote that displacements that result in the kernel
image straddling a 1GB/32MB/512MB alignment boundary (for 4KB/16KB/64KB
granule kernels, respectively) are not allowed, and will be rounded up to
an acceptable value.
If CONFIG_RANDOMIZE_MODULE_REGION_FULL is enabled, the module region is
randomized independently from the core kernel. This makes it less likely
that the location of core kernel data structures can be determined by an
adversary, but causes all function calls from modules into the core kernel
to be resolved via entries in the module PLTs.
If CONFIG_RANDOMIZE_MODULE_REGION_FULL is not enabled, the module region is
randomized by choosing a page aligned 128 MB region inside the interval
[_etext - 128 MB, _stext + 128 MB). This gives between 10 and 14 bits of
entropy (depending on page size), independently of the kernel randomization,
but still guarantees that modules are within the range of relative branch
and jump instructions (with the caveat that, since the module region is
shared with other uses of the vmalloc area, modules may need to be loaded
further away if the module region is exhausted)
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-01-26 14:12:01 +01:00
# endif
# ifdef C O N F I G _ R A N D O M I Z E _ B A S E
2016-04-18 17:09:47 +02:00
tst x23 , ~ ( M I N _ K I M G _ A L I G N - 1 ) / / a l r e a d y r u n n i n g r a n d o m i z e d ?
b. n e 0 f
arm64: add support for kernel ASLR
This adds support for KASLR is implemented, based on entropy provided by
the bootloader in the /chosen/kaslr-seed DT property. Depending on the size
of the address space (VA_BITS) and the page size, the entropy in the
virtual displacement is up to 13 bits (16k/2 levels) and up to 25 bits (all
4 levels), with the sidenote that displacements that result in the kernel
image straddling a 1GB/32MB/512MB alignment boundary (for 4KB/16KB/64KB
granule kernels, respectively) are not allowed, and will be rounded up to
an acceptable value.
If CONFIG_RANDOMIZE_MODULE_REGION_FULL is enabled, the module region is
randomized independently from the core kernel. This makes it less likely
that the location of core kernel data structures can be determined by an
adversary, but causes all function calls from modules into the core kernel
to be resolved via entries in the module PLTs.
If CONFIG_RANDOMIZE_MODULE_REGION_FULL is not enabled, the module region is
randomized by choosing a page aligned 128 MB region inside the interval
[_etext - 128 MB, _stext + 128 MB). This gives between 10 and 14 bits of
entropy (depending on page size), independently of the kernel randomization,
but still guarantees that modules are within the range of relative branch
and jump instructions (with the caveat that, since the module region is
shared with other uses of the vmalloc area, modules may need to be loaded
further away if the module region is exhausted)
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-01-26 14:12:01 +01:00
mov x0 , x21 / / p a s s F D T a d d r e s s i n x0
bl k a s l r _ e a r l y _ i n i t / / p a r s e F D T f o r K A S L R o p t i o n s
cbz x0 , 0 f / / K A S L R d i s a b l e d ? j u s t p r o c e e d
2016-04-18 17:09:47 +02:00
orr x23 , x23 , x0 / / r e c o r d K A S L R o f f s e t
2016-08-31 12:05:16 +01:00
ldp x29 , x30 , [ s p ] , #16 / / w e m u s t e n a b l e K A S L R , r e t u r n
ret / / t o _ _ p r i m a r y _ s w i t c h ( )
arm64: add support for kernel ASLR
This adds support for KASLR is implemented, based on entropy provided by
the bootloader in the /chosen/kaslr-seed DT property. Depending on the size
of the address space (VA_BITS) and the page size, the entropy in the
virtual displacement is up to 13 bits (16k/2 levels) and up to 25 bits (all
4 levels), with the sidenote that displacements that result in the kernel
image straddling a 1GB/32MB/512MB alignment boundary (for 4KB/16KB/64KB
granule kernels, respectively) are not allowed, and will be rounded up to
an acceptable value.
If CONFIG_RANDOMIZE_MODULE_REGION_FULL is enabled, the module region is
randomized independently from the core kernel. This makes it less likely
that the location of core kernel data structures can be determined by an
adversary, but causes all function calls from modules into the core kernel
to be resolved via entries in the module PLTs.
If CONFIG_RANDOMIZE_MODULE_REGION_FULL is not enabled, the module region is
randomized by choosing a page aligned 128 MB region inside the interval
[_etext - 128 MB, _stext + 128 MB). This gives between 10 and 14 bits of
entropy (depending on page size), independently of the kernel randomization,
but still guarantees that modules are within the range of relative branch
and jump instructions (with the caveat that, since the module region is
shared with other uses of the vmalloc area, modules may need to be loaded
further away if the module region is exhausted)
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-01-26 14:12:01 +01:00
0 :
2015-10-12 18:52:58 +03:00
# endif
arm64: unwind: reference pt_regs via embedded stack frame
As it turns out, the unwind code is slightly broken, and probably has
been for a while. The problem is in the dumping of the exception stack,
which is intended to dump the contents of the pt_regs struct at each
level in the call stack where an exception was taken and routed to a
routine marked as __exception (which means its stack frame is right
below the pt_regs struct on the stack).
'Right below the pt_regs struct' is ill defined, though: the unwind
code assigns 'frame pointer + 0x10' to the .sp member of the stackframe
struct at each level, and dump_backtrace() happily dereferences that as
the pt_regs pointer when encountering an __exception routine. However,
the actual size of the stack frame created by this routine (which could
be one of many __exception routines we have in the kernel) is not known,
and so frame.sp is pretty useless to figure out where struct pt_regs
really is.
So it seems the only way to ensure that we can find our struct pt_regs
when walking the stack frames is to put it at a known fixed offset of
the stack frame pointer that is passed to such __exception routines.
The simplest way to do that is to put it inside pt_regs itself, which is
the main change implemented by this patch. As a bonus, doing this allows
us to get rid of a fair amount of cruft related to walking from one stack
to the other, which is especially nice since we intend to introduce yet
another stack for overflow handling once we add support for vmapped
stacks. It also fixes an inconsistency where we only add a stack frame
pointing to ELR_EL1 if we are executing from the IRQ stack but not when
we are executing from the task stack.
To consistly identify exceptions regs even in the presence of exceptions
taken from entry code, we must check whether the next frame was created
by entry text, rather than whether the current frame was crated by
exception text.
To avoid backtracing using PCs that fall in the idmap, or are controlled
by userspace, we must explcitly zero the FP and LR in startup paths, and
must ensure that the frame embedded in pt_regs is zeroed upon entry from
EL0. To avoid these NULL entries showin in the backtrace, unwind_frame()
is updated to avoid them.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
[Mark: compare current frame against .entry.text, avoid bogus PCs]
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
2017-07-22 18:45:33 +01:00
add s p , s p , #16
mov x29 , #0
mov x30 , #0
2014-11-21 13:50:41 -08:00
b s t a r t _ k e r n e l
2016-04-18 17:09:43 +02:00
ENDPROC( _ _ p r i m a r y _ s w i t c h e d )
2014-11-21 13:50:41 -08:00
/ *
* end e a r l y h e a d s e c t i o n , b e g i n h e a d c o d e t h a t i s a l s o u s e d f o r
* hotplug a n d n e e d s t o h a v e t h e s a m e p r o t e c t i o n s a s t h e t e x t r e g i o n
* /
2018-01-29 12:00:00 +00:00
.section " .idmap .text " , " awx"
arm64: add support for kernel ASLR
This adds support for KASLR is implemented, based on entropy provided by
the bootloader in the /chosen/kaslr-seed DT property. Depending on the size
of the address space (VA_BITS) and the page size, the entropy in the
virtual displacement is up to 13 bits (16k/2 levels) and up to 25 bits (all
4 levels), with the sidenote that displacements that result in the kernel
image straddling a 1GB/32MB/512MB alignment boundary (for 4KB/16KB/64KB
granule kernels, respectively) are not allowed, and will be rounded up to
an acceptable value.
If CONFIG_RANDOMIZE_MODULE_REGION_FULL is enabled, the module region is
randomized independently from the core kernel. This makes it less likely
that the location of core kernel data structures can be determined by an
adversary, but causes all function calls from modules into the core kernel
to be resolved via entries in the module PLTs.
If CONFIG_RANDOMIZE_MODULE_REGION_FULL is not enabled, the module region is
randomized by choosing a page aligned 128 MB region inside the interval
[_etext - 128 MB, _stext + 128 MB). This gives between 10 and 14 bits of
entropy (depending on page size), independently of the kernel randomization,
but still guarantees that modules are within the range of relative branch
and jump instructions (with the caveat that, since the module region is
shared with other uses of the vmalloc area, modules may need to be loaded
further away if the module region is exhausted)
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-01-26 14:12:01 +01:00
ENTRY( k i m a g e _ v a d d r )
.quad _text - TEXT_ O F F S E T
2012-03-05 11:49:27 +00:00
/ *
* If w e ' r e f o r t u n a t e e n o u g h t o b o o t a t E L 2 , e n s u r e t h a t t h e w o r l d i s
* sane b e f o r e d r o p p i n g t o E L 1 .
2013-10-11 14:52:16 +01:00
*
2017-01-09 14:31:55 +00:00
* Returns e i t h e r B O O T _ C P U _ M O D E _ E L 1 o r B O O T _ C P U _ M O D E _ E L 2 i n w0 i f
2013-10-11 14:52:16 +01:00
* booted i n E L 1 o r E L 2 r e s p e c t i v e l y .
2012-03-05 11:49:27 +00:00
* /
ENTRY( e l 2 _ s e t u p )
2017-09-26 15:57:16 +01:00
msr S P s e l , #1 / / W e w a n t t o u s e S P _ E L { 1 ,2 }
2012-03-05 11:49:27 +00:00
mrs x0 , C u r r e n t E L
2014-06-06 14:16:21 +01:00
cmp x0 , #C u r r e n t E L _ E L 2
2017-02-15 14:54:16 +00:00
b. e q 1 f
2018-01-15 19:38:55 +00:00
mov_ q x0 , ( S C T L R _ E L 1 _ R E S 1 | E N D I A N _ S E T _ E L 1 )
2013-10-11 14:52:17 +01:00
msr s c t l r _ e l 1 , x0
2016-08-31 12:05:12 +01:00
mov w0 , #B O O T _ C P U _ M O D E _ E L 1 / / T h i s c p u b o o t e d i n E L 1
2013-10-11 14:52:17 +01:00
isb
2012-03-05 11:49:27 +00:00
ret
2018-01-15 19:38:55 +00:00
1 : mov_ q x0 , ( S C T L R _ E L 2 _ R E S 1 | E N D I A N _ S E T _ E L 2 )
2017-02-15 14:54:16 +00:00
msr s c t l r _ e l 2 , x0
2014-02-19 09:33:14 +00:00
# ifdef C O N F I G _ A R M 6 4 _ V H E
/ *
* Check f o r V H E b e i n g p r e s e n t . F o r t h e r e s t o f t h e E L 2 s e t u p ,
* x2 b e i n g n o n - z e r o i n d i c a t e s t h a t w e d o h a v e V H E , a n d t h a t t h e
* kernel i s i n t e n d e d t o r u n a t E L 2 .
* /
mrs x2 , i d _ a a64 m m f r1 _ e l 1
ubfx x2 , x2 , #8 , #4
# else
mov x2 , x z r
# endif
2012-03-05 11:49:27 +00:00
/* Hyp configuration. */
2014-02-19 09:33:14 +00:00
mov x0 , #H C R _ R W / / 64 - b i t E L 1
cbz x2 , s e t _ h c r
orr x0 , x0 , #H C R _ T G E / / E n a b l e H o s t E x t e n s i o n s
orr x0 , x0 , #H C R _ E 2 H
set_hcr :
2012-03-05 11:49:27 +00:00
msr h c r _ e l 2 , x0
2014-02-19 09:33:14 +00:00
isb
2012-03-05 11:49:27 +00:00
2016-11-28 21:13:02 -05:00
/ *
* Allow N o n - s e c u r e E L 1 a n d E L 0 t o a c c e s s p h y s i c a l t i m e r a n d c o u n t e r .
* This i s n o t n e c e s s a r y f o r V H E , s i n c e t h e h o s t k e r n e l r u n s i n E L 2 ,
* and E L 0 a c c e s s e s a r e c o n f i g u r e d i n t h e l a t e r s t a g e o f b o o t p r o c e s s .
* Note t h a t w h e n H C R _ E L 2 . E 2 H = = 1 , C N T H C T L _ E L 2 h a s t h e s a m e b i t l a y o u t
* as C N T K C T L _ E L 1 , a n d C N T K C T L _ E L 1 a c c e s s i n g i n s t r u c t i o n s a r e r e d e f i n e d
* to a c c e s s C N T H C T L _ E L 2 . T h i s a l l o w s t h e k e r n e l d e s i g n e d t o r u n a t E L 1
* to t r a n s p a r e n t l y m e s s w i t h t h e E L 0 b i t s v i a C N T K C T L _ E L 1 a c c e s s i n
* EL2 .
* /
cbnz x2 , 1 f
2012-03-05 11:49:27 +00:00
mrs x0 , c n t h c t l _ e l 2
orr x0 , x0 , #3 / / E n a b l e E L 1 p h y s i c a l t i m e r s
msr c n t h c t l _ e l 2 , x0
2016-11-28 21:13:02 -05:00
1 :
2012-11-29 22:48:31 +00:00
msr c n t v o f f _ e l 2 , x z r / / C l e a r v i r t u a l o f f s e t
2012-03-05 11:49:27 +00:00
2014-06-30 16:01:31 +01:00
# ifdef C O N F I G _ A R M _ G I C _ V 3
/* GICv3 system register access */
mrs x0 , i d _ a a64 p f r0 _ e l 1
ubfx x0 , x0 , #24 , #4
cmp x0 , #1
b. n e 3 f
2017-01-19 17:57:43 +00:00
mrs_ s x0 , S Y S _ I C C _ S R E _ E L 2
2014-06-30 16:01:31 +01:00
orr x0 , x0 , #I C C _ S R E _ E L 2 _ S R E / / S e t I C C _ S R E _ E L 2 . S R E = =1
orr x0 , x0 , #I C C _ S R E _ E L 2 _ E N A B L E / / S e t I C C _ S R E _ E L 2 . E n a b l e = =1
2017-01-19 17:57:43 +00:00
msr_ s S Y S _ I C C _ S R E _ E L 2 , x0
2014-06-30 16:01:31 +01:00
isb / / M a k e s u r e S R E i s n o w s e t
2017-01-19 17:57:43 +00:00
mrs_ s x0 , S Y S _ I C C _ S R E _ E L 2 / / R e a d S R E b a c k ,
2015-09-30 11:39:59 +01:00
tbz x0 , #0 , 3 f / / a n d c h e c k t h a t i t s t i c k s
2017-01-19 17:57:43 +00:00
msr_ s S Y S _ I C H _ H C R _ E L 2 , x z r / / R e s e t I C C _ H C R _ E L 2 t o d e f a u l t s
2014-06-30 16:01:31 +01:00
3 :
# endif
2012-03-05 11:49:27 +00:00
/* Populate ID registers. */
mrs x0 , m i d r _ e l 1
mrs x1 , m p i d r _ e l 1
msr v p i d r _ e l 2 , x0
msr v m p i d r _ e l 2 , x1
# ifdef C O N F I G _ C O M P A T
msr h s t r _ e l 2 , x z r / / D i s a b l e C P 1 5 t r a p s t o E L 2
# endif
2015-09-02 18:49:28 +01:00
/* EL2 debug */
2016-09-22 11:25:25 +01:00
mrs x1 , i d _ a a64 d f r0 _ e l 1 / / C h e c k I D _ A A 6 4 D F R 0 _ E L 1 P M U V e r
sbfx x0 , x1 , #8 , #4
2016-01-13 14:50:03 +00:00
cmp x0 , #1
b. l t 4 f / / S k i p i f n o P M U p r e s e n t
2015-09-02 18:49:28 +01:00
mrs x0 , p m c r _ e l 0 / / D i s a b l e d e b u g a c c e s s t r a p s
ubfx x0 , x0 , #11 , #5 / / t o E L 2 a n d a l l o w a c c e s s t o
2016-01-13 14:50:03 +00:00
4 :
2016-09-22 11:25:25 +01:00
csel x3 , x z r , x0 , l t / / a l l P M U c o u n t e r s f r o m E L 1
/* Statistical profiling */
ubfx x0 , x1 , #32 , #4 / / C h e c k I D _ A A 6 4 D F R 0 _ E L 1 P M S V e r
2017-07-07 13:47:02 +01:00
cbz x0 , 7 f / / S k i p i f S P E n o t p r e s e n t
cbnz x2 , 6 f / / V H E ?
mrs_ s x4 , S Y S _ P M B I D R _ E L 1 / / I f S P E a v a i l a b l e a t E L 2 ,
and x4 , x4 , #( 1 < < S Y S _ P M B I D R _ E L 1 _ P _ S H I F T )
cbnz x4 , 5 f / / t h e n p e r m i t s a m p l i n g o f p h y s i c a l
mov x4 , #( 1 < < S Y S _ P M S C R _ E L 2 _ P C T _ S H I F T | \
1 < < SYS_ P M S C R _ E L 2 _ P A _ S H I F T )
msr_ s S Y S _ P M S C R _ E L 2 , x4 / / a d d r e s s e s a n d p h y s i c a l c o u n t e r
5 :
2016-09-22 11:25:25 +01:00
mov x1 , #( M D C R _ E L 2 _ E 2 P B _ M A S K < < M D C R _ E L 2 _ E 2 P B _ S H I F T )
orr x3 , x3 , x1 / / I f w e d o n ' t h a v e V H E , t h e n
2017-07-07 13:47:02 +01:00
b 7 f / / u s e E L 1 & 0 t r a n s l a t i o n .
6 : / / For V H E , u s e E L 2 t r a n s l a t i o n
2016-09-22 11:25:25 +01:00
orr x3 , x3 , #M D C R _ E L 2 _ T P M S / / a n d d i s a b l e a c c e s s f r o m E L 1
2017-07-07 13:47:02 +01:00
7 :
2016-09-22 11:25:25 +01:00
msr m d c r _ e l 2 , x3 / / C o n f i g u r e d e b u g t r a p s
2015-09-02 18:49:28 +01:00
arm64/kvm: Prohibit guest LOR accesses
We don't currently limit guest accesses to the LOR registers, which we
neither virtualize nor context-switch. As such, guests are provided with
unusable information/controls, and are not isolated from each other (or
the host).
To prevent these issues, we can trap register accesses and present the
illusion LORegions are unssupported by the CPU. To do this, we mask
ID_AA64MMFR1.LO, and set HCR_EL2.TLOR to trap accesses to the following
registers:
* LORC_EL1
* LOREA_EL1
* LORID_EL1
* LORN_EL1
* LORSA_EL1
... when trapped, we inject an UNDEFINED exception to EL1, simulating
their non-existence.
As noted in D7.2.67, when no LORegions are implemented, LoadLOAcquire
and StoreLORelease must behave as LoadAcquire and StoreRelease
respectively. We can ensure this by clearing LORC_EL1.EN when a CPU's
EL2 is first initialized, as the host kernel will not modify this.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Vladimir Murzin <vladimir.murzin@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christoffer Dall <christoffer.dall@linaro.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: kvmarm@lists.cs.columbia.edu
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2018-02-13 13:39:23 +00:00
/* LORegions */
mrs x1 , i d _ a a64 m m f r1 _ e l 1
ubfx x0 , x1 , #I D _ A A 64 M M F R 1 _ L O R _ S H I F T , 4
cbz x0 , 1 f
msr_ s S Y S _ L O R C _ E L 1 , x z r
1 :
2012-11-06 19:27:59 +00:00
/* Stage-2 translation */
msr v t t b r _ e l 2 , x z r
2014-02-19 09:33:14 +00:00
cbz x2 , i n s t a l l _ e l 2 _ s t u b
2016-08-31 12:05:12 +01:00
mov w0 , #B O O T _ C P U _ M O D E _ E L 2 / / T h i s C P U b o o t e d i n E L 2
2014-02-19 09:33:14 +00:00
isb
ret
install_el2_stub :
2017-02-15 14:54:17 +00:00
/ *
* When V H E i s n o t i n u s e , e a r l y i n i t o f E L 2 a n d E L 1 n e e d s t o b e
* done h e r e .
* When V H E _ i s _ i n u s e , E L 1 w i l l n o t b e u s e d i n t h e h o s t a n d
* requires n o c o n f i g u r a t i o n , a n d a l l n o n - h y p - s p e c i f i c E L 2 s e t u p
* will b e d o n e v i a t h e _ E L 1 s y s t e m r e g i s t e r a l i a s e s i n _ _ c p u _ s e t u p .
* /
2018-01-15 19:38:55 +00:00
mov_ q x0 , ( S C T L R _ E L 1 _ R E S 1 | E N D I A N _ S E T _ E L 1 )
2017-02-15 14:54:17 +00:00
msr s c t l r _ e l 1 , x0
/* Coprocessor traps. */
mov x0 , #0x33ff
msr c p t r _ e l 2 , x0 / / D i s a b l e c o p r o . t r a p s t o E L 2
2017-10-31 15:51:04 +00:00
/* SVE register access */
mrs x1 , i d _ a a64 p f r0 _ e l 1
ubfx x1 , x1 , #I D _ A A 64 P F R 0 _ S V E _ S H I F T , #4
cbz x1 , 7 f
bic x0 , x0 , #C P T R _ E L 2 _ T Z / / A l s o d i s a b l e S V E t r a p s
msr c p t r _ e l 2 , x0 / / D i s a b l e c o p r o . t r a p s t o E L 2
isb
mov x1 , #Z C R _ E L x _ L E N _ M A S K / / S V E : E n a b l e f u l l v e c t o r
msr_ s S Y S _ Z C R _ E L 2 , x1 / / l e n g t h f o r E L 1 .
2012-10-19 17:46:27 +01:00
/* Hypervisor stub */
2017-10-31 15:51:04 +00:00
7 : adr_ l x0 , _ _ h y p _ s t u b _ v e c t o r s
2012-10-19 17:46:27 +01:00
msr v b a r _ e l 2 , x0
2012-03-05 11:49:27 +00:00
/* spsr */
mov x0 , #( P S R _ F _ B I T | P S R _ I _ B I T | P S R _ A _ B I T | P S R _ D _ B I T | \
PSR_ M O D E _ E L 1 h )
msr s p s r _ e l 2 , x0
msr e l r _ e l 2 , l r
2016-08-31 12:05:12 +01:00
mov w0 , #B O O T _ C P U _ M O D E _ E L 2 / / T h i s C P U b o o t e d i n E L 2
2012-03-05 11:49:27 +00:00
eret
ENDPROC( e l 2 _ s e t u p )
2013-10-11 14:52:16 +01:00
/ *
* Sets t h e _ _ b o o t _ c p u _ m o d e f l a g d e p e n d i n g o n t h e C P U b o o t m o d e p a s s e d
2017-01-09 14:31:55 +00:00
* in w0 . S e e a r c h / a r m 6 4 / i n c l u d e / a s m / v i r t . h f o r m o r e i n f o .
2013-10-11 14:52:16 +01:00
* /
2016-04-18 17:09:41 +02:00
set_cpu_boot_mode_flag :
2015-03-17 09:14:29 +01:00
adr_ l x1 , _ _ b o o t _ c p u _ m o d e
2016-08-31 12:05:12 +01:00
cmp w0 , #B O O T _ C P U _ M O D E _ E L 2
2013-10-11 14:52:16 +01:00
b. n e 1 f
add x1 , x1 , #4
2016-08-31 12:05:12 +01:00
1 : str w0 , [ x1 ] / / T h i s C P U h a s b o o t e d i n E L 1
2014-05-02 16:24:13 +01:00
dmb s y
dc i v a c , x1 / / I n v a l i d a t e p o t e n t i a l l y s t a l e c a c h e l i n e
2013-10-11 14:52:16 +01:00
ret
ENDPROC( s e t _ c p u _ b o o t _ m o d e _ f l a g )
2016-08-24 18:27:29 +01:00
/ *
* These v a l u e s a r e w r i t t e n w i t h t h e M M U o f f , b u t r e a d w i t h t h e M M U o n .
* Writers w i l l i n v a l i d a t e t h e c o r r e s p o n d i n g a d d r e s s , d i s c a r d i n g u p t o a
* ' Cache W r i t e b a c k G r a n u l e ' ( C W G ) w o r t h o f d a t a . T h e l i n k e r s c r i p t e n s u r e s
* sufficient a l i g n m e n t t h a t t h e C W G d o e s n ' t o v e r l a p a n o t h e r s e c t i o n .
* /
.pushsection " .mmuoff .data .write " , " aw"
2012-10-26 15:40:05 +01:00
/ *
* We n e e d t o f i n d o u t t h e C P U b o o t m o d e l o n g a f t e r b o o t , s o w e n e e d t o
* store i t i n a w r i t a b l e v a r i a b l e .
*
* This i s n o t i n . b s s , b e c a u s e w e s e t i t s u f f i c i e n t l y e a r l y t h a t t h e b o o t - t i m e
* zeroing o f . b s s w o u l d c l o b b e r i t .
* /
2015-03-13 16:21:18 +01:00
ENTRY( _ _ b o o t _ c p u _ m o d e )
2012-10-26 15:40:05 +01:00
.long BOOT_CPU_MODE_EL2
2015-03-13 16:14:36 +00:00
.long BOOT_CPU_MODE_EL1
2016-08-24 18:27:29 +01:00
/ *
* The b o o t i n g C P U u p d a t e s t h e f a i l e d s t a t u s @__early_cpu_boot_status,
* with M M U t u r n e d o f f .
* /
ENTRY( _ _ e a r l y _ c p u _ b o o t _ s t a t u s )
.long 0
2012-10-26 15:40:05 +01:00
.popsection
2012-03-05 11:49:27 +00:00
/ *
* This p r o v i d e s a " h o l d i n g p e n " f o r p l a t f o r m s t o h o l d a l l s e c o n d a r y
* cores a r e h e l d u n t i l w e ' r e r e a d y f o r t h e m t o i n i t i a l i s e .
* /
ENTRY( s e c o n d a r y _ h o l d i n g _ p e n )
2016-08-31 12:05:12 +01:00
bl e l 2 _ s e t u p / / D r o p t o E L 1 , w0 =cpu_boot_mode
2013-10-11 14:52:16 +01:00
bl s e t _ c p u _ b o o t _ m o d e _ f l a g
2012-03-05 11:49:27 +00:00
mrs x0 , m p i d r _ e l 1
2016-04-18 17:09:45 +02:00
mov_ q x1 , M P I D R _ H W I D _ B I T M A S K
2012-08-29 18:32:18 +01:00
and x0 , x0 , x1
2015-03-10 15:00:03 +01:00
adr_ l x3 , s e c o n d a r y _ h o l d i n g _ p e n _ r e l e a s e
2012-03-05 11:49:27 +00:00
pen : ldr x4 , [ x3 ]
cmp x4 , x0
b. e q s e c o n d a r y _ s t a r t u p
wfe
b p e n
ENDPROC( s e c o n d a r y _ h o l d i n g _ p e n )
arm64: factor out spin-table boot method
The arm64 kernel has an internal holding pen, which is necessary for
some systems where we can't bring CPUs online individually and must hold
multiple CPUs in a safe area until the kernel is able to handle them.
The current SMP infrastructure for arm64 is closely coupled to this
holding pen, and alternative boot methods must launch CPUs into the pen,
where they sit before they are launched into the kernel proper.
With PSCI (and possibly other future boot methods), we can bring CPUs
online individually, and need not perform the secondary_holding_pen
dance. Instead, this patch factors the holding pen management code out
to the spin-table boot method code, as it is the only boot method
requiring the pen.
A new entry point for secondaries, secondary_entry is added for other
boot methods to use, which bypasses the holding pen and its associated
overhead when bringing CPUs online. The smp.pen.text section is also
removed, as the pen can live in head.text without problem.
The cpu_operations structure is extended with two new functions,
cpu_boot and cpu_postboot, for bringing a cpu into the kernel and
performing any post-boot cleanup required by a bootmethod (e.g.
resetting the secondary_holding_pen_release to INVALID_HWID).
Documentation is added for cpu_operations.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2013-10-24 20:30:16 +01:00
/ *
* Secondary e n t r y p o i n t t h a t j u m p s s t r a i g h t i n t o t h e k e r n e l . O n l y t o
* be u s e d w h e r e C P U s a r e b r o u g h t o n l i n e d y n a m i c a l l y b y t h e k e r n e l .
* /
ENTRY( s e c o n d a r y _ e n t r y )
bl e l 2 _ s e t u p / / D r o p t o E L 1
2013-11-18 18:56:42 +00:00
bl s e t _ c p u _ b o o t _ m o d e _ f l a g
arm64: factor out spin-table boot method
The arm64 kernel has an internal holding pen, which is necessary for
some systems where we can't bring CPUs online individually and must hold
multiple CPUs in a safe area until the kernel is able to handle them.
The current SMP infrastructure for arm64 is closely coupled to this
holding pen, and alternative boot methods must launch CPUs into the pen,
where they sit before they are launched into the kernel proper.
With PSCI (and possibly other future boot methods), we can bring CPUs
online individually, and need not perform the secondary_holding_pen
dance. Instead, this patch factors the holding pen management code out
to the spin-table boot method code, as it is the only boot method
requiring the pen.
A new entry point for secondaries, secondary_entry is added for other
boot methods to use, which bypasses the holding pen and its associated
overhead when bringing CPUs online. The smp.pen.text section is also
removed, as the pen can live in head.text without problem.
The cpu_operations structure is extended with two new functions,
cpu_boot and cpu_postboot, for bringing a cpu into the kernel and
performing any post-boot cleanup required by a bootmethod (e.g.
resetting the secondary_holding_pen_release to INVALID_HWID).
Documentation is added for cpu_operations.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2013-10-24 20:30:16 +01:00
b s e c o n d a r y _ s t a r t u p
ENDPROC( s e c o n d a r y _ e n t r y )
2012-03-05 11:49:27 +00:00
2016-04-18 17:09:41 +02:00
secondary_startup :
2012-03-05 11:49:27 +00:00
/ *
* Common e n t r y p o i n t f o r s e c o n d a r y C P U s .
* /
2015-03-18 14:55:20 +00:00
bl _ _ c p u _ s e t u p / / i n i t i a l i s e p r o c e s s o r
2018-09-24 14:51:13 +01:00
adrp x1 , s w a p p e r _ p g _ d i r
2016-08-31 12:05:14 +01:00
bl _ _ e n a b l e _ m m u
ldr x8 , =__secondary_switched
br x8
2012-03-05 11:49:27 +00:00
ENDPROC( s e c o n d a r y _ s t a r t u p )
2016-04-18 17:09:41 +02:00
__secondary_switched :
2015-12-26 12:46:40 +01:00
adr_ l x5 , v e c t o r s
msr v b a r _ e l 1 , x5
isb
2016-02-23 10:31:42 +00:00
adr_ l x0 , s e c o n d a r y _ d a t a
arm64: split thread_info from task stack
This patch moves arm64's struct thread_info from the task stack into
task_struct. This protects thread_info from corruption in the case of
stack overflows, and makes its address harder to determine if stack
addresses are leaked, making a number of attacks more difficult. Precise
detection and handling of overflow is left for subsequent patches.
Largely, this involves changing code to store the task_struct in sp_el0,
and acquire the thread_info from the task struct. Core code now
implements current_thread_info(), and as noted in <linux/sched.h> this
relies on offsetof(task_struct, thread_info) == 0, enforced by core
code.
This change means that the 'tsk' register used in entry.S now points to
a task_struct, rather than a thread_info as it used to. To make this
clear, the TI_* field offsets are renamed to TSK_TI_*, with asm-offsets
appropriately updated to account for the structural change.
Userspace clobbers sp_el0, and we can no longer restore this from the
stack. Instead, the current task is cached in a per-cpu variable that we
can safely access from early assembly as interrupts are disabled (and we
are thus not preemptible).
Both secondary entry and idle are updated to stash the sp and task
pointer separately.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Laura Abbott <labbott@redhat.com>
Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: James Morse <james.morse@arm.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-11-03 20:23:13 +00:00
ldr x1 , [ x0 , #C P U _ B O O T _ S T A C K ] / / g e t s e c o n d a r y _ d a t a . s t a c k
mov s p , x1
ldr x2 , [ x0 , #C P U _ B O O T _ T A S K ]
msr s p _ e l 0 , x2
2012-03-05 11:49:27 +00:00
mov x29 , #0
arm64: unwind: reference pt_regs via embedded stack frame
As it turns out, the unwind code is slightly broken, and probably has
been for a while. The problem is in the dumping of the exception stack,
which is intended to dump the contents of the pt_regs struct at each
level in the call stack where an exception was taken and routed to a
routine marked as __exception (which means its stack frame is right
below the pt_regs struct on the stack).
'Right below the pt_regs struct' is ill defined, though: the unwind
code assigns 'frame pointer + 0x10' to the .sp member of the stackframe
struct at each level, and dump_backtrace() happily dereferences that as
the pt_regs pointer when encountering an __exception routine. However,
the actual size of the stack frame created by this routine (which could
be one of many __exception routines we have in the kernel) is not known,
and so frame.sp is pretty useless to figure out where struct pt_regs
really is.
So it seems the only way to ensure that we can find our struct pt_regs
when walking the stack frames is to put it at a known fixed offset of
the stack frame pointer that is passed to such __exception routines.
The simplest way to do that is to put it inside pt_regs itself, which is
the main change implemented by this patch. As a bonus, doing this allows
us to get rid of a fair amount of cruft related to walking from one stack
to the other, which is especially nice since we intend to introduce yet
another stack for overflow handling once we add support for vmapped
stacks. It also fixes an inconsistency where we only add a stack frame
pointing to ELR_EL1 if we are executing from the IRQ stack but not when
we are executing from the task stack.
To consistly identify exceptions regs even in the presence of exceptions
taken from entry code, we must check whether the next frame was created
by entry text, rather than whether the current frame was crated by
exception text.
To avoid backtracing using PCs that fall in the idmap, or are controlled
by userspace, we must explcitly zero the FP and LR in startup paths, and
must ensure that the frame embedded in pt_regs is zeroed upon entry from
EL0. To avoid these NULL entries showin in the backtrace, unwind_frame()
is updated to avoid them.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
[Mark: compare current frame against .entry.text, avoid bogus PCs]
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
2017-07-22 18:45:33 +01:00
mov x30 , #0
2012-03-05 11:49:27 +00:00
b s e c o n d a r y _ s t a r t _ k e r n e l
ENDPROC( _ _ s e c o n d a r y _ s w i t c h e d )
2016-02-23 10:31:42 +00:00
/ *
* The b o o t i n g C P U u p d a t e s t h e f a i l e d s t a t u s @__early_cpu_boot_status,
* with M M U t u r n e d o f f .
*
* update_ e a r l y _ c p u _ b o o t _ s t a t u s t m p , s t a t u s
* - Corrupts t m p1 , t m p2
* - Writes ' s t a t u s ' t o _ _ e a r l y _ c p u _ b o o t _ s t a t u s a n d m a k e s s u r e
* it i s c o m m i t t e d t o m e m o r y .
* /
.macro update_early_cpu_boot_status status, t m p1 , t m p2
mov \ t m p2 , #\ s t a t u s
arm64: fix invalidation of wrong __early_cpu_boot_status cacheline
In head.S, the str_l macro, which takes a source register, a symbol name
and a temp register, is used to store a status value to the variable
__early_cpu_boot_status. Subsequently, the value of the temp register is
reused to invalidate any cachelines covering this variable.
However, since str_l resolves to
adrp \tmp, \sym
str \src, [\tmp, :lo12:\sym]
the temp register never actually holds the address of the variable but
only of the 4 KB window that covers it, and reusing it leads to the
wrong cacheline being invalidated. So instead, take the address
explicitly before doing the store, and reuse that value to perform
the cache invalidation.
Fixes: bb9052744f4b ("arm64: Handle early CPU boot failures")
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Suzuki K Poulose <Suzuki.Poulose@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-04-15 12:11:21 +02:00
adr_ l \ t m p1 , _ _ e a r l y _ c p u _ b o o t _ s t a t u s
str \ t m p2 , [ \ t m p1 ]
2016-02-23 10:31:42 +00:00
dmb s y
dc i v a c , \ t m p1 / / I n v a l i d a t e p o t e n t i a l l y s t a l e c a c h e l i n e
.endm
2012-03-05 11:49:27 +00:00
/ *
2015-03-17 08:59:53 +01:00
* Enable t h e M M U .
2012-03-05 11:49:27 +00:00
*
2015-03-17 08:59:53 +01:00
* x0 = S C T L R _ E L 1 v a l u e f o r t u r n i n g o n t h e M M U .
2018-09-24 14:51:13 +01:00
* x1 = T T B R 1 _ E L 1 v a l u e
2015-03-17 08:59:53 +01:00
*
2016-08-31 12:05:14 +01:00
* Returns t o t h e c a l l e r v i a x30 / l r . T h i s r e q u i r e s t h e c a l l e r t o b e c o v e r e d
* by t h e . i d m a p . t e x t s e c t i o n .
2015-10-19 14:19:35 +01:00
*
* Checks i f t h e s e l e c t e d g r a n u l e s i z e i s s u p p o r t e d b y t h e C P U .
* If i t i s n ' t , p a r k t h e C P U
2012-03-05 11:49:27 +00:00
* /
2016-04-27 17:47:07 +01:00
ENTRY( _ _ e n a b l e _ m m u )
2018-09-24 14:51:13 +01:00
mrs x2 , I D _ A A 6 4 M M F R 0 _ E L 1
ubfx x2 , x2 , #I D _ A A 64 M M F R 0 _ T G R A N _ S H I F T , 4
2015-10-19 14:19:35 +01:00
cmp x2 , #I D _ A A 64 M M F R 0 _ T G R A N _ S U P P O R T E D
b. n e _ _ n o _ g r a n u l e _ s u p p o r t
2018-09-24 14:51:13 +01:00
update_ e a r l y _ c p u _ b o o t _ s t a t u s 0 , x2 , x3
adrp x2 , i d m a p _ p g _ d i r
phys_ t o _ t t b r x1 , x1
phys_ t o _ t t b r x2 , x2
msr t t b r0 _ e l 1 , x2 / / l o a d T T B R 0
arm64: mm: Offset TTBR1 to allow 52-bit PTRS_PER_PGD
Enabling 52-bit VAs on arm64 requires that the PGD table expands from 64
entries (for the 48-bit case) to 1024 entries. This quantity,
PTRS_PER_PGD is used as follows to compute which PGD entry corresponds
to a given virtual address, addr:
pgd_index(addr) -> (addr >> PGDIR_SHIFT) & (PTRS_PER_PGD - 1)
Userspace addresses are prefixed by 0's, so for a 48-bit userspace
address, uva, the following is true:
(uva >> PGDIR_SHIFT) & (1024 - 1) == (uva >> PGDIR_SHIFT) & (64 - 1)
In other words, a 48-bit userspace address will have the same pgd_index
when using PTRS_PER_PGD = 64 and 1024.
Kernel addresses are prefixed by 1's so, given a 48-bit kernel address,
kva, we have the following inequality:
(kva >> PGDIR_SHIFT) & (1024 - 1) != (kva >> PGDIR_SHIFT) & (64 - 1)
In other words a 48-bit kernel virtual address will have a different
pgd_index when using PTRS_PER_PGD = 64 and 1024.
If, however, we note that:
kva = 0xFFFF << 48 + lower (where lower[63:48] == 0b)
and, PGDIR_SHIFT = 42 (as we are dealing with 64KB PAGE_SIZE)
We can consider:
(kva >> PGDIR_SHIFT) & (1024 - 1) - (kva >> PGDIR_SHIFT) & (64 - 1)
= (0xFFFF << 6) & 0x3FF - (0xFFFF << 6) & 0x3F // "lower" cancels out
= 0x3C0
In other words, one can switch PTRS_PER_PGD to the 52-bit value globally
provided that they increment ttbr1_el1 by 0x3C0 * 8 = 0x1E00 bytes when
running with 48-bit kernel VAs (TCR_EL1.T1SZ = 16).
For kernel configuration where 52-bit userspace VAs are possible, this
patch offsets ttbr1_el1 and sets PTRS_PER_PGD corresponding to the
52-bit value.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Steve Capper <steve.capper@arm.com>
[will: added comment to TTBR1_BADDR_4852_OFFSET calculation]
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-06 22:50:39 +00:00
offset_ t t b r1 x1
2018-09-24 14:51:13 +01:00
msr t t b r1 _ e l 1 , x1 / / l o a d T T B R 1
2012-03-05 11:49:27 +00:00
isb
msr s c t l r _ e l 1 , x0
isb
2015-08-04 17:49:36 +01:00
/ *
* Invalidate t h e l o c a l I - c a c h e s o t h a t a n y i n s t r u c t i o n s f e t c h e d
* speculatively f r o m t h e P o C a r e d i s c a r d e d , s i n c e t h e y m a y h a v e
* been d y n a m i c a l l y p a t c h e d a t t h e P o U .
* /
ic i a l l u
dsb n s h
isb
2016-08-31 12:05:14 +01:00
ret
2015-03-17 08:59:53 +01:00
ENDPROC( _ _ e n a b l e _ m m u )
2015-10-19 14:19:35 +01:00
__no_granule_support :
2016-02-23 10:31:42 +00:00
/* Indicate that this CPU can't boot and is stuck in the kernel */
update_ e a r l y _ c p u _ b o o t _ s t a t u s C P U _ S T U C K _ I N _ K E R N E L , x1 , x2
1 :
2015-10-19 14:19:35 +01:00
wfe
2016-02-23 10:31:42 +00:00
wfi
2016-08-31 12:05:13 +01:00
b 1 b
2015-10-19 14:19:35 +01:00
ENDPROC( _ _ n o _ g r a n u l e _ s u p p o r t )
2016-04-18 17:09:42 +02:00
2016-04-18 17:09:43 +02:00
# ifdef C O N F I G _ R E L O C A T A B L E
2016-08-31 12:05:13 +01:00
__relocate_kernel :
2016-04-18 17:09:43 +02:00
/ *
* Iterate o v e r e a c h e n t r y i n t h e r e l o c a t i o n t a b l e , a n d a p p l y t h e
* relocations i n p l a c e .
* /
ldr w9 , =__rela_offset / / o f f s e t t o r e l o c t a b l e
ldr w10 , =__rela_size / / s i z e o f r e l o c t a b l e
2016-04-18 17:09:45 +02:00
mov_ q x11 , K I M A G E _ V A D D R / / d e f a u l t v i r t u a l o f f s e t
2016-04-18 17:09:43 +02:00
add x11 , x11 , x23 / / a c t u a l v i r t u a l o f f s e t
add x9 , x9 , x11 / / _ _ v a ( . r e l a )
add x10 , x9 , x10 / / _ _ v a ( . r e l a ) + s i z e o f ( . r e l a )
0 : cmp x9 , x10
arm64: relocatable: suppress R_AARCH64_ABS64 relocations in vmlinux
The linker routines that we rely on to produce a relocatable PIE binary
treat it as a shared ELF object in some ways, i.e., it emits symbol based
R_AARCH64_ABS64 relocations into the final binary since doing so would be
appropriate when linking a shared library that is subject to symbol
preemption. (This means that an executable can override certain symbols
that are exported by a shared library it is linked with, and that the
shared library *must* update all its internal references as well, and point
them to the version provided by the executable.)
Symbol preemption does not occur for OS hosted PIE executables, let alone
for vmlinux, and so we would prefer to get rid of these symbol based
relocations. This would allow us to simplify the relocation routines, and
to strip the .dynsym, .dynstr and .hash sections from the binary. (Note
that these are tiny, and are placed in the .init segment, but they clutter
up the vmlinux binary.)
Note that these R_AARCH64_ABS64 relocations are only emitted for absolute
references to symbols defined in the linker script, all other relocatable
quantities are covered by anonymous R_AARCH64_RELATIVE relocations that
simply list the offsets to all 64-bit values in the binary that need to be
fixed up based on the offset between the link time and run time addresses.
Fortunately, GNU ld has a -Bsymbolic option, which is intended for shared
libraries to allow them to ignore symbol preemption, and unconditionally
bind all internal symbol references to its own definitions. So set it for
our PIE binary as well, and get rid of the asoociated sections and the
relocation code that processes them.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
[will: fixed conflict with __dynsym_offset linker script entry]
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-07-24 14:00:13 +02:00
b. h s 1 f
2016-04-18 17:09:43 +02:00
ldp x11 , x12 , [ x9 ] , #24
ldr x13 , [ x9 , #- 8 ]
cmp w12 , #R _ A A R C H 64 _ R E L A T I V E
arm64: relocatable: suppress R_AARCH64_ABS64 relocations in vmlinux
The linker routines that we rely on to produce a relocatable PIE binary
treat it as a shared ELF object in some ways, i.e., it emits symbol based
R_AARCH64_ABS64 relocations into the final binary since doing so would be
appropriate when linking a shared library that is subject to symbol
preemption. (This means that an executable can override certain symbols
that are exported by a shared library it is linked with, and that the
shared library *must* update all its internal references as well, and point
them to the version provided by the executable.)
Symbol preemption does not occur for OS hosted PIE executables, let alone
for vmlinux, and so we would prefer to get rid of these symbol based
relocations. This would allow us to simplify the relocation routines, and
to strip the .dynsym, .dynstr and .hash sections from the binary. (Note
that these are tiny, and are placed in the .init segment, but they clutter
up the vmlinux binary.)
Note that these R_AARCH64_ABS64 relocations are only emitted for absolute
references to symbols defined in the linker script, all other relocatable
quantities are covered by anonymous R_AARCH64_RELATIVE relocations that
simply list the offsets to all 64-bit values in the binary that need to be
fixed up based on the offset between the link time and run time addresses.
Fortunately, GNU ld has a -Bsymbolic option, which is intended for shared
libraries to allow them to ignore symbol preemption, and unconditionally
bind all internal symbol references to its own definitions. So set it for
our PIE binary as well, and get rid of the asoociated sections and the
relocation code that processes them.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
[will: fixed conflict with __dynsym_offset linker script entry]
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-07-24 14:00:13 +02:00
b. n e 0 b
2016-04-18 17:09:43 +02:00
add x13 , x13 , x23 / / r e l o c a t e
str x13 , [ x11 , x23 ]
b 0 b
2016-08-31 12:05:13 +01:00
1 : ret
ENDPROC( _ _ r e l o c a t e _ k e r n e l )
# endif
2016-04-18 17:09:43 +02:00
2016-08-31 12:05:13 +01:00
__primary_switch :
# ifdef C O N F I G _ R A N D O M I Z E _ B A S E
mov x19 , x0 / / p r e s e r v e n e w S C T L R _ E L 1 v a l u e
mrs x20 , s c t l r _ e l 1 / / p r e s e r v e o l d S C T L R _ E L 1 v a l u e
# endif
arm64/mm: Separate boot-time page tables from swapper_pg_dir
Since the address of swapper_pg_dir is fixed for a given kernel image,
it is an attractive target for manipulation via an arbitrary write. To
mitigate this we'd like to make it read-only by moving it into the
rodata section.
We require that swapper_pg_dir is at a fixed offset from tramp_pg_dir
and reserved_ttbr0, so these will also need to move into rodata.
However, swapper_pg_dir is allocated along with some transient page
tables used for boot which we do not want to move into rodata.
As a step towards this, this patch separates the boot-time page tables
into a new init_pg_dir, and reduces swapper_pg_dir to the single page it
needs to be. This allows us to retain the relationship between
swapper_pg_dir, tramp_pg_dir, and swapper_pg_dir, while cleanly
separating these from the boot-time page tables.
The init_pg_dir holds all of the pgd/pud/pmd/pte levels needed during
boot, and all of these levels will be freed when we switch to the
swapper_pg_dir, which is initialized by the existing code in
paging_init(). Since we start off on the init_pg_dir, we no longer need
to allocate a transient page table in paging_init() in order to ensure
that swapper_pg_dir isn't live while we initialize it.
There should be no functional change as a result of this patch.
Signed-off-by: Jun Yao <yaojun8558363@gmail.com>
Reviewed-by: James Morse <james.morse@arm.com>
[Mark: place init_pg_dir after BSS, fold mm changes, commit message]
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-09-24 15:47:49 +01:00
adrp x1 , i n i t _ p g _ d i r
2016-08-31 12:05:14 +01:00
bl _ _ e n a b l e _ m m u
2016-08-31 12:05:13 +01:00
# ifdef C O N F I G _ R E L O C A T A B L E
bl _ _ r e l o c a t e _ k e r n e l
# ifdef C O N F I G _ R A N D O M I Z E _ B A S E
ldr x8 , =__primary_switched
2016-08-31 12:05:15 +01:00
adrp x0 , _ _ P H Y S _ O F F S E T
2016-08-31 12:05:13 +01:00
blr x8
/ *
* If w e r e t u r n h e r e , w e h a v e a K A S L R d i s p l a c e m e n t i n x23 w h i c h w e n e e d
* to t a k e i n t o a c c o u n t b y d i s c a r d i n g t h e c u r r e n t k e r n e l m a p p i n g a n d
* creating a n e w o n e .
* /
2018-01-29 11:59:52 +00:00
pre_ d i s a b l e _ m m u _ w o r k a r o u n d
2016-08-31 12:05:13 +01:00
msr s c t l r _ e l 1 , x20 / / d i s a b l e t h e M M U
isb
bl _ _ c r e a t e _ p a g e _ t a b l e s / / r e c r e a t e k e r n e l m a p p i n g
tlbi v m a l l e 1 / / R e m o v e a n y s t a l e T L B e n t r i e s
dsb n s h
msr s c t l r _ e l 1 , x19 / / r e - e n a b l e t h e M M U
isb
ic i a l l u / / f l u s h i n s t r u c t i o n s f e t c h e d
dsb n s h / / v i a o l d m a p p i n g
isb
bl _ _ r e l o c a t e _ k e r n e l
# endif
2016-04-18 17:09:43 +02:00
# endif
ldr x8 , =__primary_switched
2016-08-31 12:05:15 +01:00
adrp x0 , _ _ P H Y S _ O F F S E T
2016-04-18 17:09:43 +02:00
br x8
ENDPROC( _ _ p r i m a r y _ s w i t c h )