License cleanup: add SPDX GPL-2.0 license identifier to files with no license
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.
By default all files without license information are under the default
license of the kernel, which is GPL version 2.
Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.
This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.
How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,
Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.
The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.
The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.
Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if <5
lines).
All documentation files were explicitly excluded.
The following heuristics were used to determine which SPDX license
identifiers to apply.
- when both scanners couldn't find any license traces, file was
considered to have no license information in it, and the top level
COPYING file license applied.
For non */uapi/* files that summary was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 11139
and resulted in the first patch in this series.
If that file was a */uapi/* path one, it was "GPL-2.0 WITH
Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 WITH Linux-syscall-note 930
and resulted in the second patch in this series.
- if a file had some form of licensing information in it, and was one
of the */uapi/* ones, it was denoted with the Linux-syscall-note if
any GPL family license was found in the file or had no licensing in
it (per prior point). Results summary:
SPDX license identifier # files
---------------------------------------------------|------
GPL-2.0 WITH Linux-syscall-note 270
GPL-2.0+ WITH Linux-syscall-note 169
((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21
((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17
LGPL-2.1+ WITH Linux-syscall-note 15
GPL-1.0+ WITH Linux-syscall-note 14
((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5
LGPL-2.0+ WITH Linux-syscall-note 4
LGPL-2.1 WITH Linux-syscall-note 3
((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3
((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1
and that resulted in the third patch in this series.
- when the two scanners agreed on the detected license(s), that became
the concluded license(s).
- when there was disagreement between the two scanners (one detected a
license but the other didn't, or they both detected different
licenses) a manual inspection of the file occurred.
- In most cases a manual inspection of the information in the file
resulted in a clear resolution of the license that should apply (and
which scanner probably needed to revisit its heuristics).
- When it was not immediately clear, the license identifier was
confirmed with lawyers working with the Linux Foundation.
- If there was any question as to the appropriate license identifier,
the file was flagged for further research and to be revisited later
in time.
In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.
Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights. The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.
Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.
In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.
Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
- a full scancode scan run, collecting the matched texts, detected
license ids and scores
- reviewing anything where there was a license detected (about 500+
files) to ensure that the applied SPDX license was correct
- reviewing anything where there was no detection but the patch license
was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
SPDX license was correct
This produced a worksheet with 20 files needing minor correction. This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.
These .csv files were then reviewed by Greg. Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected. This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.) Finally Greg ran the script using the .csv files to
generate the patches.
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-01 17:07:57 +03:00
/* SPDX-License-Identifier: GPL-2.0 */
2005-04-17 02:20:36 +04:00
/ *
*
* Copyright ( C ) 1 9 9 1 , 1 9 9 2 L i n u s T o r v a l d s
*
* Enhanced C P U d e t e c t i o n a n d f e a t u r e s e t t i n g c o d e b y M i k e J a g d i s
* and M a r t i n M a r e s , N o v e m b e r 1 9 9 7 .
* /
.text
# include < l i n u x / t h r e a d s . h >
2008-01-30 15:33:28 +03:00
# include < l i n u x / i n i t . h >
2005-04-17 02:20:36 +04:00
# include < l i n u x / l i n k a g e . h >
# include < a s m / s e g m e n t . h >
2009-02-13 22:14:01 +03:00
# include < a s m / p a g e _ t y p e s . h >
# include < a s m / p g t a b l e _ t y p e s . h >
2005-04-17 02:20:36 +04:00
# include < a s m / c a c h e . h >
# include < a s m / t h r e a d _ i n f o . h >
2005-09-09 21:28:28 +04:00
# include < a s m / a s m - o f f s e t s . h >
2005-04-17 02:20:36 +04:00
# include < a s m / s e t u p . h >
2008-02-10 01:24:09 +03:00
# include < a s m / p r o c e s s o r - f l a g s . h >
2009-11-14 02:28:13 +03:00
# include < a s m / m s r - i n d e x . h >
2016-01-27 00:12:04 +03:00
# include < a s m / c p u f e a t u r e s . h >
2009-02-09 16:17:40 +03:00
# include < a s m / p e r c p u . h >
2012-04-19 04:16:50 +04:00
# include < a s m / n o p s . h >
2015-02-19 10:34:58 +03:00
# include < a s m / b o o t p a r a m . h >
2016-01-11 19:04:34 +03:00
# include < a s m / e x p o r t . h >
2016-12-08 19:44:31 +03:00
# include < a s m / p g t a b l e _ 3 2 . h >
2008-02-10 01:24:09 +03:00
/* Physical address */
# define p a ( X ) ( ( X ) - _ _ P A G E _ O F F S E T )
2005-04-17 02:20:36 +04:00
/ *
* References t o m e m b e r s o f t h e n e w _ c p u _ d a t a s t r u c t u r e .
* /
# define X 8 6 n e w _ c p u _ d a t a + C P U I N F O _ x86
# define X 8 6 _ V E N D O R n e w _ c p u _ d a t a + C P U I N F O _ x86 _ v e n d o r
# define X 8 6 _ M O D E L n e w _ c p u _ d a t a + C P U I N F O _ x86 _ m o d e l
2018-01-01 04:52:10 +03:00
# define X 8 6 _ S T E P P I N G n e w _ c p u _ d a t a + C P U I N F O _ x86 _ s t e p p i n g
2005-04-17 02:20:36 +04:00
# define X 8 6 _ H A R D _ M A T H n e w _ c p u _ d a t a + C P U I N F O _ h a r d _ m a t h
# define X 8 6 _ C P U I D n e w _ c p u _ d a t a + C P U I N F O _ c p u i d _ l e v e l
# define X 8 6 _ C A P A B I L I T Y n e w _ c p u _ d a t a + C P U I N F O _ x86 _ c a p a b i l i t y
# define X 8 6 _ V E N D O R _ I D n e w _ c p u _ d a t a + C P U I N F O _ x86 _ v e n d o r _ i d
2007-05-02 21:27:16 +04:00
2016-09-22 00:04:06 +03:00
# define S I Z E O F _ P T R E G S 1 7 * 4
2009-03-16 22:07:54 +03:00
/ *
* Worst- c a s e s i z e o f t h e k e r n e l m a p p i n g w e n e e d t o m a k e :
2010-12-17 06:11:09 +03:00
* a r e l o c a t a b l e k e r n e l c a n l i v e a n y w h e r e i n l o w m e m , s o w e n e e d t o b e a b l e
* to m a p a l l o f l o w m e m .
2009-03-16 22:07:54 +03:00
* /
2010-12-17 06:11:09 +03:00
KERNEL_ P A G E S = L O W M E M _ P A G E S
2009-03-16 22:07:54 +03:00
2011-02-25 23:46:13 +03:00
INIT_ M A P _ S I Z E = P A G E _ T A B L E _ S I Z E ( K E R N E L _ P A G E S ) * P A G E _ S I Z E
2009-03-09 11:15:57 +03:00
RESERVE_ B R K ( p a g e t a b l e s , I N I T _ M A P _ S I Z E )
2009-03-13 02:09:49 +03:00
2005-04-17 02:20:36 +04:00
/ *
* 3 2 - bit k e r n e l e n t r y p o i n t ; only used by the boot CPU. On entry,
* % esi p o i n t s t o t h e r e a l - m o d e c o d e a s a 3 2 - b i t p o i n t e r .
* CS a n d D S m u s t b e 4 G B f l a t s e g m e n t s , b u t w e d o n ' t d e p e n d o n
* any p a r t i c u l a r G D T l a y o u t , b e c a u s e w e l o a d o u r o w n a s s o o n a s w e
* can.
* /
2009-09-17 00:44:28 +04:00
_ _ HEAD
2019-10-11 14:51:05 +03:00
SYM_ C O D E _ S T A R T ( s t a r t u p _ 3 2 )
2016-08-18 18:59:03 +03:00
movl p a ( i n i t i a l _ s t a c k ) ,% e c x
2011-02-05 03:14:11 +03:00
2005-04-17 02:20:36 +04:00
/ *
* Set s e g m e n t s t o k n o w n v a l u e s .
* /
2008-02-10 01:24:09 +03:00
lgdt p a ( b o o t _ g d t _ d e s c r )
2005-04-17 02:20:36 +04:00
movl $ ( _ _ B O O T _ D S ) ,% e a x
movl % e a x ,% d s
movl % e a x ,% e s
movl % e a x ,% f s
movl % e a x ,% g s
2011-02-05 03:14:11 +03:00
movl % e a x ,% s s
leal - _ _ P A G E _ O F F S E T ( % e c x ) ,% e s p
2005-04-17 02:20:36 +04:00
/ *
* Clear B S S f i r s t s o t h a t t h e r e a r e n o s u r p r i s e s . . .
* /
2007-10-22 03:41:35 +04:00
cld
2005-04-17 02:20:36 +04:00
xorl % e a x ,% e a x
2008-02-10 01:24:09 +03:00
movl $ p a ( _ _ b s s _ s t a r t ) ,% e d i
movl $ p a ( _ _ b s s _ s t o p ) ,% e c x
2005-04-17 02:20:36 +04:00
subl % e d i ,% e c x
shrl $ 2 ,% e c x
rep ; stosl
2005-09-04 02:56:31 +04:00
/ *
* Copy b o o t u p p a r a m e t e r s o u t o f t h e w a y .
* Note : % esi s t i l l h a s t h e p o i n t e r t o t h e r e a l - m o d e d a t a .
* With t h e k e x e c a s b o o t l o a d e r , p a r a m e t e r s e g m e n t m i g h t b e l o a d e d b e y o n d
* kernel i m a g e a n d m i g h t n o t e v e n b e a d d r e s s a b l e b y e a r l y b o o t p a g e t a b l e s .
* ( kexec o n p a n i c c a s e ) . H e n c e c o p y o u t t h e p a r a m e t e r s b e f o r e i n i t i a l i z i n g
* page t a b l e s .
* /
2008-02-10 01:24:09 +03:00
movl $ p a ( b o o t _ p a r a m s ) ,% e d i
2005-09-04 02:56:31 +04:00
movl $ ( P A R A M _ S I Z E / 4 ) ,% e c x
cld
rep
movsl
2008-02-10 01:24:09 +03:00
movl p a ( b o o t _ p a r a m s ) + N E W _ C L _ P O I N T E R ,% e s i
2005-09-04 02:56:31 +04:00
andl % e s i ,% e s i
tree-wide: fix comment/printk typos
"gadget", "through", "command", "maintain", "maintain", "controller", "address",
"between", "initiali[zs]e", "instead", "function", "select", "already",
"equal", "access", "management", "hierarchy", "registration", "interest",
"relative", "memory", "offset", "already",
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2010-11-01 22:38:34 +03:00
jz 1 f # N o c o m m a n d l i n e
2008-02-10 01:24:09 +03:00
movl $ p a ( b o o t _ c o m m a n d _ l i n e ) ,% e d i
2005-09-04 02:56:31 +04:00
movl $ ( C O M M A N D _ L I N E _ S I Z E / 4 ) ,% e c x
rep
movsl
1 :
2005-04-17 02:20:36 +04:00
2011-02-23 12:08:31 +03:00
# ifdef C O N F I G _ O L P C
2010-06-19 01:46:53 +04:00
/* save OFW's pgdir table for later use when calling into OFW */
movl % c r3 , % e a x
movl % e a x , p a ( o l p c _ o f w _ p g d )
# endif
2015-10-20 12:54:45 +03:00
# ifdef C O N F I G _ M I C R O C O D E
2012-12-21 11:44:29 +04:00
/* Early load ucode on BSP. */
call l o a d _ u c o d e _ b s p
# endif
2016-12-08 19:44:31 +03:00
/* Create early pagetables. */
call m k _ e a r l y _ p g t b l _ 3 2
2008-02-10 01:24:09 +03:00
/* Do early initialization of the fixmap area */
2010-08-28 17:58:33 +04:00
movl $ p a ( i n i t i a l _ p g _ f i x m a p ) + P D E _ I D E N T _ A T T R ,% e a x
2016-12-08 19:44:31 +03:00
# ifdef C O N F I G _ X 8 6 _ P A E
# define K P M D S ( ( ( - _ _ P A G E _ O F F S E T ) > > 3 0 ) & 3 ) / * N u m b e r o f k e r n e l P M D s * /
2010-08-28 17:58:33 +04:00
movl % e a x ,p a ( i n i t i a l _ p g _ p m d + 0 x10 0 0 * K P M D S - 8 )
2016-12-08 19:44:31 +03:00
# else
2010-08-28 17:58:33 +04:00
movl % e a x ,p a ( i n i t i a l _ p a g e _ t a b l e + 0 x f f c )
2008-02-10 01:24:09 +03:00
# endif
2011-01-04 09:50:54 +03:00
2016-09-22 00:03:59 +03:00
jmp . L d e f a u l t _ e n t r y
2019-10-11 14:51:05 +03:00
SYM_ C O D E _ E N D ( s t a r t u p _ 3 2 )
2011-01-04 09:50:54 +03:00
2012-11-13 23:32:45 +04:00
# ifdef C O N F I G _ H O T P L U G _ C P U
/ *
* Boot C P U 0 e n t r y p o i n t . I t ' s c a l l e d f r o m p l a y _ d e a d ( ) . E v e r y t h i n g h a s b e e n s e t
* up a l r e a d y e x c e p t s t a c k . W e j u s t s e t u p s t a c k h e r e . T h e n c a l l
* start_ s e c o n d a r y ( ) .
* /
2019-10-11 14:51:07 +03:00
SYM_ F U N C _ S T A R T ( s t a r t _ c p u 0 )
2016-08-18 18:59:03 +03:00
movl i n i t i a l _ s t a c k , % e c x
2012-11-13 23:32:45 +04:00
movl % e c x , % e s p
2016-09-22 00:04:02 +03:00
call * ( i n i t i a l _ c o d e )
1 : jmp 1 b
2019-10-11 14:51:07 +03:00
SYM_ F U N C _ E N D ( s t a r t _ c p u 0 )
2012-11-13 23:32:45 +04:00
# endif
2005-04-17 02:20:36 +04:00
/ *
* Non- b o o t C P U e n t r y p o i n t ; entered from trampoline.S
* We c a n ' t l g d t h e r e , b e c a u s e l g d t i t s e l f u s e s a d a t a s e g m e n t , b u t
2007-05-02 21:27:10 +04:00
* we k n o w t h e t r a m p o l i n e h a s a l r e a d y l o a d e d t h e b o o t _ g d t f o r u s .
2007-02-13 15:26:22 +03:00
*
* If c p u h o t p l u g i s n o t s u p p o r t e d t h e n t h i s c o d e c a n g o i n i n i t s e c t i o n
* which w i l l b e f r e e d l a t e r
2005-04-17 02:20:36 +04:00
* /
2019-10-11 14:51:07 +03:00
SYM_ F U N C _ S T A R T ( s t a r t u p _ 3 2 _ s m p )
2005-04-17 02:20:36 +04:00
cld
movl $ ( _ _ B O O T _ D S ) ,% e a x
movl % e a x ,% d s
movl % e a x ,% e s
movl % e a x ,% f s
movl % e a x ,% g s
2016-08-18 18:59:03 +03:00
movl p a ( i n i t i a l _ s t a c k ) ,% e c x
2011-02-05 03:14:11 +03:00
movl % e a x ,% s s
leal - _ _ P A G E _ O F F S E T ( % e c x ) ,% e s p
2012-05-08 22:22:28 +04:00
2015-10-20 12:54:45 +03:00
# ifdef C O N F I G _ M I C R O C O D E
2012-12-21 11:44:29 +04:00
/* Early load ucode on AP. */
call l o a d _ u c o d e _ a p
# endif
2016-09-22 00:03:59 +03:00
.Ldefault_entry :
x86-32: Start out cr0 clean, disable paging before modifying cr3/4
Patch
5a5a51db78e x86-32: Start out eflags and cr4 clean
... made x86-32 match x86-64 in that we initialize %eflags and %cr4
from scratch. This broke OLPC XO-1.5, because the XO enters the
kernel with paging enabled, which the kernel doesn't expect.
Since we no longer support 386 (the source of most of the variability
in %cr0 configuration), we can simply match further x86-64 and
initialize %cr0 to a fixed value -- the one variable part remaining in
%cr0 is for FPU control, but all that is handled later on in
initialization; in particular, configuring %cr0 as if the FPU is
present until proven otherwise is correct and necessary for the probe
to work.
To deal with the XO case sanely, explicitly disable paging in %cr0
before we muck with %cr3, %cr4 or EFER -- those operations are
inherently unsafe with paging enabled.
NOTE: There is still a lot of 386-related junk in head_32.S which we
can and should get rid of, however, this is intended as a minimal fix
whereas the cleanup can be deferred to the next merge window.
Reported-by: Andres Salomon <dilinger@queued.net>
Tested-by: Daniel Drake <dsd@laptop.org>
Link: http://lkml.kernel.org/r/50FA0661.2060400@linux.intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-01-19 22:29:37 +04:00
movl $ ( C R 0 _ S T A T E & ~ X 8 6 _ C R 0 _ P G ) ,% e a x
movl % e a x ,% c r0
2005-04-17 02:20:36 +04:00
/ *
2013-02-11 18:22:16 +04:00
* We w a n t t o s t a r t o u t w i t h E F L A G S u n a m b i g u o u s l y c l e a r e d . S o m e B I O S e s l e a v e
* bits l i k e N T s e t . T h i s w o u l d c o n f u s e t h e d e b u g g e r i f t h i s c o d e i s t r a c e d . S o
* initialize t h e m p r o p e r l y n o w b e f o r e s w i t c h i n g t o p r o t e c t e d m o d e . T h a t m e a n s
* DF i n p a r t i c u l a r ( e v e n t h o u g h w e h a v e c l e a r e d i t e a r l i e r a f t e r c o p y i n g t h e
* command l i n e ) b e c a u s e G C C e x p e c t s i t .
* /
pushl $ 0
popfl
/ *
* New p a g e t a b l e s m a y b e i n 4 M b y t e p a g e m o d e a n d m a y b e u s i n g t h e g l o b a l p a g e s .
2005-04-17 02:20:36 +04:00
*
2013-02-11 18:22:16 +04:00
* NOTE! I f w e a r e o n a 4 8 6 w e m a y h a v e n o c r4 a t a l l ! S p e c i f i c a l l y , c r4 e x i s t s
* if a n d o n l y i f C P U I D e x i s t s a n d h a s f l a g s o t h e r t h a n t h e F P U f l a g s e t .
2005-04-17 02:20:36 +04:00
* /
2013-02-11 18:22:16 +04:00
movl $ - 1 ,p a ( X 8 6 _ C P U I D ) # p r e s e t C P U I D l e v e l
2012-09-25 03:05:48 +04:00
movl $ X 8 6 _ E F L A G S _ I D ,% e c x
pushl % e c x
2013-02-11 18:22:16 +04:00
popfl # s e t E F L A G S = I D
2012-09-25 03:05:48 +04:00
pushfl
2013-02-11 18:22:16 +04:00
popl % e a x # g e t E F L A G S
testl $ X 8 6 _ E F L A G S _ I D ,% e a x # d i d E F L A G S . I D r e m a i n e d s e t ?
2016-09-22 00:03:59 +03:00
jz . L e n a b l e _ p a g i n g # h w d i s a l l o w e d s e t t i n g o f I D b i t
2013-02-11 18:22:16 +04:00
# which m e a n s n o C P U I D a n d n o C R 4
xorl % e a x ,% e a x
cpuid
movl % e a x ,p a ( X 8 6 _ C P U I D ) # s a v e l a r g e s t s t d C P U I D f u n c t i o n
2012-09-25 03:05:48 +04:00
2012-11-27 20:54:36 +04:00
movl $ 1 ,% e a x
cpuid
2013-02-11 18:22:16 +04:00
andl $ ~ 1 ,% e d x # I g n o r e C P U I D . F P U
2016-09-22 00:03:59 +03:00
jz . L e n a b l e _ p a g i n g # N o f l a g s o r o n l y C P U I D . F P U = n o C R 4
2012-11-27 20:54:36 +04:00
2012-09-25 03:05:48 +04:00
movl p a ( m m u _ c r4 _ f e a t u r e s ) ,% e a x
2005-04-17 02:20:36 +04:00
movl % e a x ,% c r4
2009-11-14 02:28:13 +03:00
testb $ X 8 6 _ C R 4 _ P A E , % a l # c h e c k i f P A E i s e n a b l e d
2016-09-22 00:03:59 +03:00
jz . L e n a b l e _ p a g i n g
2005-04-17 02:20:36 +04:00
/* Check if extended functions are implemented */
movl $ 0 x80 0 0 0 0 0 0 , % e a x
cpuid
2009-11-14 02:28:13 +03:00
/* Value must be in the range 0x80000001 to 0x8000ffff */
subl $ 0 x80 0 0 0 0 0 1 , % e a x
cmpl $ ( 0 x80 0 0 f f f f - 0 x80 0 0 0 0 0 1 ) , % e a x
2016-09-22 00:03:59 +03:00
ja . L e n a b l e _ p a g i n g
2010-11-10 21:35:53 +03:00
/* Clear bogus XD_DISABLE bits */
call v e r i f y _ c p u
2005-04-17 02:20:36 +04:00
mov $ 0 x80 0 0 0 0 0 1 , % e a x
cpuid
/* Execute Disable bit supported? */
2009-11-14 02:28:13 +03:00
btl $ ( X 8 6 _ F E A T U R E _ N X & 3 1 ) , % e d x
2016-09-22 00:03:59 +03:00
jnc . L e n a b l e _ p a g i n g
2005-04-17 02:20:36 +04:00
/* Setup EFER (Extended Feature Enable Register) */
2009-11-14 02:28:13 +03:00
movl $ M S R _ E F E R , % e c x
2005-04-17 02:20:36 +04:00
rdmsr
2009-11-14 02:28:13 +03:00
btsl $ _ E F E R _ N X , % e a x
2005-04-17 02:20:36 +04:00
/* Make changes effective */
wrmsr
2016-09-22 00:03:59 +03:00
.Lenable_paging :
2005-04-17 02:20:36 +04:00
/ *
* Enable p a g i n g
* /
2010-08-28 17:58:33 +04:00
movl $ p a ( i n i t i a l _ p a g e _ t a b l e ) , % e a x
2005-04-17 02:20:36 +04:00
movl % e a x ,% c r3 / * s e t t h e p a g e t a b l e p o i n t e r . . * /
x86-32: Start out cr0 clean, disable paging before modifying cr3/4
Patch
5a5a51db78e x86-32: Start out eflags and cr4 clean
... made x86-32 match x86-64 in that we initialize %eflags and %cr4
from scratch. This broke OLPC XO-1.5, because the XO enters the
kernel with paging enabled, which the kernel doesn't expect.
Since we no longer support 386 (the source of most of the variability
in %cr0 configuration), we can simply match further x86-64 and
initialize %cr0 to a fixed value -- the one variable part remaining in
%cr0 is for FPU control, but all that is handled later on in
initialization; in particular, configuring %cr0 as if the FPU is
present until proven otherwise is correct and necessary for the probe
to work.
To deal with the XO case sanely, explicitly disable paging in %cr0
before we muck with %cr3, %cr4 or EFER -- those operations are
inherently unsafe with paging enabled.
NOTE: There is still a lot of 386-related junk in head_32.S which we
can and should get rid of, however, this is intended as a minimal fix
whereas the cleanup can be deferred to the next merge window.
Reported-by: Andres Salomon <dilinger@queued.net>
Tested-by: Daniel Drake <dsd@laptop.org>
Link: http://lkml.kernel.org/r/50FA0661.2060400@linux.intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-01-19 22:29:37 +04:00
movl $ C R 0 _ S T A T E ,% e a x
2005-04-17 02:20:36 +04:00
movl % e a x ,% c r0 / * . . a n d s e t p a g i n g ( P G ) b i t * /
ljmp $ _ _ B O O T _ C S ,$ 1 f / * C l e a r p r e f e t c h a n d n o r m a l i z e % e i p * /
1 :
2011-02-05 03:14:11 +03:00
/* Shift the stack pointer to a virtual address */
addl $ _ _ P A G E _ O F F S E T , % e s p
2005-04-17 02:20:36 +04:00
/ *
* start s y s t e m 3 2 - b i t s e t u p . W e n e e d t o r e - d o s o m e o f t h e t h i n g s d o n e
* in 1 6 - b i t m o d e f o r t h e " r e a l " o p e r a t i o n s .
* /
2012-04-19 04:16:50 +04:00
movl s e t u p _ o n c e _ r e f ,% e a x
andl % e a x ,% e a x
jz 1 f # D i d w e d o t h i s a l r e a d y ?
call * % e a x
1 :
2013-02-11 18:22:15 +04:00
2005-04-17 02:20:36 +04:00
/ *
2013-02-11 18:22:15 +04:00
* Check i f i t i s 4 8 6
2005-04-17 02:20:36 +04:00
* /
2013-06-28 18:45:16 +04:00
movb $ 4 ,X 8 6 # a t l e a s t 486
2013-02-11 18:22:17 +04:00
cmpl $ - 1 ,X 8 6 _ C P U I D
2016-09-22 00:03:59 +03:00
je . L i s48 6
2005-04-17 02:20:36 +04:00
/* get vendor info */
xorl % e a x ,% e a x # c a l l C P U I D w i t h 0 - > r e t u r n v e n d o r I D
cpuid
movl % e a x ,X 8 6 _ C P U I D # s a v e C P U I D l e v e l
movl % e b x ,X 8 6 _ V E N D O R _ I D # l o 4 c h a r s
movl % e d x ,X 8 6 _ V E N D O R _ I D + 4 # n e x t 4 c h a r s
movl % e c x ,X 8 6 _ V E N D O R _ I D + 8 # l a s t 4 c h a r s
orl % e a x ,% e a x # d o w e h a v e p r o c e s s o r i n f o a s w e l l ?
2016-09-22 00:03:59 +03:00
je . L i s48 6
2005-04-17 02:20:36 +04:00
movl $ 1 ,% e a x # U s e t h e C P U I D i n s t r u c t i o n t o g e t C P U t y p e
cpuid
movb % a l ,% c l # s a v e r e g f o r f u t u r e u s e
andb $ 0 x0 f ,% a h # m a s k p r o c e s s o r f a m i l y
movb % a h ,X 8 6
andb $ 0 x f0 ,% a l # m a s k m o d e l
shrb $ 4 ,% a l
movb % a l ,X 8 6 _ M O D E L
andb $ 0 x0 f ,% c l # m a s k m a s k r e v i s i o n
2018-01-01 04:52:10 +03:00
movb % c l ,X 8 6 _ S T E P P I N G
2005-04-17 02:20:36 +04:00
movl % e d x ,X 8 6 _ C A P A B I L I T Y
2016-09-22 00:03:59 +03:00
.Lis486 :
2013-02-11 18:22:17 +04:00
movl $ 0 x50 0 2 2 ,% e c x # s e t A M , W P , N E a n d M P
2013-02-11 18:22:15 +04:00
movl % c r0 ,% e a x
2005-04-17 02:20:36 +04:00
andl $ 0 x80 0 0 0 0 1 1 ,% e a x # S a v e P G , P E , E T
orl % e c x ,% e a x
movl % e a x ,% c r0
2007-02-13 15:26:26 +03:00
lgdt e a r l y _ g d t _ d e s c r
2005-04-17 02:20:36 +04:00
ljmp $ ( _ _ K E R N E L _ C S ) ,$ 1 f
1 : movl $ ( _ _ K E R N E L _ D S ) ,% e a x # r e l o a d a l l t h e s e g m e n t r e g i s t e r s
movl % e a x ,% s s # a f t e r c h a n g i n g g d t .
movl $ ( _ _ U S E R _ D S ) ,% e a x # D S / E S c o n t a i n s d e f a u l t U S E R s e g m e n t
movl % e a x ,% d s
movl % e a x ,% e s
2009-01-21 11:26:05 +03:00
movl $ ( _ _ K E R N E L _ P E R C P U ) , % e a x
movl % e a x ,% f s # s e t t h i s c p u ' s p e r c p u
x86/stackprotector/32: Make the canary into a regular percpu variable
On 32-bit kernels, the stackprotector canary is quite nasty -- it is
stored at %gs:(20), which is nasty because 32-bit kernels use %fs for
percpu storage. It's even nastier because it means that whether %gs
contains userspace state or kernel state while running kernel code
depends on whether stackprotector is enabled (this is
CONFIG_X86_32_LAZY_GS), and this setting radically changes the way
that segment selectors work. Supporting both variants is a
maintenance and testing mess.
Merely rearranging so that percpu and the stack canary
share the same segment would be messy as the 32-bit percpu address
layout isn't currently compatible with putting a variable at a fixed
offset.
Fortunately, GCC 8.1 added options that allow the stack canary to be
accessed as %fs:__stack_chk_guard, effectively turning it into an ordinary
percpu variable. This lets us get rid of all of the code to manage the
stack canary GDT descriptor and the CONFIG_X86_32_LAZY_GS mess.
(That name is special. We could use any symbol we want for the
%fs-relative mode, but for CONFIG_SMP=n, gcc refuses to let us use any
name other than __stack_chk_guard.)
Forcibly disable stackprotector on older compilers that don't support
the new options and turn the stack canary into a percpu variable. The
"lazy GS" approach is now used for all 32-bit configurations.
Also makes load_gs_index() work on 32-bit kernels. On 64-bit kernels,
it loads the GS selector and updates the user GSBASE accordingly. (This
is unchanged.) On 32-bit kernels, it loads the GS selector and updates
GSBASE, which is now always the user base. This means that the overall
effect is the same on 32-bit and 64-bit, which avoids some ifdeffery.
[ bp: Massage commit message. ]
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/c0ff7dba14041c7e5d1cae5d4df052f03759bef3.1613243844.git.luto@kernel.org
2021-02-13 22:19:44 +03:00
xorl % e a x ,% e a x
movl % e a x ,% g s # c l e a r p o s s i b l e g a r b a g e i n % g s
2009-02-09 16:17:40 +03:00
xorl % e a x ,% e a x # C l e a r L D T
2005-04-17 02:20:36 +04:00
lldt % a x
[PATCH] i386: Use %gs as the PDA base-segment in the kernel
This patch is the meat of the PDA change. This patch makes several related
changes:
1: Most significantly, %gs is now used in the kernel. This means that on
entry, the old value of %gs is saved away, and it is reloaded with
__KERNEL_PDA.
2: entry.S constructs the stack in the shape of struct pt_regs, and this
is passed around the kernel so that the process's saved register
state can be accessed.
Unfortunately struct pt_regs doesn't currently have space for %gs
(or %fs). This patch extends pt_regs to add space for gs (no space
is allocated for %fs, since it won't be used, and it would just
complicate the code in entry.S to work around the space).
3: Because %gs is now saved on the stack like %ds, %es and the integer
registers, there are a number of places where it no longer needs to
be handled specially; namely context switch, and saving/restoring the
register state in a signal context.
4: And since kernel threads run in kernel space and call normal kernel
code, they need to be created with their %gs == __KERNEL_PDA.
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Chuck Ebbert <76306.1226@compuserve.com>
Cc: Zachary Amsden <zach@vmware.com>
Cc: Jan Beulich <jbeulich@novell.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2006-12-07 04:14:02 +03:00
2016-09-22 00:04:02 +03:00
call * ( i n i t i a l _ c o d e )
1 : jmp 1 b
2019-10-11 14:51:07 +03:00
SYM_ F U N C _ E N D ( s t a r t u p _ 3 2 _ s m p )
2005-04-17 02:20:36 +04:00
2012-04-19 04:16:50 +04:00
# include " v e r i f y _ c p u . S "
2005-04-17 02:20:36 +04:00
/ *
2012-04-19 04:16:50 +04:00
* setup_ o n c e
2005-04-17 02:20:36 +04:00
*
2012-04-19 04:16:50 +04:00
* The s e t u p w o r k w e o n l y w a n t t o r u n o n t h e B S P .
2005-04-17 02:20:36 +04:00
*
* Warning : % esi i s l i v e a c r o s s t h i s f u n c t i o n .
* /
2012-04-19 04:16:50 +04:00
_ _ INIT
setup_once :
andl $ 0 ,s e t u p _ o n c e _ r e f / * O n c e i s e n o u g h , t h a n k s * /
2021-12-04 16:43:40 +03:00
RET
2005-04-17 02:20:36 +04:00
2019-10-11 14:51:07 +03:00
SYM_ F U N C _ S T A R T ( e a r l y _ i d t _ h a n d l e r _ a r r a y )
2012-04-19 04:16:50 +04:00
# 3 6 ( % esp) % e f l a g s
# 3 2 ( % esp) % c s
# 2 8 ( % esp) % e i p
# 2 4 ( % rsp) e r r o r c o d e
i = 0
.rept NUM_EXCEPTION_VECTORS
2017-10-20 19:21:35 +03:00
.if ( ( EXCEPTION_ E R R C O D E _ M A S K > > i ) & 1 ) = = 0
2012-04-19 04:16:50 +04:00
pushl $ 0 # D u m m y e r r o r c o d e , t o m a k e s t a c k f r a m e u n i f o r m
.endif
pushl $ i # 20 ( % e s p ) V e c t o r n u m b e r
2015-05-23 02:15:47 +03:00
jmp e a r l y _ i d t _ h a n d l e r _ c o m m o n
2012-04-19 04:16:50 +04:00
i = i + 1
2015-05-23 02:15:47 +03:00
.fill early_idt_handler_array + i* E A R L Y _ I D T _ H A N D L E R _ S I Z E - . , 1 , 0 x c c
2012-04-19 04:16:50 +04:00
.endr
2019-10-11 14:51:07 +03:00
SYM_ F U N C _ E N D ( e a r l y _ i d t _ h a n d l e r _ a r r a y )
2012-04-19 04:16:50 +04:00
2019-10-11 14:50:45 +03:00
SYM_ C O D E _ S T A R T _ L O C A L ( e a r l y _ i d t _ h a n d l e r _ c o m m o n )
2015-05-23 02:15:47 +03:00
/ *
* The s t a c k i s t h e h a r d w a r e f r a m e , a n e r r o r c o d e o r z e r o , a n d t h e
* vector n u m b e r .
* /
2012-04-19 04:16:50 +04:00
cld
2014-03-08 03:05:20 +04:00
2012-04-19 04:16:50 +04:00
incl % s s : e a r l y _ r e c u r s i o n _ f l a g
2006-09-26 12:52:39 +04:00
2016-04-02 17:01:32 +03:00
/* The vector number is in pt_regs->gs */
2006-09-26 12:52:39 +04:00
2016-04-02 17:01:32 +03:00
cld
2017-07-28 16:00:31 +03:00
pushl % f s / * p t _ r e g s - > f s ( _ _ f s h v a r i e s b y m o d e l ) * /
pushl % e s / * p t _ r e g s - > e s ( _ _ e s h v a r i e s b y m o d e l ) * /
pushl % d s / * p t _ r e g s - > d s ( _ _ d s h v a r i e s b y m o d e l ) * /
2016-04-02 17:01:32 +03:00
pushl % e a x / * p t _ r e g s - > a x * /
pushl % e b p / * p t _ r e g s - > b p * /
pushl % e d i / * p t _ r e g s - > d i * /
pushl % e s i / * p t _ r e g s - > s i * /
pushl % e d x / * p t _ r e g s - > d x * /
pushl % e c x / * p t _ r e g s - > c x * /
pushl % e b x / * p t _ r e g s - > b x * /
/* Fix up DS and ES */
movl $ ( _ _ K E R N E L _ D S ) , % e c x
movl % e c x , % d s
movl % e c x , % e s
/* Load the vector number into EDX */
movl P T _ G S ( % e s p ) , % e d x
2017-07-28 16:00:31 +03:00
/* Load GS into pt_regs->gs (and maybe clobber __gsh) */
2016-04-02 17:01:32 +03:00
movw % g s , P T _ G S ( % e s p )
movl % e s p , % e a x / * a r g s a r e p t _ r e g s ( E A X ) , t r a p n r ( E D X ) * /
call e a r l y _ f i x u p _ e x c e p t i o n
popl % e b x / * p t _ r e g s - > b x * /
popl % e c x / * p t _ r e g s - > c x * /
popl % e d x / * p t _ r e g s - > d x * /
popl % e s i / * p t _ r e g s - > s i * /
popl % e d i / * p t _ r e g s - > d i * /
popl % e b p / * p t _ r e g s - > b p * /
popl % e a x / * p t _ r e g s - > a x * /
2017-07-28 16:00:31 +03:00
popl % d s / * p t _ r e g s - > d s ( a l w a y s i g n o r e s _ _ d s h ) * /
popl % e s / * p t _ r e g s - > e s ( a l w a y s i g n o r e s _ _ e s h ) * /
popl % f s / * p t _ r e g s - > f s ( a l w a y s i g n o r e s _ _ f s h ) * /
popl % g s / * p t _ r e g s - > g s ( a l w a y s i g n o r e s _ _ g s h ) * /
2016-04-02 17:01:32 +03:00
decl % s s : e a r l y _ r e c u r s i o n _ f l a g
addl $ 4 , % e s p / * p o p p t _ r e g s - > o r i g _ a x * /
iret
2019-10-11 14:50:45 +03:00
SYM_ C O D E _ E N D ( e a r l y _ i d t _ h a n d l e r _ c o m m o n )
2012-04-19 04:16:50 +04:00
2005-04-17 02:20:36 +04:00
/* This is the default interrupt "handler" :-) */
2019-10-11 14:51:07 +03:00
SYM_ F U N C _ S T A R T ( e a r l y _ i g n o r e _ i r q )
2005-04-17 02:20:36 +04:00
cld
2005-05-01 19:59:02 +04:00
# ifdef C O N F I G _ P R I N T K
2005-04-17 02:20:36 +04:00
pushl % e a x
pushl % e c x
pushl % e d x
pushl % e s
pushl % d s
movl $ ( _ _ K E R N E L _ D S ) ,% e a x
movl % e a x ,% d s
movl % e a x ,% e s
2006-09-26 12:52:39 +04:00
cmpl $ 2 ,e a r l y _ r e c u r s i o n _ f l a g
je h l t _ l o o p
incl e a r l y _ r e c u r s i o n _ f l a g
2005-04-17 02:20:36 +04:00
pushl 1 6 ( % e s p )
pushl 2 4 ( % e s p )
pushl 3 2 ( % e s p )
pushl 4 0 ( % e s p )
pushl $ i n t _ m s g
printk: Userspace format indexing support
We have a number of systems industry-wide that have a subset of their
functionality that works as follows:
1. Receive a message from local kmsg, serial console, or netconsole;
2. Apply a set of rules to classify the message;
3. Do something based on this classification (like scheduling a
remediation for the machine), rinse, and repeat.
As a couple of examples of places we have this implemented just inside
Facebook, although this isn't a Facebook-specific problem, we have this
inside our netconsole processing (for alarm classification), and as part
of our machine health checking. We use these messages to determine
fairly important metrics around production health, and it's important
that we get them right.
While for some kinds of issues we have counters, tracepoints, or metrics
with a stable interface which can reliably indicate the issue, in order
to react to production issues quickly we need to work with the interface
which most kernel developers naturally use when developing: printk.
Most production issues come from unexpected phenomena, and as such
usually the code in question doesn't have easily usable tracepoints or
other counters available for the specific problem being mitigated. We
have a number of lines of monitoring defence against problems in
production (host metrics, process metrics, service metrics, etc), and
where it's not feasible to reliably monitor at another level, this kind
of pragmatic netconsole monitoring is essential.
As one would expect, monitoring using printk is rather brittle for a
number of reasons -- most notably that the message might disappear
entirely in a new version of the kernel, or that the message may change
in some way that the regex or other classification methods start to
silently fail.
One factor that makes this even harder is that, under normal operation,
many of these messages are never expected to be hit. For example, there
may be a rare hardware bug which one wants to detect if it was to ever
happen again, but its recurrence is not likely or anticipated. This
precludes using something like checking whether the printk in question
was printed somewhere fleetwide recently to determine whether the
message in question is still present or not, since we don't anticipate
that it should be printed anywhere, but still need to monitor for its
future presence in the long-term.
This class of issue has happened on a number of occasions, causing
unhealthy machines with hardware issues to remain in production for
longer than ideal. As a recent example, some monitoring around
blk_update_request fell out of date and caused semi-broken machines to
remain in production for longer than would be desirable.
Searching through the codebase to find the message is also extremely
fragile, because many of the messages are further constructed beyond
their callsite (eg. btrfs_printk and other module-specific wrappers,
each with their own functionality). Even if they aren't, guessing the
format and formulation of the underlying message based on the aesthetics
of the message emitted is not a recipe for success at scale, and our
previous issues with fleetwide machine health checking demonstrate as
much.
This provides a solution to the issue of silently changed or deleted
printks: we record pointers to all printk format strings known at
compile time into a new .printk_index section, both in vmlinux and
modules. At runtime, this can then be iterated by looking at
<debugfs>/printk/index/<module>, which emits the following format, both
readable by humans and able to be parsed by machines:
$ head -1 vmlinux; shuf -n 5 vmlinux
# <level[,flags]> filename:line function "format"
<5> block/blk-settings.c:661 disk_stack_limits "%s: Warning: Device %s is misaligned\n"
<4> kernel/trace/trace.c:8296 trace_create_file "Could not create tracefs '%s' entry\n"
<6> arch/x86/kernel/hpet.c:144 _hpet_print_config "hpet: %s(%d):\n"
<6> init/do_mounts.c:605 prepare_namespace "Waiting for root device %s...\n"
<6> drivers/acpi/osl.c:1410 acpi_no_auto_serialize_setup "ACPI: auto-serialization disabled\n"
This mitigates the majority of cases where we have a highly-specific
printk which we want to match on, as we can now enumerate and check
whether the format changed or the printk callsite disappeared entirely
in userspace. This allows us to catch changes to printks we monitor
earlier and decide what to do about it before it becomes problematic.
There is no additional runtime cost for printk callers or printk itself,
and the assembly generated is exactly the same.
Signed-off-by: Chris Down <chris@chrisdown.name>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Jessica Yu <jeyu@kernel.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kees Cook <keescook@chromium.org>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Tested-by: Petr Mladek <pmladek@suse.com>
Reported-by: kernel test robot <lkp@intel.com>
Acked-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Acked-by: Jessica Yu <jeyu@kernel.org> # for module.{c,h}
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/e42070983637ac5e384f17fbdbe86d19c7b212a5.1623775748.git.chris@chrisdown.name
2021-06-15 19:52:53 +03:00
call _ p r i n t k
2009-01-26 08:09:00 +03:00
call d u m p _ s t a c k
2005-04-17 02:20:36 +04:00
addl $ ( 5 * 4 ) ,% e s p
popl % d s
popl % e s
popl % e d x
popl % e c x
popl % e a x
2005-05-01 19:59:02 +04:00
# endif
2005-04-17 02:20:36 +04:00
iret
2016-04-02 17:01:34 +03:00
hlt_loop :
hlt
jmp h l t _ l o o p
2019-10-11 14:51:07 +03:00
SYM_ F U N C _ E N D ( e a r l y _ i g n o r e _ i r q )
2017-08-31 15:16:53 +03:00
2012-04-19 04:16:50 +04:00
_ _ INITDATA
.align 4
2019-10-11 14:50:51 +03:00
SYM_ D A T A ( e a r l y _ r e c u r s i o n _ f l a g , . l o n g 0 )
2005-04-17 02:20:36 +04:00
2012-04-19 04:16:50 +04:00
_ _ REFDATA
.align 4
2019-10-11 14:50:51 +03:00
SYM_ D A T A ( i n i t i a l _ c o d e , . l o n g i 3 8 6 _ s t a r t _ k e r n e l )
SYM_ D A T A ( s e t u p _ o n c e _ r e f , . l o n g s e t u p _ o n c e )
2008-07-27 23:43:11 +04:00
2018-07-18 12:40:54 +03:00
# ifdef C O N F I G _ P A G E _ T A B L E _ I S O L A T I O N
# define P G D _ A L I G N ( 2 * P A G E _ S I Z E )
# define P T I _ U S E R _ P G D _ F I L L 1 0 2 4
# else
# define P G D _ A L I G N ( P A G E _ S I Z E )
# define P T I _ U S E R _ P G D _ F I L L 0
# endif
2005-04-17 02:20:36 +04:00
/ *
* BSS s e c t i o n
* /
2009-09-21 02:14:14 +04:00
_ _ PAGE_ A L I G N E D _ B S S
2018-07-18 12:40:54 +03:00
.align PGD_ALIGN
2008-02-10 01:24:09 +03:00
# ifdef C O N F I G _ X 8 6 _ P A E
2016-12-08 19:44:31 +03:00
.globl initial_pg_pmd
2011-01-04 09:50:54 +03:00
initial_pg_pmd :
2008-02-10 01:24:09 +03:00
.fill 1 0 2 4 * KPMDS,4 ,0
# else
2016-11-16 17:17:09 +03:00
.globl initial_page_table
initial_page_table :
2005-04-17 02:20:36 +04:00
.fill 1 0 2 4 , 4 , 0
2008-02-10 01:24:09 +03:00
# endif
2018-07-18 12:40:54 +03:00
.align PGD_ALIGN
2011-01-04 09:50:54 +03:00
initial_pg_fixmap :
2007-07-16 10:37:28 +04:00
.fill 1 0 2 4 , 4 , 0
2016-11-16 17:17:09 +03:00
.globl swapper_pg_dir
2018-07-18 12:40:54 +03:00
.align PGD_ALIGN
2016-11-16 17:17:09 +03:00
swapper_pg_dir :
2010-08-28 17:58:33 +04:00
.fill 1 0 2 4 , 4 , 0
2018-07-18 12:40:54 +03:00
.fill PTI_ U S E R _ P G D _ F I L L ,4 ,0
.globl empty_zero_page
empty_zero_page :
.fill 4 0 9 6 , 1 , 0
2016-01-11 19:04:34 +03:00
EXPORT_ S Y M B O L ( e m p t y _ z e r o _ p a g e )
2009-03-09 11:15:57 +03:00
2005-04-17 02:20:36 +04:00
/ *
* This s t a r t s t h e d a t a s e c t i o n .
* /
2008-02-10 01:24:09 +03:00
# ifdef C O N F I G _ X 8 6 _ P A E
2009-09-21 02:14:15 +04:00
_ _ PAGE_ A L I G N E D _ D A T A
2008-02-10 01:24:09 +03:00
/* Page-aligned for the benefit of paravirt? */
2018-07-18 12:40:54 +03:00
.align PGD_ALIGN
2019-10-11 14:50:51 +03:00
SYM_ D A T A _ S T A R T ( i n i t i a l _ p a g e _ t a b l e )
2010-08-28 17:58:33 +04:00
.long pa( i n i t i a l _ p g _ p m d + P G D _ I D E N T _ A T T R ) ,0 / * l o w i d e n t i t y m a p * /
2008-02-10 01:24:09 +03:00
# if K P M D S = = 3
2010-08-28 17:58:33 +04:00
.long pa( i n i t i a l _ p g _ p m d + P G D _ I D E N T _ A T T R ) ,0
.long pa( i n i t i a l _ p g _ p m d + P G D _ I D E N T _ A T T R + 0 x10 0 0 ) ,0
.long pa( i n i t i a l _ p g _ p m d + P G D _ I D E N T _ A T T R + 0 x20 0 0 ) ,0
2008-02-10 01:24:09 +03:00
# elif K P M D S = = 2
.long 0 , 0
2010-08-28 17:58:33 +04:00
.long pa( i n i t i a l _ p g _ p m d + P G D _ I D E N T _ A T T R ) ,0
.long pa( i n i t i a l _ p g _ p m d + P G D _ I D E N T _ A T T R + 0 x10 0 0 ) ,0
2008-02-10 01:24:09 +03:00
# elif K P M D S = = 1
.long 0 , 0
.long 0 , 0
2010-08-28 17:58:33 +04:00
.long pa( i n i t i a l _ p g _ p m d + P G D _ I D E N T _ A T T R ) ,0
2008-02-10 01:24:09 +03:00
# else
# error " K e r n e l P M D s s h o u l d b e 1 , 2 o r 3 "
# endif
2011-02-25 23:46:13 +03:00
.align PAGE_SIZE /* needs to be page-sized too */
2019-11-21 02:40:23 +03:00
# ifdef C O N F I G _ P A G E _ T A B L E _ I S O L A T I O N
/ *
* PTI n e e d s a n o t h e r p a g e s o s y n c _ i n i t i a l _ p a g e t a b l e ( ) w o r k s c o r r e c t l y
* and d o e s n o t s c r i b b l e o v e r t h e d a t a w h i c h i s p l a c e d b e h i n d t h e
* actual i n i t i a l _ p a g e _ t a b l e . S e e c l o n e _ p g d _ r a n g e ( ) .
* /
.fill 1 0 2 4 , 4 , 0
# endif
2019-10-11 14:50:51 +03:00
SYM_ D A T A _ E N D ( i n i t i a l _ p a g e _ t a b l e )
2008-02-10 01:24:09 +03:00
# endif
2005-04-17 02:20:36 +04:00
.data
2011-02-05 03:14:11 +03:00
.balign 4
2019-10-11 14:50:51 +03:00
/ *
* The S I Z E O F _ P T R E G S g a p i s a c o n v e n t i o n w h i c h h e l p s t h e i n - k e r n e l u n w i n d e r
* reliably d e t e c t t h e e n d o f t h e s t a c k .
* /
SYM_ D A T A ( i n i t i a l _ s t a c k ,
.long init_thread_union + THREAD_ S I Z E -
SIZEOF_ P T R E G S - T O P _ O F _ K E R N E L _ S T A C K _ P A D D I N G )
2005-04-17 02:20:36 +04:00
2012-04-19 04:16:50 +04:00
_ _ INITRODATA
2005-04-17 02:20:36 +04:00
int_msg :
2009-01-26 08:09:00 +03:00
.asciz " Unknown i n t e r r u p t o r f a u l t a t : % p % p % p \ n "
2005-04-17 02:20:36 +04:00
2007-10-11 13:16:51 +04:00
# include " . . / . . / x86 / x e n / x e n - h e a d . S "
xen: Core Xen implementation
This patch is a rollup of all the core pieces of the Xen
implementation, including:
- booting and setup
- pagetable setup
- privileged instructions
- segmentation
- interrupt flags
- upcalls
- multicall batching
BOOTING AND SETUP
The vmlinux image is decorated with ELF notes which tell the Xen
domain builder what the kernel's requirements are; the domain builder
then constructs the address space accordingly and starts the kernel.
Xen has its own entrypoint for the kernel (contained in an ELF note).
The ELF notes are set up by xen-head.S, which is included into head.S.
In principle it could be linked separately, but it seems to provoke
lots of binutils bugs.
Because the domain builder starts the kernel in a fairly sane state
(32-bit protected mode, paging enabled, flat segments set up), there's
not a lot of setup needed before starting the kernel proper. The main
steps are:
1. Install the Xen paravirt_ops, which is simply a matter of a
structure assignment.
2. Set init_mm to use the Xen-supplied pagetables (analogous to the
head.S generated pagetables in a native boot).
3. Reserve address space for Xen, since it takes a chunk at the top
of the address space for its own use.
4. Call start_kernel()
PAGETABLE SETUP
Once we hit the main kernel boot sequence, it will end up calling back
via paravirt_ops to set up various pieces of Xen specific state. One
of the critical things which requires a bit of extra care is the
construction of the initial init_mm pagetable. Because Xen places
tight constraints on pagetables (an active pagetable must always be
valid, and must always be mapped read-only to the guest domain), we
need to be careful when constructing the new pagetable to keep these
constraints in mind. It turns out that the easiest way to do this is
use the initial Xen-provided pagetable as a template, and then just
insert new mappings for memory where a mapping doesn't already exist.
This means that during pagetable setup, it uses a special version of
xen_set_pte which ignores any attempt to remap a read-only page as
read-write (since Xen will map its own initial pagetable as RO), but
lets other changes to the ptes happen, so that things like NX are set
properly.
PRIVILEGED INSTRUCTIONS AND SEGMENTATION
When the kernel runs under Xen, it runs in ring 1 rather than ring 0.
This means that it is more privileged than user-mode in ring 3, but it
still can't run privileged instructions directly. Non-performance
critical instructions are dealt with by taking a privilege exception
and trapping into the hypervisor and emulating the instruction, but
more performance-critical instructions have their own specific
paravirt_ops. In many cases we can avoid having to do any hypercalls
for these instructions, or the Xen implementation is quite different
from the normal native version.
The privileged instructions fall into the broad classes of:
Segmentation: setting up the GDT and the GDT entries, LDT,
TLS and so on. Xen doesn't allow the GDT to be directly
modified; all GDT updates are done via hypercalls where the new
entries can be validated. This is important because Xen uses
segment limits to prevent the guest kernel from damaging the
hypervisor itself.
Traps and exceptions: Xen uses a special format for trap entrypoints,
so when the kernel wants to set an IDT entry, it needs to be
converted to the form Xen expects. Xen sets int 0x80 up specially
so that the trap goes straight from userspace into the guest kernel
without going via the hypervisor. sysenter isn't supported.
Kernel stack: The esp0 entry is extracted from the tss and provided to
Xen.
TLB operations: the various TLB calls are mapped into corresponding
Xen hypercalls.
Control registers: all the control registers are privileged. The most
important is cr3, which points to the base of the current pagetable,
and we handle it specially.
Another instruction we treat specially is CPUID, even though its not
privileged. We want to control what CPU features are visible to the
rest of the kernel, and so CPUID ends up going into a paravirt_op.
Xen implements this mainly to disable the ACPI and APIC subsystems.
INTERRUPT FLAGS
Xen maintains its own separate flag for masking events, which is
contained within the per-cpu vcpu_info structure. Because the guest
kernel runs in ring 1 and not 0, the IF flag in EFLAGS is completely
ignored (and must be, because even if a guest domain disables
interrupts for itself, it can't disable them overall).
(A note on terminology: "events" and interrupts are effectively
synonymous. However, rather than using an "enable flag", Xen uses a
"mask flag", which blocks event delivery when it is non-zero.)
There are paravirt_ops for each of cli/sti/save_fl/restore_fl, which
are implemented to manage the Xen event mask state. The only thing
worth noting is that when events are unmasked, we need to explicitly
see if there's a pending event and call into the hypervisor to make
sure it gets delivered.
UPCALLS
Xen needs a couple of upcall (or callback) functions to be implemented
by each guest. One is the event upcalls, which is how events
(interrupts, effectively) are delivered to the guests. The other is
the failsafe callback, which is used to report errors in either
reloading a segment register, or caused by iret. These are
implemented in i386/kernel/entry.S so they can jump into the normal
iret_exc path when necessary.
MULTICALL BATCHING
Xen provides a multicall mechanism, which allows multiple hypercalls
to be issued at once in order to mitigate the cost of trapping into
the hypervisor. This is particularly useful for context switches,
since the 4-5 hypercalls they would normally need (reload cr3, update
TLS, maybe update LDT) can be reduced to one. This patch implements a
generic batching mechanism for hypercalls, which gets used in many
places in the Xen code.
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Cc: Ian Pratt <ian.pratt@xensource.com>
Cc: Christian Limpach <Christian.Limpach@cl.cam.ac.uk>
Cc: Adrian Bunk <bunk@stusta.de>
2007-07-18 05:37:04 +04:00
2005-04-17 02:20:36 +04:00
/ *
* The I D T a n d G D T ' d e s c r i p t o r s ' a r e a s t r a n g e 4 8 - b i t o b j e c t
* only u s e d b y t h e l i d t a n d l g d t i n s t r u c t i o n s . T h e y a r e n o t
* like u s u a l s e g m e n t d e s c r i p t o r s - t h e y c o n s i s t o f a 1 6 - b i t
* segment s i z e , a n d 3 2 - b i t l i n e a r a d d r e s s v a l u e :
* /
2012-04-19 04:16:50 +04:00
.data
2005-04-17 02:20:36 +04:00
ALIGN
# early b o o t G D T d e s c r i p t o r ( m u s t u s e 1 : 1 a d d r e s s m a p p i n g )
.word 0 # 3 2 bit a l i g n g d t _ d e s c . a d d r e s s
2019-10-11 14:50:51 +03:00
SYM_ D A T A _ S T A R T _ L O C A L ( b o o t _ g d t _ d e s c r )
2005-04-17 02:20:36 +04:00
.word _ _ BOOT_ D S + 7
2007-05-02 21:27:10 +04:00
.long boot_gdt - _ _ PAGE_ O F F S E T
2019-10-11 14:50:51 +03:00
SYM_ D A T A _ E N D ( b o o t _ g d t _ d e s c r )
2005-04-17 02:20:36 +04:00
# boot G D T d e s c r i p t o r ( l a t e r o n u s e d b y C P U #0 ) :
.word 0 # 3 2 bit a l i g n g d t _ d e s c . a d d r e s s
2019-10-11 14:50:51 +03:00
SYM_ D A T A _ S T A R T ( e a r l y _ g d t _ d e s c r )
2005-04-17 02:20:36 +04:00
.word GDT_ E N T R I E S * 8 - 1
2009-10-29 16:34:15 +03:00
.long gdt_page /* Overwritten for secondary CPUs */
2019-10-11 14:50:51 +03:00
SYM_ D A T A _ E N D ( e a r l y _ g d t _ d e s c r )
2005-04-17 02:20:36 +04:00
/ *
2007-05-02 21:27:10 +04:00
* The b o o t _ g d t m u s t m i r r o r t h e e q u i v a l e n t i n s e t u p . S a n d i s
2005-04-17 02:20:36 +04:00
* used o n l y f o r b o o t i n g .
* /
.align L1_CACHE_BYTES
2019-10-11 14:50:51 +03:00
SYM_ D A T A _ S T A R T ( b o o t _ g d t )
2005-04-17 02:20:36 +04:00
.fill GDT_ E N T R Y _ B O O T _ C S ,8 ,0
.quad 0x00cf9a000000ffff /* kernel 4GB code at 0x00000000 */
.quad 0x00cf92000000ffff /* kernel 4GB data at 0x00000000 */
2019-10-11 14:50:51 +03:00
SYM_ D A T A _ E N D ( b o o t _ g d t )