License cleanup: add SPDX GPL-2.0 license identifier to files with no license
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.
By default all files without license information are under the default
license of the kernel, which is GPL version 2.
Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.
This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.
How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,
Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.
The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.
The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.
Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if <5
lines).
All documentation files were explicitly excluded.
The following heuristics were used to determine which SPDX license
identifiers to apply.
- when both scanners couldn't find any license traces, file was
considered to have no license information in it, and the top level
COPYING file license applied.
For non */uapi/* files that summary was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 11139
and resulted in the first patch in this series.
If that file was a */uapi/* path one, it was "GPL-2.0 WITH
Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 WITH Linux-syscall-note 930
and resulted in the second patch in this series.
- if a file had some form of licensing information in it, and was one
of the */uapi/* ones, it was denoted with the Linux-syscall-note if
any GPL family license was found in the file or had no licensing in
it (per prior point). Results summary:
SPDX license identifier # files
---------------------------------------------------|------
GPL-2.0 WITH Linux-syscall-note 270
GPL-2.0+ WITH Linux-syscall-note 169
((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21
((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17
LGPL-2.1+ WITH Linux-syscall-note 15
GPL-1.0+ WITH Linux-syscall-note 14
((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5
LGPL-2.0+ WITH Linux-syscall-note 4
LGPL-2.1 WITH Linux-syscall-note 3
((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3
((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1
and that resulted in the third patch in this series.
- when the two scanners agreed on the detected license(s), that became
the concluded license(s).
- when there was disagreement between the two scanners (one detected a
license but the other didn't, or they both detected different
licenses) a manual inspection of the file occurred.
- In most cases a manual inspection of the information in the file
resulted in a clear resolution of the license that should apply (and
which scanner probably needed to revisit its heuristics).
- When it was not immediately clear, the license identifier was
confirmed with lawyers working with the Linux Foundation.
- If there was any question as to the appropriate license identifier,
the file was flagged for further research and to be revisited later
in time.
In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.
Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights. The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.
Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.
In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.
Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
- a full scancode scan run, collecting the matched texts, detected
license ids and scores
- reviewing anything where there was a license detected (about 500+
files) to ensure that the applied SPDX license was correct
- reviewing anything where there was no detection but the patch license
was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
SPDX license was correct
This produced a worksheet with 20 files needing minor correction. This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.
These .csv files were then reviewed by Greg. Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected. This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.) Finally Greg ran the script using the .csv files to
generate the patches.
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-01 17:07:57 +03:00
/* SPDX-License-Identifier: GPL-2.0 */
2005-04-17 02:20:36 +04:00
/ *
* linux/ b o o t / h e a d . S
*
* Copyright ( C ) 1 9 9 1 , 1 9 9 2 , 1 9 9 3 L i n u s T o r v a l d s
* /
/ *
* head. S c o n t a i n s t h e 3 2 - b i t s t a r t u p c o d e .
*
* NOTE! ! ! S t a r t u p h a p p e n s a t a b s o l u t e a d d r e s s 0 x00 0 0 1 0 0 0 , w h i c h i s a l s o w h e r e
* the p a g e d i r e c t o r y w i l l e x i s t . T h e s t a r t u p c o d e w i l l b e o v e r w r i t t e n b y
* the p a g e d i r e c t o r y . [ A c c o r d i n g t o c o m m e n t s e t c e l s e w h e r e o n a c o m p r e s s e d
* kernel i t w i l l e n d u p a t 0 x10 0 0 + 1 M b I h o p e s o a s I a s s u m e t h i s . - A C ]
*
2009-05-09 02:45:17 +04:00
* Page 0 i s d e l i b e r a t e l y k e p t s a f e , s i n c e S y s t e m M a n a g e m e n t M o d e c o d e i n
2005-04-17 02:20:36 +04:00
* laptops m a y n e e d t o a c c e s s t h e B I O S d a t a s t o r e d t h e r e . T h i s i s a l s o
2009-05-09 02:45:17 +04:00
* useful f o r f u t u r e d e v i c e d r i v e r s t h a t e i t h e r a c c e s s t h e B I O S v i a V M 8 6
2005-04-17 02:20:36 +04:00
* mode.
* /
/ *
* High l o a d e d s t u f f b y H a n s L e r m e n & W e r n e r A l m e s b e r g e r , F e b . 1 9 9 6
* /
2009-05-09 02:45:17 +04:00
.text
2005-04-17 02:20:36 +04:00
2009-09-17 00:44:27 +04:00
# include < l i n u x / i n i t . h >
2005-04-17 02:20:36 +04:00
# include < l i n u x / l i n k a g e . h >
# include < a s m / s e g m e n t . h >
2009-02-13 22:14:01 +03:00
# include < a s m / p a g e _ t y p e s . h >
2006-12-07 04:14:04 +03:00
# include < a s m / b o o t . h >
2007-10-22 03:41:35 +04:00
# include < a s m / a s m - o f f s e t s . h >
2015-02-19 10:34:58 +03:00
# include < a s m / b o o t p a r a m . h >
2005-04-17 02:20:36 +04:00
x86/build: Build compressed x86 kernels as PIE
The 32-bit x86 assembler in binutils 2.26 will generate R_386_GOT32X
relocation to get the symbol address in PIC. When the compressed x86
kernel isn't built as PIC, the linker optimizes R_386_GOT32X relocations
to their fixed symbol addresses. However, when the compressed x86
kernel is loaded at a different address, it leads to the following
load failure:
Failed to allocate space for phdrs
during the decompression stage.
If the compressed x86 kernel is relocatable at run-time, it should be
compiled with -fPIE, instead of -fPIC, if possible and should be built as
Position Independent Executable (PIE) so that linker won't optimize
R_386_GOT32X relocation to its fixed symbol address.
Older linkers generate R_386_32 relocations against locally defined
symbols, _bss, _ebss, _got and _egot, in PIE. It isn't wrong, just less
optimal than R_386_RELATIVE. But the x86 kernel fails to properly handle
R_386_32 relocations when relocating the kernel. To generate
R_386_RELATIVE relocations, we mark _bss, _ebss, _got and _egot as
hidden in both 32-bit and 64-bit x86 kernels.
To build a 64-bit compressed x86 kernel as PIE, we need to disable the
relocation overflow check to avoid relocation overflow errors. We do
this with a new linker command-line option, -z noreloc-overflow, which
got added recently:
commit 4c10bbaa0912742322f10d9d5bb630ba4e15dfa7
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Tue Mar 15 11:07:06 2016 -0700
Add -z noreloc-overflow option to x86-64 ld
Add -z noreloc-overflow command-line option to the x86-64 ELF linker to
disable relocation overflow check. This can be used to avoid relocation
overflow check if there will be no dynamic relocation overflow at
run-time.
The 64-bit compressed x86 kernel is built as PIE only if the linker supports
-z noreloc-overflow. So far 64-bit relocatable compressed x86 kernel
boots fine even when it is built as a normal executable.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
[ Edited the changelog and comments. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-17 06:04:35 +03:00
/ *
* The 3 2 - b i t x86 a s s e m b l e r i n b i n u t i l s 2 . 2 6 w i l l g e n e r a t e R _ 3 8 6 _ G O T 3 2 X
* relocation t o g e t t h e s y m b o l a d d r e s s i n P I C . W h e n t h e c o m p r e s s e d x86
* kernel i s n ' t b u i l t a s P I C , t h e l i n k e r o p t i m i z e s R _ 3 8 6 _ G O T 3 2 X
* relocations t o t h e i r f i x e d s y m b o l a d d r e s s e s . H o w e v e r , w h e n t h e
* compressed x86 k e r n e l i s l o a d e d a t a d i f f e r e n t a d d r e s s , i t l e a d s
* to t h e f o l l o w i n g l o a d f a i l u r e :
*
* Failed t o a l l o c a t e s p a c e f o r p h d r s
*
* during t h e d e c o m p r e s s i o n s t a g e .
*
* If t h e c o m p r e s s e d x86 k e r n e l i s r e l o c a t a b l e a t r u n - t i m e , i t s h o u l d b e
* compiled w i t h - f P I E , i n s t e a d o f - f P I C , i f p o s s i b l e a n d s h o u l d b e b u i l t a s
* Position I n d e p e n d e n t E x e c u t a b l e ( P I E ) s o t h a t l i n k e r w o n ' t o p t i m i z e
* R_ 3 8 6 _ G O T 3 2 X r e l o c a t i o n t o i t s f i x e d s y m b o l a d d r e s s . O l d e r
* linkers g e n e r a t e R _ 3 8 6 _ 3 2 r e l o c a t i o n s a g a i n s t l o c a l l y d e f i n e d s y m b o l s ,
* _ bss, _ e b s s , _ g o t a n d _ e g o t , i n P I E . I t i s n ' t w r o n g , j u s t l e s s
* optimal t h a n R _ 3 8 6 _ R E L A T I V E . B u t t h e x86 k e r n e l f a i l s t o p r o p e r l y h a n d l e
* R_ 3 8 6 _ 3 2 r e l o c a t i o n s w h e n r e l o c a t i n g t h e k e r n e l . T o g e n e r a t e
* R_ 3 8 6 _ R E L A T I V E r e l o c a t i o n s , w e m a r k _ b s s , _ e b s s , _ g o t a n d _ e g o t a s
* hidden :
* /
.hidden _bss
.hidden _ebss
.hidden _got
.hidden _egot
2009-09-17 00:44:27 +04:00
_ _ HEAD
2019-10-11 14:51:07 +03:00
SYM_ F U N C _ S T A R T ( s t a r t u p _ 3 2 )
2007-10-26 21:29:04 +04:00
cld
cli
2007-10-22 03:41:35 +04:00
2009-05-09 02:45:17 +04:00
/ *
* Calculate t h e d e l t a b e t w e e n w h e r e w e w e r e c o m p i l e d t o r u n
2006-12-07 04:14:04 +03:00
* at a n d w h e r e w e w e r e a c t u a l l y l o a d e d a t . T h i s c a n o n l y b e d o n e
* with a s h o r t l o c a l c a l l o n x86 . N o t h i n g e l s e w i l l t e l l u s w h a t
* address w e a r e r u n n i n g a t . T h e r e s e r v e d c h u n k o f t h e r e a l - m o d e
2007-07-11 23:18:33 +04:00
* data a t 0 x1 e 4 ( d e f i n e d a s a s c r a t c h f i e l d ) a r e u s e d a s t h e s t a c k
* for t h i s c a l c u l a t i o n . O n l y 4 b y t e s a r e n e e d e d .
2006-12-07 04:14:04 +03:00
* /
2009-05-09 02:45:17 +04:00
leal ( B P _ s c r a t c h + 4 ) ( % e s i ) , % e s p
call 1 f
2020-03-08 11:08:46 +03:00
1 : popl % e d x
subl $ 1 b , % e d x
2006-12-07 04:14:04 +03:00
2020-02-02 20:13:51 +03:00
/* Load new GDT */
2020-03-08 11:08:46 +03:00
leal g d t ( % e d x ) , % e a x
2020-02-02 20:13:51 +03:00
movl % e a x , 2 ( % e a x )
lgdt ( % e a x )
/* Load segment registers with our descriptors */
movl $ _ _ B O O T _ D S , % e a x
movl % e a x , % d s
movl % e a x , % e s
movl % e a x , % f s
movl % e a x , % g s
movl % e a x , % s s
2009-05-09 02:45:17 +04:00
/ *
2020-03-08 11:08:46 +03:00
* % edx c o n t a i n s t h e a d d r e s s w e a r e l o a d e d a t b y t h e b o o t l o a d e r a n d % e b x
2006-12-07 04:14:04 +03:00
* contains t h e a d d r e s s w h e r e w e s h o u l d m o v e t h e k e r n e l i m a g e t e m p o r a r i l y
2020-03-08 11:08:46 +03:00
* for s a f e i n - p l a c e d e c o m p r e s s i o n . % e b p c o n t a i n s t h e a d d r e s s t h a t t h e k e r n e l
* will b e d e c o m p r e s s e d t o .
2006-12-07 04:14:04 +03:00
* /
2006-12-07 04:14:04 +03:00
2006-12-07 04:14:04 +03:00
# ifdef C O N F I G _ R E L O C A T A B L E
2020-03-08 11:08:46 +03:00
movl % e d x , % e b x
2020-03-08 11:08:47 +03:00
# ifdef C O N F I G _ E F I _ S T U B
/ *
* If w e w e r e l o a d e d v i a t h e E F I L o a d I m a g e s e r v i c e , s t a r t u p _ 3 2 ( ) w i l l b e a t a n
* offset t o t h e s t a r t o f t h e s p a c e a l l o c a t e d f o r t h e i m a g e . e f i _ p e _ e n t r y ( ) w i l l
* set u p i m a g e _ o f f s e t t o t e l l u s w h e r e t h e i m a g e a c t u a l l y s t a r t s , s o t h a t w e
* can u s e t h e f u l l a v a i l a b l e b u f f e r .
* image_ o f f s e t = s t a r t u p _ 3 2 - i m a g e _ b a s e
* Otherwise i m a g e _ o f f s e t w i l l b e z e r o a n d h a s n o e f f e c t o n t h e c a l c u l a t i o n s .
* /
subl i m a g e _ o f f s e t ( % e d x ) , % e b x
# endif
2009-05-12 02:56:08 +04:00
movl B P _ k e r n e l _ a l i g n m e n t ( % e s i ) , % e a x
decl % e a x
addl % e a x , % e b x
notl % e a x
andl % e a x , % e b x
2013-10-11 04:18:14 +04:00
cmpl $ L O A D _ P H Y S I C A L _ A D D R , % e b x
2020-03-08 11:08:44 +03:00
jae 1 f
2006-12-07 04:14:04 +03:00
# endif
2013-10-11 04:18:14 +04:00
movl $ L O A D _ P H Y S I C A L _ A D D R , % e b x
1 :
2006-12-07 04:14:04 +03:00
2020-03-08 11:08:46 +03:00
movl % e b x , % e b p / / S a v e t h e o u t p u t a d d r e s s f o r l a t e r
2009-05-09 04:42:16 +04:00
/* Target address to relocate to for decompression */
2020-03-08 11:08:46 +03:00
addl B P _ i n i t _ s i z e ( % e s i ) , % e b x
subl $ _ e n d , % e b x
2006-12-07 04:14:04 +03:00
2009-05-09 03:27:41 +04:00
/* Set up the stack */
leal b o o t _ s t a c k _ e n d ( % e b x ) , % e s p
2009-05-07 04:56:51 +04:00
/* Zero EFLAGS */
pushl $ 0
popfl
2009-05-09 02:45:17 +04:00
/ *
* Copy t h e c o m p r e s s e d k e r n e l t o t h e e n d o f o u r b u f f e r
2006-12-07 04:14:04 +03:00
* where d e c o m p r e s s i o n i n p l a c e b e c o m e s s a f e .
* /
2009-05-09 02:45:17 +04:00
pushl % e s i
2020-03-08 11:08:46 +03:00
leal ( _ b s s - 4 ) ( % e d x ) , % e s i
2009-05-09 03:45:15 +04:00
leal ( _ b s s - 4 ) ( % e b x ) , % e d i
2009-05-09 03:20:34 +04:00
movl $ ( _ b s s - s t a r t u p _ 3 2 ) , % e c x
2009-05-09 03:45:15 +04:00
shrl $ 2 , % e c x
2006-12-07 04:14:04 +03:00
std
2009-05-09 03:45:15 +04:00
rep m o v s l
2006-12-07 04:14:04 +03:00
cld
2009-05-09 02:45:17 +04:00
popl % e s i
2006-12-07 04:14:04 +03:00
2020-02-02 20:13:51 +03:00
/ *
* The G D T m a y g e t o v e r w r i t t e n e i t h e r d u r i n g t h e c o p y w e j u s t d i d o r
* during e x t r a c t _ k e r n e l b e l o w . T o a v o i d a n y i s s u e s , r e p o i n t t h e G D T R
2020-02-27 02:00:31 +03:00
* to t h e n e w c o p y o f t h e G D T .
2020-02-02 20:13:51 +03:00
* /
2020-02-27 02:00:31 +03:00
leal g d t ( % e b x ) , % e a x
movl % e a x , 2 ( % e a x )
lgdt ( % e a x )
2020-02-02 20:13:51 +03:00
2005-04-17 02:20:36 +04:00
/ *
2006-12-07 04:14:04 +03:00
* Jump t o t h e r e l o c a t e d a d d r e s s .
2005-04-17 02:20:36 +04:00
* /
2019-09-06 10:55:50 +03:00
leal . L r e l o c a t e d ( % e b x ) , % e a x
2009-05-09 02:45:17 +04:00
jmp * % e a x
2019-10-11 14:51:07 +03:00
SYM_ F U N C _ E N D ( s t a r t u p _ 3 2 )
2009-02-14 00:50:23 +03:00
2017-08-24 10:33:26 +03:00
# ifdef C O N F I G _ E F I _ S T U B
2019-10-11 14:51:07 +03:00
SYM_ F U N C _ S T A R T ( e f i 3 2 _ s t u b _ e n t r y )
2019-12-24 18:10:17 +03:00
SYM_ F U N C _ S T A R T _ A L I A S ( e f i _ s t u b _ e n t r y )
2017-08-24 10:33:26 +03:00
add $ 0 x4 , % e s p
2020-03-08 11:08:43 +03:00
movl 8 ( % e s p ) , % e s i / * s a v e b o o t _ p a r a m s p o i n t e r * /
2017-08-24 10:33:26 +03:00
call e f i _ m a i n
leal s t a r t u p _ 3 2 ( % e a x ) , % e a x
jmp * % e a x
2019-10-11 14:51:07 +03:00
SYM_ F U N C _ E N D ( e f i 3 2 _ s t u b _ e n t r y )
2019-12-24 18:10:17 +03:00
SYM_ F U N C _ E N D _ A L I A S ( e f i _ s t u b _ e n t r y )
2017-08-24 10:33:26 +03:00
# endif
2009-05-09 02:45:17 +04:00
.text
2019-10-11 14:50:47 +03:00
SYM_ F U N C _ S T A R T _ L O C A L _ N O A L I G N ( . L r e l o c a t e d )
2006-12-07 04:14:04 +03:00
2005-04-17 02:20:36 +04:00
/ *
2009-05-09 03:27:41 +04:00
* Clear B S S ( s t a c k i s c u r r e n t l y e m p t y )
2005-04-17 02:20:36 +04:00
* /
2009-05-09 02:45:17 +04:00
xorl % e a x , % e a x
2009-05-09 03:20:34 +04:00
leal _ b s s ( % e b x ) , % e d i
2009-05-09 02:45:17 +04:00
leal _ e b s s ( % e b x ) , % e c x
subl % e d i , % e c x
2009-05-09 03:45:15 +04:00
shrl $ 2 , % e c x
rep s t o s l
2006-12-07 04:14:04 +03:00
2014-09-23 10:05:49 +04:00
/ *
* Adjust o u r o w n G O T
* /
leal _ g o t ( % e b x ) , % e d x
leal _ e g o t ( % e b x ) , % e c x
1 :
cmpl % e c x , % e d x
jae 2 f
addl % e b x , ( % e d x )
addl $ 4 , % e d x
jmp 1 b
2 :
2005-04-17 02:20:36 +04:00
/ *
2016-04-18 19:42:13 +03:00
* Do t h e e x t r a c t i o n , a n d j u m p t o t h e n e w k e r n e l . .
2005-04-17 02:20:36 +04:00
* /
2016-04-18 19:42:13 +03:00
/* push arguments for extract_kernel: */
2014-10-31 16:40:38 +03:00
pushl $ z _ o u t p u t _ l e n / * d e c o m p r e s s e d l e n g t h , e n d o f r e l o c s * /
x86/boot: Move compressed kernel to the end of the decompression buffer
This change makes later calculations about where the kernel is located
easier to reason about. To better understand this change, we must first
clarify what 'VO' and 'ZO' are. These values were introduced in commits
by hpa:
77d1a4999502 ("x86, boot: make symbols from the main vmlinux available")
37ba7ab5e33c ("x86, boot: make kernel_alignment adjustable; new bzImage fields")
Specifically:
All names prefixed with 'VO_':
- relate to the uncompressed kernel image
- the size of the VO image is: VO__end-VO__text ("VO_INIT_SIZE" define)
All names prefixed with 'ZO_':
- relate to the bootable compressed kernel image (boot/compressed/vmlinux),
which is composed of the following memory areas:
- head text
- compressed kernel (VO image and relocs table)
- decompressor code
- the size of the ZO image is: ZO__end - ZO_startup_32 ("ZO_INIT_SIZE" define, though see below)
The 'INIT_SIZE' value is used to find the larger of the two image sizes:
#define ZO_INIT_SIZE (ZO__end - ZO_startup_32 + ZO_z_extract_offset)
#define VO_INIT_SIZE (VO__end - VO__text)
#if ZO_INIT_SIZE > VO_INIT_SIZE
# define INIT_SIZE ZO_INIT_SIZE
#else
# define INIT_SIZE VO_INIT_SIZE
#endif
The current code uses extract_offset to decide where to position the
copied ZO (i.e. ZO starts at extract_offset). (This is why ZO_INIT_SIZE
currently includes the extract_offset.)
Why does z_extract_offset exist? It's needed because we are trying to minimize
the amount of RAM used for the whole act of creating an uncompressed, executable,
properly relocation-linked kernel image in system memory. We do this so that
kernels can be booted on even very small systems.
To achieve the goal of minimal memory consumption we have implemented an in-place
decompression strategy: instead of cleanly separating the VO and ZO images and
also allocating some memory for the decompression code's runtime needs, we instead
create this elaborate layout of memory buffers where the output (decompressed)
stream, as it progresses, overlaps with and destroys the input (compressed)
stream. This can only be done safely if the ZO image is placed to the end of the
VO range, plus a certain amount of safety distance to make sure that when the last
bytes of the VO range are decompressed, the compressed stream pointer is safely
beyond the end of the VO range.
z_extract_offset is calculated in arch/x86/boot/compressed/mkpiggy.c during
the build process, at a point when we know the exact compressed and
uncompressed size of the kernel images and can calculate this safe minimum
offset value. (Note that the mkpiggy.c calculation is not perfect, because
we don't know the decompressor used at that stage, so the z_extract_offset
calculation is necessarily imprecise and is mostly based on gzip internals -
we'll improve that in the next patch.)
When INIT_SIZE is bigger than VO_INIT_SIZE (uncommon but possible),
the copied ZO occupies the memory from extract_offset to the end of
decompression buffer. It overlaps with the soon-to-be-uncompressed kernel
like this:
|-----compressed kernel image------|
V V
0 extract_offset +INIT_SIZE
|-----------|---------------|-------------------------|--------|
| | | |
VO__text startup_32 of ZO VO__end ZO__end
^ ^
|-------uncompressed kernel image---------|
When INIT_SIZE is equal to VO_INIT_SIZE (likely) there's still space
left from end of ZO to the end of decompressing buffer, like below.
|-compressed kernel image-|
V V
0 extract_offset +INIT_SIZE
|-----------|---------------|-------------------------|--------|
| | | |
VO__text startup_32 of ZO ZO__end VO__end
^ ^
|------------uncompressed kernel image-------------|
To simplify calculations and avoid special cases, it is cleaner to
always place the compressed kernel image in memory so that ZO__end
is at the end of the decompression buffer, instead of placing t at
the start of extract_offset as is currently done.
This patch adds BP_init_size (which is the INIT_SIZE as passed in from
the boot_params) into asm-offsets.c to make it visible to the assembly
code.
Then when moving the ZO, it calculates the starting position of
the copied ZO (via BP_init_size and the ZO run size) so that the VO__end
will be at the end of the decompression buffer. To make the position
calculation safe, the end of ZO is page aligned (and a comment is added
to the existing VO alignment for good measure).
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
[ Rewrote changelog and comments. ]
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Baoquan He <bhe@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: lasse.collin@tukaani.org
Link: http://lkml.kernel.org/r/1461888548-32439-3-git-send-email-keescook@chromium.org
[ Rewrote the changelog some more. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-04-29 03:09:04 +03:00
2020-03-08 11:08:46 +03:00
pushl % e b p / * o u t p u t a d d r e s s * /
x86/boot: Move compressed kernel to the end of the decompression buffer
This change makes later calculations about where the kernel is located
easier to reason about. To better understand this change, we must first
clarify what 'VO' and 'ZO' are. These values were introduced in commits
by hpa:
77d1a4999502 ("x86, boot: make symbols from the main vmlinux available")
37ba7ab5e33c ("x86, boot: make kernel_alignment adjustable; new bzImage fields")
Specifically:
All names prefixed with 'VO_':
- relate to the uncompressed kernel image
- the size of the VO image is: VO__end-VO__text ("VO_INIT_SIZE" define)
All names prefixed with 'ZO_':
- relate to the bootable compressed kernel image (boot/compressed/vmlinux),
which is composed of the following memory areas:
- head text
- compressed kernel (VO image and relocs table)
- decompressor code
- the size of the ZO image is: ZO__end - ZO_startup_32 ("ZO_INIT_SIZE" define, though see below)
The 'INIT_SIZE' value is used to find the larger of the two image sizes:
#define ZO_INIT_SIZE (ZO__end - ZO_startup_32 + ZO_z_extract_offset)
#define VO_INIT_SIZE (VO__end - VO__text)
#if ZO_INIT_SIZE > VO_INIT_SIZE
# define INIT_SIZE ZO_INIT_SIZE
#else
# define INIT_SIZE VO_INIT_SIZE
#endif
The current code uses extract_offset to decide where to position the
copied ZO (i.e. ZO starts at extract_offset). (This is why ZO_INIT_SIZE
currently includes the extract_offset.)
Why does z_extract_offset exist? It's needed because we are trying to minimize
the amount of RAM used for the whole act of creating an uncompressed, executable,
properly relocation-linked kernel image in system memory. We do this so that
kernels can be booted on even very small systems.
To achieve the goal of minimal memory consumption we have implemented an in-place
decompression strategy: instead of cleanly separating the VO and ZO images and
also allocating some memory for the decompression code's runtime needs, we instead
create this elaborate layout of memory buffers where the output (decompressed)
stream, as it progresses, overlaps with and destroys the input (compressed)
stream. This can only be done safely if the ZO image is placed to the end of the
VO range, plus a certain amount of safety distance to make sure that when the last
bytes of the VO range are decompressed, the compressed stream pointer is safely
beyond the end of the VO range.
z_extract_offset is calculated in arch/x86/boot/compressed/mkpiggy.c during
the build process, at a point when we know the exact compressed and
uncompressed size of the kernel images and can calculate this safe minimum
offset value. (Note that the mkpiggy.c calculation is not perfect, because
we don't know the decompressor used at that stage, so the z_extract_offset
calculation is necessarily imprecise and is mostly based on gzip internals -
we'll improve that in the next patch.)
When INIT_SIZE is bigger than VO_INIT_SIZE (uncommon but possible),
the copied ZO occupies the memory from extract_offset to the end of
decompression buffer. It overlaps with the soon-to-be-uncompressed kernel
like this:
|-----compressed kernel image------|
V V
0 extract_offset +INIT_SIZE
|-----------|---------------|-------------------------|--------|
| | | |
VO__text startup_32 of ZO VO__end ZO__end
^ ^
|-------uncompressed kernel image---------|
When INIT_SIZE is equal to VO_INIT_SIZE (likely) there's still space
left from end of ZO to the end of decompressing buffer, like below.
|-compressed kernel image-|
V V
0 extract_offset +INIT_SIZE
|-----------|---------------|-------------------------|--------|
| | | |
VO__text startup_32 of ZO ZO__end VO__end
^ ^
|------------uncompressed kernel image-------------|
To simplify calculations and avoid special cases, it is cleaner to
always place the compressed kernel image in memory so that ZO__end
is at the end of the decompression buffer, instead of placing t at
the start of extract_offset as is currently done.
This patch adds BP_init_size (which is the INIT_SIZE as passed in from
the boot_params) into asm-offsets.c to make it visible to the assembly
code.
Then when moving the ZO, it calculates the starting position of
the copied ZO (via BP_init_size and the ZO run size) so that the VO__end
will be at the end of the decompression buffer. To make the position
calculation safe, the end of ZO is page aligned (and a comment is added
to the existing VO alignment for good measure).
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
[ Rewrote changelog and comments. ]
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Baoquan He <bhe@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: lasse.collin@tukaani.org
Link: http://lkml.kernel.org/r/1461888548-32439-3-git-send-email-keescook@chromium.org
[ Rewrote the changelog some more. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-04-29 03:09:04 +03:00
2009-05-09 04:42:16 +04:00
pushl $ z _ i n p u t _ l e n / * i n p u t _ l e n * /
2009-05-09 02:45:17 +04:00
leal i n p u t _ d a t a ( % e b x ) , % e a x
pushl % e a x / * i n p u t _ d a t a * /
leal b o o t _ h e a p ( % e b x ) , % e a x
pushl % e a x / * h e a p a r e a * /
pushl % e s i / * r e a l m o d e p o i n t e r * /
2016-04-18 19:42:13 +03:00
call e x t r a c t _ k e r n e l / * r e t u r n s k e r n e l l o c a t i o n i n % e a x * /
2016-04-29 03:09:07 +03:00
addl $ 2 4 , % e s p
2005-04-17 02:20:36 +04:00
/ *
2016-04-18 19:42:13 +03:00
* Jump t o t h e e x t r a c t e d k e r n e l .
2005-04-17 02:20:36 +04:00
* /
2009-05-09 02:45:17 +04:00
xorl % e b x , % e b x
2013-10-11 04:18:14 +04:00
jmp * % e a x
2019-10-11 14:50:47 +03:00
SYM_ F U N C _ E N D ( . L r e l o c a t e d )
2006-12-07 04:14:04 +03:00
2020-02-02 20:13:51 +03:00
.data
.balign 8
SYM_ D A T A _ S T A R T _ L O C A L ( g d t )
.word gdt_end - gdt - 1
.long 0
.word 0
.quad 0x0000000000000000 /* Reserved */
.quad 0x00cf9a000000ffff /* __KERNEL_CS */
.quad 0x00cf92000000ffff /* __KERNEL_DS */
SYM_ D A T A _ E N D _ L A B E L ( g d t , S Y M _ L _ L O C A L , g d t _ e n d )
2020-03-08 11:08:47 +03:00
# ifdef C O N F I G _ E F I _ S T U B
SYM_ D A T A ( i m a g e _ o f f s e t , . l o n g 0 )
# endif
2009-05-09 02:45:17 +04:00
/ *
* Stack a n d h e a p f o r u n c o m p r e s s i o n
* /
.bss
.balign 4
2008-04-08 14:54:30 +04:00
boot_heap :
.fill BOOT_ H E A P _ S I Z E , 1 , 0
boot_stack :
.fill BOOT_ S T A C K _ S I Z E , 1 , 0
boot_stack_end :