License cleanup: add SPDX GPL-2.0 license identifier to files with no license
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.
By default all files without license information are under the default
license of the kernel, which is GPL version 2.
Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.
This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.
How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,
Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.
The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.
The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.
Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if <5
lines).
All documentation files were explicitly excluded.
The following heuristics were used to determine which SPDX license
identifiers to apply.
- when both scanners couldn't find any license traces, file was
considered to have no license information in it, and the top level
COPYING file license applied.
For non */uapi/* files that summary was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 11139
and resulted in the first patch in this series.
If that file was a */uapi/* path one, it was "GPL-2.0 WITH
Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 WITH Linux-syscall-note 930
and resulted in the second patch in this series.
- if a file had some form of licensing information in it, and was one
of the */uapi/* ones, it was denoted with the Linux-syscall-note if
any GPL family license was found in the file or had no licensing in
it (per prior point). Results summary:
SPDX license identifier # files
---------------------------------------------------|------
GPL-2.0 WITH Linux-syscall-note 270
GPL-2.0+ WITH Linux-syscall-note 169
((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21
((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17
LGPL-2.1+ WITH Linux-syscall-note 15
GPL-1.0+ WITH Linux-syscall-note 14
((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5
LGPL-2.0+ WITH Linux-syscall-note 4
LGPL-2.1 WITH Linux-syscall-note 3
((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3
((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1
and that resulted in the third patch in this series.
- when the two scanners agreed on the detected license(s), that became
the concluded license(s).
- when there was disagreement between the two scanners (one detected a
license but the other didn't, or they both detected different
licenses) a manual inspection of the file occurred.
- In most cases a manual inspection of the information in the file
resulted in a clear resolution of the license that should apply (and
which scanner probably needed to revisit its heuristics).
- When it was not immediately clear, the license identifier was
confirmed with lawyers working with the Linux Foundation.
- If there was any question as to the appropriate license identifier,
the file was flagged for further research and to be revisited later
in time.
In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.
Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights. The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.
Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.
In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.
Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
- a full scancode scan run, collecting the matched texts, detected
license ids and scores
- reviewing anything where there was a license detected (about 500+
files) to ensure that the applied SPDX license was correct
- reviewing anything where there was no detection but the patch license
was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
SPDX license was correct
This produced a worksheet with 20 files needing minor correction. This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.
These .csv files were then reviewed by Greg. Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected. This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.) Finally Greg ran the script using the .csv files to
generate the patches.
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-01 15:07:57 +01:00
// SPDX-License-Identifier: GPL-2.0
2005-04-16 15:20:36 -07:00
# include <linux/init.h>
init/initramfs.c: do unpacking asynchronously
Patch series "background initramfs unpacking, and CONFIG_MODPROBE_PATH", v3.
These two patches are independent, but better-together.
The second is a rather trivial patch that simply allows the developer to
change "/sbin/modprobe" to something else - e.g. the empty string, so
that all request_module() during early boot return -ENOENT early, without
even spawning a usermode helper, needlessly synchronizing with the
initramfs unpacking.
The first patch delegates decompressing the initramfs to a worker thread,
allowing do_initcalls() in main.c to proceed to the device_ and late_
initcalls without waiting for that decompression (and populating of
rootfs) to finish. Obviously, some of those later calls may rely on the
initramfs being available, so I've added synchronization points in the
firmware loader and usermodehelper paths - there might be other places
that would need this, but so far no one has been able to think of any
places I have missed.
There's not much to win if most of the functionality needed during boot is
only available as modules. But systems with a custom-made .config and
initramfs can boot faster, partly due to utilizing more than one cpu
earlier, partly by avoiding known-futile modprobe calls (which would still
trigger synchronization with the initramfs unpacking, thus eliminating
most of the first benefit).
This patch (of 2):
Most of the boot process doesn't actually need anything from the
initramfs, until of course PID1 is to be executed. So instead of doing
the decompressing and populating of the initramfs synchronously in
populate_rootfs() itself, push that off to a worker thread.
This is primarily motivated by an embedded ppc target, where unpacking
even the rather modest sized initramfs takes 0.6 seconds, which is long
enough that the external watchdog becomes unhappy that it doesn't get
attention soon enough. By doing the initramfs decompression in a worker
thread, we get to do the device_initcalls and hence start petting the
watchdog much sooner.
Normal desktops might benefit as well. On my mostly stock Ubuntu kernel,
my initramfs is a 26M xz-compressed blob, decompressing to around 126M.
That takes almost two seconds:
[ 0.201454] Trying to unpack rootfs image as initramfs...
[ 1.976633] Freeing initrd memory: 29416K
Before this patch, these lines occur consecutively in dmesg. With this
patch, the timestamps on these two lines is roughly the same as above, but
with 172 lines inbetween - so more than one cpu has been kept busy doing
work that would otherwise only happen after the populate_rootfs()
finished.
Should one of the initcalls done after rootfs_initcall time (i.e., device_
and late_ initcalls) need something from the initramfs (say, a kernel
module or a firmware blob), it will simply wait for the initramfs
unpacking to be done before proceeding, which should in theory make this
completely safe.
But if some driver pokes around in the filesystem directly and not via one
of the official kernel interfaces (i.e. request_firmware*(),
call_usermodehelper*) that theory may not hold - also, I certainly might
have missed a spot when sprinkling wait_for_initramfs(). So there is an
escape hatch in the form of an initramfs_async= command line parameter.
Link: https://lkml.kernel.org/r/20210313212528.2956377-1-linux@rasmusvillemoes.dk
Link: https://lkml.kernel.org/r/20210313212528.2956377-2-linux@rasmusvillemoes.dk
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Cc: Jessica Yu <jeyu@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-06 18:05:42 -07:00
# include <linux/async.h>
2005-04-16 15:20:36 -07:00
# include <linux/fs.h>
# include <linux/slab.h>
# include <linux/types.h>
# include <linux/fcntl.h>
# include <linux/delay.h>
# include <linux/string.h>
2008-08-13 17:26:01 +08:00
# include <linux/dirent.h>
2005-04-16 15:20:36 -07:00
# include <linux/syscalls.h>
2008-10-15 22:01:40 -07:00
# include <linux/utime.h>
2017-02-27 14:28:12 -08:00
# include <linux/file.h>
2019-09-28 11:02:26 +03:00
# include <linux/memblock.h>
2021-02-25 17:22:46 -08:00
# include <linux/mm.h>
2020-06-06 13:59:54 +02:00
# include <linux/namei.h>
2020-07-23 08:23:40 +02:00
# include <linux/init_syscalls.h>
init: move usermodehelper_enable() to populate_rootfs()
Currently, usermodehelper is enabled right before PID1 starts going
through the initcalls. However, any call of a usermodehelper from a
pure_, core_, postcore_, arch_, subsys_ or fs_ initcall is futile, as
there is no filesystem contents yet.
Up until commit e7cb072eb988 ("init/initramfs.c: do unpacking
asynchronously"), such calls, whether via some request_module(), a
legacy uevent "/sbin/hotplug" notification or something else, would
just fail silently with (presumably) -ENOENT from
kernel_execve(). However, that commit introduced the
wait_for_initramfs() synchronization hook which must be called from
the usermodehelper exec path right before the kernel_execve, in order
that request_module() et al done from *after* rootfs_initcall()
time (i.e. device_ and late_ initcalls) would continue to find a
populated initramfs as they used to.
Any call of wait_for_initramfs() done before the unpacking has been
scheduled (i.e. before rootfs_initcall time) must just return
immediately [and let the caller find an empty file system] in order
not to deadlock the machine. I mistakenly thought, and my limited
testing confirmed, that there were no such calls, so I added a
pr_warn_once() in wait_for_initramfs(). It turns out that one can
indeed hit request_module() as well as kobject_uevent_env() during
those early init calls, leading to a user-visible warning in the
kernel log emitted consistently for certain configurations.
We could just remove the pr_warn_once(), but I think it's better to
postpone enabling the usermodehelper framework until there is at least
some chance of finding the executable. That is also a little more
efficient in that a lot of work done in umh.c will be elided. However,
it does change the error seen by those early callers from -ENOENT to
-EBUSY, so there is a risk of a regression if any caller care about
the exact error value.
Link: https://lkml.kernel.org/r/20210728134638.329060-1-linux@rasmusvillemoes.dk
Fixes: e7cb072eb988 ("init/initramfs.c: do unpacking asynchronously")
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Reported-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Reported-by: Bruno Goncalves <bgoncalv@redhat.com>
Reported-by: Heiner Kallweit <hkallweit1@gmail.com>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-09-07 20:00:03 -07:00
# include <linux/umh.h>
2005-04-16 15:20:36 -07:00
2020-07-14 08:56:19 +02:00
static ssize_t __init xwrite ( struct file * file , const char * p , size_t count ,
loff_t * pos )
initramfs: support initrd that is bigger than 2GiB
When initrd (compressed or not) is used, kernel report data corrupted with
/dev/ram0.
The root cause:
During initramfs checking, if it is initrd, it will be transferred to
/initrd.image with sys_write.
sys_write only support 2G-4K write, so if the initrd ram is more than
that, /initrd.image will not complete at all.
Add local xwrite to loop calling sys_write to workaround the problem.
Also need to use xwrite in write_buffer() to handle:
image is uncompressed cpio and there is one big file (>2G) in it.
unpack_to_rootfs ===> write_buffer ===> actions[]/do_copy
At the same time, we don't need to worry about sys_read/sys_write in
do_mounts_rd.c::crd_load. As decompressor will have fill/flush and local
buffer that is smaller than 2G.
Test with uncompressed initrd, and compressed ones with gz, bz2, lzma,xz,
lzop.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Cc: "Daniel M. Weeks" <dan@danweeks.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-08-08 14:23:12 -07:00
{
ssize_t out = 0 ;
/* sys_write only can write MAX_RW_COUNT aka 2G-4K bytes at most */
while ( count ) {
2020-07-14 08:56:19 +02:00
ssize_t rv = kernel_write ( file , p , count , pos ) ;
initramfs: support initrd that is bigger than 2GiB
When initrd (compressed or not) is used, kernel report data corrupted with
/dev/ram0.
The root cause:
During initramfs checking, if it is initrd, it will be transferred to
/initrd.image with sys_write.
sys_write only support 2G-4K write, so if the initrd ram is more than
that, /initrd.image will not complete at all.
Add local xwrite to loop calling sys_write to workaround the problem.
Also need to use xwrite in write_buffer() to handle:
image is uncompressed cpio and there is one big file (>2G) in it.
unpack_to_rootfs ===> write_buffer ===> actions[]/do_copy
At the same time, we don't need to worry about sys_read/sys_write in
do_mounts_rd.c::crd_load. As decompressor will have fill/flush and local
buffer that is smaller than 2G.
Test with uncompressed initrd, and compressed ones with gz, bz2, lzma,xz,
lzop.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Cc: "Daniel M. Weeks" <dan@danweeks.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-08-08 14:23:12 -07:00
if ( rv < 0 ) {
if ( rv = = - EINTR | | rv = = - EAGAIN )
continue ;
return out ? out : rv ;
} else if ( rv = = 0 )
break ;
p + = rv ;
out + = rv ;
count - = rv ;
}
return out ;
}
2005-04-16 15:20:36 -07:00
static __initdata char * message ;
static void __init error ( char * x )
{
if ( ! message )
message = x ;
}
2021-02-25 17:22:46 -08:00
static void panic_show_mem ( const char * fmt , . . . )
{
va_list args ;
show_mem ( 0 , NULL ) ;
va_start ( args , fmt ) ;
panic ( fmt , args ) ;
va_end ( args ) ;
}
2005-04-16 15:20:36 -07:00
/* link hash */
2006-05-15 09:44:03 -07:00
# define N_ALIGN(len) ((((len) + 1) & ~3) + 2)
2005-04-16 15:20:36 -07:00
static __initdata struct hash {
int ino , minor , major ;
2011-07-26 04:34:13 -04:00
umode_t mode ;
2005-04-16 15:20:36 -07:00
struct hash * next ;
2006-05-15 09:44:03 -07:00
char name [ N_ALIGN ( PATH_MAX ) ] ;
2005-04-16 15:20:36 -07:00
} * head [ 32 ] ;
static inline int hash ( int major , int minor , int ino )
{
unsigned long tmp = ino + minor + ( major < < 3 ) ;
tmp + = tmp > > 5 ;
return tmp & 31 ;
}
2006-06-26 00:28:02 -07:00
static char __init * find_link ( int major , int minor , int ino ,
2011-07-26 04:34:13 -04:00
umode_t mode , char * name )
2005-04-16 15:20:36 -07:00
{
struct hash * * p , * q ;
for ( p = head + hash ( major , minor , ino ) ; * p ; p = & ( * p ) - > next ) {
if ( ( * p ) - > ino ! = ino )
continue ;
if ( ( * p ) - > minor ! = minor )
continue ;
if ( ( * p ) - > major ! = major )
continue ;
2006-06-26 00:28:02 -07:00
if ( ( ( * p ) - > mode ^ mode ) & S_IFMT )
continue ;
2005-04-16 15:20:36 -07:00
return ( * p ) - > name ;
}
2008-04-29 00:59:43 -07:00
q = kmalloc ( sizeof ( struct hash ) , GFP_KERNEL ) ;
2005-04-16 15:20:36 -07:00
if ( ! q )
2021-02-25 17:22:46 -08:00
panic_show_mem ( " can't allocate link hash entry " ) ;
2005-04-16 15:20:36 -07:00
q - > major = major ;
2006-06-26 00:28:02 -07:00
q - > minor = minor ;
q - > ino = ino ;
q - > mode = mode ;
2006-05-15 09:44:03 -07:00
strcpy ( q - > name , name ) ;
2005-04-16 15:20:36 -07:00
q - > next = NULL ;
* p = q ;
return NULL ;
}
static void __init free_hash ( void )
{
struct hash * * p , * q ;
for ( p = head ; p < head + 32 ; p + + ) {
while ( * p ) {
q = * p ;
* p = q - > next ;
2008-04-29 00:59:43 -07:00
kfree ( q ) ;
2005-04-16 15:20:36 -07:00
}
}
}
2017-11-17 15:31:04 -08:00
static long __init do_utime ( char * filename , time64_t mtime )
2008-10-15 22:01:40 -07:00
{
2017-08-02 19:51:15 -07:00
struct timespec64 t [ 2 ] ;
2008-10-15 22:01:40 -07:00
t [ 0 ] . tv_sec = mtime ;
t [ 0 ] . tv_nsec = 0 ;
t [ 1 ] . tv_sec = mtime ;
t [ 1 ] . tv_nsec = 0 ;
2020-07-21 16:05:31 +02:00
return init_utimes ( filename , t ) ;
2008-10-15 22:01:40 -07:00
}
static __initdata LIST_HEAD ( dir_list ) ;
struct dir_entry {
struct list_head list ;
char * name ;
2017-11-17 15:31:04 -08:00
time64_t mtime ;
2008-10-15 22:01:40 -07:00
} ;
2017-11-17 15:31:04 -08:00
static void __init dir_add ( const char * name , time64_t mtime )
2008-10-15 22:01:40 -07:00
{
struct dir_entry * de = kmalloc ( sizeof ( struct dir_entry ) , GFP_KERNEL ) ;
if ( ! de )
2021-02-25 17:22:46 -08:00
panic_show_mem ( " can't allocate dir_entry buffer " ) ;
2008-10-15 22:01:40 -07:00
INIT_LIST_HEAD ( & de - > list ) ;
de - > name = kstrdup ( name , GFP_KERNEL ) ;
de - > mtime = mtime ;
list_add ( & de - > list , & dir_list ) ;
}
static void __init dir_utime ( void )
{
struct dir_entry * de , * tmp ;
list_for_each_entry_safe ( de , tmp , & dir_list , list ) {
list_del ( & de - > list ) ;
do_utime ( de - > name , de - > mtime ) ;
kfree ( de - > name ) ;
kfree ( de ) ;
}
}
2017-11-17 15:31:04 -08:00
static __initdata time64_t mtime ;
2008-10-15 22:01:40 -07:00
2005-04-16 15:20:36 -07:00
/* cpio header parsing */
static __initdata unsigned long ino , major , minor , nlink ;
2011-07-26 04:34:13 -04:00
static __initdata umode_t mode ;
2005-04-16 15:20:36 -07:00
static __initdata unsigned long body_len , name_len ;
static __initdata uid_t uid ;
static __initdata gid_t gid ;
static __initdata unsigned rdev ;
static void __init parse_header ( char * s )
{
unsigned long parsed [ 12 ] ;
char buf [ 9 ] ;
int i ;
buf [ 8 ] = ' \0 ' ;
for ( i = 0 , s + = 6 ; i < 12 ; i + + , s + = 8 ) {
memcpy ( buf , s , 8 ) ;
parsed [ i ] = simple_strtoul ( buf , NULL , 16 ) ;
}
ino = parsed [ 0 ] ;
mode = parsed [ 1 ] ;
uid = parsed [ 2 ] ;
gid = parsed [ 3 ] ;
nlink = parsed [ 4 ] ;
2017-11-17 15:31:04 -08:00
mtime = parsed [ 5 ] ; /* breaks in y2106 */
2005-04-16 15:20:36 -07:00
body_len = parsed [ 6 ] ;
major = parsed [ 7 ] ;
minor = parsed [ 8 ] ;
rdev = new_encode_dev ( MKDEV ( parsed [ 9 ] , parsed [ 10 ] ) ) ;
name_len = parsed [ 11 ] ;
}
/* FSM */
static __initdata enum state {
Start ,
Collect ,
GotHeader ,
SkipIt ,
GotName ,
CopyFile ,
GotSymlink ,
Reset
} state , next_state ;
static __initdata char * victim ;
2014-10-13 15:54:07 -07:00
static unsigned long byte_count __initdata ;
2005-04-16 15:20:36 -07:00
static __initdata loff_t this_header , next_header ;
2007-07-26 17:33:59 +01:00
static inline void __init eat ( unsigned n )
2005-04-16 15:20:36 -07:00
{
victim + = n ;
this_header + = n ;
2014-10-13 15:54:07 -07:00
byte_count - = n ;
2005-04-16 15:20:36 -07:00
}
static __initdata char * collected ;
initramfs: support initramfs that is bigger than 2GiB
Now with 64bit bzImage and kexec tools, we support ramdisk that size is
bigger than 2g, as we could put it above 4G.
Found compressed initramfs image could not be decompressed properly. It
turns out that image length is int during decompress detection, and it
will become < 0 when length is more than 2G. Furthermore, during
decompressing len as int is used for inbuf count, that has problem too.
Change len to long, that should be ok as on 32 bit platform long is
32bits.
Tested with following compressed initramfs image as root with kexec.
gzip, bzip2, xz, lzma, lzop, lz4.
run time for populate_rootfs():
size name Nehalem-EX Westmere-EX Ivybridge-EX
9034400256 root_img : 26s 24s 30s
3561095057 root_img.lz4 : 28s 27s 27s
3459554629 root_img.lzo : 29s 29s 28s
3219399480 root_img.gz : 64s 62s 49s
2251594592 root_img.xz : 262s 260s 183s
2226366598 root_img.lzma: 386s 376s 277s
2901482513 root_img.bz2 : 635s 599s
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Rashika Kheria <rashika.kheria@gmail.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Kyungsik Lee <kyungsik.lee@lge.com>
Cc: P J P <ppandit@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Cc: "Daniel M. Weeks" <dan@danweeks.net>
Cc: Alexandre Courbot <acourbot@nvidia.com>
Cc: Jan Beulich <JBeulich@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-08-08 14:23:14 -07:00
static long remains __initdata ;
2005-04-16 15:20:36 -07:00
static __initdata char * collect ;
static void __init read_into ( char * buf , unsigned size , enum state next )
{
2014-10-13 15:54:07 -07:00
if ( byte_count > = size ) {
2005-04-16 15:20:36 -07:00
collected = victim ;
eat ( size ) ;
state = next ;
} else {
collect = collected = buf ;
remains = size ;
next_state = next ;
state = Collect ;
}
}
static __initdata char * header_buf , * symlink_buf , * name_buf ;
static int __init do_start ( void )
{
read_into ( header_buf , 110 , GotHeader ) ;
return 0 ;
}
static int __init do_collect ( void )
{
initramfs: support initramfs that is bigger than 2GiB
Now with 64bit bzImage and kexec tools, we support ramdisk that size is
bigger than 2g, as we could put it above 4G.
Found compressed initramfs image could not be decompressed properly. It
turns out that image length is int during decompress detection, and it
will become < 0 when length is more than 2G. Furthermore, during
decompressing len as int is used for inbuf count, that has problem too.
Change len to long, that should be ok as on 32 bit platform long is
32bits.
Tested with following compressed initramfs image as root with kexec.
gzip, bzip2, xz, lzma, lzop, lz4.
run time for populate_rootfs():
size name Nehalem-EX Westmere-EX Ivybridge-EX
9034400256 root_img : 26s 24s 30s
3561095057 root_img.lz4 : 28s 27s 27s
3459554629 root_img.lzo : 29s 29s 28s
3219399480 root_img.gz : 64s 62s 49s
2251594592 root_img.xz : 262s 260s 183s
2226366598 root_img.lzma: 386s 376s 277s
2901482513 root_img.bz2 : 635s 599s
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Rashika Kheria <rashika.kheria@gmail.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Kyungsik Lee <kyungsik.lee@lge.com>
Cc: P J P <ppandit@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Cc: "Daniel M. Weeks" <dan@danweeks.net>
Cc: Alexandre Courbot <acourbot@nvidia.com>
Cc: Jan Beulich <JBeulich@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-08-08 14:23:14 -07:00
unsigned long n = remains ;
2014-10-13 15:54:07 -07:00
if ( byte_count < n )
n = byte_count ;
2005-04-16 15:20:36 -07:00
memcpy ( collect , victim , n ) ;
eat ( n ) ;
collect + = n ;
if ( ( remains - = n ) ! = 0 )
return 1 ;
state = next_state ;
return 0 ;
}
static int __init do_header ( void )
{
2006-12-06 20:37:19 -08:00
if ( memcmp ( collected , " 070707 " , 6 ) = = 0 ) {
error ( " incorrect cpio method used: use -H newc option " ) ;
return 1 ;
}
2005-04-16 15:20:36 -07:00
if ( memcmp ( collected , " 070701 " , 6 ) ) {
error ( " no cpio magic " ) ;
return 1 ;
}
parse_header ( collected ) ;
next_header = this_header + N_ALIGN ( name_len ) + body_len ;
next_header = ( next_header + 3 ) & ~ 3 ;
state = SkipIt ;
if ( name_len < = 0 | | name_len > PATH_MAX )
return 0 ;
if ( S_ISLNK ( mode ) ) {
if ( body_len > PATH_MAX )
return 0 ;
collect = collected = symlink_buf ;
remains = N_ALIGN ( name_len ) + body_len ;
next_state = GotSymlink ;
state = Collect ;
return 0 ;
}
if ( S_ISREG ( mode ) | | ! body_len )
read_into ( name_buf , N_ALIGN ( name_len ) , GotName ) ;
return 0 ;
}
static int __init do_skip ( void )
{
2014-10-13 15:54:07 -07:00
if ( this_header + byte_count < next_header ) {
eat ( byte_count ) ;
2005-04-16 15:20:36 -07:00
return 1 ;
} else {
eat ( next_header - this_header ) ;
state = next_state ;
return 0 ;
}
}
static int __init do_reset ( void )
{
2014-10-13 15:54:07 -07:00
while ( byte_count & & * victim = = ' \0 ' )
2005-04-16 15:20:36 -07:00
eat ( 1 ) ;
2014-10-13 15:54:07 -07:00
if ( byte_count & & ( this_header & 3 ) )
2005-04-16 15:20:36 -07:00
error ( " broken padding " ) ;
return 1 ;
}
2014-10-13 15:54:07 -07:00
static void __init clean_path ( char * path , umode_t fmode )
2006-06-26 00:28:02 -07:00
{
2017-05-08 15:57:00 -07:00
struct kstat st ;
2006-06-26 00:28:02 -07:00
2020-09-04 09:53:32 -04:00
if ( ! init_stat ( path , & st , AT_SYMLINK_NOFOLLOW ) & &
2020-07-22 11:15:40 +02:00
( st . mode ^ fmode ) & S_IFMT ) {
2017-05-08 15:57:00 -07:00
if ( S_ISDIR ( st . mode ) )
2020-07-22 11:11:45 +02:00
init_rmdir ( path ) ;
2006-06-26 00:28:02 -07:00
else
2020-07-23 08:23:40 +02:00
init_unlink ( path ) ;
2006-06-26 00:28:02 -07:00
}
}
initramfs: clean old path before creating a hardlink
sys_link() can fail due to the new path already existing. This case
ofen occurs when we use a concated initrd, for example:
1) prepare a basic rootfs, it contains a regular files rc.local
lizhijian@:~/yocto-tiny-i386-2016-04-22$ cat etc/rc.local
#!/bin/sh
echo "Running /etc/rc.local..."
yocto-tiny-i386-2016-04-22$ find . | sed 's,^\./,,' | cpio -o -H newc | gzip -n -9 >../rootfs.cgz
2) create a extra initrd which also includes a etc/rc.local
lizhijian@:~/lkp-x86_64/etc$ echo "append initrd" >rc.local
lizhijian@:~/lkp/lkp-x86_64/etc$ cat rc.local
append initrd
lizhijian@:~/lkp/lkp-x86_64/etc$ ln rc.local rc.local.hardlink
append initrd
lizhijian@:~/lkp/lkp-x86_64/etc$ stat rc.local rc.local.hardlink
File: 'rc.local'
Size: 14 Blocks: 8 IO Block: 4096 regular file
Device: 801h/2049d Inode: 11296086 Links: 2
Access: (0664/-rw-rw-r--) Uid: ( 1002/lizhijian) Gid: ( 1002/lizhijian)
Access: 2018-11-15 16:08:28.654464815 +0800
Modify: 2018-11-15 16:07:57.514903210 +0800
Change: 2018-11-15 16:08:24.180228872 +0800
Birth: -
File: 'rc.local.hardlink'
Size: 14 Blocks: 8 IO Block: 4096 regular file
Device: 801h/2049d Inode: 11296086 Links: 2
Access: (0664/-rw-rw-r--) Uid: ( 1002/lizhijian) Gid: ( 1002/lizhijian)
Access: 2018-11-15 16:08:28.654464815 +0800
Modify: 2018-11-15 16:07:57.514903210 +0800
Change: 2018-11-15 16:08:24.180228872 +0800
Birth: -
lizhijian@:~/lkp/lkp-x86_64$ find . | sed 's,^\./,,' | cpio -o -H newc | gzip -n -9 >../rc-local.cgz
lizhijian@:~/lkp/lkp-x86_64$ gzip -dc ../rc-local.cgz | cpio -t
.
etc
etc/rc.local.hardlink <<< it will be extracted first at this initrd
etc/rc.local
3) concate 2 initrds and boot
lizhijian@:~/lkp$ cat rootfs.cgz rc-local.cgz >concate-initrd.cgz
lizhijian@:~/lkp$ qemu-system-x86_64 -nographic -enable-kvm -cpu host -smp 1 -m 1024 -kernel ~/lkp/linux/arch/x86/boot/bzImage -append "console=ttyS0 earlyprint=ttyS0 ignore_loglevel" -initrd ./concate-initr.cgz -serial stdio -nodefaults
In this case, sys_link(2) will fail and return -EEXIST, so we can only get
the rc.local at rootfs.cgz instead of rc-local.cgz
[akpm@linux-foundation.org: move code to avoid forward declaration]
Link: http://lkml.kernel.org/r/1542352368-13299-1-git-send-email-lizhijian@cn.fujitsu.com
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Cc: Philip Li <philip.li@intel.com>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Cc: Li Zhijian <zhijianx.li@intel.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-11-30 14:10:09 -08:00
static int __init maybe_link ( void )
{
if ( nlink > = 2 ) {
char * old = find_link ( major , minor , ino , mode , collected ) ;
if ( old ) {
clean_path ( collected , 0 ) ;
2020-07-22 11:14:19 +02:00
return ( init_link ( old , collected ) < 0 ) ? - 1 : 1 ;
initramfs: clean old path before creating a hardlink
sys_link() can fail due to the new path already existing. This case
ofen occurs when we use a concated initrd, for example:
1) prepare a basic rootfs, it contains a regular files rc.local
lizhijian@:~/yocto-tiny-i386-2016-04-22$ cat etc/rc.local
#!/bin/sh
echo "Running /etc/rc.local..."
yocto-tiny-i386-2016-04-22$ find . | sed 's,^\./,,' | cpio -o -H newc | gzip -n -9 >../rootfs.cgz
2) create a extra initrd which also includes a etc/rc.local
lizhijian@:~/lkp-x86_64/etc$ echo "append initrd" >rc.local
lizhijian@:~/lkp/lkp-x86_64/etc$ cat rc.local
append initrd
lizhijian@:~/lkp/lkp-x86_64/etc$ ln rc.local rc.local.hardlink
append initrd
lizhijian@:~/lkp/lkp-x86_64/etc$ stat rc.local rc.local.hardlink
File: 'rc.local'
Size: 14 Blocks: 8 IO Block: 4096 regular file
Device: 801h/2049d Inode: 11296086 Links: 2
Access: (0664/-rw-rw-r--) Uid: ( 1002/lizhijian) Gid: ( 1002/lizhijian)
Access: 2018-11-15 16:08:28.654464815 +0800
Modify: 2018-11-15 16:07:57.514903210 +0800
Change: 2018-11-15 16:08:24.180228872 +0800
Birth: -
File: 'rc.local.hardlink'
Size: 14 Blocks: 8 IO Block: 4096 regular file
Device: 801h/2049d Inode: 11296086 Links: 2
Access: (0664/-rw-rw-r--) Uid: ( 1002/lizhijian) Gid: ( 1002/lizhijian)
Access: 2018-11-15 16:08:28.654464815 +0800
Modify: 2018-11-15 16:07:57.514903210 +0800
Change: 2018-11-15 16:08:24.180228872 +0800
Birth: -
lizhijian@:~/lkp/lkp-x86_64$ find . | sed 's,^\./,,' | cpio -o -H newc | gzip -n -9 >../rc-local.cgz
lizhijian@:~/lkp/lkp-x86_64$ gzip -dc ../rc-local.cgz | cpio -t
.
etc
etc/rc.local.hardlink <<< it will be extracted first at this initrd
etc/rc.local
3) concate 2 initrds and boot
lizhijian@:~/lkp$ cat rootfs.cgz rc-local.cgz >concate-initrd.cgz
lizhijian@:~/lkp$ qemu-system-x86_64 -nographic -enable-kvm -cpu host -smp 1 -m 1024 -kernel ~/lkp/linux/arch/x86/boot/bzImage -append "console=ttyS0 earlyprint=ttyS0 ignore_loglevel" -initrd ./concate-initr.cgz -serial stdio -nodefaults
In this case, sys_link(2) will fail and return -EEXIST, so we can only get
the rc.local at rootfs.cgz instead of rc-local.cgz
[akpm@linux-foundation.org: move code to avoid forward declaration]
Link: http://lkml.kernel.org/r/1542352368-13299-1-git-send-email-lizhijian@cn.fujitsu.com
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Cc: Philip Li <philip.li@intel.com>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Cc: Li Zhijian <zhijianx.li@intel.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-11-30 14:10:09 -08:00
}
}
return 0 ;
}
2020-07-14 08:56:19 +02:00
static __initdata struct file * wfile ;
static __initdata loff_t wfile_pos ;
2005-04-16 15:20:36 -07:00
static int __init do_name ( void )
{
state = SkipIt ;
next_state = Reset ;
if ( strcmp ( collected , " TRAILER!!! " ) = = 0 ) {
free_hash ( ) ;
return 0 ;
}
2006-06-26 00:28:02 -07:00
clean_path ( collected , mode ) ;
2005-04-16 15:20:36 -07:00
if ( S_ISREG ( mode ) ) {
2006-06-26 00:28:02 -07:00
int ml = maybe_link ( ) ;
if ( ml > = 0 ) {
int openflags = O_WRONLY | O_CREAT ;
if ( ml ! = 1 )
openflags | = O_TRUNC ;
2020-07-14 08:56:19 +02:00
wfile = filp_open ( collected , openflags , mode ) ;
if ( IS_ERR ( wfile ) )
return 0 ;
wfile_pos = 0 ;
vfs_fchown ( wfile , uid , gid ) ;
vfs_fchmod ( wfile , mode ) ;
if ( body_len )
vfs_truncate ( & wfile - > f_path , body_len ) ;
state = CopyFile ;
2005-04-16 15:20:36 -07:00
}
} else if ( S_ISDIR ( mode ) ) {
2020-07-22 11:14:59 +02:00
init_mkdir ( collected , mode ) ;
2020-07-22 11:13:26 +02:00
init_chown ( collected , uid , gid , 0 ) ;
2020-07-22 11:41:02 +02:00
init_chmod ( collected , mode ) ;
2008-10-15 22:01:40 -07:00
dir_add ( collected , mtime ) ;
2005-04-16 15:20:36 -07:00
} else if ( S_ISBLK ( mode ) | | S_ISCHR ( mode ) | |
S_ISFIFO ( mode ) | | S_ISSOCK ( mode ) ) {
if ( maybe_link ( ) = = 0 ) {
2020-07-22 11:41:20 +02:00
init_mknod ( collected , mode , rdev ) ;
2020-07-22 11:13:26 +02:00
init_chown ( collected , uid , gid , 0 ) ;
2020-07-22 11:41:02 +02:00
init_chmod ( collected , mode ) ;
2008-10-15 22:01:40 -07:00
do_utime ( collected , mtime ) ;
2005-04-16 15:20:36 -07:00
}
}
return 0 ;
}
static int __init do_copy ( void )
{
2014-10-13 15:54:07 -07:00
if ( byte_count > = body_len ) {
2020-07-30 08:19:00 +02:00
struct timespec64 t [ 2 ] = { } ;
2020-07-14 08:56:19 +02:00
if ( xwrite ( wfile , victim , body_len , & wfile_pos ) ! = body_len )
2014-08-08 14:23:16 -07:00
error ( " write error " ) ;
2020-07-30 08:19:00 +02:00
t [ 0 ] . tv_sec = mtime ;
t [ 1 ] . tv_sec = mtime ;
vfs_utimes ( & wfile - > f_path , t ) ;
2020-07-14 08:56:19 +02:00
fput ( wfile ) ;
2005-04-16 15:20:36 -07:00
eat ( body_len ) ;
state = SkipIt ;
return 0 ;
} else {
2020-07-14 08:56:19 +02:00
if ( xwrite ( wfile , victim , byte_count , & wfile_pos ) ! = byte_count )
2014-08-08 14:23:16 -07:00
error ( " write error " ) ;
2014-10-13 15:54:07 -07:00
body_len - = byte_count ;
eat ( byte_count ) ;
2005-04-16 15:20:36 -07:00
return 1 ;
}
}
static int __init do_symlink ( void )
{
collected [ N_ALIGN ( name_len ) + body_len ] = ' \0 ' ;
2006-06-26 00:28:02 -07:00
clean_path ( collected , 0 ) ;
2020-07-22 11:14:36 +02:00
init_symlink ( collected + N_ALIGN ( name_len ) , collected ) ;
2020-07-22 11:13:26 +02:00
init_chown ( collected , uid , gid , AT_SYMLINK_NOFOLLOW ) ;
2008-10-15 22:01:40 -07:00
do_utime ( collected , mtime ) ;
2005-04-16 15:20:36 -07:00
state = SkipIt ;
next_state = Reset ;
return 0 ;
}
static __initdata int ( * actions [ ] ) ( void ) = {
[ Start ] = do_start ,
[ Collect ] = do_collect ,
[ GotHeader ] = do_header ,
[ SkipIt ] = do_skip ,
[ GotName ] = do_name ,
[ CopyFile ] = do_copy ,
[ GotSymlink ] = do_symlink ,
[ Reset ] = do_reset ,
} ;
initramfs: support initramfs that is bigger than 2GiB
Now with 64bit bzImage and kexec tools, we support ramdisk that size is
bigger than 2g, as we could put it above 4G.
Found compressed initramfs image could not be decompressed properly. It
turns out that image length is int during decompress detection, and it
will become < 0 when length is more than 2G. Furthermore, during
decompressing len as int is used for inbuf count, that has problem too.
Change len to long, that should be ok as on 32 bit platform long is
32bits.
Tested with following compressed initramfs image as root with kexec.
gzip, bzip2, xz, lzma, lzop, lz4.
run time for populate_rootfs():
size name Nehalem-EX Westmere-EX Ivybridge-EX
9034400256 root_img : 26s 24s 30s
3561095057 root_img.lz4 : 28s 27s 27s
3459554629 root_img.lzo : 29s 29s 28s
3219399480 root_img.gz : 64s 62s 49s
2251594592 root_img.xz : 262s 260s 183s
2226366598 root_img.lzma: 386s 376s 277s
2901482513 root_img.bz2 : 635s 599s
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Rashika Kheria <rashika.kheria@gmail.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Kyungsik Lee <kyungsik.lee@lge.com>
Cc: P J P <ppandit@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Cc: "Daniel M. Weeks" <dan@danweeks.net>
Cc: Alexandre Courbot <acourbot@nvidia.com>
Cc: Jan Beulich <JBeulich@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-08-08 14:23:14 -07:00
static long __init write_buffer ( char * buf , unsigned long len )
2005-04-16 15:20:36 -07:00
{
2014-10-13 15:54:07 -07:00
byte_count = len ;
2005-04-16 15:20:36 -07:00
victim = buf ;
while ( ! actions [ state ] ( ) )
;
2014-10-13 15:54:07 -07:00
return len - byte_count ;
2005-04-16 15:20:36 -07:00
}
initramfs: support initramfs that is bigger than 2GiB
Now with 64bit bzImage and kexec tools, we support ramdisk that size is
bigger than 2g, as we could put it above 4G.
Found compressed initramfs image could not be decompressed properly. It
turns out that image length is int during decompress detection, and it
will become < 0 when length is more than 2G. Furthermore, during
decompressing len as int is used for inbuf count, that has problem too.
Change len to long, that should be ok as on 32 bit platform long is
32bits.
Tested with following compressed initramfs image as root with kexec.
gzip, bzip2, xz, lzma, lzop, lz4.
run time for populate_rootfs():
size name Nehalem-EX Westmere-EX Ivybridge-EX
9034400256 root_img : 26s 24s 30s
3561095057 root_img.lz4 : 28s 27s 27s
3459554629 root_img.lzo : 29s 29s 28s
3219399480 root_img.gz : 64s 62s 49s
2251594592 root_img.xz : 262s 260s 183s
2226366598 root_img.lzma: 386s 376s 277s
2901482513 root_img.bz2 : 635s 599s
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Rashika Kheria <rashika.kheria@gmail.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Kyungsik Lee <kyungsik.lee@lge.com>
Cc: P J P <ppandit@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Cc: "Daniel M. Weeks" <dan@danweeks.net>
Cc: Alexandre Courbot <acourbot@nvidia.com>
Cc: Jan Beulich <JBeulich@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-08-08 14:23:14 -07:00
static long __init flush_buffer ( void * bufv , unsigned long len )
2005-04-16 15:20:36 -07:00
{
2009-01-04 22:46:17 +01:00
char * buf = ( char * ) bufv ;
initramfs: support initramfs that is bigger than 2GiB
Now with 64bit bzImage and kexec tools, we support ramdisk that size is
bigger than 2g, as we could put it above 4G.
Found compressed initramfs image could not be decompressed properly. It
turns out that image length is int during decompress detection, and it
will become < 0 when length is more than 2G. Furthermore, during
decompressing len as int is used for inbuf count, that has problem too.
Change len to long, that should be ok as on 32 bit platform long is
32bits.
Tested with following compressed initramfs image as root with kexec.
gzip, bzip2, xz, lzma, lzop, lz4.
run time for populate_rootfs():
size name Nehalem-EX Westmere-EX Ivybridge-EX
9034400256 root_img : 26s 24s 30s
3561095057 root_img.lz4 : 28s 27s 27s
3459554629 root_img.lzo : 29s 29s 28s
3219399480 root_img.gz : 64s 62s 49s
2251594592 root_img.xz : 262s 260s 183s
2226366598 root_img.lzma: 386s 376s 277s
2901482513 root_img.bz2 : 635s 599s
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Rashika Kheria <rashika.kheria@gmail.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Kyungsik Lee <kyungsik.lee@lge.com>
Cc: P J P <ppandit@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Cc: "Daniel M. Weeks" <dan@danweeks.net>
Cc: Alexandre Courbot <acourbot@nvidia.com>
Cc: Jan Beulich <JBeulich@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-08-08 14:23:14 -07:00
long written ;
long origLen = len ;
2005-04-16 15:20:36 -07:00
if ( message )
2009-01-04 22:46:17 +01:00
return - 1 ;
2005-04-16 15:20:36 -07:00
while ( ( written = write_buffer ( buf , len ) ) < len & & ! message ) {
char c = buf [ written ] ;
if ( c = = ' 0 ' ) {
buf + = written ;
len - = written ;
state = Start ;
} else if ( c = = 0 ) {
buf + = written ;
len - = written ;
state = Reset ;
} else
2019-03-07 16:30:19 -08:00
error ( " junk within compressed archive " ) ;
2005-04-16 15:20:36 -07:00
}
2009-01-04 22:46:17 +01:00
return origLen ;
2005-04-16 15:20:36 -07:00
}
initramfs: support initramfs that is bigger than 2GiB
Now with 64bit bzImage and kexec tools, we support ramdisk that size is
bigger than 2g, as we could put it above 4G.
Found compressed initramfs image could not be decompressed properly. It
turns out that image length is int during decompress detection, and it
will become < 0 when length is more than 2G. Furthermore, during
decompressing len as int is used for inbuf count, that has problem too.
Change len to long, that should be ok as on 32 bit platform long is
32bits.
Tested with following compressed initramfs image as root with kexec.
gzip, bzip2, xz, lzma, lzop, lz4.
run time for populate_rootfs():
size name Nehalem-EX Westmere-EX Ivybridge-EX
9034400256 root_img : 26s 24s 30s
3561095057 root_img.lz4 : 28s 27s 27s
3459554629 root_img.lzo : 29s 29s 28s
3219399480 root_img.gz : 64s 62s 49s
2251594592 root_img.xz : 262s 260s 183s
2226366598 root_img.lzma: 386s 376s 277s
2901482513 root_img.bz2 : 635s 599s
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Rashika Kheria <rashika.kheria@gmail.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Kyungsik Lee <kyungsik.lee@lge.com>
Cc: P J P <ppandit@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Cc: "Daniel M. Weeks" <dan@danweeks.net>
Cc: Alexandre Courbot <acourbot@nvidia.com>
Cc: Jan Beulich <JBeulich@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-08-08 14:23:14 -07:00
static unsigned long my_inptr ; /* index of next byte to be processed in inbuf */
2005-04-16 15:20:36 -07:00
2009-01-08 15:14:17 -08:00
# include <linux/decompress/generic.h>
2005-04-16 15:20:36 -07:00
initramfs: support initramfs that is bigger than 2GiB
Now with 64bit bzImage and kexec tools, we support ramdisk that size is
bigger than 2g, as we could put it above 4G.
Found compressed initramfs image could not be decompressed properly. It
turns out that image length is int during decompress detection, and it
will become < 0 when length is more than 2G. Furthermore, during
decompressing len as int is used for inbuf count, that has problem too.
Change len to long, that should be ok as on 32 bit platform long is
32bits.
Tested with following compressed initramfs image as root with kexec.
gzip, bzip2, xz, lzma, lzop, lz4.
run time for populate_rootfs():
size name Nehalem-EX Westmere-EX Ivybridge-EX
9034400256 root_img : 26s 24s 30s
3561095057 root_img.lz4 : 28s 27s 27s
3459554629 root_img.lzo : 29s 29s 28s
3219399480 root_img.gz : 64s 62s 49s
2251594592 root_img.xz : 262s 260s 183s
2226366598 root_img.lzma: 386s 376s 277s
2901482513 root_img.bz2 : 635s 599s
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Rashika Kheria <rashika.kheria@gmail.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Kyungsik Lee <kyungsik.lee@lge.com>
Cc: P J P <ppandit@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Cc: "Daniel M. Weeks" <dan@danweeks.net>
Cc: Alexandre Courbot <acourbot@nvidia.com>
Cc: Jan Beulich <JBeulich@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-08-08 14:23:14 -07:00
static char * __init unpack_to_rootfs ( char * buf , unsigned long len )
2005-04-16 15:20:36 -07:00
{
initramfs: support initramfs that is bigger than 2GiB
Now with 64bit bzImage and kexec tools, we support ramdisk that size is
bigger than 2g, as we could put it above 4G.
Found compressed initramfs image could not be decompressed properly. It
turns out that image length is int during decompress detection, and it
will become < 0 when length is more than 2G. Furthermore, during
decompressing len as int is used for inbuf count, that has problem too.
Change len to long, that should be ok as on 32 bit platform long is
32bits.
Tested with following compressed initramfs image as root with kexec.
gzip, bzip2, xz, lzma, lzop, lz4.
run time for populate_rootfs():
size name Nehalem-EX Westmere-EX Ivybridge-EX
9034400256 root_img : 26s 24s 30s
3561095057 root_img.lz4 : 28s 27s 27s
3459554629 root_img.lzo : 29s 29s 28s
3219399480 root_img.gz : 64s 62s 49s
2251594592 root_img.xz : 262s 260s 183s
2226366598 root_img.lzma: 386s 376s 277s
2901482513 root_img.bz2 : 635s 599s
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Rashika Kheria <rashika.kheria@gmail.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Kyungsik Lee <kyungsik.lee@lge.com>
Cc: P J P <ppandit@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Cc: "Daniel M. Weeks" <dan@danweeks.net>
Cc: Alexandre Courbot <acourbot@nvidia.com>
Cc: Jan Beulich <JBeulich@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-08-08 14:23:14 -07:00
long written ;
2009-01-08 15:14:17 -08:00
decompress_fn decompress ;
2009-01-12 14:24:04 -08:00
const char * compress_name ;
static __initdata char msg_buf [ 64 ] ;
2009-01-08 15:14:17 -08:00
2008-04-29 00:59:43 -07:00
header_buf = kmalloc ( 110 , GFP_KERNEL ) ;
symlink_buf = kmalloc ( PATH_MAX + N_ALIGN ( PATH_MAX ) + 1 , GFP_KERNEL ) ;
name_buf = kmalloc ( N_ALIGN ( PATH_MAX ) , GFP_KERNEL ) ;
2009-01-04 22:46:17 +01:00
if ( ! header_buf | | ! symlink_buf | | ! name_buf )
2021-02-25 17:22:46 -08:00
panic_show_mem ( " can't allocate buffers " ) ;
2009-01-04 22:46:17 +01:00
2005-04-16 15:20:36 -07:00
state = Start ;
this_header = 0 ;
message = NULL ;
while ( ! message & & len ) {
loff_t saved_offset = this_header ;
if ( * buf = = ' 0 ' & & ! ( this_header & 3 ) ) {
state = Start ;
written = write_buffer ( buf , len ) ;
buf + = written ;
len - = written ;
continue ;
}
if ( ! * buf ) {
buf + + ;
len - - ;
this_header + + ;
continue ;
}
this_header = 0 ;
2009-01-12 14:24:04 -08:00
decompress = decompress_method ( buf , len , & compress_name ) ;
2014-04-07 15:39:16 -07:00
pr_debug ( " Detected %s compressed data \n " , compress_name ) ;
2009-12-14 21:45:19 +00:00
if ( decompress ) {
initramfs: support initramfs that is bigger than 2GiB
Now with 64bit bzImage and kexec tools, we support ramdisk that size is
bigger than 2g, as we could put it above 4G.
Found compressed initramfs image could not be decompressed properly. It
turns out that image length is int during decompress detection, and it
will become < 0 when length is more than 2G. Furthermore, during
decompressing len as int is used for inbuf count, that has problem too.
Change len to long, that should be ok as on 32 bit platform long is
32bits.
Tested with following compressed initramfs image as root with kexec.
gzip, bzip2, xz, lzma, lzop, lz4.
run time for populate_rootfs():
size name Nehalem-EX Westmere-EX Ivybridge-EX
9034400256 root_img : 26s 24s 30s
3561095057 root_img.lz4 : 28s 27s 27s
3459554629 root_img.lzo : 29s 29s 28s
3219399480 root_img.gz : 64s 62s 49s
2251594592 root_img.xz : 262s 260s 183s
2226366598 root_img.lzma: 386s 376s 277s
2901482513 root_img.bz2 : 635s 599s
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Rashika Kheria <rashika.kheria@gmail.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Kyungsik Lee <kyungsik.lee@lge.com>
Cc: P J P <ppandit@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Cc: "Daniel M. Weeks" <dan@danweeks.net>
Cc: Alexandre Courbot <acourbot@nvidia.com>
Cc: Jan Beulich <JBeulich@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-08-08 14:23:14 -07:00
int res = decompress ( buf , len , NULL , flush_buffer , NULL ,
2009-01-08 15:14:17 -08:00
& my_inptr , error ) ;
2009-12-14 21:45:19 +00:00
if ( res )
error ( " decompressor failed " ) ;
} else if ( compress_name ) {
2009-01-12 14:24:04 -08:00
if ( ! message ) {
snprintf ( msg_buf , sizeof msg_buf ,
" compression method %s not configured " ,
compress_name ) ;
message = msg_buf ;
}
2010-04-23 13:18:11 -04:00
} else
2019-03-07 16:30:19 -08:00
error ( " invalid magic at start of compressed archive " ) ;
2005-04-16 15:20:36 -07:00
if ( state ! = Reset )
2019-03-07 16:30:19 -08:00
error ( " junk at the end of compressed archive " ) ;
2009-01-04 22:46:17 +01:00
this_header = saved_offset + my_inptr ;
buf + = my_inptr ;
len - = my_inptr ;
2005-04-16 15:20:36 -07:00
}
2008-10-15 22:01:40 -07:00
dir_utime ( ) ;
2008-04-29 00:59:43 -07:00
kfree ( name_buf ) ;
kfree ( symlink_buf ) ;
kfree ( header_buf ) ;
2005-04-16 15:20:36 -07:00
return message ;
}
2007-02-10 01:44:33 -08:00
static int __initdata do_retain_initrd ;
static int __init retain_initrd_param ( char * str )
{
if ( * str )
return 0 ;
do_retain_initrd = 1 ;
return 1 ;
}
__setup ( " retain_initrd " , retain_initrd_param ) ;
2019-05-13 17:18:30 -07:00
# ifdef CONFIG_ARCH_HAS_KEEPINITRD
static int __init keepinitrd_setup ( char * __unused )
{
do_retain_initrd = 1 ;
return 1 ;
}
__setup ( " keepinitrd " , keepinitrd_setup ) ;
# endif
init/initramfs.c: do unpacking asynchronously
Patch series "background initramfs unpacking, and CONFIG_MODPROBE_PATH", v3.
These two patches are independent, but better-together.
The second is a rather trivial patch that simply allows the developer to
change "/sbin/modprobe" to something else - e.g. the empty string, so
that all request_module() during early boot return -ENOENT early, without
even spawning a usermode helper, needlessly synchronizing with the
initramfs unpacking.
The first patch delegates decompressing the initramfs to a worker thread,
allowing do_initcalls() in main.c to proceed to the device_ and late_
initcalls without waiting for that decompression (and populating of
rootfs) to finish. Obviously, some of those later calls may rely on the
initramfs being available, so I've added synchronization points in the
firmware loader and usermodehelper paths - there might be other places
that would need this, but so far no one has been able to think of any
places I have missed.
There's not much to win if most of the functionality needed during boot is
only available as modules. But systems with a custom-made .config and
initramfs can boot faster, partly due to utilizing more than one cpu
earlier, partly by avoiding known-futile modprobe calls (which would still
trigger synchronization with the initramfs unpacking, thus eliminating
most of the first benefit).
This patch (of 2):
Most of the boot process doesn't actually need anything from the
initramfs, until of course PID1 is to be executed. So instead of doing
the decompressing and populating of the initramfs synchronously in
populate_rootfs() itself, push that off to a worker thread.
This is primarily motivated by an embedded ppc target, where unpacking
even the rather modest sized initramfs takes 0.6 seconds, which is long
enough that the external watchdog becomes unhappy that it doesn't get
attention soon enough. By doing the initramfs decompression in a worker
thread, we get to do the device_initcalls and hence start petting the
watchdog much sooner.
Normal desktops might benefit as well. On my mostly stock Ubuntu kernel,
my initramfs is a 26M xz-compressed blob, decompressing to around 126M.
That takes almost two seconds:
[ 0.201454] Trying to unpack rootfs image as initramfs...
[ 1.976633] Freeing initrd memory: 29416K
Before this patch, these lines occur consecutively in dmesg. With this
patch, the timestamps on these two lines is roughly the same as above, but
with 172 lines inbetween - so more than one cpu has been kept busy doing
work that would otherwise only happen after the populate_rootfs()
finished.
Should one of the initcalls done after rootfs_initcall time (i.e., device_
and late_ initcalls) need something from the initramfs (say, a kernel
module or a firmware blob), it will simply wait for the initramfs
unpacking to be done before proceeding, which should in theory make this
completely safe.
But if some driver pokes around in the filesystem directly and not via one
of the official kernel interfaces (i.e. request_firmware*(),
call_usermodehelper*) that theory may not hold - also, I certainly might
have missed a spot when sprinkling wait_for_initramfs(). So there is an
escape hatch in the form of an initramfs_async= command line parameter.
Link: https://lkml.kernel.org/r/20210313212528.2956377-1-linux@rasmusvillemoes.dk
Link: https://lkml.kernel.org/r/20210313212528.2956377-2-linux@rasmusvillemoes.dk
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Cc: Jessica Yu <jeyu@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-06 18:05:42 -07:00
static bool __initdata initramfs_async = true ;
static int __init initramfs_async_setup ( char * str )
{
strtobool ( str , & initramfs_async ) ;
return 1 ;
}
__setup ( " initramfs_async= " , initramfs_async_setup ) ;
initramfs: fix initramfs size calculation
The size of a built-in initramfs is calculated in init/initramfs.c by
"__initramfs_end - __initramfs_start". Those symbols are defined in the
linker script include/asm-generic/vmlinux.lds.h:
#define INIT_RAM_FS \
. = ALIGN(PAGE_SIZE); \
VMLINUX_SYMBOL(__initramfs_start) = .; \
*(.init.ramfs) \
VMLINUX_SYMBOL(__initramfs_end) = .;
If the initramfs file has an odd number of bytes, the "__initramfs_end"
symbol points to an odd address, for example, the symbols in the
System.map might look like:
0000000000572000 T __initramfs_start
00000000005bcd05 T __initramfs_end <-- odd address
At least on s390 this causes a problem:
Certain s390 instructions, especially instructions for loading addresses
(larl) or branch addresses must be on even addresses. The compiler loads
the symbol addresses with the "larl" instruction. This instruction sets
the last bit to 0 and, therefore, for odd size files, the calculated size
is one byte less than it should be:
0000000000540a9c <populate_rootfs>:
540a9c: eb cf f0 78 00 24 stmg %r12,%r15,120(%r15),
540aa2: c0 10 00 01 8a af larl %r1,572000 <__initramfs_start>
540aa8: c0 c0 00 03 e1 2e larl %r12,5bcd04 <initramfs_end>
(Instead of 5bcd05)
...
540abe: 1b c1 sr %r12,%r1
To fix the problem, this patch introduces the global variable
__initramfs_size, which is calculated in the "usr/initramfs_data.S" file.
The populate_rootfs() function can then use the start marker of the
.init.ramfs section and the value of __initramfs_size for loading the
initramfs. Because the start marker and size is sufficient, the
__initramfs_end symbol is no longer needed and is removed.
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Reviewed-by: WANG Cong <xiyou.wangcong@gmail.com>
Acked-by: Michal Marek <mmarek@suse.cz>
Acked-by: "H. Peter Anvin" <hpa@zytor.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Michal Marek <mmarek@suse.cz>
2010-09-17 15:24:11 -07:00
extern char __initramfs_start [ ] ;
extern unsigned long __initramfs_size ;
2005-04-16 15:20:36 -07:00
# include <linux/initrd.h>
2006-02-10 01:51:05 -08:00
# include <linux/kexec.h>
2005-09-13 01:25:12 -07:00
2021-01-15 13:46:04 +08:00
void __init reserve_initrd_mem ( void )
{
phys_addr_t start ;
unsigned long size ;
/* Ignore the virtul address computed during device tree parsing */
initrd_start = initrd_end = 0 ;
if ( ! phys_initrd_size )
return ;
/*
* Round the memory region to page boundaries as per free_initrd_mem ( )
* This allows us to detect whether the pages overlapping the initrd
* are in use , but more importantly , reserves the entire set of pages
* as we don ' t want these pages allocated for other purposes .
*/
start = round_down ( phys_initrd_start , PAGE_SIZE ) ;
size = phys_initrd_size + ( phys_initrd_start - start ) ;
size = round_up ( size , PAGE_SIZE ) ;
if ( ! memblock_is_region_memory ( start , size ) ) {
pr_err ( " INITRD: 0x%08llx+0x%08lx is not a memory region " ,
( u64 ) start , size ) ;
goto disable ;
}
if ( memblock_is_region_reserved ( start , size ) ) {
pr_err ( " INITRD: 0x%08llx+0x%08lx overlaps in-use memory region \n " ,
( u64 ) start , size ) ;
goto disable ;
}
memblock_reserve ( start , size ) ;
/* Now convert initrd to virtual addresses */
initrd_start = ( unsigned long ) __va ( phys_initrd_start ) ;
initrd_end = initrd_start + phys_initrd_size ;
initrd_below_start_ok = 1 ;
return ;
disable :
pr_cont ( " - disabling initrd \n " ) ;
initrd_start = 0 ;
initrd_end = 0 ;
}
2020-12-11 13:36:42 -08:00
void __weak __init free_initrd_mem ( unsigned long start , unsigned long end )
2019-05-13 17:18:34 -07:00
{
2019-09-28 11:02:26 +03:00
# ifdef CONFIG_ARCH_KEEP_MEMBLOCK
unsigned long aligned_start = ALIGN_DOWN ( start , PAGE_SIZE ) ;
unsigned long aligned_end = ALIGN ( end , PAGE_SIZE ) ;
2021-11-05 13:43:22 -07:00
memblock_free ( ( void * ) aligned_start , aligned_end - aligned_start ) ;
2019-09-28 11:02:26 +03:00
# endif
2019-05-13 17:18:37 -07:00
free_reserved_area ( ( void * ) start , ( void * ) end , POISON_FREE_INITMEM ,
" initrd " ) ;
2019-05-13 17:18:34 -07:00
}
2015-09-09 15:38:55 -07:00
# ifdef CONFIG_KEXEC_CORE
2020-05-09 17:50:03 -07:00
static bool __init kexec_free_initrd ( void )
2019-05-13 17:18:21 -07:00
{
2006-02-10 01:51:05 -08:00
unsigned long crashk_start = ( unsigned long ) __va ( crashk_res . start ) ;
unsigned long crashk_end = ( unsigned long ) __va ( crashk_res . end ) ;
/*
* If the initrd region is overlapped with crashkernel reserved region ,
* free only memory that is not part of crashkernel region .
*/
2019-05-13 17:18:21 -07:00
if ( initrd_start > = crashk_end | | initrd_end < = crashk_start )
return false ;
/*
* Initialize initrd memory region since the kexec boot does not do .
*/
memset ( ( void * ) initrd_start , 0 , initrd_end - initrd_start ) ;
if ( initrd_start < crashk_start )
free_initrd_mem ( initrd_start , crashk_start ) ;
if ( initrd_end > crashk_end )
free_initrd_mem ( crashk_end , initrd_end ) ;
return true ;
2005-09-13 01:25:12 -07:00
}
2019-05-13 17:18:21 -07:00
# else
static inline bool kexec_free_initrd ( void )
{
return false ;
}
# endif /* CONFIG_KEXEC_CORE */
2005-09-13 01:25:12 -07:00
2019-02-20 22:18:52 -08:00
# ifdef CONFIG_BLK_DEV_RAM
2019-06-28 12:07:03 -07:00
static void __init populate_initrd_image ( char * err )
2019-05-13 17:18:24 -07:00
{
ssize_t written ;
2020-07-14 08:56:19 +02:00
struct file * file ;
loff_t pos = 0 ;
2019-05-13 17:18:24 -07:00
unpack_to_rootfs ( __initramfs_start , __initramfs_size ) ;
printk ( KERN_INFO " rootfs image is not initramfs (%s); looks like an initrd \n " ,
err ) ;
2020-07-14 08:56:19 +02:00
file = filp_open ( " /initrd.image " , O_WRONLY | O_CREAT , 0700 ) ;
if ( IS_ERR ( file ) )
2019-05-13 17:18:24 -07:00
return ;
2020-07-14 08:56:19 +02:00
written = xwrite ( file , ( char * ) initrd_start , initrd_end - initrd_start ,
& pos ) ;
2019-05-13 17:18:24 -07:00
if ( written ! = initrd_end - initrd_start )
pr_err ( " /initrd.image: incomplete write (%zd != %ld) \n " ,
written , initrd_end - initrd_start ) ;
2020-07-14 08:56:19 +02:00
fput ( file ) ;
2019-05-13 17:18:24 -07:00
}
# endif /* CONFIG_BLK_DEV_RAM */
init/initramfs.c: do unpacking asynchronously
Patch series "background initramfs unpacking, and CONFIG_MODPROBE_PATH", v3.
These two patches are independent, but better-together.
The second is a rather trivial patch that simply allows the developer to
change "/sbin/modprobe" to something else - e.g. the empty string, so
that all request_module() during early boot return -ENOENT early, without
even spawning a usermode helper, needlessly synchronizing with the
initramfs unpacking.
The first patch delegates decompressing the initramfs to a worker thread,
allowing do_initcalls() in main.c to proceed to the device_ and late_
initcalls without waiting for that decompression (and populating of
rootfs) to finish. Obviously, some of those later calls may rely on the
initramfs being available, so I've added synchronization points in the
firmware loader and usermodehelper paths - there might be other places
that would need this, but so far no one has been able to think of any
places I have missed.
There's not much to win if most of the functionality needed during boot is
only available as modules. But systems with a custom-made .config and
initramfs can boot faster, partly due to utilizing more than one cpu
earlier, partly by avoiding known-futile modprobe calls (which would still
trigger synchronization with the initramfs unpacking, thus eliminating
most of the first benefit).
This patch (of 2):
Most of the boot process doesn't actually need anything from the
initramfs, until of course PID1 is to be executed. So instead of doing
the decompressing and populating of the initramfs synchronously in
populate_rootfs() itself, push that off to a worker thread.
This is primarily motivated by an embedded ppc target, where unpacking
even the rather modest sized initramfs takes 0.6 seconds, which is long
enough that the external watchdog becomes unhappy that it doesn't get
attention soon enough. By doing the initramfs decompression in a worker
thread, we get to do the device_initcalls and hence start petting the
watchdog much sooner.
Normal desktops might benefit as well. On my mostly stock Ubuntu kernel,
my initramfs is a 26M xz-compressed blob, decompressing to around 126M.
That takes almost two seconds:
[ 0.201454] Trying to unpack rootfs image as initramfs...
[ 1.976633] Freeing initrd memory: 29416K
Before this patch, these lines occur consecutively in dmesg. With this
patch, the timestamps on these two lines is roughly the same as above, but
with 172 lines inbetween - so more than one cpu has been kept busy doing
work that would otherwise only happen after the populate_rootfs()
finished.
Should one of the initcalls done after rootfs_initcall time (i.e., device_
and late_ initcalls) need something from the initramfs (say, a kernel
module or a firmware blob), it will simply wait for the initramfs
unpacking to be done before proceeding, which should in theory make this
completely safe.
But if some driver pokes around in the filesystem directly and not via one
of the official kernel interfaces (i.e. request_firmware*(),
call_usermodehelper*) that theory may not hold - also, I certainly might
have missed a spot when sprinkling wait_for_initramfs(). So there is an
escape hatch in the form of an initramfs_async= command line parameter.
Link: https://lkml.kernel.org/r/20210313212528.2956377-1-linux@rasmusvillemoes.dk
Link: https://lkml.kernel.org/r/20210313212528.2956377-2-linux@rasmusvillemoes.dk
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Cc: Jessica Yu <jeyu@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-06 18:05:42 -07:00
static void __init do_populate_rootfs ( void * unused , async_cookie_t cookie )
2005-04-16 15:20:36 -07:00
{
2017-05-04 21:15:56 +09:00
/* Load the built in initramfs */
initramfs: fix initramfs size calculation
The size of a built-in initramfs is calculated in init/initramfs.c by
"__initramfs_end - __initramfs_start". Those symbols are defined in the
linker script include/asm-generic/vmlinux.lds.h:
#define INIT_RAM_FS \
. = ALIGN(PAGE_SIZE); \
VMLINUX_SYMBOL(__initramfs_start) = .; \
*(.init.ramfs) \
VMLINUX_SYMBOL(__initramfs_end) = .;
If the initramfs file has an odd number of bytes, the "__initramfs_end"
symbol points to an odd address, for example, the symbols in the
System.map might look like:
0000000000572000 T __initramfs_start
00000000005bcd05 T __initramfs_end <-- odd address
At least on s390 this causes a problem:
Certain s390 instructions, especially instructions for loading addresses
(larl) or branch addresses must be on even addresses. The compiler loads
the symbol addresses with the "larl" instruction. This instruction sets
the last bit to 0 and, therefore, for odd size files, the calculated size
is one byte less than it should be:
0000000000540a9c <populate_rootfs>:
540a9c: eb cf f0 78 00 24 stmg %r12,%r15,120(%r15),
540aa2: c0 10 00 01 8a af larl %r1,572000 <__initramfs_start>
540aa8: c0 c0 00 03 e1 2e larl %r12,5bcd04 <initramfs_end>
(Instead of 5bcd05)
...
540abe: 1b c1 sr %r12,%r1
To fix the problem, this patch introduces the global variable
__initramfs_size, which is calculated in the "usr/initramfs_data.S" file.
The populate_rootfs() function can then use the start marker of the
.init.ramfs section and the value of __initramfs_size for loading the
initramfs. Because the start marker and size is sufficient, the
__initramfs_end symbol is no longer needed and is removed.
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Reviewed-by: WANG Cong <xiyou.wangcong@gmail.com>
Acked-by: Michal Marek <mmarek@suse.cz>
Acked-by: "H. Peter Anvin" <hpa@zytor.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Michal Marek <mmarek@suse.cz>
2010-09-17 15:24:11 -07:00
char * err = unpack_to_rootfs ( __initramfs_start , __initramfs_size ) ;
2005-04-16 15:20:36 -07:00
if ( err )
2021-02-25 17:22:46 -08:00
panic_show_mem ( " %s " , err ) ; /* Failed to decompress INTERNAL initramfs */
2019-05-13 17:18:27 -07:00
if ( ! initrd_start | | IS_ENABLED ( CONFIG_INITRAMFS_FORCE ) )
goto done ;
if ( IS_ENABLED ( CONFIG_BLK_DEV_RAM ) )
2009-05-06 16:03:06 -07:00
printk ( KERN_INFO " Trying to unpack rootfs image as initramfs... \n " ) ;
2019-05-13 17:18:27 -07:00
else
printk ( KERN_INFO " Unpacking initramfs... \n " ) ;
2019-05-13 17:18:17 -07:00
2019-05-13 17:18:27 -07:00
err = unpack_to_rootfs ( ( char * ) initrd_start , initrd_end - initrd_start ) ;
if ( err ) {
2020-06-06 13:51:52 +02:00
# ifdef CONFIG_BLK_DEV_RAM
2019-05-13 17:18:24 -07:00
populate_initrd_image ( err ) ;
2020-06-06 13:51:52 +02:00
# else
printk ( KERN_EMERG " Initramfs unpacking failed: %s \n " , err ) ;
# endif
2005-04-16 15:20:36 -07:00
}
2019-05-13 17:18:21 -07:00
2019-05-13 17:18:27 -07:00
done :
2019-05-13 17:18:21 -07:00
/*
* If the initrd region is overlapped with crashkernel reserved region ,
* free only memory that is not part of crashkernel region .
*/
2019-05-17 14:31:47 -07:00
if ( ! do_retain_initrd & & initrd_start & & ! kexec_free_initrd ( ) )
2019-05-13 17:18:21 -07:00
free_initrd_mem ( initrd_start , initrd_end ) ;
initrd_start = 0 ;
initrd_end = 0 ;
2017-05-04 21:15:56 +09:00
flush_delayed_fput ( ) ;
init/initramfs.c: do unpacking asynchronously
Patch series "background initramfs unpacking, and CONFIG_MODPROBE_PATH", v3.
These two patches are independent, but better-together.
The second is a rather trivial patch that simply allows the developer to
change "/sbin/modprobe" to something else - e.g. the empty string, so
that all request_module() during early boot return -ENOENT early, without
even spawning a usermode helper, needlessly synchronizing with the
initramfs unpacking.
The first patch delegates decompressing the initramfs to a worker thread,
allowing do_initcalls() in main.c to proceed to the device_ and late_
initcalls without waiting for that decompression (and populating of
rootfs) to finish. Obviously, some of those later calls may rely on the
initramfs being available, so I've added synchronization points in the
firmware loader and usermodehelper paths - there might be other places
that would need this, but so far no one has been able to think of any
places I have missed.
There's not much to win if most of the functionality needed during boot is
only available as modules. But systems with a custom-made .config and
initramfs can boot faster, partly due to utilizing more than one cpu
earlier, partly by avoiding known-futile modprobe calls (which would still
trigger synchronization with the initramfs unpacking, thus eliminating
most of the first benefit).
This patch (of 2):
Most of the boot process doesn't actually need anything from the
initramfs, until of course PID1 is to be executed. So instead of doing
the decompressing and populating of the initramfs synchronously in
populate_rootfs() itself, push that off to a worker thread.
This is primarily motivated by an embedded ppc target, where unpacking
even the rather modest sized initramfs takes 0.6 seconds, which is long
enough that the external watchdog becomes unhappy that it doesn't get
attention soon enough. By doing the initramfs decompression in a worker
thread, we get to do the device_initcalls and hence start petting the
watchdog much sooner.
Normal desktops might benefit as well. On my mostly stock Ubuntu kernel,
my initramfs is a 26M xz-compressed blob, decompressing to around 126M.
That takes almost two seconds:
[ 0.201454] Trying to unpack rootfs image as initramfs...
[ 1.976633] Freeing initrd memory: 29416K
Before this patch, these lines occur consecutively in dmesg. With this
patch, the timestamps on these two lines is roughly the same as above, but
with 172 lines inbetween - so more than one cpu has been kept busy doing
work that would otherwise only happen after the populate_rootfs()
finished.
Should one of the initcalls done after rootfs_initcall time (i.e., device_
and late_ initcalls) need something from the initramfs (say, a kernel
module or a firmware blob), it will simply wait for the initramfs
unpacking to be done before proceeding, which should in theory make this
completely safe.
But if some driver pokes around in the filesystem directly and not via one
of the official kernel interfaces (i.e. request_firmware*(),
call_usermodehelper*) that theory may not hold - also, I certainly might
have missed a spot when sprinkling wait_for_initramfs(). So there is an
escape hatch in the form of an initramfs_async= command line parameter.
Link: https://lkml.kernel.org/r/20210313212528.2956377-1-linux@rasmusvillemoes.dk
Link: https://lkml.kernel.org/r/20210313212528.2956377-2-linux@rasmusvillemoes.dk
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Cc: Jessica Yu <jeyu@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-06 18:05:42 -07:00
}
static ASYNC_DOMAIN_EXCLUSIVE ( initramfs_domain ) ;
static async_cookie_t initramfs_cookie ;
void wait_for_initramfs ( void )
{
if ( ! initramfs_cookie ) {
/*
* Something before rootfs_initcall wants to access
* the filesystem / initramfs . Probably a bug . Make a
* note , avoid deadlocking the machine , and let the
* caller ' s access fail as it used to .
*/
pr_warn_once ( " wait_for_initramfs() called before rootfs_initcalls \n " ) ;
return ;
}
async_synchronize_cookie_domain ( initramfs_cookie + 1 , & initramfs_domain ) ;
}
EXPORT_SYMBOL_GPL ( wait_for_initramfs ) ;
static int __init populate_rootfs ( void )
{
initramfs_cookie = async_schedule_domain ( do_populate_rootfs , NULL ,
& initramfs_domain ) ;
init: move usermodehelper_enable() to populate_rootfs()
Currently, usermodehelper is enabled right before PID1 starts going
through the initcalls. However, any call of a usermodehelper from a
pure_, core_, postcore_, arch_, subsys_ or fs_ initcall is futile, as
there is no filesystem contents yet.
Up until commit e7cb072eb988 ("init/initramfs.c: do unpacking
asynchronously"), such calls, whether via some request_module(), a
legacy uevent "/sbin/hotplug" notification or something else, would
just fail silently with (presumably) -ENOENT from
kernel_execve(). However, that commit introduced the
wait_for_initramfs() synchronization hook which must be called from
the usermodehelper exec path right before the kernel_execve, in order
that request_module() et al done from *after* rootfs_initcall()
time (i.e. device_ and late_ initcalls) would continue to find a
populated initramfs as they used to.
Any call of wait_for_initramfs() done before the unpacking has been
scheduled (i.e. before rootfs_initcall time) must just return
immediately [and let the caller find an empty file system] in order
not to deadlock the machine. I mistakenly thought, and my limited
testing confirmed, that there were no such calls, so I added a
pr_warn_once() in wait_for_initramfs(). It turns out that one can
indeed hit request_module() as well as kobject_uevent_env() during
those early init calls, leading to a user-visible warning in the
kernel log emitted consistently for certain configurations.
We could just remove the pr_warn_once(), but I think it's better to
postpone enabling the usermodehelper framework until there is at least
some chance of finding the executable. That is also a little more
efficient in that a lot of work done in umh.c will be elided. However,
it does change the error seen by those early callers from -ENOENT to
-EBUSY, so there is a risk of a regression if any caller care about
the exact error value.
Link: https://lkml.kernel.org/r/20210728134638.329060-1-linux@rasmusvillemoes.dk
Fixes: e7cb072eb988 ("init/initramfs.c: do unpacking asynchronously")
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Reported-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Reported-by: Bruno Goncalves <bgoncalv@redhat.com>
Reported-by: Heiner Kallweit <hkallweit1@gmail.com>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-09-07 20:00:03 -07:00
usermodehelper_enable ( ) ;
init/initramfs.c: do unpacking asynchronously
Patch series "background initramfs unpacking, and CONFIG_MODPROBE_PATH", v3.
These two patches are independent, but better-together.
The second is a rather trivial patch that simply allows the developer to
change "/sbin/modprobe" to something else - e.g. the empty string, so
that all request_module() during early boot return -ENOENT early, without
even spawning a usermode helper, needlessly synchronizing with the
initramfs unpacking.
The first patch delegates decompressing the initramfs to a worker thread,
allowing do_initcalls() in main.c to proceed to the device_ and late_
initcalls without waiting for that decompression (and populating of
rootfs) to finish. Obviously, some of those later calls may rely on the
initramfs being available, so I've added synchronization points in the
firmware loader and usermodehelper paths - there might be other places
that would need this, but so far no one has been able to think of any
places I have missed.
There's not much to win if most of the functionality needed during boot is
only available as modules. But systems with a custom-made .config and
initramfs can boot faster, partly due to utilizing more than one cpu
earlier, partly by avoiding known-futile modprobe calls (which would still
trigger synchronization with the initramfs unpacking, thus eliminating
most of the first benefit).
This patch (of 2):
Most of the boot process doesn't actually need anything from the
initramfs, until of course PID1 is to be executed. So instead of doing
the decompressing and populating of the initramfs synchronously in
populate_rootfs() itself, push that off to a worker thread.
This is primarily motivated by an embedded ppc target, where unpacking
even the rather modest sized initramfs takes 0.6 seconds, which is long
enough that the external watchdog becomes unhappy that it doesn't get
attention soon enough. By doing the initramfs decompression in a worker
thread, we get to do the device_initcalls and hence start petting the
watchdog much sooner.
Normal desktops might benefit as well. On my mostly stock Ubuntu kernel,
my initramfs is a 26M xz-compressed blob, decompressing to around 126M.
That takes almost two seconds:
[ 0.201454] Trying to unpack rootfs image as initramfs...
[ 1.976633] Freeing initrd memory: 29416K
Before this patch, these lines occur consecutively in dmesg. With this
patch, the timestamps on these two lines is roughly the same as above, but
with 172 lines inbetween - so more than one cpu has been kept busy doing
work that would otherwise only happen after the populate_rootfs()
finished.
Should one of the initcalls done after rootfs_initcall time (i.e., device_
and late_ initcalls) need something from the initramfs (say, a kernel
module or a firmware blob), it will simply wait for the initramfs
unpacking to be done before proceeding, which should in theory make this
completely safe.
But if some driver pokes around in the filesystem directly and not via one
of the official kernel interfaces (i.e. request_firmware*(),
call_usermodehelper*) that theory may not hold - also, I certainly might
have missed a spot when sprinkling wait_for_initramfs(). So there is an
escape hatch in the form of an initramfs_async= command line parameter.
Link: https://lkml.kernel.org/r/20210313212528.2956377-1-linux@rasmusvillemoes.dk
Link: https://lkml.kernel.org/r/20210313212528.2956377-2-linux@rasmusvillemoes.dk
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Cc: Jessica Yu <jeyu@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-06 18:05:42 -07:00
if ( ! initramfs_async )
wait_for_initramfs ( ) ;
2006-12-11 12:12:04 -08:00
return 0 ;
2005-04-16 15:20:36 -07:00
}
2006-12-11 12:12:04 -08:00
rootfs_initcall ( populate_rootfs ) ;