1
1
mirror of https://github.com/systemd/systemd-stable.git synced 2024-12-23 17:34:00 +03:00
Commit Graph

41050 Commits

Author SHA1 Message Date
Eric DeVolder
9b4abc69b2 pstore: Tool to archive contents of pstore
This patch introduces the systemd pstore service which will archive the
contents of the Linux persistent storage filesystem, pstore, to other storage,
thus preserving the existing information contained in the pstore, and clearing
pstore storage for future error events.

Linux provides a persistent storage file system, pstore[1], that can store
error records when the kernel dies (or reboots or powers-off). These records in
turn can be referenced to debug kernel problems (currently the kernel stuffs
the tail of the dmesg, which also contains a stack backtrace, into pstore).

The pstore file system supports a variety of backends that map onto persistent
storage, such as the ACPI ERST[2, Section 18.5 Error Serialization] and UEFI
variables[3 Appendix N Common Platform Error Record]. The pstore backends
typically offer a relatively small amount of persistent storage, e.g. 64KiB,
which can quickly fill up and thus prevent subsequent kernel crashes from
recording errors. Thus there is a need to monitor and extract the pstore
contents so that future kernel problems can also record information in the
pstore.

The pstore service is independent of the kdump service. In cloud environments
specifically, host and guest filesystems are on remote filesystems (eg. iSCSI
or NFS), thus kdump relies [implicitly and/or explicitly] upon proper operation
of networking software *and* hardware *and* infrastructure.  Thus it may not be
possible to capture a kernel coredump to a file since writes over the network
may not be possible.

The pstore backend, on the other hand, is completely local and provides a path
to store error records which will survive a reboot and aid in post-mortem
debugging.

Usage Notes:
This tool moves files from /sys/fs/pstore into /var/lib/systemd/pstore.

To enable kernel recording of error records into pstore, one must either pass
crash_kexec_post_notifiers[4] to the kernel command line or enable via 'echo Y
 > /sys/module/kernel/parameters/crash_kexec_post_notifiers'. This option
invokes the recording of errors into pstore *before* an attempt to kexec/kdump
on a kernel crash.

Optionally, to record reboots and shutdowns in the pstore, one can either pass
the printk.always_kmsg_dump[4] to the kernel command line or enable via 'echo Y >
/sys/module/printk/parameters/always_kmsg_dump'. This option enables code on the
shutdown path to record information via pstore.

This pstore service is a oneshot service. When run, the service invokes
systemd-pstore which is a tool that performs the following:
 - reads the pstore.conf configuration file
 - collects the lists of files in the pstore (eg. /sys/fs/pstore)
 - for certain file types (eg. dmesg) a handler is invoked
 - for all other files, the file is moved from pstore

 - In the case of dmesg handler, final processing occurs as such:
   - files processed in reverse lexigraphical order to faciliate
     reconstruction of original dmesg
   - the filename is examined to determine which dmesg it is a part
   - the file is appended to the reconstructed dmesg

For example, the following pstore contents:

 root@vm356:~# ls -al /sys/fs/pstore
 total 0
 drwxr-x--- 2 root root    0 May  9 09:50 .
 drwxr-xr-x 7 root root    0 May  9 09:50 ..
 -r--r--r-- 1 root root 1610 May  9 09:49 dmesg-efi-155741337601001
 -r--r--r-- 1 root root 1778 May  9 09:49 dmesg-efi-155741337602001
 -r--r--r-- 1 root root 1726 May  9 09:49 dmesg-efi-155741337603001
 -r--r--r-- 1 root root 1746 May  9 09:49 dmesg-efi-155741337604001
 -r--r--r-- 1 root root 1686 May  9 09:49 dmesg-efi-155741337605001
 -r--r--r-- 1 root root 1690 May  9 09:49 dmesg-efi-155741337606001
 -r--r--r-- 1 root root 1775 May  9 09:49 dmesg-efi-155741337607001
 -r--r--r-- 1 root root 1811 May  9 09:49 dmesg-efi-155741337608001
 -r--r--r-- 1 root root 1817 May  9 09:49 dmesg-efi-155741337609001
 -r--r--r-- 1 root root 1795 May  9 09:49 dmesg-efi-155741337710001
 -r--r--r-- 1 root root 1770 May  9 09:49 dmesg-efi-155741337711001
 -r--r--r-- 1 root root 1796 May  9 09:49 dmesg-efi-155741337712001
 -r--r--r-- 1 root root 1787 May  9 09:49 dmesg-efi-155741337713001
 -r--r--r-- 1 root root 1808 May  9 09:49 dmesg-efi-155741337714001
 -r--r--r-- 1 root root 1754 May  9 09:49 dmesg-efi-155741337715001

results in the following:

 root@vm356:~# ls -al /var/lib/systemd/pstore/155741337/
 total 92
 drwxr-xr-x 2 root root  4096 May  9 09:50 .
 drwxr-xr-x 4 root root    40 May  9 09:50 ..
 -rw-r--r-- 1 root root  1610 May  9 09:50 dmesg-efi-155741337601001
 -rw-r--r-- 1 root root  1778 May  9 09:50 dmesg-efi-155741337602001
 -rw-r--r-- 1 root root  1726 May  9 09:50 dmesg-efi-155741337603001
 -rw-r--r-- 1 root root  1746 May  9 09:50 dmesg-efi-155741337604001
 -rw-r--r-- 1 root root  1686 May  9 09:50 dmesg-efi-155741337605001
 -rw-r--r-- 1 root root  1690 May  9 09:50 dmesg-efi-155741337606001
 -rw-r--r-- 1 root root  1775 May  9 09:50 dmesg-efi-155741337607001
 -rw-r--r-- 1 root root  1811 May  9 09:50 dmesg-efi-155741337608001
 -rw-r--r-- 1 root root  1817 May  9 09:50 dmesg-efi-155741337609001
 -rw-r--r-- 1 root root  1795 May  9 09:50 dmesg-efi-155741337710001
 -rw-r--r-- 1 root root  1770 May  9 09:50 dmesg-efi-155741337711001
 -rw-r--r-- 1 root root  1796 May  9 09:50 dmesg-efi-155741337712001
 -rw-r--r-- 1 root root  1787 May  9 09:50 dmesg-efi-155741337713001
 -rw-r--r-- 1 root root  1808 May  9 09:50 dmesg-efi-155741337714001
 -rw-r--r-- 1 root root  1754 May  9 09:50 dmesg-efi-155741337715001
 -rw-r--r-- 1 root root 26754 May  9 09:50 dmesg.txt

where dmesg.txt is reconstructed from the group of related
dmesg-efi-155741337* files.

Configuration file:
The pstore.conf configuration file has four settings, described below.
 - Storage : one of "none", "external", or "journal". With "none", this
   tool leaves the contents of pstore untouched. With "external", the
   contents of the pstore are moved into the /var/lib/systemd/pstore,
   as well as logged into the journal.  With "journal", the contents of
   the pstore are recorded only in the systemd journal. The default is
   "external".
 - Unlink : is a boolean. When "true", the default, then files in the
   pstore are removed once processed. When "false", processing of the
   pstore occurs normally, but the pstore files remain.

References:
[1] "Persistent storage for a kernel's dying breath",
    March 23, 2011.
    https://lwn.net/Articles/434821/

[2] "Advanced Configuration and Power Interface Specification",
    version 6.2, May 2017.
    https://www.uefi.org/sites/default/files/resources/ACPI_6_2.pdf

[3] "Unified Extensible Firmware Interface Specification",
    version 2.8, March 2019.
    https://uefi.org/sites/default/files/resources/UEFI_Spec_2_8_final.pdf

[4] "The kernel’s command-line parameters",
    https://static.lwn.net/kerneldoc/admin-guide/kernel-parameters.html
2019-07-19 21:46:07 +02:00
Zbigniew Jędrzejewski-Szmek
f7e7bb6546 Merge pull request #13070 from yuwata/network-set-route-to-dhcp-dns 2019-07-19 09:35:22 +02:00
Zbigniew Jędrzejewski-Szmek
217b7b33cc pid1: order jobs that execute processes with lower priority
We can meaningfully compare jobs for units which have cpu weight or nice set.
But non-exec units those have those set.

Starting non-exec jobs first allows us to get them out of the queue quickly,
and consider more jobs for starting.

If we have service A, and socket B, and service C which is after socket B,
and we want to start both A and C, and C has higher cpu weight, if we get
B out of the way first, we'll know that we can start both A and C, and we'll
start C first.

Also invert the comparisons using CMP() so they are always done left vs. right,
and negate when returning instead.

Follow-up for da8e178296.
2019-07-19 14:38:52 +09:00
Dan Streetman
65dd488fe1 test: convert all uses of '|| true' into '|| :'
No change in functionality; just use the shorter || :
2019-07-19 13:47:21 +09:00
Yu Watanabe
0161f0ca36
Merge pull request #13100 from 1848/neigh_ipv6
networkd: Neighbor IPv6 support for LinkLayerAddress
2019-07-19 09:48:49 +09:00
Anita Zhang
27e64442f8 docs: typo in arg name replace-irreversible -> replace-irreversibly 2019-07-19 07:17:40 +09:00
Yu Watanabe
fb2ba3305b test-network: add test for neighbor with ipv6 lladdr 2019-07-19 07:14:58 +09:00
Yu Watanabe
1647f24100 sd-netlink: update comment 2019-07-19 07:14:58 +09:00
1848
f9ab224eb8 network: Added neighbor lladdr support for IPv6 2019-07-19 07:14:58 +09:00
Zbigniew Jędrzejewski-Szmek
34d2f9204c meson: update hint in man/rules/ 2019-07-19 07:09:34 +09:00
Luca Boccassi
a637d0f9ec core: set shutdown watchdog on kexec too
At the moment the shutdown watchdog is set only when rebooting.
The set of "things that can go wrong" is not too far off when kexec'ing
and in fact we have a use case where it would be useful - moving to a
new kernel image.
2019-07-18 22:31:43 +02:00
Yu Watanabe
195a18c17d test-network: add tests for routes to DNS servers provided by DHCPv4 2019-07-19 01:56:14 +09:00
Yu Watanabe
a24e12f020 network: add DHCPv4.RoutesToDNS= setting 2019-07-19 01:49:39 +09:00
Yu Watanabe
854a1ccfc2 network: set routes to dns servers provided by DHCPv4 2019-07-19 01:44:44 +09:00
Yu Watanabe
d4c52ee5b5 network: store routes provided by DHCPv4 in Set
This re-writes d03073ddcd.
2019-07-19 01:44:44 +09:00
Yu Watanabe
01aaa3df16 network: introduce route_full_hash_ops
Will be used later.
2019-07-19 01:44:44 +09:00
Zbigniew Jędrzejewski-Szmek
deeabb45ae
Merge pull request #13097 from poettering/mount-state-fix
Scan /proc/self/mountinfo before waitid() handling
2019-07-18 17:33:20 +02:00
Zbigniew Jędrzejewski-Szmek
f4c961169c
Merge pull request #13102 from mbiebl/nologin-path
meson: make nologin path build time configurable
2019-07-18 17:17:23 +02:00
Lennart Poettering
9ddaa3e459 mount: rename update_parameters_proc_self_mount_info() → update_parameters_proc_self_mountinfo()
let's name the call like the file in /proc is actually called.
2019-07-18 17:03:11 +02:00
Lennart Poettering
bcce581d65 swap: scan /proc/swaps before processing waitid() results
Similar to the previous commit, but for /proc/swaps, where the same
logic and rationale applies.
2019-07-18 17:03:11 +02:00
Lennart Poettering
350804867d mount: rescan /proc/self/mountinfo before processing waitid() results
(The interesting bits about the what and why are in a comment in the
patch, please have a look there instead of looking here in the commit
msg).

Fixes: #10872
2019-07-18 17:03:11 +02:00
Lennart Poettering
fcd8e119c2 mount: simplify /proc/self/mountinfo handler
Our IO handler is only installed for one fd, hence there's no reason to
conditionalize on it again.

Also, split out the draining into a helper function of its own.
2019-07-18 17:03:10 +02:00
Lennart Poettering
a5ac2021da
Merge pull request #12639 from michaelolbrich/job-order
make the run queue order deterministic
2019-07-18 16:53:32 +02:00
Zbigniew Jędrzejewski-Szmek
4f0acdb366 man: add note about systemctl stop return value
Fixes #13104.

(I know a lot more could be added to that  man page. This patch only addresses that
once specific complaint.)
2019-07-18 16:20:38 +02:00
Lennart Poettering
ffc1c11938
Merge pull request #13107 from keszybz/lvalue-rvalue
Better error messages for syntax errors
2019-07-18 16:12:20 +02:00
Michael Biebl
b333c4d101 test: replace Makefile copy with a symlink for TEST-28-PERCENTJ-WANTEDBY
TEST-28-PERCENTJ-WANTEDBY/Makefile is identical to
TEST-01-BASIC/Makefile so avoid duplication and use a symlink instead.
2019-07-18 12:49:41 +02:00
Michael Biebl
6db904625d meson: make nologin path build time configurable
Some distros install nologin as /usr/sbin/nologin, others as
/sbin/nologin.
Since we can't really on merged-usr everywhere (where the path wouldn't
matter), make the path build time configurable via -Dnologin-path=.

Closes #13028
2019-07-18 12:46:35 +02:00
Zbigniew Jędrzejewski-Szmek
28f30f4051 shared/conf-parser: say "key name" not "lvalue", add dot
"lvalue" is our internal jargon. Let's try not to confuse non-programmers.
2019-07-18 11:39:40 +02:00
Zbigniew Jędrzejewski-Szmek
8be8ed8ce1 shared/conf-parser: emit a nicer warning for something like "======"
Urlich Windl wrote on the mailing list:
> I noticed that a line of "=======" in "[Service]" cases the message " Unknown lvalue '' in section 'Service'".

This now becomes:
/etc/systemd/system/eqeqeqeq.service:3: Missing key name before '=', ignoring line.
2019-07-18 11:39:38 +02:00
Zbigniew Jędrzejewski-Szmek
2d4fffb00b shared/conf-parser: be nice and ignore lines without "="
We generally don't treat syntax error as fatal, but in this case we would
completely refuse to load the file. I think we should treat the the same
as assignment outside of a section, or an unknown key name.
2019-07-18 11:39:25 +02:00
Michael Olbrich
da8e178296 job: make the run queue order deterministic
Jobs are added to the run queue in random order. This happens because most
jobs are added by iterating over the transaction or dependency hash maps.

As a result, jobs that can be executed at the same time are started in a
different order each time.
On small embedded devices this can cause a measurable jitter for the point
in time when a job starts (~100ms jitter for 10 units that are started in
random order).
This results is a similar jitter for the boot time. This is undesirable in
general and make optimizing the boot time a lot harder.
Also, jobs that should have a higher priority because the unit has a higher
CPU weight might get executed later than others.

Fix this by turning the job run_queue into a Prioq and sort by the
following criteria (use the next if the values are equal):
- CPU weight
- nice level
- unit type
- unit name

The last one is just there for deterministic sorting to avoid any jitter.
2019-07-18 10:28:39 +02:00
Michael Olbrich
fcfc7e1137 basic: reorder UnitType enum
The enum order will be used to order jobs in the job queue.
Make sure that unit types that fork aditional processes come first to
maximize parallelism.
2019-07-18 09:54:03 +02:00
Zbigniew Jędrzejewski-Szmek
31a83062fb
Merge pull request #13103 from anitazha/conditiondocs
NEWS and catalog update for ExecCondition=
2019-07-18 08:06:37 +02:00
Anita Zhang
09c73ee7fe catalog: reference ExecCondition= in unit skipped str 2019-07-17 22:43:05 -07:00
Anita Zhang
a4d5848aa2 NEWS: bullet point for ExecCondition= 2019-07-17 22:27:57 -07:00
Lennart Poettering
d611cfa748 core: never propagate reload failure to service result
Fixes: #11238
2019-07-18 10:14:02 +09:00
Lennart Poettering
ea582a0f1b
Merge pull request #13047 from niedbalski/fix-5552-pr
resolved: add new option to only cache positive answers
2019-07-17 19:27:16 +02:00
Lennart Poettering
5eeb19c600
Merge pull request #13086 from yuwata/network-dhcp6-cleanups
network: dhcp6 cleanups
2019-07-17 19:26:46 +02:00
Frantisek Sumsal
c087dc0c35
Merge pull request #13093 from keszybz/two-assert-cc-cleanups
Two assert_cc cleanups
2019-07-17 15:53:35 +00:00
Jorge Niedbalski
37d7a7d984 resolved: switch cache option to a tri-state option (systemd#5552).
Change the resolved.conf Cache option to a tri-state "no, no-negative, yes" values.

If a lookup returns SERVFAIL systemd-resolved will cache the result for 30s (See 201d995),
however, there are several use cases on which this condition is not acceptable (See systemd#5552 comments)
and the only workaround would be to disable cache entirely or flush it , which isn't optimal.

This change adds the 'no-negative' option when set it avoids putting in cache
negative answers but still works the same heuristics for positive answers.

Signed-off-by: Jorge Niedbalski <jnr@metaklass.org>
2019-07-17 10:42:53 -04:00
Yu Watanabe
6787917dfa network: update state file after dhcp6 events
E.g. DNS servers may be received from DHCPv6 server. If the link is
already in configured state, the DNS servers are not written in the
state file.
2019-07-17 23:15:15 +09:00
Yu Watanabe
693283cd58 Revert "test-network: extend sleep time"
This reverts commit 7d7bb5c861.

Still the CIs are flaky and the commit just slow down them.
2019-07-17 23:13:40 +09:00
Yu Watanabe
9fdae8d5b2 man: fix wrong section name 2019-07-17 23:13:40 +09:00
Yu Watanabe
26a65470ba network: fix use after free()
The hashmap will be accessed by client_stop().
2019-07-17 23:13:40 +09:00
Yu Watanabe
2eff7cc59c network: drop unnecessary line breaks 2019-07-17 23:13:40 +09:00
Yu Watanabe
8107f4731e network: drop fallback mechanism to assign DHCPv6 addresses with IFA_F_NOPREFIXROUTE
The flag IFA_F_NOPREFIXROUTE was introduced in kernel-3.14. But even if
the kernel does not support the flag, it should be just ignored. So, it
is not necessary to do the fallback logic. Moreover, the current logic
is not a fallback mechanism but just retrying. So, it should not work.
Let's drop that.
2019-07-17 23:13:40 +09:00
Lennart Poettering
81c07a9555
Merge pull request #13080 from keszybz/firstboot-fixes
Firstboot fixes
2019-07-17 14:43:15 +02:00
Dan Streetman
2a2aeed460 test/TEST-16: don't copy systemd-notify or lib from $BUILD_DIR
On Ubuntu CI, these don't exist because it tests installed
binaries, not just-built binaries.
2019-07-17 14:25:27 +02:00
Zbigniew Jędrzejewski-Szmek
d268ab389c Rewrite IN_SET()
This restores proper speed with asan builds with gcc 9.1.1.
Fixes #12997.

$ rpm -q gcc
gcc-9.1.1-2.fc31.x86_64

$ time ASAN_OPTIONS=strict_string_checks=1:detect_stack_use_after_return=1:check_initialization_order=1:strict_init_order=1 UBSAN_OPTIONS=print_stacktrace=1:print_summary=1:halt_on_error=1 build-rawhide-sanitize/test-conf-parser

(old) 86.99s user 20.22s system 361% cpu 29.635 total
(new)  3.05s user  0.29s system  99% cpu  3.377 total

Size is increased a bit:

$ size build/systemd
(old) 1683421	 246100	   1208	1930729	 1d75e9	build/systemd
(new) 1688237	 246100	   1208	1935545	 1d88b9	build/systemd

... but that's <0.1%, so we don't really care.
2019-07-17 14:22:53 +02:00
Lennart Poettering
76c887fdaa
Merge pull request #13092 from keszybz/coverity-fixes
Coverity fixes
2019-07-17 14:18:49 +02:00