IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Depending on the location of the original build dir, either ProtectHome=
or ProtectSystem= may get in the way when creating the gcov metadata
files.
Follow-up to:
* 02d7e73013
* 6c9efba677
Otherwise we miss quite a lot of coverage (mainly from logind,
hostnamed, networkd, and possibly others), since they can't write their
reports with `ProtectSystem=strict`.
If -Db_coverage=true is used at build time, then ARTIFACT_DIRECTORY/TEST-XX-FOO.coverage-info
files are created with code coverage data, and run-integration-test.sh also
merges them into ARTIFACT_DIRECTORY/merged.coverage-info since the coveralls.io
helpers accept only a single file.
Compared to PID1 where systemd-oomd has to be the client to PID1
because PID1 is a more privileged process than systemd-oomd, systemd-oomd
is the more privileged process compared to a user manager so we have
user managers be the client whereas systemd-oomd is now the server.
The same varlink protocol is used between user managers and systemd-oomd
to deliver ManagedOOM property updates. systemd-oomd now sets up a varlink
server that user managers connect to to send ManagedOOM property updates.
We also add extra validation to make sure that non-root senders don't
send updates for cgroups they don't own.
The integration test was extended to repeat the chill/bloat test using
a user manager instead of PID1.
The `dracut_install` is a misnomer, since the systemd integration test
suite is based on the original dracut's test suite, and not all the
references to dracut has been edited out. Let's fix that.
otherwise we might mark tests where something crashes during shutdown as
successful, as happened in one of the recent TEST-01-BASIC runs:
```
testsuite-01.service: About to execute rm -f /failed /testok
testsuite-01.service: Forked rm as 606
testsuite-01.service: Executing: rm -f /failed /testoktestsuite-01.service: Changed dead -> start-pre
Starting TEST-01-BASIC...
...
Child 606 (rm) died (code=exited, status=0/SUCCESS)
testsuite-01.service: Child 606 belongs to testsuite-01.service.
testsuite-01.service: Control process exited, code=exited, status=0/SUCCESS (success)
testsuite-01.service: Got final SIGCHLD for state start-pre.
testsuite-01.service: Passing 0 fds to service
testsuite-01.service: About to execute sh -e -x -c "systemctl --state=failed --no-legend --no-pager >/failed ; systemctl daemon-reload ; echo OK >/testok"
testsuite-01.service: Forked sh as 607
testsuite-01.service: Changed start-pre -> start
testsuite-01.service: Executing: sh -e -x -c "systemctl --state=failed --no-legend --no-pager >/failed ; systemctl daemon-reload ; echo OK >/testok"systemd-journald.service: Got notification message from PID 560 (FDSTORE=1)S
...
testsuite-01.service: Child 607 belongs to testsuite-01.service.
testsuite-01.service: Main process exited, code=exited, status=0/SUCCESS (success)
testsuite-01.service: Deactivated successfully.
testsuite-01.service: Service will not restart (restart setting)
testsuite-01.service: Changed start -> dead
testsuite-01.service: Job 207 testsuite-01.service/start finished, result=done
[ OK ] Finished TEST-01-BASIC.
...
end.service: About to execute /bin/sh -x -c "systemctl poweroff --no-block"
end.service: Forked /bin/sh as 623end.service: Executing: /bin/sh -x -c "systemctl poweroff --no-block"
...
end.service: Job 213 end.service/start finished, result=canceled
Caught <SEGV>, dumped core as pid 624.
Freezing execution.
CentOS Linux 8
Kernel 4.18.0-305.12.1.el8_4.x86_64 on an x86_64 (ttyS0)
H login: qemu-kvm: terminating on signal 15 from pid 80134 (timeout)
E: Test timed out after 600s
Spawning getter /root/systemd/build/journalctl -o export -D /var/tmp/systemd-test.0UYjAS/root/var/log/journal/ca6031c2491543fe8286c748258df8d1...
Finishing after writing 15125 entries
Spawning getter /root/systemd/build/journalctl -o export -D /var/tmp/systemd-test.0UYjAS/root/var/log/journal/remote...
Finishing after writing 0 entries
-rw-r-----. 1 root root 25165824 Aug 20 12:26 /var/tmp/systemd-test.0UYjAS/system.journal
TEST-01-BASIC RUN: Basic systemd setup [OK]
...
NO_BUILD=1 indicates that we want to test systemd from the local system and not
the one from the local build. Hence there should be no need to call
find-build-dir.sh when NO_BUID=1 especially since it's likely that the script
will fail to find a local build in this case.
This avoids find-build-dir.sh to emit 'Specify build directory with $BUILD_DIR'
message when NO_BUILD=1 and no local build can be found.
This introduces a behavior change though: systemd from the local system will
always be preferred when NO_BUILD=1 even if a local build can be found.
In some cases image names are unpredictable - some orchestrators/deployment
tools like to mangle names to suit their internal formats. In these cases,
the requirement that the extension-release file matches exactly the image
name where it's contained cannot work.
Allow falling back to loading the first regular file which name starts with
'extension-release' located in /usr/lib/extension-release.d/ and tagged with
a user.extension-release.strict extended attribute with a true value, if the
one with the expected name cannot be found.
Skip a harmless error when running the tests on a system with a significantly
older systemd version (ldd tries to resolve the unprefixed RPATH for libsystemd.so.0,
which is in this case older than the already installed libsystemd.so.0 in $initdir).
The issue is triggered by installing test dependencies in install_missing_libraries().
Spotted on CentOS 8.
```
$ ldd /var/tmp/systemd-test.nZO11F/root/lib/systemd/tests/test-sd-device-thread
/var/tmp/systemd-test.nZO11F/root/lib/systemd/tests/test-sd-device-thread: /lib64/libsystemd.so.0: version `LIBSYSTEMD_240' not found (required by /var/tmp/systemd-test.nZO11F/root/lib/systemd/tests/test-sd-device-thread)
linux-vdso64.so.1 (0x00007fffb79d0000)
libclang_rt.asan-powerpc64le.so => /usr/lib64/clang/11.0.0/lib/linux/libclang_rt.asan-powerpc64le.so (0x00007fffb6ef0000)
libsystemd.so.0 => /lib64/libsystemd.so.0 (0x00007fffb6d20000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fffb6cd0000)
libc.so.6 => /lib64/libc.so.6 (0x00007fffb6ab0000)
$ LD_LIBRARY_PATH=/var/tmp/systemd-test.nZO11F/root/lib64/ ldd /var/tmp/systemd-test.nZO11F/root/lib/systemd/tests/test-sd-device-thread
linux-vdso64.so.1 (0x00007fffaba80000)
libclang_rt.asan-powerpc64le.so => /usr/lib64/clang/11.0.0/lib/linux/libclang_rt.asan-powerpc64le.so (0x00007fffaafa0000)
libsystemd.so.0 => /var/tmp/systemd-test.nZO11F/root/lib64/libsystemd.so.0 (0x00007fffaa5f0000)
libpthread.so.0 => /var/tmp/systemd-test.nZO11F/root/lib64/libpthread.so.0 (0x00007fffaa5a0000)
libc.so.6 => /var/tmp/systemd-test.nZO11F/root/lib64/libc.so.6 (0x00007fffaa380000)
```
When `linux-headers` is installed on Arch Linux, it stores the module
source tree in the kernel module directory, which is then picked up by
`find` and we get a lot of harmless but annoying errors:
```
...
modprobe: FATAL: Module Kconfig.iosched not found in directory /lib/modules/5.13.7-arch1-1
modprobe: FATAL: Module Kconfig not found in directory /lib/modules/5.13.7-arch1-1
modprobe: FATAL: Module Kconfig not found in directory /lib/modules/5.13.7-arch1-1
modprobe: FATAL: Module dm-mpath.h not found in directory /lib/modules/5.13.7-arch1-1
modprobe: FATAL: Module dm-bio-prison-v2.h not found in directory /lib/modules/5.13.7-arch1-1
modprobe: FATAL: Module raid0.h not found in directory /lib/modules/5.13.7-arch1-1
...
```
Let's fix this by trying to install only kernel modules (*.ko files with
an optional compression).
Let's unify handling of the boolean values throughout the test-functions
code, since we use 0/1, true/false, and yes/no almost randomly in many
places, so picking the right values during CI configuration can be a real
pain.
Saving the journal for passing tests creates a huge amount of unneeded
data stored for each full test run. Add a env var to allow saving the
journal only for failed tests.
Due to a little misunderstanding the last patch doesn't work as
expected, since test_create_image() is called only for the first image
(usually TEST-01-BASIC), and all subsequent images are then (possibly)
modified with test_append_files().
Follow-up to 179ca4d2b1.
It turns out the "supporting services" were run in _all_ tests if
TEST-01-BASIC was run as the first test (which is usually the case),
since with the original condition in test_create_image() we would skip
the masking and then propagate the change to the default image used by
other tests. This has been causing multiple bogus test timeouts
(especially when the hwdb was being rebuilt in tests with short
timeouts, like TEST-52-HONORFIRSTSHUTDOWN).
Let's "fix" this by making the call to mask_supporting_services()
uncoditional and override the test_create_image() function in
TEST-01-BASIC to avoid the masking in this single case.
The three-argument match() is a GNU AWK extension, thus breaking the
compatibility with mawk (used on Ubuntu/Debian, for example). Let's
replace it with a (hopefully) more portable sed expression to drop the
inadvertently introduced gawk dependency.
Fixes: #19957
This fixes an issue introduced by the commit 954c77c251.
For some reasons, setting default ACL on $TESTDIR makes TEST-29-PORTABLE
fail. Let's drop the default ACL, and set ACL on saved results instead.
Fixes#19519.
"systemd-testsuite" gets in the way when grepping for "testsuite-*.sh".
Also, the name doesn't matter for anything, so let's just use something
very short to save space.
When editing this function in 7bf20e48bd, I couldn't
decide whether to initialize ret at the top and only reset it on success, or
whether to assign a value in each branch. In the end I did neither ;( So if the
test finished without creating any of the result files, we would echo a
message, but return "success".
But there was bigger confusion with /failed: some tests create it empty, some
don't. I think we may want to do away pre-creation of /failed completely, and
assume the test failed unless /testok is found. But I'm leaving that for later
rework. For now let's just make sure we report return success only if /testok
or /skipped is found.
Basically the same scenario as in
a33e2692e1, where `awk` exits as soon
as it finds a match, thus sending SIGPIPE to `ldd` if it's not fast
enough. That, in combination with `set -o pipefail` causes random &
unexpected fails, like:
```
No journal files were found.
-rw-r----- 1 root root 16777216 Apr 30 10:31
/var/tmp/TEST-01-BASIC_sanitizers-nspawn/system.journal
TEST-01-BASIC RUN: Basic systemd setup [OK]
systemd is not linked against the ASan DSO
gcc does this by default, for clang compile with -shared-libasan
make: *** [Makefile:2: clean-again] Error 1
make: Leaving directory '/build/test/TEST-01-BASIC'
```
Specifying the test number manually is tedious and prone to errors (as
recently proven). Since we have all the necessary data to work out the
test number, let's do it automagically.
We have to invoke the tests as superuser, and not being able to read
the journal as the invoking user is annoying. I don't think there are
any security considerations here, since the invoking user can already
put arbitrary code in the Makefile and test scripts which get executed
with root privileges.
The logic to query test state was rather complex. I don't quite grok the point
of ret=$((ret+1))… But afaics, the precise result was always ignored by the
caller anyway.
This code was partially broken, since the firmware directory was
undefined. Also, some of the parts were a dead code, since they relied
on code from the original dracut test suite.
`command -v <bin> | grep ...` can under certain conditions cause the
`command` to exit with SIGPIPE, which in combination with `set -o
pipefail` means that the tests sometimes randomly die during setup.
Let's avoid using pipes in such cases.
This breaks some existing loops which previously ignored if the piped
program exited with EC >0. Rewrite them to mitigate this (and also make
them more robust in some cases).
When trying to calculate the next firing of 'Sun *-*-* 01:00:00', we'd fall
into an infinite loop, because mktime() moves us "backwards":
Before this patch:
tm_within_bounds: good=0 2021-03-29 01:00:00 → 2021-03-29 00:00:00
tm_within_bounds: good=0 2021-03-29 01:00:00 → 2021-03-29 00:00:00
tm_within_bounds: good=0 2021-03-29 01:00:00 → 2021-03-29 00:00:00
...
We rely on mktime() normalizing the time. The man page does not say that it'll
move the time forward, but our algorithm relies on this. So let's catch this
case explicitly.
With this patch:
$ TZ=Europe/Dublin faketime 2021-03-21 build/systemd-analyze calendar --iterations=5 'Sun *-*-* 01:00:00'
Normalized form: Sun *-*-* 01:00:00
Next elapse: Sun 2021-03-21 01:00:00 GMT
(in UTC): Sun 2021-03-21 01:00:00 UTC
From now: 59min left
Iter. #2: Sun 2021-04-04 01:00:00 IST
(in UTC): Sun 2021-04-04 00:00:00 UTC
From now: 1 weeks 6 days left <---- note the 2 week jump here
Iter. #3: Sun 2021-04-11 01:00:00 IST
(in UTC): Sun 2021-04-11 00:00:00 UTC
From now: 2 weeks 6 days left
Iter. #4: Sun 2021-04-18 01:00:00 IST
(in UTC): Sun 2021-04-18 00:00:00 UTC
From now: 3 weeks 6 days left
Iter. #5: Sun 2021-04-25 01:00:00 IST
(in UTC): Sun 2021-04-25 00:00:00 UTC
From now: 1 months 4 days left
Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1941335.
otherwise udev complains about the file being world-writable:
systemd-udevd[228]: Configuration file /etc/udev/rules.d/00-set-LD_PRELOAD.rules is marked world-writable. Please remove world writability permission bits. Proceeding anyway.
Fixes: systemd/systemd-centos-ci#354
This test would normally get stuck when trying to mount the verity image
due to:
systemd-udevd[299]: dm-0: '/usr/sbin/dmsetup udevflags 6293812'(err) '==371==ASan runtime does not come first in initial library list; you should either link runtime to your application or manually preload it with LD_PRELOAD.'
systemd-udevd[299]: dm-0: Process '/usr/sbin/dmsetup udevflags 6293812' failed with exit code 1
...
systemd-udevd[299]: dm-0: '/usr/sbin/dmsetup udevcomplete 6293812'(err) '==372==ASan runtime does not come first in initial library list; you should either link runtime to your application or manually preload it with LD_PRELOAD.'
systemd-udevd[299]: dm-0: Process '/usr/sbin/dmsetup udevcomplete 6293812' failed with exit code 1.
systemd-udevd[299]: dm-0: Command "/usr/sbin/dmsetup udevcomplete 6293812" returned 1 (error), ignoring.
so let's add a simple udev rule which sets $LD_PRELOAD for the block
subsystem.
Also, install the ASan library along with necessary dependencies into
the verity minimal image, to get rid of the annoying (yet harmless)
errors about missing library from $LD_LIBRARY.
When running integration tests under sanitizers D-Bus fails to
shutdown cleanly, causing unnecessary noise in the logs:
```
dbus-daemon[272]: ==272==LeakSanitizer has encountered a fatal error.
dbus-daemon[272]: ==272==HINT: For debugging, try setting environment variable LSAN_OPTIONS=verbosity=1:log_threads=1
dbus-daemon[272]: ==272==HINT: LeakSanitizer does not work under ptrace (strace, gdb, etc)
```
Since we're not "sanitizing" D-Bus anyway let's disable LSan's at_exit
check for the dbus.service to get rid of this error.
When a subshell is used ('make' or 'make all') the LOOPDEV environment
variable, which is used to store the opened loop device, is lost.
So the cleanup on trap/exit doesn't do anything, and the loop
device used to mount the test image is left around.
Avoid using a subshell to fix the issue.
The source package in the apt cache might be older than the
packaging from salsa.debian.org/systemd-team/systemd so it might not
list all the current binary packages.
This is currently the case for systemd-timesyncd, so TEST-30 fails.
Simply grep the control file rather than using apt-cache when iterating
over the packages contents.
Binaries on the latest Arch Linux use `call` instructions instead of
`callq`, which breaks the ASan detection and eventually the image
building process (due to insufficient space).
There may be situations where a cgroup should be protected from killing
or deprioritized as a candidate. In FB oomd xattrs are used to bias oomd
away from supervisor cgroups and towards worker cgroups in container
tasks. On desktops this can be used to protect important units with
unpredictable resource consumption.
The patch allows systemd-oomd to understand 2 xattrs:
"user.oomd_avoid" and "user.oomd_omit". If systemd-oomd sees these
xattrs set to 1 on a candidate cgroup (i.e. while attempting to kill something)
AND the cgroup is owned by root, it will either deprioritize the cgroup as
a candidate (avoid) or remove it completely as a candidate (omit).
Usage is restricted to root owned cgroups to prevent situations where an
unprivileged user can set their own cgroups lower in the kill priority than
another user's (and prevent them from omitting their units from
systemd-oomd killing).
Add NO_BUILD var to allow testing with no local build, by installing
local systemd files into the image.
This only works for debian-like distros currently, that use the
tools 'apt' and 'dpkg' for package management.
The $BUILD_DIR is only used in test-functions, and doesn't need to
be specified in any other scripts. Additionally, to be able to allow
the integration test suite to be run against locally installed binaries,
instead of built binaries, moving BUILD_DIR logic completely into
test-functions allows later patches to be simpler.
Building custom images for each test takes a lot of time.
Build the default one, and if the test needs incompatible changes
just copy it and extend it instead.
This reverts commit 73484ecff9.
3976f372ae moved libudev.so to be built in the
main directory, so this addition to $LD_LIBRARY_PATH is now obsolete.
After that commit, we build the following shared libraries:
build/libnss_myhostname.so.2
build/libnss_mymachines.so.2
build/libnss_resolve.so.2
build/libnss_systemd.so.2
build/libsystemd.so.0.30.0
build/libudev.so.1.7.0
build/pam_systemd.so
build/pam_systemd_home.so
build/src/boot/efi/stub.so
build/src/boot/efi/systemd_boot.so
build/src/shared/libsystemd-shared-247.so
EFI stubs don't matter, and libsystemd-shared-nnn.so is loaded through rpath,
and is doesn't need to and shouldn't be in $LD_LIBRARY_PATH. In effect, we only
ever need to add the main build directory to the search path.
Not all optional libraries might be available on developers machines,
so log and skip.
Also some pkg-config files are broken (eg: tss2 on Debian Stable) so
skip if the required variables are missing, and improve logs.
By default the test suite prefers qemu, and uses nspawn only if
a test specifically says it doesn't support qemu.
Add a variable to allow flipping the default, and run as many
tests under nspawn as possible.
Allows to split the test run in two parts. Most tests can run under
nspawn which is much faster, and they can be ran in one chunk with
TEST_NO_QEMU=1. The qemu-only tests, which are just a handful, can
be ran in another chunk with TEST_QEMU_ONLY=1.
Allows autopkgtest to be split in two parts.
The image build function greps for ExecStart lines in unit files, but some
of them (eg: systemd-firstboot) do not use a full path.
It then falls back to 'type -P' but that only works if you have the binary
installed. For optional binaries like systemd-firstboot, the installation
can then fail.
Manually check if the binary already exists in /[usr/][s]bin.
Usually on Debian ROOTLIBDIR is /lib/<arch triplet>, which is not the right place.
Use pkg-config since we define it, and then fallback to /usr/lib/systemd/user which is
the canonical location.
On both Debian&friends and Fedora dbus/dbus-broker install the user socket/service
under /usr/lib/systemd/user, not /lib/systemd/systemd/user.
I suspect the original version of the regex was written on a system,
which prints both the QEMU version and the QEMU package version in the
--version output, like Fedora:
$ /bin/qemu-system-x86_64 --version
QEMU emulator version 4.2.1 (qemu-4.2.1-1.fc32)
Copyright (c) 2003-2019 Fabrice Bellard and the QEMU Project developers
However, Arch Linux prints only the QEMU version:
$ /bin/qemu-system-x86_64 --version
QEMU emulator version 5.2.0
Copyright (c) 2003-2020 Fabrice Bellard and the QEMU Project developers
This causes the awk regex to not match the version string, since there's
no whitespace after it, causing the version check to fail (as well as the
TEST-36-NUMAPOLICY) as well.
Follow-up for 43b49470d1.
Upgrading to qemu 5.2 breaks TEST-36-NUMAPOLICY like:
qemu-system-x86_64: total memory for NUMA nodes (0x0) should
equal RAM size (0x20000000)
Use the new (as in >=2014) form of memdev in test 36:
-object memory-backend-ram,id=mem0,size=512M -numa node,memdev=mem0,nodeid=0
Since some target systems are as old as qemu 1.5.3 (CentOS7) but the new
kind to specify was added in qemu 2.1 this needs to add version parsing and
add the argument only when qemu is >=5.2.
Fixes#17986.
Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
The systemd-user file has been moved from /etc/pam.d into /usr/lib/pam.d,
so test-functions needs to copy it from /usr/lib/pam.d instead.
This will copy it from either location.
When invoking "ldd" to find dependency libraries we already set
$LD_LIBRARY_PATH to point to our own build tree, so that our libraries
are checked, not the host libraries. This is not sufficient howeever, as
libudev is built in a subdir. Add that, too.
We have four legal cases:
1. /usr/lib/os-release exists and /etc/os-release is a symlink to it
2. both exist but /etc/os-release is not a symlink to /usr/lib/os-release
3. only /usr/lib/os-release exists
4. only /etc/os-release exists
The generic setup code in test-functions and create-busybox-image didn't handle
case 3.
The test-specific code in TEST-50 didn't handle 2 (because the general setup
code would only install /etc/os-release in the image and
grep -f /usr/lib/os-release would not work) and 4 (same reason) and would fail
in case 3 in generic setup.
Since the hwdb update from a79be2f807
the systemd-hwdb-update service started timing out under ASan when
compiled with gcc, as we started tripping over the 3 minutes timeout.
This affects only gcc runs, since the current gcc on Arch still suffers
from the detect_stack_use_after_return performance penalty[0]. Until
the fixed gcc is present in the respective repositories, let's bump
the timeout to 4 minutes, as we might not be able to upgrade right
away, due to systemd/systemd#16199.
Before the hwdb update:
[ 7958.292540] systemd[63]: systemd-hwdb-update.service: Executing: /usr/bin/time systemd-hwdb update
[ 7958.304005] systemd[1]: systemd-journald.service: Got notification message from PID 44 (FDSTORE=1)
[ 7958.314434] systemd[1]: systemd-journald.service: Added fd 3 (n/a) to fd store.
[ 8008.520082] systemd[1]: systemd-journald.service: Got notification message from PID 44 (WATCHDOG=1)
[ 8068.520151] systemd[1]: systemd-journald.service: Got notification message from PID 44 (WATCHDOG=1)
[ 8125.682843] time[63]: 84.47user 82.92system 2:47.50elapsed 99%CPU (0avgtext+0avgdata 811512maxresident)k
[ 8125.682843] time[63]: 0inputs+19680outputs (0major+25000853minor)pagefaults 0swaps
After the hwdb update:
[ 6215.491958] systemd[63]: systemd-hwdb-update.service: Executing: /usr/bin/time systemd-hwdb update
[ 6215.503380] systemd[1]: systemd-journald.service: Got notification message from PID 44 (FDSTORE=1)
[ 6215.514172] systemd[1]: systemd-journald.service: Added fd 3 (n/a) to fd store.
[ 6329.392918] systemd[1]: systemd-journald.service: Got notification message from PID 44 (WATCHDOG=1)
[ 6394.920205] time[63]: 89.48user 89.98system 2:59.55elapsed 99%CPU (0avgtext+0avgdata 812764maxresident)k
[ 6394.920205] time[63]: 0inputs+20568outputs (0major+27318354minor)pagefaults 0swaps
[0] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94910
Prompted by systemd/systemd#16111.
* check if /var is a mountpoint - if not, something went wrong. In case
of systemd/systemd#16111 the /failed file was created, because
systemd-cryptsetup failed, but it ended up being empty, making the result
check incorrectly pass
* forward journal messages to console - if we fail to mount /var,
journald won't flush logs to the persistent storage and we end up
empty handed and with no clue what went wrong
For example, without systemd/systemd#16111 and with this patch:
...
[FAILED] Failed to start systemd-cryptsetup@varcrypt.service.
See 'systemctl status systemd-cryptsetup@varcrypt.service' for details.
[DEPEND] Dependency failed for cryptsetup.target.
...
[ 3.882451] systemd-cryptsetup[581]: Key file /etc/varkey is world-readable. This is not a good idea!
[ 3.883946] systemd-cryptsetup[581]: WARNING: Locking directory /run/cryptsetup is missing!
[ 3.884846] systemd-cryptsetup[581]: Failed to load Bitlocker superblock on device /dev/disk/by-uuid/180ba5ef-873b-4018-9968-47c23431f71a: Invalid argument
...
[ 4.099451] sh[606]: + mountpoint /var
[ 4.100025] sh[603]: + systemctl poweroff --no-block
[ 4.101636] systemd[1]: Finished systemd-user-sessions.service.
[ 4.102598] sh[608]: /var is not a mountpoint
[FAILED] Failed to start testsuite-02.service.
Let's create new images public by default and then symlink/copy them
into the respective private directories afterwards, not the other way
around. This should fix a nasty race condition in parallel runs where
one tests attempts to copy the backing public image at the same moment
another test is already modifying it.
Support running tests in parallel by switching to copying of the
base image instead of symlinking it..
This still requires some setup steps, like running `make setup` on tests
which have unique $IMAGE_NAME beforehand (and sequentially), otherwise
they'll all try to create the same base image when started in parallel,
leading to nasty issues. However, as running the integration tests in
parallel is such an unusual use case it should be good enough, for now.
As Debian/Ubuntu use /lib/systemd instead of /usr/lib/systemd,
add systemd-journal-remote to the list of programs that test-functions
detects the correct path to, and replace its direct usage with
$SYSTEMD_JOURNAL_REMOTE
Also use $JOURNALCTL instead of journalctl.
Also minor correction in install_plymouth() to look in /lib/... as
well as /usr/lib/... and /etc/...
Remove the artifact files indicating test result (testok, failed, and
skipped) just before running the test so we always get the latest and
most relevant result instead of incorrectly consuming previous results.
Discovered in https://github.com/systemd/systemd/pull/15378#issuecomment-616801873
This doesn't really matter, since in non-/usr-merged systems plymouth
needs to be in /bin and on merged ones it doesn't matter, but it is
still prettier to insert the right path, and avoid /bin on merged
systems, since it's just a compat symlink.
Replaces: #15351
Using s-j-remote fixes the following issue: when coalescing files from multiple
inputs, simply copying all files with into the the same directory might
potentially mess things up, because a newer system.journal might overwrite an
older journal. This happens because we run multiple tests from the same image,
and need to clean out the directory after each run.
By using systemd-journal-remote, we nicely coalesce all files. This has the
advantage that if there aren't too many logs, we end up with just one journal
file.
ARTIFACT_DIRECTORY is for ubuntuautopackagetests, where the journal files are
copied to a separate directory to preserve after tests have been run. This
functionality can now be recreated by setting
ARTIFACT_DIRECTORY=$AUTOPKGTEST_ARTIFACTS.
It is more trouble than it is worth. The setup is of a loopback device
is very quick, so it's better to always create it when needed and
immediately drop afterwards.
This causes the unprivileged-nspawn-root directory to be removed
after running one test. The advantage is that we reduce the maximum
disk-space use quite a bit (47*400 MB → about 18GB).
has-overflow was a temporary hack that was removed in
844da987ef (Oct. 2016). All the makefiles
can be the same, and all the targets can be handled identically.
Before, we'd copy the test tree into nspawn-root, and run the tests from there.
This is OK, and doesn't actually take much extra time. But it uses quite a lot
of extra disk space. So let's make things a bit more efficient by running
directly from the image file.
We still run the unprivileged nspawn tests from a copy. Once the kernel
implements fs shift, we can do away with that too.
Before, we'd create a separate image for each test, in
/var/tmp/systemd-test.XXXXX/rootdisk.img. Most of the images
where very similar, except that each one had some unit files installed
specifically for the test. The installation of those custom unit files
was removed in previous commits (all the unit files are always installed).
The new approach is to only create as few distinct images as possible.
We have:
default.img: the "normal" image suitable for almost all the tests
basic.img: the same as default image but doesn't mask any services
cryptsetup.img: p2 is used for encrypted /var
badid.img: /etc/machine-id is overwritten with stuff
selinux.img: with selinux added for fun and fun
and a few others:
ls -l build/test/*img
lrwxrwxrwx 1 root root 38 Mar 21 21:23 build/test/badid.img -> /var/tmp/systemd-test.PJFFeo/badid.img
lrwxrwxrwx 1 root root 38 Mar 21 21:17 build/test/basic.img -> /var/tmp/systemd-test.na0xOI/basic.img
lrwxrwxrwx 1 root root 43 Mar 21 21:18 build/test/cryptsetup.img -> /var/tmp/systemd-test.Tzjv06/cryptsetup.img
lrwxrwxrwx 1 root root 40 Mar 21 21:19 build/test/default.img -> /var/tmp/systemd-test.EscAsS/default.img
lrwxrwxrwx 1 root root 39 Mar 21 21:22 build/test/nspawn.img -> /var/tmp/systemd-test.HSebKo/nspawn.img
lrwxrwxrwx 1 root root 40 Mar 21 21:20 build/test/selinux.img -> /var/tmp/systemd-test.daBjbx/selinux.img
lrwxrwxrwx 1 root root 39 Mar 21 21:21 build/test/test08.img -> /var/tmp/systemd-test.OgnN8Z/test08.img
I considered trying to use the same image everywhere. It would probably be
possible, but it would be very brittle. By using separate images where it is
necessary we keep various orthogonal modifications independent.
The way that images are cached is complicated by the fact that we still
want to keep them in /var/tmp. Thus, an image is created on first use and
linked to from build/test/ so it can be found by other tests.
Tests cannot be run in parallel. I think that is an acceptable limitation.
Creation of the images was probably taking more resources then the actual
tests, so we should be better off anyway.
We had an fstab for the sole purpose of remounting "/" rw. Mounting root ro
is a pointless excercise in obsolete approaches. More importantly, the nspawn
image is now the same as the qemu one.
The two timezone files are now installed in the global setup. I am not too
happy about this, but it still seems better than to create a completely
separate image just for this.
I picked the list of zone files to install by grepping through the code. This
is is a bit brittle, but installing all of them takes a while, and more
importantly, writes a lot of lines to the log.
As in 2a5fcfae02
and in 3e67e5c992
using /usr/bin/env allows bash to be looked up in PATH
rather than being hard-coded.
As with the previous changes the same arguments apply
- distributions have scripts to rewrite shebangs on installation and
they know what locations to rely on.
- For tests/compilation we should rather rely on the user to have setup
there PATH correctly.
In particular this makes testing from git easier on NixOS where do not provide
/bin/bash to improve compose-ability.
Having /etc/securetty in test containers prevents root from logging into
them:
```
Jan 31 10:15:11 systemd-testsuite login[69]: pam_securetty(login:auth): access denied: tty 'pts/0' is not secure !
Jan 31 10:15:11 systemd-testsuite login[69]: FAILED LOGIN 1 FROM pts/0 FOR root, Authentication failure
```
The test exercises that PrivateTmp=yes and ProtectHome={read-only,tmpfs}
directives work as expected when PrivateUsers=yes in a user manager.
Some code is also added to test-functions to help set up test cases that
exercise the user manager.
Meson appears to set the rpath only for some binaries it builds, but not
all. (The rules are not clear to me, but that's besides the point of
this commit).
Let's make sure if our test script operates on a binary that has no
rpath set we fall back preferably to the BUILD_DIR rather than directly
to the host.
This matters if a test uses a libsystemd symbol introduced in a version
newer than the one on the host. In that case "ldd" will not work on the
test binary if rpath is not set. With this fix that behaviour is
corrected, and "ldd" works correctly even in this case.
(Or in other words: before this fix on binaries lacking rpath we'd base
dependency info on the libraries of the host, not the buidl tree, if
they exist in both.)
We currently use the host's kernel and initramfs in our QEMU tests.
If the host is running on an encrypted LUKS partition, then the initramfs
will have a crypttab setup looking for the particular root disk it needs to
encrypt before booting into the system.
However, this disk obviously doesn't exist in our QEMU VM, so it turns out
our tests end up waiting for this device to become available, which will
never actually happen, and boot hangs for 90s until that service times out.
[*** ] A start job is running for /dev/disk/by-uuid/01234567-abcd-1234-abcd-0123456789ab (20s / 1min 30s)
In order to prevent this issue, let's pass "rd.luks=0" to disable LUKS in
the initramfs only as part of our default kernel command-line in our QEMU
tests.
This is enough to disable this behavior and prevent the timeout, while at
the same time doesn't conflict with our tests that actually check for LUKS
behavior in the systemd running under test (such as TEST-02-CRYPTSETUP).
Tested: `sudo make -C TEST-02-CRYPTSETUP/ clean setup run`
Many tests were also masking systemd-machined.service. But machined
should only start when activated, so having it not masked shouldn't be
noticable. TEST-25-IMPORT needs it.
We should never have used an unprefixed environment variable name.
All other systemd-nspawn variables have the "SYSTEMD_NSPAWN_" prefix,
and all other systemd variables have the "SYSTEMD_" prefix.
The new variable name takes precedence, but we fall back to checking the
old one. If only the old one is found, a warning is emitted.
In addition, SYSTEMD_NSPAWN_UNIFIED_HIERARCHY="" is accepted as an override
to avoid looking for the old variable name.
We have a variable with the same name ($UNIFIED_CGROUP_HIERARCHY) in tests,
which governs both systemd-nspawn and qemu behaviour. It is not renamed.
This avoids unnecessary noise in the stderr logs which dd always produces,
such as:
0+0 records in
0+0 records out
0 bytes copied, 0.000155284 s, 0.0 kB/s
Using truncate should not result in any functional change; the image will
still be created as a sparse file of the size specified.
In Ubuntu CI, we test binaries from the installed system, not from
$BUILD_DIR, so use the appropriate binary. Most of the calls to the
binaries are part of checking/processing asan-built binaries, and so
did not apply to Ubuntu CI, except for generating noise in the stderr
log like:
objdump: '/tmp/autopkgtest.83yGoI/build.fHB/src/test/TEST-01-BASIC/systemd-journald': No such file
However this also applies to the call to systemd-nspawn, which the debian
upstream test wrapper was sed-adjusting to use the installed binary
instead of the binary in $BUILD_DIR. This commit allows removing that
sed processing of the test-functions file during Ubuntu CI test.
This dir is created by create_empty_image_rootdir, as well as indirectly
by some other functions, but it should be created by import_initdir so
the newly-exported $initdir exists and can be used immediately without
relying on other functions to create it.
Only umount it during cleanup if the $TESTDIR/root dir is a mountpoint.
This avoids adding noise to the stderr log such as:
mountpoint: /var/tmp/systemd-test.waLOFT/root: No such file or directory
To make debugging much easier, especially for crashes in tests under
QEMU, let's store the entire coredump bundle in the systemd journal,
which is usually kept around by various CIs. Right now, we usually end
up with a journal, but without the coredump itself, which is pretty
useless.
without using -type f, the logs print an error such as:
E: E: modprobe: FATAL: Module asymmetric_keys not found in directory /lib/modules/4.15.0-54-generic
while this doesn't appear to cause problems, it can be extremely
distracting when trying to debug real failures.
Almost all tests were manually mounting/unmounting $TESTDIR/root
from the loopback image; this moves all that into test-functions
so the test setup functions are simplier.
Also add test_setup_cleanup() function, to cleanup what is mounted
by create_empty_image_rootdir()
In certain cases we might attempt to install a binary which is already
present in the test image, yet it's missing from the host system.
In such cases, let's check if the binary indeed exists in the image
before doing any other chcecks. If it does, immediately return with
success.
This was discovered during installation of
/usr/lib/systemd/systemd-bless-boot, which was not present in Ubuntu CI
(as the installed systemd was from the Ubuntu repositories), and the
binary itself was already in the image thanks to `ninja install`.
However, during extraction of binaries from the systemd service files,
another attempt to install this binary was made, which failed due to
`find_binary` being unable to find it.
Running integration tests with ASan is somewhat tricky to begin with, as
we need to pre-load the ASan runtime DSO for certain services (like
dbus), otherwise they won't start or behave as expected. In case of gcc
this is pretty easy, as we need the runtime DSO during compilation, so
it's already present on the host system. For clang things get more
complicated, as ASan is compiled in statically by default, thus to
enable the necessary dynamic-ish behavior one needs to compile with
-shared-libasan and then correctly set LD_PRELOAD_PATH, as the runtime
libraries are not in a standard library path.
Certain distributions (e.g. Arch Linux) require booting with initrd, as
they lack support for commonly used filesystems in the kernel (i.e. the
support is compiled in as modules)
The `mount` utility has an unexpected behavior when run with libasan,
causing false-positives during the integration testing.
For example, on Arch Linux with LD_PRELOAD pointing to libasan:
```
bash-5.0# mount -o remount,rw -v /
mount: /dev/sda1 mounted on /.
bash-5.0# echo $?
1
```
However:
```
bash-5.0# LD_PRELOAD= mount -o remount,rw -v /
mount: /dev/sda1 mounted on /.
bash-5.0# echo $?
0
```
Further investigation with strace shows a LeakSanitizer error:
```
bash-5.0# strace -s 512 mount -o remount,rw -v /
...
write(2, "==355==LeakSanitizer has encountered a fatal error.\n", 52) = -1 EBADF (Bad file descriptor)
write(2, "ReportFile::Write() can't output requested buffer!\n", 51) = -1 EBADF (Bad file descriptor)
exit_group(1) = ?
+++ exited with 1 +++
```
Let's workaround this by clearing the LD_PRELOAD variable for
systemd-remount-fs.service
We had all kinds of indentation: 2 sp, 3 sp, 4 sp, 8 sp, and mixed.
4 sp was the most common, in particular the majority of scripts under test/
used that. Let's standarize on 4 sp, because many commandlines are long and
there's a lot of nesting, and with 8sp indentation less stuff fits. 4 sp
also seems to be the default indentation, so this will make it less likely
that people will mess up if they don't load the editor config. (I think people
often use vi, and vi has no support to load project-wide configuration
automatically. We distribute a .vimrc file, but it is not loaded by default,
and even the instructions in it seem to discourage its use for security
reasons.)
Also remove the few vim config lines that were left. We should either have them
on all files, or none.
Also remove some strange stuff like '#!/bin/env bash', yikes.
By default the run_qemu() function enables KVM automatically
if it detects the /dev/kvm char device and if the machine is not
already a KVM one. Let's add a TEST_NO_KVM env variable to suppress
this detection.
Make the interactive debugging of (particularly QEMU) machines less
painful, by replacing the default vt220 TERM with linux one, and
by not shutting down the machine after running the test itself.
We'd install the service file, and then dbus-broker-launcher because it is
mentioned in ExecStart=, but not the main executable, so nothing would work.
Let's just install dbus-broker executables if found. They are small, so this
doesn't matter much, and is much easier than figuring the exact conditions
under which dbus-broker will be used instead of dbus-daemon.
Since systemd 206 the combination of systemd and mkinitcpio
causes, under certain conditions, the rootfs to be double fsck'd.
Symptoms:
```
:: performing fsck on '/dev/sda1'
systemd: clean, 3523/125488 files, 141738/501760 blocks
********************** WARNING **********************
* *
* The root device is not configured to be mounted *
* read-write! It may be fsck'd again later. *
* *
*****************************************************
<snip>
[ OK ] Started File System Check on Root Device
```
This occurs when neither 'ro' or 'rw', or only 'ro' is present
on the kernel command line. The solution is to mount the roofs
as read-write on the kernel command line, so systemd knows to not fsck
it again.
If the QEMU_SMP value has not been explicitly set, try to determine it
from the number of online CPUs using the nproc utility. If this approach
fails, fall back to the default value QEMU_SMP=1.
This change should significantly help when running integration tests
under QEMU on multicore systems.
Currently /boot/initramfs-linux.img is used as the default initrd for ArchLinux.
Although, since the kernel modules that are not necessary for the host environment are removed from
initramfs-linux.img by mkinitcpio 's autodetect hook, the kernel modules necessary for qemu may be missing.
(ata_piix, ext4, and so on in my case.)
As a result, the test environment may not be built properly and the test will be failed.
initramfs-linux-fallback.img will skip this autodetect hook, so the test will run successfully in more
environments.
Both initramfs-linux.img and initramfs-linux-fallback.img are generated by default.
rescue.service pulls in /bin/plymouth, which doesn't exist on some
distributions (e.g. Arch Linux). Let's mark it as optional, as it's not
even required by the referencing unit and causes unwanted fails in the
integration testsuite.
Fedora Rawhide renamed dbus.service to dbus-daemon.service - that
breaks tests which require working DBus (e.g. TEST-03-JOBS)
Excerpt from the dbus.spec:
The 'dbus' package is only retained for compatibility purposes. It will
eventually be removed and then replaced by 'Provides: dbus' in the
dbus-daemon package. It will then exclusively be used for other packages to
describe their dependency on a system and user bus. It does not pull in any
particular dbus *implementation*, nor any libraries. These should be pulled
in, if required, via explicit dependencies.
When parsing and installing binaries mentioned in Exec*= lines the
5ed0dcf4d5 commit added parsing logic to drop
prefixes, including handling duplicate exclamation marks. But this did not
handle arbitrary combination of multiple prefixes, ie. StartExec=+-/bin/sh was
parsed as -/bin/sh which then would fail to install.
Instead of using egrep and shell replacements, replace both with sed command
that does it all. This sed script extract a group of characters starting with a
/ up to the first space (if any) after the equals sign. This correctly handles
existing non-prefixed, prefixed, multiple-prefixed commands.
About half commands seem to repeat themself, thus sort -u cuts the list of
binaries to install about in half.
To validate change of behaviour both old and new functions were modified to
echo parsed binaries into separate files, and then diffed. The incorrect
-/bin/sh was missing in the new output.
Without this patch tests fail on default Ubuntu installs.
Nested KVM is very flaky as we learnt from our CI. Hence, let's avoid
KVM whenever we detect we are already running inside of KVM.
Maybe one day nested KVM is fixed, at which point we can turn this on
again, but for now let's simply avoid nested KVM, since reliable CI is
more important than quick CI, I guess.
And yes, avoiding KVM for our qemu runs does make things substantially
slower, but I think it's not a complete loss.
Inspired by @evverx' findings in:
https://github.com/systemd/systemd/pull/8701#issuecomment-380213302
This makes the script wait for the newly created partition to
show up before trying to put a filesystem on it, which should
prevent the tests from failing with the following error:
```
New situation:
Disklabel type: dos
Disk identifier: 0x3541a0ec
Device Boot Start End Sectors Size Id Type
/dev/loop6p1 2048 800767 798720 390M 83 Linux
/dev/loop6p2 800768 819199 18432 9M 83 Linux
The partition table has been altered.
Calling ioctl() to re-read partition table.
Syncing disks.
The file /dev/loop6p1 does not exist and no size was specified.
make: *** [setup] Error 1
F: Failed to mkfs -t ext4
Makefile:4: recipe for target 'setup' failed
```
Let's be more restrictive when validating PID files and MAINPID=
messages: don't accept PIDs that make no sense, and if the configuration
source is not trusted, don't accept out-of-cgroup PIDs. A configuratin
source is considered trusted when the PID file is owned by root, or the
message was received from root.
This should lock things down a bit, in case service authors write out
PID files from unprivileged code or use NotifyAccess=all with
unprivileged code. Note that doing so was always problematic, just now
it's a bit less problematic.
When we open the PID file we'll now use the CHASE_SAFE chase_symlinks()
logic, to ensure that we won't follow an unpriviled-owned symlink to a
privileged-owned file thinking this was a valid privileged PID file,
even though it really isn't.
Fixes: #6632
This catches errors like "ninja not found", missing programs etc. early,
instead of silently ignoring them and trying to boot a broken VM.
In install_config_files(), allow some distro specific files to be absent
(such as /etc/sysconfig/init).
All test/TEST* but TEST-02-CRYPTSETUP share the same check_result_qemu()
and test_cleanup(), so move them into test_functions and only override
them in TEST-02-CRYPTSETUP.
Also provide a common test_run() which by default assumes that both QEMU
and nspawn tests are run. Particular tests which don't support either
need to explicitly opt out by setting $TEST_NO_{QEMU,NSPAWN}. Do it this
way around to avoid accidentally forgetting to opt in, and to encourage
test authors to at least always support nspawn.
Automatic rebuilding is removed: it doesn't play well with ninja, because
ninja always writes logs, and even if nothing needs to be built, it will
make the log file owned by root. So let's just remove this, and say that
the user must always do the build first.
I'm also keeping make for the tests, because ninja doesn't play well with
sudo.
Since the build directory is arbitrary, it needs to be specified, e.g.
sudo make BUILD_DIR=/home/zbyszek/src/systemd/build1 -C test/TEST-01-BASIC/
If run_qemu() exits with non-zero, this either meant that QEMU was not
available (which should be a SKIP) or that QEMU timed out if $QEMU_TIMEOUT was
set (which then should be a FAIL).
Limit the exit code of run_qemu() to QEMU availability only, and track timeouts
separately through the new $TIMED_OUT variable, which is then checked in
check_result_qemu().
Do the same for $NSPAWN_TIMEOUT and run_nspawn() so that nspawn and QEMU work
similarly.
There are many cgroups-related changes (thanks, @htejun!)
This commit will simplify testing a bit.
Use:
make run UNIFIED_CGROUP_HIERARCHY=yes to enable cgroup-v2
make run UNIFIED_CGROUP_HIERARCHY=no to enable cgroup-v1
systemd-udev generated an insane amount of log output at debug level.
It would break TEST-02-CRYPTSETUP by filling the overflowing the disk
(which seems to be a bug in itself!).
WARNING: Image format was not specified for
'/var/tmp/systemd-test.tGi3od/rootdisk.img' and probing guessed raw.
Automatically detecting the format is dangerous for raw images, write
operations on block 0 will be restricted. Specify the 'raw' format
explicitly to remove the restrictions.
Also use unsafe caching mode, we don't care about data integrity here.
Fixes:
$ cd test/TEST-07-ISSUE-1981/
$ sudo make clean setup run
...
timeout: failed to run command ‘systemd-nspawn’: No such file or directory
...
TEST RUN: https://github.com/systemd/systemd/issues/1981 [FAILED]
Makefile:10: recipe for target 'run' failed
make: *** [run] Error 1
* Use $ROOTLIBDIR/systemd always
* Don't pass $ROOTLIBDIR/systemd as the first argument:
$ cat /proc/1/cmdline
/lib/systemd/systemd/lib/systemd/systemd...