IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Let's detect & wrap binaries which are linked against systemd DSOs and
we're running under ASan, since otherwise running such binaries ends
with:
```
==633==ASan runtime does not come first in initial library list; you should either link runtime to your application or manually preload it with LD_PRELOAD.
```
Since we unset $LD_PRELOAD in the testsuite-* units (due to another
issue), let's store the path to the ASan DSO in another env variable, so
we can easily access it in the testsuite scripts when needed.
If rngd is included in the host initrd, QEMU guests need at least one source of
entropy otherwise rngd will refuse to start. Hence this patch enables the
virtio RNG device in QEMU guests (exposed as a HW RNG device available at
/dev/hwrng).
As a safety measure, the patch limits the data sent to the guest to 1KB per
second in order to not let the guest starve the host entropy.
We used both "qemu" and "QEMU", let's use the lower-case version everywhere
since it's also the name of the binary and the version that people are
most familiar with.
The stuff under test/ is not only for the integeration tests, but also
for various other test-related stuff, so adjust the docs a bit.
Since c6552ad we now try to collect coverage even in situations where
it's basically impossible (like in test-mount-util where the whole / is
mounted as read-only). As dealing with this is not worth the trouble,
let's ignore the missing coverage errors thrown by gcov in such cases.
Add tests for enrolling and unlocking. Various cases are tested:
- Default PCR 7 policy w/o PIN, good and bad cases (wrong PCR)
- PCR 7 + PIN policy, good and bad cases (wrong PCR, wrong PIN)
- Non-default PCR 0+7 policy w/o PIN, good and bad cases (wrong PCR 0)
v2: rename test, fix tss2 library installation, fix CI failures
v3: fix ppc64, load module
IIUC, pipefail doesn't matter for a sequence of commands joined with &&, and we
don't have any pipes. And such a failing expression also does not trigger an
exit, so the set +e/set -e were noops.
It should make it easier to figure out what exactly services do there.
For example, with SYSTEMD_LOG_LEVEL=debug userdbd (v249) prints
```
varlink-5: New incoming message: {"method":"io.systemd.UserDatabase.GetUserRecord","parameters":{}}
```
before it crashes and systemd-resolved prints
```
varlink-21: New incoming message: {"method":"io.systemd.Resolve.ResolveAddress","parameters":{"address":[127,0,0,1],"flags":0,"ifindex":1000000,"family":0}}
```
and those messages are helpful (especially when scripts causing them
aren't clever enough to keep track of random stuff they send to systemd
:-))
otherwise units using `DynamicUser=yes` won't be able to write the
coverage stats (currently affecting TEST-20-MAINPIDGAMES).
`DynamicUser=yes` implies `ProtectSystem=strict` and
`ProtectHome=read-only` and can't be overridden hence we need to
utilize `ReadWritePaths=` to work around that.
If the test logs contain lines like:
```
...systemd-resolved[735885]: profiling:/systemd-meson-build/src/shared/libsystemd-shared-250.a.p/base-filesystem.c.gcda:Cannot open
```
it means we're possibly missing some coverage since gcov can't write the stats,
usually due to the sandbox being too restrictive (e.g. ProtectSystem=yes,
ProtectHome=yes) or the $BUILD_DIR being inaccessible to non-root users.
If we're built with `-Dportable=false`, the portable profiles won't get
installed into the image. Since we need only the profile files and
nothing else, let's copy them into the image explicitly in such case.
A bind mount is added directly from private on the host to the actual
destination directory, no need for the symlinks (which cannot be created
as the bind mount happens first and creates the target as an actual directory)
Fixes https://github.com/systemd/systemd/issues/22264
Wraps nspawn to be able to use pexpect. The test logs in on the console
and runs screen. In one screen window it types in shutdown commands and
checks whether a wall message was sent to the other.
19:50:59 F: Missing a shared library required by /var/tmp/systemd-test.NIPT2q/root/lib/systemd/libsystemd-core-250.so.
19:50:59 F: Run "ldd /var/tmp/systemd-test.NIPT2q/root/lib/systemd/libsystemd-core-250.so" to find out what it is.
19:50:59 F: libsystemd-shared-250.so => not found
19:50:59 F: Cannot create a test image.
Writing a byte to test10.socket is actually the root cause of issue #19154:
depending on the timing, it's possible that PID1 closes the socket before socat
(or nc, it doesn't matter which tool is actually used) tries to write that one
byte to the socket. In this case writing to the socket returns EPIPE, which
causes socat to exit(1) and subsequently make the test fail.
Since we're only interested in connecting to the socket and triggering the rate
limit of the socket, this patch removes the parts that write the single byte to
the socket, which should remove the race for good.
Since it shouldn't matter whether the test uses socat or nc, let's switch back
to nc and hence remove the sole user of socat. The exit status of nc is however
ignored because some versions might choke when the socket is closed
unexpectedly.
In order to avoid inflating the dependency list for the core
library, use dlopen when inspecting elfs, since it's only
used in two non-core executables.
so we can run TEST-46 under sanitizers once again.
`systemd-homed` runs fsck on home directories, which reports a memory
leak we're not interested in. Let's introduce an LSan suppression file
to get around this. Since the patterns in the suppression file are
matched using basic substring match[0], they're a bit cumbersome, but
should get the work one.
[0] https://github.com/google/sanitizers/wiki/AddressSanitizerLeakSanitizer#suppressions
Example leaks (as reported by TEST-46):
```
systemd-homed[1333]: =================================================================
systemd-homed[1333]: ==1333==ERROR: LeakSanitizer: detected memory leaks
systemd-homed[1333]: Direct leak of 24 byte(s) in 1 object(s) allocated from:
systemd-homed[1333]: #0 0x7f0c8facccd1 in calloc (/usr/lib/clang/12.0.1/lib/linux/libclang_rt.asan-x86_64.so+0xf4cd1)
systemd-homed[1333]: #1 0x558d9494ff67 (/usr/bin/fsck+0x3f67)
systemd-homed[1333]: Direct leak of 6 byte(s) in 1 object(s) allocated from:
systemd-homed[1333]: #0 0x7f0c8fa906c1 in strdup (/usr/lib/clang/12.0.1/lib/linux/libclang_rt.asan-x86_64.so+0xb86c1)
systemd-homed[1333]: #1 0x558d949518fd (/usr/bin/fsck+0x58fd)
systemd-homed[1333]: SUMMARY: AddressSanitizer: 30 byte(s) leaked in 2 allocation(s).
systemd-homed[1337]: ==1337==WARNING: Symbolizer was blocked from starting itself!
systemd-homed[1337]: =================================================================
systemd-homed[1337]: ==1337==ERROR: LeakSanitizer: detected memory leaks
systemd-homed[1337]: Direct leak of 67584 byte(s) in 1 object(s) allocated from:
systemd-homed[1337]: #0 0x7f01edb84b19 (/usr/lib/clang/12.0.1/lib/linux/libclang_rt.asan-x86_64.so+0xf4b19)
systemd-homed[1337]: #1 0x7f01e8326829 (/usr/bin/../lib/libLLVM-12.so+0xb46829)
systemd-homed[1337]: SUMMARY: AddressSanitizer: 67584 byte(s) leaked in 1 allocation(s).
```
With the suppression file:
```
systemd-homed[1339]: -----------------------------------------------------
systemd-homed[1339]: Suppressions used:
systemd-homed[1339]: count bytes template
systemd-homed[1339]: 2 30 /bin/fsck$
systemd-homed[1339]: -----------------------------------------------------
systemd-homed[1343]: ==1343==WARNING: Symbolizer was blocked from starting itself!
systemd-homed[1343]: -----------------------------------------------------
systemd-homed[1343]: Suppressions used:
systemd-homed[1343]: count bytes template
systemd-homed[1343]: 1 67584 /lib/libLLVM
systemd-homed[1343]: -----------------------------------------------------
```
Depending on the location of the original build dir, either ProtectHome=
or ProtectSystem= may get in the way when creating the gcov metadata
files.
Follow-up to:
* 02d7e73013
* 6c9efba677
Otherwise we miss quite a lot of coverage (mainly from logind,
hostnamed, networkd, and possibly others), since they can't write their
reports with `ProtectSystem=strict`.
If -Db_coverage=true is used at build time, then ARTIFACT_DIRECTORY/TEST-XX-FOO.coverage-info
files are created with code coverage data, and run-integration-test.sh also
merges them into ARTIFACT_DIRECTORY/merged.coverage-info since the coveralls.io
helpers accept only a single file.
Compared to PID1 where systemd-oomd has to be the client to PID1
because PID1 is a more privileged process than systemd-oomd, systemd-oomd
is the more privileged process compared to a user manager so we have
user managers be the client whereas systemd-oomd is now the server.
The same varlink protocol is used between user managers and systemd-oomd
to deliver ManagedOOM property updates. systemd-oomd now sets up a varlink
server that user managers connect to to send ManagedOOM property updates.
We also add extra validation to make sure that non-root senders don't
send updates for cgroups they don't own.
The integration test was extended to repeat the chill/bloat test using
a user manager instead of PID1.
The `dracut_install` is a misnomer, since the systemd integration test
suite is based on the original dracut's test suite, and not all the
references to dracut has been edited out. Let's fix that.
otherwise we might mark tests where something crashes during shutdown as
successful, as happened in one of the recent TEST-01-BASIC runs:
```
testsuite-01.service: About to execute rm -f /failed /testok
testsuite-01.service: Forked rm as 606
testsuite-01.service: Executing: rm -f /failed /testoktestsuite-01.service: Changed dead -> start-pre
Starting TEST-01-BASIC...
...
Child 606 (rm) died (code=exited, status=0/SUCCESS)
testsuite-01.service: Child 606 belongs to testsuite-01.service.
testsuite-01.service: Control process exited, code=exited, status=0/SUCCESS (success)
testsuite-01.service: Got final SIGCHLD for state start-pre.
testsuite-01.service: Passing 0 fds to service
testsuite-01.service: About to execute sh -e -x -c "systemctl --state=failed --no-legend --no-pager >/failed ; systemctl daemon-reload ; echo OK >/testok"
testsuite-01.service: Forked sh as 607
testsuite-01.service: Changed start-pre -> start
testsuite-01.service: Executing: sh -e -x -c "systemctl --state=failed --no-legend --no-pager >/failed ; systemctl daemon-reload ; echo OK >/testok"systemd-journald.service: Got notification message from PID 560 (FDSTORE=1)S
...
testsuite-01.service: Child 607 belongs to testsuite-01.service.
testsuite-01.service: Main process exited, code=exited, status=0/SUCCESS (success)
testsuite-01.service: Deactivated successfully.
testsuite-01.service: Service will not restart (restart setting)
testsuite-01.service: Changed start -> dead
testsuite-01.service: Job 207 testsuite-01.service/start finished, result=done
[ OK ] Finished TEST-01-BASIC.
...
end.service: About to execute /bin/sh -x -c "systemctl poweroff --no-block"
end.service: Forked /bin/sh as 623end.service: Executing: /bin/sh -x -c "systemctl poweroff --no-block"
...
end.service: Job 213 end.service/start finished, result=canceled
Caught <SEGV>, dumped core as pid 624.
Freezing execution.
CentOS Linux 8
Kernel 4.18.0-305.12.1.el8_4.x86_64 on an x86_64 (ttyS0)
H login: qemu-kvm: terminating on signal 15 from pid 80134 (timeout)
E: Test timed out after 600s
Spawning getter /root/systemd/build/journalctl -o export -D /var/tmp/systemd-test.0UYjAS/root/var/log/journal/ca6031c2491543fe8286c748258df8d1...
Finishing after writing 15125 entries
Spawning getter /root/systemd/build/journalctl -o export -D /var/tmp/systemd-test.0UYjAS/root/var/log/journal/remote...
Finishing after writing 0 entries
-rw-r-----. 1 root root 25165824 Aug 20 12:26 /var/tmp/systemd-test.0UYjAS/system.journal
TEST-01-BASIC RUN: Basic systemd setup [OK]
...
NO_BUILD=1 indicates that we want to test systemd from the local system and not
the one from the local build. Hence there should be no need to call
find-build-dir.sh when NO_BUID=1 especially since it's likely that the script
will fail to find a local build in this case.
This avoids find-build-dir.sh to emit 'Specify build directory with $BUILD_DIR'
message when NO_BUILD=1 and no local build can be found.
This introduces a behavior change though: systemd from the local system will
always be preferred when NO_BUILD=1 even if a local build can be found.
In some cases image names are unpredictable - some orchestrators/deployment
tools like to mangle names to suit their internal formats. In these cases,
the requirement that the extension-release file matches exactly the image
name where it's contained cannot work.
Allow falling back to loading the first regular file which name starts with
'extension-release' located in /usr/lib/extension-release.d/ and tagged with
a user.extension-release.strict extended attribute with a true value, if the
one with the expected name cannot be found.
Skip a harmless error when running the tests on a system with a significantly
older systemd version (ldd tries to resolve the unprefixed RPATH for libsystemd.so.0,
which is in this case older than the already installed libsystemd.so.0 in $initdir).
The issue is triggered by installing test dependencies in install_missing_libraries().
Spotted on CentOS 8.
```
$ ldd /var/tmp/systemd-test.nZO11F/root/lib/systemd/tests/test-sd-device-thread
/var/tmp/systemd-test.nZO11F/root/lib/systemd/tests/test-sd-device-thread: /lib64/libsystemd.so.0: version `LIBSYSTEMD_240' not found (required by /var/tmp/systemd-test.nZO11F/root/lib/systemd/tests/test-sd-device-thread)
linux-vdso64.so.1 (0x00007fffb79d0000)
libclang_rt.asan-powerpc64le.so => /usr/lib64/clang/11.0.0/lib/linux/libclang_rt.asan-powerpc64le.so (0x00007fffb6ef0000)
libsystemd.so.0 => /lib64/libsystemd.so.0 (0x00007fffb6d20000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fffb6cd0000)
libc.so.6 => /lib64/libc.so.6 (0x00007fffb6ab0000)
$ LD_LIBRARY_PATH=/var/tmp/systemd-test.nZO11F/root/lib64/ ldd /var/tmp/systemd-test.nZO11F/root/lib/systemd/tests/test-sd-device-thread
linux-vdso64.so.1 (0x00007fffaba80000)
libclang_rt.asan-powerpc64le.so => /usr/lib64/clang/11.0.0/lib/linux/libclang_rt.asan-powerpc64le.so (0x00007fffaafa0000)
libsystemd.so.0 => /var/tmp/systemd-test.nZO11F/root/lib64/libsystemd.so.0 (0x00007fffaa5f0000)
libpthread.so.0 => /var/tmp/systemd-test.nZO11F/root/lib64/libpthread.so.0 (0x00007fffaa5a0000)
libc.so.6 => /var/tmp/systemd-test.nZO11F/root/lib64/libc.so.6 (0x00007fffaa380000)
```
When `linux-headers` is installed on Arch Linux, it stores the module
source tree in the kernel module directory, which is then picked up by
`find` and we get a lot of harmless but annoying errors:
```
...
modprobe: FATAL: Module Kconfig.iosched not found in directory /lib/modules/5.13.7-arch1-1
modprobe: FATAL: Module Kconfig not found in directory /lib/modules/5.13.7-arch1-1
modprobe: FATAL: Module Kconfig not found in directory /lib/modules/5.13.7-arch1-1
modprobe: FATAL: Module dm-mpath.h not found in directory /lib/modules/5.13.7-arch1-1
modprobe: FATAL: Module dm-bio-prison-v2.h not found in directory /lib/modules/5.13.7-arch1-1
modprobe: FATAL: Module raid0.h not found in directory /lib/modules/5.13.7-arch1-1
...
```
Let's fix this by trying to install only kernel modules (*.ko files with
an optional compression).
Let's unify handling of the boolean values throughout the test-functions
code, since we use 0/1, true/false, and yes/no almost randomly in many
places, so picking the right values during CI configuration can be a real
pain.
Saving the journal for passing tests creates a huge amount of unneeded
data stored for each full test run. Add a env var to allow saving the
journal only for failed tests.
Due to a little misunderstanding the last patch doesn't work as
expected, since test_create_image() is called only for the first image
(usually TEST-01-BASIC), and all subsequent images are then (possibly)
modified with test_append_files().
Follow-up to 179ca4d2b1.
It turns out the "supporting services" were run in _all_ tests if
TEST-01-BASIC was run as the first test (which is usually the case),
since with the original condition in test_create_image() we would skip
the masking and then propagate the change to the default image used by
other tests. This has been causing multiple bogus test timeouts
(especially when the hwdb was being rebuilt in tests with short
timeouts, like TEST-52-HONORFIRSTSHUTDOWN).
Let's "fix" this by making the call to mask_supporting_services()
uncoditional and override the test_create_image() function in
TEST-01-BASIC to avoid the masking in this single case.
The three-argument match() is a GNU AWK extension, thus breaking the
compatibility with mawk (used on Ubuntu/Debian, for example). Let's
replace it with a (hopefully) more portable sed expression to drop the
inadvertently introduced gawk dependency.
Fixes: #19957
This fixes an issue introduced by the commit 954c77c251.
For some reasons, setting default ACL on $TESTDIR makes TEST-29-PORTABLE
fail. Let's drop the default ACL, and set ACL on saved results instead.
Fixes#19519.
"systemd-testsuite" gets in the way when grepping for "testsuite-*.sh".
Also, the name doesn't matter for anything, so let's just use something
very short to save space.
When editing this function in 7bf20e48bd, I couldn't
decide whether to initialize ret at the top and only reset it on success, or
whether to assign a value in each branch. In the end I did neither ;( So if the
test finished without creating any of the result files, we would echo a
message, but return "success".
But there was bigger confusion with /failed: some tests create it empty, some
don't. I think we may want to do away pre-creation of /failed completely, and
assume the test failed unless /testok is found. But I'm leaving that for later
rework. For now let's just make sure we report return success only if /testok
or /skipped is found.
Basically the same scenario as in
a33e2692e1, where `awk` exits as soon
as it finds a match, thus sending SIGPIPE to `ldd` if it's not fast
enough. That, in combination with `set -o pipefail` causes random &
unexpected fails, like:
```
No journal files were found.
-rw-r----- 1 root root 16777216 Apr 30 10:31
/var/tmp/TEST-01-BASIC_sanitizers-nspawn/system.journal
TEST-01-BASIC RUN: Basic systemd setup [OK]
systemd is not linked against the ASan DSO
gcc does this by default, for clang compile with -shared-libasan
make: *** [Makefile:2: clean-again] Error 1
make: Leaving directory '/build/test/TEST-01-BASIC'
```
Specifying the test number manually is tedious and prone to errors (as
recently proven). Since we have all the necessary data to work out the
test number, let's do it automagically.
We have to invoke the tests as superuser, and not being able to read
the journal as the invoking user is annoying. I don't think there are
any security considerations here, since the invoking user can already
put arbitrary code in the Makefile and test scripts which get executed
with root privileges.
The logic to query test state was rather complex. I don't quite grok the point
of ret=$((ret+1))… But afaics, the precise result was always ignored by the
caller anyway.
This code was partially broken, since the firmware directory was
undefined. Also, some of the parts were a dead code, since they relied
on code from the original dracut test suite.
`command -v <bin> | grep ...` can under certain conditions cause the
`command` to exit with SIGPIPE, which in combination with `set -o
pipefail` means that the tests sometimes randomly die during setup.
Let's avoid using pipes in such cases.
This breaks some existing loops which previously ignored if the piped
program exited with EC >0. Rewrite them to mitigate this (and also make
them more robust in some cases).
When trying to calculate the next firing of 'Sun *-*-* 01:00:00', we'd fall
into an infinite loop, because mktime() moves us "backwards":
Before this patch:
tm_within_bounds: good=0 2021-03-29 01:00:00 → 2021-03-29 00:00:00
tm_within_bounds: good=0 2021-03-29 01:00:00 → 2021-03-29 00:00:00
tm_within_bounds: good=0 2021-03-29 01:00:00 → 2021-03-29 00:00:00
...
We rely on mktime() normalizing the time. The man page does not say that it'll
move the time forward, but our algorithm relies on this. So let's catch this
case explicitly.
With this patch:
$ TZ=Europe/Dublin faketime 2021-03-21 build/systemd-analyze calendar --iterations=5 'Sun *-*-* 01:00:00'
Normalized form: Sun *-*-* 01:00:00
Next elapse: Sun 2021-03-21 01:00:00 GMT
(in UTC): Sun 2021-03-21 01:00:00 UTC
From now: 59min left
Iter. #2: Sun 2021-04-04 01:00:00 IST
(in UTC): Sun 2021-04-04 00:00:00 UTC
From now: 1 weeks 6 days left <---- note the 2 week jump here
Iter. #3: Sun 2021-04-11 01:00:00 IST
(in UTC): Sun 2021-04-11 00:00:00 UTC
From now: 2 weeks 6 days left
Iter. #4: Sun 2021-04-18 01:00:00 IST
(in UTC): Sun 2021-04-18 00:00:00 UTC
From now: 3 weeks 6 days left
Iter. #5: Sun 2021-04-25 01:00:00 IST
(in UTC): Sun 2021-04-25 00:00:00 UTC
From now: 1 months 4 days left
Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1941335.
otherwise udev complains about the file being world-writable:
systemd-udevd[228]: Configuration file /etc/udev/rules.d/00-set-LD_PRELOAD.rules is marked world-writable. Please remove world writability permission bits. Proceeding anyway.
Fixes: systemd/systemd-centos-ci#354
This test would normally get stuck when trying to mount the verity image
due to:
systemd-udevd[299]: dm-0: '/usr/sbin/dmsetup udevflags 6293812'(err) '==371==ASan runtime does not come first in initial library list; you should either link runtime to your application or manually preload it with LD_PRELOAD.'
systemd-udevd[299]: dm-0: Process '/usr/sbin/dmsetup udevflags 6293812' failed with exit code 1
...
systemd-udevd[299]: dm-0: '/usr/sbin/dmsetup udevcomplete 6293812'(err) '==372==ASan runtime does not come first in initial library list; you should either link runtime to your application or manually preload it with LD_PRELOAD.'
systemd-udevd[299]: dm-0: Process '/usr/sbin/dmsetup udevcomplete 6293812' failed with exit code 1.
systemd-udevd[299]: dm-0: Command "/usr/sbin/dmsetup udevcomplete 6293812" returned 1 (error), ignoring.
so let's add a simple udev rule which sets $LD_PRELOAD for the block
subsystem.
Also, install the ASan library along with necessary dependencies into
the verity minimal image, to get rid of the annoying (yet harmless)
errors about missing library from $LD_LIBRARY.
When running integration tests under sanitizers D-Bus fails to
shutdown cleanly, causing unnecessary noise in the logs:
```
dbus-daemon[272]: ==272==LeakSanitizer has encountered a fatal error.
dbus-daemon[272]: ==272==HINT: For debugging, try setting environment variable LSAN_OPTIONS=verbosity=1:log_threads=1
dbus-daemon[272]: ==272==HINT: LeakSanitizer does not work under ptrace (strace, gdb, etc)
```
Since we're not "sanitizing" D-Bus anyway let's disable LSan's at_exit
check for the dbus.service to get rid of this error.
When a subshell is used ('make' or 'make all') the LOOPDEV environment
variable, which is used to store the opened loop device, is lost.
So the cleanup on trap/exit doesn't do anything, and the loop
device used to mount the test image is left around.
Avoid using a subshell to fix the issue.
The source package in the apt cache might be older than the
packaging from salsa.debian.org/systemd-team/systemd so it might not
list all the current binary packages.
This is currently the case for systemd-timesyncd, so TEST-30 fails.
Simply grep the control file rather than using apt-cache when iterating
over the packages contents.
Binaries on the latest Arch Linux use `call` instructions instead of
`callq`, which breaks the ASan detection and eventually the image
building process (due to insufficient space).
There may be situations where a cgroup should be protected from killing
or deprioritized as a candidate. In FB oomd xattrs are used to bias oomd
away from supervisor cgroups and towards worker cgroups in container
tasks. On desktops this can be used to protect important units with
unpredictable resource consumption.
The patch allows systemd-oomd to understand 2 xattrs:
"user.oomd_avoid" and "user.oomd_omit". If systemd-oomd sees these
xattrs set to 1 on a candidate cgroup (i.e. while attempting to kill something)
AND the cgroup is owned by root, it will either deprioritize the cgroup as
a candidate (avoid) or remove it completely as a candidate (omit).
Usage is restricted to root owned cgroups to prevent situations where an
unprivileged user can set their own cgroups lower in the kill priority than
another user's (and prevent them from omitting their units from
systemd-oomd killing).
Add NO_BUILD var to allow testing with no local build, by installing
local systemd files into the image.
This only works for debian-like distros currently, that use the
tools 'apt' and 'dpkg' for package management.
The $BUILD_DIR is only used in test-functions, and doesn't need to
be specified in any other scripts. Additionally, to be able to allow
the integration test suite to be run against locally installed binaries,
instead of built binaries, moving BUILD_DIR logic completely into
test-functions allows later patches to be simpler.
Building custom images for each test takes a lot of time.
Build the default one, and if the test needs incompatible changes
just copy it and extend it instead.
This reverts commit 73484ecff9.
3976f372ae moved libudev.so to be built in the
main directory, so this addition to $LD_LIBRARY_PATH is now obsolete.
After that commit, we build the following shared libraries:
build/libnss_myhostname.so.2
build/libnss_mymachines.so.2
build/libnss_resolve.so.2
build/libnss_systemd.so.2
build/libsystemd.so.0.30.0
build/libudev.so.1.7.0
build/pam_systemd.so
build/pam_systemd_home.so
build/src/boot/efi/stub.so
build/src/boot/efi/systemd_boot.so
build/src/shared/libsystemd-shared-247.so
EFI stubs don't matter, and libsystemd-shared-nnn.so is loaded through rpath,
and is doesn't need to and shouldn't be in $LD_LIBRARY_PATH. In effect, we only
ever need to add the main build directory to the search path.
Not all optional libraries might be available on developers machines,
so log and skip.
Also some pkg-config files are broken (eg: tss2 on Debian Stable) so
skip if the required variables are missing, and improve logs.
By default the test suite prefers qemu, and uses nspawn only if
a test specifically says it doesn't support qemu.
Add a variable to allow flipping the default, and run as many
tests under nspawn as possible.
Allows to split the test run in two parts. Most tests can run under
nspawn which is much faster, and they can be ran in one chunk with
TEST_NO_QEMU=1. The qemu-only tests, which are just a handful, can
be ran in another chunk with TEST_QEMU_ONLY=1.
Allows autopkgtest to be split in two parts.
The image build function greps for ExecStart lines in unit files, but some
of them (eg: systemd-firstboot) do not use a full path.
It then falls back to 'type -P' but that only works if you have the binary
installed. For optional binaries like systemd-firstboot, the installation
can then fail.
Manually check if the binary already exists in /[usr/][s]bin.
Usually on Debian ROOTLIBDIR is /lib/<arch triplet>, which is not the right place.
Use pkg-config since we define it, and then fallback to /usr/lib/systemd/user which is
the canonical location.
On both Debian&friends and Fedora dbus/dbus-broker install the user socket/service
under /usr/lib/systemd/user, not /lib/systemd/systemd/user.
I suspect the original version of the regex was written on a system,
which prints both the QEMU version and the QEMU package version in the
--version output, like Fedora:
$ /bin/qemu-system-x86_64 --version
QEMU emulator version 4.2.1 (qemu-4.2.1-1.fc32)
Copyright (c) 2003-2019 Fabrice Bellard and the QEMU Project developers
However, Arch Linux prints only the QEMU version:
$ /bin/qemu-system-x86_64 --version
QEMU emulator version 5.2.0
Copyright (c) 2003-2020 Fabrice Bellard and the QEMU Project developers
This causes the awk regex to not match the version string, since there's
no whitespace after it, causing the version check to fail (as well as the
TEST-36-NUMAPOLICY) as well.
Follow-up for 43b49470d1.
Upgrading to qemu 5.2 breaks TEST-36-NUMAPOLICY like:
qemu-system-x86_64: total memory for NUMA nodes (0x0) should
equal RAM size (0x20000000)
Use the new (as in >=2014) form of memdev in test 36:
-object memory-backend-ram,id=mem0,size=512M -numa node,memdev=mem0,nodeid=0
Since some target systems are as old as qemu 1.5.3 (CentOS7) but the new
kind to specify was added in qemu 2.1 this needs to add version parsing and
add the argument only when qemu is >=5.2.
Fixes#17986.
Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
The systemd-user file has been moved from /etc/pam.d into /usr/lib/pam.d,
so test-functions needs to copy it from /usr/lib/pam.d instead.
This will copy it from either location.
When invoking "ldd" to find dependency libraries we already set
$LD_LIBRARY_PATH to point to our own build tree, so that our libraries
are checked, not the host libraries. This is not sufficient howeever, as
libudev is built in a subdir. Add that, too.
We have four legal cases:
1. /usr/lib/os-release exists and /etc/os-release is a symlink to it
2. both exist but /etc/os-release is not a symlink to /usr/lib/os-release
3. only /usr/lib/os-release exists
4. only /etc/os-release exists
The generic setup code in test-functions and create-busybox-image didn't handle
case 3.
The test-specific code in TEST-50 didn't handle 2 (because the general setup
code would only install /etc/os-release in the image and
grep -f /usr/lib/os-release would not work) and 4 (same reason) and would fail
in case 3 in generic setup.
Since the hwdb update from a79be2f807
the systemd-hwdb-update service started timing out under ASan when
compiled with gcc, as we started tripping over the 3 minutes timeout.
This affects only gcc runs, since the current gcc on Arch still suffers
from the detect_stack_use_after_return performance penalty[0]. Until
the fixed gcc is present in the respective repositories, let's bump
the timeout to 4 minutes, as we might not be able to upgrade right
away, due to systemd/systemd#16199.
Before the hwdb update:
[ 7958.292540] systemd[63]: systemd-hwdb-update.service: Executing: /usr/bin/time systemd-hwdb update
[ 7958.304005] systemd[1]: systemd-journald.service: Got notification message from PID 44 (FDSTORE=1)
[ 7958.314434] systemd[1]: systemd-journald.service: Added fd 3 (n/a) to fd store.
[ 8008.520082] systemd[1]: systemd-journald.service: Got notification message from PID 44 (WATCHDOG=1)
[ 8068.520151] systemd[1]: systemd-journald.service: Got notification message from PID 44 (WATCHDOG=1)
[ 8125.682843] time[63]: 84.47user 82.92system 2:47.50elapsed 99%CPU (0avgtext+0avgdata 811512maxresident)k
[ 8125.682843] time[63]: 0inputs+19680outputs (0major+25000853minor)pagefaults 0swaps
After the hwdb update:
[ 6215.491958] systemd[63]: systemd-hwdb-update.service: Executing: /usr/bin/time systemd-hwdb update
[ 6215.503380] systemd[1]: systemd-journald.service: Got notification message from PID 44 (FDSTORE=1)
[ 6215.514172] systemd[1]: systemd-journald.service: Added fd 3 (n/a) to fd store.
[ 6329.392918] systemd[1]: systemd-journald.service: Got notification message from PID 44 (WATCHDOG=1)
[ 6394.920205] time[63]: 89.48user 89.98system 2:59.55elapsed 99%CPU (0avgtext+0avgdata 812764maxresident)k
[ 6394.920205] time[63]: 0inputs+20568outputs (0major+27318354minor)pagefaults 0swaps
[0] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94910
Prompted by systemd/systemd#16111.
* check if /var is a mountpoint - if not, something went wrong. In case
of systemd/systemd#16111 the /failed file was created, because
systemd-cryptsetup failed, but it ended up being empty, making the result
check incorrectly pass
* forward journal messages to console - if we fail to mount /var,
journald won't flush logs to the persistent storage and we end up
empty handed and with no clue what went wrong
For example, without systemd/systemd#16111 and with this patch:
...
[FAILED] Failed to start systemd-cryptsetup@varcrypt.service.
See 'systemctl status systemd-cryptsetup@varcrypt.service' for details.
[DEPEND] Dependency failed for cryptsetup.target.
...
[ 3.882451] systemd-cryptsetup[581]: Key file /etc/varkey is world-readable. This is not a good idea!
[ 3.883946] systemd-cryptsetup[581]: WARNING: Locking directory /run/cryptsetup is missing!
[ 3.884846] systemd-cryptsetup[581]: Failed to load Bitlocker superblock on device /dev/disk/by-uuid/180ba5ef-873b-4018-9968-47c23431f71a: Invalid argument
...
[ 4.099451] sh[606]: + mountpoint /var
[ 4.100025] sh[603]: + systemctl poweroff --no-block
[ 4.101636] systemd[1]: Finished systemd-user-sessions.service.
[ 4.102598] sh[608]: /var is not a mountpoint
[FAILED] Failed to start testsuite-02.service.
Let's create new images public by default and then symlink/copy them
into the respective private directories afterwards, not the other way
around. This should fix a nasty race condition in parallel runs where
one tests attempts to copy the backing public image at the same moment
another test is already modifying it.
Support running tests in parallel by switching to copying of the
base image instead of symlinking it..
This still requires some setup steps, like running `make setup` on tests
which have unique $IMAGE_NAME beforehand (and sequentially), otherwise
they'll all try to create the same base image when started in parallel,
leading to nasty issues. However, as running the integration tests in
parallel is such an unusual use case it should be good enough, for now.
As Debian/Ubuntu use /lib/systemd instead of /usr/lib/systemd,
add systemd-journal-remote to the list of programs that test-functions
detects the correct path to, and replace its direct usage with
$SYSTEMD_JOURNAL_REMOTE
Also use $JOURNALCTL instead of journalctl.
Also minor correction in install_plymouth() to look in /lib/... as
well as /usr/lib/... and /etc/...
Remove the artifact files indicating test result (testok, failed, and
skipped) just before running the test so we always get the latest and
most relevant result instead of incorrectly consuming previous results.
Discovered in https://github.com/systemd/systemd/pull/15378#issuecomment-616801873
This doesn't really matter, since in non-/usr-merged systems plymouth
needs to be in /bin and on merged ones it doesn't matter, but it is
still prettier to insert the right path, and avoid /bin on merged
systems, since it's just a compat symlink.
Replaces: #15351
Using s-j-remote fixes the following issue: when coalescing files from multiple
inputs, simply copying all files with into the the same directory might
potentially mess things up, because a newer system.journal might overwrite an
older journal. This happens because we run multiple tests from the same image,
and need to clean out the directory after each run.
By using systemd-journal-remote, we nicely coalesce all files. This has the
advantage that if there aren't too many logs, we end up with just one journal
file.
ARTIFACT_DIRECTORY is for ubuntuautopackagetests, where the journal files are
copied to a separate directory to preserve after tests have been run. This
functionality can now be recreated by setting
ARTIFACT_DIRECTORY=$AUTOPKGTEST_ARTIFACTS.
It is more trouble than it is worth. The setup is of a loopback device
is very quick, so it's better to always create it when needed and
immediately drop afterwards.
This causes the unprivileged-nspawn-root directory to be removed
after running one test. The advantage is that we reduce the maximum
disk-space use quite a bit (47*400 MB → about 18GB).
has-overflow was a temporary hack that was removed in
844da987ef (Oct. 2016). All the makefiles
can be the same, and all the targets can be handled identically.
Before, we'd copy the test tree into nspawn-root, and run the tests from there.
This is OK, and doesn't actually take much extra time. But it uses quite a lot
of extra disk space. So let's make things a bit more efficient by running
directly from the image file.
We still run the unprivileged nspawn tests from a copy. Once the kernel
implements fs shift, we can do away with that too.
Before, we'd create a separate image for each test, in
/var/tmp/systemd-test.XXXXX/rootdisk.img. Most of the images
where very similar, except that each one had some unit files installed
specifically for the test. The installation of those custom unit files
was removed in previous commits (all the unit files are always installed).
The new approach is to only create as few distinct images as possible.
We have:
default.img: the "normal" image suitable for almost all the tests
basic.img: the same as default image but doesn't mask any services
cryptsetup.img: p2 is used for encrypted /var
badid.img: /etc/machine-id is overwritten with stuff
selinux.img: with selinux added for fun and fun
and a few others:
ls -l build/test/*img
lrwxrwxrwx 1 root root 38 Mar 21 21:23 build/test/badid.img -> /var/tmp/systemd-test.PJFFeo/badid.img
lrwxrwxrwx 1 root root 38 Mar 21 21:17 build/test/basic.img -> /var/tmp/systemd-test.na0xOI/basic.img
lrwxrwxrwx 1 root root 43 Mar 21 21:18 build/test/cryptsetup.img -> /var/tmp/systemd-test.Tzjv06/cryptsetup.img
lrwxrwxrwx 1 root root 40 Mar 21 21:19 build/test/default.img -> /var/tmp/systemd-test.EscAsS/default.img
lrwxrwxrwx 1 root root 39 Mar 21 21:22 build/test/nspawn.img -> /var/tmp/systemd-test.HSebKo/nspawn.img
lrwxrwxrwx 1 root root 40 Mar 21 21:20 build/test/selinux.img -> /var/tmp/systemd-test.daBjbx/selinux.img
lrwxrwxrwx 1 root root 39 Mar 21 21:21 build/test/test08.img -> /var/tmp/systemd-test.OgnN8Z/test08.img
I considered trying to use the same image everywhere. It would probably be
possible, but it would be very brittle. By using separate images where it is
necessary we keep various orthogonal modifications independent.
The way that images are cached is complicated by the fact that we still
want to keep them in /var/tmp. Thus, an image is created on first use and
linked to from build/test/ so it can be found by other tests.
Tests cannot be run in parallel. I think that is an acceptable limitation.
Creation of the images was probably taking more resources then the actual
tests, so we should be better off anyway.
We had an fstab for the sole purpose of remounting "/" rw. Mounting root ro
is a pointless excercise in obsolete approaches. More importantly, the nspawn
image is now the same as the qemu one.
The two timezone files are now installed in the global setup. I am not too
happy about this, but it still seems better than to create a completely
separate image just for this.
I picked the list of zone files to install by grepping through the code. This
is is a bit brittle, but installing all of them takes a while, and more
importantly, writes a lot of lines to the log.
As in 2a5fcfae02
and in 3e67e5c992
using /usr/bin/env allows bash to be looked up in PATH
rather than being hard-coded.
As with the previous changes the same arguments apply
- distributions have scripts to rewrite shebangs on installation and
they know what locations to rely on.
- For tests/compilation we should rather rely on the user to have setup
there PATH correctly.
In particular this makes testing from git easier on NixOS where do not provide
/bin/bash to improve compose-ability.
Having /etc/securetty in test containers prevents root from logging into
them:
```
Jan 31 10:15:11 systemd-testsuite login[69]: pam_securetty(login:auth): access denied: tty 'pts/0' is not secure !
Jan 31 10:15:11 systemd-testsuite login[69]: FAILED LOGIN 1 FROM pts/0 FOR root, Authentication failure
```
The test exercises that PrivateTmp=yes and ProtectHome={read-only,tmpfs}
directives work as expected when PrivateUsers=yes in a user manager.
Some code is also added to test-functions to help set up test cases that
exercise the user manager.
Meson appears to set the rpath only for some binaries it builds, but not
all. (The rules are not clear to me, but that's besides the point of
this commit).
Let's make sure if our test script operates on a binary that has no
rpath set we fall back preferably to the BUILD_DIR rather than directly
to the host.
This matters if a test uses a libsystemd symbol introduced in a version
newer than the one on the host. In that case "ldd" will not work on the
test binary if rpath is not set. With this fix that behaviour is
corrected, and "ldd" works correctly even in this case.
(Or in other words: before this fix on binaries lacking rpath we'd base
dependency info on the libraries of the host, not the buidl tree, if
they exist in both.)
We currently use the host's kernel and initramfs in our QEMU tests.
If the host is running on an encrypted LUKS partition, then the initramfs
will have a crypttab setup looking for the particular root disk it needs to
encrypt before booting into the system.
However, this disk obviously doesn't exist in our QEMU VM, so it turns out
our tests end up waiting for this device to become available, which will
never actually happen, and boot hangs for 90s until that service times out.
[*** ] A start job is running for /dev/disk/by-uuid/01234567-abcd-1234-abcd-0123456789ab (20s / 1min 30s)
In order to prevent this issue, let's pass "rd.luks=0" to disable LUKS in
the initramfs only as part of our default kernel command-line in our QEMU
tests.
This is enough to disable this behavior and prevent the timeout, while at
the same time doesn't conflict with our tests that actually check for LUKS
behavior in the systemd running under test (such as TEST-02-CRYPTSETUP).
Tested: `sudo make -C TEST-02-CRYPTSETUP/ clean setup run`
Many tests were also masking systemd-machined.service. But machined
should only start when activated, so having it not masked shouldn't be
noticable. TEST-25-IMPORT needs it.
We should never have used an unprefixed environment variable name.
All other systemd-nspawn variables have the "SYSTEMD_NSPAWN_" prefix,
and all other systemd variables have the "SYSTEMD_" prefix.
The new variable name takes precedence, but we fall back to checking the
old one. If only the old one is found, a warning is emitted.
In addition, SYSTEMD_NSPAWN_UNIFIED_HIERARCHY="" is accepted as an override
to avoid looking for the old variable name.
We have a variable with the same name ($UNIFIED_CGROUP_HIERARCHY) in tests,
which governs both systemd-nspawn and qemu behaviour. It is not renamed.
This avoids unnecessary noise in the stderr logs which dd always produces,
such as:
0+0 records in
0+0 records out
0 bytes copied, 0.000155284 s, 0.0 kB/s
Using truncate should not result in any functional change; the image will
still be created as a sparse file of the size specified.
In Ubuntu CI, we test binaries from the installed system, not from
$BUILD_DIR, so use the appropriate binary. Most of the calls to the
binaries are part of checking/processing asan-built binaries, and so
did not apply to Ubuntu CI, except for generating noise in the stderr
log like:
objdump: '/tmp/autopkgtest.83yGoI/build.fHB/src/test/TEST-01-BASIC/systemd-journald': No such file
However this also applies to the call to systemd-nspawn, which the debian
upstream test wrapper was sed-adjusting to use the installed binary
instead of the binary in $BUILD_DIR. This commit allows removing that
sed processing of the test-functions file during Ubuntu CI test.
This dir is created by create_empty_image_rootdir, as well as indirectly
by some other functions, but it should be created by import_initdir so
the newly-exported $initdir exists and can be used immediately without
relying on other functions to create it.
Only umount it during cleanup if the $TESTDIR/root dir is a mountpoint.
This avoids adding noise to the stderr log such as:
mountpoint: /var/tmp/systemd-test.waLOFT/root: No such file or directory
To make debugging much easier, especially for crashes in tests under
QEMU, let's store the entire coredump bundle in the systemd journal,
which is usually kept around by various CIs. Right now, we usually end
up with a journal, but without the coredump itself, which is pretty
useless.
without using -type f, the logs print an error such as:
E: E: modprobe: FATAL: Module asymmetric_keys not found in directory /lib/modules/4.15.0-54-generic
while this doesn't appear to cause problems, it can be extremely
distracting when trying to debug real failures.
Almost all tests were manually mounting/unmounting $TESTDIR/root
from the loopback image; this moves all that into test-functions
so the test setup functions are simplier.
Also add test_setup_cleanup() function, to cleanup what is mounted
by create_empty_image_rootdir()
In certain cases we might attempt to install a binary which is already
present in the test image, yet it's missing from the host system.
In such cases, let's check if the binary indeed exists in the image
before doing any other chcecks. If it does, immediately return with
success.
This was discovered during installation of
/usr/lib/systemd/systemd-bless-boot, which was not present in Ubuntu CI
(as the installed systemd was from the Ubuntu repositories), and the
binary itself was already in the image thanks to `ninja install`.
However, during extraction of binaries from the systemd service files,
another attempt to install this binary was made, which failed due to
`find_binary` being unable to find it.
Running integration tests with ASan is somewhat tricky to begin with, as
we need to pre-load the ASan runtime DSO for certain services (like
dbus), otherwise they won't start or behave as expected. In case of gcc
this is pretty easy, as we need the runtime DSO during compilation, so
it's already present on the host system. For clang things get more
complicated, as ASan is compiled in statically by default, thus to
enable the necessary dynamic-ish behavior one needs to compile with
-shared-libasan and then correctly set LD_PRELOAD_PATH, as the runtime
libraries are not in a standard library path.
Certain distributions (e.g. Arch Linux) require booting with initrd, as
they lack support for commonly used filesystems in the kernel (i.e. the
support is compiled in as modules)
The `mount` utility has an unexpected behavior when run with libasan,
causing false-positives during the integration testing.
For example, on Arch Linux with LD_PRELOAD pointing to libasan:
```
bash-5.0# mount -o remount,rw -v /
mount: /dev/sda1 mounted on /.
bash-5.0# echo $?
1
```
However:
```
bash-5.0# LD_PRELOAD= mount -o remount,rw -v /
mount: /dev/sda1 mounted on /.
bash-5.0# echo $?
0
```
Further investigation with strace shows a LeakSanitizer error:
```
bash-5.0# strace -s 512 mount -o remount,rw -v /
...
write(2, "==355==LeakSanitizer has encountered a fatal error.\n", 52) = -1 EBADF (Bad file descriptor)
write(2, "ReportFile::Write() can't output requested buffer!\n", 51) = -1 EBADF (Bad file descriptor)
exit_group(1) = ?
+++ exited with 1 +++
```
Let's workaround this by clearing the LD_PRELOAD variable for
systemd-remount-fs.service