1
0
mirror of https://github.com/systemd/systemd.git synced 2025-01-18 10:04:04 +03:00
Yu Watanabe 182ffb5819 TEST-71-HOSTNAME: do not start user session
The user session may trigger hostnamed, and the job of stopping
hostnamed may be cancelled, and the test may fail:
```
[ 4633.613578] TEST-71-HOSTNAME.sh[175]: + stop_hostnamed
[ 4633.613578] TEST-71-HOSTNAME.sh[175]: + systemctl stop systemd-hostnamed.service
[ 4633.664670] systemd[1]: Stopping systemd-hostnamed.service - Hostname Service...
[ 4636.022277] systemd-logind[121]: New session c2 of user root.
[ 4636.032532] systemd[1]: Created slice user-0.slice - User Slice of UID 0.
[ 4636.042675] systemd[1]: Starting user-runtime-dir@0.service - User Runtime Directory /run/user/0...
[ 4636.176140] systemd[1]: Finished user-runtime-dir@0.service - User Runtime Directory /run/user/0.
[ 4636.202951] systemd[1]: Starting user@0.service - User Manager for UID 0...
[ 4636.292204] systemd-logind[121]: New session c3 of user root.
[ 4636.300065] (systemd)[268]: pam_unix(systemd-user:session): session opened for user root(uid=0) by root(uid=0)
[ 4636.757667] systemd[268]: Queued start job for default target default.target.
[ 4636.774419] systemd[268]: Created slice app.slice - User Application Slice.
[ 4636.774579] systemd[268]: Started systemd-tmpfiles-clean.timer - Daily Cleanup of User's Temporary Directories.
[ 4636.774747] systemd[268]: Reached target paths.target - Paths.
[ 4636.776418] systemd[268]: Reached target sysinit.target - System Initialization.
[ 4636.776604] systemd[268]: Reached target timers.target - Timers.
[ 4636.784997] systemd[268]: Starting dbus.socket - D-Bus User Message Bus Socket...
[ 4636.799472] systemd[268]: Starting systemd-tmpfiles-setup.service - Create User Files and Directories...
[ 4637.027125] systemd[268]: Finished systemd-tmpfiles-setup.service - Create User Files and Directories.
[ 4637.031721] systemd[268]: Listening on dbus.socket - D-Bus User Message Bus Socket.
[ 4637.036189] systemd[268]: Reached target sockets.target - Sockets.
[ 4637.036373] systemd[268]: Reached target basic.target - Basic System.
[ 4637.036558] systemd[268]: Reached target default.target - Main User Target.
[ 4637.036646] systemd[268]: Startup finished in 702ms.
[ 4637.049075] systemd[1]: Started user@0.service - User Manager for UID 0.
[ 4637.075263] systemd[1]: Started session-c2.scope - Session c2 of User root.
[ 4637.084917] login[136]: pam_unix(login:session): session opened for user root(uid=0) by root(uid=0)
[ 4637.117348] login[136]: ROOT LOGIN ON pts/0
[ 4637.238572] systemctl[261]: Job for systemd-hostnamed.service canceled.
[ 4637.290369] systemd[1]: TEST-71-HOSTNAME.service: Main process exited, code=exited, status=1/FAILURE
```

Fixes #35643.
2024-12-20 09:36:51 +01:00
..
2023-11-29 11:04:59 +00:00
2022-10-13 17:34:11 +09:00

Integration tests

Running the integration tests with meson + mkosi

To run the integration tests with meson + mkosi, make sure you're running the latest version of mkosi. See docs/HACKING.md for more specific details. Make sure mkosi is available in $PATH when reconfiguring meson to make sure it is picked up properly.

We also need to make sure the required meson options are enabled:

$ meson setup --reconfigure build -Dremote=enabled

To make sure mkosi doesn't try to build systemd from source during the image build process, you can add the following to mkosi.local.conf:

[Build]
Environment=NO_BUILD=1

You might also want to use the PackageDirectories= or Repositories= option to provide mkosi with a directory or repository containing the systemd packages that should be installed instead. If the repository containing the systemd packages is not a builtin repository known by mkosi, you can use the SandboxTrees= option to write an extra repository definition to /etc which is used when building the image instead.

Next, we can build the integration test image with meson:

$ meson compile -C build mkosi

By default, the mkosi meson target which builds the integration test image depends on other meson targets to build various systemd tools that are used to build the image to make sure they are up-to-date. If you instead want the already installed systemd tools on the host to be used, you can run mkosi manually to build the image. To build the integration test image without meson, run the following:

$ mkosi -f

Note that by default we assume that build/ is used as the meson build directory that will be used to run the integration tests. If you want to use another directory as the meson build directory, you will have to configure the mkosi build directory (BuildDirectory=), cache directory (CacheDirectory=) and output directory (OutputDirectory=) to point to the other directory using mkosi.local.conf.

After the image has been built, the integration tests can be run with:

$ env SYSTEMD_INTEGRATION_TESTS=1 meson test -C build --no-rebuild --suite integration-tests --num-processes "$(($(nproc) / 4))"

As usual, specific tests can be run in meson by appending the name of the test which is usually the name of the directory e.g.

$ env SYSTEMD_INTEGRATION_TESTS=1 meson test -C build --no-rebuild -v TEST-01-BASIC

See meson introspect build --tests for a list of tests.

To interactively debug a failing integration test, the --interactive option (-i) for meson test can be used. Note that this requires meson v1.5.0 or newer:

$ env SYSTEMD_INTEGRATION_TESTS=1 meson test -C build --no-rebuild -i TEST-01-BASIC

Due to limitations in meson, the integration tests do not yet depend on the mkosi target, which means the mkosi target has to be manually rebuilt before running the integration tests. To rebuild the image and rerun a test, the following command can be used:

$ meson compile -C build mkosi && env SYSTEMD_INTEGRATION_TESTS=1 meson test -C build --no-rebuild -v TEST-01-BASIC

The integration tests use the same mkosi configuration that's used when you run mkosi in the systemd reposistory, so any local modifications to the mkosi configuration (e.g. in mkosi.local.conf) are automatically picked up and used by the integration tests as well.

Iterating on an integration test

To iterate on an integration test, let's first get a shell in the integration test environment by running the following:

$ meson compile -C build mkosi && env SYSTEMD_INTEGRATION_TESTS=1 TEST_SHELL=1 meson test -C build --no-rebuild -i TEST-01-BASIC

This will get us a shell in the integration test environment after booting the machine without running the integration test itself. After booting, we can verify the integration test passes by running it manually, for example with systemctl start TEST-01-BASIC.

Now you can extend the test in whatever way you like to add more coverage of existing features or to add coverage for a new feature. Once you've finished writing the logic and want to rerun the test, run the the following on the host:

$ mkosi -t none

This will rebuild the distribution packages without rebuilding the entire integration test image. Next, run the following in the integration test machine:

$ systemctl soft-reboot
$ systemctl start TEST-01-BASIC

A soft-reboot is required to make sure all the leftover state from the previous run of the test is cleaned up by soft-rebooting into the btrfs snapshot we made before running the test. After the soft-reboot, re-running the test will first install the new packages we just built, make a new snapshot and finally run the test again. You can keep running the loop of mkosi -t none, systemctl soft-reboot and systemctl start ... until the changes to the integration test are working.

If you're debugging a failing integration test (running meson test --interactive without TEST_SHELL), there's no need to run systemctl start ..., running systemctl soft-reboot on its own is sufficient to rerun the test.

Configuration variables

TEST_NO_QEMU=1: Don't run tests under qemu.

TEST_PREFER_QEMU=1: Run all tests under qemu.

TEST_NO_KVM=1: Disable qemu KVM auto-detection (may be necessary when you're trying to run the vanilla qemu and have both qemu and qemu-kvm installed)

TEST_SHELL=1: Configure the machine to be more user-friendly for interactive debugging (e.g. by setting a usable default terminal, suppressing the shutdown after the test, etc.).

TEST_MATCH_SUBTEST=subtest: If the test makes use of run_subtests use this variable to provide a POSIX extended regex to run only subtests matching the expression.

TEST_MATCH_TESTCASE=testcase: Same as $TEST_MATCH_SUBTEST but for subtests that make use of run_testcases.

TEST_SKIP: takes a space separated list of tests to skip.

TEST_SKIP_SUBTEST=subtest: takes a space separated list of subtests to skip.

TEST_SKIP_TESTCASE=testcase: takes a space separated list of testcases to skip.

Ubuntu CI

New PRs submitted to the project are run through regression tests, and one set of those is the 'autopkgtest' runs for several different architectures, called 'Ubuntu CI'. Part of that testing is to run all these tests. Sometimes these tests are temporarily deny-listed from running in the 'autopkgtest' tests while debugging a flaky test; that is done by creating a file in the test directory named 'deny-list-ubuntu-ci', for example to prevent the TEST-01-BASIC test from running in the 'autopkgtest' runs, create the file 'TEST-01-BASIC/deny-list-ubuntu-ci'.

The tests may be disabled only for specific archs, by creating a deny-list file with the arch name at the end, e.g. 'TEST-01-BASIC/deny-list-ubuntu-ci-arm64' to disable the TEST-01-BASIC test only on test runs for the 'arm64' architecture.

Note the arch naming is not from 'uname -m', it is Debian arch names: https://wiki.debian.org/ArchitectureSpecificsMemo

For PRs that fix a currently deny-listed test, the PR should include removal of the deny-list file.

In case a test fails, the full set of artifacts, including the journal of the failed run, can be downloaded from the artifacts.tar.gz archive which will be reachable in the same URL parent directory as the logs.gz that gets linked on the Github CI status.

The log URL can be derived following a simple algorithm, however the test completion timestamp is needed and it's not easy to find without access to the log itself. For example, a noble s390x job started on 2024-03-23 at 02:09:11 will be stored at the following URL:

https://autopkgtest.ubuntu.com/results/autopkgtest-noble-upstream-systemd-ci-systemd-ci/noble/s390x/s/systemd-upstream/20240323_020911_e8e88@/log.gz

Fortunately a list of URLs listing file paths for recently completed test runs is available at:

https://autopkgtest.ubuntu.com/results/autopkgtest-noble-upstream-systemd-ci-systemd-ci/

paths listed at this URL can be appended to the URL to download them. Unfortunately there are too many results and the web server cannot list them all at once. Fortunately there is a workaround: copy the last line on the page, and append it to the URL, with a '?marker=' prefix, and the web server will show the next page of results. For example:

https://autopkgtest.ubuntu.com/results/autopkgtest-noble-upstream-systemd-ci-systemd-ci/?marker=noble/amd64/s/systemd-upstream/20240616_211635_5993a@/result.tar

The 5 characters at the end of the last directory are not random, but the first 5 characters of a SHA1 hash generated based on the set of parameters given to the build plus the completion timestamp, such as:

$ echo -n 'systemd-upstream {"build-git": "https://salsa.debian.org/systemd-team/systemd.git#debian/master", "env": ["UPSTREAM_REPO=https://github.com/systemd/systemd.git", "CFLAGS=-O0", "DEB_BUILD_PROFILES=pkg.systemd.upstream noudeb", "TEST_UPSTREAM=1", "CONFFLAGS_UPSTREAM=--werror -Dslow-tests=true", "UPSTREAM_PULL_REQUEST=31444", "GITHUB_STATUSES_URL=https://api.github.com/repos/systemd/systemd/statuses/c27f600a1c47f10b22964eaedfb5e9f0d4279cd9"], "ppas": ["upstream-systemd-ci/systemd-ci"], "submit-time": "2024-02-27 17:06:27", "uuid": "02cd262f-af22-4f82-ac91-53fa5a9e7811"}' | sha1sum | cut -c1-5

To add new dependencies or new binaries to the packages used during the tests, a merge request can be sent to: https://salsa.debian.org/systemd-team/systemd targeting the 'upstream-ci' branch.

The cloud-side infrastructure, that is hooked into the Github interface, is located at:

https://git.launchpad.net/autopkgtest-cloud/

A generic description of the testing infrastructure can be found at:

https://wiki.ubuntu.com/ProposedMigration/AutopkgtestInfrastructure

In case of infrastructure issues with this CI, things might go wrong in two places:

The CI job needs a PPA in order to be accepted, and the upstream-systemd-ci/systemd-ci PPA is used. Note that this is necessary even when there are no packages to backport, but by default a PPA won't have a repository for a release if there are no packages built for it. To work around this problem, when a new empty release is needed the mark-suite-dirty tool from the https://git.launchpad.net/ubuntu-archive-tools can be used to force the PPA to publish an empty repository, for example:

$ ./mark-suite-dirty -A ppa:upstream-systemd-ci/ubuntu/systemd-ci -s noble

will create an empty 'noble' repository that can be used for 'noble' CI jobs.

For infrastructure help, reaching out to 'qa-help' via the #ubuntu-quality channel on libera.chat is an effective way to receive support in general.

Given access to the shared secret, tests can be re-run using the generic retry-github-test tool:

https://git.launchpad.net/autopkgtest-cloud/tree/charms/focal/autopkgtest-cloud-worker/autopkgtest-cloud/tools/retry-github-test

A wrapper script that makes it easier to use is also available:

https://piware.de/gitweb/?p=bin.git;a=blob;f=retry-gh-systemd-Test

Manually running a part of the Ubuntu CI test suite

In some situations one may want/need to run one of the tests run by Ubuntu CI locally for debugging purposes. For this, you need a machine (or a VM) with the same Ubuntu release as is used by Ubuntu CI (Jammy ATTOW).

First of all, clone the Debian systemd repository and sync it with the code of the PR (set by the $UPSTREAM_PULL_REQUEST env variable) you'd like to debug:

$ git clone https://salsa.debian.org/systemd-team/systemd.git
$ cd systemd
$ git checkout upstream-ci
$ TEST_UPSTREAM=1 UPSTREAM_PULL_REQUEST=12345 ./debian/extra/checkout-upstream

Now install necessary build & test dependencies:

# PPA with some newer Ubuntu packages required by upstream systemd
$ add-apt-repository -y --enable-source ppa:upstream-systemd-ci/systemd-ci
$ apt build-dep -y systemd
$ apt install -y autopkgtest debhelper genisoimage git qemu-system-x86 \
                 libcurl4-openssl-dev libfdisk-dev libtss2-dev libfido2-dev \
                 libssl-dev python3-pefile

Build systemd deb packages with debug info:

$ TEST_UPSTREAM=1 DEB_BUILD_OPTIONS="nocheck nostrip noopt" dpkg-buildpackage -us -uc
$ cd ..

Prepare a testbed image for autopkgtest (tweak the release as necessary):

$ autopkgtest-buildvm-ubuntu-cloud --ram-size 1024 -v -a amd64 -r jammy

And finally run the autopkgtest itself:

$ autopkgtest -o logs *.deb systemd/ \
              --env=TEST_UPSTREAM=1 \
              --timeout-factor=3 \
              --test-name=boot-and-services \
              --shell-fail \
              -- autopkgtest-virt-qemu --cpus 4 --ram-size 2048 autopkgtest-jammy-amd64.img

where --test-name= is the name of the test you want to run/debug. The --shell-fail option will pause the execution in case the test fails and shows you the information how to connect to the testbed for further debugging.

Manually running CodeQL analysis

This is mostly useful for debugging various CodeQL quirks.

Download the CodeQL Bundle from https://github.com/github/codeql-action/releases and unpack it somewhere. From now the 'tutorial' assumes you have the codeql binary from the unpacked archive in $PATH for brevity.

Switch to the systemd repository if not already:

$ cd <systemd-repo>

Create an initial CodeQL database:

$ CCACHE_DISABLE=1 codeql database create codeqldb --language=cpp -vvv

Disabling ccache is important, otherwise you might see CodeQL complaining:

No source code was seen and extracted to /home/mrc0mmand/repos/@ci-incubator/systemd/codeqldb. This can occur if the specified build commands failed to compile or process any code.

  • Confirm that there is some source code for the specified language in the project.
  • For codebases written in Go, JavaScript, TypeScript, and Python, do not specify an explicit --command.
  • For other languages, the --command must specify a "clean" build which compiles all the source code files without reusing existing build artefacts.

If you want to run all queries systemd uses in CodeQL, run:

$ codeql database analyze codeqldb/ --format csv --output results.csv .github/codeql-custom.qls .github/codeql-queries/*.ql -vvv

Note: this will take a while.

If you're interested in a specific check, the easiest way (without hunting down the specific CodeQL query file) is to create a custom query suite. For example:

$ cat >test.qls <<EOF
- queries: .
  from: codeql/cpp-queries
- include:
    id:
        - cpp/missing-return
EOF

And then execute it in the same way as above:

$ codeql database analyze codeqldb/ --format csv --output results.csv test.qls -vvv

More about query suites here: https://codeql.github.com/docs/codeql-cli/creating-codeql-query-suites/

The results are then located in the results.csv file as a comma separated values list (obviously), which is the most human-friendly output format the CodeQL utility provides (so far).

Running Coverity locally

Note: this requires a Coverity license, as the public tool tarball doesn't contain cov-analyze and friends, so the usefulness of this guide is somewhat limited.

Debugging certain pesky Coverity defects can be painful, especially since the OSS Coverity instance has a very strict limit on how many builds we can send it per day/week, so if you have an access to a non-OSS Coverity license, knowing how to debug defects locally might come in handy.

After installing the necessary tooling we need to populate the emit DB first:

$ rm -rf build cov
$ meson setup build -Dman=false
$ cov-build --dir=./cov ninja -C build

From there it depends if you're interested in a specific defect or all of them. For the latter run:

$ cov-analyze --dir=./cov --wait-for-license

If you want to debug a specific defect, telling that to cov-analyze speeds things up a bit:

$ cov-analyze --dir=./cov --wait-for-license --disable-default --enable ASSERT_SIDE_EFFECT

The final step is getting the actual report which can be generated in multiple formats, for example:

$ cov-format-errors --dir ./cov --text-output-style multiline
$ cov-format-errors --dir=./cov --emacs-style
$ cov-format-errors --dir=./cov --html-output html-out

Which generate a text report, an emacs-compatible text report, and an HTML report respectively.

Other useful options for cov-format-error include --file <file> to filter out defects for a specific file, --checker-regex DEFECT_TYPE to filter our only a specific defect (if this wasn't done already by cov-analyze), and many others, see --help for an exhaustive list.

Code coverage

We have a daily cron job in CentOS CI which runs all unit and integration tests, collects coverage using gcov/lcov, and uploads the report to Coveralls. In order to collect the most accurate coverage information, some measures have to be taken regarding sandboxing, namely:

  • ProtectSystem= and ProtectHome= need to be turned off
  • the $BUILD_DIR with necessary .gcno files needs to be present in the image and needs to be writable by all processes

The first point is relatively easy to handle and is handled automagically by our test "framework" by creating necessary dropins.

Making the $BUILD_DIR accessible to everything is slightly more complicated. First, and foremost, the $BUILD_DIR has a POSIX ACL that makes it writable to everyone. However, this is not enough in some cases, like for services that use DynamicUser=yes, since that implies ProtectSystem=strict that can't be turned off. A solution to this is to use ReadWritePaths=$BUILD_DIR, which works for the majority of cases, but can't be turned on globally, since ReadWritePaths= creates its own mount namespace which might break some services. Hence, the ReadWritePaths=$BUILD_DIR is enabled for all services with the test- prefix (i.e. test-foo.service or test-foo-bar.service), both in the system and the user managers.

So, if you're considering writing an integration test that makes use of DynamicUser=yes, or other sandboxing stuff that implies it, please prefix the test unit (be it a static one or a transient one created via systemd-run), with test-, unless the test unit needs to be able to install mount points in the main mount namespace - in that case use IGNORE_MISSING_COVERAGE=yes in the test definition (i.e. TEST-*-NAME/test.sh), which will skip the post-test check for missing coverage for the respective test.