1
0
mirror of https://github.com/systemd/systemd.git synced 2025-01-23 02:04:32 +03:00
Daan De Meyer b85e54961c test: Various mkosi integration test improvements
- Stop using logging module since the default output formatting is
  pretty bad. Prefer print() for now.
- Log less, logging the full mkosi command line is rather verbose,
  especially when it contains multi-line dropins.
- Streamline the journalctl command we output for debugging failed
  tests.
- Don't force usage of the disk image format.
- Don't force running without unit tests.
- Don't force disabling RuntimeBuildSources.
- Update documentation to streamline the command for running a single
  test and remove sudo as it's not required anymore.
- Improve the console output by having the test unit's output logged
  to both the journal and the console.
- Disable journal console log forwarding as we have journal forwarding
  as a better alternative.
- Delete existing journal file before running test.
- Delete journal files of succeeded tests to reduce disk usage.
- Rename system_mkosi target to just mkosi
- Pass in mkosi source directory explicitly to accomodate arbitrary
  build directory locations.
- Add test interactive debugging if stdout is connected to a tty
- Stop explicitly using the 'system' image since it'll likely be
  dropped soon.
- Only forward journal if we're not running in debugging mode.
- Stop using testsuite.target and instead just add the necessary
  extras to the main testsuite unit via the credential dropin.
- Override type to idle so test output is not interleaved with
  status output.
- Don't build mkosi target by default
- Always add the mkosi target if mkosi is found
- Remove dependency of the integration tests on the mkosi target
  as otherwise the image is always built, even though we configure
  it to not be built by default.
- Move mkosi output, cache and build directory into build/ so that
  invocations from meson and regular invocations share the same
  directories.
- Various aesthetic cleanups.
2024-04-23 10:32:42 +02:00
..
2023-11-29 11:04:59 +00:00
2024-04-22 19:37:13 +02:00

The extended testsuite only works with UID=0. It consists of the subdirectories
named "test/TEST-??-*", each of which contains a description of an OS image and
a test which consists of systemd units and scripts to execute in this image.
The same image is used for execution under `systemd-nspawn` and `qemu`.

To run the extended testsuite do the following:

$ ninja -C build  # Avoid building anything as root later
$ sudo test/run-integration-tests.sh
ninja: Entering directory `/home/zbyszek/src/systemd/build'
ninja: no work to do.
--x-- Running TEST-01-BASIC --x--
+ make -C TEST-01-BASIC clean setup run
make: Entering directory '/home/zbyszek/src/systemd/test/TEST-01-BASIC'
TEST-01-BASIC CLEANUP: Basic systemd setup
TEST-01-BASIC SETUP: Basic systemd setup
...
TEST-01-BASIC RUN: Basic systemd setup [OK]
make: Leaving directory '/home/zbyszek/src/systemd/test/TEST-01-BASIC'
--x-- Result of TEST-01-BASIC: 0 --x--
--x-- Running TEST-02-CRYPTSETUP --x--
+ make -C TEST-02-CRYPTSETUP clean setup run

If one of the tests fails, then $subdir/test.log contains the log file of
the test.

To run just one of the cases:

$ sudo make -C test/TEST-01-BASIC clean setup run

To run the meson-based integration test config
enable integration tests and options for required commands with the following:

$ meson configure build -Dintegration-tests=true -Dremote=enabled -Dopenssl=enabled -Dblkid=enabled -Dtpm2=enabled

Once enabled, first build the integration test image:

$ meson compile -C build mkosi

After the image has been built, the integration tests can be run with:

$ meson test -C build/ --suite integration-tests --num-processes "$(($(nproc) / 2))"

As usual, specific tests can be run in meson by appending the name of the test
which is usually the name of the directory e.g.

$ meson test -C build/ -v TEST-01-BASIC

Due to limitations in meson, the integration tests do not yet depend on the mkosi target, which means the
mkosi target has to be manually rebuilt before running the integration tests. To rebuild the image and rerun
a test, the following command can be used:

$ meson compile -C build mkosi && meson test -C build -v TEST-01-BASIC

See `meson introspect build --tests` for a list of tests.

Specifying the build directory
==============================

If the build directory is not detected automatically, it can be specified
with BUILD_DIR=:

$ sudo BUILD_DIR=some-other-build/ test/run-integration-tests

or

$ sudo make -C test/TEST-01-BASIC BUILD_DIR=../../some-other-build/ ...

Note that in the second case, the path is relative to the test case directory.
An absolute path may also be used in both cases.

Testing installed binaries instead of built
===========================================

To run the extended testsuite using the systemd installed on the system instead
of the systemd from a build, use the NO_BUILD=1:

$ sudo NO_BUILD=1 test/run-integration-tests

Configuration variables
=======================

TEST_NO_QEMU=1
    Don't run tests under qemu

TEST_QEMU_ONLY=1
    Run only tests that require qemu

TEST_NO_NSPAWN=1
    Don't run tests under systemd-nspawn

TEST_PREFER_NSPAWN=1
    Run all tests that do not require qemu under systemd-nspawn

TEST_NO_KVM=1
    Disable qemu KVM auto-detection (may be necessary when you're trying to run the
    *vanilla* qemu and have both qemu and qemu-kvm installed)

TEST_NESTED_KVM=1
    Allow tests to run with nested KVM. By default, the testsuite disables
    nested KVM if the host machine already runs under KVM. Setting this
    variable disables such checks

QEMU_MEM=512M
    Configure amount of memory for qemu VMs (defaults to 512M)

QEMU_SMP=1
    Configure number of CPUs for qemu VMs (defaults to 1)

KERNEL_APPEND='...'
    Append additional parameters to the kernel command line

NSPAWN_ARGUMENTS='...'
    Specify additional arguments for systemd-nspawn

QEMU_TIMEOUT=infinity
    Set a timeout for tests under qemu (defaults to 1800 sec)

NSPAWN_TIMEOUT=infinity
    Set a timeout for tests under systemd-nspawn (defaults to 1800 sec)

INTERACTIVE_DEBUG=1
    Configure the machine to be more *user-friendly* for interactive debuggung
    (e.g. by setting a usable default terminal, suppressing the shutdown after
    the test, etc.)

TEST_MATCH_SUBTEST=subtest
    If the test makes use of `run_subtests` use this variable to provide
    a POSIX extended regex to run only subtests matching the expression

TEST_MATCH_TESTCASE=testcase
    Same as $TEST_MATCH_SUBTEST but for subtests that make use of `run_testcases`

The kernel and initrd can be specified with $KERNEL_BIN and $INITRD. (Fedora's
or Debian's default kernel path and initrd are used by default.)

A script will try to find your qemu binary. If you want to specify a different
one with $QEMU_BIN.

Debugging the qemu image
========================

If you want to log in the testsuite virtual machine, use INTERACTIVE_DEBUG=1
and log in as root:

$ sudo make -C test/TEST-01-BASIC INTERACTIVE_DEBUG=1 run

The root password is empty.

Ubuntu CI
=========

New PR submitted to the project are run through regression tests, and one set
of those is the 'autopkgtest' runs for several different architectures, called
'Ubuntu CI'.  Part of that testing is to run all these tests.  Sometimes these
tests are temporarily deny-listed from running in the 'autopkgtest' tests while
debugging a flaky test; that is done by creating a file in the test directory
named 'deny-list-ubuntu-ci', for example to prevent the TEST-01-BASIC test from
running in the 'autopkgtest' runs, create the file
'TEST-01-BASIC/deny-list-ubuntu-ci'.

The tests may be disabled only for specific archs, by creating a deny-list file
with the arch name at the end, e.g.
'TEST-01-BASIC/deny-list-ubuntu-ci-arm64' to disable the TEST-01-BASIC test
only on test runs for the 'arm64' architecture.

Note the arch naming is not from 'uname -m', it is Debian arch names:
https://wiki.debian.org/ArchitectureSpecificsMemo

For PRs that fix a currently deny-listed test, the PR should include removal
of the deny-list file.

In case a test fails, the full set of artifacts, including the journal of the
failed run, can be downloaded from the artifacts.tar.gz archive which will be
reachable in the same URL parent directory as the logs.gz that gets linked on
the Github CI status.

The log URL can be derived following a simple algorithm, however the test
completion timestamp is needed and it's not easy to find without access to the
log itself. For example, a noble s390x job started on 2024-03-23 at 02:09:11
will be stored at the following URL:

https://autopkgtest.ubuntu.com/results/autopkgtest-noble-upstream-systemd-ci-systemd-ci/noble/s390x/s/systemd-upstream/20240323_020911_e8e88@/log.gz

The 5 characters at the end of the last directory are not random, but the first
5 characters of a SHA1 hash generated based on the set of parameters given to
the build plus the completion timestamp, such as:

$ echo -n 'systemd-upstream {"build-git": "https://salsa.debian.org/systemd-team/systemd.git#debian/master", "env": ["UPSTREAM_REPO=https://github.com/systemd/systemd.git", "CFLAGS=-O0", "DEB_BUILD_PROFILES=pkg.systemd.upstream noudeb", "TEST_UPSTREAM=1", "CONFFLAGS_UPSTREAM=--werror -Dslow-tests=true", "UPSTREAM_PULL_REQUEST=31444", "GITHUB_STATUSES_URL=https://api.github.com/repos/systemd/systemd/statuses/c27f600a1c47f10b22964eaedfb5e9f0d4279cd9"], "ppas": ["upstream-systemd-ci/systemd-ci"], "submit-time": "2024-02-27 17:06:27", "uuid": "02cd262f-af22-4f82-ac91-53fa5a9e7811"}' | sha1sum | cut -c1-5

To add new dependencies or new binaries to the packages used during the tests,
a merge request can be sent to: https://salsa.debian.org/systemd-team/systemd
targeting the 'upstream-ci' branch.

The cloud-side infrastructure, that is hooked into the Github interface, is
located at:

https://git.launchpad.net/autopkgtest-cloud/

In case of infrastructure issues with this CI, things might go wrong in two
places:

- starting a job: this is done via a Github webhook, so check if the HTTP POST
  are failing on https://github.com/systemd/systemd/settings/hooks
- running a job: all currently running jobs are listed at
  https://autopkgtest.ubuntu.com/running#pkg-systemd-upstream in case the PR
  does not show the status for some reason
- reporting the job result: this is done on Canonical's cloud infrastructure,
  if jobs are started and running but no status is visible on the PR, then it is
  likely that reporting back is not working

The CI job needs a PPA in order to be accepted, and the upstream-systemd-ci/systemd-ci
PPA is used. Note that this is necessary even when there are no packages to backport,
but by default a PPA won't have a repository for a release if there are no packages
built for it. To work around this problem, when a new empty release is needed the
mark-suite-dirty tool from the https://git.launchpad.net/ubuntu-archive-tools can
be used to force the PPA to publish an empty repository, for example:

 $ ./mark-suite-dirty -A ppa:upstream-systemd-ci/ubuntu/systemd-ci -s noble

will create an empty 'noble' repository that can be used for 'noble' CI jobs.

For infrastructure help, reaching out to Canonical via the #ubuntu-quality channel
on libera.chat is an effective way to receive support in general.

Manually running a part of the Ubuntu CI test suite
===================================================

In some situations one may want/need to run one of the tests run by Ubuntu CI
locally for debugging purposes. For this, you need a machine (or a VM) with
the same Ubuntu release as is used by Ubuntu CI (Jammy ATTOW).

First of all, clone the Debian systemd repository and sync it with the code of
the PR (set by the $UPSTREAM_PULL_REQUEST env variable) you'd like to debug:

# git clone https://salsa.debian.org/systemd-team/systemd.git
# cd systemd
# git checkout upstream-ci
# TEST_UPSTREAM=1 UPSTREAM_PULL_REQUEST=12345 ./debian/extra/checkout-upstream

Now install necessary build & test dependencies:

## PPA with some newer Ubuntu packages required by upstream systemd
# add-apt-repository -y --enable-source ppa:upstream-systemd-ci/systemd-ci
# apt build-dep -y systemd
# apt install -y autopkgtest debhelper genisoimage git qemu-system-x86 \
                 libcurl4-openssl-dev libfdisk-dev libtss2-dev libfido2-dev \
                 libssl-dev python3-pefile

Build systemd deb packages with debug info:

# TEST_UPSTREAM=1 DEB_BUILD_OPTIONS="nocheck nostrip noopt" dpkg-buildpackage -us -uc
# cd ..

Prepare a testbed image for autopkgtest (tweak the release as necessary):

# autopkgtest-buildvm-ubuntu-cloud --ram-size 1024 -v -a amd64 -r jammy

And finally run the autopkgtest itself:

# autopkgtest -o logs *.deb systemd/ \
              --env=TEST_UPSTREAM=1 \
              --timeout-factor=3 \
              --test-name=boot-and-services \
              --shell-fail \
              -- autopkgtest-virt-qemu --cpus 4 --ram-size 2048 autopkgtest-jammy-amd64.img

where --test-name= is the name of the test you want to run/debug. The
--shell-fail option will pause the execution in case the test fails and shows
you the information how to connect to the testbed for further debugging.

Manually running CodeQL analysis
=====================================

This is mostly useful for debugging various CodeQL quirks.

Download the CodeQL Bundle from https://github.com/github/codeql-action/releases
and unpack it somewhere. From now the 'tutorial' assumes you have the `codeql`
binary from the unpacked archive in $PATH for brevity.

Switch to the systemd repository if not already:

$ cd <systemd-repo>

Create an initial CodeQL database:

$ CCACHE_DISABLE=1 codeql database create codeqldb --language=cpp -vvv

Disabling ccache is important, otherwise you might see CodeQL complaining:

No source code was seen and extracted to /home/mrc0mmand/repos/@ci-incubator/systemd/codeqldb.
This can occur if the specified build commands failed to compile or process any code.
 - Confirm that there is some source code for the specified language in the project.
 - For codebases written in Go, JavaScript, TypeScript, and Python, do not specify
   an explicit --command.
 - For other languages, the --command must specify a "clean" build which compiles
   all the source code files without reusing existing build artefacts.

If you want to run all queries systemd uses in CodeQL, run:

$ codeql database analyze codeqldb/ --format csv --output results.csv .github/codeql-custom.qls .github/codeql-queries/*.ql -vvv

Note: this will take a while.

If you're interested in a specific check, the easiest way (without hunting down
the specific CodeQL query file) is to create a custom query suite. For example:

$ cat >test.qls <<EOF
- queries: .
  from: codeql/cpp-queries
- include:
    id:
        - cpp/missing-return
EOF

And then execute it in the same way as above:

$ codeql database analyze codeqldb/ --format csv --output results.csv test.qls -vvv

More about query suites here: https://codeql.github.com/docs/codeql-cli/creating-codeql-query-suites/

The results are then located in the `results.csv` file as a comma separated
values list (obviously), which is the most human-friendly output format the
CodeQL utility provides (so far).

Running Coverity locally
========================

Note: this requires a Coverity license, as the public tool tarball (from [0])
doesn't contain cov-analyze and friends, so the usefulness of this guide is
somewhat limited.

Debugging certain pesky Coverity defects can be painful, especially since the
OSS Coverity instance has a very strict limit on how many builds we can send it
per day/week, so if you have an access to a non-OSS Coverity license, knowing
how to debug defects locally might come in handy.

After installing the necessary tooling we need to populate the emit DB first:

$ rm -rf build cov
$ meson setup build -Dman=false
$ cov-build --dir=./cov ninja -C build

From there it depends if you're interested in a specific defect or all of them.
For the latter run:

$ cov-analyze --dir=./cov --wait-for-license

If you want to debug a specific defect, telling that to cov-analyze speeds
things up a bit:

$ cov-analyze --dir=./cov --wait-for-license --disable-default --enable ASSERT_SIDE_EFFECT

The final step is getting the actual report which can be generated in multiple
formats, for example:

$ cov-format-errors --dir ./cov --text-output-style multiline
$ cov-format-errors --dir=./cov --emacs-style
$ cov-format-errors --dir=./cov --html-output html-out

Which generate a text report, an emacs-compatible text report, and an HTML
report respectively.

Other useful options for cov-format-error include --file <file> to filter out
defects for a specific file, --checker-regex DEFECT_TYPE to filter our only a
specific defect (if this wasn't done already by cov-analyze), and many others,
see --help for an exhaustive list.

[0] https://scan.coverity.com/download

Code coverage
=============

We have a daily cron job in CentOS CI which runs all unit and integration tests,
collects coverage using gcov/lcov, and uploads the report to Coveralls[0]. In
order to collect the most accurate coverage information, some measures have
to be taken regarding sandboxing, namely:

 - ProtectSystem= and ProtectHome= need to be turned off
 - the $BUILD_DIR with necessary .gcno files needs to be present in the image
   and needs to be writable by all processes

The first point is relatively easy to handle and is handled automagically by
our test "framework" by creating necessary dropins.

Making the $BUILD_DIR accessible to _everything_ is slightly more complicated.
First, and foremost, the $BUILD_DIR has a POSIX ACL that makes it writable
to everyone. However, this is not enough in some cases, like for services
that use DynamicUser=yes, since that implies ProtectSystem=strict that can't
be turned off. A solution to this is to use ReadWritePaths=$BUILD_DIR, which
works for the majority of cases, but can't be turned on globally, since
ReadWritePaths= creates its own mount namespace which might break some
services. Hence, the ReadWritePaths=$BUILD_DIR is enabled for all services
with the `test-` prefix (i.e. test-foo.service or test-foo-bar.service), both
in the system and the user managers.

So, if you're considering writing an integration test that makes use
of DynamicUser=yes, or other sandboxing stuff that implies it, please prefix
the test unit (be it a static one or a transient one created via systemd-run),
with `test-`, unless the test unit needs to be able to install mount points
in the main mount namespace - in that case use IGNORE_MISSING_COVERAGE=yes
in the test definition (i.e. TEST-*-NAME/test.sh), which will skip the post-test
check for missing coverage for the respective test.

[0] https://coveralls.io/github/systemd/systemd