1
0
mirror of https://github.com/systemd/systemd.git synced 2025-01-06 17:18:12 +03:00
Commit Graph

5515 Commits

Author SHA1 Message Date
Daan De Meyer
18bb30c3b2
core: Bind mount notify socket to /run/host/notify in sandboxed units (#35573)
To be able to run systemd in a Type=notify transient unit, the notify
socket can't be bind mounted to /run/systemd/notify as systemd in the
transient unit wants to use that as its own notify socket which
conflicts with systemd on the host.

Instead, for sandboxed units, let's bind mount the notify socket to
/run/host/notify as documented in the container interface. Since we
don't guarantee a stable location for the notify socket and insist users
use $NOTIFY_SOCKET to get its path, this is safe to do.
2024-12-13 13:48:07 +00:00
Luca Boccassi
ed803ee195
journalctl: make --setup-keys honor --output=json and --quiet (#35507)
Closes #35503.
Closes #35504.
2024-12-13 13:40:09 +00:00
Daan De Meyer
284dd31e9d core: Bind mount notify socket to /run/host/notify in sandboxed units
To be able to run systemd in a Type=notify transient unit, the notify
socket can't be bind mounted to /run/systemd/notify as systemd in the
transient unit wants to use that as its own notify socket which conflicts
with systemd on the host.

Instead, for sandboxed units, let's bind mount the notify socket to
/run/host/notify as documented in the container interface. Since we don't
guarantee a stable location for the notify socket and insist users use
$NOTIFY_SOCKET to get its path, this is safe to do.
2024-12-13 13:37:02 +01:00
Luca Boccassi
6dfd290031
core: Add PrivateUsers=full (#35183)
Recently, PrivateUsers=identity was added to support mapping the first
65536 UIDs/GIDs from parent to the child namespace and mapping the other
UID/GIDs to the nobody user.

However, there are use cases where users have UIDs/GIDs > 65536 and need
to do a similar identity mapping. Moreover, in some of those cases,
users want a full identity mapping from 0 -> UID_MAX.

To support this, we add PrivateUsers=full that does identity mapping for
all available UID/GIDs.

Note to differentiate ourselves from the init user namespace, we need to
set up the uid_map/gid_map like:
```
0 0 1
1 1 UINT32_MAX - 1
```

as the init user namedspace uses `0 0 UINT32_MAX` and some applications
- like systemd itself - determine if its a non-init user namespace based
on uid_map/gid_map files.

Note systemd will remove this heuristic in running_in_userns() in
version 258 (https://github.com/systemd/systemd/pull/35382) and uses
namespace inode. But some users may be running a container image with
older systemd < 258 so we keep this hack until version 259 for version
N-1 compatibility.

In addition to mapping the whole UID/GID space, we also set
/proc/pid/setgroups to "allow". While we usually set "deny" to avoid
security issues with dropping supplementary groups
(https://lwn.net/Articles/626665/), this ends up breaking dbus-broker
when running /sbin/init in full OS containers.

Fixes: #35168
Fixes: #35425
2024-12-13 12:25:13 +00:00
Luca Boccassi
9fdf10604b
core: fix loading verity settings for MountImages= (#35577)
The MountEntry logic was refactored to store the verity
settings, and updated for ExtensionImages=, but not for
MountImages=.

Follow-up for a1a40297db
2024-12-12 13:06:07 +00:00
Ryan Wilson
2665425176 core: Set /proc/pid/setgroups to allow for PrivateUsers=full
When trying to run dbus-broker in a systemd unit with PrivateUsers=full,
we see dbus-broker fails with EPERM at `util_audit_drop_permissions`.

The root cause is dbus-broker calls the setgroups() system call and this
is disallowed via systemd's implementation of PrivateUsers= by setting
/proc/pid/setgroups = deny. This is done to remediate potential privilege
escalation vulnerabilities in user namespaces where an attacker can remove
supplementary groups and gain access to resources where those groups are
restricted.

However, for OS-like containers, setgroups() is a pretty common API and
disabling it is not feasible. So we allow setgroups() by setting
/proc/pid/setgroups to allow in PrivateUsers=full. Note security conscious
users can still use SystemCallFilter= to disable setgroups() if they want
to specifically prevent this system call.

Fixes: #35425
2024-12-12 11:36:10 +00:00
Yu Watanabe
9d8cb69e7f test: rename README.testsuite -> README.md 2024-12-12 12:02:19 +09:00
Luca Boccassi
c7fcb08324 test: add more coverage for extensions and verity 2024-12-12 00:58:20 +00:00
Luca Boccassi
59a83e1188 core: fix loading verity settings for MountImages=
The MountEntry logic was refactored to store the verity
settings, and updated for ExtensionImages=, but not for
MountImages=.

Follow-up for a1a40297db
2024-12-12 00:58:20 +00:00
Yu Watanabe
bfff0f5ac8
Add credential support for mount units (#34732)
Add `EXEC_SETUP_CREDENTIALS` flag to allow using credentials with mount units.
Fixes: #23535
2024-12-12 05:07:35 +09:00
Yu Watanabe
7bb1c8f2a3 journalctl: make --invocation and --list-invocations accept unit name with glob
Previously, journalctl -I -u GLOB was not supported, while
journalctl -u GLOB works fine. Let's make them consistent.
2024-12-11 16:32:22 +00:00
Yu Watanabe
e8823b5e35 journalctl: make --invocation and --list-invocations accept unit name without suffix
Fixes #35538.
2024-12-11 16:32:22 +00:00
Nick Rosbrook
59e5108fb4 test: set nsec3-salt-length=8 in knot.conf
TEST-75-RESOLVED fails on Ubuntu autopkgtest due to this warning from
knot:

 notice: config, policy 'auto_rollover_nsec3' depends on default nsec3-salt-length=8, since version 3.5 the default becomes 0

Explicitly set nsec3-salt-length=8 to silence.
2024-12-11 12:55:37 +00:00
Yu Watanabe
5c9da83004 journalctl: allow to dump generated key in json format
Closes #35503.
2024-12-11 11:18:06 +09:00
Yu Watanabe
a5b2973850 journalctl: honor --quiet with --setup-keys
Closes #35504.
2024-12-11 11:18:05 +09:00
Yu Watanabe
627d1a9ac1
core: Add ProtectHostname=private (#35447)
This PR allows an option for systemd exec units to enable UTS namespaces
but not restrict changing hostname via seccomp. Thus, units can change
hostname without affecting the host. This is useful for OS-like
containers running as units where they should have freedom to change
their container hostname if they want, but not the host's hostname.

Fixes: #30348
2024-12-11 10:17:25 +09:00
davjav
5b66f3df16 test: mount unit with credential
Verify mount unit credential file is present.
2024-12-10 20:57:20 +01:00
Ryan Wilson
219a6dbbf3 core: Fix time namespace in RestrictNamespaces=
RestrictNamespaces= would accept "time" but would not actually apply
seccomp filters e.g. systemd-run -p RestrictNamespaces=time unshare -T true
should fail but it succeeded.

This commit actually enables time namespace seccomp filtering.
2024-12-10 20:55:26 +01:00
Nils K
e76d83d100
core: improve finding OnSuccess=/OnFailure= dependent (#35468)
Previously if one service specified the same unit as their
success and failure handler we bailed out of resolving the triggering unit
even though it is still unique.
2024-12-10 20:48:09 +01:00
Luca Boccassi
92acb89735 Revert "test: skip TEST-13-NSPAWN.nspawn/machined, TEST-86-MULTI-PROFILE-UKI and TEST-07-PID1.private-pids.sh"
The release is done, re-enable the skipped flaky tests for main.

This reverts commit ab828def6d.
2024-12-10 19:31:18 +00:00
Luca Boccassi
97eccc4850
Chores for v257 (#35525) 2024-12-10 19:21:43 +00:00
Yu Watanabe
a33813e9e9 TEST-07-PID: wait for sleep command being executed by sd-executor
Hopefully fixes #35528.
2024-12-10 19:19:54 +00:00
Luca Boccassi
ab828def6d test: skip TEST-13-NSPAWN.nspawn/machined, TEST-86-MULTI-PROFILE-UKI and TEST-07-PID1.private-pids.sh
These new tests are flaky, so disable them temporarily, until after
the release, to avoid pushing out new flakiness to consumers. They
will be re-enabled immediately after.
2024-12-10 15:18:39 +00:00
Luca Boccassi
b8a34813b0 test: add TEST_SKIP_SUBTESTS/TEST_SKIP_TESTCASES
Inverse of the TEST_MATCH_SUBTEST/TEST_MATCH_TESTCASE variables
2024-12-10 15:18:39 +00:00
Luca Boccassi
491b9a8575 test: use mkdir -p in TEST-25-IMPORT
[   15.896174] TEST-25-IMPORT.sh[473]: + mkdir /var/tmp/scratch
[   15.902524] TEST-25-IMPORT.sh[519]: mkdir: cannot create directory ‘/var/tmp/scratch’: File exists

https://github.com/systemd/systemd/actions/runs/12248114409/job/34167155679?pr=35520
2024-12-10 13:51:53 +00:00
Yu Watanabe
d2d006cc8c test: use systemd-asan-env environment file at more places 2024-12-10 11:01:53 +09:00
Yu Watanabe
456727b5d4 test-network: check status of networkd after everything cleared on tear down
Otherwise, if networkd is failed, e.g. .network files that triggered the
failure will remain, and the next test case will start with previous
.network files. So, most subsequent test will fail.
2024-12-10 11:01:53 +09:00
Yu Watanabe
1bdb9e808f test: extract sanitizer reports from journal 2024-12-10 11:01:48 +09:00
Yu Watanabe
dbf83c6613 Revert "test: tentatively disable SELinux tests"
This reverts commit 261a3d191e.
2024-12-09 21:52:06 +01:00
Daan De Meyer
8f51cf6981 test: Set kernel loglevel to INFO when running tests unattended
This makes sure all kernel log messages are logged to the console.
This should be helpful during shutdown to detect possible issues with
journald when the logs can't be written to the journal itself anymore
but are written to kmsg.
2024-12-08 12:55:43 +01:00
Yu Watanabe
261a3d191e test: tentatively disable SELinux tests
Currently, mkosi GitHub action complains the following:
===
Could not find 'setfiles' which is required to relabel files.
===
Let's tentatively disable SELinux test.
2024-12-08 12:59:08 +09:00
Ryan Wilson
cf48bde7ae core: Add ProtectHostname=private
This allows an option for systemd exec units to enable UTS namespaces
but not restrict changing hostname via seccomp. Thus, units can change
hostname without affecting the host.

Fixes: #30348
2024-12-06 13:34:04 -08:00
Daan De Meyer
ead814a0b0 test: Remove old bash test runner
We put a timeline of 257 to remove the old bash test runner so since
we're about to release 257, let's remove the old bash test runner in
favor of the meson + mkosi test runner.
2024-12-06 18:54:10 +00:00
Ryan Wilson
705cc82938 core: Add PrivateUsers=full
Recently, PrivateUsers=identity was added to support mapping the first
65536 UIDs/GIDs from parent to the child namespace and mapping the other
UID/GIDs to the nobody user.

However, there are use cases where users have UIDs/GIDs > 65536 and need
to do a similar identity mapping. Moreover, in some of those cases, users
want a full identity mapping from 0 -> UID_MAX.

Note to differentiate ourselves from the init user namespace, we need to
set up the uid_map/gid_map like:
```
0 0 1
1 1 UINT32_MAX - 1
```

as the init user namedspace uses `0 0 UINT32_MAX` and some applications -
like systemd itself - determine if its a non-init user namespace based on
uid_map/gid_map files. Note systemd will remove this heuristic in
running_in_userns() in version 258 and uses namespace inode. But some users
may be running a container image with older systemd < 258 so we keep this
hack until version 259.

To support this, we add PrivateUsers=full that does identity mapping for
all available UID/GIDs.

Fixes: #35168
2024-12-05 10:34:32 -08:00
Daan De Meyer
e022e73e3f test: Implement TEST_PREFER_QEMU and use it in one of the mkosi jobs
We want to make sure the integration tests that don't require qemu
can run successfully both in an nspawn container and in a qemu VM.
So let's add one more knob TEST_PREFER_QEMU=1 to run jobs that normally
require nspawn in qemu instead.

Running these tests in qemu is also possible by not running as root but
that's very implicit so we add an explicit knob instead to make it explicit
that we want to run these in qemu instead of nspawn.
2024-12-05 16:43:11 +01:00
Daan De Meyer
900ac3a76a
ci: Implement coverage on top of mkosi (#35407) 2024-12-05 10:47:45 +01:00
Daan De Meyer
c45174f05d ci: Implement coverage on top of mkosi 2024-12-05 00:21:57 +01:00
Luca Boccassi
162760f16c
Use nicer syntax in two places in CI (#35455) 2024-12-04 13:32:28 +00:00
Daan De Meyer
e69d724aff test-execute: Make /coverage writable in DynamicUser= tests
DynamicUser=yes implies ProtectSystem=yes, so let's explicitly make
sure the coverage directory is writable in these tests.
2024-12-04 14:04:24 +01:00
Daan De Meyer
820a9373fc test: Skip TEST-38-FREEZER if coverage is enabled
The test freezes regularly when run with coverage so let's skip it
if coverage is enabled.
2024-12-04 11:12:50 +01:00
Zbigniew Jędrzejewski-Szmek
92e43e5c53 TEST-64: use more idiomatic loop syntax 2024-12-04 09:58:52 +01:00
Yu Watanabe
552e5db9ac Revert "mkosi: extend DefaultTimeoutStopSec= when running on sanitizers"
This reverts commit b75befc3c9.

Unfortunately, it does not work. Let's revert.
2024-12-04 09:13:18 +09:00
Mike Yuan
703b1b7f24 core/service: preserve RuntimeDirectory= even if oneshot service exits
Follow-up for c26948c6da

We only want to get rid of cred mount here, and RuntimeDirectory=
is documented to be retained for SERVICE_EXITED state.

Fixes #35427
2024-12-02 10:57:45 +01:00
Yu Watanabe
472e3cce6e TEST-13-NSPAWN: enable debugging logs by nspawn run by systemd-run
Otherwise, it is hard to debug issue #35209.
2024-12-01 15:40:19 +01:00
Yu Watanabe
b75befc3c9 mkosi: extend DefaultTimeoutStopSec= when running on sanitizers
Hopefully fixes #35335.
2024-11-30 04:28:24 +09:00
Yu Watanabe
6bb3771e8c TEST-67-INTEGRITY: blkid should not provide the underlying loopback block device
Fixes #35363.
2024-11-28 00:56:43 +09:00
Yu Watanabe
d5c4f1997a TEST-67-INTEGRITY: modernize test code
- make udevd generate debugging logs for loopback and DM devices,
- insert 'udevadm wait' at several places to make the device processed
  by udevd,
- cleanup generated integritysetup service before moving to next
  algorithm,
- drop unnecessary exit on command failure,
- also test data splitting mode for all algorithms.
2024-11-28 00:56:23 +09:00
Lennart Poettering
c18a102464 tests: fix access mode of root inode of throw-away container images
Otherwise the root inode will typically have what mkdtemp sets up, which
is something like 0700, which is weird and somewhat broken when trying
to look into containers from unpriv users.
2024-11-28 00:13:27 +09:00
Luca Boccassi
0566bd9643
machine: increase timeouts in attempt to fix #35115 (#35117)
An attempt to fix https://github.com/systemd/systemd/issues/35115
2024-11-26 16:12:56 +00:00
Daan De Meyer
e3b5a0c32d test: Use env in testsuite readme
Let's make sure we use env when we're setting environment variables
to rely less on shell specifics.
2024-11-25 14:54:23 +00:00