IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Follow-up for d0a63cf041.
The command ncat may be already dead when the service manager receives
the notify message. Hence, the service cannot be found by the sender PID,
and the notify message will be ignored.
```
Dec 17 03:26:49 systemd[1]: Cannot find unit for notify message of PID 1159, ignoring.
Dec 17 03:26:49 systemd[1]: Received SIGCHLD from PID 1152 (bash).
Dec 17 03:26:49 systemd[1]: Child 1152 (bash) died (code=exited, status=0/SUCCESS)
Dec 17 03:26:49 systemd[1]: run-p1151-i1451.service: Child 1152 belongs to run-p1151-i1451.service.
Dec 17 03:26:49 systemd[1]: run-p1151-i1451.service: Main process exited, code=exited, status=0/SUCCESS (success)
Dec 17 03:26:49 systemd[1]: run-p1151-i1451.service: Failed with result 'protocol'.
Dec 17 03:26:49 systemd[1]: run-p1151-i1451.service: Service will not restart (restart setting)
Dec 17 03:26:49 systemd[1]: run-p1151-i1451.service: Changed start -> failed
```
This also drops unnecessary --pipe option and redundant check by 'env' command.
For some reasons, another session logind-test-user may be started.
===
Dec 13 07:04:16 systemd-logind[2140]: Got message type=method_call ... member=CreateSessionWithPIDFD ...
(snip)
Dec 13 07:04:16 systemd-logind[2140]: New session 15 of user logind-test-user.
Dec 13 07:04:16 systemd-logind[2140]: VT changed to 2
Dec 13 07:04:16 systemd-logind[2140]: rfkill: Found udev node /dev/rfkill for seat seat0
Dec 13 07:04:16 systemd-logind[2140]: udmabuf: Found udev node /dev/udmabuf for seat seat0
Dec 13 07:04:16 systemd-logind[2140]: Found static node /dev/snd/timer for seat seat0
Dec 13 07:04:16 systemd-logind[2140]: Found static node /dev/snd/seq for seat seat0
Dec 13 07:04:16 systemd-logind[2140]: Changing ACLs at /dev/snd/timer for seat seat0 (uid 0→4712 add)
Dec 13 07:04:16 systemd-logind[2140]: Changing ACLs at /dev/rfkill for seat seat0 (uid 0→4712 add)
Dec 13 07:04:16 systemd-logind[2140]: Changing ACLs at /dev/udmabuf for seat seat0 (uid 0→4712 add)
Dec 13 07:04:16 systemd-logind[2140]: Changing ACLs at /dev/snd/seq for seat seat0 (uid 0→4712 add)
Dec 13 07:04:16 systemd[1]: user-4712.slice: Changed dead -> active
Dec 13 07:04:16 systemd[1]: user-4712.slice: Job 5951 user-4712.slice/start finished, result=done
Dec 13 07:04:16 systemd[1]: Created slice user-4712.slice.
Dec 13 07:04:16 systemd-logind[2140]: Electing new display for user logind-test-user
Dec 13 07:04:16 systemd-logind[2140]: Choosing session 15 in preference to -
(snip)
Dec 13 07:04:16 systemd-logind[2140]: Got message type=method_call ... member=CreateSessionWithPIDFD ...
(snip)
Dec 13 07:04:16 systemd-logind[2140]: New session 16 of user logind-test-user.
Dec 13 07:04:16 systemd-logind[2140]: Electing new display for user logind-test-user
Dec 13 07:04:16 systemd-logind[2140]: Ignoring session 16
===
Let's track only session for the user with tty, which we explicitly created.
Fixes#35597.
Copy what systemd-notify does by default by setting it to the PID of the shell,
so that main process tracking works as expected. Also use test -S instead of ls
to check socket.
[ 33.980396] (sh)[1024]: run-p1022-i1322.service: Executing: sh -c "echo READY=1 | ncat --unixsock --udp \$NOTIFY_SOCKET --source /run/notify && env"
[ 34.138778] systemd[1]: run-p1022-i1322.service: Child 1024 belongs to run-p1022-i1322.service.
[ 34.138825] systemd[1]: run-p1022-i1322.service: Main process exited, code=exited, status=0/SUCCESS (success)
[ 34.139451] systemd[1]: run-p1022-i1322.service: Failed with result 'protocol'.
[ 34.139559] systemd[1]: run-p1022-i1322.service: Service will not restart (restart setting)
[ 34.139573] systemd[1]: run-p1022-i1322.service: Changed start -> failed
[ 34.139945] systemd[1]: run-p1022-i1322.service: Job 1364 run-p1022-i1322.service/start finished, result=failed
Fixes#35619
Follow-up for 18bb30c3b2
To be able to run systemd in a Type=notify transient unit, the notify
socket can't be bind mounted to /run/systemd/notify as systemd in the
transient unit wants to use that as its own notify socket which
conflicts with systemd on the host.
Instead, for sandboxed units, let's bind mount the notify socket to
/run/host/notify as documented in the container interface. Since we
don't guarantee a stable location for the notify socket and insist users
use $NOTIFY_SOCKET to get its path, this is safe to do.
To be able to run systemd in a Type=notify transient unit, the notify
socket can't be bind mounted to /run/systemd/notify as systemd in the
transient unit wants to use that as its own notify socket which conflicts
with systemd on the host.
Instead, for sandboxed units, let's bind mount the notify socket to
/run/host/notify as documented in the container interface. Since we don't
guarantee a stable location for the notify socket and insist users use
$NOTIFY_SOCKET to get its path, this is safe to do.
Recently, PrivateUsers=identity was added to support mapping the first
65536 UIDs/GIDs from parent to the child namespace and mapping the other
UID/GIDs to the nobody user.
However, there are use cases where users have UIDs/GIDs > 65536 and need
to do a similar identity mapping. Moreover, in some of those cases,
users want a full identity mapping from 0 -> UID_MAX.
To support this, we add PrivateUsers=full that does identity mapping for
all available UID/GIDs.
Note to differentiate ourselves from the init user namespace, we need to
set up the uid_map/gid_map like:
```
0 0 1
1 1 UINT32_MAX - 1
```
as the init user namedspace uses `0 0 UINT32_MAX` and some applications
- like systemd itself - determine if its a non-init user namespace based
on uid_map/gid_map files.
Note systemd will remove this heuristic in running_in_userns() in
version 258 (https://github.com/systemd/systemd/pull/35382) and uses
namespace inode. But some users may be running a container image with
older systemd < 258 so we keep this hack until version 259 for version
N-1 compatibility.
In addition to mapping the whole UID/GID space, we also set
/proc/pid/setgroups to "allow". While we usually set "deny" to avoid
security issues with dropping supplementary groups
(https://lwn.net/Articles/626665/), this ends up breaking dbus-broker
when running /sbin/init in full OS containers.
Fixes: #35168Fixes: #35425
When trying to run dbus-broker in a systemd unit with PrivateUsers=full,
we see dbus-broker fails with EPERM at `util_audit_drop_permissions`.
The root cause is dbus-broker calls the setgroups() system call and this
is disallowed via systemd's implementation of PrivateUsers= by setting
/proc/pid/setgroups = deny. This is done to remediate potential privilege
escalation vulnerabilities in user namespaces where an attacker can remove
supplementary groups and gain access to resources where those groups are
restricted.
However, for OS-like containers, setgroups() is a pretty common API and
disabling it is not feasible. So we allow setgroups() by setting
/proc/pid/setgroups to allow in PrivateUsers=full. Note security conscious
users can still use SystemCallFilter= to disable setgroups() if they want
to specifically prevent this system call.
Fixes: #35425
TEST-75-RESOLVED fails on Ubuntu autopkgtest due to this warning from
knot:
notice: config, policy 'auto_rollover_nsec3' depends on default nsec3-salt-length=8, since version 3.5 the default becomes 0
Explicitly set nsec3-salt-length=8 to silence.
This PR allows an option for systemd exec units to enable UTS namespaces
but not restrict changing hostname via seccomp. Thus, units can change
hostname without affecting the host. This is useful for OS-like
containers running as units where they should have freedom to change
their container hostname if they want, but not the host's hostname.
Fixes: #30348
RestrictNamespaces= would accept "time" but would not actually apply
seccomp filters e.g. systemd-run -p RestrictNamespaces=time unshare -T true
should fail but it succeeded.
This commit actually enables time namespace seccomp filtering.
Previously if one service specified the same unit as their
success and failure handler we bailed out of resolving the triggering unit
even though it is still unique.
These new tests are flaky, so disable them temporarily, until after
the release, to avoid pushing out new flakiness to consumers. They
will be re-enabled immediately after.
Otherwise, if networkd is failed, e.g. .network files that triggered the
failure will remain, and the next test case will start with previous
.network files. So, most subsequent test will fail.
This makes sure all kernel log messages are logged to the console.
This should be helpful during shutdown to detect possible issues with
journald when the logs can't be written to the journal itself anymore
but are written to kmsg.
Currently, mkosi GitHub action complains the following:
===
Could not find 'setfiles' which is required to relabel files.
===
Let's tentatively disable SELinux test.
This allows an option for systemd exec units to enable UTS namespaces
but not restrict changing hostname via seccomp. Thus, units can change
hostname without affecting the host.
Fixes: #30348
We put a timeline of 257 to remove the old bash test runner so since
we're about to release 257, let's remove the old bash test runner in
favor of the meson + mkosi test runner.
Recently, PrivateUsers=identity was added to support mapping the first
65536 UIDs/GIDs from parent to the child namespace and mapping the other
UID/GIDs to the nobody user.
However, there are use cases where users have UIDs/GIDs > 65536 and need
to do a similar identity mapping. Moreover, in some of those cases, users
want a full identity mapping from 0 -> UID_MAX.
Note to differentiate ourselves from the init user namespace, we need to
set up the uid_map/gid_map like:
```
0 0 1
1 1 UINT32_MAX - 1
```
as the init user namedspace uses `0 0 UINT32_MAX` and some applications -
like systemd itself - determine if its a non-init user namespace based on
uid_map/gid_map files. Note systemd will remove this heuristic in
running_in_userns() in version 258 and uses namespace inode. But some users
may be running a container image with older systemd < 258 so we keep this
hack until version 259.
To support this, we add PrivateUsers=full that does identity mapping for
all available UID/GIDs.
Fixes: #35168
We want to make sure the integration tests that don't require qemu
can run successfully both in an nspawn container and in a qemu VM.
So let's add one more knob TEST_PREFER_QEMU=1 to run jobs that normally
require nspawn in qemu instead.
Running these tests in qemu is also possible by not running as root but
that's very implicit so we add an explicit knob instead to make it explicit
that we want to run these in qemu instead of nspawn.