IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
This normalizes how we report an empty list of boot entries in
ListBootEntries(). Our usual pattern is to return one item per method
call, but when there is none we usually return a NoSuchXYZ error. Do so
here too.
Before this we'd return a null item instead here, and only here.
This is a minor compat break, but given that this IPC interface is very
new and probably not used so far (we don't use it in our code at least,
and google doesn#t find any other use) I think this normalization is OK
at this point.
If a symlink is leftover, still allow cleaning it up via 'disable'. This
happens when a unit is stopped and removed, but not disabled, and a reload
has already happened. At that point, cleaning up the old symlinks becomes
impossible through the APIs, and needs to be done manually. Always allow
cleaning up symlinks, if they exist, by only erroring out if there is an
OOM.
Follow-up for f31f10a620
DynamicUser= enables PrivateTmp= implicitly to avoid files owned by reusable uids
leaking into the host. Change it to instead create a fully private tmpfs instance
instead, which also ensures the same result, since it has less impactful semantics
with respect to PrivateTmp=yes, which links the mount namespace to the host's /tmp
instead. If a user specifies PrivateTmp manually, let the existing behaviour
unchanged to ensure backward compatibility is not broken.
I find myself wanting to check this data with a quick command, and
browsing through /sys/ manually getting binary data sucks. Hence let's
do add a nice little analysis tool.
CPUQuota= can deal with float percentages perfectly fine these days
(up to two places after the dot), so let's take that into account
when serializing the value to the transient unit file so we don't lose
precision when specifying e.g. "CPUQuota=0.5%".
A unit with StandardOutput=journal (the default) will get its stdout/stderr sockets
disconnected when journald stops, as the file descriptors on journald's side are
not preserved (it works on restart, as the FD Store keeps them open during restarts).
Set FileDescriptorStorePreserve=yes so that the journal FD's stay open during a soft
reboot, and applications don't get broken stdout/stderr.
It seems this introduced a regression in the CentOS CI;
14:25:58 FAILED TASKS:14:25:58 -------------
14:25:58 TEST-03-JOBS
14:25:58 TEST-52-HONORFIRSTSHUTDOWN
14:25:58 TEST-63-PATH
Revert for now.
This reverts commit da3c6fc553.
This makes output a bit shorter and nicer. For us, shorter output is generally
better.
Also, drop unnecessary UINT64_C macros. The left operand is always uint64_t,
and C upcasting rules mean that it doesn't matter if the right operand is
narrower or signed, the operation is always done on the wider unsigned type.
The virtio-scsi driver is available in the KVM/cloud kernel
packages provided by distributions whereas the megasas2 driver is
not. Let's switch to virtio-scsi so we can switch back to the KVM/cloud
kernel packages.
"journalctl -u foo.service" may not work as expected, especially entries
for _TRANSPORT=stdout, for short-living services or when the service manager
generates debugging logs. Instead, SYSLOG_IDENTIFIER= should be reliable for
stdout. Let's use it.
An example case:
```
__CURSOR=s=06278e3bf011458e973c81d370a8f7a5;i=1e4dc;b=1b0258a5c78341609bf462c72d4541c3;m=308de65;t=6194c3895a13f;x=50c7e9af5b8cfc37
__REALTIME_TIMESTAMP=1716665017803071
__MONOTONIC_TIMESTAMP=50912869
_BOOT_ID=1b0258a5c78341609bf462c72d4541c3
SYSLOG_FACILITY=3
_UID=0
_GID=0
_MACHINE_ID=d3490e076ab24968bfa19a6aab26beb3
_HOSTNAME=H
_RUNTIME_SCOPE=system
_TRANSPORT=stdout
PRIORITY=6
_PID=2668
_STREAM_ID=3f9b8855636041988d003a9c63379b8a
SYSLOG_IDENTIFIER=echo
MESSAGE=foo
```
As you can see, there is no unit identifier.
This reverts commit 60d064d3fd.
The logged test failure was because of missing memory controller in
testing cgroup. With the test fixed in previous commit, memory
attributes are delegated as expected.
Ref: #32439
When the test used to be run on distro that doesn't enable memory
accounting by default (such as openSUSE TW), there is no guarantee that
testing unit has memory.* cgroup attributes and delegation test would
fail if they are missing.
Require memory controller explicitly inside the unit so that test can
work in any environment.
In varlink.c we generally do not make failing callback functions fatal,
since that should be up to the app. Hence, in case of varlinkctl (where
we want failures to be fatal), make sure to propagate the error back
explicitly.
Before this change a failing call to "varlinkctl --more call …" would result in
a zero exit code. With this it will correctly exit with a non-zero exit
code.
Recently, for slow test environments, journalctl --sync was added to the
loop in the timeout. However, journalctl --sync may be slow in such systems,
and timeout easily triggered during syncing.
Hopefully, reading journal with --follow and grep the output with an expected
line should be efficient.
Hopefully fixes#32712.
On running cryptsetup, udevd detects two inotify events for the
underlying device. Running the test on enough fast host, the expected
symlinks based on UUID and disk label are created by the second event.
During processing a uevent for a device, udevd disables the inotify
watch for the device. If the test runs on slow system, the second
inotify event may comes during a udev worker processing the synthesized
uevent triggered by the first inotify event. Hence, no synthesized
uevent for the second inotify event will be generated, and the expected
symlinks will be never created.
To prevent the issue, we need to lock the device during cryptsetup
command is running.
Fixes#32913.
Otherwise, when stopping the service, the last command may not be
started yet, and the service manager may not send SIGTERM signal to the
last command, but send SIGKILL on timeout.
===
May 21 08:23:24 test19-exit-cgroup.sh[437]: + disown
May 21 08:23:24 test19-exit-cgroup.sh[438]: + sleep infinity
May 21 08:23:24 test19-exit-cgroup.sh[437]: + systemd-notify --ready
May 21 08:23:24 test19-exit-cgroup.sh[437]: + sleep infinity
May 21 08:23:24 test19-exit-cgroup.sh[441]: + systemctl stop one
May 21 08:23:24 test19-exit-cgroup.sh[443]: + sleep infinity
(snip)
May 21 08:23:24 systemd[1]: one.service: Changed running -> stop-sigterm
May 21 08:23:24 systemd[1]: Stopping one.service - /tmp/test19-exit-cgroup.sh "systemctl stop one"...
May 21 08:23:24 systemd[1]: Received SIGCHLD from PID 441 (systemctl).
May 21 08:23:24 systemd[1]: Child 437 (bash) died (code=killed, status=15/TERM)
May 21 08:23:24 systemd[1]: one.service: Child 437 belongs to one.service.
May 21 08:23:24 systemd[1]: one.service: Main process exited, code=killed, status=15/TERM (success)
May 21 08:23:24 systemd[1]: Child 439 (bash) died (code=killed, status=15/TERM)
May 21 08:23:24 systemd[1]: one.service: Child 439 belongs to one.service.
May 21 08:23:24 systemd[1]: Child 441 (systemctl) died (code=killed, status=15/TERM)
May 21 08:23:24 systemd[1]: one.service: Child 441 belongs to one.service.
May 21 08:23:24 systemd[1]: Child 442 (bash) died (code=killed, status=15/TERM)
May 21 08:23:24 systemd[1]: one.service: Child 442 belongs to one.service.
(snip)
May 21 08:24:54 systemd[1]: one.service: State 'stop-sigterm' timed out. Killing.
May 21 08:24:54 systemd[1]: one.service: Killing process 443 (sleep) with signal SIGKILL.
May 21 08:24:54 systemd[1]: one.service: Changed stop-sigterm -> stop-sigkill
May 21 08:24:54 systemd[1]: Received SIGCHLD from PID 443 (sleep).
May 21 08:24:54 systemd[1]: Child 443 (sleep) died (code=killed, status=9/KILL)
May 21 08:24:54 systemd[1]: one.service: Child 443 belongs to one.service.
May 21 08:24:54 systemd[1]: one.service: Control group is empty.
May 21 08:24:54 systemd[1]: one.service: Failed with result 'timeout'.
May 21 08:24:54 systemd[1]: one.service: Service restart not allowed.
May 21 08:24:54 systemd[1]: one.service: Changed stop-sigkill -> failed
May 21 08:24:54 systemd[1]: one.service: Job 738 one.service/stop finished, result=done
May 21 08:24:54 systemd[1]: Stopped one.service - /tmp/test19-exit-cgroup.sh "systemctl stop one".
May 21 08:24:54 systemd[1]: one.service: Unit entered failed state.
May 21 08:24:54 systemd[1]: one.service: Releasing resources...
===
Fixes#32947.
Fixes https://github.com/systemd/systemd/issues/32680#issuecomment-2120974685.
===
May 21 02:45:08 TEST-74-AUX-UTILS.sh[2475]: + mountpoint /tmp/tmp.eaRV7lSbX2/mnt
May 21 02:45:08 TEST-74-AUX-UTILS.sh[2476]: /tmp/tmp.eaRV7lSbX2/mnt is not a mountpoint
May 21 02:45:08 TEST-74-AUX-UTILS.sh[2449]: + systemd-mount /dev/loop0 /tmp/tmp.eaRV7lSbX2/mnt
May 21 02:45:08 systemd-mount[2477]: Failed to start transient mount unit: Unit tmp-tmp.eaRV7lSbX2-mnt.mount was already loaded or has a fragment file.
===
systemd-analyze runs the generators in a sandbox, which makes gcov
unhappy since it can't update its counters. Let's "silence" gcov in this
particular case by telling it to look for gcov note files in /tmp (where
shouldn't be any, so gcov won't try to update any counters).
Fixes following failure:
===
May 17 04:12:04 TEST-74-AUX-UTILS.sh[2684]: + systemd-mount --owner=testuser /dev/loop0 /tmp/tmp.DVQdo2ou53/mnt
(snip)
May 17 04:15:04 systemd[1]: dev-loop0.device: Job dev-loop0.device/start timed out.
May 17 04:15:04 systemd[1]: dev-loop0.device: Job 5812 dev-loop0.device/start finished, result=timeout
May 17 04:15:04 systemd[1]: Timed out waiting for device dev-loop0.device - /dev/loop0.
May 17 04:15:04 systemd[1]: tmp-tmp.DVQdo2ou53-mnt.mount: Job 5804 tmp-tmp.DVQdo2ou53-mnt.mount/start finished, result=dependency
May 17 04:15:04 systemd[1]: Dependency failed for tmp-tmp.DVQdo2ou53-mnt.mount - /tmp/tmp.DVQdo2ou53/mnt.
May 17 04:15:04 systemd[1]: tmp-tmp.DVQdo2ou53-mnt.mount: Job tmp-tmp.DVQdo2ou53-mnt.mount/start failed with result 'dependency'.
May 17 04:15:04 systemd[1]: systemd-fsck@dev-loop0.service: Job 5805 systemd-fsck@dev-loop0.service/start finished, result=dependency
May 17 04:15:04 systemd[1]: Dependency failed for systemd-fsck@dev-loop0.service - File System Check on /dev/loop0.
May 17 04:15:04 systemd[1]: systemd-fsck@dev-loop0.service: Job systemd-fsck@dev-loop0.service/start failed with result 'dependency'.
May 17 04:15:04 systemd[1]: dev-loop0.device: Job dev-loop0.device/start failed with result 'timeout'.
(snip)
May 17 04:15:04 systemd-mount[2856]: A dependency job for tmp-tmp.DVQdo2ou53-mnt.mount failed. See 'journalctl -xe' for details.
In mkosi, we run the test inside the VM instead of outside. To simplify
the implementation we drop the reboot part and only verify that we can
schedule and cancel shutdowns and that the wall messages are sent as
expected.
Encrypted /var is skipped because meson's limitations make per test
images not really feasible and we can't encrypt /var by default because
it slows down the image build too much.
Co-authored-by: Richard Maw <richard.maw@codethink.co.uk>
For manager test runs, the generator output paths are located in
/tmp, which means that if we mount a private /tmp for generators,
we lose all the generated units (actually the generators will just
fail because the directories don't exist, but if they did exist,
we'd still lose all the units).
Let's avoid the problem by skipping the private /tmp for manager
test runs. This also avoids any possible privilege issues with
mounting a private /tmp that might happen in this scenario.
On Debian/Ubuntu, the unit is named tgt.service instead of tgtd.service,
so let's make sure we take that into account.
On CentOS, tgtd.service is not available, so let's skip the test if we
can't find the service.
Let's remove the unneeded NotifyAccess=all and start the socket
and service in the test itself instead of via the service unit. This
makes the test unit identical to the other test units which will allow
us to autogenerate it in a later commit.
Having these named differently than the test itself mostly creates
unecessary confusion and makes writing logic against the tests harder
so let's rename the testsuite-xx units and scripts to just use the
test name itself.
Currently, on soft-reboot, /run/credentials/@system is unmounted
because it has DefaultDependencies=yes and as such will have
Conflicts=umount.target and Before=umount.target. Let's make sure
credential mounts survive soft-reboot by implying DefaultDependencies=no
for credential mounts.
A fixed name is too rigid, let's give users the ability to define
custom drop-in names which at the same time also allows defining
multiple dropins per unit.
We use ~ as the separator because:
- ':' is not allowed in credential names
- '=' is used to separate credential from value in mkosi's --credential
argument.
- '-' is commonly used in filenames
- '@' already has meaning as the unit template specifier which might be
confusing when adding dropins for template units
Otherwise, at this stage, the interface may be in e.g. initialized or
pending state, and the drop-in file introduced by the previous command
may not be registered to the state file for the interface.
Fixes#32685.
We already changed logs-filtering.service to sleep 2 seconds before
exiting to combat flakyness, let's do the same for the delegated
cgroup filtering payload.
Fixes#32696 (hopefully)
Let's run mkfs on the file we create instead of the loop device and
let's use udevadm wait --settle to wait for udev to settle before
doing anything with the loop device
Fixes#32680 (hopefully)
Currently test-aux-scope.service can get killed by the test before
it's had a chance to setup its signal handler. Make it Type=notify
to fix the race.
Fixes#32670 (hopefully)
For some reason this fails on ext4 with "No space left on device".
Until we figure out why, let's skip the test on ext4 (which is reported
as ext2/ext3 by stat).
This module is builtin on ubuntu causing the test to fail. Let's
use just dummy instead. I tried replacing it with scsi_debug but
that caused issues with modprobe complaining it could not remove
scsi_debug because it was in use.
When root authorized keys are provided by mkosi they are not
newline-terminated so appending a public key to the file results
in a corrupt key, so just to be safe we add an empty line.
Depending on host configuration this may or may not be included (e.g.
on mkosi we get a result without an ifindex field). Let's strip it from
the resolved reply to avoid failing the test.
Other distributions may be able to install selinux
but they are not expected to use it.
The distribution is tested rather than whether selinux is enabled
because it is expected to work on CentOS and Fedora
and we want it to fail noisily.
TEST-26-SYSTEMCTL is racy as we call systemctl is-active immediately
after systemctl kill. Let's implement --wait for systemctl kill and
use it in TEST-26-SYSTEMCTL to avoid the race.
On OpenSUSE the systemd-hostnamed does not fail and is unloaded which
causes reset-failed to fail. So let's ignore any errors from reset-failed
to make the test more robust.
The rootfs only has 64K UIDs available when booting with virtiofs,
whereas the nspawn tests want to use user namespace which require
more than 64K UIDs.
Let's use oneshot services as we don't need long running services
for the tests we're doing. Let's also increase the sleeps a little
as the current values weren't sufficient when running the test locally
on my machine with mkosi.
If 3 lock messages get sent when going to sleep
then we can falsely assume we have woken up if we only assume we have at least two
so checking we have more than we did before sleeping addresses that issue.
OpenSUSE images seem to be unhappy with either how they are built
or what they are being asked to do.
The listed device-mapper failure is just one of the strange errors,
I have also seen it fail to propagate cgroup properties into new cgroups
that were previously guaranteed to exist.
This commit adds definitions to build the minimal_0 and minimal_1
images with mkosi and includes them into the system image. We also
move the building of the various app-xxx and similar images that are
extremely minimal into the tests itself by moving the related logic
from install_verity_minimal() into a new function
install_extension_images() in util.sh. Because the mkosi /usr is
read-only, we now place the extension images in /tmp instead of
/usr/share.
Co-authored-by: Richard Maw <richard.maw@codethink.co.uk>
Co-authored-by: sam-leonard-ct <sam.leonard@codethink.co.uk>
Otherwise we lose valuable logging from systemd-executor when things
go wrong since it can only log to the journal and not to the console
in these cases.
Also: rename Handover → Handoff. I think it makes it clearer that this
is not really about handing over any resources, but that the executor is
out off the game from that point on.
Currently, A large amount of unit test output is logged directly
to the console instead of to the per test log file as any subprocesses
executed by a test manager will detect that stderr is not connected
to the journal and log directly to /dev/console instead.
To solve this issue, let's make sure all tests are connected directly
to the journal by running them with systemd-run. We also simplify the
entire test script by getting rid of the custom queue and replicating
it with xargs instead. By using bash's function export feature, we can
make our run_test() function available to the bash subprocess spawned
by xargs.
Once a test is finished, we read its logs from the journal and put them
in the appropriate file if needed.
When starting a container with --user, the new uid will be resolved and switched to
only in the inner child, at the end of the setup, by spawning getent. But the
credentials are set up in the outer child, long before the user is resolvable,
and the directories/files are made only readable by root and read-only, which
means they cannot be changed later and made visible to the user.
When this particular combination is specified, it is obvious the caller wants
the single-process container to be able to use credentials, so make them world
readable only in that specific case.
Fixes https://github.com/systemd/systemd/issues/31794
Enable the exec_fd logic for Type=notify* services too, and change it
to send a timestamp instead of a '1' byte. Record the timestamp in a
new ExecMainHandoverTimestamp property so that users can track accurately
when control is handed over from systemd to the service payload, so
that latency and startup performance can be trivially and accurately
tracked and attributed.
Resolve at attach/detach/inspect time, so that the image is pinned and requires
re-attaching on update, given files are extracted from it so just passing
img.v/ to RootImage= is not enough to get a portable image updated
I was bitten several times by testing things only with --root flag, so this
commit prepares the existing test cases to run on / too. This required the test
cases to clean up after themselves, thus I have put each test case in a
separate subshell and used traps to do the cleanups.
I needed to change the hierarchy used by the test extension to /opt, because
unmounting /usr often failed with EBUSY.
Let's rework the test a bit, namely:
- condense the code a bit
- drop unnecessary braces around variables
- drop unnecessary explanations around `touch` calls
- drop/rename functions to make the code more self-explanatory
- simplify cleanup a bit
- create R/O bind mounts directly (supported since util-linux 2.27)
Previously, 'udevadm control' only checked the number of the arguments.
So, if only `--timeout` is specified, it spuriously did nothing and succeeded.
This makes the command request at least one control command.
Unfortunately bfd30e8af6 is not enough and the test fails, that still
occasionally occur, don't provide enough information to see what's
wrong. Let's rework the test a little to improve this, namely:
- redirect curl's output into a temporary file instead of piping it
directly into the "check" expression; that way we can simply dump
the temporary file when the test fails, providing potentially
crucial information. We don't want to always dump everything to
stdout, as some of the tests request an entire system journal (note
that shell redirection instead of `curl -o file` is used
intentionally, so the output file is always nuked first)
- by dropping the pipes in curl commands we can re-enable pipefail
- also, split some very long commands to multiple lines to (slightly)
improve readability
Follow-up for bfd30e8af6.
The timeout on sd-resolved's side is 5-10s (UDP or TCP), but dig's
default timeout is 5s. Let's give sd-resolved enough time to timeout
before either giving up or checking if it served stale data on dig's
side.
Resolves: #31639
I collected a couple of fails in this particular test, but without any
output they're impossible to debug. Let's make this slightly less
annoying and let curl show an error (if any) even in silent mode.
This patch uncovers that curl has been (silently) complaining about not
being able to write to the output destination, because `grep -q`
short-circuits on the first match and doesn't bother reading the rest,
so replace `grep -q` with `grep ... >/dev/null` to force grep to always
read the whole thing from curl.
Prep work for running the integration tests with meson, which requires
tests to exit with 77 to indicate they are skipped.
Note this only deals with the easy cases where there's only tests. The
hard ones where there's subtests of which only some are skipped are left
for another PR.
With plain QEMU on a saturated AWS region we might just barely miss the
timeout window, causing unexpected test fails:
[ 688.681324] systemd-nspawn[1332]: [ OK ] Finished systemd-user-sessions.service.
[ 689.451267] systemd-nspawn[1332]: [ OK ] Started console-getty.service.
[ 689.572874] systemd-nspawn[1332]: [ OK ] Reached target getty.target.
[ 693.634609] testsuite-74.sh[1223]: + at_exit
[ 693.634609] testsuite-74.sh[1223]: + rm -fv -- /tmp/test-dump /tmp/test-usr-dump /tmp/make-dump
[ 693.838395] testsuite-74.sh[1502]: removed '/tmp/test-dump'
[ 693.838395] testsuite-74.sh[1502]: removed '/tmp/test-usr-dump'
[ 693.838395] testsuite-74.sh[1502]: removed '/tmp/make-dump'
[ 693.951114] testsuite-74.sh[670]: + echo 'Subtest /usr/lib/systemd/tests/testdata/units/testsuite-74.coredump.sh failed'
[ 693.951114] testsuite-74.sh[670]: Subtest /usr/lib/systemd/tests/testdata/units/testsuite-74.coredump.sh failed
[ 693.951114] testsuite-74.sh[670]: + return 1
[ 694.659094] systemd[1]: testsuite-74.service: Main process exited, code=exited, status=1/FAILURE
[ 694.719563] systemd[1]: testsuite-74.service: Failed with result 'exit-code'.
[ 694.882069] systemd[1]: Failed to start testsuite-74.service.
[ 695.574445] systemd[1]: Reached target testsuite.target.
[ 696.174844] systemd[1]: Starting end.service...
[ 699.509408] systemd-nspawn[1332]:
[ 699.509408] systemd-nspawn[1332]: CentOS Stream 9
[ 699.509408] systemd-nspawn[1332]: Kernel 5.14.0-432.el9.x86_64 on an x86_64 (pts/0)
[ 699.509408] systemd-nspawn[1332]:
Also, move the rest of container the setup for the user xattrs test into
the condition, since doing it without the actual test is pretty
pointless.
Same reason as the reload, reexec is disruptive and it requires the
same privileges, so if somebody wants to limit reloads, they'll also
want to limit reexecs, so use the same setting.
Previously, 'udevadm test' performs not only processing udev rules,
but made several destructive change on the system; updating udev
database, device node permission, devlinks, network interface
properties, and so on.
Similary, 'udevadm test-builtin' may perform something destructive,
especially by 'keyboard', 'kmod', and 'net_setup_link' builtins.
Let's make these commands and test executables not change device
configurations.
When listing images they are inspected one by one, so in case of a
portable with extensions they always resulted as not found.
Allow a partial match when listing, so that we can find the appropriate
unit that an image belongs to, and list the correct state as attached.
Currently app_1.0.raw is refused if it contains extension-release.d/extension-release.app,
which stops one from using versioned images without using the force flag to disable
the check. Relax it so that only the actual name, and not the version, is compared, like
it already happens in other places.
This fixes a race condition crash in homed that would happen in the
following sequence of events:
1. Client 1 takes a ref on the home area
2. Client 1 calls some method via dbus
3. Client 2 calls Release()
In homed, the Release() would check if a ref is still held (in this
case: yes it is) and returns an error. Except that is done through a
code-path that asserts that no operations are ongoing. In this case,
it's valid to have an ongoing operation, and so the assertion fails
causing homed to crash.
When sd-run connects to D-Bus rather than the private socket, it will
generate the transient unit name using the bus ID assigned by the D-Bus
broker/daemon. The issue is that this ID is only unique per D-Bus run,
if the broker/daemon restarts it starts again from 1, and it's a simple
incremental counter for each client.
So if a transient unit run-u6.service starts and fails, and it is not
collected (default on failure), and the system soft-reboots, any new
transient unit might conflict as the counter will restart:
Failed to start transient service unit: Unit run-u6.service was already loaded or has a fragment file.
Get the soft-reboot counter, and if it's greater than zero, append it
to the autogenerated unit name to avoid clashes.
losetup in util-linux 2.40 started reporting lost loop devices [0] and
it has an unfortunate side-effect where it reports lost devices even in
containers, which then makes the loop device check "falsely" pass [1].
Let's just check for /dev/loop-control explicitly to "work around" this.
[0] a6ca0456cc
[1] https://github.com/util-linux/util-linux/issues/2824
Drop connections and caches and reload config from files, to allow
for low-interruptions updates, and hook up to the usual SIGHUP and
ExecReload=. Mark servers and services configured directly via D-Bus
so that they can be kept around, and only the configuration file
settings are dropped and reloaded.
Fixes https://github.com/systemd/systemd/issues/17503
Fixes https://github.com/systemd/systemd/issues/20604
This patch fixes an issue where, when not specifiying either at least one
`SocketBindAllow` or `SocketBindDeny` rule, behavior for the bind syscall
filtering would be unexpected.
For example, when trying to bind to a port with only "SocketBindDeny=any"
given, the syscall would succeed:
> systemd-run -t -p "SocketBindDeny=any" nc -l 8080
Expected with this set of rules (also in accordance with the documentation)
would be an Operation not permitted error.
This behavior occurs because a default initialized socket_bind_rule struct
matches what "any" represents. When creating the bpf list all elements get
default initialized, as such represeting "any". Seemingly it is necressarry
to set the size of the map to at least one, as such if no allow rule is
given default initialization and minimal map size cause one any allow rule
to be in the map, causing the behavior observed above.
This patch solves this by introducing a new "match nothing" magic stored in
the rule's address family and setting such a rule as the first one if no
rule is given, making sure that default initialized rule structs are never
used.
Resolves#30556
Rate limiting authentication attempts in the test can cause somewhat
sporadic test failures: adding a test case might suddenly cause future
test cases to fail because of too many authentication attempts too
quickly
We're not trying to test the rate-limiting, we're trying to test the
functionality of homed. So we effectively disable rate-limiting on all
the home areas we create
This makes it possible to update a home record (and blob directory) of a
home area that's either completely absent (i.e. on a USB stick that's
unplugged) or just inaccessible due to lack of authentication
This bypasses authentication (i.e. user_record_authenticate) if the
volume key was loaded from the keyring and no secret section is
provided.
This also changes Update() and Resize() to always try and load the
volume key from the keyring. This makes the secret section optional for
these methods while still letting them function (as long as the home
area is active)
Naming is always a matter of preference, and the old name would certainly work,
but I think the new one has the following advantages:
- A verb is better than a noun.
- The name more similar to "the competition", i.e. 'sudo', 'pkexec', 'runas',
'doas', which generally include an action verb.
- The connection between 'systemd-run' and 'run0' is more obvious.
There has been no release yet with the old name, so we can rename without
caring for backwards compatibility.
dig question with DNSSEC on will now be proxied upstream, i.e. to the
test knot server. This leads to different results, but the result isn't
tha tinteresting since we don't want to test knot, but resolved. Hence
comment this test.
There seems to be something wrong with the test though, as the upstream
server refused recursion, but if so it is not suitable as an upstream
server really, as resolved can only be client to a recursive resolver.
By default socat open a separate r/w channel for each specified address,
and terminates the connection after .5s from receiving EOF on _either_
side. And since one side of that connection is an empty stdin, we reach
that EOF pretty quickly. Let's avoid this by using socat in
"reversed unidirectional" mode, where the first address is used only for
writing, and the second one is used only for reading.
Addresses:
- https://github.com/systemd/systemd/issues/31500
- https://github.com/systemd/systemd/issues/31493
Follow-up for 3456c89ac2.
This reverts commit 5e8ff010a1.
This broke all the URLs, we can't have that. (And actually, we probably don't
_want_ to make the change either. It's nicer to have all the pages in one
directory, so one doesn't have to figure out to which collection the page
belongs.)
The follow-up commit will refactor some code in systemd-sysext, so add some
tests to make sure that things didn't break. The tests will be later extended
with cases for new features added.
This is inspired by one of our internal tests that does pretty much the
same thing. However, it is slightly more convoluted than I'd like it to
be, since I really don't want to duplicate the list of our units in
another place, so we need to, somehow, pass the list from the meson file
to the test script. I originally envisioned this to be a part of the
unit test suite, but this doesn't work for unit files with absolute
paths to binaries, as we'd have to install the build first (maybe using
a chroot would work?).
It doesn't check man pages (since they might not be installed on the
test machine) and also skip recursive dependencies (as that would trip
over issues in files that are not under our direct control), but it
should still cover typos and such.
There are currently two units for which the check had to be disabled -
syslog.socket, as the corresponding syslog.service might not be
installed, and rc-local.service as that's a compat API and the necessary
/etc/rc.d/rc.local file may not (and most likely won't be) present.
Follow-ups for e1634bb832.
- Allow to call the method without "name" and "type".
- Allow to specify SD_RESOLVE_NO_TXT and SD_RESOLVE_NO_ADDRESS.
- Allow to provide multiple services, and fix memory leak.
- Rearrange the return value format.
- Encode TXT field with octescape() to make the field matches with the
io.systemd.Resolve.Monitor interface.
Fixes#31371.
Follow-up for 4d8b0f0f7a
After the mentioned commit, when the ExecCommand executable is missing,
and failure will be ignored by manager, we exit with EXIT_SUCCESS at executor
side too. The behavior however contradicts systemd.service(5), which states:
> If the executable path is prefixed with "-", an exit code of the command
> normally considered a failure (i.e. non-zero exit status or abnormal exit
> due to signal is _recorded_, but has no further effect and is considered
> equivalent to success.
and thus makes debugging unexpected failures harder. Therefore, let's still
exit with EXIT_EXEC, but just skip LOG_ERR level log.
I (incorrectly) assumed that --relinquish-var does everything --flush
does, including moving already existing stuff from /var/log/journal/ to
/run/log/journal/, but that's not the case. To actually do that we need
to shuffle things manually, so let's do just that.
This should make issues like #31334 easier to debug, since with this
patch we now have a coredump in the test journal as well:
~# make -C test/TEST-04-JOURNAL/ clean setup run TEST_MATCH_SUBTEST=bsod BUILD_DIR=$PWD/build TEST_NO_NSPAWN=1
...
[ 12.176089] testsuite-04.sh[712]: + echo 'Subtest /usr/lib/systemd/tests/testdata/units/testsuite-04.bsod.sh failed'
[ 12.176089] testsuite-04.sh[712]: Subtest /usr/lib/systemd/tests/testdata/units/testsuite-04.bsod.sh failed
[ 12.176089] testsuite-04.sh[712]: + return 1
[ 12.177347] systemd[1]: testsuite-04.service: Failed with result 'exit-code'.
[ 12.220580] systemd[1]: Failed to start testsuite-04.service.
Spawning getter /home/mrc0mmand/repos/@systemd/systemd/build/journalctl -o export -D /var/tmp/systemd-tests/systemd-test.Qtqmmr/root/var/log/journal...
Finishing after writing 7649 entries
TEST-04-JOURNAL: (failed; see logs)
-rw-r----- 1 root root 16777216 Feb 15 21:13 /var/tmp/systemd-tests/systemd-test.Qtqmmr/system.journal
...
~# coredumpctl --file /var/tmp/systemd-tests/systemd-test.Qtqmmr/system.journal
TIME PID UID GID SIG COREFILE EXE SIZE
Thu 2024-02-15 21:13:38 CET 812 0 0 SIGABRT journal /usr/lib/systemd/systemd-bsod -
This commit adds a new way of forwarding journal messages - forwarding
over a socket.
The socket can be any of AF_INET, AF_INET6, AF_UNIUX or AF_VSOCK.
The address to connect to is retrieved from the "journald.forward_address" credential.
It can also be specified in systemd-journald's unit file with ForwardAddress=
owneridmap bind option will map the target directory owner from inside the
container to the owner of the directory bound from the host filesystem.
This will ensure files and directories created in the container will be owned
by the directory owner of the host filesystem. All other users will remain
unmapped. Files to be written as other users in the container will not be
allowed.
Resolves: #27037
For the other verbs turning off JSON mode makes sense, but for "call"
not so much, after all the contents of a method call reply is JSON we
couldn't really show any other way.
Hence, when JSON output was not configured otherwise in "call", default
to the same as -j.
I think this was just overlooked in #13754, which removed
the restriction of Restart= on Type=oneshot services.
There's no reason to prevent RestartForceExitStatus=
now that Restart= has been allowed.
Closes#31148
When we're running with sanitizers, sd-executor might pull in a
significant chunk of shared libraries on startup, that can cause a lot
of memory pressure and put us in the front when sd-oomd decides to go on
a killing spree. This is exacerbated further on Arch Linux when built
with gcc, as Arch ships unstripped gcc-libs so sd-executor pulls in over
30M of additional shared libs on startup:
~# lddtree build-san/systemd-executor
build-san/systemd-executor (interpreter => /lib64/ld-linux-x86-64.so.2)
libasan.so.8 => /usr/lib/libasan.so.8
libstdc++.so.6 => /usr/lib/libstdc++.so.6
libm.so.6 => /usr/lib/libm.so.6
libgcc_s.so.1 => /usr/lib/libgcc_s.so.1
libsystemd-core-255.so => /root/systemd/build-san/src/core/libsystemd-core-255.so
libaudit.so.1 => /usr/lib/libaudit.so.1
libcap-ng.so.0 => /usr/lib/libcap-ng.so.0
...
libseccomp.so.2 => /usr/lib/libseccomp.so.2
libubsan.so.1 => /usr/lib/libubsan.so.1
libc.so.6 => /usr/lib/libc.so.6
~# ls -Llh /usr/lib/libasan.so.8 /usr/lib/libstdc++.so.6 /usr/lib/libubsan.so.1
-rwxr-xr-x 1 root root 9.7M Feb 2 10:36 /usr/lib/libasan.so.8
-rwxr-xr-x 1 root root 21M Feb 2 10:36 /usr/lib/libstdc++.so.6
-rwxr-xr-x 1 root root 3.2M Feb 2 10:36 /usr/lib/libubsan.so.1
Sanitized libsystemd-core.so is also quite big:
~# ls -Llh /root/systemd/build-san/src/core/libsystemd-core-255.so /usr/lib/systemd/libsystemd-core-255.so
-rwxr-xr-x 1 root root 26M Feb 8 19:04 /root/systemd/build-san/src/core/libsystemd-core-255.so
-rwxr-xr-x 1 root root 5.9M Feb 7 12:03 /usr/lib/systemd/libsystemd-core-255.so
... i.e. apply nested config (exclusions and such) when executing R and D.
This fixes a long-standing RFE. The existing logic seems to have been an
accident of implementation. After all, if somebody specifies a config with
'R /foo; x /tmp/bar', then probably the goal is to remove stuff from under /foo,
but keep /tmp/bar. If they just wanted to nuke everything, then would not specify
the second item.
This also makes R and D use O_NOATIME, i.e. the access times of the directories
that are accessed will not be changed by the cleanup.
Obviously, we'll have to add this to NEWS and such.
Looking at the whole tmpfiles.d config in Fedora, this change has no effect.
The test cases are adjusted as appropriate. I also added another test case for
'R'/'D' with a file, just to test this code path more.
Replaces #20641.
Fixes#1633.
Let's allow configuring the debug tty independently of enabling/disabling
the debug shell. This allows mkosi to configure the correct tty while
leaving enabling/disabling the debug tty to the user.
This new mode copies resources provided by the client, so that they
remain available for inspect/detach even if the original images are
deleted, but symlinks the profile as that is owned by the OS, so that
updates are automatically applied.
fresh otherwise
Currently, exec_setup_credential() always rewrite all credentials
upon exec_invoke(), i.e. invocation of each ExecCommand, and within
a single tmpfs instance. This is problematic though:
* When writing each tmp cred file, we essentially double the size
of the credential. Therefore, if one cred is bigger than half
of CREDENTIALS_TOTAL_SIZE_MAX, confusing ENOSPC occurs (see also
https://github.com/systemd/systemd/pull/24734#issuecomment-1925440546)
* Credential is a unit-wide thing and thus should not change
during the whole lifetime of main process. However, if e.g.
a on-disk credential or SetCredential= in unit file
changes between ExecStart= and ExecStartPost=,
the credentials are overwritten when the latter gets to run,
and the already-running main process is suddenly seeing
completely different creds.
So, let's try to reuse final cred dir if the main process has started
and the tmpfs has been populated, so that the creds used is stable
across all ExecStart= and ExecStartPost=-s. We still want to retain
the ability of updating creds through ExecStartPre= though, therefore
we forcibly use a fresh cred dir for those. 'Fresh' means to actually
unmount the old tmpfs first, so the first problem goes away, too.
It's easy to add. Let's do so.
This only covers record lookups, i.e. with the --type= switch.
The higher level lookups are not covered, I opted instead to print a
message there to use --type= instead.
I am a bit reluctant to defining a new JSON format for the high-level
lookups, hence I figured for now a helpful error is good enough, that
points people to the right use.
Fixes: #29755
Previously, unit_{start,stop,reload} would call the low-level cgroup
unfreeze function whenever a unit was started, stopped, or reloaded. It
did so with no error checking. This call would ultimately recurse up the
cgroup tree, and unfreeze all the parent cgroups of the unit, unless an
error occurred (in which case I have no idea what would happen...)
After the freeze/thaw rework in a previous commit, this can no longer
work. If we recursively thaw the parent cgroups of the unit, there may
be sibling units marked as PARENT_FROZEN which will no longer actually
have frozen parents. Fixing this is a lot more complicated than simply
disallowing start/stop/reload on a frozen unit
Fixes https://github.com/systemd/systemd/issues/15849
This commit overhauls the way freeze/thaw works recursively:
First, it introduces new FreezerActions that are like the existing
FREEZE and THAW but indicate that the action was initiated by a parent
unit. We also refactored the code to pass these FreezerActions through
the whole call stack so that we can make use of them. FreezerState was
extended similarly, to be able to differentiate between a unit that's
frozen manually and a unit that's frozen because a parent is frozen.
Next, slices were changed to check recursively that all their child
units can be frozen before it attempts to freeze them. This is different
from the previous behavior, that would just check if the unit's type
supported freezing at all. This cleans up the code, and also ensures
that the behavior of slices corresponds to the unit's actual ability
to be frozen
Next, we make it so that if you FREEZE a slice, it'll PARENT_FREEZE
all of its children. Similarly, if you THAW a slice it will PARENT_THAW
its children.
Finally, we use the new states available to us to refactor the code
that actually does the cgroup freezing. The code now looks at the unit's
existing freezer state and the action being requested, and decides what
next state is most appropriate. Then it puts the unit in that state.
For instance, a RUNNING unit with a request to PARENT_FREEZE will
put the unit into the PARENT_FREEZING state. As another example, a
FROZEN unit who's parent is also FROZEN will transition to
PARENT_FROZEN in response to a request to THAW.
Fixes https://github.com/systemd/systemd/issues/30640
Fixes https://github.com/systemd/systemd/issues/15850
It never worked, but the fail was masked by missing set -e, see the
previous commit.
Also, throw env into the test container and dump the environment on
container start, to make potential failures easier to debug.
Previously, path units would remain in the running state while their
target unit is deactivating. This left a window of time where the target
unit is no longer operational (i.e. it is busy deactivating/cleaning
up/etc) but the path unit would continue to ignore inotify events. In
short: any inotify event that occurs while the target unit deactivates
would be completely lost.
With this commit, the path will go back into a waiting state when the
target unit starts deactivating. This means that any inotify event that
occurs while the target unit deactivates will queue a start job.
With newer versions of AppArmor, unprivileged user namespace creation
may be restricted by default, in which case user manager instances will
not be able to apply PrivateUsers=yes, which is implied by
PrivateTmp=yes in this systemd-run invocation.
With newer versions of AppArmor, unprivileged user namespace creation
may be restricted by default, in which case user manager instances will
not be able to apply PrivateUsers=yes (or the settings which require it).
This can be tested with the kernel.apparmor_restrict_unprivileged_userns
sysctl.