IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
At least shim will choke on an empty relocation section when loading the
binary. Note that the binary is still considered relocatable (just with
no base relocations to apply) as we do not set the
IMAGE_FILE_RELOCS_STRIPPED DLL characteristic.
We want thawing operations to still succeed even in the presence of an
unfreezable unit type (e.g. mount) appearing under a slice after the
slice was frozen. The appearance of such units should never cause the
slice thawing operation to fail to prevent potential future repeats of
https://github.com/systemd/systemd/issues/25356.
The kernel, systemd, and many other things print their version during boot.
sd-boot and sd-stub are also important, so let's print the version if EFI_DEBUG.
(If !EFI_DEBUG, continue to be quiet.)
When updating the docs, I saw that that the text in HACKING.md was out of date.
Instead of trying to update the instructions there, make it shorter and refer
the reader to tools/debug-sd-boot.sh for details.
Recent gcc versions have started to trigger false positive
maybe-uninitialized warnings. Let's make sure we initialize
variables annotated with _cleanup_ to avoid these.
We can't dereference the variant object directly, as it might be
a magic object (which has an address on a faulting page); use
json_variant_is_sensitive() instead that handles this case.
For example, with an empty array:
==1547789==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000023 (pc 0x7fd616ca9a18 bp 0x7ffcba1dc7c0 sp 0x7ffcba1dc6d0 T0)
==1547789==The signal is caused by a READ memory access.
==1547789==Hint: address points to the zero page.
SCARINESS: 10 (null-deref)
#0 0x7fd616ca9a18 in json_variant_strv ../src/shared/json.c:2190
#1 0x408332 in oci_args ../src/nspawn/nspawn-oci.c:173
#2 0x7fd616cc09ce in json_dispatch ../src/shared/json.c:4400
#3 0x40addf in oci_process ../src/nspawn/nspawn-oci.c:428
#4 0x7fd616cc09ce in json_dispatch ../src/shared/json.c:4400
#5 0x41fef5 in oci_load ../src/nspawn/nspawn-oci.c:2187
#6 0x4061e4 in LLVMFuzzerTestOneInput ../src/nspawn/fuzz-nspawn-oci.c:23
#7 0x40691c in main ../src/fuzz/fuzz-main.c:50
#8 0x7fd61564a50f in __libc_start_call_main (/lib64/libc.so.6+0x2750f)
#9 0x7fd61564a5c8 in __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x275c8)
#10 0x405da4 in _start (/home/fsumsal/repos/@systemd/systemd/build-san/fuzz-nspawn-oci+0x405da4)
DEDUP_TOKEN: json_variant_strv--oci_args--json_dispatch
AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV ../src/shared/json.c:2190 in json_variant_strv
==1547789==ABORTING
Or with an empty string in an array:
../src/shared/json.c:2202:39: runtime error: member access within misaligned address 0x000000000007 for type 'struct JsonVariant', which requires 8 byte alignment
0x000000000007: note: pointer points here
<memory cannot be printed>
#0 0x7f35f4ca9bcf in json_variant_strv ../src/shared/json.c:2202
#1 0x408332 in oci_args ../src/nspawn/nspawn-oci.c:173
#2 0x7f35f4cc09ce in json_dispatch ../src/shared/json.c:4400
#3 0x40addf in oci_process ../src/nspawn/nspawn-oci.c:428
#4 0x7f35f4cc09ce in json_dispatch ../src/shared/json.c:4400
#5 0x41fef5 in oci_load ../src/nspawn/nspawn-oci.c:2187
#6 0x4061e4 in LLVMFuzzerTestOneInput ../src/nspawn/fuzz-nspawn-oci.c:23
#7 0x40691c in main ../src/fuzz/fuzz-main.c:50
#8 0x7f35f364a50f in __libc_start_call_main (/lib64/libc.so.6+0x2750f)
#9 0x7f35f364a5c8 in __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x275c8)
#10 0x405da4 in _start (/home/fsumsal/repos/@systemd/systemd/build-san/fuzz-nspawn-oci+0x405da4)
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior ../src/shared/json.c:2202:39 in
Note: this happens only if json_variant_copy() in json_variant_set_source() fails.
Found by Nallocfuzz.
So here's the thing. One library we use (libselinux) is opening fds
behind our back when we initialize it and keeps it open. On the other
hand we want to automatically pick up all fds passed in to us, so that
we can distribute them to our services and close the rest. We pick them
up very early in our code, to ensure that we don't get confused by open
fds at that point. Except that libselinux insists on being initialized
even earlier. So suddenly we might take possession of libselinux' fds,
and then close them later when we decide no service wants them. Then
during shutdown we close down selinux and selinux closes its fds, but
since already closed long ago this ight close our fds instead. Hilarity
ensues.
I wish low-level software wouldn't do such things behind our back, but
well, let's make the best of it.
This changes the fd pick-up logic to only pick up fds that have
O_CLOEXEC unset. O_CLOEXEC must be unset for any fds passed in to us
over execve() after all. And for all our own fds we should set O_CLOEXEC
since we generally don't want to litter fd tables for execve(). Also,
libselinux thankfully appears to set O_CLOEXEC correctly on its fds,
hence the filter works.
Fixes: #27491
Follow-up for: #27734
It makes sense to propagate the select log level we maintain also into
glibc, so that any code that uses syslog() directly that ends up in our
processes (libraries and such) are affected by our settings the same way
as we are ourselves.
Let's make debugging udev triggering a bit easier, by generating debug
log messages whenever we trigger a device, and also when we see the
event in pid1.
We would create root account from sysusers or from firstboot, depending on
which one ran earlier. Since firstboot offers more options, in particular can
set the root password, we needed to order it earlier. This created an ugly
ordering requirement:
systemd-sysusers.service > systemd-firstboot.service > ... >
systemd-remount-fs.service > systemd-tmpfiles-setup-dev.service >
systemd-sysusers.service
We want sysusers.service to create basic users, so we can create nodes in dev,
so we can operate on block devices and such, so that we can resize and remount
things. But at the same time, systemd-firstboot.service can only work if it is
run early, before systemd-sysusers.service has created /etc/passwd. We can't
have it both ways: the units that want to have a fully writable root file
system cannot be ordered before units which are required to do file system
preparation.
Instead of trying to order firstboot very early, let's let it do its thing even
if it is started later. Instead of refusing to create to the root account if
/etc/passwd and /etc/shadow exist, actually check if the account is configured.
Now sysusers writes root account with password PASSWORD_UNPROVISIONED
("!unprovisioned"), and then firstboot checks for this, and will configure root
in this case.
This allows sysusers to be executed earlier (or accounts to be set up earlier
in another way).
This effectively reverts b825ab1a99.
Before 7cd43e34c5, it was possible to use
SYSTEMD_PROC_CMDLINE=systemd.condition-first-boot to override autodetection.
But now this doesn't work anymore, and it's useful to be able to do that for
testing.
We want to call systemd-tmpfiles-setup-dev.service to create /dev/fuse and
other device nodes so that module probing will work. But it is possible that
when we're in first boot, some users or groups need to be created by
systemd-sysusers first. But it is also possible that systemd-sysusers cannot
actually execute configuration because the root partition is not fully writable
yet. So let systemd-tmpfiles-setup-dev.service run earlier, possibly without
all users and groups in place. Since systemd-tmpfiles-setup-dev.service writes
to /dev only, it doesn't care how the root partition is mounted. In this early
run, some some nodes might be created with default permissions (i.e. not
accessible to non-root users or groups). This should be OK for the early boot
phase. Afterwards, we let systemd-tmpfiles-setup.service execute full
configuration. We will configure any files in /dev twice, but considering that
there's only a few of them and that the second run should only adjust ownership
and permissions, this should be OK. This way, we avoid the dependency loop.
If systemd-repart is running sufficiently early, /var/tmp might not be in place
yet. But if there is nothing to minimize, we won't even use it. Let's move the
check right before the first use.
systemd-repart[441]: Device '/' has no dm-crypt/dm-verity device, no need to look for…
systemd-repart[441]: Device /dev/sda opened and locked.
systemd-repart[441]: Sector size of device is 512 bytes. Using grain size of 4096.
systemd-repart[441]: Could not determine temporary directory: No such file or directory
systemd[1]: systemd-repart.service: Child 441 belongs to systemd-repart.service.
systemd[1]: systemd-repart.service: Main process exited, code=exited, status=1/FAILURE
systemd[1]: systemd-repart.service: Failed with result 'exit-code'.
Let's flat out refuse to configure machine-id on a running system with
systemd-firstboot. It wouldn't work anyway, because by the time firstboot is
started, pid1 has created /etc/machine-id, possibly with "unitialized", so
firstboot wouldn't touch the file. (If --force is specified, it works. So
let's allow that in case people want to do crazy things.)
While at it, add missing descriptions of various things that were added over
time, and group descriptions of similar options together.
As with other units, stopping of the automount requires actual work,
and without the ordering dependency systemd might not execute the stop
job before shutdown.target is reached and units ordered after that are
executed.
I have a system with /usr/lib/repart.d/50-root.conf with GrowFileSystem=yes.
The partition wouldn't be resized in the initrd, because
ConditionDirectoryNotEmpty=|/sysusr/usr/lib/repart.d was evaluated very early,
before /sysroot was mounted. There was no ordering dependency between
systemd-repart.service and sysroot.mount. (There was After=initrd-usr-fs.target,
but it seems to be only referred to by systemd-fstab-generator, which in my
case doesn't even run, because there's no fstab.)
But in fact, we neeed to run systemd-repart in the initrd only in limited
circumstances: when we need to create the root device based on config under
sysusr.mount. If there is config on the root device, it can be executed in
the host system, early during boot. Thus, let's remove the condition on
/sysroot/…. Without an ordering dependency on sysroot.mount, it was subject to
a race condition anyway. (A race condition with a low probability of "winning",
because systemd-repart.service has no dependencies, but sysroot.mount requires
a device to be detected and the mount to happen.)
The other problem was that systemd-repart.service didn't have the ordering wrt.
initrd-switch-root.target, so it was subject to the same race condition that
was fixed for other units in 7c0e2b5559. (If the
systemd-repart.service/stop job is slow, we could end up not restarting
systemd-repart.service in the host system.)
With the changes here, I see systemd-repart.service/start running twice:
in the initrd it is skipped because the conditions fail, and then in the
host system it runs normally.
Note: support for /sysroot is retained in systemd-repart code. I don't see a
strong reason to remove it, since it may still be useful to people invoking
repart in the initrd in other circumstances.
No functional change, just a cleanup to make the subsequent changes easier to
see. This is a continuation of 9810e41942
> The block is reordered and split to have:
> 1. description + documentation
> 2. (optionally) conditions
> 3. all the dependencies
The dependencies for shutdown.target are listed separately because they are the
other deps are for startup, and shutdown.target only matter much later.
Re-watching pids on cgroup v1 (needed because of unreliability of cgroup
empty notifications in containers) is handled bellow at the end of
service_sigchld_event() and depends on value main_pid_known flag.
In CentOS Stream 8 container on cgroup v1 the stop action would get stuck
indefinitely on unit like this,
$ cat /run/systemd/system/foo.service
[Service]
ExecStart=/bin/bash -c 'trap "nohup sleep 1 & exit 0" TERM; sleep infinity'
ExecStop=/bin/bash -c 'kill -s TERM $MAINPID'
TimeoutSec=0
However, upstream works "fine" because in upstream version of systemd we
actually never wait on processes killed in containers and proceed
immediately to sending SIGKILL hence re-watching of pids in the cgroup
is not necessary. But for the sake of correctness we should merge the
patch also upstream.
We ship with empty /var, so /var/log/journal does not exist, which
means journald does not do persistent logging. Let's fix that by
setting the config to explicitly enable persistent logging.
Since bash has no namespaces, let's do the second best thing and prefix
all "internal" stuff with an underscore, to minimize the chance of a name
conflict in the future.