IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
This way we'll not add deps for the mount point that unmount it during
shutdown. This is similar as for /run/initramfs/ which we want to
transition into during shutdown.
This way we don't have to add "-o x-initrd.mount" to all bind mounts for
/run/nextroot anymore to make it survive the reboot, it will be implied.
Let's make sure implicitly that the target directory is a mount point,
instead of doing so manually beforehand. This allows us to drop this
step from the transition into the /run/initramfs/ dir at shutdown.
During the initrd→host transition the switch root operations so far
where towards pre-existing mount points, but there are cetrainly
usecases where it might make sense to siwtch into arbitrary
subdirectories, too.
Our shutdown binary that takes over as PID 1 when shutting down puts
great efforts into a sync() that comes with a time-out once sync'ing
process stops. If we'd add another dumb sync() here, we kinda defeat all
it is good for. Hence, let's keep the sync() in for most codepats, but
let's disable it for the final shutdown logic when we transition back
into the exitrd. After all we sync()ed more than enough here, no need to
sync() even more.
Let's replace the current boolean param with a proper flags param. With
a single flag this doesn't appear to make much sense, though it does
already make things more readable I think.
However, once we add a second flag, it starts to make more sense.
Also, while we are at it, condition the "istmp" determinaton with this
flag too, since we only need it when the flag is set.
We previously would use MS_MOVE to move the old procfs, sysfs, /dev/ and
/run to the new place in some places, and MS_BIND in others.
The logic when to use MS_MOVE and when to use MS_BIND was pretty
arbitrary so far: we'd use MS_MOVE during the initrd → host transition
and MS_BIND when transitioning from host into the exitrd during
shutdown.
Traditionally, using MS_MOVE was preferable, because we didn't bother
with unmounting the old mount hierarchy before the switch root, and thus
using MS_MOVE did some clean-up as side-effect (because the old mounts
went away this way). But since we nowadays properly umount all remaining
mount points (since 268d1244e87a35ff8dff56c92ef375ebf69d462e) when
transitioning it's pointless.
Let's just use MS_BIND always. Let's tweak it though: let's use
MS_BIND|MS_REC for the kernel API VFS, and MS_BIND without MS_REC for
/run/. The latter reflects the fact that the submounts /run/ has usually
are not so much about just accessing kernel APIs but about auxiliary
user resources. Hence let's only move the main mount over for that.
While we are at it, also set up the base filesystem *before* we move the
mounts from the old to the new root, since the base filesystem setup
logic creates various needed inodes for us, which we really should make
use of instead of creating on our own.
This adds a new mechanism for rebooting, a form of "userspace reboot"
hereby dubbed "soft-reboot". It will stop all services as in a usual
shutdown, possibly transition into a new root fs and then issue a fresh
initial transaction. The kernel is not replaced.
File descriptors can be passed over, thus opening the door for leaving
certain resources around between such reboots.
Usecase: this is an extremely quick way to reset userspace fully when
updating image based systems, without going through a full
hardware/firmware/boot loader/kernel/initrd cycle. It minimizes "grayout time"
for OS updates. (In particular when combined with kernel live patching)
The logic was borked: if we find multiple partitions of the same
designator, we should first prefer the better arch, and then prefer the
better version, and then the first found. Fix that.
Fixes: #27897
We only really care about lowering the device timeout so we get to
a shell faster when the root device doesn't appear so let's only
lower that timeout instead of lowering all default timeouts.
If we discover the root or /usr/ fs via roothash=/usrhash= we know the
file system mounted on it will be read-only, since Verity volumes are by
definition immutable. Hence, let's imply the "ro" mount option for them.
This way the "kernel: /dev/mapper/usr: Can't open blockdev" boot-time
log message goes away, reported here:
https://github.com/systemd/systemd/issues/27682
(I do wonder though why erofs even tries to open the block device as
writable, that sounds utterly pointless for a file system that carries
the fact it is read-only even in the name...)
==1==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 17 byte(s) in 1 object(s) allocated from:
#0 0x7fc096c7243b in strdup (/lib64/libasan.so.8+0x7243b)
#1 0x7fc095db3899 in bus_socket_set_transient_property ../src/core/dbus-socket.c:386
#2 0x7fc095db5140 in bus_socket_set_property ../src/core/dbus-socket.c:460
#3 0x7fc095dd20f1 in bus_unit_set_properties ../src/core/dbus-unit.c:2473
#4 0x7fc095d87d53 in transient_unit_from_message ../src/core/dbus-manager.c:1025
#5 0x7fc095d8872f in method_start_transient_unit ../src/core/dbus-manager.c:1112
#6 0x7fc0944ddf4f in method_callbacks_run ../src/libsystemd/sd-bus/bus-objects.c:406
#7 0x7fc0944e7854 in object_find_and_run ../src/libsystemd/sd-bus/bus-objects.c:1319
#8 0x7fc0944e8f03 in bus_process_object ../src/libsystemd/sd-bus/bus-objects.c:1439
#9 0x7fc09454ad78 in process_message ../src/libsystemd/sd-bus/sd-bus.c:3011
#10 0x7fc09454b302 in process_running ../src/libsystemd/sd-bus/sd-bus.c:3053
#11 0x7fc09454e158 in bus_process_internal ../src/libsystemd/sd-bus/sd-bus.c:3273
#12 0x7fc09454e2f2 in sd_bus_process ../src/libsystemd/sd-bus/sd-bus.c:3300
#13 0x7fc094551a59 in io_callback ../src/libsystemd/sd-bus/sd-bus.c:3642
#14 0x7fc094727830 in source_dispatch ../src/libsystemd/sd-event/sd-event.c:4187
#15 0x7fc094731009 in sd_event_dispatch ../src/libsystemd/sd-event/sd-event.c:4808
#16 0x7fc094732124 in sd_event_run ../src/libsystemd/sd-event/sd-event.c:4869
#17 0x7fc095f7af9f in manager_loop ../src/core/manager.c:3242
#18 0x41cc7c in invoke_main_loop ../src/core/main.c:1937
#19 0x4252e0 in main ../src/core/main.c:3072
#20 0x7fc092a4a50f in __libc_start_call_main (/lib64/libc.so.6+0x2750f)
SUMMARY: AddressSanitizer: 17 byte(s) leaked in 1 allocation(s).
`sd_journal_print_with_location` and similar functions behave
inconsistently compared to their documentation, which says:
sd_journal_print_with_location(), sd_journal_printv_with_location(),
sd_journal_send_with_location(), sd_journal_sendv_with_location(),
and sd_journal_perror_with_location() [...] accept additional
parameters to explicitly set the source file name, function, and
line. Those arguments must contain valid journal entries including
the variable name, e.g. "CODE_FILE=src/foo.c", "CODE_LINE=666",
"CODE_FUNC=myfunc".
Calling e.g. `sd_journal_sendv_with_location` with
`CODE_FUNC=myfunction` as the value of the argument `func` results in
"CODE_FUNC" : "CODE_FUNC=myfunction"
because `sd_journal_*_with_location` implicitly prefix the argument
`func` with `CODE_FUNC=`. For example:
_public_ int sd_journal_sendv_with_location(
const char *file, const char *line,
const char *func,
const struct iovec *iov, int n) {
[...]
char *f;
[...]
niov = newa(struct iovec, n + 3);
[...]
ALLOCA_CODE_FUNC(f, func);
[...]
niov[n++] = IOVEC_MAKE_STRING(f);
return sd_journal_sendv(niov, n);
}
where `ALLOCA_CODE_FUNC` is:
#define ALLOCA_CODE_FUNC(f, func) \
do { \
size_t _fl; \
const char *_func = (func); \
char **_f = &(f); \
_fl = strlen(_func) + 1; \
*_f = newa(char, _fl + 10); \
memcpy(*_f, "CODE_FUNC=", 10); \
memcpy(*_f + 10, _func, _fl); \
} while (false)
The arguments `file` and `line` are _not_ prefixed similarly but
expected to be prefixed already with `CODE_FILE=` and `CODE_LINE=`
respectively and sent as is like the documentation describes.
That is, the argument `func` is treated differently and behaves
inconsistently compared to the arguments `file` and `line`. The behavior
seems still intentional:
_public_ int sd_journal_printv_with_location(int priority, const char *file, const char *line, const char *func, const char *format, va_list ap) {
[...]
/* func is initialized from __func__ which is not a macro, but
* a static const char[], hence cannot easily be prefixed with
* CODE_FUNC=, hence let's do it manually here. */
ALLOCA_CODE_FUNC(f, func);
[...]
}
Thus, change the documentation to match the actual behavior.
Note: `sd_journal_{print,send}` and `sd_journal_{print,send}v` work as
expected as they only pass the function name (i.e. without `CODE_FUNC=`)
to the `func` argument of the `sd_journal_*_with_location` functions
they call. For example:
#define sd_journal_print(priority, ...) sd_journal_print_with_location(priority, "CODE_FILE=" __FILE__, "CODE_LINE=" _SD_STRINGIFY(__LINE__), __func__, __VA_ARGS__)
I'm definitely a fan of precision, but in this case it's a bit too much:
$ systemd-run --unit=test --socket-property=ListenFIFO=/tmp/foo --socket-property=SocketMode=0644 true
$ systemctl cat test.socket
# /run/systemd/transient/test.socket
# This is a transient unit file, created programmatically via the systemd API. Do not edit.
[Unit]
Description=/usr/bin/true
[Socket]
ListenFIFO=/tmp/foo
SocketMode=0000000000000000000000000000000000000644
Let's check if we keep the old records after multiple systemd-pstore
invocations (i.e. simulate a scenario where we get multiple crashes and
multiple machine reboots).
generator_write_veritysetup_service_section() already escapes the
parameters internally, doing so in the caller means double escaping,
which is a bug. Fix it.
create_device() and create_disk() so far did very similar things, but
the name didn't give a hint what the difference was.
Hence let's rename them to create_special_device() and
create_veritytab_device() to make this more understandabe, as one
creates /proc/cmdline specified roothash=/usrhash= devices, and the
other one devices for items listed in /etc/veritytab.
No code changes besides renaming.
Both should have the same effect: the /dev/loop-control devices should
become available. systemd-tmpfiles-setup-dev.service creates the device
node "dry" based on modalias data, while modprobe@loop.service creates
it fully, because the module backing it is loaded properly. This should
shorten the deps chain a bit, simplify things and allows us to focus on
the stuff we actually need (i.e. the loopback infra) instead of all
entrypoints anyone might possibly need (i.e. the device nodes)
If both the data and the hash device are a regular file we might create
two sets of deps on s-t-s-d.s, which is of course redundant. Shorten the
code to only generate this once.
No change in behaviour.
Let's imply "x-initrd.attach" for "usr" and "root" volumes, so that
we do not attempt to umount them anymore during shutdown.
The names of these volumes have been mandated by the Discoverable
Partition Spec:
https://uapi-group.org/specifications/specs/discoverable_partitions_specification/#suggested-mode-of-operation
Hence it appears reasonably safe to special case these volume names.
Note that a similar logic is implemented in fstab-generator and in fact
PID 1 to treat the root mount and /usr/ mount specially too, to avoid
trying to umount it at shutdown. (This is what fstab_is_extrinsic()
checks).
This should ensure that if /usr/ or / is for some reason a LUKS medium
we won't try to detach it during runtime, which likely fails, since we
run off it.
Note this also moves an ordering dep towards umount.target under the
x-initrd.attach check, becasue that's where the crucial conflicts dep is
placed too.
We want that cryptsetup/veritysetup devices can stick around until the
very end, as well as the users of them which might depend on
blockdev@.target for the devices. Hence leave the targets around till
the very end.
Note that their runtime is managed via StopWhenUnneeded= anyway, hence
unless their are volumes that actually survive still the very end they
target units will still be stopped.
This mimics what we already have for cryptsetup services: the slice they
are placed in (they have their own slice since that's what we do by
default for instantiated services) shouldn't conflict with
shutdown.target, so that veritysetup services can stay around until the
very end (which is what we want for the root and usr verity volumes).
It's literally just a copy of the same unit we already have for
cryptsetup, just with an updated description string.
Let's use the common generator_write_veritysetup_unit_section(),
ggenerator_write_veritysetup_service_section(), generator_add_symlink()
implementation we already have at one more place.
This mostly generates the same unit, but for the first time hooks up
blockdev@dev-mapper-*.device for the device, which means things like
growfs on usr+root volumes will actually work now. (I mean, growfs
won#t, because verity devices are immutable after all, but things *like*
it that want to run between the device popping up and being mounted.)