IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Consider the following case: a user sets up a minimum rootfs for
file system maintenance work in /run/nextroot/ dir directly. When
they're done, they expect 'systemctl reboot' to perform a full reboot.
But they keep soft-rebooting back to the tmpfs root, until they
find out about $SYSTEMCTL_SKIP_AUTO_SOFT_REBOOT.
So currently, when /run/nextroot/ is a normal dir, pid1 automatically
turns it into a bind mount to soft-reboot into. This is good, but when
combined with automatic soft-reboot it has an arguably unexpected
behavior, since /run/nextroot/ can never go away in such a case.
OTOH, if /run/nextroot/ is a mountpoint in the first place, the mount
is *moved* so a second reboot would not trigger auto soft-reboot.
Let's just make things more friendly to users, and do auto soft-reboot
only if /run/nextroot/ is also a mountpoint.
Only the system manager records soft reboots, and the user session is
restarted anyway so it doesn't suffer from the ID clash issue
Follow-up for ed358516937780b524a2cfa833427da3df1bc87f
With gcc-14.0.1-0.13.fc40, when compiling with -O2, the compiler doesn't understand
that sd_bus_error_setf() always returns negative on error when <name> is provided:
[28/576] Compiling C object systemd-resolved.p/src_resolve_resolved-bus.c.o
../src/resolve/resolved-bus.c: In function ‘call_link_method’:
../src/resolve/resolved-bus.c:1763:16: warning: ‘l’ may be used uninitialized [-Wmaybe-uninitialized]
1763 | return handler(message, l, error);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~
../src/resolve/resolved-bus.c:1749:15: note: ‘l’ was declared here
1749 | Link *l;
| ^
../src/resolve/resolved-bus.c: In function ‘bus_method_get_link’:
../src/resolve/resolved-bus.c:1822:13: warning: ‘l’ may be used uninitialized [-Wmaybe-uninitialized]
1822 | p = link_bus_path(l);
| ^~~~~~~~~~~~~~~~
../src/resolve/resolved-bus.c:1810:15: note: ‘l’ was declared here
1810 | Link *l;
| ^
...
Let's make the assertion a bit more explicit. With this, the warning goes away,
but I think it's more obvious to a human reader too.
When compiled with -O2, the compiler is not happy about dynamic_user_pop() and
would warn about the output variables not being set. It does have a point:
we were doing a cast from ssize_t to int, and theoretically there could be
wraparound. So let's add an explicit check that the cast to int is fine.
[540/2509] Compiling C object src/core/libsystemd-core-256.so.p/dynamic-user.c.o
../src/core/dynamic-user.c: In function ‘dynamic_user_close.isra’:
../src/core/dynamic-user.c:580:9: warning: ‘uid’ may be used uninitialized [-Wmaybe-uninitialized]
580 | unlink_uid_lock(lock_fd, uid, d->name);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../src/core/dynamic-user.c:560:15: note: ‘uid’ was declared here
560 | uid_t uid;
| ^~~
../src/core/dynamic-user.c: In function ‘dynamic_user_realize’:
../src/core/dynamic-user.c:476:29: warning: ‘new_uid’ may be used uninitialized [-Wmaybe-uninitialized]
476 | num = new_uid;
| ~~~~^~~~~~~~~
../src/core/dynamic-user.c:398:23: note: ‘new_uid’ was declared here
398 | uid_t new_uid;
| ^~~~~~~
This adds a small, socket-activated Varlink daemon that can delegate UID
ranges for user namespaces to clients asking for it.
The primary call is AllocateUserRange() where the user passes in an
uninitialized userns fd, which is then set up.
There are other calls that allow assigning a mount fd to a userns
allocated that way, to set up permissions for a cgroup subtree, and to
allocate a veth for such a user namespace.
Since the UID assignments are supposed to be transitive, i.e. not
permanent, care is taken to ensure that users cannot create inodes owned
by these UIDs, so that persistancy cannot be acquired. This is
implemented via a BPF-LSM module that ensures that any member of a
userns allocated that way cannot create files unless the mount it
operates on is owned by the userns itself, or is explicitly
allowelisted.
BPF LSM program with contributions from Alexei Starovoitov.
This opens the door for making the call work without privileges: if we
pass in a userns fd and DissectedImage that has mount fds then we can
acquire all information without privs.
When adding unprivileged nspawn support we don't really want a global
lock file, since we cannot even access the dir they are stored in, hence
make the concept optional.
Some minor other modernizations.
This is similar to uid_range_load_userns() but instead of reading the
uid_map off a process it reads it off a userns fd.
(Of course the kernel has no API for this right now, hence we fork off a
throw-away process which joins the user namespace, and then read off the
data from there.)
This new call takes two image policy objects and generates an
"intersection" policy, i.e. only allows what is allowed by both. Or in
other words it conceptually implements a binary AND of the policy flags.
(Except that it's a bit harder, due to normalization, and underspecified
flags).
We can use this later for mountfsd: a client can specify a policy, and
mountfsd can specify another policy, and we'll then apply only what both
allow.
Note that a policy generated like this might be invalid. For example, if
one policy says root must exist and be verity or luks protected, and the
other policy says root must be absent, then the intersection is invalid,
since one policy only allows what the other prohibits and vice versa.
We'll return a clear error code in that case (ENAVAIL). (This is because
we simply don't allow encoding such impossible policies in an
ImagePolicy structure, for good reasons.)
This new call is like varlink_peek_fd() (i.e. gets an fd out of the
connection but leaving it also in there), and combines ith with
F_DUPFD_CLOEXEC to make a copy of it.
We previously already had varlink_dup_fd() which was a duplicating
version for pushing an fd *into* the connection. To reduce confusion,
let's rename that one varlink_push_dup_fd() to make the symmetry to
valrink_push_fd() clear so that we have no:
varlink_peer_push_fd() → put fd in without dup'ing
varlink_peer_push_dup_fd() → same with F_DUPFD_CLOEXEC
varlink_peer_peek_fd() → get fd out without dup'ing
varlink_peer_peek_dup_fd() → same with F_DUPFD_CLOEXEC
Since e56a8790a0 debugging test-execute fails has been a royal PITA, since
we ditch all potentially useful output from the test units (that, for
the most part, run `sh -x ...`). Let's improve the situation a bit by
setting EXEC_OUTPUT_NULL only when running the single test case that
needs it, and inheriting stdout otherwise.
For example, with a purposefully introduced error we get this output
with this patch:
exec-personality-x86-64.service: About to execute: sh -x -c "c=\$\$(uname -m); test \"\$\$c\" = \"foo_bar\""
Serializing sd-executor-state to memfd.
...
Personality: x86-64
LockPersonality: no
SystemCallErrorNumber: kill
++ uname -m
+ c=x86_64
+ test x86_64 = foo_bar
Received SIGCHLD from PID 1520588 (sh).
Child 1520588 (sh) died (code=exited, status=1/FAILURE)
exec-personality-x86-64.service: Child 1520588 belongs to exec-personality-x86-64.service.
exec-personality-x86-64.service: Main process exited, code=exited, status=1/FAILURE
exec-personality-x86-64.service: Failed with result 'exit-code'.
...
Exit Status: 1
src/test/test-execute.c:456:test_exec_personality: exec-personality-x86-64.service: can_unshare=yes: exit status 1, expected 0
(test-execute-root) terminated by signal ABRT.
Assertion 'r >= 0' failed at src/test/test-execute.c:1433, function prepare_ns(). Aborting.
Aborted
But without it, we'd miss the most important part:
exec-personality-x86-64.service: About to execute: sh -x -c "c=\$\$(uname -m); test \"\$\$c\" = \"foo_bar\""
Serializing sd-executor-state to memfd.
...
Personality: x86-64
LockPersonality: no
SystemCallErrorNumber: kill
Received SIGCHLD from PID 1521365 (sh).
Child 1521365 (sh) died (code=exited, status=1/FAILURE)
exec-personality-x86-64.service: Child 1521365 belongs to exec-personality-x86-64.service.
exec-personality-x86-64.service: Main process exited, code=exited, status=1/FAILURE
exec-personality-x86-64.service: Failed with result 'exit-code'.
...
Exit Status: 1
src/test/test-execute.c:456:test_exec_personality: exec-personality-x86-64.service: can_unshare=yes: exit status 1, expected 0
(test-execute-root) terminated by signal ABRT.
Assertion 'r >= 0' failed at src/test/test-execute.c:1433, function prepare_ns(). Aborting.
Aborted
Currently, the memory management of service_set_main_pidref
is a bit odd. Normally we either invalidate the original
resource on caller's side after the call succeeds, or
just pass the ownership wholly. But service_set_main_pidref
take a pointer, and calls pidref_done() internally.
Let's just make it consume the passed pidref. This is more
straightforward.
On s390x both __s390__ and __s390x__ are defined, and with the original
order we'd go through the __s390__ branch and emit a warning:
[169/2118] Compiling C object src/shared/libsystemd-shared-256.a.p/base-filesystem.c.o
../src/shared/base-filesystem.c:136:11: note: ‘#pragma message: Please add an entry above specifying whether your architecture uses /lib64/, /lib32/, or no such links.’
136 | # pragma message "Please add an entry above specifying whether your architecture uses /lib64/, /lib32/, or no such links."
| ^~~~~~~