IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
When starting a container with --user, the new uid will be resolved and switched to
only in the inner child, at the end of the setup, by spawning getent. But the
credentials are set up in the outer child, long before the user is resolvable,
and the directories/files are made only readable by root and read-only, which
means they cannot be changed later and made visible to the user.
When this particular combination is specified, it is obvious the caller wants
the single-process container to be able to use credentials, so make them world
readable only in that specific case.
Fixes https://github.com/systemd/systemd/issues/31794
Enable the exec_fd logic for Type=notify* services too, and change it
to send a timestamp instead of a '1' byte. Record the timestamp in a
new ExecMainHandoverTimestamp property so that users can track accurately
when control is handed over from systemd to the service payload, so
that latency and startup performance can be trivially and accurately
tracked and attributed.
When an IO event source owns relevant fd, replacing with a new fd leaks
the previously assigned fd.
===
sd_event_add_io(event, &s, fd, ...);
sd_event_source_set_io_fd_own(s, true);
sd_event_source_set_io_fd(s, new_fd); <-- The previous fd is not closed.
sd_event_source_unref(s); <-- new_fd is closed as expected.
===
Without the change, valgrind reports the leak:
==998589==
==998589== FILE DESCRIPTORS: 4 open (3 std) at exit.
==998589== Open file descriptor 4:
==998589== at 0x4F119AB: pipe2 (in /usr/lib64/libc.so.6)
==998589== by 0x408830: test_sd_event_source_set_io_fd (test-event.c:862)
==998589== by 0x403302: run_test_table (tests.h:171)
==998589== by 0x408E31: main (test-event.c:935)
==998589==
==998589==
==998589== HEAP SUMMARY:
==998589== in use at exit: 0 bytes in 0 blocks
==998589== total heap usage: 33,305 allocs, 33,305 frees, 1,283,581 bytes allocated
==998589==
==998589== All heap blocks were freed -- no leaks are possible
==998589==
==998589== For lists of detected and suppressed errors, rerun with: -s
==998589== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Follow-ups for 74c4231ce5.
Previously, the path is obtained from the fd, but it is closed in
sd_event_loop() to unpin the filesystem.
So, let's save the path when the event source is created, and make
sd_event_source_get_inotify_path() simply read it.
We look for the root fs on the device of the booted ESP, and for the
other partitions on the device of the root fs. On EFI systems this
generally boils down to the same, but there are cases where this doesn't
hold, hence document this properly.
Fixes: #31199
This commit adds support for loading, measuring and handling a ".ucode"
UKI section. This section is functionally an initrd, intended for
microcode updates. As such it will always be passed to the kernel first.
Resolve at attach/detach/inspect time, so that the image is pinned and requires
re-attaching on update, given files are extracted from it so just passing
img.v/ to RootImage= is not enough to get a portable image updated
This reworkds --recovery-pin= from a parameter that takes a boolean to
an enum supporting one of "hide", "show", "query".
If "hide" (default behaviour) we'll generate a recovery pin
automatically, but never show it, and thus just seal it and good.
If "show" we'll generate a recovery pin automatically, but display it in
the output, so the user can write it down.
If "query" we'll ask the user for a recovery pin, and not automatically
generate any.
For compatibility the old boolean behaviour is kept.
With this you can now do "systemd-pcrlock make-policy
--recovery-pin=show" to set up the first policy, write down the recovery
PIN. Later, if the PCR prediction didn't work out one day you can then
do "systemd-pcrlock make-policy --recovery-pin=query" and enter the
recovery key and write a new policy.
Running both sd-ndisc and sd-radv should be mostly a misconfiguration,
but may not. So, let's only disable sd-ndisc by default when sd-radv is
enabled, but allow when both are explicitly requested.
This adds a small, socket-activated Varlink daemon that can delegate UID
ranges for user namespaces to clients asking for it.
The primary call is AllocateUserRange() where the user passes in an
uninitialized userns fd, which is then set up.
There are other calls that allow assigning a mount fd to a userns
allocated that way, to set up permissions for a cgroup subtree, and to
allocate a veth for such a user namespace.
Since the UID assignments are supposed to be transitive, i.e. not
permanent, care is taken to ensure that users cannot create inodes owned
by these UIDs, so that persistancy cannot be acquired. This is
implemented via a BPF-LSM module that ensures that any member of a
userns allocated that way cannot create files unless the mount it
operates on is owned by the userns itself, or is explicitly
allowelisted.
BPF LSM program with contributions from Alexei Starovoitov.
This is just good style. In this particular case, if the argument is incorrect and
the function is not tested with $NOTIFY_SOCKET set, the user could not get the
proper error until running for real.
Also, remove mention of systemd. The protocol is fully generic on purpose.
F40 will be out soon, so we can update the man page already. The example should
already work.
The cloud link was dropped in fd571c9df0, so
drop the unused variable too.
Unfortunately, sd-bus-vtable.h, sd-journal.h, and sd-id128.h
have variadic macro and inline initialization of sub-object, these are
not supported in C90. So, we need to silence some errors.
I think the example should reflect the full set of lifecycle messages,
including STOPPING=1, which tells the service manager that the service
is already terminating. This is useful for reporting this information
back to the user and to suppress repeated shutdown requests.
It's not as important as the READY=1 and RELOADING=1 messages, since we
actively wait for those from the service message if the right Type= is
set. But it's still very valuable information, easy to do, and completes
the state engine.
stale HibernateLocation EFI variable
Currently, if the HibernateLocation EFI variable exists,
but we failed to resume from it, the boot carries on
without clearing the stale variable. Therefore, the subsequent
boots would still be waiting for the device timeout,
unless the variable is purged manually.
There's no point to keep trying to resume after a successful
switch-root, because the hibernation image state
would have been invalidated by then. OTOH, we don't
want to clear the variable prematurely either,
i.e. in initrd, since if the resume device is the same
as root one, the boot won't succeed and the user might
be able to try resuming again. So, let's introduce a
unit that only runs after switch-root and clears the var.
Fixes#32021
We are saying in public that the protocl is stable and can be easily
reimplemented, so provide an example doing so in the documentation,
license as MIT-0 so that it can be copied and pasted at will.
Same reason as the reload, reexec is disruptive and it requires the
same privileges, so if somebody wants to limit reloads, they'll also
want to limit reexecs, so use the same setting.
The setting is used when /sys/power/state is set to 'mem'
(common for suspend) or /sys/power/disk is set to 'suspend'
(hybrid-sleep). We default to kernel choice here, i.e.
respect what's set through 'mem_sleep_default=' kernel
cmdline option.
Today listen file descriptors created by socket unit don't get passed to
commands in Exec{Start,Stop}{Pre,Post}= socket options.
This prevents ExecXYZ= commands from accessing the created socket FDs to do
any kind of system setup which involves the socket but is not covered by
existing socket unit options.
One concrete example is to insert a socket FD into a BPF map capable of
holding socket references, such as BPF sockmap/sockhash [1] or
reuseport_sockarray [2]. Or, similarly, send the file descriptor with
SCM_RIGHTS to another process, which has access to a BPF map for storing
sockets.
To unblock this use case, pass ListenXYZ= file descriptors to ExecXYZ=
commands as listen FDs [4]. As an exception, ExecStartPre= command does not
inherit any file descriptors because it gets invoked before the listen FDs
are created.
This new behavior can potentially break existing configurations. Commands
invoked from ExecXYZ= might not expect to inherit file descriptors through
sd_listen_fds protocol.
To prevent breakage, add a new socket unit parameter,
PassFileDescriptorsToExec=, to control whether ExecXYZ= programs inherit
listen FDs.
[1] https://docs.kernel.org/bpf/map_sockmap.html
[2] https://lore.kernel.org/r/20180808075917.3009181-1-kafai@fb.com
[3] https://man.archlinux.org/man/socket.7#SO_INCOMING_CPU
[4] https://www.freedesktop.org/software/systemd/man/latest/sd_listen_fds.html
Drop connections and caches and reload config from files, to allow
for low-interruptions updates, and hook up to the usual SIGHUP and
ExecReload=. Mark servers and services configured directly via D-Bus
so that they can be kept around, and only the configuration file
settings are dropped and reloaded.
Fixes https://github.com/systemd/systemd/issues/17503
Fixes https://github.com/systemd/systemd/issues/20604
This makes it possible to update a home record (and blob directory) of a
home area that's either completely absent (i.e. on a USB stick that's
unplugged) or just inaccessible due to lack of authentication
This bypasses authentication (i.e. user_record_authenticate) if the
volume key was loaded from the keyring and no secret section is
provided.
This also changes Update() and Resize() to always try and load the
volume key from the keyring. This makes the secret section optional for
these methods while still letting them function (as long as the home
area is active)
For CI in mkosi, I want to configure systemd to log at debug level
to the journal, but not to the console. While we already have max
level settings for journald's forwarding settings, not every log line
goes to the journal, specifically during early boot and when units
are connected directly to the console (think systemd-firstboot), so
let's extend the log level options we already have to allow specifying
a comma separated list of values and lets allow prefixing values with
the log target they apply to to make this possible.
usage:
(1) get latest revocation list for your architecture
from https://uefi.org/revocationlistfile
(2) copy the file to $ESP/loader/keys/$name/dbx.auth
Naming is always a matter of preference, and the old name would certainly work,
but I think the new one has the following advantages:
- A verb is better than a noun.
- The name more similar to "the competition", i.e. 'sudo', 'pkexec', 'runas',
'doas', which generally include an action verb.
- The connection between 'systemd-run' and 'run0' is more obvious.
There has been no release yet with the old name, so we can rename without
caring for backwards compatibility.
The specified vendor UUID is not actually a UUID. This changes it to an actual UUID.
The new value matches the ones from the systemd-boot man page and [The Boot Loader Interface](https://systemd.io/BOOT_LOADER_INTERFACE/).
This new passive target is supposed to be pulled in by SSH
implementations and should be reached when remote SSH access is
possible. The idea is that this target can be used as indicator for
other components to determine if and when SSH access is possible.
One specific usecase for this is the new sd_notify() logic in PID 1 that
sends its own supervisor notifications whenever target units are
reached. This can be used to precisely schedule SSH connections from
host to VM/container, or just to identify systems where SSH is even
available.
This was lost on refactor, and only addons had a default uki
line in the .sbat. Add it back, and differentiate between the
default for UKIs vs the default for addons, so that they can
be revoked separately. These are only defaults and users are
encouraged to provide their own.
Follow-up for a8b645dec8
Then, we can read the lease file on restart, and the DHCP server will be
able to manage previously assigned addresses.
To save leases in the state directory /var/lib/systemd/network/, this
adds systemd-networkd-dhcp-server.service, and by default
systemd-networkd does not start the DHCP server without the heler
service started.
Closes#29991.
Then, this introduces systemd-networkd-persistent-storage.service.
systemd-networkd.service is an early starting service. So, at the time
it is started, the persistent storage for the service may not be ready,
and we cannot use StateDirectory=systemd/network in
systemd-networkd.service.
The newly added systemd-networkd-persistent-storage.service creates the
state directory for networkd, and notify systemd-networkd that the
directory is usable.
This brings the handling of config for kernel-install in line with most of
systemd, i.e. we search the set of paths for the main config file, and the full
set of drop-in paths for drop-ins.
This mirrors what 07f5e35fe7 did for udev.conf.
That change worked out fine, so I hope this one will too.
The update in the man page is minimal. I think we should split out a separate
page for the config file later on.
One motivating use case is to allow a drop-in to be created for temporary
config overrides and then removed after the operation is done.
This way the man pages are installed only when the corresponding binary is
installed. The conditions in man pages and man/rules/meson.build are adjusted to
match the conditions for units in units/meson.build.
Previously, we'd only freeze user.slice in the case of s2h, because we
didn't want the user session to resume while systemd was transitioning
from suspend to hibernate.
This commit extends this freezing behavior to all sleep modes.
We also have an environment variable to disable the freezing behavior
outright. This is a necessary workaround for someone that has hooks
in /usr/lib/systemd/system-sleep/ which communicate with some
process running under user.slice, or if someone is using the proprietary
NVIDIA driver which breaks when user.slice is frozen (issue #27559)
Fixes#27559
"submitted" is already used in the description of FDNAME=.
Let's use that instead of "stored" for FDPOLL= too, to make
it more clear that it's a per-submission/per-fdset setting.
This also replaces the Fedora download example with another one from
Ubuntu, since Fedora's images these days no longer qualify as DDIs, they
have no distinctive partition type UUIDs set for multiple of their
partitions, hence the images cannot be booted. A bit sad. Let's provide
a command that just works in its place.
Allow signing with an OpenSSL engine/provider, such as PKCS11. A public key is
not enough, a full certificate is needed for PKCS11, so a new parameter is
added for that too.
It turns out it's mostly PKCS11 that supports the URI format,
and other engines just take files. For example the tpm2-tss-openssl
engine just takes a sealed private key file path as the key input,
and the engine needs to be specified separately.
Add --private-key-source=file|engine:foo|provider:bar to
manually specify how to use the private key parameter.
Follow-up for 0a8264080a
These will be used by display managers to pre-select the user's
preferred desktop environment and display server type. On homed, the
display manager will also be able to set these fields to cache the
user's last selection.