1
0
mirror of https://github.com/systemd/systemd.git synced 2025-01-27 18:04:05 +03:00

156 Commits

Author SHA1 Message Date
Lennart Poettering
f596658811 importd: allow activation in early boot, and make it socket activatable
Previously, importd was only accessible via D-Bus, which required it to
be a late boot service. Now that we have Varlink we can rearrange things
to become early-boot activated, just after the image directories are
mounted.

This will later allow us to have generator that auto-downloads images on
boot.
2024-06-25 09:57:42 +02:00
Yu Watanabe
61628287bd journal: explicitly sync namespaced journals before stopping socket units
Otherwise, if a service unit that requests LogNamespace= stopped before
systemd-journald@.service is started, logs generated by the service will be
lost, as systemd-journald@.socket is stopped and
systemd-journald@.service will never started.

To prevent the issue, let's introduce another implicit dependency to
a oneshot service that explicitly synchronizes a namespaced journal file
when the log namespace is not needed anymore.

Fixes #32604.
2024-05-02 19:41:01 +02:00
Yu Watanabe
5700e755a9 units: introduce systemd-udev-load-credentials.service 2024-04-16 09:45:43 +09:00
Lennart Poettering
702a52f4b5 mountfsd: add new systemd-mountfsd component 2024-04-06 16:08:24 +02:00
Lennart Poettering
8aee931e7a nsresourced: add new daemon for granting clients user namespaces and assigning resources to them
This adds a small, socket-activated Varlink daemon that can delegate UID
ranges for user namespaces to clients asking for it.

The primary call is AllocateUserRange() where the user passes in an
uninitialized userns fd, which is then set up.

There are other calls that allow assigning a mount fd to a userns
allocated that way, to set up permissions for a cgroup subtree, and to
allocate a veth for such a user namespace.

Since the UID assignments are supposed to be transitive, i.e. not
permanent, care is taken to ensure that users cannot create inodes owned
by these UIDs, so that persistancy cannot be acquired. This is
implemented via a BPF-LSM module that ensures that any member of a
userns allocated that way cannot create files unless the mount it
operates on is owned by the userns itself, or is explicitly
allowelisted.

BPF LSM program with contributions from Alexei Starovoitov.
2024-04-06 16:08:24 +02:00
Mike Yuan
dfad86b838
units: introduce systemd-hibernate-clear.service that clears
stale HibernateLocation EFI variable

Currently, if the HibernateLocation EFI variable exists,
but we failed to resume from it, the boot carries on
without clearing the stale variable. Therefore, the subsequent
boots would still be waiting for the device timeout,
unless the variable is purged manually.

There's no point to keep trying to resume after a successful
switch-root, because the hibernation image state
would have been invalidated by then. OTOH, we don't
want to clear the variable prematurely either,
i.e. in initrd, since if the resume device is the same
as root one, the boot won't succeed and the user might
be able to try resuming again. So, let's introduce a
unit that only runs after switch-root and clears the var.

Fixes #32021
2024-04-03 22:07:43 +08:00
Mike Yuan
20ce9fecaa
units: sort lists in meson.build 2024-03-26 21:08:49 +08:00
Zbigniew Jędrzejewski-Szmek
c38e4e2fda
Merge pull request #29721 from poettering/systemd-project
New capsule@.service feature
2024-03-26 13:19:33 +01:00
Yu Watanabe
7b799b870f units: use relative path 2024-03-16 05:31:44 +09:00
Lennart Poettering
95be59f907 ssh-generator: introduce ssh-access.target
This new passive target is supposed to be pulled in by SSH
implementations and should be reached when remote SSH access is
possible. The idea is that this target can be used as indicator for
other components to determine if and when SSH access is possible.

One specific usecase for this is the new sd_notify() logic in PID 1 that
sends its own supervisor notifications whenever target units are
reached. This can be used to precisely schedule SSH connections from
host to VM/container, or just to identify systems where SSH is even
available.
2024-03-14 17:23:28 +01:00
Lennart Poettering
9b94ae834b units: add systemd-capsule@.service 2024-03-14 11:34:04 +01:00
Yu Watanabe
91676b6458 networkctl: introduce "persistent-storage" command
Then, this introduces systemd-networkd-persistent-storage.service.

systemd-networkd.service is an early starting service. So, at the time
it is started, the persistent storage for the service may not be ready,
and we cannot use StateDirectory=systemd/network in
systemd-networkd.service.

The newly added systemd-networkd-persistent-storage.service creates the
state directory for networkd, and notify systemd-networkd that the
directory is usable.
2024-03-12 01:57:16 +09:00
Thomas Blume
fc5c6eccb4 units: make templates for quotaon and systemd-quotacheck service 2024-03-09 19:32:09 +00:00
Lennart Poettering
79ec39958d bootctl: add a Varlink interface
For now, just super basic functionality: return the list of boot menu
entries, and read/write the reboot to firmware flag
2024-02-14 16:15:19 +01:00
Sam Leonard
38624568d8
vmspawn: add template unit to start systemd-vmspawn -M 2024-02-13 12:31:03 +00:00
Lennart Poettering
15138e7980 pcrlock: add basic Varlink interface
This can be used to make or delete a PCR policy via Varlink. It can also
be used to query the current event log in CEL format.
2024-02-12 12:04:18 +01:00
Lennart Poettering
0a6598bb38 hostnamed: add simple Varlink API, too 2024-01-09 10:46:25 +01:00
Lennart Poettering
4e1f0037b8 units: add a tpm2.target synchronization point and small generator that pulls in
Distributions apparently only compile a subset of TPM2 drivers into the
kernel. For those not compiled it but provided as kmod we need a
synchronization point: we must wait before the first TPM2 interaction
until the driver is available and accessible.

This adds a tpm2.target unit as such a synchronization point. It's
ordered after /dev/tpmrm0, and is pulled in by a generator whenever we
detect that the kernel reported a TPM2 to exist but we have no device
for it yet.

This should solve the issue, but might create problems: if there are TPM
devices supported by firmware that we don't have Linux drivers for we'll
hang for a bit. Hence let's add a kernel cmdline switch to disable (or
alternatively force) this logic.

Fixes: #30164
2024-01-03 13:49:02 +01:00
Lennart Poettering
644f19c75c creds: add varlink API for encrypting/decrypting credentials 2023-12-21 19:19:12 +01:00
Lennart Poettering
3ccadbce33 homectl: add "firstboot" command
This extends what systemd-firstboot does and runs on first boots only
and either processes user records passed in via credentials to create,
or asks the user interactively to create one (only if no regular user
exists yet).
2023-12-18 11:10:53 +01:00
Lennart Poettering
809def1940 units: add units that put together and install a TPM2 PCR policy at boot
(This is disabled by default, for now)
2023-11-03 11:24:45 +01:00
Lennart Poettering
1761066b13 storagetm: add new systemd-storagetm component
This implements a "storage target mode", similar to what MacOS provides
since a long time as "Target Disk Mode":

        https://en.wikipedia.org/wiki/Target_Disk_Mode

This implementation is relatively simple:

1. a new generic target "storage-target-mode.target" is added, which
   when booted into defines the target mode.

2. a small tool and service "systemd-storagetm.service" is added which
   exposes a specific device or all devices as NVMe-TCP devices over the
   network.  NVMe-TCP appears to be hot shit right now how to expose
   block devices over the network. And it's really simple to set up via
   configs, hence our code is relatively short and neat.

The idea is that systemd-storagetm.target can be extended sooner or
later, for example to expose block devices also as USB mass storage
devices and similar, in case the system has "dual mode" USB controller
that can also work as device, not just as host. (And people could also
plug in sharing as NBD, iSCSI, whatever they want.)

How to use this? Boot into your system with a kernel cmdline of
"rd.systemd.unit=storage-target-mode.target ip=link-local", and you'll see on
screen the precise "nvme connect" command line to make the relevant
block devices available locally on some other machine. This all requires
that the target mode stuff is included in the initrd of course. And the
system will the stay in the initrd forever.

Why bother? Primarily three use-cases:

1. Debug a broken system: with very few dependencies during boot get
   access to the raw block device of a broken machine.

2. Migrate from system to another system, by dd'ing the old to the new
   directly.

3. Installing an OS remotely on some device (for example via Thunderbolt
   networking)

(And there might be more, for example the ability to boot from a
laptop's disk on another system)

Limitations:

1. There's no authentication/encryption. Hence: use this on local links
   only.

2. NVMe target mode on Linux supports r/w operation only. Ideally, we'd
   have a read-only mode, for security reasons, and default to it.

Future love:

1. We should have another mode, where we simply expose the homed LUKS
   home dirs like that.

2. Some lightweight hookup with plymouth, to display a (shortened)
   version of the info we write to the console.

To test all this, just run:

    mkosi --kernel-command-line-extra="rd.systemd.unit=storage-target-mode.target" qemu
2023-11-02 14:19:32 +01:00
Lennart Poettering
f5151fb459 sysext: make some calls available via varlink 2023-10-16 12:08:39 +02:00
Lennart Poettering
4e16d5c69e pcrextend: make pcrextend tool acccessible via varlink
This is primarily supposed to be a 1st step with varlinkifying our
various command line tools, and excercise in how this might look like
across our codebase one day. However, at AllSystemsGo! 2023 it was
requested that we provide an API to do a PCR measurement along with a
matching event log record, and this provides that.
2023-10-06 11:49:38 +02:00
Lennart Poettering
2e64cb71b9 tpm2-setup: add new early boot tool for initializing the SRK
This adds an explicit service for initializing the TPM2 SRK. This is
implicitly also done by systemd-cryptsetup, hence strictly speaking
redundant, but doing this early has the benefit that we can parallelize
this in a nicer way. This also write a copy of the SRK public key in PEM
format to /run/ + /var/lib/, thus pinning the disk image to the TPM.
Making the SRK public key is also useful for allowing easy offline
encryption for a specific TPM.

Sooner or later we should probably grow what this service does, the
above is just the first step. For example, the service should probably
offer the ability to reset the TPM (clear the owner hierarchy?) on a
factory reset, if such a policy is needed. And we might want to install
some default AK (?).

Fixes: #27986
Also see: #22637
2023-09-29 19:36:04 +02:00
Mike Yuan
a628d933cc
hibernate-resume: split out the logic of finding hibernate location
Before this commit, the hibernate location logic only exists in
the generator. Also, we compare device nodes (devnode_same()) and
clear EFI variable HibernateLocation in the generator too. This is
not ideal though: when the generator gets to run, udev hasn't yet
started, so effectively devnode_same() always fails. Moreover, if
the boot process is interrupted by e.g. battery-check, the hibernate
information is lost.

Therefore, let's split out the logic of finding hibernate location.
The generator only does the initial validation of system info and
enables systemd-hibernate-resume.service, and when the service
actually runs we validate everything again, which includes comparing
the device nodes and clearing the EFI variable. This should make
things more robust, plus systems that don't utilize a systemd-enabled
initrd can use the exact same logic to resume using the EFI variable.
I.e., systemd-hibernate-resume can be used standalone.
2023-09-07 20:21:16 +08:00
Yu Watanabe
c3c885a771 bsod: several cleanups
- add reference to the service unit in the man page,
- fix several indentation and typos,
- replace '(uint64_t) -1' with 'UINT64_MAX',
- drop unnecessary 'continue'.
2023-08-22 23:20:14 +09:00
Luca Boccassi
b24d10e35a
Merge pull request #28697 from 1awesomeJ/new_bsod
systemd-bsod: Add "--continuous" option
2023-08-18 00:20:04 +01:00
OMOJOLA JOSHUA
77d0917ea3 systemd-bsod: Add "--continuous" option 2023-08-17 13:13:54 +01:00
Yu Watanabe
bb7f485f4b units: introduce systemd-tmpfiles-setup-dev-early.service
This makes tmpfiles, sysusers, and udevd invoked in the following order:
1. systemd-tmpfiles-setup-dev-early.service
   Create device nodes gracefully, that is, create device nodes anyway
   by ignoring unknown users and groups.
2. systemd-sysusers.service
   Create users and groups, to make later invocations of tmpfiles and
   udevd can resolve necessary users and groups.
3. systemd-tmpfiles-setup-dev.service
   Adjust owners of previously created device nodes.
4. systemd-udevd.service
   Process all devices. Especially to make block devices active and can
   be mountable.
5. systemd-tmpfiles-setup.service
   Setup basic filesystem.

Follow-up for b42482af904ae0b94a6e4501ec595448f0ba1c06.

Fixes #28653.
Replaces #28681 and #28732.
2023-08-12 07:55:20 +09:00
Yu Watanabe
9289e093ae meson: use install_emptydir() and drop meson-make-symlink.sh
The script is mostly equivalent to 'mkdir -p' and 'ln -sfr'.
Let's replace it with install_emptydir() builtin function and
inline meson call.
2023-08-08 22:11:34 +01:00
Lennart Poettering
95dafd30da battery-check: rework unit
Let's rename the unit to systemd-battery-check.service. We usually want
to name our own unit files like our tools they wrap, in particular if
they are entirely defined by us (i.e. not just wrappers of foreign
concepts)

While we are at it, also hook this in from initrd.target, and order it
against initrd-root-device.target so that it runs before the root device
is possibly written to (i.e. mounted or fsck'ed).

This is heavily inspired by @aafeijoo-suse's PR #28208, but quite
different ;-)
2023-07-01 03:19:16 +08:00
OMOJOLA JOSHUA
e3d4148d50 PID1: detect battery level in initrd and if low refuse continuing to boot, print message and shut down. 2023-06-28 14:48:54 +01:00
Mike Yuan
760e99bb52
hibernate-resume: rework to follow the logic of sleep.c and use
main-func.h

Preparation for #27247
2023-06-23 23:57:49 +08:00
Yu Watanabe
742aebc5a7 meson: merge two similar loops for unit files
This also merges two arrays units and in_units, and uses dictionary
for declaring units.

This also fixes the condition handling, that previously only two
conditions were handled and rests were ignored.
2023-06-22 10:19:51 -06:00
Lennart Poettering
13ffc60749 pid1: add "soft-reboot" reboot method
This adds a new mechanism for rebooting, a form of "userspace reboot"
hereby dubbed "soft-reboot". It will stop all services as in a usual
shutdown, possibly transition into a new root fs and then issue a fresh
initial transaction. The kernel is not replaced.

File descriptors can be passed over, thus opening the door for leaving
certain resources around between such reboots.

Usecase: this is an extremely quick way to reset userspace fully when
updating image based systems, without going through a full
hardware/firmware/boot loader/kernel/initrd cycle. It minimizes "grayout time"
for OS updates. (In particular when combined with kernel live patching)
2023-06-02 16:49:38 +02:00
Lennart Poettering
7a2f3194ff units: set DefaultDependencies=no for veritysetup slice
This mimics what we already have for cryptsetup services: the slice they
are placed in (they have their own slice since that's what we do by
default for instantiated services) shouldn't conflict with
shutdown.target, so that veritysetup services can stay around until the
very end (which is what we want for the root and usr verity volumes).

It's literally just a copy of the same unit we already have for
cryptsetup, just with an updated description string.
2023-06-01 18:49:43 +02:00
maanyagoenka
1f839f48e0 confext: add the systemd-confext.service file 2023-04-05 21:50:04 +00:00
Jan Janssen
dfca5587cf tree-wide: Drop gnu-efi
This drops all mentions of gnu-efi and its manual build machinery. A
future commit will bring bootloader builds back. A new bootloader meson
option is now used to control whether to build sd-boot and its userspace
tooling.
2023-03-10 11:41:03 +01:00
Frantisek Sumsal
0eb635ef4b units: don't install pcrphase-related units without gnu-efi
since we don't have systemd-pcrphase built anyway, which breaks the tests:

...
I: Attempting to install /usr/lib/systemd/systemd-networkd-wait-online (based on unit file reference)
I: Attempting to install /usr/lib/systemd/systemd-network-generator (based on unit file reference)
I: Attempting to install /usr/lib/systemd/systemd-oomd (based on unit file reference)
I: Attempting to install /usr/lib/systemd/systemd-pcrphase (based on unit file reference)
W: Failed to install '/usr/lib/systemd/systemd-pcrphase'
make: *** [Makefile:4: setup] Error 1
make: Leaving directory '/root/systemd/test/TEST-01-BASIC'

Follow-up to 04959faa632272a8fc9cdac3121b2e4af721c1b6.
2023-01-17 14:30:02 +01:00
Lennart Poettering
04959faa63 generators: optionally, measure file systems at boot
If we use gpt-auto-generator, automatically measure root fs and /var.

Otherwise, add x-systemd.measure option to request this.
2023-01-17 09:42:16 +01:00
Lennart Poettering
50072ccf1b units: rework growfs units to be just a regular unit that is instantiated
The systemd-growfs@.service units are currently written in full for each
file system to grow. Which is kinda pointless given that (besides an
optional ordering dep) they contain always the same definition. Let's
fix that and add a static template for this logic, that the generator
simply instantiates (and adds an ordering dep for).

This mimics how systemd-fsck@.service is handled. Similar to the wait
that for root fs there's a special instance systemd-fsck-root.service
we also add a special instance systemd-growfs-root.service for the root
fs, since it has slightly different deps.

Fixes: #20788
See: #10014
2023-01-17 09:42:16 +01:00
Lennart Poettering
072c8f6505 units: measure /etc/machine-id into PCR 15 during early boot
We want PCR 15 to be useful for binding per-system policy to. Let's
measure the machine ID into it, to ensure that every OS we can
distinguish will get a different PCR (even if the root disk encryption
key is already measured into it).
2023-01-17 09:42:16 +01:00
Franck Bui
2aba77057e journal: give the ability to enable/disable systemd-journald-audit.socket
Before this patch the only way to prevent journald from reading the audit
messages was to mask systemd-journald-audit.socket. However this had main
drawback that downstream couldn't ship the socket disabled by default (beside
the fact that masking units is not supposed to be the usual way to disable
them).

Fixes #15777
2023-01-11 17:18:57 +01:00
Lennart Poettering
921fc451cb units: rename/rework systemd-boot-system-token.service → systemd-boot-random-seed.service
This renames systemd-boot-system-token.service to
systemd-boot-random-seed.service and conditions it less strictly.

Previously, the job of the service was to write a "system token" EFI
variable if it was missing. It called "bootctl --graceful random-seed"
for that. With this change we condition it more liberally: instead of
calling it only when the "system token" EFI variable isn't set, we call
it whenever a boot loader interface compatible boot loader is used. This
means, previously it was invoked on the first boot only: now it is
invoked at every boot.

This doesn#t change the command that is invoked. That's because
previously already the "bootctl --graceful random-seed" did two things:
set the system token if not set yet *and* refresh the random seed in the
ESP. Previousy we put the focus on the former, now we shift the focus to
the latter.

With this simple change we can replace the logic
f913c784ad4c93894fd6cb2590738113dff5a694 added, but from a service that
can run much later and doesn't keep the ESP pinned.
2023-01-04 15:18:10 +01:00
Lennart Poettering
047273e6e8 pcrphase: add two additional phases
This adds two more phases to the PCR boot phase logic: "sysinit" +
"final".

The "sysinit" one is placed between sysinit.target and basic.target.
It's good to have a milestone in this place, since this is after all
file systems/LUKS volumes are in place (which sooner or later should
result in measurements of their own) and before services are started
(where we should be able to rely on them to be complete).

This is particularly useful to make certain secrets available for
mounting secondary file systems, but making them unavailable later.

This breaks API in a way (as measurements during runtime will change),
but given that the pcrphase stuff wasn't realeased yet should be OK.
2022-10-17 12:09:43 +02:00
Daan De Meyer
9377e53f4f meson: Fix pcrphase unit conditions 2022-10-11 15:29:08 +02:00
Lennart Poettering
40f1856791 units: add pcrphase units 2022-09-22 16:53:34 +02:00
Zbigniew Jędrzejewski-Szmek
45bcfcb36c units/initrd-parse-etc.service: only start units that are required
This makes use of the option switch that was added in the previous commit.
We used a pretty big hammer on a relatively small nail: we would do daemon-reload
and (in principle) allow any configuration to be changed. But in fact we only
made use of this in systemd-fstab-generator. systemd-fstab-generator filters
out all mountpoints except /usr and those marked with x-initrd.mount, i.e. on
a big majority of systems it wouldn't do anything.

Also, since systemd-fstab-generator first parses /proc/cmdline, and then
initrd's /etc/fstab, and only then /sysroot/etc/fstab, configuration in the
host would only matter if it the same mountpoint wasn't configured "earlier".
So the config in the host could be used for new mountpoints, but it couldn't
be used to amend configuration for existing mountpoints. And we wouldn't actually
remount anything, so mountpoints that were already mounted wouldn't be affected,
even if did change some config.

In the new scheme, we will parse /sysroot/etc/fstab and explicitly start
sysroot-usr.mount and other units that we just wrote. In most cases (as written
above), this will actually result in no units being created or started.

If the generator is invoked on a system with /sysroot/etc/fstab present,
behaviour is not changed and we'll create units as before. This is needed so
that if daemon-reload is later at some points, we don't "lose" those units.

There's a minor bugfix here: we honour x-initrd.mount for swaps, but we
wouldn't restart swap.target, i.e. the new swaps wouldn't necessarilly be
pulled in immediately.
2022-07-23 19:02:39 +02:00
Franck Bui
278e815bfa logind: don't delay login for root even if systemd-user-sessions.service is not activated yet
If for any reason something goes wrong during the boot process (most likely due
to a network issue), system admins should be allowed to log in to the system to
debug the problem. However due to the login session barrier enforced by
systemd-user-sessions.service for all users, logins for root will be delayed
until a (dbus) timeout expires. Beside being confusing, it's not a nice user
experience to wait for an indefinite period of time (no message is shown) this
and also suggests that something went wrong in the background.

The reason of this delay is due to the fact that all units involved in the
creation of a user session are ordered after systemd-user-sessions.service,
which is subject to network issues. If root needs to log in at that time,
logind is requested to create a new session (via pam_systemd), which ultimately
ends up waiting for systemd-user-session.service to be activated. This has the
bad side effect to block login for root until the dbus call done by pam_systemd
times out and the PAM stack proceeds anyways.

To solve this problem, this patch orders the session scope units and the user
instances only after systemd-user-sessions.service for unprivileged users only.
2022-07-12 22:54:39 +01:00