1
1
mirror of https://github.com/systemd/systemd-stable.git synced 2024-12-23 17:34:00 +03:00
Commit Graph

50710 Commits

Author SHA1 Message Date
Lennart Poettering
2671fbefce units: fix repart conditions to run if definitions exist in /sysroot + /sysusr
The systemd-repart code was already smart enough to look for definitions
there, but the unit file conditions made that pointless. Let's fix that.
2021-04-21 12:23:31 +01:00
Yu Watanabe
ea846e45c1 doc: fix typo 2021-04-21 09:57:30 +02:00
Lennart Poettering
5efbd0bf89
Merge pull request #19371 from poettering/repart-initrd-usr-only
two /sysusr/ changes for repart, split out of #19234
2021-04-20 23:46:17 +02:00
Lennart Poettering
0aa714778a
Merge pull request #19372 from poettering/repart-initrd-usr-begin
fstab-generator: mount.usr= handling changes, split out of #19234
2021-04-20 23:44:49 +02:00
Lennart Poettering
ac02dccabc
Merge pull request #19368 from poettering/loop-seqnum
loop-util: let's try harder to avoid loopback block device recycle issues
2021-04-20 23:43:57 +02:00
Lennart Poettering
3464514457 man: document new initrd-usr-fs.target 2021-04-20 19:11:07 +02:00
Lennart Poettering
632b551ca2 units: change order of settings to match order in other similar unit 2021-04-20 19:11:07 +02:00
Lennart Poettering
8f47e32a3e repart: use /sysusr/ as --root= default in initrd, if mounted 2021-04-20 18:53:15 +02:00
Lennart Poettering
a73b2ad041 repart: try harder to find OS prefix
This teaches repart to look for the root block device both as the
backing for /sysroot and for /sysusr/usr.

The latter is a new addition, and starts making more sense with the next
commit. It's about supporting systems that are shipped with only a /usr/
fs, but where a root fs is allocated and formatted on first boot via
systemd-repart (or a similar tool). In this case it's useful to be able
to mount the ultimate /usr/ early on without mounting the root fs
right-away (simple because the rootfs might not exist yet, and we need
the repart data encoded in /usr/ to actually format it). Hence, instead
of requiring that we mount /sysroot/ first and /sysroot/usr/ second as
we did so far, let's rearrange things slightly:

1. We mount the /usr/ file system we discover to /sysusr/usr/
2. We mount the root file system we discover to /sysroot/
3. Once both are established we bind mount /sysusr/usr/ to /sysroot/usr/

And that' it. The first two steps can happen in either order, and we can
access /usr/ with or without a rootfs being around.

This commit implements nothing of the above. Instead, it teaches
systemd-repart to check both /sysroot/ and /sysusr/ for repart drop-ins,
and use the first of these hierarchies it finds populated. This way
systemd-repart can be spawned once /usr is mounted and it will work
correctly without root fs having to exist, or we can invoke it when the
root fs is already mounted, where it also will work correctly.
2021-04-20 18:53:15 +02:00
Lennart Poettering
fa138f5e26 fstab-generator: properly order generated mount units before "post" target units
Let's make sure, that our mount unit are properly ordered before the
"post" target unit even if DefaultDependencies= is used on the target
unit.
2021-04-20 18:26:17 +02:00
Lennart Poettering
e19ae92af6 fstab-generator: extend logging a bit 2021-04-20 18:26:17 +02:00
Lennart Poettering
29a24ab28e fstab-generator: if usr= is specified, mount it to /sysusr/usr/ first
This changes the fstab-generator to handle mounting of /usr/ a bit
differently than before. Instead of immediately mounting the fs to
/sysroot/usr/ we'll first mount it to /sysusr/usr/ and then add a
separate bind mount that mounts it from /sysusr/usr/ to /sysroot/usr/.

This way we can access /usr independently of the root fs, without for
waiting to be mounted via the /sysusr/ hierarchy. This is useful for
invoking systemd-repart while a root fs doesn't exist yet and for
creating it, with partition data read from the /usr/ hierarchy.

This introduces a new generic target initrd-usr-fs.target that may be
used to generically order services against /sysusr/ to become available.
2021-04-20 18:26:17 +02:00
Lennart Poettering
6e1454b4b9 ci: drop test/TEST-50-DISSECT/deny-list-ubuntu-ci
Let's see if this makes the test stable on the CI.
2021-04-20 17:21:22 +02:00
Lennart Poettering
4a62257d68 dissect: ignore udev database entries from before the loopback attachment
This tries to shorten the race of device reuse a bit more: let's ignore
udev database entries that are older than the time where we started to
use a loopback device.

This doesn't fix the whole loopback device raciness mess, but it makes
the race window a bit shorter.
2021-04-20 17:20:38 +02:00
Lennart Poettering
8ede1e86b2 loop-util: track CLOCK_MONOTONIC timestamp immediately before attaching a loopback device
This is similar to the preceding work to store the uevent seqnum, but
this stores the CLOCK_MONOTONIC timestamp.

Why? This allows to validate udev database entries, to determine if they
were created *after* we attached the device.

The uevent seqnum logic allows us to validate uevent, and the timestamp
database entries, hence together we should be able to validate both
sources of truth for us.

(note that this is all racy, just a bit less racy, since we cannot
atomically attach loopback devices and get the timestamp for it, the
same way we can't get the uevent seqnum. Thus is shortens the race
window, but doesn#t close it).
2021-04-20 17:20:38 +02:00
Lennart Poettering
8626b43be4 sd-device: add API to query from when a udev database entry is
We already store a CLOCK_MONOTONIC timestamp for each device appearance,
let' make this queriable.

This is useful to determine whether a udev device database entry is from
a current appearance of the device or a previous one, by comparing it
with appropriately taken timestamps.
2021-04-20 17:14:10 +02:00
Lennart Poettering
75dc190d39 dissect: ignore old uevents when waiting for loopback partition scan
Let's drop all monitor uevent that were enqueued before we actually
started setting up the device.

This doesn't fix the race, but it makes the race window smaller: since
we cannot determine the uevent seqnum and the loopback attachment
atomically, there's a tiny window where uevents might be generated by
the device which we mistake for being associated with out use of the
loopback device.
2021-04-20 17:14:10 +02:00
Lennart Poettering
31c75fcc41 loop-util: read kernel's uevent seqnum right before attaching a loopback device
Later, this will allow us to ignore uevents from earlier attachments a
bit better, as we can compare uevent seqnums with this boundary. It's
not a full fix for the race though, since we cannot atomically determine
the uevent and attach the device, but it at least shortens the window a
bit.
2021-04-20 17:13:56 +02:00
Lennart Poettering
79e8393a6a loop-util: initialize .devno in loop_device_open() too 2021-04-20 17:12:39 +02:00
Lennart Poettering
b0dbffd868 loop-util: port to random_u64_range()
Doesn't matter, but it's a bit easier to read I'd claim.
2021-04-20 17:12:39 +02:00
Lennart Poettering
38bd449f96 loop-util: make loop_device_make() return fd in all code paths
Previously, loop_device_make() would return the device fd in one success
code path, but not the other (where' we'd just return 0).
loop_device_open() returns it in all cases.

Hence, let's clean this up, and make sure in all success code paths of
both functions we return it (even though it strictly speaking is
redundant, since we return it in LoopDevice anyway, and currently noone
actually relies on this).
2021-04-20 17:12:39 +02:00
Lennart Poettering
02ef01ade3 sd-device: use right clock when comparing initialization usec
we actually use CLOCK_MONOTONIC for the timestamp, hence when
comparing/subtracting it from the current time, also use
CLOCK_MONOTONIC.
2021-04-20 17:12:39 +02:00
Lennart Poettering
a156eb89c8 sd-device: use right type for usec_initialized 2021-04-20 17:11:21 +02:00
Lennart Poettering
ee7561d014 update TODO 2021-04-20 16:32:24 +02:00
Yegor Alexeyev
c95df5879e relay role implementation 2021-04-20 15:11:53 +02:00
Yu Watanabe
d5bfddf037 man: fix typo
Follow-up for e73309c532.
2021-04-20 11:41:05 +01:00
Miroslav Suchý
0084d4f6b5 document DefaultOOMPolicy
the `man systemd.service` say:
   Defaults to the setting DefaultOOMPolicy= in systemd-system.conf(5) is set to
but there is no such line in this config.
This is the default value I extracted from
   systemctl show --property=DefaultOOMPolicy
2021-04-20 10:40:42 +02:00
Yu Watanabe
66205cb3f5 wifi-util: do not set zero errno to log_debug_errno() 2021-04-20 10:39:50 +02:00
Frantisek Sumsal
3f161ba9bc test: make the test entrypoint scripts shellcheck-compliant 2021-04-20 10:26:43 +02:00
Lennart Poettering
4d686e6b0b mount-util: make umount_and_rmdir_and_freep() cleanup handler deal with NULL 2021-04-20 10:23:30 +02:00
Lennart Poettering
fd2f6f7248
Merge pull request #19096 from poettering/repart-features
repart: four new features: CopyBlocks=auto + --image= + ReadOnly=/Flags= + MakeDirectories=
2021-04-20 10:20:22 +02:00
Peter Hutterer
7a4afd3a15 shell-completion: use base.lst, not xorg.lst
Since 2005 xorg.lst has been the legacy symlink to the real file base.lst.
2021-04-20 10:19:41 +02:00
Luca Boccassi
ba81458350
Merge pull request #19356 from zxzax/sd-login-typos
Fix some typos in sd-login header, docs
2021-04-19 22:26:36 +01:00
Lennart Poettering
7cc3966693 update TODO 2021-04-19 23:19:52 +02:00
Lennart Poettering
5a3b86404a test: add test for new repart features 2021-04-19 23:19:52 +02:00
Lennart Poettering
b620bf332f dissect: ext4 and loopback files are unimpressed by read-only access
Even if we set up a loopback device read-only and mount it read-only
this means nothing, ext4 will still write through to the backing storage
file.

Yes, I lost 6h debugging time on this.

Apparently, we have to specify "norecovery" when mounting such file
systems, to force them into truly read-only mode. Let's do so.
2021-04-19 23:16:02 +02:00
Lennart Poettering
e73309c532 repart: add new ReadOnly= and Flags= settings for repart dropins
Let's make the GPT partition flags configurable when creating new
partitions. This is primarily useful for the read-only flag (which we
want to set for verity enabled partitions).

This adds two settings for this: Flags= and ReadOnly=, which strictly
speaking are redundant. The main reason to have both is that usually the
ReadOnly= setting is the one wants to control, and it' more generic.
Moreover we might later on introduce inherting of flags from CopyBlocks=
partitions, where one might want to control most flags as is except for
the RO flag and similar, hence let's keep them separate.
2021-04-19 23:16:02 +02:00
Lennart Poettering
5c08da586f repart: add CopyBlocks=auto support
When using systemd-repart as an installer that replicates the install
medium on another medium it is useful to reference the root
partition/usr partition or verity data that is currently booted, in
particular in A/B scenarios where we have two copies and want to
reference the one we currently use. Let's add a CopyBlocks=auto for this
case: for a partition that uses that we'll copy a suitable partition
from the host.

CopyBlocks=auto finds the partition to copy like this: based on the
configured partition type uuid we determine the usual mount point (i.e.
for the /usr partition type we determine /usr/, and so on). We then
figure out the block device behind that path, through dm-verity and
dm-crypt if necessary. Finally, we compare the partition type uuid of
the partition found that way with the one we are supposed to fill and
only use it if it matches (the latter is primarily important on
dm-verity setups where a volume is likely backed by two partitions and
we need to find the right one).

This is particularly fun to use in conjunction with --image= (where
we'll restrict the device search onto the specify device, for security
reasons), as this allows "duplicating" an image like this:

    # systemd-repart --image=source.raw --empty=create --size=auto target.raw

If the right repart data is embedded into "source.raw" this will be able
to create and initialize a partition table on target.raw that carrries
all needed partitions, and will stream the source's file systems onto it
as configured.
2021-04-19 23:16:02 +02:00
Lennart Poettering
e81acfd251 gpt: add some simple helpers for categorizing GPT partition types 2021-04-19 23:16:02 +02:00
Lennart Poettering
f3859d5f55 loop-util: store device major/minor in LoopDevice object
Let's store this away. It's useful when matching up mounts (i.e.  struct
stat's .st_dev field) with loopback devices.
2021-04-19 23:16:02 +02:00
Lennart Poettering
d83d804863 repart: add high-level setting for creating dirs in formatted file systems
So far we already had the CopyFiles= option in systemd-repart drop-in
files, as a mechanism for populating freshly formatted file systems with
files and directories. This adds MakeDirectories= in similar style, and
creates simple directories as listed. The option is of course entirely
redundant, since the same can be done with CopyFiles= simply by copying
in a directory. It's kinda nice to encode the dirs to create directly in
the drop-in files however, instead of providing a directory subtree to
copy in somehere, to make the files more self-contained — since often
just creating dirs is entirely sufficient.

The main usecase for this are GPT OS images that carry only a /usr/
tree, and for which a root file system is only formatted on first boot
via repart.  Without any additional CopyFiles=/MakeDirectories=
configuration these root file systems are entirely empty of course
initially. To mount in the /usr/ tree, a directory inode for /usr/ to
mount over needs to be created.  systemd-nspawn will do so automatically
when booting up the image, as will the initrd during boot. However, this
requires the image to be writable – which is OK for npawn and
initrd-based boots, but there are plenty tools where read-only operation
is desirable after repart ran, before the image was booted for the first
time. Specifically, "systemd-dissect" opens the image in read-only to
inspect its contents, and this will only work of /usr/ can be properly
mounted. Moreover systemd-dissect --mount --read-only won't succeed
either if the fs is read-only.

Via MakeDirectories= we now provide a way that ensures that the image
can be mounted/inspected in a fully read-only way immediately after
systemd-repart completed. Specifically, let's consider a GPT disk image
shipping with a file usr/lib/repart.d/50-root.conf:

       [Partition]
       Type=root
       Format=btrfs
       MakeDirectories=/usr
       MakeDirectories=/efi

With this in place systemd-repart will create a root partition when run,
and add /usr and /efi into it as directory inods. This ensures that the
whole image can then be mounted truly read-only anf /usr and /efi can be
overmounted by the /usr partition and the ESP.
2021-04-19 23:16:02 +02:00
Lennart Poettering
78eee6ce4d repart: use free_and_strdup_warn() where appropriate 2021-04-19 23:16:02 +02:00
Lennart Poettering
be9ce0188e repart: deal with empty partition label sensibly
libfdisk appears to return NULL when encountering an empty partition
label, let's handle this sanely, and treat NULL and "" for the current
label as the same, but for the new label as distinct: there NULL means
nothing is set, and "" means an actual empty label.
2021-04-19 23:16:02 +02:00
Lennart Poettering
22163eb51b repart: handle DISCARD failing with EBUSY gracefully 2021-04-19 23:16:02 +02:00
Lennart Poettering
55d380144a repart: add one more overflow check 2021-04-19 23:16:02 +02:00
Lennart Poettering
d17db7b2bf repart: when we can't fit in all partitions explain how large the image would have to be 2021-04-19 23:16:02 +02:00
Lennart Poettering
252d626711 repart: add --image= switch
This is similar to the --image= switch in the other tools, like
systemd-sysusers or systemd-tmpfiles, i.e. it apply the configuration
from the image to the image.

This is particularly useful for downloading minimized GPT image, and
then extending it to the desired size via:

   # systemd-repart --image=foo.image --size=5G
2021-04-19 23:16:02 +02:00
Lennart Poettering
8e5f3cecdf repart: slightly improve error message if partition is not on dm-crypt/dm-verity 2021-04-19 23:16:02 +02:00
Lennart Poettering
0efb3f83da repart: move NOP destructors into shared code 2021-04-19 23:16:02 +02:00
Lennart Poettering
ef9c184d3d dissect: split read-only flag into two
Let's have one flag to request that when dissecting an image the
loopback device is made read-only, and another one to request that when
it is mounted to make it read-only. Previously both concepts were always
done read-only together.

(Of course, making the loopback device read-only but mounting it
read-write doesn't make too much sense, but the kernel should catch that
for us, no need to make restrictions from our side there)

Use-case for this: in systemd-repart we'd like to operate on images for
adding partitions. Thus we'd like to have the loopback device writable,
but if we read repart.d/ snippets from it, we want to do that read-only.
2021-04-19 23:16:02 +02:00