diff --git a/TODO b/TODO index 94a585769b7..2080eb6b571 100644 --- a/TODO +++ b/TODO @@ -45,6 +45,13 @@ Features: in a graceful way, so that updated /usr trees automatically propagate into updated boot loaders on reboot. +* sysext: optionally, if the merged trees allow it use bind mounts instead of + overlayfs + +* nspawn: add support for sysext extensions, too. i.e. a new --extension= + switch that takes one or more arguments, and applies the extensions already + during startup. + * add "systemd-analyze debug" + AttachDebugger= in unit files: The former specifies a command to execute; the latter specifies that an already running "systemd-analyze debug" instance shall be contacted and execution paused diff --git a/docs/ENVIRONMENT.md b/docs/ENVIRONMENT.md index 47cdff317bf..cfe7784f9ef 100644 --- a/docs/ENVIRONMENT.md +++ b/docs/ENVIRONMENT.md @@ -267,3 +267,15 @@ systemd-firstboot and localectl: * `SYSTEMD_LIST_NON_UTF8_LOCALES=1` – if set non-UTF-8 locales are listed among the installed ones. By default non-UTF-8 locales are suppressed from the selection, since we are living in the 21st century. + +systemd-sysext: + +* `SYSTEMD_SYSEXT_HIERARCHIES` – if set to a colon-separated list of absolute + paths this variable may be used to override which hierarchies to manage with + `systemd-sysext`. By default only `/usr/` and `/opt/` are managed. With this + environment variable this list may be changed, in order to add or remove + directories from this list. This should only reference "real" file systems + and directories that only contain "real" file systems as submounts — do not + specify API file systems such as `/proc/` or `/sys/` here, or hierarchies + that have them as submounts. In particular, do not specify the root directory + `/` here. diff --git a/man/os-release.xml b/man/os-release.xml index 674180679b5..5a0cfd2887c 100644 --- a/man/os-release.xml +++ b/man/os-release.xml @@ -317,6 +317,17 @@ + + SYSEXT_LEVEL= + + A lower-case string (mostly numeric, no spaces or other characters outside of 0–9, + a–z, ".", "_" and "-") identifying the operating system extensions support level, to indicate which + extension images are supported (See: + systemd-sysext8). + Example: SYSEXT_LEVEL=2 or + SYSEXT_LEVEL=15.14. + + If you are reading this file from C code or a shell script diff --git a/man/rules/meson.build b/man/rules/meson.build index b8cb96ac22f..38d58307fe0 100644 --- a/man/rules/meson.build +++ b/man/rules/meson.build @@ -954,6 +954,7 @@ manpages = [ 'systemd-suspend-then-hibernate.service'], ''], ['systemd-sysctl.service', '8', ['systemd-sysctl'], ''], + ['systemd-sysext', '8', ['systemd-sysext.service'], ''], ['systemd-system-update-generator', '8', [], ''], ['systemd-system.conf', '5', diff --git a/man/systemd-sysext.xml b/man/systemd-sysext.xml new file mode 100644 index 00000000000..6bda5f4fc65 --- /dev/null +++ b/man/systemd-sysext.xml @@ -0,0 +1,239 @@ + + + + + + + + systemd-sysext + systemd + + + + systemd-sysext + 8 + + + + systemd-sysext + systemd-sysext.service + Activates System Extension Images + + + + + systemd-sysext + OPTIONS + + + systemd-sysext.service + + + + + Description + + systemd-sysext activates/deactivates system extension images. System extension + images may – dynamically at runtime — extend the /usr/ and + /opt/ directory hierarchies with additional files. This is particularly useful on + immutable system images where a /usr/ and/or /opt/ hierarchy + residing on a read-only file system shall be extended temporarily at runtime without making any + persistent modifications. + + System extension images should contain files and directories similar in fashion to regular + operating system tree. When one or more system extension images are activated, their + /usr/ and /opt/ hierarchies are combined via + overlayfs with the same hierarchies of the host OS, and the host + /usr/ and /opt overmounted with it ("merging"). When they are + deactivated, the mount point is disassembled — again revealing the unmodified original host version of + the hierarchy ("unmerging"). Merging thus makes the extension's resources suddenly appear below the + /usr/ and /opt/ hierarchies as if they were included in the + base OS image itself. Unmerging makes them disappear again, leaving in place only the files that were + shipped with the base OS image itself. + + Files and directories contained in the extension images outside of the /usr/ + and /opt/ hierarchies are not merged, and hence have no effect + when included in a system extension image. In particular, files in the /etc/ and + /var/ included in a system extension image will not appear in + the respective hierarchies after activation. + + System extension images are strictly read-only, and the host /usr/ and + /opt/ hierarchies become read-only too while they are activated. + + System extensions are supposed to be purely additive, i.e. they are supposed to include only files + that do not exist in the underlying basic OS image. However, the underlying mechanism (overlayfs) also + allows removing files, but it is recommended not to make use of this. + + System extension images may be provided in the following formats: + + + Plain directories or btrfs subvolumes containing the OS tree + Disk images with a GPT disk label, following the Discoverable Partition Specification + Disk images lacking a partition table, with a naked Linux file system (e.g. squashfs or ext4) + + + These image formats are the same ones that + systemd-nspawn1 + supports via it's / switches and those that the + service manager supports via /. Similar to + them they may optionally carry Verity authentication information. + + System extensions are automatically looked for in the directories + /etc/extensions/, /run/extensions/, + /var/lib/extensions/, /usr/lib/extensions/ and + /usr/local/lib/extensions/. The first two listed directories are not suitable for + carrying large binary images, however are still useful for carrying symlinks to them. The primary place + for installing system extensions is /var/lib/extensions/. Any directories found in + these search directories are considered directory based extension images, any files with the + .raw suffix are considered disk image based extension images. + + During boot OS extension images are activated automatically, if the + systemd-sysext.service is enabled. Note that this service runs only after the + underlying file systems where system extensions are searched are mounted. This means they are not + suitable for shipping resources that are processed by subsystems running in earliest boot. Specifically, + OS extension images are not suitable for shipping system services or + systemd-sysusers8 + definitions. See Portable Services for a simple + mechanism for shipping system services in disk images, in a similar fashion to OS extensions. Note the + different isolation on these two mechanisms: while system extension directly extend the underlying OS + image with additional files that appear in a way very similar to as if they were shipped in the OS image + itself and thus imply no security isolation, portable services imply service level sandboxing in one way + or another. The systemd-sysext.service service is guaranteed to finish start-up + before basic.target is reached; i.e. at the time regular services initialize (those + which do not use DefaultDependencies=no), the files and directories system extensions + provide are available in /usr/ and /opt/ and may be + accessed. + + Note that there is no concept of enabling/disabling installed system extension images: all + installed extension images are automatically activated at boot. + + A simple mechanism for version compatibility is enforced: a system extension image must carry a + /usr/lib/extension-release.d/extension-release.$name + file, which must match its image name, that is compared with the host os-release + file: the contained ID= fields have to match, as well as the + SYSEXT_LEVEL= field (if defined). If the latter is not defined, the + VERSION_ID= field has to match instead. System extensions should not ship a + /usr/lib/os-release file (as that would be merged into the host + /usr/ tree, overriding the host OS version data, which is not desirable). The + extension-release file follows the same format and semantics, and carries the same + content, as the os-release file of the OS, but it describes the resources carried + in the extension image. + + + + Uses + + The primary use case for system images are immutable environments where debugging and development + tools shall optionally be made available, but not included in the immutable base OS image itself + (e.g. strace and gdb shall be an optionally installable + addition in order to make debugging/development easier). System extension images should not be + misunderstood as a generic software packaging framework, as no dependency scheme is available: system + extensions should carry all files they need themselves, except for those already shipped in the + underlying host system image. Typically, system extension images are built at the same time as the base + OS image — within the same build system. + + Another use case for the system extension concept is temporarily overriding OS supplied resources + with newer ones, for example to install a locally compiled development version of some low-level + component over the immutable OS image without doing a full OS rebuild or modifying the nominally + immutable image. (e.g. "install" a locally built package with DESTDIR=/var/lib/extensions/mytest + make install && systemd-sysext --refresh, making it available in + /usr/ as if it was installed in the OS image itself.) This case works regardless if + the underlying host /usr/ is managed as immutable disk image or is a traditional + package manager controlled (i.e. writable) tree. + + + + Commands + + The following command switches are understood: + + + + + + Merges all currently installed system extension images into + /usr/ and /opt/, by overmounting these hierarchies with an + overlayfs file system combining the underlying hierarchies with those included in + the extension images. This command will fail if the hierarchies are already merged. + + + + + + Unmerges all currently installed system extension images from + /usr/ and /opt/, by unmounting the + overlayfs file systems created by + prior. + + + + + + A combination of and : if already + mounted the existing overlayfs instance is unmounted temporarily, and then + replaced by a new version. This command is useful after installing/removing system extension images, + in order to update the overlayfs file system accordingly. If no system extensions + are installed when this command is executed, the equivalent of is + executed, without establishing any new overlayfs instance. Note that currently + there's a brief moment where neither the old nor the new overlayfs file system is + mounted. This implies that all resources supplied by a system extension will briefly disappear — even + if it exists continuously during the refresh operation. + + + + + + + A brief list of installed extension images is shown. + + + + + + + When invoked without any command switches, the current merge status is shown, separately for both + /usr/ and /opt/. + + + + Options + + + + + + Operate relative to the specified root directory, i.e. establish the + overlayfs mount not on the top-level host /usr/ and + /opt/ hierarchies, but below some specified root directory. + + + + + + Generate JSON output, instead of human readable tabular output. Takes one of + short, pretty or off in order to control the + output style, or explicitly disabling JSON output. + + + + + + + + Exit status + + On success, 0 is returned. + + + + See Also + + systemd1, + systemd-nspawn1 + + + + diff --git a/meson.build b/meson.build index c12b399b5f2..14b919d0c9f 100644 --- a/meson.build +++ b/meson.build @@ -1502,6 +1502,7 @@ foreach term : ['analyze', 'nss-myhostname', 'nss-systemd', 'portabled', + 'sysext', 'pstore', 'quotacheck', 'randomseed', @@ -1745,6 +1746,7 @@ subdir('src/portable') subdir('src/pstore') subdir('src/resolve') subdir('src/shutdown') +subdir('src/sysext') subdir('src/systemctl') subdir('src/timedate') subdir('src/timesync') @@ -2202,6 +2204,17 @@ if conf.get('ENABLE_PORTABLED') == 1 install_dir : rootbindir) endif +if conf.get('ENABLE_SYSEXT') == 1 + public_programs += executable( + 'systemd-sysext', + systemd_sysext_sources, + include_directories : includes, + link_with : [libshared], + install_rpath : rootlibexecdir, + install : true, + install_dir : rootlibexecdir) +endif + if conf.get('ENABLE_USERDB') == 1 executable( 'systemd-userwork', @@ -2390,8 +2403,7 @@ if conf.get('HAVE_LIBCRYPTSETUP') == 1 libopenssl, libp11kit], install_rpath : rootlibexecdir, - install : true, - install_dir : bindir) + install : true) endif if conf.get('HAVE_SYSV_COMPAT') == 1 @@ -3735,6 +3747,7 @@ foreach tuple : [ ['logind'], ['machined'], ['portabled'], + ['sysext'], ['userdb'], ['homed'], ['importd'], diff --git a/meson_options.txt b/meson_options.txt index 1707f64c177..a4214730299 100644 --- a/meson_options.txt +++ b/meson_options.txt @@ -111,6 +111,8 @@ option('machined', type : 'boolean', description : 'install the systemd-machined stack') option('portabled', type : 'boolean', description : 'install the systemd-portabled stack') +option('sysext', type : 'boolean', + description : 'install the systemd-sysext stack') option('userdb', type : 'boolean', description : 'install the systemd-userdbd stack') option('homed', type : 'combo', choices : ['auto', 'true', 'false'], diff --git a/src/import/export.c b/src/import/export.c index b8507330ac0..9c44fcf35cc 100644 --- a/src/import/export.c +++ b/src/import/export.c @@ -66,7 +66,7 @@ static int export_tar(int argc, char *argv[], void *userdata) { int r, fd; if (hostname_is_valid(argv[1], 0)) { - r = image_find(IMAGE_MACHINE, argv[1], &image); + r = image_find(IMAGE_MACHINE, argv[1], NULL, &image); if (r == -ENOENT) return log_error_errno(r, "Machine image %s not found.", argv[1]); if (r < 0) @@ -142,7 +142,7 @@ static int export_raw(int argc, char *argv[], void *userdata) { int r, fd; if (hostname_is_valid(argv[1], 0)) { - r = image_find(IMAGE_MACHINE, argv[1], &image); + r = image_find(IMAGE_MACHINE, argv[1], NULL, &image); if (r == -ENOENT) return log_error_errno(r, "Machine image %s not found.", argv[1]); if (r < 0) diff --git a/src/import/import-fs.c b/src/import/import-fs.c index a22eef82558..a36ab24fb8e 100644 --- a/src/import/import-fs.c +++ b/src/import/import-fs.c @@ -132,7 +132,7 @@ static int import_fs(int argc, char *argv[], void *userdata) { local); if (!arg_force) { - r = image_find(IMAGE_MACHINE, local, NULL); + r = image_find(IMAGE_MACHINE, local, NULL, NULL); if (r < 0) { if (r != -ENOENT) return log_error_errno(r, "Failed to check whether image '%s' exists: %m", local); diff --git a/src/import/import.c b/src/import/import.c index 9ea8e7f16dd..3fd99d11603 100644 --- a/src/import/import.c +++ b/src/import/import.c @@ -70,7 +70,7 @@ static int import_tar(int argc, char *argv[], void *userdata) { local); if (!arg_force) { - r = image_find(IMAGE_MACHINE, local, NULL); + r = image_find(IMAGE_MACHINE, local, NULL, NULL); if (r < 0) { if (r != -ENOENT) return log_error_errno(r, "Failed to check whether image '%s' exists: %m", local); @@ -165,7 +165,7 @@ static int import_raw(int argc, char *argv[], void *userdata) { local); if (!arg_force) { - r = image_find(IMAGE_MACHINE, local, NULL); + r = image_find(IMAGE_MACHINE, local, NULL, NULL); if (r < 0) { if (r != -ENOENT) return log_error_errno(r, "Failed to check whether image '%s' exists: %m", local); diff --git a/src/import/pull.c b/src/import/pull.c index a4cec9448e2..e80d8abe6f5 100644 --- a/src/import/pull.c +++ b/src/import/pull.c @@ -78,7 +78,7 @@ static int pull_tar(int argc, char *argv[], void *userdata) { local); if (!arg_force) { - r = image_find(IMAGE_MACHINE, local, NULL); + r = image_find(IMAGE_MACHINE, local, NULL, NULL); if (r < 0) { if (r != -ENOENT) return log_error_errno(r, "Failed to check whether image '%s' exists: %m", local); @@ -164,7 +164,7 @@ static int pull_raw(int argc, char *argv[], void *userdata) { local); if (!arg_force) { - r = image_find(IMAGE_MACHINE, local, NULL); + r = image_find(IMAGE_MACHINE, local, NULL, NULL); if (r < 0) { if (r != -ENOENT) return log_error_errno(r, "Failed to check whether image '%s' exists: %m", local); diff --git a/src/machine/image-dbus.c b/src/machine/image-dbus.c index c157aaf33cb..4c4f900527a 100644 --- a/src/machine/image-dbus.c +++ b/src/machine/image-dbus.c @@ -408,7 +408,7 @@ static int image_object_find(sd_bus *bus, const char *path, const char *interfac if (r < 0) return r; - r = image_find(IMAGE_MACHINE, e, &image); + r = image_find(IMAGE_MACHINE, e, NULL, &image); if (r == -ENOENT) return 0; if (r < 0) @@ -452,7 +452,7 @@ static int image_node_enumerator(sd_bus *bus, const char *path, void *userdata, if (!images) return -ENOMEM; - r = image_discover(IMAGE_MACHINE, images); + r = image_discover(IMAGE_MACHINE, NULL, images); if (r < 0) return r; diff --git a/src/machine/machined-dbus.c b/src/machine/machined-dbus.c index 781686c1048..a65f9b6a8e5 100644 --- a/src/machine/machined-dbus.c +++ b/src/machine/machined-dbus.c @@ -124,7 +124,7 @@ static int method_get_image(sd_bus_message *message, void *userdata, sd_bus_erro if (r < 0) return r; - r = image_find(IMAGE_MACHINE, name, NULL); + r = image_find(IMAGE_MACHINE, name, NULL, NULL); if (r == -ENOENT) return sd_bus_error_setf(error, BUS_ERROR_NO_SUCH_IMAGE, "No image '%s' known", name); if (r < 0) @@ -480,7 +480,7 @@ static int method_list_images(sd_bus_message *message, void *userdata, sd_bus_er if (!images) return -ENOMEM; - r = image_discover(IMAGE_MACHINE, images); + r = image_discover(IMAGE_MACHINE, NULL, images); if (r < 0) return r; @@ -562,7 +562,7 @@ static int redirect_method_to_image(sd_bus_message *message, Manager *m, sd_bus_ if (!image_name_is_valid(name)) return sd_bus_error_setf(error, SD_BUS_ERROR_INVALID_ARGS, "Image name '%s' is invalid.", name); - r = image_find(IMAGE_MACHINE, name, &i); + r = image_find(IMAGE_MACHINE, name, NULL, &i); if (r == -ENOENT) return sd_bus_error_setf(error, BUS_ERROR_NO_SUCH_IMAGE, "No image '%s' known", name); if (r < 0) @@ -755,7 +755,7 @@ static int method_clean_pool(sd_bus_message *message, void *userdata, sd_bus_err goto child_fail; } - r = image_discover(IMAGE_MACHINE, images); + r = image_discover(IMAGE_MACHINE, NULL, images); if (r < 0) goto child_fail; diff --git a/src/nspawn/nspawn.c b/src/nspawn/nspawn.c index e68f0cebf0f..75cefe84142 100644 --- a/src/nspawn/nspawn.c +++ b/src/nspawn/nspawn.c @@ -2940,7 +2940,7 @@ static int determine_names(void) { if (arg_machine) { _cleanup_(image_unrefp) Image *i = NULL; - r = image_find(IMAGE_MACHINE, arg_machine, &i); + r = image_find(IMAGE_MACHINE, arg_machine, NULL, &i); if (r == -ENOENT) return log_error_errno(r, "No image for machine '%s'.", arg_machine); if (r < 0) diff --git a/src/portable/portable.c b/src/portable/portable.c index a96a944ad1a..d74e498d596 100644 --- a/src/portable/portable.c +++ b/src/portable/portable.c @@ -495,7 +495,7 @@ int portable_extract( assert(name_or_path); - r = image_find_harder(IMAGE_PORTABLE, name_or_path, &image); + r = image_find_harder(IMAGE_PORTABLE, name_or_path, NULL, &image); if (r < 0) return r; @@ -953,7 +953,7 @@ static int install_image_symlink( /* If the image is outside of the image search also link it into it, so that it can be found with short image * names and is listed among the images. */ - if (image_in_search_path(IMAGE_PORTABLE, image_path)) + if (image_in_search_path(IMAGE_PORTABLE, NULL, image_path)) return 0; r = image_symlink(image_path, flags, &sl); @@ -987,7 +987,7 @@ int portable_attach( assert(name_or_path); - r = image_find_harder(IMAGE_PORTABLE, name_or_path, &image); + r = image_find_harder(IMAGE_PORTABLE, name_or_path, NULL, &image); if (r < 0) return r; @@ -1193,7 +1193,7 @@ int portable_detach( return log_debug_errno(r, "Failed to add unit name '%s' to set: %m", de->d_name); if (path_is_absolute(marker) && - !image_in_search_path(IMAGE_PORTABLE, marker)) { + !image_in_search_path(IMAGE_PORTABLE, NULL, marker)) { r = set_ensure_consume(&markers, &path_hash_ops_free, TAKE_PTR(marker)); if (r < 0) diff --git a/src/portable/portabled-image-bus.c b/src/portable/portabled-image-bus.c index eb0786e4bb0..76b6ddebde3 100644 --- a/src/portable/portabled-image-bus.c +++ b/src/portable/portabled-image-bus.c @@ -606,7 +606,7 @@ int bus_image_acquire( if (image_name_is_valid(name_or_path)) { /* If it's a short name, let's search for it */ - r = image_find(IMAGE_PORTABLE, name_or_path, &loaded); + r = image_find(IMAGE_PORTABLE, name_or_path, NULL, &loaded); if (r == -ENOENT) return sd_bus_error_setf(error, BUS_ERROR_NO_SUCH_PORTABLE_IMAGE, "No image '%s' found.", name_or_path); diff --git a/src/portable/portabled-image.c b/src/portable/portabled-image.c index b025c205490..40548fb6556 100644 --- a/src/portable/portabled-image.c +++ b/src/portable/portabled-image.c @@ -92,7 +92,7 @@ int manager_image_cache_discover(Manager *m, Hashmap *images, sd_bus_error *erro /* A wrapper around image_discover() (for finding images in search path) and portable_discover_attached() (for * finding attached images). */ - r = image_discover(IMAGE_PORTABLE, images); + r = image_discover(IMAGE_PORTABLE, NULL, images); if (r < 0) return r; diff --git a/src/shared/machine-image.c b/src/shared/machine-image.c index df288bc0e18..d2b726efc4c 100644 --- a/src/shared/machine-image.c +++ b/src/shared/machine-image.c @@ -42,18 +42,24 @@ #include "xattr-util.h" static const char* const image_search_path[_IMAGE_CLASS_MAX] = { - [IMAGE_MACHINE] = "/etc/machines\0" /* only place symlinks here */ - "/run/machines\0" /* and here too */ - "/var/lib/machines\0" /* the main place for images */ - "/var/lib/container\0" /* legacy */ - "/usr/local/lib/machines\0" - "/usr/lib/machines\0", + [IMAGE_MACHINE] = "/etc/machines\0" /* only place symlinks here */ + "/run/machines\0" /* and here too */ + "/var/lib/machines\0" /* the main place for images */ + "/var/lib/container\0" /* legacy */ + "/usr/local/lib/machines\0" + "/usr/lib/machines\0", - [IMAGE_PORTABLE] = "/etc/portables\0" /* only place symlinks here */ - "/run/portables\0" /* and here too */ - "/var/lib/portables\0" /* the main place for images */ - "/usr/local/lib/portables\0" - "/usr/lib/portables\0", + [IMAGE_PORTABLE] = "/etc/portables\0" /* only place symlinks here */ + "/run/portables\0" /* and here too */ + "/var/lib/portables\0" /* the main place for images */ + "/usr/local/lib/portables\0" + "/usr/lib/portables\0", + + [IMAGE_EXTENSION] = "/etc/extensions\0" /* only place symlinks here */ + "/run/extensions\0" /* and here too */ + "/var/lib/extensions\0" /* the main place for images */ + "/usr/local/lib/extensions\0" + "/usr/lib/extensions\0", }; static Image *image_free(Image *i) { @@ -415,7 +421,11 @@ static int image_make( return -EMEDIUMTYPE; } -int image_find(ImageClass class, const char *name, Image **ret) { +int image_find(ImageClass class, + const char *name, + const char *root, + Image **ret) { + const char *path; int r; @@ -428,20 +438,22 @@ int image_find(ImageClass class, const char *name, Image **ret) { return -ENOENT; NULSTR_FOREACH(path, image_search_path[class]) { + _cleanup_free_ char *resolved = NULL; _cleanup_closedir_ DIR *d = NULL; struct stat st; + int flags; - d = opendir(path); - if (!d) { - if (errno == ENOENT) - continue; + r = chase_symlinks_and_opendir(path, root, CHASE_PREFIX_ROOT, &resolved, &d); + if (r == -ENOENT) + continue; + if (r < 0) + return r; - return -errno; - } - - /* As mentioned above, we follow symlinks on this fstatat(), because we want to permit people to - * symlink block devices into the search path */ - if (fstatat(dirfd(d), name, &st, 0) < 0) { + /* As mentioned above, we follow symlinks on this fstatat(), because we want to permit people + * to symlink block devices into the search path. (For now, we disable that when operating + * relative to some root directory.) */ + flags = root ? AT_SYMLINK_NOFOLLOW : 0; + if (fstatat(dirfd(d), name, &st, flags) < 0) { _cleanup_free_ char *raw = NULL; if (errno != ENOENT) @@ -451,8 +463,7 @@ int image_find(ImageClass class, const char *name, Image **ret) { if (!raw) return -ENOMEM; - if (fstatat(dirfd(d), raw, &st, 0) < 0) { - + if (fstatat(dirfd(d), raw, &st, flags) < 0) { if (errno == ENOENT) continue; @@ -462,13 +473,13 @@ int image_find(ImageClass class, const char *name, Image **ret) { if (!S_ISREG(st.st_mode)) continue; - r = image_make(name, dirfd(d), path, raw, &st, ret); + r = image_make(name, dirfd(d), resolved, raw, &st, ret); } else { if (!S_ISDIR(st.st_mode) && !S_ISBLK(st.st_mode)) continue; - r = image_make(name, dirfd(d), path, name, &st, ret); + r = image_make(name, dirfd(d), resolved, name, &st, ret); } if (IN_SET(r, -ENOENT, -EMEDIUMTYPE)) continue; @@ -482,7 +493,7 @@ int image_find(ImageClass class, const char *name, Image **ret) { } if (class == IMAGE_MACHINE && streq(name, ".host")) { - r = image_make(".host", AT_FDCWD, NULL, "/", NULL, ret); + r = image_make(".host", AT_FDCWD, NULL, empty_to_root(root), NULL, ret); if (r < 0) return r; @@ -507,14 +518,18 @@ int image_from_path(const char *path, Image **ret) { return image_make(NULL, AT_FDCWD, NULL, path, NULL, ret); } -int image_find_harder(ImageClass class, const char *name_or_path, Image **ret) { +int image_find_harder(ImageClass class, const char *name_or_path, const char *root, Image **ret) { if (image_name_is_valid(name_or_path)) - return image_find(class, name_or_path, ret); + return image_find(class, name_or_path, root, ret); return image_from_path(name_or_path, ret); } -int image_discover(ImageClass class, Hashmap *h) { +int image_discover( + ImageClass class, + const char *root, + Hashmap *h) { + const char *path; int r; @@ -523,29 +538,30 @@ int image_discover(ImageClass class, Hashmap *h) { assert(h); NULSTR_FOREACH(path, image_search_path[class]) { + _cleanup_free_ char *resolved = NULL; _cleanup_closedir_ DIR *d = NULL; struct dirent *de; - d = opendir(path); - if (!d) { - if (errno == ENOENT) - continue; - - return -errno; - } + r = chase_symlinks_and_opendir(path, root, CHASE_PREFIX_ROOT, &resolved, &d); + if (r == -ENOENT) + continue; + if (r < 0) + return r; FOREACH_DIRENT_ALL(de, d, return -errno) { _cleanup_(image_unrefp) Image *image = NULL; _cleanup_free_ char *truncated = NULL; const char *pretty; struct stat st; + int flags; if (dot_or_dot_dot(de->d_name)) continue; - /* As mentioned above, we follow symlinks on this fstatat(), because we want to permit people - * to symlink block devices into the search path */ - if (fstatat(dirfd(d), de->d_name, &st, 0) < 0) { + /* As mentioned above, we follow symlinks on this fstatat(), because we want to + * permit people to symlink block devices into the search path. */ + flags = root ? AT_SYMLINK_NOFOLLOW : 0; + if (fstatat(dirfd(d), de->d_name, &st, flags) < 0) { if (errno == ENOENT) continue; @@ -575,7 +591,7 @@ int image_discover(ImageClass class, Hashmap *h) { if (hashmap_contains(h, pretty)) continue; - r = image_make(pretty, dirfd(d), path, de->d_name, &st, &image); + r = image_make(pretty, dirfd(d), resolved, de->d_name, &st, &image); if (IN_SET(r, -ENOENT, -EMEDIUMTYPE)) continue; if (r < 0) @@ -594,7 +610,7 @@ int image_discover(ImageClass class, Hashmap *h) { if (class == IMAGE_MACHINE && !hashmap_contains(h, ".host")) { _cleanup_(image_unrefp) Image *image = NULL; - r = image_make(".host", AT_FDCWD, NULL, "/", NULL, &image); + r = image_make(".host", AT_FDCWD, NULL, empty_to_root("/"), NULL, &image); if (r < 0) return r; @@ -737,7 +753,7 @@ int image_rename(Image *i, const char *new_name) { if (r < 0) return r; - r = image_find(IMAGE_MACHINE, new_name, NULL); + r = image_find(IMAGE_MACHINE, new_name, NULL, NULL); if (r >= 0) return -EEXIST; if (r != -ENOENT) @@ -850,7 +866,7 @@ int image_clone(Image *i, const char *new_name, bool read_only) { if (r < 0) return r; - r = image_find(IMAGE_MACHINE, new_name, NULL); + r = image_find(IMAGE_MACHINE, new_name, NULL, NULL); if (r >= 0) return -EEXIST; if (r != -ENOENT) @@ -1242,16 +1258,27 @@ bool image_name_is_valid(const char *s) { return true; } -bool image_in_search_path(ImageClass class, const char *image) { +bool image_in_search_path( + ImageClass class, + const char *root, + const char *image) { + const char *path; assert(image); NULSTR_FOREACH(path, image_search_path[class]) { - const char *p; + const char *p, *q; size_t k; - p = path_startswith(image, path); + if (!empty_or_root(root)) { + q = path_startswith(path, root); + if (!q) + continue; + } else + q = path; + + p = path_startswith(q, path); if (!p) continue; diff --git a/src/shared/machine-image.h b/src/shared/machine-image.h index 95a8f5cfbd4..c568fff751b 100644 --- a/src/shared/machine-image.h +++ b/src/shared/machine-image.h @@ -16,6 +16,7 @@ typedef enum ImageClass { IMAGE_MACHINE, IMAGE_PORTABLE, + IMAGE_EXTENSION, _IMAGE_CLASS_MAX, _IMAGE_CLASS_INVALID = -1 } ImageClass; @@ -61,10 +62,10 @@ Image *image_ref(Image *i); DEFINE_TRIVIAL_CLEANUP_FUNC(Image*, image_unref); -int image_find(ImageClass class, const char *name, Image **ret); +int image_find(ImageClass class, const char *root, const char *name, Image **ret); int image_from_path(const char *path, Image **ret); -int image_find_harder(ImageClass class, const char *name_or_path, Image **ret); -int image_discover(ImageClass class, Hashmap *map); +int image_find_harder(ImageClass class, const char *root, const char *name_or_path, Image **ret); +int image_discover(ImageClass class, const char *root, Hashmap *map); int image_remove(Image *i); int image_rename(Image *i, const char *new_name); @@ -83,7 +84,7 @@ int image_set_limit(Image *i, uint64_t referenced_max); int image_read_metadata(Image *i); -bool image_in_search_path(ImageClass class, const char *image); +bool image_in_search_path(ImageClass class, const char *root, const char *image); static inline bool IMAGE_IS_HIDDEN(const struct Image *i) { assert(i); diff --git a/src/shared/os-util.c b/src/shared/os-util.c index 3b7e4958464..d1cf41283b1 100644 --- a/src/shared/os-util.c +++ b/src/shared/os-util.c @@ -6,6 +6,7 @@ #include "fileio.h" #include "fs-util.h" #include "macro.h" +#include "machine-image.h" #include "os-util.h" #include "string-util.h" #include "strv.h" @@ -31,17 +32,31 @@ int path_is_os_tree(const char *path) { return 1; } -int open_os_release(const char *root, char **ret_path, int *ret_fd) { +int open_extension_release(const char *root, const char *extension, char **ret_path, int *ret_fd) { _cleanup_free_ char *q = NULL; - const char *p; int r, fd; - FOREACH_STRING(p, "/etc/os-release", "/usr/lib/os-release") { - r = chase_symlinks(p, root, CHASE_PREFIX_ROOT, - ret_path ? &q : NULL, - ret_fd ? &fd : NULL); - if (r != -ENOENT) - break; + if (extension) { + const char *extension_full_path; + + if (!image_name_is_valid(extension)) + return log_debug_errno(SYNTHETIC_ERRNO(EINVAL), + "The extension name %s is invalid.", extension); + + extension_full_path = strjoina("/usr/lib/extension-release.d/extension-release.", extension); + r = chase_symlinks(extension_full_path, root, CHASE_PREFIX_ROOT, + ret_path ? &q : NULL, + ret_fd ? &fd : NULL); + } else { + const char *p; + + FOREACH_STRING(p, "/etc/os-release", "/usr/lib/os-release") { + r = chase_symlinks(p, root, CHASE_PREFIX_ROOT, + ret_path ? &q : NULL, + ret_fd ? &fd : NULL); + if (r != -ENOENT) + break; + } } if (r < 0) return r; @@ -64,16 +79,16 @@ int open_os_release(const char *root, char **ret_path, int *ret_fd) { return 0; } -int fopen_os_release(const char *root, char **ret_path, FILE **ret_file) { +int fopen_extension_release(const char *root, const char *extension, char **ret_path, FILE **ret_file) { _cleanup_free_ char *p = NULL; _cleanup_close_ int fd = -1; FILE *f; int r; if (!ret_file) - return open_os_release(root, ret_path, NULL); + return open_extension_release(root, extension, ret_path, NULL); - r = open_os_release(root, ret_path ? &p : NULL, &fd); + r = open_extension_release(root, extension, ret_path ? &p : NULL, &fd); if (r < 0) return r; @@ -89,18 +104,35 @@ int fopen_os_release(const char *root, char **ret_path, FILE **ret_file) { return 0; } -int parse_os_release(const char *root, ...) { +static int parse_release_internal(const char *root, const char *extension, va_list ap) { _cleanup_fclose_ FILE *f = NULL; _cleanup_free_ char *p = NULL; - va_list ap; int r; - r = fopen_os_release(root, &p, &f); + r = fopen_extension_release(root, extension, &p, &f); if (r < 0) return r; + return parse_env_filev(f, p, ap); +} + +int parse_extension_release(const char *root, const char *extension, ...) { + va_list ap; + int r; + + va_start(ap, extension); + r = parse_release_internal(root, extension, ap); + va_end(ap); + + return r; +} + +int parse_os_release(const char *root, ...) { + va_list ap; + int r; + va_start(ap, root); - r = parse_env_filev(f, p, ap); + r = parse_release_internal(root, NULL, ap); va_end(ap); return r; diff --git a/src/shared/os-util.h b/src/shared/os-util.h index 1d9b0b146b3..5b724eb7ac1 100644 --- a/src/shared/os-util.h +++ b/src/shared/os-util.h @@ -5,9 +5,19 @@ int path_is_os_tree(const char *path); -int open_os_release(const char *root, char **ret_path, int *ret_fd); -int fopen_os_release(const char *root, char **ret_path, FILE **ret_file); +/* The *_extension_release flavours will look for /usr/lib/extension-release/extension-release.NAME + * in accordance with the OS extension specification, rather than for /usr/lib/ or /etc/os-release. */ +int open_extension_release(const char *root, const char *extension, char **ret_path, int *ret_fd); +static inline int open_os_release(const char *root, char **ret_path, int *ret_fd) { + return open_extension_release(root, NULL, ret_path, ret_fd); +} +int fopen_extension_release(const char *root, const char *extension, char **ret_path, FILE **ret_file); +static inline int fopen_os_release(const char *root, char **ret_path, FILE **ret_file) { + return fopen_extension_release(root, NULL, ret_path, ret_file); +} + +int parse_extension_release(const char *root, const char *extension, ...) _sentinel_; int parse_os_release(const char *root, ...) _sentinel_; int load_os_release_pairs(const char *root, char ***ret); int load_os_release_pairs_with_prefix(const char *root, const char *prefix, char ***ret); diff --git a/src/sysext/meson.build b/src/sysext/meson.build new file mode 100644 index 00000000000..1517df414e8 --- /dev/null +++ b/src/sysext/meson.build @@ -0,0 +1,5 @@ +# SPDX-License-Identifier: LGPL-2.1-or-later + +systemd_sysext_sources = files(''' + sysext.c +'''.split()) diff --git a/src/sysext/sysext.c b/src/sysext/sysext.c new file mode 100644 index 00000000000..8a8cd7535ed --- /dev/null +++ b/src/sysext/sysext.c @@ -0,0 +1,1018 @@ +/* SPDX-License-Identifier: LGPL-2.1-or-later */ + +#include +#include +#include +#include + +#include "capability-util.h" +#include "dissect-image.h" +#include "escape.h" +#include "fd-util.h" +#include "fileio.h" +#include "format-table.h" +#include "fs-util.h" +#include "hashmap.h" +#include "log.h" +#include "machine-image.h" +#include "main-func.h" +#include "missing_magic.h" +#include "mkdir.h" +#include "mount-util.h" +#include "mountpoint-util.h" +#include "os-util.h" +#include "pager.h" +#include "parse-util.h" +#include "pretty-print.h" +#include "process-util.h" +#include "sort-util.h" +#include "stat-util.h" +#include "terminal-util.h" +#include "user-util.h" + +static enum { + ACTION_STATUS, + ACTION_MERGE, + ACTION_UNMERGE, + ACTION_REFRESH, + ACTION_LIST, +} arg_action = ACTION_STATUS; +static char **arg_hierarchies = NULL; /* "/usr" + "/opt" by default */ +static char *arg_root = NULL; +static JsonFormatFlags arg_json_format_flags = JSON_FORMAT_OFF; +static PagerFlags arg_pager_flags = 0; + +STATIC_DESTRUCTOR_REGISTER(arg_hierarchies, strv_freep); +STATIC_DESTRUCTOR_REGISTER(arg_root, freep); + +static int is_our_mount_point(const char *p) { + _cleanup_free_ char *buf = NULL, *f = NULL; + struct stat st; + dev_t dev; + int r; + + r = path_is_mount_point(p, NULL, 0); + if (r == -ENOENT) { + log_debug_errno(r, "Hierarchy '%s' doesn't exist.", p); + return false; + } + if (r < 0) + return log_error_errno(r, "Failed to determine whether '%s' is a mount point: %m", p); + if (r == 0) { + log_debug("Hierarchy '%s' is not a mount point, skipping.", p); + return false; + } + + /* So we know now that it's a mount point. Now let's check if it's one of ours, so that we don't + * accidentally unmount the user's own /usr/ but just the mounts we established ourselves. We do this + * check by looking into the metadata directory we place in merged mounts: if the file + * .systemd-sysext/dev contains the major/minor device pair of the mount we have a good reason to + * believe this is one of our mounts. This thorough check has the benefit that we aren't easily + * confused if people tar up one of our merged trees and untar them elsewhere where we might mistake + * them for a live sysext tree. */ + + f = path_join(p, ".systemd-sysext/dev"); + if (!f) + return log_oom(); + + r = read_one_line_file(f, &buf); + if (r == -ENOENT) { + log_debug("Hierarchy '%s' does not carry a .systemd-sysext/dev file, not a sysext merged tree.", p); + return false; + } + if (r < 0) + return log_error_errno(r, "Failed to determine whether hierarchy '%s' contains '.systemd-sysext/dev': %m", p); + + r = parse_dev(buf, &dev); + if (r < 0) + return log_error_errno(r, "Failed to parse device major/minor stored in '.systemd-sysext/dev' file on '%s': %m", p); + + if (lstat(p, &st) < 0) + return log_error_errno(r, "Failed to stat %s: %m", p); + + if (st.st_dev != dev) { + log_debug("Hierarchy '%s' reports a different device major/minor than what we are seeing, assuming offline copy.", p); + return false; + } + + return true; +} + +static int unmerge_hierarchy(const char *p) { + int r; + + for (;;) { + /* We only unmount /usr/ if it is a mount point and really one of ours, in order not to break + * systems where /usr/ is a mount point of its own already. */ + + r = is_our_mount_point(p); + if (r < 0) + return r; + if (r == 0) + break; + + r = umount_verbose(LOG_ERR, p, MNT_DETACH|UMOUNT_NOFOLLOW); + if (r < 0) + return log_error_errno(r, "Failed to unmount file system '%s': %m", p); + + log_info("Unmerged '%s'.", p); + } + + return 0; +} + +static int unmerge(void) { + int r, ret = 0; + char **p; + + STRV_FOREACH(p, arg_hierarchies) { + _cleanup_free_ char *resolved = NULL; + + r = chase_symlinks(*p, arg_root, CHASE_PREFIX_ROOT, &resolved, NULL); + if (r == -ENOENT) { + log_debug_errno(r, "Hierarchy '%s%s' does not exist, ignoring.", strempty(arg_root), *p); + continue; + } + if (r < 0) { + log_error_errno(r, "Failed to resolve path to hierarchy '%s%s': %m", strempty(arg_root), *p); + if (ret == 0) + ret = r; + + continue; + } + + r = unmerge_hierarchy(resolved); + if (r < 0 && ret == 0) + ret = r; + } + + return ret; +} + +static int status(void) { + _cleanup_(table_unrefp) Table *t = NULL; + int r, ret = 0; + char **p; + + t = table_new("hierarchy", "extensions", "since"); + if (!t) + return log_oom(); + + (void) table_set_empty_string(t, "-"); + + STRV_FOREACH(p, arg_hierarchies) { + _cleanup_free_ char *resolved = NULL, *f = NULL, *buf = NULL; + _cleanup_strv_free_ char **l = NULL; + struct stat st; + + r = chase_symlinks(*p, arg_root, CHASE_PREFIX_ROOT, &resolved, NULL); + if (r == -ENOENT) { + log_debug_errno(r, "Hierarchy '%s%s' does not exist, ignoring.", strempty(arg_root), *p); + continue; + } + if (r < 0) { + log_error_errno(r, "Failed to resolve path to hierarchy '%s%s': %m", strempty(arg_root), *p); + goto inner_fail; + } + + r = is_our_mount_point(resolved); + if (r < 0) + goto inner_fail; + if (r == 0) { + r = table_add_many( + t, + TABLE_PATH, *p, + TABLE_STRING, "none", + TABLE_SET_COLOR, ansi_grey(), + TABLE_EMPTY); + if (r < 0) + return table_log_add_error(r); + + continue; + } + + f = path_join(*p, ".systemd-sysext/extensions"); + if (!f) + return log_oom(); + + r = read_full_file(f, &buf, NULL); + if (r < 0) + return log_error_errno(r, "Failed to open '%s': %m", f); + + l = strv_split_newlines(buf); + if (!l) + return log_oom(); + + if (stat(*p, &st) < 0) + return log_error_errno(r, "Failed to stat() '%s': %m", *p); + + r = table_add_many( + t, + TABLE_PATH, *p, + TABLE_STRV, l, + TABLE_TIMESTAMP, timespec_load(&st.st_mtim)); + if (r < 0) + return table_log_add_error(r); + + continue; + + inner_fail: + if (ret == 0) + ret = r; + } + + (void) table_set_sort(t, (size_t) 0, (size_t) -1); + + if (arg_json_format_flags & (JSON_FORMAT_OFF|JSON_FORMAT_PRETTY|JSON_FORMAT_PRETTY_AUTO)) + (void) pager_open(arg_pager_flags); + + r = table_print_json(t, stdout, arg_json_format_flags); + if (r < 0) + return table_log_add_error(r); + + return ret; +} + +static int mount_overlayfs( + const char *where, + char **layers) { + + _cleanup_free_ char *options = NULL; + bool separator = false; + char **l; + int r; + + assert(where); + + options = strdup("lowerdir="); + if (!options) + return log_oom(); + + STRV_FOREACH(l, layers) { + _cleanup_free_ char *escaped = NULL; + + escaped = shell_escape(*l, ",:"); + if (!escaped) + return log_oom(); + + if (!strextend(&options, separator ? ":" : "", escaped)) + return log_oom(); + + separator = true; + } + + /* Now mount the actual overlayfs */ + r = mount_nofollow_verbose(LOG_ERR, "sysext", where, "overlay", MS_RDONLY, options); + if (r < 0) + return r; + + return 0; +} + +static int merge_hierarchy( + const char *hierarchy, + char **extensions, + char **paths, + const char *meta_path, + const char *overlay_path) { + + _cleanup_free_ char *resolved_hierarchy = NULL, *f = NULL, *buf = NULL; + _cleanup_strv_free_ char **layers = NULL; + struct stat st; + char **p; + int r; + + assert(hierarchy); + assert(meta_path); + assert(overlay_path); + + /* Resolve the path of the host's version of the hierarchy, i.e. what we want to use as lowest layer + * in the overlayfs stack. */ + r = chase_symlinks(hierarchy, arg_root, CHASE_PREFIX_ROOT, &resolved_hierarchy, NULL); + if (r == -ENOENT) + log_debug_errno(r, "Hierarchy '%s' on host doesn't exist, not merging.", hierarchy); + else if (r < 0) + return log_error_errno(r, "Failed to resolve host hierarchy '%s': %m", hierarchy); + else { + r = dir_is_empty(resolved_hierarchy); + if (r < 0) + return log_error_errno(r, "Failed to check if host hierarchy '%s' is empty: %m", resolved_hierarchy); + if (r > 0) { + log_debug("Host hierarchy '%s' is empty, not merging.", resolved_hierarchy); + resolved_hierarchy = mfree(resolved_hierarchy); + } + } + + /* Let's generate a metadata file that lists all extensions we took into account for this + * hierarchy. We include this in the final fs, to make things nicely discoverable and + * recognizable. */ + f = path_join(meta_path, ".systemd-sysext/extensions"); + if (!f) + return log_oom(); + + buf = strv_join(extensions, "\n"); + if (!buf) + return log_oom(); + + r = write_string_file(f, buf, WRITE_STRING_FILE_CREATE|WRITE_STRING_FILE_MKDIR_0755); + if (r < 0) + return log_error_errno(r, "Failed to write extension meta file '%s': %m", f); + + /* Put the meta path (i.e. our synthesized stuff) at the top of the layer stack */ + layers = strv_new(meta_path); + if (!layers) + return log_oom(); + + /* Put the extensions in the middle */ + STRV_FOREACH(p, paths) { + _cleanup_free_ char *resolved = NULL; + + r = chase_symlinks(hierarchy, *p, CHASE_PREFIX_ROOT, &resolved, NULL); + if (r == -ENOENT) { + log_debug_errno(r, "Hierarchy '%s' in extension '%s' doesn't exist, not merging.", hierarchy, *p); + continue; + } + if (r < 0) + return log_error_errno(r, "Failed to resolve hierarchy '%s' in extension '%s': %m", hierarchy, *p); + + r = dir_is_empty(resolved); + if (r < 0) + return log_error_errno(r, "Failed to check if hierarchy '%s' in extension '%s' is empty: %m", resolved, *p); + if (r > 0) { + log_debug("Hierarchy '%s' in extension '%s' is empty, not merging.", hierarchy, *p); + continue; + } + + r = strv_consume(&layers, TAKE_PTR(resolved)); + if (r < 0) + return log_oom(); + } + + if (!layers[1]) /* No extension with files in this hierarchy? Then don't do anything. */ + return 0; + + if (resolved_hierarchy) { + /* Add the host hierarchy as last (lowest) layer in the stack */ + r = strv_consume(&layers, TAKE_PTR(resolved_hierarchy)); + if (r < 0) + return log_oom(); + } + + r = mkdir_p(overlay_path, 0700); + if (r < 0) + return log_error_errno(r, "Failed to make directory '%s': %m", overlay_path); + + r = mount_overlayfs(overlay_path, layers); + if (r < 0) + return r; + + /* The overlayfs superblock is read-only. Let's also mark the bind mount read-only. Extra turbo safety 😎 */ + r = bind_remount_recursive(overlay_path, MS_RDONLY, MS_RDONLY, NULL); + if (r < 0) + return log_error_errno(r, "Failed to make bind mount '%s' read-only: %m", overlay_path); + + /* Now we have mounted the new file system. Let's now figure out its .st_dev field, and make that + * available in the metadata directory. This is useful to detect whether the metadata dir actually + * belongs to the fs it is found on: if .st_dev of the top-level mount matches it, it's pretty likely + * we are looking at a live sysext tree, and not an unpacked tar or so of one. */ + if (stat(overlay_path, &st) < 0) + return log_error_errno(r, "Failed to stat mount '%s': %m", overlay_path); + + free(f); + f = path_join(meta_path, ".systemd-sysext/dev"); + if (!f) + return log_oom(); + + r = write_string_filef(f, WRITE_STRING_FILE_CREATE, "%u:%u", major(st.st_dev), minor(st.st_dev)); + if (r < 0) + return log_error_errno(r, "Failed to write '%s': %m", f); + + /* Make sure the top-level dir has an mtime marking the point we established the merge */ + if (utimensat(AT_FDCWD, meta_path, NULL, AT_SYMLINK_NOFOLLOW) < 0) + return log_error_errno(r, "Failed fix mtime of '%s': %m", meta_path); + + return 1; +} + +static int strverscmpp(char *const* a, char *const* b) { + /* usable in qsort() for sorting a string array with strverscmp() */ + return strverscmp(*a, *b); +} + +static int merge_subprocess(Hashmap *images, const char *workspace) { + _cleanup_free_ char *host_os_release_id = NULL, *host_os_release_version_id = NULL, *host_os_release_sysext_level = NULL, + *buf = NULL; + _cleanup_strv_free_ char **extensions = NULL, **paths = NULL; + size_t n_extensions = 0; + unsigned n_ignored = 0; + Image *img; + char **h; + int r; + + /* Mark the whole of /run as MS_SLAVE, so that we can mount stuff below it that doesn't show up on + * the host otherwise. */ + r = mount_nofollow_verbose(LOG_ERR, NULL, "/run", NULL, MS_SLAVE|MS_REC, NULL); + if (r < 0) + return log_error_errno(r, "Failed to remount /run/ MS_SLAVE: %m"); + + /* Let's create the workspace if it's missing */ + r = mkdir_p(workspace, 0700); + if (r < 0) + return log_error_errno(r, "Failed to create /run/systemd/sysext: %m"); + + /* Let's mount a tmpfs to our workspace. This way we don't need to clean up the inodes we mount over, + * but let the kernel do that entirely automatically, once our namespace dies. Note that this file + * system won't be visible to anyone but us, since we opened our own namespace and then made the + * /run/ hierarchy (which our workspace is contained in) MS_SLAVE, see above. */ + r = mount_nofollow_verbose(LOG_ERR, "sysexit", workspace, "tmpfs", 0, "mode=0700"); + if (r < 0) + return r; + + /* Acquire host OS release info, so that we can compare it with the extension's data */ + r = parse_os_release( + arg_root, + "ID", &host_os_release_id, + "VERSION_ID", &host_os_release_version_id, + "SYSEXT_LEVEL", &host_os_release_sysext_level, + NULL); + if (r < 0) + return log_error_errno(r, "Failed to acquire 'os-release' data of OS tree '%s': %m", empty_to_root(arg_root)); + + /* Let's now mount all images */ + HASHMAP_FOREACH(img, images) { + _cleanup_free_ char *p = NULL, + *extension_release_id = NULL, *extension_release_version_id = NULL, *extension_release_sysext_level = NULL; + + p = path_join(workspace, "extensions", img->name); + if (!p) + return log_oom(); + + r = mkdir_p(p, 0700); + if (r < 0) + return log_error_errno(r, "Failed to create %s: %m", p); + + switch (img->type) { + case IMAGE_DIRECTORY: + case IMAGE_SUBVOLUME: + r = mount_nofollow_verbose(LOG_ERR, img->path, p, NULL, MS_BIND, NULL); + if (r < 0) + return r; + + /* Make this a read-only bind mount */ + r = bind_remount_recursive(p, MS_RDONLY, MS_RDONLY, NULL); + if (r < 0) + return log_error_errno(r, "Failed to make bind mount '%s' read-only: %m", p); + + break; + + case IMAGE_RAW: + case IMAGE_BLOCK: { + _cleanup_(dissected_image_unrefp) DissectedImage *m = NULL; + _cleanup_(loop_device_unrefp) LoopDevice *d = NULL; + _cleanup_(decrypted_image_unrefp) DecryptedImage *di = NULL; + _cleanup_(verity_settings_done) VeritySettings verity_settings = VERITY_SETTINGS_DEFAULT; + DissectImageFlags flags = DISSECT_IMAGE_READ_ONLY|DISSECT_IMAGE_REQUIRE_ROOT|DISSECT_IMAGE_MOUNT_ROOT_ONLY; + + r = verity_settings_load(&verity_settings, img->path, NULL, NULL); + if (r < 0) + return log_error_errno(r, "Failed to read verity artifacts for %s: %m", img->path); + + if (verity_settings.data_path) + flags |= DISSECT_IMAGE_NO_PARTITION_TABLE; + + r = loop_device_make_by_path(img->path, O_RDONLY, 0, &d); + if (r < 0) + return log_error_errno(r, "Failed to set up loopback device: %m"); + + r = dissect_image_and_warn( + d->fd, + img->path, + &verity_settings, + NULL, + flags, + &m); + if (r < 0) + return r; + + r = dissected_image_decrypt_interactively( + m, NULL, + &verity_settings, + flags, + &di); + if (r < 0) + return r; + + r = dissected_image_mount_and_warn( + m, + p, + UID_INVALID, + flags); + if (r < 0) + return r; + + if (di) { + r = decrypted_image_relinquish(di); + if (r < 0) + return log_error_errno(r, "Failed to relinquish DM devices: %m"); + } + + loop_device_relinquish(d); + break; + } + default: + assert_not_reached("Unsupported image type"); + } + + /* Insist that extension images do not overwrite the underlying OS release file (it's fine if + * they place one in /etc/os-release, i.e. where things don't matter, as they aren't + * merged.) */ + r = chase_symlinks("/usr/lib/os-release", p, CHASE_PREFIX_ROOT, NULL, NULL); + if (r < 0) { + if (r != -ENOENT) + return log_error_errno(r, "Failed to determine whether /usr/lib/os-release exists in the extension image: %m"); + } else + return log_error_errno(SYNTHETIC_ERRNO(EINVAL), + "Extension image contains /usr/lib/os-release file, which is not allowed (it may carry /etc/os-release), refusing."); + + /* Now that we can look into the extension image, let's see if the OS version is compatible */ + r = parse_extension_release( + p, + img->name, + "ID", &extension_release_id, + "VERSION_ID", &extension_release_version_id, + "SYSEXT_LEVEL", &extension_release_sysext_level, + NULL); + if (r == -ENOENT) { + log_notice_errno(r, "Extension '%s' carries no extension-release data, ignoring extension.", img->name); + n_ignored++; + continue; + } else if (r < 0) + return log_error_errno(r, "Failed to acquire 'os-release' data of extension '%s': %m", img->name); + else { + if (!streq_ptr(host_os_release_id, extension_release_id)) { + log_notice("Extension '%s' is for OS '%s', but running on '%s', ignoring extension.", + img->name, strna(extension_release_id), strna(host_os_release_id)); + n_ignored++; + continue; + } + + /* If the extension has a sysext API level declared, then it must match the host API level. Otherwise, compare OS version as a whole */ + if (extension_release_sysext_level) { + if (!streq_ptr(host_os_release_sysext_level, extension_release_sysext_level)) { + log_notice("Extension '%s' is for sysext API level '%s', but running on sysext API level '%s', ignoring extension.", + img->name, extension_release_sysext_level, strna(host_os_release_sysext_level)); + n_ignored++; + continue; + } + } else { + if (!streq_ptr(host_os_release_version_id, extension_release_version_id)) { + log_notice("Extension '%s' is for OS version '%s', but running on OS version '%s', ignoring extension.", + img->name, extension_release_version_id, strna(host_os_release_version_id)); + n_ignored++; + continue; + } + } + + log_debug("Version info of extension '%s' matches host.", img->name); + } + + /* Noice! This one is an extension we want. */ + r = strv_extend(&extensions, img->name); + if (r < 0) + return log_oom(); + + n_extensions ++; + } + + /* Nothing left? Then shortcut things */ + if (n_extensions == 0) { + if (n_ignored > 0) + log_info("No suitable extensions found (%u ignored due to incompatible version).", n_ignored); + else + log_info("No extensions found."); + return 0; + } + + /* Order by version sort (i.e. libc strverscmp()) */ + typesafe_qsort(extensions, n_extensions, strverscmpp); + + buf = strv_join(extensions, "', '"); + if (!buf) + return log_oom(); + + log_info("Using extensions '%s'.", buf); + + /* Build table of extension paths (in reverse order) */ + paths = new0(char*, n_extensions + 1); + if (!paths) + return log_oom(); + + for (size_t k = 0; k < n_extensions; k++) { + _cleanup_free_ char *p = NULL; + + assert_se(img = hashmap_get(images, extensions[n_extensions - 1 - k])); + + p = path_join(workspace, "extensions", img->name); + if (!p) + return log_oom(); + + paths[k] = TAKE_PTR(p); + } + + /* Let's now unmerge the status quo ante, since to build the new overlayfs we need a reference to the + * underlying fs. */ + STRV_FOREACH(h, arg_hierarchies) { + _cleanup_free_ char *resolved = NULL; + + r = chase_symlinks(*h, arg_root, CHASE_PREFIX_ROOT|CHASE_NONEXISTENT, &resolved, NULL); + if (r < 0) + return log_error_errno(r, "Failed to resolve hierarchy '%s%s': %m", strempty(arg_root), *h); + + r = unmerge_hierarchy(resolved); + if (r < 0) + return r; + } + + /* Create overlayfs mounts for all hierarchies */ + STRV_FOREACH(h, arg_hierarchies) { + _cleanup_free_ char *meta_path = NULL, *overlay_path = NULL; + + meta_path = path_join(workspace, "meta", *h); /* The place where to store metadata about this instance */ + if (!meta_path) + return log_oom(); + + overlay_path = path_join(workspace, "overlay", *h); /* The resulting overlayfs instance */ + if (!overlay_path) + return log_oom(); + + r = merge_hierarchy(*h, extensions, paths, meta_path, overlay_path); + if (r < 0) + return r; + } + + /* And move them all into place. This is where things appear in the host namespace */ + STRV_FOREACH(h, arg_hierarchies) { + _cleanup_free_ char *p = NULL, *resolved = NULL; + + p = path_join(workspace, "overlay", *h); + if (!p) + return log_oom(); + + if (laccess(p, F_OK) < 0) { + if (errno != ENOENT) + return log_error_errno(errno, "Failed to check if '%s' exists: %m", p); + + /* Hierarchy apparently was empty in all extensions, and wasn't mounted, ignoring. */ + continue; + } + + r = chase_symlinks(*h, arg_root, CHASE_PREFIX_ROOT|CHASE_NONEXISTENT, &resolved, NULL); + if (r < 0) + return log_error_errno(r, "Failed to resolve hierarchy '%s%s': %m", strempty(arg_root), *h); + + r = mkdir_p(resolved, 0755); + if (r < 0) + return log_error_errno(r, "Failed to create hierarchy mount point '%s': %m", resolved); + + r = mount_nofollow_verbose(LOG_ERR, p, resolved, NULL, MS_BIND, NULL); + if (r < 0) + return r; + + log_info("Merged extensions into '%s'.", resolved); + } + + return 1; +} + +static int merge(Hashmap *images) { + pid_t pid; + int r; + + r = safe_fork("(sd-sysext)", FORK_DEATHSIG|FORK_LOG|FORK_NEW_MOUNTNS, &pid); + if (r < 0) + return log_error_errno(r, "Failed to fork off child: %m"); + if (r == 0) { + /* Child with its own mount namespace */ + + r = merge_subprocess(images, "/run/systemd/sysext"); + if (r < 0) + _exit(EXIT_FAILURE); + + /* Our namespace ceases to exist here, also implicitly detaching all temporary mounts we + * created below /run. Nice! */ + + _exit(r > 0 ? EXIT_SUCCESS : 123); /* 123 means: didn't find any extensions */ + } + + r = wait_for_terminate_and_check("(sd-sysext)", pid, WAIT_LOG_ABNORMAL); + if (r < 0) + return r; + + return r != 123; /* exit code 123 means: didn't do anything */ +} + +static int help(void) { + _cleanup_free_ char *link = NULL; + int r; + + r = terminal_urlify_man("systemd-sysext", "1", &link); + if (r < 0) + return log_oom(); + + printf("%1$s [OPTIONS...] [DEVICE]\n" + "\n%5$sMerge extension images into /usr/ and /opt/ hierarchies.%6$s\n" + "\n%3$sCommands:%4$s\n" + " -h --help Show this help\n" + " --version Show package version\n" + " -m --merge Merge extensions into /usr/ and /opt/\n" + " -u --unmerge Unmerge extensions from /usr/ and /opt/\n" + " -R --refresh Unmerge/merge extensions again\n" + " -l --list List all OS images\n" + "\n%3$sOptions:%4$s\n" + " --no-pager Do not pipe output into a pager\n" + " --root=PATH Operate relative to root path\n" + " --json=pretty|short|off\n" + " Generate JSON output\n" + "\nSee the %2$s for details.\n" + , program_invocation_short_name + , link + , ansi_underline(), ansi_normal() + , ansi_highlight(), ansi_normal() + ); + + return 0; +} + +static int parse_argv(int argc, char *argv[]) { + + enum { + ARG_VERSION = 0x100, + ARG_NO_PAGER, + ARG_MERGE, + ARG_UNMERGE, + ARG_REFRESH, + ARG_LIST, + ARG_ROOT, + ARG_JSON, + }; + + static const struct option options[] = { + { "help", no_argument, NULL, 'h' }, + { "version", no_argument, NULL, ARG_VERSION }, + { "no-pager", no_argument, NULL, ARG_NO_PAGER }, + { "root", required_argument, NULL, ARG_ROOT }, + { "merge", no_argument, NULL, 'm' }, + { "unmerge", no_argument, NULL, 'u' }, + { "refresh", no_argument, NULL, 'R' }, + { "list", no_argument, NULL, 'l' }, + { "json", required_argument, NULL, ARG_JSON }, + {} + }; + + int c, r; + + assert(argc >= 0); + assert(argv); + + while ((c = getopt_long(argc, argv, "hmuRl", options, NULL)) >= 0) + + switch (c) { + + case 'h': + return help(); + + case ARG_VERSION: + return version(); + + case ARG_NO_PAGER: + arg_pager_flags |= PAGER_DISABLE; + break; + + case 'm': + arg_action = ACTION_MERGE; + break; + + case 'u': + arg_action = ACTION_UNMERGE; + break; + + case 'R': + arg_action = ACTION_REFRESH; + break; + + case 'l': + arg_action = ACTION_LIST; + break; + + case ARG_ROOT: + r = parse_path_argument_and_warn(optarg, false, &arg_root); + if (r < 0) + return r; + break; + + case ARG_JSON: + r = json_parse_cmdline_parameter_and_warn(optarg, &arg_json_format_flags); + if (r <= 0) + return r; + + break; + + case '?': + return -EINVAL; + + default: + assert_not_reached("Unhandled option"); + } + + if (argc - optind > 0) + return log_error_errno(SYNTHETIC_ERRNO(EINVAL), + "Unexpected argument."); + + return 1; +} + +static int parse_env(void) { + _cleanup_strv_free_ char **l = NULL; + const char *e; + char **p; + int r; + + e = secure_getenv("SYSTEMD_SYSEXT_HIERARCHIES"); + if (!e) + return 0; + + /* For debugging purposes it might make sense to do this for other hierarchies than /usr/ and + * /opt/, but let's make that a hacker/debugging feature, i.e. env var instead of cmdline + * switch. */ + + r = strv_split_full(&l, e, ":", EXTRACT_DONT_COALESCE_SEPARATORS); + if (r < 0) + return log_error_errno(r, "Failed to parse $SYSTEMD_SYSEXT_HIERARCHIES: %m"); + + STRV_FOREACH(p, l) { + if (!path_is_absolute(*p)) + return log_error_errno(SYNTHETIC_ERRNO(EINVAL), + "Hierarchy path '%s' is not absolute, refusing.", *p); + + if (!path_is_normalized(*p)) + return log_error_errno(SYNTHETIC_ERRNO(EINVAL), + "Hierarchy path '%s' is not normalized, refusing.", *p); + + if (path_equal(*p, "/")) + return log_error_errno(SYNTHETIC_ERRNO(EINVAL), + "Hierarchy path '%s' is the root fs, refusing.", *p); + } + + if (strv_isempty(l)) + return log_error_errno(SYNTHETIC_ERRNO(EINVAL), + "No hierarchies specified, refusing."); + + strv_free_and_replace(arg_hierarchies, l); + return 0; +} + +static int run(int argc, char *argv[]) { + _cleanup_(hashmap_freep) Hashmap *images = NULL; + int r; + + log_show_color(true); + log_parse_environment(); + log_open(); + + r = parse_argv(argc, argv); + if (r <= 0) + return r; + + r = parse_env(); + if (r < 0) + return r; + + if (!arg_hierarchies) { + arg_hierarchies = strv_new("/usr", "/opt"); + if (!arg_hierarchies) + return log_oom(); + } + + /* Given that things deep down in the child process will fail, let's catch the no-privilege issue + * early on */ + if (!IN_SET(arg_action, ACTION_STATUS, ACTION_LIST) && !have_effective_cap(CAP_SYS_ADMIN)) + return log_error_errno(SYNTHETIC_ERRNO(EPERM), "Need to be privileged."); + + if (arg_action == ACTION_STATUS) + return status(); + + if (arg_action == ACTION_UNMERGE) + return unmerge(); + + images = hashmap_new(&image_hash_ops); + if (!images) + return log_oom(); + + r = image_discover(IMAGE_EXTENSION, arg_root, images); + if (r < 0) + return log_error_errno(r, "Failed to discover extension images: %m"); + + switch (arg_action) { + + case ACTION_LIST: { + _cleanup_(table_unrefp) Table *t = NULL; + Image *img; + + if ((arg_json_format_flags & JSON_FORMAT_OFF) && hashmap_isempty(images)) { + log_info("No OS extensions found."); + return 0; + } + + t = table_new("name", "type", "path", "time"); + if (!t) + return log_oom(); + + HASHMAP_FOREACH(img, images) { + r = table_add_many( + t, + TABLE_STRING, img->name, + TABLE_STRING, image_type_to_string(img->type), + TABLE_PATH, img->path, + TABLE_TIMESTAMP, img->mtime != 0 ? img->mtime : img->crtime); + if (r < 0) + return table_log_add_error(r); + } + + (void) table_set_sort(t, (size_t) 0, (size_t) -1); + + if (arg_json_format_flags & (JSON_FORMAT_OFF|JSON_FORMAT_PRETTY|JSON_FORMAT_PRETTY_AUTO)) + (void) pager_open(arg_pager_flags); + + r = table_print_json(t, stdout, arg_json_format_flags); + if (r < 0) + return table_log_print_error(r); + + r = 0; + break; + } + + case ACTION_MERGE: { + char **p; + + /* In merge mode fail if things are already merged. (In --refresh mode below we'll unmerge if + * we find things are already merged...) */ + STRV_FOREACH(p, arg_hierarchies) { + _cleanup_free_ char *resolved = NULL; + + r = chase_symlinks(*p, arg_root, CHASE_PREFIX_ROOT, &resolved, NULL); + if (r == -ENOENT) { + log_debug_errno(r, "Hierarchy '%s%s' does not exist, ignoring.", strempty(arg_root), *p); + continue; + } + if (r < 0) + return log_error_errno(r, "Failed to resolve path to hierarchy '%s%s': %m", strempty(arg_root), *p); + + r = is_our_mount_point(resolved); + if (r < 0) + return r; + if (r > 0) + return log_error_errno(SYNTHETIC_ERRNO(EBUSY), + "Hierarchy '%s' is already merged.", *p); + } + + r = merge(images); + break; + } + + case ACTION_REFRESH: + r = merge(images); /* Returns > 0 if it did something, i.e. a new overlayfs is mounted + * now. When it does so it implicitly unmounts any overlayfs placed there + * before. Returns == 0 if it did nothing, i.e. no extension images + * found. In this case the old overlayfs remains in place if there was + * one. */ + if (r < 0) + return r; + if (r == 0) /* No images found? Then unmerge. The goal of --refresh is after all that after + * having called there's a guarantee that the merge status matches the installed + * extensions. */ + r = unmerge(); + + /* Net result here is that: + * + * 1. If an overlayfs was mounted before and no extensions exist anymore, we'll have unmerged + * things. + * + * 2. If an overlayfs was mounted before, and there are still extensions installed' we'll + * have unmerged and then merged things again. + * + * 3. If an overlayfs so far wasn't mounted, and there are extensions installed, we'll have + * it mounted now. + * + * 4. If there was no overlayfs mount so far, and no extensions installed, we implement a + * NOP. + */ + break; + + default: + assert_not_reached("Uneexpected action"); + } + + return r; +} + +DEFINE_MAIN_FUNCTION(run); diff --git a/test/test-functions b/test/test-functions index 7012e8a8f3d..3df6db7ac08 100644 --- a/test/test-functions +++ b/test/test-functions @@ -933,7 +933,7 @@ install_execs() { # some {rc,halt}.local scripts and programs are okay to not exist, the rest should # also, plymouth is pulled in by rescue.service, but even there the exit code # is ignored; as it's not present on some distros, don't fail if it doesn't exist - dinfo "Attempting to install $i" + dinfo "Attempting to install $i (based on unit file reference)" inst $i || [ "${i%.local}" != "$i" ] || [ "${i%systemd-update-done}" != "$i" ] || [ "${i##*/}" == "plymouth" ] done ) diff --git a/units/meson.build b/units/meson.build index 7b18f1bfea0..bdfb2e52ce2 100644 --- a/units/meson.build +++ b/units/meson.build @@ -211,6 +211,7 @@ in_units = [ ['systemd-oomd.service', 'ENABLE_OOMD'], ['systemd-portabled.service', 'ENABLE_PORTABLED', 'dbus-org.freedesktop.portable1.service'], + ['systemd-sysext.service', 'ENABLE_SYSEXT'], ['systemd-userdbd.service', 'ENABLE_USERDB'], ['systemd-homed.service', 'ENABLE_HOMED'], ['systemd-quotacheck.service', 'ENABLE_QUOTACHECK'], diff --git a/units/systemd-sysext.service.in b/units/systemd-sysext.service.in new file mode 100644 index 00000000000..aee30cc4b58 --- /dev/null +++ b/units/systemd-sysext.service.in @@ -0,0 +1,31 @@ +# SPDX-License-Identifier: LGPL-2.1-or-later +# +# This file is part of systemd. +# +# systemd is free software; you can redistribute it and/or modify it +# under the terms of the GNU Lesser General Public License as published by +# the Free Software Foundation; either version 2.1 of the License, or +# (at your option) any later version. + +[Unit] +Description=Merge System Extension Images into /usr/ and /opt/ +Documentation=man:systemd-sysext.service(8) +DefaultDependencies=no +Conflicts=shutdown.target +After=local-fs.target +Before=sysinit.target shutdown.target systemd-tmpfiles.service +ConditionCapability=CAP_SYS_ADMIN +ConditionDirectoryNotEmpty=|/etc/extensions +ConditionDirectoryNotEmpty=|/run/extensions +ConditionDirectoryNotEmpty=|/var/lib/extensions +ConditionDirectoryNotEmpty=|/usr/local/lib/extensions +ConditionDirectoryNotEmpty=|/usr/lib/extensions + +[Service] +Type=oneshot +RemainAfterExit=yes +ExecStart=@rootlibexecdir@/systemd-sysext --merge +ExecStop=@rootlibexecdir@/systemd-sysext --unmerge + +[Install] +WantedBy=sysinit.target