mirror of
https://github.com/systemd/systemd.git
synced 2024-12-21 13:34:21 +03:00
docs/PORTABLE_SERVICES: format text
This commit is contained in:
parent
6c46f0e23c
commit
ca219b008e
@ -19,32 +19,32 @@ two specific features of container management:
|
||||
The primary tool for interacting with Portable Services is `portablectl`,
|
||||
and they are managed by the `systemd-portabled` service.
|
||||
|
||||
Portable services don't bring anything inherently new to the table. All they do
|
||||
is put together known concepts to cover a specific set of use-cases in a
|
||||
Portable services don't bring anything inherently new to the table.
|
||||
All they do is put together known concepts to cover a specific set of use-cases in a
|
||||
slightly nicer way.
|
||||
|
||||
## So, what *is* a "Portable Service"?
|
||||
|
||||
A portable service is ultimately just an OS tree, either inside of a directory,
|
||||
or inside a raw disk image containing a Linux file system. This tree is called
|
||||
the "image". It can be "attached" or "detached" from the system. When
|
||||
"attached", specific systemd units from the image are made available on the
|
||||
host system, then behaving pretty much exactly like locally installed system
|
||||
services. When "detached", these units are removed again from the host, leaving
|
||||
or inside a raw disk image containing a Linux file system.
|
||||
This tree is called the "image". It can be "attached" or "detached" from the system.
|
||||
When "attached", specific systemd units from the image are made available on the
|
||||
host system, then behaving pretty much exactly like locally installed system services.
|
||||
When "detached", these units are removed again from the host, leaving
|
||||
no artifacts around (except maybe messages they might have logged).
|
||||
|
||||
The OS tree/image can be created with any tool of your choice. For example, you
|
||||
can use `dnf --installroot=` if you like, or `debootstrap`, the image format is
|
||||
entirely generic, and doesn't have to carry any specific metadata beyond what
|
||||
distribution images carry anyway. Or to say this differently: the image format
|
||||
doesn't define any new metadata as unit files and OS tree directories or disk
|
||||
images are already sufficient, and pretty universally available these days. One
|
||||
particularly nice tool for creating suitable images is
|
||||
[mkosi](https://github.com/systemd/mkosi), but many other existing tools will
|
||||
do too.
|
||||
The OS tree/image can be created with any tool of your choice.
|
||||
For example, you can use `dnf --installroot=` if you like, or `debootstrap`, the image format is
|
||||
entirely generic, and doesn't have to carry any specific metadata beyond what distribution images carry anyway.
|
||||
Or to say this differently:
|
||||
The image format doesn't define any new metadata as unit files and OS tree directories or disk
|
||||
images are already sufficient, and pretty universally available these days.
|
||||
One particularly nice tool for creating suitable images is
|
||||
[mkosi](https://github.com/systemd/mkosi),
|
||||
but many other existing tools will do too.
|
||||
|
||||
Portable services may also be constructed from layers, similarly to container
|
||||
environments. See [Extension Images](#extension-images) below.
|
||||
Portable services may also be constructed from layers, similarly to container environments.
|
||||
See [Extension Images](#extension-images) below.
|
||||
|
||||
If you so will, "Portable Services" are a nicer way to manage chroot()
|
||||
environments, with better security, tooling and behavior.
|
||||
@ -55,26 +55,21 @@ environments, with better security, tooling and behavior.
|
||||
systemd-nspawn/LXC-type OS containers, for Docker/rkt-like micro service
|
||||
containers, and even certain 'lightweight' VM runtimes.
|
||||
|
||||
"Portable services" do not provide a fully isolated environment to the payload,
|
||||
like containers mostly intend to. Instead, they are more like regular system
|
||||
services, can be controlled with the same tools, are exposed the same way in
|
||||
all infrastructure, and so on. The main difference is that they use a different
|
||||
root directory than the rest of the system. Hence, the intent is not to run
|
||||
code in a different, isolated environment from the host — like most containers
|
||||
would — but to run it in the same environment, but with stricter access
|
||||
controls on what the service can see and do.
|
||||
"Portable services" do not provide a fully isolated environment to the payload, like containers mostly intend to.
|
||||
Instead, they are more like regular system services, can be controlled with the same tools, are exposed the same way in all infrastructure, and so on.
|
||||
The main difference is that they use a different root directory than the rest of the system.
|
||||
Hence, the intent is not to run code in a different, isolated environment from the host — like most containers would — but to run it in the same environment, but with stricter access controls on what the service can see and do.
|
||||
|
||||
One point of differentiation: since programs running as "portable services" are
|
||||
pretty much regular system services, they won't run as PID 1 (like they would
|
||||
under Docker), but as normal processes. A corollary of that is that they aren't
|
||||
supposed to manage anything in their own environment (such as the network) as
|
||||
the execution environment is mostly shared with the rest of the system.
|
||||
under Docker), but as normal processes.
|
||||
|
||||
The primary focus of "portable services" is to extend the host system
|
||||
with encapsulated extensions that provide almost full integration with the rest
|
||||
of the system, though possibly restricted by security knobs. This focus
|
||||
includes system extensions otherwise sometimes called "super-privileged
|
||||
containers".
|
||||
A corollary of that is that they aren't supposed to manage anything in their own environment (such as the network) as the execution environment is mostly shared with the rest of the system.
|
||||
|
||||
The primary focus use-case of "portable services" is to extend the host system
|
||||
with encapsulated extensions, but provide almost full integration with the rest
|
||||
of the system, though possibly restricted by security knobs.
|
||||
This focus includes system extensions otherwise sometimes called "super-privileged containers".
|
||||
|
||||
Note that portable services are only available for system services, not for
|
||||
user services (i.e. the functionality cannot be used for the stuff
|
||||
@ -103,15 +98,15 @@ This command does the following:
|
||||
`foobar-*.{service|socket|target|timer|path}`,
|
||||
`foobar@.{service|socket|target|timer|path}` as well as
|
||||
`foobar.*.{service|socket|target|timer|path}` and
|
||||
`foobar.{service|socket|target|timer|path}` are copied out. These unit files
|
||||
are placed in `/etc/systemd/system.attached/` (which is part of the normal
|
||||
unit file search path of PID 1, and thus loaded exactly like regular unit
|
||||
files). Within the images the unit files are looked for at the usual
|
||||
locations, i.e. in `/usr/lib/systemd/system/` and `/etc/systemd/system/` and
|
||||
so on, relative to the image's root.
|
||||
`foobar.{service|socket|target|timer|path}`
|
||||
are copied out.
|
||||
These unit files are placed in `/etc/systemd/system.attached/`
|
||||
(which is part of the normal unit file search path of PID 1, and thus loaded exactly like regular unit
|
||||
files).
|
||||
Within the images the unit files are looked for at the usual locations, i.e. in `/usr/lib/systemd/system/` and `/etc/systemd/system/` and so on, relative to the image's root.
|
||||
|
||||
3. For each such unit file a drop-in file is created. Let's say
|
||||
`foobar-waldo.service` was one of the unit files copied to
|
||||
3. For each such unit file a drop-in file is created.
|
||||
Let's say `foobar-waldo.service` was one of the unit files copied to
|
||||
`/etc/systemd/system.attached/`, then a drop-in file
|
||||
`/etc/systemd/system.attached/foobar-waldo.service.d/20-portable.conf` is
|
||||
created, containing a few lines of additional configuration:
|
||||
@ -123,31 +118,30 @@ This command does the following:
|
||||
LogExtraFields=PORTABLE=foobar
|
||||
```
|
||||
|
||||
4. For each such unit a "profile" drop-in is linked in. This "profile" drop-in
|
||||
generally contains security options that lock down the service. By default
|
||||
the `default` profile is used, which provides a medium level of security.
|
||||
4. For each such unit a "profile" drop-in is linked in.
|
||||
This "profile" drop-in generally contains security options that lock down the service.
|
||||
By default the `default` profile is used, which provides a medium level of security.
|
||||
There's also `trusted`, which runs the service with no restrictions, i.e. in
|
||||
the host file system root and with full privileges. The `strict` profile
|
||||
comes with the toughest security restrictions. Finally, `nonetwork` is like
|
||||
`default` but without network access. Users may define their own profiles
|
||||
too (or modify the existing ones).
|
||||
the host file system root and with full privileges.
|
||||
The `strict` profile comes with the toughest security restrictions.
|
||||
Finally, `nonetwork` is like `default` but without network access.
|
||||
Users may define their own profiles too (or modify the existing ones).
|
||||
|
||||
And that's already it.
|
||||
|
||||
Note that the images need to stay around (and in the same location) as long as the
|
||||
portable service is attached. If an image is moved, the `RootImage=` line
|
||||
written to the unit drop-in would point to an non-existent path, and break
|
||||
access to the image.
|
||||
portable service is attached.
|
||||
If an image is moved, the `RootImage=` line written to the unit drop-in would point to an non-existent path, and break access to the image.
|
||||
|
||||
The `portablectl detach` command executes the reverse operation: it looks for
|
||||
the drop-ins and the unit files associated with the image, and removes them.
|
||||
The `portablectl detach` command executes the reverse operation:
|
||||
it looks for the drop-ins and the unit files associated with the image, and removes them.
|
||||
|
||||
Note that `portablectl attach` won't enable or start any of the units it copies
|
||||
out by default, but `--enable` and `--now` parameter are available as shortcuts.
|
||||
The same is true for the opposite `detach` operation.
|
||||
|
||||
The `portablectl reattach` command combines a `detach` with an `attach`. It is
|
||||
useful in case an image gets upgraded, as it allows performing a `restart`
|
||||
The `portablectl reattach` command combines a `detach` with an `attach`.
|
||||
It is useful in case an image gets upgraded, as it allows performing a `restart`
|
||||
operation on the units instead of `stop` plus `start`, thus providing lower
|
||||
downtime and avoiding losing runtime state associated with the unit such as the
|
||||
file descriptor store.
|
||||
@ -155,13 +149,12 @@ file descriptor store.
|
||||
## Requirements on Images
|
||||
|
||||
Note that portable services don't introduce any new image format, but most OS
|
||||
images should just work the way they are. Specifically, the following
|
||||
requirements are made for an image that can be attached/detached with
|
||||
`portablectl`.
|
||||
images should just work the way they are.
|
||||
Specifically, the following requirements are made for an image that can be attached/detached with `portablectl`.
|
||||
|
||||
1. It must contain an executable that shall be invoked, along with all its
|
||||
dependencies. Any binary code needs to be compiled for an architecture
|
||||
compatible with the host.
|
||||
dependencies.
|
||||
Any binary code needs to be compiled for an architecture compatible with the host.
|
||||
|
||||
2. The image must either be a plain sub-directory (or btrfs subvolume)
|
||||
containing the binaries and its dependencies in a classic Linux OS tree, or
|
||||
@ -172,10 +165,9 @@ requirements are made for an image that can be attached/detached with
|
||||
[Discoverable Partitions Specification](https://uapi-group.org/specifications/specs/discoverable_partitions_specification).
|
||||
|
||||
3. The image must at least contain one matching unit file, with the right name
|
||||
prefix and suffix (see above). The unit file is searched in the usual paths,
|
||||
i.e. primarily /etc/systemd/system/ and /usr/lib/systemd/system/ within the
|
||||
image. (The implementation will check a couple of other paths too, but it's
|
||||
recommended to use these two paths.)
|
||||
prefix and suffix (see above).
|
||||
The unit file is searched in the usual paths, i.e. primarily /etc/systemd/system/ and /usr/lib/systemd/system/ within the image.
|
||||
(The implementation will check a couple of other paths too, but it's recommended to use these two paths.)
|
||||
|
||||
4. The image must contain an os-release file, either in `/etc/os-release` or
|
||||
`/usr/lib/os-release`. The file should follow the standard format.
|
||||
@ -187,17 +179,17 @@ requirements are made for an image that can be attached/detached with
|
||||
`/tmp/`, `/var/tmp/` that can be mounted over with the corresponding version
|
||||
from the host.
|
||||
|
||||
7. The OS might require other files or directories to be in place. For example,
|
||||
if the image is built based on glibc, the dynamic loader needs to be
|
||||
7. The OS might require other files or directories to be in place.
|
||||
For example, if the image is built based on glibc, the dynamic loader needs to be
|
||||
available in `/lib/ld-linux.so.2` or `/lib64/ld-linux-x86-64.so.2` (or
|
||||
similar, depending on architecture), and if the distribution implements a
|
||||
merged `/usr/` tree, this means `/lib` and/or `/lib64` need to be symlinks
|
||||
to their respective counterparts below `/usr/`. For details see your
|
||||
distribution's documentation.
|
||||
to their respective counterparts below `/usr/`.
|
||||
For details see your distribution's documentation.
|
||||
|
||||
Note that images created by tools such as `debootstrap`, `dnf --installroot=`
|
||||
or `mkosi` generally satisfy all of the above. If you wonder what the most
|
||||
minimal image would be that complies with the requirements above, it could
|
||||
or `mkosi` generally satisfy all of the above.
|
||||
If you wonder what the most minimal image would be that complies with the requirements above, it could
|
||||
consist of this:
|
||||
|
||||
```
|
||||
@ -216,10 +208,9 @@ consist of this:
|
||||
|
||||
And that's it.
|
||||
|
||||
Note that qualifying images do not have to contain an init system of their
|
||||
own. If they do, it's fine, it will be ignored by the portable service logic,
|
||||
but they generally don't have to, and it might make sense to avoid any, to keep
|
||||
images minimal.
|
||||
Note that qualifying images do not have to contain an init system of their own.
|
||||
If they do, it's fine, it will be ignored by the portable service logic,
|
||||
but they generally don't have to, and it might make sense to avoid any, to keep images minimal.
|
||||
|
||||
If the image is writable, and some of the files or directories that are
|
||||
overmounted from the host do not exist yet they will be automatically created.
|
||||
@ -227,8 +218,8 @@ On read-only, immutable images (e.g. `erofs` or `squashfs` images) all files
|
||||
and directories to over-mount must exist already.
|
||||
|
||||
Note that as no new image format or metadata is defined, it's very
|
||||
straightforward to define images than can be made use of in a number of
|
||||
different ways. For example, by using `mkosi -b` you can trivially build a
|
||||
straightforward to define images than can be made use of in a number of different ways.
|
||||
For example, by using `mkosi -b` you can trivially build a
|
||||
single, unified image that:
|
||||
|
||||
1. Can be attached as portable service, to run any container services natively
|
||||
@ -242,35 +233,33 @@ single, unified image that:
|
||||
|
||||
4. Can be booted directly on bare-metal systems.
|
||||
|
||||
Of course, to facilitate 2, 3 and 4 you need to include an init system in the
|
||||
image. To facilitate 3 and 4 you also need to include a boot loader in the
|
||||
image. As mentioned, `mkosi -b` takes care of all of that for you, but any
|
||||
other image generator should work too.
|
||||
Of course, to facilitate 2, 3 and 4 you need to include an init system in the image.
|
||||
To facilitate 3 and 4 you also need to include a boot loader in the
|
||||
image.
|
||||
As mentioned, `mkosi -b` takes care of all of that for you, but any other image generator should work too.
|
||||
|
||||
The
|
||||
[os-release(5)](https://www.freedesktop.org/software/systemd/man/os-release.html)
|
||||
file may optionally be extended with a `PORTABLE_PREFIXES=` field listing all
|
||||
supported portable service prefixes for the image (see above). This is useful
|
||||
for informational purposes (as it allows recognizing portable service images
|
||||
supported portable service prefixes for the image (see above).
|
||||
This is useful for informational purposes (as it allows recognizing portable service images
|
||||
from their contents as such), but is also useful to protect the image from
|
||||
being used under a wrong name and prefix. This is particularly relevant if the
|
||||
images are cryptographically authenticated (via Verity or a similar mechanism)
|
||||
as this way the (not necessarily authenticated) image file name can be
|
||||
validated against the (authenticated) image contents. If the field is not
|
||||
specified the image will work fine, but is not necessarily recognizable as
|
||||
portable service image, and any set of units included in the image may be
|
||||
attached, there are no restrictions enforced.
|
||||
being used under a wrong name and prefix.
|
||||
This is particularly relevant if the images are cryptographically authenticated (via Verity or a similar mechanism) as this way the (not necessarily authenticated) image file name can be
|
||||
validated against the (authenticated) image contents.
|
||||
If the field is not specified the image will work fine, but is not necessarily recognizable as
|
||||
portable service image, and any set of units included in the image may be attached, there are no restrictions enforced.
|
||||
|
||||
## Extension Images
|
||||
|
||||
Portable services can be delivered as one or multiple images that extend the base
|
||||
image, and are combined with OverlayFS at runtime, when they are attached. This
|
||||
enables a workflow that splits the base 'runtime' from the daemon, so that multiple
|
||||
image, and are combined with OverlayFS at runtime, when they are attached.
|
||||
This enables a workflow that splits the base 'runtime' from the daemon, so that multiple
|
||||
portable services can share the same 'runtime' image (libraries, tools) without
|
||||
having to include everything each time, with the layering happening only at runtime.
|
||||
The `--extension` parameter of `portablectl` can be used to specify as many upper
|
||||
layers as desired. On top of the requirements listed in the previous section, the
|
||||
following must be also be observed:
|
||||
layers as desired.
|
||||
On top of the requirements listed in the previous section, the following must be also be observed:
|
||||
|
||||
1. The base/OS image must contain an `os-release file`, either in `/etc/os-release`
|
||||
or `/usr/lib/os-release`, in the standard format.
|
||||
@ -296,25 +285,25 @@ following must be also be observed:
|
||||
|
||||
## Execution Environment
|
||||
|
||||
Note that the code in portable service images is run exactly like regular
|
||||
services. Hence there's no new execution environment to consider. And, unlike
|
||||
Docker would do it, as these are regular system services they aren't run as PID
|
||||
Note that the code in portable service images is run exactly like regular services.
|
||||
Hence there's no new execution environment to consider.
|
||||
And, unlike Docker would do it, as these are regular system services they aren't run as PID
|
||||
1 either, but with regular PID values.
|
||||
|
||||
## Access to host resources
|
||||
|
||||
If services shipped with this mechanism shall be able to access host resources
|
||||
(such as files or AF_UNIX sockets for IPC), use the normal `BindPaths=` and
|
||||
`BindReadOnlyPaths=` settings in unit files to mount them in. In fact, the
|
||||
`default` profile mentioned above makes use of this to ensure
|
||||
`BindReadOnlyPaths=` settings in unit files to mount them in.
|
||||
In fact, the `default` profile mentioned above makes use of this to ensure
|
||||
`/etc/resolv.conf`, the D-Bus system bus socket or write access to the logging
|
||||
subsystem are available to the service.
|
||||
|
||||
## Instantiation
|
||||
|
||||
Sometimes it makes sense to instantiate the same set of services multiple
|
||||
times. The portable service concept does not introduce a new logic for this. It
|
||||
is recommended to use the regular systemd unit templating for this, i.e. to
|
||||
Sometimes it makes sense to instantiate the same set of services multiple times.
|
||||
The portable service concept does not introduce a new logic for this.
|
||||
It is recommended to use the regular systemd unit templating for this, i.e. to
|
||||
include template units such as `foobar@.service`, so that instantiation is as
|
||||
simple as:
|
||||
|
||||
@ -330,11 +319,10 @@ units shipped with the OS itself as for attached portable services.
|
||||
|
||||
## Immutable images with local data
|
||||
|
||||
It's a good idea to keep portable service images read-only during normal
|
||||
operation. In fact, all but the `trusted` profile will default to this kind of
|
||||
behaviour, by setting the `ProtectSystem=strict` option. In this case writable
|
||||
service data may be placed on the host file system. Use `StateDirectory=` in
|
||||
the unit files to enable such behaviour and add a local data directory to the
|
||||
It's a good idea to keep portable service images read-only during normal operation.
|
||||
In fact, all but the `trusted` profile will default to this kind of behaviour, by setting the `ProtectSystem=strict` option.
|
||||
In this case writable service data may be placed on the host file system.
|
||||
Use `StateDirectory=` in the unit files to enable such behaviour and add a local data directory to the
|
||||
services copied onto the host.
|
||||
|
||||
## Logging
|
||||
@ -342,24 +330,19 @@ services copied onto the host.
|
||||
Several fields are autotmatically added to log messages generated by a portable
|
||||
service (or about a portable service, e.g.: start/stop logs from systemd).
|
||||
The `PORTABLE=` field will refer to the name of the portable image where the unit
|
||||
was loaded from. In case extensions are used, additionally there will be a
|
||||
`PORTABLE_ROOT=` field, referring to the name of image used as the base layer
|
||||
(i.e.: `RootImage=` or `RootDirectory=`), and one `PORTABLE_EXTENSION=` field per
|
||||
was loaded from. In case extensions are used, additionally there will be a `PORTABLE_ROOT=` field, referring to the name of image used as the base layer (i.e.: `RootImage=` or `RootDirectory=`), and one `PORTABLE_EXTENSION=` field per
|
||||
each extension image used.
|
||||
|
||||
The `os-release` file from the portable image will be parsed and added as structured
|
||||
metadata to the journal log entries. The parsed fields will be the first ID field which
|
||||
is set from the set of `IMAGE_ID` and `ID` in this order of preference, and the first
|
||||
version field which is set from a set of `IMAGE_VERSION`, `VERSION_ID`, and `BUILD_ID`
|
||||
in this order of preference. The ID and version, if any, are concatenated with an
|
||||
underscore (`_`) as separator. If only either one is found, it will be used by itself.
|
||||
The `os-release` file from the portable image will be parsed and added as structured metadata to the journal log entries.
|
||||
The parsed fields will be the first ID field which is set from the set of `IMAGE_ID` and `ID` in this order of preference, and the first version field which is set from a set of `IMAGE_VERSION`, `VERSION_ID`, and `BUILD_ID` in this order of preference.
|
||||
The ID and version, if any, are concatenated with an underscore (`_`) as separator.
|
||||
If only either one is found, it will be used by itself.
|
||||
The field will be named `PORTABLE_NAME_AND_VERSION=`.
|
||||
|
||||
In case extensions are used, the same fields in the same order are, but prefixed by
|
||||
`SYSEXT_`/`CONFEXT_`, are parsed from each `extension-release` file, and are appended
|
||||
to the journal as log entries, using `PORTABLE_EXTENSION_NAME_AND_VERSION=` as the
|
||||
field name. The base layer's field will be named `PORTABLE_ROOT_NAME_AND_VERSION=`
|
||||
instead of `PORTABLE_NAME_AND_VERSION=` in this case.
|
||||
to the journal as log entries, using `PORTABLE_EXTENSION_NAME_AND_VERSION=` as the field name.
|
||||
The base layer's field will be named `PORTABLE_ROOT_NAME_AND_VERSION=` instead of `PORTABLE_NAME_AND_VERSION=` in this case.
|
||||
|
||||
For example, a portable service `app0` using two extensions `app0.raw` and
|
||||
`app1.raw` (with `SYSEXT_ID=app`, and `SYSEXT_VERSION_ID=` `0` and `1` in their
|
||||
|
Loading…
Reference in New Issue
Block a user