2020-11-09 13:23:58 +09:00
/* SPDX-License-Identifier: LGPL-2.1-or-later */
2015-09-07 15:59:52 +02:00
# pragma once
# include <stdbool.h>
2016-08-15 18:13:36 -04:00
# include "cgroup-util.h"
2016-12-08 00:28:05 +01:00
# include "volatile-util.h"
2016-08-15 18:13:36 -04:00
2016-10-14 14:00:15 +02:00
typedef enum MountSettingsMask {
2018-06-07 16:03:43 +02:00
MOUNT_FATAL = 1 < < 0 , /* if set, a mount error is considered fatal */
MOUNT_USE_USERNS = 1 < < 1 , /* if set, mounts are patched considering uid/gid shifts in a user namespace */
MOUNT_IN_USERNS = 1 < < 2 , /* if set, the mount is executed in the inner child, otherwise in the outer child */
MOUNT_APPLY_APIVFS_RO = 1 < < 3 , /* if set, /proc/sys, and /sys will be mounted read-only, otherwise read-write. */
MOUNT_APPLY_APIVFS_NETNS = 1 < < 4 , /* if set, /proc/sys/net will be mounted read-write.
2018-04-30 12:22:41 +02:00
Works only if MOUNT_APPLY_APIVFS_RO is also set . */
nspawn: add support for executing OCI runtime bundles with nspawn
This is a pretty large patch, and adds support for OCI runtime bundles
to nspawn. A new switch --oci-bundle= is added that takes a path to an
OCI bundle. The JSON file included therein is read similar to a .nspawn
settings files, however with a different feature set.
Implementation-wise this mostly extends the pre-existing Settings object
to carry additional properties for OCI. However, OCI supports some
concepts .nspawn files did not support yet, which this patch also adds:
1. Support for "masking" files and directories. This functionatly is now
also available via the new --inaccesible= cmdline command, and
Inaccessible= in .nspawn files.
2. Support for mounting arbitrary file systems. (not exposed through
nspawn cmdline nor .nspawn files, because probably not a good idea)
3. Ability to configure the console settings for a container. This
functionality is now also available on the nspawn cmdline in the new
--console= switch (not added to .nspawn for now, as it is something
specific to the invocation really, not a property of the container)
4. Console width/height configuration. Not exposed through
.nspawn/cmdline, but this may be controlled through $COLUMNS and
$LINES like in most other UNIX tools.
5. UID/GID configuration by raw numbers. (not exposed in .nspawn and on
the cmdline, since containers likely have different user tables, and
the existing --user= switch appears to be the better option)
6. OCI hook commands (no exposed in .nspawn/cmdline, as very specific to
OCI)
7. Creation of additional devices nodes in /dev. Most likely not a good
idea, hence not exposed in .nspawn/cmdline. There's already --bind=
to achieve the same, which is the better alternative.
8. Explicit syscall filters. This is not a good idea, due to the skewed
arch support, hence not exposed through .nspawn/cmdline.
9. Configuration of some sysctls on a whitelist. Questionnable, not
supported in .nspawn/cmdline for now.
10. Configuration of all 5 types of capabilities. Not a useful concept,
since the kernel will reduce the caps on execve() anyway. Not
exposed through .nspawn/cmdline as this is not very useful hence.
Note that this only implements the OCI runtime logic itself. It does not
provide a runc-compatible command line tool. This is left for a later
PR. Only with that in place tools such as "buildah" can use the OCI
support in nspawn as drop-in replacement.
Currently still missing is OCI hook support, but it's already parsed and
everything, and should be easy to add. Other than that it's OCI is
implemented pretty comprehensively.
There's a list of incompatibilities in the nspawn-oci.c file. In a later
PR I'd like to convert this into proper markdown and add it to the
documentation directory.
2018-04-25 11:23:37 +02:00
MOUNT_APPLY_TMPFS_TMP = 1 < < 5 , /* if set, /tmp will be mounted as tmpfs */
2019-12-06 22:45:14 +01:00
MOUNT_ROOT_ONLY = 1 < < 6 , /* if set, only root mounts are mounted */
MOUNT_NON_ROOT_ONLY = 1 < < 7 , /* if set, only non-root mounts are mounted */
2020-04-22 16:35:32 +02:00
MOUNT_MKDIR = 1 < < 8 , /* if set, make directory to mount over first */
2020-05-22 16:06:54 +01:00
MOUNT_TOUCH = 1 < < 9 , /* if set, touch file to mount over first */
2020-07-23 16:49:13 +02:00
MOUNT_PREFIX_ROOT = 1 < < 10 , /* if set, prefix the source path with the container's root directory */
2020-09-22 15:51:17 +02:00
MOUNT_FOLLOW_SYMLINKS = 1 < < 11 , /* if set, we'll follow symlinks for the mount target */
2016-10-14 14:00:15 +02:00
} MountSettingsMask ;
2015-09-07 15:59:52 +02:00
typedef enum CustomMountType {
CUSTOM_MOUNT_BIND ,
CUSTOM_MOUNT_TMPFS ,
CUSTOM_MOUNT_OVERLAY ,
nspawn: add support for executing OCI runtime bundles with nspawn
This is a pretty large patch, and adds support for OCI runtime bundles
to nspawn. A new switch --oci-bundle= is added that takes a path to an
OCI bundle. The JSON file included therein is read similar to a .nspawn
settings files, however with a different feature set.
Implementation-wise this mostly extends the pre-existing Settings object
to carry additional properties for OCI. However, OCI supports some
concepts .nspawn files did not support yet, which this patch also adds:
1. Support for "masking" files and directories. This functionatly is now
also available via the new --inaccesible= cmdline command, and
Inaccessible= in .nspawn files.
2. Support for mounting arbitrary file systems. (not exposed through
nspawn cmdline nor .nspawn files, because probably not a good idea)
3. Ability to configure the console settings for a container. This
functionality is now also available on the nspawn cmdline in the new
--console= switch (not added to .nspawn for now, as it is something
specific to the invocation really, not a property of the container)
4. Console width/height configuration. Not exposed through
.nspawn/cmdline, but this may be controlled through $COLUMNS and
$LINES like in most other UNIX tools.
5. UID/GID configuration by raw numbers. (not exposed in .nspawn and on
the cmdline, since containers likely have different user tables, and
the existing --user= switch appears to be the better option)
6. OCI hook commands (no exposed in .nspawn/cmdline, as very specific to
OCI)
7. Creation of additional devices nodes in /dev. Most likely not a good
idea, hence not exposed in .nspawn/cmdline. There's already --bind=
to achieve the same, which is the better alternative.
8. Explicit syscall filters. This is not a good idea, due to the skewed
arch support, hence not exposed through .nspawn/cmdline.
9. Configuration of some sysctls on a whitelist. Questionnable, not
supported in .nspawn/cmdline for now.
10. Configuration of all 5 types of capabilities. Not a useful concept,
since the kernel will reduce the caps on execve() anyway. Not
exposed through .nspawn/cmdline as this is not very useful hence.
Note that this only implements the OCI runtime logic itself. It does not
provide a runc-compatible command line tool. This is left for a later
PR. Only with that in place tools such as "buildah" can use the OCI
support in nspawn as drop-in replacement.
Currently still missing is OCI hook support, but it's already parsed and
everything, and should be easy to add. Other than that it's OCI is
implemented pretty comprehensively.
There's a list of incompatibilities in the nspawn-oci.c file. In a later
PR I'd like to convert this into proper markdown and add it to the
documentation directory.
2018-04-25 11:23:37 +02:00
CUSTOM_MOUNT_INACCESSIBLE ,
CUSTOM_MOUNT_ARBITRARY ,
2015-09-07 15:59:52 +02:00
_CUSTOM_MOUNT_TYPE_MAX ,
2021-02-09 17:17:47 +01:00
_CUSTOM_MOUNT_TYPE_INVALID = - EINVAL ,
2015-09-07 15:59:52 +02:00
} CustomMountType ;
typedef struct CustomMount {
CustomMountType type ;
bool read_only ;
char * source ; /* for overlayfs this is the upper directory */
char * destination ;
char * options ;
char * work_dir ;
char * * lower ;
2016-11-30 18:57:42 +01:00
char * rm_rf_tmpdir ;
nspawn: add support for executing OCI runtime bundles with nspawn
This is a pretty large patch, and adds support for OCI runtime bundles
to nspawn. A new switch --oci-bundle= is added that takes a path to an
OCI bundle. The JSON file included therein is read similar to a .nspawn
settings files, however with a different feature set.
Implementation-wise this mostly extends the pre-existing Settings object
to carry additional properties for OCI. However, OCI supports some
concepts .nspawn files did not support yet, which this patch also adds:
1. Support for "masking" files and directories. This functionatly is now
also available via the new --inaccesible= cmdline command, and
Inaccessible= in .nspawn files.
2. Support for mounting arbitrary file systems. (not exposed through
nspawn cmdline nor .nspawn files, because probably not a good idea)
3. Ability to configure the console settings for a container. This
functionality is now also available on the nspawn cmdline in the new
--console= switch (not added to .nspawn for now, as it is something
specific to the invocation really, not a property of the container)
4. Console width/height configuration. Not exposed through
.nspawn/cmdline, but this may be controlled through $COLUMNS and
$LINES like in most other UNIX tools.
5. UID/GID configuration by raw numbers. (not exposed in .nspawn and on
the cmdline, since containers likely have different user tables, and
the existing --user= switch appears to be the better option)
6. OCI hook commands (no exposed in .nspawn/cmdline, as very specific to
OCI)
7. Creation of additional devices nodes in /dev. Most likely not a good
idea, hence not exposed in .nspawn/cmdline. There's already --bind=
to achieve the same, which is the better alternative.
8. Explicit syscall filters. This is not a good idea, due to the skewed
arch support, hence not exposed through .nspawn/cmdline.
9. Configuration of some sysctls on a whitelist. Questionnable, not
supported in .nspawn/cmdline for now.
10. Configuration of all 5 types of capabilities. Not a useful concept,
since the kernel will reduce the caps on execve() anyway. Not
exposed through .nspawn/cmdline as this is not very useful hence.
Note that this only implements the OCI runtime logic itself. It does not
provide a runc-compatible command line tool. This is left for a later
PR. Only with that in place tools such as "buildah" can use the OCI
support in nspawn as drop-in replacement.
Currently still missing is OCI hook support, but it's already parsed and
everything, and should be easy to add. Other than that it's OCI is
implemented pretty comprehensively.
There's a list of incompatibilities in the nspawn-oci.c file. In a later
PR I'd like to convert this into proper markdown and add it to the
documentation directory.
2018-04-25 11:23:37 +02:00
char * type_argument ; /* only for CUSTOM_MOUNT_ARBITRARY */
bool graceful ;
bool in_userns ;
2015-09-07 15:59:52 +02:00
} CustomMount ;
2018-04-27 22:01:54 +02:00
CustomMount * custom_mount_add ( CustomMount * * l , size_t * n , CustomMountType t ) ;
void custom_mount_free_all ( CustomMount * l , size_t n ) ;
int custom_mount_prepare_all ( const char * dest , CustomMount * l , size_t n ) ;
2016-11-29 23:47:58 +01:00
2018-04-27 22:01:54 +02:00
int bind_mount_parse ( CustomMount * * l , size_t * n , const char * s , bool read_only ) ;
int tmpfs_mount_parse ( CustomMount * * l , size_t * n , const char * s ) ;
int overlay_mount_parse ( CustomMount * * l , size_t * n , const char * s , bool read_only ) ;
nspawn: add support for executing OCI runtime bundles with nspawn
This is a pretty large patch, and adds support for OCI runtime bundles
to nspawn. A new switch --oci-bundle= is added that takes a path to an
OCI bundle. The JSON file included therein is read similar to a .nspawn
settings files, however with a different feature set.
Implementation-wise this mostly extends the pre-existing Settings object
to carry additional properties for OCI. However, OCI supports some
concepts .nspawn files did not support yet, which this patch also adds:
1. Support for "masking" files and directories. This functionatly is now
also available via the new --inaccesible= cmdline command, and
Inaccessible= in .nspawn files.
2. Support for mounting arbitrary file systems. (not exposed through
nspawn cmdline nor .nspawn files, because probably not a good idea)
3. Ability to configure the console settings for a container. This
functionality is now also available on the nspawn cmdline in the new
--console= switch (not added to .nspawn for now, as it is something
specific to the invocation really, not a property of the container)
4. Console width/height configuration. Not exposed through
.nspawn/cmdline, but this may be controlled through $COLUMNS and
$LINES like in most other UNIX tools.
5. UID/GID configuration by raw numbers. (not exposed in .nspawn and on
the cmdline, since containers likely have different user tables, and
the existing --user= switch appears to be the better option)
6. OCI hook commands (no exposed in .nspawn/cmdline, as very specific to
OCI)
7. Creation of additional devices nodes in /dev. Most likely not a good
idea, hence not exposed in .nspawn/cmdline. There's already --bind=
to achieve the same, which is the better alternative.
8. Explicit syscall filters. This is not a good idea, due to the skewed
arch support, hence not exposed through .nspawn/cmdline.
9. Configuration of some sysctls on a whitelist. Questionnable, not
supported in .nspawn/cmdline for now.
10. Configuration of all 5 types of capabilities. Not a useful concept,
since the kernel will reduce the caps on execve() anyway. Not
exposed through .nspawn/cmdline as this is not very useful hence.
Note that this only implements the OCI runtime logic itself. It does not
provide a runc-compatible command line tool. This is left for a later
PR. Only with that in place tools such as "buildah" can use the OCI
support in nspawn as drop-in replacement.
Currently still missing is OCI hook support, but it's already parsed and
everything, and should be easy to add. Other than that it's OCI is
implemented pretty comprehensively.
There's a list of incompatibilities in the nspawn-oci.c file. In a later
PR I'd like to convert this into proper markdown and add it to the
documentation directory.
2018-04-25 11:23:37 +02:00
int inaccessible_mount_parse ( CustomMount * * l , size_t * n , const char * s ) ;
2015-09-07 15:59:52 +02:00
nspawn: Simplify tmpfs_patch_options() usage, and trickle that up
One of the things that tmpfs_patch_options does is take an (optional) UID,
and insert "uid=${UID},gid=${UID}" into the options string. So we need a
uid_t argument, and a way of telling if we should use it. Fortunately,
that is built in to the uid_t type by having UID_INVALID as a possible
value.
So this is really a feature that requires one argument. Yet, it is somehow
taking 4! That is absurd. Simplify it to only take one argument, and have
that trickle all the way up to mount_all()'s usage.
Now, in may of the uses, the argument becomes
uid_shift == 0 ? UID_INVALID : uid_shift
because it used to treat uid_shift=0 as invalid unless the patch_ids flag
was also set. This keeps the behavior the same. Note that in all cases
where it is invoked, if !use_userns (sometimes called !userns), then
uid_shift is 0; we don't have to add any checks for that.
That said, I'm pretty sure that "uid=0" and not setting "uid=" are the
same, but Christian Brauner seemed to not think so when implementing the
cgns support. https://github.com/systemd/systemd/pull/3589
2017-06-13 18:06:09 -04:00
int mount_all ( const char * dest , MountSettingsMask mount_settings , uid_t uid_shift , const char * selinux_apifs_context ) ;
2016-10-14 14:00:15 +02:00
int mount_sysfs ( const char * dest , MountSettingsMask mount_settings ) ;
2015-09-07 15:59:52 +02:00
2021-06-05 18:39:38 +02:00
int mount_custom ( const char * dest , CustomMount * mounts , size_t n , uid_t uid_shift , uid_t uid_range , const char * selinux_apifs_context , MountSettingsMask mount_settings ) ;
2019-12-23 11:50:02 +01:00
bool has_custom_root_mount ( const CustomMount * mounts , size_t n ) ;
2015-09-07 15:59:52 +02:00
2019-12-07 11:59:59 +01:00
int setup_volatile_mode ( const char * directory , VolatileMode mode , uid_t uid_shift , const char * selinux_apifs_context ) ;
2017-02-08 15:54:31 +00:00
int pivot_root_parse ( char * * pivot_root_new , char * * pivot_root_old , const char * s ) ;
int setup_pivot_root ( const char * directory , const char * pivot_root_new , const char * pivot_root_old ) ;
2017-07-07 18:57:08 -04:00
int tmpfs_patch_options ( const char * options , uid_t uid_shift , const char * selinux_apifs_context , char * * ret ) ;
2022-11-28 12:36:47 +01:00
int pin_fully_visible_fs ( void ) ;
int wipe_fully_visible_fs ( int mntns_fd ) ;