From 327daea5e918d79335f3929240b3c126747f3e5d Mon Sep 17 00:00:00 2001 From: Lennart Poettering Date: Mon, 25 Nov 2024 14:51:32 +0100 Subject: [PATCH] man: document new nspawn functionality around unpriv support --- man/systemd-nspawn.xml | 42 ++++++++++++++++++++++++++++++------------ 1 file changed, 30 insertions(+), 12 deletions(-) diff --git a/man/systemd-nspawn.xml b/man/systemd-nspawn.xml index 7bdb55d5d5f..f83f6be0acf 100644 --- a/man/systemd-nspawn.xml +++ b/man/systemd-nspawn.xml @@ -841,6 +841,12 @@ host and container UIDs/GIDs are chosen identically it does provide process capability isolation, but may be useful if proper user namespacing with distinct UID maps is not possible. This option is not secure and must not be used to run untrusted code. + + If the parameter is managed, user namespacing is employed with + in managed mode, i.e. allocation of a UID range is delegated to + systemd-nsresourced.service8. This + mode is selected by default if invoked unprivileged, but can also be requested explicitly when + privileged. In this mode a 64K UID range is automatically picked. It is recommended to assign at least 65536 UIDs/GIDs to each container, so that the usable @@ -852,18 +858,23 @@ When user namespaces are used, the GID range assigned to each container is always chosen identical to the UID range. - In most cases, using is the recommended option as user - namespacing is required for security, and this option massively enhances container security while + In most cases, (or when privileged + , too) is the recommended option as user + namespacing is advised for security, and this option massively enhances container security while operating fully automatically in most cases. Note that the picked UID/GID range is not written to /etc/passwd or /etc/group. In fact, the allocation of the range is not stored persistently, - except in the file ownership of the files and directories of the container. + except possibly in the file ownership of the files and directories of the container, see + . - Note that when user namespacing is used file ownership on disk reflects this, and all of the container's - files and directories are owned by the container's effective user and group IDs. This means that copying files - from and to the container image requires correction of the numeric UID/GID values, according to the UID/GID - shift applied. + Note that when user namespacing is used without UID mapping (see below) file ownership on disk + reflects this, and all of the container's files and directories are owned by the container's + effective user and group IDs. This means that copying files from and to the container image requires + correction of the numeric UID/GID values, according to the UID/GID shift applied. + + Note that for fully unprivileged operation in managed mode, any directory + image should be ownd by the foreign UID range. @@ -875,8 +886,10 @@ chosen with , see above. Takes one of off (to leave the image as is), chown (to recursively chown() the container's directory tree as needed), map (in order to use transparent ID mapping - mounts) or auto for automatically using map where available and - chown where not. + mounts from UID 0 to the target UID range), foreign (the same, but from the + foreign UID range base) or auto for automatically using map or + foreign, where available and applicable and chown where + not. If chown is selected, all files and directories in the container's directory tree will be adjusted so that they are owned by the appropriate UIDs/GIDs selected for the container @@ -884,14 +897,19 @@ directory tree of the container. Besides actual file ownership, file ACLs are adjusted as well. - Typically map is the best choice, since it transparently maps UIDs/GIDs in - memory as needed without modifying the image, and without requiring an expensive recursive adjustment - operation. However, it is not available for all file systems, currently. + Typically foreign or map is the best choice, since it + transparently maps UIDs/GIDs in memory as needed without modifying the image, and without requiring + an expensive recursive adjustment operation. However, it is not available for all file systems, + currently. The option is implied if is used. This option has no effect if user namespacing is not used. + systemd-dissect1's + switch may be used to shift UID/GID ownership from or to the 0, foreign or + specific container UID/GID base outside of any systemd-nspawn invocation. +