mirror of
https://github.com/systemd/systemd-stable.git
synced 2025-01-22 22:03:43 +03:00
nspawn: add fallback top normal copy/reflink when we cannot btrfs snapshot
Given that other file systems (notably: xfs) support reflinks these days, let's extend the file system snapshotting logic to fall back to plan copies or reflinks when full btrfs subvolume snapshots are not available. This essentially makes "systemd-nspawn --ephemeral" and "systemd-nspawn --template=" available on non-btrfs subvolumes. Of course, both operations will still be slower on non-btrfs than on btrfs (simply because reflinking each file individually in a directory tree is still slower than doing this in one step for a whole subvolume), but it's probably good enough for many cases, and we should provide the users with the tools, they have to figure out what's good for them. Note that "machinectl clone" already had a fallback like this in place, this patch generalizes this, and adds similar support to our other cases.
This commit is contained in:
parent
c67b008273
commit
17cbb288fa
@ -599,8 +599,8 @@
|
||||
<listitem><para>Clones a container or VM image. The arguments specify the name of the image to clone and the
|
||||
name of the newly cloned image. Note that plain directory container images are cloned into btrfs subvolume
|
||||
images with this command, if the underlying file system supports this. Note that cloning a container or VM
|
||||
image is optimized for btrfs file systems, and might not be efficient on others, due to file system
|
||||
limitations.</para>
|
||||
image is optimized for file systems that support copy-on-write, and might not be efficient on others, due to
|
||||
file system limitations.</para>
|
||||
|
||||
<para>Note that this command leaves host name, machine ID and
|
||||
all other settings that could identify the instance
|
||||
@ -910,7 +910,7 @@
|
||||
<filename>/var/lib/machines/</filename> to make them available for
|
||||
control with <command>machinectl</command>.</para>
|
||||
|
||||
<para>Note that many image operations are only supported,
|
||||
<para>Note that some image operations are only supported,
|
||||
efficient or atomic on btrfs file systems. Due to this, if the
|
||||
<command>pull-tar</command>, <command>pull-raw</command>,
|
||||
<command>import-tar</command>, <command>import-raw</command> and
|
||||
|
@ -181,25 +181,15 @@
|
||||
<varlistentry>
|
||||
<term><option>--template=</option></term>
|
||||
|
||||
<listitem><para>Directory or <literal>btrfs</literal>
|
||||
subvolume to use as template for the container's root
|
||||
directory. If this is specified and the container's root
|
||||
directory (as configured by <option>--directory=</option>)
|
||||
does not yet exist it is created as <literal>btrfs</literal>
|
||||
subvolume and populated from this template tree. Ideally, the
|
||||
specified template path refers to the root of a
|
||||
<literal>btrfs</literal> subvolume, in which case a simple
|
||||
copy-on-write snapshot is taken, and populating the root
|
||||
directory is instant. If the specified template path does not
|
||||
refer to the root of a <literal>btrfs</literal> subvolume (or
|
||||
not even to a <literal>btrfs</literal> file system at all),
|
||||
the tree is copied, which can be substantially more
|
||||
time-consuming. Note that if this option is used the
|
||||
container's root directory (in contrast to the template
|
||||
directory!) must be located on a <literal>btrfs</literal> file
|
||||
system, so that the <literal>btrfs</literal> subvolume may be
|
||||
created. May not be specified together with
|
||||
<option>--image=</option> or
|
||||
<listitem><para>Directory or <literal>btrfs</literal> subvolume to use as template for the container's root
|
||||
directory. If this is specified and the container's root directory (as configured by
|
||||
<option>--directory=</option>) does not yet exist it is created as <literal>btrfs</literal> snapshot (if
|
||||
supported) or plain directory (otherwise) and populated from this template tree. Ideally, the specified
|
||||
template path refers to the root of a <literal>btrfs</literal> subvolume, in which case a simple copy-on-write
|
||||
snapshot is taken, and populating the root directory is instant. If the specified template path does not refer
|
||||
to the root of a <literal>btrfs</literal> subvolume (or not even to a <literal>btrfs</literal> file system at
|
||||
all), the tree is copied (though possibly in a copy-on-write scheme — if the file system supports that), which
|
||||
can be substantially more time-consuming. May not be specified together with <option>--image=</option> or
|
||||
<option>--ephemeral</option>.</para>
|
||||
|
||||
<para>Note that this switch leaves host name, machine ID and
|
||||
@ -1052,14 +1042,12 @@
|
||||
</example>
|
||||
|
||||
<example>
|
||||
<title>Boot into an ephemeral <literal>btrfs</literal> snapshot of the host system</title>
|
||||
<title>Boot into an ephemeral snapshot of the host system</title>
|
||||
|
||||
<programlisting># systemd-nspawn -D / -xb</programlisting>
|
||||
|
||||
<para>This runs a copy of the host system in a
|
||||
<literal>btrfs</literal> snapshot which is removed immediately
|
||||
when the container exits. All file system changes made during
|
||||
runtime will be lost on shutdown, hence.</para>
|
||||
<para>This runs a copy of the host system in a snapshot which is removed immediately when the container
|
||||
exits. All file system changes made during runtime will be lost on shutdown, hence.</para>
|
||||
</example>
|
||||
|
||||
<example>
|
||||
|
@ -20,6 +20,7 @@
|
||||
#include <errno.h>
|
||||
#include <fcntl.h>
|
||||
#include <inttypes.h>
|
||||
#include <linux/fs.h>
|
||||
#include <linux/loop.h>
|
||||
#include <stddef.h>
|
||||
#include <stdio.h>
|
||||
@ -38,6 +39,7 @@
|
||||
#include "alloc-util.h"
|
||||
#include "btrfs-ctree.h"
|
||||
#include "btrfs-util.h"
|
||||
#include "chattr-util.h"
|
||||
#include "copy.h"
|
||||
#include "fd-util.h"
|
||||
#include "fileio.h"
|
||||
@ -45,6 +47,7 @@
|
||||
#include "macro.h"
|
||||
#include "missing.h"
|
||||
#include "path-util.h"
|
||||
#include "rm-rf.h"
|
||||
#include "selinux-util.h"
|
||||
#include "smack-util.h"
|
||||
#include "sparse-endian.h"
|
||||
@ -1718,28 +1721,46 @@ int btrfs_subvol_snapshot_fd(int old_fd, const char *new_path, BtrfsSnapshotFlag
|
||||
if (r < 0)
|
||||
return r;
|
||||
if (r == 0) {
|
||||
bool plain_directory = false;
|
||||
|
||||
/* If the source isn't a proper subvolume, fail unless fallback is requested */
|
||||
if (!(flags & BTRFS_SNAPSHOT_FALLBACK_COPY))
|
||||
return -EISDIR;
|
||||
|
||||
r = btrfs_subvol_make(new_path);
|
||||
if (r < 0)
|
||||
if (r == -ENOTTY && (flags & BTRFS_SNAPSHOT_FALLBACK_DIRECTORY)) {
|
||||
/* If the destination doesn't support subvolumes, then use a plain directory, if that's requested. */
|
||||
if (mkdir(new_path, 0755) < 0)
|
||||
return r;
|
||||
|
||||
plain_directory = true;
|
||||
} else if (r < 0)
|
||||
return r;
|
||||
|
||||
r = copy_directory_fd(old_fd, new_path, true);
|
||||
if (r < 0) {
|
||||
(void) btrfs_subvol_remove(new_path, BTRFS_REMOVE_QUOTA);
|
||||
return r;
|
||||
}
|
||||
if (r < 0)
|
||||
goto fallback_fail;
|
||||
|
||||
if (flags & BTRFS_SNAPSHOT_READ_ONLY) {
|
||||
r = btrfs_subvol_set_read_only(new_path, true);
|
||||
if (r < 0) {
|
||||
(void) btrfs_subvol_remove(new_path, BTRFS_REMOVE_QUOTA);
|
||||
return r;
|
||||
|
||||
if (plain_directory) {
|
||||
/* Plain directories have no recursive read-only flag, but something pretty close to
|
||||
* it: the IMMUTABLE bit. Let's use this here, if this is requested. */
|
||||
|
||||
if (flags & BTRFS_SNAPSHOT_FALLBACK_IMMUTABLE)
|
||||
(void) chattr_path(new_path, FS_IMMUTABLE_FL, FS_IMMUTABLE_FL);
|
||||
} else {
|
||||
r = btrfs_subvol_set_read_only(new_path, true);
|
||||
if (r < 0)
|
||||
goto fallback_fail;
|
||||
}
|
||||
}
|
||||
|
||||
return 0;
|
||||
|
||||
fallback_fail:
|
||||
(void) rm_rf(new_path, REMOVE_ROOT|REMOVE_PHYSICAL|REMOVE_SUBVOLUME);
|
||||
return r;
|
||||
}
|
||||
|
||||
r = extract_subvolume_name(new_path, &subvolume);
|
||||
|
@ -45,10 +45,12 @@ typedef struct BtrfsQuotaInfo {
|
||||
} BtrfsQuotaInfo;
|
||||
|
||||
typedef enum BtrfsSnapshotFlags {
|
||||
BTRFS_SNAPSHOT_FALLBACK_COPY = 1,
|
||||
BTRFS_SNAPSHOT_FALLBACK_COPY = 1, /* If the source isn't a subvolume, reflink everything */
|
||||
BTRFS_SNAPSHOT_READ_ONLY = 2,
|
||||
BTRFS_SNAPSHOT_RECURSIVE = 4,
|
||||
BTRFS_SNAPSHOT_QUOTA = 8,
|
||||
BTRFS_SNAPSHOT_FALLBACK_DIRECTORY = 16, /* If the destination doesn't support subvolumes, reflink/copy instead */
|
||||
BTRFS_SNAPSHOT_FALLBACK_IMMUTABLE = 32, /* When we can't create a subvolume, use the FS_IMMUTABLE attribute for indicating read-only */
|
||||
} BtrfsSnapshotFlags;
|
||||
|
||||
typedef enum BtrfsRemoveFlags {
|
||||
|
@ -144,12 +144,12 @@ int pull_make_local_copy(const char *final, const char *image_root, const char *
|
||||
if (force_local)
|
||||
(void) rm_rf(p, REMOVE_ROOT|REMOVE_PHYSICAL|REMOVE_SUBVOLUME);
|
||||
|
||||
r = btrfs_subvol_snapshot(final, p, BTRFS_SNAPSHOT_QUOTA);
|
||||
if (r == -ENOTTY) {
|
||||
r = copy_tree(final, p, false);
|
||||
if (r < 0)
|
||||
return log_error_errno(r, "Failed to copy image: %m");
|
||||
} else if (r < 0)
|
||||
r = btrfs_subvol_snapshot(final, p,
|
||||
BTRFS_SNAPSHOT_QUOTA|
|
||||
BTRFS_SNAPSHOT_FALLBACK_COPY|
|
||||
BTRFS_SNAPSHOT_FALLBACK_DIRECTORY|
|
||||
BTRFS_SNAPSHOT_RECURSIVE);
|
||||
if (r < 0)
|
||||
return log_error_errno(r, "Failed to create local image: %m");
|
||||
|
||||
log_info("Created new local image '%s'.", local);
|
||||
|
@ -4070,7 +4070,7 @@ int main(int argc, char *argv[]) {
|
||||
_cleanup_fdset_free_ FDSet *fds = NULL;
|
||||
int r, n_fd_passed, loop_nr = -1, ret = EXIT_SUCCESS;
|
||||
char veth_name[IFNAMSIZ] = "";
|
||||
bool secondary = false, remove_subvol = false, remove_image = false;
|
||||
bool secondary = false, remove_directory = false, remove_image = false;
|
||||
pid_t pid = 0;
|
||||
union in_addr_union exposed = {};
|
||||
_cleanup_release_lock_file_ LockFile tree_global_lock = LOCK_FILE_INIT, tree_local_lock = LOCK_FILE_INIT;
|
||||
@ -4152,7 +4152,12 @@ int main(int argc, char *argv[]) {
|
||||
goto finish;
|
||||
}
|
||||
|
||||
r = btrfs_subvol_snapshot(arg_directory, np, (arg_read_only ? BTRFS_SNAPSHOT_READ_ONLY : 0) | BTRFS_SNAPSHOT_FALLBACK_COPY | BTRFS_SNAPSHOT_RECURSIVE | BTRFS_SNAPSHOT_QUOTA);
|
||||
r = btrfs_subvol_snapshot(arg_directory, np,
|
||||
(arg_read_only ? BTRFS_SNAPSHOT_READ_ONLY : 0) |
|
||||
BTRFS_SNAPSHOT_FALLBACK_COPY |
|
||||
BTRFS_SNAPSHOT_FALLBACK_DIRECTORY |
|
||||
BTRFS_SNAPSHOT_RECURSIVE |
|
||||
BTRFS_SNAPSHOT_QUOTA);
|
||||
if (r < 0) {
|
||||
log_error_errno(r, "Failed to create snapshot %s from %s: %m", np, arg_directory);
|
||||
goto finish;
|
||||
@ -4162,7 +4167,7 @@ int main(int argc, char *argv[]) {
|
||||
arg_directory = np;
|
||||
np = NULL;
|
||||
|
||||
remove_subvol = true;
|
||||
remove_directory = true;
|
||||
|
||||
} else {
|
||||
r = image_path_lock(arg_directory, (arg_read_only ? LOCK_SH : LOCK_EX) | LOCK_NB, &tree_global_lock, &tree_local_lock);
|
||||
@ -4176,7 +4181,13 @@ int main(int argc, char *argv[]) {
|
||||
}
|
||||
|
||||
if (arg_template) {
|
||||
r = btrfs_subvol_snapshot(arg_template, arg_directory, (arg_read_only ? BTRFS_SNAPSHOT_READ_ONLY : 0) | BTRFS_SNAPSHOT_FALLBACK_COPY | BTRFS_SNAPSHOT_RECURSIVE | BTRFS_SNAPSHOT_QUOTA);
|
||||
r = btrfs_subvol_snapshot(arg_template, arg_directory,
|
||||
(arg_read_only ? BTRFS_SNAPSHOT_READ_ONLY : 0) |
|
||||
BTRFS_SNAPSHOT_FALLBACK_COPY |
|
||||
BTRFS_SNAPSHOT_FALLBACK_DIRECTORY |
|
||||
BTRFS_SNAPSHOT_FALLBACK_IMMUTABLE |
|
||||
BTRFS_SNAPSHOT_RECURSIVE |
|
||||
BTRFS_SNAPSHOT_QUOTA);
|
||||
if (r == -EEXIST) {
|
||||
if (!arg_quiet)
|
||||
log_info("Directory %s already exists, not populating from template %s.", arg_directory, arg_template);
|
||||
@ -4359,12 +4370,12 @@ finish:
|
||||
|
||||
loop_remove(loop_nr, &image_fd);
|
||||
|
||||
if (remove_subvol && arg_directory) {
|
||||
if (remove_directory && arg_directory) {
|
||||
int k;
|
||||
|
||||
k = btrfs_subvol_remove(arg_directory, BTRFS_REMOVE_RECURSIVE|BTRFS_REMOVE_QUOTA);
|
||||
k = rm_rf(arg_directory, REMOVE_ROOT|REMOVE_PHYSICAL|REMOVE_SUBVOLUME);
|
||||
if (k < 0)
|
||||
log_warning_errno(k, "Cannot remove subvolume '%s', ignoring: %m", arg_directory);
|
||||
log_warning_errno(k, "Cannot remove '%s', ignoring: %m", arg_directory);
|
||||
}
|
||||
|
||||
if (remove_image && arg_image) {
|
||||
|
@ -609,14 +609,14 @@ int image_clone(Image *i, const char *new_name, bool read_only) {
|
||||
|
||||
new_path = strjoina("/var/lib/machines/", new_name);
|
||||
|
||||
r = btrfs_subvol_snapshot(i->path, new_path, (read_only ? BTRFS_SNAPSHOT_READ_ONLY : 0) | BTRFS_SNAPSHOT_FALLBACK_COPY | BTRFS_SNAPSHOT_RECURSIVE | BTRFS_SNAPSHOT_QUOTA);
|
||||
if (r == -EOPNOTSUPP) {
|
||||
/* No btrfs snapshots supported, create a normal directory then. */
|
||||
|
||||
r = copy_directory(i->path, new_path, false);
|
||||
if (r >= 0)
|
||||
(void) chattr_path(new_path, read_only ? FS_IMMUTABLE_FL : 0, FS_IMMUTABLE_FL);
|
||||
} else if (r >= 0)
|
||||
r = btrfs_subvol_snapshot(i->path, new_path,
|
||||
(read_only ? BTRFS_SNAPSHOT_READ_ONLY : 0) |
|
||||
BTRFS_SNAPSHOT_FALLBACK_COPY |
|
||||
BTRFS_SNAPSHOT_FALLBACK_DIRECTORY |
|
||||
BTRFS_SNAPSHOT_FALLBACK_IMMUTABLE |
|
||||
BTRFS_SNAPSHOT_RECURSIVE |
|
||||
BTRFS_SNAPSHOT_QUOTA);
|
||||
if (r >= 0)
|
||||
/* Enable "subtree" quotas for the copy, if we didn't copy any quota from the source. */
|
||||
(void) btrfs_subvol_auto_qgroup(new_path, 0, true);
|
||||
|
||||
|
Loading…
x
Reference in New Issue
Block a user