Files
virt-v2v/input/parse_vmx.mli
Richard W.M. Jones 255722cbf3 v2v: Modular virt-v2v
Split virt-v2v into several cooperating helper programs.  Use disk
image pipelines on both the input and output sides even when accessing
local files.  Expose the NBD sockets.  Use nbdcopy for the copy step.
Some features have been removed and we intend to add those back later
(see TODO file).

For the original plan to split virt-v2v, see:
https://listman.redhat.com/archives/libguestfs/2020-November/msg00022.html

Thanks: Ming Xie, Tingting Zheng, Nir Soffer, Eric Blake, Martin Kletzander

This change is made up of many separate commits done during
development.  The history of those commit messages is preserved below,
but the individual commits do not make too much sense so they have
been squashed into a single large change.

v2v: Move library-ish parts of virt-v2v into lib/ subdirectory

In preparation for splitting virt-v2v, moving library-ish parts of the
code that we wish to reuse in the new helpers into the lib/
subdirectory.

This is neutral code refactoring.

lib: Define format for metadata

In a previous iteration of the virt-v2v split I proposed using an open
format for metadata such as XML, and actually implemented much of it.

However to keep this change simple, and because no one except us is
supposed to be generating or consuming this metadata, this commit
replaces the open format with a simple OCaml serialization of an
opaque version string + the struct (eg. Types.sources).  The opaque
version string is there to ensure binary compatibility between the
helpers and to discourage people from trying to write or consume the
metadata.

Note: The metadata is not ABI and will change arbitrarily between
releases.  If you need to write or consume the metadata it's best to
talk to us about what you're trying to do.

inputs: Create helper-v2v-input-disk

As part of splitting up virt-v2v create input helpers.  This is the
first and simplest input helper which implements the “virt-v2v -i disk”
functionality, ie. being able to drive virt-v2v from a local disk file
without any metadata.

For further details on the virt-v2v split, refer to this plan:
https://listman.redhat.com/archives/libguestfs/2020-November/msg00022.html

outputs: Create helper-v2v-output-disk

This is the simplest possible output helper.  It creates the output
disks (really: processes and sockets).  Note this does not yet create
the final libvirt XML.  This will be added in a later commit.

convert: Create helper-v2v-convert

This commit moves the conversion code into a separate helper program
(helper-v2v-convert) which performs the conversion on the input disks.
The input disks are actually COW overlays over the source disks so
that nothing is changed on the source.

This step creates metadata files: guestcaps, inspect, target_buses and
target_firmware corresponding to the internal data structures.  These
will be consumed by the output finalization step.

v2v: Get rid of Modules_list

This functionality will be replaced in the new virt-v2v.

v2v: Rearrange sources into input/ and output/ directories

Rearrange sources for incomplete input and output drivers into the new
directories.

lib: Remove unused input and output objects

These objects are no longer required after creating the modular input
and output helpers.

lib/nbdkit.ml: Add LANG=C for all nbdkit instances

In old virt-v2v this was added through the Nbdkit_sources module to
all instances of nbdkit.  Add it unconditionally through Nbdkit module
to get the same effect.

outputs: Create helper-v2v-output-null

This handles -o null conversions.

v2v: Add new virt-v2v command line parser and program

In the newly modular virt-v2v, this program is responsible for
handling compatibility with the old virt-v2v command line.  It will
continue to be the main way that people use virt-v2v for the
foreseeable future.  This program starts the helper programs and
handles multiplexing of virt-v2v command line parameters to the right
helper.

docs, tests: Adjust --no-copy documentation and tests

Since copying and creating the output are now handled in separate
programs, --no-copy will usually create the output disks (but empty).
Adjust documentation and tests accordingly.

It's probably better to remove this option.

inputs: Create helper-v2v-input-libvirt

This handles "-i libvirtxml" (input from libvirt XML file), and all
"-i libvirt" cases which are not handled by more specific code
(ie. not vcenter-https, not vddk, not xen-ssh).

outputs: Finish finalization code for helper-v2v-output-disk

Create the final libvirt XML.

lib, tests: Don't print unused field in source_disk, fix test.

lib: Remove unused fields in s_disks struct

The fields s_qemu_uri and s_format were no longer used, remove them.

inputs: Create helper-v2v-input-ova

This handles parsing OVA files (-i ova).

outputs: Create helper-v2v-output-glance

This implements -o glance conversions to OpenStack Glance.

outputs: Create helper-v2v-output-json

Implements -o json mode.

outputs: Create helper-v2v-output-qemu

Implements -o qemu mode.

tests/test-v2v-bad-networks-and-bridges.sh: Fix test

This test depended on the specifics of parameter parsing and errors.
Adjust the test so it works with modular virt-v2v.

inputs: Combine all input helpers into one program.

This reduces the duplication of code from the previous plan.  There is
now a single helper, and it uses a "hidden" -im parameter (passed by
virt-v2v) to select the input mode, eg:

  helper-v2v-input -im libvirtxml v2vdir xmlfile

outputs: Combine all output helpers into one program.

This reduces duplication of code.  There is now a single helper, and
it uses a "hidden" -om parameter (passed by virt-v2v) to select the
output mode, eg:

  helper-v2v-output -om disk setup v2vdir -os /storage

outputs: Implement -o libvirt

inputs: Implement input from vcenter over HTTPS

This implements virt-v2v -i libvirt when we detect that the libvirt
URI points to a VMware server over HTTPS (without using VDDK).

inputs: Implement input from VMware using VDDK

This implements -i libvirt -it vddk.

inputs: Implement input from VMware via VMX

This implements -i vmx.

v2v: Fix -io ? and -oo ?

inputs: Implement input from Xen over SSH

input: Refactor input helper

Now that we have moved all the input-side code from old virt-v2v,
refactor and generally clean up.

output: Refactor output helper

General refactoring and clean up to improve the quality of the code in
the output helper.

outputs: Implement -o openstack

outputs: Implement -o rhv and -o vdsm

outputs: Implement -o rhv-upload

v2v: Run helpers with --program-name=virt-v2v

This means the helpers will use "virt-v2v" instead of "helper-v2v-..."
in error messages and similar, hopefully reducing confusion.

convert, output: Improve consistency of error messages

Don't use "prog" usually since it is added by the error function.
However occasionally when there's an internal error with virt-v2v
using the wrong arguments to the helper then we can use prog to
display the actual helper having problems.

inputs, outputs: Add cmdline abstract type

Convenient way to pass the multiple command line options as a single
parameter to functions.  This is simple refactoring to make the next
change possible.

inputs, outputs: Give an error for invalid option combinations

virt-v2v 1.4x was fussy about reporting errors for options which were
not applicable in certain input or output modes.  Replicate that as
much as possible here.  Old virt-v2v checked output modes more
thoroughly than input modes, and I have stuck with copying that
behaviour.

This also corrects an error in -o libvirt: In old virt-v2v the output
pool defaulted to "default" rather than giving an error.

inputs, outputs: Choose qemu-nbd PID file named based on socket

Previously we attempted to choose the PID file name randomly.
Although this should never conflict, I saw one case where qemu-nbd
failed to start up, printing only:

  qemu-nbd: Cannot lock pid file: Resource temporarily unavailable

My reading of the code is this could be caused by the PID file already
being locked.

Anyway there is a better way to choose PID file names: simply extend
the already unique socket name with ".pid".

v2v: Don't print double error messages

If running helper-v2v-* programs, assume that if these exit on error
then they have already printed an error message.  Therefore the main
virt-v2v program does not need to print another error message.

v2v: Set permissions and SELinux labels on all sockets

When running virt-v2v as non-root (the recommended way) this all
worked fine before.

However a problem arises when running virt-v2v as root.  Libvirt will
run qemu as a non-root user, so we need to set permissions
appropriately (ironically making everything a bit less secure).

Also set SELinux labels if we detect SELinux is being used.

Reported-by: Tingting Zheng

output: Explicitly shut down the NBD handle

This avoids a warning from qemu-nbd:

qemu-nbd: Disconnect client, due to: Failed to send reply: Unable to write to socket: Broken pipe

For more information about the warning, see:

https://lists.nongnu.org/archive/html/qemu-block/2021-07/msg00703.html

lib/nbdkit: Always set both socket and file labels when using SELinux

We always set the file permissions to 0777 so we might as well always
set the SELinux labels when we detect that we are using SELinux.  This
avoids complexity elsewhere in virt-v2v.

inputs, outputs: Label all qemu-nbd sockets when using SELinux

Abstract qemu-nbd into a data type

Add a new module QemuNBD which contains the common code for running
qemu-nbd.  Replace existing code in the input and output helpers with
this module.

v2v: In verbose mode, dump nbdinfo about each NBD socket

This could help with debugging, especially understanding if nbdcopy
can use multi-conn.

input: Use the cache filter (if available) with slow plugins

This adds the cache filter to the chain of filters for slow plugins
(curl, ssh, vddk).

There is a potential further enhancement here: using conditional
cache-on-read=/path.  However that requires a very new nbdkit and
further changes elsewhere in virt-v2v.

input/nbdkit: Refactor these modules

These modules were made from old virt-v2v by splitting up the old
Nbdkit_sources module, but otherwise the code was virtually
unmodified.  This refactoring eliminates code duplication and dead
code left over from the split.

Although this is mostly refactoring, I also got rid of the ability to
use nbdkit-vddk-plugin < 1.17.10, which required the awkward use to
LD_LIBRARY_PATH.

convert: Do not use qemu block layer copyonread

Before this change:

[   0.0] Opening the source
[ 145.6] Inspecting the source
[ 988.4] Checking for sufficient free disk space in the guest
[ 988.4] Converting Fedora 28 (Server Edition) to run on KVM
virt-v2v: This guest has virtio drivers installed.
[3892.1] Mapping filesystem data to avoid copying unused and blank areas
[4125.9] Closing the overlay
[4126.6] Assigning disks to buses
[4126.6] Checking if the guest needs BIOS or UEFI to boot
[4126.6] Creating output metadata
[4132.8] Copying disk 1/1
█ 100% [****************************************]
[4205.1] Creating output metadata
[4205.1] Finishing off

After this change:

[   0.0] Opening the source
[   8.4] Inspecting the source
[  14.1] Checking for sufficient free disk space in the guest
[  14.1] Converting Fedora 28 (Server Edition) to run on KVM
virt-v2v: This guest has virtio drivers installed.
[  83.5] Mapping filesystem data to avoid copying unused and blank areas
[  87.2] Closing the overlay
[  87.9] Assigning disks to buses
[  87.9] Checking if the guest needs BIOS or UEFI to boot
[  87.9] Creating output metadata
[  94.0] Copying disk 1/1
█ 100% [****************************************]
[ 165.7] Creating output metadata
[ 165.7] Finishing off

We are now faster than virt-v2v 1.45:

[   0.0] Opening the source -i libvirt [...]
[   1.4] Creating an overlay to protect the source from being modified
[   4.8] Opening the overlay
[  17.2] Inspecting the overlay
[  23.7] Checking for sufficient free disk space in the guest
[  23.7] Converting Fedora 28 (Server Edition) to run on KVM
virt-v2v: This guest has virtio drivers installed.
[ 110.0] Mapping filesystem data to avoid copying unused and blank areas
[ 124.5] Closing the overlay
[ 125.1] Assigning disks to buses
[ 125.1] Checking if the guest needs BIOS or UEFI to boot
[ 125.1] Initializing the target -o null
[ 125.2] Copying disk 1/1 to qemu URI json:{ "file.driver": "null-co", "file.size": "1E" } (raw)
    (100.00/100%)
[ 764.6] Creating output metadata
[ 764.6] Finishing off

Thanks: Peter Krempa

v2v: Write dir/convert and dir/copy files

During the conversion and copying phases, write files literally called
"convert" and "copy" into the v2v directory.  Helpers can use these to
make decisions based on the phase of virt-v2v.  In particular we will
use the presence of the "convert" file to determine if we need to
enable copy-on-read.

input: Implement nbdkit-cow-filter cow-on-read (copy on read)

This has considerable performance benefits during the conversion step.

See also:
https://listman.redhat.com/archives/libguestfs/2021-July/msg00054.html

Add list of requirements to the README

todo: Put some items left over from modularization on the backlog

input: -i disk: Always detect input format

If the input format is raw, prefer nbdkit.

v2v: Minor refactoring of the code that runs nbdcopy

convert: Remove bogus "Creating output metadata" message

Left over from virt-v2v 1.45
2021-09-07 11:24:03 +01:00

90 lines
3.3 KiB
OCaml

(* virt-v2v
* Copyright (C) 2017 Red Hat Inc.
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License along
* with this program; if not, write to the Free Software Foundation, Inc.,
* 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
*)
(** A simple parser for VMware [.vmx] files. *)
type t
val parse_file : string -> t
(** [parse_file filename] parses a VMX file. *)
val parse_string : string -> t
(** [parse_string s] parses VMX from a string. *)
val get_string : t -> string list -> string option
(** Find a key and return it as a string. If not present, returns [None].
Note that if [namespace.present = "FALSE"] is found in the file
then all keys in [namespace] and below it are ignored. This
applies to all [get_*] functions. *)
val get_int64 : t -> string list -> int64 option
(** Find a key and return it as an [int64].
If not present, returns [None].
Raises [Failure _] if the key is present but was not parseable
as an integer. *)
val get_int : t -> string list -> int option
(** Find a key and return it as an [int].
If not present, returns [None].
Raises [Failure _] if the key is present but was not parseable
as an integer. *)
val get_bool : t -> string list -> bool option
(** Find a key and return it as a boolean.
You cannot return [namespace.present = "FALSE"] booleans this way.
They are processed by the parser and the namespace and anything
below it are removed from the tree.
Raises [Failure _] if the key is present but was not parseable
as a boolean. *)
val namespace_present : t -> string list -> bool
(** Returns true iff the namespace ({b note:} not key) is present. *)
val select_namespaces : (string list -> bool) -> t -> t
(** Filter the VMX file, selecting exactly namespaces (and their
keys) matching the predicate. The predicate is a function which
is called on each {i namespace} path ({b note:} not on
namespace + key paths). If the predicate matches a
namespace, then all sub-namespaces under that namespace are
selected implicitly. *)
val map : (string list -> string option -> 'a) -> t -> 'a list
(** Map all the entries in the VMX file into a list using the
map function. The map function takes two arguments. The
first is the path to the namespace or key, and the second
is the key value (or [None] if the path refers to a namespace). *)
val equal : t -> t -> bool
(** Compare two VMX files for equality. This is mainly used for
testing the parser. *)
val empty : t
(** An empty VMX file. *)
val print : out_channel -> int -> t -> unit
(** [print chan indent] prints the VMX file to the output channel.
[indent] is the indentation applied to each line of output. *)
val to_string : int -> t -> string
(** Same as {!print} but it creates a printable (multiline) string. *)