64 Commits

Author SHA1 Message Date
Richard W.M. Jones
902cf78966 -i libvirt: Parse UEFI secureboot flag from libvirt XML
Link: https://libvirt.org/kbase/secureboot.html
Related: https://issues.redhat.com/browse/RHEL-67836
Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
2024-11-18 14:26:50 +00:00
Richard W.M. Jones
35f96251fd -i vmx: Parse uefi.secureBoot.enabled from vmx file
Fixes: https://issues.redhat.com/browse/RHEL-67007
Reported-by: Ming Xie
2024-11-15 14:50:25 +00:00
Richard W.M. Jones
d46ceba1bd lib, input: Model UEFI secureboot property in metadata
This models a UEFI NVRAM variable which controls whether or not EFI
binary signatures are verified, which is an initial step in the long
chain of operations commonly known as "secure boot".  I added plenty
of documentation explaining what this really means as it is
non-obvious.

Thanks: Daniel Berrange, Andrea Bolognani
2024-11-15 14:50:25 +00:00
Richard W.M. Jones
69b4e83935 -o qemu: Replace hard-coded UEFI paths
Update the qemu shell script to simply find the UEFI paths as
required.

Remove lib/uefi.ml:find_uefi_firmware as this function is no longer
needed.

Remove common/mlv2v/ everywhere.  This contained a list of UEFI code
and NVRAM files which is no longer used.

Update common submodule.  This pulls in:

  Richard W.M. Jones (5):
      mlcustomize/customize_run.ml: Move 'in' to new line
      mlstdutils/guestfs_config: Define host_os
      mlcustomize, mltools: Check guest OS is compatible before allowing --run
      Remove mlv2v/ subdirectory
      qemuopts: Add ability to add raw, unquoted output to qemu scripts
      qemuopts: Fix missing break statement
2024-11-15 14:15:57 +00:00
Richard W.M. Jones
9cb7606904 -i libvirt: Trim whitespace around name
In -i libvirt / -i libvirtxml we didn't trim whitespace
around the name, so:

  <name> foo </name>

would set the input name to the literal string " foo ".
2024-11-15 14:01:03 +00:00
Richard W.M. Jones
de88b16622 v2v: Allow printing the checksum
Use <disk> ... <checksum method="sha256" fail="print" />

This doesn't check the checksum, it just computes and prints it.
2024-11-09 17:18:27 +00:00
Richard W.M. Jones
14f35e84f2 -i libvirtxml: Implement disk checksumming
It is not possible to specify an optional <checksum> field for each
disk.  This gives the expected checksum (to be supplied by some higher
level management tool), and virt-v2v will verify that the checksum of
the actual disk presented matches the expected checksum.

 <domain type='kvm'>
   ...
   <devices>
     <disk type='file' device='disk'>
       <driver name='qemu' type='raw'/>
       <source file='/path/to/disk/image'/>
       <target dev='hda' bus='ide'/>
       <checksum method='md5' fail='warn'>
         a07aeb7de93e9b60d9155294cf332508
       </checksum>
     </disk>

 $ virt-v2v-in-place -i libvirtxml file.xml
 ...
 [   0.0] Checking md5 checksum of disk 1/2
 warning: bad checksum for disk 1/2
 Expected checksum: a07aeb7de93e9b60d9155294cf332508
 Actual checksum: e83009eb529a958875008103bb248017
2024-11-09 15:40:41 +00:00
Richard W.M. Jones
7a82e130c2 lib: Only get nbdkit config and version once
The configuration (--dump-config) and version of nbdkit won't change
while we are running, so we only need to get it once.  We also don't
need the config parameter to Nbdkit.version so drop that.
2024-07-30 15:57:57 +01:00
Richard W.M. Jones
3d26e48af3 -i ova: Ignore dot-underscore-files in OVA files
I received an OVA created by a mac which contained various files
prefixed by "._" that contain some sort of extra information.  Ignore
those files when decoding OVAs:

$ tar tvf win22ktest.tar 2>/dev/null
drwxr-xr-x markd/staff       0 2024-07-23 18:23 win22test/
-rw-r--r-- markd/staff     619 2024-07-23 15:16 win22test/._win22test.mf
-rw-r--r-- markd/staff     271 2024-07-23 15:16 win22test/win22test.mf
-rw-r--r-- markd/staff     623 2024-07-23 15:16 win22test/._win22test-1.vmdk
-rw-r--r-- markd/staff 8348649984 2024-07-23 15:16 win22test/win22test-1.vmdk
-rw-r--r-- markd/staff        624 2024-07-23 15:00 win22test/._win22test-2.nvram
-rw-r--r-- markd/staff     270840 2024-07-23 15:00 win22test/win22test-2.nvram
-rw-r--r-- markd/staff        620 2024-07-23 15:00 win22test/._win22test.ovf
-rw-r--r-- markd/staff      12052 2024-07-23 15:00 win22test/win22test.ovf
2024-07-25 08:55:59 +01:00
Bella Khizgiyaev
bf33133c05
-i ova: Use the detected firmware type when absent rather than default (#53)
Signed-off-by: Bella Khizgiyaev <bkhizgiy@redhat.com>
2024-04-11 14:25:35 +01:00
Richard W.M. Jones
970d7123c2 input/ssh: Use nbdinfo --can connect (instead of --size)
nbdinfo --size prints the size on stdout, causing it to appear in
virt-v2v output.  Using --can connect instead is silent.

Note that nbdinfo --can connect was added in libnbd 1.9.2 (Jul 2021)
and our previous minimum version of libnbd was 1.9.3 so we're OK.
However since neither of these was a stable version I also updated the
minimum libnbd requirement to 1.10 (Sep 2021), and added a proper
check in ./configure

Fixes: commit fb72e059863a60503b6011b8590c25c3a010a58f
2024-01-18 17:25:25 +00:00
Richard W.M. Jones
4f0758a95a input/ssh: Rearrange parameters specifying ssh server 2024-01-17 14:20:44 +00:00
Richard W.M. Jones
a99d9f2afe input/nbdkit_ssh: Make password parameter optional
Instead of storing an explicit NoPassword case, make the password
parameter optional and encode NoPassword as None.  This is simple
refactoring.
2024-01-17 14:20:44 +00:00
Richard W.M. Jones
fb72e05986 virt-v2v: -i vmx: Replace external ssh/scp with nbdkit-ssh-plugin
If you use a -i vmx ssh filename containing '*' then it will expand
the glob at the remote side.  New scp is weird and silently creates a
directory on the local side.  For example suppose there's a remote
file literally called "/tmp/test*" (ie. the '*' is part of the
filename) and you thought you could copy that to local using:

  scp 'remote:/tmp/test*' '/tmp/test*'

scp treats the first parameter (only) as a wildcard and if there are
multiple files matching the wildcard /tmp/test*, will create a local
directory literally called "/tmp/test*/" and put the files into it.
Who expected any of that?

You might think that double quoting (as we used to use) might work,
but that breaks with spaces and quotes.  I guess scp is using
different code paths internally for glob versus everything else.

The only way to really make this work is to stop using scp entirely
and just use sftp directly.  The easiest way to use sftp is to use
nbdkit-ssh-plugin.  We already depend on nbdkit-ssh-plugin, nbdcopy
and nbdinfo for other parts of virt-v2v, so might as well use the
whole lot here.

One advantage of this change is that now the -ip (input password)
parameter actually works in -i vmx -it ssh mode.

Other approaches that would have been possible:

 - Use the OCaml NBD library instead of nbdcopy/nbdinfo

 - Use 'nbdkit -U - ssh --run ...'

 - Direct binding to libssh.

See also commit e2af12ba69c4463bb73d30db63290a887cdd41eb
("input: -i vmx: Remove support for openssh scp < 8.8")

See also commit 22c5b98ab78c734b478c26e14ee62e2a065aaa0c
("-it ssh: Double quote ssh command which tests remote file exists")

See also https://unix.stackexchange.com/a/587710

Reported-by: Ming Xie
Fixes: https://issues.redhat.com/browse/RHEL-21365
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
2024-01-17 14:20:44 +00:00
Richard W.M. Jones
72c9734050 input/nbdkit_ssh: Make retry filter optional
This filter retries a request on failure.  Add a flag to make this
filter optional, but default to using it if available (so there's no
change to default behaviour).

The reason we want to make this optional is that the next commit will
want to use nbdkit-ssh-plugin to probe to see if a remote file is
present, and we don't want to retry that probe on failure.

Reviewed-by: Laszlo Ersek <lersek@redhat.com>
2024-01-17 14:20:44 +00:00
Richard W.M. Jones
40730615d2 virt-v2v: -i vmx: Remove dependency of ssh.ml on Xml.uri
Xml.uri is a convenient way to pass the multiple ssh fields to
virt-v2v and is still used internally by the -i vmx code.  However
don't leak this awkward implementation into the new ssh.ml module.

Like Nbdkit_ssh, we'll only deal with the explicit (password, port,
server, user, path) fields separately.

Note after this refactoring:

 - The new Ssh module interface looks broadly similar to the
   Nbdkit_ssh interface.

 - vmx_source_of_arg assertions are no longer required inside the new
   Ssh module, which was a mild layering violation before.

Reviewed-by: Laszlo Ersek <lersek@redhat.com>
2024-01-17 14:20:44 +00:00
Richard W.M. Jones
60e72acb34 virt-v2v: -i vmx: Add the input password to vmx_source
Since we use the input password in various places in the VMX module,
store the input password in vmx_source.  This neutral refactoring
makes later changes simpler.

Reviewed-by: Laszlo Ersek <lersek@redhat.com>
2024-01-17 14:20:44 +00:00
Richard W.M. Jones
a0b22af287 virt-v2v: -i vmx: Simplify scp wrapper
The existing 'scp_from_remote_to_temporary' wrapper around scp was
pretty weird (I think from much patching without refactoring).
Simplify it so it no longer generates the output filename, and rename
it accordingly to 'download_file' (as we will soon remove the need for
and dependency on 'scp').

Reviewed-by: Laszlo Ersek <lersek@redhat.com>
2024-01-17 14:20:44 +00:00
Richard W.M. Jones
cddd07669d virt-v2v: -i vmx: Refactor ssh/scp code into a new module
This is a straight refactor of the existing code that handles ssh/scp
into a new module.  In this commit I just copy the code around without
doing any cleanup; cleanup will follow in subsequent commits.

Reviewed-by: Laszlo Ersek <lersek@redhat.com>
2024-01-17 14:20:44 +00:00
Richard W.M. Jones
af68f253d1 virt-v2v: -i vmx: Remove scp -T option
This reverts the following commit:

  commit d265639c2ab31418cfdbdedd0cc3e68cf290d834
  Author: Richard W.M. Jones <rjones@redhat.com>
  Date:   Thu Jul 25 14:52:42 2019 +0100

    v2v: -i vmx: Use scp -T option if available to unbreak scp (RHBZ#1733168).

See also the referenced bug:
https://bugzilla.redhat.com/show_bug.cgi?id=1733168

My rationale for removing this option is that since we now require
OpenSSH 8.8 we must be using sftp for file transfer so we no longer
need to defeat the check for correct expansion of wildcards.  That
check was only relevant for OpenSSH <= 8.7 using the old scp protocol.

Reverts: commit d265639c2ab31418cfdbdedd0cc3e68cf290d834
2024-01-11 15:35:43 +00:00
Richard W.M. Jones
6f2536bd9f input/parse_vmx.ml: Reject VMDK files
Instead of the confusing warnings printed before, it now prints
an error indicating the incorrect input format:

$ virt-v2v -i vmx -it ssh ssh://root@xxx/vmfs/volumes/esx8.0-function/Auto-esx8.0-rhell9.3-efi-with-empty-cdrom/Auto-esx8.0-rhell9.3-efi-with-empty-cdrom.vmdk -ip /tmp/passwd -o null
virt-v2v: error: input file is a VMDK (disk image), but we are expecting a
VMX (VMware metadata)

Reported-by: Ming Xie
Fixes: https://issues.redhat.com/browse/RHEL-19564
2023-12-19 12:57:42 +00:00
Richard W.M. Jones
22c5b98ab7 -it ssh: Double quote ssh command which tests remote file exists
Double quoting was removed in
commit e2af12ba69c4463bb73d30db63290a887cdd41eb ("input: -i vmx:
Remove support for openssh scp < 8.8", Nov 2021).  However it should
only have been removed from scp commands, not for this ssh command
where it is still required.

See: https://github.com/libguestfs/virt-v2v/issues/35
Thanks: Laszlo Ersek for diagnosis and suggesting the fix
Reported-by: Bill Sanders
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
2023-10-02 15:17:43 +01:00
Richard W.M. Jones
97d8c28b7e builder: Use new Curl.create API that requires url parameter
Update common submodule, picking up this fix:

  commit 06721c00e2582daf1b3bedc4fbfb699b9b788a41
  Author: Richard W.M. Jones <rjones@redhat.com>
  Date:   Thu Jun 29 13:10:24 2023 +0100

    mltools: curl: Split out the url parameter

This requires a small change to input/vCenter.ml
2023-06-29 13:23:05 +01:00
Richard W.M. Jones
17c6d52447 input: -i ova: Less strict parsing of lines from the manifest file
eBlocker (https://eblocker.org) provides an OVA which contains this
manifest (*.mf) file:

SHA256 (eBlockerVM-2.9.1-amd64.ovf) = 7b25ea03c3b9eb143c7f6d1e8f76c3717e5b928f007e571cd96132bf6a15a30d
SHA256 (eBlocker VM-disk001.vmdk) = 646d1ec6ddf5a6d6604dbec421624ca2cea6ee5b57bf80ae635bf8f4f0bfea45

Note that the manifest is not conforming with the OVFS specification
which says the line format should be:

file_digest = algorithm "(" file_name ")" "=" sp digest nl

Space (sp) is not permitted after the algorithm or before the = sign.

However let's be tolerant of the extra whitespace.
2023-04-03 11:39:01 +01:00
Richard W.M. Jones
8ad152afc4 Rework Std_utils.Option so it works like the OCaml stdlib module
OCaml 4.08 introduces a stdlib Option module which looks a bit like
ours but has a number of differences.  In particular our functions
Option.may and Option.default have no corresponding functions in
stdlib, although there are close enough equivalents.

This change was automated using this command:

$ perl -pi.bak \
  -e 's/Option.may/Option.iter/g; s/Option.default /Option.value ~default:/g' \
  `git ls-files`

Update common module to include:

  commit cffa077323fafcdfcf78e230c022afa891a6b3ff
  Author: Richard W.M. Jones <rjones@redhat.com>
  Date:   Mon Feb 20 12:11:51 2023 +0000

    mlstdutils: Rework the Option module to be compatible with stdlib

  commit 007d0506c538db0a43fec7e9986a95ecdcd48b56
  Author: Richard W.M. Jones <rjones@redhat.com>
  Date:   Mon Feb 20 12:18:29 2023 +0000

    mltools: Replace Option.may with Option.iter
2023-02-20 12:21:47 +00:00
Richard W.M. Jones
d2b01e487f Split long lines in messages
This commit splits up any long lines found in errors, warnings or
other messages.  OCaml ignores whitespace following "\<CR>" within a
string, eg:

  "long string \
   more stuff"

is parsed as:

  "long string more stuff"

Thanks: Laszlo Ersek, for working out the OCaml syntax for this
2023-01-20 10:29:24 +00:00
Andrey Drobyshev
7361431311 parse_libvirt_xml: look for manual firmware in "/domain/os/loader/@type"
According to [1], there're different ways to specify which firmware is
to be used by a libvirt-driven VM.  Namely, there's an automatic
firmware selection, e.g.:

    ...
    <os firmware='(bios|efi)'>
    ...

and a manual one, e.g.:

    ...
    <os>
        <loader readonly='yes' type='pflash'>/usr/share/OVMF/OVMF_CODE.fd</loader>
        ...
    </os>
    ...

with the latter being a way to specify UEFI firmware.  So let's add this
search path as well when parsing source VM's libvirt xml.

[1] https://libvirt.org/formatdomain.html#bios-bootloader

Co-authored-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>
Originally-by: Denis Plotnikov <dplotnikov@virtuozzo.com>
Message-Id: <20221220150133.403120-1-andrey.drobyshev@virtuozzo.com>
2022-12-21 10:10:29 +01:00
Richard W.M. Jones
2eb6441264 common: Adapt to renamed function On_exit.rmdir -> On_exit.rm_rf
This function was renamed to make it clearer what it does (and that
it's potentially dangerous).  The functionality is unchanged.

Reviewed-by: Laszlo Ersek <lersek@redhat.com>
2022-07-15 08:57:52 +01:00
Richard W.M. Jones
3c4505c12a build: Remove bundled OCaml bindings for libvirt
The ocaml-libvirt project (https://gitlab.com/libvirt/libvirt-ocaml)
provides bindings for libvirt.  For historical reasons we bundled this
as it was throught ocaml-libvirt wasn't widespread on distros.  In
fact Fedora and Debian derivatives all have this.  Arch does not
(yet), but we can fix that.

It said in the README file in that directory, "before virt-v2v 1.42 is
released we hope to have unbundled this".  That didn't happen, but
let's remove it now.
2022-04-28 15:42:26 +01:00
Laszlo Ersek
e7539dc6f6 input: -i vmx: Add support for SATA hard disks
See also:

- virt-v2v commit 75872bf282d7 ("input: -i vmx: Add support for NVMe
  devices", 2022-04-08),

- libvirt commit 2214fe90442c ("vmx: start parsing SATA disks",
  2020-10-14).

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1883802
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Message-Id: <20220419152415.13356-3-lersek@redhat.com>
Reviewed-by: Richard W.M. Jones <rjones@redhat.com>
2022-04-21 16:06:43 +02:00
Laszlo Ersek
ae5b44639b "-i vmx -it ssh": fix non-unique "s_disk_id"
The "s_disk_id" field is supposed to be unique across all fixed disks of
the source domain.

With "-i vmx -it ssh" however, we restart assigning "s_disk_id" from zero
near the end of the "find_hdds" function.

The problem is that "find_hdds" is separately called from
"find_scsi_disks", "find_nvme_disks", and "find_ide_disks". Thus, if we
have at least two hard disks in the source on different controller types,
we'll output two disks with a zero-value "s_disk_id".

"input/input_vmx.ml" starts up nbdkit with "in%d" socket names that are
based on list position, not "s_disk_id", so the nbdkit invocations are
correct. However, in "convert/convert.ml", the "server" argument for
"g#add_drive_opts" is based on "s_disk_id"; therefore, one of the disks
will be multiply referenced, and the others will be ignored.

Assign "s_disk_id" in "find_disks" [input/parse_domain_from_vmx.ml] once
all the disks have been found and concatenated into the common list.

Note that the global sorting order remains well-defined. Both before and
after this patch, the sorting occurs *within* "find_hdds", but that's OK:
the concatenation order between the type-specific disk-lists is
well-defined (albeit arbitrary) in "find_disks".

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1883802
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Message-Id: <20220419152415.13356-2-lersek@redhat.com>
Reviewed-by: Richard W.M. Jones <rjones@redhat.com>
2022-04-21 16:06:33 +02:00
Laszlo Ersek
e0b08ee021 input_vmx: cleanly reject guests with snapshots when using "-it ssh"
For traversing a backing chain of VMDK descriptor files over ssh, two
things are necessary:

- qemu-nbd with the ssh block driver, rather than nbdkit-ssh-plugin,

- a remote SSH URL (for qemu-nbd) without a query string appended, as
  qemu-nbd cannot update the last pathname component (for tracking the
  relative pathnames of VMDK descriptor files) if a query string is
  appended.

Before commit 7a6f6113a25f ("v2v: -i vmx -it ssh: Replace qemu block ssh
driver with nbdkit-ssh-plugin.", 2019-10-08), we passed the
"?host_key_check=no" query string in the URL to qemu-nbd, so we can't just
return to that, for accessing snapshotted guests with vmx+ssh.

But, we shouldn't return to qemu-nbd for vmx+ssh even without a query
string, as that would undo the other benefit(s) of commit 7a6f6113a25f.

Instead, clearly document that snapshotted guests are not supported over
vmx+ssh, and cleanly reject this situation in the code as well. Recommend
the two alternative transports that allow the user to convert such guests.

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1774386
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Message-Id: <20220409061252.6551-1-lersek@redhat.com>
[lersek@redhat.com: replace B<-...> with I<-...> in NOTES]
Reviewed-by: Richard W.M. Jones <rjones@redhat.com>
2022-04-09 09:07:29 +02:00
Richard W.M. Jones
75872bf282 input: -i vmx: Add support for NVMe devices
We model NVMe devices in the source hypervisor.

We currently assume that no one is using the namespaces feature of
NVMe, ie. that each source device will appear in a Linux guest as
/dev/nvme0n1, /dev/nvme1n1, etc.  We could fix this if it is a
problem, but it requires adjusting the current assumption for
removable devices that slots are simple integers.

The devices are mapped to virtio-blk, so in the target the device name
has to change from /dev/nvme0 to /dev/vda (etc.)

Reported-by: Ming Xie
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=2070530
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
2022-04-08 16:24:38 +01:00
Richard W.M. Jones
db00a9a9aa input: -i ova: Handle OVAs which contain user/group names with spaces
If importing an OVA that has user/group names with spaces then the
plain tar -tRvf command would print them without any quoting.  Our
simple strategy of splitting on spaces resulted in an "extra field"
being parsed.  This is an example from a real OVA (note "Domain Users"
is the group name):

$ tar --quoting-style=literal -tRvf protect_appliance.ova
block 0: -rw-r--r-- eraautobuilds/Domain Users 33508 2021-11-04 17:48 PROTECT_Appliance.ovf

Luckily this is fairly simple to fix.  We don't care about the
original user/group name, and using --numeric-owner causes tar to
print the UID/GID instead:

$ tar --quoting-style=literal --numeric-owner -tRvf protect_appliance.ova
block 0: -rw-r--r-- 1074101/1049089 33508 2021-11-04 17:48 PROTECT_Appliance.ovf

I also added --quoting-style=literal to deal with possible future
cases where the filename contains spaces.  Because we use
nsplit ~max:8 these should now be handled correctly too, although I
didn't test this.

Reported-by: Jiří Sléžka
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=2069768
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
2022-04-04 13:03:41 +01:00
Laszlo Ersek
9788b06765 nbdkit, qemuNBD: run_unix: formally require externally provided socket
At this point, virt-v2v never relies on the Unix domain sockets created
inside the "run_unix" implementations. Simplify the code by removing this
option.

Consequently, the internally created temporary directory only holds the
NBD server's PID file, and never its UNIX domain socket. Therefore:

(1) we no longer need the libguestfs socket dir to be our temp dir,

(2) we need not change the file mode bits on the temp dir,

(3) we can rename "tmpdir" to the more specific "piddir".

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2066773
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Message-Id: <20220323104330.9667-1-lersek@redhat.com>
Acked-by: Richard W.M. Jones <rjones@redhat.com>
2022-03-23 13:39:12 +01:00
Richard W.M. Jones
6413024848 input: Add options.read_only flag
This flag can be used for certain inputs to control whether we place a
protective overlay over the source.  It is only supported for local
file type inputs (-i disk and -i libvirt).  For other input types it
cannot work:

 * -i ova: These cannot be updated in-place because the OVA embeds
   checksums.

 * Any input on VMware or Xen: There's no point doing an in-place
   update for a guest which his still sitting on a foreign hypervisor.

 * -i vmx: We cannot reliably update a disk in VMDK format using qemu
   tools.

This flag is always set to true, so there is no effect in the current
code.  It allows us to implement a second front end for in-place
support.
2022-03-10 15:11:53 +00:00
Richard W.M. Jones
407c6e5cae vddk: Use blocksize filter to split very large requests
We know that VDDK has problems handling very large single requests.
It may cause out of memory errors on the server unless a relatively
obscure server-side configuration change is made.  With modular
virt-v2v we won't necessarily have fine control over who is connecting
to the input socket and what sized requests they may make, eg if we
delegate copying to a third party.

nbdkit-blocksize-filter offers a simple solution.  It will split
requests larger than a certain size, ensuring that whatever NBD
requests come in, we shouldn't cause problems on the VMware server.
I chose 2M as the maximum request size, but this could be fine-tuned
later.

For a longer explanation, see:
6f4746af9a

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=2039255
Acked-by: Laszlo Ersek <lersek@redhat.com>
Reported-by: Ming Xie <mxie@redhat.com>
Tested-by: Ming Xie <mxie@redhat.com>
2022-02-15 15:27:58 +00:00
Richard W.M. Jones
cee676fb70 input: Refactor code in the input modules
Cleans up things left over from modularising virt-v2v and aims to make
the code more straightforward to follow.

In particular (just because of the incremental steps that
modularization went through) the input modules all looked like:

  let foo_source dir options args =
    ...
  module Foo = struct
    let setup dir options args = foo_source dir options args
  end

(and the same for other methods).  One of the refactorings is to
simply place the code directly inside the module functions.

I also moved around the module methods to reflect more closely the
order in which the code is actually called by virt-v2v.

The change looks large, but that's because of whitespace changes
because of reindenting the code.  There are no code changes here and
the refactoring should be completely neutral.
2022-02-11 13:06:07 +00:00
Richard W.M. Jones
8abc07a858 input: Require -ip password for vCenter over HTTPS
As far as I've ever seen, you always need a password to access the
/folder directory of a VMware server.  If the password isn't specified
to virt-v2v using the -ip option then it used to try to ask for it
interactively when virt-v2v started.  Actually it asked for the
password several times -- at least twice.

After we added the cookie-script feature (commit 2b9a11743b "v2v:
vcenter: Implement cookie scripts.") it tries to ask interactively at
random points during the conversion, which is obviously bad from a UI
point of view but also because the password is requested without a
prompt looks like a hang.

We could solve this by prompting for a password.  But virt-v2v is not
primarily an interactive tool and it's an easier fix is to require the
caller to use the -ip passwordfile option (in this particular mode).

Reported-by: Xinyu Li
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1960087
2022-02-01 10:02:49 +00:00
Richard W.M. Jones
20019b5cad input: libvirt: Share a single connection to the source NBD server
When using virt-p2v from RHEL 7, it starts a very old qemu-nbd server
(probably 1.5.3) which required the --shared parameter to enable
sharing even in read-only mode.  Since it doesn't pass this parameter
only a single connection at a time is allowed, and further connections
will deadlock.  Note that later versions of qemu-nbd changed this so
that read-only connections permit sharing automatically.

In modular virt-v2v we now use nbdkit-nbd-plugin to proxy the
connection to virt-p2v / qemu-nbd.  When you connect to this multiple
times, as virt-v2v does, it will make multiple connections to the
backend qemu-nbd.  This will cause a deadlock.

We can use the nbdkit-nbd-plugin shared=true flag to enable the plugin
to share a single connection to the backend between multiple nbdkit
clients.

Reported-by: Tingting Zheng
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=2044911
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
2022-01-26 12:56:06 +00:00
Richard W.M. Jones
924aa8b70a Restore message about setting up the input and output
Old virt-v2v would print a summary of the input and output options
before connecting to the input/output, looking something like this:

  [   0.2] Opening the source -i libvirt -ic [etc]

This gave reassurance that virt-v2v was doing something in the case
where the source was slow or unreachable.  In particular if you use
-i libvirt with a vCenter URL, and the URL is wrong, libvirt hangs for
a few minutes without printing anything.

Modular virt-v2v rearranged things so the connecting phase was silent,
which meant that in the case above virt-v2v appeared to hang for a few
minutes printing nothing at all.

This change adds to_string functions to all the input and output
methods and uses them to print a message like:

  [   0.0] Setting up the source: -i libvirt -ic [etc]

The hang still happens, but at least it's now clear where it's hanging.

Note the old "Opening the source" message now refers to libguestfs
connecting to the NBD source disk pipeline.

Typical full output looks like this:

  $ virt-v2v -i disk /var/tmp/fedora-35.img -o disk -os /var/tmp/out
  [   0.0] Setting up the source: -i disk /var/tmp/fedora-35.img
  [   1.1] Opening the source
  [   5.9] Inspecting the source
  [  11.5] Checking for sufficient free disk space in the guest
  [  11.5] Converting Fedora Linux 35 (Thirty Five) to run on KVM
  virt-v2v: warning: /files/boot/grub2/device.map/hd0 references unknown
  device "vda".  You may have to fix this entry manually after conversion.
  virt-v2v: This guest has virtio drivers installed.
  [  57.4] Mapping filesystem data to avoid copying unused and blank areas
  [  61.0] Closing the overlay
  [  61.7] Assigning disks to buses
  [  61.7] Checking if the guest needs BIOS or UEFI to boot
  [  61.7] Setting up the destination: -o disk -os /var/tmp/out
  [  62.8] Copying disk 1/1
  █ 100% [****************************************]
  [  81.7] Creating output metadata
  [  81.7] Finishing off

Reported-by: Xiaodai Wang
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=2041886
Fixes: commit 255722cbf39afc0b012e2ac00d16fa6ba2f8c21f
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
2022-01-19 15:26:28 +00:00
Richard W.M. Jones
f3180e3c0e input: xen: Fix assertion error when importing from remote block device
We never supported this because OpenSSH sftp server does not know how
to open and get the size of a block device.  Before modular virt-v2v,
we would detect this situation and print an error:

  virt-v2v: error: guest disk sda appears to be zero bytes in size.

  There could be several reasons for this:

  Check that the guest doesn't really have a zero-sized disk.  virt-v2v
  cannot convert such a guest.

  If you are converting a guest from an ssh source and the guest has a disk
  on a block device (eg. on a host partition or host LVM LV), then
  conversions of this type are not supported.  See the virt-v2v-input-xen(1)
  manual for a workaround.

This error was lost in the conversion to modularity, but in any case
the proper way to detect this is in the input_xen driver itself.

In addition to this, when we removed the old virt-v2v-copy-to-local
tool, the text referring to the workaround in the manual became
meaningless, so I added the (ugly, manual) workaround that you have to
do back into the manual, minus the bit about the tool that we removed.

In future we could consider other ways to convert a remote block
device over SSH transparently.  (Sending a "dd" command, maybe?)

I also checked what happens when we try to convert an empty disk, and
it fails during inspection with a reasonable error message, so we
don't need to do anything to restore the above error:

  $ virt-v2v -i disk /var/tmp/empty -o null
  [   1.1] Opening the source
  [   7.0] Inspecting the source
  virt-v2v: error: inspection could not detect the source guest (or physical
  machine).

  Assuming that you are running virt-v2v/virt-p2v on a source which is
  supported (and not, for example, a blank disk), then this should not
  happen.

  No root device found in this operating system image.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=2041852
Reported-by: Xiaodai Wang
2022-01-18 12:48:56 +00:00
Richard W.M. Jones
bdd8412164 input, output: Use Tools_utils.open_guestfs instead of Guestfs.create
The common function open_guestfs passes verbose and trace flags from
virt-v2v to the libguestfs handle, thus improving debugging.

Also update the common submodule to get:

    mltools: Error out if the libguestfs environment is bad

    Currently setting bogus environment variables causes libguestfs to
    print errors but continue running, eg:

    $ LIBGUESTFS_BACKEND=bogus virt-v2v --version
    libguestfs: error: invalid backend: bogus
    virt-v2v 1.45.96fedora=36,release=1.fc36

    If environment variables are bogus we really ought to exit early
    instead.  After this change:

    $ LIBGUESTFS_BACKEND=bogus virt-v2v --version
    Fatal error: exception Guestfs.Error("invalid backend: bogus")

    Thanks: Xiaodai Wang
2022-01-17 15:53:46 +00:00
Richard W.M. Jones
702a511b7f input/nbdkit_curl.ml: Fix typo in commented code
Fixes: commit 255722cbf39afc0b012e2ac00d16fa6ba2f8c21f
2021-12-18 13:22:48 +00:00
Richard W.M. Jones
24fdb088b1 input, output: Use Option.may for some Nbdkit calls
Option.may (Nbdkit.add_arg cmd "port") port;

is completely equivalent to:

  (match port with
   | Some s -> Nbdkit.add_arg cmd "port" s
   | None -> ());

Updates: commit d50966c2a480bda033f6e63bb797f86c13d576bd
2021-12-18 12:02:52 +00:00
Richard W.M. Jones
e3e1000226 input: Remove nbdkit file fadvise=sequential option in a couple more places
The result of commit ac59d3b231 ("output: Don't use nbdkit-file-plugin
cache=none when writing") and the later partial revert in
commit 5d5383b912 is that we deliberately dropped fadvise=sequential
when using nbdkit-file-plugin, while keeping cache=none.

This flag was giving an incorrect hint to the kernel.  It was still
being used in a couple of places in virt-v2v, this commit drops those
too for the same reason.
2021-12-11 17:35:26 +00:00
Richard W.M. Jones
ff84a00fff lib: Make QemuNBD mini-library handle mutable
Similar change to the previous commit.
2021-12-11 17:22:07 +00:00
Richard W.M. Jones
d50966c2a4 lib: Make Nbdkit mini-library handle mutable
Previously the Nbdkit.cmd handle was immutable, which is nice from a
purity point of view but made for a lot of boilerplate.  If we make it
mutable we can make the code a little smaller and easier to read and
write.

There is no significant change here and this is mostly refactoring.
However I also cleaned up the way environment variables are done
(LANG=C was duplicated before), and also the handling of the verbose
flag.
2021-12-11 17:09:56 +00:00
Richard W.M. Jones
8ffde4f4c2 input: Drop all use of nbdkit-readahead-filter
Previously we used nbdkit-readahead-filter with input modes that used
curl, ssh and vddk nbdkit plugins.  We quickly dropped it for curl and
vddk because it caused poor performance and failures with VMware
servers (bugs https://bugzilla.redhat.com/show_bug.cgi?id=1848862
https://bugzilla.redhat.com/show_bug.cgi?id=1832805).

This drops it for ssh too (eg for vmx or xen-ssh input modes).

The readahead filter in nbdkit is naively written and implementing a
replacement has proven difficult.  In any case it is not needed.  When
doing the conversion, readahead is already implemented in the
libguestfs appliance kernel.  When doing the copy, nbdcopy is already
able to keep sufficient requests in flight.
2021-12-11 16:26:50 +00:00
Richard W.M. Jones
351d61f768 input: -it vddk: Reduce cow-block-size to 4K
NB: Requires the following fix to nbdkit:
c6fe9cb5b4

In commit 17358abd ("inputs: vddk: Use cow-block-size=1M") we started
to use a larger cow filter block size than the default.  This is based
on the theory that a larger block size causes larger, more efficient
read and extents requests to VDDK.  Some data were shown in the commit
message to show this was more efficient at the time.

However further testing shows it's not clear cut.  The trimming step
("Mapping filesystem data") is considerably slower with a larger block
size.  This is because the cow filter does a read-modify-write cycle
on the unaligned head and tail of the request (most trims and small
and/or not aligned to large block sizes), and the read part of this
cycle goes through VDDK which is slow.  The overall results are
clearly now better with a small block size, as shown below, even
though the copy step is slower, possibly because less data has been
read through the cow filter and cached during conversion.

Note that nbdkit-cow-filter coalesces operations so the original
rationale from that commit seems dubious.

Before (note time spent in "Mapping" step):

[   2.8] Opening the source
[  12.6] Inspecting the source
[  18.7] Checking for sufficient free disk space in the guest
[  18.7] Converting Windows 10 Enterprise to run on KVM
virt-v2v: This guest has virtio drivers installed.
[  36.1] Mapping filesystem data to avoid copying unused and blank areas
[  77.0] Closing the overlay
[  77.9] Assigning disks to buses
[  77.9] Checking if the guest needs BIOS or UEFI to boot
[  77.9] Copying disk 1/1
█ 100% [****************************************]
[ 332.3] Creating output metadata
[ 332.3] Finishing off

After:

[   2.8] Opening the source
[   7.9] Inspecting the source
[  12.8] Checking for sufficient free disk space in the guest
[  12.8] Converting Windows 10 Enterprise to run on KVM
virt-v2v: This guest has virtio drivers installed.
[  26.6] Mapping filesystem data to avoid copying unused and blank areas
[  28.0] Closing the overlay
[  28.7] Assigning disks to buses
[  28.7] Checking if the guest needs BIOS or UEFI to boot
[  28.7] Copying disk 1/1
█ 100% [****************************************]
[ 297.4] Creating output metadata
[ 297.4] Finishing off

Fixes: commit 17358abdcd44e3ff2a10fb16b3b626f438f2fc7a
2021-12-07 08:34:58 +00:00