55 Commits

Author SHA1 Message Date
Bella Khizgiyaev
bf33133c05
-i ova: Use the detected firmware type when absent rather than default (#53)
Signed-off-by: Bella Khizgiyaev <bkhizgiy@redhat.com>
2024-04-11 14:25:35 +01:00
Richard W.M. Jones
970d7123c2 input/ssh: Use nbdinfo --can connect (instead of --size)
nbdinfo --size prints the size on stdout, causing it to appear in
virt-v2v output.  Using --can connect instead is silent.

Note that nbdinfo --can connect was added in libnbd 1.9.2 (Jul 2021)
and our previous minimum version of libnbd was 1.9.3 so we're OK.
However since neither of these was a stable version I also updated the
minimum libnbd requirement to 1.10 (Sep 2021), and added a proper
check in ./configure

Fixes: commit fb72e059863a60503b6011b8590c25c3a010a58f
2024-01-18 17:25:25 +00:00
Richard W.M. Jones
4f0758a95a input/ssh: Rearrange parameters specifying ssh server 2024-01-17 14:20:44 +00:00
Richard W.M. Jones
a99d9f2afe input/nbdkit_ssh: Make password parameter optional
Instead of storing an explicit NoPassword case, make the password
parameter optional and encode NoPassword as None.  This is simple
refactoring.
2024-01-17 14:20:44 +00:00
Richard W.M. Jones
fb72e05986 virt-v2v: -i vmx: Replace external ssh/scp with nbdkit-ssh-plugin
If you use a -i vmx ssh filename containing '*' then it will expand
the glob at the remote side.  New scp is weird and silently creates a
directory on the local side.  For example suppose there's a remote
file literally called "/tmp/test*" (ie. the '*' is part of the
filename) and you thought you could copy that to local using:

  scp 'remote:/tmp/test*' '/tmp/test*'

scp treats the first parameter (only) as a wildcard and if there are
multiple files matching the wildcard /tmp/test*, will create a local
directory literally called "/tmp/test*/" and put the files into it.
Who expected any of that?

You might think that double quoting (as we used to use) might work,
but that breaks with spaces and quotes.  I guess scp is using
different code paths internally for glob versus everything else.

The only way to really make this work is to stop using scp entirely
and just use sftp directly.  The easiest way to use sftp is to use
nbdkit-ssh-plugin.  We already depend on nbdkit-ssh-plugin, nbdcopy
and nbdinfo for other parts of virt-v2v, so might as well use the
whole lot here.

One advantage of this change is that now the -ip (input password)
parameter actually works in -i vmx -it ssh mode.

Other approaches that would have been possible:

 - Use the OCaml NBD library instead of nbdcopy/nbdinfo

 - Use 'nbdkit -U - ssh --run ...'

 - Direct binding to libssh.

See also commit e2af12ba69c4463bb73d30db63290a887cdd41eb
("input: -i vmx: Remove support for openssh scp < 8.8")

See also commit 22c5b98ab78c734b478c26e14ee62e2a065aaa0c
("-it ssh: Double quote ssh command which tests remote file exists")

See also https://unix.stackexchange.com/a/587710

Reported-by: Ming Xie
Fixes: https://issues.redhat.com/browse/RHEL-21365
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
2024-01-17 14:20:44 +00:00
Richard W.M. Jones
72c9734050 input/nbdkit_ssh: Make retry filter optional
This filter retries a request on failure.  Add a flag to make this
filter optional, but default to using it if available (so there's no
change to default behaviour).

The reason we want to make this optional is that the next commit will
want to use nbdkit-ssh-plugin to probe to see if a remote file is
present, and we don't want to retry that probe on failure.

Reviewed-by: Laszlo Ersek <lersek@redhat.com>
2024-01-17 14:20:44 +00:00
Richard W.M. Jones
40730615d2 virt-v2v: -i vmx: Remove dependency of ssh.ml on Xml.uri
Xml.uri is a convenient way to pass the multiple ssh fields to
virt-v2v and is still used internally by the -i vmx code.  However
don't leak this awkward implementation into the new ssh.ml module.

Like Nbdkit_ssh, we'll only deal with the explicit (password, port,
server, user, path) fields separately.

Note after this refactoring:

 - The new Ssh module interface looks broadly similar to the
   Nbdkit_ssh interface.

 - vmx_source_of_arg assertions are no longer required inside the new
   Ssh module, which was a mild layering violation before.

Reviewed-by: Laszlo Ersek <lersek@redhat.com>
2024-01-17 14:20:44 +00:00
Richard W.M. Jones
60e72acb34 virt-v2v: -i vmx: Add the input password to vmx_source
Since we use the input password in various places in the VMX module,
store the input password in vmx_source.  This neutral refactoring
makes later changes simpler.

Reviewed-by: Laszlo Ersek <lersek@redhat.com>
2024-01-17 14:20:44 +00:00
Richard W.M. Jones
a0b22af287 virt-v2v: -i vmx: Simplify scp wrapper
The existing 'scp_from_remote_to_temporary' wrapper around scp was
pretty weird (I think from much patching without refactoring).
Simplify it so it no longer generates the output filename, and rename
it accordingly to 'download_file' (as we will soon remove the need for
and dependency on 'scp').

Reviewed-by: Laszlo Ersek <lersek@redhat.com>
2024-01-17 14:20:44 +00:00
Richard W.M. Jones
cddd07669d virt-v2v: -i vmx: Refactor ssh/scp code into a new module
This is a straight refactor of the existing code that handles ssh/scp
into a new module.  In this commit I just copy the code around without
doing any cleanup; cleanup will follow in subsequent commits.

Reviewed-by: Laszlo Ersek <lersek@redhat.com>
2024-01-17 14:20:44 +00:00
Richard W.M. Jones
af68f253d1 virt-v2v: -i vmx: Remove scp -T option
This reverts the following commit:

  commit d265639c2ab31418cfdbdedd0cc3e68cf290d834
  Author: Richard W.M. Jones <rjones@redhat.com>
  Date:   Thu Jul 25 14:52:42 2019 +0100

    v2v: -i vmx: Use scp -T option if available to unbreak scp (RHBZ#1733168).

See also the referenced bug:
https://bugzilla.redhat.com/show_bug.cgi?id=1733168

My rationale for removing this option is that since we now require
OpenSSH 8.8 we must be using sftp for file transfer so we no longer
need to defeat the check for correct expansion of wildcards.  That
check was only relevant for OpenSSH <= 8.7 using the old scp protocol.

Reverts: commit d265639c2ab31418cfdbdedd0cc3e68cf290d834
2024-01-11 15:35:43 +00:00
Richard W.M. Jones
6f2536bd9f input/parse_vmx.ml: Reject VMDK files
Instead of the confusing warnings printed before, it now prints
an error indicating the incorrect input format:

$ virt-v2v -i vmx -it ssh ssh://root@xxx/vmfs/volumes/esx8.0-function/Auto-esx8.0-rhell9.3-efi-with-empty-cdrom/Auto-esx8.0-rhell9.3-efi-with-empty-cdrom.vmdk -ip /tmp/passwd -o null
virt-v2v: error: input file is a VMDK (disk image), but we are expecting a
VMX (VMware metadata)

Reported-by: Ming Xie
Fixes: https://issues.redhat.com/browse/RHEL-19564
2023-12-19 12:57:42 +00:00
Richard W.M. Jones
22c5b98ab7 -it ssh: Double quote ssh command which tests remote file exists
Double quoting was removed in
commit e2af12ba69c4463bb73d30db63290a887cdd41eb ("input: -i vmx:
Remove support for openssh scp < 8.8", Nov 2021).  However it should
only have been removed from scp commands, not for this ssh command
where it is still required.

See: https://github.com/libguestfs/virt-v2v/issues/35
Thanks: Laszlo Ersek for diagnosis and suggesting the fix
Reported-by: Bill Sanders
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
2023-10-02 15:17:43 +01:00
Richard W.M. Jones
97d8c28b7e builder: Use new Curl.create API that requires url parameter
Update common submodule, picking up this fix:

  commit 06721c00e2582daf1b3bedc4fbfb699b9b788a41
  Author: Richard W.M. Jones <rjones@redhat.com>
  Date:   Thu Jun 29 13:10:24 2023 +0100

    mltools: curl: Split out the url parameter

This requires a small change to input/vCenter.ml
2023-06-29 13:23:05 +01:00
Richard W.M. Jones
17c6d52447 input: -i ova: Less strict parsing of lines from the manifest file
eBlocker (https://eblocker.org) provides an OVA which contains this
manifest (*.mf) file:

SHA256 (eBlockerVM-2.9.1-amd64.ovf) = 7b25ea03c3b9eb143c7f6d1e8f76c3717e5b928f007e571cd96132bf6a15a30d
SHA256 (eBlocker VM-disk001.vmdk) = 646d1ec6ddf5a6d6604dbec421624ca2cea6ee5b57bf80ae635bf8f4f0bfea45

Note that the manifest is not conforming with the OVFS specification
which says the line format should be:

file_digest = algorithm "(" file_name ")" "=" sp digest nl

Space (sp) is not permitted after the algorithm or before the = sign.

However let's be tolerant of the extra whitespace.
2023-04-03 11:39:01 +01:00
Richard W.M. Jones
8ad152afc4 Rework Std_utils.Option so it works like the OCaml stdlib module
OCaml 4.08 introduces a stdlib Option module which looks a bit like
ours but has a number of differences.  In particular our functions
Option.may and Option.default have no corresponding functions in
stdlib, although there are close enough equivalents.

This change was automated using this command:

$ perl -pi.bak \
  -e 's/Option.may/Option.iter/g; s/Option.default /Option.value ~default:/g' \
  `git ls-files`

Update common module to include:

  commit cffa077323fafcdfcf78e230c022afa891a6b3ff
  Author: Richard W.M. Jones <rjones@redhat.com>
  Date:   Mon Feb 20 12:11:51 2023 +0000

    mlstdutils: Rework the Option module to be compatible with stdlib

  commit 007d0506c538db0a43fec7e9986a95ecdcd48b56
  Author: Richard W.M. Jones <rjones@redhat.com>
  Date:   Mon Feb 20 12:18:29 2023 +0000

    mltools: Replace Option.may with Option.iter
2023-02-20 12:21:47 +00:00
Richard W.M. Jones
d2b01e487f Split long lines in messages
This commit splits up any long lines found in errors, warnings or
other messages.  OCaml ignores whitespace following "\<CR>" within a
string, eg:

  "long string \
   more stuff"

is parsed as:

  "long string more stuff"

Thanks: Laszlo Ersek, for working out the OCaml syntax for this
2023-01-20 10:29:24 +00:00
Andrey Drobyshev
7361431311 parse_libvirt_xml: look for manual firmware in "/domain/os/loader/@type"
According to [1], there're different ways to specify which firmware is
to be used by a libvirt-driven VM.  Namely, there's an automatic
firmware selection, e.g.:

    ...
    <os firmware='(bios|efi)'>
    ...

and a manual one, e.g.:

    ...
    <os>
        <loader readonly='yes' type='pflash'>/usr/share/OVMF/OVMF_CODE.fd</loader>
        ...
    </os>
    ...

with the latter being a way to specify UEFI firmware.  So let's add this
search path as well when parsing source VM's libvirt xml.

[1] https://libvirt.org/formatdomain.html#bios-bootloader

Co-authored-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>
Originally-by: Denis Plotnikov <dplotnikov@virtuozzo.com>
Message-Id: <20221220150133.403120-1-andrey.drobyshev@virtuozzo.com>
2022-12-21 10:10:29 +01:00
Richard W.M. Jones
2eb6441264 common: Adapt to renamed function On_exit.rmdir -> On_exit.rm_rf
This function was renamed to make it clearer what it does (and that
it's potentially dangerous).  The functionality is unchanged.

Reviewed-by: Laszlo Ersek <lersek@redhat.com>
2022-07-15 08:57:52 +01:00
Richard W.M. Jones
3c4505c12a build: Remove bundled OCaml bindings for libvirt
The ocaml-libvirt project (https://gitlab.com/libvirt/libvirt-ocaml)
provides bindings for libvirt.  For historical reasons we bundled this
as it was throught ocaml-libvirt wasn't widespread on distros.  In
fact Fedora and Debian derivatives all have this.  Arch does not
(yet), but we can fix that.

It said in the README file in that directory, "before virt-v2v 1.42 is
released we hope to have unbundled this".  That didn't happen, but
let's remove it now.
2022-04-28 15:42:26 +01:00
Laszlo Ersek
e7539dc6f6 input: -i vmx: Add support for SATA hard disks
See also:

- virt-v2v commit 75872bf282d7 ("input: -i vmx: Add support for NVMe
  devices", 2022-04-08),

- libvirt commit 2214fe90442c ("vmx: start parsing SATA disks",
  2020-10-14).

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1883802
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Message-Id: <20220419152415.13356-3-lersek@redhat.com>
Reviewed-by: Richard W.M. Jones <rjones@redhat.com>
2022-04-21 16:06:43 +02:00
Laszlo Ersek
ae5b44639b "-i vmx -it ssh": fix non-unique "s_disk_id"
The "s_disk_id" field is supposed to be unique across all fixed disks of
the source domain.

With "-i vmx -it ssh" however, we restart assigning "s_disk_id" from zero
near the end of the "find_hdds" function.

The problem is that "find_hdds" is separately called from
"find_scsi_disks", "find_nvme_disks", and "find_ide_disks". Thus, if we
have at least two hard disks in the source on different controller types,
we'll output two disks with a zero-value "s_disk_id".

"input/input_vmx.ml" starts up nbdkit with "in%d" socket names that are
based on list position, not "s_disk_id", so the nbdkit invocations are
correct. However, in "convert/convert.ml", the "server" argument for
"g#add_drive_opts" is based on "s_disk_id"; therefore, one of the disks
will be multiply referenced, and the others will be ignored.

Assign "s_disk_id" in "find_disks" [input/parse_domain_from_vmx.ml] once
all the disks have been found and concatenated into the common list.

Note that the global sorting order remains well-defined. Both before and
after this patch, the sorting occurs *within* "find_hdds", but that's OK:
the concatenation order between the type-specific disk-lists is
well-defined (albeit arbitrary) in "find_disks".

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1883802
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Message-Id: <20220419152415.13356-2-lersek@redhat.com>
Reviewed-by: Richard W.M. Jones <rjones@redhat.com>
2022-04-21 16:06:33 +02:00
Laszlo Ersek
e0b08ee021 input_vmx: cleanly reject guests with snapshots when using "-it ssh"
For traversing a backing chain of VMDK descriptor files over ssh, two
things are necessary:

- qemu-nbd with the ssh block driver, rather than nbdkit-ssh-plugin,

- a remote SSH URL (for qemu-nbd) without a query string appended, as
  qemu-nbd cannot update the last pathname component (for tracking the
  relative pathnames of VMDK descriptor files) if a query string is
  appended.

Before commit 7a6f6113a25f ("v2v: -i vmx -it ssh: Replace qemu block ssh
driver with nbdkit-ssh-plugin.", 2019-10-08), we passed the
"?host_key_check=no" query string in the URL to qemu-nbd, so we can't just
return to that, for accessing snapshotted guests with vmx+ssh.

But, we shouldn't return to qemu-nbd for vmx+ssh even without a query
string, as that would undo the other benefit(s) of commit 7a6f6113a25f.

Instead, clearly document that snapshotted guests are not supported over
vmx+ssh, and cleanly reject this situation in the code as well. Recommend
the two alternative transports that allow the user to convert such guests.

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1774386
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Message-Id: <20220409061252.6551-1-lersek@redhat.com>
[lersek@redhat.com: replace B<-...> with I<-...> in NOTES]
Reviewed-by: Richard W.M. Jones <rjones@redhat.com>
2022-04-09 09:07:29 +02:00
Richard W.M. Jones
75872bf282 input: -i vmx: Add support for NVMe devices
We model NVMe devices in the source hypervisor.

We currently assume that no one is using the namespaces feature of
NVMe, ie. that each source device will appear in a Linux guest as
/dev/nvme0n1, /dev/nvme1n1, etc.  We could fix this if it is a
problem, but it requires adjusting the current assumption for
removable devices that slots are simple integers.

The devices are mapped to virtio-blk, so in the target the device name
has to change from /dev/nvme0 to /dev/vda (etc.)

Reported-by: Ming Xie
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=2070530
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
2022-04-08 16:24:38 +01:00
Richard W.M. Jones
db00a9a9aa input: -i ova: Handle OVAs which contain user/group names with spaces
If importing an OVA that has user/group names with spaces then the
plain tar -tRvf command would print them without any quoting.  Our
simple strategy of splitting on spaces resulted in an "extra field"
being parsed.  This is an example from a real OVA (note "Domain Users"
is the group name):

$ tar --quoting-style=literal -tRvf protect_appliance.ova
block 0: -rw-r--r-- eraautobuilds/Domain Users 33508 2021-11-04 17:48 PROTECT_Appliance.ovf

Luckily this is fairly simple to fix.  We don't care about the
original user/group name, and using --numeric-owner causes tar to
print the UID/GID instead:

$ tar --quoting-style=literal --numeric-owner -tRvf protect_appliance.ova
block 0: -rw-r--r-- 1074101/1049089 33508 2021-11-04 17:48 PROTECT_Appliance.ovf

I also added --quoting-style=literal to deal with possible future
cases where the filename contains spaces.  Because we use
nsplit ~max:8 these should now be handled correctly too, although I
didn't test this.

Reported-by: Jiří Sléžka
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=2069768
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
2022-04-04 13:03:41 +01:00
Laszlo Ersek
9788b06765 nbdkit, qemuNBD: run_unix: formally require externally provided socket
At this point, virt-v2v never relies on the Unix domain sockets created
inside the "run_unix" implementations. Simplify the code by removing this
option.

Consequently, the internally created temporary directory only holds the
NBD server's PID file, and never its UNIX domain socket. Therefore:

(1) we no longer need the libguestfs socket dir to be our temp dir,

(2) we need not change the file mode bits on the temp dir,

(3) we can rename "tmpdir" to the more specific "piddir".

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2066773
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Message-Id: <20220323104330.9667-1-lersek@redhat.com>
Acked-by: Richard W.M. Jones <rjones@redhat.com>
2022-03-23 13:39:12 +01:00
Richard W.M. Jones
6413024848 input: Add options.read_only flag
This flag can be used for certain inputs to control whether we place a
protective overlay over the source.  It is only supported for local
file type inputs (-i disk and -i libvirt).  For other input types it
cannot work:

 * -i ova: These cannot be updated in-place because the OVA embeds
   checksums.

 * Any input on VMware or Xen: There's no point doing an in-place
   update for a guest which his still sitting on a foreign hypervisor.

 * -i vmx: We cannot reliably update a disk in VMDK format using qemu
   tools.

This flag is always set to true, so there is no effect in the current
code.  It allows us to implement a second front end for in-place
support.
2022-03-10 15:11:53 +00:00
Richard W.M. Jones
407c6e5cae vddk: Use blocksize filter to split very large requests
We know that VDDK has problems handling very large single requests.
It may cause out of memory errors on the server unless a relatively
obscure server-side configuration change is made.  With modular
virt-v2v we won't necessarily have fine control over who is connecting
to the input socket and what sized requests they may make, eg if we
delegate copying to a third party.

nbdkit-blocksize-filter offers a simple solution.  It will split
requests larger than a certain size, ensuring that whatever NBD
requests come in, we shouldn't cause problems on the VMware server.
I chose 2M as the maximum request size, but this could be fine-tuned
later.

For a longer explanation, see:
6f4746af9a

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=2039255
Acked-by: Laszlo Ersek <lersek@redhat.com>
Reported-by: Ming Xie <mxie@redhat.com>
Tested-by: Ming Xie <mxie@redhat.com>
2022-02-15 15:27:58 +00:00
Richard W.M. Jones
cee676fb70 input: Refactor code in the input modules
Cleans up things left over from modularising virt-v2v and aims to make
the code more straightforward to follow.

In particular (just because of the incremental steps that
modularization went through) the input modules all looked like:

  let foo_source dir options args =
    ...
  module Foo = struct
    let setup dir options args = foo_source dir options args
  end

(and the same for other methods).  One of the refactorings is to
simply place the code directly inside the module functions.

I also moved around the module methods to reflect more closely the
order in which the code is actually called by virt-v2v.

The change looks large, but that's because of whitespace changes
because of reindenting the code.  There are no code changes here and
the refactoring should be completely neutral.
2022-02-11 13:06:07 +00:00
Richard W.M. Jones
8abc07a858 input: Require -ip password for vCenter over HTTPS
As far as I've ever seen, you always need a password to access the
/folder directory of a VMware server.  If the password isn't specified
to virt-v2v using the -ip option then it used to try to ask for it
interactively when virt-v2v started.  Actually it asked for the
password several times -- at least twice.

After we added the cookie-script feature (commit 2b9a11743b "v2v:
vcenter: Implement cookie scripts.") it tries to ask interactively at
random points during the conversion, which is obviously bad from a UI
point of view but also because the password is requested without a
prompt looks like a hang.

We could solve this by prompting for a password.  But virt-v2v is not
primarily an interactive tool and it's an easier fix is to require the
caller to use the -ip passwordfile option (in this particular mode).

Reported-by: Xinyu Li
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1960087
2022-02-01 10:02:49 +00:00
Richard W.M. Jones
20019b5cad input: libvirt: Share a single connection to the source NBD server
When using virt-p2v from RHEL 7, it starts a very old qemu-nbd server
(probably 1.5.3) which required the --shared parameter to enable
sharing even in read-only mode.  Since it doesn't pass this parameter
only a single connection at a time is allowed, and further connections
will deadlock.  Note that later versions of qemu-nbd changed this so
that read-only connections permit sharing automatically.

In modular virt-v2v we now use nbdkit-nbd-plugin to proxy the
connection to virt-p2v / qemu-nbd.  When you connect to this multiple
times, as virt-v2v does, it will make multiple connections to the
backend qemu-nbd.  This will cause a deadlock.

We can use the nbdkit-nbd-plugin shared=true flag to enable the plugin
to share a single connection to the backend between multiple nbdkit
clients.

Reported-by: Tingting Zheng
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=2044911
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
2022-01-26 12:56:06 +00:00
Richard W.M. Jones
924aa8b70a Restore message about setting up the input and output
Old virt-v2v would print a summary of the input and output options
before connecting to the input/output, looking something like this:

  [   0.2] Opening the source -i libvirt -ic [etc]

This gave reassurance that virt-v2v was doing something in the case
where the source was slow or unreachable.  In particular if you use
-i libvirt with a vCenter URL, and the URL is wrong, libvirt hangs for
a few minutes without printing anything.

Modular virt-v2v rearranged things so the connecting phase was silent,
which meant that in the case above virt-v2v appeared to hang for a few
minutes printing nothing at all.

This change adds to_string functions to all the input and output
methods and uses them to print a message like:

  [   0.0] Setting up the source: -i libvirt -ic [etc]

The hang still happens, but at least it's now clear where it's hanging.

Note the old "Opening the source" message now refers to libguestfs
connecting to the NBD source disk pipeline.

Typical full output looks like this:

  $ virt-v2v -i disk /var/tmp/fedora-35.img -o disk -os /var/tmp/out
  [   0.0] Setting up the source: -i disk /var/tmp/fedora-35.img
  [   1.1] Opening the source
  [   5.9] Inspecting the source
  [  11.5] Checking for sufficient free disk space in the guest
  [  11.5] Converting Fedora Linux 35 (Thirty Five) to run on KVM
  virt-v2v: warning: /files/boot/grub2/device.map/hd0 references unknown
  device "vda".  You may have to fix this entry manually after conversion.
  virt-v2v: This guest has virtio drivers installed.
  [  57.4] Mapping filesystem data to avoid copying unused and blank areas
  [  61.0] Closing the overlay
  [  61.7] Assigning disks to buses
  [  61.7] Checking if the guest needs BIOS or UEFI to boot
  [  61.7] Setting up the destination: -o disk -os /var/tmp/out
  [  62.8] Copying disk 1/1
  █ 100% [****************************************]
  [  81.7] Creating output metadata
  [  81.7] Finishing off

Reported-by: Xiaodai Wang
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=2041886
Fixes: commit 255722cbf39afc0b012e2ac00d16fa6ba2f8c21f
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
2022-01-19 15:26:28 +00:00
Richard W.M. Jones
f3180e3c0e input: xen: Fix assertion error when importing from remote block device
We never supported this because OpenSSH sftp server does not know how
to open and get the size of a block device.  Before modular virt-v2v,
we would detect this situation and print an error:

  virt-v2v: error: guest disk sda appears to be zero bytes in size.

  There could be several reasons for this:

  Check that the guest doesn't really have a zero-sized disk.  virt-v2v
  cannot convert such a guest.

  If you are converting a guest from an ssh source and the guest has a disk
  on a block device (eg. on a host partition or host LVM LV), then
  conversions of this type are not supported.  See the virt-v2v-input-xen(1)
  manual for a workaround.

This error was lost in the conversion to modularity, but in any case
the proper way to detect this is in the input_xen driver itself.

In addition to this, when we removed the old virt-v2v-copy-to-local
tool, the text referring to the workaround in the manual became
meaningless, so I added the (ugly, manual) workaround that you have to
do back into the manual, minus the bit about the tool that we removed.

In future we could consider other ways to convert a remote block
device over SSH transparently.  (Sending a "dd" command, maybe?)

I also checked what happens when we try to convert an empty disk, and
it fails during inspection with a reasonable error message, so we
don't need to do anything to restore the above error:

  $ virt-v2v -i disk /var/tmp/empty -o null
  [   1.1] Opening the source
  [   7.0] Inspecting the source
  virt-v2v: error: inspection could not detect the source guest (or physical
  machine).

  Assuming that you are running virt-v2v/virt-p2v on a source which is
  supported (and not, for example, a blank disk), then this should not
  happen.

  No root device found in this operating system image.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=2041852
Reported-by: Xiaodai Wang
2022-01-18 12:48:56 +00:00
Richard W.M. Jones
bdd8412164 input, output: Use Tools_utils.open_guestfs instead of Guestfs.create
The common function open_guestfs passes verbose and trace flags from
virt-v2v to the libguestfs handle, thus improving debugging.

Also update the common submodule to get:

    mltools: Error out if the libguestfs environment is bad

    Currently setting bogus environment variables causes libguestfs to
    print errors but continue running, eg:

    $ LIBGUESTFS_BACKEND=bogus virt-v2v --version
    libguestfs: error: invalid backend: bogus
    virt-v2v 1.45.96fedora=36,release=1.fc36

    If environment variables are bogus we really ought to exit early
    instead.  After this change:

    $ LIBGUESTFS_BACKEND=bogus virt-v2v --version
    Fatal error: exception Guestfs.Error("invalid backend: bogus")

    Thanks: Xiaodai Wang
2022-01-17 15:53:46 +00:00
Richard W.M. Jones
702a511b7f input/nbdkit_curl.ml: Fix typo in commented code
Fixes: commit 255722cbf39afc0b012e2ac00d16fa6ba2f8c21f
2021-12-18 13:22:48 +00:00
Richard W.M. Jones
24fdb088b1 input, output: Use Option.may for some Nbdkit calls
Option.may (Nbdkit.add_arg cmd "port") port;

is completely equivalent to:

  (match port with
   | Some s -> Nbdkit.add_arg cmd "port" s
   | None -> ());

Updates: commit d50966c2a480bda033f6e63bb797f86c13d576bd
2021-12-18 12:02:52 +00:00
Richard W.M. Jones
e3e1000226 input: Remove nbdkit file fadvise=sequential option in a couple more places
The result of commit ac59d3b231 ("output: Don't use nbdkit-file-plugin
cache=none when writing") and the later partial revert in
commit 5d5383b912 is that we deliberately dropped fadvise=sequential
when using nbdkit-file-plugin, while keeping cache=none.

This flag was giving an incorrect hint to the kernel.  It was still
being used in a couple of places in virt-v2v, this commit drops those
too for the same reason.
2021-12-11 17:35:26 +00:00
Richard W.M. Jones
ff84a00fff lib: Make QemuNBD mini-library handle mutable
Similar change to the previous commit.
2021-12-11 17:22:07 +00:00
Richard W.M. Jones
d50966c2a4 lib: Make Nbdkit mini-library handle mutable
Previously the Nbdkit.cmd handle was immutable, which is nice from a
purity point of view but made for a lot of boilerplate.  If we make it
mutable we can make the code a little smaller and easier to read and
write.

There is no significant change here and this is mostly refactoring.
However I also cleaned up the way environment variables are done
(LANG=C was duplicated before), and also the handling of the verbose
flag.
2021-12-11 17:09:56 +00:00
Richard W.M. Jones
8ffde4f4c2 input: Drop all use of nbdkit-readahead-filter
Previously we used nbdkit-readahead-filter with input modes that used
curl, ssh and vddk nbdkit plugins.  We quickly dropped it for curl and
vddk because it caused poor performance and failures with VMware
servers (bugs https://bugzilla.redhat.com/show_bug.cgi?id=1848862
https://bugzilla.redhat.com/show_bug.cgi?id=1832805).

This drops it for ssh too (eg for vmx or xen-ssh input modes).

The readahead filter in nbdkit is naively written and implementing a
replacement has proven difficult.  In any case it is not needed.  When
doing the conversion, readahead is already implemented in the
libguestfs appliance kernel.  When doing the copy, nbdcopy is already
able to keep sufficient requests in flight.
2021-12-11 16:26:50 +00:00
Richard W.M. Jones
351d61f768 input: -it vddk: Reduce cow-block-size to 4K
NB: Requires the following fix to nbdkit:
c6fe9cb5b4

In commit 17358abd ("inputs: vddk: Use cow-block-size=1M") we started
to use a larger cow filter block size than the default.  This is based
on the theory that a larger block size causes larger, more efficient
read and extents requests to VDDK.  Some data were shown in the commit
message to show this was more efficient at the time.

However further testing shows it's not clear cut.  The trimming step
("Mapping filesystem data") is considerably slower with a larger block
size.  This is because the cow filter does a read-modify-write cycle
on the unaligned head and tail of the request (most trims and small
and/or not aligned to large block sizes), and the read part of this
cycle goes through VDDK which is slow.  The overall results are
clearly now better with a small block size, as shown below, even
though the copy step is slower, possibly because less data has been
read through the cow filter and cached during conversion.

Note that nbdkit-cow-filter coalesces operations so the original
rationale from that commit seems dubious.

Before (note time spent in "Mapping" step):

[   2.8] Opening the source
[  12.6] Inspecting the source
[  18.7] Checking for sufficient free disk space in the guest
[  18.7] Converting Windows 10 Enterprise to run on KVM
virt-v2v: This guest has virtio drivers installed.
[  36.1] Mapping filesystem data to avoid copying unused and blank areas
[  77.0] Closing the overlay
[  77.9] Assigning disks to buses
[  77.9] Checking if the guest needs BIOS or UEFI to boot
[  77.9] Copying disk 1/1
█ 100% [****************************************]
[ 332.3] Creating output metadata
[ 332.3] Finishing off

After:

[   2.8] Opening the source
[   7.9] Inspecting the source
[  12.8] Checking for sufficient free disk space in the guest
[  12.8] Converting Windows 10 Enterprise to run on KVM
virt-v2v: This guest has virtio drivers installed.
[  26.6] Mapping filesystem data to avoid copying unused and blank areas
[  28.0] Closing the overlay
[  28.7] Assigning disks to buses
[  28.7] Checking if the guest needs BIOS or UEFI to boot
[  28.7] Copying disk 1/1
█ 100% [****************************************]
[ 297.4] Creating output metadata
[ 297.4] Finishing off

Fixes: commit 17358abdcd44e3ff2a10fb16b3b626f438f2fc7a
2021-12-07 08:34:58 +00:00
Richard W.M. Jones
b12c6ba7a2 lib: Remove Nbdkit.set_exportname option
This might have been necessary with qemu, but now we're using nbdcopy
it is not needed.  Remove it completely.
2021-12-07 08:34:58 +00:00
Richard W.M. Jones
bb0e698360 input: Disable multi-conn in VDDK input mode
The cow filter unconditionally enables multi-conn (because it is
safe).  However this causes an unintended consequence with the VDDK
plugin.  Multiple VDDK handles are opened (one per multi-conn
connection), and for some reason, possibly internal locking, they
conflict with each other.  This manifests itself as API calls taking
between 2 and 7 times longer to serve (especially QueryAllocatedBlocks
which seems to slow down most).

Avoid this by adding nbdkit-multi-conn-filter with
multi-conn-mode=disable on top which disables multi-conn
advertisement.

Virt-v2v -ic esx://... -it vddk -o null, before this change:

[   2.8] Opening the source
[  11.4] Inspecting the source
[  16.2] Checking for sufficient free disk space in the guest
[  16.2] Converting Windows 10 Enterprise to run on KVM
virt-v2v: This guest has virtio drivers installed.
[  25.5] Mapping filesystem data to avoid copying unused and blank areas
[  67.3] Closing the overlay
[  68.0] Assigning disks to buses
[  68.0] Checking if the guest needs BIOS or UEFI to boot
[  68.0] Copying disk 1/1
█ 100% [****************************************]
[ 416.4] Creating output metadata
[ 416.4] Finishing off

After this change:

[   2.8] Opening the source
[  12.6] Inspecting the source
[  18.7] Checking for sufficient free disk space in the guest
[  18.7] Converting Windows 10 Enterprise to run on KVM
virt-v2v: This guest has virtio drivers installed.
[  36.1] Mapping filesystem data to avoid copying unused and blank areas
[  77.0] Closing the overlay
[  77.9] Assigning disks to buses
[  77.9] Checking if the guest needs BIOS or UEFI to boot
[  77.9] Copying disk 1/1
█ 100% [****************************************]
[ 332.3] Creating output metadata
[ 332.3] Finishing off
2021-12-06 22:03:20 +00:00
Richard W.M. Jones
7691f8e853 input: Split up the very large input/input.ml file
This file was contenated from the various input modes when I
modularized virt-v2v.  It was done this way essentially for
convenience.  However it wasn't a great idea because:

 - Not very easy to see which file is responsible for each mode.
   ie. it's a lot more obvious that input_disk.ml might be responsible
   for -i disk.

 - It didn't sufficiently isolate each mode.  Each mode is logically
   separate and does not need to be in the same top level file or
   module.

 - This single file "open"'d just about every namespace which is a
   symptom of poor encapsulation.

In addition the various *_source and *_servers functions which were
called one after another are combined into a single function.  (Their
separation was an unusual side effect of earlier modularization, not a
necessary feature).

This is just code movement, there is no change to functionality.
2021-12-03 16:38:50 +00:00
Richard W.M. Jones
9efea1fee1 input, output: Remove explicit module cleanup functions
We can use the new [On_exit] module to do this much more cleanly and
reliably.  Remove the need for explicit cleanup.

The existing functions were in any case somewhat broken with regard to
signal handling so a further change was made to the common submodule
to fix that.
2021-12-03 14:26:42 +00:00
Richard W.M. Jones
118bf64bb4 Use new libguestfs-common On_exit module
This is a simple refactoring, replacing uses of unlink_on_exit and
rmdir_on_exit with On_exit.unlink and On_exit.rmdir.
2021-12-03 12:27:30 +00:00
Richard W.M. Jones
724ecb5e88 input: Turn helper into an OCaml module
In a future commit I will break up the large input/input.ml file,
which is now much simpler to do.  However this commit sticks to the
modularization only.

Note the first class module syntax is a somewhat obscure and new OCaml
feature.  For more on this topic:
https://dev.realworldocaml.org/first-class-modules.html
2021-12-02 10:14:40 +00:00
Laszlo Ersek
dbbdd4c20a lib/types: remove the "source.s_video" field
Since commit 255722cbf39a ("v2v: Modular virt-v2v", 2021-09-07), the only
use of the "source.s_video" field has been its logging via
"string_of_source", for the "--print-source" debugging option. Given that
the source video controller is available in the source domain description
anyway (that's where "source.s_video" is parsed from), simplify the debug
log code by removing "source.s_video".

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1961107
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Message-Id: <20211202094637.8020-17-lersek@redhat.com>
2021-12-02 11:05:12 +01:00
Laszlo Ersek
4d847539aa lib/types: replace "source_video" type with plain "string"
The "source_video" union type is basically a string, with a distinguished
constructor for the "cirrus" device model ("Source_Cirrus"). However, this
distinction has been useless since commit 255722cbf39a ("v2v: Modular
virt-v2v", 2021-09-07); now "Source_Cirrus" isn't treated specially.
Replace the "source_video" type with just "string".

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1961107
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Message-Id: <20211202094637.8020-16-lersek@redhat.com>
2021-12-02 11:05:12 +01:00
Richard W.M. Jones
e2af12ba69 input: -i vmx: Remove support for openssh scp < 8.8
Older openssh scp (< 8.8) had a strange double-quoting syntax for
paths because it passed the literal parameter to the remote side where
it was interpreted again by the remote shell.  In openssh 8.8 scp was
changed to use the sftp protocol instead.

Since we now require openssh >= 8.8, the -T option will always be
present so we don't need to test for it.

There are a few reasons why this change does not attempt to support
the older scp:

 * Distros like to keep up with OpenSSH for security reasons.

 * The older double quoting format was crazy, and probably insecure.
   Replacing it with a sane alternative is something we should
   encourage.

 * Import from -i vmx is fairly niche, and there are workable
   alternatives (VDDK or HTTPS) in case anyone is affected by this.

 * In fact it _does_ support the older version, at least in the case
   where the filename doesn't contain spaces or quoted characters.

Note that we require sftp in nbdkit-ssh-plugin, so we don't need to
worry about the remote server not supporting it because conversion
would fail anyway.

Reported-by: Ming Xie
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=2027673
2021-11-30 13:47:52 +00:00