IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
When rebooting a VM from PVE (via CLI/API), the reboot code is called
under a guest lock, which creates a reboot request, shuts down the VM
and then calls the regular cleanup code, which includes the mdev
cleanup.
In parallel, the qmeventd observes that the VM process has gone, and
starts 'qm cleanup' which is (among other tasks) also starts the VM
again if a reboot from the PVE side is pending.
The qmeventd synchronizes this through a lock on the guest, with a
default timeout of 10 seconds.
Since we currently also always wait 10 seconds for the NVIDIA driver
to clean up the mdev, this creates a race condition for the cleanup
lock. IOW., when the call to `qm cleanup` starts before we started to
sleep for 10 seconds, it will not be able to acquire its lock and not
start the vm again.
To avoid the race condition in practice, do two things:
* increase the timeout in `qm cleanup` to 60 seconds.
Technically this still might run into a timeout, as we can configure
up to 16 mediated devices with each delaying 10 seconds in the worst
case, but realistically most users won't configure more than two or
three of them, if even that.
* change the hard-coded `sleep 10` to a loop sleeping for 1 second
each before checking the state again. This shortens the timeout when
the NVIDIA driver did not require the full 10s to finish the
clean-up.
Further, add a bit of logging, so one can properly see in the task log
what is happening at which point in time.
Fixes: 49c51a60 (pci: workaround nvidia driver issue on mdev cleanup)
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Reviewed-by: Mira Limbeck <m.limbeck@proxmox.com>
[ TL: change warn to print, reword commit message ]
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Make the post-if check for the target not already running more
prominent by using a full if block.
Also comment on why we ignore the error here, while the commit
changing that explained it well, this is one of the things that might
be better of with a in-code comment (as doing the deactivation is
described as important here, so one might wonder why the code
continues if that fails)
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
When a template with disks on LVM is cloned to another node, the
volumes are first activated, then cloned and deactivated again after
cloning.
However, if clones of this template are now created in parallel to
other nodes, it can happen that one of the tasks can no longer
deactivate the logical volume because it is still in use. The reason
for this is that we use a shared lock.
Since the failed deactivation does not necessarily have consequences,
we downgrade the error to a warning, which means that the clone tasks
will continue to be completed successfully.
Signed-off-by: Hannes Duerr <h.duerr@proxmox.com>
Tested-by: Friedrich Weber <f.weber@proxmox.com>
by fixing the SCSI feature compatibility check helper. The helper is
also called for disks using import-from, so it has to use the extended
schema when parsing the drive.
Fixes: d1feab4a ("fix #4957: add vendor and product information passthrough for SCSI-Disks")
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
PVE::Storage::path() neither activates the storage of the passed-in volume, nor
does it ensure that the returned value is actually a file or block device, so
this actually fixes two issues. PVE::Storage::abs_filesystem_path() actually
takes care of both, while still calling path() under the hood (since $volid
here is always a proper volid, unless we change the cicustom schema at some
point in the future).
Reviewed-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
During migration, the volume names may change if the name is already in
use at the target location. We therefore want to save the original names
so that we can deactivate the original volumes afterwards.
Signed-off-by: Hannes Duerr <h.duerr@proxmox.com>
adds vendor and product information for SCSI devices to the json schema
and checks in the VM create/update API call if it is possible to add
these to QEMU as a device option
Signed-off-by: Hannes Duerr <h.duerr@proxmox.com>
[FE: add missing space to exception message
use config option for exception e.g. scsi0 rather than 'product'
style fixes]
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
since we always determine the deviceid, passing in a possibly wrong value makes
no sense and could actually re-introduce bugs.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Reviewed-by: Fiona Ebner <f.ebner@proxmox.com>
The QMP command needs to be issued for the device where the disk is
currently attached, not for the device where the disk was attached at
the time the snapshot was taken.
Fixes the following scenario with a disk image for which
do_snapshots_with_qemu() is true (i.e. qcow2 or RBD+krbd=0):
1. Take snapshot while disk image is attached to a given bus+ID.
2. Detach disk image.
3. Attach disk image to a different bus+ID.
4. Remove snapshot.
Previously, this would result in an error like:
> blockdev-snapshot-delete-internal-sync' failed - Cannot find device=drive-scsi1 nor node_name=drive-scsi1
While the $running parameter for volume_snapshot_delete() is planned
to be removed on the next storage plugin APIAGE reset, it currently
causes an immediate return in Storage/Plugin.pm. So passing a truthy
value would prevent removing a snapshot from an unused qcow2 disk that
was still used at the time the snapshot was taken. Thus, and because
some exotic third party plugin might be using it for whatever reason,
it's necessary to keep passing the same value as before.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Encapsulation of the functionality for determining the scsi device type
in a new function for reusability in QemuServer/Drive.pm
Signed-off-by: Hannes Duerr <h.duerr@proxmox.com>
Currently, volume activation, PCI reservation and resetting systemd
scope happen in between, so the 5 second expiretime used for port
reservation is not always enough.
It's possible to defer telling QEMU where it should listen for
migration and do so after it has been started via QMP. Therefore, the
port reservation can be moved very close to the actual usage.
Mentioned here for completeness and can still be done as an additional
change later if desired: next_migrate_port could be modified to
optionally return the open socket and it should be possible to pass
the file descriptor directly to QEMU, but that would require accepting
the connection before on the Perl side (otherwise leads to ENOTCONN
107). While it would avoid any races, it's not the most elegant
and the change at hand should be enough in all practical situations.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Acked-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Tested-by: Hannes Duerr <h.duerr@proxmox.com>
It is not yet supported for QEMU's vdagent device which is used for
the VNC clipboard.
The migration precondition API call will now treat the VNC clipboard
as a local resource. Thus the GUI blocks migration and shows:
"Can't migrate VM with local resources: clipboard=vnc"
QemuMigrate's prepare function will also abort live migration early
when using the VNC clipboard.
Signed-off-by: Markus Frank <m.frank@proxmox.com>
[FE: adapt commit message a bit]
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
We want to notify guest of the change, so it can resubmit dhcp request,
or send gratuitous arp,...
Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
add one test case for a spice display and one for std
Signed-off-by: Markus Frank <m.frank@proxmox.com>
Reviewed-by: Dominik Csapak <d.csapak@proxmox.com>
Tested-by: Dominik Csapak <d.csapak@proxmox.com>
This can be used by noVNC to check if a clipboard is available.
Signed-off-by: Markus Frank <m.frank@proxmox.com>
Reviewed-by: Dominik Csapak <d.csapak@proxmox.com>
Tested-by: Dominik Csapak <d.csapak@proxmox.com>
add option to use the qemu vdagent implementation to enable the VNC
clipboard. When enabled with SPICE the spice-vdagent gets replaced
with the QEMU implementation.
This patch does not solve #1406, but does allow copy and paste with a
running X-session, when spice-vdagent is installed on the guest.
Signed-off-by: Markus Frank <m.frank@proxmox.com>
Reviewed-by: Dominik Csapak <d.csapak@proxmox.com>
Tested-by: Dominik Csapak <d.csapak@proxmox.com>
Signed-off-by: Stefan Lendl <s.lendl@proxmox.com>
[ TL: extend subject and use more specific build-dir glob ]
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
While there already is a warning from QEMU proper, that one is not
visible as a task warning and it's not straightforward to make it be
one, because QEMU is started inside a run_fork(). It's also more
future-proof to have the detection explicit on our side and the
documentation can be referenced.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
by remembering the 'forcemachine' parameter that's passed along when
starting the target instance.
In preparation to introduce a call to get_current_qemu_machine after
starting a VM to check for machine version deprecation.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
by adding a comment and grouping the code better. See the PVE QEMU
patch "PVE: Allow version code in machine type" for reference. The way
the code was written previously made it look like a bug where
$pve_version might be overwritten multiple times.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
This can seemingly need a bit longer than expected, and better than
erroring out on migration is to wait a bit longer.
Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
Reviewed-by: Fiona Ebner <f.ebner@proxmox.com>
As the ha-manager accessed rather internal details before that
version, and the memory property changing to a format-string with sub
properties in 7f8c808 ("add memory parser") breaks that access, so
ensure the installed ha-manager is using the newer
get_derived_property method to access that information cleanly.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
The vCPUs are passed as devices with specific id only when CPU
hot-plug is enable at cold start.
So, we can't enable/disable allow-hotplug online as then vCPU hotplug
API will thrown errors not finding core id.
Not enforcing this could also lead to migration failure, as the QEMU
command line for the target VM could be made different than the one it
was actually running with, causing a crash of the target as Fiona
observed [0].
[0]: https://lists.proxmox.com/pipermail/pve-devel/2023-October/059434.html
Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
[ TL: Reflowed & expanded commit message ]
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>