IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
There are some features/improvements/bug fixes I've either
contributed or reviewed/merged. Document them for upcoming
release.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
If SGX memory model is configured for domain then we need to
allow QEMU access some additional files:
1) /dev/sgx_vepc needs to be RW
2) /dev/sgx_provision needs to be RO
We already do this in SELinux driver but not in AppArmor.
Resolves: https://gitlab.com/libvirt/libvirt/-/issues/751
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
When virCPUx86UpdateLive checks whether a feature was added to a CPU
model after the model was already released (vmx-* features in most Intel
models), the following assert could be logged by glib:
g_strv_contains: assertion 'strv != NULL' failed
While most of our CPU models have a non-empty list of added feature, new
models added in 2024 and versioned variants of older models have
addedFeatures == NULL.
Fixes: e622970c87
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Only libvirtd uses virtd_t/virt_exec_t context, modular daemons use
their specific context each.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
To make sure both error and warning messages printed by virsh are
properly marked for translation.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The prohibit_newline_at_end_of_diagnostic syntax check is confused when
another unrelated translatable message with a newline is too close to
the function it is supposed to check. Refactoring the code to make the
two strings further apart seems like the easiest solution.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Having to put a newline at the end of each debug message in virsh has
always felt strange.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
While using host CPU definition from capabilities XML is allowed for
historical reasons, it will likely provide incorrect results and should
be avoided.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
This new function can be used for printing warnings about suboptimal
usage.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The code is moved into a newly introduced generic vshPrintStderr and
vshError changed into a tiny wrapper.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The same message was formatted both in vshOutputLogFile and in vshDebug
and vshError functions. This patch refactor vshOutputLogFile and its
callers to only format each message once.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Using host CPU definition with hypervisor-cpu-baseline is possible, but
it provide incorrect results and thus it should not be documented the
same way we describe the correct usage. Also using host-model CPU from
domain capabilities was not described clearly enough.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Using host CPU definition with hypervisor-cpu-compare is possible, but
it provide incorrect results and thus it should not be documented the
same way we describe the correct usage. Also using host-model CPU from
domain capabilities was not described clearly enough.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
I first noticed a problem when I added a <memoryBacking> element at an
unusual (but still correct) place in the domain XML, and validation
failed. Then I tried adding that element in several different places
and it failed in many, but not all of them.
(NB: from here on, I will use '' for the names of attributes in the
domain XML, <> for elements in the domain XML, and "" for the names of
grammar rule definitions in the RNG file, and "<>" for the names of
elements in the RNG file's own XML. Confused yet? If so, please tell
me a better way - everything I know about RNG I've picked up
informally by looking at examples in already existing RNG files)
Starting from the top level of the grammar for <domain>
("domaincontents" in domaincommon.rng), I noticed that
1) the "<attribute>" for the 'id' attribute of <domain> is defined
inside an "<interleave>" down in the definition of "ids" (which is
referenced from "domaincontents") (I'm not familiar with the
nomenclature - does that make it a "sub-grammer", "child-grammar",
???)
2) although the definition of "ids", had all of its
"<attribute>"s/"<element>"s inside an "<interleave>",
"domaincontents" already had the reference to "ids" inside an
"<interleave>", so there were nested "<interleave>"s.
It's not clear to me how an "<attribute>" or "<interleave>" inside
another "<interleave>" is supposed to behave, but they both seemed a
bit suspicious.
I tried all of the below modifications:
1) moving the grammar for the 'id' attribute out of the "<interleave>"
but still inside "ids"
2) moving the grammer for the 'id' attribute directly into
"domaincontents" (and outside of its "interleave"
3) removing the "<interleave>" that was inside "ids"
4) (2) + (3)
5) move the entire grammar rule "ids" up directly in place of <ref
name="ids"> in "domaincontents".
6) (5), but with the grammar for the 'id' attribute moved outside of
the "<interleave>"
(6) was the only change that allowed all of the following (using
modifications to the subelements of <domain> in
net-vhostuser-passt.xml as example):
a) a <memoryBacking> element in between *any* two existing elements
b) moving <name> in between any two elements
c) oddly, in addition to the problem with putting <memoryBacking> in
odd places, I also found that the original RNG did not allow the
<clock> element to be placed in between <on_poweroff> and
<on_reboot>, but once I'd made the change in (6), this was no
longer problematic. Why should this have any effect? No idea, but
it works :-/
(NB: there are many other cases of referencing "sub-grammar" from
inside an "<interleave>", and they all seem to work just fine;
possibly in this case it was problematic because the sub-grammar a)
also contained an "<interleave>", b) had an "<attribute>" at its
toplevel, or c) had multiple "<element>"s.)
(inexplicably (to me) at one point during my experimentation, I tried
reordering the references to "clock", "resources", "features", and
"events", and that *also* made it legal to put a <clock> element in
between the <on_*> elements:-O)
Since I was no longer able to reproduce the error described in (c)
once I had made mod (6) (move all of "ids" directly into
"domaincontent", I decided it was pointless for me to spend any more
time randomly poking and just add that to the new test case for that
in case some other random change to the RNG causes it to start failing
again.
(I thought of writing a test program that would try all possible
orderings of the subelements of <domain>, but since doing that for
even 10 subelements would mean testing > 3.2 million different XML
documents, I decided we could continue in this adhoc manner, just
adding a single new test case if/when a new validation failure is
found.)
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
As is often the case with macros (especially those that resolve to
multiple statements), it isn't technically necessary to end any of the
invocations of the DO_TEST_*() macros with a semicolon (as evidenced
by the lines changed in this path). Having does make some
auto-indenters (e.g. cc-mode in emacs) more likely to do the right
thing, though, and it also looks nicer if all the lines are similar.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The documentation states:
``iothread``
Supported for controller type ``scsi`` using model ``virtio-scsi`` for
``address`` types ``pci`` and ``ccw`` :since:`since 1.3.5 (QEMU 2.4)`. The
The code itself didn't validate if iothread is specified for any other
controller type.
Add test case showing the issue on one example.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The schema definition will be reused when adding iothread<->virtqueue
mapping for 'virtio-scsi'.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The function reports libvirt errors so stick with the usual '0' and '-1'
return values.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Extract the code to 'qemuDomainValidateIothreadMapping'. It will be
reused to validate the mapping for 'virtio-scsi' iothreads.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Prepare for reuse of the code for 'virtio-scsi' controller.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The code will be also needed for 'virtio-scsi' controller definitions.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The code will be also needed for 'virtio-scsi' controller definitions.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The iothread mapping will be also possible for 'virtio-scsi' controllers
so rename the corresponding structs to a generic name.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
When starting a VM with a vhost-user interface in server mode qemu will
wait for the incoming connection without running CPUs. This isn't really
documented in our XML. Additionally when hotplugging the same interface
the above will not happen.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Note that 'block' backed NVRAM may need to use 'qcow2' images to work
properly and that populating from template may not support format
conversion.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The presence of a return value made it seem that it's expected to fail
on errors which is not the case. The function is designed to skip
anything it can't fill and not fail when fetching individual stats.
Convert the workers to void to make it clear that it's expected not
to fail.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The bulk domain stats API is meant to collect as much data as possible
without erroring out.
If fetching of the dirty rate stats fails just skip outputting them.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The bulk domain stats API is meant to collect as much data as possible
without erroring out.
If fetching of the memory bandwidth stats fails just skip outputting them.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The bulk domain stats API is meant to collect as much data as possible
without erroring out. Ignore errors from 'qemuDomainGetIOThreadsMon()'
and skip the data if an error happens.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The bulk domain stats API is meant to collect as much data as possible
without erroring out. Skip the perf stats if we can't fetch them instead
of erroring out.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The function didn't comply with libvirt's error reporting scheme as it
reported libvirt errors only sometimes. As callers may want to ignore
errors convert it to returning -errno on failure instead.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The bulk domain stats API is meant to collect as much data as possible
without erroring out.
If fetching of the cache stats fails just skip outputting them.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Use g_autofree for the temporary variables, remove error checks for
virBitmapFormat and simplify formatting of multiple attributes.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Decrease scope of temporary variables so that they don't have to be
autofreed and VIR_FREE()d at the same time.
Remove unneeded checks and temporary variables.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
NULL can't be returned; don't mention it in the docs.
Avoid extra cofusing variable when returning copy of empty string.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
'qemuDomainStorageAlias' always returns non-NULL pointer if it gets a
non-NULL string on input. Remove unneeded checks from callers.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
These files pollute the stderr output when the sc_trailing_blank
syntax check fails.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
A function was changed from having no arguments to having a single
argument, but the entire body of the function was #ifdefed out for
windows builds, leaving that new argument unused. Surprisingly this
didn't cause the build to fail, but I happened to notice it flit by
during an rpm build.
Fixes: 785cd56e58
Signed-off-by: Laine Stump <laine@redhat.com>
In virFileIsSharedFSOverride() we compare a path against a list
of overrides looking for a match.
All overrides are canonicalized ahead of time though, so e.g.
/var/run/foo will be turned into /run/foo due to /var/run being
a symlink on modern Linux systems. But the path we're trying to
match with the overrides doesn't get the same treatment, so in
this scenario the comparison will always fail.
Canonicalizing the path as well solves the issue.
Resolves: https://issues.redhat.com/browse/RHEL-79165
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Almost everything is already there (in the section for using passt
with type='user'), so we just need to point to that from the
type='vhostuser' section (and vice versa), and add a bit of glue.
Also updated a few related details that have changed (e.g. default
model type for vhostuser is now 'virtio', and source type/mode are now
optional), and changed "vhost-user interface" to "vhost-user
connection" because the interface is a virtio interface, and
vhost-user is being used to connect that interface to the outside.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This can/should also be done for a traditional vhost-user interface
(ie not backend type='passt') but that will be a separate change.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
<interface type='vhostuser'><backend type='passt'/> needs to run the
passt command just as is done for interface type='user', but then add
vhostuser bits to the qemu commandline/monitor command.
There are some changes to the parsing/validation along with changes to
the vhostuser codepath do do the extra stuff for passt. I tried
keeping them separated into different patches, but then the unit test
failed in a strange way deep down in the bowels of the commandline
generation, so this patch both 1) makes the final changes to
parsing/formatting and 2) adds passt stuff at appropriate places for
vhostuser (as well as making a couple of things *not* happen when the
passt backend is chosen). The result is that you can now have:
<interface type='vhostuser'>
<backend type='passt'/>
...
</interface>
Then as long as you also have the following as a subelement of
<domain>:
<memoryBacking>
<access mode='shared'/>
</memoryBacking>
your passt interfaces will benefit from the greatly improved
efficiency of a vhost-user data path, and all without requiring
special privileges or capabilities *anywhere* (i.e. it works for
unprivileged libvirt (qemu:///session) as well as privileged libvirt).
Resolves: https://issues.redhat.com/browse/RHEL-69455
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
When passt is used with vhostuser, the vhostuser code that builds the
qemu commandline will need to have the same socket path that is given
to the passt command, so this patch makes it visible outside of
qemu_passt.c.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
qemuProcessPrepareDomain()'s comments say that it should be the only
place to change the "live XML" of a domain (i.e. the public parts of
the virDomainDef object that is shown in the domain's status
XML), and that seems like a reasonable idea (although there aren't
many users of it to date).
qemuProcessPrepareDomainNetwork() is called by the aforementioned
qemuProcessPrepareDomain() - this patch changes the "if (type ==
HOSTDEV)" in that function to a "switch(type)" so it's simpler to add
DomainDef modifications for various other types of virDomainNetDef,
and also so that anyone who adds a new interface type is forced to
look at the code and decide if anything needs to be done here for the
new type.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
For some reason, when vhostuser interface support was added in 2014,
the parser required that the XML for the <interface> have a <source>
element with type, mode, and path, all 3 also required. This in spite
of the fact that 'unix' is the only possible valid setting for type,
and 95% of the time the mode is set to 'client' (as I understand from
comments in the code, normally a guest will use mode='client' to
connect to an existing socket that is precreated (by OVS?), and the
only use for mode='server' is for test setups where one guest is setup
with a listening vhostuser socket (i.e. 'server') and another guest
connects to that socket (i.e. 'client')). (or maybe one guest connects
to OVS in server mode, and all the others connect in client mode, not
sure - I don't claim to be an expert on vhost-user.)
So from the point of view of existing vhost-user functionality, it
seems reasonable to make 'type' and 'mode' optional, and by default
fill in the vhostuser part of the NetDef as if they were 'unix' and
'client'.
In theory, the <source> element itself is also not *directly* required
after this patch, however, the path attribute of <source> *is*
required (for now), so effectively the <source> element is still
required.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Since vhostuser is only used/supported by the QEMU driver, and all the
rest of the vhostuser-specific validation is done in QEMU's
validation, lets move the final check (to see if they've tried to
enable auto-reconnect when this interface is on the server side of the
vhostuser socket) to the QEMU validate.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Both vdpa and vhostuser require that the guest device be virtio, and
for interface type='vdpa', we already set <model type='virtio'/> if it
is unspecified in the input XML, so let's be just as courteous for
interface type='vhostuser'.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Both vhostuser and vdpa interface types must use the virtio model in
the guest (because part of the functionality is implemented in the
guest virtio driver). Due to ["because that's the way it happened"]
this has been validated for vhostuser in the hypervisor-agnostic
validate function, but for vdpa it has been done in the QEMU-specific
validate. Since these interface models are only supported by QEMU
anyway, validate for both of them in the QEMU validation function.
Take advantage of this change to switch to using
virDomainNetIsVirtioModel(net) instead of "net->model ==
VIR_DOMAIN_NET_MODEL_VIRTIO" (the former also matches
...VIRTIO_TRANSITIONAL and ...VIRTIO_NON_TRANSITIONAL, so is more
correct).
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Because all the checks for VIR_DOMAIN_NET_TYPE_VDPA were inside an
else-if clause that was immediately followed by another else-if clause
that forbid setting guestIP.ips or guestIP.routes, we've been allowing
users to set guestIP.* for vdpa interfaces (but then not doing
validation of the attributes that should have been done if we *did*
support setting IPs for vdpa (but we don't anyway, so 🤷.)
This can be fixed by turning the vdpa else-if clause into a top-level
if - this way vdpa interfaces will hit the "else if
(net->guestIP.nips)" clause and reject guest-side IP address setting.
Also, since there are currently *no* interface types for QEMU that
support adding guest-side routes, we put that check by itself (I think
it may be possible to set some guest routes for passt interfaces, but
we don't do that)
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
We haven't checked for memalloc failure in many years, and that was
the only reason this function would have ever failed.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
When a daemon (like libvirtd, virtqemud, etc.) is started as an
unprivileged user (which is exactly how KubeVirt does it), then
it tries to register on both session and system DBus-es so that
it can shut itself down (e.g. when system is powering off or user
logs out). It's worth noting that this is just opportunistic and
if no DBus is available then no error is reported.
Or at least that's what we thought. Because the way our
virGDBusGetSessionBus() and virGDBusGetSystemBus() are written an
error is actually reported every time the daemon starts.
Use virGDBusHasSessionBus() and virGDBusHasSystemBus() to check
if corresponding bus is available.
Resolves: https://issues.redhat.com/browse/RHEL-79088
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
This is just like virGDBusHasSystemBus() except it checks for the
session bus instead of the system one.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
This allows a user specified delay between autostart of each VM, giving
parity with the equivalent feature of libvirt-guests.
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
This delay can reduce the CPU/IO load storm when autostarting many
guests.
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
This eliminates some duplicated code patterns aross drivers.
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
There's a common pattern for autostart of iterating over VMs, acquiring
a lock and ref count, then checking the autostart & is-active flags.
Wrap this all up into a helper method.
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Switch to the 'notify-reload' service type and send notifications to
systemd when reloading configuration.
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
We have a way to notify systemd when we're done starting the daemon.
Systemd supports many more notifications, however, and many of them
are quite relevant to our needs:
https://www.freedesktop.org/software/systemd/man/latest/sd_notify.html
This renames the existing notification API to better reflect its
semantics, and adds new APIs for reporting
* Initiation of config file reload
* Initiation of daemon shutdown process
* Adhoc progress status messages
* Request to extend service shutdown timeout
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
A connection object is not required because autostarted domains are
never marked for autodestroy.
The comment about needing a connection for the network driver is
obsolete since we can auto-open a connection on demand.
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
This allows for passinga NULL connection object in cases where
domain autodestroy is not required.
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
On incoming migration qemu doesn't activate the block graph nodes right
away. This is to properly facilitate locking of the images.
The block nodes are normally re-activated when starting the CPUs after
migration, but in cases (e.g. when a paused VM was migrated) when the VM
is left paused the block nodes are not re-activated by qemu.
This means that blockjobs which would want to write to an existing
backing chain member would fail. Generally read-only jobs would succeed
with older qemu's but this was not intended.
Instead with new qemu you'll always get an error if attempting to access
a inactive node:
error: internal error: unable to execute QEMU command 'blockdev-mirror': Inactive 'libvirt-1-storage' can't be a backing child of active '#block052'
This is the case for explicit blockjobs (virsh blockcopy) but also for
non shared-storage migration (virsh migrate --copy-storage-all).
Since qemu now provides 'blockdev-set-active' QMP command which can
on-demand re-activate the nodes we can re-activate them in similar cases
as when we'd be starting vCPUs if the VM weren't left paused.
The only exception is on the source in case of a failed post-copy
migration as the VM already ran on destination so it won't ever run on
the source even when recovered.
Resolves: https://issues.redhat.com/browse/RHEL-78398
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
The command will be used to re-activate block nodes after migration when
we're leaving the VM paused so that blockjobs can be used.
As the 'node-name' field is optional the 'qemumonitorjsontest' case
tests both variants.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
The flag signals presence of the 'blockdev-set-active' QMP command.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Notable changes:
- 'blockdev-set-active' QMP command and the corresponding 'active'
flag for instantiating blockdev backends added
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
The query language allows querying whether a member is optional by using
the '*' "operator" but the dumper script didn't output those query
strings.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
'qemuDomainSupportsCheckpointsBlockjobs()' should really be used only
with active VMs based on the scope of interlocking it does.
This means that the inactive snapshot code path needs to do the
interlocking based on what's supported:
- external snapshot support was not implemented yet
(bitmaps need to be propagated to the new overlay image)
- internal snapshot support can be deferred to qemu
Move the check inside qemuSnapshotPrepare() which has knowledge about
the snapshot type and implement an explicit check for the inactive case.
See: https://gitlab.com/libvirt/libvirt/-/issues/739
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The commit in question made an incorrect change that resulted in getting
O_RDONLY FD instead of O_RDWR preventing any writes to happen with the
following error:
virQEMUSaveDataWrite:176 : failed to write header to domain save file '/path/to/save.img': Bad file descriptor
Pass 'bypass_cache' as proper bool as the original code did.
Fixes: 517248e239
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
This replaces Fedora 39 with Fedora 41, updates the FreeBSD
Cirrus CI image names, and tweaks some package names
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
When processing the PCI devices we can only read the configs for each of
them if running as privileged. That information is saved in the driver
state as a boolean introduced in commit 643c74abff. However since
that version it is only written to once during nodeStateInitialize() and
only read from that point (apart from some commits around v3.9.0 release
when it was not even set, but that was fixed before v3.10.0). And it is
only read once, just to store that boolean in a temporary variable which
is also used in only one condition.
Rewrite this without locking and save few lines of code.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Add reporting an internal error when the string to type conversion of
devtype fails as this indicates a serious problem since devtype was used
to get into this method during the udev event handling.
Signed-off-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The call to 'qemuBlockStorageSourceNeedsFormatLayer()' bases the
decision also on the state of the passed FD, so we must initialize the
passthrough data via 'qemuDomainPrepareStorageSourceFDs()' before the
aforementioned call.
In the test change it's visible that we didn't add the necessary 'raw'
driver which allows the 'protocol' blockdev to be opened in 'rw' mode so
that qemu picks the proper file descriptior while keeping the device
read-only.
Resolves: https://issues.redhat.com/browse/RHEL-37519
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Add few examples of fd groups with the 'writable' flag set, when passing
a single FD. Notably as a top level image of a readonly disk (even when
that doesn't make much sense) and also as a base image of a chain.
Note that this documents a status quo of a bug fixed in upcoming patch.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Pass also the 'writable' state to the fake passed FDs so that we can
test it.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Recently, I was part of a discussion where it was suspected that
libvirt does not pick up correct FW for SEV-SNP guests. Update
our test to demonstrate it does.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Similarly to commit 1af45804 we should be safer by waiting for the whole
sysfs tree is created for the device.
Signed-off-by: Guoyi Tu <tugy@chinatelecom.cn>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Check if the persistent reservations manager daemon is still needed
after a disk (sub)-chain was dropped after a blockjob.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Export qemuHotplugAttachManagedPR/qemuHotplugRemoveManagedPR for reuse
in blockjob code.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Consider also the destination of a block-copy job.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Add data based on 'v9.2.0-1537-gd922088eb4'
Notable changes:
- '10.0' machine types added
- 'hub' chardev backend added
- 'cpr' migrate channel added
- 'nsamples' field for 'dbus' audio backend now reported
- 'ClearwaterForest-v1' cpu model added
- 'SierraForest-v2' cpu model added
- 'ivshmem-flat' device added
- new qom objects:
- 'virtio-mem-system-reset'
- 'vmclock'
- default value of 'rombar' changed from 1 to -1 for all devices
- 'intel-iommu' device:
- default value of 'aw-bit' changed from '39' to '48'
- 'fs1gp' boolean added
- 'x-flts' boolean added
- 'virtio-balloon-pci'/'virtio-mem-pci':
- 'ioeventfd' added
- 'vectors' added
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Update the data after the release.
Notable changes:
- the 6.2 machine types became deprecated
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Attempting to take an internal snapshot of a freshly defined VM with
qcow2 backed NVRAM results in failure as the NVRAM image doesn't get
populated until the VM is started for the first time.
Fix this by invoking qemuPrepareNVRAM() when qcow2 nvram is defined.
Resolves: https://issues.redhat.com/browse/RHEL-73315
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Export qemuPrepareNVRAM so that it doesn't require the VM object. The
snapshot code needs in the corner case of creating a snapshot of a
freshly defined VM ensure that the nvram image exists in order to
snapshot it.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
The variable holds the amount of disks to roll back the snapshot for.
The value must be set before the code jumps to the 'rollback:' label so
the best situation is to not initialize it and let the compiler catch
errors rather than initialize the unsigned variable to -1 and let it
crash.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
clang complains:
../../../libvirt/src/node_device/node_device_udev.c:1408:82: error: result of comparison of unsigned enum expression < 0 is always false [-Werror,-Wtautological-unsigned-enum-zero-compare]
1408 | if ((data->ccwgroup_dev.type = virNodeDevCCWGroupCapTypeFromString(devtype)) < 0)
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^ ~
1 error generated.
Fix it by adding a temporary int variable to facilitate the check before
assigning to the unsigned enum value.
Fixes: 985cb9c32a
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Add the group membership information to a CCW device. Allow to filter
CCW devices based on a group membership.
Signed-off-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The fields are in nanoseconds, not microseconds. Also fixes the
description of `vcpu.<num>.wait`, as it does not actually represent the
time waiting on I/O.
Signed-off-by: Fabricio Duarte <fabricio.duarte.jr@gmail.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
In case when the hypervisor does report the reason for the I/O error as
an unstable string to display to users we can add a @reason possibility
for the I/O error event noting that the error is available.
Add 'message' as a reason enumeration value and document it
to instruct users to look at the logs or virDomainGetMessages().
The resulting event looks like:
event 'io-error' for domain 'cd': /dev/mapper/errdev0 (virtio-disk0) report due to message
Users then can look at the virDomainGetMessages() API:
I/O error: disk='vda', index='1', path='/dev/mapper/errdev0', timestamp='2025-01-28 15:47:52.776+0000', message='Input/output error'
Or in the VM log file:
2025-01-28 15:47:52.776+0000: IO error device='virtio-disk0' node-name='libvirt-1-storage' reason='Input/output error'
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Emphasise that it's an enumeration and convert the possibilities to a
list of values with explanation.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Report any stored I/O error messages reported by the hypervisor when
reporting messages of a domain. As the I/O error may be already stale we
report also the timestamp when it was recorded.
Example message:
I/O error: disk='vda', index='1', path='/dev/mapper/errdev0', timestamp='2025-01-28 15:47:52.776+0000', message='Input/output error'
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Simplify the function especially by rewriting it using GPtrArray to
construct the string list, especially for the upcoming case when the
number of added elements will not be known beforehand and when
hypervisor specific data will be added.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
The two VIR_DOMAIN_MESSAGE_* flags were not listed in the virCheckFlags
check in 'libxl' but were present in 'test' and 'qemu' driver impls.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Add a log entry to the VM log file for every time we receive an IO error
event from qemu. The log entry is as follows:
2025-01-24 16:03:28.928+0000: IO error device='virtio-disk0' node-name='libvirt-1-storage' reason='other: Input/output error'
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Record the last I/O error reason and timestamp which happened with the
corresponding virStorageSource struct.
This will later allow querying the last error e.g. via the
virDomainGetMessages() API.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Hypervisors may report a I/O error message (unstable; for human use)
to libvirt. In order to store it with the appropriate virStorageSource
so that it can be later queried we need to add fields to
virStorageSource to store the timestamp and message.
The message is deliberately not copied via virStorageSourceCopy.
The messages are also not serialized to the status XML as losing them on
a daemon restart as they're likely to be stale anyways.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
QEMU commit v9.1.0-1065-ge67b7aef7c added 'qom-path' as an optional
field for the BLOCK_IO_ERROR event. Extract and propagate it as an
alternative to lookup via 'node-name' and 'device' (alias).
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
When qemu reports a node name for an I/O error we should prefer the
lookup by node name instead as it gives us the path to the specific
image which caused the error instead of the top level image path.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Leave the interpretation of the event to 'qemuProcessHandleIOError()'
which will create it's own variant of the messages for the user-facing
libvirt events. qemuMonitorJSONHandleIOError() will pass through the raw
data it got from qemu.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Prefix the helper variables used to supply data to the event by
'event'. Declare them with the default value of an empty string rather
than doing it later.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
The field is named 'device' in the event so unify our naming.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
BLOCK_IO_ERROR's 'device' field is an empty string in case when it isn't
applicable as it was originally mandatory in the qemu API docs.
Move the logic that convert's empty string back to NULL from
'qemuProcessHandleIOError()' to 'qemuMonitorJSONHandleIOError()'
This also fixes a hypothetical NULL-dereference if qemu would indeed
report an IO error without the 'device' field present.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
This is similar to emuxmlconfdata/memory-hotplug-virtio-mem-pci-s390x.xml
except the explicit placement of virtio-mem onto a PCI bus is removed.
This results in virtio-mem being placed onto CCW "bus" this demonstrating
previous commits working as expected.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
After previous commits, we can allow virtio-mem to live on CCW
channel.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
There are basically two differences between virtio-mem-ccw and
virtio-mem-pci. s390 doesn't allow mixing different page sizes
and there's no NUMA support in QEMU.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
This capability tracks whether QEMU supports virtio-mem-ccw
device. Introduced in QEMU commit v9.2.0-492-gaa910c20ec only
upcoming release of QEMU supports the device.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
As of v9.2.0-1413-gd77ae821e8 QEMU supports virtio-mem-pci on
s390 too. Let's add a test case for that.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Both, virtio-mem and virtio-pmem devices follow traditional QEMU
naming convention: their suffix determines what bus they live on.
For instance, virtio-mem-pci, virtio-mem-ccw, virtio-pmem-pci.
We already have a function that constructs device name following
this convention: qemuBuildVirtioDevGetConfigDev().
While there's no virtio-pmem-ccw device yet, the function can
still be used.
Another advantage of using the function is - it'll be easier in
future when we want to configure various virtio aspects of memory
devices (like ats, iommu_platform, etc.).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
In some cases, we might automatically add a NUMA node. But this
doesn't work for s390 really, because in its commit
v2.12.0-rc0~41^2~6 QEMU forbade specifying NUMA nodes for s390.
Suppress automatic adding of NUMA node on our side.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
In Fedora >= 42, support for user/group account creation based on
sysusers files has been enabled in RPM. Manually running useradd/
groupadd is thus obsolete.
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
We previously added a sysusers file, but missed the 'virtlogin' group.
This group is used to make the virt-login-shell binary setgid, so we
shoudl be registering that too. It must be done in a separate sysusers
file, however, since it is packaged separately from the daemons.
Fixes: a2c3e390f7
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Now that only supported version of VirtualBox is 7.0.x the code
that supports older versions can be dropped.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
If initialization of VBOX fails inside of _pfnInitialize an
negative value is returned to signal an error condition to a
caller but no error message is printed out. Reporting an error
may shed more light into why VBOX failed to initialize.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
When attempting to restore a saved image, the check for a valid save image
format does not occur until the qemu process is about to be executed. Move
the check earlier in the restore process, along with the other checks that
verify a valid save image header.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Split the reading of libvirt's save image metadata from the opening
of the fd that will be passed to QEMU. This allows improved error
handling and provides more flexibility users of qemu_saveimage.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
qemuDomainObjRestore is the only caller of qemuSaveImageOpen that
requests an unlink of a corrupted save image. Provide a function to
check for a corrupt image and move unlinking it to qemuDomainObjRestore.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
We previously added a sysusers file, but missed the 'libvirt' group.
This group is referenced in the polkit rules, so we should be
registering that too. It must be done in a separate sysusers file,
however, since it is common to all daemons.
Fixes: a2c3e390f7
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Ever since its introduction, g_string_replace() has received
various bugfies and improvements, e.g.:
0a8c7e57a g_string_replace: Don't replace empty string more than once per location
b13777841 g_string_replace: Document behaviour of zero-length match pattern
e8517e777 remove quadratic behavior in g_string_replace
c9e48947e gstring: Fix a heap buffer overflow in the new g_string_replace() code
to name a few. Sync our implementation with the one from current
main branch of glib. Some code style adjustments have been made
to match our coding style.
Signed-off-by: Adam Julis <ajulis@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
qemuDomainDiskByName() can return a NULL pointer on failure.
But this returned value in qemuSnapshotDeleteValidate is not checked.It will make libvirtd crash.
Signed-off-by: kaihuan <jungleman759@gmail.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Provide a proper user facing error when attempting to query block
I/O throttling settings for an empty drive. Without this patch, a less
meaningful internal error produced by qemuMonitorJSONBlockIoThrottleInfo
would be propagated to the user.
Signed-off-by: Fabian Leditzky <fabian@ldsoft.dev>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Since we supported 'product' parameter for SCSI, just expanded existing
solution makes IDE/SATA parameter works too. QEMU requires parameter 'model'
in case of IDE/SATA (instead of 'product'), so the process of making JSON
object is slightly modified. Length of the 'product' parameter is
different in SCSI (16 chars) and ATA/SATA (40 chars).
Resolves: https://gitlab.com/libvirt/libvirt/-/issues/697
Signed-off-by: Adam Julis <ajulis@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
When snapshot is created with disk-only flag it is always external
snapshot without memory state. Historically when there was not support
to revert external snapshots this produced error message.
error: Failed to revert snapshot s1
error: internal error: Invalid target domain state 'disk-snapshot'. Refusing snapshot reversion
Now we can simply consider this as reverting to offline snapshot as the
possible damage to file system is already done at the point of snapshot
creation.
Resolves: https://issues.redhat.com/browse/RHEL-21549
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
This directory might not exist on systems not supporting old SystemV interfaces.
Signed-off-by: Bronek Kozicki <brok@incorrekt.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
By removing the unnecessary spaces, the behavior is aligned with
`virsh list --all --name` and `virsh net-list --all --name`.
Without this change, one can't do something like the following easily:
`virsh pool-list --all --name | xargs -I {} virsh pool-start \"{}\"`
as no pool `"foo "` (with all the spaces) actually exist.
Although the removed comment states that the additional spaces were kept
to maintain backwards compatibility, the commit [0] and the old behavior
are from 2010 when libvirt was at version 0.8.1. For the sake of sanity,
the behavior should be aligned with other parts of the CLI.
[0] 415b14903e
Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
The VM private data will be used in a sub-sequent patch. To minimize
churn, refactor the function before changing the logic.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Per our supported platforms the minimum available versions are:
CentOS Stream 9: 2.68.4
Debian 11: 2.66.8
Fedora 39: 2.78.6
openSUSE Leap 15.6: 2.78.6
Ubuntu 22.04: 2.72.4
FreeBSD ports: 2.80.5
macOS homebrew: 2.82.4
macOS macports: 2.78.4
Bump to 2.66 which is limited by Debian 11. While ideally we'd bump to
2.68 which would give us 'g_strv_builder' and friends 2.66 is enough for
g_ptr_array_steal() which can be used to emulate the former with almost
no extra code.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Since the empty file with a .base64 value wasn't recognized during the loading
process (starting of libvirtd), attempting to get a value for the UUID resulted
in an undefined error. This patch resolves the issue by checking the size of
the file and ensuring that the stored value is as expected (NULL).
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Adam Julis <ajulis@redhat.com>
There's a call to read() in the file but corresponding include of
unistd.h is missing causing a build failure.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
When a migration with non-shared storage is started with
VIR_MIGRATE_PARAM_BANDWIDTH set, it will be applied to both memory
migration and each block job started for storage migration. Once the
migration is running virDomainMigrateSetMaxSpeed may be used to change
the bandwidth used by memory migration, but there was no way of changing
storage migration speed. Let's allow virDomainBlockJobSetSpeed during
migration to enable the missing functionality.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The new VIR_MIGRATE_PARAM_BANDWIDTH_AVAIL_SWITCHOVER parameter can be
used to override the estimated bandwidth that can be used for
transferring guest memory and device state once virtual CPUs are
stopped.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Now that meson ensures these directories always exist, we can
move them to the daemon-common package where they belong.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
All loadable modules should depend on the daemon-common package.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Currently the directories that are searched for each possible
kind of loadable module are created as a side effect of
installing the corresponding module, which means that their
availability depends on the exact list of features that have
been enabled.
Create them explicitly ahead of time instead, ensuring
consistency.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Implement domainInterfaceAddresses for the Cloud Hypervisor driver.
Support VIR_DOMAIN_INTERFACE_ADDRESSES_SRC_LEASE and
VIR_DOMAIN_INTERFACE_ADDRESSES_SRC_ARP sources. Implementation is
similar to other drivers.
Signed-off-by: Anirudh Rayabharam <anrayabh@linux.microsoft.com>
Signed-off-by: Praveen K Paladugu <praveenkpaladugu@gmail.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Implement `virCHProcessEvent` that maps event string to corresponding
event type and take appropriate actions. As part of this, handle the
shutdown event by correctly updating the domain state. This change also
facilitates the handling of other VM lifecycle events, such as booting,
rebooting, pause, resume, etc.
Signed-off-by: Purna Pavan Chandra Aekkaladevi <paekkaladevi@linux.microsoft.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Use a FIFO(named pipe) for --event-monitor option in CH. Introduce a new
thread, `virCHEventHandlerLoop`, to continuously monitor and handle
events from cloud-hypervisor.
Signed-off-by: Purna Pavan Chandra Aekkaladevi <paekkaladevi@linux.microsoft.com>
Co-authored-by: Vineeth Pillai <viremana@linux.microsoft.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The `--event-monitor` option in cloud-hypervisor outputs events to a
specified file. This file can then be used to monitor VM lifecycle,
other vmm events and trigger appropriate actions.
Signed-off-by: Purna Pavan Chandra Aekkaladevi <paekkaladevi@linux.microsoft.com>
Co-authored-by: Vineeth Pillai <viremana@linux.microsoft.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Most of my historical libvirt contributions are PowerPC related but at
this moment I'm working with RISC-V enablement (mostly on the QEMU
side).
Feel free to reach out for RISC-V related matters w.r.t libvirt and
QEMU-KVM support.
Suggested-by: Martin Kletzander <mkletzan@redhat.com>
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
The 'aia' feature is added as a machine type option for the 'virt'
RISC-V machine, e.g. "-machine virt,aia=<val>".
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
This feature is implemented as a string that can range from "none",
"aplic" and "aplic-imsic".
If the feature isn't present in the domain XML the hypervisor default
will be used. For QEMU, at least up to 9.2, the default is "none".
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
AIA (Advanced Interrupt Architecture) support was introduced in QEMU 7.0
for the 'virt' machine type. It allows the guest to choose from a more
modern interrupt model than the default (CLINT - Core Logical Interrupt
Controller).
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
The correct compiler define to detect the RISC-V architecture is __riscv.
Fixes: b902cfece0 ("virsysinfo: Try reading DMI table")
Signed-off-by: Heinrich Schuchardt <heinrich.schuchardt@canonical.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Let us introduce the xml and reply files for QEMU 10.0.0 on s390x.
Notable changes:
- new s390-ccw-virtio-10.0 machine type
- old machine types (2.4 - 2.8) dropped
- new CPU models
- New devices:
- virtio-mem-ccw
- chardev now supports qemu-vdagent
Signed-off-by: Shalini Chellathurai Saroja <shalini@linux.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Suggested-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
When a migration is canceled very late once virtual CPUs are already
stopped, QEMU will automatically resume them. If this happens after we
exited a waiting loop in qemuMigrationSrcWaitForCompletion, but before a
loop that tries to make sure CPUs are stopped by waiting for the
appropriate event, we may end up waiting forever because the CPUs are
running (they were resumed by migrate_cancel), but the STOP event is
already gone.
This is possible because we enter monitor for fetching migration
statistics at which point other APIs can be processed and migration may
change its state. We should recheck the state when we get back from the
monitor code.
https://issues.redhat.com/browse/RHEL-52493
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Inactive domain XML can be wildly different to the live XML. For
instance, it can have VSOCK CID of that from another (running)
domain. Since domain status is not checked for, attempting to ssh
into an inactive domain may in fact result in opening a
connection to a different live domain that listens on said CID
currently.
Resolves: https://gitlab.com/libvirt/libvirt/-/issues/737
Resolves: https://issues.redhat.com/browse/RHEL-75577
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
JSON parser isn't called when reading empty files so `jerr` will be used
uninitialized in the original code. Empty files appear when a network
has no dhcp clients.
This patch checks for such files and skip them.
Fixes: a8d828c88b
Signed-off-by: Jiang XueQian <jiangxueqian@gmail.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
This matches the cpu_family() check in `meson.build`
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Cole Robinson <crobinso@redhat.com>
Commit f8558a87ac de-modularized the
'storage-file' backend for local files, and thus now the only
possibility to have the directory is when compiling with gluster.
This breaks RPM builds when building without gluster as the backend
directory no longer exists in such case. Move the stanza requiring the
directory under the gluster driver declarations.
Fixes: f8558a87ac
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
In its upstream commit [1], qemu dropped s390-2.7 machine type,
then in commit [2] the s390-2.8 machine type was dropped. But as
Thomas Huth pointed out, any machine type that's older than 6
years is subject to removal [3]. This means, any machine type
older than 4.1 is going to be removed eventually.
We have two test cases that assumes existence of 2.7 machine type.
While they could be switched to 4.1 machine type, we also have
another test case that already check 4.2 machine type.
Therefore, just drop the 2.7 ones.
1: 3199c7ee76
2: 66924fe369
3: ce80c4fa6f
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
The 'storage_file' infrastructure serves as an abstraction on top of
file-looking storage technologies. Apart from local file it currently
implements also a backend for 'gluster'.
Historically it was all modularized and the local file module was
usually packaged with the 'core' part of the storage driver. Now with
split daemons one can install e.g. 'virqemud' without the storage driver
core which contains the 'fs' backend module. Since the qemu driver uses
the storage file backends to e.g. create storage for snapshots and
backups this allows users to create a deployment where some things will
not work properly.
As the 'fs' backend doesn't use any code that wouldn't be linked
directly anyways there's no point in actually shipping it as a module.
Let's compile it in so that all deployments can use it.
To achieve that, compile the source directly into the
'virt_storage_file_lib' static library and remove the loading code. Also
adjust the spec file.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Add an example image formatted by:
qemu-img create -f qcow2 -o data_file=nbd+unix:///datafile?socket=/tmp/nbd,data_file_raw=true /tmp/nbddatastore.qcow2 10M -u
serving as an example when qemu records an empty string as the
'data_file' field.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
In certain buggy conditions qemu can create an image which has empty
string stored as 'data_file'. While probing libvirt would consider the
empty string as a relative file name and construct the path using the
path of the parent image stripping the last component and appending the
empty string. This results into attempting to using a directory as an
image and thus the following error when attempting to start VM with such
an image:
error: unsupported configuration: storage type 'dir' requires use of storage format 'fat'
Reject empty strings passed in as 'data_file'.
Note that we do not have the same problem with 'backing store' as an
empty string there is interpreted as no backing file both by qemu and
libvirt.
Resolves: https://issues.redhat.com/browse/RHEL-70627
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The assigned string is 17 chars long once the trailing nul is taken
into account. This triggers a warning with GCC 15
src/util/virsystemd.c: In function ‘virSystemdEscapeName’:
src/util/virsystemd.c:59:38: error: initializer-string for array of ‘char’ is too long [-Werror=unterminated-string-initialization]
59 | static const char hextable[16] = "0123456789abcdef";
| ^~~~~~~~~~~~~~~~~~
Switch to a dynamically sized array as used in all the other places
we have a hextable array.
See also: https://gcc.gnu.org/PR115185
Reported-by: Yaakov Selkowitz <yselkowi@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The list of CPU features we probe from various MSR grew significantly
over time and the CPU map currently mentions 11 distinct MSR indexes.
But the code for directly probing host CPU features was still reading
only the original 0x10a index. Thus the CPU model in host capabilities
was missing a lot of features.
Instead of specifying a static list of indexes to read (which we would
forget to update in the future), let's just read all indexes found in
the CPU map.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
When an I/O error happens (causing a domain to be paused) during live
migration which is later cancelled by a user, trying to resume the
domain doesn't make sense.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
None of the callers really care about the return value so we can drop it
and simplify the code a bit.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
When migration fails in Perform phase, we call Finish on the destination
host with cancelled=1 and get the error from there and report it to the
user. This works well if the error on the destination caused the
migration to fail. But in other cases the main error may reported by the
source and the destination would just be complaining about broken
migration stream.
In other words, we don't really know which error caused the migration to
fail and we have no way of detecting that. So instead of choosing one
error, this patch will combine the error messages from both sides of
migration into a single message and report it to the user. The result
would be, for example:
operation failed: migration failed. Message from the source host:
operation failed: job 'migration out' failed: Certificate does not
match the hostname ble.bla. Message from the destination host:
operation failed: job 'migration in' failed: load of migration
failed: Invalid argument
And yes, this is ugly, but I wasn't able to come up with a better way of
fixing this issue.
https://issues.redhat.com/browse/RHEL-58933
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
There are some features/improvements/bug fixes I've either
contributed or reviewed/merged. Document them for upcoming
release.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The source_root() method is deprecated in 0.56.0 and we're
recommended to use project_source_root() instead.
This is similar to commit v8.9.0-rc1~70 but somehow, the old
method sneaked in.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
The postcopy-recover migration state in QEMU means a connection for the
migration stream was established. Depending on the schedulers on both
hosts a relative timing of the corresponding MIGRATION event on the
source host and the destination host may differ. Specifically it's
possible that the source sees postcopy-recover while the destination is
still in postcopy-paused.
Currently the Perform phase on the source host ends when we get
postcopy-recover event and the Finish phase on the destination host is
called. If this is fast enough we can still see postcopy-paused state
when the Finish phase starts waiting for migration to complete. This is
interpreted as a failure and reported back to the caller. Even though
the recovery may actually start just a few moments later.
To avoid this race we now don't consider post-copy migration active in
postcopy-recover state and keep waiting for postcopy-active event (in
the success path). Thus the Finish phase is entered only after the
migration switches to postcopy-active. In this state QEMU guarantees the
destination already switched at least to postcopy-recover and we won't
be confused be seeing an old postcopy-failed state.
https://issues.redhat.com/browse/RHEL-73085
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com
The generated org.libvirt.api.policy.in file was recently added to the
POTFILES list as it contains translatable messages.
It is only generated when WITH_POLKIT && WITH_LIBVIRTD is satisfied
though, resulting in the 'po_check' syntax rule failing if either of
those conditions are not met.
It is harmless to unconditionally generate this file, as a separate
rule takes care of of installing it, and the latter remains under
the build conditions.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Since we previously only supported vlan tagging for interfaces
connected to an OVS bridge [*], the code in qemuChangeNet() (used by
the update-device API) assumed an interface with modified vlan config
was on an OVS bridge, and would call the OVS-specific
virNetDevOpenvswitchUpdateVlan().
Now that we support vlan tagging for interfaces connected to a
standard Linux host bridge, we must check the type of connection and
only call the OVS function when connected to an OVS bridge *both
before and after the update*. Otherwise we just set the flag to
re-connect to the bridge, which has the side effect of redoing the
vlan setup.
([*] or an SRIOV VF assigned using VFIO, but we don't support *any
runtime changes to that type of netdev so it's irrelevant here.)
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Update domain XML and network XML documentation to describe how
standard linux bridges support the VLAN configuration.
Signed-off-by: Leigh Brown <leigh@solinno.co.uk>
Reviewed-by: Laine Stump <laine@redhat.com>
When we are deleting external snapshot that is not active we only need
to delete overlay disk image of the parent snapshot. This works
correctly even if parent snapshot is external and active as it will have
another overlay created when user reverted to that snapshot.
In case the parent snapshot is internal there are no overlay disk images
created as everything is stored internally within the disk image. In
this case we would delete the actual disk image storing internal
snapshots and most likely the original disk image as well resulting in
data loss once the VM is shutoff.
Fixes: https://gitlab.com/libvirt/libvirt/-/issues/734
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
The host-model CPU mode was described as similar to copying the host CPU
definition from capabilities, which has not been the case for ages. The
host-model definition from domain capabilities is used instead.
Only the first sentence changed, but it required reformatting
essentially the whole paragraph so I used this as an opportunity to
reformat it a little bit more and split the long paragraph into several
smaller ones for better readability.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
When VIR_INHIBITOR_WHAT_NONE is passed to virInhibitorNew, it is
an indication that daemon shutdown should be inhibited, but no
OS level inhibitors acquired. This is done by the virtnetworkd
daemon, for example, to prevent shutdown while running virtual
machines are present, without blocking / delaying OS shutdown.
Unfortunately the code forgot to skip the DBus call in this case,
resulting in errors being logged.
Reviewed-by: Laine Stump <laine@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
When debugging it is useful to know what signals are being received and
metadata related to them. Log this data before calling the signal
handling callbacks.
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
GPU vendors are moving away from using mdev to create virtual GPUs
towards using SRIOV VFs that are vGPUs. In both cases, once created
the vGPUs are assigned to guests via <hostdev> (i.e. VFIO device
assignment), and inside the guest the devices look identical, but mdev
vGPUs are located by QEMU/VFIO using a uuid, while VF vGPUs are
located with a PCI address. So although we generally require the
device on the source host to exactly match the device on the
destination host, in the case of mdev-created vGPU vs. VF vGPU
migration *can* potentially work, except that libvirt has a hard-coded
check that prevents us from even trying.
This patch loosens up that check so that we will allow attempts to
migrate a guest from a source host that has mdev-created vGPUs to a
destination host that has VF vGPUs (and vice versa). The expectation
is that if this doesn't actually work then QEMU will fail and generate
an error that we can report.
Signed-off-by: Laine Stump <laine@redhat.com>
Tested-by: Zhiyi Guo <zhguo@redhat.com>
Reviewed-by: Zhiyi Guo <zhguo@redhat.com>
Adjust domain and network validation to permit vlan configuration on
standard linux bridges.
Update calls to virNetDevBridgeAddPort to pass the vlan configuration.
Signed-off-by: Leigh Brown <leigh@solinno.co.uk>
Reviewed-by: Laine Stump <laine@redhat.com>
Add virNetDevBridgeSetupVlans function to configure a bridge
interface using the passed virNetDevVlan struct.
Add virVlan parameter to the Linux version of virNetDevBridgeAddPort
and call virNetDevBridgeSetupVlans to set up the required vlan
configuration.
Update callers of virNetDevBridgeAddPort to pass NULL for now.
Signed-off-by: Leigh Brown <leigh@solinno.co.uk>
Reviewed-by: Laine Stump <laine@redhat.com>
Enable capability to add and remove vlan filters for a standard
linux bridge using netlink.
New function virNetlinkBridgeVlanFilterSet can be used to add or
remove a vlan filter to a given bridge interface.
Signed-off-by: Leigh Brown <leigh@solinno.co.uk>
Reviewed-by: Laine Stump <laine@redhat.com>
There is a common misconception when writing AppArmor policy that
[0-9]* applies * to the [0-9] class, but that's not the case. For this
example, [0-9]* matches a single digit followed by any number of
characters except for /
Create a UUID variable that uses the following format 8-4-4-4-12.
Signed-off-by: Georgia Garcia <georgia.garcia@canonical.com>
Reviewed-by: Jim Fehlig <jfehlig@suse.com>
Most of the impl for the 'daemon-set-timeout' command was ordered under
the heading for the 'daemon-log-filters' command.
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The inhibitor constant values were off-by-1, so when converted into
string format, we picked the wrong names
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
In commit dfa0e11 the last direct usage of devmapper for storage_disk was
removed. There is one stale include remaining, which is unused even longer
since df1011ca. Remove the include and change meson.build so we can use
storage_disk without devmapper.
I'm running it right now with a stripped-down config on a small arm64
router with openwrt.
Signed-off-by: Stefan Hellermann <stefan@the2masters.de>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Commit 247357cc29 added support for direct and extended modes for
tlbflush, but forgot to do the formatting as well.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Add a nested buffer for whatever sub-elements a particular
hyperv feature might have.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
The apparmor driver probe function checks for an active profile matching
the full path of the running daemon binary. If not found, it checks for
a profile named "libvirtd". This works fine when the running daemon is the
old monolithic libvirtd, but fails with modular daemons.
Remove the check for a hardcoded "libvirtd" profile and replace with the
basename of the running daemon binary.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
The 'description' and 'message' fields in polkit policy files should be
translated into the user's chosen language. xgettext is told to search
in both and source and build dirs by meson.
Unfortunately a bug in xgettext means that when it searches for built
files in XML format, it'll trigger a warning message due to failure to
load the generated file from the source dir:
xgettext: cannot read ..snip../libvirt/src/access/org.libvirt.api.policy: failed to load external entity "..snip../libvirt/src/access/org.libvirt.api.policy"
This is harmless since it then goes on to try the build dir and
succeeds, but will pollute the output of 'ninja libvirt-pot'
Related: https://gitlab.com/libvirt/libvirt/-/merge_requests/387
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
xgettext / msgfmt have generic support for extracting / merging strings
in XML files, however, they need to be told something about the schema
to know which fields are translatable. This is done by providing 'its'
rules. Usually the 'its' rules would be shipped in a -devel package of
the app which owns the schema definition, but polkit does not do this.
Thus libvirt (and other apps) must ship their own local 'its' rules for
polkit.
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
When the vTPM source path is specified, such as:
<source type=".." path="/my/tpm"/>
Do not delete the parent directory, but only the given file/dir.
Fixes: commit f1304cc566 ("qemu_tpm: handle file/block storage source")
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Tested-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Commit bb5e26749f ("qemu: explicit swtpm state locking") attempted to
lock the state, but only for swtpm-setup. The capability
"tpmstate-opt-lock" is actually only exposed by swtpm.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
In its upstream commit [1] openwsman dropped 'facility' variable
which is documented as:
* all processes that use the libu must define a "facility" variable somewhere
* to satisfy this external linkage reference.
*
* Such variable will be used as the syslog(3) facility argument.
Well, prior to that commit, openwsman itself declared the
variable (and set it to LOG_DAEMON). Now it's up to us.
Yeah, the variable naming is terrible and also I we are not using
libu directly, but apparently libwsman.so requires it anyway:
$ objdump -T /usr/lib64/libwsman.so | grep facility
0000000000000000 D *UND* 0000000000000000 Base facility
1: d72c51f21b
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Allows to load firmware in the qemu-efi-loongarch64 directory
Allows the binary qemu-system-loongarch64 to be run
This makes it possible to run loongarch64 VMs when AppArmor
is enabled
Signed-off-by: Xianglai Li <lixianglai@loongson.cn>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
They require special handling since they are dependent on the basic
tlbflush feature itself and therefore are not handled automatically as
part of virDomainHyperv enum, just like the stimer-direct feature.
Resolves: https://issues.redhat.com/browse/RHEL-7122
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Similarly to stimer-direct these are subelements of <tlbflush/> in the
domain XML.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Log curl responses from cloud-hypervisor process during Boot request, using
domain's logContext.
Signed-off-by: Praveen K Paladugu <praveenkpaladugu@gmail.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Move the definitions of curl_data and curl_callback to be used
within virCHMonitorPutNoContent.
Signed-off-by: Praveen K Paladugu <praveenkpaladugu@gmail.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Use domainLogContext to enable logging for ch domain process during create
and restore steps.
Signed-off-by: Praveen K Paladugu <praveenkpaladugu@gmail.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
While doing so, also drop QEMU specific arguments from
domainLogContextNew() and replace them with hypervisor agnostic
ones.
Signed-off-by: Praveen K Paladugu <praveenkpaladugu@gmail.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Libxml2 has awful error reporting behaviour when reading files. When
we fail to load a file from the test driver we see:
$ virsh -c test:///wibble.xml
I/O warning : failed to load external entity "/wibble.xml"
error: failed to connect to the hypervisor
error: XML error: failed to parse xml document '/wibble.xml'
where the I/O warning line is something printed by libxml2 itself,
which also lacks any useful detail.
Switching to our own file reading code we can massively improve
things:
$ ./build/tools/virsh -c test:///wibble.xml
error: failed to connect to the hypervisor
error: Failed to open file '/wibble.xml': No such file or directory
Using 10 MB as an upper limit on XML file size ought to be sufficient
for any XML files libvirt is reading.
Reviewed-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The libxml2 error handling gets the filename from a libxml2 struct, but
it is better to not assume libxml2 knows the filename being parsed, as
we might have simply provided it a pre-loaded string.
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Currently given an input of '<dom\n' we emit an error:
error: Failed to define domain from tests/qemuxmlconfdata/broken-xml-invalid.xml
error: at line 2: Couldn't find end of Start Tag dom line 1
(null)
^
With this fix we emit:
error: Failed to define domain from tests/qemuxmlconfdata/broken-xml-invalid.xml
error: at line 2: Couldn't find end of Start Tag dom line 1
<dom
----^
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The virNetDaemon code now only concerns itself with preventing auto
shutdown of the local daemon. Logind is now handled by the new
virInhibitor object, for QEMU, LXC and LibXL. This fixes two notable
bugs
* Running virtual networks would prevent system shutdown
* Loaded ephemeral secrets would prevent system shutdown
Fixes 9e3cc0ff5e
Fixes 37800af9a4
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
This initial conversion of the drivers switches them over to use
the virInhibitor APIs in local daemon only mode. Communication to
logind is still handled by the virNetDaemon class logic.
This mostly just replaces upto 3 fields in the driver state
with a single new virInhibitor object, but otherwise should not
change functionality besides replacing atomics with mutex protected
APIs.
The exception is the LXC driver which has been trying to inhibit
shutdown shutdown but silently failing to, since nothing ever
remembered to set the 'inhibitCallback' pointer in the driver
state struct.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The system inhibitor locks are currently handled by code in the
virNetDaemon class. The driver code invokes a callback provided
by the daemon when it wants to start or end inhibition.
When the first inhibition is started, the daemon will call out
to logind to apply it system wide.
This has many flaws
* A single message is registered with logind regardless of
what driver holds the inhibition
* An inhibition of daemon shutdown can't be acquired
without also inhibiting system shutdown
* Config of the inhibitions cannot be tailored by the
driver
The new virInhibitor object addresses these:
* The object directly manages an inhibition with logind
privately to the driver, enabling custom messages to
be set.
* It is possible to acquire an inhibition locally to the
daemon without forwarding it to logind.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The exit-on-error=false argument of migrate-incoming tells the QEMU
process to keep running when incoming migration fails, which helps us in
two ways:
1. When migration enters Finish phase to cleanup the process, the domain
might not even exist on the destination (because it has already been
cleaned up by EOF monitor callback) and we would get rather unhelpful
"operation failed: domain is no longer running" error message.
2. We can get the error that caused incoming migration to fail directly
from QEMU via query-migrate QMP command.
https://issues.redhat.com/browse/RHEL-7041
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The function is only used during incoming migration in the beginning of
Finish phase to detect if QEMU already died but EOF handler haven't had
a chance to do its job yet. It calls query-status QMP command, but
ignores the result. By calling query-migrate instead we can achieve the
same functionality if QEMU is dead and even get meaningful error from
"error-desc" in case the incoming migration failed and QEMU is still
running.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The exit-on-error argument (added in QEMU 9.1.0) can be used to tell
QEMU not to exit when incoming migration fails so that the error can be
retrieved via QMP. This patch adds a new capability bit indicating
support for the new argument.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
As one of its arguments, the
virQEMUCapsProbeFullDeprecatedProperties() gets a pointer to
GStrv (a string list), which it may eventually replace. It's
single caller (virQEMUCapsProbeQMPHostCPU()) passes a string list
indeed. Now, when replacing one string list with another plain
g_free() is not enough as we need to free individual strings too.
==13573== 34 bytes in 8 blocks are definitely lost in loss record 271 of 576
==13573== at 0x4844878: malloc (vg_replace_malloc.c:446)
==13573== by 0x51789D1: g_malloc (in /usr/lib64/libglib-2.0.so.0.7800.6)
==13573== by 0x5193E82: g_strdup (in /usr/lib64/libglib-2.0.so.0.7800.6)
==13573== by 0x4997F73: g_strdup_inline (gstrfuncs.h:321)
==13573== by 0x4997F73: virJSONValueArrayToStringList (virjson.c:1296)
==13573== by 0x5027CF7: qemuMonitorJSONParseCPUModelExpansion (qemu_monitor_json.c:5139)
==13573== by 0x50281C9: qemuMonitorJSONGetCPUModelExpansion (qemu_monitor_json.c:5245)
==13573== by 0x501044F: qemuMonitorGetCPUModelExpansion (qemu_monitor.c:3261)
==13573== by 0x4F190D0: virQEMUCapsProbeQMPHostCPU (qemu_capabilities.c:3227)
==13573== by 0x4F2145E: virQEMUCapsInitQMPMonitor (qemu_capabilities.c:5758)
==13573== by 0x10FFF8: testQemuCaps (qemucapabilitiestest.c:111)
==13573== by 0x110B53: virTestRun (testutils.c:143)
==13573== by 0x11063E: doCapsTest (qemucapabilitiestest.c:200)
Fixes: 51c098347d
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
In my previous commit v10.10.0-48-g2d222ecf6e I've made us enable
I/O APIC when there is an IOMMU with EIM. This works well. What
does not work is case when there's just an IOMMU without EIM but
with 256+ vCPUS. Problem is that post parsing happens in two
stages: general domain post parse (where
qemuDomainDefEnableDefaultFeatures() is called) and then per
device post parse (where qemuDomainIOMMUDefPostParse() is
called). Now, in aforementioned case it is the device post parse
phase where EIM is enabled but the code that would enable
VIR_DOMAIN_FEATURE_IOAPIC has already run.
To resolve this, make the domain post parse callback "foresee"
the future enabling of EIM so that it can turn on I/O APIC
beforehand.
Resolves: https://issues.redhat.com/browse/RHEL-65844
Fixes: 2d222ecf6e
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
An RPM must own any directories its creates, unless it can guarantee a
dependancy has ownership. Two packages owning the same directory is fine
if permissions are consistent.
We don't require augeas as a dep in most packages, so we must own the
augeas lens directories. Likewise for systemtap tapset dirs.
Our own cpu map dir also needs ownership.
A few files are re-sorted, so that the files are listed immediately
adjacent to the %dir that contains them.
https://bugzilla.redhat.com/show_bug.cgi?id=2280979
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Add a new a attribute, deprecated_features='on|off' to the <cpu>
element. This is used to toggle features flagged as deprecated on the
CPU model on or off. When this attribute is paired with 'on',
deprecated features will not be filtered. When paired with 'off', any
CPU features that are flagged as deprecated will be listed under the
CPU model with the 'disable' policy.
Example:
<cpu mode='host-model' check='partial' deprecated_features='off'/>
The absence of this attribute is equivalent to the 'on' option.
The deprecated features that will populate the domain XML are the same
features that result in the virsh domcapabilities command with the
--disable-deprecated-features argument present.
It is recommended to define a domain XML with this attribute set to
'off' to ensure migration to machines that may outright drop these
features in the future.
Signed-off-by: Collin Walling <walling@linux.ibm.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Add a new flag, --disable-deprecated-features, to the domcapabilities
command. This will modify the output to show the 'host-model' CPU
with features flagged as deprecated paired with the 'disable' policy.
virsh domcapabilities --disable-deprecated-features
Signed-off-by: Collin Walling <walling@linux.ibm.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
If flag VIR_CONNECT_GET_DOMAIN_CAPABILITIES_DISABLE_DEPRECATED_FEATURES
is passed to qemuConnectGetDomainCapabilities, then the domain's CPU
model features will be updated to set any deprecated features to the
'disabled' policy.
Signed-off-by: Collin Walling <walling@linux.ibm.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Introduce domain flag used to filter deprecated features from the
domain's CPU model.
Signed-off-by: Collin Walling <walling@linux.ibm.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Add QEMU_CAPS_QUERY_CPU_MODEL_EXPANSION_DEPRECATED_PROPS for detecting
if query-cpu-model-expansion can report deprecated CPU model properties.
QEMU introduced this capability in 9.1 release. Add flag and deprecated
features to the capabilities test data for QEMU 9.1 and 9.2 replies/XML
since it can now be accounted for.
When probing for the host CPU, perform a full CPU model expansion to
retrieve the list of features deprecated across the entire architecture.
The list and count are stored in the host's CPU model info within the
QEMU capabilities. Other info resulting from this query (e.g. model
name, etc) is ignored.
The new capabilities flag is used to fence off the extra query for
architectures/QEMU binaries that do not report deprecated CPU model
features.
Signed-off-by: Collin Walling <walling@linux.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
query-cpu-model-expansion may report an array of deprecated properties.
This array is optional, and may not be supported for a particular
architecture or reported for a particular CPU model. If the output is
present, then capture it and store in a qemuMonitorCPUModelInfo struct
for later use.
The deprecated features will be retained in qemuCaps->kvm->hostCPU.info
and will be stored in the capabilities cache file under the <hostCPU>
element using the following format:
<deprecatedFeatures>
<property name='bpb'/>
<property name='csske'/>
<property name='cte'/>
<property name='te'/>
</deprecatedFeatures>
At this time the data is only queried, parsed, and cached. The data
will be utilized in a subsequent patch.
Signed-off-by: Collin Walling <walling@linux.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Refactor the CPU Model parsing functions within
qemuMonitorJSONGetCPUModelExpansion. The new functions,
qemuMonitorJSONParseCPUModelExpansionData and
qemuMonitorJSONParseCPUModelExpansion invoke the functions they
replace and leave room for a subsequent patch to handle parsing the
(optional) deprecated_props field resulting from the command.
Signed-off-by: Collin Walling <walling@linux.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
This is a follow up of my previous commits. If the number of
vCPUs exceeds some arbitrary value (255) then QEMU requires IOMMU
with EIM and intremap enabled. But in turn, intremap IOMMU
requires split I/O APIC (per virDomainDefIOMMUValidate()). Since
after my previous commits (e.g. v10.10.0-rc1~183) IOMMU is added
automagically, the I/O APIC can be also enabled automagically.
Relates to: https://issues.redhat.com/browse/RHEL-65844
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This function return value is invariant since 18f3771, so change
its type and remove all dependent checks.
Found by Linux Verification Center (linuxtesting.org) with Svace.
Reported-by: Pavel Nekrasov <p.nekrasov@fobos-nt.ru>
Signed-off-by: Alexander Kuznetsov <kuznetsovam@altlinux.org>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
This function return value is invariant since 18f3771, so change
its type and remove all dependent checks.
Found by Linux Verification Center (linuxtesting.org) with Svace.
Reported-by: Pavel Nekrasov <p.nekrasov@fobos-nt.ru>
Signed-off-by: Alexander Kuznetsov <kuznetsovam@altlinux.org>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
This function return value is invariant since 18f3771, so change
its type and remove all dependent checks.
Found by Linux Verification Center (linuxtesting.org) with Svace.
Reported-by: Pavel Nekrasov <p.nekrasov@fobos-nt.ru>
Signed-off-by: Alexander Kuznetsov <kuznetsovam@altlinux.org>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
This function return value is invariant since 18f3771, so change
its type and remove all dependent checks.
Found by Linux Verification Center (linuxtesting.org) with Svace.
Reported-by: Pavel Nekrasov <p.nekrasov@fobos-nt.ru>
Signed-off-by: Alexander Kuznetsov <kuznetsovam@altlinux.org>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
For the full history behind this patch, look at the following:
https://issues.redhat.com/browse/RHEL-7036
commit v10.7.0-101-ga37bd2a15b
commit v10.8.0-rc2-8-gbcd5ae4e73
Summary: original problem was unexpected failure of update-device when
the user hadn't changed anything other than online status of the guest
NIC (which should always be allowed).
The first commit "fixed" this by avoiding the allocation of a new
ActualNetDef (i.e. creating a new networkport) for *all* network
device updates (because that was inappropriately changing which
ethernet physdev should be used for a macvtap connection, which by
design can't be handled in an update-device).
But this commit caused a regression for update-device of bridge-based
network devices (because some the updates of certain attributes *do*
require the ActualNetDef be re-allocated), so...
The 2nd commit narrowed the list of network types that get the "don't
allocate new ActualNetDef" treatment (so that only interfaces
connected to a network that uses a pool of ethernet VFs *being used in
passthrough mode* qualify).
But then it was pointed out that this re-broke simple updates of
devices that used a direct/macvtap network in "bridge" mode (because
it's possible to list multiple physdevs to use for bridge mode, in
which case the network driver attempts to "load balance" (and so a new
allocation might have a different ethernet physdev which, again, can't
be supported in a device-update).
So this (single line of code) patch *widens* the list of network types
that don't allocate a new ActualNetDef to also include the other
direct (macvtap) modes, e.g. bridge, private, etc.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
These functions return value is invariant since VIR_EXPAND_N check
removal in 7d2fd6e, so change its type and remove all dependent checks.
Found by Linux Verification Center (linuxtesting.org) with Svace.
Reported-by: Pavel Nekrasov <p.nekrasov@fobos-nt.ru>
Signed-off-by: Alexander Kuznetsov <kuznetsovam@altlinux.org>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
This function return value is invariant since VIR_EXPAND_N check
removal in 7d2fd6e, so change its type and remove all dependent checks.
Found by Linux Verification Center (linuxtesting.org) with Svace.
Reported-by: Pavel Nekrasov <p.nekrasov@fobos-nt.ru>
Signed-off-by: Alexander Kuznetsov <kuznetsovam@altlinux.org>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
This function return value is invariant since VIR_EXPAND_N check
removal in 7d2fd6e, so change its type and remove all dependent checks.
Found by Linux Verification Center (linuxtesting.org) with Svace.
Reported-by: Pavel Nekrasov <p.nekrasov@fobos-nt.ru>
Signed-off-by: Alexander Kuznetsov <kuznetsovam@altlinux.org>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
This function return value is invariant since VIR_EXPAND_N check
removal in 7d2fd6e, so change its type and remove all dependent checks.
Found by Linux Verification Center (linuxtesting.org) with Svace.
Reported-by: Pavel Nekrasov <p.nekrasov@fobos-nt.ru>
Signed-off-by: Alexander Kuznetsov <kuznetsovam@altlinux.org>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
This function return value is invariant since VIR_EXPAND_N check
removal in 7d2fd6e, so change its type and remove all dependent checks.
Found by Linux Verification Center (linuxtesting.org) with Svace.
Reported-by: Pavel Nekrasov <p.nekrasov@fobos-nt.ru>
Signed-off-by: Alexander Kuznetsov <kuznetsovam@altlinux.org>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
This function return value is invariant since VIR_EXPAND_N check
removal in 7d2fd6e, so change its type and remove all dependent checks.
Found by Linux Verification Center (linuxtesting.org) with Svace.
Reported-by: Pavel Nekrasov <p.nekrasov@fobos-nt.ru>
Signed-off-by: Alexander Kuznetsov <kuznetsovam@altlinux.org>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
QEMU supports -v1 variant of any CPU model even though the list of
versions is not defined (i.e., even if { .version = 1 } item is
missing).
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
When adding new CPU models to CPU map it's easy (and very common) to
forget to add the new files to meson.build. We already update index.xml
with the new models so updating meson.build too makes sense.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
When starting a migration with --timeout, we create a thread to call the
migration API and in parallel setup a timer for the timeout. The
description of --timeout says: "run action specified by --timeout-*
option (suspend by default) if live migration exceeds timeout", which is
not really the way this feature was implemented. Before live migration
starts we first need to contact the source to get the domain definition
and send it to the destination where a new QEMU process has to be
started. This can take some (unpredictably long) time while the timeout
timer is already running. If a very short timeout is set (which doesn't
really make sense, but it's allowed), we may even end up taking the
timeout action before the actual migration had a chance to start.
With this patch the timeout is started only after we get non-zero
dataTotal from virDomainGetJobInfo, which means the migration (of either
storage or memory) really started.
https://issues.redhat.com/browse/RHEL-41264
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
It may happen that, for instance after daemon restart, that one
thread is still in qemuProcessReconnect(), i.e. filling in
runtime information by talking to QEMU on monitor. If another
thread then tries to format domain XML (which is currently
guarded by plain mutex on virDomainObj) it'll produce incomplete
and misleading information (e.g. current size of virtio-mem).
This happens because the reconnecting thread talks to QEMU on
monitor and thus unlocks the domain object frequently allowing
the XML formatting thread to acquire the mutex meanwhile.
Resolves: https://issues.redhat.com/browse/RHEL-71042
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Debian has packaged EDK II for 64-bit RISC-V in directory
/usr/share/qemu-efi-riscv64/.
For usage with libvirt update the AppArmor helper.
Signed-off-by: Heinrich Schuchardt <heinrich.schuchardt@canonical.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
We don't need the full git package, git-core is sufficient and a smaller
build root install.
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The Linux MADV_DONTDUMP flag was added to Linux kernels > 3.3,
back in 2012, and the dump-guest-core flag was added to QEMU
> 1.0 at the same time.
IOW, on Linux we have long been able to assume that QEMU core
dumps will exclude guest memory, unless the user has overridden
the host level defaults in the domain XML.
It is desirable to permit QEMU core dumps out of the box to make
it easier for users to report crashes to their OS vendor without
having to reconfigure and restart libvirt daemons and their
running guests.
While there is a risk that an admin may have set 'dump_guest_core'
to true, while leaving 'max_core' to 0, on balance the benefits
of easier troubleshooting outweigh the risk of changing the
defaults to permit core dumps.
Reviewed-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Since iproute2 v6.12.0, the command "ip link set lo netns -1" can
no longer be used to check for netns support, as it now validates
PIDs are not less than zero.
Since every kernel we care about has the support, just remove the
check.
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Leigh Brown <leigh@solinno.co.uk>
When external swtpm support was added back in 9.0.0, I omitted
the update of the XML docs.
Add it now, especially since the 'emulator' backend can now
also use the <source> element.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
There are some features/improvements/bug fixes I've either
contributed or reviewed/merged. Document them for upcoming
release.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Due to a bug in the optimization to avoid testing symlinked tests
multiple times all tests were skipped.
In commit f997fcca71 I made an attempt to optimize the
tests by avoiding testing symlinks. This optimization was buggy as I've
passed the 'd_name' field of 'struct dirent' which is just the filename
to 'g_lstat()'. 'g_lstat()' obviously always failed with ENOENT. As the
logic checked only for successful return of 'g_lstat()' the optimizatio
was a dud.
Now in 4d8ebbfee8 the 'g_lstat()' call was replaced by
'virFileIsLink()' checking all non-zero values. This meant that if
'virFileIsLink()' failed the test was skipped. Now since a bad argument
was passed this failed always and thus was always skipped making
'virschematest' useless.
Fix it by passing the full path of the test and also explicitly check
for '1' return value instead of any non-zero.
Fixes: f997fcca71
Fixes: 4d8ebbfee8
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Due to 'virschematest' being broken commit a52cd504b3
introduced a new element to the domain caps but didn't add schema for
it.
Fixes: a52cd504b3
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Both the 'user' and 'group' attribute are optional so <identity> can
be empty. Allow it to be omitted completely. The parser and qemu code
can handle that.
The schema was introduced in 943871f971
and in d018c8dc9e an offending test was
added.
Fixes: 943871f971
Fixes: d018c8dc9e
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The <dataStore> volumes have their own 'id' so we need to be able to
look them up for the given image chain.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
As the source for the data file is a completely separate
virStorageSource including it's own index we need to match it
explicitly, so that code such as storage threshold events work properly
and separately for the data file.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Extract the matching of the node name of a single virStorage source so
that the logic can be reused in the upcoming patch.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
For the reason outlined in previous commit qemu doesn't do this
automatically. Handle it manually after the snapshot.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
In contrast to normal backing chain members where qemu does honour the
'auto-read-only' property the 'data-file' nodes are not automatically
reopened by qemu. Libvirt now has the infrastructure to reopen them
explicitly so use it for all transitions of the 'commit' block job.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Add an exception for image formats not supporting backing images so that
they can be reopened RW/RO without the need for adding a terminating
virStorageSource as they simply can't have a backing image.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Refactors done in 24b667eeed (and also 9ec0e28e87)
broke the expected handling of the update of 'readonly' flag of a
virStorage. The source is actually set to the proper state but rolled
back to the previous state as the 'cleanup' label should have been
'error' and thus not reached on success.
Additionally some of the code paths violate the statement in the comment
after updating 'readonly' that only 'goto error' must be used.
Fixes: 24b667eeed
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Log the node name and current and expected state to simplify debugging.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
This is similar to one of my previous commits (v10.7.0-rc1~22)
which introduced a check that <bandwidth/> values fit into
certain limits. My original commit validated values when parsing
<bandwidth/> XML, but completely missed the case when values are
set over virDomainSetInterfaceParameters() API.
Solution is simple - just perform validation after bandwidth
structure is reconstructed from arguments passed to the API.
Resolves: https://issues.redhat.com/browse/RHEL-65372
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
virCPUCompareUnusable can be called with blockers == NULL in case the
CPU model itself is usable (i.e., QEMU reports an empty list of
blockers), but the CPU definition contains some additional features
which have to be checked.
Fixes: v10.8.0-129-g5f8abbb7d0
Reported-by: Han Han <hhan@redhat.com>
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Tested-by: Han Han <hhan@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Please see the commit log for commit v10.9.0-rc1-1-g42ab0148dd for the
history and explanation of the problem that this patch is fixing.
A shorter explanation is that when a guest is connected to a libvirt
virtual network using a virtio-net adapter with in-kernel "vhost-net"
packet processing enabled, it will fail to acquire an IP address from
a DHCP seever running on the host.
In commit v10.9.0-rc1-1-g42ab0148dd we tried fixing this by *zeroing
out* the checksums of these packets with an nftables rule (nftables
can't recompute the checksum, but it can set it to 0) . This
*appeared* to work initially, but it turned out that zeroing the
checksum ends up breaking dhcp packets on *non* virtio/vhost-net guest
interfaces. That attempt was reverted in commit v10.9.0-rc2.
Fortunately, there is an existing way to recompute the checksum of a
packet as it leaves an interface - the "tc" (traffic control) utility
that libvirt already uses for bandwidth management. This patch uses a
tc filter rule to match dhcp response packets on the bridge and
recompute their checksum.
The filter rule must be attached to a tc qdisc, which may also have a
filter attached for bandwidth management (in the <bandwidth> element
of the network config). Not only must we add the qdisc only once
(which was already handled by the patch two prior to this one), but
also the filter rule for checksum fixing and the filter rule for
bandwidth management must be different priorities so they don't clash;
this is solved by adding the checksum-fix filter with "priority 2",
while the bandwidth management filter remains "priority 1" (both will
always be evaluated anyway, it's just a matter of which is evaluated
first).
So far this method has worked with every different guest we could
throw at it, including several that failed with the previous method.
Fixes: b89c4991da
Reported-by: Rich Jones <rjones@redhat.com>
Reported-by: Andrea Bolognani <abologna@redhat.com>
Fix-Suggested-by: Eric Garver <egarver@redhat.com>
Fix-Suggested-by: Phil Sutter <psutter@redhat.com>
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
If the layer of a virFirewallCmd is "tc", then the "tc" utility will
be executed using the arguments that had been added to the
virFirewallCmd
tc layer doesn't support auto-rollback command creation (any rollback
needs to be added manually with virFirewallAddRollbackCmd()), and also
tc layer isn't supported by the iptables backend (it would have been
straightforward to add, but the iptables backend doesn't need it, and
I didn't want to take the chance of causing a regression in that
code for no good reason).
Signed-off-by: Laine Stump <laine@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
There will soon be two separate users of tc on virtual networks, and
both will use the "qdisc root handle 1: htb" to add tx filters. One or the
other could get the first chance to add the qdisc, and then if at a
later time the other decides to use it, we need to prevent the 2nd
user from attempting to re-add the qdisc (because that just generates
an error).
We do this by running "tc qdisc show dev $bridge handle 1:" then
checking if the output of that command contains both "qdisc" and " 1:
".[*] If it does then the qdisc has already been added. If not then we
need to add it now.
[*]As of this writing, the output more exactly starts with "qdisc
htb 1: root", but our comparison is made purposefully generous to
increase the chances that it will continue to work properly if tc
modifies the format of its output.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
virNetDevBandwidthSet() adds a queue discipline (qdisc) for each
interface that it will need to add tc transmit filters to, and the
filters are then attached to the qdisc.
There are other circumstances where some other function will need to
add tc transmit filters to an interface (in particular an upcoming
patch to the network driver nftables backend that will use a tc tx
filter to fix the checksum of dhcp packets), so that function will
also need a qdisc for the tx filter. To assure both always use exactly
the same qdisc, this patch puts the command that adds the tx filter
qdisc into a separate helper function that can (and will) be called
from either place
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
virNetDevBandwidthSet() always clears all existing qdiscs and their
subordinate filters before adding all the new qdiscs/filters. This is
normally exactly what we want, but there is one case (the network
driver) where the Qdisc added by virNetDevBandwidthSet() may already
be in use by the nftables backend (which will add a rule to fix the
checksum of dhcp packets); in that case, we *don't* want
virNetDevBandwidthSet() to clear out the qdisc that was already added
for nftables, and none of the bandwidth filters have been added yet,
so there already aren't any "old" filters that need to be removed
either - it is safe to just skip virNetDevBandwidthClear() in this
case.
To allow the network driver to set bandwidth without first clearing
it, this patch adds the flag VIR_NETDEV_BANDWIDTH_SET_CLEAR_ALL to the
virNetDevBandwidthSetFlags enum, and recognizes it in
virNetDevBandwidthSet() - if the flag is set, then
virNetDevBandwidth() will call virNetDevBandwidthClear() just as it
always has. But if the flag isn't set it *won't* call
virNetDevBandwidthClear().
As suggested above, VIR_NETDEV_BANDWIDTH_SET_CLEAR_ALL is set for all
calls to virNetdevBandwidthSet() except for two places in the network
driver.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Having two bools in the arg list is on the borderline of being
confusing to anyone trying to read the code, but we're about to add a
3rd. This patch replaces the two bools with a single flags argument
which will instead have one or more bits from virNetDevBandwidthFlags
set.
Signed-off-by: Laine Stump <laine@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Some models are just aliases to other models. Make this relation
available to users via domain capabilities.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Record a fact a specific CPU model was derived from another one. The
original model is also marked as an alias of the new one in case it did
not change any properties of the original CPU.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
The signatures in the CPU map are used for matching physical CPUs and
thus we need to cover all possible real world variants we know about.
When adding a new version of an existing CPU model, we should copy the
signature(s) of the existing model rather than replacing it with the
signature that QEMU uses.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Add all newly generated CPU models to the appropriate section of
index.xml.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
We already visually group the included models using comments. This patch
introduces a new <group name='...'> element for doing it properly in a
machine friendly way.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
XMLs parse/format round trip using lxml results in an XML document that
almost exactly matches the original (including comments).
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
We don't really need or want the extra info to be included in the CPU
model definitions in git, it's mostly useful for verifying the output of
the script. Let's store it in a separate file rather than in a comment
block of the CPU model definition itself.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Each CPU model with -v* suffix is defined as a standalone model copying
all attributes of the previous version. CPU model versions with an alias
are handled differently. The full definition is used for the alias and
the versioned model is created as an identical copy of the alias.
To avoid breaking migration compatibility of host-model CPUs all
versioned models are marked with <decode guest='off'/> so that they are
ignored when selecting candidates for host-model. It's not ideal but not
doing so would break almost all host-model CPUs as the new versioned CPU
models have all vmx-* features included since their introduction while
existing CPU models were updated later. This meas existing models would
be accompanied with a long list of vmx-* features to properly describe a
host CPU while the newly added CPU models would have those features
enabled implicitly and their list of features would be significantly
shorter. Thus the new models would always be better candidates for
host-model than the existing models.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
While the script for synchronizing CPU features expects a path to QEMU
source tree, this CPU model script insisted on getting a full patch to
cpu.c file, even though it could easily deduce it from the path to QEMU
source tree.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
We don't change definitions of CPU models which were already included in
a libvirt release to maintain migration compatibility. Thus the script
can just skip existing models and save us from having to drop the
changes it would do to them.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
When removing features unknown to QEMU (they have a different name or
are completely missing as they are not configurable by a user) I should
not have removed them from the list of features unknown to QEMU in the
script for synchronizing QEMU features to the CPU map.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
When a CPU model is defined based on another model, we were completely
ignoring features marked as added to or removed from the original model
after it was released. For added features this is the right thing to do
as it will promote them to become normal features included in the new
model. But features marked as removed would become included in the new
model as well. We need to explicitly remove them as if they were never
included in the model.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Document which fields are inherited when a CPU model is based on another
model.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Add two test images showing the use of 'data_file' and 'data_file_raw'
(although the latter is not detected by libvirt) so that we can see that
the qcow2 metadata parser and backing chain populators work correctly.
The example files were created by:
qemu-img create -f qcow2 -o data_file=raw,data_file_raw=true,preallocation=off datafile.qcow2 1k
qemu-img create -f qcow2 -o data_file=rawpreallocation=off -F qcow2 -b datafile.qcow2 qcow2datafile-datafile.qcow2
Note that 'data_file_raw' is mutually exclusive with backing images.
Signed-off-by: Nikolai Barybin <nikolai.barybin@virtuozzo.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Add the block infrastructure for detecting and landling the data file
for images and starting qemu with the configuration.
Signed-off-by: Nikolai Barybin <nikolai.barybin@virtuozzo.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
In qcow2 header data file is represented by incompitible feature bit
and its path is saved to header extension table.
Thus, we implement here the logic similar to backing file probing.
Signed-off-by: Nikolai Barybin <nikolai.barybin@virtuozzo.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Introduce parsing and formatting of <dataStore> element. The <dataStore
represents a different storage volume meant for storing the actual
blocks of guest-visible data. The original disk source is then just a
metadata storage for any advanced features.
This currently works only for 'qcow2' images.
Signed-off-by: Nikolai Barybin <nikolai.barybin@virtuozzo.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
The 'data-file' is a qcow2 feature which allows storing the actual data
outside of the qcow2 image.
Signed-off-by: Nikolai Barybin <nikolai.barybin@virtuozzo.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
The previous example will cause the error like:
error: Options --file and --base64 are mutually exclusive
Reported-by: Yanqiu Zhang <yanqzhan@redhat.com>
Signed-off-by: Han Han <hhan@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Virtio-serial-pci device is hot pluggable, loosen the restriction
and allow user to hot plug it.
Signed-off-by: shenjiatong <yshxxsjt715@163.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Update to v9.2.0-rc0-42-g3428a3894c
Apart from the changes below there are changes to CPU features reported
by qemu, some of which were reported multiple times previously which no
longer happens.
Notable changes:
- 'reconnect-ms' added and 'reconnect' deprecated for 'stream' variant
of 'netdev-add' backend
- 'BLOCK_IO_ERROR' event removed 'qom-path' parameter
- 'GraniteRapids-v2-x86_64-cpu' added
- 'sm3' hashing algorithm for 'luks' added
- 'acpi-generic-port' object added
- deprecated field 'loaded' of 'secret'/'secret_keyring'/'tls-creds*'
removed
- 'sh4eb' target added
- 'query-migrationthreads' command deprecated
- 'busnr' and 'x-pcie-ext-tag' attributes added for
'ICH9-LPC'/'PIIX4_PM'/'VGA'/'mch'/'pcie-root-port'/'qxl'/'vfio-pci'/
'virtio-*'/'vmware-svga'
devices
- 'stale-tm' property added for 'intel-iommu' device
Experimental features:
- 'device-sync-config' command added
As the addition of the 'reconnect-ms' property of the 'stream' network
backend happened along with deprecation of the 'reconnect' field which
was already in use by libvirt this patch also captures the change to the
new format.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The 'reconnect' field of 'stream' network backend type is about to be
deprecated so libvirt will need to start using 'reconnect-ms'.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The 'stream' type for 'netdev-add' recently added support for
'reconnect-ms' which supersedes 'reconnect' (now deprecated). Add a
capability which will allow us to switch to the new property.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Historically the QMP schema lookup queries were grouped by the first
component of the query (which was also sorted), but not fully sorted.
This deteriorated over time. Re-group the query strings now that some
were added at the bottom.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
When a VM is being migrated to a destination host it can be made
persistent on the destination by using '--persistent'. That may not
work as intended if '--xml' is used as well as that allows overriding
certain aspects of the VM xml, but does not involve the persistent
definition. In most cases users will need to supply also
'--persistent-xml' with the same set of modification.
Modify the man page to clarify the above so that users don't end up with
broken VM after migrating and restarting it.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
When a VM is being migrated to a destination host it can be made
persistent on the destination by using VIR_MIGRATE_PERSIST_DEST. That
may not work as intended if VIR_MIGRATE_PARAM_DEST_XML or the 'xmlin'
parameter is used as that allows overriding certain aspects of the VM
xml, but does not involve the persistent definition.
In most cases users will need to supply also VIR_MIGRATE_PARAM_PERSIST_XML
with the same set of modification.
Modify the man page to clarify the above so that users don't end up with
broken VM after migrating and restarting it.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The original intention was to improve the behaviour of the
VIR_MIGRATE_PERSIST_DEST flag which makes the VM persistent after
migration on the destination when used with VIR_MIGRATE_PARAM_DEST_XML.
While it worked as intended with p2p migration where the migration is
driven from the virtqemud instance on the source of the migration, which
can distinguish between the user-provided input XML and the one fetched
from the source of the migration, it's not easily possible to achieve
the same behaviour with normal migration driven from the client library.
The approach also still had corner cases (originally deemed worth
changing) such as if the persistent definition was modified it would be
overwritten.
As there is no clear fix which would improve both styles of migrations
with no corner cases revert the change.
Upcoming commits will modify the documentation to add warning about the
use of VIR_MIGRATE_PERSIST_DEST with VIR_MIGRATE_PARAM_DEST_XML/xmlin
without using VIR_MIGRATE_PARAM_PERSIST_XML instead of a code fix.
This reverts commit 6a38559092.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
When virtio-(non-)transitional models were introduced, the
documentation was updated to include them; at the same time,
language was introduced indicating that using the existing
virtio model is no longer recommended.
This is unnecessarily harsh, and has resulted in people
incorrectly believing (through no fault of their own) that the
virtio model has been deprecated.
In reality, it's perfectly fine to use the virtio model as the
stress-free option that, while often not producing the ideal
PCI topology, will generally get the job done and work reliably
across libvirt versions and machine types.
Tweak the documentation so that it hopefully carries the
desired message across.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Some VMware guests have a boolean uefi.secureBoot.enabled. If found,
and it's set to "TRUE", and if it's a UEFI guest, then add this clause
into the domain XML:
<os firmware='efi'>
<firmware>
<feature enabled='yes' name='enrolled-keys'/>
<feature enabled='yes' name='secure-boot'/>
</firmware>
</os>
This approximates the meaning of this VMware flag.
Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
Fixes: https://issues.redhat.com/browse/RHEL-67836
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The '-loadvm' commandline parameter has exactly the same semantics as
the HMP 'loadvm' command. This includes the selection of which block
device is considered to contain the 'vmstate' section.
Since libvirt recently switched to the new QMP commands which allow a
free selection of where the 'vmstate' is placed, snapshot reversion will
no longer work if libvirt's algorithm disagrees with qemu's. This is the
case when the VM has UEFI NVRAM image, in qcow2 format, present.
To solve this we'll use the QMP counterpart 'snapshot-load' to load the
snapshot instead of using '-loadvm'. We'll do this before resuming
processors after startup of qemu and thus the behaviour is identical to
what we had before.
The logic for selecting the images now checks both the snapshot metadata
and the VM definition. In case images not covered by the snapshot
definition do have the snapshot it's included in the reversion, but it's
fatal if the snapshot is not present in a disk covered in snapshot
metadata.
The vmstate is selected based on where it's present as libvirt doesn't
store this information.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Refactor the parts of qemuBlockGetNamedNodeData which fetch the names of
internal snapshots present in the on-disk state of QCOW2 images to also
extract the presence of the 'vmstate' section.
This requires conversion of the snapshot list to a hash table as we
always know the name of the snapshot that we're looking for, and the
hash table allows also storing of additional data which we'll use to
store the presence of the 'vmstate'.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The internal snapshot code will use the 'snapshot-load' command so we
need to add the corresponding job type.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Libvirt currently loads snapshots via the '-loadvm' commandline option
but that uses the same logic as the 'loadvm' text monitor command used
to pick the disk image with the 'vmstate' section. Since libvirt now
implements our own logic to pick the 'vmstate' device it can happen that
we pick a different than qemu and thus qemu would fail to load the
snapshot. This happens currently on VMs with UEFI firmware with NVRAM
image in qcow2 format.
To fix this libvirt will need to use the 'snapshot-load' QMP command
instead of relying on '-savevm'.
Implement the monitor bits for 'snapshot-load'.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The live VM snapshot code already does handle the NVRAM image when it's
in use, so we should also handle it when modifying/creating the
snapshots via qemu-img when inactive.
Add the handling to qemuSnapshotForEachQcow2 which is used for all
inactive operations.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Refactor the function to avoid recursive call to rollback and simplify
calling parameters.
To achieve that most of the fatal checks are extracted into a dedicated
loop that runs before modifying the disk state thus removing the need to
rollback altoghether. Since rollback is still necessary when creation of
the snapshot fails half-way through the rollback is extracted to handle
only that scenario.
Additionally callers would only pass the old 'try_all' argument as true
on all non-creation ("-c") modes. This means that we can infer it from
the operation instead of passing it as an extra argument.
This refactor will also make it much simpler to implement handling of
the NVRAM pflash backing file (in case it's qcow2) for internal
snapshots.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The functions are exclusively used in the snapshot module. Move and
rename them:
qemuDomainSnapshotForEachQcow2Raw -> qemuSnapshotForEachQcow2Internal
qemuDomainSnapshotForEachQcow2 -> qemuSnapshotForEachQcow2
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Now that it's unused except for the recursive call it can be dropped
from all of the call tree.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The 'virCommand' helpers already look up the full path to the binary in
PATH if it's not specified. This means that the qemu driver doesn't have
to lookup and store the path to 'qemu-img' in the conf object but rather
can be cleaned up to use this new infrastructure.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Get the JSON profile that the swtpm instance was created with from the
output of 'swtpm socket --tpm2 --print-info 0x20 --tpmstate ...'. Get the
name of the profile from the JSON and set it in the current and persistent
emulator descriptions as 'name' attribute and have the persistent
description stored with this update. The user should avoid setting this
'name' attribute since it is meant to be read-only. The following is
an example of how the XML could look like:
<profile source='local:restricted' name='custom:restricted'/>
If the user provided no profile node, and therefore swtpm_setup picked its
default profile, the XML may now shows the 'name' attribute with the name
of the profile. This makes the 'source' attribute now optional.
<profile name='default-v1'/>
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Factor-out code related to adding the --tpmstate option to the swtpm
command line into its own function.
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Run swtpm_setup with the --profile-name option if the user provided the
name of a profile. swtpm_setup will try to load the profile from
directories with local profiles and distro profiles and if no profile
by this name with appended '.json' suffix could be found there, it will
fall back to try to use an internal profile with the given name.
Also set the --profile-remove-disabled option if the user provided a value
in the remove_disabled attribute in the profile XML node.
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Add documentation for the TPM backend profile node and point the reader to
further documentation about TPM profiles available in the swtpm man page.
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Extend the parser and XML builder with support for the profile parameter
and its remove_disabled attribute.
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Extend the schema for the TPM emulator profile node. Require that the
profile the user provides is described in a 'source' attribute. An optional
remove_disabled attribute is also supported for swtpm to automatically
remove algorithms from the 'custom' profile if they are disabled by FIPS
mode on the host.
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
To avoid passing TPM emulator parameters around individually, move them
into a structure and pass around the structure.
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
While sending API requests that don't need any body, explicitly set
CURLOPT_INFILESIZE to 0.
Without this option, curl sends a chunked request with `Expect: 100-continue`
header. The client, in this case curl, expects a response from the server,
ch in this case, to respond within a timeout period.
If guest definition has a PCI passthrough device configuration,
cloud-hypervisor process cannot respond within above mentioned timeout.
Even if cloud-hypervisor responds after the timeout, curl cannot read
the response. Because of this, virsh request to create a guest, hangs. This
only happens while using "mshv" hypervisor.
By setting CURLOPT_INFILESIZE to O, curl drops the Expect header and
sychronously waits for server to respond.
Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Move HostdevHostSupportsPassthroughVFIO method to hypervisor to be
shared between qemu and ch drivers.
Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Move HostdevNeedsVFIO method to hypervisor to be reused between qemu
and ch drivers.
Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
virtiofs 1.11 contains support for migration so update the 'Note' which
states that migration is not supported.
Additionally mention that VM snapshots don't save state of the files
shared via virtiofs so reverting is not a good idea.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
In case when a management application will require to store the nvram in
a block device instead of a file libvirt needs to be able to set up the
block device.
This patch introduces support for setting up the block device by using
'qemu-img convert' to produce a qcow2-formatted block device.
The use of 'qcow2' is made mandatory as the UEFI firmware requires that
the NVRAM image has the exact expected size, which is almost impossible
with block devices. 'qcow2' also allows libvirt to detect wheher the
block device is formatted allowing file-like semantics.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The setup of nvram will later be extended to also support block-device
backed nvram, so extract the file-backed nvram setup steps from
'qemuPrepareNVRAM' into 'qemuPrepareNVRAMFile'.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The nvram image can have any supported format and there's no technical
requirement of them having the same format. In fact the actual nvram
image doesn't necessarily need to have the same format as the template
if the user is willing to format it themselves (as libvirt is not going
to convert it).
Remove the nonsensical check and adjust tests. The test case required
swapping around the format in order to work properly.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Basing the selection on the format of the actual NVRAM image makes no
sense as user may format the image themselves.
Additionally it doesn't make much sense to even limit the firmware
selection based on the nvram template itself. As format of the template
is given and firmware images don't really provide any choice.
Remove the limitation so that autoselection can pick a template
regardless of the selected format or template format.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Refuse situations where the user configures a different format for a
file-backed nvram than the template file has.
At this point it's still required that the NVRAM and firmware share
format, but that is going to be relaxed, thus we need to refuse
configurations that the code can't handle.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The code historically skipped the 'format' field for 'raw' images as we
didn't output it when no format support was present. Stop misleading and
output the format also for 'raw' images.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
As the 'format' field is meant to carry the format of the nvram image we
should output it even when the image is 'raw'.
Currently this is not a problem but later patches will allow mismatch
between the nvram format and loader format (as nothing really
technically requires them to be the same and this then could become
problem).
Modify the condition and update tests.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Currently the qemu firmware code weirdly depends on the 'format' field
of the nvram image itself to do the auto-selection process as well as
then uses it to declare the actual type to qemu.
As it's not technically required that the template and the on disk image
share the type introduce a 'templateFormat' field which will split off
from the shared purpose of the type and will be used for the selection
and instantiation process, while 'format' will be left for the actual
type of the on disk image.
This patch introduces the field, adds XML infrastructure as well as
plumbs it to the firmware bits.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The NVRAM template file may be autoselected same as the loader/firmware
image. Add a hint that this can occur and also that it doesn't
necessarily need to be from the 'qemu.conf' configured files.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Restructure the code to assign first (as this is simpler to refactor in
the future) and avoid mixing implicit value checks with explicit ones by
checking for _NONE.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The qemu driver does support qcow2 images for the firmware and nvram
pflash devices, but we do not do the full backing chain setup for them
as we don't expect that those images would actually have a backing
store. We don't tell that to qemu though which theoretically can lead to
qemu probing the backing store from the image itself. We don't want that
for now.
Deny qemu probing the backing store by installing a "terminator" empty
virStorageSource as 'backingStore' for pflash and nvram.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
'qemuFirmwareEnsureNVRAM' which fills the NVRAM configuration bits which
may be missing was basing its decision to do something based on whether
the 'path' field was set. This is insufficient if remote storage is to
be considered.
Use 'virStorageSourceIsEmpty()' instead as that properly considers
remote filesystems and explain why the source is unref'd when the
function decides to rewrite the config.
The 'firmware-auto-efi-format-nvram-qcow2-network-nbd' is modified to
omit filling the 'path' field, which without this fix would result in
the nvram to be reset to a local file.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
'virFileRewrite()' which is used to setup the NVRAM image if it doesn't
exist or when it is requested by the user forcibly replaces the
destination file by the file it creates. For block devices this
overwrites the device node file or the symlink pointing to the device
node by a regular file instead of formatting it.
As this not only makes the VM fail to start but also breaks user's /dev/
filesystem forbid it for now.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The rule catches incorrect attempts to use internal references,
but doesn't guide the developer hitting a failure towards the
not exactly obvious acceptable alternatives.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
When using recent Fedora and RHEL versions, the manual setup that
is otherwise necessary to enable the module can be replaced with
executing a single command.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The page contains some confusing information, especially around
limitations that supposedly only affect one of the two variants,
and goes into what is arguably an unnecessary amount of detail
when it comes to its inner workings.
We can make the page a lot shorter and snappier without
affecting its usefulness, so let's do just that.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Problem with qemu_domain.c is that it's constantly growing. But
there are few options for improvement. For instance, validation
functions were moved out and now live in qemu_validate.c. We can
do the same for PostParse functions, though since PostParse may
modify domain definition, some functions need to be exported from
qemu_domain.c.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
When qemuDomainDeleteDevice() gets "DeviceNotFound" error it is a
special case as we're trying to remove a device which does not exists
any more. Such occasion is indicated by the return value -2.
Callers of the aforementioned function ought to base their behaviour on
the return value. However not all callers take as much care for the
return value as one could realistically anticipate.
Follow the usual direction of removing possible backend object (in case
of character devices), remove the device from its XML without waiting
for the device removal from QEMU (since it is already not there) and
basically follow the same algorithm as there is when the device was
removed, skipping over the wait for the device removal.
The overall return value also needs to be adjusted since
qemuDomainDeleteDevice() does not set an error on the -2 return value
and would otherwise trigger an unknown error being reported to the user
or management application.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
When moving function and/or renaming them sometimes corresponding
change to corresponding header file is not done. This leaves us
with functions that are declared in header files, but nowhere
implemented. Drop such declarations.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The function the comment is referring to is
virNetServerClientPrivNew() not virNetServerClintPrivNew(). The
latter doesn't even exist.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
As requested on the libvirt users list I am adding this mention to the
apps page.
Reported-by: Erik Huelsmann <ehuels@gmail.com>
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
This switches to newer freebsd 14.1 and implements the new RUN_PIPELINE
behaviour introduced by Daniel.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
When removing a socket in virCHMonitorClose() fails, a warning is
printed. But it doesn't contain errno nor g_strerror() which may
shed more light into why removing of the socket failed.
Oh, and since virCHMonitorClose() is registered as autoptr
cleanup for virCHMonitor() it may happen that virCHMonitorClose()
is called with mon->socketpath allocated but file not existing
yet (see virCHMonitorNew()). Thus ignore ENOENT and do not print
warning in that case - the file doesn't exist anyways.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The virCHMonitorClose() is meant to be called when monitor to
cloud-hypervisor process closes. It removes the socket and frees
string containing path to the socket.
In general, there is a problem with the following pattern:
if (var) {
do_something();
g_free(var);
}
because if the pattern executes twice the variable is freed
twice. That's why we have VIR_FREE() macro. Well, replace plain
g_free() with g_clear_pointer(). Mind you, this is NOT a
destructor where clearing pointers is needless.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
If QEMU supports multi boot device make use of it instead of using the
single boot device machine parameter.
Signed-off-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Let us introduce the xml and reply files for QEMU 9.2.0 on s390x.
A QEMU at commit v9.1.0-1348-g11b8920ed2 was used to generate this data.
Signed-off-by: Shalini Chellathurai Saroja <shalini@linux.ibm.com>
Signed-off-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Add capability QEMU_CAPS_VIRTIO_CCW_DEVICE_LOADPARM to detect multi boot
device support in QEMU by checking the virtio-blk-ccw device property
existence of loadparm.
Signed-off-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Let me preface this with stating the obvious: documentation on
QoS in OVS is very sparse. This is all based on my observation
and OVS codebase analysis.
For the following QoS setting:
<bandwidth>
<inbound average="512" peak="1024" burst="32"/>
</bandwidth>
the following QoS setting is generated into OVS (NB, our XML
values are in KiB/s, OVS has them in bits/s):
# ovs-vsctl list qos
_uuid : a087226b-2da6-4575-ad4c-bf570cb812a9
external_ids : {ifname=vnet1, vm-id="7714e6b5-4885-4140-bc59-2f77cc99b3b5"}
other_config : {burst="262144", max-rate="8192000", min-rate="4096000"}
queues : {0=655bf3a7-e530-4516-9caf-ec9555dfbd4c}
type : linux-htb
from which the following topology is generated:
# for i in qdisc class; do tc -s -d -g $i show dev vnet1; done
qdisc htb 1: root refcnt 2 r2q 10 default 0x1 direct_packets_stat 0 ver 3.17 direct_qlen 1000
Sent 2186 bytes 16 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
+---(1:fffe) htb rate 8192Kbit ceil 8192Kbit linklayer ethernet burst 1499b/1mpu 60b cburst 1499b/1mpu 60b level 7
| Sent 2186 bytes 16 pkt (dropped 0, overlimits 0 requeues 0)
| backlog 0b 0p requeues 0
|
+---(1:1) htb prio 0 quantum 51200 rate 4096Kbit ceil 8192Kbit linklayer ethernet burst 32Kb/1mpu 60b cburst 32Kb/1mpu 60b level 0
Sent 2186 bytes 16 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
Long story short, the default class (1:) for an OVS interface has
average and peak set exactly as requested. But since it's nested
under another class (1:fffe), it can borrow unused bandwidth. And
the parent is set to have rate = ceil = peak from our XML. From
[1]: htb_tc_install() calls htb_parse_qdisc_details__() which
sets: 'hc->min_rate = hc->max_rate;' and then calls
htb_setup_class_(..., tc_make_handle(1, 0xfffe), tc_make_handle(1, 0), &hc);
to set up the top parent class.
In other words - the interface is set up to so that it can always
consume 'peak' bandwidth and there is no way for us to set it up
differently. It's too late to deny setting 'peak' different to
'average' at XML validation phase so do the next best thing -
throw a warning, just like we do in case <bandwidth/> is set for
an unsupported <interface/> type.
1: https://github.com/openvswitch/ovs/blob/main/lib/netdev-linux.c#L5039
Resolves: https://issues.redhat.com/browse/RHEL-53963
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
If a Q35 domain has huge number of vCPUS (over 255, currently), then
it needs IOMMU with Extended Interrupt Mode enabled (see check in
qemuValidateDomainVCpuTopology()).
Well, we already add some devices and to other tricks when
parsing new domain XML. Might as well add IOMMU device if above
condition is met.
Resolves: https://issues.redhat.com/browse/RHEL-65844
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
If a Q35 domain has huge number of vCPUS (over 255, currently), then
it needs IOMMU with Extended Interrupt Mode enabled (see check in
qemuValidateDomainVCpuTopology()).
Well, we already add some devices and to other tricks when
parsing new domain XML. Might as well turn the EIM on for IOMMU
device.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
It only errors out when presented with a non-array, but we do check
it everywhere else.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
In virJSONValueFromJsonC, the return value of virJSONValueFromJsonC
was not checked in one case.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
In the rare case where int and long long are not the same size,
the multiplication of an int variable and an int constant might
overflow. Cast the constant to long long to avoid this.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Fixes: baa4edfb79
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
With upcoming v0.10 swtpm (commit
aa483aeb6d),
file locking with "lock" option is now supported and reflected in
"tpmstate-opt-lock" capability.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
When swtpm reports "nvram-backend-dir", it can accepts a single file or
block device where TPM state will be stored. --tpmstate must be
backend-uri=file://<path>.
Teach the storage to use custom directory or file source location.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Mechanically replace existing 'storagepath' with 'source_path', as the
following patches introduce <source path='..'> configuration.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Domain capabilities include information about support for various
devices and models.
Panic devices are not included in the output which means that management
applications need to include the logic for choosing the right device
model or request a default model and try defining such a domain.
Add reporting of panic device models into the domain capabilities based
on the logic in qemuValidateDomainDefPanic() and also report whether
panic devices are supported based on whether at least one model is
supported. That way consumers of the domain capability XML can
differentiate between libvirt not reporting the panic device models or
no model being supported.
Resolves: https://issues.redhat.com/browse/RHEL-65187
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The recent attempt to fix the attributes used wrong mode for some
directories used by the QEMU driver. Only dbus and swtpm directories use
770, all other directories are created with 755.
Fixes: 961fb8944d
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Add an error message for the rare case if json_tokener_new
fails (allocation failure) and guard any use of json_tokener_free
where tok might be NULL (this was possible in libvirt-nss
when the json file could not be opened).
https://gitlab.com/libvirt/libvirt/-/issues/581
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reported-by: Simon Pilkington
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Richard W.M. Jones <rjones@redhat.com>
The support for the 'sgio' attribute for SCSI-backed devices was dropped
as there wasn't really ever any upstream support for it.
The docs do state that support for this depends on the hypervisor
itself, but we can be more clear that there is no hypervisor which does
support it.
There is also a suggestion to use 'sgio' instead of 'rawio' as being
more "secure" but since it no longer works drop this suggestion.
Resolves: https://issues.redhat.com/browse/RHEL-65268
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Completely remove use of g_malloc (without zeroing of the allocated
memory) and forbid further use.
Replace use of g_malloc0 in cases where the variable holding the pointer
has proper type.
In all of the above cases we can use g_new0 instead.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The error message from 'json-c' was passed along without any libvirt
string which makes it hard to find in the source and isn't exactly clear
when present in logs:
libvirtd[843]: internal error : invalid utf-8 string
Prefix the message with 'failed to parse JSON'.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
We should include maximum physical address size in the CPU definition
created by virConnectBaselineHypervisorCPU only if we know the value for
all input CPUs. Otherwise we would create a CPU definition that is not
usable on all hosts from which we gathered the CPU info.
https://issues.redhat.com/browse/RHEL-24850
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Directories which we dynamically create in %{_rundir} with non-default
attributes (i.e., the owner differs from root:root and/or mode is not
755) fail RPM verification. We should properly declare the expected
ownership and mode in the specfile.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This reverts commit 42ab0148dd.
This patch was supposed to fix the checksum of dhcp response packets
by setting it to 0 (because having a non-0 but incorrect checksum was
causing the packets to be droppe on FreeBSD guests).
Early testing was positive, but after the patch was pushed upstream
and more people could test it, it turned out that while it fixed the
dhcp checksum problem for virtio-net interfaces on FreeBSD and
OpenBSD, it also *broke* dhcp checksums for the e1000 emulated NIC on
*all* guests (but not e1000e).
So we're reverting this fix and looking for something more universal
to be included in the next release.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
The docs for submitting a patch describe using your "Legal Name" with
the Signed-off-by line.
In recent times, there's been a general push back[1] against the notion
that use of Signed-off-by in a project automatically requires / implies
the use of legal ("real") names and greater awareness of the downsides.
Full discussion of the problems of such policies is beyond the scope of
this commit message, but at a high level they are liable to marginalize,
disadvantage, and potentially result in harm, to contributors.
TL;DR: there are compelling reasons for a person to choose distinct
identities in different contexts & a decision to override that choice
should not be taken lightly.
A number of key projects have responded to the issues raised by making
it clear that a contributor is free to determine the identity used in
SoB lines:
* Linux has clarified[2] that they merely expect use of the
contributor's "known identity", removing the previous explicit
rejection of pseudonyms.
* CNCF has clarified[3] that the real name is simply the identity
the contributor chooses to use in the context of the community
and does not have to be a legal name, nor birth name, nor appear
on any government ID.
Since we have no intention of ever routinely checking any form of ID
documents for contributors[4], realistically we have no way of knowing
anything about the name they are using, except through chance, or
through the contributor volunteering the information. IOW, we almost
certainly already have people using pseudonyms for contributions.
This proposes to accept that reality and eliminate unnecessary friction,
by following Linux & the CNCF in merely asking that a contributors'
commonly known identity, of their choosing, be used with the SoB line.
[1] Raised in many contexts at many times, but a decent overall summary
can be read at https://drewdevault.com/2023/10/31/On-real-names.html
[2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d4563201f33a022fc0353033d9dfeb1606a88330
[3] https://github.com/cncf/foundation/blob/659fd32c86dc/dco-guidelines.md
[4] Excluding the rare GPG key signing parties for regular maintainers
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Many years ago (April 2010), soon after "vhost" in-kernel packet
processing was added to the virtio-net driver, people running RHEL5
virtual machines with a virtio-net interface connected via a libvirt
virtual network noticed that when vhost packet processing was enabled,
their VMs could no longer get an IP address via DHCP - the guest was
ignoring the DHCP response packets sent by the host.
(I've been informed by danpb that the same issue had been encountered,
and "fixed" even earlier than that, in 2006, with Xen as the
hypervisor.)
The "gory details" of the 2010 discussion are chronicled here:
https://lists.isc.org/pipermail/dhcp-hackers/2010-April/001835.html
but basically it was because packet checksums weren't being fully
computed on the host side (because QEMU on the host and the NIC driver
in the guest had agreed between themselves to turn off checksums
because they were unnecessary due to the "link" between the two being
entirely in local memory rather than an error-prone physical cable),
but
1) a partial checksum was being put into the packets at some point by
"someone"
2) the "don't use checksums" info was known by the guest kernel, which
would properly ignore the "bad" checksum), and
3) the packets were being read by the dhclient application on the
guest side with a "raw" socket (thus bypassing the guest kernel UDP
processing that would have known the checksum was irrelevant and
ignore it)),
The "fix" for this ended up being two-tiered:
1) The ISC DHCP package (which contains the aforementioned dhclient
program) made a fix to their dhclient code which caused it to accept
packets anyway even if they didn't have a proper checksum (NB: that's
not a full explanation, and possibly not accurate). This remedied the
problem for guests with an updated dhclient. Here is the code with the
fix to ISC DHCP:
https://github.com/isc-projects/dhcp/blob/master/common/packet.c#L365
This eliminated the issue for any new/updated guests that had the
fixed dhclient, but it didn't solve the problem for existing/old guest
images that didn't/couldn't get their dhclient updated. This brings us
to:
2) iptables added a new "CHECKSUM" target and "--checksum-fill"
action:
http://patchwork.ozlabs.org/patch/58525/
and libvirt added an iptables rule for each virtual network to match
DHCP response packets and perform --checksum-fill. This way by the
time dhclient on the guest read the raw packet, the checksum would be
corrected, and the packet would be accepted. This was pushed upstream
in libvirt commit v0.8.2-142-gfd5b15ff1a.
The word at the time from those more knowledgeable than me was that
the bad checksum problem was really specific to ISC's dhclient running
on Linux, and so once their fix was in use everywhere dhclient was
used, bad checksums would be a thing of the past and the
--checksum-fill iptables rules would no longer be needed (but would
otherwise be harmless if they were still there).
(Plot twist: the dhclient code in fix (1) above apparently is on a
Linux-only code path - this is very important later!)
Based on this information (and also due to the opinion that fixing it
by having iptables modify the packet checksum was really the wrong way
to permanently fix things, i.e. an "ugly hack"), the nftables
developers made the decision to not implement an equivalent to
--checksum-fill in nftables. As a result, when I wrote the nftables
firewall backend for libvirt virtual networks earlier this year, it
didn't add in any rule to "fix" broken UDP checksums (since there was
apparently no equivalent in nftables and, after all, that was fixed
somewhere else 14 years ago, right???)
But last week, when Rich Jones was doing routine testing using a Fedora
40 host (the first Fedora release to use the nftables backend of libvirt's
network driver by default) and a FreeBSD guest, for "some strange
reason", the FreeBSD guest was unable to get an IP address from DHCP!!
https://www.spinics.net/linux/fedora/libvirt-users/msg14356.html
A few quick tests proved that it was the same old "bad checksum"
problem from 2010 come back to haunt us - it wasn't a Linux-only issue
after all.
Phil Sutter and Eric Garver (nftables people) pointed out that, while
nftables doesn't have an action that will *compute* the checksum of a
packet, it *does* have an action that will set the checksum to 0, and
suggested we try adding a "zero the checksum" rule for dhcp response
packets to our nftables ruleset. (Why? Because a checksum value of 0
in a IPv4 UDP packet is defined by RFC768 to mean "no checksum
generated", implying "checksum not needed"). It turns out that this
works - dhclient properly recognizes that a 0 checksum means "don't
bother with the checksum", and accepts the packet as valid.
So to once again fix this timeless bug, this patch adds such a
checksum zeroing rule to the nftables rules setup for each virtual
network.
This has been verified (on a Fedora 40 host) to fix DHCP with FreeBSD
and OpenBSD guests, while not breaking it for Fedora or Windows (10)
guests.
Fixes: b89c4991da
Reported-by: Rich Jones <rjones@redhat.com>
Fix-Suggested-by: Eric Garver <egarver@redhat.com>
Fix-Suggested-by: Phil Sutter <psutter@redhat.com>
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-10-25 12:00:52 -04:00
956 changed files with 269439 additions and 62169 deletions
relaxed Relax constraints on timers on, off :since:`1.0.0 (QEMU 2.0)`
vapic Enable virtual APIC on, off :since:`1.1.0 (QEMU 2.0)`
spinlocks Enable spinlock support on, off; retries - at least 4095 :since:`1.1.0 (QEMU 2.0)`
@@ -2078,13 +2100,13 @@ are:
vendor_id Set hypervisor vendor id on, off; value - string, up to 12 characters :since:`1.3.3 (QEMU 2.5)`
frequencies Expose frequency MSRs on, off :since:`4.7.0 (QEMU 2.12)`
reenlightenment Enable re-enlightenment notification on migration on, off :since:`4.7.0 (QEMU 3.0)`
tlbflush Enable PV TLB flush support on, off:since:`4.7.0 (QEMU 3.0)`
tlbflush Enable PV TLB flush support on, off; direct - on,off; extended - on,off :since:`4.7.0 (QEMU 3.0), direct and extended modes 11.0.0 (QEMU 7.1.0)`
ipi Enable PV IPI support on, off :since:`4.10.0 (QEMU 3.1)`
evmcs Enable Enlightened VMCS on, off :since:`4.10.0 (QEMU 3.1)`
avic Enable use Hyper-V SynIC with hardware APICv/AVIC on, off :since:`8.10.0 (QEMU 6.2)`
emsr_bitmap Avoid unnecessary updates to L2 MSR Bitmap upon vmexits. on, off :since:`10.7.0 (QEMU 7.1)`
xmm_input Enable XMM Fast Hypercall Input on, off :since:`10.7.0 (QEMU 7.1)`
File diff suppressed because it is too large
Load Diff
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.