IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
by also expecting the ".scope" part and trying the next entry if it is
not present instead of immediately failing.
It's still unexpected to encounter such entries, so keep the log line.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
On a hybrid cgroup system, the /proc/<PID>/cgroup file looks like
> 13:pids:/qemu.slice/110.scope
> 12:perf_event:/
> 11:devices:/qemu.slice
> 10:misc:/
> 9:hugetlb:/
> 8:freezer:/
> 7:cpu,cpuacct:/qemu.slice/110.scope
> 6:memory:/qemu.slice/110.scope
> 5:rdma:/
> 4:cpuset:/
> 3:blkio:/qemu.slice
> 2:net_cls,net_prio:/
> 1:name=systemd:/qemu.slice/110.scope
> 0::/qemu.slice/110.scope
but the order doesn't seem to be deterministic, so it can happen that
an entry like '11:devices:/qemu.slice' is the first to match the
'/qemu.slice' part, which previously made the code expect to find the
VMID.
To improve detection, as a first step, match the trailing slash too.
Reported in the community forum:
https://forum.proxmox.com/threads/129320/post-571654
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
This is the single remaining user of the id argument. The id argument
is a Proxmox-specific extension to QEMU, which we'd like to drop to
reduce our differences with upstream QEMU.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
this is functionally the same, but sending SIGTERM has the ugly side
effect of printing the following to the log:
> QEMU[<pid>]: kvm: terminating on signal 15 from pid <pid> (/usr/sbin/qmeventd)
while sending a QMP quit command does not.
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
currently, the 'forced_cleanup' (sending SIGKILL to the qemu process),
is intended to be triggered 5 seconds after sending the initial shutdown
signal (SIGTERM) which is sadly not enough for some setups.
Accidentally, it could be triggered earlier than 5 seconds, if a
SIGALRM triggers in the timespan directly before setting it again.
Also, this approach means that depending on when machines are shutdown
their forced cleanup may happen after 5 seconds, or any time after, if
new vms are shut off in the meantime.
Improve this situation by reworking the way we deal with this cleanup.
We save the pidfd, time incl. timeout in the Client, and set a timeout
to 'epoll_wait' of 10 seconds, which will then trigger a forced_cleanup.
Remove entries from the forced_cleanup list when that entry is killed,
or when the normal cleanup took place.
To improve the shutdown behaviour, increase the default timeout to 60
seconds, which should be enough, but add a commandline toggle where
users can set it to a different value.
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
In most circumstances a pidfd gets closed automatically once the child
dies, and that *should* be guaranteed by us calling SIGKILL - however,
it seems that sometimes that doesn't happen, leading to leaked file
descriptors[0].
Also add a small note to verbose mode showing when the late-cleanup
actually happens, helped during debug.
[0] https://forum.proxmox.com/threads/cannot-shutdown-vm.83911/
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
if one would try to use -v in a systemd service, systemd would disable
line buffering for stdout and no output would happen (until the buffer
is full)
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
'alarm' is used to schedule an additionaly cleanup round 5 seconds after
sending SIGTERM via terminate_client. This then sends SIGKILL via a
pidfd (if supported by the kernel) or directly via kill, making sure
that the QEMU process is *really* dead and won't be left behind in an
undetermined state.
This shouldn't be an issue under normal circumstances, but can help
avoid dead processes lying around if QEMU hangs after SIGTERM.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
We take care of killing QEMU processes when a guest shuts down manually.
QEMU will not exit itself, if started with -no-shutdown, but it will
still emit a "SHUTDOWN" event, which we await and then send SIGTERM.
This additionally allows us to handle backups in such situations. A
vzdump instance will connect to our socket and identify itself as such
in the handshake, sending along a VMID which will be marked as backing
up until the file handle is closed.
When a SHUTDOWN event is received while the VM is backing up, we do not
kill the VM. And when the vzdump handle is closed, we check if the
guest has started up since, and only if it's determined to still be
turned off, we then finally kill QEMU.
We cannot wait for QEMU directly to finish the backup (i.e. with
query-backup), as that would kill the VM too fast for vzdump to send the
last 'query-backup' to mark the backup as successful and done.
For handling 'query-status' messages sent to QEMU, a state-machine-esque
protocol is implemented into the Client struct (ClientState). This is
necessary, since QMP works asynchronously, and results arrive on the
same channel as events and even the handshake.
For referencing QEMU Clients from vzdump messages, they are kept in a
hash table. This requires linking against glib.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
It's really not nice if such many files, source code, meta-files, …
linger around in the top level directory..
Also, cleanup the build a bit, i.e., use LDFLAGS as dpkg-buildpackage
can set some LDFLAGS so it'd be nice if both CFLAFGS and LDFLAGS have
the same (related) ones.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>