5
0
mirror of git://git.proxmox.com/git/pve-guest-common.git synced 2024-12-27 03:21:36 +03:00
Commit Graph

74 Commits

Author SHA1 Message Date
Wolfgang Link
d869a19c9e Cleanup for stateless jobs.
If a VM configuration has been manually moved or recovered by HA,
there is no status on this new node.
In this case, the replication snapshots still exist on the remote side.
It must be possible to remove a job without status,
otherwise, a new replication job on the same remote node will fail
and the disks will have to be manually removed.
When searching through the sorted_volumes generated from the VMID.conf,
we can be sure that every disk will be removed in the event
of a complete job removal on the remote side.

In the end, the remote_prepare_local_job calls on the remote side a prepare.
2018-05-09 15:10:30 +02:00
Dietmar Maurer
edd61f2b3a Replication.pm: code cleanup 2018-04-16 10:52:24 +02:00
Dietmar Maurer
c1797f7a4d PVE/Replication.pm: fix error message 2018-04-16 10:48:49 +02:00
Wolfgang Link
ce22af0895 fix #1694: make failure of snapshot removal non-fatal
In certain high-load scenarios ANY ZFS operation can block,
including registering an (async) destroy.
Since ZFS operations are implemented via ioctl's,
killing the user space process
does not affect the waiting kernel thread processing the ioctl.

Once "zfs destroy" has been called, killing it does not say anything
about whether the destroy operation will be aborted or not.
Since running into a timeout effectively means killing it,
we don't know whether the snapshot exists afterwards or not.
We also don't know how long it takes for ZFS to catch up on pending ioctls.

Given the above problem, we must to not die on errors when deleting a no
longer needed snapshot fails (under a timeout) after an otherwise
successful replication. Since we retry on the next run anyway, this is
not problematic.

The snapshot deletion error will be logged in the replication log
and the syslog/journal.
2018-04-16 10:40:48 +02:00
Thomas Lamprecht
c8a71da5ce vzdump: add common log sub-method
Add a general log method here which supports to pass on the "log to
syslog too" functionality and makes it more clear what each
parameter of logerr and logginfo means.

Further, we can now also log wlith a 'warn' level, which can be
useful to notice an backup user of a possible problem which isn't a
error per se, but may need the users attention.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2017-12-15 12:11:22 +01:00
Thomas Lamprecht
8044127d7d vzdump: allow all defined log levels
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2017-12-15 12:11:22 +01:00
Wolfgang Link
ac02a68e07 Remove noerr form replication.
We will handle this errors in the API and decide what to do.

Reviewed-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Acked-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2017-12-13 14:51:34 +01:00
Wolfgang Bumiller
81228d280f replication: purge states: verify the vmlist
Instead of clearing out the local state if the last
cfs_update failed.

Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
2017-10-17 14:00:39 +02:00
Wolfgang Link
aa0d516fc5 Add logfunc in storage_migration.
This will redirect export and import output to the correct log, instead of paring it into the syslog.
2017-10-16 15:00:57 +02:00
Thomas Lamprecht
047ee481a6 VZDump/Plugin: avoid cyclic dependency
pve-guest-common is above qemu-server, pve-container and thus also
pve-manager in the package hierarchy.
The latter hosts PVE::VZDump, so using it here adds a cyclic
dependency between pve-manager and pve-guest-common.

Move the log method to the base plugin class and inline the
run_command function directly do the plugins cmd method.

pve-manager's PVE::VZDump may use this plugins static log function
then instead of its own copy.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2017-09-21 09:48:08 +02:00
Thomas Lamprecht
71dd5d907b AbstractMigrate: remove unused obsolete variables 2017-09-20 12:39:11 +02:00
Thomas Lamprecht
ee966a3f7a AbstractMigrate: do not overwrite global signal handlers
perls 'local' must be either used in front of each $SIG{...}
assignments or they must be put in a list, else it affects only the
first variable and the rest are *not* in local context.

This may cause weird behaviour where daemons seemingly do not get
terminating signals delivered correctly and thus may not shutdown
gracefully anymore.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2017-09-07 10:32:08 +02:00
Alwin Antreich
4c3144eaa6 Fix #1480: die if snapshot name is not found before set_lock is used
Signed-off-by: Alwin Antreich <a.antreich@proxmox.com>
2017-09-01 09:06:24 +02:00
Wolfgang Bumiller
d91bac5053 replication: we must call storage_migrate with with_snapshots true 2017-07-03 11:58:41 +02:00
Thomas Lamprecht
23ca78cd25 replication job_status: add get_disabled parameter
allows the API/frontend to get the disabled jobs easier

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2017-06-29 10:42:19 +02:00
Dietmar Maurer
b3ed460ed0 Revert "Add guest type at find_local_replication_job"
This reverts commit 914b6647a4.

No longer required.
2017-06-29 07:27:16 +02:00
Dietmar Maurer
6358ffe1cb PVE::Replication - do not use $jobcfg->{vmtype} 2017-06-29 07:26:51 +02:00
Wolfgang Link
914b6647a4 Add guest type at find_local_replication_job
We need this at migration time.
2017-06-28 14:29:45 +02:00
Dietmar Maurer
40bcf6526b fix previous commit 2017-06-28 12:05:18 +02:00
Dietmar Maurer
22ce136731 replication: improve schedule_job_now
do no not modify anything if there is no state
2017-06-28 12:01:50 +02:00
Wolfgang Bumiller
b90dc712c5 replication: add schedule_job_now helper 2017-06-28 11:54:11 +02:00
Wolfgang Bumiller
621b955fb8 replication: sort time stamps numerically 2017-06-28 09:52:17 +02:00
Dietmar Maurer
1b82f17117 replication: pass $noerr to run_replication_nolock 2017-06-28 07:54:11 +02:00
Wolfgang Link
14849765e5 Add new function delete_guest_states. 2017-06-27 12:51:44 +02:00
Wolfgang Bumiller
fd844180a7 replication: don't sync to offline targets on error states
There's no point in trying to replicate to a target node
which is offline. Note that if we're not already in an
error state we do still give it a try in order for this to
get logged as an error at least once.
2017-06-27 12:13:24 +02:00
Wolfgang Bumiller
3385399339 replication: keep retrying every 30 minutes in error state
Otherwise we never get out of it.
2017-06-27 12:13:24 +02:00
Dietmar Maurer
92a243e986 PVE::ReplicationState - cleanup job state on job removal
Signed-off-by: Dietmar Maurer <dietmar@proxmox.com>
2017-06-27 11:53:28 +02:00
Dietmar Maurer
44972014b2 PVE::ReplicationState::purge_old_states - new helper 2017-06-27 10:15:01 +02:00
Dietmar Maurer
2c508173ea PVE::ReplicationState::write_job_state - allow to remove state completely 2017-06-27 08:13:36 +02:00
Dietmar Maurer
5e93f430f8 PVE/Replication.pm: also log when we thaw the filesystem 2017-06-23 13:18:08 +02:00
Dominik Csapak
93c3695b05 change replica log timestamp to a human readable format
modeled after the timestamps in vm migration

this does not impact the regression tests, because they
overwrite 'get_log_time' anyway

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
2017-06-22 10:17:15 +02:00
Dietmar Maurer
1c9607105a PVE::AbstractMigrate - new helpers to handle replication jobs 2017-06-21 12:24:57 +02:00
Dietmar Maurer
210a5f7970 PVE::ReplicationState::extract_vmid_tranfer_state - new helper
moved from PVE::QemuMigrate
2017-06-21 12:24:06 +02:00
Dietmar Maurer
18c369255d PVE::ReplicationConfig::switch_replication_job_target - new helper
moved from PVE::QemuMigrate
2017-06-21 11:43:24 +02:00
Dietmar Maurer
c64fb36899 PVE/ReplicationConfig.pm: store job id inside job config
To simplify code.
2017-06-20 13:19:53 +02:00
Dietmar Maurer
3ec43aafc8 PVE::Replication::run_replication - add verbose parameter
used for regression tests
2017-06-20 08:54:01 +02:00
Dietmar Maurer
5899ebbd2d PVE::Replication::run_replication - return replicated $volumes 2017-06-20 08:51:08 +02:00
Dietmar Maurer
c17dcb3eb3 PVE::ReplicationState - new helpers record_job_start/record_job_end 2017-06-20 07:13:00 +02:00
Dietmar Maurer
e4f6301672 PVE::Replication::find_common_replication_snapshot - new helper
This is just a cleanup (simply factor out code from replicate()).
2017-06-20 07:13:00 +02:00
Dietmar Maurer
637b7acd2e PVE::ReplicationConfig::find_local_replication_job - new helper 2017-06-20 07:13:00 +02:00
Wolfgang Bumiller
c475e16d11 replication: replicate_volume: rate can be undefined
as it is optional in which case we want to pass undef to
stogae_migrate
2017-06-19 09:58:08 +02:00
Dietmar Maurer
7862a35c71 replicate_volume: implement rate limit and insecure 2017-06-14 08:44:13 +02:00
Dietmar Maurer
c324e90764 call get_replicatable_volumes with $vmid parameter 2017-06-13 09:17:50 +02:00
Dietmar Maurer
54b79ff5e2 get_replicatable_volumes: add $vmid parameter 2017-06-13 09:00:24 +02:00
Dietmar Maurer
14f17b497a PVE/AbstractConfig.pm - include missing classes 2017-06-13 06:13:39 +02:00
Dietmar Maurer
55222f3747 PVE/ReplicationState.pm: implement write_vmid_job_states
Update all job states related to a specific $vmid
2017-06-12 11:33:05 +02:00
Wolfgang Link
683a30e062 Make rollback compatible with storage replica.
If we rollback we have to clean up all local replication snapshots.
2017-06-12 10:54:33 +02:00
Dietmar Maurer
d7f305c0bc PVE::Replication - pass $cleanup parameter to get_replicatable_volumes 2017-06-12 09:05:50 +02:00
Dietmar Maurer
3f85b14d89 PVE::AbstractConfig - add prototype for get_replicatable_volumes 2017-06-12 09:03:14 +02:00
Dietmar Maurer
b499eccbf0 PVE::Replication::prepare - allow to pass undefined $jobid
And remove all replication snapshots in that case. This is useful
for snapshot rollback.
2017-06-12 07:55:28 +02:00