5
0
mirror of git://git.proxmox.com/git/pve-guest-common.git synced 2025-01-25 06:03:45 +03:00

248 Commits

Author SHA1 Message Date
Fabian Ebner
ff574bf8d2 replication: update last_sync before removing old replication snapshots
If pvesr was terminated after finishing with the new sync and after
removing old replication snapshots, but before it could write the new
state, the next replication would fail. It would wrongly interpret the
actual last replication snapshot as stale, remove it, and (if no other
snapshots are present) attempt a full sync, which would fail.

Reported in the community forum [0], this was brought to light by the
new pvescheduler before it learned graceful reload.

It's not possible to simply preserve a last remaining snapshot in
prepare(), because prepare() is also used for valid removals. Instead,
update last_sync early enough. Stale snapshots will still be removed
on the next run if there are any.

[0]: https://forum.proxmox.com/threads/100154

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2021-11-29 10:50:36 +01:00
Fabian Grünbichler
7d604b5bbd bump version to 4.0-3
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2021-11-09 13:17:37 +01:00
Fabian Ebner
2511f525f5 config: snapshot delete check: use volume_snapshot_info
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2021-11-08 10:35:38 +01:00
Fabian Ebner
b20bf9bf7d replication: find common snapshot: use additional information
which is now available from the storage back-end.

The benefits are:

1. Ability to detect different snapshots even if they have the same
name. Rather hard to reach, but for example with:
Snapshots A -> B -> C -> __replicationXYZ
Remove B, rollback to C (causes __replicationXYZ to be removed),
create a new snapshot B. Previously, B was selected as replication
base, but it didn't match on source and target. Now, C is correctly
selected.
2. Smaller delta in some cases by not prefering replication snapshots
over config snapshots, but using the most recent possible one from
both types of snapshots.
3. Less code complexity for snapshot selection.

If the remote side is old, it's not possible to detect mismatch of
distinct snapshots with the same name, but the timestamps from the
local side can still be used.

Still limit to our snapshots (config and replication), because we
don't have guarantees for other snapshots (could be deleted in the
meantime, name might not fit import/export regex, etc.).

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2021-11-08 10:35:34 +01:00
Fabian Ebner
3200c404a9 replication: prepare: return additional information about snapshots
This is backwards compatible, because existing users of prepare() only
rely on the elements to evaluate to true or be defined.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2021-11-08 10:35:34 +01:00
Fabian Ebner
84fc20aa37 replication: refactor finding most recent common replication snapshot
By using a single loop instead. Should make the code more readable,
but also more efficient.

Suggested-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2021-11-08 10:35:34 +01:00
Fabian Ebner
602ca77cdb fix #3111: config: snapshot delete: check if replication still needs it
and abort if it does and --force is not specified.

After rollback, the rollback snapshot might still be needed as the
base for incremental replication, because rollback removes (blocking)
replication snapshots.

It's not enough to limit the check to the most recent snapshot,
because new snapshots might've been created between rollback and
remove.

It's not enough to limit the check to snapshots without a parent (i.e.
in case of ZFS, the oldest), because some volumes might've been added
only after that, meaning the oldest snapshot is not an incremental
replication base for them.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2021-11-08 10:34:14 +01:00
Fabian Ebner
8d1cd44345 partially fix #3111: replication: be less picky when selecting incremental base
After rollback, it might be necessary to start the replication from an
earlier, possibly non-replication, snapshot, because the replication
snapshot might have been removed from the source node. Previously,
replication could only recover in case the current parent snapshot was
already replicated.

To get into the bad situation (with no replication happening between
the steps):
1. have existing replication
2. take new snapshot
3. rollback to that snapshot
In case the partial fix to only remove blocking replication snapshots
for rollback was already applied, an additional step is necessary to
get into the bad situation:
4. take a second new snapshot

Since non-replication snapshots are now also included, where no
timestamp is readily available, it is necessary to filter them out
when probing for replication snapshots.

If no common replication snapshot is present, iterate backwards
through the config snapshots.

The changes are backwards compatible:

If one side is running the old code, and the other the new code,
the fact that one of the two prepare() calls does not return the
new additional snapshot candidates, means that no new match is
possible. Since the new prepare() returns a superset, no previously
possible match is now impossible.

The branch with @desc_sorted_snap is now taken more often, but
it can still only be taken when the volume exists on the remote side
(and has snapshots). In such cases, it is safe to die if no
incremental base snapshot can be found, because a full sync would not
be possible.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2021-11-08 10:34:00 +01:00
Fabian Ebner
c05dc937d4 replication: pass guest config to find_common_replication_snapshot
in preparation to iterate over all config snapshots when necessary.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2021-11-08 10:34:00 +01:00
Fabian Ebner
fbbeb87225 replication: remove unused variable and style fixes
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2021-11-08 10:34:00 +01:00
Fabian Ebner
45c0b7554c partially fix #3111: further improve removing replication snapshots
by using the new $blocker parameter. No longer remove all replication
snapshots from affected volumes unconditionally, but check first if
all blocking snapshots are replication snapshots. If they are, remove
them and proceed with rollback. If they are not, die without removing
any.

For backwards compatibility, it's still necessary to remove all
replication snapshots if $blockers is not available.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2021-11-08 10:34:00 +01:00
Fabian Ebner
a9bc9b3c89 config: rollback: factor out helper for removing replication snapshots
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2021-11-08 10:34:00 +01:00
Fabian Ebner
2dfe62927b partially fix #3111: snapshot rollback: improve removing replication snapshots
Get the replicatable volumes from the snapshot config rather than the
current config. And filter those volumes further to those that will
actually be rolled back.

Previously, a volume that only had replication snapshots (e.g. because
it was added after the snapshot was taken, or the vmstate volume)
would lose them.  Then, on the next replication run, such a volume
would lead to an error, because replication tried to do a full sync,
but the target volume still exists.

This is not a complete fix. It is still possible to run into problems:
- by removing the last (non-replication) snapshots after a rollback
  before replication can run once.
- by creating a snapshot and making a rollback before replication can
  run once.

The list of volumes is not required to be sorted for prepare(), but it
is sorted by how foreach_volume() iterates now, so not random.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2021-11-08 10:34:00 +01:00
Fabian Grünbichler
239fe671c3 build: switch upload target to bullseye
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2021-06-09 11:39:15 +02:00
Fabian Grünbichler
523e947366 bump version to 4.0-2
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2021-06-09 10:07:42 +02:00
Fabian Ebner
60796d5fbb vzdump: defaults: keep all backups by default for 7.0
and switch to using prune-backups instead of maxfiles.

Storages created via the web UI defaulted to keeping all backups already, switch
to this safer default here as well.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2021-06-08 14:57:31 +02:00
Fabian Ebner
a58366f460 vzdump: remove deprecated size parameter
It was deprecated for a long time (before it got move to guest-common) already,
and there also was a deprecation warning when passed as a CLI option.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2021-06-08 14:57:31 +02:00
Thomas Lamprecht
71066e627e bump version to 4.0-1
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2021-05-12 13:08:53 +02:00
Thomas Lamprecht
dfcc0de52d d/control: increase compat level to 12
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2021-05-12 13:03:34 +02:00
Thomas Lamprecht
873b9de294 d/control: update meta information
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2021-05-12 13:03:09 +02:00
Thomas Lamprecht
960c85be38 buildsys: split packaging and source build-systems
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2021-05-09 20:10:14 +02:00
Fabian Ebner
1c527dfe62 mention prune behavior for the remove parameter
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2021-03-05 21:24:39 +01:00
Thomas Lamprecht
de1ae1652c bump version to 3.1-5
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2021-02-19 16:32:26 +01:00
Fabian Ebner
17b5185b77 vzdump: mailto: use email-or-username-list format
because it is a more complete pattern. Also, 'mailto' was a '-list' format in
PVE 6.2 and earlier, so this also fixes whitespace-related backwards
compatibility. In particular, this fixes creating a backup job in the GUI
without setting an address, which passes along ''.

For example,
> vzdump 153 --mailto " ,,,admin@proxmox.com;;; developer@proxmox.com , ; "
was valid and worked in PVE 6.2.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2021-02-19 16:29:58 +01:00
Fabian Ebner
7a9b527f54 vzdump: command line: make sure mailto is comma-separated
In addition to relying on shellquote(), it's still nice to avoid printing out
unnecessary whitespaces, especially newlines.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2021-02-19 16:29:58 +01:00
Fabian Ebner
9e542a4f90 vzdump: command line: refactor handling prune-backups
to re-use a line here and with the next patch.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2021-02-19 16:29:58 +01:00
Fabian Ebner
533d6e503a vzdump: correctly handle prune-backups option in commandline and cron config
Previously only the hash reference was printed instead of the property string.
It's also necessary to parse the property string when reading the cron config.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2021-01-26 18:48:27 +01:00
Fabian Ebner
0111ceb9fd vzdump: add explicit reminder for breaking schema change for PVE 7.0
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2021-01-26 18:45:16 +01:00
Thomas Lamprecht
9d8f36e1ce bump version to 3.1-4
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2020-12-15 15:52:58 +01:00
Fabian Ebner
5da50f8cf5 vzdump: update exclude-path description
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2020-11-24 16:30:10 +01:00
Fabian Ebner
f6c5ba3cdb job_status: simplify fixup of jobs for stolen guests
by using switch_replication_job_target_nolock.

If a job is scheduled for removal and the guest was
stolen, it still makes sense to correct the job entry,
which didn't happen previously.

AFAICT, this was the only user of swap_source_target_nolock.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2020-11-06 15:10:25 +01:00
Fabian Ebner
0c3550c014 create nolock variant for switch_replication_job_target
so that it can be used within job_status

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2020-11-06 15:09:51 +01:00
Fabian Ebner
158c90bf9a also update sources in switch_replication_job_target
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2020-11-06 15:07:04 +01:00
Fabian Ebner
6364fd6385 clarify what the source property is used for in a replication job
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2020-11-06 15:06:52 +01:00
Fabian Ebner
db9eb287d4 job_status: read only after acquiring the lock
to get the current replication config and have the VM list
and state object as recent as possible.

Suggested-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2020-11-06 15:05:47 +01:00
Thomas Lamprecht
60aeee5fb1 print snapshot tree: reduce indentation
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2020-10-24 16:57:38 +02:00
Oguz Bektas
295a6359db vzdump: use regex check for 'mailto'
add a new string format to allow local usernames like 'root' but also
limit which characters can be used in the 'mailto' address.

Co-Authored-by: Fabian Gruenbichler <f.gruenbichler@proxmox.com>
Signed-off-by: Oguz Bektas <o.bektas@proxmox.com>
2020-09-03 10:02:32 +02:00
Thomas Lamprecht
74c0a4cf7e bump version to 3.1-3
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2020-08-24 10:12:34 +02:00
Thomas Lamprecht
0eec698ff5 move config: add comment about node owning configs
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2020-08-24 09:45:30 +02:00
Fabian Ebner
659061237c Add move_config_to_node method
allows to mock it when testing and a few lines less duplication
between the migration modules.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2020-08-24 09:42:50 +02:00
Fabian Ebner
0d9520c415 Add prune-backups option to vzdump parameters
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
2020-08-20 17:39:15 +02:00
Fabian Grünbichler
9bdcd0325d bump version to 3.1-2
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2020-08-05 12:15:00 +02:00
Fabian Grünbichler
301b375bab unbreak config_with_pending_array
which lead to current and pending/delete values being returned
separately, and being misinterpreted by the web interface (and probably
other clients as well).

Fixes: daf8fca57a34417365c873ed91f3a52bf0002a4f

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
2020-08-05 12:12:44 +02:00
Thomas Lamprecht
fb53d1087e bump version to 3.1-1
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2020-07-13 09:08:57 +02:00
Thomas Lamprecht
07c587d84a followup: add comment to avoid same mistake again
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2020-07-11 18:47:53 +02:00
Stoiko Ivanov
dd59a7cac0 fix #2834: skip refs in config_with_pending_array
With the refactoring of config_with_pending_array in
daf8fca57a34417365c873ed91f3a52bf0002a4f a few sanity checks on parsed configs
were dropped.

One case where a config value should be skipped, instead of parsed and added
is when the value is not scalar. This is the case for the raw lxc keys
(e.g. lxc.init.cmd, lxc.apparmor.profile) - which get added as array to the
'lxc' key.

This patch reintroduces the skipping of non-scalar values, when parsing the
config but not for the pending values.
From a short look through the commit history the sanity checks were in place
since 2014 (introduced in qemu-server for handling pending configuration
changes), and their removal did not cause any other regressions.
To my knowledge only the raw lxc config keys are parsed into a non-scalar
value.

Tested by adding a 'lxc.init.cmd' key to a container config.

Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
2020-07-11 18:45:26 +02:00
Thomas Lamprecht
981e497b79 bump version to 3.0-11
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2020-07-07 18:51:35 +02:00
Aaron Lauterer
a8878e5ef7 Adapt description of get_backup_volumes
as `data` was a bit too generic we now use `volume_config` in the actual
implementations. Thus we should adapt the description as well.

Tab spacing for the other keys has been adapted for easier readabilty.

Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
2020-06-24 11:14:19 +02:00
Thomas Lamprecht
daf8fca57a refactor config_with_pending_array
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2020-05-20 17:08:09 +02:00
Thomas Lamprecht
43c899e407 fix config_with_pending_array for falsy current values
one could have a config with:
> acpi: 0

and a pending deletion for that to restore the default 1 value.

The config_with_pending_array method then pushed the key twice, one
in the loop iterating the config itself correctly and once in the
pending delete hash, which is normally only for those options not yet
referenced in the config at all. Here the check was on "truthiness"
not definedness, fix that.

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
2020-05-20 17:04:40 +02:00