pve-ha-manager

mirror of git://git.proxmox.com/git/pve-ha-manager.git synced 2025-01-20 18:03:53 +03:00

Author	SHA1	Message	Date
Fiona Ebner	f74f8ffb24	manager: set resource scheduler mode upon init Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>	2022-11-18 13:25:21 +01:00
Fiona Ebner	7c142d6822	env: datacenter config: include crs (cluster-resource-scheduling) setting Suggested-by: Thomas Lamprecht <t.lamprecht@proxmox.com> Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>	2022-11-18 13:25:21 +01:00
Fiona Ebner	749d8161be	env: rename get_ha_settings to get_datacenter_settings The method will be extended to include other HA-relevant settings from datacenter.cfg. Suggested-by: Thomas Lamprecht <t.lamprecht@proxmox.com> Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>	2022-11-18 13:25:21 +01:00
Fiona Ebner	48f2144b27	usage: add Usage::Static plugin for calculating node usage of services based upon static CPU and memory configuration as well as scoring the nodes with that information to decide where to start a new or recovered service. For getting the service stats, it's necessary to also consider the migration target (if present), becuase the configuration file might have already moved. It's necessary to update the cluster filesystem upon stealing the service to be able to always read the moved config right away when adding the usage. Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>	2022-11-18 13:25:21 +01:00
Fiona Ebner	5d724d4dd9	manager: online node usage: switch to Usage::Basic plugin no functional change is intended. One test needs adaptation too, because it created its own version of $online_node_usage. Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>	2022-11-18 13:25:21 +01:00
Fiona Ebner	b259857688	manager: select service node: add $sid to parameters In preparation for scheduling based on static information, where the scoring of nodes depends on information from the service's VM/CT configuration file (and the $sid is required to query that). Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>	2022-11-18 13:25:21 +01:00
Fiona Ebner	c8c6e462fc	add Usage base plugin and Usage::Basic plugin in preparation to also support static resource scheduling via another such Usage plugin. The interface is designed in anticipation of the Usage::Static plugin, the Usage::Basic plugin doesn't require all parameters. In Usage::Static, the $haenv will necessary for logging and getting the static node stats. add_service_usage_to_node() and score_nodes_to_start_service() take the sid, service node and the former also the optional migration target (during a migration it's not clear whether the config file has already been moved or not) to be able to get the static service stats. Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>	2022-11-18 13:25:21 +01:00
Fiona Ebner	eea0c60923	resources: add get_static_stats() method to be used for static resource scheduling. In container's vmstatus(), the 'cores' option takes precedence over the 'cpulimit' one, but it felt more accurate to prefer 'cpulimit' here. Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>	2022-11-18 13:25:21 +01:00
Fiona Ebner	5db695c3f3	env: add get_static_node_stats() method to be used for static resource scheduling. In the simulation environment, the information can be added in hardware_status. Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>	2022-11-18 13:25:21 +01:00
Thomas Lamprecht	0869c306ba	fixup variable name typo Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2022-07-22 12:39:27 +02:00
Thomas Lamprecht	a3ffb0b3d4	manager: add top level comment section to explain common variables Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2022-07-22 12:15:55 +02:00
Thomas Lamprecht	bc64c08e37	d/lintian-overrides: update for newer lintian Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2022-07-22 10:06:47 +02:00
Thomas Lamprecht	2a1638b77b	bump version to 3.4.0 Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2022-07-22 09:22:52 +02:00
Thomas Lamprecht	6f818da13f	manager: online node usage: factor out possible traget and future proof only count up target selection if that node is already in the online node usage list, to avoid that a offline node is considered online if its a target from any command Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2022-07-22 09:12:38 +02:00
Thomas Lamprecht	8c80973d40	test: update pre-existing policy tests for fixed balancing spread Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2022-07-22 08:49:41 +02:00
Thomas Lamprecht	1280368d31	fix variable name typo Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2022-07-22 07:25:02 +02:00
Thomas Lamprecht	066fd01670	fix spreading out services if source node isnt operational but otherwise ok as its the case for going into maintenance mode Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2022-07-21 18:14:33 +02:00
Thomas Lamprecht	6756e14aed	tests: add shutdown policy scenario with multiple guests to spread out currently wrong as online_node_usage doesn't considers counting the target node if the source node isn't considered online (= operational) anymore Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2022-07-21 18:09:42 +02:00
Thomas Lamprecht	c00c44818a	bump version to 3.3-4 Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2022-04-27 14:02:22 +02:00
Fabian Grünbichler	ad6456997e	lrm: fix getting stuck on restart run_workers is responsible for updating the state after workers have exited. if the current LRM state is 'active', but a shutdown_request was issued in 'restart' mode (like on package upgrades), this call is the only one made in the LRM work() loop. skipping it if there are active services means the following sequence of events effectively keeps the LRM from restarting or making any progress: - start HA migration on node A - reload LRM on node A while migration is still running even once the migration is finished, the service count is still >= 1 since the LRM never calls run_workers (directly or via manage_resources), so the service having been migrated is never noticed. maintenance mode (i.e., rebooting the node with shutdown policy migrate) does call manage_resources and thus run_workers, and will proceed once the last worker has exited. reported by a user: https://forum.proxmox.com/threads/lrm-hangs-when-updating-while-migration-is-running.108628 Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>	2022-04-27 13:57:37 +02:00
Thomas Lamprecht	fe3781e8ab	buildsys: track and upload debug package Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2022-01-20 18:08:27 +01:00
Thomas Lamprecht	c15a8b803e	bump version to 3.3-3 Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2022-01-20 18:05:37 +01:00
Thomas Lamprecht	eef4f86338	lrm: increase run_worker loop-time parition every LRM round is scheduled to run for 10s but we spend only half of that to actively trying to run workers (in the max_worker limit). Raise that to 80% duty cycle. Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2022-01-20 16:17:28 +01:00
Thomas Lamprecht	65c1fbac99	lrm: avoid job starvation on huge workloads If a setup has a lot VMs we may run into the time limit from the run_worker loop before processing all workers, which can easily happen if an admin did not increased their default of max_workers in the setup, but even with a bigger max_worker setting one can run into it. That combined with the fact that we sorted just by the $sid alpha-numerically means that CTs where preferred over VMs (C comes before V) and additionally lower VMIDs where preferred too. That means that a set of SIDs had a lower chance of ever get actually run, which is naturally not ideal at all. Improve on that behavior by adding a counter to the queued worker and preferring those that have a higher one, i.e., spent more time waiting on getting actively run. Note, due to the way the stop state is enforced, i.e., always enqueued as new worker, its start-try counter will be reset every round and thus have a lower priority compared to other request states. We probably want to differ between a stop request when the service is/was in another state just before and the time a stop is just re-requested even if a service was already stopped for a while. Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2022-01-20 16:14:03 +01:00
Thomas Lamprecht	b538340c9d	lrm: code/style cleanups Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2022-01-20 14:40:27 +01:00
Thomas Lamprecht	f613e426ce	lrm: run worker: avoid an indendation level best viewed with the `-w` flag to ignore whitespace change itself Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2022-01-20 13:42:15 +01:00
Thomas Lamprecht	a25a516ac6	lrm: log actual error if fork fails Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2022-01-20 13:39:35 +01:00
Thomas Lamprecht	2deff1ae35	manager: refactor fence processing and rework fence-but-no-service log Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2022-01-20 13:31:04 +01:00
Thomas Lamprecht	0179818f48	d/changelog: s/nodes/services/ Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2022-01-20 10:10:27 +01:00
Thomas Lamprecht	ccf328a833	bump version to 3.3-2 Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2022-01-19 14:30:19 +01:00
Fabian Ebner	7dc927033f	manage: handle edge case where a node gets stuck in 'fence' state If all services in 'fence' state are gone from a node (e.g. by removing the services) before fence_node() was successful, a node would get stuck in the 'fence' state. Avoid this by calling fence_node() if the node is in 'fence' state, regardless of service state. Reported in the community forum: https://forum.proxmox.com/threads/ha-migration-stuck-is-doing-nothing.94469/ Signed-off-by: Fabian Ebner <f.ebner@proxmox.com> [ T: track test change of new test ] Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2022-01-19 13:50:47 +01:00
Thomas Lamprecht	30fc7ceedb	lrm: also check CRM node-status for determining fence-request This fixes point 2. of commit 3addeeb - avoiding that a LRM goes active as long as the CRM still has it in (pending) `fence` state, which can happen after a watchdog reset + fast boot. This avoids that we interfere with the CRM acquiring the lock, which is all the more important once a future commit gets added that ensures a node isn't stuck in `fence` state if there's no service configured (anymore) due to admin manually removing them during fencing. We explicitly fix the startup first to better show how it works in the test framework, but as the test/sim hardware can now delay the CRM now while keeping LRM running, the second test (i.e., test-service-command9) should still trigger after the next commit, if this one would be reverted or broken otherwise. Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2022-01-19 13:48:57 +01:00
Thomas Lamprecht	303490d8f1	lrm: factor out fence-request check into own helper we'll extend that a bit in a future commit Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2022-01-19 13:48:57 +01:00
Thomas Lamprecht	ca2e547a76	test: cover case where all service get removed from in-progress fenced node this test's log is showing up two issues we'll fix in later commits 1. If a node gets fenced and an admin removes all services before the fencing completes, the manager will ignore that node's state and thus never make the "fence" -> "unknown" transition required by the state machine 2. If a node is marked as "fence" in the manager's node status, but has no service, its LRM's check for "pending fence request" returns a false negative and the node start trying to acquire its LRM work lock. This can even succeed in practice, e.g. the events: 1. Node A gets fenced (whyever that is), CRM is working on acquiring its lock while Node A reboots 2. Admin is present and removes all services of Node A from HA 2. Node A booted up fast again, LRM is already starting before CRM could ever get the lock (<< 2 minutes) 3. Service located on Node A gets added to HA (again) 4. LRM of Node A will actively try to get lock as it has no service in fence state and is (currently) not checking the manager's node state, so is ignorant of the not yet processed fence -> unknown transition (note: above uses 2. twice as those points order doesn't matter) As a result the CRM may never get to acquire the lock of Node A's LRM, and thus cannot finish the fence -> unknown transition, resulting in user confusion and possible weird effects. I the current log one can observe 1. by the missing fence tries of the master and 2. can be observed by the LRM acquiring the lock while still being in "fence" state from the masters POV. We use two tests so that point 2. is better covered later on Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2022-01-19 13:48:21 +01:00
Thomas Lamprecht	1b21e7e651	sim: implement skip-round command for crm/lrm This allows to simulate situations where there's some asymmetry required in service type scheduling, e.g., if we the master should not pickup LRM changes just yet - something that can happen quite often in the real world due to scheduling not being predictable, especially across different hosts. The implementation is pretty simple for now, that also means we just do not care about watchdog updates for the skipped service, meaning that one is limited to skip two 20s rounds max before self-fencing kicks in. This can be made more advanced once required. Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2022-01-19 11:19:34 +01:00
Thomas Lamprecht	214b70f45a	sim: test hw: small code cleanups and whitespace fixes Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2022-01-19 11:19:34 +01:00
Thomas Lamprecht	0e13a6c123	sim: service add command: allow to override state Until now we had at most one extra param, so lets get the all remaining params in an array and use that, fallback staid the same. Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2022-01-19 11:19:34 +01:00
Thomas Lamprecht	1323ef6ec5	sim: add service: set type/name in config Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2022-01-19 11:19:34 +01:00
Thomas Lamprecht	fe19c9b412	test/sim: also log delay commands Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2022-01-19 11:19:34 +01:00
Thomas Lamprecht	a0a7d11ed6	sim/hardware: sort and split use statements Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2022-01-17 15:57:43 +01:00
Thomas Lamprecht	4ee32601b9	lrm: fix comment typos Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2022-01-17 15:57:43 +01:00
Thomas Lamprecht	0dcb6597aa	crm: code/style cleanup Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2022-01-17 12:28:22 +01:00
Thomas Lamprecht	8a25bf2969	d/postinst: fix restarting LRM/CRM when triggered We wrongly dropped the semi-manual postinst in favor of a fully auto-generated one, but we always need to generate the trigger actions ourself - cannot work otherwise. Fix 3166752 ("postinst: use auto generated postinst") Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2022-01-17 11:30:49 +01:00
Thomas Lamprecht	b7fb934810	d/lintian: update repeated-trigger override Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2022-01-17 11:30:08 +01:00
Thomas Lamprecht	a31c6fe591	lrm: fix log call on wrong module Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2021-10-07 15:19:30 +02:00
Thomas Lamprecht	a2d12984b5	bump version to 3.3-1 Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2021-07-02 20:08:12 +02:00
Thomas Lamprecht	719883e9a5	recovery: allow disabling a in-recovery service Mostly for convenience for the admin, to avoid the need for removing it completely, which is always frowned uppon by most users. Follows the same logic and safety criteria as the transition to `stopped` on getting into the `disabled` state in the `next_state_error`. As we previously had a rather immediate transition from recovery -> error (not anymore) this is actually restoring a previous feature and does not adds new implications or the like. Still, add a test which also covers that the recovery state does not allows things like stop or migrate to happen. Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2021-07-02 20:08:12 +02:00
Thomas Lamprecht	6104d9e76e	tests: cover request-state changes and crm-cmds for in-recovery services Add a test which covers that the recovery state does not allows things like stop or migrate to happen. Also add one for disabling at the end, this is currently blocked too but will change in the next patch, as it can be a safe way out for the admin to reset the service without removing it. Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2021-07-02 20:08:12 +02:00
Thomas Lamprecht	feea391367	recompute_online_node_usage: show state on internal error makes debugging easier, also throw in some code cleanup Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2021-07-02 20:08:12 +02:00
Thomas Lamprecht	90a247552c	fix #3415 : never switch in error state on recovery, try harder With the new 'recovery' state introduced a commit previously we get a clean transition, and thus actual difference, from to-be-fenced and fenced. Use that to avoid going into the error state when we did not find any possible new node we could recover the service too. That can happen if the user uses the HA manager for local services, which is an OK use-case as long as the service is restricted to a group with only that node. But previous to that we could never recover such services if their node failed, as they got always put into the "error" dummy/final state. But that's just artificially limiting ourself to get a false sense of safety. Nobody, touches the services while it's in the recovery state, no LRM not anything else (as any normal API call gets just routed to the HA stack anyway) so there's just no chance that we get a bad double-start of the same services, with resource access collisions and all the bad stuff that could happen (and note, this will in practice only matter for restricted services, which are normally only using local resources, so here it wouldn't even matter if it wasn't safe already - but it is, double time!). So, the usual transition guarantees still hold: * only the current master does transitions * there needs to be a OK quorate partition to have a master And, for getting into recovery the following holds: * the old node's lock was acquired by the master, which means it was (self-)fenced -> resource not running So as "recovery" is a no-op state we got only into once the nodes was fenced we can continue recovery, i.e., try to find a new node for t the failed services. Tests: * adapt the exist recovery test output to match the endless retry for finding a new node (vs. the previous "go into error immediately" * add a test where the node comes up eventually, so that we cover also the recovery to the same node it was on, previous to a failure * add a test with a non-empty start-state, the restricted failed node is online again. This ensure that the service won't get started until the HA manager actively recovered it, even if it's staying on that node. Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>	2021-07-02 20:08:12 +02:00

1 2 3 4 5 ...

730 Commits