mirror of
https://github.com/samba-team/samba.git
synced 2024-12-23 17:34:34 +03:00
recoverd: Stabilise the recovery master role
On rare occasions when a node that has been inactive it will trigger an election when it becomes active again. If that node has been up for the longest then it will win the election and the recovery master role will spuriously move. While a node remains inactive we reset the priority time to discourage it from winning elections. The priority time will now reflect roughly how long the node has been active rather than how long it has been up. That means the most stable node is more likely to win elections. Having a stable recovery master means that disabling takeover runs while reloading IPs is more likely to succeed. It also improves the chances of being able to cache information in the recovery master - for example, between takeover runs. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit f0f48f22f45e4c82eba2582efae307e25385de81)
This commit is contained in:
parent
630196423a
commit
30a50c6e1e
@ -3442,6 +3442,14 @@ static void main_loop(struct ctdb_context *ctdb, struct ctdb_recoverd *rec,
|
||||
also frozen and that the recmode is set to active.
|
||||
*/
|
||||
if (rec->node_flags & (NODE_FLAGS_STOPPED | NODE_FLAGS_BANNED)) {
|
||||
/* If this node has become inactive then we want to
|
||||
* reduce the chances of it taking over the recovery
|
||||
* master role when it becomes active again. This
|
||||
* helps to stabilise the recovery master role so that
|
||||
* it stays on the most stable node.
|
||||
*/
|
||||
rec->priority_time = timeval_current();
|
||||
|
||||
ret = ctdb_ctrl_getrecmode(ctdb, mem_ctx, CONTROL_TIMEOUT(), CTDB_CURRENT_NODE, &ctdb->recovery_mode);
|
||||
if (ret != 0) {
|
||||
DEBUG(DEBUG_ERR,(__location__ " Failed to read recmode from local node\n"));
|
||||
|
Loading…
Reference in New Issue
Block a user