samba-mirror

mirror of https://github.com/samba-team/samba.git synced 2025-01-03 01:18:10 +03:00

Ignoring revisions in .git-blame-ignore-revs. Click here to bypass and see the normal blame view.

3322 lines

80 KiB

C

Raw Permalink Normal View History

start working on a recovery daemon change ctdb_control so it takes a timeval pointer as argument. this is the timeout. if the node has not responded within hte timeout ctdb_control will return an error instead of hanging. if the timeval pointer is NULL then the call will block indefinitely if there is no response. this is used for now in the createdb control but all the helpers ctdb_ctrl_* should probably be updated to take a timeout parameter as well. (This used to be ctdb commit 1fe64b04869b17dbf123851b0fe09df8d28a6211) 2007-05-04 02:30:18 +04:00			`/*`
			`ctdb recovery daemon`

			`Copyright (C) Ronnie Sahlberg 2007`

ctdb is GPL not LGPL (This used to be ctdb commit 8624378010d1c2a1438e1e701339dfba7276f960) 2007-05-31 07:50:53 +04:00			`This program is free software; you can redistribute it and/or modify`
			`it under the terms of the GNU General Public License as published by`
update lib/replace from samba4 (This used to be ctdb commit f0555484105668c01c21f56322992e752e831109) 2007-07-10 09:29:31 +04:00			`the Free Software Foundation; either version 3 of the License, or`
ctdb is GPL not LGPL (This used to be ctdb commit 8624378010d1c2a1438e1e701339dfba7276f960) 2007-05-31 07:50:53 +04:00			`(at your option) any later version.`

			`This program is distributed in the hope that it will be useful,`
start working on a recovery daemon change ctdb_control so it takes a timeval pointer as argument. this is the timeout. if the node has not responded within hte timeout ctdb_control will return an error instead of hanging. if the timeval pointer is NULL then the call will block indefinitely if there is no response. this is used for now in the createdb control but all the helpers ctdb_ctrl_* should probably be updated to take a timeout parameter as well. (This used to be ctdb commit 1fe64b04869b17dbf123851b0fe09df8d28a6211) 2007-05-04 02:30:18 +04:00			`but WITHOUT ANY WARRANTY; without even the implied warranty of`
ctdb is GPL not LGPL (This used to be ctdb commit 8624378010d1c2a1438e1e701339dfba7276f960) 2007-05-31 07:50:53 +04:00			`MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the`
			`GNU General Public License for more details.`

			`You should have received a copy of the GNU General Public License`
update lib/replace from samba4 (This used to be ctdb commit f0555484105668c01c21f56322992e752e831109) 2007-07-10 09:29:31 +04:00			`along with this program; if not, see <http://www.gnu.org/licenses/>.`
start working on a recovery daemon change ctdb_control so it takes a timeval pointer as argument. this is the timeout. if the node has not responded within hte timeout ctdb_control will return an error instead of hanging. if the timeval pointer is NULL then the call will block indefinitely if there is no response. this is used for now in the createdb control but all the helpers ctdb_ctrl_* should probably be updated to take a timeout parameter as well. (This used to be ctdb commit 1fe64b04869b17dbf123851b0fe09df8d28a6211) 2007-05-04 02:30:18 +04:00			`*/`

ctdb-daemon: Remove dependency on includes.h Instead of includes.h, include the required header files explicitly. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-26 08:50:46 +03:00			`#include "replace.h"`
start working on a recovery daemon change ctdb_control so it takes a timeval pointer as argument. this is the timeout. if the node has not responded within hte timeout ctdb_control will return an error instead of hanging. if the timeval pointer is NULL then the call will block indefinitely if there is no response. this is used for now in the createdb control but all the helpers ctdb_ctrl_* should probably be updated to take a timeout parameter as well. (This used to be ctdb commit 1fe64b04869b17dbf123851b0fe09df8d28a6211) 2007-05-04 02:30:18 +04:00			`#include "system/filesys.h"`
better timeout handling for calls, controls and traverses (This used to be ctdb commit 63346a6c59d4821b4c443939b5d88db8cd20f5fe) 2007-05-10 08:06:48 +04:00			`#include "system/time.h"`
let each node verify that they have a correct assignment of public ip addresses (i.e. htey hold those they should hold and they dont hold any of those they shouldnt hold) if an inconsistency is found, mark the local node as recovery mode active and wait for the recovery master to trigger a full blown recovery (This used to be ctdb commit 55a5bfc8244c5b9cdda3f11992f384f00566b5dc) 2007-09-14 04:16:36 +04:00			`#include "system/network.h"`
when we are shutting down, we should first shut down the recovery daemon (This used to be ctdb commit 39ade6b329adcd3234124d6a8daaa6181abf739b) 2007-10-22 06:34:08 +04:00			`#include "system/wait.h"`
ctdb-daemon: Remove dependency on includes.h Instead of includes.h, include the required header files explicitly. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-26 08:50:46 +03:00
			`#include <popt.h>`
			`#include <talloc.h>`
			`#include <tevent.h>`
			`#include <tdb.h>`

ctdb-util: Rename db_wrap to tdb_wrap and make it a build subsystem This makes it consistent with Samba, to ease transition. Update unit test code to link to with tdb_wrap instead of including db_wrap.c. There are some potential whitespace fixes in this commit that have been ignored. CTDB's lib/tdb_wrap will be deleted after the transition to Samba's lib/tdb_wrap, so there's no point polishing it too much. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2014-08-15 09:46:33 +04:00			`#include "lib/tdb_wrap/tdb_wrap.h"`
ctdb-recoverd: Change include of dlinklist.h to contain directory This makes it consistent with the rest of the code and avoids problems when some variant of lib/util isn't in the include path. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2014-08-15 10:18:05 +04:00			`#include "lib/util/dlinklist.h"`
ctdb-daemon: Remove dependency on includes.h Instead of includes.h, include the required header files explicitly. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-26 08:50:46 +03:00			`#include "lib/util/debug.h"`
			`#include "lib/util/samba_util.h"`
ctdb-common: Drop CTDB's copy of sys_read() and sys_write() Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Tue Nov 29 11:22:40 CET 2016 on sn-devel-144 2016-11-29 04:55:06 +03:00			`#include "lib/util/sys_rw.h"`
ctdb: Use prctl_set_comment from lib/util Signed-off-by: Christof Schmitt <cs@samba.org> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-09-24 02:10:59 +03:00			`#include "lib/util/util_process.h"`
ctdb-daemon: Remove dependency on includes.h Instead of includes.h, include the required header files explicitly. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-26 08:50:46 +03:00
			`#include "ctdb_private.h"`
			`#include "ctdb_client.h"`

ctdb-recoverd: Process leader broadcasts Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-03-16 08:07:26 +03:00			`#include "protocol/protocol_basic.h"`

ctdb-common: Rename system utility files system_socket.[ch] will contain all the raw socket code and other functions that use ctdb_sock_addr. system.[ch] will contain other platform dependent functions. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-06-28 13:15:37 +03:00			`#include "common/system_socket.h"`
ctdb-daemon: Separate prototypes for common client/server functions This groups function prototypes for common client/server functions in common/common.h and removes them from ctdb_private.h. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-23 06:17:34 +03:00			`#include "common/common.h"`
ctdb-server: Replace ctdb_logging.h with common/logging.h Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Michael Adam <obnox@samba.org> 2015-11-11 07:41:10 +03:00			`#include "common/logging.h"`
start working on a recovery daemon change ctdb_control so it takes a timeval pointer as argument. this is the timeout. if the node has not responded within hte timeout ctdb_control will return an error instead of hanging. if the timeval pointer is NULL then the call will block indefinitely if there is no response. this is used for now in the createdb control but all the helpers ctdb_ctrl_* should probably be updated to take a timeout parameter as well. (This used to be ctdb commit 1fe64b04869b17dbf123851b0fe09df8d28a6211) 2007-05-04 02:30:18 +04:00
ctdb-conf: Move all conf files to new conf/ subdirectory Leave common/conf.[ch] where they are to make this commit comprehensible. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Guenther Deschner <gd@samba.org> Reviewed-by: Anoop C S <anoopcs@samba.org> 2019-08-19 05:06:40 +03:00			`#include "conf/ctdb_config.h"`
ctdb-config: Switch tunable DisableIPFailover to a config option Use the "failover:disabled" option instead. BUG: https://bugzilla.samba.org/show_bug.cgi?id=13589 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-08-21 06:41:22 +03:00
ctdb-recovery: Move recovery lock functions to recovery daemon code ctdb_recovery_have_lock(), ctdb_recovery_lock(), ctdb_recovery_unlock() are only used by recovery daemon, so move them there. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-02-17 12:20:03 +03:00			`#include "ctdb_cluster_mutex.h"`
added admin commands to ban/unban nodes (This used to be ctdb commit 4dad04172e7e4955b5bf6444a85b19901c9683ad) 2007-06-07 10:34:33 +04:00
recoverd: Factor out the SRVID handling code The code that handles IP reallocate requests can be reused. This also changes the result back to a SRVID caller to the PNN on success or a negative error code on failure. None of the callers currently look at the result so this is harmless... but it will be useful later. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit e4eae6e3291baa299a1d0f733ab11b138ee699a3) 2013-08-16 14:02:34 +04:00			`/* List of SRVID requests that need to be processed */`
			`struct srvid_list {`
			`struct srvid_list next, prev;`
ctdb-daemon: Rename struct srvid_request to ctdb_srvid_message Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-29 06:32:49 +03:00			`struct ctdb_srvid_message *request;`
recoverd: Factor out the SRVID handling code The code that handles IP reallocate requests can be reused. This also changes the result back to a SRVID caller to the PNN on success or a negative error code on failure. None of the callers currently look at the result so this is harmless... but it will be useful later. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit e4eae6e3291baa299a1d0f733ab11b138ee699a3) 2013-08-16 14:02:34 +04:00			`};`

			`struct srvid_requests {`
			`struct srvid_list *requests;`
add a new command "ctdb ipreallocate", this command will force the recovery master to perform a full ip reallocation process. the ctdb command will block until the ip reallocation has comleted (This used to be ctdb commit abad7b97fe0c066b33f6e75d0953bbed892a3216) 2009-07-02 07:00:26 +04:00			`};`

recoverd: Factor out the SRVID handling code The code that handles IP reallocate requests can be reused. This also changes the result back to a SRVID caller to the PNN on success or a negative error code on failure. None of the callers currently look at the result so this is harmless... but it will be useful later. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit e4eae6e3291baa299a1d0f733ab11b138ee699a3) 2013-08-16 14:02:34 +04:00			`static void srvid_request_reply(struct ctdb_context *ctdb,`
ctdb-daemon: Rename struct srvid_request to ctdb_srvid_message Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-29 06:32:49 +03:00			`struct ctdb_srvid_message *request,`
recoverd: Factor out the SRVID handling code The code that handles IP reallocate requests can be reused. This also changes the result back to a SRVID caller to the PNN on success or a negative error code on failure. None of the callers currently look at the result so this is harmless... but it will be useful later. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit e4eae6e3291baa299a1d0f733ab11b138ee699a3) 2013-08-16 14:02:34 +04:00			`TDB_DATA result)`
			`{`
			`/* Someone that sent srvid==0 does not want a reply */`
			`if (request->srvid == 0) {`
			`talloc_free(request);`
			`return;`
			`}`

			`if (ctdb_client_send_message(ctdb, request->pnn, request->srvid,`
			`result) == 0) {`
			`DEBUG(DEBUG_INFO,("Sent SRVID reply to %u:%llu\n",`
			`(unsigned)request->pnn,`
			`(unsigned long long)request->srvid));`
			`} else {`
			`DEBUG(DEBUG_ERR,("Failed to send SRVID reply to %u:%llu\n",`
			`(unsigned)request->pnn,`
			`(unsigned long long)request->srvid));`
			`}`

			`talloc_free(request);`
			`}`

			`static void srvid_requests_reply(struct ctdb_context *ctdb,`
			`struct srvid_requests **requests,`
			`TDB_DATA result)`
			`{`
			`struct srvid_list *r;`

ctdb-recoverd: Add early return in srvid_requests_reply() Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-05-03 08:56:09 +03:00			`if (*requests == NULL) {`
			`return;`
			`}`

recoverd: Factor out the SRVID handling code The code that handles IP reallocate requests can be reused. This also changes the result back to a SRVID caller to the PNN on success or a negative error code on failure. None of the callers currently look at the result so this is harmless... but it will be useful later. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit e4eae6e3291baa299a1d0f733ab11b138ee699a3) 2013-08-16 14:02:34 +04:00			`for (r = (*requests)->requests; r != NULL; r = r->next) {`
			`srvid_request_reply(ctdb, r->request, result);`
			`}`

			`/* Free the list structure... */`
			`TALLOC_FREE(*requests);`
			`}`

			`static void srvid_request_add(struct ctdb_context *ctdb,`
			`struct srvid_requests **requests,`
ctdb-daemon: Rename struct srvid_request to ctdb_srvid_message Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-29 06:32:49 +03:00			`struct ctdb_srvid_message *request)`
recoverd: Factor out the SRVID handling code The code that handles IP reallocate requests can be reused. This also changes the result back to a SRVID caller to the PNN on success or a negative error code on failure. None of the callers currently look at the result so this is harmless... but it will be useful later. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit e4eae6e3291baa299a1d0f733ab11b138ee699a3) 2013-08-16 14:02:34 +04:00			`{`
			`struct srvid_list *t;`
			`int32_t ret;`
			`TDB_DATA result;`

			`if (*requests == NULL) {`
			`*requests = talloc_zero(ctdb, struct srvid_requests);`
			`if (*requests == NULL) {`
			`goto nomem;`
			`}`
			`}`

			`t = talloc_zero(*requests, struct srvid_list);`
			`if (t == NULL) {`
			`/* If requests was just allocated above then free it /`
			`if ((*requests)->requests == NULL) {`
			`TALLOC_FREE(*requests);`
			`}`
			`goto nomem;`
			`}`

ctdb-daemon: Rename struct srvid_request to ctdb_srvid_message Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-29 06:32:49 +03:00			`t->request = (struct ctdb_srvid_message *)talloc_steal(t, request);`
recoverd: Factor out the SRVID handling code The code that handles IP reallocate requests can be reused. This also changes the result back to a SRVID caller to the PNN on success or a negative error code on failure. None of the callers currently look at the result so this is harmless... but it will be useful later. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit e4eae6e3291baa299a1d0f733ab11b138ee699a3) 2013-08-16 14:02:34 +04:00			`DLIST_ADD((*requests)->requests, t);`

			`return;`

			`nomem:`
			`/* Failed to add the request to the list. Send a fail. */`
			`DEBUG(DEBUG_ERR, (__location__`
			`" Out of memory, failed to queue SRVID request\n"));`
			`ret = -ENOMEM;`
			`result.dsize = sizeof(ret);`
			`result.dptr = (uint8_t *)&ret;`
			`srvid_request_reply(ctdb, request, result);`
			`}`

ctdb-recoverd: Add a new abstraction ctdb_op_disable() This can be used to disable and re-enable an operation, and do all the relevant sanity checking. Most of this is from existing functions disable_takeover_runs_handler(), clear_takeover_runs_disable() and reenable_takeover_runs(). Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-02-08 12:50:38 +03:00			`/* An abstraction to allow an operation (takeover runs, recoveries,`
			`* ...) to be disabled for a given timeout */`
			`struct ctdb_op_state {`
			`struct tevent_timer *timer;`
			`bool in_progress;`
			`const char *name;`
			`};`

			`static struct ctdb_op_state ctdb_op_init(TALLOC_CTX mem_ctx, const char *name)`
			`{`
			`struct ctdb_op_state *state = talloc_zero(mem_ctx, struct ctdb_op_state);`

			`if (state != NULL) {`
			`state->in_progress = false;`
			`state->name = name;`
			`}`

			`return state;`
			`}`

			`static bool ctdb_op_is_disabled(struct ctdb_op_state *state)`
			`{`
			`return state->timer != NULL;`
			`}`

			`static bool ctdb_op_begin(struct ctdb_op_state *state)`
			`{`
			`if (ctdb_op_is_disabled(state)) {`
			`DEBUG(DEBUG_NOTICE,`
			`("Unable to begin - %s are disabled\n", state->name));`
			`return false;`
			`}`

			`state->in_progress = true;`
			`return true;`
			`}`

			`static bool ctdb_op_end(struct ctdb_op_state *state)`
			`{`
			`return state->in_progress = false;`
			`}`

			`static bool ctdb_op_is_in_progress(struct ctdb_op_state *state)`
			`{`
			`return state->in_progress;`
			`}`

			`static void ctdb_op_enable(struct ctdb_op_state *state)`
			`{`
			`TALLOC_FREE(state->timer);`
			`}`

ctdb-daemon: Stop using tevent compatibility definitions Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-26 08:50:09 +03:00			`static void ctdb_op_timeout_handler(struct tevent_context *ev,`
			`struct tevent_timer *te,`
ctdb-recoverd: Add a new abstraction ctdb_op_disable() This can be used to disable and re-enable an operation, and do all the relevant sanity checking. Most of this is from existing functions disable_takeover_runs_handler(), clear_takeover_runs_disable() and reenable_takeover_runs(). Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-02-08 12:50:38 +03:00			`struct timeval yt, void *p)`
			`{`
			`struct ctdb_op_state *state =`
			`talloc_get_type(p, struct ctdb_op_state);`

			`DEBUG(DEBUG_NOTICE,("Reenabling %s after timeout\n", state->name));`
			`ctdb_op_enable(state);`
			`}`

			`static int ctdb_op_disable(struct ctdb_op_state *state,`
			`struct tevent_context *ev,`
			`uint32_t timeout)`
			`{`
			`if (timeout == 0) {`
			`DEBUG(DEBUG_NOTICE,("Reenabling %s\n", state->name));`
			`ctdb_op_enable(state);`
			`return 0;`
			`}`

			`if (state->in_progress) {`
			`DEBUG(DEBUG_ERR,`
			`("Unable to disable %s - in progress\n", state->name));`
			`return -EAGAIN;`
			`}`

			`DEBUG(DEBUG_NOTICE,("Disabling %s for %u seconds\n",`
			`state->name, timeout));`

			`/* Clear any old timers */`
			`talloc_free(state->timer);`

			`/* Arrange for the timeout to occur */`
			`state->timer = tevent_add_timer(ev, state,`
			`timeval_current_ofs(timeout, 0),`
			`ctdb_op_timeout_handler, state);`
			`if (state->timer == NULL) {`
			`DEBUG(DEBUG_ERR,(__location__ " Unable to setup timer\n"));`
			`return -ENOMEM;`
			`}`

			`return 0;`
			`}`

new prototype banning code (This used to be ctdb commit 0c4c2240267af183d54ffd4c0aacda208f6eff6a) 2009-09-03 20:20:39 +04:00			`struct ctdb_banning_state {`
ctdb-recoverd: Add pnn field to banning state structure This structure is now standalone, so indexing by PNN can be avoided via a subsequent commit. Index by culprit here to make this commit simple. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-07-29 05:15:03 +03:00			`uint32_t pnn;`
new prototype banning code (This used to be ctdb commit 0c4c2240267af183d54ffd4c0aacda208f6eff6a) 2009-09-03 20:20:39 +04:00			`uint32_t count;`
			`struct timeval last_reported_time;`
			`};`

ctdb-recoverd: Terminology change: recovery lock -> cluster lock No functional changes, just name changes for clarity. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-10 03:29:06 +03:00			`struct ctdb_cluster_lock_handle;`
ctdb-recoverd: Store recovery lock handle ... not just cluster mutex handle. This makes the recovery lock handle long-lived and with allow the releasing code to cancel an in-progress attempt to take the recovery lock. The cluster mutex handle is now allocated off the recovery lock handle. BUG: https://bugzilla.samba.org/show_bug.cgi?id=13617 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-09-03 05:39:32 +03:00
implement a scheme where nodes are banned if they continuously caused the cluster to start a recovery session. The node is banned from the cluster for the RecoveryBanPeriod (default of 5 minutes) (This used to be ctdb commit 4ad43dd07f526b6002477177fbf55483246c2c0c) 2007-06-07 09:18:55 +04:00			`/*`
			`private state of recovery daemon`
			`*/`
			`struct ctdb_recoverd {`
			`struct ctdb_context *ctdb;`
ctdb-recoverd: Rename recmaster field to leader Recovery master is being renamed to leader. This follows clustering best practice (e.g. RAFT). Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-07-14 08:22:33 +03:00			`uint32_t leader;`
ctdb-recoverd: Send leader broadcasts These are triggered on 1 second timer, but are only sent if the node is the current leader and there is no election underway. If this node can not be the leader then ensure it releases the recovery lock. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-03-16 08:16:44 +03:00			`struct tevent_timer *leader_broadcast_te;`
ctdb-recoverd: Handle leader broadcast timeout If no leader broadcasts have been received from the leader for more than 5s then trigger an election. Apart from being sane behaviour, this avoids elected-before-connected bugs at startup, where a node elects itself leader before it is connected to other nodes. When a node processes a leader broadcast timeout it sends an unknown leader broadcast to all nodes. That causes cancellation of the leader broadcast timeout across the cluster. This is particular important at startup, since nodes may be started in a staggered fashion. Without this cluster-wide cancellation, a node might notice the lack of leader, win an election and complete a recovery before other nodes notice the lack of leader. When the leader broadcast timeout finally occurs on the other nodes then they'll put the cluster back into an unnecessary recovery. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-17 06:42:47 +03:00			`struct tevent_timer *leader_broadcast_timeout_te;`
ctdb-recoverd: Add PNN to recovery daemon context This is currently referenced in a number of inconsistent ways, including: * pnn * rec->ctdb->pnn * ctdb->pnn * ctdb_get_pnn(ctdb) * ctdb_get_pnn(rec->ctdb) The first of these always requires some thought about the context - is this the node PNN or some other PNN (e.g. argument to function)? The intention is to always use rec->pnn when referring to the recovery daemon's PNN. Doing this also reduces reliance on struct ctdb_context internals. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-09 02:33:17 +03:00			`uint32_t pnn;`
new prototype banning code (This used to be ctdb commit 0c4c2240267af183d54ffd4c0aacda208f6eff6a) 2009-09-03 20:20:39 +04:00			`uint32_t last_culprit_node;`
ctdb-recoverd: Clean up banning culprit code Make this fully self-contained in the recovery daemon and avoid indexing by PNN. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-07-29 06:30:04 +03:00			`struct ctdb_banning_state *banning_state;`
ctdb-daemon: Rename struct ctdb_node_map to ctdb_node_map_old Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-29 09:22:48 +03:00			`struct ctdb_node_map_old *nodemap;`
use a priority time for the election data, not just the vnn (This used to be ctdb commit a691f9c5cd77194005f0d98483da94b07a48d57d) 2007-06-07 12:37:27 +04:00			`struct timeval priority_time;`
prevent recursion in the calling of ctdb_takeover_run (This used to be ctdb commit 0fbdeb7c91b965d9bc5ecc7b24e31070378d8f1d) 2007-09-13 08:08:18 +04:00			`bool need_takeover_run;`
- merge from ronnie - add a flag to check that recovery completed correctly. If not, re-trigger it in monitoring (This used to be ctdb commit d5ed941d9bab4af30d8b5f9b77bdf43d9218d69b) 2007-09-14 03:49:12 +04:00			`bool need_recovery;`
- catch ESTALE in the recovery lock by trying a read() - priortise nodes that are unbanned and healthy in the election (This used to be ctdb commit 929feb475dfdf7283f0e99b50b179e1c91d3a39f) 2007-10-05 07:28:21 +04:00			`uint32_t node_flags;`
ctdb-daemon: Stop using tevent compatibility definitions Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-26 08:50:09 +03:00			`struct tevent_timer *send_election_te;`
ctdb-recoverd: Add an explicit flag for election in progress An alternate election method will be added that doesn't use the election timeout, so this provides a common way for recognising when an election is in progress. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-03-18 12:27:10 +03:00			`bool election_in_progress;`
ctdb-daemon: Stop using tevent compatibility definitions Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-26 08:50:09 +03:00			`struct tevent_timer *election_timeout;`
recoverd: Factor out the SRVID handling code The code that handles IP reallocate requests can be reused. This also changes the result back to a SRVID caller to the PNN on success or a negative error code on failure. None of the callers currently look at the result so this is harmless... but it will be useful later. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit e4eae6e3291baa299a1d0f733ab11b138ee699a3) 2013-08-16 14:02:34 +04:00			`struct srvid_requests *reallocate_requests;`
ctdb-recoverd: Reimplement disabling takeover runs using ctdb_op_disable() Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-02-08 12:52:12 +03:00			`struct ctdb_op_state *takeover_run;`
ctdb-recoverd: Reimplement ReRecoveryTimeout using ctdb_op_disable() Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-02-06 06:47:33 +03:00			`struct ctdb_op_state *recovery;`
ctdb-daemon: Rename struct ctdb_control_get_ifaces to ctdb_iface_list_old Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-28 11:43:48 +03:00			`struct ctdb_iface_list_old *ifaces;`
recoverd: Fix the implementation of CTDB_SRVID_REBALANCE_NODE The current implementation has a few flaws: * A takeover run is called unconditionally when the timer goes even if the recovery master role has moved. This means a node other than the recovery master can incorrectly do a takeover run. * The rebalancing target nodes are cleared in the setup for a takeover run, regardless of whether the takeover run succeeds. * The timer to force a rebalance isn't cleared if another takeover run occurs before the deadline. Any forced rebalancing will happen in the first takeover run and when the timer expires some time later then an unnecessary takeover run will occur. * If the recovery master role moves then the rebalancing data will stay on the original node and affect the next takeover run to occur if the recovery master role should come back to the original node. Instead, store an array of rebalance target nodes in the recovery master context. This is passed as an extra argument to ctdb_takeover_run() each time it is called and is cleared when a takeover run succeeds. The timer hangs off the array of rebalance target nodes, which is cleared if the node isn't the recovery master. This means that it is possible to lose rebalance data if the recovery master role moves. However, that's a difficult problem to solve. The best way of approaching it is probably to try to stop the recovery master role from jumping around unnecesarily when inactive nodes join the cluster. The long term solution is to avoid this nonsense completely. The IP allocation algorithm needs to cache state between runs so that it knows which nodes have just become healthy. This also needs recovery master stability. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit c51c1efe5fc7fa668597f2acd435dee16e410fc9) 2013-09-04 08:30:04 +04:00			`uint32_t *force_rebalance_nodes;`
ctdb-recoverd: Use capabilities API Simplify update_capabilities() using the capabilities API and store the capabilities in new field rec->caps rather than scattered around ctdb->nodes. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2014-07-31 09:26:03 +04:00			`struct ctdb_node_capabilities *caps;`
ctdb-recoverd: Freeze databases whenever the node is INACTIVE If the node becomes stopped or banned after recovery is marked active, then it will never freeze the databases, and hence the node will keep banning itself indefinitely, until ctdbd is restarted. This is a regression from 4.3, introduced with b4357a79d916b1f8ade8fa78563fbef0ce670aa9 and d8f3b490bbb691c9916eed0df5b980c1aef23c85 BUG: https://bugzilla.samba.org/show_bug.cgi?id=11945 Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org> Reviewed-by: Martin Schwenke <martin@meltin.net> Autobuild-User(master): Michael Adam <obnox@samba.org> Autobuild-Date(master): Wed Jun 1 17:36:12 CEST 2016 on sn-devel-144 2016-06-01 05:10:46 +03:00			`bool frozen_on_inactive;`
ctdb-recoverd: Terminology change: recovery lock -> cluster lock No functional changes, just name changes for clarity. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-10 03:29:06 +03:00			`struct ctdb_cluster_lock_handle *cluster_lock_handle;`
ctdb-recoverd: Record helper PID in recovery daemon context Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-09-30 14:15:56 +03:00			`pid_t helper_pid;`
implement a scheme where nodes are banned if they continuously caused the cluster to start a recovery session. The node is banned from the cluster for the RecoveryBanPeriod (default of 5 minutes) (This used to be ctdb commit 4ad43dd07f526b6002477177fbf55483246c2c0c) 2007-06-07 09:18:55 +04:00			`};`
start working on a recovery daemon change ctdb_control so it takes a timeval pointer as argument. this is the timeout. if the node has not responded within hte timeout ctdb_control will return an error instead of hanging. if the timeval pointer is NULL then the call will block indefinitely if there is no response. this is used for now in the createdb control but all the helpers ctdb_ctrl_* should probably be updated to take a timeout parameter as well. (This used to be ctdb commit 1fe64b04869b17dbf123851b0fe09df8d28a6211) 2007-05-04 02:30:18 +04:00
make recovery daemon values tunable (This used to be ctdb commit ec29dbf2f5110428df8b97801443ba7addf61353) 2007-06-04 14:22:44 +04:00			`#define CONTROL_TIMEOUT() timeval_current_ofs(ctdb->tunable.recover_timeout, 0)`
added health monitoring logic to ctdb, so a node loses its public IP address if one of the sybsystem event scripts reports a problem (This used to be ctdb commit c7a089256d86cec21097453bce5acbccee87413f) 2007-06-06 04:25:46 +04:00			`#define MONITOR_TIMEOUT() timeval_current_ofs(ctdb->tunable.recover_interval, 0)`
raise the control timeout in recovery (This used to be ctdb commit 43424ff66daf28c202c12982f20a9f662b6fb125) 2007-05-24 07:49:27 +04:00
ctdb-daemon: Stop using tevent compatibility definitions Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-26 08:50:09 +03:00			`static void ctdb_restart_recd(struct tevent_context *ev,`
			`struct tevent_timer *te, struct timeval t,`
			`void *private_data);`
convert much of the recovery logic to be async and parallel across all nodes (This used to be ctdb commit 8b72a02bf1045d8befb342a4111ca1316889262e) 2008-01-05 01:35:43 +03:00
ctdb-recoverd: Factor out and use function this_node_is_leader() Make the code self-documenting. This preempts an upcoming change to terminology but doing it now saves a lot of churn. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-08 11:37:39 +03:00			`static bool this_node_is_leader(struct ctdb_recoverd *rec)`
			`{`
ctdb-recoverd: Rename recmaster field to leader Recovery master is being renamed to leader. This follows clustering best practice (e.g. RAFT). Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-07-14 08:22:33 +03:00			`return rec->leader == rec->pnn;`
ctdb-recoverd: Factor out and use function this_node_is_leader() Make the code self-documenting. This preempts an upcoming change to terminology but doing it now saves a lot of churn. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-08 11:37:39 +03:00			`}`

ctdb-recoverd: Add and use function this_node_can_be_leader() This makes the code self-documenting. In ctdb_election_data() there is a slight behaviour change. An inactive node will now try to lose an election. This case should not happen because: * An inactive node can't win an election round and then send a reply. * Any inactive node should never start an election. There are currently places where this happens and they will be fixed later. There is an instance where this could be used in validate_recovery_master() but this involves a more serious logic change. Overhaul this function later. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-14 02:57:03 +03:00			`static bool this_node_can_be_leader(struct ctdb_recoverd *rec)`
			`{`
			`return (rec->node_flags & NODE_FLAGS_INACTIVE) == 0 &&`
			`(rec->ctdb->capabilities & CTDB_CAP_RECMASTER) != 0;`
			`}`

ctdb-recoverd: Add function node_flags() and use it in elections Indexing a node map by PNN is suboptimal. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-07-29 10:57:53 +03:00			`static bool node_flags(struct ctdb_recoverd rec, uint32_t pnn, uint32_t flags)`
			`{`
			`size_t i;`

			`for (i = 0; i < rec->nodemap->num; i++) {`
			`struct ctdb_node_and_flags *node = &rec->nodemap->nodes[i];`
			`if (node->pnn == pnn) {`
			`if (flags != NULL) {`
			`*flags = node->flags;`
			`}`
			`return true;`
			`}`
			`}`

			`return false;`
			`}`

added admin commands to ban/unban nodes (This used to be ctdb commit 4dad04172e7e4955b5bf6444a85b19901c9683ad) 2007-06-07 10:34:33 +04:00			`/*`
			`ban a node for a period of time`
			`*/`
ctdb-recoverd: Simplify arguments to ctdb_ban_node() ban_time argument is always ctdb->tunable.recovery_ban_period, so build this in and make the calling code more readable. ctdb_ban_node() already logs how long a node is banned for, so don't repeatedly log this. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-10 02:31:56 +03:00			`static void ctdb_ban_node(struct ctdb_recoverd *rec, uint32_t pnn)`
added admin commands to ban/unban nodes (This used to be ctdb commit 4dad04172e7e4955b5bf6444a85b19901c9683ad) 2007-06-07 10:34:33 +04:00			`{`
new prototype banning code (This used to be ctdb commit 0c4c2240267af183d54ffd4c0aacda208f6eff6a) 2009-09-03 20:20:39 +04:00			`int ret;`
added admin commands to ban/unban nodes (This used to be ctdb commit 4dad04172e7e4955b5bf6444a85b19901c9683ad) 2007-06-07 10:34:33 +04:00			`struct ctdb_context *ctdb = rec->ctdb;`
ctdb-recoverd: Simplify arguments to ctdb_ban_node() ban_time argument is always ctdb->tunable.recovery_ban_period, so build this in and make the calling code more readable. ctdb_ban_node() already logs how long a node is banned for, so don't repeatedly log this. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-10 02:31:56 +03:00			`uint32_t ban_time = ctdb->tunable.recovery_ban_period;`
ctdb-daemon: Rename struct ctdb_ban_time to ctdb_ban_state Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-28 10:18:33 +03:00			`struct ctdb_ban_state bantime;`

change ctdb_node_flags_change.vnn to ctdb_node_flags_changed.pnn change ctdb_ban_info.vnn to ctdb_ban_info.pnn (This used to be ctdb commit fcedd40e0493948829e1c921d4fe30e9196e398a) 2007-09-04 04:33:10 +04:00			`if (!ctdb_validate_pnn(ctdb, pnn)) {`
merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_ERR,("Bad pnn %u in ctdb_ban_node\n", pnn));`
handle CTDB_CURRENT_NODE in ban commands (This used to be ctdb commit fefb53f1d22c5458a1e107f8352818aee87983de) 2007-06-07 10:48:31 +04:00			`return;`
			`}`

recoverd: Print banning message only after verifying pnn Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 4be8dff3a4451192f838497b4747273685959bed) 2013-06-24 08:18:58 +04:00			`DEBUG(DEBUG_NOTICE,("Banning node %u for %u seconds\n", pnn, ban_time));`

new prototype banning code (This used to be ctdb commit 0c4c2240267af183d54ffd4c0aacda208f6eff6a) 2009-09-03 20:20:39 +04:00			`bantime.pnn = pnn;`
			`bantime.time = ban_time;`
add log output for when ctdb_ban_node() and ctdb_unban_node() are called when these functions are called to ban or unban a node make sure we update the CTDB_NODE_BANNED flag in rec->node_flags since this field and flag are checked during the election process (This used to be ctdb commit 740c632ae96a2d34327d1b575780aaf079d93f4f) 2007-11-23 04:36:14 +03:00
new prototype banning code (This used to be ctdb commit 0c4c2240267af183d54ffd4c0aacda208f6eff6a) 2009-09-03 20:20:39 +04:00			`ret = ctdb_ctrl_set_ban(ctdb, CONTROL_TIMEOUT(), pnn, &bantime);`
			`if (ret != 0) {`
			`DEBUG(DEBUG_ERR,(__location__ " Failed to ban node %d\n", pnn));`
rework banning/unbanning nodes ctdb_recoverd.c Always handle banning/unbanning locally on the node that is being banned/unbanned instead of on the recovery master. This means that if a ban request comes in to the recovery master for a remote node, we pass the request on to the remote node instead of setting up the ban and ban timeouts locally. ctdb.c send ban/unban requests to the node being banned/unbanned instead of to the recmaster (This used to be ctdb commit 880dd9f5fd0b91e450da93e195cc5c62cb1dcd6e) 2007-12-03 07:45:53 +03:00			`return;`
use a priority time for the election data, not just the vnn (This used to be ctdb commit a691f9c5cd77194005f0d98483da94b07a48d57d) 2007-06-07 12:37:27 +04:00			`}`

added admin commands to ban/unban nodes (This used to be ctdb commit 4dad04172e7e4955b5bf6444a85b19901c9683ad) 2007-06-07 10:34:33 +04:00			`}`

add async versions of the freeze node control and freeze all nodes in parallell (This used to be ctdb commit f34e89f54d9f4380e76eb1b5b2385a4d8500b505) 2007-08-27 04:31:22 +04:00			`enum monitor_result { MONITOR_OK, MONITOR_RECOVERY_NEEDED, MONITOR_ELECTION_NEEDED, MONITOR_FAILED};`


add a callback for failed nodes to the async control helper. this callback is called for every node where the control failed (or timed out) when we issue the start recovery control from recovery master, set any node that fails as a culprit so it will eventually be banned (This used to be ctdb commit 72f89bac13cbe8c3ca3e7a942469cd2ff25abba2) 2008-06-12 10:53:36 +04:00			`/*`
			`remember the trouble maker`
			`*/`
ctdb-recoverd: Clean up banning culprit code Make this fully self-contained in the recovery daemon and avoid indexing by PNN. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-07-29 06:30:04 +03:00			`static void ctdb_set_culprit_count(struct ctdb_recoverd *rec,`
			`uint32_t culprit,`
			`uint32_t count)`
add a callback for failed nodes to the async control helper. this callback is called for every node where the control failed (or timed out) when we issue the start recovery control from recovery master, set any node that fails as a culprit so it will eventually be banned (This used to be ctdb commit 72f89bac13cbe8c3ca3e7a942469cd2ff25abba2) 2008-06-12 10:53:36 +04:00			`{`
ctdb-recoverd: Clean up banning culprit code Make this fully self-contained in the recovery daemon and avoid indexing by PNN. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-07-29 06:30:04 +03:00			`struct ctdb_context *ctdb = talloc_get_type_abort(`
			`rec->ctdb, struct ctdb_context);`
			`struct ctdb_banning_state *ban_state = NULL;`
			`size_t len;`
			`bool ok;`
new prototype banning code (This used to be ctdb commit 0c4c2240267af183d54ffd4c0aacda208f6eff6a) 2009-09-03 20:20:39 +04:00
ctdb-recoverd: Clean up banning culprit code Make this fully self-contained in the recovery daemon and avoid indexing by PNN. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-07-29 06:30:04 +03:00			`ok = node_flags(rec, culprit, NULL);`
			`if (!ok) {`
			`DBG_WARNING("Unknown culprit node %"PRIu32"\n", culprit);`
new prototype banning code (This used to be ctdb commit 0c4c2240267af183d54ffd4c0aacda208f6eff6a) 2009-09-03 20:20:39 +04:00			`return;`
			`}`

recoverd: Do not set banning credits on a node if current node is inactive If the current node is banned or stopped, then it should not assign banning credits to other nodes since the current node will not have up-to-date flags of other nodes. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 38304f88e0c634e97d4687c25adef975f71537b8) 2013-06-28 08:10:47 +04:00			`/* If we are banned or stopped, do not set other nodes as culprits */`
			`if (rec->node_flags & NODE_FLAGS_INACTIVE) {`
ctdb-recoverd: Clean up banning culprit code Make this fully self-contained in the recovery daemon and avoid indexing by PNN. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-07-29 06:30:04 +03:00			`D_WARNING("This node is INACTIVE, cannot set culprit node %d\n",`
			`culprit);`
recoverd: Do not set banning credits on a node if current node is inactive If the current node is banned or stopped, then it should not assign banning credits to other nodes since the current node will not have up-to-date flags of other nodes. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 38304f88e0c634e97d4687c25adef975f71537b8) 2013-06-28 08:10:47 +04:00			`return;`
			`}`

ctdb-recoverd: Clean up banning culprit code Make this fully self-contained in the recovery daemon and avoid indexing by PNN. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-07-29 06:30:04 +03:00			`if (rec->banning_state == NULL) {`
			`len = 0;`
			`} else {`
			`size_t i;`

			`len = talloc_array_length(rec->banning_state);`

			`for (i = 0 ; i < len; i++) {`
			`if (rec->banning_state[i].pnn == culprit) {`
			`ban_state= &rec->banning_state[i];`
			`break;`
			`}`
			`}`
new prototype banning code (This used to be ctdb commit 0c4c2240267af183d54ffd4c0aacda208f6eff6a) 2009-09-03 20:20:39 +04:00			`}`
ctdb-recoverd: Clean up banning culprit code Make this fully self-contained in the recovery daemon and avoid indexing by PNN. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-07-29 06:30:04 +03:00
			`/* Not found, so extend (or allocate new) array */`
			`if (ban_state == NULL) {`
			`struct ctdb_banning_state *t;`

			`len += 1;`
			`/*`
			`* talloc_realloc() handles the corner case where`
			`* rec->banning_state is NULL`
			`*/`
			`t = talloc_realloc(rec,`
			`rec->banning_state,`
			`struct ctdb_banning_state,`
			`len);`
			`if (t == NULL) {`
ctdb: Add missing newlines to logging messages Signed-off-by: Joseph Sutton <josephsutton@catalyst.net.nz> Reviewed-by: Andrew Bartlett <abartlet@samba.org> 2023-07-31 07:07:36 +03:00			`DBG_WARNING("Memory allocation error\n");`
ctdb-recoverd: Clean up banning culprit code Make this fully self-contained in the recovery daemon and avoid indexing by PNN. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-07-29 06:30:04 +03:00			`return;`
			`}`
			`rec->banning_state = t;`

			`/* New element is always at the end - initialise it... */`
			`ban_state = &rec->banning_state[len - 1];`
			`*ban_state = (struct ctdb_banning_state) {`
			`.pnn = culprit,`
			`.count = 0,`
			`};`
			`} else if (ban_state->count > 0 &&`
			`timeval_elapsed(&ban_state->last_reported_time) >`
			`ctdb->tunable.recovery_grace_period) {`
			`/*`
			`* Forgive old transgressions beyond the tunable time-limit`
			`*/`
new prototype banning code (This used to be ctdb commit 0c4c2240267af183d54ffd4c0aacda208f6eff6a) 2009-09-03 20:20:39 +04:00			`ban_state->count = 0;`
add a callback for failed nodes to the async control helper. this callback is called for every node where the control failed (or timed out) when we issue the start recovery control from recovery master, set any node that fails as a culprit so it will eventually be banned (This used to be ctdb commit 72f89bac13cbe8c3ca3e7a942469cd2ff25abba2) 2008-06-12 10:53:36 +04:00			`}`
new prototype banning code (This used to be ctdb commit 0c4c2240267af183d54ffd4c0aacda208f6eff6a) 2009-09-03 20:20:39 +04:00
			`ban_state->count += count;`
			`ban_state->last_reported_time = timeval_current();`
			`rec->last_culprit_node = culprit;`
add a callback for failed nodes to the async control helper. this callback is called for every node where the control failed (or timed out) when we issue the start recovery control from recovery master, set any node that fails as a culprit so it will eventually be banned (This used to be ctdb commit 72f89bac13cbe8c3ca3e7a942469cd2ff25abba2) 2008-06-12 10:53:36 +04:00			`}`

ctdb-recoverd: Clean up banning culprit code Make this fully self-contained in the recovery daemon and avoid indexing by PNN. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-07-29 06:30:04 +03:00			`static void ban_counts_reset(struct ctdb_recoverd *rec)`
			`{`
			`D_NOTICE("Resetting ban count to 0 for all nodes\n");`
			`TALLOC_FREE(rec->banning_state);`
			`}`

If we can not pull a database from a node during recovery, mark this node as a "culprit" so that it will eventually become banned. (This used to be ctdb commit 69dc3bf60b86d8df6dc5c7c6ebf303e847fb2ba9) 2009-04-24 07:58:32 +04:00			`/*`
			`remember the trouble maker`
			`*/`
new prototype banning code (This used to be ctdb commit 0c4c2240267af183d54ffd4c0aacda208f6eff6a) 2009-09-03 20:20:39 +04:00			`static void ctdb_set_culprit(struct ctdb_recoverd *rec, uint32_t culprit)`
If we can not pull a database from a node during recovery, mark this node as a "culprit" so that it will eventually become banned. (This used to be ctdb commit 69dc3bf60b86d8df6dc5c7c6ebf303e847fb2ba9) 2009-04-24 07:58:32 +04:00			`{`
new prototype banning code (This used to be ctdb commit 0c4c2240267af183d54ffd4c0aacda208f6eff6a) 2009-09-03 20:20:39 +04:00			`ctdb_set_culprit_count(rec, culprit, 1);`
If we can not pull a database from a node during recovery, mark this node as a "culprit" so that it will eventually become banned. (This used to be ctdb commit 69dc3bf60b86d8df6dc5c7c6ebf303e847fb2ba9) 2009-04-24 07:58:32 +04:00			`}`
add a callback for failed nodes to the async control helper. this callback is called for every node where the control failed (or timed out) when we issue the start recovery control from recovery master, set any node that fails as a culprit so it will eventually be banned (This used to be ctdb commit 72f89bac13cbe8c3ca3e7a942469cd2ff25abba2) 2008-06-12 10:53:36 +04:00
Expand the client async framework so that it can take a callback function. This allows us to use the async framework also for controls that return outdata. Add a "capabilities" field to the ctdb_node structure. This field is only initialized and kept valid inside the recovery daemon context and not inside the main ctdb daemon. change the GET_CAPABILITIES control to return the capabilities in outdata instead of in the res return variable. When performing a recovery inside the recovery daemon, read the capabilities from all connected nodes and update the ctdb->nodes list of nodes. when building the new vnnmap after the database rebuild in recovery, do not include any nodes which lack the LMASTER capability in the new vnnmap. Unless there are no available connected node that sports the LMASTER capability in which case we let the local node (recmaster) take on the lmaster role temporarily (i.e. become a member of the vnnmap list) (This used to be ctdb commit 0f1883c69c689b28b0c04148774840b2c4081df6) 2008-05-06 09:42:59 +04:00			`/*`
ctdb-recmaster: Update capabilities before calling first election Capabilities are used when computing an election result so having them up-to-date seems like a good idea. Also update several instances of an ambiguous comment. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-10-27 07:09:33 +03:00			`Retrieve capabilities from all connected nodes`
Expand the client async framework so that it can take a callback function. This allows us to use the async framework also for controls that return outdata. Add a "capabilities" field to the ctdb_node structure. This field is only initialized and kept valid inside the recovery daemon context and not inside the main ctdb daemon. change the GET_CAPABILITIES control to return the capabilities in outdata instead of in the res return variable. When performing a recovery inside the recovery daemon, read the capabilities from all connected nodes and update the ctdb->nodes list of nodes. when building the new vnnmap after the database rebuild in recovery, do not include any nodes which lack the LMASTER capability in the new vnnmap. Unless there are no available connected node that sports the LMASTER capability in which case we let the local node (recmaster) take on the lmaster role temporarily (i.e. become a member of the vnnmap list) (This used to be ctdb commit 0f1883c69c689b28b0c04148774840b2c4081df6) 2008-05-06 09:42:59 +04:00			`*/`
ctdb-recoverd: Use capabilities API Simplify update_capabilities() using the capabilities API and store the capabilities in new field rec->caps rather than scattered around ctdb->nodes. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2014-07-31 09:26:03 +04:00			`static int update_capabilities(struct ctdb_recoverd *rec,`
ctdb-daemon: Rename struct ctdb_node_map to ctdb_node_map_old Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-29 09:22:48 +03:00			`struct ctdb_node_map_old *nodemap)`
Expand the client async framework so that it can take a callback function. This allows us to use the async framework also for controls that return outdata. Add a "capabilities" field to the ctdb_node structure. This field is only initialized and kept valid inside the recovery daemon context and not inside the main ctdb daemon. change the GET_CAPABILITIES control to return the capabilities in outdata instead of in the res return variable. When performing a recovery inside the recovery daemon, read the capabilities from all connected nodes and update the ctdb->nodes list of nodes. when building the new vnnmap after the database rebuild in recovery, do not include any nodes which lack the LMASTER capability in the new vnnmap. Unless there are no available connected node that sports the LMASTER capability in which case we let the local node (recmaster) take on the lmaster role temporarily (i.e. become a member of the vnnmap list) (This used to be ctdb commit 0f1883c69c689b28b0c04148774840b2c4081df6) 2008-05-06 09:42:59 +04:00			`{`
ctdb-recoverd: Use capabilities API Simplify update_capabilities() using the capabilities API and store the capabilities in new field rec->caps rather than scattered around ctdb->nodes. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2014-07-31 09:26:03 +04:00			`uint32_t *capp;`
Expand the client async framework so that it can take a callback function. This allows us to use the async framework also for controls that return outdata. Add a "capabilities" field to the ctdb_node structure. This field is only initialized and kept valid inside the recovery daemon context and not inside the main ctdb daemon. change the GET_CAPABILITIES control to return the capabilities in outdata instead of in the res return variable. When performing a recovery inside the recovery daemon, read the capabilities from all connected nodes and update the ctdb->nodes list of nodes. when building the new vnnmap after the database rebuild in recovery, do not include any nodes which lack the LMASTER capability in the new vnnmap. Unless there are no available connected node that sports the LMASTER capability in which case we let the local node (recmaster) take on the lmaster role temporarily (i.e. become a member of the vnnmap list) (This used to be ctdb commit 0f1883c69c689b28b0c04148774840b2c4081df6) 2008-05-06 09:42:59 +04:00			`TALLOC_CTX *tmp_ctx;`
ctdb-recoverd: Use capabilities API Simplify update_capabilities() using the capabilities API and store the capabilities in new field rec->caps rather than scattered around ctdb->nodes. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2014-07-31 09:26:03 +04:00			`struct ctdb_node_capabilities *caps;`
			`struct ctdb_context *ctdb = rec->ctdb;`
Expand the client async framework so that it can take a callback function. This allows us to use the async framework also for controls that return outdata. Add a "capabilities" field to the ctdb_node structure. This field is only initialized and kept valid inside the recovery daemon context and not inside the main ctdb daemon. change the GET_CAPABILITIES control to return the capabilities in outdata instead of in the res return variable. When performing a recovery inside the recovery daemon, read the capabilities from all connected nodes and update the ctdb->nodes list of nodes. when building the new vnnmap after the database rebuild in recovery, do not include any nodes which lack the LMASTER capability in the new vnnmap. Unless there are no available connected node that sports the LMASTER capability in which case we let the local node (recmaster) take on the lmaster role temporarily (i.e. become a member of the vnnmap list) (This used to be ctdb commit 0f1883c69c689b28b0c04148774840b2c4081df6) 2008-05-06 09:42:59 +04:00
ctdb-recoverd: Use capabilities API Simplify update_capabilities() using the capabilities API and store the capabilities in new field rec->caps rather than scattered around ctdb->nodes. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2014-07-31 09:26:03 +04:00			`tmp_ctx = talloc_new(rec);`
Expand the client async framework so that it can take a callback function. This allows us to use the async framework also for controls that return outdata. Add a "capabilities" field to the ctdb_node structure. This field is only initialized and kept valid inside the recovery daemon context and not inside the main ctdb daemon. change the GET_CAPABILITIES control to return the capabilities in outdata instead of in the res return variable. When performing a recovery inside the recovery daemon, read the capabilities from all connected nodes and update the ctdb->nodes list of nodes. when building the new vnnmap after the database rebuild in recovery, do not include any nodes which lack the LMASTER capability in the new vnnmap. Unless there are no available connected node that sports the LMASTER capability in which case we let the local node (recmaster) take on the lmaster role temporarily (i.e. become a member of the vnnmap list) (This used to be ctdb commit 0f1883c69c689b28b0c04148774840b2c4081df6) 2008-05-06 09:42:59 +04:00			`CTDB_NO_MEMORY(ctdb, tmp_ctx);`

ctdb-recoverd: Use capabilities API Simplify update_capabilities() using the capabilities API and store the capabilities in new field rec->caps rather than scattered around ctdb->nodes. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2014-07-31 09:26:03 +04:00			`caps = ctdb_get_capabilities(ctdb, tmp_ctx,`
			`CONTROL_TIMEOUT(), nodemap);`

			`if (caps == NULL) {`
			`DEBUG(DEBUG_ERR,`
			`(__location__ " Failed to get node capabilities\n"));`
Expand the client async framework so that it can take a callback function. This allows us to use the async framework also for controls that return outdata. Add a "capabilities" field to the ctdb_node structure. This field is only initialized and kept valid inside the recovery daemon context and not inside the main ctdb daemon. change the GET_CAPABILITIES control to return the capabilities in outdata instead of in the res return variable. When performing a recovery inside the recovery daemon, read the capabilities from all connected nodes and update the ctdb->nodes list of nodes. when building the new vnnmap after the database rebuild in recovery, do not include any nodes which lack the LMASTER capability in the new vnnmap. Unless there are no available connected node that sports the LMASTER capability in which case we let the local node (recmaster) take on the lmaster role temporarily (i.e. become a member of the vnnmap list) (This used to be ctdb commit 0f1883c69c689b28b0c04148774840b2c4081df6) 2008-05-06 09:42:59 +04:00			`talloc_free(tmp_ctx);`
			`return -1;`
			`}`

ctdb-recoverd: Use rec->pnn everywhere This is currently referenced in a number of inconsistent ways, including: * pnn * rec->ctdb->pnn * ctdb->pnn * ctdb_get_pnn(ctdb) * ctdb_get_pnn(rec->ctdb) The first of these always requires some thought about the context - is this the node PNN or some other PNN (e.g. argument to function)? rec->pnn is now always used when referring to the recovery daemon's PNN. Doing this also reduces reliance on struct ctdb_context internals. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-08 12:25:46 +03:00			`capp = ctdb_get_node_capabilities(caps, rec->pnn);`
ctdb-recoverd: Use capabilities API Simplify update_capabilities() using the capabilities API and store the capabilities in new field rec->caps rather than scattered around ctdb->nodes. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2014-07-31 09:26:03 +04:00			`if (capp == NULL) {`
			`DEBUG(DEBUG_ERR,`
			`(__location__`
			`" Capabilities don't include current node.\n"));`
			`talloc_free(tmp_ctx);`
			`return -1;`
			`}`
			`ctdb->capabilities = *capp;`

			`TALLOC_FREE(rec->caps);`
			`rec->caps = talloc_steal(rec, caps);`

Expand the client async framework so that it can take a callback function. This allows us to use the async framework also for controls that return outdata. Add a "capabilities" field to the ctdb_node structure. This field is only initialized and kept valid inside the recovery daemon context and not inside the main ctdb daemon. change the GET_CAPABILITIES control to return the capabilities in outdata instead of in the res return variable. When performing a recovery inside the recovery daemon, read the capabilities from all connected nodes and update the ctdb->nodes list of nodes. when building the new vnnmap after the database rebuild in recovery, do not include any nodes which lack the LMASTER capability in the new vnnmap. Unless there are no available connected node that sports the LMASTER capability in which case we let the local node (recmaster) take on the lmaster role temporarily (i.e. become a member of the vnnmap list) (This used to be ctdb commit 0f1883c69c689b28b0c04148774840b2c4081df6) 2008-05-06 09:42:59 +04:00			`talloc_free(tmp_ctx);`
			`return 0;`
			`}`

implement a scheme where nodes are banned if they continuously caused the cluster to start a recovery session. The node is banned from the cluster for the RecoveryBanPeriod (default of 5 minutes) (This used to be ctdb commit 4ad43dd07f526b6002477177fbf55483246c2c0c) 2007-06-07 09:18:55 +04:00			`/*`
			`change recovery mode on all nodes`
			`*/`
ctdb-recoverd: Do not freeze databases for election If election occurs during SMB activity, then trying to freeze all the databases can cause samba/ctdb deadlock which parallel database recovery is trying to avoid. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-06 03:52:06 +03:00			`static int set_recovery_mode(struct ctdb_context *ctdb,`
			`struct ctdb_recoverd *rec,`
ctdb-daemon: Rename struct ctdb_node_map to ctdb_node_map_old Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-29 09:22:48 +03:00			`struct ctdb_node_map_old *nodemap,`
ctdb-recoverd: Drop code to freeze databases from set_recovery_mode() This function is called only once from force_election() and does not require freezing of databases. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2016-09-13 08:45:54 +03:00			`uint32_t rec_mode)`
break out the setting/clearing of recovery mode into a dedicated helper function (This used to be ctdb commit dba4e4f8aa4f2fde1e9f8d93bdf3a33f7de8ce18) 2007-05-06 03:53:12 +04:00			`{`
new simpler and much faster recovery code based on tdb transactions (This used to be ctdb commit 9ef2268a1674b01f60c58fed72af8ac982fe77a3) 2008-01-06 04:38:01 +03:00			`TDB_DATA data;`
merge async recovery changes from Ronnie (This used to be ctdb commit 576e317640d25f8059114f15c6f1ebcee5e5b6e2) 2008-01-29 05:59:28 +03:00			`uint32_t *nodes;`
			`TALLOC_CTX *tmp_ctx;`

			`tmp_ctx = talloc_new(ctdb);`
			`CTDB_NO_MEMORY(ctdb, tmp_ctx);`

add a callback for failed nodes to the async control helper. this callback is called for every node where the control failed (or timed out) when we issue the start recovery control from recovery master, set any node that fails as a culprit so it will eventually be banned (This used to be ctdb commit 72f89bac13cbe8c3ca3e7a942469cd2ff25abba2) 2008-06-12 10:53:36 +04:00			`nodes = list_of_active_nodes(ctdb, nodemap, tmp_ctx, true);`
ctdb-recoverd: Set recovery mode before freezing databases Setting recovery mode to active is the only correct way to inform recovery daemon to run database recovery. Only freezing databases without setting recovery mode should not trigger database recovery, as this mechanism is used in tool to implement wipedb/restoredb commands. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2014-05-06 08:24:52 +04:00
			`data.dsize = sizeof(uint32_t);`
			`data.dptr = (unsigned char *)&rec_mode;`

			`if (ctdb_client_async_control(ctdb, CTDB_CONTROL_SET_RECMODE,`
			`nodes, 0,`
			`CONTROL_TIMEOUT(),`
			`false, data,`
			`NULL, NULL,`
			`NULL) != 0) {`
			`DEBUG(DEBUG_ERR, (__location__ " Unable to set recovery mode. Recovery failed.\n"));`
			`talloc_free(tmp_ctx);`
			`return -1;`
			`}`

merge async recovery changes from Ronnie (This used to be ctdb commit 576e317640d25f8059114f15c6f1ebcee5e5b6e2) 2008-01-29 05:59:28 +03:00			`talloc_free(tmp_ctx);`
break out the setting/clearing of recovery mode into a dedicated helper function (This used to be ctdb commit dba4e4f8aa4f2fde1e9f8d93bdf3a33f7de8ce18) 2007-05-06 03:53:12 +04:00			`return 0;`
			`}`

ctdb-recoverd: Move ctdb_ctrl_modflags() to ctdb_recoverd.c This file is the only user of this function. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-05-05 16:37:57 +03:00			`/*`
ctdb-recoverd: Flatten update_flags_on_all_nodes() The logic currently in ctdb_ctrl_modflags() will be optimised so that it no longer matches the pattern for a control function. So, remove this function and squash its functionality into the only caller. Although there are some superficial changes, the behaviour is unchanged. Flattening the 2 functions produces some seriously weird logic for setting the new flags, to the point where using ctdb_ctrl_modflags() for this purpose now looks very strange. The weirdness will be cleaned up in a subsequent commit. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-09-28 03:46:17 +03:00			`* Update flags on all connected nodes`
ctdb-recoverd: Move ctdb_ctrl_modflags() to ctdb_recoverd.c This file is the only user of this function. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-05-05 16:37:57 +03:00			`*/`
ctdb-recoverd: Flatten update_flags_on_all_nodes() The logic currently in ctdb_ctrl_modflags() will be optimised so that it no longer matches the pattern for a control function. So, remove this function and squash its functionality into the only caller. Although there are some superficial changes, the behaviour is unchanged. Flattening the 2 functions produces some seriously weird logic for setting the new flags, to the point where using ctdb_ctrl_modflags() for this purpose now looks very strange. The weirdness will be cleaned up in a subsequent commit. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-09-28 03:46:17 +03:00			`static int update_flags_on_all_nodes(struct ctdb_recoverd *rec,`
			`uint32_t pnn,`
			`uint32_t flags)`
ctdb-recoverd: Move ctdb_ctrl_modflags() to ctdb_recoverd.c This file is the only user of this function. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-05-05 16:37:57 +03:00			`{`
ctdb-recoverd: Flatten update_flags_on_all_nodes() The logic currently in ctdb_ctrl_modflags() will be optimised so that it no longer matches the pattern for a control function. So, remove this function and squash its functionality into the only caller. Although there are some superficial changes, the behaviour is unchanged. Flattening the 2 functions produces some seriously weird logic for setting the new flags, to the point where using ctdb_ctrl_modflags() for this purpose now looks very strange. The weirdness will be cleaned up in a subsequent commit. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-09-28 03:46:17 +03:00			`struct ctdb_context *ctdb = rec->ctdb;`
			`struct timeval timeout = CONTROL_TIMEOUT();`
ctdb-recoverd: Move ctdb_ctrl_modflags() to ctdb_recoverd.c This file is the only user of this function. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-05-05 16:37:57 +03:00			`TDB_DATA data;`
			`struct ctdb_node_map_old *nodemap=NULL;`
			`struct ctdb_node_flag_change c;`
			`TALLOC_CTX *tmp_ctx = talloc_new(ctdb);`
			`uint32_t *nodes;`
ctdb-recoverd: Correctly find nodemap entry for pnn Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-07-14 07:22:15 +03:00			`uint32_t i;`
ctdb-recoverd: Flatten update_flags_on_all_nodes() The logic currently in ctdb_ctrl_modflags() will be optimised so that it no longer matches the pattern for a control function. So, remove this function and squash its functionality into the only caller. Although there are some superficial changes, the behaviour is unchanged. Flattening the 2 functions produces some seriously weird logic for setting the new flags, to the point where using ctdb_ctrl_modflags() for this purpose now looks very strange. The weirdness will be cleaned up in a subsequent commit. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-09-28 03:46:17 +03:00			`int ret;`
ctdb-recoverd: Move ctdb_ctrl_modflags() to ctdb_recoverd.c This file is the only user of this function. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-05-05 16:37:57 +03:00
ctdb-recoverd: Do not retrieve nodemap from recovery master It is already in rec->nodemap. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-05-05 16:49:05 +03:00			`nodemap = rec->nodemap;`
ctdb-recoverd: Move ctdb_ctrl_modflags() to ctdb_recoverd.c This file is the only user of this function. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-05-05 16:37:57 +03:00
ctdb-recoverd: Correctly find nodemap entry for pnn Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-07-14 07:22:15 +03:00			`for (i = 0; i < nodemap->num; i++) {`
			`if (pnn == nodemap->nodes[i].pnn) {`
			`break;`
			`}`
			`}`
			`if (i >= nodemap->num) {`
ctdb-recoverd: Do not retrieve nodemap from recovery master It is already in rec->nodemap. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-05-05 16:49:05 +03:00			`DBG_ERR("Nodemap does not contain node %d\n", pnn);`
ctdb-recoverd: Move ctdb_ctrl_modflags() to ctdb_recoverd.c This file is the only user of this function. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-05-05 16:37:57 +03:00			`talloc_free(tmp_ctx);`
			`return -1;`
			`}`

ctdb-recoverd: Flatten update_flags_on_all_nodes() The logic currently in ctdb_ctrl_modflags() will be optimised so that it no longer matches the pattern for a control function. So, remove this function and squash its functionality into the only caller. Although there are some superficial changes, the behaviour is unchanged. Flattening the 2 functions produces some seriously weird logic for setting the new flags, to the point where using ctdb_ctrl_modflags() for this purpose now looks very strange. The weirdness will be cleaned up in a subsequent commit. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-09-28 03:46:17 +03:00			`c.pnn = pnn;`
ctdb-recoverd: Correctly find nodemap entry for pnn Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-07-14 07:22:15 +03:00			`c.old_flags = nodemap->nodes[i].flags;`
ctdb-recoverd: Simplify calculation of new flags Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Martin Schwenke <martins@samba.org> Autobuild-Date(master): Fri Jul 24 06:03:23 UTC 2020 on sn-devel-184 2020-07-14 07:29:09 +03:00			`c.new_flags = flags;`
ctdb-recoverd: Move ctdb_ctrl_modflags() to ctdb_recoverd.c This file is the only user of this function. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-05-05 16:37:57 +03:00
			`data.dsize = sizeof(c);`
			`data.dptr = (unsigned char *)&c;`

			`/* send the flags update to all connected nodes */`
			`nodes = list_of_connected_nodes(ctdb, nodemap, tmp_ctx, true);`

ctdb-recoverd: Flatten update_flags_on_all_nodes() The logic currently in ctdb_ctrl_modflags() will be optimised so that it no longer matches the pattern for a control function. So, remove this function and squash its functionality into the only caller. Although there are some superficial changes, the behaviour is unchanged. Flattening the 2 functions produces some seriously weird logic for setting the new flags, to the point where using ctdb_ctrl_modflags() for this purpose now looks very strange. The weirdness will be cleaned up in a subsequent commit. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-09-28 03:46:17 +03:00			`ret = ctdb_client_async_control(ctdb,`
			`CTDB_CONTROL_MODIFY_FLAGS,`
			`nodes,`
			`0,`
			`timeout,`
			`false,`
			`data,`
			`NULL,`
			`NULL,`
			`NULL);`
			`if (ret != 0) {`
			`DBG_ERR("Unable to update flags on remote nodes\n");`
ctdb-recoverd: Move ctdb_ctrl_modflags() to ctdb_recoverd.c This file is the only user of this function. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-05-05 16:37:57 +03:00			`talloc_free(tmp_ctx);`
			`return -1;`
			`}`

			`talloc_free(tmp_ctx);`
			`return 0;`
			`}`

ctdb-recoverd: Terminology change: recovery lock -> cluster lock No functional changes, just name changes for clarity. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-10 03:29:06 +03:00			`static bool _cluster_lock_lock(struct ctdb_recoverd *rec);`
			`static bool cluster_lock_held(struct ctdb_recoverd *rec);`
ctdb-recoverd: Factor out function cluster_lock_take() Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-09-20 05:30:58 +03:00
ctdb-recoverd: Add and use function cluster_lock_enabled() Now all references to ctdb->recovery_lock are encapsulated in the cluster lock code. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-10 03:43:10 +03:00			`static bool cluster_lock_enabled(struct ctdb_recoverd *rec)`
			`{`
			`return rec->ctdb->recovery_lock != NULL;`
			`}`

ctdb-recoverd: Factor out function cluster_lock_take() Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-09-20 05:30:58 +03:00			`static bool cluster_lock_take(struct ctdb_recoverd *rec)`
			`{`
			`struct ctdb_context *ctdb = rec->ctdb;`
ctdb-recoverd: Terminology change: recovery lock -> cluster lock No functional changes, just name changes for clarity. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-10 03:29:06 +03:00			`bool have_lock;`
ctdb-recoverd: Factor out function cluster_lock_take() Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-09-20 05:30:58 +03:00
ctdb-recoverd: Add and use function cluster_lock_enabled() Now all references to ctdb->recovery_lock are encapsulated in the cluster lock code. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-10 03:43:10 +03:00			`if (!cluster_lock_enabled(rec)) {`
ctdb-recoverd: Factor out function cluster_lock_take() Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-09-20 05:30:58 +03:00			`return true;`
			`}`

ctdb-recoverd: Terminology change: recovery lock -> cluster lock No functional changes, just name changes for clarity. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-10 03:29:06 +03:00			`if (cluster_lock_held(rec)) {`
			`D_NOTICE("Already holding cluster lock\n");`
ctdb-recoverd: Factor out function cluster_lock_take() Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-09-20 05:30:58 +03:00			`return true;`
			`}`

ctdb-recoverd: Terminology change: recovery lock -> cluster lock No functional changes, just name changes for clarity. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-10 03:29:06 +03:00			`D_NOTICE("Attempting to take cluster lock (%s)\n", ctdb->recovery_lock);`
			`have_lock = _cluster_lock_lock(rec);`
			`if (!have_lock) {`
ctdb-recoverd: Factor out function cluster_lock_take() Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-09-20 05:30:58 +03:00			`return false;`
			`}`

ctdb-recoverd: Terminology change: recovery lock -> cluster lock No functional changes, just name changes for clarity. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-10 03:29:06 +03:00			`D_NOTICE("Cluster lock taken successfully\n");`
ctdb-recoverd: Factor out function cluster_lock_take() Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-09-20 05:30:58 +03:00			`return true;`
			`}`

add a tuneable to control how long we wait after a successful recovery before we alow another recovery to be initiated (This used to be ctdb commit f3b43519423b7a73e6a2dd986bdf11203b8653cf) 2007-07-04 02:36:59 +04:00			`/*`
			`called when ctdb_wait_timeout should finish`
			`*/`
ctdb-daemon: Stop using tevent compatibility definitions Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-26 08:50:09 +03:00			`static void ctdb_wait_handler(struct tevent_context *ev,`
			`struct tevent_timer *te,`
add a tuneable to control how long we wait after a successful recovery before we alow another recovery to be initiated (This used to be ctdb commit f3b43519423b7a73e6a2dd986bdf11203b8653cf) 2007-07-04 02:36:59 +04:00			`struct timeval yt, void *p)`
			`{`
			`uint32_t timed_out = (uint32_t )p;`
			`(*timed_out) = 1;`
			`}`

			`/*`
			`wait for a given number of seconds`
			`*/`
speed startup: don't wait a full recovery interval if we've already waited We currently sleep for one second, whether or not we've already slept. Change this to sleep for the remainder of the second, if at all. Seconds between ctdbd first log message and node healthy: BEFORE: 18.09 AFTER: 17.08 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit fde760b5f39c77172308a583da4c2443b71541c9) 2010-06-22 17:20:35 +04:00			`static void ctdb_wait_timeout(struct ctdb_context *ctdb, double secs)`
add a tuneable to control how long we wait after a successful recovery before we alow another recovery to be initiated (This used to be ctdb commit f3b43519423b7a73e6a2dd986bdf11203b8653cf) 2007-07-04 02:36:59 +04:00			`{`
			`uint32_t timed_out = 0;`
ctdb-recoverd: CID 1509028 - Use of 32-bit time_t (Y2K38_SAFETY) usecs is going to be passed as a uint32_t. There is no need to calculate it as a time_t. Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2022-10-12 01:05:25 +03:00			`uint32_t usecs = (secs - (uint32_t)secs) * 1000000;`
ctdb-daemon: Stop using tevent compatibility definitions Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-26 08:50:09 +03:00			`tevent_add_timer(ctdb->ev, ctdb, timeval_current_ofs(secs, usecs),`
			`ctdb_wait_handler, &timed_out);`
add a tuneable to control how long we wait after a successful recovery before we alow another recovery to be initiated (This used to be ctdb commit f3b43519423b7a73e6a2dd986bdf11203b8653cf) 2007-07-04 02:36:59 +04:00			`while (!timed_out) {`
ctdb-daemon: Stop using tevent compatibility definitions Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-26 08:50:09 +03:00			`tevent_loop_once(ctdb->ev);`
add a tuneable to control how long we wait after a successful recovery before we alow another recovery to be initiated (This used to be ctdb commit f3b43519423b7a73e6a2dd986bdf11203b8653cf) 2007-07-04 02:36:59 +04:00			`}`
			`}`

ctdb-recoverd: Send leader broadcasts These are triggered on 1 second timer, but are only sent if the node is the current leader and there is no election underway. If this node can not be the leader then ensure it releases the recovery lock. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-03-16 08:16:44 +03:00			`/*`
			`* Broadcast cluster leader`
			`*/`

			`static int leader_broadcast_send(struct ctdb_recoverd *rec, uint32_t pnn)`
			`{`
			`struct ctdb_context *ctdb = rec->ctdb;`
			`TDB_DATA data;`
			`int ret;`

			`data.dptr = (uint8_t *)&pnn;`
			`data.dsize = sizeof(pnn);`

			`ret = ctdb_client_send_message(ctdb,`
			`CTDB_BROADCAST_CONNECTED,`
			`CTDB_SRVID_LEADER,`
			`data);`
			`return ret;`
			`}`

			`static int leader_broadcast_loop(struct ctdb_recoverd *rec);`
ctdb-recoverd: Terminology change: recovery lock -> cluster lock No functional changes, just name changes for clarity. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-10 03:29:06 +03:00			`static void cluster_lock_release(struct ctdb_recoverd *rec);`
ctdb-recoverd: Send leader broadcasts These are triggered on 1 second timer, but are only sent if the node is the current leader and there is no election underway. If this node can not be the leader then ensure it releases the recovery lock. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-03-16 08:16:44 +03:00
ctdb:server: Fix code spelling Best reviewed with: `git show --word-diff` Signed-off-by: Andreas Schneider <asn@samba.org> Reviewed-by: Martin Schwenke <mschwenke@ddn.com> 2023-03-22 11:36:23 +03:00			`/* This runs continuously but only sends the broadcast when leader */`
ctdb-recoverd: Send leader broadcasts These are triggered on 1 second timer, but are only sent if the node is the current leader and there is no election underway. If this node can not be the leader then ensure it releases the recovery lock. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-03-16 08:16:44 +03:00			`static void leader_broadcast_loop_handler(struct tevent_context *ev,`
			`struct tevent_timer *te,`
			`struct timeval current_time,`
			`void *private_data)`
			`{`
			`struct ctdb_recoverd *rec = talloc_get_type_abort(`
			`private_data, struct ctdb_recoverd);`
			`int ret;`

			`if (!this_node_can_be_leader(rec)) {`
			`if (this_node_is_leader(rec)) {`
			`rec->leader = CTDB_UNKNOWN_PNN;`
			`}`
ctdb-recoverd: Add and use function cluster_lock_enabled() Now all references to ctdb->recovery_lock are encapsulated in the cluster lock code. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-10 03:43:10 +03:00			`if (cluster_lock_enabled(rec) && cluster_lock_held(rec)) {`
ctdb-recoverd: Terminology change: recovery lock -> cluster lock No functional changes, just name changes for clarity. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-10 03:29:06 +03:00			`cluster_lock_release(rec);`
ctdb-recoverd: Send leader broadcasts These are triggered on 1 second timer, but are only sent if the node is the current leader and there is no election underway. If this node can not be the leader then ensure it releases the recovery lock. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-03-16 08:16:44 +03:00			`}`
			`goto done;`
			`}`

			`if (!this_node_is_leader(rec)) {`
			`goto done;`
			`}`

			`if (rec->election_in_progress) {`
			`goto done;`
			`}`

			`ret = leader_broadcast_send(rec, rec->leader);`
			`if (ret != 0) {`
			`DBG_WARNING("Failed to send leader broadcast\n");`
			`}`

			`done:`
			`ret = leader_broadcast_loop(rec);`
			`if (ret != 0) {`
			`D_WARNING("Failed to set up leader broadcast\n");`
			`}`
			`}`

			`static int leader_broadcast_loop(struct ctdb_recoverd *rec)`
			`{`
			`struct ctdb_context *ctdb = rec->ctdb;`

			`TALLOC_FREE(rec->leader_broadcast_te);`
			`rec->leader_broadcast_te =`
			`tevent_add_timer(ctdb->ev,`
			`rec,`
			`timeval_current_ofs(1, 0),`
			`leader_broadcast_loop_handler,`
			`rec);`
			`if (rec->leader_broadcast_te == NULL) {`
			`return ENOMEM;`
			`}`

			`return 0;`
			`}`

			`static bool leader_broadcast_loop_active(struct ctdb_recoverd *rec)`
			`{`
			`return rec->leader_broadcast_te != NULL;`
			`}`

make election handling much more scalable (This used to be ctdb commit 05938d462b92bd9ecb8e35f53651bded47c48675) 2007-11-13 02:27:44 +03:00			`/*`
			`called when an election times out (ends)`
			`*/`
ctdb-daemon: Stop using tevent compatibility definitions Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-26 08:50:09 +03:00			`static void ctdb_election_timeout(struct tevent_context *ev,`
			`struct tevent_timer *te,`
make election handling much more scalable (This used to be ctdb commit 05938d462b92bd9ecb8e35f53651bded47c48675) 2007-11-13 02:27:44 +03:00			`struct timeval t, void *p)`
			`{`
			`struct ctdb_recoverd *rec = talloc_get_type(p, struct ctdb_recoverd);`
ctdb-recoverd: Take cluster lock when election completes It is no longer just a recovery lock but is always held by the cluster leader. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-09-20 07:13:58 +03:00			`bool ok;`

ctdb-recoverd: Add an explicit flag for election in progress An alternate election method will be added that doesn't use the election timeout, so this provides a common way for recognising when an election is in progress. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-03-18 12:27:10 +03:00			`rec->election_in_progress = false;`
make election handling much more scalable (This used to be ctdb commit 05938d462b92bd9ecb8e35f53651bded47c48675) 2007-11-13 02:27:44 +03:00			`rec->election_timeout = NULL;`
speed startup: with --sloppy-start, cut initial election timeout to 1/2 second. Seconds between ctdbd first log message and node healthy: BEFORE: 4.03 AFTER: 2.02 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 8f17731dea4287d4f9b21dc58c1bdf26c8a0e628) 2010-06-22 17:25:20 +04:00			`fast_start = false;`
When we create new election data to send during elections, we must re-read the node flags from the main daemon to catch when the STOPPED flag is changed. (This used to be ctdb commit ca4982c40d81db528fe915d5ecc01fcf7df0b522) 2009-07-17 05:37:03 +04:00
ctdb-recoverd: Logging/comments: recovery master -> leader There are some remaining instances in this file but they will be removed in subsequent commits. Modernise debug macros as appropriate. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-08 03:07:25 +03:00			`D_WARNING("Election period ended, leader=%u\n", rec->leader);`
ctdb-recoverd: Take cluster lock when election completes It is no longer just a recovery lock but is always held by the cluster leader. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-09-20 07:13:58 +03:00
			`if (!this_node_is_leader(rec)) {`
			`return;`
			`}`

			`ok = cluster_lock_take(rec);`
			`if (!ok) {`
			`D_ERR("Unable to get cluster lock, banning node\n");`
			`ctdb_ban_node(rec, rec->pnn);`
			`}`
make election handling much more scalable (This used to be ctdb commit 05938d462b92bd9ecb8e35f53651bded47c48675) 2007-11-13 02:27:44 +03:00			`}`


			`/*`
			`wait for an election to finish. It finished election_timeout seconds after`
			`the last election packet is received`
			`*/`
			`static void ctdb_wait_election(struct ctdb_recoverd *rec)`
			`{`
			`struct ctdb_context *ctdb = rec->ctdb;`
ctdb-recoverd: Add an explicit flag for election in progress An alternate election method will be added that doesn't use the election timeout, so this provides a common way for recognising when an election is in progress. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-03-18 12:27:10 +03:00			`while (rec->election_in_progress) {`
ctdb-daemon: Stop using tevent compatibility definitions Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-26 08:50:09 +03:00			`tevent_loop_once(ctdb->ev);`
make election handling much more scalable (This used to be ctdb commit 05938d462b92bd9ecb8e35f53651bded47c48675) 2007-11-13 02:27:44 +03:00			`}`
			`}`

sync flags between nodes in monitor loop in recmaster (This used to be ctdb commit 6eef86e06388fc53a1212f1e2783ae174c6cd210) 2007-10-15 08:28:51 +04:00			`/*`
ctdb-recoverd: Change update_local_flags() to use already retrieved nodemaps BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-01-18 12:35:55 +03:00			`* Update local flags from all remote connected nodes and push out`
ctdb-recoverd: Logging/comments: recovery master -> leader There are some remaining instances in this file but they will be removed in subsequent commits. Modernise debug macros as appropriate. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-08 03:07:25 +03:00			`* flags changes to all nodes. This is only run by the leader.`
sync flags between nodes in monitor loop in recmaster (This used to be ctdb commit 6eef86e06388fc53a1212f1e2783ae174c6cd210) 2007-10-15 08:28:51 +04:00			`*/`
ctdb-recoverd: Rename update_local_flags() -> update_flags() This also updates remote flags so the name is misleading. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-01-24 02:21:37 +03:00			`static int update_flags(struct ctdb_recoverd *rec,`
			`struct ctdb_node_map_old *nodemap,`
			`struct ctdb_node_map_old **remote_nodemaps)`
sync flags between nodes in monitor loop in recmaster (This used to be ctdb commit 6eef86e06388fc53a1212f1e2783ae174c6cd210) 2007-10-15 08:28:51 +04:00			`{`
ctdb-recovery: Fix signed/unsigned comparisons by declaring as unsigned Simple cases where variables need to be declared as an unsigned type instead of an int. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2019-05-23 01:43:58 +03:00			`unsigned int j;`
dont manipulate ctdb->monitoring_mode directly from the SET_MON_MODE control, instead call ctdb_start/stop_monitoring() ctdb_stop_monitoring() dont allocate a new monitoring context, leave it NULL. Also set the monitoring_mode in this function so that ctdb_stop/start_monitoring() and ->monitoring_mode are kept in sync. Add a debug message to log that we have stopped monitoring. ctdb_start_monitoring() check whether monitoring is already active and make the function idempotent. Create the monitoring context when monitoring is started. Update ->monitoring_mode once the monitoring has been started. Add a debug message to log that we have started monitoring. When we temporarily stop monitoring while running an event script, restart monitoring after the event script wrapper returns instead of in the event script callback. Let monitoring_mode start out as DISABLED and let it be enabled once we call ctdb_start_monitoring. dont check for MONITORING_DISABLED in check_fore_dead_nodes(). If monitoring is disabled, this event handler will not be called. (This used to be ctdb commit 3a93ae8bdcffb1adbd6243844f3058fc742f76aa) 2007-11-30 00:44:34 +03:00			`struct ctdb_context *ctdb = rec->ctdb;`
sync flags between nodes in monitor loop in recmaster (This used to be ctdb commit 6eef86e06388fc53a1212f1e2783ae174c6cd210) 2007-10-15 08:28:51 +04:00			`TALLOC_CTX *mem_ctx = talloc_new(ctdb);`

ctdb-recoverd: Change update_local_flags() to use already retrieved nodemaps BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-01-18 12:35:55 +03:00			`/* Check flags from remote nodes */`
sync flags between nodes in monitor loop in recmaster (This used to be ctdb commit 6eef86e06388fc53a1212f1e2783ae174c6cd210) 2007-10-15 08:28:51 +04:00			`for (j=0; j<nodemap->num; j++) {`
ctdb-daemon: Rename struct ctdb_node_map to ctdb_node_map_old Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-29 09:22:48 +03:00			`struct ctdb_node_map_old *remote_nodemap=NULL;`
ctdb-recoverd: Introduce some local variables to improve readability Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2019-06-15 00:19:26 +03:00			`uint32_t local_flags = nodemap->nodes[j].flags;`
ctdb-recoverd: Add a helper variable Improves readability and simplifies subsequent changes. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14784 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-07-11 13:40:10 +03:00			`uint32_t remote_pnn = nodemap->nodes[j].pnn;`
ctdb-recoverd: Introduce some local variables to improve readability Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2019-06-15 00:19:26 +03:00			`uint32_t remote_flags;`
ctdb-recoverd: Push flags for a node if any remote node disagrees This will usually happen if flags on the node in question change, so keeping the code simple and pushing to all nodes won't hurt. When all nodes come up there might be differences in connected nodes, causing such "fix ups". Receiving nodes will ignore no-op pushes. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14784 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-07-11 15:17:08 +03:00			`unsigned int i;`
sync flags between nodes in monitor loop in recmaster (This used to be ctdb commit 6eef86e06388fc53a1212f1e2783ae174c6cd210) 2007-10-15 08:28:51 +04:00			`int ret;`

ctdb-recoverd: Introduce some local variables to improve readability Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2019-06-15 00:19:26 +03:00			`if (local_flags & NODE_FLAGS_DISCONNECTED) {`
sync flags between nodes in monitor loop in recmaster (This used to be ctdb commit 6eef86e06388fc53a1212f1e2783ae174c6cd210) 2007-10-15 08:28:51 +04:00			`continue;`
			`}`
ctdb-recoverd: Use rec->pnn everywhere This is currently referenced in a number of inconsistent ways, including: * pnn * rec->ctdb->pnn * ctdb->pnn * ctdb_get_pnn(ctdb) * ctdb_get_pnn(rec->ctdb) The first of these always requires some thought about the context - is this the node PNN or some other PNN (e.g. argument to function)? rec->pnn is now always used when referring to the recovery daemon's PNN. Doing this also reduces reliance on struct ctdb_context internals. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-08 12:25:46 +03:00			`if (remote_pnn == rec->pnn) {`
ctdb-recoverd: Push flags for a node if any remote node disagrees This will usually happen if flags on the node in question change, so keeping the code simple and pushing to all nodes won't hurt. When all nodes come up there might be differences in connected nodes, causing such "fix ups". Receiving nodes will ignore no-op pushes. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14784 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-07-11 15:17:08 +03:00			`/*`
			`* No remote nodemap for this node since this`
			`* is the local nodemap. However, still need`
			`* to check this against the remote nodes and`
			`* push it if they are out-of-date.`
			`*/`
			`goto compare_remotes;`
sync flags between nodes in monitor loop in recmaster (This used to be ctdb commit 6eef86e06388fc53a1212f1e2783ae174c6cd210) 2007-10-15 08:28:51 +04:00			`}`

ctdb-recoverd: Change update_local_flags() to use already retrieved nodemaps BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-01-18 12:35:55 +03:00			`remote_nodemap = remote_nodemaps[j];`
ctdb-recoverd: Introduce some local variables to improve readability Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2019-06-15 00:19:26 +03:00			`remote_flags = remote_nodemap->nodes[j].flags;`

			`if (local_flags != remote_flags) {`
			`/*`
			`* Update the local copy of the flags in the`
			`* recovery daemon.`
			`*/`
			`D_NOTICE("Remote node %u had flags 0x%x, "`
			`"local had 0x%x - updating local\n",`
ctdb-recoverd: Add a helper variable Improves readability and simplifies subsequent changes. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14784 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-07-11 13:40:10 +03:00			`remote_pnn,`
ctdb-recoverd: Introduce some local variables to improve readability Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2019-06-15 00:19:26 +03:00			`remote_flags,`
			`local_flags);`
			`nodemap->nodes[j].flags = remote_flags;`
ctdb-recoverd: Update the local node map before pushing out flags The resulting code structure looks a little weird. However, there is another condition that requires the flags to be pushed that will be inserted before the continue statement in a subsequent commit.. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14784 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-07-11 14:28:43 +03:00			`local_flags = remote_flags;`
			`goto push;`
			`}`

ctdb-recoverd: Push flags for a node if any remote node disagrees This will usually happen if flags on the node in question change, so keeping the code simple and pushing to all nodes won't hurt. When all nodes come up there might be differences in connected nodes, causing such "fix ups". Receiving nodes will ignore no-op pushes. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14784 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-07-11 15:17:08 +03:00			`compare_remotes:`
			`for (i = 0; i < nodemap->num; i++) {`
			`if (i == j) {`
			`continue;`
			`}`
			`if (nodemap->nodes[i].flags & NODE_FLAGS_DISCONNECTED) {`
			`continue;`
			`}`
ctdb-recoverd: Use rec->pnn everywhere This is currently referenced in a number of inconsistent ways, including: * pnn * rec->ctdb->pnn * ctdb->pnn * ctdb_get_pnn(ctdb) * ctdb_get_pnn(rec->ctdb) The first of these always requires some thought about the context - is this the node PNN or some other PNN (e.g. argument to function)? rec->pnn is now always used when referring to the recovery daemon's PNN. Doing this also reduces reliance on struct ctdb_context internals. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-08 12:25:46 +03:00			`if (nodemap->nodes[i].pnn == rec->pnn) {`
ctdb-recoverd: Push flags for a node if any remote node disagrees This will usually happen if flags on the node in question change, so keeping the code simple and pushing to all nodes won't hurt. When all nodes come up there might be differences in connected nodes, causing such "fix ups". Receiving nodes will ignore no-op pushes. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14784 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-07-11 15:17:08 +03:00			`continue;`
			`}`

			`remote_nodemap = remote_nodemaps[i];`
			`remote_flags = remote_nodemap->nodes[j].flags;`

			`if (local_flags != remote_flags) {`
			`goto push;`
			`}`
			`}`

ctdb-recoverd: Update the local node map before pushing out flags The resulting code structure looks a little weird. However, there is another condition that requires the flags to be pushed that will be inserted before the continue statement in a subsequent commit.. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14784 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-07-11 14:28:43 +03:00			`continue;`

			`push:`
			`D_NOTICE("Pushing updated flags for node %u (0x%x)\n",`
			`remote_pnn,`
			`local_flags);`
			`ret = update_flags_on_all_nodes(rec, remote_pnn, local_flags);`
			`if (ret != 0) {`
			`DBG_ERR("Unable to update flags on remote nodes\n");`
			`talloc_free(mem_ctx);`
			`return -1;`
sync flags between nodes in monitor loop in recmaster (This used to be ctdb commit 6eef86e06388fc53a1212f1e2783ae174c6cd210) 2007-10-15 08:28:51 +04:00			`}`
			`}`
			`talloc_free(mem_ctx);`
ctdb-recoverd: Simplify return values when updating local flags Change this to return just 0 or -1. It isn't monitoring anything. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-04-27 14:47:08 +03:00			`return 0;`
sync flags between nodes in monitor loop in recmaster (This used to be ctdb commit 6eef86e06388fc53a1212f1e2783ae174c6cd210) 2007-10-15 08:28:51 +04:00			`}`


ctdbd: Fix a typo Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Jeremy Allison <jra@samba.org> 2015-10-12 17:52:49 +03:00			`/* Create a new random generation id.`
create a define to represent the 'invalid' generation id we used in two places. create a new helper function to generate new generation id values that know about the invalid id and avoids generating it. update the ctdb status tool to know about the invalid generation id and print the string INVALID instead (This used to be ctdb commit 4fbcd189543cb8a92227fdcd3d158472e558ccda) 2007-08-22 06:38:31 +04:00			`The generation id can not be the INVALID_GENERATION id`
			`*/`
			`static uint32_t new_generation(void)`
			`{`
			`uint32_t generation;`

			`while (1) {`
			`generation = random();`

			`if (generation != INVALID_GENERATION) {`
			`break;`
			`}`
			`}`

			`return generation;`
			`}`
we are the culprit if we can't get the reclock (This used to be ctdb commit 1d320e113c6134ff6822b985a47131d8204af35a) 2007-10-05 06:01:40 +04:00
ctdb-recoverd: Terminology change: recovery lock -> cluster lock No functional changes, just name changes for clarity. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-10 03:29:06 +03:00			`static bool cluster_lock_held(struct ctdb_recoverd *rec)`
ctdb-recovery: Move recovery lock functions to recovery daemon code ctdb_recovery_have_lock(), ctdb_recovery_lock(), ctdb_recovery_unlock() are only used by recovery daemon, so move them there. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-02-17 12:20:03 +03:00			`{`
ctdb-recoverd: Terminology change: recovery lock -> cluster lock No functional changes, just name changes for clarity. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-10 03:29:06 +03:00			`return (rec->cluster_lock_handle != NULL);`
ctdb-recovery: Move recovery lock functions to recovery daemon code ctdb_recovery_have_lock(), ctdb_recovery_lock(), ctdb_recovery_unlock() are only used by recovery daemon, so move them there. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-02-17 12:20:03 +03:00			`}`

ctdb-recoverd: Terminology change: recovery lock -> cluster lock No functional changes, just name changes for clarity. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-10 03:29:06 +03:00			`struct ctdb_cluster_lock_handle {`
ctdb-recovery: Move recovery lock functions to recovery daemon code ctdb_recovery_have_lock(), ctdb_recovery_lock(), ctdb_recovery_unlock() are only used by recovery daemon, so move them there. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-02-17 12:20:03 +03:00			`bool done;`
ctdb-recoverd: Don't expose internal cluster mutex status Just expose whether the lock was taken. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-05-31 11:37:30 +03:00			`bool locked;`
ctdb-recoverd: Simplify reclock handler Do the interesting work outside the handler. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-06-01 10:32:42 +03:00			`double latency;`
ctdb-recoverd: Store recovery lock handle ... not just cluster mutex handle. This makes the recovery lock handle long-lived and with allow the releasing code to cancel an in-progress attempt to take the recovery lock. The cluster mutex handle is now allocated off the recovery lock handle. BUG: https://bugzilla.samba.org/show_bug.cgi?id=13617 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-09-03 05:39:32 +03:00			`struct ctdb_cluster_mutex_handle *h;`
ctdb-recoverd: Make recoverd context available in recovery lock handle BUG: https://bugzilla.samba.org/show_bug.cgi?id=13800 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2019-01-10 05:24:34 +03:00			`struct ctdb_recoverd *rec;`
ctdb-recovery: Move recovery lock functions to recovery daemon code ctdb_recovery_have_lock(), ctdb_recovery_lock(), ctdb_recovery_unlock() are only used by recovery daemon, so move them there. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-02-17 12:20:03 +03:00			`};`

ctdb-recoverd: Terminology change: recovery lock -> cluster lock No functional changes, just name changes for clarity. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-10 03:29:06 +03:00			`static void take_cluster_lock_handler(char status,`
			`double latency,`
			`void *private_data)`
ctdb-recovery: Move recovery lock functions to recovery daemon code ctdb_recovery_have_lock(), ctdb_recovery_lock(), ctdb_recovery_unlock() are only used by recovery daemon, so move them there. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-02-17 12:20:03 +03:00			`{`
ctdb-recoverd: Terminology change: recovery lock -> cluster lock No functional changes, just name changes for clarity. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-10 03:29:06 +03:00			`struct ctdb_cluster_lock_handle *s =`
			`(struct ctdb_cluster_lock_handle *) private_data;`
ctdb-recovery: Move recovery lock functions to recovery daemon code ctdb_recovery_have_lock(), ctdb_recovery_lock(), ctdb_recovery_unlock() are only used by recovery daemon, so move them there. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-02-17 12:20:03 +03:00
ctdb-recoverd: Free cluster mutex handler on failure to take lock If nested events occur while the file descriptor handler is still active then chaos can ensue. For example, if a node is banned and the lock is explicitly cancelled (e.g. due to election loss) then double-talloc-free()s abound. BUG: https://bugzilla.samba.org/show_bug.cgi?id=13800 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2019-01-21 08:28:28 +03:00			`s->locked = (status == '0') ;`

			`/*`
			`* If unsuccessful then ensure the process has exited and that`
			`* the file descriptor event handler has been cancelled`
			`*/`
			`if (! s->locked) {`
			`TALLOC_FREE(s->h);`
			`}`

ctdb-recovery: Move recovery lock functions to recovery daemon code ctdb_recovery_have_lock(), ctdb_recovery_lock(), ctdb_recovery_unlock() are only used by recovery daemon, so move them there. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-02-17 12:20:03 +03:00			`switch (status) {`
			`case '0':`
ctdb-recoverd: Simplify reclock handler Do the interesting work outside the handler. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-06-01 10:32:42 +03:00			`s->latency = latency;`
ctdb-recovery: Move recovery lock functions to recovery daemon code ctdb_recovery_have_lock(), ctdb_recovery_lock(), ctdb_recovery_unlock() are only used by recovery daemon, so move them there. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-02-17 12:20:03 +03:00			`break;`

			`case '1':`
ctdb-recoverd: Terminology change: recovery lock -> cluster lock No functional changes, just name changes for clarity. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-10 03:29:06 +03:00			`D_ERR("Unable to take cluster lock - contention\n");`
ctdb-recoverd: Clean up logging on failure to take recovery lock Add an explicit case for a timeout and clean up the other messages. BUG: https://bugzilla.samba.org/show_bug.cgi?id=13800 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2019-01-21 08:36:13 +03:00			`break;`

			`case '2':`
ctdb-recoverd: Terminology change: recovery lock -> cluster lock No functional changes, just name changes for clarity. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-10 03:29:06 +03:00			`D_ERR("Unable to take cluster lock - timeout\n");`
ctdb-recovery: Move recovery lock functions to recovery daemon code ctdb_recovery_have_lock(), ctdb_recovery_lock(), ctdb_recovery_unlock() are only used by recovery daemon, so move them there. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-02-17 12:20:03 +03:00			`break;`

			`default:`
ctdb-recoverd: Terminology change: recovery lock -> cluster lock No functional changes, just name changes for clarity. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-10 03:29:06 +03:00			`D_ERR("Unable to take cluster lock - unknown error\n");`
ctdb-recovery: Move recovery lock functions to recovery daemon code ctdb_recovery_have_lock(), ctdb_recovery_lock(), ctdb_recovery_unlock() are only used by recovery daemon, so move them there. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-02-17 12:20:03 +03:00			`}`

			`s->done = true;`
			`}`

ctdb-recoverd: Simplify arguments to some election functions The pnn and nodemap arguments to force_election() and send_election_request() are always effectively rec->pnn and rec->nodemap, so simplify. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-08 11:27:01 +03:00			`static void force_election(struct ctdb_recoverd *rec);`
ctdb-recoverd: Add handler for lost recovery lock If the process holding the recovery lock terminates unexpectedly then the recovery daemon needs to know that the lock is no longer held. While here, rename hold_reclock_handler() to take_reclock_handler() so there is a clear difference between the two handler names. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-05-29 00:25:05 +03:00
ctdb-recoverd: Terminology change: recovery lock -> cluster lock No functional changes, just name changes for clarity. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-10 03:29:06 +03:00			`static void lost_cluster_lock_handler(void *private_data)`
ctdb-recoverd: Add handler for lost recovery lock If the process holding the recovery lock terminates unexpectedly then the recovery daemon needs to know that the lock is no longer held. While here, rename hold_reclock_handler() to take_reclock_handler() so there is a clear difference between the two handler names. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-05-29 00:25:05 +03:00			`{`
			`struct ctdb_recoverd *rec = talloc_get_type_abort(`
			`private_data, struct ctdb_recoverd);`

ctdb-recoverd: Terminology change: recovery lock -> cluster lock No functional changes, just name changes for clarity. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-10 03:29:06 +03:00			`D_ERR("Cluster lock helper terminated\n");`
			`TALLOC_FREE(rec->cluster_lock_handle);`
ctdb-recoverd: Call an election when the recovery lock is lost The lock may have been lost due to a failure in the underlying locking mechanism. This could be due to quorum loss or similar. It is best to call an election to confirm that this node should still be master. At worst, the node will reelect itself, fail to take the lock and then ban itself. This is a suitable outcome for a node that has been partitioned from others in the cluster. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-11-08 07:49:30 +03:00
ctdb-recoverd: Only start election if node can be leader Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2022-01-07 03:27:06 +03:00			`if (this_node_can_be_leader(rec)) {`
			`force_election(rec);`
			`}`
ctdb-recoverd: Add handler for lost recovery lock If the process holding the recovery lock terminates unexpectedly then the recovery daemon needs to know that the lock is no longer held. While here, rename hold_reclock_handler() to take_reclock_handler() so there is a clear difference between the two handler names. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-05-29 00:25:05 +03:00			`}`

ctdb-recoverd: Terminology change: recovery lock -> cluster lock No functional changes, just name changes for clarity. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-10 03:29:06 +03:00			`static bool _cluster_lock_lock(struct ctdb_recoverd *rec)`
ctdb-recovery: Move recovery lock functions to recovery daemon code ctdb_recovery_have_lock(), ctdb_recovery_lock(), ctdb_recovery_unlock() are only used by recovery daemon, so move them there. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-02-17 12:20:03 +03:00			`{`
ctdb-recoverd: Recovery lock handle should be in recovery deamon context This shouldn't be in the CTDB context. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-05-24 07:54:39 +03:00			`struct ctdb_context *ctdb = rec->ctdb;`
ctdb-recovery: Move recovery lock functions to recovery daemon code ctdb_recovery_have_lock(), ctdb_recovery_lock(), ctdb_recovery_unlock() are only used by recovery daemon, so move them there. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-02-17 12:20:03 +03:00			`struct ctdb_cluster_mutex_handle *h;`
ctdb-recoverd: Terminology change: recovery lock -> cluster lock No functional changes, just name changes for clarity. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-10 03:29:06 +03:00			`struct ctdb_cluster_lock_handle *s;`
ctdb-recoverd: Use talloc() to allocate recovery lock handle At the moment this is still local and is freed after the mutex is successfully taken. BUG: https://bugzilla.samba.org/show_bug.cgi?id=13617 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-09-03 04:43:44 +03:00
ctdb-recoverd: Terminology change: recovery lock -> cluster lock No functional changes, just name changes for clarity. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-10 03:29:06 +03:00			`s = talloc_zero(rec, struct ctdb_cluster_lock_handle);`
ctdb-recoverd: Use talloc() to allocate recovery lock handle At the moment this is still local and is freed after the mutex is successfully taken. BUG: https://bugzilla.samba.org/show_bug.cgi?id=13617 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-09-03 04:43:44 +03:00			`if (s == NULL) {`
			`DBG_ERR("Memory allocation error\n");`
			`return false;`
ctdb-recovery: Move recovery lock functions to recovery daemon code ctdb_recovery_have_lock(), ctdb_recovery_lock(), ctdb_recovery_unlock() are only used by recovery daemon, so move them there. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-02-17 12:20:03 +03:00			`};`

ctdb-recoverd: Make recoverd context available in recovery lock handle BUG: https://bugzilla.samba.org/show_bug.cgi?id=13800 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2019-01-10 05:24:34 +03:00			`s->rec = rec;`

ctdb-recoverd: Store recovery lock handle ... not just cluster mutex handle. This makes the recovery lock handle long-lived and with allow the releasing code to cancel an in-progress attempt to take the recovery lock. The cluster mutex handle is now allocated off the recovery lock handle. BUG: https://bugzilla.samba.org/show_bug.cgi?id=13617 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-09-03 05:39:32 +03:00			`h = ctdb_cluster_mutex(s,`
ctdb-recoverd: Use talloc() to allocate recovery lock handle At the moment this is still local and is freed after the mutex is successfully taken. BUG: https://bugzilla.samba.org/show_bug.cgi?id=13617 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-09-03 04:43:44 +03:00			`ctdb,`
			`ctdb->recovery_lock,`
ctdb-recoverd: Time out attempt to take recovery lock after 120s Currently this will wait forever. It really needs a timeout in case the cluster filesystem (or other lock mechanism) is completely wedged. BUG: https://bugzilla.samba.org/show_bug.cgi?id=13800 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2019-02-22 07:09:33 +03:00			`120,`
ctdb-recoverd: Terminology change: recovery lock -> cluster lock No functional changes, just name changes for clarity. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-10 03:29:06 +03:00			`take_cluster_lock_handler,`
ctdb-recoverd: Use talloc() to allocate recovery lock handle At the moment this is still local and is freed after the mutex is successfully taken. BUG: https://bugzilla.samba.org/show_bug.cgi?id=13617 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-09-03 04:43:44 +03:00			`s,`
ctdb-recoverd: Terminology change: recovery lock -> cluster lock No functional changes, just name changes for clarity. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-10 03:29:06 +03:00			`lost_cluster_lock_handler,`
ctdb-recoverd: Use talloc() to allocate recovery lock handle At the moment this is still local and is freed after the mutex is successfully taken. BUG: https://bugzilla.samba.org/show_bug.cgi?id=13617 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-09-03 04:43:44 +03:00			`rec);`
ctdb-recovery: Move recovery lock functions to recovery daemon code ctdb_recovery_have_lock(), ctdb_recovery_lock(), ctdb_recovery_unlock() are only used by recovery daemon, so move them there. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-02-17 12:20:03 +03:00			`if (h == NULL) {`
ctdb-recoverd: Use talloc() to allocate recovery lock handle At the moment this is still local and is freed after the mutex is successfully taken. BUG: https://bugzilla.samba.org/show_bug.cgi?id=13617 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-09-03 04:43:44 +03:00			`talloc_free(s);`
ctdb-recoverd: Fix buggy function return on memory allocation failure Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-06-01 08:56:42 +03:00			`return false;`
ctdb-recovery: Move recovery lock functions to recovery daemon code ctdb_recovery_have_lock(), ctdb_recovery_lock(), ctdb_recovery_unlock() are only used by recovery daemon, so move them there. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-02-17 12:20:03 +03:00			`}`

ctdb-recoverd: Terminology change: recovery lock -> cluster lock No functional changes, just name changes for clarity. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-10 03:29:06 +03:00			`rec->cluster_lock_handle = s;`
ctdb-recoverd: Set recovery lock handle at start of attempt This allows the attempt to be cancelled if an election is lost and an unlock is done before the attempt is completed. BUG: https://bugzilla.samba.org/show_bug.cgi?id=13617 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Martin Schwenke <martins@samba.org> Autobuild-Date(master): Tue Sep 18 02:18:30 CEST 2018 on sn-devel-144 2018-09-03 06:30:57 +03:00			`s->h = h;`

ctdb-recoverd: Use talloc() to allocate recovery lock handle At the moment this is still local and is freed after the mutex is successfully taken. BUG: https://bugzilla.samba.org/show_bug.cgi?id=13617 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-09-03 04:43:44 +03:00			`while (! s->done) {`
ctdb-recovery: Move recovery lock functions to recovery daemon code ctdb_recovery_have_lock(), ctdb_recovery_lock(), ctdb_recovery_unlock() are only used by recovery daemon, so move them there. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-02-17 12:20:03 +03:00			`tevent_loop_once(ctdb->ev);`
			`}`

ctdb-recoverd: Use talloc() to allocate recovery lock handle At the moment this is still local and is freed after the mutex is successfully taken. BUG: https://bugzilla.samba.org/show_bug.cgi?id=13617 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-09-03 04:43:44 +03:00			`if (! s->locked) {`
ctdb-recoverd: Terminology change: recovery lock -> cluster lock No functional changes, just name changes for clarity. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-10 03:29:06 +03:00			`TALLOC_FREE(rec->cluster_lock_handle);`
ctdb-recoverd: Recovery lock handle should be in recovery deamon context This shouldn't be in the CTDB context. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-05-24 07:54:39 +03:00			`return false;`
			`}`

ctdb-recoverd: Use talloc() to allocate recovery lock handle At the moment this is still local and is freed after the mutex is successfully taken. BUG: https://bugzilla.samba.org/show_bug.cgi?id=13617 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-09-03 04:43:44 +03:00			`ctdb_ctrl_report_recd_lock_latency(ctdb,`
			`CONTROL_TIMEOUT(),`
			`s->latency);`

ctdb-recoverd: Recovery lock handle should be in recovery deamon context This shouldn't be in the CTDB context. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-05-24 07:54:39 +03:00			`return true;`
ctdb-recovery: Move recovery lock functions to recovery daemon code ctdb_recovery_have_lock(), ctdb_recovery_lock(), ctdb_recovery_unlock() are only used by recovery daemon, so move them there. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-02-17 12:20:03 +03:00			`}`

ctdb-recoverd: Terminology change: recovery lock -> cluster lock No functional changes, just name changes for clarity. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-10 03:29:06 +03:00			`static void cluster_lock_release(struct ctdb_recoverd *rec)`
ctdb-recovery: Move recovery lock functions to recovery daemon code ctdb_recovery_have_lock(), ctdb_recovery_lock(), ctdb_recovery_unlock() are only used by recovery daemon, so move them there. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-02-17 12:20:03 +03:00			`{`
ctdb-recoverd: Terminology change: recovery lock -> cluster lock No functional changes, just name changes for clarity. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-10 03:29:06 +03:00			`if (rec->cluster_lock_handle == NULL) {`
ctdb-recoverd: Return early when the recovery lock is not held This makes upcoming changes simpler. Update to modern debug macro while touching relevant line. BUG: https://bugzilla.samba.org/show_bug.cgi?id=13617 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-09-11 08:05:19 +03:00			`return;`
ctdb-recovery: Move recovery lock functions to recovery daemon code ctdb_recovery_have_lock(), ctdb_recovery_lock(), ctdb_recovery_unlock() are only used by recovery daemon, so move them there. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-02-17 12:20:03 +03:00			`}`
ctdb-recoverd: Return early when the recovery lock is not held This makes upcoming changes simpler. Update to modern debug macro while touching relevant line. BUG: https://bugzilla.samba.org/show_bug.cgi?id=13617 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-09-11 08:05:19 +03:00
ctdb-recoverd: Terminology change: recovery lock -> cluster lock No functional changes, just name changes for clarity. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-10 03:29:06 +03:00			`if (! rec->cluster_lock_handle->done) {`
ctdb-recoverd: Handle cancellation when releasing recovery lock If the recovery lock is in the process of being taken then free the cluster mutex handle but leave the recovery lock handle in place. This allows ctdb_recovery_lock() to fail. Note that this isn't yet live because rec->recovery_lock_handle is still only set at the completion of the attempt to take the lock. BUG: https://bugzilla.samba.org/show_bug.cgi?id=13617 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-09-03 06:01:19 +03:00			`/*`
ctdb-recoverd: Terminology change: recovery lock -> cluster lock No functional changes, just name changes for clarity. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-10 03:29:06 +03:00			`* Taking of cluster lock still in progress. Free`
ctdb-recoverd: Handle cancellation when releasing recovery lock If the recovery lock is in the process of being taken then free the cluster mutex handle but leave the recovery lock handle in place. This allows ctdb_recovery_lock() to fail. Note that this isn't yet live because rec->recovery_lock_handle is still only set at the completion of the attempt to take the lock. BUG: https://bugzilla.samba.org/show_bug.cgi?id=13617 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-09-03 06:01:19 +03:00			`* the cluster mutex handle to release it but leave`
ctdb-recoverd: Terminology change: recovery lock -> cluster lock No functional changes, just name changes for clarity. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-10 03:29:06 +03:00			`* the cluster lock handle in place to allow taking`
ctdb-recoverd: Handle cancellation when releasing recovery lock If the recovery lock is in the process of being taken then free the cluster mutex handle but leave the recovery lock handle in place. This allows ctdb_recovery_lock() to fail. Note that this isn't yet live because rec->recovery_lock_handle is still only set at the completion of the attempt to take the lock. BUG: https://bugzilla.samba.org/show_bug.cgi?id=13617 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-09-03 06:01:19 +03:00			`* of the lock to fail.`
			`*/`
ctdb-recoverd: Terminology change: recovery lock -> cluster lock No functional changes, just name changes for clarity. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-10 03:29:06 +03:00			`D_NOTICE("Cancelling cluster lock\n");`
			`TALLOC_FREE(rec->cluster_lock_handle->h);`
			`rec->cluster_lock_handle->done = true;`
			`rec->cluster_lock_handle->locked = false;`
ctdb-recoverd: Handle cancellation when releasing recovery lock If the recovery lock is in the process of being taken then free the cluster mutex handle but leave the recovery lock handle in place. This allows ctdb_recovery_lock() to fail. Note that this isn't yet live because rec->recovery_lock_handle is still only set at the completion of the attempt to take the lock. BUG: https://bugzilla.samba.org/show_bug.cgi?id=13617 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-09-03 06:01:19 +03:00			`return;`
			`}`

ctdb-recoverd: Terminology change: recovery lock -> cluster lock No functional changes, just name changes for clarity. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-10 03:29:06 +03:00			`D_NOTICE("Releasing cluster lock\n");`
			`TALLOC_FREE(rec->cluster_lock_handle);`
ctdb-recovery: Move recovery lock functions to recovery daemon code ctdb_recovery_have_lock(), ctdb_recovery_lock(), ctdb_recovery_unlock() are only used by recovery daemon, so move them there. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-02-17 12:20:03 +03:00			`}`

recoverd: Don't continue if the current node gets banned Can not continue with recovery or monitoring cluster. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 14399de1dd0bd8dabf1f48b1457e3ccb37589d8a) 2013-06-28 10:31:07 +04:00			`static void ban_misbehaving_nodes(struct ctdb_recoverd rec, bool self_ban)`
recoverd: Refactor code to ban misbehaving nodes Since we have nodemap information, there is no need to hardcode the limit of 20. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Pair-Programmed-With: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit aea12dce83ef385e9fb3bc03ac7ace0874a0e3fe) 2013-06-28 08:31:02 +04:00			`{`
ctdb-recoverd: Clean up banning culprit code Make this fully self-contained in the recovery daemon and avoid indexing by PNN. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-07-29 06:30:04 +03:00			`size_t len = talloc_array_length(rec->banning_state);`
			`size_t i;`

recoverd: Refactor code to ban misbehaving nodes Since we have nodemap information, there is no need to hardcode the limit of 20. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Pair-Programmed-With: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit aea12dce83ef385e9fb3bc03ac7ace0874a0e3fe) 2013-06-28 08:31:02 +04:00
recoverd: Don't continue if the current node gets banned Can not continue with recovery or monitoring cluster. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 14399de1dd0bd8dabf1f48b1457e3ccb37589d8a) 2013-06-28 10:31:07 +04:00			`*self_ban = false;`
ctdb-recoverd: Clean up banning culprit code Make this fully self-contained in the recovery daemon and avoid indexing by PNN. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-07-29 06:30:04 +03:00			`for (i = 0; i < len; i++) {`
			`struct ctdb_banning_state *ban_state = &rec->banning_state[i];`

			`if (ban_state->count < 2 * rec->nodemap->num) {`
recoverd: Refactor code to ban misbehaving nodes Since we have nodemap information, there is no need to hardcode the limit of 20. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Pair-Programmed-With: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit aea12dce83ef385e9fb3bc03ac7ace0874a0e3fe) 2013-06-28 08:31:02 +04:00			`continue;`
			`}`

ctdb-recoverd: Simplify arguments to ctdb_ban_node() ban_time argument is always ctdb->tunable.recovery_ban_period, so build this in and make the calling code more readable. ctdb_ban_node() already logs how long a node is banned for, so don't repeatedly log this. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-10 02:31:56 +03:00			`D_NOTICE("Node %u reached %u banning credits\n",`
ctdb-recoverd: Add pnn field to banning state structure This structure is now standalone, so indexing by PNN can be avoided via a subsequent commit. Index by culprit here to make this commit simple. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-07-29 05:15:03 +03:00			`ban_state->pnn,`
ctdb-recoverd: Simplify arguments to ctdb_ban_node() ban_time argument is always ctdb->tunable.recovery_ban_period, so build this in and make the calling code more readable. ctdb_ban_node() already logs how long a node is banned for, so don't repeatedly log this. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-10 02:31:56 +03:00			`ban_state->count);`
ctdb-recoverd: Add pnn field to banning state structure This structure is now standalone, so indexing by PNN can be avoided via a subsequent commit. Index by culprit here to make this commit simple. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-07-29 05:15:03 +03:00			`ctdb_ban_node(rec, ban_state->pnn);`
recoverd: Refactor code to ban misbehaving nodes Since we have nodemap information, there is no need to hardcode the limit of 20. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Pair-Programmed-With: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit aea12dce83ef385e9fb3bc03ac7ace0874a0e3fe) 2013-06-28 08:31:02 +04:00			`ban_state->count = 0;`
recoverd: Don't continue if the current node gets banned Can not continue with recovery or monitoring cluster. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 14399de1dd0bd8dabf1f48b1457e3ccb37589d8a) 2013-06-28 10:31:07 +04:00
			`/* Banning ourself? */`
ctdb-recoverd: Add pnn field to banning state structure This structure is now standalone, so indexing by PNN can be avoided via a subsequent commit. Index by culprit here to make this commit simple. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-07-29 05:15:03 +03:00			`if (ban_state->pnn == rec->pnn) {`
recoverd: Don't continue if the current node gets banned Can not continue with recovery or monitoring cluster. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 14399de1dd0bd8dabf1f48b1457e3ccb37589d8a) 2013-06-28 10:31:07 +04:00			`*self_ban = true;`
			`}`
recoverd: Refactor code to ban misbehaving nodes Since we have nodemap information, there is no need to hardcode the limit of 20. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Pair-Programmed-With: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit aea12dce83ef385e9fb3bc03ac7ace0874a0e3fe) 2013-06-28 08:31:02 +04:00			`}`
			`}`

ctdb-recoverd: Generalise helper state, handler and launching These can also be used for takeover handler. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-12-09 07:04:03 +03:00			`struct helper_state {`
			`int fd[2];`
			`pid_t pid;`
			`int result;`
			`bool done;`
			`};`

			`static void helper_handler(struct tevent_context *ev,`
			`struct tevent_fd *fde,`
			`uint16_t flags, void *private_data)`
			`{`
			`struct helper_state *state = talloc_get_type_abort(`
			`private_data, struct helper_state);`
			`int ret;`

			`ret = sys_read(state->fd[0], &state->result, sizeof(state->result));`
			`if (ret != sizeof(state->result)) {`
			`state->result = EPIPE;`
			`}`

			`state->done = true;`
			`}`

			`static int helper_run(struct ctdb_recoverd rec, TALLOC_CTX mem_ctx,`
			`const char prog, const char arg, const char *type)`
			`{`
			`struct helper_state *state;`
			`struct tevent_fd *fde;`
			`const char **args;`
			`int nargs, ret;`

			`state = talloc_zero(mem_ctx, struct helper_state);`
			`if (state == NULL) {`
			`DEBUG(DEBUG_ERR, (__location__ " memory error\n"));`
			`return -1;`
			`}`

			`state->pid = -1;`

			`ret = pipe(state->fd);`
			`if (ret != 0) {`
			`DEBUG(DEBUG_ERR,`
			`("Failed to create pipe for %s helper\n", type));`
			`goto fail;`
			`}`

			`set_close_on_exec(state->fd[0]);`

			`nargs = 4;`
			`args = talloc_array(state, const char *, nargs);`
			`if (args == NULL) {`
			`DEBUG(DEBUG_ERR, (__location__ " memory error\n"));`
			`goto fail;`
			`}`

			`args[0] = talloc_asprintf(args, "%d", state->fd[1]);`
			`if (args[0] == NULL) {`
			`DEBUG(DEBUG_ERR, (__location__ " memory error\n"));`
			`goto fail;`
			`}`
			`args[1] = rec->ctdb->daemon.name;`
			`args[2] = arg;`
			`args[3] = NULL;`

			`if (args[2] == NULL) {`
			`nargs = 3;`
			`}`

			`state->pid = ctdb_vfork_exec(state, rec->ctdb, prog, nargs, args);`
			`if (state->pid == -1) {`
			`DEBUG(DEBUG_ERR,`
			`("Failed to create child for %s helper\n", type));`
			`goto fail;`
			`}`

			`close(state->fd[1]);`
			`state->fd[1] = -1;`

ctdb-recoverd: Record helper PID in recovery daemon context Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-09-30 14:15:56 +03:00			`rec->helper_pid = state->pid;`
ctdb-recoverd: Generalise helper state, handler and launching These can also be used for takeover handler. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-12-09 07:04:03 +03:00			`state->done = false;`

ctdb-recoverd: Fix memory leak state is always freed before exiting this function, so allocate fde off it instead of long-lived ctdb context. BUG: https://bugzilla.samba.org/show_bug.cgi?id=13943 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2019-05-11 07:24:24 +03:00			`fde = tevent_add_fd(rec->ctdb->ev, state, state->fd[0],`
ctdb-recoverd: Generalise helper state, handler and launching These can also be used for takeover handler. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-12-09 07:04:03 +03:00			`TEVENT_FD_READ, helper_handler, state);`
			`if (fde == NULL) {`
			`goto fail;`
			`}`
			`tevent_fd_set_auto_close(fde);`

			`while (!state->done) {`
			`tevent_loop_once(rec->ctdb->ev);`
ctdb-recoverd: Abort recovery/takeover if recmaster changes Recovery and takeover are run via helper from recovery daemon. While the helpers are running, it's possible for the current node to lose election. If that happens, abort the currently running recovery/takeover helper. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2017-09-08 04:24:27 +03:00
ctdb-recoverd: Use this_node_is_leader() in an extra context This is arguably clearer. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-09 03:47:54 +03:00			`if (!this_node_is_leader(rec)) {`
ctdb-recoverd: Logging/comments: recovery master -> leader There are some remaining instances in this file but they will be removed in subsequent commits. Modernise debug macros as appropriate. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-08 03:07:25 +03:00			`D_ERR("Leader changed to %u, aborting %s\n",`
			`rec->leader,`
			`type);`
ctdb-recoverd: Abort recovery/takeover if recmaster changes Recovery and takeover are run via helper from recovery daemon. While the helpers are running, it's possible for the current node to lose election. If that happens, abort the currently running recovery/takeover helper. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2017-09-08 04:24:27 +03:00			`state->result = 1;`
			`break;`
			`}`
ctdb-recoverd: Generalise helper state, handler and launching These can also be used for takeover handler. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-12-09 07:04:03 +03:00			`}`

			`close(state->fd[0]);`
			`state->fd[0] = -1;`

			`if (state->result != 0) {`
			`goto fail;`
			`}`

ctdb-recoverd: Record helper PID in recovery daemon context Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-09-30 14:15:56 +03:00			`rec->helper_pid = -1;`
ctdb-recoverd: Generalise helper state, handler and launching These can also be used for takeover handler. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-12-09 07:04:03 +03:00			`ctdb_kill(rec->ctdb, state->pid, SIGKILL);`
			`talloc_free(state);`
			`return 0;`

			`fail:`
			`if (state->fd[0] != -1) {`
			`close(state->fd[0]);`
			`}`
			`if (state->fd[1] != -1) {`
			`close(state->fd[1]);`
			`}`
ctdb-recoverd: Record helper PID in recovery daemon context Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-09-30 14:15:56 +03:00			`rec->helper_pid = -1;`
ctdb-recoverd: Generalise helper state, handler and launching These can also be used for takeover handler. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-12-09 07:04:03 +03:00			`if (state->pid != -1) {`
			`ctdb_kill(rec->ctdb, state->pid, SIGKILL);`
			`}`
			`talloc_free(state);`
			`return -1;`
			`}`


ctdb-recoverd: Integrate takeover helper Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-12-09 08:21:39 +03:00			`static int ctdb_takeover(struct ctdb_recoverd *rec,`
			`uint32_t *force_rebalance_nodes)`
			`{`
			`static char prog[PATH_MAX+1] = "";`
			`char *arg;`
ctdb-recovery: Fix signed/unsigned comparisons by declaring as unsigned Simple cases where variables need to be declared as an unsigned type instead of an int. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2019-05-23 01:43:58 +03:00			`unsigned int i;`
			`int ret;`
ctdb-recoverd: Integrate takeover helper Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-12-09 08:21:39 +03:00
			`if (!ctdb_set_helper("takeover_helper", prog, sizeof(prog),`
			`"CTDB_TAKEOVER_HELPER", CTDB_HELPER_BINDIR,`
			`"ctdb_takeover_helper")) {`
			`ctdb_die(rec->ctdb, "Unable to set takeover helper\n");`
			`}`

			`arg = NULL;`
			`for (i = 0; i < talloc_array_length(force_rebalance_nodes); i++) {`
			`uint32_t pnn = force_rebalance_nodes[i];`
			`if (arg == NULL) {`
			`arg = talloc_asprintf(rec, "%u", pnn);`
			`} else {`
			`arg = talloc_asprintf_append(arg, ",%u", pnn);`
			`}`
			`if (arg == NULL) {`
			`DEBUG(DEBUG_ERR, (__location__ " memory error\n"));`
			`return -1;`
			`}`
			`}`

ctdb-config: Switch tunable DisableIPFailover to a config option Use the "failover:disabled" option instead. BUG: https://bugzilla.samba.org/show_bug.cgi?id=13589 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-08-21 06:41:22 +03:00			`if (ctdb_config.failover_disabled) {`
ctdb-daemon: Pass DisableIPFailover tunable via environment variable Preparation for obsoleting this tunable. BUG: https://bugzilla.samba.org/show_bug.cgi?id=13589 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-08-21 02:36:00 +03:00			`ret = setenv("CTDB_DISABLE_IP_FAILOVER", "1", 1);`
			`if (ret != 0) {`
			`D_ERR("Failed to set CTDB_DISABLE_IP_FAILOVER variable\n");`
			`return -1;`
			`}`
			`}`

ctdb-recoverd: Integrate takeover helper Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-12-09 08:21:39 +03:00			`return helper_run(rec, rec, prog, arg, "takeover");`
			`}`
ctdb-recoverd: Generalise helper state, handler and launching These can also be used for takeover handler. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-12-09 07:04:03 +03:00
recoverd: New function do_takeover_run() Factor the calling sequence for ctdb_takeover_run() into a new function and call it instead. This changes rec->need_takeover_run to false for each successful takeover run and that seems to be the right thing to do. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 9a3f0c0e61ca5c17e020c6e0463d73c7cf4f7c09) 2013-08-27 06:14:34 +04:00			`static bool do_takeover_run(struct ctdb_recoverd *rec,`
ctdb-takeover: Recovery daemon no longer passes fail callback Banning is now handled by the takeover code sending banning credit messages. This commit makes a change in behaviour quite obvious. Takeover runs were initiated from several locations in the code but banning was only done from one of these locations. Now banning can be done from any failed takeover run. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-05-03 08:35:08 +03:00			`struct ctdb_node_map_old *nodemap)`
recoverd: New function do_takeover_run() Factor the calling sequence for ctdb_takeover_run() into a new function and call it instead. This changes rec->need_takeover_run to false for each successful takeover run and that seems to be the right thing to do. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 9a3f0c0e61ca5c17e020c6e0463d73c7cf4f7c09) 2013-08-27 06:14:34 +04:00			`{`
recoverd: New SRVID message CTDB_SRVID_DISABLE_TAKEOVER_RUNS This implements a superset of CTDB_SRVID_DISABLE_IP_CHECK. It stops the IP checks but also causes any attempted takeover runs to fail and be rescheduled. This is meant to completely stop IP movements. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 00db4de53a0d86013e79e6577e7e6cf3ef864e56) 2013-08-27 09:04:40 +04:00			`uint32_t *nodes = NULL;`
ctdb-daemon: Rename struct srvid_request_data to ctdb_disable_message Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-28 10:23:13 +03:00			`struct ctdb_disable_message dtr;`
recoverd: Move disabling of IP checks into do_takeover_run() Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 48b603fbf16311daa47b01e7a33d477ed51da56d) 2013-09-03 05:21:09 +04:00			`TDB_DATA data;`
ctdb-recovery: Fix signed/unsigned comparisons by declaring as unsigned Simple cases where variables need to be declared as an unsigned type instead of an int. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2019-05-23 01:43:58 +03:00			`size_t i;`
recoverd: Be careful about freeing the list of IP rebalance target nodes It can change during a takeover run. If it does then don't free it. There are potentially fancier solutions (e.g. check what PNNs are new to the list) to this issue but this is the simplest. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit e81589b7084c661adf617e166cc2c25b4939f841) 2013-09-06 05:23:07 +04:00			`uint32_t *rebalance_nodes = rec->force_rebalance_nodes;`
recoverd: New function do_takeover_run() Factor the calling sequence for ctdb_takeover_run() into a new function and call it instead. This changes rec->need_takeover_run to false for each successful takeover run and that seems to be the right thing to do. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 9a3f0c0e61ca5c17e020c6e0463d73c7cf4f7c09) 2013-08-27 06:14:34 +04:00			`int ret;`
			`bool ok;`

recoverd: Improve logging for takeover runs Takeover runs are currently silent when they succeed. However, they are important, so log something by default. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit b39aa2e401fbb581207d986bac93778e9c01acdc) 2013-09-18 11:06:16 +04:00			`DEBUG(DEBUG_NOTICE, ("Takeover run starting\n"));`

ctdb-recoverd: Reimplement disabling takeover runs using ctdb_op_disable() Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-02-08 12:52:12 +03:00			`if (ctdb_op_is_in_progress(rec->takeover_run)) {`
recoverd: do_takeover_run() should mark when a takeover run is in progress Nested takeover runs should never happens so they should fail. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 8ed29c60c0a7dd29f2a6efdf694d38e94281e1c4) 2013-09-03 05:20:01 +04:00			`DEBUG(DEBUG_ERR, (__location__`
			`" takeover run already in progress \n"));`
			`ok = false;`
			`goto done;`
			`}`

ctdb-recoverd: Reimplement disabling takeover runs using ctdb_op_disable() Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-02-08 12:52:12 +03:00			`if (!ctdb_op_begin(rec->takeover_run)) {`
recoverd: New SRVID message CTDB_SRVID_DISABLE_TAKEOVER_RUNS This implements a superset of CTDB_SRVID_DISABLE_IP_CHECK. It stops the IP checks but also causes any attempted takeover runs to fail and be rescheduled. This is meant to completely stop IP movements. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 00db4de53a0d86013e79e6577e7e6cf3ef864e56) 2013-08-27 09:04:40 +04:00			`ok = false;`
			`goto done;`
recoverd: Move disabling of IP checks into do_takeover_run() Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 48b603fbf16311daa47b01e7a33d477ed51da56d) 2013-09-03 05:21:09 +04:00			`}`

recoverd: New SRVID message CTDB_SRVID_DISABLE_TAKEOVER_RUNS This implements a superset of CTDB_SRVID_DISABLE_IP_CHECK. It stops the IP checks but also causes any attempted takeover runs to fail and be rescheduled. This is meant to completely stop IP movements. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 00db4de53a0d86013e79e6577e7e6cf3ef864e56) 2013-08-27 09:04:40 +04:00			`/* Disable IP checks (takeover runs, really) on other nodes`
			`* while doing this takeover run. This will stop those other`
			`* nodes from triggering takeover runs when think they should`
			`* be hosting an IP but it isn't yet on an interface. Don't`
			`* wait for replies since a failure here might cause some`
			`* noise in the logs but will not actually cause a problem.`
			`*/`
ctdb-recoverd: Fix some uninitialised memory issues The first element of these structures is a 32-bit PNN. On 64-bit systems this field can be followed by 32-bits of padding. When the structures are copied this can cause uninitialised memory to be copied. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Michael Adam <obnox@samba.org> 2016-01-11 09:23:12 +03:00			`ZERO_STRUCT(dtr);`
recoverd: New SRVID message CTDB_SRVID_DISABLE_TAKEOVER_RUNS This implements a superset of CTDB_SRVID_DISABLE_IP_CHECK. It stops the IP checks but also causes any attempted takeover runs to fail and be rescheduled. This is meant to completely stop IP movements. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 00db4de53a0d86013e79e6577e7e6cf3ef864e56) 2013-08-27 09:04:40 +04:00			`dtr.srvid = 0; /* No reply */`
			`dtr.pnn = -1;`

			`data.dptr = (uint8_t*)&dtr;`
			`data.dsize = sizeof(dtr);`

			`nodes = list_of_connected_nodes(rec->ctdb, nodemap, rec, false);`

Revert "recoverd: Disable takeover runs on other nodes for 5 minutes" 5 minutes is too long to leave the cluster in limbo if the recovery daemon dies during a takeover run, even though this is quite unlikely. We need a new recover master to be able to do takeover runs fairly quickly. This reverts commit 71080676bb4acbd0d9b595a30cf7fe6dddbf426f. (This used to be ctdb commit 3e41170c78fc7a2bf526129c9b7db3739b61c6bf) 2013-10-24 04:13:16 +04:00			`/* Disable for 60 seconds. This can be a tunable later if`
recoverd: New SRVID message CTDB_SRVID_DISABLE_TAKEOVER_RUNS This implements a superset of CTDB_SRVID_DISABLE_IP_CHECK. It stops the IP checks but also causes any attempted takeover runs to fail and be rescheduled. This is meant to completely stop IP movements. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 00db4de53a0d86013e79e6577e7e6cf3ef864e56) 2013-08-27 09:04:40 +04:00			`* necessary.`
			`*/`
ctdb-daemon: Rename struct srvid_request_data to ctdb_disable_message Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-28 10:23:13 +03:00			`dtr.timeout = 60;`
recoverd: New SRVID message CTDB_SRVID_DISABLE_TAKEOVER_RUNS This implements a superset of CTDB_SRVID_DISABLE_IP_CHECK. It stops the IP checks but also causes any attempted takeover runs to fail and be rescheduled. This is meant to completely stop IP movements. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 00db4de53a0d86013e79e6577e7e6cf3ef864e56) 2013-08-27 09:04:40 +04:00			`for (i = 0; i < talloc_array_length(nodes); i++) {`
			`if (ctdb_client_send_message(rec->ctdb, nodes[i],`
			`CTDB_SRVID_DISABLE_TAKEOVER_RUNS,`
			`data) != 0) {`
			`DEBUG(DEBUG_INFO,("Failed to disable takeover runs\n"));`
			`}`
			`}`
recoverd: do_takeover_run() should mark when a takeover run is in progress Nested takeover runs should never happens so they should fail. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 8ed29c60c0a7dd29f2a6efdf694d38e94281e1c4) 2013-09-03 05:20:01 +04:00
ctdb-recoverd: Integrate takeover helper Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-12-09 08:21:39 +03:00			`ret = ctdb_takeover(rec, rec->force_rebalance_nodes);`
recoverd: Move disabling of IP checks into do_takeover_run() Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 48b603fbf16311daa47b01e7a33d477ed51da56d) 2013-09-03 05:21:09 +04:00
ctdb:server: Fix code spelling Best reviewed with: `git show --word-diff` Signed-off-by: Andreas Schneider <asn@samba.org> Reviewed-by: Martin Schwenke <mschwenke@ddn.com> 2023-03-22 11:36:23 +03:00			`/* Re-enable takeover runs and IP checks on other nodes */`
ctdb-daemon: Rename struct srvid_request_data to ctdb_disable_message Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-28 10:23:13 +03:00			`dtr.timeout = 0;`
recoverd: New SRVID message CTDB_SRVID_DISABLE_TAKEOVER_RUNS This implements a superset of CTDB_SRVID_DISABLE_IP_CHECK. It stops the IP checks but also causes any attempted takeover runs to fail and be rescheduled. This is meant to completely stop IP movements. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 00db4de53a0d86013e79e6577e7e6cf3ef864e56) 2013-08-27 09:04:40 +04:00			`for (i = 0; i < talloc_array_length(nodes); i++) {`
			`if (ctdb_client_send_message(rec->ctdb, nodes[i],`
			`CTDB_SRVID_DISABLE_TAKEOVER_RUNS,`
			`data) != 0) {`
Fix various spelling errors Reviewed-by: Andrew Bartlett <abartlet@samba.org> Reviewed-by: Michael Adam <obnox@samba.org> Autobuild-User(master): Andrew Bartlett <abartlet@samba.org> Autobuild-Date(master): Fri Nov 6 13:43:45 CET 2015 on sn-devel-104 2015-07-27 00:02:57 +03:00			`DEBUG(DEBUG_INFO,("Failed to re-enable takeover runs\n"));`
recoverd: New SRVID message CTDB_SRVID_DISABLE_TAKEOVER_RUNS This implements a superset of CTDB_SRVID_DISABLE_IP_CHECK. It stops the IP checks but also causes any attempted takeover runs to fail and be rescheduled. This is meant to completely stop IP movements. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 00db4de53a0d86013e79e6577e7e6cf3ef864e56) 2013-08-27 09:04:40 +04:00			`}`
recoverd: Move disabling of IP checks into do_takeover_run() Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 48b603fbf16311daa47b01e7a33d477ed51da56d) 2013-09-03 05:21:09 +04:00			`}`

recoverd: New function do_takeover_run() Factor the calling sequence for ctdb_takeover_run() into a new function and call it instead. This changes rec->need_takeover_run to false for each successful takeover run and that seems to be the right thing to do. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 9a3f0c0e61ca5c17e020c6e0463d73c7cf4f7c09) 2013-08-27 06:14:34 +04:00			`if (ret != 0) {`
recoverd: Improve logging for takeover runs Takeover runs are currently silent when they succeed. However, they are important, so log something by default. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit b39aa2e401fbb581207d986bac93778e9c01acdc) 2013-09-18 11:06:16 +04:00			`DEBUG(DEBUG_ERR, ("ctdb_takeover_run() failed\n"));`
recoverd: New function do_takeover_run() Factor the calling sequence for ctdb_takeover_run() into a new function and call it instead. This changes rec->need_takeover_run to false for each successful takeover run and that seems to be the right thing to do. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 9a3f0c0e61ca5c17e020c6e0463d73c7cf4f7c09) 2013-08-27 06:14:34 +04:00			`ok = false;`
			`goto done;`
			`}`

			`ok = true;`
recoverd: Fix the implementation of CTDB_SRVID_REBALANCE_NODE The current implementation has a few flaws: * A takeover run is called unconditionally when the timer goes even if the recovery master role has moved. This means a node other than the recovery master can incorrectly do a takeover run. * The rebalancing target nodes are cleared in the setup for a takeover run, regardless of whether the takeover run succeeds. * The timer to force a rebalance isn't cleared if another takeover run occurs before the deadline. Any forced rebalancing will happen in the first takeover run and when the timer expires some time later then an unnecessary takeover run will occur. * If the recovery master role moves then the rebalancing data will stay on the original node and affect the next takeover run to occur if the recovery master role should come back to the original node. Instead, store an array of rebalance target nodes in the recovery master context. This is passed as an extra argument to ctdb_takeover_run() each time it is called and is cleared when a takeover run succeeds. The timer hangs off the array of rebalance target nodes, which is cleared if the node isn't the recovery master. This means that it is possible to lose rebalance data if the recovery master role moves. However, that's a difficult problem to solve. The best way of approaching it is probably to try to stop the recovery master role from jumping around unnecesarily when inactive nodes join the cluster. The long term solution is to avoid this nonsense completely. The IP allocation algorithm needs to cache state between runs so that it knows which nodes have just become healthy. This also needs recovery master stability. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit c51c1efe5fc7fa668597f2acd435dee16e410fc9) 2013-09-04 08:30:04 +04:00			`/* Takeover run was successful so clear force rebalance targets */`
recoverd: Be careful about freeing the list of IP rebalance target nodes It can change during a takeover run. If it does then don't free it. There are potentially fancier solutions (e.g. check what PNNs are new to the list) to this issue but this is the simplest. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit e81589b7084c661adf617e166cc2c25b4939f841) 2013-09-06 05:23:07 +04:00			`if (rebalance_nodes == rec->force_rebalance_nodes) {`
			`TALLOC_FREE(rec->force_rebalance_nodes);`
			`} else {`
			`DEBUG(DEBUG_WARNING,`
			`("Rebalance target nodes changed during takeover run - not clearing\n"));`
			`}`
recoverd: New function do_takeover_run() Factor the calling sequence for ctdb_takeover_run() into a new function and call it instead. This changes rec->need_takeover_run to false for each successful takeover run and that seems to be the right thing to do. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 9a3f0c0e61ca5c17e020c6e0463d73c7cf4f7c09) 2013-08-27 06:14:34 +04:00			`done:`
			`rec->need_takeover_run = !ok;`
recoverd: New SRVID message CTDB_SRVID_DISABLE_TAKEOVER_RUNS This implements a superset of CTDB_SRVID_DISABLE_IP_CHECK. It stops the IP checks but also causes any attempted takeover runs to fail and be rescheduled. This is meant to completely stop IP movements. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 00db4de53a0d86013e79e6577e7e6cf3ef864e56) 2013-08-27 09:04:40 +04:00			`talloc_free(nodes);`
ctdb-recoverd: Reimplement disabling takeover runs using ctdb_op_disable() Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-02-08 12:52:12 +03:00			`ctdb_op_end(rec->takeover_run);`
recoverd: Improve logging for takeover runs Takeover runs are currently silent when they succeed. However, they are important, so log something by default. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit b39aa2e401fbb581207d986bac93778e9c01acdc) 2013-09-18 11:06:16 +04:00
			`DEBUG(DEBUG_NOTICE, ("Takeover run %s\n", ok ? "completed successfully" : "unsuccessful"));`
recoverd: New function do_takeover_run() Factor the calling sequence for ctdb_takeover_run() into a new function and call it instead. This changes rec->need_takeover_run to false for each successful takeover run and that seems to be the right thing to do. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 9a3f0c0e61ca5c17e020c6e0463d73c7cf4f7c09) 2013-08-27 06:14:34 +04:00			`return ok;`
			`}`

ctdb-recoverd: Add code for parallel database recovery Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-09-17 09:22:38 +03:00			`static int db_recovery_parallel(struct ctdb_recoverd rec, TALLOC_CTX mem_ctx)`
			`{`
			`static char prog[PATH_MAX+1] = "";`
ctdb-recoverd: Generalise helper state, handler and launching These can also be used for takeover handler. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-12-09 07:04:03 +03:00			`const char *arg;`
ctdb-recoverd: Add code for parallel database recovery Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-09-17 09:22:38 +03:00
			`if (!ctdb_set_helper("recovery_helper", prog, sizeof(prog),`
			`"CTDB_RECOVERY_HELPER", CTDB_HELPER_BINDIR,`
			`"ctdb_recovery_helper")) {`
			`ctdb_die(rec->ctdb, "Unable to set recovery helper\n");`
			`}`

ctdb-recoverd: Generalise helper state, handler and launching These can also be used for takeover handler. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-12-09 07:04:03 +03:00			`arg = talloc_asprintf(mem_ctx, "%u", new_generation());`
			`if (arg == NULL) {`
ctdb-recoverd: Add code for parallel database recovery Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-09-17 09:22:38 +03:00			`DEBUG(DEBUG_ERR, (__location__ " memory error\n"));`
			`return -1;`
			`}`

ctdb-recovery: Create recovery databases in state dir This matches the behaviour during serial database recovery. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> Autobuild-User(master): Martin Schwenke <martins@samba.org> Autobuild-Date(master): Thu Feb 11 08:01:14 CET 2016 on sn-devel-144 2016-02-11 06:32:34 +03:00			`setenv("CTDB_DBDIR_STATE", rec->ctdb->db_directory_state, 1);`

ctdb-recoverd: Generalise helper state, handler and launching These can also be used for takeover handler. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-12-09 07:04:03 +03:00			`return helper_run(rec, mem_ctx, prog, arg, "recovery");`
ctdb-recoverd: Add code for parallel database recovery Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-09-17 09:22:38 +03:00			`}`

ctdb-recovery: Factor out existing database recovery code Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-09-17 09:00:47 +03:00			`/*`
ctdb-recoverd: Logging/comments: recovery master -> leader There are some remaining instances in this file but they will be removed in subsequent commits. Modernise debug macros as appropriate. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-08 03:07:25 +03:00			`* Main recovery function, only run by leader`
ctdb-recovery: Factor out existing database recovery code Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-09-17 09:00:47 +03:00			`*/`
ctdb-recoverd: Simplify arguments to do_recovery() pnn and nodemap are both available via the rec context, so simplify. vnnmap is unused. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-01-16 08:20:05 +03:00			`static int do_recovery(struct ctdb_recoverd rec, TALLOC_CTX mem_ctx)`
ctdb-recovery: Factor out existing database recovery code Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-09-17 09:00:47 +03:00			`{`
			`struct ctdb_context *ctdb = rec->ctdb;`
ctdb-recoverd: Simplify arguments to do_recovery() pnn and nodemap are both available via the rec context, so simplify. vnnmap is unused. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-01-16 08:20:05 +03:00			`struct ctdb_node_map_old *nodemap = rec->nodemap;`
ctdb-recovery: Fix signed/unsigned comparisons by declaring as unsigned Simple cases where variables need to be declared as an unsigned type instead of an int. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2019-05-23 01:43:58 +03:00			`unsigned int i;`
			`int ret;`
ctdb-recovery: Factor out existing database recovery code Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-09-17 09:00:47 +03:00			`bool self_ban;`

ctdb: Modernize a few DEBUGs Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Martin Schwenke <mschwenke@ddn.com> Autobuild-User(master): Martin Schwenke <martins@samba.org> Autobuild-Date(master): Wed Apr 17 00:54:55 UTC 2024 on atb-devel-224 2024-02-29 18:11:16 +03:00			`DBG_NOTICE("Starting do_recovery\n");`
ctdb-recovery: Factor out existing database recovery code Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-09-17 09:00:47 +03:00
ctdb-recoverd: Logging/comments: recovery master -> leader There are some remaining instances in this file but they will be removed in subsequent commits. Modernise debug macros as appropriate. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-08 03:07:25 +03:00			`/* Check if the current node is still the leader. It's possible that`
			`* re-election has changed the leader.`
ctdb-recoverd: Always check for recmaster before doing recovery Recovery daemon checks if it is the recovery master before performing certain checks. During those checks it's possible that re-election can change the recmaster. In such a case, the recovery daemon should never do a database recovery. This is not complete fix since the recovery master can still change while the recovery is going on. The correct fix is to abort recovery if the recovery master changes. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Wed Oct 7 17:55:05 CEST 2015 on sn-devel-104 2015-10-06 09:31:41 +03:00			`*/`
ctdb-recoverd: Factor out and use function this_node_is_leader() Make the code self-documenting. This preempts an upcoming change to terminology but doing it now saves a lot of churn. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-08 11:37:39 +03:00			`if (!this_node_is_leader(rec)) {`
ctdb: Modernize a few DEBUGs Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Martin Schwenke <mschwenke@ddn.com> Autobuild-User(master): Martin Schwenke <martins@samba.org> Autobuild-Date(master): Wed Apr 17 00:54:55 UTC 2024 on atb-devel-224 2024-02-29 18:11:16 +03:00			`D_NOTICE("Leader changed to %" PRIu32 ", aborting recovery\n",`
ctdb-recoverd: Logging/comments: recovery master -> leader There are some remaining instances in this file but they will be removed in subsequent commits. Modernise debug macros as appropriate. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-08 03:07:25 +03:00			`rec->leader);`
ctdb-recoverd: Always check for recmaster before doing recovery Recovery daemon checks if it is the recovery master before performing certain checks. During those checks it's possible that re-election can change the recmaster. In such a case, the recovery daemon should never do a database recovery. This is not complete fix since the recovery master can still change while the recovery is going on. The correct fix is to abort recovery if the recovery master changes. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Wed Oct 7 17:55:05 CEST 2015 on sn-devel-104 2015-10-06 09:31:41 +03:00			`return -1;`
			`}`

ctdb-recovery: Factor out existing database recovery code Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-09-17 09:00:47 +03:00			`/* if recovery fails, force it again */`
			`rec->need_recovery = true;`

			`if (!ctdb_op_begin(rec->recovery)) {`
			`return -1;`
			`}`

ctdb-recoverd: Add an explicit flag for election in progress An alternate election method will be added that doesn't use the election timeout, so this provides a common way for recognising when an election is in progress. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-03-18 12:27:10 +03:00			`if (rec->election_in_progress) {`
ctdb-recovery: Factor out existing database recovery code Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-09-17 09:00:47 +03:00			`/* an election is in progress */`
ctdb: Modernize a few DEBUGs Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Martin Schwenke <mschwenke@ddn.com> Autobuild-User(master): Martin Schwenke <martins@samba.org> Autobuild-Date(master): Wed Apr 17 00:54:55 UTC 2024 on atb-devel-224 2024-02-29 18:11:16 +03:00			`DBG_ERR("do_recovery called while election in progress - try "`
			`"again later\n");`
ctdb-recovery: Factor out existing database recovery code Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-09-17 09:00:47 +03:00			`goto fail;`
			`}`

			`ban_misbehaving_nodes(rec, &self_ban);`
			`if (self_ban) {`
ctdb: Modernize a few DEBUGs Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Martin Schwenke <mschwenke@ddn.com> Autobuild-User(master): Martin Schwenke <martins@samba.org> Autobuild-Date(master): Wed Apr 17 00:54:55 UTC 2024 on atb-devel-224 2024-02-29 18:11:16 +03:00			`DBG_NOTICE("This node was banned, aborting recovery\n");`
ctdb-recovery: Factor out existing database recovery code Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-09-17 09:00:47 +03:00			`goto fail;`
			`}`

ctdb-recoverd: No longer take cluster lock during recovery Confirm instead that it is already held. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-05-04 10:45:51 +03:00			`if (cluster_lock_enabled(rec) && !cluster_lock_held(rec)) {`
			`/* Leader can change in ban_misbehaving_nodes() */`
			`if (!this_node_is_leader(rec)) {`
ctdb: Modernize a few DEBUGs Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Martin Schwenke <mschwenke@ddn.com> Autobuild-User(master): Martin Schwenke <martins@samba.org> Autobuild-Date(master): Wed Apr 17 00:54:55 UTC 2024 on atb-devel-224 2024-02-29 18:11:16 +03:00			`D_NOTICE("Leader changed to %" PRIu32`
			`", aborting recovery\n",`
ctdb-recoverd: No longer take cluster lock during recovery Confirm instead that it is already held. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-05-04 10:45:51 +03:00			`rec->leader);`
			`rec->need_recovery = false;`
ctdb-recoverd: Factor out function cluster_lock_take() Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-09-20 05:30:58 +03:00			`goto fail;`
ctdb-recovery: Factor out existing database recovery code Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-09-17 09:00:47 +03:00			`}`
ctdb-recoverd: No longer take cluster lock during recovery Confirm instead that it is already held. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-05-04 10:45:51 +03:00
			`D_ERR("Cluster lock not held - abort recovery, ban node\n");`
			`ctdb_ban_node(rec, rec->pnn);`
			`goto fail;`
ctdb-recovery: Factor out existing database recovery code Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-09-17 09:00:47 +03:00			`}`

ctdb: Modernize a few DEBUGs Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Martin Schwenke <mschwenke@ddn.com> Autobuild-User(master): Martin Schwenke <martins@samba.org> Autobuild-Date(master): Wed Apr 17 00:54:55 UTC 2024 on atb-devel-224 2024-02-29 18:11:16 +03:00			`DBG_NOTICE("Recovery initiated due to problem with node %" PRIu32 "\n",`
			`rec->last_culprit_node);`
ctdb-recovery: Factor out existing database recovery code Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-09-17 09:00:47 +03:00
ctdb-recmaster: Update capabilities before calling first election Capabilities are used when computing an election result so having them up-to-date seems like a good idea. Also update several instances of an ambiguous comment. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-10-27 07:09:33 +03:00			`/* Retrieve capabilities from all connected nodes */`
ctdb-recoverd: Update capabilities before the database recovery Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-09-17 09:07:37 +03:00			`ret = update_capabilities(rec, nodemap);`
			`if (ret!=0) {`
ctdb: Modernize a few DEBUGs Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Martin Schwenke <mschwenke@ddn.com> Autobuild-User(master): Martin Schwenke <martins@samba.org> Autobuild-Date(master): Wed Apr 17 00:54:55 UTC 2024 on atb-devel-224 2024-02-29 18:11:16 +03:00			`DBG_ERR("Unable to update node capabilities.\n");`
ctdb-recoverd: Update capabilities before the database recovery Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-09-17 09:07:37 +03:00			`return -1;`
			`}`

ctdb-recoverd: Update flags on all nodes before database recovery Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-09-17 10:10:15 +03:00			`/*`
			`update all nodes to have the same flags that we have`
			`*/`
			`for (i=0;i<nodemap->num;i++) {`
			`if (nodemap->nodes[i].flags & NODE_FLAGS_DISCONNECTED) {`
			`continue;`
			`}`

ctdb-recoverd: Change update_flags_on_all_nodes() to take rec argument This makes fields such as recmaster and nodemap easily available if required. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-05-05 16:45:15 +03:00			`ret = update_flags_on_all_nodes(rec,`
ctdb-recoverd: Improve a call to update_flags_on_all_nodes() This should take a PNN, not an array index. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-07-14 07:43:04 +03:00			`nodemap->nodes[i].pnn,`
ctdb-recoverd: Drop unused nodemap argument from update_flags_on_all_nodes() An unused argument needlessly extends the length of function calls. A subsequent change will allow rec->nodemap to be used if necessary. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-01-18 12:25:07 +03:00			`nodemap->nodes[i].flags);`
ctdb-recoverd: Update flags on all nodes before database recovery Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-09-17 10:10:15 +03:00			`if (ret != 0) {`
			`if (nodemap->nodes[i].flags & NODE_FLAGS_INACTIVE) {`
ctdb: Modernize a few DEBUGs Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Martin Schwenke <mschwenke@ddn.com> Autobuild-User(master): Martin Schwenke <martins@samba.org> Autobuild-Date(master): Wed Apr 17 00:54:55 UTC 2024 on atb-devel-224 2024-02-29 18:11:16 +03:00			`DBG_WARNING("Unable to update flags on "`
			`"inactive node %d\n",`
			`i);`
ctdb-recoverd: Update flags on all nodes before database recovery Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-09-17 10:10:15 +03:00			`} else {`
ctdb: Modernize a few DEBUGs Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Martin Schwenke <mschwenke@ddn.com> Autobuild-User(master): Martin Schwenke <martins@samba.org> Autobuild-Date(master): Wed Apr 17 00:54:55 UTC 2024 on atb-devel-224 2024-02-29 18:11:16 +03:00			`DBG_ERR("Unable to update flags on all nodes "`
			`"for node %d\n",`
			`i);`
ctdb-recoverd: Update flags on all nodes before database recovery Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-09-17 10:10:15 +03:00			`return -1;`
			`}`
			`}`
			`}`

ctdb: Modernize a few DEBUGs Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Martin Schwenke <mschwenke@ddn.com> Autobuild-User(master): Martin Schwenke <martins@samba.org> Autobuild-Date(master): Wed Apr 17 00:54:55 UTC 2024 on atb-devel-224 2024-02-29 18:11:16 +03:00			`DBG_NOTICE("Recovery - updated flags\n");`
ctdb-recoverd: Update flags on all nodes before database recovery Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-09-17 10:10:15 +03:00
ctdb-recovery: Remove serial database recovery code Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2016-07-19 09:06:37 +03:00			`ret = db_recovery_parallel(rec, mem_ctx);`
ctdb-recovery: Factor out existing database recovery code Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-09-17 09:00:47 +03:00			`if (ret != 0) {`
			`goto fail;`
			`}`

ctdb-takeover: Recovery daemon no longer passes fail callback Banning is now handled by the takeover code sending banning credit messages. This commit makes a change in behaviour quite obvious. Takeover runs were initiated from several locations in the code but banning was only done from one of these locations. Now banning can be done from any failed takeover run. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-05-03 08:35:08 +03:00			`do_takeover_run(rec, nodemap);`
read the current debuglevel in each loop in the recovery daemon so that we pick up when they change in the parent daemon (This used to be ctdb commit 792d5471ff0c2947b6e66183925860de27f30eaf) 2008-02-18 11:38:04 +03:00
send a message to clients when an IP has been released (This used to be ctdb commit 8b7ab0b00253462593d368052c2cb10a385b4e63) 2007-05-25 18:05:30 +04:00			`/* send a message to all clients telling them that the cluster`
			`has been reconfigured */`
ctdb-server: Coverity fixes Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org> 2013-11-11 05:39:27 +04:00			`ret = ctdb_client_send_message(ctdb, CTDB_BROADCAST_CONNECTED,`
			`CTDB_SRVID_RECONFIGURE, tdb_null);`
			`if (ret != 0) {`
ctdb: Modernize a few DEBUGs Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Martin Schwenke <mschwenke@ddn.com> Autobuild-User(master): Martin Schwenke <martins@samba.org> Autobuild-Date(master): Wed Apr 17 00:54:55 UTC 2024 on atb-devel-224 2024-02-29 18:11:16 +03:00			`DBG_ERR("Failed to send reconfigure message\n");`
ctdb-recoverd: Use a goto for do_recovery() failures This will allow extra things to be done on failure. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-02-06 06:32:08 +03:00			`goto fail;`
ctdb-server: Coverity fixes Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org> 2013-11-11 05:39:27 +04:00			`}`
recovery daemon this program is a client to the local ctdb daemon every second it pulls all vnnmap and nodemaps from all nodes that are available and checks if a recovery is required a recovery is required if : * all nodes do NOT have an identical vnnmap and generation * all nodes do NOT have an identical nodemap * there are active nodes that are NOT in the nodemap * there are nodes in the nodemap that are NOT active During recovery, the recovery tool will also make sure that all nodes know about and have created all databases. (This used to be ctdb commit 2f2650467bac7e8954de7c17cb34f46b0bdbcd26) 2007-05-04 09:21:40 +04:00
ctdb: Modernize a few DEBUGs Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Martin Schwenke <mschwenke@ddn.com> Autobuild-User(master): Martin Schwenke <martins@samba.org> Autobuild-Date(master): Wed Apr 17 00:54:55 UTC 2024 on atb-devel-224 2024-02-29 18:11:16 +03:00			`DBG_NOTICE("Recovery complete\n");`
add a tuneable to control how long we wait after a successful recovery before we alow another recovery to be initiated (This used to be ctdb commit f3b43519423b7a73e6a2dd986bdf11203b8653cf) 2007-07-04 02:36:59 +04:00
- merge from ronnie - add a flag to check that recovery completed correctly. If not, re-trigger it in monitoring (This used to be ctdb commit d5ed941d9bab4af30d8b5f9b77bdf43d9218d69b) 2007-09-14 03:49:12 +04:00			`rec->need_recovery = false;`
ctdb-recoverd: Reimplement ReRecoveryTimeout using ctdb_op_disable() Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-02-06 06:47:33 +03:00			`ctdb_op_end(rec->recovery);`
- merge from ronnie - add a flag to check that recovery completed correctly. If not, re-trigger it in monitoring (This used to be ctdb commit d5ed941d9bab4af30d8b5f9b77bdf43d9218d69b) 2007-09-14 03:49:12 +04:00
ctdb-recoverd: Clean up banning culprit code Make this fully self-contained in the recovery daemon and avoid indexing by PNN. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-07-29 06:30:04 +03:00			`/*`
			`* Completed a full recovery so forgive any past transgressions`
			`*/`
			`ban_counts_reset(rec);`
with the new banning logic with one struct for each node we no longer "forget" the other culprits as often as we used to do, which means that things like "ctdb recover" can now actually lead to a node becomming banned if we perform too many recoveries too frequently. change this to provide absolution to all nodes once they have participated in a recovery session. (This used to be ctdb commit f66d17fb2e81a35d5adb3754e1cc902f76b4590a) 2009-09-25 07:14:53 +04:00
ctdb-recoverd: Reimplement ReRecoveryTimeout using ctdb_op_disable() Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-02-06 06:47:33 +03:00			`/* We just finished a recovery successfully.`
			`We now wait for rerecovery_timeout before we allow`
add a tuneable to control how long we wait after a successful recovery before we alow another recovery to be initiated (This used to be ctdb commit f3b43519423b7a73e6a2dd986bdf11203b8653cf) 2007-07-04 02:36:59 +04:00			`another recovery to take place.`
			`*/`
ctdb: Modernize a few DEBUGs Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Martin Schwenke <mschwenke@ddn.com> Autobuild-User(master): Martin Schwenke <martins@samba.org> Autobuild-Date(master): Wed Apr 17 00:54:55 UTC 2024 on atb-devel-224 2024-02-29 18:11:16 +03:00			`D_NOTICE("Just finished a recovery. New recoveries will now be "`
			`"suppressed for the rerecovery timeout (%" PRIu32`
			`" seconds)\n",`
			`ctdb->tunable.rerecovery_timeout);`
ctdb-recoverd: Reimplement ReRecoveryTimeout using ctdb_op_disable() Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-02-06 06:47:33 +03:00			`ctdb_op_disable(rec->recovery, ctdb->ev,`
			`ctdb->tunable.rerecovery_timeout);`
recovery daemon this program is a client to the local ctdb daemon every second it pulls all vnnmap and nodemaps from all nodes that are available and checks if a recovery is required a recovery is required if : * all nodes do NOT have an identical vnnmap and generation * all nodes do NOT have an identical nodemap * there are active nodes that are NOT in the nodemap * there are nodes in the nodemap that are NOT active During recovery, the recovery tool will also make sure that all nodes know about and have created all databases. (This used to be ctdb commit 2f2650467bac7e8954de7c17cb34f46b0bdbcd26) 2007-05-04 09:21:40 +04:00			`return 0;`
ctdb-recoverd: Use a goto for do_recovery() failures This will allow extra things to be done on failure. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-02-06 06:32:08 +03:00
			`fail:`
ctdb-recoverd: Reimplement ReRecoveryTimeout using ctdb_op_disable() Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-02-06 06:47:33 +03:00			`ctdb_op_end(rec->recovery);`
ctdb-recoverd: Use a goto for do_recovery() failures This will allow extra things to be done on failure. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-02-06 06:32:08 +03:00			`return -1;`
update getvnnmap control to take a timeout parameter dont explicitely free the vnnmap pointer in the getvnnmap control this is freed by the mem_ctx instead add code to the recoverd to detect when/if recovery is required veiry that the number of active nodes, the nodemap and the vnn map is consistent across the entire cluster and if not trigger a recovery (which right now just prints "we need to do recovery" to the screen. (This used to be ctdb commit 2b0a207a3748bdb3394dc9fd0d1c344ee1bb0bb5) 2007-05-04 03:45:53 +04:00			`}`
start working on a recovery daemon change ctdb_control so it takes a timeval pointer as argument. this is the timeout. if the node has not responded within hte timeout ctdb_control will return an error instead of hanging. if the timeval pointer is NULL then the call will block indefinitely if there is no response. this is used for now in the createdb control but all the helpers ctdb_ctrl_* should probably be updated to take a timeout parameter as well. (This used to be ctdb commit 1fe64b04869b17dbf123851b0fe09df8d28a6211) 2007-05-04 02:30:18 +04:00
add a test in the function that checks whether the cluster needs recovery or not that all active nodes are in normal mode. If we discover that some node is still in recoverymode it may indicate that a previous recovery ended prematurely and thus we should start a new recovery (This used to be ctdb commit c15517872e6c98c8c425a8d47d2b348ecb0620b0) 2007-05-06 22:41:12 +04:00
choose the most connected node first (This used to be ctdb commit c7c17a79fa4f28509e34b6f635fa62517dc458c2) 2007-06-07 13:17:27 +04:00			`/*`
			`elections are won by first checking the number of connected nodes, then`
change ctdb_node_flags_change.vnn to ctdb_node_flags_changed.pnn change ctdb_ban_info.vnn to ctdb_ban_info.pnn (This used to be ctdb commit fcedd40e0493948829e1c921d4fe30e9196e398a) 2007-09-04 04:33:10 +04:00			`the priority time, then the pnn`
choose the most connected node first (This used to be ctdb commit c7c17a79fa4f28509e34b6f635fa62517dc458c2) 2007-06-07 13:17:27 +04:00			`*/`
recovery daemon with recovery master election election is primitive, it elects the lowest vnn as the recovery master two new controls, to get/set recovery master for a node to use recovery daemon, start one ./bin/recoverd --socket=ctdb.socket* for each ctdb daemon it has been briefly tested by deleting and adding nodes to a 4 node cluster but needs more testing (This used to be ctdb commit 541d1cc49d46d44042a31a8404d521412ef2fdb3) 2007-05-07 00:51:58 +04:00			`struct election_message {`
choose the most connected node first (This used to be ctdb commit c7c17a79fa4f28509e34b6f635fa62517dc458c2) 2007-06-07 13:17:27 +04:00			`uint32_t num_connected;`
use a priority time for the election data, not just the vnn (This used to be ctdb commit a691f9c5cd77194005f0d98483da94b07a48d57d) 2007-06-07 12:37:27 +04:00			`struct timeval priority_time;`
change ctdb_node_flags_change.vnn to ctdb_node_flags_changed.pnn change ctdb_ban_info.vnn to ctdb_ban_info.pnn (This used to be ctdb commit fcedd40e0493948829e1c921d4fe30e9196e398a) 2007-09-04 04:33:10 +04:00			`uint32_t pnn;`
- catch ESTALE in the recovery lock by trying a read() - priortise nodes that are unbanned and healthy in the election (This used to be ctdb commit 929feb475dfdf7283f0e99b50b179e1c91d3a39f) 2007-10-05 07:28:21 +04:00			`uint32_t node_flags;`
recovery daemon with recovery master election election is primitive, it elects the lowest vnn as the recovery master two new controls, to get/set recovery master for a node to use recovery daemon, start one ./bin/recoverd --socket=ctdb.socket* for each ctdb daemon it has been briefly tested by deleting and adding nodes to a 4 node cluster but needs more testing (This used to be ctdb commit 541d1cc49d46d44042a31a8404d521412ef2fdb3) 2007-05-07 00:51:58 +04:00			`};`

choose the most connected node first (This used to be ctdb commit c7c17a79fa4f28509e34b6f635fa62517dc458c2) 2007-06-07 13:17:27 +04:00			`/*`
			`form this nodes election data`
			`*/`
			`static void ctdb_election_data(struct ctdb_recoverd rec, struct election_message em)`
			`{`
ctdb-recovery: Fix signed/unsigned comparisons by declaring as unsigned Simple cases where variables need to be declared as an unsigned type instead of an int. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2019-05-23 01:43:58 +03:00			`unsigned int i;`
			`int ret;`
ctdb-daemon: Rename struct ctdb_node_map to ctdb_node_map_old Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-29 09:22:48 +03:00			`struct ctdb_node_map_old *nodemap;`
choose the most connected node first (This used to be ctdb commit c7c17a79fa4f28509e34b6f635fa62517dc458c2) 2007-06-07 13:17:27 +04:00			`struct ctdb_context *ctdb = rec->ctdb;`
ctdb-recoverd: Add function node_flags() and use it in elections Indexing a node map by PNN is suboptimal. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-07-29 10:57:53 +03:00			`bool ok;`
choose the most connected node first (This used to be ctdb commit c7c17a79fa4f28509e34b6f635fa62517dc458c2) 2007-06-07 13:17:27 +04:00
			`ZERO_STRUCTP(em);`

ctdb-recoverd: Use rec->pnn everywhere This is currently referenced in a number of inconsistent ways, including: * pnn * rec->ctdb->pnn * ctdb->pnn * ctdb_get_pnn(ctdb) * ctdb_get_pnn(rec->ctdb) The first of these always requires some thought about the context - is this the node PNN or some other PNN (e.g. argument to function)? rec->pnn is now always used when referring to the recovery daemon's PNN. Doing this also reduces reliance on struct ctdb_context internals. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-08 12:25:46 +03:00			`em->pnn = rec->pnn;`
choose the most connected node first (This used to be ctdb commit c7c17a79fa4f28509e34b6f635fa62517dc458c2) 2007-06-07 13:17:27 +04:00			`em->priority_time = rec->priority_time;`

			`ret = ctdb_ctrl_getnodemap(ctdb, CONTROL_TIMEOUT(), CTDB_CURRENT_NODE, rec, &nodemap);`
			`if (ret != 0) {`
recoverd: Improve an error message in the election code Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 275ed9ebe287e39d891888c13810c70f347af8ac) 2013-10-30 04:32:28 +04:00			`DEBUG(DEBUG_ERR,(__location__ " unable to get node map\n"));`
choose the most connected node first (This used to be ctdb commit c7c17a79fa4f28509e34b6f635fa62517dc458c2) 2007-06-07 13:17:27 +04:00			`return;`
			`}`

ctdb-recoverd: Add function node_flags() and use it in elections Indexing a node map by PNN is suboptimal. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-07-29 10:57:53 +03:00			`ok = node_flags(rec, rec->pnn, &rec->node_flags);`
			`if (!ok) {`
			`DBG_ERR("Unable to get node flags for this node\n");`
			`return;`
			`}`
When we create new election data to send during elections, we must re-read the node flags from the main daemon to catch when the STOPPED flag is changed. (This used to be ctdb commit ca4982c40d81db528fe915d5ecc01fcf7df0b522) 2009-07-17 05:37:03 +04:00			`em->node_flags = rec->node_flags;`

choose the most connected node first (This used to be ctdb commit c7c17a79fa4f28509e34b6f635fa62517dc458c2) 2007-06-07 13:17:27 +04:00			`for (i=0;i<nodemap->num;i++) {`
			`if (!(nodemap->nodes[i].flags & NODE_FLAGS_DISCONNECTED)) {`
			`em->num_connected++;`
			`}`
			`}`
make sure we lose all elections for recmaster role if we do not have the recmaster capability. (unless there are no other node at all available with this capability) (This used to be ctdb commit 8556e9dc897c6b9b9be0b52f391effb1f72fbd80) 2008-05-06 07:56:56 +04:00
ctdb-recoverd: Add and use function this_node_can_be_leader() This makes the code self-documenting. In ctdb_election_data() there is a slight behaviour change. An inactive node will now try to lose an election. This case should not happen because: * An inactive node can't win an election round and then send a reply. * Any inactive node should never start an election. There are currently places where this happens and they will be fixed later. There is an instance where this could be used in validate_recovery_master() but this involves a more serious logic change. Overhaul this function later. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-14 02:57:03 +03:00			`if (!this_node_can_be_leader(rec)) {`
			`/* Try to lose... */`
make sure we lose all elections for recmaster role if we do not have the recmaster capability. (unless there are no other node at all available with this capability) (This used to be ctdb commit 8556e9dc897c6b9b9be0b52f391effb1f72fbd80) 2008-05-06 07:56:56 +04:00			`em->num_connected = 0;`
			`em->priority_time = timeval_current();`
			`}`

choose the most connected node first (This used to be ctdb commit c7c17a79fa4f28509e34b6f635fa62517dc458c2) 2007-06-07 13:17:27 +04:00			`talloc_free(nodemap);`
			`}`

			`/*`
			`see if the given election data wins`
			`*/`
			`static bool ctdb_election_win(struct ctdb_recoverd rec, struct election_message em)`
			`{`
			`struct election_message myem;`
- catch ESTALE in the recovery lock by trying a read() - priortise nodes that are unbanned and healthy in the election (This used to be ctdb commit 929feb475dfdf7283f0e99b50b179e1c91d3a39f) 2007-10-05 07:28:21 +04:00			`int cmp = 0;`
choose the most connected node first (This used to be ctdb commit c7c17a79fa4f28509e34b6f635fa62517dc458c2) 2007-06-07 13:17:27 +04:00
			`ctdb_election_data(rec, &myem);`

ctdb-recoverd: Add and use function this_node_can_be_leader() This makes the code self-documenting. In ctdb_election_data() there is a slight behaviour change. An inactive node will now try to lose an election. This case should not happen because: * An inactive node can't win an election round and then send a reply. * Any inactive node should never start an election. There are currently places where this happens and they will be fixed later. There is an instance where this could be used in validate_recovery_master() but this involves a more serious logic change. Overhaul this function later. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-14 02:57:03 +03:00			`if (!this_node_can_be_leader(rec)) {`
stopped nodes can not win a recmaster election stopped nodes must yield the recmaster role (This used to be ctdb commit b75ac1185481060ab71bd743e1e48d333d716eba) 2009-07-09 08:44:03 +04:00			`return false;`
recoverd: eliminate some trailing spaces from ctdb_election_win() Signed-off-by: Michael Adam <obnox@samba.org> (This used to be ctdb commit df30c0a05ed908fc2a997c56ff5484736b23b70f) 2013-06-21 16:06:22 +04:00			`}`
stopped nodes can not win a recmaster election stopped nodes must yield the recmaster role (This used to be ctdb commit b75ac1185481060ab71bd743e1e48d333d716eba) 2009-07-09 08:44:03 +04:00
ctdb-recoverd: Simplify some stopped/banned checks to inactive checks Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-03-17 09:10:20 +03:00			`/* Automatically win if other node is banned or stopped */`
			`if (em->node_flags & NODE_FLAGS_INACTIVE) {`
stopped nodes can not win a recmaster election stopped nodes must yield the recmaster role (This used to be ctdb commit b75ac1185481060ab71bd743e1e48d333d716eba) 2009-07-09 08:44:03 +04:00			`return true;`
			`}`

choose the most connected node first (This used to be ctdb commit c7c17a79fa4f28509e34b6f635fa62517dc458c2) 2007-06-07 13:17:27 +04:00			`/* then the longest running node */`
			`if (cmp == 0) {`
later times are a lower priority, not a higher priority (This used to be ctdb commit e96424e7d366df29767c4eeaccdcc0cc975cb8ae) 2007-06-07 13:21:55 +04:00			`cmp = timeval_compare(&em->priority_time, &myem.priority_time);`
choose the most connected node first (This used to be ctdb commit c7c17a79fa4f28509e34b6f635fa62517dc458c2) 2007-06-07 13:17:27 +04:00			`}`

			`if (cmp == 0) {`
change ctdb_node_flags_change.vnn to ctdb_node_flags_changed.pnn change ctdb_ban_info.vnn to ctdb_ban_info.pnn (This used to be ctdb commit fcedd40e0493948829e1c921d4fe30e9196e398a) 2007-09-04 04:33:10 +04:00			`cmp = (int)myem.pnn - (int)em->pnn;`
choose the most connected node first (This used to be ctdb commit c7c17a79fa4f28509e34b6f635fa62517dc458c2) 2007-06-07 13:17:27 +04:00			`}`

			`return cmp > 0;`
			`}`
implement a scheme where nodes are banned if they continuously caused the cluster to start a recovery session. The node is banned from the cluster for the RecoveryBanPeriod (default of 5 minutes) (This used to be ctdb commit 4ad43dd07f526b6002477177fbf55483246c2c0c) 2007-06-07 09:18:55 +04:00
			`/*`
			`send out an election request`
			`*/`
ctdb-recoverd: Simplify arguments to some election functions The pnn and nodemap arguments to force_election() and send_election_request() are always effectively rec->pnn and rec->nodemap, so simplify. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-08 11:27:01 +03:00			`static int send_election_request(struct ctdb_recoverd *rec)`
recovery daemon with recovery master election election is primitive, it elects the lowest vnn as the recovery master two new controls, to get/set recovery master for a node to use recovery daemon, start one ./bin/recoverd --socket=ctdb.socket* for each ctdb daemon it has been briefly tested by deleting and adding nodes to a 4 node cluster but needs more testing (This used to be ctdb commit 541d1cc49d46d44042a31a8404d521412ef2fdb3) 2007-05-07 00:51:58 +04:00			`{`
			`TDB_DATA election_data;`
			`struct election_message emsg;`
			`uint64_t srvid;`
use a priority time for the election data, not just the vnn (This used to be ctdb commit a691f9c5cd77194005f0d98483da94b07a48d57d) 2007-06-07 12:37:27 +04:00			`struct ctdb_context *ctdb = rec->ctdb;`
simplify election handling make sure we read and update the flags from all remote nodes before we reach the first codepath that can call do_recovery() since during do_recovery() we need to know what the flags are. (This used to be ctdb commit e85f3806483ea420559d449e0e4d81bec996740f) 2007-10-11 00:16:36 +04:00
ctdb-include: Use new protocol definitions This gets rid of the duplicate definitions from ctdb_protocol.h. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-29 09:51:52 +03:00			`srvid = CTDB_SRVID_ELECTION;`
recovery daemon with recovery master election election is primitive, it elects the lowest vnn as the recovery master two new controls, to get/set recovery master for a node to use recovery daemon, start one ./bin/recoverd --socket=ctdb.socket* for each ctdb daemon it has been briefly tested by deleting and adding nodes to a 4 node cluster but needs more testing (This used to be ctdb commit 541d1cc49d46d44042a31a8404d521412ef2fdb3) 2007-05-07 00:51:58 +04:00
choose the most connected node first (This used to be ctdb commit c7c17a79fa4f28509e34b6f635fa62517dc458c2) 2007-06-07 13:17:27 +04:00			`ctdb_election_data(rec, &emsg);`
recovery daemon with recovery master election election is primitive, it elects the lowest vnn as the recovery master two new controls, to get/set recovery master for a node to use recovery daemon, start one ./bin/recoverd --socket=ctdb.socket* for each ctdb daemon it has been briefly tested by deleting and adding nodes to a 4 node cluster but needs more testing (This used to be ctdb commit 541d1cc49d46d44042a31a8404d521412ef2fdb3) 2007-05-07 00:51:58 +04:00
			`election_data.dsize = sizeof(struct election_message);`
			`election_data.dptr = (unsigned char *)&emsg;`


ctdb-recoverd: Drop calls to ctdb_ctrl_setrecmaster() Nothing fetches this value anymore. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-05-05 16:26:41 +03:00			`/* Assume this node will win the election, set leader accordingly */`
ctdb-recoverd: Rename recmaster field to leader Recovery master is being renamed to leader. This follows clustering best practice (e.g. RAFT). Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-07-14 08:22:33 +03:00			`rec->leader = rec->pnn;`
Revert "if a new node enters the cluster, that node will already be frozen at start" This is unnecessary due to 03e2e436db5cfd29a56d13f5d2101e42389bfc94. Furthermore, if a node doesn't force an election but wins it then it can fail to record that it is the new recovery master. This can lead to a reverse split brain where there is no recovery master. This reverts commit c5035657606283d2e35bea40992505e84ca8e7be. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Conflicts: server/ctdb_recoverd.c (This used to be ctdb commit c8b542e059a54b8d524bd430cad9d82e5edd864d) 2013-10-29 09:38:42 +04:00
recovery daemon with recovery master election election is primitive, it elects the lowest vnn as the recovery master two new controls, to get/set recovery master for a node to use recovery daemon, start one ./bin/recoverd --socket=ctdb.socket* for each ctdb daemon it has been briefly tested by deleting and adding nodes to a 4 node cluster but needs more testing (This used to be ctdb commit 541d1cc49d46d44042a31a8404d521412ef2fdb3) 2007-05-07 00:51:58 +04:00			`/* send an election message to all active nodes */`
When we create new election data to send during elections, we must re-read the node flags from the main daemon to catch when the STOPPED flag is changed. (This used to be ctdb commit ca4982c40d81db528fe915d5ecc01fcf7df0b522) 2009-07-17 05:37:03 +04:00			`DEBUG(DEBUG_INFO,(__location__ " Send election request to all active nodes\n"));`
ctdb-server: Coverity fixes Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org> 2013-11-11 05:39:27 +04:00			`return ctdb_client_send_message(ctdb, CTDB_BROADCAST_ALL, srvid, election_data);`
recovery daemon with recovery master election election is primitive, it elects the lowest vnn as the recovery master two new controls, to get/set recovery master for a node to use recovery daemon, start one ./bin/recoverd --socket=ctdb.socket* for each ctdb daemon it has been briefly tested by deleting and adding nodes to a 4 node cluster but needs more testing (This used to be ctdb commit 541d1cc49d46d44042a31a8404d521412ef2fdb3) 2007-05-07 00:51:58 +04:00			`}`

make election handling much more scalable (This used to be ctdb commit 05938d462b92bd9ecb8e35f53651bded47c48675) 2007-11-13 02:27:44 +03:00			`/*`
			`we think we are winning the election - send a broadcast election request`
			`*/`
ctdb-daemon: Stop using tevent compatibility definitions Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-26 08:50:09 +03:00			`static void election_send_request(struct tevent_context *ev,`
			`struct tevent_timer *te,`
			`struct timeval t, void *p)`
make election handling much more scalable (This used to be ctdb commit 05938d462b92bd9ecb8e35f53651bded47c48675) 2007-11-13 02:27:44 +03:00			`{`
			`struct ctdb_recoverd *rec = talloc_get_type(p, struct ctdb_recoverd);`
			`int ret;`

ctdb-recoverd: Simplify arguments to some election functions The pnn and nodemap arguments to force_election() and send_election_request() are always effectively rec->pnn and rec->nodemap, so simplify. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-08 11:27:01 +03:00			`ret = send_election_request(rec);`
make election handling much more scalable (This used to be ctdb commit 05938d462b92bd9ecb8e35f53651bded47c48675) 2007-11-13 02:27:44 +03:00			`if (ret != 0) {`
merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_ERR,("Failed to send election request!\n"));`
make election handling much more scalable (This used to be ctdb commit 05938d462b92bd9ecb8e35f53651bded47c48675) 2007-11-13 02:27:44 +03:00			`}`

ctdb-recoverd: Simplify using TALLOC_FREE() Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-10-23 08:03:38 +03:00			`TALLOC_FREE(rec->send_election_te);`
make election handling much more scalable (This used to be ctdb commit 05938d462b92bd9ecb8e35f53651bded47c48675) 2007-11-13 02:27:44 +03:00			`}`

add improvements to tracking memory usage in ctdbd adn the recovery daemon and a ctdb command to pull the talloc memory map from a recovery daemon ctdb rddumpmemory (This used to be ctdb commit d23950be7406cf288f48b660c0f57a9b8d7bdd05) 2008-04-01 08:34:54 +04:00			`/*`
			`handler for memory dumps`
			`*/`
ctdb-daemon: Replace ctdb_message with srvid abstraction Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-04-08 07:38:26 +03:00			`static void mem_dump_handler(uint64_t srvid, TDB_DATA data, void *private_data)`
add improvements to tracking memory usage in ctdbd adn the recovery daemon and a ctdb command to pull the talloc memory map from a recovery daemon ctdb rddumpmemory (This used to be ctdb commit d23950be7406cf288f48b660c0f57a9b8d7bdd05) 2008-04-01 08:34:54 +04:00			`{`
ctdb-daemon: Replace ctdb_message with srvid abstraction Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-04-08 07:38:26 +03:00			`struct ctdb_recoverd *rec = talloc_get_type(`
			`private_data, struct ctdb_recoverd);`
			`struct ctdb_context *ctdb = rec->ctdb;`
add improvements to tracking memory usage in ctdbd adn the recovery daemon and a ctdb command to pull the talloc memory map from a recovery daemon ctdb rddumpmemory (This used to be ctdb commit d23950be7406cf288f48b660c0f57a9b8d7bdd05) 2008-04-01 08:34:54 +04:00			`TALLOC_CTX *tmp_ctx = talloc_new(ctdb);`
			`TDB_DATA *dump;`
			`int ret;`
ctdb-daemon: Rename struct srvid_request to ctdb_srvid_message Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-29 06:32:49 +03:00			`struct ctdb_srvid_message *rd;`
add improvements to tracking memory usage in ctdbd adn the recovery daemon and a ctdb command to pull the talloc memory map from a recovery daemon ctdb rddumpmemory (This used to be ctdb commit d23950be7406cf288f48b660c0f57a9b8d7bdd05) 2008-04-01 08:34:54 +04:00
ctdb-daemon: Rename struct srvid_request to ctdb_srvid_message Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-29 06:32:49 +03:00			`if (data.dsize != sizeof(struct ctdb_srvid_message)) {`
add improvements to tracking memory usage in ctdbd adn the recovery daemon and a ctdb command to pull the talloc memory map from a recovery daemon ctdb rddumpmemory (This used to be ctdb commit d23950be7406cf288f48b660c0f57a9b8d7bdd05) 2008-04-01 08:34:54 +04:00			`DEBUG(DEBUG_ERR, (__location__ " Wrong size of return address.\n"));`
fix a slow memory leak in the recovery daemon in the error paths for the memdump function (This used to be ctdb commit 5e641ef9d6cca286061138a9680dcf2495736e8b) 2008-09-16 03:00:48 +04:00			`talloc_free(tmp_ctx);`
add improvements to tracking memory usage in ctdbd adn the recovery daemon and a ctdb command to pull the talloc memory map from a recovery daemon ctdb rddumpmemory (This used to be ctdb commit d23950be7406cf288f48b660c0f57a9b8d7bdd05) 2008-04-01 08:34:54 +04:00			`return;`
			`}`
ctdb-daemon: Rename struct srvid_request to ctdb_srvid_message Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-29 06:32:49 +03:00			`rd = (struct ctdb_srvid_message *)data.dptr;`
add improvements to tracking memory usage in ctdbd adn the recovery daemon and a ctdb command to pull the talloc memory map from a recovery daemon ctdb rddumpmemory (This used to be ctdb commit d23950be7406cf288f48b660c0f57a9b8d7bdd05) 2008-04-01 08:34:54 +04:00
			`dump = talloc_zero(tmp_ctx, TDB_DATA);`
			`if (dump == NULL) {`
			`DEBUG(DEBUG_ERR, (__location__ " Failed to allocate memory for memdump\n"));`
			`talloc_free(tmp_ctx);`
			`return;`
			`}`
			`ret = ctdb_dump_memory(ctdb, dump);`
			`if (ret != 0) {`
			`DEBUG(DEBUG_ERR, (__location__ " ctdb_dump_memory() failed\n"));`
			`talloc_free(tmp_ctx);`
			`return;`
			`}`

ctdb-recoverd: Logging/comments: recovery master -> leader There are some remaining instances in this file but they will be removed in subsequent commits. Modernise debug macros as appropriate. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-08 03:07:25 +03:00			`DBG_ERR("recovery daemon memory dump\n");`
add improvements to tracking memory usage in ctdbd adn the recovery daemon and a ctdb command to pull the talloc memory map from a recovery daemon ctdb rddumpmemory (This used to be ctdb commit d23950be7406cf288f48b660c0f57a9b8d7bdd05) 2008-04-01 08:34:54 +04:00
rename ctdb_send_message to ctdb_client_send_message to resolve colission with the function of the same name in libctdb (This used to be ctdb commit ac3292c12832484a22715f1d46aa23f3b7c8a6f6) 2010-06-02 03:45:21 +04:00			`ret = ctdb_client_send_message(ctdb, rd->pnn, rd->srvid, *dump);`
add improvements to tracking memory usage in ctdbd adn the recovery daemon and a ctdb command to pull the talloc memory map from a recovery daemon ctdb rddumpmemory (This used to be ctdb commit d23950be7406cf288f48b660c0f57a9b8d7bdd05) 2008-04-01 08:34:54 +04:00			`if (ret != 0) {`
			`DEBUG(DEBUG_ERR,("Failed to send rd memdump reply message\n"));`
fix a slow memory leak in the recovery daemon in the error paths for the memdump function (This used to be ctdb commit 5e641ef9d6cca286061138a9680dcf2495736e8b) 2008-09-16 03:00:48 +04:00			`talloc_free(tmp_ctx);`
add improvements to tracking memory usage in ctdbd adn the recovery daemon and a ctdb command to pull the talloc memory map from a recovery daemon ctdb rddumpmemory (This used to be ctdb commit d23950be7406cf288f48b660c0f57a9b8d7bdd05) 2008-04-01 08:34:54 +04:00			`return;`
			`}`

			`talloc_free(tmp_ctx);`
			`}`

add a new node state : DELETED. This is used to mark nodes as being DELETED internally in ctdb so that nodes are not renumbered if / when they are removed from the nodes file. This is used to be able to do "ctdb reloadnodes" at runtime without causing nodes to be renumbered. To do this, instead of deleting a node from the nodes file, just comment it out like 1.0.0.1 #1.0.0.2 1.0.0.3 After removing 1.0.0.2 from the cluster, the remaining nodes retain their pnn's from prior to the deletion, namely 0 and 2 Any line in the nodes file that is commented out represents a DELETED pnn (This used to be ctdb commit 6a5e4fd7fa391206b463bb4e976502f3ac5bd343) 2009-06-01 08:18:34 +04:00			`/*`
			`handler for reload_nodes`
			`*/`
ctdb-daemon: Replace ctdb_message with srvid abstraction Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-04-08 07:38:26 +03:00			`static void reload_nodes_handler(uint64_t srvid, TDB_DATA data,`
			`void *private_data)`
add a new node state : DELETED. This is used to mark nodes as being DELETED internally in ctdb so that nodes are not renumbered if / when they are removed from the nodes file. This is used to be able to do "ctdb reloadnodes" at runtime without causing nodes to be renumbered. To do this, instead of deleting a node from the nodes file, just comment it out like 1.0.0.1 #1.0.0.2 1.0.0.3 After removing 1.0.0.2 from the cluster, the remaining nodes retain their pnn's from prior to the deletion, namely 0 and 2 Any line in the nodes file that is commented out represents a DELETED pnn (This used to be ctdb commit 6a5e4fd7fa391206b463bb4e976502f3ac5bd343) 2009-06-01 08:18:34 +04:00			`{`
ctdb-daemon: Replace ctdb_message with srvid abstraction Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-04-08 07:38:26 +03:00			`struct ctdb_recoverd *rec = talloc_get_type(`
			`private_data, struct ctdb_recoverd);`
add a new node state : DELETED. This is used to mark nodes as being DELETED internally in ctdb so that nodes are not renumbered if / when they are removed from the nodes file. This is used to be able to do "ctdb reloadnodes" at runtime without causing nodes to be renumbered. To do this, instead of deleting a node from the nodes file, just comment it out like 1.0.0.1 #1.0.0.2 1.0.0.3 After removing 1.0.0.2 from the cluster, the remaining nodes retain their pnn's from prior to the deletion, namely 0 and 2 Any line in the nodes file that is commented out represents a DELETED pnn (This used to be ctdb commit 6a5e4fd7fa391206b463bb4e976502f3ac5bd343) 2009-06-01 08:18:34 +04:00
			`DEBUG(DEBUG_ERR, (__location__ " Reload nodes file from recovery daemon\n"));`

ctdb-server: rename ctdb_load_nodes_file to ctdb_load_nodes Rename ctdb_load_nodes_file to ctdb_load_nodes as it can now load nodes from more than a regular file. Signed-off-by: John Mulligan <jmulligan@redhat.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2024-06-06 20:53:43 +03:00			`ctdb_load_nodes(rec->ctdb);`
add a new node state : DELETED. This is used to mark nodes as being DELETED internally in ctdb so that nodes are not renumbered if / when they are removed from the nodes file. This is used to be able to do "ctdb reloadnodes" at runtime without causing nodes to be renumbered. To do this, instead of deleting a node from the nodes file, just comment it out like 1.0.0.1 #1.0.0.2 1.0.0.3 After removing 1.0.0.2 from the cluster, the remaining nodes retain their pnn's from prior to the deletion, namely 0 and 2 Any line in the nodes file that is commented out represents a DELETED pnn (This used to be ctdb commit 6a5e4fd7fa391206b463bb4e976502f3ac5bd343) 2009-06-01 08:18:34 +04:00			`}`

add a new message to ask the recovery daemon to temporarily disable checking ip address consistency. This is useful when we are moving addresses using moveip in the cluster since otherwise if we collide with the recovery daemons own check we could cause a recovery (This used to be ctdb commit 9c63858c0b22c81eaccb9865a414af0bbb2833d4) 2009-10-06 05:11:32 +04:00
ctdb-daemon: Replace ctdb_message with srvid abstraction Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-04-08 07:38:26 +03:00			`static void recd_node_rebalance_handler(uint64_t srvid, TDB_DATA data,`
			`void *private_data)`
When adding ips to nodes, set up a deferred rebalance for the whole node to trigger after 60 seconds in case the normal ipreallocated is not sufficient to trigger rebalance. (This used to be ctdb commit 4340263b219d75c39f8de22abe3f6f1c1ee63ea2) 2012-02-27 23:56:04 +04:00			`{`
ctdb-daemon: Replace ctdb_message with srvid abstraction Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-04-08 07:38:26 +03:00			`struct ctdb_recoverd *rec = talloc_get_type(`
			`private_data, struct ctdb_recoverd);`
			`struct ctdb_context *ctdb = rec->ctdb;`
When adding ips to nodes, set up a deferred rebalance for the whole node to trigger after 60 seconds in case the normal ipreallocated is not sufficient to trigger rebalance. (This used to be ctdb commit 4340263b219d75c39f8de22abe3f6f1c1ee63ea2) 2012-02-27 23:56:04 +04:00			`uint32_t pnn;`
recoverd: Fix the implementation of CTDB_SRVID_REBALANCE_NODE The current implementation has a few flaws: * A takeover run is called unconditionally when the timer goes even if the recovery master role has moved. This means a node other than the recovery master can incorrectly do a takeover run. * The rebalancing target nodes are cleared in the setup for a takeover run, regardless of whether the takeover run succeeds. * The timer to force a rebalance isn't cleared if another takeover run occurs before the deadline. Any forced rebalancing will happen in the first takeover run and when the timer expires some time later then an unnecessary takeover run will occur. * If the recovery master role moves then the rebalancing data will stay on the original node and affect the next takeover run to occur if the recovery master role should come back to the original node. Instead, store an array of rebalance target nodes in the recovery master context. This is passed as an extra argument to ctdb_takeover_run() each time it is called and is cleared when a takeover run succeeds. The timer hangs off the array of rebalance target nodes, which is cleared if the node isn't the recovery master. This means that it is possible to lose rebalance data if the recovery master role moves. However, that's a difficult problem to solve. The best way of approaching it is probably to try to stop the recovery master role from jumping around unnecesarily when inactive nodes join the cluster. The long term solution is to avoid this nonsense completely. The IP allocation algorithm needs to cache state between runs so that it knows which nodes have just become healthy. This also needs recovery master stability. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit c51c1efe5fc7fa668597f2acd435dee16e410fc9) 2013-09-04 08:30:04 +04:00			`uint32_t *t;`
			`int len;`
When adding ips to nodes, set up a deferred rebalance for the whole node to trigger after 60 seconds in case the normal ipreallocated is not sufficient to trigger rebalance. (This used to be ctdb commit 4340263b219d75c39f8de22abe3f6f1c1ee63ea2) 2012-02-27 23:56:04 +04:00
ctdb-recoverd: Factor out and use function this_node_is_leader() Make the code self-documenting. This preempts an upcoming change to terminology but doing it now saves a lot of churn. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-08 11:37:39 +03:00			`if (!this_node_is_leader(rec)) {`
recoverd: Fix the implementation of CTDB_SRVID_REBALANCE_NODE The current implementation has a few flaws: * A takeover run is called unconditionally when the timer goes even if the recovery master role has moved. This means a node other than the recovery master can incorrectly do a takeover run. * The rebalancing target nodes are cleared in the setup for a takeover run, regardless of whether the takeover run succeeds. * The timer to force a rebalance isn't cleared if another takeover run occurs before the deadline. Any forced rebalancing will happen in the first takeover run and when the timer expires some time later then an unnecessary takeover run will occur. * If the recovery master role moves then the rebalancing data will stay on the original node and affect the next takeover run to occur if the recovery master role should come back to the original node. Instead, store an array of rebalance target nodes in the recovery master context. This is passed as an extra argument to ctdb_takeover_run() each time it is called and is cleared when a takeover run succeeds. The timer hangs off the array of rebalance target nodes, which is cleared if the node isn't the recovery master. This means that it is possible to lose rebalance data if the recovery master role moves. However, that's a difficult problem to solve. The best way of approaching it is probably to try to stop the recovery master role from jumping around unnecesarily when inactive nodes join the cluster. The long term solution is to avoid this nonsense completely. The IP allocation algorithm needs to cache state between runs so that it knows which nodes have just become healthy. This also needs recovery master stability. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit c51c1efe5fc7fa668597f2acd435dee16e410fc9) 2013-09-04 08:30:04 +04:00			`return;`
			`}`

When adding ips to nodes, set up a deferred rebalance for the whole node to trigger after 60 seconds in case the normal ipreallocated is not sufficient to trigger rebalance. (This used to be ctdb commit 4340263b219d75c39f8de22abe3f6f1c1ee63ea2) 2012-02-27 23:56:04 +04:00			`if (data.dsize != sizeof(uint32_t)) {`
			`DEBUG(DEBUG_ERR,(__location__ " Incorrect size of node rebalance message. Was %zd but expected %zd bytes\n", data.dsize, sizeof(uint32_t)));`
			`return;`
			`}`

			`pnn = (uint32_t )&data.dptr[0];`

recoverd: Fix the implementation of CTDB_SRVID_REBALANCE_NODE The current implementation has a few flaws: * A takeover run is called unconditionally when the timer goes even if the recovery master role has moved. This means a node other than the recovery master can incorrectly do a takeover run. * The rebalancing target nodes are cleared in the setup for a takeover run, regardless of whether the takeover run succeeds. * The timer to force a rebalance isn't cleared if another takeover run occurs before the deadline. Any forced rebalancing will happen in the first takeover run and when the timer expires some time later then an unnecessary takeover run will occur. * If the recovery master role moves then the rebalancing data will stay on the original node and affect the next takeover run to occur if the recovery master role should come back to the original node. Instead, store an array of rebalance target nodes in the recovery master context. This is passed as an extra argument to ctdb_takeover_run() each time it is called and is cleared when a takeover run succeeds. The timer hangs off the array of rebalance target nodes, which is cleared if the node isn't the recovery master. This means that it is possible to lose rebalance data if the recovery master role moves. However, that's a difficult problem to solve. The best way of approaching it is probably to try to stop the recovery master role from jumping around unnecesarily when inactive nodes join the cluster. The long term solution is to avoid this nonsense completely. The IP allocation algorithm needs to cache state between runs so that it knows which nodes have just become healthy. This also needs recovery master stability. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit c51c1efe5fc7fa668597f2acd435dee16e410fc9) 2013-09-04 08:30:04 +04:00			`DEBUG(DEBUG_NOTICE,("Setting up rebalance of IPs to node %u\n", pnn));`
When adding ips to nodes, set up a deferred rebalance for the whole node to trigger after 60 seconds in case the normal ipreallocated is not sufficient to trigger rebalance. (This used to be ctdb commit 4340263b219d75c39f8de22abe3f6f1c1ee63ea2) 2012-02-27 23:56:04 +04:00
recoverd: Fix the implementation of CTDB_SRVID_REBALANCE_NODE The current implementation has a few flaws: * A takeover run is called unconditionally when the timer goes even if the recovery master role has moved. This means a node other than the recovery master can incorrectly do a takeover run. * The rebalancing target nodes are cleared in the setup for a takeover run, regardless of whether the takeover run succeeds. * The timer to force a rebalance isn't cleared if another takeover run occurs before the deadline. Any forced rebalancing will happen in the first takeover run and when the timer expires some time later then an unnecessary takeover run will occur. * If the recovery master role moves then the rebalancing data will stay on the original node and affect the next takeover run to occur if the recovery master role should come back to the original node. Instead, store an array of rebalance target nodes in the recovery master context. This is passed as an extra argument to ctdb_takeover_run() each time it is called and is cleared when a takeover run succeeds. The timer hangs off the array of rebalance target nodes, which is cleared if the node isn't the recovery master. This means that it is possible to lose rebalance data if the recovery master role moves. However, that's a difficult problem to solve. The best way of approaching it is probably to try to stop the recovery master role from jumping around unnecesarily when inactive nodes join the cluster. The long term solution is to avoid this nonsense completely. The IP allocation algorithm needs to cache state between runs so that it knows which nodes have just become healthy. This also needs recovery master stability. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit c51c1efe5fc7fa668597f2acd435dee16e410fc9) 2013-09-04 08:30:04 +04:00			`/* Copy any existing list of nodes. There's probably some`
			`* sort of realloc variant that will do this but we need to`
			`* make sure that freeing the old array also cancels the timer`
			`* event for the timeout... not sure if realloc will do that.`
			`*/`
			`len = (rec->force_rebalance_nodes != NULL) ?`
			`talloc_array_length(rec->force_rebalance_nodes) :`
			`0;`

			`/* This allows duplicates to be added but they don't cause`
			`* harm. A call to add a duplicate PNN arguably means that`
			`* the timeout should be reset, so this is the simplest`
			`* solution.`
			`*/`
			`t = talloc_zero_array(rec, uint32_t, len+1);`
			`CTDB_NO_MEMORY_VOID(ctdb, t);`
			`if (len > 0) {`
			`memcpy(t, rec->force_rebalance_nodes, sizeof(uint32_t) * len);`
When adding ips to nodes, set up a deferred rebalance for the whole node to trigger after 60 seconds in case the normal ipreallocated is not sufficient to trigger rebalance. (This used to be ctdb commit 4340263b219d75c39f8de22abe3f6f1c1ee63ea2) 2012-02-27 23:56:04 +04:00			`}`
recoverd: Fix the implementation of CTDB_SRVID_REBALANCE_NODE The current implementation has a few flaws: * A takeover run is called unconditionally when the timer goes even if the recovery master role has moved. This means a node other than the recovery master can incorrectly do a takeover run. * The rebalancing target nodes are cleared in the setup for a takeover run, regardless of whether the takeover run succeeds. * The timer to force a rebalance isn't cleared if another takeover run occurs before the deadline. Any forced rebalancing will happen in the first takeover run and when the timer expires some time later then an unnecessary takeover run will occur. * If the recovery master role moves then the rebalancing data will stay on the original node and affect the next takeover run to occur if the recovery master role should come back to the original node. Instead, store an array of rebalance target nodes in the recovery master context. This is passed as an extra argument to ctdb_takeover_run() each time it is called and is cleared when a takeover run succeeds. The timer hangs off the array of rebalance target nodes, which is cleared if the node isn't the recovery master. This means that it is possible to lose rebalance data if the recovery master role moves. However, that's a difficult problem to solve. The best way of approaching it is probably to try to stop the recovery master role from jumping around unnecesarily when inactive nodes join the cluster. The long term solution is to avoid this nonsense completely. The IP allocation algorithm needs to cache state between runs so that it knows which nodes have just become healthy. This also needs recovery master stability. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit c51c1efe5fc7fa668597f2acd435dee16e410fc9) 2013-09-04 08:30:04 +04:00			`t[len] = pnn;`

			`talloc_free(rec->force_rebalance_nodes);`

			`rec->force_rebalance_nodes = t;`
When adding ips to nodes, set up a deferred rebalance for the whole node to trigger after 60 seconds in case the normal ipreallocated is not sufficient to trigger rebalance. (This used to be ctdb commit 4340263b219d75c39f8de22abe3f6f1c1ee63ea2) 2012-02-27 23:56:04 +04:00			`}`



ctdb-recoverd: Change argument to srvid_disable_and_reply() Reduce dependency on struct ctdb_context internals, enable a subsequent change. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-08 13:28:05 +03:00			`static void srvid_disable_and_reply(struct ctdb_recoverd *rec,`
ctdb-recoverd: Add slightly more abstraction for disabling takeover runs Factor out new function srvid_disable_and_reply(), which can be re-used. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-02-06 05:05:12 +03:00			`TDB_DATA data,`
			`struct ctdb_op_state *op_state)`
recoverd: New SRVID message CTDB_SRVID_DISABLE_TAKEOVER_RUNS This implements a superset of CTDB_SRVID_DISABLE_IP_CHECK. It stops the IP checks but also causes any attempted takeover runs to fail and be rescheduled. This is meant to completely stop IP movements. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 00db4de53a0d86013e79e6577e7e6cf3ef864e56) 2013-08-27 09:04:40 +04:00			`{`
ctdb-recoverd: Change argument to srvid_disable_and_reply() Reduce dependency on struct ctdb_context internals, enable a subsequent change. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-08 13:28:05 +03:00			`struct ctdb_context *ctdb = rec->ctdb;`
ctdb-daemon: Rename struct srvid_request_data to ctdb_disable_message Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-28 10:23:13 +03:00			`struct ctdb_disable_message *r;`
recoverd: New SRVID message CTDB_SRVID_DISABLE_TAKEOVER_RUNS This implements a superset of CTDB_SRVID_DISABLE_IP_CHECK. It stops the IP checks but also causes any attempted takeover runs to fail and be rescheduled. This is meant to completely stop IP movements. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 00db4de53a0d86013e79e6577e7e6cf3ef864e56) 2013-08-27 09:04:40 +04:00			`uint32_t timeout;`
			`TDB_DATA result;`
			`int32_t ret = 0;`

			`/* Validate input data */`
ctdb-daemon: Rename struct srvid_request_data to ctdb_disable_message Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-28 10:23:13 +03:00			`if (data.dsize != sizeof(struct ctdb_disable_message)) {`
recoverd: New SRVID message CTDB_SRVID_DISABLE_TAKEOVER_RUNS This implements a superset of CTDB_SRVID_DISABLE_IP_CHECK. It stops the IP checks but also causes any attempted takeover runs to fail and be rescheduled. This is meant to completely stop IP movements. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 00db4de53a0d86013e79e6577e7e6cf3ef864e56) 2013-08-27 09:04:40 +04:00			`DEBUG(DEBUG_ERR,(__location__ " Wrong size for data :%lu "`
			`"expecting %lu\n", (long unsigned)data.dsize,`
ctdb-daemon: Rename struct srvid_request to ctdb_srvid_message Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-29 06:32:49 +03:00			`(long unsigned)sizeof(struct ctdb_srvid_message)));`
ctdb-server: Coverity fixes Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org> 2013-11-11 05:39:27 +04:00			`return;`
recoverd: New SRVID message CTDB_SRVID_DISABLE_TAKEOVER_RUNS This implements a superset of CTDB_SRVID_DISABLE_IP_CHECK. It stops the IP checks but also causes any attempted takeover runs to fail and be rescheduled. This is meant to completely stop IP movements. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 00db4de53a0d86013e79e6577e7e6cf3ef864e56) 2013-08-27 09:04:40 +04:00			`}`
			`if (data.dptr == NULL) {`
			`DEBUG(DEBUG_ERR,(__location__ " No data received\n"));`
ctdb-server: Coverity fixes Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org> 2013-11-11 05:39:27 +04:00			`return;`
recoverd: New SRVID message CTDB_SRVID_DISABLE_TAKEOVER_RUNS This implements a superset of CTDB_SRVID_DISABLE_IP_CHECK. It stops the IP checks but also causes any attempted takeover runs to fail and be rescheduled. This is meant to completely stop IP movements. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 00db4de53a0d86013e79e6577e7e6cf3ef864e56) 2013-08-27 09:04:40 +04:00			`}`

ctdb-daemon: Rename struct srvid_request_data to ctdb_disable_message Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-28 10:23:13 +03:00			`r = (struct ctdb_disable_message *)data.dptr;`
			`timeout = r->timeout;`
recoverd: New SRVID message CTDB_SRVID_DISABLE_TAKEOVER_RUNS This implements a superset of CTDB_SRVID_DISABLE_IP_CHECK. It stops the IP checks but also causes any attempted takeover runs to fail and be rescheduled. This is meant to completely stop IP movements. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 00db4de53a0d86013e79e6577e7e6cf3ef864e56) 2013-08-27 09:04:40 +04:00
ctdb-recoverd: Add slightly more abstraction for disabling takeover runs Factor out new function srvid_disable_and_reply(), which can be re-used. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-02-06 05:05:12 +03:00			`ret = ctdb_op_disable(op_state, ctdb->ev, timeout);`
ctdb-recoverd: Reimplement disabling takeover runs using ctdb_op_disable() Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-02-08 12:52:12 +03:00			`if (ret != 0) {`
recoverd: New SRVID message CTDB_SRVID_DISABLE_TAKEOVER_RUNS This implements a superset of CTDB_SRVID_DISABLE_IP_CHECK. It stops the IP checks but also causes any attempted takeover runs to fail and be rescheduled. This is meant to completely stop IP movements. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 00db4de53a0d86013e79e6577e7e6cf3ef864e56) 2013-08-27 09:04:40 +04:00			`goto done;`
			`}`

			`/* Returning our PNN tells the caller that we succeeded */`
ctdb-recoverd: Use rec->pnn everywhere This is currently referenced in a number of inconsistent ways, including: * pnn * rec->ctdb->pnn * ctdb->pnn * ctdb_get_pnn(ctdb) * ctdb_get_pnn(rec->ctdb) The first of these always requires some thought about the context - is this the node PNN or some other PNN (e.g. argument to function)? rec->pnn is now always used when referring to the recovery daemon's PNN. Doing this also reduces reliance on struct ctdb_context internals. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-08 12:25:46 +03:00			`ret = rec->pnn;`
recoverd: New SRVID message CTDB_SRVID_DISABLE_TAKEOVER_RUNS This implements a superset of CTDB_SRVID_DISABLE_IP_CHECK. It stops the IP checks but also causes any attempted takeover runs to fail and be rescheduled. This is meant to completely stop IP movements. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 00db4de53a0d86013e79e6577e7e6cf3ef864e56) 2013-08-27 09:04:40 +04:00			`done:`
			`result.dsize = sizeof(int32_t);`
			`result.dptr = (uint8_t *)&ret;`
ctdb-daemon: Rename struct srvid_request to ctdb_srvid_message Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-29 06:32:49 +03:00			`srvid_request_reply(ctdb, (struct ctdb_srvid_message *)r, result);`
recoverd: New SRVID message CTDB_SRVID_DISABLE_TAKEOVER_RUNS This implements a superset of CTDB_SRVID_DISABLE_IP_CHECK. It stops the IP checks but also causes any attempted takeover runs to fail and be rescheduled. This is meant to completely stop IP movements. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 00db4de53a0d86013e79e6577e7e6cf3ef864e56) 2013-08-27 09:04:40 +04:00			`}`

ctdb-daemon: Replace ctdb_message with srvid abstraction Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-04-08 07:38:26 +03:00			`static void disable_takeover_runs_handler(uint64_t srvid, TDB_DATA data,`
ctdb-recoverd: Add slightly more abstraction for disabling takeover runs Factor out new function srvid_disable_and_reply(), which can be re-used. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-02-06 05:05:12 +03:00			`void *private_data)`
			`{`
ctdb-daemon: Replace ctdb_message with srvid abstraction Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-04-08 07:38:26 +03:00			`struct ctdb_recoverd *rec = talloc_get_type(`
			`private_data, struct ctdb_recoverd);`
ctdb-recoverd: Add slightly more abstraction for disabling takeover runs Factor out new function srvid_disable_and_reply(), which can be re-used. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-02-06 05:05:12 +03:00
ctdb-recoverd: Change argument to srvid_disable_and_reply() Reduce dependency on struct ctdb_context internals, enable a subsequent change. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-08 13:28:05 +03:00			`srvid_disable_and_reply(rec, data, rec->takeover_run);`
ctdb-recoverd: Add slightly more abstraction for disabling takeover runs Factor out new function srvid_disable_and_reply(), which can be re-used. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-02-06 05:05:12 +03:00			`}`

ctdb-recoverd: Simplify disable_ip_check_handler() using ctdb_op_disable() Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-02-06 07:03:03 +03:00			`/* Backward compatibility for this SRVID */`
ctdb-daemon: Replace ctdb_message with srvid abstraction Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-04-08 07:38:26 +03:00			`static void disable_ip_check_handler(uint64_t srvid, TDB_DATA data,`
			`void *private_data)`
recoverd: Reimplement CTDB_SRVID_DISABLE_IP_CHECK Use disable_takeover_runs_handler() instead of maintaining duplicate logic. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 0a51a85915486b2a8fded7ba6444b18c6c1ee8e8) 2013-08-28 05:32:54 +04:00			`{`
ctdb-daemon: Replace ctdb_message with srvid abstraction Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-04-08 07:38:26 +03:00			`struct ctdb_recoverd *rec = talloc_get_type(`
			`private_data, struct ctdb_recoverd);`
ctdb-recoverd: Simplify disable_ip_check_handler() using ctdb_op_disable() Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-02-06 07:03:03 +03:00			`uint32_t timeout;`
recoverd: Reimplement CTDB_SRVID_DISABLE_IP_CHECK Use disable_takeover_runs_handler() instead of maintaining duplicate logic. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 0a51a85915486b2a8fded7ba6444b18c6c1ee8e8) 2013-08-28 05:32:54 +04:00
			`if (data.dsize != sizeof(uint32_t)) {`
			`DEBUG(DEBUG_ERR,(__location__ " Wrong size for data :%lu "`
			`"expecting %lu\n", (long unsigned)data.dsize,`
			`(long unsigned)sizeof(uint32_t)));`
			`return;`
			`}`
			`if (data.dptr == NULL) {`
			`DEBUG(DEBUG_ERR,(__location__ " No data received\n"));`
			`return;`
			`}`

ctdb-recoverd: Simplify disable_ip_check_handler() using ctdb_op_disable() Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-02-06 07:03:03 +03:00			`timeout = ((uint32_t )data.dptr);`
recoverd: Reimplement CTDB_SRVID_DISABLE_IP_CHECK Use disable_takeover_runs_handler() instead of maintaining duplicate logic. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 0a51a85915486b2a8fded7ba6444b18c6c1ee8e8) 2013-08-28 05:32:54 +04:00
ctdb-daemon: Replace ctdb_message with srvid abstraction Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-04-08 07:38:26 +03:00			`ctdb_op_disable(rec->takeover_run, rec->ctdb->ev, timeout);`
recoverd: Reimplement CTDB_SRVID_DISABLE_IP_CHECK Use disable_takeover_runs_handler() instead of maintaining duplicate logic. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 0a51a85915486b2a8fded7ba6444b18c6c1ee8e8) 2013-08-28 05:32:54 +04:00			`}`
add a new message to ask the recovery daemon to temporarily disable checking ip address consistency. This is useful when we are moving addresses using moveip in the cluster since otherwise if we collide with the recovery daemons own check we could cause a recovery (This used to be ctdb commit 9c63858c0b22c81eaccb9865a414af0bbb2833d4) 2009-10-06 05:11:32 +04:00
ctdb-daemon: Replace ctdb_message with srvid abstraction Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-04-08 07:38:26 +03:00			`static void disable_recoveries_handler(uint64_t srvid, TDB_DATA data,`
ctdb-recoverd: New message ID CTDB_SRVID_DISABLE_RECOVERIES Also add test stub support. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-02-06 07:06:44 +03:00			`void *private_data)`
			`{`
ctdb-daemon: Replace ctdb_message with srvid abstraction Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-04-08 07:38:26 +03:00			`struct ctdb_recoverd *rec = talloc_get_type(`
			`private_data, struct ctdb_recoverd);`
ctdb-recoverd: New message ID CTDB_SRVID_DISABLE_RECOVERIES Also add test stub support. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-02-06 07:06:44 +03:00
ctdb-recoverd: Change argument to srvid_disable_and_reply() Reduce dependency on struct ctdb_context internals, enable a subsequent change. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-08 13:28:05 +03:00			`srvid_disable_and_reply(rec, data, rec->recovery);`
ctdb-recoverd: New message ID CTDB_SRVID_DISABLE_RECOVERIES Also add test stub support. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-02-06 07:06:44 +03:00			`}`

add a new command "ctdb ipreallocate", this command will force the recovery master to perform a full ip reallocation process. the ctdb command will block until the ip reallocation has comleted (This used to be ctdb commit abad7b97fe0c066b33f6e75d0953bbed892a3216) 2009-07-02 07:00:26 +04:00			`/*`
recoverd: Make the SRVID request structure generic No need for a separate one for each SRVID. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit d9c22b04d5aa7938a3965bd3144568664eb772ce) 2013-08-16 14:10:10 +04:00			`handler for ip reallocate, just add it to the list of requests and`
add a new command "ctdb ipreallocate", this command will force the recovery master to perform a full ip reallocation process. the ctdb command will block until the ip reallocation has comleted (This used to be ctdb commit abad7b97fe0c066b33f6e75d0953bbed892a3216) 2009-07-02 07:00:26 +04:00			`handle this later in the monitor_cluster loop so we do not recurse`
recoverd: Make the SRVID request structure generic No need for a separate one for each SRVID. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit d9c22b04d5aa7938a3965bd3144568664eb772ce) 2013-08-16 14:10:10 +04:00			`with other requests to takeover_run()`
add a new command "ctdb ipreallocate", this command will force the recovery master to perform a full ip reallocation process. the ctdb command will block until the ip reallocation has comleted (This used to be ctdb commit abad7b97fe0c066b33f6e75d0953bbed892a3216) 2009-07-02 07:00:26 +04:00			`*/`
ctdb-daemon: Replace ctdb_message with srvid abstraction Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-04-08 07:38:26 +03:00			`static void ip_reallocate_handler(uint64_t srvid, TDB_DATA data,`
			`void *private_data)`
add a new command "ctdb ipreallocate", this command will force the recovery master to perform a full ip reallocation process. the ctdb command will block until the ip reallocation has comleted (This used to be ctdb commit abad7b97fe0c066b33f6e75d0953bbed892a3216) 2009-07-02 07:00:26 +04:00			`{`
ctdb-daemon: Rename struct srvid_request to ctdb_srvid_message Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-29 06:32:49 +03:00			`struct ctdb_srvid_message *request;`
ctdb-daemon: Replace ctdb_message with srvid abstraction Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-04-08 07:38:26 +03:00			`struct ctdb_recoverd *rec = talloc_get_type(`
			`private_data, struct ctdb_recoverd);`
add a new command "ctdb ipreallocate", this command will force the recovery master to perform a full ip reallocation process. the ctdb command will block until the ip reallocation has comleted (This used to be ctdb commit abad7b97fe0c066b33f6e75d0953bbed892a3216) 2009-07-02 07:00:26 +04:00
ctdb-daemon: Rename struct srvid_request to ctdb_srvid_message Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-29 06:32:49 +03:00			`if (data.dsize != sizeof(struct ctdb_srvid_message)) {`
add a new command "ctdb ipreallocate", this command will force the recovery master to perform a full ip reallocation process. the ctdb command will block until the ip reallocation has comleted (This used to be ctdb commit abad7b97fe0c066b33f6e75d0953bbed892a3216) 2009-07-02 07:00:26 +04:00			`DEBUG(DEBUG_ERR, (__location__ " Wrong size of return address.\n"));`
			`return;`
			`}`

ctdb-daemon: Rename struct srvid_request to ctdb_srvid_message Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-29 06:32:49 +03:00			`request = (struct ctdb_srvid_message *)data.dptr;`
add a new command "ctdb ipreallocate", this command will force the recovery master to perform a full ip reallocation process. the ctdb command will block until the ip reallocation has comleted (This used to be ctdb commit abad7b97fe0c066b33f6e75d0953bbed892a3216) 2009-07-02 07:00:26 +04:00
ctdb-daemon: Replace ctdb_message with srvid abstraction Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-04-08 07:38:26 +03:00			`srvid_request_add(rec->ctdb, &rec->reallocate_requests, request);`
add a new command "ctdb ipreallocate", this command will force the recovery master to perform a full ip reallocation process. the ctdb command will block until the ip reallocation has comleted (This used to be ctdb commit abad7b97fe0c066b33f6e75d0953bbed892a3216) 2009-07-02 07:00:26 +04:00			`}`

recoverd: Factor out the SRVID handling code The code that handles IP reallocate requests can be reused. This also changes the result back to a SRVID caller to the PNN on success or a negative error code on failure. None of the callers currently look at the result so this is harmless... but it will be useful later. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit e4eae6e3291baa299a1d0f733ab11b138ee699a3) 2013-08-16 14:02:34 +04:00			`static void process_ipreallocate_requests(struct ctdb_context *ctdb,`
			`struct ctdb_recoverd *rec)`
add a new command "ctdb ipreallocate", this command will force the recovery master to perform a full ip reallocation process. the ctdb command will block until the ip reallocation has comleted (This used to be ctdb commit abad7b97fe0c066b33f6e75d0953bbed892a3216) 2009-07-02 07:00:26 +04:00			`{`
			`TDB_DATA result;`
			`int32_t ret;`
ctdb-recoverd: Only respond to currently queued ipreallocated requests Otherwise new requests can come in during the latter parts of the takeover run when the IP allocation algorithm has already run, and the new requests will be dequeued even though they haven't really be processed. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org> 2013-11-22 06:57:03 +04:00			`struct srvid_requests *current;`
add a new command "ctdb ipreallocate", this command will force the recovery master to perform a full ip reallocation process. the ctdb command will block until the ip reallocation has comleted (This used to be ctdb commit abad7b97fe0c066b33f6e75d0953bbed892a3216) 2009-07-02 07:00:26 +04:00
ctdb-recoverd: Only respond to currently queued ipreallocated requests Otherwise new requests can come in during the latter parts of the takeover run when the IP allocation algorithm has already run, and the new requests will be dequeued even though they haven't really be processed. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org> 2013-11-22 06:57:03 +04:00			`/* Only process requests that are currently pending. More`
			`* might come in while the takeover run is in progress and`
			`* they will need to be processed later since they might`
			`* be in response flag changes.`
			`*/`
			`current = rec->reallocate_requests;`
			`rec->reallocate_requests = NULL;`

ctdb-takeover: Recovery daemon no longer passes fail callback Banning is now handled by the takeover code sending banning credit messages. This commit makes a change in behaviour quite obvious. Takeover runs were initiated from several locations in the code but banning was only done from one of these locations. Now banning can be done from any failed takeover run. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-05-03 08:35:08 +03:00			`if (do_takeover_run(rec, rec->nodemap)) {`
ctdb-recoverd: Use rec->pnn everywhere This is currently referenced in a number of inconsistent ways, including: * pnn * rec->ctdb->pnn * ctdb->pnn * ctdb_get_pnn(ctdb) * ctdb_get_pnn(rec->ctdb) The first of these always requires some thought about the context - is this the node PNN or some other PNN (e.g. argument to function)? rec->pnn is now always used when referring to the recovery daemon's PNN. Doing this also reduces reliance on struct ctdb_context internals. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-08 12:25:46 +03:00			`ret = rec->pnn;`
ctdb-recoverd: Reload remote IPs as part of takeover run This is currently done before each IP takeover run, so just factor it in. ctdb_reload_remote_public_ips() becomes static. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Thu Nov 12 09:28:45 CET 2015 on sn-devel-104 2015-10-28 12:04:41 +03:00			`} else {`
			`ret = -1;`
server: reload the public addresses before doing a takeover run metze (This used to be ctdb commit 0e41a2204fa8a1e77dc83c0d4b253ab272b5c72d) 2010-01-19 10:42:48 +03:00			`}`

add a new command "ctdb ipreallocate", this command will force the recovery master to perform a full ip reallocation process. the ctdb command will block until the ip reallocation has comleted (This used to be ctdb commit abad7b97fe0c066b33f6e75d0953bbed892a3216) 2009-07-02 07:00:26 +04:00			`result.dsize = sizeof(int32_t);`
			`result.dptr = (uint8_t *)&ret;`

ctdb-recoverd: Only respond to currently queued ipreallocated requests Otherwise new requests can come in during the latter parts of the takeover run when the IP allocation algorithm has already run, and the new requests will be dequeued even though they haven't really be processed. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org> 2013-11-22 06:57:03 +04:00			`srvid_requests_reply(ctdb, &current, result);`
add a new command "ctdb ipreallocate", this command will force the recovery master to perform a full ip reallocation process. the ctdb command will block until the ip reallocation has comleted (This used to be ctdb commit abad7b97fe0c066b33f6e75d0953bbed892a3216) 2009-07-02 07:00:26 +04:00			`}`
add a new node state : DELETED. This is used to mark nodes as being DELETED internally in ctdb so that nodes are not renumbered if / when they are removed from the nodes file. This is used to be able to do "ctdb reloadnodes" at runtime without causing nodes to be renumbered. To do this, instead of deleting a node from the nodes file, just comment it out like 1.0.0.1 #1.0.0.2 1.0.0.3 After removing 1.0.0.2 from the cluster, the remaining nodes retain their pnn's from prior to the deletion, namely 0 and 2 Any line in the nodes file that is commented out represents a DELETED pnn (This used to be ctdb commit 6a5e4fd7fa391206b463bb4e976502f3ac5bd343) 2009-06-01 08:18:34 +04:00
ctdb-recoverd: Add message handler to assigning banning credits This will be called from recovery helper to assign banning credits to misbehaving node. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2016-03-17 09:26:30 +03:00			`/*`
			`* handler for assigning banning credits`
			`*/`
			`static void banning_handler(uint64_t srvid, TDB_DATA data, void *private_data)`
			`{`
			`struct ctdb_recoverd *rec = talloc_get_type(`
			`private_data, struct ctdb_recoverd);`
			`uint32_t ban_pnn;`

ctdb-recoverd: Factor out and use function this_node_is_leader() Make the code self-documenting. This preempts an upcoming change to terminology but doing it now saves a lot of churn. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-08 11:37:39 +03:00			`/* Ignore if we are not leader */`
			`if (!this_node_is_leader(rec)) {`
ctdb-recoverd: Add message handler to assigning banning credits This will be called from recovery helper to assign banning credits to misbehaving node. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2016-03-17 09:26:30 +03:00			`return;`
			`}`

			`if (data.dsize != sizeof(uint32_t)) {`
			`DEBUG(DEBUG_ERR, (__location__ "invalid data size %zu\n",`
			`data.dsize));`
			`return;`
			`}`

			`ban_pnn = (uint32_t )data.dptr;`

			`ctdb_set_culprit_count(rec, ban_pnn, rec->nodemap->num);`
			`}`
add a new node state : DELETED. This is used to mark nodes as being DELETED internally in ctdb so that nodes are not renumbered if / when they are removed from the nodes file. This is used to be able to do "ctdb reloadnodes" at runtime without causing nodes to be renumbered. To do this, instead of deleting a node from the nodes file, just comment it out like 1.0.0.1 #1.0.0.2 1.0.0.3 After removing 1.0.0.2 from the cluster, the remaining nodes retain their pnn's from prior to the deletion, namely 0 and 2 Any line in the nodes file that is commented out represents a DELETED pnn (This used to be ctdb commit 6a5e4fd7fa391206b463bb4e976502f3ac5bd343) 2009-06-01 08:18:34 +04:00
recovery daemon with recovery master election election is primitive, it elects the lowest vnn as the recovery master two new controls, to get/set recovery master for a node to use recovery daemon, start one ./bin/recoverd --socket=ctdb.socket* for each ctdb daemon it has been briefly tested by deleting and adding nodes to a 4 node cluster but needs more testing (This used to be ctdb commit 541d1cc49d46d44042a31a8404d521412ef2fdb3) 2007-05-07 00:51:58 +04:00			`/*`
ctdb-recoverd: Logging/comments: recovery master -> leader There are some remaining instances in this file but they will be removed in subsequent commits. Modernise debug macros as appropriate. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-08 03:07:25 +03:00			`* Handler for leader elections`
			`*/`
ctdb-daemon: Replace ctdb_message with srvid abstraction Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-04-08 07:38:26 +03:00			`static void election_handler(uint64_t srvid, TDB_DATA data, void *private_data)`
recovery daemon with recovery master election election is primitive, it elects the lowest vnn as the recovery master two new controls, to get/set recovery master for a node to use recovery daemon, start one ./bin/recoverd --socket=ctdb.socket* for each ctdb daemon it has been briefly tested by deleting and adding nodes to a 4 node cluster but needs more testing (This used to be ctdb commit 541d1cc49d46d44042a31a8404d521412ef2fdb3) 2007-05-07 00:51:58 +04:00			`{`
ctdb-daemon: Replace ctdb_message with srvid abstraction Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-04-08 07:38:26 +03:00			`struct ctdb_recoverd *rec = talloc_get_type(`
			`private_data, struct ctdb_recoverd);`
			`struct ctdb_context *ctdb = rec->ctdb;`
recovery daemon with recovery master election election is primitive, it elects the lowest vnn as the recovery master two new controls, to get/set recovery master for a node to use recovery daemon, start one ./bin/recoverd --socket=ctdb.socket* for each ctdb daemon it has been briefly tested by deleting and adding nodes to a 4 node cluster but needs more testing (This used to be ctdb commit 541d1cc49d46d44042a31a8404d521412ef2fdb3) 2007-05-07 00:51:58 +04:00			`struct election_message em = (struct election_message )data.dptr;`

ctdb-recoverd: A node refuses to play against itself Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org> 2013-11-01 07:34:20 +04:00			`/* Ignore election packets from ourself */`
ctdb-recoverd: Use rec->pnn everywhere This is currently referenced in a number of inconsistent ways, including: * pnn * rec->ctdb->pnn * ctdb->pnn * ctdb_get_pnn(ctdb) * ctdb_get_pnn(rec->ctdb) The first of these always requires some thought about the context - is this the node PNN or some other PNN (e.g. argument to function)? rec->pnn is now always used when referring to the recovery daemon's PNN. Doing this also reduces reliance on struct ctdb_context internals. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-08 12:25:46 +03:00			`if (rec->pnn == em->pnn) {`
ctdb-recoverd: A node refuses to play against itself Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org> 2013-11-01 07:34:20 +04:00			`return;`
			`}`

make election handling much more scalable (This used to be ctdb commit 05938d462b92bd9ecb8e35f53651bded47c48675) 2007-11-13 02:27:44 +03:00			`/* we got an election packet - update the timeout for the election */`
			`talloc_free(rec->election_timeout);`
ctdb-recoverd: Add an explicit flag for election in progress An alternate election method will be added that doesn't use the election timeout, so this provides a common way for recognising when an election is in progress. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-03-18 12:27:10 +03:00			`rec->election_in_progress = true;`
ctdb-daemon: Stop using tevent compatibility definitions Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-26 08:50:09 +03:00			`rec->election_timeout = tevent_add_timer(`
			`ctdb->ev, ctdb,`
			`fast_start ?`
			`timeval_current_ofs(0, 500000) :`
			`timeval_current_ofs(ctdb->tunable.election_timeout, 0),`
			`ctdb_election_timeout, rec);`
make election handling much more scalable (This used to be ctdb commit 05938d462b92bd9ecb8e35f53651bded47c48675) 2007-11-13 02:27:44 +03:00
recovery daemon with recovery master election election is primitive, it elects the lowest vnn as the recovery master two new controls, to get/set recovery master for a node to use recovery daemon, start one ./bin/recoverd --socket=ctdb.socket* for each ctdb daemon it has been briefly tested by deleting and adding nodes to a 4 node cluster but needs more testing (This used to be ctdb commit 541d1cc49d46d44042a31a8404d521412ef2fdb3) 2007-05-07 00:51:58 +04:00			`/* someone called an election. check their election data`
			`and if we disagree and we would rather be the elected node,`
			`send a new election message to all other nodes`
			`*/`
choose the most connected node first (This used to be ctdb commit c7c17a79fa4f28509e34b6f635fa62517dc458c2) 2007-06-07 13:17:27 +04:00			`if (ctdb_election_win(rec, em)) {`
make election handling much more scalable (This used to be ctdb commit 05938d462b92bd9ecb8e35f53651bded47c48675) 2007-11-13 02:27:44 +03:00			`if (!rec->send_election_te) {`
ctdb-daemon: Stop using tevent compatibility definitions Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-26 08:50:09 +03:00			`rec->send_election_te = tevent_add_timer(`
			`ctdb->ev, rec,`
			`timeval_current_ofs(0, 500000),`
			`election_send_request, rec);`
recovery daemon with recovery master election election is primitive, it elects the lowest vnn as the recovery master two new controls, to get/set recovery master for a node to use recovery daemon, start one ./bin/recoverd --socket=ctdb.socket* for each ctdb daemon it has been briefly tested by deleting and adding nodes to a 4 node cluster but needs more testing (This used to be ctdb commit 541d1cc49d46d44042a31a8404d521412ef2fdb3) 2007-05-07 00:51:58 +04:00			`}`
			`return;`
			`}`
ctdb-recoverd: New function ctdb_recovery_have_lock() True if this recovery daemon holds the lock. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2014-12-09 05:50:22 +03:00
make election handling much more scalable (This used to be ctdb commit 05938d462b92bd9ecb8e35f53651bded47c48675) 2007-11-13 02:27:44 +03:00			`/* we didn't win */`
ctdb-recoverd: Simplify using TALLOC_FREE() Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-03-31 05:59:02 +03:00			`TALLOC_FREE(rec->send_election_te);`
recovery daemon with recovery master election election is primitive, it elects the lowest vnn as the recovery master two new controls, to get/set recovery master for a node to use recovery daemon, start one ./bin/recoverd --socket=ctdb.socket* for each ctdb daemon it has been briefly tested by deleting and adding nodes to a 4 node cluster but needs more testing (This used to be ctdb commit 541d1cc49d46d44042a31a8404d521412ef2fdb3) 2007-05-07 00:51:58 +04:00
ctdb-recoverd: Terminology change: recovery lock -> cluster lock No functional changes, just name changes for clarity. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-10 03:29:06 +03:00			`/* Release the cluster lock file */`
			`if (cluster_lock_held(rec)) {`
			`cluster_lock_release(rec);`
- startup frozen, and do an initial recovery - fixed a bug in traverse - get a lock on the node list file in the recmaster recovery daemon (This used to be ctdb commit 162a5647535ad1cb3e8e5d4042a2784365fb1913) 2007-05-23 08:35:19 +04:00			`}`

ctdb-recoverd: Logging/comments: recovery master -> leader There are some remaining instances in this file but they will be removed in subsequent commits. Modernise debug macros as appropriate. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-08 03:07:25 +03:00			`/* Set leader to the winner of this round */`
ctdb-recoverd: Rename recmaster field to leader Recovery master is being renamed to leader. This follows clustering best practice (e.g. RAFT). Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-07-14 08:22:33 +03:00			`rec->leader = em->pnn;`
recovery daemon with recovery master election election is primitive, it elects the lowest vnn as the recovery master two new controls, to get/set recovery master for a node to use recovery daemon, start one ./bin/recoverd --socket=ctdb.socket* for each ctdb daemon it has been briefly tested by deleting and adding nodes to a 4 node cluster but needs more testing (This used to be ctdb commit 541d1cc49d46d44042a31a8404d521412ef2fdb3) 2007-05-07 00:51:58 +04:00
			`return;`
			`}`

ctdb-recoverd: Use race for cluster lock as election when lock is enabled If the cluster is partitioned then nodes in one partition can not take the lock anyway, so election is pointless. It just introduces unnecessary corner cases. Instead just race for the lock. When a node notices a lack of leader and notifies other nodes of an election via an unknown leader broadcast, the cluster lock election is hooked into this broadcast. The test needs to be updated because losing the cluster lock can now result in a leadership change. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-03-18 07:14:39 +03:00			`static void cluster_lock_election(struct ctdb_recoverd *rec)`
			`{`
			`bool ok;`

			`if (!this_node_can_be_leader(rec)) {`
			`if (cluster_lock_held(rec)) {`
			`cluster_lock_release(rec);`
			`}`
ctdb-recoverd: Always cancel election in progress Election-in-progress is set by unknown leader broadcast, so needs to be cleared in all cases when election completes. This was seen in a case where the leader node stalled, so didn't send leader broadcasts for some time. The node continued to hold the cluster lock, so another node could not become leader. However, after the node returned to normal it still did not send leader broadcasts because election-in-progress was never cleared. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14958 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2022-01-21 10:09:47 +03:00			`goto done;`
ctdb-recoverd: Use race for cluster lock as election when lock is enabled If the cluster is partitioned then nodes in one partition can not take the lock anyway, so election is pointless. It just introduces unnecessary corner cases. Instead just race for the lock. When a node notices a lack of leader and notifies other nodes of an election via an unknown leader broadcast, the cluster lock election is hooked into this broadcast. The test needs to be updated because losing the cluster lock can now result in a leadership change. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-03-18 07:14:39 +03:00			`}`

			`/*`
			`* Don't need to unconditionally release the lock and then`
			`* attempt to retake it. This provides stability.`
			`*/`
			`if (cluster_lock_held(rec)) {`
ctdb-recoverd: Always cancel election in progress Election-in-progress is set by unknown leader broadcast, so needs to be cleared in all cases when election completes. This was seen in a case where the leader node stalled, so didn't send leader broadcasts for some time. The node continued to hold the cluster lock, so another node could not become leader. However, after the node returned to normal it still did not send leader broadcasts because election-in-progress was never cleared. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14958 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2022-01-21 10:09:47 +03:00			`goto done;`
ctdb-recoverd: Use race for cluster lock as election when lock is enabled If the cluster is partitioned then nodes in one partition can not take the lock anyway, so election is pointless. It just introduces unnecessary corner cases. Instead just race for the lock. When a node notices a lack of leader and notifies other nodes of an election via an unknown leader broadcast, the cluster lock election is hooked into this broadcast. The test needs to be updated because losing the cluster lock can now result in a leadership change. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-03-18 07:14:39 +03:00			`}`

			`rec->leader = CTDB_UNKNOWN_PNN;`

			`ok = cluster_lock_take(rec);`
			`if (ok) {`
			`rec->leader = rec->pnn;`
			`D_WARNING("Took cluster lock, leader=%"PRIu32"\n", rec->leader);`
			`}`

ctdb-recoverd: Always cancel election in progress Election-in-progress is set by unknown leader broadcast, so needs to be cleared in all cases when election completes. This was seen in a case where the leader node stalled, so didn't send leader broadcasts for some time. The node continued to hold the cluster lock, so another node could not become leader. However, after the node returned to normal it still did not send leader broadcasts because election-in-progress was never cleared. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14958 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2022-01-21 10:09:47 +03:00			`done:`
ctdb-recoverd: Use race for cluster lock as election when lock is enabled If the cluster is partitioned then nodes in one partition can not take the lock anyway, so election is pointless. It just introduces unnecessary corner cases. Instead just race for the lock. When a node notices a lack of leader and notifies other nodes of an election via an unknown leader broadcast, the cluster lock election is hooked into this broadcast. The test needs to be updated because losing the cluster lock can now result in a leadership change. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-03-18 07:14:39 +03:00			`rec->election_in_progress = false;`
			`}`

implement a scheme where nodes are banned if they continuously caused the cluster to start a recovery session. The node is banned from the cluster for the RecoveryBanPeriod (default of 5 minutes) (This used to be ctdb commit 4ad43dd07f526b6002477177fbf55483246c2c0c) 2007-06-07 09:18:55 +04:00			`/*`
			`force the start of the election process`
			`*/`
ctdb-recoverd: Simplify arguments to some election functions The pnn and nodemap arguments to force_election() and send_election_request() are always effectively rec->pnn and rec->nodemap, so simplify. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-08 11:27:01 +03:00			`static void force_election(struct ctdb_recoverd *rec)`
recovery daemon with recovery master election election is primitive, it elects the lowest vnn as the recovery master two new controls, to get/set recovery master for a node to use recovery daemon, start one ./bin/recoverd --socket=ctdb.socket* for each ctdb daemon it has been briefly tested by deleting and adding nodes to a 4 node cluster but needs more testing (This used to be ctdb commit 541d1cc49d46d44042a31a8404d521412ef2fdb3) 2007-05-07 00:51:58 +04:00			`{`
			`int ret;`
use a priority time for the election data, not just the vnn (This used to be ctdb commit a691f9c5cd77194005f0d98483da94b07a48d57d) 2007-06-07 12:37:27 +04:00			`struct ctdb_context *ctdb = rec->ctdb;`
when starting a new election, also force all nodes into recovery mode so there is no internode traffic to interfere with our election (This used to be ctdb commit ccfb67a076c72a0e7f2b6dc5fce9c19f652ba2ad) 2007-05-10 03:48:14 +04:00
ctdb-recoverd: Consistently log start of election Elections should now be quite rare, so always log when one begins. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14958 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2022-01-22 22:21:51 +03:00			`D_ERR("Start election\n");`
When we create new election data to send during elections, we must re-read the node flags from the main daemon to catch when the STOPPED flag is changed. (This used to be ctdb commit ca4982c40d81db528fe915d5ecc01fcf7df0b522) 2009-07-17 05:37:03 +04:00
when starting a new election, also force all nodes into recovery mode so there is no internode traffic to interfere with our election (This used to be ctdb commit ccfb67a076c72a0e7f2b6dc5fce9c19f652ba2ad) 2007-05-10 03:48:14 +04:00			`/* set all nodes to recovery mode to stop all internode traffic */`
ctdb-recoverd: Simplify arguments to some election functions The pnn and nodemap arguments to force_election() and send_election_request() are always effectively rec->pnn and rec->nodemap, so simplify. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-08 11:27:01 +03:00			`ret = set_recovery_mode(ctdb, rec, rec->nodemap, CTDB_RECOVERY_ACTIVE);`
in the destructor for the lock-wait child, make sure that we cancel any pending transactions. (This used to be ctdb commit 45b6ff64f6ddf037b810c4e5f8b9f04d71067b98) 2008-07-07 02:50:12 +04:00			`if (ret != 0) {`
merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_ERR, (__location__ " Unable to set recovery mode to active on cluster\n"));`
when starting a new election, also force all nodes into recovery mode so there is no internode traffic to interfere with our election (This used to be ctdb commit ccfb67a076c72a0e7f2b6dc5fce9c19f652ba2ad) 2007-05-10 03:48:14 +04:00			`return;`
			`}`
make election handling much more scalable (This used to be ctdb commit 05938d462b92bd9ecb8e35f53651bded47c48675) 2007-11-13 02:27:44 +03:00
ctdb-recoverd: Consistently have caller set election-in-progress The problem here is that election-in-progress must be set to potentially avoid restarting the election broadcast timeout in main_loop(), so this is already done by leader_handler(). Have force_election() set election-in-progress for all election types and do not bother setting it in cluster_lock_election(). BUG: https://bugzilla.samba.org/show_bug.cgi?id=14958 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2022-01-22 21:49:18 +03:00			`rec->election_in_progress = true;`
ctdb-recoverd: Always send unknown leader broadcast when starting election This is currently missed when the cluster lock is lost. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14958 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2022-01-22 22:18:51 +03:00			`/* Let other nodes know that an election is underway */`
			`leader_broadcast_send(rec, CTDB_UNKNOWN_PNN);`
ctdb-recoverd: Consistently have caller set election-in-progress The problem here is that election-in-progress must be set to potentially avoid restarting the election broadcast timeout in main_loop(), so this is already done by leader_handler(). Have force_election() set election-in-progress for all election types and do not bother setting it in cluster_lock_election(). BUG: https://bugzilla.samba.org/show_bug.cgi?id=14958 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2022-01-22 21:49:18 +03:00
ctdb-recoverd: Use race for cluster lock as election when lock is enabled If the cluster is partitioned then nodes in one partition can not take the lock anyway, so election is pointless. It just introduces unnecessary corner cases. Instead just race for the lock. When a node notices a lack of leader and notifies other nodes of an election via an unknown leader broadcast, the cluster lock election is hooked into this broadcast. The test needs to be updated because losing the cluster lock can now result in a leadership change. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-03-18 07:14:39 +03:00			`if (cluster_lock_enabled(rec)) {`
			`cluster_lock_election(rec);`
			`return;`
			`}`

make election handling much more scalable (This used to be ctdb commit 05938d462b92bd9ecb8e35f53651bded47c48675) 2007-11-13 02:27:44 +03:00			`talloc_free(rec->election_timeout);`
ctdb-daemon: Stop using tevent compatibility definitions Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-26 08:50:09 +03:00			`rec->election_timeout = tevent_add_timer(`
			`ctdb->ev, ctdb,`
			`fast_start ?`
			`timeval_current_ofs(0, 500000) :`
			`timeval_current_ofs(ctdb->tunable.election_timeout, 0),`
			`ctdb_election_timeout, rec);`
make election handling much more scalable (This used to be ctdb commit 05938d462b92bd9ecb8e35f53651bded47c48675) 2007-11-13 02:27:44 +03:00
ctdb-recoverd: Simplify arguments to some election functions The pnn and nodemap arguments to force_election() and send_election_request() are always effectively rec->pnn and rec->nodemap, so simplify. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-08 11:27:01 +03:00			`ret = send_election_request(rec);`
recovery daemon with recovery master election election is primitive, it elects the lowest vnn as the recovery master two new controls, to get/set recovery master for a node to use recovery daemon, start one ./bin/recoverd --socket=ctdb.socket* for each ctdb daemon it has been briefly tested by deleting and adding nodes to a 4 node cluster but needs more testing (This used to be ctdb commit 541d1cc49d46d44042a31a8404d521412ef2fdb3) 2007-05-07 00:51:58 +04:00			`if (ret!=0) {`
ctdb: Add missing newlines to logging messages Signed-off-by: Joseph Sutton <josephsutton@catalyst.net.nz> Reviewed-by: Andrew Bartlett <abartlet@samba.org> 2023-07-31 07:07:36 +03:00			`DBG_ERR("Failed to initiate leader election\n");`
recovery daemon with recovery master election election is primitive, it elects the lowest vnn as the recovery master two new controls, to get/set recovery master for a node to use recovery daemon, start one ./bin/recoverd --socket=ctdb.socket* for each ctdb daemon it has been briefly tested by deleting and adding nodes to a 4 node cluster but needs more testing (This used to be ctdb commit 541d1cc49d46d44042a31a8404d521412ef2fdb3) 2007-05-07 00:51:58 +04:00			`return;`
			`}`

moved system specific ip code to system.c (This used to be ctdb commit 9de9e4ccda9665108baac12a8716b189d26340b1) 2007-05-26 08:01:08 +04:00			`/* wait for a few seconds to collect all responses */`
make election handling much more scalable (This used to be ctdb commit 05938d462b92bd9ecb8e35f53651bded47c48675) 2007-11-13 02:27:44 +03:00			`ctdb_wait_election(rec);`
implement a scheme where nodes are banned if they continuously caused the cluster to start a recovery session. The node is banned from the cluster for the RecoveryBanPeriod (default of 5 minutes) (This used to be ctdb commit 4ad43dd07f526b6002477177fbf55483246c2c0c) 2007-06-07 09:18:55 +04:00			`}`


ctdb-recoverd: Mark CTDB_SRVID_SET_NODE_FLAGS obsolete CTDB_SRVID_SET_NODE_FLAGS is no longer sent so drop monitor_handler() and replace with srvid_not_implemented(). Mark the SRVID obsolete in its comment. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14784 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-01-17 11:04:34 +03:00			`static void srvid_not_implemented(uint64_t srvid,`
			`TDB_DATA data,`
			`void *private_data)`
implement a scheme where nodes are banned if they continuously caused the cluster to start a recovery session. The node is banned from the cluster for the RecoveryBanPeriod (default of 5 minutes) (This used to be ctdb commit 4ad43dd07f526b6002477177fbf55483246c2c0c) 2007-06-07 09:18:55 +04:00			`{`
ctdb-recoverd: Mark CTDB_SRVID_SET_NODE_FLAGS obsolete CTDB_SRVID_SET_NODE_FLAGS is no longer sent so drop monitor_handler() and replace with srvid_not_implemented(). Mark the SRVID obsolete in its comment. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14784 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-01-17 11:04:34 +03:00			`const char *s;`
implement a scheme where nodes are banned if they continuously caused the cluster to start a recovery session. The node is banned from the cluster for the RecoveryBanPeriod (default of 5 minutes) (This used to be ctdb commit 4ad43dd07f526b6002477177fbf55483246c2c0c) 2007-06-07 09:18:55 +04:00
ctdb-recoverd: Mark CTDB_SRVID_SET_NODE_FLAGS obsolete CTDB_SRVID_SET_NODE_FLAGS is no longer sent so drop monitor_handler() and replace with srvid_not_implemented(). Mark the SRVID obsolete in its comment. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14784 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-01-17 11:04:34 +03:00			`switch (srvid) {`
			`case CTDB_SRVID_SET_NODE_FLAGS:`
			`s = "CTDB_SRVID_SET_NODE_FLAGS";`
			`break;`
			`default:`
			`s = "UNKNOWN";`
implement a scheme where nodes are banned if they continuously caused the cluster to start a recovery session. The node is banned from the cluster for the RecoveryBanPeriod (default of 5 minutes) (This used to be ctdb commit 4ad43dd07f526b6002477177fbf55483246c2c0c) 2007-06-07 09:18:55 +04:00			`}`

ctdb-recoverd: Mark CTDB_SRVID_SET_NODE_FLAGS obsolete CTDB_SRVID_SET_NODE_FLAGS is no longer sent so drop monitor_handler() and replace with srvid_not_implemented(). Mark the SRVID obsolete in its comment. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14784 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-01-17 11:04:34 +03:00			`D_WARNING("SRVID %s (0x%" PRIx64 ") is obsolete\n", s, srvid);`
recovery daemon with recovery master election election is primitive, it elects the lowest vnn as the recovery master two new controls, to get/set recovery master for a node to use recovery daemon, start one ./bin/recoverd --socket=ctdb.socket* for each ctdb daemon it has been briefly tested by deleting and adding nodes to a 4 node cluster but needs more testing (This used to be ctdb commit 541d1cc49d46d44042a31a8404d521412ef2fdb3) 2007-05-07 00:51:58 +04:00			`}`
add a test in the function that checks whether the cluster needs recovery or not that all active nodes are in normal mode. If we discover that some node is still in recoverymode it may indicate that a previous recovery ended prematurely and thus we should start a new recovery (This used to be ctdb commit c15517872e6c98c8c425a8d47d2b348ecb0620b0) 2007-05-06 22:41:12 +04:00
reqrite the handling of flag updates across the cluster to eliminate a race between the ctdb tool and the recovery daemon both at once trying to push flag changes across the cluster. (This used to be ctdb commit a9a1156ea4e10483a4bf4265b8e9203f0af033aa) 2008-11-19 06:43:46 +03:00			`/*`
Spelling fixes s/ ot / to / Signed-off-by: Mathieu Parent <math.parent@gmail.com> Reviewed-by: Andrew Bartlett <abartlet@samba.org> Reviewed-by: Gary Lockyer <gary@catalyst.net.nz> 2019-08-29 23:19:03 +03:00			`handler for when we need to push out flag changes to all other nodes`
reqrite the handling of flag updates across the cluster to eliminate a race between the ctdb tool and the recovery daemon both at once trying to push flag changes across the cluster. (This used to be ctdb commit a9a1156ea4e10483a4bf4265b8e9203f0af033aa) 2008-11-19 06:43:46 +03:00			`*/`
ctdb-daemon: Replace ctdb_message with srvid abstraction Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-04-08 07:38:26 +03:00			`static void push_flags_handler(uint64_t srvid, TDB_DATA data,`
			`void *private_data)`
reqrite the handling of flag updates across the cluster to eliminate a race between the ctdb tool and the recovery daemon both at once trying to push flag changes across the cluster. (This used to be ctdb commit a9a1156ea4e10483a4bf4265b8e9203f0af033aa) 2008-11-19 06:43:46 +03:00			`{`
ctdb-daemon: Replace ctdb_message with srvid abstraction Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-04-08 07:38:26 +03:00			`struct ctdb_recoverd *rec = talloc_get_type(`
			`private_data, struct ctdb_recoverd);`
			`struct ctdb_context *ctdb = rec->ctdb;`
reqrite the handling of flag updates across the cluster to eliminate a race between the ctdb tool and the recovery daemon both at once trying to push flag changes across the cluster. (This used to be ctdb commit a9a1156ea4e10483a4bf4265b8e9203f0af033aa) 2008-11-19 06:43:46 +03:00			`int ret;`
			`struct ctdb_node_flag_change c = (struct ctdb_node_flag_change )data.dptr;`
ctdb-daemon: Rename struct ctdb_node_map to ctdb_node_map_old Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-29 09:22:48 +03:00			`struct ctdb_node_map_old *nodemap=NULL;`
server: if takeover runs when the recovery master becomes unhealthy The problem was this: When the monitor event fails, the node->flags get updated, and an update (containing the old and new flags) is sent to the recovery master. If the recovery master sends the update to itself (the same process), it was compairing the node->flags variable with the received new flags. This check always found both flag values to be equal and never sets the rec->need_takeover_run variable to true. There were two problem, first the push_flags_handler() function didn't pass the received old flags. And the ctdb_control_modflags() function ignored the received old flags. metze (This used to be ctdb commit 8ec633b64a05a2d903c2b9639909f15f6375548f) 2009-10-09 17:47:49 +04:00			`TALLOC_CTX *tmp_ctx = talloc_new(ctdb);`
			`uint32_t *nodes;`
reqrite the handling of flag updates across the cluster to eliminate a race between the ctdb tool and the recovery daemon both at once trying to push flag changes across the cluster. (This used to be ctdb commit a9a1156ea4e10483a4bf4265b8e9203f0af033aa) 2008-11-19 06:43:46 +03:00
ctdb-recoverd: Logging/comments: recovery master -> leader There are some remaining instances in this file but they will be removed in subsequent commits. Modernise debug macros as appropriate. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-08 03:07:25 +03:00			`/* read the node flags from the leader */`
ctdb-recoverd: Rename recmaster field to leader Recovery master is being renamed to leader. This follows clustering best practice (e.g. RAFT). Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-07-14 08:22:33 +03:00			`ret = ctdb_ctrl_getnodemap(ctdb, CONTROL_TIMEOUT(), rec->leader,`
ctdb-recoverd: Don't retrieve recovery master from local daemon The recovery daemon already knows which node is the master. This relies on rec->recmaster being correctly initialised and correctly set during elections. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-10-23 07:33:01 +03:00			`tmp_ctx, &nodemap);`
server: if takeover runs when the recovery master becomes unhealthy The problem was this: When the monitor event fails, the node->flags get updated, and an update (containing the old and new flags) is sent to the recovery master. If the recovery master sends the update to itself (the same process), it was compairing the node->flags variable with the received new flags. This check always found both flag values to be equal and never sets the rec->need_takeover_run variable to true. There were two problem, first the push_flags_handler() function didn't pass the received old flags. And the ctdb_control_modflags() function ignored the received old flags. metze (This used to be ctdb commit 8ec633b64a05a2d903c2b9639909f15f6375548f) 2009-10-09 17:47:49 +04:00			`if (ret != 0) {`
			`DEBUG(DEBUG_ERR, (__location__ " Unable to get nodemap from node %u\n", c->pnn));`
			`talloc_free(tmp_ctx);`
			`return;`
reqrite the handling of flag updates across the cluster to eliminate a race between the ctdb tool and the recovery daemon both at once trying to push flag changes across the cluster. (This used to be ctdb commit a9a1156ea4e10483a4bf4265b8e9203f0af033aa) 2008-11-19 06:43:46 +03:00			`}`
server: if takeover runs when the recovery master becomes unhealthy The problem was this: When the monitor event fails, the node->flags get updated, and an update (containing the old and new flags) is sent to the recovery master. If the recovery master sends the update to itself (the same process), it was compairing the node->flags variable with the received new flags. This check always found both flag values to be equal and never sets the rec->need_takeover_run variable to true. There were two problem, first the push_flags_handler() function didn't pass the received old flags. And the ctdb_control_modflags() function ignored the received old flags. metze (This used to be ctdb commit 8ec633b64a05a2d903c2b9639909f15f6375548f) 2009-10-09 17:47:49 +04:00			`if (c->pnn >= nodemap->num) {`
ctdb-recoverd: Logging/comments: recovery master -> leader There are some remaining instances in this file but they will be removed in subsequent commits. Modernise debug macros as appropriate. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-08 03:07:25 +03:00			`DBG_ERR("Nodemap from leader does not contain node %d\n",`
			`c->pnn);`
server: if takeover runs when the recovery master becomes unhealthy The problem was this: When the monitor event fails, the node->flags get updated, and an update (containing the old and new flags) is sent to the recovery master. If the recovery master sends the update to itself (the same process), it was compairing the node->flags variable with the received new flags. This check always found both flag values to be equal and never sets the rec->need_takeover_run variable to true. There were two problem, first the push_flags_handler() function didn't pass the received old flags. And the ctdb_control_modflags() function ignored the received old flags. metze (This used to be ctdb commit 8ec633b64a05a2d903c2b9639909f15f6375548f) 2009-10-09 17:47:49 +04:00			`talloc_free(tmp_ctx);`
			`return;`
			`}`

			`/* send the flags update to all connected nodes */`
			`nodes = list_of_connected_nodes(ctdb, nodemap, tmp_ctx, true);`

			`if (ctdb_client_async_control(ctdb, CTDB_CONTROL_MODIFY_FLAGS,`
			`nodes, 0, CONTROL_TIMEOUT(),`
			`false, data,`
			`NULL, NULL,`
			`NULL) != 0) {`
			`DEBUG(DEBUG_ERR, (__location__ " ctdb_control to modify node flags failed\n"));`

			`talloc_free(tmp_ctx);`
			`return;`
			`}`

			`talloc_free(tmp_ctx);`
reqrite the handling of flag updates across the cluster to eliminate a race between the ctdb tool and the recovery daemon both at once trying to push flag changes across the cluster. (This used to be ctdb commit a9a1156ea4e10483a4bf4265b8e9203f0af033aa) 2008-11-19 06:43:46 +03:00			`}`
implement a scheme where nodes are banned if they continuously caused the cluster to start a recovery session. The node is banned from the cluster for the RecoveryBanPeriod (default of 5 minutes) (This used to be ctdb commit 4ad43dd07f526b6002477177fbf55483246c2c0c) 2007-06-07 09:18:55 +04:00
ctdb-recoverd: Handle leader broadcast timeout If no leader broadcasts have been received from the leader for more than 5s then trigger an election. Apart from being sane behaviour, this avoids elected-before-connected bugs at startup, where a node elects itself leader before it is connected to other nodes. When a node processes a leader broadcast timeout it sends an unknown leader broadcast to all nodes. That causes cancellation of the leader broadcast timeout across the cluster. This is particular important at startup, since nodes may be started in a staggered fashion. Without this cluster-wide cancellation, a node might notice the lack of leader, win an election and complete a recovery before other nodes notice the lack of leader. When the leader broadcast timeout finally occurs on the other nodes then they'll put the cluster back into an unnecessary recovery. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-17 06:42:47 +03:00			`static void leader_broadcast_timeout_handler(struct tevent_context *ev,`
			`struct tevent_timer *te,`
			`struct timeval current_time,`
			`void *private_data)`
			`{`
			`struct ctdb_recoverd *rec = talloc_get_type_abort(`
			`private_data, struct ctdb_recoverd);`

			`rec->leader_broadcast_timeout_te = NULL;`

ctdb-recoverd: Consistently log start of election Elections should now be quite rare, so always log when one begins. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14958 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2022-01-22 22:21:51 +03:00			`D_NOTICE("Leader broadcast timeout\n");`

ctdb-recoverd: Handle leader broadcast timeout If no leader broadcasts have been received from the leader for more than 5s then trigger an election. Apart from being sane behaviour, this avoids elected-before-connected bugs at startup, where a node elects itself leader before it is connected to other nodes. When a node processes a leader broadcast timeout it sends an unknown leader broadcast to all nodes. That causes cancellation of the leader broadcast timeout across the cluster. This is particular important at startup, since nodes may be started in a staggered fashion. Without this cluster-wide cancellation, a node might notice the lack of leader, win an election and complete a recovery before other nodes notice the lack of leader. When the leader broadcast timeout finally occurs on the other nodes then they'll put the cluster back into an unnecessary recovery. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-17 06:42:47 +03:00			`force_election(rec);`
			`}`

			`static void leader_broadcast_timeout_cancel(struct ctdb_recoverd *rec)`
			`{`
			`TALLOC_FREE(rec->leader_broadcast_timeout_te);`
			`}`

			`static int leader_broadcast_timeout_start(struct ctdb_recoverd *rec)`
			`{`
			`struct ctdb_context *ctdb = rec->ctdb;`

			`/*`
			`* This should not be necessary. However, there will be`
			`* interactions with election code here. It will want to`
			`* cancel and restart the timer around potentially long`
			`* elections.`
			`*/`
			`leader_broadcast_timeout_cancel(rec);`

			`rec->leader_broadcast_timeout_te =`
			`tevent_add_timer(`
			`ctdb->ev,`
			`rec,`
ctdb-config: Add configuration option [cluster] leader timeout Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2022-01-15 05:02:02 +03:00			`timeval_current_ofs(ctdb_config.leader_timeout, 0),`
ctdb-recoverd: Handle leader broadcast timeout If no leader broadcasts have been received from the leader for more than 5s then trigger an election. Apart from being sane behaviour, this avoids elected-before-connected bugs at startup, where a node elects itself leader before it is connected to other nodes. When a node processes a leader broadcast timeout it sends an unknown leader broadcast to all nodes. That causes cancellation of the leader broadcast timeout across the cluster. This is particular important at startup, since nodes may be started in a staggered fashion. Without this cluster-wide cancellation, a node might notice the lack of leader, win an election and complete a recovery before other nodes notice the lack of leader. When the leader broadcast timeout finally occurs on the other nodes then they'll put the cluster back into an unnecessary recovery. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-17 06:42:47 +03:00			`leader_broadcast_timeout_handler,`
			`rec);`
			`if (rec->leader_broadcast_timeout_te == NULL) {`
			`D_ERR("Unable to start leader broadcast timeout\n");`
			`return ENOMEM;`
			`}`

			`return 0;`
			`}`

			`static bool leader_broadcast_timeout_active(struct ctdb_recoverd *rec)`
			`{`
			`return rec->leader_broadcast_timeout_te != NULL;`
			`}`

ctdb-recoverd: Process leader broadcasts Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-03-16 08:07:26 +03:00			`static void leader_handler(uint64_t srvid, TDB_DATA data, void *private_data)`
			`{`
			`struct ctdb_recoverd *rec = talloc_get_type_abort(`
			`private_data, struct ctdb_recoverd);`
			`uint32_t pnn;`
			`size_t npull;`
			`int ret;`

			`ret = ctdb_uint32_pull(data.dptr, data.dsize, &pnn, &npull);`
			`if (ret != 0) {`
			`DBG_WARNING("Unable to parse leader broadcast, ret=%d\n", ret);`
			`return;`
			`}`

ctdb-recoverd: Handle leader broadcast timeout If no leader broadcasts have been received from the leader for more than 5s then trigger an election. Apart from being sane behaviour, this avoids elected-before-connected bugs at startup, where a node elects itself leader before it is connected to other nodes. When a node processes a leader broadcast timeout it sends an unknown leader broadcast to all nodes. That causes cancellation of the leader broadcast timeout across the cluster. This is particular important at startup, since nodes may be started in a staggered fashion. Without this cluster-wide cancellation, a node might notice the lack of leader, win an election and complete a recovery before other nodes notice the lack of leader. When the leader broadcast timeout finally occurs on the other nodes then they'll put the cluster back into an unnecessary recovery. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-17 06:42:47 +03:00			`leader_broadcast_timeout_cancel(rec);`

ctdb-recoverd: Process leader broadcasts Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-03-16 08:07:26 +03:00			`if (pnn == rec->leader) {`
ctdb-recoverd: Handle leader broadcast timeout If no leader broadcasts have been received from the leader for more than 5s then trigger an election. Apart from being sane behaviour, this avoids elected-before-connected bugs at startup, where a node elects itself leader before it is connected to other nodes. When a node processes a leader broadcast timeout it sends an unknown leader broadcast to all nodes. That causes cancellation of the leader broadcast timeout across the cluster. This is particular important at startup, since nodes may be started in a staggered fashion. Without this cluster-wide cancellation, a node might notice the lack of leader, win an election and complete a recovery before other nodes notice the lack of leader. When the leader broadcast timeout finally occurs on the other nodes then they'll put the cluster back into an unnecessary recovery. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-17 06:42:47 +03:00			`goto done;`
ctdb-recoverd: Process leader broadcasts Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-03-16 08:07:26 +03:00			`}`

			`if (pnn == CTDB_UNKNOWN_PNN) {`
ctdb-recoverd: Use race for cluster lock as election when lock is enabled If the cluster is partitioned then nodes in one partition can not take the lock anyway, so election is pointless. It just introduces unnecessary corner cases. Instead just race for the lock. When a node notices a lack of leader and notifies other nodes of an election via an unknown leader broadcast, the cluster lock election is hooked into this broadcast. The test needs to be updated because losing the cluster lock can now result in a leadership change. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-03-18 07:14:39 +03:00			`bool was_election_in_progress = rec->election_in_progress;`

ctdb-recoverd: Handle leader broadcast timeout If no leader broadcasts have been received from the leader for more than 5s then trigger an election. Apart from being sane behaviour, this avoids elected-before-connected bugs at startup, where a node elects itself leader before it is connected to other nodes. When a node processes a leader broadcast timeout it sends an unknown leader broadcast to all nodes. That causes cancellation of the leader broadcast timeout across the cluster. This is particular important at startup, since nodes may be started in a staggered fashion. Without this cluster-wide cancellation, a node might notice the lack of leader, win an election and complete a recovery before other nodes notice the lack of leader. When the leader broadcast timeout finally occurs on the other nodes then they'll put the cluster back into an unnecessary recovery. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-17 06:42:47 +03:00			`/*`
			`* Leader broadcast timeout was cancelled above - stop`
			`* main loop from restarting it until election is`
			`* complete`
			`*/`
			`rec->election_in_progress = true;`
ctdb-recoverd: Use race for cluster lock as election when lock is enabled If the cluster is partitioned then nodes in one partition can not take the lock anyway, so election is pointless. It just introduces unnecessary corner cases. Instead just race for the lock. When a node notices a lack of leader and notifies other nodes of an election via an unknown leader broadcast, the cluster lock election is hooked into this broadcast. The test needs to be updated because losing the cluster lock can now result in a leadership change. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-03-18 07:14:39 +03:00
			`/*`
			`* This is the only notification for a cluster lock`
			`* election, so handle it here...`
			`*/`
			`if (cluster_lock_enabled(rec) && !was_election_in_progress) {`
			`cluster_lock_election(rec);`
			`}`

ctdb-recoverd: Process leader broadcasts Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-03-16 08:07:26 +03:00			`return;`
			`}`

			`D_NOTICE("Received leader broadcast, leader=%"PRIu32"\n", pnn);`
			`rec->leader = pnn;`
ctdb-recoverd: Handle leader broadcast timeout If no leader broadcasts have been received from the leader for more than 5s then trigger an election. Apart from being sane behaviour, this avoids elected-before-connected bugs at startup, where a node elects itself leader before it is connected to other nodes. When a node processes a leader broadcast timeout it sends an unknown leader broadcast to all nodes. That causes cancellation of the leader broadcast timeout across the cluster. This is particular important at startup, since nodes may be started in a staggered fashion. Without this cluster-wide cancellation, a node might notice the lack of leader, win an election and complete a recovery before other nodes notice the lack of leader. When the leader broadcast timeout finally occurs on the other nodes then they'll put the cluster back into an unnecessary recovery. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-17 06:42:47 +03:00
			`done:`
			`leader_broadcast_timeout_start(rec);`
ctdb-recoverd: Process leader broadcasts Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-03-16 08:07:26 +03:00			`}`
break checking that the recoverymode on all nodes are ok out into its own function (This used to be ctdb commit 813cf9a252af96da24122b80f24aabeed2911939) 2007-08-23 07:48:39 +04:00
change the monitoring of recmode in the recovery daemon to use a fully async eventdriven api for controls (This used to be ctdb commit 8d0e43428c507967a0d96e6a4c5c821ac269c546) 2007-08-27 03:40:10 +04:00			`struct verify_recmode_normal_data {`
			`uint32_t count;`
			`enum monitor_result status;`
			`};`

			`static void verify_recmode_normal_callback(struct ctdb_client_control_state *state)`
			`{`
change async.private to async.private_data since private is a reserved work in c++ (This used to be ctdb commit 79eb28f6cd5dcc30b04966d202a050eaf98a2552) 2007-09-26 08:25:32 +04:00			`struct verify_recmode_normal_data *rmdata = talloc_get_type(state->async.private_data, struct verify_recmode_normal_data);`
change the monitoring of recmode in the recovery daemon to use a fully async eventdriven api for controls (This used to be ctdb commit 8d0e43428c507967a0d96e6a4c5c821ac269c546) 2007-08-27 03:40:10 +04:00

			`/* one more node has responded with recmode data*/`
			`rmdata->count--;`

			`/* if we failed to get the recmode, then return an error and let`
			`the main loop try again.`
			`*/`
			`if (state->state != CTDB_CONTROL_DONE) {`
			`if (rmdata->status == MONITOR_OK) {`
			`rmdata->status = MONITOR_FAILED;`
			`}`
			`return;`
			`}`

			`/* if we got a response, then the recmode will be stored in the`
			`status field`
			`*/`
			`if (state->status != CTDB_RECOVERY_NORMAL) {`
recoverd: Fix an unclear log message - "Restart recovery process" When the recovery master notices a node in recovery mode it starts the recovery process, it doesn't restart it. Update documentation to match. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 298c4d2c3b4ea3d900c91f5a0a5aca2952a13d61) 2013-06-30 11:57:33 +04:00			`DEBUG(DEBUG_NOTICE, ("Node:%u was in recovery mode. Start recovery process\n", state->c->hdr.destnode));`
change the monitoring of recmode in the recovery daemon to use a fully async eventdriven api for controls (This used to be ctdb commit 8d0e43428c507967a0d96e6a4c5c821ac269c546) 2007-08-27 03:40:10 +04:00			`rmdata->status = MONITOR_RECOVERY_NEEDED;`
			`}`

			`return;`
			`}`


			`/* verify that all nodes are in normal recovery mode */`
ctdb-daemon: Rename struct ctdb_node_map to ctdb_node_map_old Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-29 09:22:48 +03:00			`static enum monitor_result verify_recmode(struct ctdb_context ctdb, struct ctdb_node_map_old nodemap)`
break checking that the recoverymode on all nodes are ok out into its own function (This used to be ctdb commit 813cf9a252af96da24122b80f24aabeed2911939) 2007-08-23 07:48:39 +04:00			`{`
change the monitoring of recmode in the recovery daemon to use a fully async eventdriven api for controls (This used to be ctdb commit 8d0e43428c507967a0d96e6a4c5c821ac269c546) 2007-08-27 03:40:10 +04:00			`struct verify_recmode_normal_data *rmdata;`
try out a slightly different api for controls where you provide a callback function which is called upon completion (or timeout) of the control. modify scanning of recmaster in the monitoring_cluster code to try the api out (This used to be ctdb commit c37843f1d97b169afec910e7ddb4e5ac12c3015c) 2007-08-23 13:27:09 +04:00			`TALLOC_CTX *mem_ctx = talloc_new(ctdb);`
change the monitoring of recmode in the recovery daemon to use a fully async eventdriven api for controls (This used to be ctdb commit 8d0e43428c507967a0d96e6a4c5c821ac269c546) 2007-08-27 03:40:10 +04:00			`struct ctdb_client_control_state *state;`
			`enum monitor_result status;`
ctdb-recovery: Fix signed/unsigned comparisons by declaring as unsigned Simple cases where variables need to be declared as an unsigned type instead of an int. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2019-05-23 01:43:58 +03:00			`unsigned int j;`

change the monitoring of recmode in the recovery daemon to use a fully async eventdriven api for controls (This used to be ctdb commit 8d0e43428c507967a0d96e6a4c5c821ac269c546) 2007-08-27 03:40:10 +04:00			`rmdata = talloc(mem_ctx, struct verify_recmode_normal_data);`
			`CTDB_NO_MEMORY_FATAL(ctdb, rmdata);`
			`rmdata->count = 0;`
			`rmdata->status = MONITOR_OK;`
break checking that the recoverymode on all nodes are ok out into its own function (This used to be ctdb commit 813cf9a252af96da24122b80f24aabeed2911939) 2007-08-23 07:48:39 +04:00
			`/* loop over all active nodes and send an async getrecmode call to`
			`them*/`
			`for (j=0; j<nodemap->num; j++) {`
			`if (nodemap->nodes[j].flags & NODE_FLAGS_INACTIVE) {`
			`continue;`
			`}`
change the monitoring of recmode in the recovery daemon to use a fully async eventdriven api for controls (This used to be ctdb commit 8d0e43428c507967a0d96e6a4c5c821ac269c546) 2007-08-27 03:40:10 +04:00			`state = ctdb_ctrl_getrecmode_send(ctdb, mem_ctx,`
break checking that the recoverymode on all nodes are ok out into its own function (This used to be ctdb commit 813cf9a252af96da24122b80f24aabeed2911939) 2007-08-23 07:48:39 +04:00			`CONTROL_TIMEOUT(),`
change how we do public addresses and takeover so that we can have multiple public addresses spread across multiple interfaces on each node. this is a massive patch since we have previously made the assumtion that we only have one public address per node. get rid of the public_interface argument. the public addresses file now explicitely lists which interface the address belongs to (This used to be ctdb commit 462ebbc791e906a6b874c862defea43235597ca8) 2007-09-04 03:50:07 +04:00			`nodemap->nodes[j].pnn);`
change the monitoring of recmode in the recovery daemon to use a fully async eventdriven api for controls (This used to be ctdb commit 8d0e43428c507967a0d96e6a4c5c821ac269c546) 2007-08-27 03:40:10 +04:00			`if (state == NULL) {`
			`/* we failed to send the control, treat this as`
			`an error and try again next iteration`
			`*/`
merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_ERR,("Failed to call ctdb_ctrl_getrecmode_send during monitoring\n"));`
try out a slightly different api for controls where you provide a callback function which is called upon completion (or timeout) of the control. modify scanning of recmaster in the monitoring_cluster code to try the api out (This used to be ctdb commit c37843f1d97b169afec910e7ddb4e5ac12c3015c) 2007-08-23 13:27:09 +04:00			`talloc_free(mem_ctx);`
break checking that the recoverymode on all nodes are ok out into its own function (This used to be ctdb commit 813cf9a252af96da24122b80f24aabeed2911939) 2007-08-23 07:48:39 +04:00			`return MONITOR_FAILED;`
			`}`
try out a slightly different api for controls where you provide a callback function which is called upon completion (or timeout) of the control. modify scanning of recmaster in the monitoring_cluster code to try the api out (This used to be ctdb commit c37843f1d97b169afec910e7ddb4e5ac12c3015c) 2007-08-23 13:27:09 +04:00
change the monitoring of recmode in the recovery daemon to use a fully async eventdriven api for controls (This used to be ctdb commit 8d0e43428c507967a0d96e6a4c5c821ac269c546) 2007-08-27 03:40:10 +04:00			`/* set up the callback functions */`
			`state->async.fn = verify_recmode_normal_callback;`
change async.private to async.private_data since private is a reserved work in c++ (This used to be ctdb commit 79eb28f6cd5dcc30b04966d202a050eaf98a2552) 2007-09-26 08:25:32 +04:00			`state->async.private_data = rmdata;`
change the monitoring of recmode in the recovery daemon to use a fully async eventdriven api for controls (This used to be ctdb commit 8d0e43428c507967a0d96e6a4c5c821ac269c546) 2007-08-27 03:40:10 +04:00
			`/* one more control to wait for to complete */`
			`rmdata->count++;`
break checking that the recoverymode on all nodes are ok out into its own function (This used to be ctdb commit 813cf9a252af96da24122b80f24aabeed2911939) 2007-08-23 07:48:39 +04:00			`}`

change the monitoring of recmode in the recovery daemon to use a fully async eventdriven api for controls (This used to be ctdb commit 8d0e43428c507967a0d96e6a4c5c821ac269c546) 2007-08-27 03:40:10 +04:00
			`/* now wait for up to the maximum number of seconds allowed`
			`or until all nodes we expect a response from has replied`
			`*/`
			`while (rmdata->count > 0) {`
ctdb-daemon: Stop using tevent compatibility definitions Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-26 08:50:09 +03:00			`tevent_loop_once(ctdb->ev);`
change the monitoring of recmode in the recovery daemon to use a fully async eventdriven api for controls (This used to be ctdb commit 8d0e43428c507967a0d96e6a4c5c821ac269c546) 2007-08-27 03:40:10 +04:00			`}`

			`status = rmdata->status;`
try out a slightly different api for controls where you provide a callback function which is called upon completion (or timeout) of the control. modify scanning of recmaster in the monitoring_cluster code to try the api out (This used to be ctdb commit c37843f1d97b169afec910e7ddb4e5ac12c3015c) 2007-08-23 13:27:09 +04:00			`talloc_free(mem_ctx);`
change the monitoring of recmode in the recovery daemon to use a fully async eventdriven api for controls (This used to be ctdb commit 8d0e43428c507967a0d96e6a4c5c821ac269c546) 2007-08-27 03:40:10 +04:00			`return status;`
break checking that the recoverymode on all nodes are ok out into its own function (This used to be ctdb commit 813cf9a252af96da24122b80f24aabeed2911939) 2007-08-23 07:48:39 +04:00			`}`

change the monitoring of recmode in the recovery daemon to use a fully async eventdriven api for controls (This used to be ctdb commit 8d0e43428c507967a0d96e6a4c5c821ac269c546) 2007-08-27 03:40:10 +04:00
recoverd: Interface reference count changes should not cause takeover runs At the moment a naive compare of the all the interface data is done. So, if any IPs move then the reference counts for the the relevant interfaces change, interfaces appear to have changed and another takeover run is initiated by each node that took/released IPs. This change stops the spurious takeover runs by changing the interface comparison to ignore the reference counts. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 0b7257642f62ebd83c05b6e2922f0dc2737f175c) 2013-02-21 03:43:35 +04:00			`static bool interfaces_have_changed(struct ctdb_context *ctdb,`
			`struct ctdb_recoverd *rec)`
			`{`
ctdb-daemon: Rename struct ctdb_control_get_ifaces to ctdb_iface_list_old Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-28 11:43:48 +03:00			`struct ctdb_iface_list_old *ifaces = NULL;`
recoverd: Interface reference count changes should not cause takeover runs At the moment a naive compare of the all the interface data is done. So, if any IPs move then the reference counts for the the relevant interfaces change, interfaces appear to have changed and another takeover run is initiated by each node that took/released IPs. This change stops the spurious takeover runs by changing the interface comparison to ignore the reference counts. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 0b7257642f62ebd83c05b6e2922f0dc2737f175c) 2013-02-21 03:43:35 +04:00			`TALLOC_CTX *mem_ctx;`
			`bool ret = false;`

			`mem_ctx = talloc_new(NULL);`

			`/* Read the interfaces from the local node */`
			`if (ctdb_ctrl_get_ifaces(ctdb, CONTROL_TIMEOUT(),`
			`CTDB_CURRENT_NODE, mem_ctx, &ifaces) != 0) {`
ctdb-recoverd: Use rec->pnn everywhere This is currently referenced in a number of inconsistent ways, including: * pnn * rec->ctdb->pnn * ctdb->pnn * ctdb_get_pnn(ctdb) * ctdb_get_pnn(rec->ctdb) The first of these always requires some thought about the context - is this the node PNN or some other PNN (e.g. argument to function)? rec->pnn is now always used when referring to the recovery daemon's PNN. Doing this also reduces reliance on struct ctdb_context internals. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-08 12:25:46 +03:00			`D_ERR("Unable to get interfaces from local node %u\n", rec->pnn);`
recoverd: Interface reference count changes should not cause takeover runs At the moment a naive compare of the all the interface data is done. So, if any IPs move then the reference counts for the the relevant interfaces change, interfaces appear to have changed and another takeover run is initiated by each node that took/released IPs. This change stops the spurious takeover runs by changing the interface comparison to ignore the reference counts. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 0b7257642f62ebd83c05b6e2922f0dc2737f175c) 2013-02-21 03:43:35 +04:00			`/* We could return an error. However, this will be`
			`* rare so we'll decide that the interfaces have`
			`* actually changed, just in case.`
			`*/`
			`talloc_free(mem_ctx);`
			`return true;`
			`}`

			`if (!rec->ifaces) {`
			`/* We haven't been here before so things have changed */`
recoverd: Log more information when interfaces change Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 3ef93a1a3e60cdf5d8954e7a16a988ea6126916b) 2013-08-15 11:04:01 +04:00			`DEBUG(DEBUG_NOTICE, ("Initial interface fetched\n"));`
recoverd: Interface reference count changes should not cause takeover runs At the moment a naive compare of the all the interface data is done. So, if any IPs move then the reference counts for the the relevant interfaces change, interfaces appear to have changed and another takeover run is initiated by each node that took/released IPs. This change stops the spurious takeover runs by changing the interface comparison to ignore the reference counts. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 0b7257642f62ebd83c05b6e2922f0dc2737f175c) 2013-02-21 03:43:35 +04:00			`ret = true;`
			`} else if (rec->ifaces->num != ifaces->num) {`
			`/* Number of interfaces has changed */`
recoverd: Log more information when interfaces change Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 3ef93a1a3e60cdf5d8954e7a16a988ea6126916b) 2013-08-15 11:04:01 +04:00			`DEBUG(DEBUG_NOTICE, ("Interface count changed from %d to %d\n",`
			`rec->ifaces->num, ifaces->num));`
recoverd: Interface reference count changes should not cause takeover runs At the moment a naive compare of the all the interface data is done. So, if any IPs move then the reference counts for the the relevant interfaces change, interfaces appear to have changed and another takeover run is initiated by each node that took/released IPs. This change stops the spurious takeover runs by changing the interface comparison to ignore the reference counts. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 0b7257642f62ebd83c05b6e2922f0dc2737f175c) 2013-02-21 03:43:35 +04:00			`ret = true;`
			`} else {`
			`/* See if interface names or link states have changed */`
ctdb-recovery: Fix signed/unsigned comparisons by declaring as unsigned Simple cases where variables need to be declared as an unsigned type instead of an int. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2019-05-23 01:43:58 +03:00			`unsigned int i;`
recoverd: Interface reference count changes should not cause takeover runs At the moment a naive compare of the all the interface data is done. So, if any IPs move then the reference counts for the the relevant interfaces change, interfaces appear to have changed and another takeover run is initiated by each node that took/released IPs. This change stops the spurious takeover runs by changing the interface comparison to ignore the reference counts. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 0b7257642f62ebd83c05b6e2922f0dc2737f175c) 2013-02-21 03:43:35 +04:00			`for (i = 0; i < rec->ifaces->num; i++) {`
ctdb-daemon: Rename struct ctdb_control_iface_info to ctdb_iface Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-28 11:37:17 +03:00			`struct ctdb_iface * iface = &rec->ifaces->ifaces[i];`
recoverd: Log more information when interfaces change Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 3ef93a1a3e60cdf5d8954e7a16a988ea6126916b) 2013-08-15 11:04:01 +04:00			`if (strcmp(iface->name, ifaces->ifaces[i].name) != 0) {`
			`DEBUG(DEBUG_NOTICE,`
			`("Interface in slot %d changed: %s => %s\n",`
			`i, iface->name, ifaces->ifaces[i].name));`
			`ret = true;`
			`break;`
			`}`
			`if (iface->link_state != ifaces->ifaces[i].link_state) {`
			`DEBUG(DEBUG_NOTICE,`
			`("Interface %s changed state: %d => %d\n",`
			`iface->name, iface->link_state,`
			`ifaces->ifaces[i].link_state));`
recoverd: Interface reference count changes should not cause takeover runs At the moment a naive compare of the all the interface data is done. So, if any IPs move then the reference counts for the the relevant interfaces change, interfaces appear to have changed and another takeover run is initiated by each node that took/released IPs. This change stops the spurious takeover runs by changing the interface comparison to ignore the reference counts. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 0b7257642f62ebd83c05b6e2922f0dc2737f175c) 2013-02-21 03:43:35 +04:00			`ret = true;`
			`break;`
			`}`
			`}`
			`}`

			`talloc_free(rec->ifaces);`
			`rec->ifaces = talloc_steal(rec, ifaces);`

			`talloc_free(mem_ctx);`
			`return ret;`
			`}`
implement a scheme where nodes are banned if they continuously caused the cluster to start a recovery session. The node is banned from the cluster for the RecoveryBanPeriod (default of 5 minutes) (This used to be ctdb commit 4ad43dd07f526b6002477177fbf55483246c2c0c) 2007-06-07 09:18:55 +04:00
ctdb-recoverd: Fold IP allocation house-keeping into IP verification Now all the IP takeover code for non-master node is in this function. The function can always be renamed to something more suitable. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Martin Schwenke <martins@samba.org> Autobuild-Date(master): Fri May 6 15:10:59 CEST 2016 on sn-devel-144 2016-05-03 09:36:37 +03:00			`/* Check that the local allocation of public IP addresses is correct`
			`* and do some house-keeping */`
ctdb-recoverd: Simplify arguments to verify_local_ip_allocation() All other arguments are available via rec, so simplify. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-13 01:51:36 +03:00			`static int verify_local_ip_allocation(struct ctdb_recoverd *rec)`
track both when we last started and ended a recovery. make ctdb uptime print how long the recovery took in the recovery daemon when we check that the public ip address allocation on the local node is correct (we have the ips we should have and we dont have any we shouldnt have) use ctdb uptime and check the recovery start/stop times and make sure we dont check for ip allocation inconsistencies during a recovery where the ip address allocation is in flux. (This used to be ctdb commit f86551580349b7f662f9a07e4eb0c1189e38e429) 2008-07-02 07:55:59 +04:00			`{`
			`TALLOC_CTX *mem_ctx = talloc_new(NULL);`
ctdb-recoverd: Simplify arguments to verify_local_ip_allocation() All other arguments are available via rec, so simplify. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-13 01:51:36 +03:00			`struct ctdb_context *ctdb = rec->ctdb;`
ctdb-recovery: Fix signed/unsigned comparisons by declaring as unsigned Simple cases where variables need to be declared as an unsigned type instead of an int. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2019-05-23 01:43:58 +03:00			`unsigned int j;`
			`int ret;`
server: only trigger one takeover run in verify_ip_allocation() metze (This used to be ctdb commit 10bc087d0280057962177721bdd6d4f28743b311) 2009-12-22 17:21:08 +03:00			`bool need_takeover_run = false;`
ctdb-recoverd: Check that IP failover is active in IP verification This makes verify_local_ip_allocation() self-contained and simplifies main_loop(). Due to indentation changes, this commit is most easily read when ignoring whitespace. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-11-09 07:41:45 +03:00			`struct ctdb_public_ip_list_old *ips = NULL;`
ctdb-server: Optimise local IP verification It is more efficient calling ctdb_sys_local_ip_check() inside a loop compared to calling ctdb_sys_have_ip(). There is a chance that this is premature optimisation... but it sure is easy. Fall back to checking with bind(). I think these checks really exist because of the weirdness fixed by commit 4b4e4d8870475d994fe42a7b2c57dc69842d91f6. However, we might as well do what we can. Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: John Mulligan <jmulligan@redhat.com> Reviewed-by: Anoop C S <anoopcs@samba.org> 2024-09-29 07:10:22 +03:00			`struct ctdb_sys_local_ips_context *ips_ctx = NULL;`
ctdb-recoverd: Check that IP failover is active in IP verification This makes verify_local_ip_allocation() self-contained and simplifies main_loop(). Due to indentation changes, this commit is most easily read when ignoring whitespace. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-11-09 07:41:45 +03:00
ctdb-recoverd: Factor out and use function this_node_is_leader() Make the code self-documenting. This preempts an upcoming change to terminology but doing it now saves a lot of churn. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-08 11:37:39 +03:00			`/* If we are not the leader then do some housekeeping */`
			`if (!this_node_is_leader(rec)) {`
ctdb-recoverd: Logging/comments: recovery master -> leader There are some remaining instances in this file but they will be removed in subsequent commits. Modernise debug macros as appropriate. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-08 03:07:25 +03:00			`/* Ignore any IP reallocate requests - only leader`
ctdb-recoverd: Fold IP allocation house-keeping into IP verification Now all the IP takeover code for non-master node is in this function. The function can always be renamed to something more suitable. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Martin Schwenke <martins@samba.org> Autobuild-Date(master): Fri May 6 15:10:59 CEST 2016 on sn-devel-144 2016-05-03 09:36:37 +03:00			`* processes them`
			`*/`
			`TALLOC_FREE(rec->reallocate_requests);`
			`/* Clear any nodes that should be force rebalanced in`
ctdb-recoverd: Logging/comments: recovery master -> leader There are some remaining instances in this file but they will be removed in subsequent commits. Modernise debug macros as appropriate. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-08 03:07:25 +03:00			`* the next takeover run. If the leader has changed`
			`* then we don't want to process these some time in`
			`* the future.`
ctdb-recoverd: Fold IP allocation house-keeping into IP verification Now all the IP takeover code for non-master node is in this function. The function can always be renamed to something more suitable. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Martin Schwenke <martins@samba.org> Autobuild-Date(master): Fri May 6 15:10:59 CEST 2016 on sn-devel-144 2016-05-03 09:36:37 +03:00			`*/`
			`TALLOC_FREE(rec->force_rebalance_nodes);`
			`}`

ctdb-recoverd: Check that IP failover is active in IP verification This makes verify_local_ip_allocation() self-contained and simplifies main_loop(). Due to indentation changes, this commit is most easily read when ignoring whitespace. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-11-09 07:41:45 +03:00			`/* Return early if disabled... */`
ctdb-config: Switch tunable DisableIPFailover to a config option Use the "failover:disabled" option instead. BUG: https://bugzilla.samba.org/show_bug.cgi?id=13589 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-08-21 06:41:22 +03:00			`if (ctdb_config.failover_disabled \|\|`
ctdb-recoverd: Check that IP failover is active in IP verification This makes verify_local_ip_allocation() self-contained and simplifies main_loop(). Due to indentation changes, this commit is most easily read when ignoring whitespace. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-11-09 07:41:45 +03:00			`ctdb_op_is_disabled(rec->takeover_run)) {`
ctdb: Fix a memleak Bug: https://bugzilla.samba.org/show_bug.cgi?id=14348 Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Martin Schwenke <martin@meltin.net> Autobuild-User(master): Martin Schwenke <martins@samba.org> Autobuild-Date(master): Fri Apr 17 08:32:35 UTC 2020 on sn-devel-184 2020-04-16 15:38:34 +03:00			`talloc_free(mem_ctx);`
ctdb-recoverd: Check that IP failover is active in IP verification This makes verify_local_ip_allocation() self-contained and simplifies main_loop(). Due to indentation changes, this commit is most easily read when ignoring whitespace. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-11-09 07:41:45 +03:00			`return 0;`
			`}`
track both when we last started and ended a recovery. make ctdb uptime print how long the recovery took in the recovery daemon when we check that the public ip address allocation on the local node is correct (we have the ips we should have and we dont have any we shouldnt have) use ctdb uptime and check the recovery start/stop times and make sure we dont check for ip allocation inconsistencies during a recovery where the ip address allocation is in flux. (This used to be ctdb commit f86551580349b7f662f9a07e4eb0c1189e38e429) 2008-07-02 07:55:59 +04:00
recoverd: Interface reference count changes should not cause takeover runs At the moment a naive compare of the all the interface data is done. So, if any IPs move then the reference counts for the the relevant interfaces change, interfaces appear to have changed and another takeover run is initiated by each node that took/released IPs. This change stops the spurious takeover runs by changing the interface comparison to ignore the reference counts. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 0b7257642f62ebd83c05b6e2922f0dc2737f175c) 2013-02-21 03:43:35 +04:00			`if (interfaces_have_changed(ctdb, rec)) {`
server: monitor interfaces in verify_ip_allocation() metze (This used to be ctdb commit 965a65520693e3731b5b0250127b04c777087808) 2009-12-22 17:21:08 +03:00			`need_takeover_run = true;`
			`}`

ctdb-recoverd: Clean up local IP verification Update log levels and messages, comments and wrapping of long lines. No functional changes. Note that interfaces_have_changed() already does adequate logging. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-11-09 08:12:31 +03:00			`/* If there are unhosted IPs but this node can host them then`
			`* trigger an IP reallocation */`
recoverd: Verifying local IPs should only check for unhosted available IPs Currently it checks for unhosted IPs among the known IPs rather than available IPs. This means that a takeover run can be flagged even when that takeover run will be unable to assign a known, unhosted IP. Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 3cc878bc97fdac764a60ed805f64d649eaab06e8) 2012-10-11 08:17:54 +04:00
ctdb-recoverd: Clean up local IP verification Update log levels and messages, comments and wrapping of long lines. No functional changes. Note that interfaces_have_changed() already does adequate logging. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-11-09 08:12:31 +03:00			`/* Read available IPs from local node */`
			`ret = ctdb_ctrl_get_public_ips_flags(`
			`ctdb, CONTROL_TIMEOUT(), CTDB_CURRENT_NODE, mem_ctx,`
			`CTDB_PUBLIC_IP_FLAGS_ONLY_AVAILABLE, &ips);`
ctdb-recoverd: Check that IP failover is active in IP verification This makes verify_local_ip_allocation() self-contained and simplifies main_loop(). Due to indentation changes, this commit is most easily read when ignoring whitespace. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-11-09 07:41:45 +03:00			`if (ret != 0) {`
ctdb-recoverd: Clean up local IP verification Update log levels and messages, comments and wrapping of long lines. No functional changes. Note that interfaces_have_changed() already does adequate logging. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-11-09 08:12:31 +03:00			`DEBUG(DEBUG_ERR, ("Unable to retrieve available public IPs\n"));`
ctdb-recoverd: Check that IP failover is active in IP verification This makes verify_local_ip_allocation() self-contained and simplifies main_loop(). Due to indentation changes, this commit is most easily read when ignoring whitespace. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-11-09 07:41:45 +03:00			`talloc_free(mem_ctx);`
			`return -1;`
			`}`
recoverd: Verifying local IPs should only check for unhosted available IPs Currently it checks for unhosted IPs among the known IPs rather than available IPs. This means that a takeover run can be flagged even when that takeover run will be unable to assign a known, unhosted IP. Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 3cc878bc97fdac764a60ed805f64d649eaab06e8) 2012-10-11 08:17:54 +04:00
ctdb-recoverd: Check that IP failover is active in IP verification This makes verify_local_ip_allocation() self-contained and simplifies main_loop(). Due to indentation changes, this commit is most easily read when ignoring whitespace. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-11-09 07:41:45 +03:00			`for (j=0; j<ips->num; j++) {`
ctdb-recovery: Avoid -1 as a PNN, use CTDB_UNKNOWN_PNN instead Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2019-05-23 10:50:32 +03:00			`if (ips->ips[j].pnn == CTDB_UNKNOWN_PNN &&`
ctdb-recoverd: Simplify arguments to verify_local_ip_allocation() All other arguments are available via rec, so simplify. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-13 01:51:36 +03:00			`rec->nodemap->nodes[rec->pnn].flags == 0) {`
ctdb-recoverd: Clean up local IP verification Update log levels and messages, comments and wrapping of long lines. No functional changes. Note that interfaces_have_changed() already does adequate logging. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-11-09 08:12:31 +03:00			`DEBUG(DEBUG_WARNING,`
			`("Unassigned IP %s can be served by this node\n",`
			`ctdb_addr_to_str(&ips->ips[j].addr)));`
ctdb-recoverd: Check that IP failover is active in IP verification This makes verify_local_ip_allocation() self-contained and simplifies main_loop(). Due to indentation changes, this commit is most easily read when ignoring whitespace. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-11-09 07:41:45 +03:00			`need_takeover_run = true;`
recoverd: Verifying local IPs should only check for unhosted available IPs Currently it checks for unhosted IPs among the known IPs rather than available IPs. This means that a takeover run can be flagged even when that takeover run will be unable to assign a known, unhosted IP. Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 3cc878bc97fdac764a60ed805f64d649eaab06e8) 2012-10-11 08:17:54 +04:00			`}`
ctdb-recoverd: Check that IP failover is active in IP verification This makes verify_local_ip_allocation() self-contained and simplifies main_loop(). Due to indentation changes, this commit is most easily read when ignoring whitespace. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-11-09 07:41:45 +03:00			`}`
recoverd: Verifying local IPs should only check for unhosted available IPs Currently it checks for unhosted IPs among the known IPs rather than available IPs. This means that a takeover run can be flagged even when that takeover run will be unable to assign a known, unhosted IP. Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 3cc878bc97fdac764a60ed805f64d649eaab06e8) 2012-10-11 08:17:54 +04:00
ctdb-recoverd: Check that IP failover is active in IP verification This makes verify_local_ip_allocation() self-contained and simplifies main_loop(). Due to indentation changes, this commit is most easily read when ignoring whitespace. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-11-09 07:41:45 +03:00			`talloc_free(ips);`
recoverd: Verifying local IPs should only check for unhosted available IPs Currently it checks for unhosted IPs among the known IPs rather than available IPs. This means that a takeover run can be flagged even when that takeover run will be unable to assign a known, unhosted IP. Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 3cc878bc97fdac764a60ed805f64d649eaab06e8) 2012-10-11 08:17:54 +04:00
ctdb-recoverd: Skip known IP address checking when it is disabled When public IP checking is disabled, verify_local_ip_allocation() still retrieves known IP addresses and runs through a loop that does nothing. Instead, completely skip the retrieval and checking loop. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-11-09 07:44:15 +03:00			`if (!ctdb->do_checkpublicip) {`
			`goto done;`
			`}`

ctdb-recoverd: Clean up local IP verification Update log levels and messages, comments and wrapping of long lines. No functional changes. Note that interfaces_have_changed() already does adequate logging. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-11-09 08:12:31 +03:00			`/* Validate the IP addresses that this node has on network`
			`* interfaces. If there is an inconsistency between reality`
			`* and the state expected by CTDB then try to fix it by`
			`* triggering an IP reallocation or releasing extraneous IP`
			`* addresses. */`

			`/* Read known IPs from local node */`
			`ret = ctdb_ctrl_get_public_ips_flags(`
			`ctdb, CONTROL_TIMEOUT(), CTDB_CURRENT_NODE, mem_ctx, 0, &ips);`
ctdb-recoverd: Check that IP failover is active in IP verification This makes verify_local_ip_allocation() self-contained and simplifies main_loop(). Due to indentation changes, this commit is most easily read when ignoring whitespace. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-11-09 07:41:45 +03:00			`if (ret != 0) {`
ctdb-recoverd: Clean up local IP verification Update log levels and messages, comments and wrapping of long lines. No functional changes. Note that interfaces_have_changed() already does adequate logging. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-11-09 08:12:31 +03:00			`DEBUG(DEBUG_ERR, ("Unable to retrieve known public IPs\n"));`
ctdb-recoverd: Check that IP failover is active in IP verification This makes verify_local_ip_allocation() self-contained and simplifies main_loop(). Due to indentation changes, this commit is most easily read when ignoring whitespace. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-11-09 07:41:45 +03:00			`talloc_free(mem_ctx);`
			`return -1;`
			`}`
recoverd: Verifying local IPs should only check for unhosted available IPs Currently it checks for unhosted IPs among the known IPs rather than available IPs. This means that a takeover run can be flagged even when that takeover run will be unable to assign a known, unhosted IP. Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 3cc878bc97fdac764a60ed805f64d649eaab06e8) 2012-10-11 08:17:54 +04:00
ctdb-server: Optimise local IP verification It is more efficient calling ctdb_sys_local_ip_check() inside a loop compared to calling ctdb_sys_have_ip(). There is a chance that this is premature optimisation... but it sure is easy. Fall back to checking with bind(). I think these checks really exist because of the weirdness fixed by commit 4b4e4d8870475d994fe42a7b2c57dc69842d91f6. However, we might as well do what we can. Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: John Mulligan <jmulligan@redhat.com> Reviewed-by: Anoop C S <anoopcs@samba.org> 2024-09-29 07:10:22 +03:00			`ret = ctdb_sys_local_ips_init(mem_ctx, &ips_ctx);`
			`if (ret != 0) {`
			`/*`
			`* What to do? The point here is to allow public`
			`* addresses to be checked when`
			`* net.ipv4.ip_nonlocal_bind = 1, which is probably`
			`* just Linux... though other platforms may have a`
			`* similar setting. For non-Linux platforms without a`
			`* usable getifaddrs(3) function/replacement, fall`
			`* back to bind() below...`
			`*/`
			`DBG_DEBUG("Failed to get local addresses, depending on bind\n");`
			`ips_ctx = NULL; /* Just in case */`
			`}`

ctdb-recoverd: Check that IP failover is active in IP verification This makes verify_local_ip_allocation() self-contained and simplifies main_loop(). Due to indentation changes, this commit is most easily read when ignoring whitespace. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-11-09 07:41:45 +03:00			`for (j=0; j<ips->num; j++) {`
ctdb-server: Add some local variables Improve readability by not repeating the complex expression now assigned to addr. ctdb_sys_have_ip() is called in both arms of the if/else, so call it once when declaring the new variable. Modernise debug macros while touching lines. Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: John Mulligan <jmulligan@redhat.com> Reviewed-by: Anoop C S <anoopcs@samba.org> 2024-09-29 07:06:51 +03:00			`ctdb_sock_addr *addr = &ips->ips[j].addr;`
ctdb-server: Optimise local IP verification It is more efficient calling ctdb_sys_local_ip_check() inside a loop compared to calling ctdb_sys_have_ip(). There is a chance that this is premature optimisation... but it sure is easy. Fall back to checking with bind(). I think these checks really exist because of the weirdness fixed by commit 4b4e4d8870475d994fe42a7b2c57dc69842d91f6. However, we might as well do what we can. Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: John Mulligan <jmulligan@redhat.com> Reviewed-by: Anoop C S <anoopcs@samba.org> 2024-09-29 07:10:22 +03:00			`bool have_ip;`

			`if (ips_ctx != NULL) {`
			`have_ip = ctdb_sys_local_ip_check(ips_ctx, addr);`
			`} else {`
			`have_ip = ctdb_sys_bind_ip_check(addr);`
			`}`
ctdb-server: Add some local variables Improve readability by not repeating the complex expression now assigned to addr. ctdb_sys_have_ip() is called in both arms of the if/else, so call it once when declaring the new variable. Modernise debug macros while touching lines. Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: John Mulligan <jmulligan@redhat.com> Reviewed-by: Anoop C S <anoopcs@samba.org> 2024-09-29 07:06:51 +03:00
ctdb-recoverd: Simplify arguments to verify_local_ip_allocation() All other arguments are available via rec, so simplify. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-13 01:51:36 +03:00			`if (ips->ips[j].pnn == rec->pnn) {`
ctdb-server: Add some local variables Improve readability by not repeating the complex expression now assigned to addr. ctdb_sys_have_ip() is called in both arms of the if/else, so call it once when declaring the new variable. Modernise debug macros while touching lines. Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: John Mulligan <jmulligan@redhat.com> Reviewed-by: Anoop C S <anoopcs@samba.org> 2024-09-29 07:06:51 +03:00			`if (!have_ip) {`
			`D_ERR("Assigned IP %s not on an interface\n",`
			`ctdb_addr_to_str(addr));`
ctdb-recoverd: Check that IP failover is active in IP verification This makes verify_local_ip_allocation() self-contained and simplifies main_loop(). Due to indentation changes, this commit is most easily read when ignoring whitespace. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-11-09 07:41:45 +03:00			`need_takeover_run = true;`
			`}`
			`} else {`
ctdb-server: Add some local variables Improve readability by not repeating the complex expression now assigned to addr. ctdb_sys_have_ip() is called in both arms of the if/else, so call it once when declaring the new variable. Modernise debug macros while touching lines. Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: John Mulligan <jmulligan@redhat.com> Reviewed-by: Anoop C S <anoopcs@samba.org> 2024-09-29 07:06:51 +03:00			`if (have_ip) {`
			`D_ERR("IP %s incorrectly on an interface\n",`
			`ctdb_addr_to_str(addr));`
ctdb-recoverd: Don't directly release rogue IP addresses This is inconsistent with the rest of the local IP verification. It should notice problems but not try to fix them directly. Like other cases, it should use an IP takeover run to try to fix the problem. In this case the address might have just been added and an out-of-band RELEASE_IP might cause conflicts (i.e. "another change is in flight") with a scheduled IP takeover run. This effectively reverts commit 694c1b269edc95df446b2e171919be0c266383c4. Not sure why this was needed after c7e648c2d11f9785f2493a3dd67567a635633489. More recently commit 6471541d6d2bc9f2af0ff92b280abbd1d933cf88 moves responsibility for determining interface/netmask to 10.interface so this should continue to work just fine. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-08-02 05:18:15 +03:00			`need_takeover_run = true;`
track both when we last started and ended a recovery. make ctdb uptime print how long the recovery took in the recovery daemon when we check that the public ip address allocation on the local node is correct (we have the ips we should have and we dont have any we shouldnt have) use ctdb uptime and check the recovery start/stop times and make sure we dont check for ip allocation inconsistencies during a recovery where the ip address allocation is in flux. (This used to be ctdb commit f86551580349b7f662f9a07e4eb0c1189e38e429) 2008-07-02 07:55:59 +04:00			`}`
			`}`
			`}`

ctdb-server: Optimise local IP verification It is more efficient calling ctdb_sys_local_ip_check() inside a loop compared to calling ctdb_sys_have_ip(). There is a chance that this is premature optimisation... but it sure is easy. Fall back to checking with bind(). I think these checks really exist because of the weirdness fixed by commit 4b4e4d8870475d994fe42a7b2c57dc69842d91f6. However, we might as well do what we can. Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: John Mulligan <jmulligan@redhat.com> Reviewed-by: Anoop C S <anoopcs@samba.org> 2024-09-29 07:10:22 +03:00			`TALLOC_FREE(ips_ctx);`

ctdb-recoverd: Skip known IP address checking when it is disabled When public IP checking is disabled, verify_local_ip_allocation() still retrieves known IP addresses and runs through a loop that does nothing. Instead, completely skip the retrieval and checking loop. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-11-09 07:44:15 +03:00			`done:`
server: only trigger one takeover run in verify_ip_allocation() metze (This used to be ctdb commit 10bc087d0280057962177721bdd6d4f28743b311) 2009-12-22 17:21:08 +03:00			`if (need_takeover_run) {`
ctdb-daemon: Rename struct srvid_request to ctdb_srvid_message Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-29 06:32:49 +03:00			`struct ctdb_srvid_message rd;`
server: only trigger one takeover run in verify_ip_allocation() metze (This used to be ctdb commit 10bc087d0280057962177721bdd6d4f28743b311) 2009-12-22 17:21:08 +03:00			`TDB_DATA data;`

ctdb-recoverd: Clean up local IP verification Update log levels and messages, comments and wrapping of long lines. No functional changes. Note that interfaces_have_changed() already does adequate logging. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-11-09 08:12:31 +03:00			`DEBUG(DEBUG_NOTICE,("Trigger takeoverrun\n"));`
server: only trigger one takeover run in verify_ip_allocation() metze (This used to be ctdb commit 10bc087d0280057962177721bdd6d4f28743b311) 2009-12-22 17:21:08 +03:00
ctdb-recoverd: Fix some uninitialised memory issues The first element of these structures is a 32-bit PNN. On 64-bit systems this field can be followed by 32-bits of padding. When the structures are copied this can cause uninitialised memory to be copied. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Michael Adam <obnox@samba.org> 2016-01-11 09:23:12 +03:00			`ZERO_STRUCT(rd);`
ctdb-recoverd: Simplify arguments to verify_local_ip_allocation() All other arguments are available via rec, so simplify. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-13 01:51:36 +03:00			`rd.pnn = rec->pnn;`
server: only trigger one takeover run in verify_ip_allocation() metze (This used to be ctdb commit 10bc087d0280057962177721bdd6d4f28743b311) 2009-12-22 17:21:08 +03:00			`rd.srvid = 0;`
			`data.dptr = (uint8_t *)&rd;`
			`data.dsize = sizeof(rd);`

ctdb-recoverd: Broadcast takeover run message when verifying IPs This makes it consistent with the monitoring code. If the master has changed then this means the master will always get the message. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Tue Aug 18 06:24:11 UTC 2020 on sn-devel-184 2020-07-29 00:02:45 +03:00			`ret = ctdb_client_send_message(ctdb,`
			`CTDB_BROADCAST_CONNECTED,`
			`CTDB_SRVID_TAKEOVER_RUN,`
			`data);`
server: only trigger one takeover run in verify_ip_allocation() metze (This used to be ctdb commit 10bc087d0280057962177721bdd6d4f28743b311) 2009-12-22 17:21:08 +03:00			`if (ret != 0) {`
ctdb-recoverd: Broadcast takeover run message when verifying IPs This makes it consistent with the monitoring code. If the master has changed then this means the master will always get the message. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Tue Aug 18 06:24:11 UTC 2020 on sn-devel-184 2020-07-29 00:02:45 +03:00			`D_ERR("Failed to send takeover run request\n");`
server: only trigger one takeover run in verify_ip_allocation() metze (This used to be ctdb commit 10bc087d0280057962177721bdd6d4f28743b311) 2009-12-22 17:21:08 +03:00			`}`
			`}`
track both when we last started and ended a recovery. make ctdb uptime print how long the recovery took in the recovery daemon when we check that the public ip address allocation on the local node is correct (we have the ips we should have and we dont have any we shouldnt have) use ctdb uptime and check the recovery start/stop times and make sure we dont check for ip allocation inconsistencies during a recovery where the ip address allocation is in flux. (This used to be ctdb commit f86551580349b7f662f9a07e4eb0c1189e38e429) 2008-07-02 07:55:59 +04:00			`talloc_free(mem_ctx);`
			`return 0;`
			`}`

redo and update how we synchronize flags across the cluster. this simplifies the code and should close a race condition between the local recovery daemon and a remote node when flags are changing. (This used to be ctdb commit 32d460b8469eb53145f04161a5d01166f9b5f09e) 2008-12-05 08:32:30 +03:00
ctdb-recoverd: Add an intermediate state struct for nodemap fetching This will allow an error callback to be added. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-01-18 11:52:22 +03:00			`struct remote_nodemaps_state {`
			`struct ctdb_node_map_old **remote_nodemaps;`
ctdb-recoverd: Add fail callback to assign banning credits Also drop error handling in main_loop() that is replaced by this change. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-01-18 11:58:15 +03:00			`struct ctdb_recoverd *rec;`
ctdb-recoverd: Add an intermediate state struct for nodemap fetching This will allow an error callback to be added. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-01-18 11:52:22 +03:00			`};`

ctdb-recoverd: Basic cleanups for get_remote_nodemaps() Don't log an error on failure - let the caller can do this. Apart from this: fix up coding style and modernise the remaining error message. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-01-18 08:19:36 +03:00			`static void async_getnodemap_callback(struct ctdb_context *ctdb,`
			`uint32_t node_pnn,`
			`int32_t res,`
			`TDB_DATA outdata,`
			`void *callback_data)`
redo and update how we synchronize flags across the cluster. this simplifies the code and should close a race condition between the local recovery daemon and a remote node when flags are changing. (This used to be ctdb commit 32d460b8469eb53145f04161a5d01166f9b5f09e) 2008-12-05 08:32:30 +03:00			`{`
ctdb-recoverd: Add an intermediate state struct for nodemap fetching This will allow an error callback to be added. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-01-18 11:52:22 +03:00			`struct remote_nodemaps_state *state =`
			`(struct remote_nodemaps_state *)callback_data;`
			`struct ctdb_node_map_old **remote_nodemaps = state->remote_nodemaps;`
ctdb-recoverd: Fix node_pnn check and assignment of nodemap into array This array is indexed by the same index as nodemap, not the PNN. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-07-30 04:57:51 +03:00			`struct ctdb_node_map_old *nodemap = state->rec->nodemap;`
			`size_t i;`
redo and update how we synchronize flags across the cluster. this simplifies the code and should close a race condition between the local recovery daemon and a remote node when flags are changing. (This used to be ctdb commit 32d460b8469eb53145f04161a5d01166f9b5f09e) 2008-12-05 08:32:30 +03:00
ctdb-recoverd: Fix node_pnn check and assignment of nodemap into array This array is indexed by the same index as nodemap, not the PNN. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-07-30 04:57:51 +03:00			`for (i = 0; i < nodemap->num; i++) {`
			`if (nodemap->nodes[i].pnn == node_pnn) {`
			`break;`
			`}`
			`}`

			`if (i >= nodemap->num) {`
			`DBG_ERR("Invalid PNN %"PRIu32"\n", node_pnn);`
redo and update how we synchronize flags across the cluster. this simplifies the code and should close a race condition between the local recovery daemon and a remote node when flags are changing. (This used to be ctdb commit 32d460b8469eb53145f04161a5d01166f9b5f09e) 2008-12-05 08:32:30 +03:00			`return;`
			`}`

ctdb-recoverd: Fix node_pnn check and assignment of nodemap into array This array is indexed by the same index as nodemap, not the PNN. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-07-30 04:57:51 +03:00			`remote_nodemaps[i] = (struct ctdb_node_map_old *)talloc_steal(`
ctdb-recoverd: Basic cleanups for get_remote_nodemaps() Don't log an error on failure - let the caller can do this. Apart from this: fix up coding style and modernise the remaining error message. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-01-18 08:19:36 +03:00			`remote_nodemaps, outdata.dptr);`
redo and update how we synchronize flags across the cluster. this simplifies the code and should close a race condition between the local recovery daemon and a remote node when flags are changing. (This used to be ctdb commit 32d460b8469eb53145f04161a5d01166f9b5f09e) 2008-12-05 08:32:30 +03:00
			`}`

ctdb-recoverd: Add fail callback to assign banning credits Also drop error handling in main_loop() that is replaced by this change. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-01-18 11:58:15 +03:00			`static void async_getnodemap_error(struct ctdb_context *ctdb,`
			`uint32_t node_pnn,`
			`int32_t res,`
			`TDB_DATA outdata,`
			`void *callback_data)`
			`{`
			`struct remote_nodemaps_state *state =`
			`(struct remote_nodemaps_state *)callback_data;`
			`struct ctdb_recoverd *rec = state->rec;`

			`DBG_ERR("Failed to retrieve nodemap from node %u\n", node_pnn);`
			`ctdb_set_culprit(rec, node_pnn);`
			`}`

ctdb-recoverd: Change signature of get_remote_nodemaps() Change 1st argument to a rec context, since this will be needed later. Drop the nodemap argument and access it via rec->nodemap instead. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-01-18 08:41:19 +03:00			`static int get_remote_nodemaps(struct ctdb_recoverd *rec,`
ctdb-recoverd: Basic cleanups for get_remote_nodemaps() Don't log an error on failure - let the caller can do this. Apart from this: fix up coding style and modernise the remaining error message. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-01-18 08:19:36 +03:00			`TALLOC_CTX *mem_ctx,`
ctdb-recoverd: Move memory allocation into get_remote_nodemaps() BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-01-18 08:31:39 +03:00			`struct ctdb_node_map_old ***remote_nodemaps)`
redo and update how we synchronize flags across the cluster. this simplifies the code and should close a race condition between the local recovery daemon and a remote node when flags are changing. (This used to be ctdb commit 32d460b8469eb53145f04161a5d01166f9b5f09e) 2008-12-05 08:32:30 +03:00			`{`
ctdb-recoverd: Change signature of get_remote_nodemaps() Change 1st argument to a rec context, since this will be needed later. Drop the nodemap argument and access it via rec->nodemap instead. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-01-18 08:41:19 +03:00			`struct ctdb_context *ctdb = rec->ctdb;`
ctdb-recoverd: Move memory allocation into get_remote_nodemaps() BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-01-18 08:31:39 +03:00			`struct ctdb_node_map_old **t;`
redo and update how we synchronize flags across the cluster. this simplifies the code and should close a race condition between the local recovery daemon and a remote node when flags are changing. (This used to be ctdb commit 32d460b8469eb53145f04161a5d01166f9b5f09e) 2008-12-05 08:32:30 +03:00			`uint32_t *nodes;`
ctdb-recoverd: Add an intermediate state struct for nodemap fetching This will allow an error callback to be added. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-01-18 11:52:22 +03:00			`struct remote_nodemaps_state state;`
ctdb-recoverd: Basic cleanups for get_remote_nodemaps() Don't log an error on failure - let the caller can do this. Apart from this: fix up coding style and modernise the remaining error message. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-01-18 08:19:36 +03:00			`int ret;`
redo and update how we synchronize flags across the cluster. this simplifies the code and should close a race condition between the local recovery daemon and a remote node when flags are changing. (This used to be ctdb commit 32d460b8469eb53145f04161a5d01166f9b5f09e) 2008-12-05 08:32:30 +03:00
ctdb-recoverd: Move memory allocation into get_remote_nodemaps() BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-01-18 08:31:39 +03:00			`t = talloc_zero_array(mem_ctx,`
			`struct ctdb_node_map_old *,`
			`rec->nodemap->num);`
			`if (t == NULL) {`
			`DBG_ERR("Memory allocation error\n");`
			`return -1;`
			`}`

ctdb-recoverd: Do not fetch the nodemap from the recovery master The nodemap has already been fetched from the local node and is actually passed to this function. Care must be taken to avoid referencing the "remote" nodemap for the recovery master. It also isn't useful to do so, since it would be the same nodemap. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2019-06-13 17:23:22 +03:00			`nodes = list_of_connected_nodes(ctdb, rec->nodemap, mem_ctx, false);`
ctdb-recoverd: Basic cleanups for get_remote_nodemaps() Don't log an error on failure - let the caller can do this. Apart from this: fix up coding style and modernise the remaining error message. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-01-18 08:19:36 +03:00
ctdb-recoverd: Add an intermediate state struct for nodemap fetching This will allow an error callback to be added. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-01-18 11:52:22 +03:00			`state.remote_nodemaps = t;`
ctdb-recoverd: Add fail callback to assign banning credits Also drop error handling in main_loop() that is replaced by this change. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-01-18 11:58:15 +03:00			`state.rec = rec;`
ctdb-recoverd: Add an intermediate state struct for nodemap fetching This will allow an error callback to be added. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-01-18 11:52:22 +03:00
ctdb-recoverd: Basic cleanups for get_remote_nodemaps() Don't log an error on failure - let the caller can do this. Apart from this: fix up coding style and modernise the remaining error message. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-01-18 08:19:36 +03:00			`ret = ctdb_client_async_control(ctdb,`
			`CTDB_CONTROL_GET_NODEMAP,`
			`nodes,`
			`0,`
			`CONTROL_TIMEOUT(),`
			`false,`
			`tdb_null,`
redo and update how we synchronize flags across the cluster. this simplifies the code and should close a race condition between the local recovery daemon and a remote node when flags are changing. (This used to be ctdb commit 32d460b8469eb53145f04161a5d01166f9b5f09e) 2008-12-05 08:32:30 +03:00			`async_getnodemap_callback,`
ctdb-recoverd: Add fail callback to assign banning credits Also drop error handling in main_loop() that is replaced by this change. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-01-18 11:58:15 +03:00			`async_getnodemap_error,`
ctdb-recoverd: Add an intermediate state struct for nodemap fetching This will allow an error callback to be added. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-01-18 11:52:22 +03:00			`&state);`
ctdb-recoverd: Fix a local memory leak The memory is allocated off the memory context used by the current iteration of main loop. It is freed when main loop completes the fix doesn't require backporting to stable branches. However, it is sloppy so it is worth fixing. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-08-17 13:27:18 +03:00			`talloc_free(nodes);`
ctdb-recoverd: Move memory allocation into get_remote_nodemaps() BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-01-18 08:31:39 +03:00
			`if (ret != 0) {`
			`talloc_free(t);`
			`return ret;`
			`}`

			`*remote_nodemaps = t;`
			`return 0;`
redo and update how we synchronize flags across the cluster. this simplifies the code and should close a race condition between the local recovery daemon and a remote node when flags are changing. (This used to be ctdb commit 32d460b8469eb53145f04161a5d01166f9b5f09e) 2008-12-05 08:32:30 +03:00			`}`

speed startup: alter recovery loop We do a recovery on startup. But the code does: Sleep for ctdb->tunable.recover_interval. Check for recovery. We want to do it in the other order. This is best done by extracting the loop into a separate "main_loop" function. Seconds between ctdbd first log message and node healthy: BEFORE: 24.09 AFTER: 23.58 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 097046025176b9fcb670839d1a9f100f890e7ed2) 2010-06-22 17:20:23 +04:00			`static void main_loop(struct ctdb_context ctdb, struct ctdb_recoverd rec,`
			`TALLOC_CTX *mem_ctx)`
start working on a recovery daemon change ctdb_control so it takes a timeval pointer as argument. this is the timeout. if the node has not responded within hte timeout ctdb_control will return an error instead of hanging. if the timeval pointer is NULL then the call will block indefinitely if there is no response. this is used for now in the createdb control but all the helpers ctdb_ctrl_* should probably be updated to take a timeout parameter as well. (This used to be ctdb commit 1fe64b04869b17dbf123851b0fe09df8d28a6211) 2007-05-04 02:30:18 +04:00			`{`
ctdb-daemon: Rename struct ctdb_node_map to ctdb_node_map_old Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-29 09:22:48 +03:00			`struct ctdb_node_map_old *nodemap=NULL;`
			`struct ctdb_node_map_old **remote_nodemaps=NULL;`
update getvnnmap control to take a timeout parameter dont explicitely free the vnnmap pointer in the getvnnmap control this is freed by the mem_ctx instead add code to the recoverd to detect when/if recovery is required veiry that the number of active nodes, the nodemap and the vnn map is consistent across the entire cluster and if not trigger a recovery (which right now just prints "we need to do recovery" to the screen. (This used to be ctdb commit 2b0a207a3748bdb3394dc9fd0d1c344ee1bb0bb5) 2007-05-04 03:45:53 +04:00			`struct ctdb_vnn_map *vnnmap=NULL;`
			`struct ctdb_vnn_map *remote_vnnmap=NULL;`
ctdb-recoverd: Make num_lmasters a local variable It isn't used anywhere else and is always re-initialised to 0. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-03-29 09:49:02 +03:00			`uint32_t num_lmasters;`
read the current debuglevel in each loop in the recovery daemon so that we pick up when they change in the parent daemon (This used to be ctdb commit 792d5471ff0c2947b6e66183925860de27f30eaf) 2008-02-18 11:38:04 +03:00			`int32_t debug_level;`
ctdb-recovery: Fix signed/unsigned comparisons by declaring as unsigned Simple cases where variables need to be declared as an unsigned type instead of an int. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2019-05-23 01:43:58 +03:00			`unsigned int i, j;`
			`int ret;`
recoverd: Don't continue if the current node gets banned Can not continue with recovery or monitoring cluster. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 14399de1dd0bd8dabf1f48b1457e3ccb37589d8a) 2013-06-28 10:31:07 +04:00			`bool self_ban;`
implement a scheme where nodes are banned if they continuously caused the cluster to start a recovery session. The node is banned from the cluster for the RecoveryBanPeriod (default of 5 minutes) (This used to be ctdb commit 4ad43dd07f526b6002477177fbf55483246c2c0c) 2007-06-07 09:18:55 +04:00
make recovery daemon values tunable (This used to be ctdb commit ec29dbf2f5110428df8b97801443ba7addf61353) 2007-06-04 14:22:44 +04:00
merge from ronnie (This used to be ctdb commit 0aa6e04438aa5ec727815689baa19544df042cf7) 2008-01-07 08:17:22 +03:00			`/* verify that the main daemon is still running */`
Track all child process so we never send a signal to an unrelated process (our child died and kernel wrapped the pid-space and reused the pid for a different process Wrap all creation of child processes inside ctdb_fork() which is used to track all processes we have spawned. Capture SIGCHLD to track also which child processes have terminated. Wrap kill() inside ctdb_kill() and make sure that we never send a !0 signal to a child process pid that has already terminated (and might have been replaced with a (This used to be ctdb commit f73a4b1495830bcdd094a93732a89dd53b3c2f78) 2012-05-03 05:42:41 +04:00			`if (ctdb_kill(ctdb, ctdb->ctdbd_pid, 0) != 0) {`
merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_CRIT,("CTDB daemon is no longer available. Shutting down recovery daemon\n"));`
merge from ronnie (This used to be ctdb commit 0aa6e04438aa5ec727815689baa19544df042cf7) 2008-01-07 08:17:22 +03:00			`exit(-1);`
			`}`

additional monitoring between the two daemons. we currently only monitor that the dameons are running by kill(0, pid) and verifying the the domain socket between them is ok. this is not sufficient since we can have a situation where the recovery daemon is hung. this new code monitors that the recovery daemon is operating. if the recovery hangs, we log this and shut down the main daemon (This used to be ctdb commit cd69d292292eaab3aac0e9d9fc57cb621597c63c) 2008-09-09 07:44:46 +04:00			`/* ping the local daemon to tell it we are alive */`
			`ctdb_ctrl_recd_ping(ctdb);`

ctdb-recoverd: Add an explicit flag for election in progress An alternate election method will be added that doesn't use the election timeout, so this provides a common way for recognising when an election is in progress. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-03-18 12:27:10 +03:00			`if (rec->election_in_progress) {`
make election handling much more scalable (This used to be ctdb commit 05938d462b92bd9ecb8e35f53651bded47c48675) 2007-11-13 02:27:44 +03:00			`/* an election is in progress */`
speed startup: alter recovery loop We do a recovery on startup. But the code does: Sleep for ctdb->tunable.recover_interval. Check for recovery. We want to do it in the other order. This is best done by extracting the loop into a separate "main_loop" function. Seconds between ctdbd first log message and node healthy: BEFORE: 24.09 AFTER: 23.58 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 097046025176b9fcb670839d1a9f100f890e7ed2) 2010-06-22 17:20:23 +04:00			`return;`
make election handling much more scalable (This used to be ctdb commit 05938d462b92bd9ecb8e35f53651bded47c48675) 2007-11-13 02:27:44 +03:00			`}`

ctdb-recoverd: Send leader broadcasts These are triggered on 1 second timer, but are only sent if the node is the current leader and there is no election underway. If this node can not be the leader then ensure it releases the recovery lock. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-03-16 08:16:44 +03:00			`/*`
			`* Start leader broadcasts if they are not active (1st time`
			`* through main loop? Memory allocation error?)`
			`*/`
			`if (!leader_broadcast_loop_active(rec)) {`
			`ret = leader_broadcast_loop(rec);`
			`if (ret != 0) {`
			`D_ERR("Failed to set up leader broadcast\n");`
			`ctdb_set_culprit(rec, rec->pnn);`
			`}`
			`}`
ctdb-recoverd: Handle leader broadcast timeout If no leader broadcasts have been received from the leader for more than 5s then trigger an election. Apart from being sane behaviour, this avoids elected-before-connected bugs at startup, where a node elects itself leader before it is connected to other nodes. When a node processes a leader broadcast timeout it sends an unknown leader broadcast to all nodes. That causes cancellation of the leader broadcast timeout across the cluster. This is particular important at startup, since nodes may be started in a staggered fashion. Without this cluster-wide cancellation, a node might notice the lack of leader, win an election and complete a recovery before other nodes notice the lack of leader. When the leader broadcast timeout finally occurs on the other nodes then they'll put the cluster back into an unnecessary recovery. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-17 06:42:47 +03:00			`/*`
			`* Similar for leader broadcast timeouts. These can also have`
			`* been stopped by another node receiving a leader broadcast`
			`* timeout and transmitting an "unknown leader broadcast".`
			`* Note that this should never be done during an election - at`
			`* the moment there is nothing between here and the above`
			`* election-in-progress check that can process an election`
			`* result (i.e. no event loop).`
			`*/`
			`if (!leader_broadcast_timeout_active(rec)) {`
			`ret = leader_broadcast_timeout_start(rec);`
			`if (ret != 0) {`
			`ctdb_set_culprit(rec, rec->pnn);`
			`}`
			`}`

ctdb-recoverd: Send leader broadcasts These are triggered on 1 second timer, but are only sent if the node is the current leader and there is no election underway. If this node can not be the leader then ensure it releases the recovery lock. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-03-16 08:16:44 +03:00
read the current debuglevel in each loop in the recovery daemon so that we pick up when they change in the parent daemon (This used to be ctdb commit 792d5471ff0c2947b6e66183925860de27f30eaf) 2008-02-18 11:38:04 +03:00			`/* read the debug level from the parent and update locally */`
			`ret = ctdb_ctrl_get_debuglevel(ctdb, CTDB_CURRENT_NODE, &debug_level);`
			`if (ret !=0) {`
			`DEBUG(DEBUG_ERR, (__location__ " Failed to read debuglevel from parent\n"));`
speed startup: alter recovery loop We do a recovery on startup. But the code does: Sleep for ctdb->tunable.recover_interval. Check for recovery. We want to do it in the other order. This is best done by extracting the loop into a separate "main_loop" function. Seconds between ctdbd first log message and node healthy: BEFORE: 24.09 AFTER: 23.58 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 097046025176b9fcb670839d1a9f100f890e7ed2) 2010-06-22 17:20:23 +04:00			`return;`
read the current debuglevel in each loop in the recovery daemon so that we pick up when they change in the parent daemon (This used to be ctdb commit 792d5471ff0c2947b6e66183925860de27f30eaf) 2008-02-18 11:38:04 +03:00			`}`
debug: Use debuglevel_(get\|set) function Signed-off-by: Andreas Schneider <asn@samba.org> Reviewed-by: Jeremy Allison <jra@samba.org> Autobuild-User(master): Andreas Schneider <asn@cryptomilk.org> Autobuild-Date(master): Thu Nov 8 11:03:11 CET 2018 on sn-devel-144 2018-11-07 16:14:05 +03:00			`debuglevel_set(debug_level);`
read the current debuglevel in each loop in the recovery daemon so that we pick up when they change in the parent daemon (This used to be ctdb commit 792d5471ff0c2947b6e66183925860de27f30eaf) 2008-02-18 11:38:04 +03:00
make recovery daemon values tunable (This used to be ctdb commit ec29dbf2f5110428df8b97801443ba7addf61353) 2007-06-04 14:22:44 +04:00			`/* get relevant tunables */`
get all the tunables at once in recovery daemon (This used to be ctdb commit 8e60be6c22aab145e68b16ede5f32f4430c2af93) 2007-06-07 12:05:25 +04:00			`ret = ctdb_ctrl_get_all_tunables(ctdb, CONTROL_TIMEOUT(), CTDB_CURRENT_NODE, &ctdb->tunable);`
			`if (ret != 0) {`
merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_ERR,("Failed to get tunables - retrying\n"));`
speed startup: alter recovery loop We do a recovery on startup. But the code does: Sleep for ctdb->tunable.recover_interval. Check for recovery. We want to do it in the other order. This is best done by extracting the loop into a separate "main_loop" function. Seconds between ctdbd first log message and node healthy: BEFORE: 24.09 AFTER: 23.58 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 097046025176b9fcb670839d1a9f100f890e7ed2) 2010-06-22 17:20:23 +04:00			`return;`
get all the tunables at once in recovery daemon (This used to be ctdb commit 8e60be6c22aab145e68b16ede5f32f4430c2af93) 2007-06-07 12:05:25 +04:00			`}`
start working on a recovery daemon change ctdb_control so it takes a timeval pointer as argument. this is the timeout. if the node has not responded within hte timeout ctdb_control will return an error instead of hanging. if the timeval pointer is NULL then the call will block indefinitely if there is no response. this is used for now in the createdb control but all the helpers ctdb_ctrl_* should probably be updated to take a timeout parameter as well. (This used to be ctdb commit 1fe64b04869b17dbf123851b0fe09df8d28a6211) 2007-05-04 02:30:18 +04:00
ctdb-recoverd: If obtaining recovery lock fails, try again When ctdb daemon starts up, it considers itself the recovery master and tries to do first recovery. However, it's possible that there is already a recovery master and the current node has not yet heard from it. So do not ban ourselves immediately if ctdb_recovery_lock() fails when doing first recovery. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2014-09-25 11:17:04 +04:00			`/* get runstate */`
			`ret = ctdb_ctrl_get_runstate(ctdb, CONTROL_TIMEOUT(),`
			`CTDB_CURRENT_NODE, &ctdb->runstate);`
			`if (ret != 0) {`
			`DEBUG(DEBUG_ERR, ("Failed to get runstate - retrying\n"));`
			`return;`
			`}`

ctdb-recoverd: Simplify using TALLOC_FREE() The only non-obvious part here is dropping the setting of the nodemap local variable to NULL. If the following control succeeds then it is set, otherwise return and it doesn't matter. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-10-23 08:00:55 +03:00			`/* get nodemap */`
ctdb-recoverd: Use rec->pnn everywhere This is currently referenced in a number of inconsistent ways, including: * pnn * rec->ctdb->pnn * ctdb->pnn * ctdb_get_pnn(ctdb) * ctdb_get_pnn(rec->ctdb) The first of these always requires some thought about the context - is this the node PNN or some other PNN (e.g. argument to function)? rec->pnn is now always used when referring to the recovery daemon's PNN. Doing this also reduces reliance on struct ctdb_context internals. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-08 12:25:46 +03:00			`ret = ctdb_ctrl_getnodemap(ctdb,`
			`CONTROL_TIMEOUT(),`
			`rec->pnn,`
			`rec,`
			`&nodemap);`
change the signature for ctdb_ctrl_getnodemap() so that a timeout parameter is added. change ctdb_get_connected_nodes in the same way (This used to be ctdb commit d85f23bcf4c1230225abb2f4a053c70b68d469aa) 2007-05-04 03:01:01 +04:00			`if (ret != 0) {`
ctdb-recoverd: Use rec->pnn everywhere This is currently referenced in a number of inconsistent ways, including: * pnn * rec->ctdb->pnn * ctdb->pnn * ctdb_get_pnn(ctdb) * ctdb_get_pnn(rec->ctdb) The first of these always requires some thought about the context - is this the node PNN or some other PNN (e.g. argument to function)? rec->pnn is now always used when referring to the recovery daemon's PNN. Doing this also reduces reliance on struct ctdb_context internals. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-08 12:25:46 +03:00			`DBG_ERR("Unable to get nodemap from node %"PRIu32"\n", rec->pnn);`
speed startup: alter recovery loop We do a recovery on startup. But the code does: Sleep for ctdb->tunable.recover_interval. Check for recovery. We want to do it in the other order. This is best done by extracting the loop into a separate "main_loop" function. Seconds between ctdbd first log message and node healthy: BEFORE: 24.09 AFTER: 23.58 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 097046025176b9fcb670839d1a9f100f890e7ed2) 2010-06-22 17:20:23 +04:00			`return;`
change the signature for ctdb_ctrl_getnodemap() so that a timeout parameter is added. change ctdb_get_connected_nodes in the same way (This used to be ctdb commit d85f23bcf4c1230225abb2f4a053c70b68d469aa) 2007-05-04 03:01:01 +04:00			`}`
ctdb-recoverd: Avoid dereferencing NULL rec->nodemap Inside the nested event loop in ctdb_ctrl_getnodemap(), various asynchronous handlers may dereference rec->nodemap, which will be NULL. One example is lost_reclock_handler(), which causes rec->nodemap to be unconditionally dereferenced in list_of_nodes() via this call chain: list_of_nodes() list_of_active_nodes() set_recovery_mode() force_election() lost_reclock_handler() Instead of attempting to trace all of the cases, just avoid leaving rec->nodemap set to NULL. Attempting to use an old value is generally harmless, especially since it will be the same as the new value in most cases. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14324 Reported-by: Volker Lendecke <vl@samba.org> Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Martin Schwenke <martins@samba.org> Autobuild-Date(master): Tue Mar 24 01:22:45 UTC 2020 on sn-devel-184 2020-03-22 05:46:46 +03:00			`talloc_free(rec->nodemap);`
			`rec->nodemap = nodemap;`
start working on a recovery daemon change ctdb_control so it takes a timeval pointer as argument. this is the timeout. if the node has not responded within hte timeout ctdb_control will return an error instead of hanging. if the timeval pointer is NULL then the call will block indefinitely if there is no response. this is used for now in the createdb control but all the helpers ctdb_ctrl_* should probably be updated to take a timeout parameter as well. (This used to be ctdb commit 1fe64b04869b17dbf123851b0fe09df8d28a6211) 2007-05-04 02:30:18 +04:00
recoverd: Set node_flags information as soon as we get nodemap Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 8d622660a14c929e365d306147b378ea6ab92175) 2013-06-28 08:09:35 +04:00			`/* remember our own node flags */`
ctdb-recoverd: Use rec->pnn everywhere This is currently referenced in a number of inconsistent ways, including: * pnn * rec->ctdb->pnn * ctdb->pnn * ctdb_get_pnn(ctdb) * ctdb_get_pnn(rec->ctdb) The first of these always requires some thought about the context - is this the node PNN or some other PNN (e.g. argument to function)? rec->pnn is now always used when referring to the recovery daemon's PNN. Doing this also reduces reliance on struct ctdb_context internals. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-08 12:25:46 +03:00			`rec->node_flags = nodemap->nodes[rec->pnn].flags;`
recoverd: Set node_flags information as soon as we get nodemap Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 8d622660a14c929e365d306147b378ea6ab92175) 2013-06-28 08:09:35 +04:00
recoverd: Don't continue if the current node gets banned Can not continue with recovery or monitoring cluster. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 14399de1dd0bd8dabf1f48b1457e3ccb37589d8a) 2013-06-28 10:31:07 +04:00			`ban_misbehaving_nodes(rec, &self_ban);`
			`if (self_ban) {`
			`DEBUG(DEBUG_NOTICE, ("This node was banned, restart main_loop\n"));`
			`return;`
			`}`
recoverd: Move code to ban other nodes after we get local node flags If a node gets banned first, then it should not ban other nodes. This code was moved up in main_loop to avoid waiting for nodemap from other nodes (commit 83b0261f2cb453195b86f547d360400103a8b795). To prevent a banned node from banning other nodes, we need to first get nodemap information from local node, so trying to ban other nodes can fail if we are already banned. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit ae1693905036ecdbc4594fde1f12500faae4a554) 2013-06-27 10:01:16 +04:00
ctdb-recovery: Get recmode unconditionally in the main_loop BUG: https://bugzilla.samba.org/show_bug.cgi?id=12857 This can be used later in the main_loop to avoid the local ip check. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2017-06-22 10:45:20 +03:00			`ret = ctdb_ctrl_getrecmode(ctdb, mem_ctx, CONTROL_TIMEOUT(),`
			`CTDB_CURRENT_NODE, &ctdb->recovery_mode);`
			`if (ret != 0) {`
			`D_ERR("Failed to read recmode from local node\n");`
			`return;`
			`}`

recoverd: Also check if current node is in recovery when it is banned Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 6a9dbb8fb0f1f6e8c206189cdc2d33bb371ea2a8) 2013-06-28 08:02:44 +04:00			`/* if the local daemon is STOPPED or BANNED, we verify that the databases are`
recoverd: fix a comment typo Signed-off-by: Michael Adam <obnox@samba.org> (This used to be ctdb commit 741944f118e98f178b860194eecb215180949d18) 2013-06-26 09:11:51 +04:00			`also frozen and that the recmode is set to active.`
recovery daemon needs to monitor when the local ctdb daemon is stopped and ensure that the databases gets frozen and the node enters recovery mode (This used to be ctdb commit 99f239f8b96c8c0a06ac8ca8b8083be96265865a) 2009-07-09 08:19:32 +04:00			`*/`
ctdb-recoverd: Simplify some stopped/banned checks to inactive checks Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-03-17 09:10:20 +03:00			`if (rec->node_flags & NODE_FLAGS_INACTIVE) {`
recoverd: Stabilise the recovery master role On rare occasions when a node that has been inactive it will trigger an election when it becomes active again. If that node has been up for the longest then it will win the election and the recovery master role will spuriously move. While a node remains inactive we reset the priority time to discourage it from winning elections. The priority time will now reflect roughly how long the node has been active rather than how long it has been up. That means the most stable node is more likely to win elections. Having a stable recovery master means that disabling takeover runs while reloading IPs is more likely to succeed. It also improves the chances of being able to cache information in the recovery master - for example, between takeover runs. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit f0f48f22f45e4c82eba2582efae307e25385de81) 2013-09-17 06:00:26 +04:00			`/* If this node has become inactive then we want to`
ctdb-recoverd: Logging/comments: recovery master -> leader There are some remaining instances in this file but they will be removed in subsequent commits. Modernise debug macros as appropriate. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-08 03:07:25 +03:00			`* reduce the chances of it taking over the leader`
			`* role when it becomes active again. This`
			`* helps to stabilise the leader role so that`
recoverd: Stabilise the recovery master role On rare occasions when a node that has been inactive it will trigger an election when it becomes active again. If that node has been up for the longest then it will win the election and the recovery master role will spuriously move. While a node remains inactive we reset the priority time to discourage it from winning elections. The priority time will now reflect roughly how long the node has been active rather than how long it has been up. That means the most stable node is more likely to win elections. Having a stable recovery master means that disabling takeover runs while reloading IPs is more likely to succeed. It also improves the chances of being able to cache information in the recovery master - for example, between takeover runs. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit f0f48f22f45e4c82eba2582efae307e25385de81) 2013-09-17 06:00:26 +04:00			`* it stays on the most stable node.`
			`*/`
			`rec->priority_time = timeval_current();`

recovery daemon needs to monitor when the local ctdb daemon is stopped and ensure that the databases gets frozen and the node enters recovery mode (This used to be ctdb commit 99f239f8b96c8c0a06ac8ca8b8083be96265865a) 2009-07-09 08:19:32 +04:00			`if (ctdb->recovery_mode == CTDB_RECOVERY_NORMAL) {`
recoverd: Also check if current node is in recovery when it is banned Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 6a9dbb8fb0f1f6e8c206189cdc2d33bb371ea2a8) 2013-06-28 08:02:44 +04:00			`DEBUG(DEBUG_ERR,("Node is stopped or banned but recovery mode is not active. Activate recovery mode and lock databases\n"));`
recovery daemon needs to monitor when the local ctdb daemon is stopped and ensure that the databases gets frozen and the node enters recovery mode (This used to be ctdb commit 99f239f8b96c8c0a06ac8ca8b8083be96265865a) 2009-07-09 08:19:32 +04:00
			`ret = ctdb_ctrl_setrecmode(ctdb, CONTROL_TIMEOUT(), CTDB_CURRENT_NODE, CTDB_RECOVERY_ACTIVE);`
			`if (ret != 0) {`
recoverd: Also check if current node is in recovery when it is banned Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 6a9dbb8fb0f1f6e8c206189cdc2d33bb371ea2a8) 2013-06-28 08:02:44 +04:00			`DEBUG(DEBUG_ERR,(__location__ " Failed to activate recovery mode in STOPPED or BANNED state\n"));`
recovery daemon needs to monitor when the local ctdb daemon is stopped and ensure that the databases gets frozen and the node enters recovery mode (This used to be ctdb commit 99f239f8b96c8c0a06ac8ca8b8083be96265865a) 2009-07-09 08:19:32 +04:00
speed startup: alter recovery loop We do a recovery on startup. But the code does: Sleep for ctdb->tunable.recover_interval. Check for recovery. We want to do it in the other order. This is best done by extracting the loop into a separate "main_loop" function. Seconds between ctdbd first log message and node healthy: BEFORE: 24.09 AFTER: 23.58 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 097046025176b9fcb670839d1a9f100f890e7ed2) 2010-06-22 17:20:23 +04:00			`return;`
recovery daemon needs to monitor when the local ctdb daemon is stopped and ensure that the databases gets frozen and the node enters recovery mode (This used to be ctdb commit 99f239f8b96c8c0a06ac8ca8b8083be96265865a) 2009-07-09 08:19:32 +04:00			`}`
ctdb-recoverd: Freeze databases whenever the node is INACTIVE If the node becomes stopped or banned after recovery is marked active, then it will never freeze the databases, and hence the node will keep banning itself indefinitely, until ctdbd is restarted. This is a regression from 4.3, introduced with b4357a79d916b1f8ade8fa78563fbef0ce670aa9 and d8f3b490bbb691c9916eed0df5b980c1aef23c85 BUG: https://bugzilla.samba.org/show_bug.cgi?id=11945 Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org> Reviewed-by: Martin Schwenke <martin@meltin.net> Autobuild-User(master): Michael Adam <obnox@samba.org> Autobuild-Date(master): Wed Jun 1 17:36:12 CEST 2016 on sn-devel-144 2016-06-01 05:10:46 +03:00			`}`
			`if (! rec->frozen_on_inactive) {`
			`ret = ctdb_ctrl_freeze(ctdb, CONTROL_TIMEOUT(),`
			`CTDB_CURRENT_NODE);`
ctdb-recoverd: Set recovery mode before freezing databases Setting recovery mode to active is the only correct way to inform recovery daemon to run database recovery. Only freezing databases without setting recovery mode should not trigger database recovery, as this mechanism is used in tool to implement wipedb/restoredb commands. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2014-05-06 08:24:52 +04:00			`if (ret != 0) {`
ctdb-recoverd: Freeze databases whenever the node is INACTIVE If the node becomes stopped or banned after recovery is marked active, then it will never freeze the databases, and hence the node will keep banning itself indefinitely, until ctdbd is restarted. This is a regression from 4.3, introduced with b4357a79d916b1f8ade8fa78563fbef0ce670aa9 and d8f3b490bbb691c9916eed0df5b980c1aef23c85 BUG: https://bugzilla.samba.org/show_bug.cgi?id=11945 Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org> Reviewed-by: Martin Schwenke <martin@meltin.net> Autobuild-User(master): Michael Adam <obnox@samba.org> Autobuild-Date(master): Wed Jun 1 17:36:12 CEST 2016 on sn-devel-144 2016-06-01 05:10:46 +03:00			`DEBUG(DEBUG_ERR,`
			`(__location__ " Failed to freeze node "`
			`"in STOPPED or BANNED state\n"));`
ctdb-recoverd: Set recovery mode before freezing databases Setting recovery mode to active is the only correct way to inform recovery daemon to run database recovery. Only freezing databases without setting recovery mode should not trigger database recovery, as this mechanism is used in tool to implement wipedb/restoredb commands. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2014-05-06 08:24:52 +04:00			`return;`
			`}`
ctdb-recoverd: Freeze databases whenever the node is INACTIVE If the node becomes stopped or banned after recovery is marked active, then it will never freeze the databases, and hence the node will keep banning itself indefinitely, until ctdbd is restarted. This is a regression from 4.3, introduced with b4357a79d916b1f8ade8fa78563fbef0ce670aa9 and d8f3b490bbb691c9916eed0df5b980c1aef23c85 BUG: https://bugzilla.samba.org/show_bug.cgi?id=11945 Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org> Reviewed-by: Martin Schwenke <martin@meltin.net> Autobuild-User(master): Michael Adam <obnox@samba.org> Autobuild-Date(master): Wed Jun 1 17:36:12 CEST 2016 on sn-devel-144 2016-06-01 05:10:46 +03:00
			`rec->frozen_on_inactive = true;`
recovery daemon needs to monitor when the local ctdb daemon is stopped and ensure that the databases gets frozen and the node enters recovery mode (This used to be ctdb commit 99f239f8b96c8c0a06ac8ca8b8083be96265865a) 2009-07-09 08:19:32 +04:00			`}`
recoverd: Always do an early exit from main_loop if node is stopped or banned A stopped or banned node cannot do anything useful. So do not participate in any cluster activity and do not cause any unnecessary network traffic. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 2396981c4bcf30530aeb7f4395093cc202105b50) 2013-06-27 09:39:15 +04:00
			`/* If this node is stopped or banned then it is not the recovery`
			`* master, so don't do anything. This prevents stopped or banned`
			`* node from starting election and sending unnecessary controls.`
			`*/`
			`return;`
recovery daemon needs to monitor when the local ctdb daemon is stopped and ensure that the databases gets frozen and the node enters recovery mode (This used to be ctdb commit 99f239f8b96c8c0a06ac8ca8b8083be96265865a) 2009-07-09 08:19:32 +04:00			`}`
recoverd: Always do an early exit from main_loop if node is stopped or banned A stopped or banned node cannot do anything useful. So do not participate in any cluster activity and do not cause any unnecessary network traffic. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 2396981c4bcf30530aeb7f4395093cc202105b50) 2013-06-27 09:39:15 +04:00
ctdb-recoverd: Freeze databases whenever the node is INACTIVE If the node becomes stopped or banned after recovery is marked active, then it will never freeze the databases, and hence the node will keep banning itself indefinitely, until ctdbd is restarted. This is a regression from 4.3, introduced with b4357a79d916b1f8ade8fa78563fbef0ce670aa9 and d8f3b490bbb691c9916eed0df5b980c1aef23c85 BUG: https://bugzilla.samba.org/show_bug.cgi?id=11945 Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org> Reviewed-by: Martin Schwenke <martin@meltin.net> Autobuild-User(master): Michael Adam <obnox@samba.org> Autobuild-Date(master): Wed Jun 1 17:36:12 CEST 2016 on sn-devel-144 2016-06-01 05:10:46 +03:00			`rec->frozen_on_inactive = false;`

ctdb-recmaster: Update capabilities before calling first election Capabilities are used when computing an election result so having them up-to-date seems like a good idea. Also update several instances of an ambiguous comment. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-10-27 07:09:33 +03:00			`/* Retrieve capabilities from all connected nodes */`
			`ret = update_capabilities(rec, nodemap);`
			`if (ret != 0) {`
			`DEBUG(DEBUG_ERR, (__location__ " Unable to update node capabilities.\n"));`
			`return;`
			`}`

ctdb-recovery: Do not run local ip verification when in recovery BUG: https://bugzilla.samba.org/show_bug.cgi?id=12857 If we drop public IPs because CTDB is in recovery for too long, then avoid spamming logs "Trigger takeoverrun" every second. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2017-06-22 09:15:47 +03:00			`if (ctdb->recovery_mode == CTDB_RECOVERY_NORMAL) {`
			`/* Check if an IP takeover run is needed and trigger one if`
			`* necessary */`
ctdb-recoverd: Simplify arguments to verify_local_ip_allocation() All other arguments are available via rec, so simplify. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-13 01:51:36 +03:00			`verify_local_ip_allocation(rec);`
ctdb-recovery: Do not run local ip verification when in recovery BUG: https://bugzilla.samba.org/show_bug.cgi?id=12857 If we drop public IPs because CTDB is in recovery for too long, then avoid spamming logs "Trigger takeoverrun" every second. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2017-06-22 09:15:47 +03:00			`}`
recovery daemon with recovery master election election is primitive, it elects the lowest vnn as the recovery master two new controls, to get/set recovery master for a node to use recovery daemon, start one ./bin/recoverd --socket=ctdb.socket* for each ctdb daemon it has been briefly tested by deleting and adding nodes to a 4 node cluster but needs more testing (This used to be ctdb commit 541d1cc49d46d44042a31a8404d521412ef2fdb3) 2007-05-07 00:51:58 +04:00
ctdb-recoverd: Factor out and use function this_node_is_leader() Make the code self-documenting. This preempts an upcoming change to terminology but doing it now saves a lot of churn. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-08 11:37:39 +03:00			`/* If this node is not the leader then skip recovery checks */`
			`if (!this_node_is_leader(rec)) {`
speed startup: alter recovery loop We do a recovery on startup. But the code does: Sleep for ctdb->tunable.recover_interval. Check for recovery. We want to do it in the other order. This is best done by extracting the loop into a separate "main_loop" function. Seconds between ctdbd first log message and node healthy: BEFORE: 24.09 AFTER: 23.58 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 097046025176b9fcb670839d1a9f100f890e7ed2) 2010-06-22 17:20:23 +04:00			`return;`
recovery daemon with recovery master election election is primitive, it elects the lowest vnn as the recovery master two new controls, to get/set recovery master for a node to use recovery daemon, start one ./bin/recoverd --socket=ctdb.socket* for each ctdb daemon it has been briefly tested by deleting and adding nodes to a 4 node cluster but needs more testing (This used to be ctdb commit 541d1cc49d46d44042a31a8404d521412ef2fdb3) 2007-05-07 00:51:58 +04:00			`}`

simplify election handling make sure we read and update the flags from all remote nodes before we reach the first codepath that can call do_recovery() since during do_recovery() we need to know what the flags are. (This used to be ctdb commit e85f3806483ea420559d449e0e4d81bec996740f) 2007-10-11 00:16:36 +04:00
ctdb-recoverd: Get remote nodemaps earlier update_local_flags() will be changed to use these nodemaps. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2019-06-13 20:51:01 +03:00			`/* Get the nodemaps for all connected remote nodes */`
			`ret = get_remote_nodemaps(rec, mem_ctx, &remote_nodemaps);`
			`if (ret != 0) {`
			`DBG_ERR("Failed to read remote nodemaps\n");`
			`return;`
			`}`

ctdb-recoverd: Rename update_local_flags() -> update_flags() This also updates remote flags so the name is misleading. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-01-24 02:21:37 +03:00			`/* Ensure our local and remote flags are correct */`
			`ret = update_flags(rec, nodemap, remote_nodemaps);`
ctdb-recoverd: Simplify return values when updating local flags Change this to return just 0 or -1. It isn't monitoring anything. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-04-27 14:47:08 +03:00			`if (ret != 0) {`
ctdb-recoverd: Rename update_local_flags() -> update_flags() This also updates remote flags so the name is misleading. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-01-24 02:21:37 +03:00			`D_ERR("Unable to update flags\n");`
speed startup: alter recovery loop We do a recovery on startup. But the code does: Sleep for ctdb->tunable.recover_interval. Check for recovery. We want to do it in the other order. This is best done by extracting the loop into a separate "main_loop" function. Seconds between ctdbd first log message and node healthy: BEFORE: 24.09 AFTER: 23.58 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 097046025176b9fcb670839d1a9f100f890e7ed2) 2010-06-22 17:20:23 +04:00			`return;`
simplify election handling make sure we read and update the flags from all remote nodes before we reach the first codepath that can call do_recovery() since during do_recovery() we need to know what the flags are. (This used to be ctdb commit e85f3806483ea420559d449e0e4d81bec996740f) 2007-10-11 00:16:36 +04:00			`}`

when we reload the nodes file, we may need to reload the nodes file inside the recovery daemon as well. (This used to be ctdb commit 82fd2b6b5cd8e988c38fa6b74121a048757bdeef) 2008-10-17 14:18:06 +04:00			`if (ctdb->num_nodes != nodemap->num) {`
			`DEBUG(DEBUG_ERR, (__location__ " ctdb->num_nodes (%d) != nodemap->num (%d) reloading nodes file\n", ctdb->num_nodes, nodemap->num));`
ctdb-server: rename ctdb_load_nodes_file to ctdb_load_nodes Rename ctdb_load_nodes_file to ctdb_load_nodes as it can now load nodes from more than a regular file. Signed-off-by: John Mulligan <jmulligan@redhat.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2024-06-06 20:53:43 +03:00			`ctdb_load_nodes(ctdb);`
speed startup: alter recovery loop We do a recovery on startup. But the code does: Sleep for ctdb->tunable.recover_interval. Check for recovery. We want to do it in the other order. This is best done by extracting the loop into a separate "main_loop" function. Seconds between ctdbd first log message and node healthy: BEFORE: 24.09 AFTER: 23.58 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 097046025176b9fcb670839d1a9f100f890e7ed2) 2010-06-22 17:20:23 +04:00			`return;`
when we reload the nodes file, we may need to reload the nodes file inside the recovery daemon as well. (This used to be ctdb commit 82fd2b6b5cd8e988c38fa6b74121a048757bdeef) 2008-10-17 14:18:06 +04:00			`}`
allow different nodes in the cluster to use different public_addresses files so that we can partition the cluster into different subsets of nodes which each serve a different subset of the public addresses (This used to be ctdb commit 889e0fe69e4c88c6166282b12843b8d9727552d6) 2007-09-04 17:15:23 +04:00
ctdb-recoverd: Move VNN map retrieval to where it is needed The VNN map is only needed on the recovery master, so no need for all recovery daemons to retrieve it. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-10-27 06:35:09 +03:00			`/* get the vnnmap */`
ctdb-recoverd: Use rec->pnn everywhere This is currently referenced in a number of inconsistent ways, including: * pnn * rec->ctdb->pnn * ctdb->pnn * ctdb_get_pnn(ctdb) * ctdb_get_pnn(rec->ctdb) The first of these always requires some thought about the context - is this the node PNN or some other PNN (e.g. argument to function)? rec->pnn is now always used when referring to the recovery daemon's PNN. Doing this also reduces reliance on struct ctdb_context internals. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-08 12:25:46 +03:00			`ret = ctdb_ctrl_getvnnmap(ctdb,`
			`CONTROL_TIMEOUT(),`
			`rec->pnn,`
			`mem_ctx,`
			`&vnnmap);`
ctdb-recoverd: Move VNN map retrieval to where it is needed The VNN map is only needed on the recovery master, so no need for all recovery daemons to retrieve it. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-10-27 06:35:09 +03:00			`if (ret != 0) {`
ctdb-recoverd: Use rec->pnn everywhere This is currently referenced in a number of inconsistent ways, including: * pnn * rec->ctdb->pnn * ctdb->pnn * ctdb_get_pnn(ctdb) * ctdb_get_pnn(rec->ctdb) The first of these always requires some thought about the context - is this the node PNN or some other PNN (e.g. argument to function)? rec->pnn is now always used when referring to the recovery daemon's PNN. Doing this also reduces reliance on struct ctdb_context internals. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-08 12:25:46 +03:00			`DBG_ERR("Unable to get vnnmap from node %u\n", rec->pnn);`
ctdb-recoverd: Move VNN map retrieval to where it is needed The VNN map is only needed on the recovery master, so no need for all recovery daemons to retrieve it. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-10-27 06:35:09 +03:00			`return;`
			`}`

- merge from ronnie - add a flag to check that recovery completed correctly. If not, re-trigger it in monitoring (This used to be ctdb commit d5ed941d9bab4af30d8b5f9b77bdf43d9218d69b) 2007-09-14 03:49:12 +04:00			`if (rec->need_recovery) {`
			`/* a previous recovery didn't finish */`
ctdb-recoverd: Simplify arguments to do_recovery() pnn and nodemap are both available via the rec context, so simplify. vnnmap is unused. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-01-16 08:20:05 +03:00			`do_recovery(rec, mem_ctx);`
speed startup: alter recovery loop We do a recovery on startup. But the code does: Sleep for ctdb->tunable.recover_interval. Check for recovery. We want to do it in the other order. This is best done by extracting the loop into a separate "main_loop" function. Seconds between ctdbd first log message and node healthy: BEFORE: 24.09 AFTER: 23.58 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 097046025176b9fcb670839d1a9f100f890e7ed2) 2010-06-22 17:20:23 +04:00			`return;`
- merge from ronnie - add a flag to check that recovery completed correctly. If not, re-trigger it in monitoring (This used to be ctdb commit d5ed941d9bab4af30d8b5f9b77bdf43d9218d69b) 2007-09-14 03:49:12 +04:00			`}`

add a test in the function that checks whether the cluster needs recovery or not that all active nodes are in normal mode. If we discover that some node is still in recoverymode it may indicate that a previous recovery ended prematurely and thus we should start a new recovery (This used to be ctdb commit c15517872e6c98c8c425a8d47d2b348ecb0620b0) 2007-05-06 22:41:12 +04:00			`/* verify that all active nodes are in normal mode`
			`and not in recovery mode`
new prototype banning code (This used to be ctdb commit 0c4c2240267af183d54ffd4c0aacda208f6eff6a) 2009-09-03 20:20:39 +04:00			`*/`
try out a slightly different api for controls where you provide a callback function which is called upon completion (or timeout) of the control. modify scanning of recmaster in the monitoring_cluster code to try the api out (This used to be ctdb commit c37843f1d97b169afec910e7ddb4e5ac12c3015c) 2007-08-23 13:27:09 +04:00			`switch (verify_recmode(ctdb, nodemap)) {`
break checking that the recoverymode on all nodes are ok out into its own function (This used to be ctdb commit 813cf9a252af96da24122b80f24aabeed2911939) 2007-08-23 07:48:39 +04:00			`case MONITOR_RECOVERY_NEEDED:`
ctdb-recoverd: Simplify arguments to do_recovery() pnn and nodemap are both available via the rec context, so simplify. vnnmap is unused. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-01-16 08:20:05 +03:00			`do_recovery(rec, mem_ctx);`
speed startup: alter recovery loop We do a recovery on startup. But the code does: Sleep for ctdb->tunable.recover_interval. Check for recovery. We want to do it in the other order. This is best done by extracting the loop into a separate "main_loop" function. Seconds between ctdbd first log message and node healthy: BEFORE: 24.09 AFTER: 23.58 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 097046025176b9fcb670839d1a9f100f890e7ed2) 2010-06-22 17:20:23 +04:00			`return;`
break checking that the recoverymode on all nodes are ok out into its own function (This used to be ctdb commit 813cf9a252af96da24122b80f24aabeed2911939) 2007-08-23 07:48:39 +04:00			`case MONITOR_FAILED:`
speed startup: alter recovery loop We do a recovery on startup. But the code does: Sleep for ctdb->tunable.recover_interval. Check for recovery. We want to do it in the other order. This is best done by extracting the loop into a separate "main_loop" function. Seconds between ctdbd first log message and node healthy: BEFORE: 24.09 AFTER: 23.58 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 097046025176b9fcb670839d1a9f100f890e7ed2) 2010-06-22 17:20:23 +04:00			`return;`
try out a slightly different api for controls where you provide a callback function which is called upon completion (or timeout) of the control. modify scanning of recmaster in the monitoring_cluster code to try the api out (This used to be ctdb commit c37843f1d97b169afec910e7ddb4e5ac12c3015c) 2007-08-23 13:27:09 +04:00			`case MONITOR_ELECTION_NEEDED:`
			`/* can not happen */`
break checking that the recoverymode on all nodes are ok out into its own function (This used to be ctdb commit 813cf9a252af96da24122b80f24aabeed2911939) 2007-08-23 07:48:39 +04:00			`case MONITOR_OK:`
			`break;`
add a test in the function that checks whether the cluster needs recovery or not that all active nodes are in normal mode. If we discover that some node is still in recoverymode it may indicate that a previous recovery ended prematurely and thus we should start a new recovery (This used to be ctdb commit c15517872e6c98c8c425a8d47d2b348ecb0620b0) 2007-05-06 22:41:12 +04:00			`}`

ctdb-recoverd: Add and use function cluster_lock_enabled() Now all references to ctdb->recovery_lock are encapsulated in the cluster lock code. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-10 03:43:10 +03:00			`if (cluster_lock_enabled(rec)) {`
ctdb-recoverd: Terminology change: recovery lock -> cluster lock No functional changes, just name changes for clarity. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-10 03:29:06 +03:00			`/* We must already hold the cluster lock */`
			`if (!cluster_lock_held(rec)) {`
			`D_ERR("Failed cluster lock sanity check\n");`
ctdb-recoverd: Use rec->pnn everywhere This is currently referenced in a number of inconsistent ways, including: * pnn * rec->ctdb->pnn * ctdb->pnn * ctdb_get_pnn(ctdb) * ctdb_get_pnn(rec->ctdb) The first of these always requires some thought about the context - is this the node PNN or some other PNN (e.g. argument to function)? rec->pnn is now always used when referring to the recovery daemon's PNN. Doing this also reduces reliance on struct ctdb_context internals. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-08 12:25:46 +03:00			`ctdb_set_culprit(rec, rec->pnn);`
ctdb-recoverd: Simplify arguments to do_recovery() pnn and nodemap are both available via the rec context, so simplify. vnnmap is unused. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-01-16 08:20:05 +03:00			`do_recovery(rec, mem_ctx);`
speed startup: alter recovery loop We do a recovery on startup. But the code does: Sleep for ctdb->tunable.recover_interval. Check for recovery. We want to do it in the other order. This is best done by extracting the loop into a separate "main_loop" function. Seconds between ctdbd first log message and node healthy: BEFORE: 24.09 AFTER: 23.58 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 097046025176b9fcb670839d1a9f100f890e7ed2) 2010-06-22 17:20:23 +04:00			`return;`
Dont access the reclock file at all if VerifyRecoveryLock is zero and also make sure the reclock file is closed if the variable is cleared at runtime (This used to be ctdb commit a25f4888689a0725971606163d87c39a41669292) 2009-06-25 05:41:18 +04:00			`}`
- catch ESTALE in the recovery lock by trying a read() - priortise nodes that are unbanned and healthy in the election (This used to be ctdb commit 929feb475dfdf7283f0e99b50b179e1c91d3a39f) 2007-10-05 07:28:21 +04:00			`}`
break checking that the recoverymode on all nodes are ok out into its own function (This used to be ctdb commit 813cf9a252af96da24122b80f24aabeed2911939) 2007-08-23 07:48:39 +04:00
Add new control to reload the public ip address file on a node Also add a method to use the recovery master/daemon to reload the public ips on all nodes in the cluster. Reloading the public ips on all node sin the cluster is only suported if all nodes in the cluster are available and healthy. (This used to be ctdb commit 05603e914f8c12618d7e06943c0f7df207f645b0) 2012-04-30 09:50:44 +04:00
ctdb-recoverd: Avoid nodemap-related checks when recoveries are disabled The potential resulting recovery won't run anyway. Also recoveries may have been disabled by "reloadnodes" and if the nodemaps are inconsistent between nodes then avoid triggering an unnecessary recovery. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-02-06 12:59:11 +03:00			`/* If recoveries are disabled then there is no use doing any`
			`* nodemap or flags checks. Recoveries might be disabled due`
			`* to "reloadnodes", so doing these checks might cause an`
			`* unnecessary recovery. */`
			`if (ctdb_op_is_disabled(rec->recovery)) {`
ctdb-recoverd: Move takeover run checks after recover checks If a recovery is going to be done then this will be followed by a takeover run anyway. So, there's no use doing the takeover run checks, potentially doing a takeover run and then doing a recovery. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-05-03 09:00:02 +03:00			`goto takeover_run_checks;`
ctdb-recoverd: Avoid nodemap-related checks when recoveries are disabled The potential resulting recovery won't run anyway. Also recoveries may have been disabled by "reloadnodes" and if the nodemaps are inconsistent between nodes then avoid triggering an unnecessary recovery. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-02-06 12:59:11 +03:00			`}`

redo and update how we synchronize flags across the cluster. this simplifies the code and should close a race condition between the local recovery daemon and a remote node when flags are changing. (This used to be ctdb commit 32d460b8469eb53145f04161a5d01166f9b5f09e) 2008-12-05 08:32:30 +03:00			`/* verify that all other nodes have the same nodemap as we have`
			`*/`
update getvnnmap control to take a timeout parameter dont explicitely free the vnnmap pointer in the getvnnmap control this is freed by the mem_ctx instead add code to the recoverd to detect when/if recovery is required veiry that the number of active nodes, the nodemap and the vnn map is consistent across the entire cluster and if not trigger a recovery (which right now just prints "we need to do recovery" to the screen. (This used to be ctdb commit 2b0a207a3748bdb3394dc9fd0d1c344ee1bb0bb5) 2007-05-04 03:45:53 +04:00			`for (j=0; j<nodemap->num; j++) {`
ctdb-recoverd: Use rec->pnn everywhere This is currently referenced in a number of inconsistent ways, including: * pnn * rec->ctdb->pnn * ctdb->pnn * ctdb_get_pnn(ctdb) * ctdb_get_pnn(rec->ctdb) The first of these always requires some thought about the context - is this the node PNN or some other PNN (e.g. argument to function)? rec->pnn is now always used when referring to the recovery daemon's PNN. Doing this also reduces reliance on struct ctdb_context internals. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-08 12:25:46 +03:00			`if (nodemap->nodes[j].pnn == rec->pnn) {`
ctdb-recoverd: Do not fetch the nodemap from the recovery master The nodemap has already been fetched from the local node and is actually passed to this function. Care must be taken to avoid referencing the "remote" nodemap for the recovery master. It also isn't useful to do so, since it would be the same nodemap. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2019-06-13 17:23:22 +03:00			`continue;`
			`}`
We dont need to verify the nodemap on remote nodes that are banned (This used to be ctdb commit 7f8f9385deee6eff2b7303147bc6412bbdc122df) 2009-04-06 06:00:22 +04:00			`if (nodemap->nodes[j].flags & NODE_FLAGS_INACTIVE) {`
update getvnnmap control to take a timeout parameter dont explicitely free the vnnmap pointer in the getvnnmap control this is freed by the mem_ctx instead add code to the recoverd to detect when/if recovery is required veiry that the number of active nodes, the nodemap and the vnn map is consistent across the entire cluster and if not trigger a recovery (which right now just prints "we need to do recovery" to the screen. (This used to be ctdb commit 2b0a207a3748bdb3394dc9fd0d1c344ee1bb0bb5) 2007-05-04 03:45:53 +04:00			`continue;`
			`}`

redo and update how we synchronize flags across the cluster. this simplifies the code and should close a race condition between the local recovery daemon and a remote node when flags are changing. (This used to be ctdb commit 32d460b8469eb53145f04161a5d01166f9b5f09e) 2008-12-05 08:32:30 +03:00			`/* if the nodes disagree on how many nodes there are`
update getvnnmap control to take a timeout parameter dont explicitely free the vnnmap pointer in the getvnnmap control this is freed by the mem_ctx instead add code to the recoverd to detect when/if recovery is required veiry that the number of active nodes, the nodemap and the vnn map is consistent across the entire cluster and if not trigger a recovery (which right now just prints "we need to do recovery" to the screen. (This used to be ctdb commit 2b0a207a3748bdb3394dc9fd0d1c344ee1bb0bb5) 2007-05-04 03:45:53 +04:00			`then this is a good reason to try recovery`
			`*/`
redo and update how we synchronize flags across the cluster. this simplifies the code and should close a race condition between the local recovery daemon and a remote node when flags are changing. (This used to be ctdb commit 32d460b8469eb53145f04161a5d01166f9b5f09e) 2008-12-05 08:32:30 +03:00			`if (remote_nodemaps[j]->num != nodemap->num) {`
merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_ERR, (__location__ " Remote node:%u has different node count. %u vs %u of the local node\n",`
redo and update how we synchronize flags across the cluster. this simplifies the code and should close a race condition between the local recovery daemon and a remote node when flags are changing. (This used to be ctdb commit 32d460b8469eb53145f04161a5d01166f9b5f09e) 2008-12-05 08:32:30 +03:00			`nodemap->nodes[j].pnn, remote_nodemaps[j]->num, nodemap->num));`
new prototype banning code (This used to be ctdb commit 0c4c2240267af183d54ffd4c0aacda208f6eff6a) 2009-09-03 20:20:39 +04:00			`ctdb_set_culprit(rec, nodemap->nodes[j].pnn);`
ctdb-recoverd: Simplify arguments to do_recovery() pnn and nodemap are both available via the rec context, so simplify. vnnmap is unused. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-01-16 08:20:05 +03:00			`do_recovery(rec, mem_ctx);`
speed startup: alter recovery loop We do a recovery on startup. But the code does: Sleep for ctdb->tunable.recover_interval. Check for recovery. We want to do it in the other order. This is best done by extracting the loop into a separate "main_loop" function. Seconds between ctdbd first log message and node healthy: BEFORE: 24.09 AFTER: 23.58 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 097046025176b9fcb670839d1a9f100f890e7ed2) 2010-06-22 17:20:23 +04:00			`return;`
update getvnnmap control to take a timeout parameter dont explicitely free the vnnmap pointer in the getvnnmap control this is freed by the mem_ctx instead add code to the recoverd to detect when/if recovery is required veiry that the number of active nodes, the nodemap and the vnn map is consistent across the entire cluster and if not trigger a recovery (which right now just prints "we need to do recovery" to the screen. (This used to be ctdb commit 2b0a207a3748bdb3394dc9fd0d1c344ee1bb0bb5) 2007-05-04 03:45:53 +04:00			`}`

			`/* if the nodes disagree on which nodes exist and are`
			`active, then that is also a good reason to do recovery`
			`*/`
			`for (i=0;i<nodemap->num;i++) {`
redo and update how we synchronize flags across the cluster. this simplifies the code and should close a race condition between the local recovery daemon and a remote node when flags are changing. (This used to be ctdb commit 32d460b8469eb53145f04161a5d01166f9b5f09e) 2008-12-05 08:32:30 +03:00			`if (remote_nodemaps[j]->nodes[i].pnn != nodemap->nodes[i].pnn) {`
merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_ERR, (__location__ " Remote node:%u has different nodemap pnn for %d (%u vs %u).\n",`
change how we do public addresses and takeover so that we can have multiple public addresses spread across multiple interfaces on each node. this is a massive patch since we have previously made the assumtion that we only have one public address per node. get rid of the public_interface argument. the public addresses file now explicitely lists which interface the address belongs to (This used to be ctdb commit 462ebbc791e906a6b874c862defea43235597ca8) 2007-09-04 03:50:07 +04:00			`nodemap->nodes[j].pnn, i,`
redo and update how we synchronize flags across the cluster. this simplifies the code and should close a race condition between the local recovery daemon and a remote node when flags are changing. (This used to be ctdb commit 32d460b8469eb53145f04161a5d01166f9b5f09e) 2008-12-05 08:32:30 +03:00			`remote_nodemaps[j]->nodes[i].pnn, nodemap->nodes[i].pnn));`
new prototype banning code (This used to be ctdb commit 0c4c2240267af183d54ffd4c0aacda208f6eff6a) 2009-09-03 20:20:39 +04:00			`ctdb_set_culprit(rec, nodemap->nodes[j].pnn);`
ctdb-recoverd: Simplify arguments to do_recovery() pnn and nodemap are both available via the rec context, so simplify. vnnmap is unused. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-01-16 08:20:05 +03:00			`do_recovery(rec, mem_ctx);`
speed startup: alter recovery loop We do a recovery on startup. But the code does: Sleep for ctdb->tunable.recover_interval. Check for recovery. We want to do it in the other order. This is best done by extracting the loop into a separate "main_loop" function. Seconds between ctdbd first log message and node healthy: BEFORE: 24.09 AFTER: 23.58 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 097046025176b9fcb670839d1a9f100f890e7ed2) 2010-06-22 17:20:23 +04:00			`return;`
update getvnnmap control to take a timeout parameter dont explicitely free the vnnmap pointer in the getvnnmap control this is freed by the mem_ctx instead add code to the recoverd to detect when/if recovery is required veiry that the number of active nodes, the nodemap and the vnn map is consistent across the entire cluster and if not trigger a recovery (which right now just prints "we need to do recovery" to the screen. (This used to be ctdb commit 2b0a207a3748bdb3394dc9fd0d1c344ee1bb0bb5) 2007-05-04 03:45:53 +04:00			`}`
			`}`
recoverd: Assemble up-to-date node flags information from remote nodes Currently nodemap used by recovery master is the one obtained from the local node. This information may have been updated while processing main loop. Before comparing node flags on all the nodes, create up-to-date node flags information based on the information received from all the nodes. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit fcf77dec5af973a0e32f3999bc012053a6f47a96) 2013-07-22 11:26:28 +04:00			`}`

ctdb_recoverd: Move num_lmasters calculation to near where it is used Unless this node is the recovery master then this is not needed. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-03-29 12:00:17 +03:00			`/* count how many active nodes there are */`
			`num_lmasters = 0;`
			`for (i=0; i<nodemap->num; i++) {`
			`if (!(nodemap->nodes[i].flags & NODE_FLAGS_INACTIVE)) {`
			`if (ctdb_node_has_capabilities(rec->caps,`
			`ctdb->nodes[i]->pnn,`
			`CTDB_CAP_LMASTER)) {`
			`num_lmasters++;`
			`}`
			`}`
			`}`

update getvnnmap control to take a timeout parameter dont explicitely free the vnnmap pointer in the getvnnmap control this is freed by the mem_ctx instead add code to the recoverd to detect when/if recovery is required veiry that the number of active nodes, the nodemap and the vnn map is consistent across the entire cluster and if not trigger a recovery (which right now just prints "we need to do recovery" to the screen. (This used to be ctdb commit 2b0a207a3748bdb3394dc9fd0d1c344ee1bb0bb5) 2007-05-04 03:45:53 +04:00
recoverd: Fix the VNN lmaster consistency check It does cope with node that don't have the lmaster capability. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 588172bcb6bf267339e2bd09e23d2c4904a27a41) 2013-09-26 07:11:04 +04:00			`/* There must be the same number of lmasters in the vnn map as`
			`* there are active nodes with the lmaster capability... or`
			`* do a recovery.`
update getvnnmap control to take a timeout parameter dont explicitely free the vnnmap pointer in the getvnnmap control this is freed by the mem_ctx instead add code to the recoverd to detect when/if recovery is required veiry that the number of active nodes, the nodemap and the vnn map is consistent across the entire cluster and if not trigger a recovery (which right now just prints "we need to do recovery" to the screen. (This used to be ctdb commit 2b0a207a3748bdb3394dc9fd0d1c344ee1bb0bb5) 2007-05-04 03:45:53 +04:00			`*/`
ctdb-recoverd: Make num_lmasters a local variable It isn't used anywhere else and is always re-initialised to 0. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-03-29 09:49:02 +03:00			`if (vnnmap->size != num_lmasters) {`
recoverd: Fix the VNN lmaster consistency check It does cope with node that don't have the lmaster capability. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 588172bcb6bf267339e2bd09e23d2c4904a27a41) 2013-09-26 07:11:04 +04:00			`DEBUG(DEBUG_ERR, (__location__ " The vnnmap count is different from the number of active lmaster nodes: %u vs %u\n",`
ctdb-recoverd: Make num_lmasters a local variable It isn't used anywhere else and is always re-initialised to 0. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-03-29 09:49:02 +03:00			`vnnmap->size, num_lmasters));`
ctdb-recoverd: Use rec->pnn everywhere This is currently referenced in a number of inconsistent ways, including: * pnn * rec->ctdb->pnn * ctdb->pnn * ctdb_get_pnn(ctdb) * ctdb_get_pnn(rec->ctdb) The first of these always requires some thought about the context - is this the node PNN or some other PNN (e.g. argument to function)? rec->pnn is now always used when referring to the recovery daemon's PNN. Doing this also reduces reliance on struct ctdb_context internals. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-08 12:25:46 +03:00			`ctdb_set_culprit(rec, rec->pnn);`
ctdb-recoverd: Simplify arguments to do_recovery() pnn and nodemap are both available via the rec context, so simplify. vnnmap is unused. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-01-16 08:20:05 +03:00			`do_recovery(rec, mem_ctx);`
speed startup: alter recovery loop We do a recovery on startup. But the code does: Sleep for ctdb->tunable.recover_interval. Check for recovery. We want to do it in the other order. This is best done by extracting the loop into a separate "main_loop" function. Seconds between ctdbd first log message and node healthy: BEFORE: 24.09 AFTER: 23.58 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 097046025176b9fcb670839d1a9f100f890e7ed2) 2010-06-22 17:20:23 +04:00			`return;`
update getvnnmap control to take a timeout parameter dont explicitely free the vnnmap pointer in the getvnnmap control this is freed by the mem_ctx instead add code to the recoverd to detect when/if recovery is required veiry that the number of active nodes, the nodemap and the vnn map is consistent across the entire cluster and if not trigger a recovery (which right now just prints "we need to do recovery" to the screen. (This used to be ctdb commit 2b0a207a3748bdb3394dc9fd0d1c344ee1bb0bb5) 2007-05-04 03:45:53 +04:00			`}`

ctdb-recoverd: Only check for LMASTER nodes in the VNN map BUG: https://bugzilla.samba.org/show_bug.cgi?id=14085 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2019-08-21 07:35:09 +03:00			`/*`
			`* Verify that all active lmaster nodes in the nodemap also`
			`* exist in the vnnmap`
update getvnnmap control to take a timeout parameter dont explicitely free the vnnmap pointer in the getvnnmap control this is freed by the mem_ctx instead add code to the recoverd to detect when/if recovery is required veiry that the number of active nodes, the nodemap and the vnn map is consistent across the entire cluster and if not trigger a recovery (which right now just prints "we need to do recovery" to the screen. (This used to be ctdb commit 2b0a207a3748bdb3394dc9fd0d1c344ee1bb0bb5) 2007-05-04 03:45:53 +04:00			`*/`
			`for (j=0; j<nodemap->num; j++) {`
implement a scheme where nodes are banned if they continuously caused the cluster to start a recovery session. The node is banned from the cluster for the RecoveryBanPeriod (default of 5 minutes) (This used to be ctdb commit 4ad43dd07f526b6002477177fbf55483246c2c0c) 2007-06-07 09:18:55 +04:00			`if (nodemap->nodes[j].flags & NODE_FLAGS_INACTIVE) {`
update getvnnmap control to take a timeout parameter dont explicitely free the vnnmap pointer in the getvnnmap control this is freed by the mem_ctx instead add code to the recoverd to detect when/if recovery is required veiry that the number of active nodes, the nodemap and the vnn map is consistent across the entire cluster and if not trigger a recovery (which right now just prints "we need to do recovery" to the screen. (This used to be ctdb commit 2b0a207a3748bdb3394dc9fd0d1c344ee1bb0bb5) 2007-05-04 03:45:53 +04:00			`continue;`
			`}`
ctdb-recoverd: Only check for LMASTER nodes in the VNN map BUG: https://bugzilla.samba.org/show_bug.cgi?id=14085 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2019-08-21 07:35:09 +03:00			`if (! ctdb_node_has_capabilities(rec->caps,`
ctdb-recoverd: Fix typo in previous fix BUG: https://bugzilla.samba.org/show_bug.cgi?id=14085 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Tue Aug 27 15:29:11 UTC 2019 on sn-devel-184 2019-08-27 05:13:51 +03:00			`nodemap->nodes[j].pnn,`
ctdb-recoverd: Only check for LMASTER nodes in the VNN map BUG: https://bugzilla.samba.org/show_bug.cgi?id=14085 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2019-08-21 07:35:09 +03:00			`CTDB_CAP_LMASTER)) {`
			`continue;`
			`}`
ctdb-recoverd: Use rec->pnn everywhere This is currently referenced in a number of inconsistent ways, including: * pnn * rec->ctdb->pnn * ctdb->pnn * ctdb_get_pnn(ctdb) * ctdb_get_pnn(rec->ctdb) The first of these always requires some thought about the context - is this the node PNN or some other PNN (e.g. argument to function)? rec->pnn is now always used when referring to the recovery daemon's PNN. Doing this also reduces reliance on struct ctdb_context internals. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-08 12:25:46 +03:00			`if (nodemap->nodes[j].pnn == rec->pnn) {`
update getvnnmap control to take a timeout parameter dont explicitely free the vnnmap pointer in the getvnnmap control this is freed by the mem_ctx instead add code to the recoverd to detect when/if recovery is required veiry that the number of active nodes, the nodemap and the vnn map is consistent across the entire cluster and if not trigger a recovery (which right now just prints "we need to do recovery" to the screen. (This used to be ctdb commit 2b0a207a3748bdb3394dc9fd0d1c344ee1bb0bb5) 2007-05-04 03:45:53 +04:00			`continue;`
			`}`

			`for (i=0; i<vnnmap->size; i++) {`
change how we do public addresses and takeover so that we can have multiple public addresses spread across multiple interfaces on each node. this is a massive patch since we have previously made the assumtion that we only have one public address per node. get rid of the public_interface argument. the public addresses file now explicitely lists which interface the address belongs to (This used to be ctdb commit 462ebbc791e906a6b874c862defea43235597ca8) 2007-09-04 03:50:07 +04:00			`if (vnnmap->map[i] == nodemap->nodes[j].pnn) {`
update getvnnmap control to take a timeout parameter dont explicitely free the vnnmap pointer in the getvnnmap control this is freed by the mem_ctx instead add code to the recoverd to detect when/if recovery is required veiry that the number of active nodes, the nodemap and the vnn map is consistent across the entire cluster and if not trigger a recovery (which right now just prints "we need to do recovery" to the screen. (This used to be ctdb commit 2b0a207a3748bdb3394dc9fd0d1c344ee1bb0bb5) 2007-05-04 03:45:53 +04:00			`break;`
			`}`
			`}`
implement a scheme where nodes are banned if they continuously caused the cluster to start a recovery session. The node is banned from the cluster for the RecoveryBanPeriod (default of 5 minutes) (This used to be ctdb commit 4ad43dd07f526b6002477177fbf55483246c2c0c) 2007-06-07 09:18:55 +04:00			`if (i == vnnmap->size) {`
ctdb-recoverd: Only check for LMASTER nodes in the VNN map BUG: https://bugzilla.samba.org/show_bug.cgi?id=14085 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2019-08-21 07:35:09 +03:00			`D_ERR("Active LMASTER node %u is not in the vnnmap\n",`
			`nodemap->nodes[j].pnn);`
new prototype banning code (This used to be ctdb commit 0c4c2240267af183d54ffd4c0aacda208f6eff6a) 2009-09-03 20:20:39 +04:00			`ctdb_set_culprit(rec, nodemap->nodes[j].pnn);`
ctdb-recoverd: Simplify arguments to do_recovery() pnn and nodemap are both available via the rec context, so simplify. vnnmap is unused. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-01-16 08:20:05 +03:00			`do_recovery(rec, mem_ctx);`
speed startup: alter recovery loop We do a recovery on startup. But the code does: Sleep for ctdb->tunable.recover_interval. Check for recovery. We want to do it in the other order. This is best done by extracting the loop into a separate "main_loop" function. Seconds between ctdbd first log message and node healthy: BEFORE: 24.09 AFTER: 23.58 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 097046025176b9fcb670839d1a9f100f890e7ed2) 2010-06-22 17:20:23 +04:00			`return;`
update getvnnmap control to take a timeout parameter dont explicitely free the vnnmap pointer in the getvnnmap control this is freed by the mem_ctx instead add code to the recoverd to detect when/if recovery is required veiry that the number of active nodes, the nodemap and the vnn map is consistent across the entire cluster and if not trigger a recovery (which right now just prints "we need to do recovery" to the screen. (This used to be ctdb commit 2b0a207a3748bdb3394dc9fd0d1c344ee1bb0bb5) 2007-05-04 03:45:53 +04:00			`}`
			`}`


also verify that the generation id is the same on all the nodes and if not, trigger a recovery (This used to be ctdb commit 46b8a66ee70419c153acf45eeec88c1fc8f230ce) 2007-05-04 05:57:45 +04:00			`/* verify that all other nodes have the same vnnmap`
			`and are from the same generation`
			`*/`
update getvnnmap control to take a timeout parameter dont explicitely free the vnnmap pointer in the getvnnmap control this is freed by the mem_ctx instead add code to the recoverd to detect when/if recovery is required veiry that the number of active nodes, the nodemap and the vnn map is consistent across the entire cluster and if not trigger a recovery (which right now just prints "we need to do recovery" to the screen. (This used to be ctdb commit 2b0a207a3748bdb3394dc9fd0d1c344ee1bb0bb5) 2007-05-04 03:45:53 +04:00			`for (j=0; j<nodemap->num; j++) {`
implement a scheme where nodes are banned if they continuously caused the cluster to start a recovery session. The node is banned from the cluster for the RecoveryBanPeriod (default of 5 minutes) (This used to be ctdb commit 4ad43dd07f526b6002477177fbf55483246c2c0c) 2007-06-07 09:18:55 +04:00			`if (nodemap->nodes[j].flags & NODE_FLAGS_INACTIVE) {`
update getvnnmap control to take a timeout parameter dont explicitely free the vnnmap pointer in the getvnnmap control this is freed by the mem_ctx instead add code to the recoverd to detect when/if recovery is required veiry that the number of active nodes, the nodemap and the vnn map is consistent across the entire cluster and if not trigger a recovery (which right now just prints "we need to do recovery" to the screen. (This used to be ctdb commit 2b0a207a3748bdb3394dc9fd0d1c344ee1bb0bb5) 2007-05-04 03:45:53 +04:00			`continue;`
			`}`
ctdb-recoverd: Use rec->pnn everywhere This is currently referenced in a number of inconsistent ways, including: * pnn * rec->ctdb->pnn * ctdb->pnn * ctdb_get_pnn(ctdb) * ctdb_get_pnn(rec->ctdb) The first of these always requires some thought about the context - is this the node PNN or some other PNN (e.g. argument to function)? rec->pnn is now always used when referring to the recovery daemon's PNN. Doing this also reduces reliance on struct ctdb_context internals. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-08 12:25:46 +03:00			`if (nodemap->nodes[j].pnn == rec->pnn) {`
update getvnnmap control to take a timeout parameter dont explicitely free the vnnmap pointer in the getvnnmap control this is freed by the mem_ctx instead add code to the recoverd to detect when/if recovery is required veiry that the number of active nodes, the nodemap and the vnn map is consistent across the entire cluster and if not trigger a recovery (which right now just prints "we need to do recovery" to the screen. (This used to be ctdb commit 2b0a207a3748bdb3394dc9fd0d1c344ee1bb0bb5) 2007-05-04 03:45:53 +04:00			`continue;`
			`}`

change how we do public addresses and takeover so that we can have multiple public addresses spread across multiple interfaces on each node. this is a massive patch since we have previously made the assumtion that we only have one public address per node. get rid of the public_interface argument. the public addresses file now explicitely lists which interface the address belongs to (This used to be ctdb commit 462ebbc791e906a6b874c862defea43235597ca8) 2007-09-04 03:50:07 +04:00			`ret = ctdb_ctrl_getvnnmap(ctdb, CONTROL_TIMEOUT(), nodemap->nodes[j].pnn,`
formatting fixes (This used to be ctdb commit ed63a2057698aed3931762605b2ea2368681af2b) 2007-06-07 12:39:37 +04:00			`mem_ctx, &remote_vnnmap);`
update getvnnmap control to take a timeout parameter dont explicitely free the vnnmap pointer in the getvnnmap control this is freed by the mem_ctx instead add code to the recoverd to detect when/if recovery is required veiry that the number of active nodes, the nodemap and the vnn map is consistent across the entire cluster and if not trigger a recovery (which right now just prints "we need to do recovery" to the screen. (This used to be ctdb commit 2b0a207a3748bdb3394dc9fd0d1c344ee1bb0bb5) 2007-05-04 03:45:53 +04:00			`if (ret != 0) {`
merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_ERR, (__location__ " Unable to get vnnmap from remote node %u\n",`
change how we do public addresses and takeover so that we can have multiple public addresses spread across multiple interfaces on each node. this is a massive patch since we have previously made the assumtion that we only have one public address per node. get rid of the public_interface argument. the public addresses file now explicitely lists which interface the address belongs to (This used to be ctdb commit 462ebbc791e906a6b874c862defea43235597ca8) 2007-09-04 03:50:07 +04:00			`nodemap->nodes[j].pnn));`
speed startup: alter recovery loop We do a recovery on startup. But the code does: Sleep for ctdb->tunable.recover_interval. Check for recovery. We want to do it in the other order. This is best done by extracting the loop into a separate "main_loop" function. Seconds between ctdbd first log message and node healthy: BEFORE: 24.09 AFTER: 23.58 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 097046025176b9fcb670839d1a9f100f890e7ed2) 2010-06-22 17:20:23 +04:00			`return;`
update getvnnmap control to take a timeout parameter dont explicitely free the vnnmap pointer in the getvnnmap control this is freed by the mem_ctx instead add code to the recoverd to detect when/if recovery is required veiry that the number of active nodes, the nodemap and the vnn map is consistent across the entire cluster and if not trigger a recovery (which right now just prints "we need to do recovery" to the screen. (This used to be ctdb commit 2b0a207a3748bdb3394dc9fd0d1c344ee1bb0bb5) 2007-05-04 03:45:53 +04:00			`}`

also verify that the generation id is the same on all the nodes and if not, trigger a recovery (This used to be ctdb commit 46b8a66ee70419c153acf45eeec88c1fc8f230ce) 2007-05-04 05:57:45 +04:00			`/* verify the vnnmap generation is the same */`
			`if (vnnmap->generation != remote_vnnmap->generation) {`
merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_ERR, (__location__ " Remote node %u has different generation of vnnmap. %u vs %u (ours)\n",`
change how we do public addresses and takeover so that we can have multiple public addresses spread across multiple interfaces on each node. this is a massive patch since we have previously made the assumtion that we only have one public address per node. get rid of the public_interface argument. the public addresses file now explicitely lists which interface the address belongs to (This used to be ctdb commit 462ebbc791e906a6b874c862defea43235597ca8) 2007-09-04 03:50:07 +04:00			`nodemap->nodes[j].pnn, remote_vnnmap->generation, vnnmap->generation));`
new prototype banning code (This used to be ctdb commit 0c4c2240267af183d54ffd4c0aacda208f6eff6a) 2009-09-03 20:20:39 +04:00			`ctdb_set_culprit(rec, nodemap->nodes[j].pnn);`
ctdb-recoverd: Simplify arguments to do_recovery() pnn and nodemap are both available via the rec context, so simplify. vnnmap is unused. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-01-16 08:20:05 +03:00			`do_recovery(rec, mem_ctx);`
speed startup: alter recovery loop We do a recovery on startup. But the code does: Sleep for ctdb->tunable.recover_interval. Check for recovery. We want to do it in the other order. This is best done by extracting the loop into a separate "main_loop" function. Seconds between ctdbd first log message and node healthy: BEFORE: 24.09 AFTER: 23.58 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 097046025176b9fcb670839d1a9f100f890e7ed2) 2010-06-22 17:20:23 +04:00			`return;`
also verify that the generation id is the same on all the nodes and if not, trigger a recovery (This used to be ctdb commit 46b8a66ee70419c153acf45eeec88c1fc8f230ce) 2007-05-04 05:57:45 +04:00			`}`

update getvnnmap control to take a timeout parameter dont explicitely free the vnnmap pointer in the getvnnmap control this is freed by the mem_ctx instead add code to the recoverd to detect when/if recovery is required veiry that the number of active nodes, the nodemap and the vnn map is consistent across the entire cluster and if not trigger a recovery (which right now just prints "we need to do recovery" to the screen. (This used to be ctdb commit 2b0a207a3748bdb3394dc9fd0d1c344ee1bb0bb5) 2007-05-04 03:45:53 +04:00			`/* verify the vnnmap size is the same */`
			`if (vnnmap->size != remote_vnnmap->size) {`
merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_ERR, (__location__ " Remote node %u has different size of vnnmap. %u vs %u (ours)\n",`
change how we do public addresses and takeover so that we can have multiple public addresses spread across multiple interfaces on each node. this is a massive patch since we have previously made the assumtion that we only have one public address per node. get rid of the public_interface argument. the public addresses file now explicitely lists which interface the address belongs to (This used to be ctdb commit 462ebbc791e906a6b874c862defea43235597ca8) 2007-09-04 03:50:07 +04:00			`nodemap->nodes[j].pnn, remote_vnnmap->size, vnnmap->size));`
new prototype banning code (This used to be ctdb commit 0c4c2240267af183d54ffd4c0aacda208f6eff6a) 2009-09-03 20:20:39 +04:00			`ctdb_set_culprit(rec, nodemap->nodes[j].pnn);`
ctdb-recoverd: Simplify arguments to do_recovery() pnn and nodemap are both available via the rec context, so simplify. vnnmap is unused. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-01-16 08:20:05 +03:00			`do_recovery(rec, mem_ctx);`
speed startup: alter recovery loop We do a recovery on startup. But the code does: Sleep for ctdb->tunable.recover_interval. Check for recovery. We want to do it in the other order. This is best done by extracting the loop into a separate "main_loop" function. Seconds between ctdbd first log message and node healthy: BEFORE: 24.09 AFTER: 23.58 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 097046025176b9fcb670839d1a9f100f890e7ed2) 2010-06-22 17:20:23 +04:00			`return;`
update getvnnmap control to take a timeout parameter dont explicitely free the vnnmap pointer in the getvnnmap control this is freed by the mem_ctx instead add code to the recoverd to detect when/if recovery is required veiry that the number of active nodes, the nodemap and the vnn map is consistent across the entire cluster and if not trigger a recovery (which right now just prints "we need to do recovery" to the screen. (This used to be ctdb commit 2b0a207a3748bdb3394dc9fd0d1c344ee1bb0bb5) 2007-05-04 03:45:53 +04:00			`}`

			`/* verify the vnnmap is the same */`
			`for (i=0;i<vnnmap->size;i++) {`
			`if (remote_vnnmap->map[i] != vnnmap->map[i]) {`
merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_ERR, (__location__ " Remote node %u has different vnnmap.\n",`
change how we do public addresses and takeover so that we can have multiple public addresses spread across multiple interfaces on each node. this is a massive patch since we have previously made the assumtion that we only have one public address per node. get rid of the public_interface argument. the public addresses file now explicitely lists which interface the address belongs to (This used to be ctdb commit 462ebbc791e906a6b874c862defea43235597ca8) 2007-09-04 03:50:07 +04:00			`nodemap->nodes[j].pnn));`
new prototype banning code (This used to be ctdb commit 0c4c2240267af183d54ffd4c0aacda208f6eff6a) 2009-09-03 20:20:39 +04:00			`ctdb_set_culprit(rec, nodemap->nodes[j].pnn);`
ctdb-recoverd: Simplify arguments to do_recovery() pnn and nodemap are both available via the rec context, so simplify. vnnmap is unused. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-01-16 08:20:05 +03:00			`do_recovery(rec, mem_ctx);`
speed startup: alter recovery loop We do a recovery on startup. But the code does: Sleep for ctdb->tunable.recover_interval. Check for recovery. We want to do it in the other order. This is best done by extracting the loop into a separate "main_loop" function. Seconds between ctdbd first log message and node healthy: BEFORE: 24.09 AFTER: 23.58 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 097046025176b9fcb670839d1a9f100f890e7ed2) 2010-06-22 17:20:23 +04:00			`return;`
update getvnnmap control to take a timeout parameter dont explicitely free the vnnmap pointer in the getvnnmap control this is freed by the mem_ctx instead add code to the recoverd to detect when/if recovery is required veiry that the number of active nodes, the nodemap and the vnn map is consistent across the entire cluster and if not trigger a recovery (which right now just prints "we need to do recovery" to the screen. (This used to be ctdb commit 2b0a207a3748bdb3394dc9fd0d1c344ee1bb0bb5) 2007-05-04 03:45:53 +04:00			`}`
			`}`
			`}`

ctdb-ipalloc: Drop remote IP verification It is only run during a takeover run and only logs errors. It doesn't actually do anything to fix potential errors. The takeover run should fix any inconsistencies anyway. Instead, leave a comment in the recovery daemon's monitoring loop to add proper remote IP verification later. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-06-20 13:41:05 +03:00			`/* FIXME: Add remote public IP checking to ensure that nodes`
			`* have the IP addresses that are allocated to them. */`

ctdb-recoverd: Move takeover run checks after recover checks If a recovery is going to be done then this will be followed by a takeover run anyway. So, there's no use doing the takeover run checks, potentially doing a takeover run and then doing a recovery. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-05-03 09:00:02 +03:00			`takeover_run_checks:`

ctdb-recoverd: Unify takeover run triggering code in main loop Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Fri May 13 17:15:57 CEST 2016 on sn-devel-144 2016-05-03 09:07:34 +03:00			`/* If there are IP takeover runs requested or the previous one`
			`* failed then perform one and notify the waiters */`
ctdb-recoverd: Move takeover run checks after recover checks If a recovery is going to be done then this will be followed by a takeover run anyway. So, there's no use doing the takeover run checks, potentially doing a takeover run and then doing a recovery. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-05-03 09:00:02 +03:00			`if (!ctdb_op_is_disabled(rec->takeover_run) &&`
ctdb-recoverd: Unify takeover run triggering code in main loop Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Fri May 13 17:15:57 CEST 2016 on sn-devel-144 2016-05-03 09:07:34 +03:00			`(rec->reallocate_requests \|\| rec->need_takeover_run)) {`
ctdb-recoverd: Move takeover run checks after recover checks If a recovery is going to be done then this will be followed by a takeover run anyway. So, there's no use doing the takeover run checks, potentially doing a takeover run and then doing a recovery. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-05-03 09:00:02 +03:00			`process_ipreallocate_requests(ctdb, rec);`
			`}`
speed startup: alter recovery loop We do a recovery on startup. But the code does: Sleep for ctdb->tunable.recover_interval. Check for recovery. We want to do it in the other order. This is best done by extracting the loop into a separate "main_loop" function. Seconds between ctdbd first log message and node healthy: BEFORE: 24.09 AFTER: 23.58 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 097046025176b9fcb670839d1a9f100f890e7ed2) 2010-06-22 17:20:23 +04:00			`}`

ctdb-recoverd: Release recovery lock on exit The recovery lock helper must exit when it notices its parent is gone. However, that can take a few seconds. The usual way of terminating the recovery daemon is for the main ctdbd to send it a SIGTERM. Installing a handler is nice and simple. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-06-02 02:26:40 +03:00			`static void recd_sig_term_handler(struct tevent_context *ev,`
			`struct tevent_signal *se, int signum,`
			`int count, void *dont_care,`
			`void *private_data)`
			`{`
			`struct ctdb_recoverd *rec = talloc_get_type_abort(`
			`private_data, struct ctdb_recoverd);`

ctdb-recoverd: Log a message when terminating Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2016-11-25 06:57:30 +03:00			`DEBUG(DEBUG_ERR, ("Received SIGTERM, exiting\n"));`
ctdb-recoverd: Terminology change: recovery lock -> cluster lock No functional changes, just name changes for clarity. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-10 03:29:06 +03:00			`cluster_lock_release(rec);`
ctdb-recoverd: Release recovery lock on exit The recovery lock helper must exit when it notices its parent is gone. However, that can take a few seconds. The usual way of terminating the recovery daemon is for the main ctdbd to send it a SIGTERM. Installing a handler is nice and simple. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-06-02 02:26:40 +03:00			`exit(0);`
			`}`

ctdb-recoverd: Periodically log recovery master of incomplete cluster Only do this if the recovery lock is unset. Log every minute for the first 10 minutes, then every 10 minutes, then every hour. This is useful for determining whether a split brain occurred. It is particularly useful if logging failed or was throttled at startup, so there is no evidence of the split brain when it began. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2019-07-16 01:58:33 +03:00			`/*`
			`* Periodically log elements of the cluster state`
			`*`
			`* This can be used to confirm a split brain has occurred`
			`*/`
			`static void maybe_log_cluster_state(struct tevent_context *ev,`
			`struct tevent_timer *te,`
			`struct timeval current_time,`
			`void *private_data)`
			`{`
			`struct ctdb_recoverd *rec = talloc_get_type_abort(`
			`private_data, struct ctdb_recoverd);`
			`struct ctdb_context *ctdb = rec->ctdb;`
			`struct tevent_timer *tt;`

			`static struct timeval start_incomplete = {`
			`.tv_sec = 0,`
			`};`

			`bool is_complete;`
			`bool was_complete;`
			`unsigned int i;`
			`double seconds;`
			`unsigned int minutes;`
			`unsigned int num_connected;`

ctdb-recoverd: Factor out and use function this_node_is_leader() Make the code self-documenting. This preempts an upcoming change to terminology but doing it now saves a lot of churn. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-08 11:37:39 +03:00			`if (!this_node_is_leader(rec)) {`
ctdb-recoverd: Periodically log recovery master of incomplete cluster Only do this if the recovery lock is unset. Log every minute for the first 10 minutes, then every 10 minutes, then every hour. This is useful for determining whether a split brain occurred. It is particularly useful if logging failed or was throttled at startup, so there is no evidence of the split brain when it began. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2019-07-16 01:58:33 +03:00			`goto done;`
			`}`

			`if (rec->nodemap == NULL) {`
			`goto done;`
			`}`

			`is_complete = true;`
			`num_connected = 0;`
			`for (i = 0; i < rec->nodemap->num; i++) {`
			`struct ctdb_node_and_flags *n = &rec->nodemap->nodes[i];`

ctdb-recoverd: Use rec->pnn everywhere This is currently referenced in a number of inconsistent ways, including: * pnn * rec->ctdb->pnn * ctdb->pnn * ctdb_get_pnn(ctdb) * ctdb_get_pnn(rec->ctdb) The first of these always requires some thought about the context - is this the node PNN or some other PNN (e.g. argument to function)? rec->pnn is now always used when referring to the recovery daemon's PNN. Doing this also reduces reliance on struct ctdb_context internals. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-08 12:25:46 +03:00			`if (n->pnn == rec->pnn) {`
ctdb-recoverd: Periodically log recovery master of incomplete cluster Only do this if the recovery lock is unset. Log every minute for the first 10 minutes, then every 10 minutes, then every hour. This is useful for determining whether a split brain occurred. It is particularly useful if logging failed or was throttled at startup, so there is no evidence of the split brain when it began. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2019-07-16 01:58:33 +03:00			`continue;`
			`}`
			`if ((n->flags & NODE_FLAGS_DELETED) != 0) {`
			`continue;`
			`}`
			`if ((n->flags & NODE_FLAGS_DISCONNECTED) != 0) {`
			`is_complete = false;`
			`continue;`
			`}`

			`num_connected++;`
			`}`

			`was_complete = timeval_is_zero(&start_incomplete);`

			`if (is_complete) {`
			`if (! was_complete) {`
ctdb-recoverd: Logging/comments: recovery master -> leader There are some remaining instances in this file but they will be removed in subsequent commits. Modernise debug macros as appropriate. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-08 03:07:25 +03:00			`D_WARNING("Cluster complete with leader=%u\n",`
ctdb-recoverd: Rename recmaster field to leader Recovery master is being renamed to leader. This follows clustering best practice (e.g. RAFT). Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-07-14 08:22:33 +03:00			`rec->leader);`
ctdb-recoverd: Periodically log recovery master of incomplete cluster Only do this if the recovery lock is unset. Log every minute for the first 10 minutes, then every 10 minutes, then every hour. This is useful for determining whether a split brain occurred. It is particularly useful if logging failed or was throttled at startup, so there is no evidence of the split brain when it began. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2019-07-16 01:58:33 +03:00			`start_incomplete = timeval_zero();`
			`}`
			`goto done;`
			`}`

			`/* Cluster is newly incomplete... */`
			`if (was_complete) {`
			`start_incomplete = current_time;`
			`minutes = 0;`
			`goto log;`
			`}`

			`/*`
			`* Cluster has been incomplete since previous check, so figure`
			`* out how long (in minutes) and decide whether to log anything`
			`*/`
			`seconds = timeval_elapsed2(&start_incomplete, &current_time);`
			`minutes = (unsigned int)seconds / 60;`
			`if (minutes >= 60) {`
			`/* Over an hour, log every hour */`
			`if (minutes % 60 != 0) {`
			`goto done;`
			`}`
			`} else if (minutes >= 10) {`
			`/* Over 10 minutes, log every 10 minutes */`
			`if (minutes % 10 != 0) {`
			`goto done;`
			`}`
			`}`

			`log:`
ctdb-recoverd: Logging/comments: recovery master -> leader There are some remaining instances in this file but they will be removed in subsequent commits. Modernise debug macros as appropriate. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-08 03:07:25 +03:00			`D_WARNING("Cluster incomplete with leader=%u, elapsed=%u minutes, "`
ctdb-recoverd: Periodically log recovery master of incomplete cluster Only do this if the recovery lock is unset. Log every minute for the first 10 minutes, then every 10 minutes, then every hour. This is useful for determining whether a split brain occurred. It is particularly useful if logging failed or was throttled at startup, so there is no evidence of the split brain when it began. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2019-07-16 01:58:33 +03:00			`"connected=%u\n",`
ctdb-recoverd: Rename recmaster field to leader Recovery master is being renamed to leader. This follows clustering best practice (e.g. RAFT). Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-07-14 08:22:33 +03:00			`rec->leader,`
ctdb-recoverd: Periodically log recovery master of incomplete cluster Only do this if the recovery lock is unset. Log every minute for the first 10 minutes, then every 10 minutes, then every hour. This is useful for determining whether a split brain occurred. It is particularly useful if logging failed or was throttled at startup, so there is no evidence of the split brain when it began. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2019-07-16 01:58:33 +03:00			`minutes,`
			`num_connected);`

			`done:`
			`tt = tevent_add_timer(ctdb->ev,`
			`rec,`
			`timeval_current_ofs(60, 0),`
			`maybe_log_cluster_state,`
			`rec);`
			`if (tt == NULL) {`
			`DBG_WARNING("Failed to set up cluster state timer\n");`
			`}`
			`}`
ctdb-recoverd: Release recovery lock on exit The recovery lock helper must exit when it notices its parent is gone. However, that can take a few seconds. The usual way of terminating the recovery daemon is for the main ctdbd to send it a SIGTERM. Installing a handler is nice and simple. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-06-02 02:26:40 +03:00
ctdb-recoverd: Pass SIGHUP to running helper The recovery and takeover helpers can run for a while and generate non-trivial logs, so have them reopen their logs to support log rotation. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Mon Jan 17 04:36:30 UTC 2022 on sn-devel-184 2021-09-30 14:16:44 +03:00			`static void recd_sighup_hook(void *private_data)`
			`{`
			`struct ctdb_recoverd *rec = talloc_get_type_abort(`
			`private_data, struct ctdb_recoverd);`

			`if (rec->helper_pid > 0) {`
			`kill(rec->helper_pid, SIGHUP);`
			`}`
			`}`

speed startup: alter recovery loop We do a recovery on startup. But the code does: Sleep for ctdb->tunable.recover_interval. Check for recovery. We want to do it in the other order. This is best done by extracting the loop into a separate "main_loop" function. Seconds between ctdbd first log message and node healthy: BEFORE: 24.09 AFTER: 23.58 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 097046025176b9fcb670839d1a9f100f890e7ed2) 2010-06-22 17:20:23 +04:00			`/*`
			`the main monitoring loop`
			`*/`
			`static void monitor_cluster(struct ctdb_context *ctdb)`
			`{`
ctdb-recoverd: Release recovery lock on exit The recovery lock helper must exit when it notices its parent is gone. However, that can take a few seconds. The usual way of terminating the recovery daemon is for the main ctdbd to send it a SIGTERM. Installing a handler is nice and simple. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-06-02 02:26:40 +03:00			`struct tevent_signal *se;`
speed startup: alter recovery loop We do a recovery on startup. But the code does: Sleep for ctdb->tunable.recover_interval. Check for recovery. We want to do it in the other order. This is best done by extracting the loop into a separate "main_loop" function. Seconds between ctdbd first log message and node healthy: BEFORE: 24.09 AFTER: 23.58 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 097046025176b9fcb670839d1a9f100f890e7ed2) 2010-06-22 17:20:23 +04:00			`struct ctdb_recoverd *rec;`
ctdb-recoverd: Add basic log reopening Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-09-30 14:03:15 +03:00			`bool status;`
speed startup: alter recovery loop We do a recovery on startup. But the code does: Sleep for ctdb->tunable.recover_interval. Check for recovery. We want to do it in the other order. This is best done by extracting the loop into a separate "main_loop" function. Seconds between ctdbd first log message and node healthy: BEFORE: 24.09 AFTER: 23.58 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 097046025176b9fcb670839d1a9f100f890e7ed2) 2010-06-22 17:20:23 +04:00
			`DEBUG(DEBUG_NOTICE,("monitor_cluster starting\n"));`

			`rec = talloc_zero(ctdb, struct ctdb_recoverd);`
			`CTDB_NO_MEMORY_FATAL(ctdb, rec);`

			`rec->ctdb = ctdb;`
ctdb-recoverd: Rename recmaster field to leader Recovery master is being renamed to leader. This follows clustering best practice (e.g. RAFT). Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-07-14 08:22:33 +03:00			`rec->leader = CTDB_UNKNOWN_PNN;`
ctdb-recoverd: Add PNN to recovery daemon context This is currently referenced in a number of inconsistent ways, including: * pnn * rec->ctdb->pnn * ctdb->pnn * ctdb_get_pnn(ctdb) * ctdb_get_pnn(rec->ctdb) The first of these always requires some thought about the context - is this the node PNN or some other PNN (e.g. argument to function)? The intention is to always use rec->pnn when referring to the recovery daemon's PNN. Doing this also reduces reliance on struct ctdb_context internals. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-09 02:33:17 +03:00			`rec->pnn = ctdb_get_pnn(ctdb);`
ctdb-recoverd: Terminology change: recovery lock -> cluster lock No functional changes, just name changes for clarity. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-10 03:29:06 +03:00			`rec->cluster_lock_handle = NULL;`
ctdb-recoverd: Record helper PID in recovery daemon context Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-09-30 14:15:56 +03:00			`rec->helper_pid = -1;`
added health monitoring logic to ctdb, so a node loses its public IP address if one of the sybsystem event scripts reports a problem (This used to be ctdb commit c7a089256d86cec21097453bce5acbccee87413f) 2007-06-06 04:25:46 +04:00
ctdb-recoverd: Reimplement disabling takeover runs using ctdb_op_disable() Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-02-08 12:52:12 +03:00			`rec->takeover_run = ctdb_op_init(rec, "takeover runs");`
			`CTDB_NO_MEMORY_FATAL(ctdb, rec->takeover_run);`
recoverd: do_takeover_run() should mark when a takeover run is in progress Nested takeover runs should never happens so they should fail. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 8ed29c60c0a7dd29f2a6efdf694d38e94281e1c4) 2013-09-03 05:20:01 +04:00
ctdb-recoverd: Reimplement ReRecoveryTimeout using ctdb_op_disable() Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-02-06 06:47:33 +03:00			`rec->recovery = ctdb_op_init(rec, "recoveries");`
			`CTDB_NO_MEMORY_FATAL(ctdb, rec->recovery);`

speed startup: alter recovery loop We do a recovery on startup. But the code does: Sleep for ctdb->tunable.recover_interval. Check for recovery. We want to do it in the other order. This is best done by extracting the loop into a separate "main_loop" function. Seconds between ctdbd first log message and node healthy: BEFORE: 24.09 AFTER: 23.58 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 097046025176b9fcb670839d1a9f100f890e7ed2) 2010-06-22 17:20:23 +04:00			`rec->priority_time = timeval_current();`
ctdb-recoverd: Freeze databases whenever the node is INACTIVE If the node becomes stopped or banned after recovery is marked active, then it will never freeze the databases, and hence the node will keep banning itself indefinitely, until ctdbd is restarted. This is a regression from 4.3, introduced with b4357a79d916b1f8ade8fa78563fbef0ce670aa9 and d8f3b490bbb691c9916eed0df5b980c1aef23c85 BUG: https://bugzilla.samba.org/show_bug.cgi?id=11945 Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org> Reviewed-by: Martin Schwenke <martin@meltin.net> Autobuild-User(master): Michael Adam <obnox@samba.org> Autobuild-Date(master): Wed Jun 1 17:36:12 CEST 2016 on sn-devel-144 2016-06-01 05:10:46 +03:00			`rec->frozen_on_inactive = false;`
force an update of the flags from the recmaster after each monitoring run (This used to be ctdb commit 251aeadc8b16a9c27a4bae78c97ad6e93e6cfdf4) 2008-06-26 07:08:37 +04:00
ctdb-recoverd: Add basic log reopening Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-09-30 14:03:15 +03:00			`status = logging_setup_sighup_handler(rec->ctdb->ev,`
			`rec,`
ctdb-recoverd: Pass SIGHUP to running helper The recovery and takeover helpers can run for a while and generate non-trivial logs, so have them reopen their logs to support log rotation. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Mon Jan 17 04:36:30 UTC 2022 on sn-devel-184 2021-09-30 14:16:44 +03:00			`recd_sighup_hook,`
			`rec);`
ctdb-recoverd: Add basic log reopening Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-09-30 14:03:15 +03:00			`if (!status) {`
			`D_ERR("Failed to install SIGHUP handler\n");`
			`exit(1);`
			`}`

ctdb-recoverd: Release recovery lock on exit The recovery lock helper must exit when it notices its parent is gone. However, that can take a few seconds. The usual way of terminating the recovery daemon is for the main ctdbd to send it a SIGTERM. Installing a handler is nice and simple. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-06-02 02:26:40 +03:00			`se = tevent_add_signal(ctdb->ev, ctdb, SIGTERM, 0,`
			`recd_sig_term_handler, rec);`
			`if (se == NULL) {`
			`DEBUG(DEBUG_ERR, ("Failed to install SIGTERM handler\n"));`
			`exit(1);`
			`}`

ctdb-recoverd: Add and use function cluster_lock_enabled() Now all references to ctdb->recovery_lock are encapsulated in the cluster lock code. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2021-12-10 03:43:10 +03:00			`if (!cluster_lock_enabled(rec)) {`
ctdb-recoverd: Periodically log recovery master of incomplete cluster Only do this if the recovery lock is unset. Log every minute for the first 10 minutes, then every 10 minutes, then every hour. This is useful for determining whether a split brain occurred. It is particularly useful if logging failed or was throttled at startup, so there is no evidence of the split brain when it began. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2019-07-16 01:58:33 +03:00			`struct tevent_timer *tt;`

			`tt = tevent_add_timer(ctdb->ev,`
			`rec,`
			`timeval_current_ofs(60, 0),`
			`maybe_log_cluster_state,`
			`rec);`
			`if (tt == NULL) {`
			`DBG_WARNING("Failed to set up cluster state timer\n");`
			`}`
			`}`

speed startup: alter recovery loop We do a recovery on startup. But the code does: Sleep for ctdb->tunable.recover_interval. Check for recovery. We want to do it in the other order. This is best done by extracting the loop into a separate "main_loop" function. Seconds between ctdbd first log message and node healthy: BEFORE: 24.09 AFTER: 23.58 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 097046025176b9fcb670839d1a9f100f890e7ed2) 2010-06-22 17:20:23 +04:00			`/* register a message port for sending memory dumps */`
			`ctdb_client_set_message_handler(ctdb, CTDB_SRVID_MEM_DUMP, mem_dump_handler, rec);`
update getvnnmap control to take a timeout parameter dont explicitely free the vnnmap pointer in the getvnnmap control this is freed by the mem_ctx instead add code to the recoverd to detect when/if recovery is required veiry that the number of active nodes, the nodemap and the vnn map is consistent across the entire cluster and if not trigger a recovery (which right now just prints "we need to do recovery" to the screen. (This used to be ctdb commit 2b0a207a3748bdb3394dc9fd0d1c344ee1bb0bb5) 2007-05-04 03:45:53 +04:00
ctdb-recoverd: Add message handler to assigning banning credits This will be called from recovery helper to assign banning credits to misbehaving node. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2016-03-17 09:26:30 +03:00			`/* when a node is assigned banning credits */`
			`ctdb_client_set_message_handler(ctdb, CTDB_SRVID_BANNING,`
			`banning_handler, rec);`

speed startup: alter recovery loop We do a recovery on startup. But the code does: Sleep for ctdb->tunable.recover_interval. Check for recovery. We want to do it in the other order. This is best done by extracting the loop into a separate "main_loop" function. Seconds between ctdbd first log message and node healthy: BEFORE: 24.09 AFTER: 23.58 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 097046025176b9fcb670839d1a9f100f890e7ed2) 2010-06-22 17:20:23 +04:00			`/* register a message port for recovery elections */`
ctdb-include: Use new protocol definitions This gets rid of the duplicate definitions from ctdb_protocol.h. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-29 09:51:52 +03:00			`ctdb_client_set_message_handler(ctdb, CTDB_SRVID_ELECTION, election_handler, rec);`
speed startup: alter recovery loop We do a recovery on startup. But the code does: Sleep for ctdb->tunable.recover_interval. Check for recovery. We want to do it in the other order. This is best done by extracting the loop into a separate "main_loop" function. Seconds between ctdbd first log message and node healthy: BEFORE: 24.09 AFTER: 23.58 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 097046025176b9fcb670839d1a9f100f890e7ed2) 2010-06-22 17:20:23 +04:00
ctdb-recoverd: Mark CTDB_SRVID_SET_NODE_FLAGS obsolete CTDB_SRVID_SET_NODE_FLAGS is no longer sent so drop monitor_handler() and replace with srvid_not_implemented(). Mark the SRVID obsolete in its comment. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14784 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-01-17 11:04:34 +03:00			`ctdb_client_set_message_handler(ctdb,`
			`CTDB_SRVID_SET_NODE_FLAGS,`
			`srvid_not_implemented,`
			`rec);`
speed startup: alter recovery loop We do a recovery on startup. But the code does: Sleep for ctdb->tunable.recover_interval. Check for recovery. We want to do it in the other order. This is best done by extracting the loop into a separate "main_loop" function. Seconds between ctdbd first log message and node healthy: BEFORE: 24.09 AFTER: 23.58 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 097046025176b9fcb670839d1a9f100f890e7ed2) 2010-06-22 17:20:23 +04:00
			`/* when we are asked to puch out a flag change */`
			`ctdb_client_set_message_handler(ctdb, CTDB_SRVID_PUSH_NODE_FLAGS, push_flags_handler, rec);`

			`/* register a message port for reloadnodes */`
			`ctdb_client_set_message_handler(ctdb, CTDB_SRVID_RELOAD_NODES, reload_nodes_handler, rec);`

			`/* register a message port for performing a takeover run */`
			`ctdb_client_set_message_handler(ctdb, CTDB_SRVID_TAKEOVER_RUN, ip_reallocate_handler, rec);`

			`/* register a message port for disabling the ip check for a short while */`
			`ctdb_client_set_message_handler(ctdb, CTDB_SRVID_DISABLE_IP_CHECK, disable_ip_check_handler, rec);`

When adding ips to nodes, set up a deferred rebalance for the whole node to trigger after 60 seconds in case the normal ipreallocated is not sufficient to trigger rebalance. (This used to be ctdb commit 4340263b219d75c39f8de22abe3f6f1c1ee63ea2) 2012-02-27 23:56:04 +04:00			`/* register a message port for forcing a rebalance of a node next`
			`reallocation */`
			`ctdb_client_set_message_handler(ctdb, CTDB_SRVID_REBALANCE_NODE, recd_node_rebalance_handler, rec);`

recoverd: New SRVID message CTDB_SRVID_DISABLE_TAKEOVER_RUNS This implements a superset of CTDB_SRVID_DISABLE_IP_CHECK. It stops the IP checks but also causes any attempted takeover runs to fail and be rescheduled. This is meant to completely stop IP movements. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 00db4de53a0d86013e79e6577e7e6cf3ef864e56) 2013-08-27 09:04:40 +04:00			`/* Register a message port for disabling takeover runs */`
			`ctdb_client_set_message_handler(ctdb,`
			`CTDB_SRVID_DISABLE_TAKEOVER_RUNS,`
			`disable_takeover_runs_handler, rec);`

ctdb-recoverd: New message ID CTDB_SRVID_DISABLE_RECOVERIES Also add test stub support. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-02-06 07:06:44 +03:00			`/* Register a message port for disabling recoveries */`
			`ctdb_client_set_message_handler(ctdb,`
			`CTDB_SRVID_DISABLE_RECOVERIES,`
			`disable_recoveries_handler, rec);`

ctdb-recoverd: Process leader broadcasts Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2020-03-16 08:07:26 +03:00			`ctdb_client_set_message_handler(ctdb,`
			`CTDB_SRVID_LEADER,`
			`leader_handler,`
			`rec);`

speed startup: alter recovery loop We do a recovery on startup. But the code does: Sleep for ctdb->tunable.recover_interval. Check for recovery. We want to do it in the other order. This is best done by extracting the loop into a separate "main_loop" function. Seconds between ctdbd first log message and node healthy: BEFORE: 24.09 AFTER: 23.58 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 097046025176b9fcb670839d1a9f100f890e7ed2) 2010-06-22 17:20:23 +04:00			`for (;;) {`
			`TALLOC_CTX *mem_ctx = talloc_new(ctdb);`
speed startup: don't wait a full recovery interval if we've already waited We currently sleep for one second, whether or not we've already slept. Change this to sleep for the remainder of the second, if at all. Seconds between ctdbd first log message and node healthy: BEFORE: 18.09 AFTER: 17.08 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit fde760b5f39c77172308a583da4c2443b71541c9) 2010-06-22 17:20:35 +04:00			`struct timeval start;`
			`double elapsed;`

speed startup: alter recovery loop We do a recovery on startup. But the code does: Sleep for ctdb->tunable.recover_interval. Check for recovery. We want to do it in the other order. This is best done by extracting the loop into a separate "main_loop" function. Seconds between ctdbd first log message and node healthy: BEFORE: 24.09 AFTER: 23.58 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 097046025176b9fcb670839d1a9f100f890e7ed2) 2010-06-22 17:20:23 +04:00			`if (!mem_ctx) {`
			`DEBUG(DEBUG_CRIT,(__location__`
			`" Failed to create temp context\n"));`
			`exit(-1);`
			`}`

speed startup: don't wait a full recovery interval if we've already waited We currently sleep for one second, whether or not we've already slept. Change this to sleep for the remainder of the second, if at all. Seconds between ctdbd first log message and node healthy: BEFORE: 18.09 AFTER: 17.08 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit fde760b5f39c77172308a583da4c2443b71541c9) 2010-06-22 17:20:35 +04:00			`start = timeval_current();`
speed startup: alter recovery loop We do a recovery on startup. But the code does: Sleep for ctdb->tunable.recover_interval. Check for recovery. We want to do it in the other order. This is best done by extracting the loop into a separate "main_loop" function. Seconds between ctdbd first log message and node healthy: BEFORE: 24.09 AFTER: 23.58 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 097046025176b9fcb670839d1a9f100f890e7ed2) 2010-06-22 17:20:23 +04:00			`main_loop(ctdb, rec, mem_ctx);`
			`talloc_free(mem_ctx);`

			`/* we only check for recovery once every second */`
speed startup: don't wait a full recovery interval if we've already waited We currently sleep for one second, whether or not we've already slept. Change this to sleep for the remainder of the second, if at all. Seconds between ctdbd first log message and node healthy: BEFORE: 18.09 AFTER: 17.08 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit fde760b5f39c77172308a583da4c2443b71541c9) 2010-06-22 17:20:35 +04:00			`elapsed = timeval_elapsed(&start);`
			`if (elapsed < ctdb->tunable.recover_interval) {`
			`ctdb_wait_timeout(ctdb, ctdb->tunable.recover_interval`
			`- elapsed);`
			`}`
speed startup: alter recovery loop We do a recovery on startup. But the code does: Sleep for ctdb->tunable.recover_interval. Check for recovery. We want to do it in the other order. This is best done by extracting the loop into a separate "main_loop" function. Seconds between ctdbd first log message and node healthy: BEFORE: 24.09 AFTER: 23.58 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 097046025176b9fcb670839d1a9f100f890e7ed2) 2010-06-22 17:20:23 +04:00			`}`
start working on a recovery daemon change ctdb_control so it takes a timeval pointer as argument. this is the timeout. if the node has not responded within hte timeout ctdb_control will return an error instead of hanging. if the timeval pointer is NULL then the call will block indefinitely if there is no response. this is used for now in the createdb control but all the helpers ctdb_ctrl_* should probably be updated to take a timeout parameter as well. (This used to be ctdb commit 1fe64b04869b17dbf123851b0fe09df8d28a6211) 2007-05-04 02:30:18 +04:00			`}`

added health monitoring logic to ctdb, so a node loses its public IP address if one of the sybsystem event scripts reports a problem (This used to be ctdb commit c7a089256d86cec21097453bce5acbccee87413f) 2007-06-06 04:25:46 +04:00			`/*`
implement a scheme where nodes are banned if they continuously caused the cluster to start a recovery session. The node is banned from the cluster for the RecoveryBanPeriod (default of 5 minutes) (This used to be ctdb commit 4ad43dd07f526b6002477177fbf55483246c2c0c) 2007-06-07 09:18:55 +04:00			`event handler for when the main ctdbd dies`
			`*/`
ctdb-daemon: Stop using tevent compatibility definitions Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-26 08:50:09 +03:00			`static void ctdb_recoverd_parent(struct tevent_context *ev,`
			`struct tevent_fd *fde,`
moved the recovery daemon into the main ctdbd and enable it by default (This used to be ctdb commit 2a7d42124731f43d013cb76a798525eab4cc1ee0) 2007-05-15 09:13:36 +04:00			`uint16_t flags, void *private_data)`
			`{`
merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_ALERT,("recovery daemon parent died - exiting\n"));`
moved the recovery daemon into the main ctdbd and enable it by default (This used to be ctdb commit 2a7d42124731f43d013cb76a798525eab4cc1ee0) 2007-05-15 09:13:36 +04:00			`_exit(1);`
			`}`

Monitor that the recovery daemon is still running from the main ctdb daemon and if it has terminated, then we shut down the main daemon as well (This used to be ctdb commit 7e587acaf8006254e89ff9b4bf48454821c85863) 2008-05-06 05:19:17 +04:00			`/*`
			`called regularly to verify that the recovery daemon is still running`
			`*/`
ctdb-daemon: Stop using tevent compatibility definitions Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-26 08:50:09 +03:00			`static void ctdb_check_recd(struct tevent_context *ev,`
			`struct tevent_timer *te,`
			`struct timeval yt, void *p)`
Monitor that the recovery daemon is still running from the main ctdb daemon and if it has terminated, then we shut down the main daemon as well (This used to be ctdb commit 7e587acaf8006254e89ff9b4bf48454821c85863) 2008-05-06 05:19:17 +04:00			`{`
			`struct ctdb_context *ctdb = talloc_get_type(p, struct ctdb_context);`

Track all child process so we never send a signal to an unrelated process (our child died and kernel wrapped the pid-space and reused the pid for a different process Wrap all creation of child processes inside ctdb_fork() which is used to track all processes we have spawned. Capture SIGCHLD to track also which child processes have terminated. Wrap kill() inside ctdb_kill() and make sure that we never send a !0 signal to a child process pid that has already terminated (and might have been replaced with a (This used to be ctdb commit f73a4b1495830bcdd094a93732a89dd53b3c2f78) 2012-05-03 05:42:41 +04:00			`if (ctdb_kill(ctdb, ctdb->recoverd_pid, 0) != 0) {`
If/when the recovery daemon terminates unexpectedly, try to restart it again from the main daemon instead of just shutting down the main deamon too. While it does not address the reason for recovery daemon shutting down, it reduces the impact of such issues and makes the system more robust. (This used to be ctdb commit 0566ef3d6cef809bda204877c493c80ff9eb2c40) 2011-03-01 04:09:42 +03:00			`DEBUG(DEBUG_ERR,("Recovery daemon (pid:%d) is no longer running. Trying to restart recovery daemon.\n", (int)ctdb->recoverd_pid));`
Monitor that the recovery daemon is still running from the main ctdb daemon and if it has terminated, then we shut down the main daemon as well (This used to be ctdb commit 7e587acaf8006254e89ff9b4bf48454821c85863) 2008-05-06 05:19:17 +04:00
ctdb-daemon: Stop using tevent compatibility definitions Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-26 08:50:09 +03:00			`tevent_add_timer(ctdb->ev, ctdb, timeval_zero(),`
			`ctdb_restart_recd, ctdb);`
Monitor that the recovery daemon is still running from the main ctdb daemon and if it has terminated, then we shut down the main daemon as well (This used to be ctdb commit 7e587acaf8006254e89ff9b4bf48454821c85863) 2008-05-06 05:19:17 +04:00
If/when the recovery daemon terminates unexpectedly, try to restart it again from the main daemon instead of just shutting down the main deamon too. While it does not address the reason for recovery daemon shutting down, it reduces the impact of such issues and makes the system more robust. (This used to be ctdb commit 0566ef3d6cef809bda204877c493c80ff9eb2c40) 2011-03-01 04:09:42 +03:00			`return;`
Monitor that the recovery daemon is still running from the main ctdb daemon and if it has terminated, then we shut down the main daemon as well (This used to be ctdb commit 7e587acaf8006254e89ff9b4bf48454821c85863) 2008-05-06 05:19:17 +04:00			`}`

ctdb-daemon: Stop using tevent compatibility definitions Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-26 08:50:09 +03:00			`tevent_add_timer(ctdb->ev, ctdb->recd_ctx,`
			`timeval_current_ofs(30, 0),`
			`ctdb_check_recd, ctdb);`
Monitor that the recovery daemon is still running from the main ctdb daemon and if it has terminated, then we shut down the main daemon as well (This used to be ctdb commit 7e587acaf8006254e89ff9b4bf48454821c85863) 2008-05-06 05:19:17 +04:00			`}`

ctdb-daemon: Stop using tevent compatibility definitions Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-26 08:50:09 +03:00			`static void recd_sig_child_handler(struct tevent_context *ev,`
			`struct tevent_signal *se, int signum,`
			`int count, void *dont_care,`
			`void *private_data)`
proper waitpid() fix. remove all waitpid() calls and use the event system to trap sigchld (This used to be ctdb commit 77458b2b6b51b2970c12b0e5b097088d3fb9d358) 2008-07-09 08:02:54 +04:00			`{`
			`// struct ctdb_context *ctdb = talloc_get_type(private_data, struct ctdb_context);`
			`int status;`
			`pid_t pid = -1;`

			`while (pid != 0) {`
			`pid = waitpid(-1, &status, WNOHANG);`
			`if (pid == -1) {`
dont log an error if waitpid returns -1 and errno is ECHILD (This used to be ctdb commit fdf50f3e774e3980af81c0b6f4ff81d085f4f697) 2009-06-19 09:55:13 +04:00			`if (errno != ECHILD) {`
			`DEBUG(DEBUG_ERR, (__location__ " waitpid() returned error. errno:%s(%d)\n", strerror(errno),errno));`
			`}`
proper waitpid() fix. remove all waitpid() calls and use the event system to trap sigchld (This used to be ctdb commit 77458b2b6b51b2970c12b0e5b097088d3fb9d358) 2008-07-09 08:02:54 +04:00			`return;`
			`}`
			`if (pid > 0) {`
			`DEBUG(DEBUG_DEBUG, ("RECD SIGCHLD from %d\n", (int)pid));`
			`}`
			`}`
			`}`

implement a scheme where nodes are banned if they continuously caused the cluster to start a recovery session. The node is banned from the cluster for the RecoveryBanPeriod (default of 5 minutes) (This used to be ctdb commit 4ad43dd07f526b6002477177fbf55483246c2c0c) 2007-06-07 09:18:55 +04:00			`/*`
			`startup the recovery daemon as a child of the main ctdb daemon`
			`*/`
moved the recovery daemon into the main ctdbd and enable it by default (This used to be ctdb commit 2a7d42124731f43d013cb76a798525eab4cc1ee0) 2007-05-15 09:13:36 +04:00			`int ctdb_start_recoverd(struct ctdb_context *ctdb)`
start working on a recovery daemon change ctdb_control so it takes a timeval pointer as argument. this is the timeout. if the node has not responded within hte timeout ctdb_control will return an error instead of hanging. if the timeval pointer is NULL then the call will block indefinitely if there is no response. this is used for now in the createdb control but all the helpers ctdb_ctrl_* should probably be updated to take a timeout parameter as well. (This used to be ctdb commit 1fe64b04869b17dbf123851b0fe09df8d28a6211) 2007-05-04 02:30:18 +04:00			`{`
moved the recovery daemon into the main ctdbd and enable it by default (This used to be ctdb commit 2a7d42124731f43d013cb76a798525eab4cc1ee0) 2007-05-15 09:13:36 +04:00			`int fd[2];`
ctdb-daemon: Stop using tevent compatibility definitions Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-26 08:50:09 +03:00			`struct tevent_signal *se;`
event: Update events to latest Samba version 0.9.8 In Samba this is now called "tevent", and while we use the backwards compatibility wrappers they don't offer EVENT_FD_AUTOCLOSE: that is now a separate tevent_fd_set_auto_close() function. This is based on Samba version 7f29f817fa939ef1bbb740584f09e76e2ecd5b06. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 85e5e760cc91eb3157d3a88996ce474491646726) 2010-08-18 03:46:31 +04:00			`struct tevent_fd *fde;`
ctdb-daemon: Initialize logging in recovery daemon Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2016-11-29 08:49:41 +03:00			`int ret;`
start working on a recovery daemon change ctdb_control so it takes a timeval pointer as argument. this is the timeout. if the node has not responded within hte timeout ctdb_control will return an error instead of hanging. if the timeval pointer is NULL then the call will block indefinitely if there is no response. this is used for now in the createdb control but all the helpers ctdb_ctrl_* should probably be updated to take a timeout parameter as well. (This used to be ctdb commit 1fe64b04869b17dbf123851b0fe09df8d28a6211) 2007-05-04 02:30:18 +04:00
moved the recovery daemon into the main ctdbd and enable it by default (This used to be ctdb commit 2a7d42124731f43d013cb76a798525eab4cc1ee0) 2007-05-15 09:13:36 +04:00			`if (pipe(fd) != 0) {`
			`return -1;`
start working on a recovery daemon change ctdb_control so it takes a timeval pointer as argument. this is the timeout. if the node has not responded within hte timeout ctdb_control will return an error instead of hanging. if the timeval pointer is NULL then the call will block indefinitely if there is no response. this is used for now in the createdb control but all the helpers ctdb_ctrl_* should probably be updated to take a timeout parameter as well. (This used to be ctdb commit 1fe64b04869b17dbf123851b0fe09df8d28a6211) 2007-05-04 02:30:18 +04:00			`}`

ctdb-logging: Remove log ringbuffer As far as we know, nobody uses this and it just complicates the logging subsystem. Remove all ringbuffer code and documentation. Update the local daemons startup code correspondingly. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Volker Lendecke <vl@samba.org> 2014-08-08 06:51:03 +04:00			`ctdb->recoverd_pid = ctdb_fork(ctdb);`
when we are shutting down, we should first shut down the recovery daemon (This used to be ctdb commit 39ade6b329adcd3234124d6a8daaa6181abf739b) 2007-10-22 06:34:08 +04:00			`if (ctdb->recoverd_pid == -1) {`
moved the recovery daemon into the main ctdbd and enable it by default (This used to be ctdb commit 2a7d42124731f43d013cb76a798525eab4cc1ee0) 2007-05-15 09:13:36 +04:00			`return -1;`
start working on a recovery daemon change ctdb_control so it takes a timeval pointer as argument. this is the timeout. if the node has not responded within hte timeout ctdb_control will return an error instead of hanging. if the timeval pointer is NULL then the call will block indefinitely if there is no response. this is used for now in the createdb control but all the helpers ctdb_ctrl_* should probably be updated to take a timeout parameter as well. (This used to be ctdb commit 1fe64b04869b17dbf123851b0fe09df8d28a6211) 2007-05-04 02:30:18 +04:00			`}`
daemon: On shutdown, destroy timed events that check if recoverd is active When CTDB is shutting down, recovery daemon is stopped, but the event that checks if recovery daemon is still alive is not destroyed. So recovery master is restarted during shutdown if CTDB daemon takes longer to shutdown. There are two processes that check if recovery daemon is working. 1. ctdb_check_recd() - which checks every 30 seconds if the recovery daemon process exists. 2. ctdb_recd_ping_timeout() - which is triggered when recovery daemon fails to ping CTDB daemon. Both the events are periodic and need to be destroyed when shutting down. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 746168df2e691058e601016110fae818c6a265c3) 2012-12-04 08:05:44 +04:00
when we are shutting down, we should first shut down the recovery daemon (This used to be ctdb commit 39ade6b329adcd3234124d6a8daaa6181abf739b) 2007-10-22 06:34:08 +04:00			`if (ctdb->recoverd_pid != 0) {`
daemon: On shutdown, destroy timed events that check if recoverd is active When CTDB is shutting down, recovery daemon is stopped, but the event that checks if recovery daemon is still alive is not destroyed. So recovery master is restarted during shutdown if CTDB daemon takes longer to shutdown. There are two processes that check if recovery daemon is working. 1. ctdb_check_recd() - which checks every 30 seconds if the recovery daemon process exists. 2. ctdb_recd_ping_timeout() - which is triggered when recovery daemon fails to ping CTDB daemon. Both the events are periodic and need to be destroyed when shutting down. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 746168df2e691058e601016110fae818c6a265c3) 2012-12-04 08:05:44 +04:00			`talloc_free(ctdb->recd_ctx);`
			`ctdb->recd_ctx = talloc_new(ctdb);`
			`CTDB_NO_MEMORY(ctdb, ctdb->recd_ctx);`

moved the recovery daemon into the main ctdbd and enable it by default (This used to be ctdb commit 2a7d42124731f43d013cb76a798525eab4cc1ee0) 2007-05-15 09:13:36 +04:00			`close(fd[0]);`
ctdb-daemon: Stop using tevent compatibility definitions Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-26 08:50:09 +03:00			`tevent_add_timer(ctdb->ev, ctdb->recd_ctx,`
			`timeval_current_ofs(30, 0),`
			`ctdb_check_recd, ctdb);`
moved the recovery daemon into the main ctdbd and enable it by default (This used to be ctdb commit 2a7d42124731f43d013cb76a798525eab4cc1ee0) 2007-05-15 09:13:36 +04:00			`return 0;`
start working on a recovery daemon change ctdb_control so it takes a timeval pointer as argument. this is the timeout. if the node has not responded within hte timeout ctdb_control will return an error instead of hanging. if the timeval pointer is NULL then the call will block indefinitely if there is no response. this is used for now in the createdb control but all the helpers ctdb_ctrl_* should probably be updated to take a timeout parameter as well. (This used to be ctdb commit 1fe64b04869b17dbf123851b0fe09df8d28a6211) 2007-05-04 02:30:18 +04:00			`}`

moved the recovery daemon into the main ctdbd and enable it by default (This used to be ctdb commit 2a7d42124731f43d013cb76a798525eab4cc1ee0) 2007-05-15 09:13:36 +04:00			`close(fd[1]);`

			`srandom(getpid() ^ time(NULL));`
start working on a recovery daemon change ctdb_control so it takes a timeval pointer as argument. this is the timeout. if the node has not responded within hte timeout ctdb_control will return an error instead of hanging. if the timeval pointer is NULL then the call will block indefinitely if there is no response. this is used for now in the createdb control but all the helpers ctdb_ctrl_* should probably be updated to take a timeout parameter as well. (This used to be ctdb commit 1fe64b04869b17dbf123851b0fe09df8d28a6211) 2007-05-04 02:30:18 +04:00
ctdb-daemon: Initialize logging in recovery daemon Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2016-11-29 08:49:41 +03:00			`ret = logging_init(ctdb, NULL, NULL, "ctdb-recoverd");`
			`if (ret != 0) {`
			`return -1;`
			`}`

ctdb-recoverd: Set the process name correctly Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2018-06-19 09:50:41 +03:00			`prctl_set_comment("ctdb_recoverd");`
ctdb-daemon: Remove setting of debug_extra from switch_from_server_to_client() Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2016-11-25 06:44:10 +03:00			`if (switch_from_server_to_client(ctdb) != 0) {`
create a helper function that converts a ctdb instance in daemon mode to become a ctdb client instance. use this from the recovery daemon child process to switch to client mode and connect back to the main daemon (This used to be ctdb commit 16f31786a031255ab5b3099a0a3c745de973347a) 2009-03-23 04:37:30 +03:00			`DEBUG(DEBUG_CRIT, (__location__ "ERROR: failed to switch recovery daemon into client mode. shutting down.\n"));`
start working on a recovery daemon change ctdb_control so it takes a timeval pointer as argument. this is the timeout. if the node has not responded within hte timeout ctdb_control will return an error instead of hanging. if the timeval pointer is NULL then the call will block indefinitely if there is no response. this is used for now in the createdb control but all the helpers ctdb_ctrl_* should probably be updated to take a timeout parameter as well. (This used to be ctdb commit 1fe64b04869b17dbf123851b0fe09df8d28a6211) 2007-05-04 02:30:18 +04:00			`exit(1);`
			`}`

Drop the debug level for logging fd creation to DEBUG_DEBUG (This used to be ctdb commit eae1d4f9e52e73b4d8769868fffdafa590d03784) 2010-02-03 22:37:41 +03:00			`DEBUG(DEBUG_DEBUG, (__location__ " Created PIPE FD:%d to recovery daemon\n", fd[0]));`
add logging everytime we create a filedescriptor in the main ctdb daemon so we can spot if there are leaks. plug two leaks for filedescriptors related to when sending ARP fail and one leak when we can not parse the local address during tcp connection establish (This used to be ctdb commit ddd089810a14efe4be6e1ff3eccaa604e4913c9e) 2009-10-15 04:24:54 +04:00
ctdb-daemon: Stop using tevent compatibility definitions Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-26 08:50:09 +03:00			`fde = tevent_add_fd(ctdb->ev, ctdb, fd[0], TEVENT_FD_READ,`
			`ctdb_recoverd_parent, &fd[0]);`
event: Update events to latest Samba version 0.9.8 In Samba this is now called "tevent", and while we use the backwards compatibility wrappers they don't offer EVENT_FD_AUTOCLOSE: that is now a separate tevent_fd_set_auto_close() function. This is based on Samba version 7f29f817fa939ef1bbb740584f09e76e2ecd5b06. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 85e5e760cc91eb3157d3a88996ce474491646726) 2010-08-18 03:46:31 +04:00			`tevent_fd_set_auto_close(fde);`
create a helper function that converts a ctdb instance in daemon mode to become a ctdb client instance. use this from the recovery daemon child process to switch to client mode and connect back to the main daemon (This used to be ctdb commit 16f31786a031255ab5b3099a0a3c745de973347a) 2009-03-23 04:37:30 +03:00
proper waitpid() fix. remove all waitpid() calls and use the event system to trap sigchld (This used to be ctdb commit 77458b2b6b51b2970c12b0e5b097088d3fb9d358) 2008-07-09 08:02:54 +04:00			`/* set up a handler to pick up sigchld */`
ctdb-daemon: Stop using tevent compatibility definitions Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-26 08:50:09 +03:00			`se = tevent_add_signal(ctdb->ev, ctdb, SIGCHLD, 0,`
			`recd_sig_child_handler, ctdb);`
proper waitpid() fix. remove all waitpid() calls and use the event system to trap sigchld (This used to be ctdb commit 77458b2b6b51b2970c12b0e5b097088d3fb9d358) 2008-07-09 08:02:54 +04:00			`if (se == NULL) {`
			`DEBUG(DEBUG_CRIT,("Failed to set up signal handler for SIGCHLD in recovery daemon\n"));`
			`exit(1);`
			`}`

moved the recovery daemon into the main ctdbd and enable it by default (This used to be ctdb commit 2a7d42124731f43d013cb76a798525eab4cc1ee0) 2007-05-15 09:13:36 +04:00			`monitor_cluster(ctdb);`
recovery daemon with recovery master election election is primitive, it elects the lowest vnn as the recovery master two new controls, to get/set recovery master for a node to use recovery daemon, start one ./bin/recoverd --socket=ctdb.socket* for each ctdb daemon it has been briefly tested by deleting and adding nodes to a 4 node cluster but needs more testing (This used to be ctdb commit 541d1cc49d46d44042a31a8404d521412ef2fdb3) 2007-05-07 00:51:58 +04:00
merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_ALERT,("ERROR: ctdb_recoverd finished!?\n"));`
moved the recovery daemon into the main ctdbd and enable it by default (This used to be ctdb commit 2a7d42124731f43d013cb76a798525eab4cc1ee0) 2007-05-15 09:13:36 +04:00			`return -1;`
start working on a recovery daemon change ctdb_control so it takes a timeval pointer as argument. this is the timeout. if the node has not responded within hte timeout ctdb_control will return an error instead of hanging. if the timeval pointer is NULL then the call will block indefinitely if there is no response. this is used for now in the createdb control but all the helpers ctdb_ctrl_* should probably be updated to take a timeout parameter as well. (This used to be ctdb commit 1fe64b04869b17dbf123851b0fe09df8d28a6211) 2007-05-04 02:30:18 +04:00			`}`
when we are shutting down, we should first shut down the recovery daemon (This used to be ctdb commit 39ade6b329adcd3234124d6a8daaa6181abf739b) 2007-10-22 06:34:08 +04:00
			`/*`
			`shutdown the recovery daemon`
			`*/`
			`void ctdb_stop_recoverd(struct ctdb_context *ctdb)`
			`{`
			`if (ctdb->recoverd_pid == 0) {`
			`return;`
			`}`

merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_NOTICE,("Shutting down recovery daemon\n"));`
Track all child process so we never send a signal to an unrelated process (our child died and kernel wrapped the pid-space and reused the pid for a different process Wrap all creation of child processes inside ctdb_fork() which is used to track all processes we have spawned. Capture SIGCHLD to track also which child processes have terminated. Wrap kill() inside ctdb_kill() and make sure that we never send a !0 signal to a child process pid that has already terminated (and might have been replaced with a (This used to be ctdb commit f73a4b1495830bcdd094a93732a89dd53b3c2f78) 2012-05-03 05:42:41 +04:00			`ctdb_kill(ctdb, ctdb->recoverd_pid, SIGTERM);`
daemon: On shutdown, destroy timed events that check if recoverd is active When CTDB is shutting down, recovery daemon is stopped, but the event that checks if recovery daemon is still alive is not destroyed. So recovery master is restarted during shutdown if CTDB daemon takes longer to shutdown. There are two processes that check if recovery daemon is working. 1. ctdb_check_recd() - which checks every 30 seconds if the recovery daemon process exists. 2. ctdb_recd_ping_timeout() - which is triggered when recovery daemon fails to ping CTDB daemon. Both the events are periodic and need to be destroyed when shutting down. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 746168df2e691058e601016110fae818c6a265c3) 2012-12-04 08:05:44 +04:00
			`TALLOC_FREE(ctdb->recd_ctx);`
			`TALLOC_FREE(ctdb->recd_ping_count);`
when we are shutting down, we should first shut down the recovery daemon (This used to be ctdb commit 39ade6b329adcd3234124d6a8daaa6181abf739b) 2007-10-22 06:34:08 +04:00			`}`
If/when the recovery daemon terminates unexpectedly, try to restart it again from the main daemon instead of just shutting down the main deamon too. While it does not address the reason for recovery daemon shutting down, it reduces the impact of such issues and makes the system more robust. (This used to be ctdb commit 0566ef3d6cef809bda204877c493c80ff9eb2c40) 2011-03-01 04:09:42 +03:00
ctdb-daemon: Stop using tevent compatibility definitions Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-26 08:50:09 +03:00			`static void ctdb_restart_recd(struct tevent_context *ev,`
			`struct tevent_timer *te,`
			`struct timeval t, void *private_data)`
If/when the recovery daemon terminates unexpectedly, try to restart it again from the main daemon instead of just shutting down the main deamon too. While it does not address the reason for recovery daemon shutting down, it reduces the impact of such issues and makes the system more robust. (This used to be ctdb commit 0566ef3d6cef809bda204877c493c80ff9eb2c40) 2011-03-01 04:09:42 +03:00			`{`
			`struct ctdb_context *ctdb = talloc_get_type(private_data, struct ctdb_context);`

			`DEBUG(DEBUG_ERR,("Restarting recovery daemon\n"));`
			`ctdb_stop_recoverd(ctdb);`
			`ctdb_start_recoverd(ctdb);`
			`}`

3322 lines 80 KiB C Raw Permalink Normal View History Unescape Escape

3322 lines

80 KiB

C

Raw Permalink Normal View History