samba-mirror

mirror of https://github.com/samba-team/samba.git synced 2024-12-23 17:34:34 +03:00

3259 lines

86 KiB

C

Raw Normal View History

start working on a recovery daemon change ctdb_control so it takes a timeval pointer as argument. this is the timeout. if the node has not responded within hte timeout ctdb_control will return an error instead of hanging. if the timeval pointer is NULL then the call will block indefinitely if there is no response. this is used for now in the createdb control but all the helpers ctdb_ctrl_* should probably be updated to take a timeout parameter as well. (This used to be ctdb commit 1fe64b04869b17dbf123851b0fe09df8d28a6211) 2007-05-04 02:30:18 +04:00			`/*`
			`ctdb recovery daemon`

			`Copyright (C) Ronnie Sahlberg 2007`

ctdb is GPL not LGPL (This used to be ctdb commit 8624378010d1c2a1438e1e701339dfba7276f960) 2007-05-31 07:50:53 +04:00			`This program is free software; you can redistribute it and/or modify`
			`it under the terms of the GNU General Public License as published by`
update lib/replace from samba4 (This used to be ctdb commit f0555484105668c01c21f56322992e752e831109) 2007-07-10 09:29:31 +04:00			`the Free Software Foundation; either version 3 of the License, or`
ctdb is GPL not LGPL (This used to be ctdb commit 8624378010d1c2a1438e1e701339dfba7276f960) 2007-05-31 07:50:53 +04:00			`(at your option) any later version.`

			`This program is distributed in the hope that it will be useful,`
start working on a recovery daemon change ctdb_control so it takes a timeval pointer as argument. this is the timeout. if the node has not responded within hte timeout ctdb_control will return an error instead of hanging. if the timeval pointer is NULL then the call will block indefinitely if there is no response. this is used for now in the createdb control but all the helpers ctdb_ctrl_* should probably be updated to take a timeout parameter as well. (This used to be ctdb commit 1fe64b04869b17dbf123851b0fe09df8d28a6211) 2007-05-04 02:30:18 +04:00			`but WITHOUT ANY WARRANTY; without even the implied warranty of`
ctdb is GPL not LGPL (This used to be ctdb commit 8624378010d1c2a1438e1e701339dfba7276f960) 2007-05-31 07:50:53 +04:00			`MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the`
			`GNU General Public License for more details.`

			`You should have received a copy of the GNU General Public License`
update lib/replace from samba4 (This used to be ctdb commit f0555484105668c01c21f56322992e752e831109) 2007-07-10 09:29:31 +04:00			`along with this program; if not, see <http://www.gnu.org/licenses/>.`
start working on a recovery daemon change ctdb_control so it takes a timeval pointer as argument. this is the timeout. if the node has not responded within hte timeout ctdb_control will return an error instead of hanging. if the timeval pointer is NULL then the call will block indefinitely if there is no response. this is used for now in the createdb control but all the helpers ctdb_ctrl_* should probably be updated to take a timeout parameter as well. (This used to be ctdb commit 1fe64b04869b17dbf123851b0fe09df8d28a6211) 2007-05-04 02:30:18 +04:00			`*/`

ctdb-daemon: Remove dependency on includes.h Instead of includes.h, include the required header files explicitly. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-26 08:50:46 +03:00			`#include "replace.h"`
start working on a recovery daemon change ctdb_control so it takes a timeval pointer as argument. this is the timeout. if the node has not responded within hte timeout ctdb_control will return an error instead of hanging. if the timeval pointer is NULL then the call will block indefinitely if there is no response. this is used for now in the createdb control but all the helpers ctdb_ctrl_* should probably be updated to take a timeout parameter as well. (This used to be ctdb commit 1fe64b04869b17dbf123851b0fe09df8d28a6211) 2007-05-04 02:30:18 +04:00			`#include "system/filesys.h"`
better timeout handling for calls, controls and traverses (This used to be ctdb commit 63346a6c59d4821b4c443939b5d88db8cd20f5fe) 2007-05-10 08:06:48 +04:00			`#include "system/time.h"`
let each node verify that they have a correct assignment of public ip addresses (i.e. htey hold those they should hold and they dont hold any of those they shouldnt hold) if an inconsistency is found, mark the local node as recovery mode active and wait for the recovery master to trigger a full blown recovery (This used to be ctdb commit 55a5bfc8244c5b9cdda3f11992f384f00566b5dc) 2007-09-14 04:16:36 +04:00			`#include "system/network.h"`
when we are shutting down, we should first shut down the recovery daemon (This used to be ctdb commit 39ade6b329adcd3234124d6a8daaa6181abf739b) 2007-10-22 06:34:08 +04:00			`#include "system/wait.h"`
ctdb-daemon: Remove dependency on includes.h Instead of includes.h, include the required header files explicitly. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-26 08:50:46 +03:00
			`#include <popt.h>`
			`#include <talloc.h>`
			`#include <tevent.h>`
			`#include <tdb.h>`

ctdb-util: Rename db_wrap to tdb_wrap and make it a build subsystem This makes it consistent with Samba, to ease transition. Update unit test code to link to with tdb_wrap instead of including db_wrap.c. There are some potential whitespace fixes in this commit that have been ignored. CTDB's lib/tdb_wrap will be deleted after the transition to Samba's lib/tdb_wrap, so there's no point polishing it too much. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2014-08-15 09:46:33 +04:00			`#include "lib/tdb_wrap/tdb_wrap.h"`
ctdb-recoverd: Change include of dlinklist.h to contain directory This makes it consistent with the rest of the code and avoids problems when some variant of lib/util isn't in the include path. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2014-08-15 10:18:05 +04:00			`#include "lib/util/dlinklist.h"`
ctdb-daemon: Remove dependency on includes.h Instead of includes.h, include the required header files explicitly. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-26 08:50:46 +03:00			`#include "lib/util/debug.h"`
			`#include "lib/util/samba_util.h"`
ctdb-common: Drop CTDB's copy of sys_read() and sys_write() Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Tue Nov 29 11:22:40 CET 2016 on sn-devel-144 2016-11-29 04:55:06 +03:00			`#include "lib/util/sys_rw.h"`
ctdb: Use prctl_set_comment from lib/util Signed-off-by: Christof Schmitt <cs@samba.org> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-09-24 02:10:59 +03:00			`#include "lib/util/util_process.h"`
ctdb-daemon: Remove dependency on includes.h Instead of includes.h, include the required header files explicitly. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-26 08:50:46 +03:00
			`#include "ctdb_private.h"`
			`#include "ctdb_client.h"`

ctdb-daemon: Separate prototypes for system specific functions This groups function prototypes for system specific functions in common/system.h and removes them from ctdb_private.h. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-23 06:11:53 +03:00			`#include "common/system.h"`
ctdb-daemon: Separate prototypes for common client/server functions This groups function prototypes for common client/server functions in common/common.h and removes them from ctdb_private.h. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-23 06:17:34 +03:00			`#include "common/common.h"`
ctdb-server: Replace ctdb_logging.h with common/logging.h Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Michael Adam <obnox@samba.org> 2015-11-11 07:41:10 +03:00			`#include "common/logging.h"`
start working on a recovery daemon change ctdb_control so it takes a timeval pointer as argument. this is the timeout. if the node has not responded within hte timeout ctdb_control will return an error instead of hanging. if the timeval pointer is NULL then the call will block indefinitely if there is no response. this is used for now in the createdb control but all the helpers ctdb_ctrl_* should probably be updated to take a timeout parameter as well. (This used to be ctdb commit 1fe64b04869b17dbf123851b0fe09df8d28a6211) 2007-05-04 02:30:18 +04:00
ctdb-recovery: Move recovery lock functions to recovery daemon code ctdb_recovery_have_lock(), ctdb_recovery_lock(), ctdb_recovery_unlock() are only used by recovery daemon, so move them there. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-02-17 12:20:03 +03:00			`#include "ctdb_cluster_mutex.h"`
added admin commands to ban/unban nodes (This used to be ctdb commit 4dad04172e7e4955b5bf6444a85b19901c9683ad) 2007-06-07 10:34:33 +04:00
recoverd: Factor out the SRVID handling code The code that handles IP reallocate requests can be reused. This also changes the result back to a SRVID caller to the PNN on success or a negative error code on failure. None of the callers currently look at the result so this is harmless... but it will be useful later. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit e4eae6e3291baa299a1d0f733ab11b138ee699a3) 2013-08-16 14:02:34 +04:00			`/* List of SRVID requests that need to be processed */`
			`struct srvid_list {`
			`struct srvid_list next, prev;`
ctdb-daemon: Rename struct srvid_request to ctdb_srvid_message Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-29 06:32:49 +03:00			`struct ctdb_srvid_message *request;`
recoverd: Factor out the SRVID handling code The code that handles IP reallocate requests can be reused. This also changes the result back to a SRVID caller to the PNN on success or a negative error code on failure. None of the callers currently look at the result so this is harmless... but it will be useful later. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit e4eae6e3291baa299a1d0f733ab11b138ee699a3) 2013-08-16 14:02:34 +04:00			`};`

			`struct srvid_requests {`
			`struct srvid_list *requests;`
add a new command "ctdb ipreallocate", this command will force the recovery master to perform a full ip reallocation process. the ctdb command will block until the ip reallocation has comleted (This used to be ctdb commit abad7b97fe0c066b33f6e75d0953bbed892a3216) 2009-07-02 07:00:26 +04:00			`};`

recoverd: Factor out the SRVID handling code The code that handles IP reallocate requests can be reused. This also changes the result back to a SRVID caller to the PNN on success or a negative error code on failure. None of the callers currently look at the result so this is harmless... but it will be useful later. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit e4eae6e3291baa299a1d0f733ab11b138ee699a3) 2013-08-16 14:02:34 +04:00			`static void srvid_request_reply(struct ctdb_context *ctdb,`
ctdb-daemon: Rename struct srvid_request to ctdb_srvid_message Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-29 06:32:49 +03:00			`struct ctdb_srvid_message *request,`
recoverd: Factor out the SRVID handling code The code that handles IP reallocate requests can be reused. This also changes the result back to a SRVID caller to the PNN on success or a negative error code on failure. None of the callers currently look at the result so this is harmless... but it will be useful later. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit e4eae6e3291baa299a1d0f733ab11b138ee699a3) 2013-08-16 14:02:34 +04:00			`TDB_DATA result)`
			`{`
			`/* Someone that sent srvid==0 does not want a reply */`
			`if (request->srvid == 0) {`
			`talloc_free(request);`
			`return;`
			`}`

			`if (ctdb_client_send_message(ctdb, request->pnn, request->srvid,`
			`result) == 0) {`
			`DEBUG(DEBUG_INFO,("Sent SRVID reply to %u:%llu\n",`
			`(unsigned)request->pnn,`
			`(unsigned long long)request->srvid));`
			`} else {`
			`DEBUG(DEBUG_ERR,("Failed to send SRVID reply to %u:%llu\n",`
			`(unsigned)request->pnn,`
			`(unsigned long long)request->srvid));`
			`}`

			`talloc_free(request);`
			`}`

			`static void srvid_requests_reply(struct ctdb_context *ctdb,`
			`struct srvid_requests **requests,`
			`TDB_DATA result)`
			`{`
			`struct srvid_list *r;`

ctdb-recoverd: Add early return in srvid_requests_reply() Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-05-03 08:56:09 +03:00			`if (*requests == NULL) {`
			`return;`
			`}`

recoverd: Factor out the SRVID handling code The code that handles IP reallocate requests can be reused. This also changes the result back to a SRVID caller to the PNN on success or a negative error code on failure. None of the callers currently look at the result so this is harmless... but it will be useful later. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit e4eae6e3291baa299a1d0f733ab11b138ee699a3) 2013-08-16 14:02:34 +04:00			`for (r = (*requests)->requests; r != NULL; r = r->next) {`
			`srvid_request_reply(ctdb, r->request, result);`
			`}`

			`/* Free the list structure... */`
			`TALLOC_FREE(*requests);`
			`}`

			`static void srvid_request_add(struct ctdb_context *ctdb,`
			`struct srvid_requests **requests,`
ctdb-daemon: Rename struct srvid_request to ctdb_srvid_message Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-29 06:32:49 +03:00			`struct ctdb_srvid_message *request)`
recoverd: Factor out the SRVID handling code The code that handles IP reallocate requests can be reused. This also changes the result back to a SRVID caller to the PNN on success or a negative error code on failure. None of the callers currently look at the result so this is harmless... but it will be useful later. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit e4eae6e3291baa299a1d0f733ab11b138ee699a3) 2013-08-16 14:02:34 +04:00			`{`
			`struct srvid_list *t;`
			`int32_t ret;`
			`TDB_DATA result;`

			`if (*requests == NULL) {`
			`*requests = talloc_zero(ctdb, struct srvid_requests);`
			`if (*requests == NULL) {`
			`goto nomem;`
			`}`
			`}`

			`t = talloc_zero(*requests, struct srvid_list);`
			`if (t == NULL) {`
			`/* If requests was just allocated above then free it /`
			`if ((*requests)->requests == NULL) {`
			`TALLOC_FREE(*requests);`
			`}`
			`goto nomem;`
			`}`

ctdb-daemon: Rename struct srvid_request to ctdb_srvid_message Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-29 06:32:49 +03:00			`t->request = (struct ctdb_srvid_message *)talloc_steal(t, request);`
recoverd: Factor out the SRVID handling code The code that handles IP reallocate requests can be reused. This also changes the result back to a SRVID caller to the PNN on success or a negative error code on failure. None of the callers currently look at the result so this is harmless... but it will be useful later. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit e4eae6e3291baa299a1d0f733ab11b138ee699a3) 2013-08-16 14:02:34 +04:00			`DLIST_ADD((*requests)->requests, t);`

			`return;`

			`nomem:`
			`/* Failed to add the request to the list. Send a fail. */`
			`DEBUG(DEBUG_ERR, (__location__`
			`" Out of memory, failed to queue SRVID request\n"));`
			`ret = -ENOMEM;`
			`result.dsize = sizeof(ret);`
			`result.dptr = (uint8_t *)&ret;`
			`srvid_request_reply(ctdb, request, result);`
			`}`

ctdb-recoverd: Add a new abstraction ctdb_op_disable() This can be used to disable and re-enable an operation, and do all the relevant sanity checking. Most of this is from existing functions disable_takeover_runs_handler(), clear_takeover_runs_disable() and reenable_takeover_runs(). Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-02-08 12:50:38 +03:00			`/* An abstraction to allow an operation (takeover runs, recoveries,`
			`* ...) to be disabled for a given timeout */`
			`struct ctdb_op_state {`
			`struct tevent_timer *timer;`
			`bool in_progress;`
			`const char *name;`
			`};`

			`static struct ctdb_op_state ctdb_op_init(TALLOC_CTX mem_ctx, const char *name)`
			`{`
			`struct ctdb_op_state *state = talloc_zero(mem_ctx, struct ctdb_op_state);`

			`if (state != NULL) {`
			`state->in_progress = false;`
			`state->name = name;`
			`}`

			`return state;`
			`}`

			`static bool ctdb_op_is_disabled(struct ctdb_op_state *state)`
			`{`
			`return state->timer != NULL;`
			`}`

			`static bool ctdb_op_begin(struct ctdb_op_state *state)`
			`{`
			`if (ctdb_op_is_disabled(state)) {`
			`DEBUG(DEBUG_NOTICE,`
			`("Unable to begin - %s are disabled\n", state->name));`
			`return false;`
			`}`

			`state->in_progress = true;`
			`return true;`
			`}`

			`static bool ctdb_op_end(struct ctdb_op_state *state)`
			`{`
			`return state->in_progress = false;`
			`}`

			`static bool ctdb_op_is_in_progress(struct ctdb_op_state *state)`
			`{`
			`return state->in_progress;`
			`}`

			`static void ctdb_op_enable(struct ctdb_op_state *state)`
			`{`
			`TALLOC_FREE(state->timer);`
			`}`

ctdb-daemon: Stop using tevent compatibility definitions Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-26 08:50:09 +03:00			`static void ctdb_op_timeout_handler(struct tevent_context *ev,`
			`struct tevent_timer *te,`
ctdb-recoverd: Add a new abstraction ctdb_op_disable() This can be used to disable and re-enable an operation, and do all the relevant sanity checking. Most of this is from existing functions disable_takeover_runs_handler(), clear_takeover_runs_disable() and reenable_takeover_runs(). Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-02-08 12:50:38 +03:00			`struct timeval yt, void *p)`
			`{`
			`struct ctdb_op_state *state =`
			`talloc_get_type(p, struct ctdb_op_state);`

			`DEBUG(DEBUG_NOTICE,("Reenabling %s after timeout\n", state->name));`
			`ctdb_op_enable(state);`
			`}`

			`static int ctdb_op_disable(struct ctdb_op_state *state,`
			`struct tevent_context *ev,`
			`uint32_t timeout)`
			`{`
			`if (timeout == 0) {`
			`DEBUG(DEBUG_NOTICE,("Reenabling %s\n", state->name));`
			`ctdb_op_enable(state);`
			`return 0;`
			`}`

			`if (state->in_progress) {`
			`DEBUG(DEBUG_ERR,`
			`("Unable to disable %s - in progress\n", state->name));`
			`return -EAGAIN;`
			`}`

			`DEBUG(DEBUG_NOTICE,("Disabling %s for %u seconds\n",`
			`state->name, timeout));`

			`/* Clear any old timers */`
			`talloc_free(state->timer);`

			`/* Arrange for the timeout to occur */`
			`state->timer = tevent_add_timer(ev, state,`
			`timeval_current_ofs(timeout, 0),`
			`ctdb_op_timeout_handler, state);`
			`if (state->timer == NULL) {`
			`DEBUG(DEBUG_ERR,(__location__ " Unable to setup timer\n"));`
			`return -ENOMEM;`
			`}`

			`return 0;`
			`}`

new prototype banning code (This used to be ctdb commit 0c4c2240267af183d54ffd4c0aacda208f6eff6a) 2009-09-03 20:20:39 +04:00			`struct ctdb_banning_state {`
			`uint32_t count;`
			`struct timeval last_reported_time;`
			`};`

implement a scheme where nodes are banned if they continuously caused the cluster to start a recovery session. The node is banned from the cluster for the RecoveryBanPeriod (default of 5 minutes) (This used to be ctdb commit 4ad43dd07f526b6002477177fbf55483246c2c0c) 2007-06-07 09:18:55 +04:00			`/*`
			`private state of recovery daemon`
			`*/`
			`struct ctdb_recoverd {`
			`struct ctdb_context *ctdb;`
change recmaster from being a local variable in monitor_cluster() to be a member of the ctdb_recoverd structure (This used to be ctdb commit b7f955338f50c92374b4f559268fb3a1a516aefa) 2008-03-02 23:53:46 +03:00			`uint32_t recmaster;`
new prototype banning code (This used to be ctdb commit 0c4c2240267af183d54ffd4c0aacda208f6eff6a) 2009-09-03 20:20:39 +04:00			`uint32_t last_culprit_node;`
ctdb-daemon: Rename struct ctdb_node_map to ctdb_node_map_old Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-29 09:22:48 +03:00			`struct ctdb_node_map_old *nodemap;`
use a priority time for the election data, not just the vnn (This used to be ctdb commit a691f9c5cd77194005f0d98483da94b07a48d57d) 2007-06-07 12:37:27 +04:00			`struct timeval priority_time;`
prevent recursion in the calling of ctdb_takeover_run (This used to be ctdb commit 0fbdeb7c91b965d9bc5ecc7b24e31070378d8f1d) 2007-09-13 08:08:18 +04:00			`bool need_takeover_run;`
- merge from ronnie - add a flag to check that recovery completed correctly. If not, re-trigger it in monitoring (This used to be ctdb commit d5ed941d9bab4af30d8b5f9b77bdf43d9218d69b) 2007-09-14 03:49:12 +04:00			`bool need_recovery;`
- catch ESTALE in the recovery lock by trying a read() - priortise nodes that are unbanned and healthy in the election (This used to be ctdb commit 929feb475dfdf7283f0e99b50b179e1c91d3a39f) 2007-10-05 07:28:21 +04:00			`uint32_t node_flags;`
ctdb-daemon: Stop using tevent compatibility definitions Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-26 08:50:09 +03:00			`struct tevent_timer *send_election_te;`
			`struct tevent_timer *election_timeout;`
recoverd: Factor out the SRVID handling code The code that handles IP reallocate requests can be reused. This also changes the result back to a SRVID caller to the PNN on success or a negative error code on failure. None of the callers currently look at the result so this is harmless... but it will be useful later. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit e4eae6e3291baa299a1d0f733ab11b138ee699a3) 2013-08-16 14:02:34 +04:00			`struct srvid_requests *reallocate_requests;`
ctdb-recoverd: Reimplement disabling takeover runs using ctdb_op_disable() Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-02-08 12:52:12 +03:00			`struct ctdb_op_state *takeover_run;`
ctdb-recoverd: Reimplement ReRecoveryTimeout using ctdb_op_disable() Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-02-06 06:47:33 +03:00			`struct ctdb_op_state *recovery;`
ctdb-daemon: Rename struct ctdb_control_get_ifaces to ctdb_iface_list_old Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-28 11:43:48 +03:00			`struct ctdb_iface_list_old *ifaces;`
recoverd: Fix the implementation of CTDB_SRVID_REBALANCE_NODE The current implementation has a few flaws: * A takeover run is called unconditionally when the timer goes even if the recovery master role has moved. This means a node other than the recovery master can incorrectly do a takeover run. * The rebalancing target nodes are cleared in the setup for a takeover run, regardless of whether the takeover run succeeds. * The timer to force a rebalance isn't cleared if another takeover run occurs before the deadline. Any forced rebalancing will happen in the first takeover run and when the timer expires some time later then an unnecessary takeover run will occur. * If the recovery master role moves then the rebalancing data will stay on the original node and affect the next takeover run to occur if the recovery master role should come back to the original node. Instead, store an array of rebalance target nodes in the recovery master context. This is passed as an extra argument to ctdb_takeover_run() each time it is called and is cleared when a takeover run succeeds. The timer hangs off the array of rebalance target nodes, which is cleared if the node isn't the recovery master. This means that it is possible to lose rebalance data if the recovery master role moves. However, that's a difficult problem to solve. The best way of approaching it is probably to try to stop the recovery master role from jumping around unnecesarily when inactive nodes join the cluster. The long term solution is to avoid this nonsense completely. The IP allocation algorithm needs to cache state between runs so that it knows which nodes have just become healthy. This also needs recovery master stability. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit c51c1efe5fc7fa668597f2acd435dee16e410fc9) 2013-09-04 08:30:04 +04:00			`uint32_t *force_rebalance_nodes;`
ctdb-recoverd: Use capabilities API Simplify update_capabilities() using the capabilities API and store the capabilities in new field rec->caps rather than scattered around ctdb->nodes. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2014-07-31 09:26:03 +04:00			`struct ctdb_node_capabilities *caps;`
ctdb-recoverd: Freeze databases whenever the node is INACTIVE If the node becomes stopped or banned after recovery is marked active, then it will never freeze the databases, and hence the node will keep banning itself indefinitely, until ctdbd is restarted. This is a regression from 4.3, introduced with b4357a79d916b1f8ade8fa78563fbef0ce670aa9 and d8f3b490bbb691c9916eed0df5b980c1aef23c85 BUG: https://bugzilla.samba.org/show_bug.cgi?id=11945 Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org> Reviewed-by: Martin Schwenke <martin@meltin.net> Autobuild-User(master): Michael Adam <obnox@samba.org> Autobuild-Date(master): Wed Jun 1 17:36:12 CEST 2016 on sn-devel-144 2016-06-01 05:10:46 +03:00			`bool frozen_on_inactive;`
ctdb-recoverd: Recovery lock handle should be in recovery deamon context This shouldn't be in the CTDB context. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-05-24 07:54:39 +03:00			`struct ctdb_cluster_mutex_handle *recovery_lock_handle;`
implement a scheme where nodes are banned if they continuously caused the cluster to start a recovery session. The node is banned from the cluster for the RecoveryBanPeriod (default of 5 minutes) (This used to be ctdb commit 4ad43dd07f526b6002477177fbf55483246c2c0c) 2007-06-07 09:18:55 +04:00			`};`
start working on a recovery daemon change ctdb_control so it takes a timeval pointer as argument. this is the timeout. if the node has not responded within hte timeout ctdb_control will return an error instead of hanging. if the timeval pointer is NULL then the call will block indefinitely if there is no response. this is used for now in the createdb control but all the helpers ctdb_ctrl_* should probably be updated to take a timeout parameter as well. (This used to be ctdb commit 1fe64b04869b17dbf123851b0fe09df8d28a6211) 2007-05-04 02:30:18 +04:00
make recovery daemon values tunable (This used to be ctdb commit ec29dbf2f5110428df8b97801443ba7addf61353) 2007-06-04 14:22:44 +04:00			`#define CONTROL_TIMEOUT() timeval_current_ofs(ctdb->tunable.recover_timeout, 0)`
added health monitoring logic to ctdb, so a node loses its public IP address if one of the sybsystem event scripts reports a problem (This used to be ctdb commit c7a089256d86cec21097453bce5acbccee87413f) 2007-06-06 04:25:46 +04:00			`#define MONITOR_TIMEOUT() timeval_current_ofs(ctdb->tunable.recover_interval, 0)`
raise the control timeout in recovery (This used to be ctdb commit 43424ff66daf28c202c12982f20a9f662b6fb125) 2007-05-24 07:49:27 +04:00
ctdb-daemon: Stop using tevent compatibility definitions Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-26 08:50:09 +03:00			`static void ctdb_restart_recd(struct tevent_context *ev,`
			`struct tevent_timer *te, struct timeval t,`
			`void *private_data);`
convert much of the recovery logic to be async and parallel across all nodes (This used to be ctdb commit 8b72a02bf1045d8befb342a4111ca1316889262e) 2008-01-05 01:35:43 +03:00
added admin commands to ban/unban nodes (This used to be ctdb commit 4dad04172e7e4955b5bf6444a85b19901c9683ad) 2007-06-07 10:34:33 +04:00			`/*`
			`ban a node for a period of time`
			`*/`
change ctdb_node_flags_change.vnn to ctdb_node_flags_changed.pnn change ctdb_ban_info.vnn to ctdb_ban_info.pnn (This used to be ctdb commit fcedd40e0493948829e1c921d4fe30e9196e398a) 2007-09-04 04:33:10 +04:00			`static void ctdb_ban_node(struct ctdb_recoverd *rec, uint32_t pnn, uint32_t ban_time)`
added admin commands to ban/unban nodes (This used to be ctdb commit 4dad04172e7e4955b5bf6444a85b19901c9683ad) 2007-06-07 10:34:33 +04:00			`{`
new prototype banning code (This used to be ctdb commit 0c4c2240267af183d54ffd4c0aacda208f6eff6a) 2009-09-03 20:20:39 +04:00			`int ret;`
added admin commands to ban/unban nodes (This used to be ctdb commit 4dad04172e7e4955b5bf6444a85b19901c9683ad) 2007-06-07 10:34:33 +04:00			`struct ctdb_context *ctdb = rec->ctdb;`
ctdb-daemon: Rename struct ctdb_ban_time to ctdb_ban_state Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-28 10:18:33 +03:00			`struct ctdb_ban_state bantime;`

change ctdb_node_flags_change.vnn to ctdb_node_flags_changed.pnn change ctdb_ban_info.vnn to ctdb_ban_info.pnn (This used to be ctdb commit fcedd40e0493948829e1c921d4fe30e9196e398a) 2007-09-04 04:33:10 +04:00			`if (!ctdb_validate_pnn(ctdb, pnn)) {`
merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_ERR,("Bad pnn %u in ctdb_ban_node\n", pnn));`
handle CTDB_CURRENT_NODE in ban commands (This used to be ctdb commit fefb53f1d22c5458a1e107f8352818aee87983de) 2007-06-07 10:48:31 +04:00			`return;`
			`}`

recoverd: Print banning message only after verifying pnn Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 4be8dff3a4451192f838497b4747273685959bed) 2013-06-24 08:18:58 +04:00			`DEBUG(DEBUG_NOTICE,("Banning node %u for %u seconds\n", pnn, ban_time));`

new prototype banning code (This used to be ctdb commit 0c4c2240267af183d54ffd4c0aacda208f6eff6a) 2009-09-03 20:20:39 +04:00			`bantime.pnn = pnn;`
			`bantime.time = ban_time;`
add log output for when ctdb_ban_node() and ctdb_unban_node() are called when these functions are called to ban or unban a node make sure we update the CTDB_NODE_BANNED flag in rec->node_flags since this field and flag are checked during the election process (This used to be ctdb commit 740c632ae96a2d34327d1b575780aaf079d93f4f) 2007-11-23 04:36:14 +03:00
new prototype banning code (This used to be ctdb commit 0c4c2240267af183d54ffd4c0aacda208f6eff6a) 2009-09-03 20:20:39 +04:00			`ret = ctdb_ctrl_set_ban(ctdb, CONTROL_TIMEOUT(), pnn, &bantime);`
			`if (ret != 0) {`
			`DEBUG(DEBUG_ERR,(__location__ " Failed to ban node %d\n", pnn));`
rework banning/unbanning nodes ctdb_recoverd.c Always handle banning/unbanning locally on the node that is being banned/unbanned instead of on the recovery master. This means that if a ban request comes in to the recovery master for a remote node, we pass the request on to the remote node instead of setting up the ban and ban timeouts locally. ctdb.c send ban/unban requests to the node being banned/unbanned instead of to the recmaster (This used to be ctdb commit 880dd9f5fd0b91e450da93e195cc5c62cb1dcd6e) 2007-12-03 07:45:53 +03:00			`return;`
use a priority time for the election data, not just the vnn (This used to be ctdb commit a691f9c5cd77194005f0d98483da94b07a48d57d) 2007-06-07 12:37:27 +04:00			`}`

added admin commands to ban/unban nodes (This used to be ctdb commit 4dad04172e7e4955b5bf6444a85b19901c9683ad) 2007-06-07 10:34:33 +04:00			`}`

add async versions of the freeze node control and freeze all nodes in parallell (This used to be ctdb commit f34e89f54d9f4380e76eb1b5b2385a4d8500b505) 2007-08-27 04:31:22 +04:00			`enum monitor_result { MONITOR_OK, MONITOR_RECOVERY_NEEDED, MONITOR_ELECTION_NEEDED, MONITOR_FAILED};`


add a callback for failed nodes to the async control helper. this callback is called for every node where the control failed (or timed out) when we issue the start recovery control from recovery master, set any node that fails as a culprit so it will eventually be banned (This used to be ctdb commit 72f89bac13cbe8c3ca3e7a942469cd2ff25abba2) 2008-06-12 10:53:36 +04:00			`/*`
			`remember the trouble maker`
			`*/`
new prototype banning code (This used to be ctdb commit 0c4c2240267af183d54ffd4c0aacda208f6eff6a) 2009-09-03 20:20:39 +04:00			`static void ctdb_set_culprit_count(struct ctdb_recoverd *rec, uint32_t culprit, uint32_t count)`
add a callback for failed nodes to the async control helper. this callback is called for every node where the control failed (or timed out) when we issue the start recovery control from recovery master, set any node that fails as a culprit so it will eventually be banned (This used to be ctdb commit 72f89bac13cbe8c3ca3e7a942469cd2ff25abba2) 2008-06-12 10:53:36 +04:00			`{`
new prototype banning code (This used to be ctdb commit 0c4c2240267af183d54ffd4c0aacda208f6eff6a) 2009-09-03 20:20:39 +04:00			`struct ctdb_context *ctdb = talloc_get_type(rec->ctdb, struct ctdb_context);`
			`struct ctdb_banning_state *ban_state;`

			`if (culprit > ctdb->num_nodes) {`
			`DEBUG(DEBUG_ERR,("Trying to set culprit %d but num_nodes is %d\n", culprit, ctdb->num_nodes));`
			`return;`
			`}`

recoverd: Do not set banning credits on a node if current node is inactive If the current node is banned or stopped, then it should not assign banning credits to other nodes since the current node will not have up-to-date flags of other nodes. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 38304f88e0c634e97d4687c25adef975f71537b8) 2013-06-28 08:10:47 +04:00			`/* If we are banned or stopped, do not set other nodes as culprits */`
			`if (rec->node_flags & NODE_FLAGS_INACTIVE) {`
			`DEBUG(DEBUG_NOTICE, ("This node is INACTIVE, cannot set culprit node %d\n", culprit));`
			`return;`
			`}`

new prototype banning code (This used to be ctdb commit 0c4c2240267af183d54ffd4c0aacda208f6eff6a) 2009-09-03 20:20:39 +04:00			`if (ctdb->nodes[culprit]->ban_state == NULL) {`
			`ctdb->nodes[culprit]->ban_state = talloc_zero(ctdb->nodes[culprit], struct ctdb_banning_state);`
			`CTDB_NO_MEMORY_VOID(ctdb, ctdb->nodes[culprit]->ban_state);`
add a callback for failed nodes to the async control helper. this callback is called for every node where the control failed (or timed out) when we issue the start recovery control from recovery master, set any node that fails as a culprit so it will eventually be banned (This used to be ctdb commit 72f89bac13cbe8c3ca3e7a942469cd2ff25abba2) 2008-06-12 10:53:36 +04:00
new prototype banning code (This used to be ctdb commit 0c4c2240267af183d54ffd4c0aacda208f6eff6a) 2009-09-03 20:20:39 +04:00
			`}`
			`ban_state = ctdb->nodes[culprit]->ban_state;`
			`if (timeval_elapsed(&ban_state->last_reported_time) > ctdb->tunable.recovery_grace_period) {`
			`/* this was the first time in a long while this node`
			`misbehaved so we will forgive any old transgressions.`
			`*/`
			`ban_state->count = 0;`
add a callback for failed nodes to the async control helper. this callback is called for every node where the control failed (or timed out) when we issue the start recovery control from recovery master, set any node that fails as a culprit so it will eventually be banned (This used to be ctdb commit 72f89bac13cbe8c3ca3e7a942469cd2ff25abba2) 2008-06-12 10:53:36 +04:00			`}`
new prototype banning code (This used to be ctdb commit 0c4c2240267af183d54ffd4c0aacda208f6eff6a) 2009-09-03 20:20:39 +04:00
			`ban_state->count += count;`
			`ban_state->last_reported_time = timeval_current();`
			`rec->last_culprit_node = culprit;`
add a callback for failed nodes to the async control helper. this callback is called for every node where the control failed (or timed out) when we issue the start recovery control from recovery master, set any node that fails as a culprit so it will eventually be banned (This used to be ctdb commit 72f89bac13cbe8c3ca3e7a942469cd2ff25abba2) 2008-06-12 10:53:36 +04:00			`}`

If we can not pull a database from a node during recovery, mark this node as a "culprit" so that it will eventually become banned. (This used to be ctdb commit 69dc3bf60b86d8df6dc5c7c6ebf303e847fb2ba9) 2009-04-24 07:58:32 +04:00			`/*`
			`remember the trouble maker`
			`*/`
new prototype banning code (This used to be ctdb commit 0c4c2240267af183d54ffd4c0aacda208f6eff6a) 2009-09-03 20:20:39 +04:00			`static void ctdb_set_culprit(struct ctdb_recoverd *rec, uint32_t culprit)`
If we can not pull a database from a node during recovery, mark this node as a "culprit" so that it will eventually become banned. (This used to be ctdb commit 69dc3bf60b86d8df6dc5c7c6ebf303e847fb2ba9) 2009-04-24 07:58:32 +04:00			`{`
new prototype banning code (This used to be ctdb commit 0c4c2240267af183d54ffd4c0aacda208f6eff6a) 2009-09-03 20:20:39 +04:00			`ctdb_set_culprit_count(rec, culprit, 1);`
If we can not pull a database from a node during recovery, mark this node as a "culprit" so that it will eventually become banned. (This used to be ctdb commit 69dc3bf60b86d8df6dc5c7c6ebf303e847fb2ba9) 2009-04-24 07:58:32 +04:00			`}`
add a callback for failed nodes to the async control helper. this callback is called for every node where the control failed (or timed out) when we issue the start recovery control from recovery master, set any node that fails as a culprit so it will eventually be banned (This used to be ctdb commit 72f89bac13cbe8c3ca3e7a942469cd2ff25abba2) 2008-06-12 10:53:36 +04:00
Expand the client async framework so that it can take a callback function. This allows us to use the async framework also for controls that return outdata. Add a "capabilities" field to the ctdb_node structure. This field is only initialized and kept valid inside the recovery daemon context and not inside the main ctdb daemon. change the GET_CAPABILITIES control to return the capabilities in outdata instead of in the res return variable. When performing a recovery inside the recovery daemon, read the capabilities from all connected nodes and update the ctdb->nodes list of nodes. when building the new vnnmap after the database rebuild in recovery, do not include any nodes which lack the LMASTER capability in the new vnnmap. Unless there are no available connected node that sports the LMASTER capability in which case we let the local node (recmaster) take on the lmaster role temporarily (i.e. become a member of the vnnmap list) (This used to be ctdb commit 0f1883c69c689b28b0c04148774840b2c4081df6) 2008-05-06 09:42:59 +04:00			`/*`
ctdb-recmaster: Update capabilities before calling first election Capabilities are used when computing an election result so having them up-to-date seems like a good idea. Also update several instances of an ambiguous comment. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-10-27 07:09:33 +03:00			`Retrieve capabilities from all connected nodes`
Expand the client async framework so that it can take a callback function. This allows us to use the async framework also for controls that return outdata. Add a "capabilities" field to the ctdb_node structure. This field is only initialized and kept valid inside the recovery daemon context and not inside the main ctdb daemon. change the GET_CAPABILITIES control to return the capabilities in outdata instead of in the res return variable. When performing a recovery inside the recovery daemon, read the capabilities from all connected nodes and update the ctdb->nodes list of nodes. when building the new vnnmap after the database rebuild in recovery, do not include any nodes which lack the LMASTER capability in the new vnnmap. Unless there are no available connected node that sports the LMASTER capability in which case we let the local node (recmaster) take on the lmaster role temporarily (i.e. become a member of the vnnmap list) (This used to be ctdb commit 0f1883c69c689b28b0c04148774840b2c4081df6) 2008-05-06 09:42:59 +04:00			`*/`
ctdb-recoverd: Use capabilities API Simplify update_capabilities() using the capabilities API and store the capabilities in new field rec->caps rather than scattered around ctdb->nodes. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2014-07-31 09:26:03 +04:00			`static int update_capabilities(struct ctdb_recoverd *rec,`
ctdb-daemon: Rename struct ctdb_node_map to ctdb_node_map_old Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-29 09:22:48 +03:00			`struct ctdb_node_map_old *nodemap)`
Expand the client async framework so that it can take a callback function. This allows us to use the async framework also for controls that return outdata. Add a "capabilities" field to the ctdb_node structure. This field is only initialized and kept valid inside the recovery daemon context and not inside the main ctdb daemon. change the GET_CAPABILITIES control to return the capabilities in outdata instead of in the res return variable. When performing a recovery inside the recovery daemon, read the capabilities from all connected nodes and update the ctdb->nodes list of nodes. when building the new vnnmap after the database rebuild in recovery, do not include any nodes which lack the LMASTER capability in the new vnnmap. Unless there are no available connected node that sports the LMASTER capability in which case we let the local node (recmaster) take on the lmaster role temporarily (i.e. become a member of the vnnmap list) (This used to be ctdb commit 0f1883c69c689b28b0c04148774840b2c4081df6) 2008-05-06 09:42:59 +04:00			`{`
ctdb-recoverd: Use capabilities API Simplify update_capabilities() using the capabilities API and store the capabilities in new field rec->caps rather than scattered around ctdb->nodes. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2014-07-31 09:26:03 +04:00			`uint32_t *capp;`
Expand the client async framework so that it can take a callback function. This allows us to use the async framework also for controls that return outdata. Add a "capabilities" field to the ctdb_node structure. This field is only initialized and kept valid inside the recovery daemon context and not inside the main ctdb daemon. change the GET_CAPABILITIES control to return the capabilities in outdata instead of in the res return variable. When performing a recovery inside the recovery daemon, read the capabilities from all connected nodes and update the ctdb->nodes list of nodes. when building the new vnnmap after the database rebuild in recovery, do not include any nodes which lack the LMASTER capability in the new vnnmap. Unless there are no available connected node that sports the LMASTER capability in which case we let the local node (recmaster) take on the lmaster role temporarily (i.e. become a member of the vnnmap list) (This used to be ctdb commit 0f1883c69c689b28b0c04148774840b2c4081df6) 2008-05-06 09:42:59 +04:00			`TALLOC_CTX *tmp_ctx;`
ctdb-recoverd: Use capabilities API Simplify update_capabilities() using the capabilities API and store the capabilities in new field rec->caps rather than scattered around ctdb->nodes. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2014-07-31 09:26:03 +04:00			`struct ctdb_node_capabilities *caps;`
			`struct ctdb_context *ctdb = rec->ctdb;`
Expand the client async framework so that it can take a callback function. This allows us to use the async framework also for controls that return outdata. Add a "capabilities" field to the ctdb_node structure. This field is only initialized and kept valid inside the recovery daemon context and not inside the main ctdb daemon. change the GET_CAPABILITIES control to return the capabilities in outdata instead of in the res return variable. When performing a recovery inside the recovery daemon, read the capabilities from all connected nodes and update the ctdb->nodes list of nodes. when building the new vnnmap after the database rebuild in recovery, do not include any nodes which lack the LMASTER capability in the new vnnmap. Unless there are no available connected node that sports the LMASTER capability in which case we let the local node (recmaster) take on the lmaster role temporarily (i.e. become a member of the vnnmap list) (This used to be ctdb commit 0f1883c69c689b28b0c04148774840b2c4081df6) 2008-05-06 09:42:59 +04:00
ctdb-recoverd: Use capabilities API Simplify update_capabilities() using the capabilities API and store the capabilities in new field rec->caps rather than scattered around ctdb->nodes. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2014-07-31 09:26:03 +04:00			`tmp_ctx = talloc_new(rec);`
Expand the client async framework so that it can take a callback function. This allows us to use the async framework also for controls that return outdata. Add a "capabilities" field to the ctdb_node structure. This field is only initialized and kept valid inside the recovery daemon context and not inside the main ctdb daemon. change the GET_CAPABILITIES control to return the capabilities in outdata instead of in the res return variable. When performing a recovery inside the recovery daemon, read the capabilities from all connected nodes and update the ctdb->nodes list of nodes. when building the new vnnmap after the database rebuild in recovery, do not include any nodes which lack the LMASTER capability in the new vnnmap. Unless there are no available connected node that sports the LMASTER capability in which case we let the local node (recmaster) take on the lmaster role temporarily (i.e. become a member of the vnnmap list) (This used to be ctdb commit 0f1883c69c689b28b0c04148774840b2c4081df6) 2008-05-06 09:42:59 +04:00			`CTDB_NO_MEMORY(ctdb, tmp_ctx);`

ctdb-recoverd: Use capabilities API Simplify update_capabilities() using the capabilities API and store the capabilities in new field rec->caps rather than scattered around ctdb->nodes. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2014-07-31 09:26:03 +04:00			`caps = ctdb_get_capabilities(ctdb, tmp_ctx,`
			`CONTROL_TIMEOUT(), nodemap);`

			`if (caps == NULL) {`
			`DEBUG(DEBUG_ERR,`
			`(__location__ " Failed to get node capabilities\n"));`
Expand the client async framework so that it can take a callback function. This allows us to use the async framework also for controls that return outdata. Add a "capabilities" field to the ctdb_node structure. This field is only initialized and kept valid inside the recovery daemon context and not inside the main ctdb daemon. change the GET_CAPABILITIES control to return the capabilities in outdata instead of in the res return variable. When performing a recovery inside the recovery daemon, read the capabilities from all connected nodes and update the ctdb->nodes list of nodes. when building the new vnnmap after the database rebuild in recovery, do not include any nodes which lack the LMASTER capability in the new vnnmap. Unless there are no available connected node that sports the LMASTER capability in which case we let the local node (recmaster) take on the lmaster role temporarily (i.e. become a member of the vnnmap list) (This used to be ctdb commit 0f1883c69c689b28b0c04148774840b2c4081df6) 2008-05-06 09:42:59 +04:00			`talloc_free(tmp_ctx);`
			`return -1;`
			`}`

ctdb-recoverd: Use capabilities API Simplify update_capabilities() using the capabilities API and store the capabilities in new field rec->caps rather than scattered around ctdb->nodes. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2014-07-31 09:26:03 +04:00			`capp = ctdb_get_node_capabilities(caps, ctdb_get_pnn(ctdb));`
			`if (capp == NULL) {`
			`DEBUG(DEBUG_ERR,`
			`(__location__`
			`" Capabilities don't include current node.\n"));`
			`talloc_free(tmp_ctx);`
			`return -1;`
			`}`
			`ctdb->capabilities = *capp;`

			`TALLOC_FREE(rec->caps);`
			`rec->caps = talloc_steal(rec, caps);`

Expand the client async framework so that it can take a callback function. This allows us to use the async framework also for controls that return outdata. Add a "capabilities" field to the ctdb_node structure. This field is only initialized and kept valid inside the recovery daemon context and not inside the main ctdb daemon. change the GET_CAPABILITIES control to return the capabilities in outdata instead of in the res return variable. When performing a recovery inside the recovery daemon, read the capabilities from all connected nodes and update the ctdb->nodes list of nodes. when building the new vnnmap after the database rebuild in recovery, do not include any nodes which lack the LMASTER capability in the new vnnmap. Unless there are no available connected node that sports the LMASTER capability in which case we let the local node (recmaster) take on the lmaster role temporarily (i.e. become a member of the vnnmap list) (This used to be ctdb commit 0f1883c69c689b28b0c04148774840b2c4081df6) 2008-05-06 09:42:59 +04:00			`talloc_free(tmp_ctx);`
			`return 0;`
			`}`

implement a scheme where nodes are banned if they continuously caused the cluster to start a recovery session. The node is banned from the cluster for the RecoveryBanPeriod (default of 5 minutes) (This used to be ctdb commit 4ad43dd07f526b6002477177fbf55483246c2c0c) 2007-06-07 09:18:55 +04:00			`/*`
			`change recovery mode on all nodes`
			`*/`
ctdb-recoverd: Do not freeze databases for election If election occurs during SMB activity, then trying to freeze all the databases can cause samba/ctdb deadlock which parallel database recovery is trying to avoid. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-06 03:52:06 +03:00			`static int set_recovery_mode(struct ctdb_context *ctdb,`
			`struct ctdb_recoverd *rec,`
ctdb-daemon: Rename struct ctdb_node_map to ctdb_node_map_old Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-29 09:22:48 +03:00			`struct ctdb_node_map_old *nodemap,`
ctdb-recoverd: Drop code to freeze databases from set_recovery_mode() This function is called only once from force_election() and does not require freezing of databases. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2016-09-13 08:45:54 +03:00			`uint32_t rec_mode)`
break out the setting/clearing of recovery mode into a dedicated helper function (This used to be ctdb commit dba4e4f8aa4f2fde1e9f8d93bdf3a33f7de8ce18) 2007-05-06 03:53:12 +04:00			`{`
new simpler and much faster recovery code based on tdb transactions (This used to be ctdb commit 9ef2268a1674b01f60c58fed72af8ac982fe77a3) 2008-01-06 04:38:01 +03:00			`TDB_DATA data;`
merge async recovery changes from Ronnie (This used to be ctdb commit 576e317640d25f8059114f15c6f1ebcee5e5b6e2) 2008-01-29 05:59:28 +03:00			`uint32_t *nodes;`
			`TALLOC_CTX *tmp_ctx;`

			`tmp_ctx = talloc_new(ctdb);`
			`CTDB_NO_MEMORY(ctdb, tmp_ctx);`

add a callback for failed nodes to the async control helper. this callback is called for every node where the control failed (or timed out) when we issue the start recovery control from recovery master, set any node that fails as a culprit so it will eventually be banned (This used to be ctdb commit 72f89bac13cbe8c3ca3e7a942469cd2ff25abba2) 2008-06-12 10:53:36 +04:00			`nodes = list_of_active_nodes(ctdb, nodemap, tmp_ctx, true);`
ctdb-recoverd: Set recovery mode before freezing databases Setting recovery mode to active is the only correct way to inform recovery daemon to run database recovery. Only freezing databases without setting recovery mode should not trigger database recovery, as this mechanism is used in tool to implement wipedb/restoredb commands. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2014-05-06 08:24:52 +04:00
			`data.dsize = sizeof(uint32_t);`
			`data.dptr = (unsigned char *)&rec_mode;`

			`if (ctdb_client_async_control(ctdb, CTDB_CONTROL_SET_RECMODE,`
			`nodes, 0,`
			`CONTROL_TIMEOUT(),`
			`false, data,`
			`NULL, NULL,`
			`NULL) != 0) {`
			`DEBUG(DEBUG_ERR, (__location__ " Unable to set recovery mode. Recovery failed.\n"));`
			`talloc_free(tmp_ctx);`
			`return -1;`
			`}`

merge async recovery changes from Ronnie (This used to be ctdb commit 576e317640d25f8059114f15c6f1ebcee5e5b6e2) 2008-01-29 05:59:28 +03:00			`talloc_free(tmp_ctx);`
break out the setting/clearing of recovery mode into a dedicated helper function (This used to be ctdb commit dba4e4f8aa4f2fde1e9f8d93bdf3a33f7de8ce18) 2007-05-06 03:53:12 +04:00			`return 0;`
			`}`

implement a scheme where nodes are banned if they continuously caused the cluster to start a recovery session. The node is banned from the cluster for the RecoveryBanPeriod (default of 5 minutes) (This used to be ctdb commit 4ad43dd07f526b6002477177fbf55483246c2c0c) 2007-06-07 09:18:55 +04:00			`/*`
			`ensure all other nodes have attached to any databases that we have`
			`*/`
ctdb-daemon: Rename struct ctdb_node_map to ctdb_node_map_old Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-29 09:22:48 +03:00			`static int create_missing_remote_databases(struct ctdb_context ctdb, struct ctdb_node_map_old nodemap,`
ctdb-daemon: Rename struct ctdb_dbid_map to ctdb_dbid_map_old Match struct ctdb_dbid as per protocol/protocol.h Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-29 09:46:05 +03:00			`uint32_t pnn, struct ctdb_dbid_map_old dbmap, TALLOC_CTX mem_ctx)`
update getvnnmap control to take a timeout parameter dont explicitely free the vnnmap pointer in the getvnnmap control this is freed by the mem_ctx instead add code to the recoverd to detect when/if recovery is required veiry that the number of active nodes, the nodemap and the vnn map is consistent across the entire cluster and if not trigger a recovery (which right now just prints "we need to do recovery" to the screen. (This used to be ctdb commit 2b0a207a3748bdb3394dc9fd0d1c344ee1bb0bb5) 2007-05-04 03:45:53 +04:00			`{`
recovery daemon this program is a client to the local ctdb daemon every second it pulls all vnnmap and nodemaps from all nodes that are available and checks if a recovery is required a recovery is required if : * all nodes do NOT have an identical vnnmap and generation * all nodes do NOT have an identical nodemap * there are active nodes that are NOT in the nodemap * there are nodes in the nodemap that are NOT active During recovery, the recovery tool will also make sure that all nodes know about and have created all databases. (This used to be ctdb commit 2f2650467bac7e8954de7c17cb34f46b0bdbcd26) 2007-05-04 09:21:40 +04:00			`int i, j, db, ret;`
ctdb-daemon: Rename struct ctdb_dbid_map to ctdb_dbid_map_old Match struct ctdb_dbid as per protocol/protocol.h Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-29 09:46:05 +03:00			`struct ctdb_dbid_map_old *remote_dbmap;`
recovery daemon this program is a client to the local ctdb daemon every second it pulls all vnnmap and nodemaps from all nodes that are available and checks if a recovery is required a recovery is required if : * all nodes do NOT have an identical vnnmap and generation * all nodes do NOT have an identical nodemap * there are active nodes that are NOT in the nodemap * there are nodes in the nodemap that are NOT active During recovery, the recovery tool will also make sure that all nodes know about and have created all databases. (This used to be ctdb commit 2f2650467bac7e8954de7c17cb34f46b0bdbcd26) 2007-05-04 09:21:40 +04:00
update to rhe recovery daemon ctdb_ctrl_ calls are timedout due to nodes arriving or leaving the cluster it crashes the recovery daemon afterwards with a SEGV but no useful stack backtrace (This used to be ctdb commit cd3abc7349e86555ccd87cd47a1dcc2adad2f46c) 2007-05-06 00:58:01 +04:00			`/* verify that all other nodes have all our databases */`
			`for (j=0; j<nodemap->num; j++) {`
Fix various spelling errors Reviewed-by: Andrew Bartlett <abartlet@samba.org> Reviewed-by: Michael Adam <obnox@samba.org> Autobuild-User(master): Andrew Bartlett <abartlet@samba.org> Autobuild-Date(master): Fri Nov 6 13:43:45 CET 2015 on sn-devel-104 2015-07-27 00:02:57 +03:00			`/* we don't need to ourself ourselves */`
change ctdb_node_flags_change.vnn to ctdb_node_flags_changed.pnn change ctdb_ban_info.vnn to ctdb_ban_info.pnn (This used to be ctdb commit fcedd40e0493948829e1c921d4fe30e9196e398a) 2007-09-04 04:33:10 +04:00			`if (nodemap->nodes[j].pnn == pnn) {`
update to rhe recovery daemon ctdb_ctrl_ calls are timedout due to nodes arriving or leaving the cluster it crashes the recovery daemon afterwards with a SEGV but no useful stack backtrace (This used to be ctdb commit cd3abc7349e86555ccd87cd47a1dcc2adad2f46c) 2007-05-06 00:58:01 +04:00			`continue;`
			`}`
Fix various spelling errors Reviewed-by: Andrew Bartlett <abartlet@samba.org> Reviewed-by: Michael Adam <obnox@samba.org> Autobuild-User(master): Andrew Bartlett <abartlet@samba.org> Autobuild-Date(master): Fri Nov 6 13:43:45 CET 2015 on sn-devel-104 2015-07-27 00:02:57 +03:00			`/* don't check nodes that are unavailable */`
implement a scheme where nodes are banned if they continuously caused the cluster to start a recovery session. The node is banned from the cluster for the RecoveryBanPeriod (default of 5 minutes) (This used to be ctdb commit 4ad43dd07f526b6002477177fbf55483246c2c0c) 2007-06-07 09:18:55 +04:00			`if (nodemap->nodes[j].flags & NODE_FLAGS_INACTIVE) {`
update to rhe recovery daemon ctdb_ctrl_ calls are timedout due to nodes arriving or leaving the cluster it crashes the recovery daemon afterwards with a SEGV but no useful stack backtrace (This used to be ctdb commit cd3abc7349e86555ccd87cd47a1dcc2adad2f46c) 2007-05-06 00:58:01 +04:00			`continue;`
			`}`

change how we do public addresses and takeover so that we can have multiple public addresses spread across multiple interfaces on each node. this is a massive patch since we have previously made the assumtion that we only have one public address per node. get rid of the public_interface argument. the public addresses file now explicitely lists which interface the address belongs to (This used to be ctdb commit 462ebbc791e906a6b874c862defea43235597ca8) 2007-09-04 03:50:07 +04:00			`ret = ctdb_ctrl_getdbmap(ctdb, CONTROL_TIMEOUT(), nodemap->nodes[j].pnn,`
formatting fixes (This used to be ctdb commit ed63a2057698aed3931762605b2ea2368681af2b) 2007-06-07 12:39:37 +04:00			`mem_ctx, &remote_dbmap);`
update to rhe recovery daemon ctdb_ctrl_ calls are timedout due to nodes arriving or leaving the cluster it crashes the recovery daemon afterwards with a SEGV but no useful stack backtrace (This used to be ctdb commit cd3abc7349e86555ccd87cd47a1dcc2adad2f46c) 2007-05-06 00:58:01 +04:00			`if (ret != 0) {`
merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_ERR, (__location__ " Unable to get dbids from node %u\n", pnn));`
update to rhe recovery daemon ctdb_ctrl_ calls are timedout due to nodes arriving or leaving the cluster it crashes the recovery daemon afterwards with a SEGV but no useful stack backtrace (This used to be ctdb commit cd3abc7349e86555ccd87cd47a1dcc2adad2f46c) 2007-05-06 00:58:01 +04:00			`return -1;`
			`}`

			`/* step through all local databases */`
			`for (db=0; db<dbmap->num;db++) {`
			`const char *name;`


			`for (i=0;i<remote_dbmap->num;i++) {`
ctdb-daemon: Rename struct ctdb_dbid_map to ctdb_dbid_map_old Match struct ctdb_dbid as per protocol/protocol.h Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-29 09:46:05 +03:00			`if (dbmap->dbs[db].db_id == remote_dbmap->dbs[i].db_id) {`
update to rhe recovery daemon ctdb_ctrl_ calls are timedout due to nodes arriving or leaving the cluster it crashes the recovery daemon afterwards with a SEGV but no useful stack backtrace (This used to be ctdb commit cd3abc7349e86555ccd87cd47a1dcc2adad2f46c) 2007-05-06 00:58:01 +04:00			`break;`
			`}`
			`}`
			`/* the remote node already have this database */`
			`if (i!=remote_dbmap->num) {`
			`continue;`
			`}`
			`/* ok so we need to create this database */`
ctdb-server: Coverity fixes Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org> 2013-11-11 05:39:27 +04:00			`ret = ctdb_ctrl_getdbname(ctdb, CONTROL_TIMEOUT(), pnn,`
ctdb-daemon: Rename struct ctdb_dbid_map to ctdb_dbid_map_old Match struct ctdb_dbid as per protocol/protocol.h Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-29 09:46:05 +03:00			`dbmap->dbs[db].db_id, mem_ctx,`
ctdb-server: Coverity fixes Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org> 2013-11-11 05:39:27 +04:00			`&name);`
update to rhe recovery daemon ctdb_ctrl_ calls are timedout due to nodes arriving or leaving the cluster it crashes the recovery daemon afterwards with a SEGV but no useful stack backtrace (This used to be ctdb commit cd3abc7349e86555ccd87cd47a1dcc2adad2f46c) 2007-05-06 00:58:01 +04:00			`if (ret != 0) {`
merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_ERR, (__location__ " Unable to get dbname from node %u\n", pnn));`
update to rhe recovery daemon ctdb_ctrl_ calls are timedout due to nodes arriving or leaving the cluster it crashes the recovery daemon afterwards with a SEGV but no useful stack backtrace (This used to be ctdb commit cd3abc7349e86555ccd87cd47a1dcc2adad2f46c) 2007-05-06 00:58:01 +04:00			`return -1;`
			`}`
ctdb-server: Coverity fixes Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org> 2013-11-11 05:39:27 +04:00			`ret = ctdb_ctrl_createdb(ctdb, CONTROL_TIMEOUT(),`
			`nodemap->nodes[j].pnn,`
			`mem_ctx, name,`
ctdb-client: Optionally return database id from ctdb_ctrl_createdb() BUG: https://bugzilla.samba.org/show_bug.cgi?id=12978 Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2017-08-23 05:09:22 +03:00			`dbmap->dbs[db].flags, NULL);`
update to rhe recovery daemon ctdb_ctrl_ calls are timedout due to nodes arriving or leaving the cluster it crashes the recovery daemon afterwards with a SEGV but no useful stack backtrace (This used to be ctdb commit cd3abc7349e86555ccd87cd47a1dcc2adad2f46c) 2007-05-06 00:58:01 +04:00			`if (ret != 0) {`
merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_ERR, (__location__ " Unable to create remote db:%s\n", name));`
update to rhe recovery daemon ctdb_ctrl_ calls are timedout due to nodes arriving or leaving the cluster it crashes the recovery daemon afterwards with a SEGV but no useful stack backtrace (This used to be ctdb commit cd3abc7349e86555ccd87cd47a1dcc2adad2f46c) 2007-05-06 00:58:01 +04:00			`return -1;`
			`}`
			`}`
			`}`
recovery daemon this program is a client to the local ctdb daemon every second it pulls all vnnmap and nodemaps from all nodes that are available and checks if a recovery is required a recovery is required if : * all nodes do NOT have an identical vnnmap and generation * all nodes do NOT have an identical nodemap * there are active nodes that are NOT in the nodemap * there are nodes in the nodemap that are NOT active During recovery, the recovery tool will also make sure that all nodes know about and have created all databases. (This used to be ctdb commit 2f2650467bac7e8954de7c17cb34f46b0bdbcd26) 2007-05-04 09:21:40 +04:00
add a helper function to create all missing remote databases detected during recovery (This used to be ctdb commit 04758c6f7d8f61260be6d2472380cb7904984427) 2007-05-06 04:04:37 +04:00			`return 0;`
			`}`


implement a scheme where nodes are banned if they continuously caused the cluster to start a recovery session. The node is banned from the cluster for the RecoveryBanPeriod (default of 5 minutes) (This used to be ctdb commit 4ad43dd07f526b6002477177fbf55483246c2c0c) 2007-06-07 09:18:55 +04:00			`/*`
			`ensure we are attached to any databases that anyone else is attached to`
			`*/`
ctdb-daemon: Rename struct ctdb_node_map to ctdb_node_map_old Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-29 09:22:48 +03:00			`static int create_missing_local_databases(struct ctdb_context ctdb, struct ctdb_node_map_old nodemap,`
ctdb-daemon: Rename struct ctdb_dbid_map to ctdb_dbid_map_old Match struct ctdb_dbid as per protocol/protocol.h Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-29 09:46:05 +03:00			`uint32_t pnn, struct ctdb_dbid_map_old *dbmap, TALLOC_CTX mem_ctx)`
create a helper function to make sure the local node that does recovery has all the databases that exist on any other remote node (This used to be ctdb commit 0f436e3d40fea6e6a146019b0c664e80e81e88b4) 2007-05-06 04:12:42 +04:00			`{`
			`int i, j, db, ret;`
ctdb-daemon: Rename struct ctdb_dbid_map to ctdb_dbid_map_old Match struct ctdb_dbid as per protocol/protocol.h Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-29 09:46:05 +03:00			`struct ctdb_dbid_map_old *remote_dbmap;`
create a helper function to make sure the local node that does recovery has all the databases that exist on any other remote node (This used to be ctdb commit 0f436e3d40fea6e6a146019b0c664e80e81e88b4) 2007-05-06 04:12:42 +04:00
			`/* verify that we have all database any other node has */`
			`for (j=0; j<nodemap->num; j++) {`
Fix various spelling errors Reviewed-by: Andrew Bartlett <abartlet@samba.org> Reviewed-by: Michael Adam <obnox@samba.org> Autobuild-User(master): Andrew Bartlett <abartlet@samba.org> Autobuild-Date(master): Fri Nov 6 13:43:45 CET 2015 on sn-devel-104 2015-07-27 00:02:57 +03:00			`/* we don't need to ourself ourselves */`
change ctdb_node_flags_change.vnn to ctdb_node_flags_changed.pnn change ctdb_ban_info.vnn to ctdb_ban_info.pnn (This used to be ctdb commit fcedd40e0493948829e1c921d4fe30e9196e398a) 2007-09-04 04:33:10 +04:00			`if (nodemap->nodes[j].pnn == pnn) {`
create a helper function to make sure the local node that does recovery has all the databases that exist on any other remote node (This used to be ctdb commit 0f436e3d40fea6e6a146019b0c664e80e81e88b4) 2007-05-06 04:12:42 +04:00			`continue;`
			`}`
Fix various spelling errors Reviewed-by: Andrew Bartlett <abartlet@samba.org> Reviewed-by: Michael Adam <obnox@samba.org> Autobuild-User(master): Andrew Bartlett <abartlet@samba.org> Autobuild-Date(master): Fri Nov 6 13:43:45 CET 2015 on sn-devel-104 2015-07-27 00:02:57 +03:00			`/* don't check nodes that are unavailable */`
implement a scheme where nodes are banned if they continuously caused the cluster to start a recovery session. The node is banned from the cluster for the RecoveryBanPeriod (default of 5 minutes) (This used to be ctdb commit 4ad43dd07f526b6002477177fbf55483246c2c0c) 2007-06-07 09:18:55 +04:00			`if (nodemap->nodes[j].flags & NODE_FLAGS_INACTIVE) {`
create a helper function to make sure the local node that does recovery has all the databases that exist on any other remote node (This used to be ctdb commit 0f436e3d40fea6e6a146019b0c664e80e81e88b4) 2007-05-06 04:12:42 +04:00			`continue;`
			`}`

change how we do public addresses and takeover so that we can have multiple public addresses spread across multiple interfaces on each node. this is a massive patch since we have previously made the assumtion that we only have one public address per node. get rid of the public_interface argument. the public addresses file now explicitely lists which interface the address belongs to (This used to be ctdb commit 462ebbc791e906a6b874c862defea43235597ca8) 2007-09-04 03:50:07 +04:00			`ret = ctdb_ctrl_getdbmap(ctdb, CONTROL_TIMEOUT(), nodemap->nodes[j].pnn,`
formatting fixes (This used to be ctdb commit ed63a2057698aed3931762605b2ea2368681af2b) 2007-06-07 12:39:37 +04:00			`mem_ctx, &remote_dbmap);`
create a helper function to make sure the local node that does recovery has all the databases that exist on any other remote node (This used to be ctdb commit 0f436e3d40fea6e6a146019b0c664e80e81e88b4) 2007-05-06 04:12:42 +04:00			`if (ret != 0) {`
merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_ERR, (__location__ " Unable to get dbids from node %u\n", pnn));`
create a helper function to make sure the local node that does recovery has all the databases that exist on any other remote node (This used to be ctdb commit 0f436e3d40fea6e6a146019b0c664e80e81e88b4) 2007-05-06 04:12:42 +04:00			`return -1;`
			`}`

			`/* step through all databases on the remote node */`
			`for (db=0; db<remote_dbmap->num;db++) {`
			`const char *name;`

			`for (i=0;i<(*dbmap)->num;i++) {`
ctdb-daemon: Rename struct ctdb_dbid_map to ctdb_dbid_map_old Match struct ctdb_dbid as per protocol/protocol.h Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-29 09:46:05 +03:00			`if (remote_dbmap->dbs[db].db_id == (*dbmap)->dbs[i].db_id) {`
create a helper function to make sure the local node that does recovery has all the databases that exist on any other remote node (This used to be ctdb commit 0f436e3d40fea6e6a146019b0c664e80e81e88b4) 2007-05-06 04:12:42 +04:00			`break;`
			`}`
			`}`
			`/* we already have this db locally */`
			`if (i!=(*dbmap)->num) {`
			`continue;`
			`}`
			`/* ok so we need to create this database and`
			`rebuild dbmap`
			`*/`
change how we do public addresses and takeover so that we can have multiple public addresses spread across multiple interfaces on each node. this is a massive patch since we have previously made the assumtion that we only have one public address per node. get rid of the public_interface argument. the public addresses file now explicitely lists which interface the address belongs to (This used to be ctdb commit 462ebbc791e906a6b874c862defea43235597ca8) 2007-09-04 03:50:07 +04:00			`ctdb_ctrl_getdbname(ctdb, CONTROL_TIMEOUT(), nodemap->nodes[j].pnn,`
ctdb-daemon: Rename struct ctdb_dbid_map to ctdb_dbid_map_old Match struct ctdb_dbid as per protocol/protocol.h Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-29 09:46:05 +03:00			`remote_dbmap->dbs[db].db_id, mem_ctx, &name);`
create a helper function to make sure the local node that does recovery has all the databases that exist on any other remote node (This used to be ctdb commit 0f436e3d40fea6e6a146019b0c664e80e81e88b4) 2007-05-06 04:12:42 +04:00			`if (ret != 0) {`
merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_ERR, (__location__ " Unable to get dbname from node %u\n",`
change how we do public addresses and takeover so that we can have multiple public addresses spread across multiple interfaces on each node. this is a massive patch since we have previously made the assumtion that we only have one public address per node. get rid of the public_interface argument. the public addresses file now explicitely lists which interface the address belongs to (This used to be ctdb commit 462ebbc791e906a6b874c862defea43235597ca8) 2007-09-04 03:50:07 +04:00			`nodemap->nodes[j].pnn));`
create a helper function to make sure the local node that does recovery has all the databases that exist on any other remote node (This used to be ctdb commit 0f436e3d40fea6e6a146019b0c664e80e81e88b4) 2007-05-06 04:12:42 +04:00			`return -1;`
			`}`
ctdb-client: Fix ctdb_ctrl_createdb() to use database flags BUG: https://bugzilla.samba.org/show_bug.cgi?id=12978 Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2017-08-18 06:50:39 +03:00			`ctdb_ctrl_createdb(ctdb, CONTROL_TIMEOUT(), pnn,`
			`mem_ctx, name,`
ctdb-client: Optionally return database id from ctdb_ctrl_createdb() BUG: https://bugzilla.samba.org/show_bug.cgi?id=12978 Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2017-08-23 05:09:22 +03:00			`remote_dbmap->dbs[db].flags, NULL);`
create a helper function to make sure the local node that does recovery has all the databases that exist on any other remote node (This used to be ctdb commit 0f436e3d40fea6e6a146019b0c664e80e81e88b4) 2007-05-06 04:12:42 +04:00			`if (ret != 0) {`
merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_ERR, (__location__ " Unable to create local db:%s\n", name));`
create a helper function to make sure the local node that does recovery has all the databases that exist on any other remote node (This used to be ctdb commit 0f436e3d40fea6e6a146019b0c664e80e81e88b4) 2007-05-06 04:12:42 +04:00			`return -1;`
			`}`
change ctdb_node_flags_change.vnn to ctdb_node_flags_changed.pnn change ctdb_ban_info.vnn to ctdb_ban_info.pnn (This used to be ctdb commit fcedd40e0493948829e1c921d4fe30e9196e398a) 2007-09-04 04:33:10 +04:00			`ret = ctdb_ctrl_getdbmap(ctdb, CONTROL_TIMEOUT(), pnn, mem_ctx, dbmap);`
create a helper function to make sure the local node that does recovery has all the databases that exist on any other remote node (This used to be ctdb commit 0f436e3d40fea6e6a146019b0c664e80e81e88b4) 2007-05-06 04:12:42 +04:00			`if (ret != 0) {`
merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_ERR, (__location__ " Unable to reread dbmap on node %u\n", pnn));`
create a helper function to make sure the local node that does recovery has all the databases that exist on any other remote node (This used to be ctdb commit 0f436e3d40fea6e6a146019b0c664e80e81e88b4) 2007-05-06 04:12:42 +04:00			`return -1;`
			`}`
			`}`
			`}`

			`return 0;`
			`}`

implement a scheme where nodes are banned if they continuously caused the cluster to start a recovery session. The node is banned from the cluster for the RecoveryBanPeriod (default of 5 minutes) (This used to be ctdb commit 4ad43dd07f526b6002477177fbf55483246c2c0c) 2007-06-07 09:18:55 +04:00			`/*`
			`update flags on all active nodes`
			`*/`
ctdb-daemon: Rename struct ctdb_node_map to ctdb_node_map_old Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-29 09:22:48 +03:00			`static int update_flags_on_all_nodes(struct ctdb_context ctdb, struct ctdb_node_map_old nodemap, uint32_t pnn, uint32_t flags)`
verify that the recmaster has the correct flags for us and if not tell the recmaster what the flags should be (This used to be ctdb commit 3387597926ad71e4140cc504b828486d99a3ec8e) 2008-06-26 05:08:09 +04:00			`{`
reqrite the handling of flag updates across the cluster to eliminate a race between the ctdb tool and the recovery daemon both at once trying to push flag changes across the cluster. (This used to be ctdb commit a9a1156ea4e10483a4bf4265b8e9203f0af033aa) 2008-11-19 06:43:46 +03:00			`int ret;`
verify that the recmaster has the correct flags for us and if not tell the recmaster what the flags should be (This used to be ctdb commit 3387597926ad71e4140cc504b828486d99a3ec8e) 2008-06-26 05:08:09 +04:00
redo and update how we synchronize flags across the cluster. this simplifies the code and should close a race condition between the local recovery daemon and a remote node when flags are changing. (This used to be ctdb commit 32d460b8469eb53145f04161a5d01166f9b5f09e) 2008-12-05 08:32:30 +03:00			`ret = ctdb_ctrl_modflags(ctdb, CONTROL_TIMEOUT(), pnn, flags, ~flags);`
			`if (ret != 0) {`
reqrite the handling of flag updates across the cluster to eliminate a race between the ctdb tool and the recovery daemon both at once trying to push flag changes across the cluster. (This used to be ctdb commit a9a1156ea4e10483a4bf4265b8e9203f0af033aa) 2008-11-19 06:43:46 +03:00			`DEBUG(DEBUG_ERR, (__location__ " Unable to update nodeflags on remote nodes\n"));`
			`return -1;`
			`}`
verify that the recmaster has the correct flags for us and if not tell the recmaster what the flags should be (This used to be ctdb commit 3387597926ad71e4140cc504b828486d99a3ec8e) 2008-06-26 05:08:09 +04:00
			`return 0;`
			`}`
create a helper function for recovery to push all local databases out onto the remote nodes (This used to be ctdb commit 1ba76d374652cfa29e56fb77c7190349e42d3bcc) 2007-05-06 04:38:44 +04:00
added two new ctdb commands: ctdb vacuum : vacuums all the databases, deleting any zero length ctdb records ctdb repack : repacks all the databases, resulting in a perfectly packed database with no freelist entries (This used to be ctdb commit 3532119c84ab3247051ed6ba21ba3243ae2f6bf4) 2008-01-08 09:23:27 +03:00			`/*`
ensure the recovery daemon is not clagged up by vacuum calls (This used to be ctdb commit ff7e80e247bf5a86adda0ef850d901478449675b) 2008-01-08 13:28:42 +03:00			`called when a vacuum fetch has completed - just free it and do the next one`
added two new ctdb commands: ctdb vacuum : vacuums all the databases, deleting any zero length ctdb records ctdb repack : repacks all the databases, resulting in a perfectly packed database with no freelist entries (This used to be ctdb commit 3532119c84ab3247051ed6ba21ba3243ae2f6bf4) 2008-01-08 09:23:27 +03:00			`*/`
			`static void vacuum_fetch_callback(struct ctdb_client_call_state *state)`
			`{`
			`talloc_free(state);`
ensure the recovery daemon is not clagged up by vacuum calls (This used to be ctdb commit ff7e80e247bf5a86adda0ef850d901478449675b) 2008-01-08 13:28:42 +03:00			`}`


ctdb-recoverd/vacuum: factor vacuum_fetch_process_one out of vacuum_fetch_loop Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-06-02 22:39:00 +03:00			`/**`
			`* Process one elements of the vacuum fetch list:`
			`* Migrate it over to us with the special flag`
			`* CTDB_CALL_FLAG_VACUUM_MIGRATION.`
			`*/`
			`static bool vacuum_fetch_process_one(struct ctdb_db_context *ctdb_db,`
			`uint32_t pnn,`
ctdb-daemon: Rename struct ctdb_rec_data to ctdb_rec_data_old Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-29 09:30:30 +03:00			`struct ctdb_rec_data_old *r)`
ctdb-recoverd/vacuum: factor vacuum_fetch_process_one out of vacuum_fetch_loop Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-06-02 22:39:00 +03:00			`{`
			`struct ctdb_client_call_state *state;`
			`TDB_DATA data;`
			`struct ctdb_ltdb_header *hdr;`
			`struct ctdb_call call;`

			`ZERO_STRUCT(call);`
			`call.call_id = CTDB_NULL_FUNC;`
			`call.flags = CTDB_IMMEDIATE_MIGRATION;`
			`call.flags \|= CTDB_CALL_FLAG_VACUUM_MIGRATION;`

			`call.key.dptr = &r->data[0];`
			`call.key.dsize = r->keylen;`

			`/* ensure we don't block this daemon - just skip a record if we can't get`
			`the chainlock */`
			`if (tdb_chainlock_nonblock(ctdb_db->ltdb->tdb, call.key) != 0) {`
			`return true;`
			`}`

			`data = tdb_fetch(ctdb_db->ltdb->tdb, call.key);`
			`if (data.dptr == NULL) {`
			`tdb_chainunlock(ctdb_db->ltdb->tdb, call.key);`
			`return true;`
			`}`

			`if (data.dsize < sizeof(struct ctdb_ltdb_header)) {`
			`free(data.dptr);`
			`tdb_chainunlock(ctdb_db->ltdb->tdb, call.key);`
			`return true;`
			`}`

			`hdr = (struct ctdb_ltdb_header *)data.dptr;`
			`if (hdr->dmaster == pnn) {`
			`/* its already local */`
			`free(data.dptr);`
			`tdb_chainunlock(ctdb_db->ltdb->tdb, call.key);`
			`return true;`
			`}`

			`free(data.dptr);`

			`state = ctdb_call_send(ctdb_db, &call);`
			`tdb_chainunlock(ctdb_db->ltdb->tdb, call.key);`
			`if (state == NULL) {`
			`DEBUG(DEBUG_ERR,(__location__ " Failed to setup vacuum fetch call\n"));`
			`return false;`
			`}`
			`state->async.fn = vacuum_fetch_callback;`
			`state->async.private_data = NULL;`

			`return true;`
			`}`

ensure the recovery daemon is not clagged up by vacuum calls (This used to be ctdb commit ff7e80e247bf5a86adda0ef850d901478449675b) 2008-01-08 13:28:42 +03:00
added two new ctdb commands: ctdb vacuum : vacuums all the databases, deleting any zero length ctdb records ctdb repack : repacks all the databases, resulting in a perfectly packed database with no freelist entries (This used to be ctdb commit 3532119c84ab3247051ed6ba21ba3243ae2f6bf4) 2008-01-08 09:23:27 +03:00			`/*`
			`handler for vacuum fetch`
			`*/`
ctdb-daemon: Replace ctdb_message with srvid abstraction Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-04-08 07:38:26 +03:00			`static void vacuum_fetch_handler(uint64_t srvid, TDB_DATA data,`
			`void *private_data)`
added two new ctdb commands: ctdb vacuum : vacuums all the databases, deleting any zero length ctdb records ctdb repack : repacks all the databases, resulting in a perfectly packed database with no freelist entries (This used to be ctdb commit 3532119c84ab3247051ed6ba21ba3243ae2f6bf4) 2008-01-08 09:23:27 +03:00			`{`
ctdb-daemon: Replace ctdb_message with srvid abstraction Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-04-08 07:38:26 +03:00			`struct ctdb_recoverd *rec = talloc_get_type(`
			`private_data, struct ctdb_recoverd);`
			`struct ctdb_context *ctdb = rec->ctdb;`
rename the structure we use for marshalling multiple records (This used to be ctdb commit 4d205476d286570a6e1f52b59af42858ce051106) 2008-07-30 08:24:56 +04:00			`struct ctdb_marshall_buffer *recs;`
added two new ctdb commands: ctdb vacuum : vacuums all the databases, deleting any zero length ctdb records ctdb repack : repacks all the databases, resulting in a perfectly packed database with no freelist entries (This used to be ctdb commit 3532119c84ab3247051ed6ba21ba3243ae2f6bf4) 2008-01-08 09:23:27 +03:00			`int ret, i;`
			`TALLOC_CTX *tmp_ctx = talloc_new(ctdb);`
			`const char *name;`
ctdb-daemon: Rename struct ctdb_dbid_map to ctdb_dbid_map_old Match struct ctdb_dbid as per protocol/protocol.h Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-29 09:46:05 +03:00			`struct ctdb_dbid_map_old *dbmap=NULL;`
ctdb-client: Fix ctdb_attach() to use database flags BUG: https://bugzilla.samba.org/show_bug.cgi?id=12978 Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> Autobuild-User(master): Martin Schwenke <martins@samba.org> Autobuild-Date(master): Fri Aug 25 13:32:58 CEST 2017 on sn-devel-144 2017-08-18 07:00:47 +03:00			`uint8_t db_flags = 0;`
added two new ctdb commands: ctdb vacuum : vacuums all the databases, deleting any zero length ctdb records ctdb repack : repacks all the databases, resulting in a perfectly packed database with no freelist entries (This used to be ctdb commit 3532119c84ab3247051ed6ba21ba3243ae2f6bf4) 2008-01-08 09:23:27 +03:00			`struct ctdb_db_context *ctdb_db;`
ctdb-daemon: Rename struct ctdb_rec_data to ctdb_rec_data_old Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-29 09:30:30 +03:00			`struct ctdb_rec_data_old *r;`
added two new ctdb commands: ctdb vacuum : vacuums all the databases, deleting any zero length ctdb records ctdb repack : repacks all the databases, resulting in a perfectly packed database with no freelist entries (This used to be ctdb commit 3532119c84ab3247051ed6ba21ba3243ae2f6bf4) 2008-01-08 09:23:27 +03:00
rename the structure we use for marshalling multiple records (This used to be ctdb commit 4d205476d286570a6e1f52b59af42858ce051106) 2008-07-30 08:24:56 +04:00			`recs = (struct ctdb_marshall_buffer *)data.dptr;`
ensure the recovery daemon is not clagged up by vacuum calls (This used to be ctdb commit ff7e80e247bf5a86adda0ef850d901478449675b) 2008-01-08 13:28:42 +03:00
			`if (recs->count == 0) {`
ctdb-recoverd/vacuum: add common exit point to vacuum_fetch_handler Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-06-02 22:57:54 +03:00			`goto done;`
ensure the recovery daemon is not clagged up by vacuum calls (This used to be ctdb commit ff7e80e247bf5a86adda0ef850d901478449675b) 2008-01-08 13:28:42 +03:00			`}`

added two new ctdb commands: ctdb vacuum : vacuums all the databases, deleting any zero length ctdb records ctdb repack : repacks all the databases, resulting in a perfectly packed database with no freelist entries (This used to be ctdb commit 3532119c84ab3247051ed6ba21ba3243ae2f6bf4) 2008-01-08 09:23:27 +03:00			`/* work out if the database is persistent */`
			`ret = ctdb_ctrl_getdbmap(ctdb, CONTROL_TIMEOUT(), CTDB_CURRENT_NODE, tmp_ctx, &dbmap);`
			`if (ret != 0) {`
merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_ERR, (__location__ " Unable to get dbids from local node\n"));`
ctdb-recoverd/vacuum: add common exit point to vacuum_fetch_handler Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-06-02 22:57:54 +03:00			`goto done;`
added two new ctdb commands: ctdb vacuum : vacuums all the databases, deleting any zero length ctdb records ctdb repack : repacks all the databases, resulting in a perfectly packed database with no freelist entries (This used to be ctdb commit 3532119c84ab3247051ed6ba21ba3243ae2f6bf4) 2008-01-08 09:23:27 +03:00			`}`

			`for (i=0;i<dbmap->num;i++) {`
ctdb-daemon: Rename struct ctdb_dbid_map to ctdb_dbid_map_old Match struct ctdb_dbid as per protocol/protocol.h Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-29 09:46:05 +03:00			`if (dbmap->dbs[i].db_id == recs->db_id) {`
ctdb-client: Fix ctdb_attach() to use database flags BUG: https://bugzilla.samba.org/show_bug.cgi?id=12978 Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> Autobuild-User(master): Martin Schwenke <martins@samba.org> Autobuild-Date(master): Fri Aug 25 13:32:58 CEST 2017 on sn-devel-144 2017-08-18 07:00:47 +03:00			`db_flags = dbmap->dbs[i].flags;`
added two new ctdb commands: ctdb vacuum : vacuums all the databases, deleting any zero length ctdb records ctdb repack : repacks all the databases, resulting in a perfectly packed database with no freelist entries (This used to be ctdb commit 3532119c84ab3247051ed6ba21ba3243ae2f6bf4) 2008-01-08 09:23:27 +03:00			`break;`
			`}`
			`}`
			`if (i == dbmap->num) {`
merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_ERR, (__location__ " Unable to find db_id 0x%x on local node\n", recs->db_id));`
ctdb-recoverd/vacuum: add common exit point to vacuum_fetch_handler Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-06-02 22:57:54 +03:00			`goto done;`
added two new ctdb commands: ctdb vacuum : vacuums all the databases, deleting any zero length ctdb records ctdb repack : repacks all the databases, resulting in a perfectly packed database with no freelist entries (This used to be ctdb commit 3532119c84ab3247051ed6ba21ba3243ae2f6bf4) 2008-01-08 09:23:27 +03:00			`}`

			`/* find the name of this database */`
			`if (ctdb_ctrl_getdbname(ctdb, CONTROL_TIMEOUT(), CTDB_CURRENT_NODE, recs->db_id, tmp_ctx, &name) != 0) {`
merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_ERR,(__location__ " Failed to get name of db 0x%x\n", recs->db_id));`
ctdb-recoverd/vacuum: add common exit point to vacuum_fetch_handler Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-06-02 22:57:54 +03:00			`goto done;`
added two new ctdb commands: ctdb vacuum : vacuums all the databases, deleting any zero length ctdb records ctdb repack : repacks all the databases, resulting in a perfectly packed database with no freelist entries (This used to be ctdb commit 3532119c84ab3247051ed6ba21ba3243ae2f6bf4) 2008-01-08 09:23:27 +03:00			`}`

			`/* attach to it */`
ctdb-client: Fix ctdb_attach() to use database flags BUG: https://bugzilla.samba.org/show_bug.cgi?id=12978 Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> Autobuild-User(master): Martin Schwenke <martins@samba.org> Autobuild-Date(master): Fri Aug 25 13:32:58 CEST 2017 on sn-devel-144 2017-08-18 07:00:47 +03:00			`ctdb_db = ctdb_attach(ctdb, CONTROL_TIMEOUT(), name, db_flags);`
added two new ctdb commands: ctdb vacuum : vacuums all the databases, deleting any zero length ctdb records ctdb repack : repacks all the databases, resulting in a perfectly packed database with no freelist entries (This used to be ctdb commit 3532119c84ab3247051ed6ba21ba3243ae2f6bf4) 2008-01-08 09:23:27 +03:00			`if (ctdb_db == NULL) {`
merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_ERR,(__location__ " Failed to attach to database '%s'\n", name));`
ctdb-recoverd/vacuum: add common exit point to vacuum_fetch_handler Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-06-02 22:57:54 +03:00			`goto done;`
added two new ctdb commands: ctdb vacuum : vacuums all the databases, deleting any zero length ctdb records ctdb repack : repacks all the databases, resulting in a perfectly packed database with no freelist entries (This used to be ctdb commit 3532119c84ab3247051ed6ba21ba3243ae2f6bf4) 2008-01-08 09:23:27 +03:00			`}`

ctdb-daemon: Rename struct ctdb_rec_data to ctdb_rec_data_old Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-29 09:30:30 +03:00			`r = (struct ctdb_rec_data_old *)&recs->data[0];`
ctdb-recoverd/vacuum: Remove vacuum_info structure For all the records listed in VACUUM_FETCH, migration requests are sent immediately without waiting. This means there can only be a single VACUUM_FETCH processing active at a time. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org> 2015-06-05 09:35:48 +03:00			`while (recs->count) {`
ctdb-recoverd/vacuum: move fetch loop back into fetch handler. With the processing of one element factored out, it is more natural to have the actual loop inside the handler function. This also makes the talloc/free bracked around the loop more obvious. Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-06-02 23:17:03 +03:00			`bool ok;`

ctdb-recoverd/vacuum: Remove vacuum_info structure For all the records listed in VACUUM_FETCH, migration requests are sent immediately without waiting. This means there can only be a single VACUUM_FETCH processing active at a time. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org> 2015-06-05 09:35:48 +03:00			`ok = vacuum_fetch_process_one(ctdb_db, rec->ctdb->pnn, r);`
ctdb-recoverd/vacuum: move fetch loop back into fetch handler. With the processing of one element factored out, it is more natural to have the actual loop inside the handler function. This also makes the talloc/free bracked around the loop more obvious. Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-06-02 23:17:03 +03:00			`if (!ok) {`
			`break;`
			`}`

ctdb-daemon: Rename struct ctdb_rec_data to ctdb_rec_data_old Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-29 09:30:30 +03:00			`r = (struct ctdb_rec_data_old )(r->length + (uint8_t )r);`
ctdb-recoverd/vacuum: Remove vacuum_info structure For all the records listed in VACUUM_FETCH, migration requests are sent immediately without waiting. This means there can only be a single VACUUM_FETCH processing active at a time. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org> 2015-06-05 09:35:48 +03:00			`recs->count--;`
ctdb-recoverd/vacuum: move fetch loop back into fetch handler. With the processing of one element factored out, it is more natural to have the actual loop inside the handler function. This also makes the talloc/free bracked around the loop more obvious. Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-06-02 23:17:03 +03:00			`}`

ctdb-recoverd/vacuum: add common exit point to vacuum_fetch_handler Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-06-02 22:57:54 +03:00			`done:`
fix some slow memory leaks in the vacuuming handler in the recovery daemon (This used to be ctdb commit 95bf36559d62f29e6f538f3a173b504ef3258341) 2008-09-16 01:55:57 +04:00			`talloc_free(tmp_ctx);`
added two new ctdb commands: ctdb vacuum : vacuums all the databases, deleting any zero length ctdb records ctdb repack : repacks all the databases, resulting in a perfectly packed database with no freelist entries (This used to be ctdb commit 3532119c84ab3247051ed6ba21ba3243ae2f6bf4) 2008-01-08 09:23:27 +03:00			`}`

added admin commands to ban/unban nodes (This used to be ctdb commit 4dad04172e7e4955b5bf6444a85b19901c9683ad) 2007-06-07 10:34:33 +04:00
ctdb-recoverd: Detach database from recovery daemon As part of vacuuming, recoverd attaches to databases to migrate records. When detaching a database from main daemon, it should be removed from recovery daemon also. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org> Autobuild-User(master): Michael Adam <obnox@samba.org> Autobuild-Date(master): Wed Apr 23 17:05:45 CEST 2014 on sn-devel-104 2014-04-22 09:24:49 +04:00			`/*`
			`* handler for database detach`
			`*/`
ctdb-daemon: Replace ctdb_message with srvid abstraction Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-04-08 07:38:26 +03:00			`static void detach_database_handler(uint64_t srvid, TDB_DATA data,`
			`void *private_data)`
ctdb-recoverd: Detach database from recovery daemon As part of vacuuming, recoverd attaches to databases to migrate records. When detaching a database from main daemon, it should be removed from recovery daemon also. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org> Autobuild-User(master): Michael Adam <obnox@samba.org> Autobuild-Date(master): Wed Apr 23 17:05:45 CEST 2014 on sn-devel-104 2014-04-22 09:24:49 +04:00			`{`
ctdb-daemon: Replace ctdb_message with srvid abstraction Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-04-08 07:38:26 +03:00			`struct ctdb_recoverd *rec = talloc_get_type(`
			`private_data, struct ctdb_recoverd);`
			`struct ctdb_context *ctdb = rec->ctdb;`
ctdb-recoverd: Detach database from recovery daemon As part of vacuuming, recoverd attaches to databases to migrate records. When detaching a database from main daemon, it should be removed from recovery daemon also. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org> Autobuild-User(master): Michael Adam <obnox@samba.org> Autobuild-Date(master): Wed Apr 23 17:05:45 CEST 2014 on sn-devel-104 2014-04-22 09:24:49 +04:00			`uint32_t db_id;`
			`struct ctdb_db_context *ctdb_db;`

			`if (data.dsize != sizeof(db_id)) {`
			`return;`
			`}`
			`db_id = (uint32_t )data.dptr;`

			`ctdb_db = find_ctdb_db(ctdb, db_id);`
			`if (ctdb_db == NULL) {`
			`/* database is not attached */`
			`return;`
			`}`

			`DLIST_REMOVE(ctdb->db_list, ctdb_db);`

			`DEBUG(DEBUG_NOTICE, ("Detached from database '%s'\n",`
			`ctdb_db->db_name));`
			`talloc_free(ctdb_db);`
			`}`

add a tuneable to control how long we wait after a successful recovery before we alow another recovery to be initiated (This used to be ctdb commit f3b43519423b7a73e6a2dd986bdf11203b8653cf) 2007-07-04 02:36:59 +04:00			`/*`
			`called when ctdb_wait_timeout should finish`
			`*/`
ctdb-daemon: Stop using tevent compatibility definitions Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-26 08:50:09 +03:00			`static void ctdb_wait_handler(struct tevent_context *ev,`
			`struct tevent_timer *te,`
add a tuneable to control how long we wait after a successful recovery before we alow another recovery to be initiated (This used to be ctdb commit f3b43519423b7a73e6a2dd986bdf11203b8653cf) 2007-07-04 02:36:59 +04:00			`struct timeval yt, void *p)`
			`{`
			`uint32_t timed_out = (uint32_t )p;`
			`(*timed_out) = 1;`
			`}`

			`/*`
			`wait for a given number of seconds`
			`*/`
speed startup: don't wait a full recovery interval if we've already waited We currently sleep for one second, whether or not we've already slept. Change this to sleep for the remainder of the second, if at all. Seconds between ctdbd first log message and node healthy: BEFORE: 18.09 AFTER: 17.08 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit fde760b5f39c77172308a583da4c2443b71541c9) 2010-06-22 17:20:35 +04:00			`static void ctdb_wait_timeout(struct ctdb_context *ctdb, double secs)`
add a tuneable to control how long we wait after a successful recovery before we alow another recovery to be initiated (This used to be ctdb commit f3b43519423b7a73e6a2dd986bdf11203b8653cf) 2007-07-04 02:36:59 +04:00			`{`
			`uint32_t timed_out = 0;`
speed startup: don't wait a full recovery interval if we've already waited We currently sleep for one second, whether or not we've already slept. Change this to sleep for the remainder of the second, if at all. Seconds between ctdbd first log message and node healthy: BEFORE: 18.09 AFTER: 17.08 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit fde760b5f39c77172308a583da4c2443b71541c9) 2010-06-22 17:20:35 +04:00			`time_t usecs = (secs - (time_t)secs) * 1000000;`
ctdb-daemon: Stop using tevent compatibility definitions Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-26 08:50:09 +03:00			`tevent_add_timer(ctdb->ev, ctdb, timeval_current_ofs(secs, usecs),`
			`ctdb_wait_handler, &timed_out);`
add a tuneable to control how long we wait after a successful recovery before we alow another recovery to be initiated (This used to be ctdb commit f3b43519423b7a73e6a2dd986bdf11203b8653cf) 2007-07-04 02:36:59 +04:00			`while (!timed_out) {`
ctdb-daemon: Stop using tevent compatibility definitions Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-26 08:50:09 +03:00			`tevent_loop_once(ctdb->ev);`
add a tuneable to control how long we wait after a successful recovery before we alow another recovery to be initiated (This used to be ctdb commit f3b43519423b7a73e6a2dd986bdf11203b8653cf) 2007-07-04 02:36:59 +04:00			`}`
			`}`

make election handling much more scalable (This used to be ctdb commit 05938d462b92bd9ecb8e35f53651bded47c48675) 2007-11-13 02:27:44 +03:00			`/*`
			`called when an election times out (ends)`
			`*/`
ctdb-daemon: Stop using tevent compatibility definitions Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-26 08:50:09 +03:00			`static void ctdb_election_timeout(struct tevent_context *ev,`
			`struct tevent_timer *te,`
make election handling much more scalable (This used to be ctdb commit 05938d462b92bd9ecb8e35f53651bded47c48675) 2007-11-13 02:27:44 +03:00			`struct timeval t, void *p)`
			`{`
			`struct ctdb_recoverd *rec = talloc_get_type(p, struct ctdb_recoverd);`
			`rec->election_timeout = NULL;`
speed startup: with --sloppy-start, cut initial election timeout to 1/2 second. Seconds between ctdbd first log message and node healthy: BEFORE: 4.03 AFTER: 2.02 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 8f17731dea4287d4f9b21dc58c1bdf26c8a0e628) 2010-06-22 17:25:20 +04:00			`fast_start = false;`
When we create new election data to send during elections, we must re-read the node flags from the main daemon to catch when the STOPPED flag is changed. (This used to be ctdb commit ca4982c40d81db528fe915d5ecc01fcf7df0b522) 2009-07-17 05:37:03 +04:00
ctdb-recoverd: Don't say "Election timed out" That makes people think there's a problem (and report bugs) so say something a bit less scary instead... Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2014-06-20 07:36:25 +04:00			`DEBUG(DEBUG_WARNING,("Election period ended\n"));`
make election handling much more scalable (This used to be ctdb commit 05938d462b92bd9ecb8e35f53651bded47c48675) 2007-11-13 02:27:44 +03:00			`}`


			`/*`
			`wait for an election to finish. It finished election_timeout seconds after`
			`the last election packet is received`
			`*/`
			`static void ctdb_wait_election(struct ctdb_recoverd *rec)`
			`{`
			`struct ctdb_context *ctdb = rec->ctdb;`
			`while (rec->election_timeout) {`
ctdb-daemon: Stop using tevent compatibility definitions Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-26 08:50:09 +03:00			`tevent_loop_once(ctdb->ev);`
make election handling much more scalable (This used to be ctdb commit 05938d462b92bd9ecb8e35f53651bded47c48675) 2007-11-13 02:27:44 +03:00			`}`
			`}`

sync flags between nodes in monitor loop in recmaster (This used to be ctdb commit 6eef86e06388fc53a1212f1e2783ae174c6cd210) 2007-10-15 08:28:51 +04:00			`/*`
when we as the recovery daemon on the recovery master detects that the flags differ between the local ctdb daemon and the remote node we can force a flags update on all nodes and not just the local daemon (This used to be ctdb commit a924eb89c966ecbae029ca137e06cffd40cc70fd) 2007-11-23 03:31:42 +03:00			`Update our local flags from all remote connected nodes.`
			`This is only run when we are or we belive we are the recovery master`
sync flags between nodes in monitor loop in recmaster (This used to be ctdb commit 6eef86e06388fc53a1212f1e2783ae174c6cd210) 2007-10-15 08:28:51 +04:00			`*/`
ctdb-daemon: Rename struct ctdb_node_map to ctdb_node_map_old Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-29 09:22:48 +03:00			`static int update_local_flags(struct ctdb_recoverd rec, struct ctdb_node_map_old nodemap)`
sync flags between nodes in monitor loop in recmaster (This used to be ctdb commit 6eef86e06388fc53a1212f1e2783ae174c6cd210) 2007-10-15 08:28:51 +04:00			`{`
			`int j;`
dont manipulate ctdb->monitoring_mode directly from the SET_MON_MODE control, instead call ctdb_start/stop_monitoring() ctdb_stop_monitoring() dont allocate a new monitoring context, leave it NULL. Also set the monitoring_mode in this function so that ctdb_stop/start_monitoring() and ->monitoring_mode are kept in sync. Add a debug message to log that we have stopped monitoring. ctdb_start_monitoring() check whether monitoring is already active and make the function idempotent. Create the monitoring context when monitoring is started. Update ->monitoring_mode once the monitoring has been started. Add a debug message to log that we have started monitoring. When we temporarily stop monitoring while running an event script, restart monitoring after the event script wrapper returns instead of in the event script callback. Let monitoring_mode start out as DISABLED and let it be enabled once we call ctdb_start_monitoring. dont check for MONITORING_DISABLED in check_fore_dead_nodes(). If monitoring is disabled, this event handler will not be called. (This used to be ctdb commit 3a93ae8bdcffb1adbd6243844f3058fc742f76aa) 2007-11-30 00:44:34 +03:00			`struct ctdb_context *ctdb = rec->ctdb;`
sync flags between nodes in monitor loop in recmaster (This used to be ctdb commit 6eef86e06388fc53a1212f1e2783ae174c6cd210) 2007-10-15 08:28:51 +04:00			`TALLOC_CTX *mem_ctx = talloc_new(ctdb);`

			`/* get the nodemap for all active remote nodes and verify`
			`they are the same as for this node`
			`*/`
			`for (j=0; j<nodemap->num; j++) {`
ctdb-daemon: Rename struct ctdb_node_map to ctdb_node_map_old Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-29 09:22:48 +03:00			`struct ctdb_node_map_old *remote_nodemap=NULL;`
sync flags between nodes in monitor loop in recmaster (This used to be ctdb commit 6eef86e06388fc53a1212f1e2783ae174c6cd210) 2007-10-15 08:28:51 +04:00			`int ret;`

			`if (nodemap->nodes[j].flags & NODE_FLAGS_DISCONNECTED) {`
			`continue;`
			`}`
			`if (nodemap->nodes[j].pnn == ctdb->pnn) {`
			`continue;`
			`}`

			`ret = ctdb_ctrl_getnodemap(ctdb, CONTROL_TIMEOUT(), nodemap->nodes[j].pnn,`
			`mem_ctx, &remote_nodemap);`
			`if (ret != 0) {`
merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_ERR, (__location__ " Unable to get nodemap from remote node %u\n",`
sync flags between nodes in monitor loop in recmaster (This used to be ctdb commit 6eef86e06388fc53a1212f1e2783ae174c6cd210) 2007-10-15 08:28:51 +04:00			`nodemap->nodes[j].pnn));`
move ctdb_set_culprit higher up in the file when we are the recmaster and we update the local flags for all the nodes, if one of the nodes fail to respond and give us his flags, set that node as a "culprit" as one of the first things to do in the monitor_cluster loop, check if the current culprit has caused too many (20) failures and if so ban that node. this is for the situation where a remote node may still be CONNECTED but it fails to respond to the getnodemap control causing the recovery master to loop in monitor_cluster aborting the monitoring when the node fails to respond but before anything will trigger a call to do_recovery(). If one or more of the databases or nodes are frozen at this stage, this would lead to smbd being blocked for potentially a longish time. (This used to be ctdb commit 83b0261f2cb453195b86f547d360400103a8b795) 2007-11-28 07:04:20 +03:00			`ctdb_set_culprit(rec, nodemap->nodes[j].pnn);`
sync flags between nodes in monitor loop in recmaster (This used to be ctdb commit 6eef86e06388fc53a1212f1e2783ae174c6cd210) 2007-10-15 08:28:51 +04:00			`talloc_free(mem_ctx);`
ctdb-recoverd: Simplify return values when updating local flags Change this to return just 0 or -1. It isn't monitoring anything. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-04-27 14:47:08 +03:00			`return -1;`
sync flags between nodes in monitor loop in recmaster (This used to be ctdb commit 6eef86e06388fc53a1212f1e2783ae174c6cd210) 2007-10-15 08:28:51 +04:00			`}`
			`if (nodemap->nodes[j].flags != remote_nodemap->nodes[j].flags) {`
If update_local_flags() finds that a node has changed its BANNED status so it differs from what the local ctdb daemon on the recovery master thinks it should be we should call for a re-election (This used to be ctdb commit 21ad6039c31ef5cc0e40a35a41220f91943947cb) 2007-11-23 03:53:06 +03:00			`/* We should tell our daemon about this so it`
add an extra log if we get a modflags control but it doesnt change any flags in update_local_flags() (this is only called if we are or we belive we are the recmaster) when we detect that the flags of a remote node is different from what our local node thinks the flags should be for that remote node we should send a node-flag-changed message to the local daemon so that it updates the flags for that node. (This used to be ctdb commit 36225e4e271f7a4065398253747fb20054f99a53) 2007-11-23 02:52:29 +03:00			`updates its flags or else we will log the same`
			`message again in the next iteration of recovery.`
when we as the recovery daemon on the recovery master detects that the flags differ between the local ctdb daemon and the remote node we can force a flags update on all nodes and not just the local daemon (This used to be ctdb commit a924eb89c966ecbae029ca137e06cffd40cc70fd) 2007-11-23 03:31:42 +03:00			`Since we are the recovery master we can just as`
			`well update the flags on all nodes.`
add an extra log if we get a modflags control but it doesnt change any flags in update_local_flags() (this is only called if we are or we belive we are the recmaster) when we detect that the flags of a remote node is different from what our local node thinks the flags should be for that remote node we should send a node-flag-changed message to the local daemon so that it updates the flags for that node. (This used to be ctdb commit 36225e4e271f7a4065398253747fb20054f99a53) 2007-11-23 02:52:29 +03:00			`*/`
recoverd: When updating flags on nodes, send updated flags and not old flags This was broken by commit a9a1156ea4e10483a4bf4265b8e9203f0af033aa. Instead of a SRVID_SET_NODE_FLAGS message to recovery daemon, a control was sent to the local daemon which in turn informed the recovery daemon. And while doing this change old flags were sent via CONTROL_MODIFY_FLAGS. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 7eb2f89979360b6cc98ca9b17c48310277fa89fc) 2013-06-26 09:22:46 +04:00			`ret = ctdb_ctrl_modflags(ctdb, CONTROL_TIMEOUT(), nodemap->nodes[j].pnn, remote_nodemap->nodes[j].flags, ~remote_nodemap->nodes[j].flags);`
reqrite the handling of flag updates across the cluster to eliminate a race between the ctdb tool and the recovery daemon both at once trying to push flag changes across the cluster. (This used to be ctdb commit a9a1156ea4e10483a4bf4265b8e9203f0af033aa) 2008-11-19 06:43:46 +03:00			`if (ret != 0) {`
			`DEBUG(DEBUG_ERR, (__location__ " Unable to update nodeflags on remote nodes\n"));`
			`return -1;`
			`}`
add an extra log if we get a modflags control but it doesnt change any flags in update_local_flags() (this is only called if we are or we belive we are the recmaster) when we detect that the flags of a remote node is different from what our local node thinks the flags should be for that remote node we should send a node-flag-changed message to the local daemon so that it updates the flags for that node. (This used to be ctdb commit 36225e4e271f7a4065398253747fb20054f99a53) 2007-11-23 02:52:29 +03:00
If update_local_flags() finds that a node has changed its BANNED status so it differs from what the local ctdb daemon on the recovery master thinks it should be we should call for a re-election (This used to be ctdb commit 21ad6039c31ef5cc0e40a35a41220f91943947cb) 2007-11-23 03:53:06 +03:00			`/* Update our local copy of the flags in the recovery`
			`daemon.`
			`*/`
merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_NOTICE,("Remote node %u had flags 0x%x, local had 0x%x - updating local\n",`
If update_local_flags() finds that a node has changed its BANNED status so it differs from what the local ctdb daemon on the recovery master thinks it should be we should call for a re-election (This used to be ctdb commit 21ad6039c31ef5cc0e40a35a41220f91943947cb) 2007-11-23 03:53:06 +03:00			`nodemap->nodes[j].pnn, remote_nodemap->nodes[j].flags,`
			`nodemap->nodes[j].flags));`
sync flags between nodes in monitor loop in recmaster (This used to be ctdb commit 6eef86e06388fc53a1212f1e2783ae174c6cd210) 2007-10-15 08:28:51 +04:00			`nodemap->nodes[j].flags = remote_nodemap->nodes[j].flags;`
			`}`
			`talloc_free(remote_nodemap);`
			`}`
			`talloc_free(mem_ctx);`
ctdb-recoverd: Simplify return values when updating local flags Change this to return just 0 or -1. It isn't monitoring anything. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-04-27 14:47:08 +03:00			`return 0;`
sync flags between nodes in monitor loop in recmaster (This used to be ctdb commit 6eef86e06388fc53a1212f1e2783ae174c6cd210) 2007-10-15 08:28:51 +04:00			`}`


ctdbd: Fix a typo Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Jeremy Allison <jra@samba.org> 2015-10-12 17:52:49 +03:00			`/* Create a new random generation id.`
create a define to represent the 'invalid' generation id we used in two places. create a new helper function to generate new generation id values that know about the invalid id and avoids generating it. update the ctdb status tool to know about the invalid generation id and print the string INVALID instead (This used to be ctdb commit 4fbcd189543cb8a92227fdcd3d158472e558ccda) 2007-08-22 06:38:31 +04:00			`The generation id can not be the INVALID_GENERATION id`
			`*/`
			`static uint32_t new_generation(void)`
			`{`
			`uint32_t generation;`

			`while (1) {`
			`generation = random();`

			`if (generation != INVALID_GENERATION) {`
			`break;`
			`}`
			`}`

			`return generation;`
			`}`
we are the culprit if we can't get the reclock (This used to be ctdb commit 1d320e113c6134ff6822b985a47131d8204af35a) 2007-10-05 06:01:40 +04:00
ctdb-recoverd: Recovery lock handle should be in recovery deamon context This shouldn't be in the CTDB context. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-05-24 07:54:39 +03:00			`static bool ctdb_recovery_have_lock(struct ctdb_recoverd *rec)`
ctdb-recovery: Move recovery lock functions to recovery daemon code ctdb_recovery_have_lock(), ctdb_recovery_lock(), ctdb_recovery_unlock() are only used by recovery daemon, so move them there. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-02-17 12:20:03 +03:00			`{`
ctdb-recoverd: Recovery lock handle should be in recovery deamon context This shouldn't be in the CTDB context. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-05-24 07:54:39 +03:00			`return (rec->recovery_lock_handle != NULL);`
ctdb-recovery: Move recovery lock functions to recovery daemon code ctdb_recovery_have_lock(), ctdb_recovery_lock(), ctdb_recovery_unlock() are only used by recovery daemon, so move them there. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-02-17 12:20:03 +03:00			`}`

			`struct hold_reclock_state {`
			`bool done;`
ctdb-recoverd: Don't expose internal cluster mutex status Just expose whether the lock was taken. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-05-31 11:37:30 +03:00			`bool locked;`
ctdb-recoverd: Simplify reclock handler Do the interesting work outside the handler. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-06-01 10:32:42 +03:00			`double latency;`
ctdb-recovery: Move recovery lock functions to recovery daemon code ctdb_recovery_have_lock(), ctdb_recovery_lock(), ctdb_recovery_unlock() are only used by recovery daemon, so move them there. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-02-17 12:20:03 +03:00			`};`

ctdb-recoverd: Add handler for lost recovery lock If the process holding the recovery lock terminates unexpectedly then the recovery daemon needs to know that the lock is no longer held. While here, rename hold_reclock_handler() to take_reclock_handler() so there is a clear difference between the two handler names. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-05-29 00:25:05 +03:00			`static void take_reclock_handler(char status,`
ctdb-recovery: Move recovery lock functions to recovery daemon code ctdb_recovery_have_lock(), ctdb_recovery_lock(), ctdb_recovery_unlock() are only used by recovery daemon, so move them there. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-02-17 12:20:03 +03:00			`double latency,`
			`void *private_data)`
			`{`
			`struct hold_reclock_state *s =`
			`(struct hold_reclock_state *) private_data;`

			`switch (status) {`
			`case '0':`
ctdb-recoverd: Simplify reclock handler Do the interesting work outside the handler. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-06-01 10:32:42 +03:00			`s->latency = latency;`
ctdb-recovery: Move recovery lock functions to recovery daemon code ctdb_recovery_have_lock(), ctdb_recovery_lock(), ctdb_recovery_unlock() are only used by recovery daemon, so move them there. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-02-17 12:20:03 +03:00			`break;`

			`case '1':`
			`DEBUG(DEBUG_ERR,`
			`("Unable to take recovery lock - contention\n"));`
			`break;`

			`default:`
			`DEBUG(DEBUG_ERR, ("ERROR: when taking recovery lock\n"));`
			`}`

			`s->done = true;`
ctdb-recoverd: Don't expose internal cluster mutex status Just expose whether the lock was taken. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-05-31 11:37:30 +03:00			`s->locked = (status == '0') ;`
ctdb-recovery: Move recovery lock functions to recovery daemon code ctdb_recovery_have_lock(), ctdb_recovery_lock(), ctdb_recovery_unlock() are only used by recovery daemon, so move them there. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-02-17 12:20:03 +03:00			`}`

ctdb-recoverd: Add handler for lost recovery lock If the process holding the recovery lock terminates unexpectedly then the recovery daemon needs to know that the lock is no longer held. While here, rename hold_reclock_handler() to take_reclock_handler() so there is a clear difference between the two handler names. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-05-29 00:25:05 +03:00			`static bool ctdb_recovery_lock(struct ctdb_recoverd *rec);`

			`static void lost_reclock_handler(void *private_data)`
			`{`
			`struct ctdb_recoverd *rec = talloc_get_type_abort(`
			`private_data, struct ctdb_recoverd);`

			`DEBUG(DEBUG_ERR,`
			`("Recovery lock helper terminated unexpectedly - "`
			`"trying to retake recovery lock\n"));`
			`TALLOC_FREE(rec->recovery_lock_handle);`
			`if (! ctdb_recovery_lock(rec)) {`
			`DEBUG(DEBUG_ERR, ("Failed to take recovery lock\n"));`
			`}`
			`}`

ctdb-recoverd: Recovery lock handle should be in recovery deamon context This shouldn't be in the CTDB context. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-05-24 07:54:39 +03:00			`static bool ctdb_recovery_lock(struct ctdb_recoverd *rec)`
ctdb-recovery: Move recovery lock functions to recovery daemon code ctdb_recovery_have_lock(), ctdb_recovery_lock(), ctdb_recovery_unlock() are only used by recovery daemon, so move them there. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-02-17 12:20:03 +03:00			`{`
ctdb-recoverd: Recovery lock handle should be in recovery deamon context This shouldn't be in the CTDB context. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-05-24 07:54:39 +03:00			`struct ctdb_context *ctdb = rec->ctdb;`
ctdb-recovery: Move recovery lock functions to recovery daemon code ctdb_recovery_have_lock(), ctdb_recovery_lock(), ctdb_recovery_unlock() are only used by recovery daemon, so move them there. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-02-17 12:20:03 +03:00			`struct ctdb_cluster_mutex_handle *h;`
			`struct hold_reclock_state s = {`
			`.done = false,`
ctdb-recoverd: Don't expose internal cluster mutex status Just expose whether the lock was taken. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-05-31 11:37:30 +03:00			`.locked = false,`
ctdb-recoverd: Simplify reclock handler Do the interesting work outside the handler. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-06-01 10:32:42 +03:00			`.latency = 0,`
ctdb-recovery: Move recovery lock functions to recovery daemon code ctdb_recovery_have_lock(), ctdb_recovery_lock(), ctdb_recovery_unlock() are only used by recovery daemon, so move them there. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-02-17 12:20:03 +03:00			`};`

ctdb-cluster-mutex: ctdb_cluster_mutex() registers handler and private data Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-06-01 11:56:33 +03:00			`h = ctdb_cluster_mutex(rec, ctdb, ctdb->recovery_lock, 0,`
ctdb-recoverd: Add handler for lost recovery lock If the process holding the recovery lock terminates unexpectedly then the recovery daemon needs to know that the lock is no longer held. While here, rename hold_reclock_handler() to take_reclock_handler() so there is a clear difference between the two handler names. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-05-29 00:25:05 +03:00			`take_reclock_handler, &s,`
			`lost_reclock_handler, rec);`
ctdb-recovery: Move recovery lock functions to recovery daemon code ctdb_recovery_have_lock(), ctdb_recovery_lock(), ctdb_recovery_unlock() are only used by recovery daemon, so move them there. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-02-17 12:20:03 +03:00			`if (h == NULL) {`
ctdb-recoverd: Fix buggy function return on memory allocation failure Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-06-01 08:56:42 +03:00			`return false;`
ctdb-recovery: Move recovery lock functions to recovery daemon code ctdb_recovery_have_lock(), ctdb_recovery_lock(), ctdb_recovery_unlock() are only used by recovery daemon, so move them there. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-02-17 12:20:03 +03:00			`}`

			`while (!s.done) {`
			`tevent_loop_once(ctdb->ev);`
			`}`

ctdb-recoverd: Recovery lock handle should be in recovery deamon context This shouldn't be in the CTDB context. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-05-24 07:54:39 +03:00			`if (! s.locked) {`
ctdb-recoverd: Simplify reclock handler Do the interesting work outside the handler. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-06-01 10:32:42 +03:00			`talloc_free(h);`
ctdb-recoverd: Recovery lock handle should be in recovery deamon context This shouldn't be in the CTDB context. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-05-24 07:54:39 +03:00			`return false;`
			`}`

			`rec->recovery_lock_handle = h;`
ctdb-recoverd: Simplify reclock handler Do the interesting work outside the handler. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-06-01 10:32:42 +03:00			`ctdb_ctrl_report_recd_lock_latency(ctdb, CONTROL_TIMEOUT(),`
			`s.latency);`
ctdb-recoverd: Recovery lock handle should be in recovery deamon context This shouldn't be in the CTDB context. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-05-24 07:54:39 +03:00
			`return true;`
ctdb-recovery: Move recovery lock functions to recovery daemon code ctdb_recovery_have_lock(), ctdb_recovery_lock(), ctdb_recovery_unlock() are only used by recovery daemon, so move them there. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-02-17 12:20:03 +03:00			`}`

ctdb-recoverd: Recovery lock handle should be in recovery deamon context This shouldn't be in the CTDB context. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-05-24 07:54:39 +03:00			`static void ctdb_recovery_unlock(struct ctdb_recoverd *rec)`
ctdb-recovery: Move recovery lock functions to recovery daemon code ctdb_recovery_have_lock(), ctdb_recovery_lock(), ctdb_recovery_unlock() are only used by recovery daemon, so move them there. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-02-17 12:20:03 +03:00			`{`
ctdb-recoverd: Recovery lock handle should be in recovery deamon context This shouldn't be in the CTDB context. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-05-24 07:54:39 +03:00			`if (rec->recovery_lock_handle != NULL) {`
ctdb-recovery: Move recovery lock functions to recovery daemon code ctdb_recovery_have_lock(), ctdb_recovery_lock(), ctdb_recovery_unlock() are only used by recovery daemon, so move them there. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-02-17 12:20:03 +03:00			`DEBUG(DEBUG_NOTICE, ("Releasing recovery lock\n"));`
ctdb-recoverd: Recovery lock handle should be in recovery deamon context This shouldn't be in the CTDB context. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-05-24 07:54:39 +03:00			`TALLOC_FREE(rec->recovery_lock_handle);`
ctdb-recovery: Move recovery lock functions to recovery daemon code ctdb_recovery_have_lock(), ctdb_recovery_lock(), ctdb_recovery_unlock() are only used by recovery daemon, so move them there. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-02-17 12:20:03 +03:00			`}`
			`}`

recoverd: Don't continue if the current node gets banned Can not continue with recovery or monitoring cluster. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 14399de1dd0bd8dabf1f48b1457e3ccb37589d8a) 2013-06-28 10:31:07 +04:00			`static void ban_misbehaving_nodes(struct ctdb_recoverd rec, bool self_ban)`
recoverd: Refactor code to ban misbehaving nodes Since we have nodemap information, there is no need to hardcode the limit of 20. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Pair-Programmed-With: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit aea12dce83ef385e9fb3bc03ac7ace0874a0e3fe) 2013-06-28 08:31:02 +04:00			`{`
			`struct ctdb_context *ctdb = rec->ctdb;`
			`int i;`
			`struct ctdb_banning_state *ban_state;`

recoverd: Don't continue if the current node gets banned Can not continue with recovery or monitoring cluster. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 14399de1dd0bd8dabf1f48b1457e3ccb37589d8a) 2013-06-28 10:31:07 +04:00			`*self_ban = false;`
recoverd: Refactor code to ban misbehaving nodes Since we have nodemap information, there is no need to hardcode the limit of 20. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Pair-Programmed-With: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit aea12dce83ef385e9fb3bc03ac7ace0874a0e3fe) 2013-06-28 08:31:02 +04:00			`for (i=0; i<ctdb->num_nodes; i++) {`
			`if (ctdb->nodes[i]->ban_state == NULL) {`
			`continue;`
			`}`
			`ban_state = (struct ctdb_banning_state *)ctdb->nodes[i]->ban_state;`
			`if (ban_state->count < 2*ctdb->num_nodes) {`
			`continue;`
			`}`

			`DEBUG(DEBUG_NOTICE,("Node %u reached %u banning credits - banning it for %u seconds\n",`
			`ctdb->nodes[i]->pnn, ban_state->count,`
			`ctdb->tunable.recovery_ban_period));`
			`ctdb_ban_node(rec, ctdb->nodes[i]->pnn, ctdb->tunable.recovery_ban_period);`
			`ban_state->count = 0;`
recoverd: Don't continue if the current node gets banned Can not continue with recovery or monitoring cluster. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 14399de1dd0bd8dabf1f48b1457e3ccb37589d8a) 2013-06-28 10:31:07 +04:00
			`/* Banning ourself? */`
			`if (ctdb->nodes[i]->pnn == rec->ctdb->pnn) {`
			`*self_ban = true;`
			`}`
recoverd: Refactor code to ban misbehaving nodes Since we have nodemap information, there is no need to hardcode the limit of 20. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Pair-Programmed-With: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit aea12dce83ef385e9fb3bc03ac7ace0874a0e3fe) 2013-06-28 08:31:02 +04:00			`}`
			`}`

ctdb-recoverd: Generalise helper state, handler and launching These can also be used for takeover handler. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-12-09 07:04:03 +03:00			`struct helper_state {`
			`int fd[2];`
			`pid_t pid;`
			`int result;`
			`bool done;`
			`};`

			`static void helper_handler(struct tevent_context *ev,`
			`struct tevent_fd *fde,`
			`uint16_t flags, void *private_data)`
			`{`
			`struct helper_state *state = talloc_get_type_abort(`
			`private_data, struct helper_state);`
			`int ret;`

			`ret = sys_read(state->fd[0], &state->result, sizeof(state->result));`
			`if (ret != sizeof(state->result)) {`
			`state->result = EPIPE;`
			`}`

			`state->done = true;`
			`}`

			`static int helper_run(struct ctdb_recoverd rec, TALLOC_CTX mem_ctx,`
			`const char prog, const char arg, const char *type)`
			`{`
			`struct helper_state *state;`
			`struct tevent_fd *fde;`
			`const char **args;`
			`int nargs, ret;`
ctdb-recoverd: Abort recovery/takeover if recmaster changes Recovery and takeover are run via helper from recovery daemon. While the helpers are running, it's possible for the current node to lose election. If that happens, abort the currently running recovery/takeover helper. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2017-09-08 04:24:27 +03:00			`uint32_t recmaster = rec->recmaster;`
ctdb-recoverd: Generalise helper state, handler and launching These can also be used for takeover handler. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-12-09 07:04:03 +03:00
			`state = talloc_zero(mem_ctx, struct helper_state);`
			`if (state == NULL) {`
			`DEBUG(DEBUG_ERR, (__location__ " memory error\n"));`
			`return -1;`
			`}`

			`state->pid = -1;`

			`ret = pipe(state->fd);`
			`if (ret != 0) {`
			`DEBUG(DEBUG_ERR,`
			`("Failed to create pipe for %s helper\n", type));`
			`goto fail;`
			`}`

			`set_close_on_exec(state->fd[0]);`

			`nargs = 4;`
			`args = talloc_array(state, const char *, nargs);`
			`if (args == NULL) {`
			`DEBUG(DEBUG_ERR, (__location__ " memory error\n"));`
			`goto fail;`
			`}`

			`args[0] = talloc_asprintf(args, "%d", state->fd[1]);`
			`if (args[0] == NULL) {`
			`DEBUG(DEBUG_ERR, (__location__ " memory error\n"));`
			`goto fail;`
			`}`
			`args[1] = rec->ctdb->daemon.name;`
			`args[2] = arg;`
			`args[3] = NULL;`

			`if (args[2] == NULL) {`
			`nargs = 3;`
			`}`

			`state->pid = ctdb_vfork_exec(state, rec->ctdb, prog, nargs, args);`
			`if (state->pid == -1) {`
			`DEBUG(DEBUG_ERR,`
			`("Failed to create child for %s helper\n", type));`
			`goto fail;`
			`}`

			`close(state->fd[1]);`
			`state->fd[1] = -1;`

			`state->done = false;`

			`fde = tevent_add_fd(rec->ctdb->ev, rec->ctdb, state->fd[0],`
			`TEVENT_FD_READ, helper_handler, state);`
			`if (fde == NULL) {`
			`goto fail;`
			`}`
			`tevent_fd_set_auto_close(fde);`

			`while (!state->done) {`
			`tevent_loop_once(rec->ctdb->ev);`
ctdb-recoverd: Abort recovery/takeover if recmaster changes Recovery and takeover are run via helper from recovery daemon. While the helpers are running, it's possible for the current node to lose election. If that happens, abort the currently running recovery/takeover helper. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2017-09-08 04:24:27 +03:00
			`/* If recmaster changes, we have lost election */`
			`if (recmaster != rec->recmaster) {`
			`D_ERR("Recmaster changed to %u, aborting %s\n",`
			`rec->recmaster, type);`
			`state->result = 1;`
			`break;`
			`}`
ctdb-recoverd: Generalise helper state, handler and launching These can also be used for takeover handler. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-12-09 07:04:03 +03:00			`}`

			`close(state->fd[0]);`
			`state->fd[0] = -1;`

			`if (state->result != 0) {`
			`goto fail;`
			`}`

			`ctdb_kill(rec->ctdb, state->pid, SIGKILL);`
			`talloc_free(state);`
			`return 0;`

			`fail:`
			`if (state->fd[0] != -1) {`
			`close(state->fd[0]);`
			`}`
			`if (state->fd[1] != -1) {`
			`close(state->fd[1]);`
			`}`
			`if (state->pid != -1) {`
			`ctdb_kill(rec->ctdb, state->pid, SIGKILL);`
			`}`
			`talloc_free(state);`
			`return -1;`
			`}`


ctdb-recoverd: Integrate takeover helper Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-12-09 08:21:39 +03:00			`static int ctdb_takeover(struct ctdb_recoverd *rec,`
			`uint32_t *force_rebalance_nodes)`
			`{`
			`static char prog[PATH_MAX+1] = "";`
			`char *arg;`
			`int i;`

			`if (!ctdb_set_helper("takeover_helper", prog, sizeof(prog),`
			`"CTDB_TAKEOVER_HELPER", CTDB_HELPER_BINDIR,`
			`"ctdb_takeover_helper")) {`
			`ctdb_die(rec->ctdb, "Unable to set takeover helper\n");`
			`}`

			`arg = NULL;`
			`for (i = 0; i < talloc_array_length(force_rebalance_nodes); i++) {`
			`uint32_t pnn = force_rebalance_nodes[i];`
			`if (arg == NULL) {`
			`arg = talloc_asprintf(rec, "%u", pnn);`
			`} else {`
			`arg = talloc_asprintf_append(arg, ",%u", pnn);`
			`}`
			`if (arg == NULL) {`
			`DEBUG(DEBUG_ERR, (__location__ " memory error\n"));`
			`return -1;`
			`}`
			`}`

			`return helper_run(rec, rec, prog, arg, "takeover");`
			`}`
ctdb-recoverd: Generalise helper state, handler and launching These can also be used for takeover handler. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-12-09 07:04:03 +03:00
recoverd: New function do_takeover_run() Factor the calling sequence for ctdb_takeover_run() into a new function and call it instead. This changes rec->need_takeover_run to false for each successful takeover run and that seems to be the right thing to do. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 9a3f0c0e61ca5c17e020c6e0463d73c7cf4f7c09) 2013-08-27 06:14:34 +04:00			`static bool do_takeover_run(struct ctdb_recoverd *rec,`
ctdb-takeover: Recovery daemon no longer passes fail callback Banning is now handled by the takeover code sending banning credit messages. This commit makes a change in behaviour quite obvious. Takeover runs were initiated from several locations in the code but banning was only done from one of these locations. Now banning can be done from any failed takeover run. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-05-03 08:35:08 +03:00			`struct ctdb_node_map_old *nodemap)`
recoverd: New function do_takeover_run() Factor the calling sequence for ctdb_takeover_run() into a new function and call it instead. This changes rec->need_takeover_run to false for each successful takeover run and that seems to be the right thing to do. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 9a3f0c0e61ca5c17e020c6e0463d73c7cf4f7c09) 2013-08-27 06:14:34 +04:00			`{`
recoverd: New SRVID message CTDB_SRVID_DISABLE_TAKEOVER_RUNS This implements a superset of CTDB_SRVID_DISABLE_IP_CHECK. It stops the IP checks but also causes any attempted takeover runs to fail and be rescheduled. This is meant to completely stop IP movements. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 00db4de53a0d86013e79e6577e7e6cf3ef864e56) 2013-08-27 09:04:40 +04:00			`uint32_t *nodes = NULL;`
ctdb-daemon: Rename struct srvid_request_data to ctdb_disable_message Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-28 10:23:13 +03:00			`struct ctdb_disable_message dtr;`
recoverd: Move disabling of IP checks into do_takeover_run() Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 48b603fbf16311daa47b01e7a33d477ed51da56d) 2013-09-03 05:21:09 +04:00			`TDB_DATA data;`
recoverd: New SRVID message CTDB_SRVID_DISABLE_TAKEOVER_RUNS This implements a superset of CTDB_SRVID_DISABLE_IP_CHECK. It stops the IP checks but also causes any attempted takeover runs to fail and be rescheduled. This is meant to completely stop IP movements. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 00db4de53a0d86013e79e6577e7e6cf3ef864e56) 2013-08-27 09:04:40 +04:00			`int i;`
recoverd: Be careful about freeing the list of IP rebalance target nodes It can change during a takeover run. If it does then don't free it. There are potentially fancier solutions (e.g. check what PNNs are new to the list) to this issue but this is the simplest. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit e81589b7084c661adf617e166cc2c25b4939f841) 2013-09-06 05:23:07 +04:00			`uint32_t *rebalance_nodes = rec->force_rebalance_nodes;`
recoverd: New function do_takeover_run() Factor the calling sequence for ctdb_takeover_run() into a new function and call it instead. This changes rec->need_takeover_run to false for each successful takeover run and that seems to be the right thing to do. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 9a3f0c0e61ca5c17e020c6e0463d73c7cf4f7c09) 2013-08-27 06:14:34 +04:00			`int ret;`
			`bool ok;`

recoverd: Improve logging for takeover runs Takeover runs are currently silent when they succeed. However, they are important, so log something by default. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit b39aa2e401fbb581207d986bac93778e9c01acdc) 2013-09-18 11:06:16 +04:00			`DEBUG(DEBUG_NOTICE, ("Takeover run starting\n"));`

ctdb-recoverd: Reimplement disabling takeover runs using ctdb_op_disable() Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-02-08 12:52:12 +03:00			`if (ctdb_op_is_in_progress(rec->takeover_run)) {`
recoverd: do_takeover_run() should mark when a takeover run is in progress Nested takeover runs should never happens so they should fail. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 8ed29c60c0a7dd29f2a6efdf694d38e94281e1c4) 2013-09-03 05:20:01 +04:00			`DEBUG(DEBUG_ERR, (__location__`
			`" takeover run already in progress \n"));`
			`ok = false;`
			`goto done;`
			`}`

ctdb-recoverd: Reimplement disabling takeover runs using ctdb_op_disable() Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-02-08 12:52:12 +03:00			`if (!ctdb_op_begin(rec->takeover_run)) {`
recoverd: New SRVID message CTDB_SRVID_DISABLE_TAKEOVER_RUNS This implements a superset of CTDB_SRVID_DISABLE_IP_CHECK. It stops the IP checks but also causes any attempted takeover runs to fail and be rescheduled. This is meant to completely stop IP movements. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 00db4de53a0d86013e79e6577e7e6cf3ef864e56) 2013-08-27 09:04:40 +04:00			`ok = false;`
			`goto done;`
recoverd: Move disabling of IP checks into do_takeover_run() Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 48b603fbf16311daa47b01e7a33d477ed51da56d) 2013-09-03 05:21:09 +04:00			`}`

recoverd: New SRVID message CTDB_SRVID_DISABLE_TAKEOVER_RUNS This implements a superset of CTDB_SRVID_DISABLE_IP_CHECK. It stops the IP checks but also causes any attempted takeover runs to fail and be rescheduled. This is meant to completely stop IP movements. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 00db4de53a0d86013e79e6577e7e6cf3ef864e56) 2013-08-27 09:04:40 +04:00			`/* Disable IP checks (takeover runs, really) on other nodes`
			`* while doing this takeover run. This will stop those other`
			`* nodes from triggering takeover runs when think they should`
			`* be hosting an IP but it isn't yet on an interface. Don't`
			`* wait for replies since a failure here might cause some`
			`* noise in the logs but will not actually cause a problem.`
			`*/`
ctdb-recoverd: Fix some uninitialised memory issues The first element of these structures is a 32-bit PNN. On 64-bit systems this field can be followed by 32-bits of padding. When the structures are copied this can cause uninitialised memory to be copied. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Michael Adam <obnox@samba.org> 2016-01-11 09:23:12 +03:00			`ZERO_STRUCT(dtr);`
recoverd: New SRVID message CTDB_SRVID_DISABLE_TAKEOVER_RUNS This implements a superset of CTDB_SRVID_DISABLE_IP_CHECK. It stops the IP checks but also causes any attempted takeover runs to fail and be rescheduled. This is meant to completely stop IP movements. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 00db4de53a0d86013e79e6577e7e6cf3ef864e56) 2013-08-27 09:04:40 +04:00			`dtr.srvid = 0; /* No reply */`
			`dtr.pnn = -1;`

			`data.dptr = (uint8_t*)&dtr;`
			`data.dsize = sizeof(dtr);`

			`nodes = list_of_connected_nodes(rec->ctdb, nodemap, rec, false);`

Revert "recoverd: Disable takeover runs on other nodes for 5 minutes" 5 minutes is too long to leave the cluster in limbo if the recovery daemon dies during a takeover run, even though this is quite unlikely. We need a new recover master to be able to do takeover runs fairly quickly. This reverts commit 71080676bb4acbd0d9b595a30cf7fe6dddbf426f. (This used to be ctdb commit 3e41170c78fc7a2bf526129c9b7db3739b61c6bf) 2013-10-24 04:13:16 +04:00			`/* Disable for 60 seconds. This can be a tunable later if`
recoverd: New SRVID message CTDB_SRVID_DISABLE_TAKEOVER_RUNS This implements a superset of CTDB_SRVID_DISABLE_IP_CHECK. It stops the IP checks but also causes any attempted takeover runs to fail and be rescheduled. This is meant to completely stop IP movements. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 00db4de53a0d86013e79e6577e7e6cf3ef864e56) 2013-08-27 09:04:40 +04:00			`* necessary.`
			`*/`
ctdb-daemon: Rename struct srvid_request_data to ctdb_disable_message Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-28 10:23:13 +03:00			`dtr.timeout = 60;`
recoverd: New SRVID message CTDB_SRVID_DISABLE_TAKEOVER_RUNS This implements a superset of CTDB_SRVID_DISABLE_IP_CHECK. It stops the IP checks but also causes any attempted takeover runs to fail and be rescheduled. This is meant to completely stop IP movements. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 00db4de53a0d86013e79e6577e7e6cf3ef864e56) 2013-08-27 09:04:40 +04:00			`for (i = 0; i < talloc_array_length(nodes); i++) {`
			`if (ctdb_client_send_message(rec->ctdb, nodes[i],`
			`CTDB_SRVID_DISABLE_TAKEOVER_RUNS,`
			`data) != 0) {`
			`DEBUG(DEBUG_INFO,("Failed to disable takeover runs\n"));`
			`}`
			`}`
recoverd: do_takeover_run() should mark when a takeover run is in progress Nested takeover runs should never happens so they should fail. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 8ed29c60c0a7dd29f2a6efdf694d38e94281e1c4) 2013-09-03 05:20:01 +04:00
ctdb-recoverd: Integrate takeover helper Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-12-09 08:21:39 +03:00			`ret = ctdb_takeover(rec, rec->force_rebalance_nodes);`
recoverd: Move disabling of IP checks into do_takeover_run() Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 48b603fbf16311daa47b01e7a33d477ed51da56d) 2013-09-03 05:21:09 +04:00
recoverd: New SRVID message CTDB_SRVID_DISABLE_TAKEOVER_RUNS This implements a superset of CTDB_SRVID_DISABLE_IP_CHECK. It stops the IP checks but also causes any attempted takeover runs to fail and be rescheduled. This is meant to completely stop IP movements. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 00db4de53a0d86013e79e6577e7e6cf3ef864e56) 2013-08-27 09:04:40 +04:00			`/* Reenable takeover runs and IP checks on other nodes */`
ctdb-daemon: Rename struct srvid_request_data to ctdb_disable_message Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-28 10:23:13 +03:00			`dtr.timeout = 0;`
recoverd: New SRVID message CTDB_SRVID_DISABLE_TAKEOVER_RUNS This implements a superset of CTDB_SRVID_DISABLE_IP_CHECK. It stops the IP checks but also causes any attempted takeover runs to fail and be rescheduled. This is meant to completely stop IP movements. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 00db4de53a0d86013e79e6577e7e6cf3ef864e56) 2013-08-27 09:04:40 +04:00			`for (i = 0; i < talloc_array_length(nodes); i++) {`
			`if (ctdb_client_send_message(rec->ctdb, nodes[i],`
			`CTDB_SRVID_DISABLE_TAKEOVER_RUNS,`
			`data) != 0) {`
Fix various spelling errors Reviewed-by: Andrew Bartlett <abartlet@samba.org> Reviewed-by: Michael Adam <obnox@samba.org> Autobuild-User(master): Andrew Bartlett <abartlet@samba.org> Autobuild-Date(master): Fri Nov 6 13:43:45 CET 2015 on sn-devel-104 2015-07-27 00:02:57 +03:00			`DEBUG(DEBUG_INFO,("Failed to re-enable takeover runs\n"));`
recoverd: New SRVID message CTDB_SRVID_DISABLE_TAKEOVER_RUNS This implements a superset of CTDB_SRVID_DISABLE_IP_CHECK. It stops the IP checks but also causes any attempted takeover runs to fail and be rescheduled. This is meant to completely stop IP movements. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 00db4de53a0d86013e79e6577e7e6cf3ef864e56) 2013-08-27 09:04:40 +04:00			`}`
recoverd: Move disabling of IP checks into do_takeover_run() Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 48b603fbf16311daa47b01e7a33d477ed51da56d) 2013-09-03 05:21:09 +04:00			`}`

recoverd: New function do_takeover_run() Factor the calling sequence for ctdb_takeover_run() into a new function and call it instead. This changes rec->need_takeover_run to false for each successful takeover run and that seems to be the right thing to do. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 9a3f0c0e61ca5c17e020c6e0463d73c7cf4f7c09) 2013-08-27 06:14:34 +04:00			`if (ret != 0) {`
recoverd: Improve logging for takeover runs Takeover runs are currently silent when they succeed. However, they are important, so log something by default. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit b39aa2e401fbb581207d986bac93778e9c01acdc) 2013-09-18 11:06:16 +04:00			`DEBUG(DEBUG_ERR, ("ctdb_takeover_run() failed\n"));`
recoverd: New function do_takeover_run() Factor the calling sequence for ctdb_takeover_run() into a new function and call it instead. This changes rec->need_takeover_run to false for each successful takeover run and that seems to be the right thing to do. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 9a3f0c0e61ca5c17e020c6e0463d73c7cf4f7c09) 2013-08-27 06:14:34 +04:00			`ok = false;`
			`goto done;`
			`}`

			`ok = true;`
recoverd: Fix the implementation of CTDB_SRVID_REBALANCE_NODE The current implementation has a few flaws: * A takeover run is called unconditionally when the timer goes even if the recovery master role has moved. This means a node other than the recovery master can incorrectly do a takeover run. * The rebalancing target nodes are cleared in the setup for a takeover run, regardless of whether the takeover run succeeds. * The timer to force a rebalance isn't cleared if another takeover run occurs before the deadline. Any forced rebalancing will happen in the first takeover run and when the timer expires some time later then an unnecessary takeover run will occur. * If the recovery master role moves then the rebalancing data will stay on the original node and affect the next takeover run to occur if the recovery master role should come back to the original node. Instead, store an array of rebalance target nodes in the recovery master context. This is passed as an extra argument to ctdb_takeover_run() each time it is called and is cleared when a takeover run succeeds. The timer hangs off the array of rebalance target nodes, which is cleared if the node isn't the recovery master. This means that it is possible to lose rebalance data if the recovery master role moves. However, that's a difficult problem to solve. The best way of approaching it is probably to try to stop the recovery master role from jumping around unnecesarily when inactive nodes join the cluster. The long term solution is to avoid this nonsense completely. The IP allocation algorithm needs to cache state between runs so that it knows which nodes have just become healthy. This also needs recovery master stability. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit c51c1efe5fc7fa668597f2acd435dee16e410fc9) 2013-09-04 08:30:04 +04:00			`/* Takeover run was successful so clear force rebalance targets */`
recoverd: Be careful about freeing the list of IP rebalance target nodes It can change during a takeover run. If it does then don't free it. There are potentially fancier solutions (e.g. check what PNNs are new to the list) to this issue but this is the simplest. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit e81589b7084c661adf617e166cc2c25b4939f841) 2013-09-06 05:23:07 +04:00			`if (rebalance_nodes == rec->force_rebalance_nodes) {`
			`TALLOC_FREE(rec->force_rebalance_nodes);`
			`} else {`
			`DEBUG(DEBUG_WARNING,`
			`("Rebalance target nodes changed during takeover run - not clearing\n"));`
			`}`
recoverd: New function do_takeover_run() Factor the calling sequence for ctdb_takeover_run() into a new function and call it instead. This changes rec->need_takeover_run to false for each successful takeover run and that seems to be the right thing to do. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 9a3f0c0e61ca5c17e020c6e0463d73c7cf4f7c09) 2013-08-27 06:14:34 +04:00			`done:`
			`rec->need_takeover_run = !ok;`
recoverd: New SRVID message CTDB_SRVID_DISABLE_TAKEOVER_RUNS This implements a superset of CTDB_SRVID_DISABLE_IP_CHECK. It stops the IP checks but also causes any attempted takeover runs to fail and be rescheduled. This is meant to completely stop IP movements. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 00db4de53a0d86013e79e6577e7e6cf3ef864e56) 2013-08-27 09:04:40 +04:00			`talloc_free(nodes);`
ctdb-recoverd: Reimplement disabling takeover runs using ctdb_op_disable() Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-02-08 12:52:12 +03:00			`ctdb_op_end(rec->takeover_run);`
recoverd: Improve logging for takeover runs Takeover runs are currently silent when they succeed. However, they are important, so log something by default. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit b39aa2e401fbb581207d986bac93778e9c01acdc) 2013-09-18 11:06:16 +04:00
			`DEBUG(DEBUG_NOTICE, ("Takeover run %s\n", ok ? "completed successfully" : "unsuccessful"));`
recoverd: New function do_takeover_run() Factor the calling sequence for ctdb_takeover_run() into a new function and call it instead. This changes rec->need_takeover_run to false for each successful takeover run and that seems to be the right thing to do. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 9a3f0c0e61ca5c17e020c6e0463d73c7cf4f7c09) 2013-08-27 06:14:34 +04:00			`return ok;`
			`}`

ctdb-recoverd: Add code for parallel database recovery Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-09-17 09:22:38 +03:00			`static int db_recovery_parallel(struct ctdb_recoverd rec, TALLOC_CTX mem_ctx)`
			`{`
			`static char prog[PATH_MAX+1] = "";`
ctdb-recoverd: Generalise helper state, handler and launching These can also be used for takeover handler. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-12-09 07:04:03 +03:00			`const char *arg;`
ctdb-recoverd: Add code for parallel database recovery Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-09-17 09:22:38 +03:00
			`if (!ctdb_set_helper("recovery_helper", prog, sizeof(prog),`
			`"CTDB_RECOVERY_HELPER", CTDB_HELPER_BINDIR,`
			`"ctdb_recovery_helper")) {`
			`ctdb_die(rec->ctdb, "Unable to set recovery helper\n");`
			`}`

ctdb-recoverd: Generalise helper state, handler and launching These can also be used for takeover handler. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-12-09 07:04:03 +03:00			`arg = talloc_asprintf(mem_ctx, "%u", new_generation());`
			`if (arg == NULL) {`
ctdb-recoverd: Add code for parallel database recovery Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-09-17 09:22:38 +03:00			`DEBUG(DEBUG_ERR, (__location__ " memory error\n"));`
			`return -1;`
			`}`

ctdb-recovery: Create recovery databases in state dir This matches the behaviour during serial database recovery. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> Autobuild-User(master): Martin Schwenke <martins@samba.org> Autobuild-Date(master): Thu Feb 11 08:01:14 CET 2016 on sn-devel-144 2016-02-11 06:32:34 +03:00			`setenv("CTDB_DBDIR_STATE", rec->ctdb->db_directory_state, 1);`

ctdb-recoverd: Generalise helper state, handler and launching These can also be used for takeover handler. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-12-09 07:04:03 +03:00			`return helper_run(rec, mem_ctx, prog, arg, "recovery");`
ctdb-recoverd: Add code for parallel database recovery Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-09-17 09:22:38 +03:00			`}`

ctdb-recovery: Factor out existing database recovery code Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-09-17 09:00:47 +03:00			`/*`
			`we are the recmaster, and recovery is needed - start a recovery run`
			`*/`
			`static int do_recovery(struct ctdb_recoverd *rec,`
			`TALLOC_CTX *mem_ctx, uint32_t pnn,`
ctdb-daemon: Rename struct ctdb_node_map to ctdb_node_map_old Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-29 09:22:48 +03:00			`struct ctdb_node_map_old nodemap, struct ctdb_vnn_map vnnmap)`
ctdb-recovery: Factor out existing database recovery code Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-09-17 09:00:47 +03:00			`{`
			`struct ctdb_context *ctdb = rec->ctdb;`
			`int i, ret;`
ctdb-daemon: Rename struct ctdb_dbid_map to ctdb_dbid_map_old Match struct ctdb_dbid as per protocol/protocol.h Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-29 09:46:05 +03:00			`struct ctdb_dbid_map_old *dbmap;`
ctdb-recovery: Factor out existing database recovery code Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-09-17 09:00:47 +03:00			`bool self_ban;`

			`DEBUG(DEBUG_NOTICE, (__location__ " Starting do_recovery\n"));`

ctdb-recoverd: Always check for recmaster before doing recovery Recovery daemon checks if it is the recovery master before performing certain checks. During those checks it's possible that re-election can change the recmaster. In such a case, the recovery daemon should never do a database recovery. This is not complete fix since the recovery master can still change while the recovery is going on. The correct fix is to abort recovery if the recovery master changes. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Wed Oct 7 17:55:05 CEST 2015 on sn-devel-104 2015-10-06 09:31:41 +03:00			`/* Check if the current node is still the recmaster. It's possible that`
ctdb-recoverd: Don't retrieve recovery master from local daemon The recovery daemon already knows which node is the master. This relies on rec->recmaster being correctly initialised and correctly set during elections. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-10-23 07:33:01 +03:00			`* re-election has changed the recmaster.`
ctdb-recoverd: Always check for recmaster before doing recovery Recovery daemon checks if it is the recovery master before performing certain checks. During those checks it's possible that re-election can change the recmaster. In such a case, the recovery daemon should never do a database recovery. This is not complete fix since the recovery master can still change while the recovery is going on. The correct fix is to abort recovery if the recovery master changes. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Wed Oct 7 17:55:05 CEST 2015 on sn-devel-104 2015-10-06 09:31:41 +03:00			`*/`
ctdb-recoverd: Don't retrieve recovery master from local daemon The recovery daemon already knows which node is the master. This relies on rec->recmaster being correctly initialised and correctly set during elections. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-10-23 07:33:01 +03:00			`if (pnn != rec->recmaster) {`
ctdb-recoverd: Always check for recmaster before doing recovery Recovery daemon checks if it is the recovery master before performing certain checks. During those checks it's possible that re-election can change the recmaster. In such a case, the recovery daemon should never do a database recovery. This is not complete fix since the recovery master can still change while the recovery is going on. The correct fix is to abort recovery if the recovery master changes. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Wed Oct 7 17:55:05 CEST 2015 on sn-devel-104 2015-10-06 09:31:41 +03:00			`DEBUG(DEBUG_NOTICE,`
			`("Recovery master changed to %u, aborting recovery\n",`
ctdb-recoverd: Don't retrieve recovery master from local daemon The recovery daemon already knows which node is the master. This relies on rec->recmaster being correctly initialised and correctly set during elections. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-10-23 07:33:01 +03:00			`rec->recmaster));`
ctdb-recoverd: Always check for recmaster before doing recovery Recovery daemon checks if it is the recovery master before performing certain checks. During those checks it's possible that re-election can change the recmaster. In such a case, the recovery daemon should never do a database recovery. This is not complete fix since the recovery master can still change while the recovery is going on. The correct fix is to abort recovery if the recovery master changes. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Wed Oct 7 17:55:05 CEST 2015 on sn-devel-104 2015-10-06 09:31:41 +03:00			`return -1;`
			`}`

ctdb-recovery: Factor out existing database recovery code Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-09-17 09:00:47 +03:00			`/* if recovery fails, force it again */`
			`rec->need_recovery = true;`

			`if (!ctdb_op_begin(rec->recovery)) {`
			`return -1;`
			`}`

			`if (rec->election_timeout) {`
			`/* an election is in progress */`
			`DEBUG(DEBUG_ERR, ("do_recovery called while election in progress - try again later\n"));`
			`goto fail;`
			`}`

			`ban_misbehaving_nodes(rec, &self_ban);`
			`if (self_ban) {`
			`DEBUG(DEBUG_NOTICE, ("This node was banned, aborting recovery\n"));`
			`goto fail;`
			`}`

ctdb-daemon: Rename recovery lock file to just recovery lock It isn't necessarily a file. Don't bother changing the control, since it doesn't pervade the code. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-05-17 11:28:56 +03:00			`if (ctdb->recovery_lock != NULL) {`
ctdb-recoverd: Recovery lock handle should be in recovery deamon context This shouldn't be in the CTDB context. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-05-24 07:54:39 +03:00			`if (ctdb_recovery_have_lock(rec)) {`
ctdb-recovery: Factor out existing database recovery code Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-09-17 09:00:47 +03:00			`DEBUG(DEBUG_NOTICE, ("Already holding recovery lock\n"));`
			`} else {`
			`DEBUG(DEBUG_NOTICE, ("Attempting to take recovery lock (%s)\n",`
ctdb-daemon: Rename recovery lock file to just recovery lock It isn't necessarily a file. Don't bother changing the control, since it doesn't pervade the code. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-05-17 11:28:56 +03:00			`ctdb->recovery_lock));`
ctdb-recoverd: Recovery lock handle should be in recovery deamon context This shouldn't be in the CTDB context. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-05-24 07:54:39 +03:00			`if (!ctdb_recovery_lock(rec)) {`
ctdb-recovery: Factor out existing database recovery code Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-09-17 09:00:47 +03:00			`if (ctdb->runstate == CTDB_RUNSTATE_FIRST_RECOVERY) {`
			`/* If ctdb is trying first recovery, it's`
			`* possible that current node does not know`
			`* yet who the recmaster is.`
			`*/`
			`DEBUG(DEBUG_ERR, ("Unable to get recovery lock"`
			`" - retrying recovery\n"));`
			`goto fail;`
			`}`

			`DEBUG(DEBUG_ERR,("Unable to get recovery lock - aborting recovery "`
			`"and ban ourself for %u seconds\n",`
			`ctdb->tunable.recovery_ban_period));`
			`ctdb_ban_node(rec, pnn, ctdb->tunable.recovery_ban_period);`
			`goto fail;`
			`}`
			`DEBUG(DEBUG_NOTICE,`
			`("Recovery lock taken successfully by recovery daemon\n"));`
			`}`
			`}`

			`DEBUG(DEBUG_NOTICE, (__location__ " Recovery initiated due to problem with node %u\n", rec->last_culprit_node));`

			`/* get a list of all databases */`
			`ret = ctdb_ctrl_getdbmap(ctdb, CONTROL_TIMEOUT(), pnn, mem_ctx, &dbmap);`
			`if (ret != 0) {`
			`DEBUG(DEBUG_ERR, (__location__ " Unable to get dbids from node :%u\n", pnn));`
			`goto fail;`
			`}`

			`/* we do the db creation before we set the recovery mode, so the freeze happens`
			`on all databases we will be dealing with. */`

			`/* verify that we have all the databases any other node has */`
			`ret = create_missing_local_databases(ctdb, nodemap, pnn, &dbmap, mem_ctx);`
			`if (ret != 0) {`
			`DEBUG(DEBUG_ERR, (__location__ " Unable to create missing local databases\n"));`
			`goto fail;`
			`}`

			`/* verify that all other nodes have all our databases */`
			`ret = create_missing_remote_databases(ctdb, nodemap, pnn, dbmap, mem_ctx);`
			`if (ret != 0) {`
			`DEBUG(DEBUG_ERR, (__location__ " Unable to create missing remote databases\n"));`
			`goto fail;`
			`}`
			`DEBUG(DEBUG_NOTICE, (__location__ " Recovery - created remote databases\n"));`


ctdb-recmaster: Update capabilities before calling first election Capabilities are used when computing an election result so having them up-to-date seems like a good idea. Also update several instances of an ambiguous comment. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-10-27 07:09:33 +03:00			`/* Retrieve capabilities from all connected nodes */`
ctdb-recoverd: Update capabilities before the database recovery Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-09-17 09:07:37 +03:00			`ret = update_capabilities(rec, nodemap);`
			`if (ret!=0) {`
			`DEBUG(DEBUG_ERR, (__location__ " Unable to update node capabilities.\n"));`
			`return -1;`
			`}`

ctdb-recoverd: Update flags on all nodes before database recovery Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-09-17 10:10:15 +03:00			`/*`
			`update all nodes to have the same flags that we have`
			`*/`
			`for (i=0;i<nodemap->num;i++) {`
			`if (nodemap->nodes[i].flags & NODE_FLAGS_DISCONNECTED) {`
			`continue;`
			`}`

			`ret = update_flags_on_all_nodes(ctdb, nodemap, i, nodemap->nodes[i].flags);`
			`if (ret != 0) {`
			`if (nodemap->nodes[i].flags & NODE_FLAGS_INACTIVE) {`
			`DEBUG(DEBUG_WARNING, (__location__ "Unable to update flags on inactive node %d\n", i));`
			`} else {`
			`DEBUG(DEBUG_ERR, (__location__ " Unable to update flags on all nodes for node %d\n", i));`
			`return -1;`
			`}`
			`}`
			`}`

			`DEBUG(DEBUG_NOTICE, (__location__ " Recovery - updated flags\n"));`

ctdb-recovery: Remove serial database recovery code Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2016-07-19 09:06:37 +03:00			`ret = db_recovery_parallel(rec, mem_ctx);`
ctdb-recovery: Factor out existing database recovery code Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-09-17 09:00:47 +03:00			`if (ret != 0) {`
			`goto fail;`
			`}`

ctdb-takeover: Recovery daemon no longer passes fail callback Banning is now handled by the takeover code sending banning credit messages. This commit makes a change in behaviour quite obvious. Takeover runs were initiated from several locations in the code but banning was only done from one of these locations. Now banning can be done from any failed takeover run. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-05-03 08:35:08 +03:00			`do_takeover_run(rec, nodemap);`
read the current debuglevel in each loop in the recovery daemon so that we pick up when they change in the parent daemon (This used to be ctdb commit 792d5471ff0c2947b6e66183925860de27f30eaf) 2008-02-18 11:38:04 +03:00
send a message to clients when an IP has been released (This used to be ctdb commit 8b7ab0b00253462593d368052c2cb10a385b4e63) 2007-05-25 18:05:30 +04:00			`/* send a message to all clients telling them that the cluster`
			`has been reconfigured */`
ctdb-server: Coverity fixes Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org> 2013-11-11 05:39:27 +04:00			`ret = ctdb_client_send_message(ctdb, CTDB_BROADCAST_CONNECTED,`
			`CTDB_SRVID_RECONFIGURE, tdb_null);`
			`if (ret != 0) {`
			`DEBUG(DEBUG_ERR, (__location__ " Failed to send reconfigure message\n"));`
ctdb-recoverd: Use a goto for do_recovery() failures This will allow extra things to be done on failure. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-02-06 06:32:08 +03:00			`goto fail;`
ctdb-server: Coverity fixes Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org> 2013-11-11 05:39:27 +04:00			`}`
recovery daemon this program is a client to the local ctdb daemon every second it pulls all vnnmap and nodemaps from all nodes that are available and checks if a recovery is required a recovery is required if : * all nodes do NOT have an identical vnnmap and generation * all nodes do NOT have an identical nodemap * there are active nodes that are NOT in the nodemap * there are nodes in the nodemap that are NOT active During recovery, the recovery tool will also make sure that all nodes know about and have created all databases. (This used to be ctdb commit 2f2650467bac7e8954de7c17cb34f46b0bdbcd26) 2007-05-04 09:21:40 +04:00
added debug constants to allow for better mapping to syslog levels (This used to be ctdb commit 7ba8f1dde318eab03f4257e5a89fd23e7281e502) 2008-02-04 09:44:24 +03:00			`DEBUG(DEBUG_NOTICE, (__location__ " Recovery complete\n"));`
add a tuneable to control how long we wait after a successful recovery before we alow another recovery to be initiated (This used to be ctdb commit f3b43519423b7a73e6a2dd986bdf11203b8653cf) 2007-07-04 02:36:59 +04:00
- merge from ronnie - add a flag to check that recovery completed correctly. If not, re-trigger it in monitoring (This used to be ctdb commit d5ed941d9bab4af30d8b5f9b77bdf43d9218d69b) 2007-09-14 03:49:12 +04:00			`rec->need_recovery = false;`
ctdb-recoverd: Reimplement ReRecoveryTimeout using ctdb_op_disable() Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-02-06 06:47:33 +03:00			`ctdb_op_end(rec->recovery);`
- merge from ronnie - add a flag to check that recovery completed correctly. If not, re-trigger it in monitoring (This used to be ctdb commit d5ed941d9bab4af30d8b5f9b77bdf43d9218d69b) 2007-09-14 03:49:12 +04:00
with the new banning logic with one struct for each node we no longer "forget" the other culprits as often as we used to do, which means that things like "ctdb recover" can now actually lead to a node becomming banned if we perform too many recoveries too frequently. change this to provide absolution to all nodes once they have participated in a recovery session. (This used to be ctdb commit f66d17fb2e81a35d5adb3754e1cc902f76b4590a) 2009-09-25 07:14:53 +04:00			`/* we managed to complete a full recovery, make sure to forgive`
			`any past sins by the nodes that could now participate in the`
			`recovery.`
			`*/`
			`DEBUG(DEBUG_ERR,("Resetting ban count to 0 for all nodes\n"));`
			`for (i=0;i<nodemap->num;i++) {`
			`struct ctdb_banning_state *ban_state;`

			`if (nodemap->nodes[i].flags & NODE_FLAGS_DISCONNECTED) {`
			`continue;`
			`}`

			`ban_state = (struct ctdb_banning_state *)ctdb->nodes[nodemap->nodes[i].pnn]->ban_state;`
			`if (ban_state == NULL) {`
			`continue;`
			`}`

			`ban_state->count = 0;`
			`}`

ctdb-recoverd: Reimplement ReRecoveryTimeout using ctdb_op_disable() Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-02-06 06:47:33 +03:00			`/* We just finished a recovery successfully.`
			`We now wait for rerecovery_timeout before we allow`
add a tuneable to control how long we wait after a successful recovery before we alow another recovery to be initiated (This used to be ctdb commit f3b43519423b7a73e6a2dd986bdf11203b8653cf) 2007-07-04 02:36:59 +04:00			`another recovery to take place.`
			`*/`
Correct "supressed" typo. Signed-off-by: Chris Lamb <chris@chris-lamb.co.uk> Reviewed-by: Andrew Bartlett <abartlet@samba.org> Reviewed-by: Garming Sam <garming@catalyst.net.nz> 2017-02-17 12:51:52 +03:00			`DEBUG(DEBUG_NOTICE, ("Just finished a recovery. New recoveries will now be suppressed for the rerecovery timeout (%d seconds)\n", ctdb->tunable.rerecovery_timeout));`
ctdb-recoverd: Reimplement ReRecoveryTimeout using ctdb_op_disable() Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-02-06 06:47:33 +03:00			`ctdb_op_disable(rec->recovery, ctdb->ev,`
			`ctdb->tunable.rerecovery_timeout);`
recovery daemon this program is a client to the local ctdb daemon every second it pulls all vnnmap and nodemaps from all nodes that are available and checks if a recovery is required a recovery is required if : * all nodes do NOT have an identical vnnmap and generation * all nodes do NOT have an identical nodemap * there are active nodes that are NOT in the nodemap * there are nodes in the nodemap that are NOT active During recovery, the recovery tool will also make sure that all nodes know about and have created all databases. (This used to be ctdb commit 2f2650467bac7e8954de7c17cb34f46b0bdbcd26) 2007-05-04 09:21:40 +04:00			`return 0;`
ctdb-recoverd: Use a goto for do_recovery() failures This will allow extra things to be done on failure. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-02-06 06:32:08 +03:00
			`fail:`
ctdb-recoverd: Reimplement ReRecoveryTimeout using ctdb_op_disable() Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-02-06 06:47:33 +03:00			`ctdb_op_end(rec->recovery);`
ctdb-recoverd: Use a goto for do_recovery() failures This will allow extra things to be done on failure. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-02-06 06:32:08 +03:00			`return -1;`
update getvnnmap control to take a timeout parameter dont explicitely free the vnnmap pointer in the getvnnmap control this is freed by the mem_ctx instead add code to the recoverd to detect when/if recovery is required veiry that the number of active nodes, the nodemap and the vnn map is consistent across the entire cluster and if not trigger a recovery (which right now just prints "we need to do recovery" to the screen. (This used to be ctdb commit 2b0a207a3748bdb3394dc9fd0d1c344ee1bb0bb5) 2007-05-04 03:45:53 +04:00			`}`
start working on a recovery daemon change ctdb_control so it takes a timeval pointer as argument. this is the timeout. if the node has not responded within hte timeout ctdb_control will return an error instead of hanging. if the timeval pointer is NULL then the call will block indefinitely if there is no response. this is used for now in the createdb control but all the helpers ctdb_ctrl_* should probably be updated to take a timeout parameter as well. (This used to be ctdb commit 1fe64b04869b17dbf123851b0fe09df8d28a6211) 2007-05-04 02:30:18 +04:00
add a test in the function that checks whether the cluster needs recovery or not that all active nodes are in normal mode. If we discover that some node is still in recoverymode it may indicate that a previous recovery ended prematurely and thus we should start a new recovery (This used to be ctdb commit c15517872e6c98c8c425a8d47d2b348ecb0620b0) 2007-05-06 22:41:12 +04:00
choose the most connected node first (This used to be ctdb commit c7c17a79fa4f28509e34b6f635fa62517dc458c2) 2007-06-07 13:17:27 +04:00			`/*`
			`elections are won by first checking the number of connected nodes, then`
change ctdb_node_flags_change.vnn to ctdb_node_flags_changed.pnn change ctdb_ban_info.vnn to ctdb_ban_info.pnn (This used to be ctdb commit fcedd40e0493948829e1c921d4fe30e9196e398a) 2007-09-04 04:33:10 +04:00			`the priority time, then the pnn`
choose the most connected node first (This used to be ctdb commit c7c17a79fa4f28509e34b6f635fa62517dc458c2) 2007-06-07 13:17:27 +04:00			`*/`
recovery daemon with recovery master election election is primitive, it elects the lowest vnn as the recovery master two new controls, to get/set recovery master for a node to use recovery daemon, start one ./bin/recoverd --socket=ctdb.socket* for each ctdb daemon it has been briefly tested by deleting and adding nodes to a 4 node cluster but needs more testing (This used to be ctdb commit 541d1cc49d46d44042a31a8404d521412ef2fdb3) 2007-05-07 00:51:58 +04:00			`struct election_message {`
choose the most connected node first (This used to be ctdb commit c7c17a79fa4f28509e34b6f635fa62517dc458c2) 2007-06-07 13:17:27 +04:00			`uint32_t num_connected;`
use a priority time for the election data, not just the vnn (This used to be ctdb commit a691f9c5cd77194005f0d98483da94b07a48d57d) 2007-06-07 12:37:27 +04:00			`struct timeval priority_time;`
change ctdb_node_flags_change.vnn to ctdb_node_flags_changed.pnn change ctdb_ban_info.vnn to ctdb_ban_info.pnn (This used to be ctdb commit fcedd40e0493948829e1c921d4fe30e9196e398a) 2007-09-04 04:33:10 +04:00			`uint32_t pnn;`
- catch ESTALE in the recovery lock by trying a read() - priortise nodes that are unbanned and healthy in the election (This used to be ctdb commit 929feb475dfdf7283f0e99b50b179e1c91d3a39f) 2007-10-05 07:28:21 +04:00			`uint32_t node_flags;`
recovery daemon with recovery master election election is primitive, it elects the lowest vnn as the recovery master two new controls, to get/set recovery master for a node to use recovery daemon, start one ./bin/recoverd --socket=ctdb.socket* for each ctdb daemon it has been briefly tested by deleting and adding nodes to a 4 node cluster but needs more testing (This used to be ctdb commit 541d1cc49d46d44042a31a8404d521412ef2fdb3) 2007-05-07 00:51:58 +04:00			`};`

choose the most connected node first (This used to be ctdb commit c7c17a79fa4f28509e34b6f635fa62517dc458c2) 2007-06-07 13:17:27 +04:00			`/*`
			`form this nodes election data`
			`*/`
			`static void ctdb_election_data(struct ctdb_recoverd rec, struct election_message em)`
			`{`
			`int ret, i;`
ctdb-daemon: Rename struct ctdb_node_map to ctdb_node_map_old Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-29 09:22:48 +03:00			`struct ctdb_node_map_old *nodemap;`
choose the most connected node first (This used to be ctdb commit c7c17a79fa4f28509e34b6f635fa62517dc458c2) 2007-06-07 13:17:27 +04:00			`struct ctdb_context *ctdb = rec->ctdb;`

			`ZERO_STRUCTP(em);`

change ctdb_node_flags_change.vnn to ctdb_node_flags_changed.pnn change ctdb_ban_info.vnn to ctdb_ban_info.pnn (This used to be ctdb commit fcedd40e0493948829e1c921d4fe30e9196e398a) 2007-09-04 04:33:10 +04:00			`em->pnn = rec->ctdb->pnn;`
choose the most connected node first (This used to be ctdb commit c7c17a79fa4f28509e34b6f635fa62517dc458c2) 2007-06-07 13:17:27 +04:00			`em->priority_time = rec->priority_time;`

			`ret = ctdb_ctrl_getnodemap(ctdb, CONTROL_TIMEOUT(), CTDB_CURRENT_NODE, rec, &nodemap);`
			`if (ret != 0) {`
recoverd: Improve an error message in the election code Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 275ed9ebe287e39d891888c13810c70f347af8ac) 2013-10-30 04:32:28 +04:00			`DEBUG(DEBUG_ERR,(__location__ " unable to get node map\n"));`
choose the most connected node first (This used to be ctdb commit c7c17a79fa4f28509e34b6f635fa62517dc458c2) 2007-06-07 13:17:27 +04:00			`return;`
			`}`

When we create new election data to send during elections, we must re-read the node flags from the main daemon to catch when the STOPPED flag is changed. (This used to be ctdb commit ca4982c40d81db528fe915d5ecc01fcf7df0b522) 2009-07-17 05:37:03 +04:00			`rec->node_flags = nodemap->nodes[ctdb->pnn].flags;`
			`em->node_flags = rec->node_flags;`

choose the most connected node first (This used to be ctdb commit c7c17a79fa4f28509e34b6f635fa62517dc458c2) 2007-06-07 13:17:27 +04:00			`for (i=0;i<nodemap->num;i++) {`
			`if (!(nodemap->nodes[i].flags & NODE_FLAGS_DISCONNECTED)) {`
			`em->num_connected++;`
			`}`
			`}`
make sure we lose all elections for recmaster role if we do not have the recmaster capability. (unless there are no other node at all available with this capability) (This used to be ctdb commit 8556e9dc897c6b9b9be0b52f391effb1f72fbd80) 2008-05-06 07:56:56 +04:00
			`/* we shouldnt try to win this election if we cant be a recmaster */`
			`if ((ctdb->capabilities & CTDB_CAP_RECMASTER) == 0) {`
			`em->num_connected = 0;`
			`em->priority_time = timeval_current();`
			`}`

choose the most connected node first (This used to be ctdb commit c7c17a79fa4f28509e34b6f635fa62517dc458c2) 2007-06-07 13:17:27 +04:00			`talloc_free(nodemap);`
			`}`

			`/*`
			`see if the given election data wins`
			`*/`
			`static bool ctdb_election_win(struct ctdb_recoverd rec, struct election_message em)`
			`{`
			`struct election_message myem;`
- catch ESTALE in the recovery lock by trying a read() - priortise nodes that are unbanned and healthy in the election (This used to be ctdb commit 929feb475dfdf7283f0e99b50b179e1c91d3a39f) 2007-10-05 07:28:21 +04:00			`int cmp = 0;`
choose the most connected node first (This used to be ctdb commit c7c17a79fa4f28509e34b6f635fa62517dc458c2) 2007-06-07 13:17:27 +04:00
			`ctdb_election_data(rec, &myem);`

Fix various spelling errors Reviewed-by: Andrew Bartlett <abartlet@samba.org> Reviewed-by: Michael Adam <obnox@samba.org> Autobuild-User(master): Andrew Bartlett <abartlet@samba.org> Autobuild-Date(master): Fri Nov 6 13:43:45 CET 2015 on sn-devel-104 2015-07-27 00:02:57 +03:00			`/* we cant win if we don't have the recmaster capability */`
make sure we lose all elections for recmaster role if we do not have the recmaster capability. (unless there are no other node at all available with this capability) (This used to be ctdb commit 8556e9dc897c6b9b9be0b52f391effb1f72fbd80) 2008-05-06 07:56:56 +04:00			`if ((rec->ctdb->capabilities & CTDB_CAP_RECMASTER) == 0) {`
			`return false;`
			`}`

simplify election handling make sure we read and update the flags from all remote nodes before we reach the first codepath that can call do_recovery() since during do_recovery() we need to know what the flags are. (This used to be ctdb commit e85f3806483ea420559d449e0e4d81bec996740f) 2007-10-11 00:16:36 +04:00			`/* we cant win if we are banned */`
			`if (rec->node_flags & NODE_FLAGS_BANNED) {`
merge from ronnie (This used to be ctdb commit d18712caba11855010be52f90bac656683076676) 2007-10-15 08:17:49 +04:00			`return false;`
recoverd: eliminate some trailing spaces from ctdb_election_win() Signed-off-by: Michael Adam <obnox@samba.org> (This used to be ctdb commit df30c0a05ed908fc2a997c56ff5484736b23b70f) 2013-06-21 16:06:22 +04:00			`}`
- catch ESTALE in the recovery lock by trying a read() - priortise nodes that are unbanned and healthy in the election (This used to be ctdb commit 929feb475dfdf7283f0e99b50b179e1c91d3a39f) 2007-10-05 07:28:21 +04:00
stopped nodes can not win a recmaster election stopped nodes must yield the recmaster role (This used to be ctdb commit b75ac1185481060ab71bd743e1e48d333d716eba) 2009-07-09 08:44:03 +04:00			`/* we cant win if we are stopped */`
			`if (rec->node_flags & NODE_FLAGS_STOPPED) {`
			`return false;`
recoverd: eliminate some trailing spaces from ctdb_election_win() Signed-off-by: Michael Adam <obnox@samba.org> (This used to be ctdb commit df30c0a05ed908fc2a997c56ff5484736b23b70f) 2013-06-21 16:06:22 +04:00			`}`
stopped nodes can not win a recmaster election stopped nodes must yield the recmaster role (This used to be ctdb commit b75ac1185481060ab71bd743e1e48d333d716eba) 2009-07-09 08:44:03 +04:00
simplify election handling make sure we read and update the flags from all remote nodes before we reach the first codepath that can call do_recovery() since during do_recovery() we need to know what the flags are. (This used to be ctdb commit e85f3806483ea420559d449e0e4d81bec996740f) 2007-10-11 00:16:36 +04:00			`/* we will automatically win if the other node is banned */`
			`if (em->node_flags & NODE_FLAGS_BANNED) {`
merge from ronnie (This used to be ctdb commit d18712caba11855010be52f90bac656683076676) 2007-10-15 08:17:49 +04:00			`return true;`
- catch ESTALE in the recovery lock by trying a read() - priortise nodes that are unbanned and healthy in the election (This used to be ctdb commit 929feb475dfdf7283f0e99b50b179e1c91d3a39f) 2007-10-05 07:28:21 +04:00			`}`

stopped nodes can not win a recmaster election stopped nodes must yield the recmaster role (This used to be ctdb commit b75ac1185481060ab71bd743e1e48d333d716eba) 2009-07-09 08:44:03 +04:00			`/* we will automatically win if the other node is banned */`
			`if (em->node_flags & NODE_FLAGS_STOPPED) {`
			`return true;`
			`}`

choose the most connected node first (This used to be ctdb commit c7c17a79fa4f28509e34b6f635fa62517dc458c2) 2007-06-07 13:17:27 +04:00			`/* then the longest running node */`
			`if (cmp == 0) {`
later times are a lower priority, not a higher priority (This used to be ctdb commit e96424e7d366df29767c4eeaccdcc0cc975cb8ae) 2007-06-07 13:21:55 +04:00			`cmp = timeval_compare(&em->priority_time, &myem.priority_time);`
choose the most connected node first (This used to be ctdb commit c7c17a79fa4f28509e34b6f635fa62517dc458c2) 2007-06-07 13:17:27 +04:00			`}`

			`if (cmp == 0) {`
change ctdb_node_flags_change.vnn to ctdb_node_flags_changed.pnn change ctdb_ban_info.vnn to ctdb_ban_info.pnn (This used to be ctdb commit fcedd40e0493948829e1c921d4fe30e9196e398a) 2007-09-04 04:33:10 +04:00			`cmp = (int)myem.pnn - (int)em->pnn;`
choose the most connected node first (This used to be ctdb commit c7c17a79fa4f28509e34b6f635fa62517dc458c2) 2007-06-07 13:17:27 +04:00			`}`

			`return cmp > 0;`
			`}`
implement a scheme where nodes are banned if they continuously caused the cluster to start a recovery session. The node is banned from the cluster for the RecoveryBanPeriod (default of 5 minutes) (This used to be ctdb commit 4ad43dd07f526b6002477177fbf55483246c2c0c) 2007-06-07 09:18:55 +04:00
			`/*`
			`send out an election request`
			`*/`
Revert "if a new node enters the cluster, that node will already be frozen at start" This is unnecessary due to 03e2e436db5cfd29a56d13f5d2101e42389bfc94. Furthermore, if a node doesn't force an election but wins it then it can fail to record that it is the new recovery master. This can lead to a reverse split brain where there is no recovery master. This reverts commit c5035657606283d2e35bea40992505e84ca8e7be. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Conflicts: server/ctdb_recoverd.c (This used to be ctdb commit c8b542e059a54b8d524bd430cad9d82e5edd864d) 2013-10-29 09:38:42 +04:00			`static int send_election_request(struct ctdb_recoverd *rec, uint32_t pnn)`
recovery daemon with recovery master election election is primitive, it elects the lowest vnn as the recovery master two new controls, to get/set recovery master for a node to use recovery daemon, start one ./bin/recoverd --socket=ctdb.socket* for each ctdb daemon it has been briefly tested by deleting and adding nodes to a 4 node cluster but needs more testing (This used to be ctdb commit 541d1cc49d46d44042a31a8404d521412ef2fdb3) 2007-05-07 00:51:58 +04:00			`{`
			`int ret;`
			`TDB_DATA election_data;`
			`struct election_message emsg;`
			`uint64_t srvid;`
use a priority time for the election data, not just the vnn (This used to be ctdb commit a691f9c5cd77194005f0d98483da94b07a48d57d) 2007-06-07 12:37:27 +04:00			`struct ctdb_context *ctdb = rec->ctdb;`
simplify election handling make sure we read and update the flags from all remote nodes before we reach the first codepath that can call do_recovery() since during do_recovery() we need to know what the flags are. (This used to be ctdb commit e85f3806483ea420559d449e0e4d81bec996740f) 2007-10-11 00:16:36 +04:00
ctdb-include: Use new protocol definitions This gets rid of the duplicate definitions from ctdb_protocol.h. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-29 09:51:52 +03:00			`srvid = CTDB_SRVID_ELECTION;`
recovery daemon with recovery master election election is primitive, it elects the lowest vnn as the recovery master two new controls, to get/set recovery master for a node to use recovery daemon, start one ./bin/recoverd --socket=ctdb.socket* for each ctdb daemon it has been briefly tested by deleting and adding nodes to a 4 node cluster but needs more testing (This used to be ctdb commit 541d1cc49d46d44042a31a8404d521412ef2fdb3) 2007-05-07 00:51:58 +04:00
choose the most connected node first (This used to be ctdb commit c7c17a79fa4f28509e34b6f635fa62517dc458c2) 2007-06-07 13:17:27 +04:00			`ctdb_election_data(rec, &emsg);`
recovery daemon with recovery master election election is primitive, it elects the lowest vnn as the recovery master two new controls, to get/set recovery master for a node to use recovery daemon, start one ./bin/recoverd --socket=ctdb.socket* for each ctdb daemon it has been briefly tested by deleting and adding nodes to a 4 node cluster but needs more testing (This used to be ctdb commit 541d1cc49d46d44042a31a8404d521412ef2fdb3) 2007-05-07 00:51:58 +04:00
			`election_data.dsize = sizeof(struct election_message);`
			`election_data.dptr = (unsigned char *)&emsg;`


Revert "if a new node enters the cluster, that node will already be frozen at start" This is unnecessary due to 03e2e436db5cfd29a56d13f5d2101e42389bfc94. Furthermore, if a node doesn't force an election but wins it then it can fail to record that it is the new recovery master. This can lead to a reverse split brain where there is no recovery master. This reverts commit c5035657606283d2e35bea40992505e84ca8e7be. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Conflicts: server/ctdb_recoverd.c (This used to be ctdb commit c8b542e059a54b8d524bd430cad9d82e5edd864d) 2013-10-29 09:38:42 +04:00			`/* first we assume we will win the election and set`
			`recoverymaster to be ourself on the current node`
			`*/`
ctdb-recoverd: Clarify that recmaster is being set on the current node That is, using CTDB_CURRENT_NODE makes this more obvious. Also fix incorrect error messages. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-10-23 07:27:12 +03:00			`ret = ctdb_ctrl_setrecmaster(ctdb, CONTROL_TIMEOUT(),`
			`CTDB_CURRENT_NODE, pnn);`
Revert "if a new node enters the cluster, that node will already be frozen at start" This is unnecessary due to 03e2e436db5cfd29a56d13f5d2101e42389bfc94. Furthermore, if a node doesn't force an election but wins it then it can fail to record that it is the new recovery master. This can lead to a reverse split brain where there is no recovery master. This reverts commit c5035657606283d2e35bea40992505e84ca8e7be. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Conflicts: server/ctdb_recoverd.c (This used to be ctdb commit c8b542e059a54b8d524bd430cad9d82e5edd864d) 2013-10-29 09:38:42 +04:00			`if (ret != 0) {`
ctdb-recoverd: Clarify that recmaster is being set on the current node That is, using CTDB_CURRENT_NODE makes this more obvious. Also fix incorrect error messages. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-10-23 07:27:12 +03:00			`DEBUG(DEBUG_ERR, (__location__ " failed to set recmaster\n"));`
Revert "if a new node enters the cluster, that node will already be frozen at start" This is unnecessary due to 03e2e436db5cfd29a56d13f5d2101e42389bfc94. Furthermore, if a node doesn't force an election but wins it then it can fail to record that it is the new recovery master. This can lead to a reverse split brain where there is no recovery master. This reverts commit c5035657606283d2e35bea40992505e84ca8e7be. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Conflicts: server/ctdb_recoverd.c (This used to be ctdb commit c8b542e059a54b8d524bd430cad9d82e5edd864d) 2013-10-29 09:38:42 +04:00			`return -1;`
			`}`
ctdb-recoverd: Have recovery daemon remember election result The recovery daemon pushes knowledge of recovery master election progress/result to local daemon. It then retrieves that information again. Instead, have the recovery daemon reliably track election progress/result in rec->recmaster so it doesn't need to be retrieved. Be careful to maintain consistency by only doing this when the local daemon has been updated. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-10-23 06:32:41 +03:00			`rec->recmaster = pnn;`
Revert "if a new node enters the cluster, that node will already be frozen at start" This is unnecessary due to 03e2e436db5cfd29a56d13f5d2101e42389bfc94. Furthermore, if a node doesn't force an election but wins it then it can fail to record that it is the new recovery master. This can lead to a reverse split brain where there is no recovery master. This reverts commit c5035657606283d2e35bea40992505e84ca8e7be. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Conflicts: server/ctdb_recoverd.c (This used to be ctdb commit c8b542e059a54b8d524bd430cad9d82e5edd864d) 2013-10-29 09:38:42 +04:00
recovery daemon with recovery master election election is primitive, it elects the lowest vnn as the recovery master two new controls, to get/set recovery master for a node to use recovery daemon, start one ./bin/recoverd --socket=ctdb.socket* for each ctdb daemon it has been briefly tested by deleting and adding nodes to a 4 node cluster but needs more testing (This used to be ctdb commit 541d1cc49d46d44042a31a8404d521412ef2fdb3) 2007-05-07 00:51:58 +04:00			`/* send an election message to all active nodes */`
When we create new election data to send during elections, we must re-read the node flags from the main daemon to catch when the STOPPED flag is changed. (This used to be ctdb commit ca4982c40d81db528fe915d5ecc01fcf7df0b522) 2009-07-17 05:37:03 +04:00			`DEBUG(DEBUG_INFO,(__location__ " Send election request to all active nodes\n"));`
ctdb-server: Coverity fixes Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org> 2013-11-11 05:39:27 +04:00			`return ctdb_client_send_message(ctdb, CTDB_BROADCAST_ALL, srvid, election_data);`
recovery daemon with recovery master election election is primitive, it elects the lowest vnn as the recovery master two new controls, to get/set recovery master for a node to use recovery daemon, start one ./bin/recoverd --socket=ctdb.socket* for each ctdb daemon it has been briefly tested by deleting and adding nodes to a 4 node cluster but needs more testing (This used to be ctdb commit 541d1cc49d46d44042a31a8404d521412ef2fdb3) 2007-05-07 00:51:58 +04:00			`}`

make election handling much more scalable (This used to be ctdb commit 05938d462b92bd9ecb8e35f53651bded47c48675) 2007-11-13 02:27:44 +03:00			`/*`
			`we think we are winning the election - send a broadcast election request`
			`*/`
ctdb-daemon: Stop using tevent compatibility definitions Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-26 08:50:09 +03:00			`static void election_send_request(struct tevent_context *ev,`
			`struct tevent_timer *te,`
			`struct timeval t, void *p)`
make election handling much more scalable (This used to be ctdb commit 05938d462b92bd9ecb8e35f53651bded47c48675) 2007-11-13 02:27:44 +03:00			`{`
			`struct ctdb_recoverd *rec = talloc_get_type(p, struct ctdb_recoverd);`
			`int ret;`

Revert "if a new node enters the cluster, that node will already be frozen at start" This is unnecessary due to 03e2e436db5cfd29a56d13f5d2101e42389bfc94. Furthermore, if a node doesn't force an election but wins it then it can fail to record that it is the new recovery master. This can lead to a reverse split brain where there is no recovery master. This reverts commit c5035657606283d2e35bea40992505e84ca8e7be. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Conflicts: server/ctdb_recoverd.c (This used to be ctdb commit c8b542e059a54b8d524bd430cad9d82e5edd864d) 2013-10-29 09:38:42 +04:00			`ret = send_election_request(rec, ctdb_get_pnn(rec->ctdb));`
make election handling much more scalable (This used to be ctdb commit 05938d462b92bd9ecb8e35f53651bded47c48675) 2007-11-13 02:27:44 +03:00			`if (ret != 0) {`
merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_ERR,("Failed to send election request!\n"));`
make election handling much more scalable (This used to be ctdb commit 05938d462b92bd9ecb8e35f53651bded47c48675) 2007-11-13 02:27:44 +03:00			`}`

ctdb-recoverd: Simplify using TALLOC_FREE() Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-10-23 08:03:38 +03:00			`TALLOC_FREE(rec->send_election_te);`
make election handling much more scalable (This used to be ctdb commit 05938d462b92bd9ecb8e35f53651bded47c48675) 2007-11-13 02:27:44 +03:00			`}`

add improvements to tracking memory usage in ctdbd adn the recovery daemon and a ctdb command to pull the talloc memory map from a recovery daemon ctdb rddumpmemory (This used to be ctdb commit d23950be7406cf288f48b660c0f57a9b8d7bdd05) 2008-04-01 08:34:54 +04:00			`/*`
			`handler for memory dumps`
			`*/`
ctdb-daemon: Replace ctdb_message with srvid abstraction Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-04-08 07:38:26 +03:00			`static void mem_dump_handler(uint64_t srvid, TDB_DATA data, void *private_data)`
add improvements to tracking memory usage in ctdbd adn the recovery daemon and a ctdb command to pull the talloc memory map from a recovery daemon ctdb rddumpmemory (This used to be ctdb commit d23950be7406cf288f48b660c0f57a9b8d7bdd05) 2008-04-01 08:34:54 +04:00			`{`
ctdb-daemon: Replace ctdb_message with srvid abstraction Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-04-08 07:38:26 +03:00			`struct ctdb_recoverd *rec = talloc_get_type(`
			`private_data, struct ctdb_recoverd);`
			`struct ctdb_context *ctdb = rec->ctdb;`
add improvements to tracking memory usage in ctdbd adn the recovery daemon and a ctdb command to pull the talloc memory map from a recovery daemon ctdb rddumpmemory (This used to be ctdb commit d23950be7406cf288f48b660c0f57a9b8d7bdd05) 2008-04-01 08:34:54 +04:00			`TALLOC_CTX *tmp_ctx = talloc_new(ctdb);`
			`TDB_DATA *dump;`
			`int ret;`
ctdb-daemon: Rename struct srvid_request to ctdb_srvid_message Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-29 06:32:49 +03:00			`struct ctdb_srvid_message *rd;`
add improvements to tracking memory usage in ctdbd adn the recovery daemon and a ctdb command to pull the talloc memory map from a recovery daemon ctdb rddumpmemory (This used to be ctdb commit d23950be7406cf288f48b660c0f57a9b8d7bdd05) 2008-04-01 08:34:54 +04:00
ctdb-daemon: Rename struct srvid_request to ctdb_srvid_message Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-29 06:32:49 +03:00			`if (data.dsize != sizeof(struct ctdb_srvid_message)) {`
add improvements to tracking memory usage in ctdbd adn the recovery daemon and a ctdb command to pull the talloc memory map from a recovery daemon ctdb rddumpmemory (This used to be ctdb commit d23950be7406cf288f48b660c0f57a9b8d7bdd05) 2008-04-01 08:34:54 +04:00			`DEBUG(DEBUG_ERR, (__location__ " Wrong size of return address.\n"));`
fix a slow memory leak in the recovery daemon in the error paths for the memdump function (This used to be ctdb commit 5e641ef9d6cca286061138a9680dcf2495736e8b) 2008-09-16 03:00:48 +04:00			`talloc_free(tmp_ctx);`
add improvements to tracking memory usage in ctdbd adn the recovery daemon and a ctdb command to pull the talloc memory map from a recovery daemon ctdb rddumpmemory (This used to be ctdb commit d23950be7406cf288f48b660c0f57a9b8d7bdd05) 2008-04-01 08:34:54 +04:00			`return;`
			`}`
ctdb-daemon: Rename struct srvid_request to ctdb_srvid_message Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-29 06:32:49 +03:00			`rd = (struct ctdb_srvid_message *)data.dptr;`
add improvements to tracking memory usage in ctdbd adn the recovery daemon and a ctdb command to pull the talloc memory map from a recovery daemon ctdb rddumpmemory (This used to be ctdb commit d23950be7406cf288f48b660c0f57a9b8d7bdd05) 2008-04-01 08:34:54 +04:00
			`dump = talloc_zero(tmp_ctx, TDB_DATA);`
			`if (dump == NULL) {`
			`DEBUG(DEBUG_ERR, (__location__ " Failed to allocate memory for memdump\n"));`
			`talloc_free(tmp_ctx);`
			`return;`
			`}`
			`ret = ctdb_dump_memory(ctdb, dump);`
			`if (ret != 0) {`
			`DEBUG(DEBUG_ERR, (__location__ " ctdb_dump_memory() failed\n"));`
			`talloc_free(tmp_ctx);`
			`return;`
			`}`

			`DEBUG(DEBUG_ERR, ("recovery master memory dump\n"));`

rename ctdb_send_message to ctdb_client_send_message to resolve colission with the function of the same name in libctdb (This used to be ctdb commit ac3292c12832484a22715f1d46aa23f3b7c8a6f6) 2010-06-02 03:45:21 +04:00			`ret = ctdb_client_send_message(ctdb, rd->pnn, rd->srvid, *dump);`
add improvements to tracking memory usage in ctdbd adn the recovery daemon and a ctdb command to pull the talloc memory map from a recovery daemon ctdb rddumpmemory (This used to be ctdb commit d23950be7406cf288f48b660c0f57a9b8d7bdd05) 2008-04-01 08:34:54 +04:00			`if (ret != 0) {`
			`DEBUG(DEBUG_ERR,("Failed to send rd memdump reply message\n"));`
fix a slow memory leak in the recovery daemon in the error paths for the memdump function (This used to be ctdb commit 5e641ef9d6cca286061138a9680dcf2495736e8b) 2008-09-16 03:00:48 +04:00			`talloc_free(tmp_ctx);`
add improvements to tracking memory usage in ctdbd adn the recovery daemon and a ctdb command to pull the talloc memory map from a recovery daemon ctdb rddumpmemory (This used to be ctdb commit d23950be7406cf288f48b660c0f57a9b8d7bdd05) 2008-04-01 08:34:54 +04:00			`return;`
			`}`

			`talloc_free(tmp_ctx);`
			`}`

add a new node state : DELETED. This is used to mark nodes as being DELETED internally in ctdb so that nodes are not renumbered if / when they are removed from the nodes file. This is used to be able to do "ctdb reloadnodes" at runtime without causing nodes to be renumbered. To do this, instead of deleting a node from the nodes file, just comment it out like 1.0.0.1 #1.0.0.2 1.0.0.3 After removing 1.0.0.2 from the cluster, the remaining nodes retain their pnn's from prior to the deletion, namely 0 and 2 Any line in the nodes file that is commented out represents a DELETED pnn (This used to be ctdb commit 6a5e4fd7fa391206b463bb4e976502f3ac5bd343) 2009-06-01 08:18:34 +04:00			`/*`
			`handler for reload_nodes`
			`*/`
ctdb-daemon: Replace ctdb_message with srvid abstraction Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-04-08 07:38:26 +03:00			`static void reload_nodes_handler(uint64_t srvid, TDB_DATA data,`
			`void *private_data)`
add a new node state : DELETED. This is used to mark nodes as being DELETED internally in ctdb so that nodes are not renumbered if / when they are removed from the nodes file. This is used to be able to do "ctdb reloadnodes" at runtime without causing nodes to be renumbered. To do this, instead of deleting a node from the nodes file, just comment it out like 1.0.0.1 #1.0.0.2 1.0.0.3 After removing 1.0.0.2 from the cluster, the remaining nodes retain their pnn's from prior to the deletion, namely 0 and 2 Any line in the nodes file that is commented out represents a DELETED pnn (This used to be ctdb commit 6a5e4fd7fa391206b463bb4e976502f3ac5bd343) 2009-06-01 08:18:34 +04:00			`{`
ctdb-daemon: Replace ctdb_message with srvid abstraction Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-04-08 07:38:26 +03:00			`struct ctdb_recoverd *rec = talloc_get_type(`
			`private_data, struct ctdb_recoverd);`
add a new node state : DELETED. This is used to mark nodes as being DELETED internally in ctdb so that nodes are not renumbered if / when they are removed from the nodes file. This is used to be able to do "ctdb reloadnodes" at runtime without causing nodes to be renumbered. To do this, instead of deleting a node from the nodes file, just comment it out like 1.0.0.1 #1.0.0.2 1.0.0.3 After removing 1.0.0.2 from the cluster, the remaining nodes retain their pnn's from prior to the deletion, namely 0 and 2 Any line in the nodes file that is commented out represents a DELETED pnn (This used to be ctdb commit 6a5e4fd7fa391206b463bb4e976502f3ac5bd343) 2009-06-01 08:18:34 +04:00
			`DEBUG(DEBUG_ERR, (__location__ " Reload nodes file from recovery daemon\n"));`

recoverd: Remove function reload_nodes_file() It is a 1 line wrapper around ctdb_load_nodes_file(), so use that instead. We need less code... :-) Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 4a5d5935f4410a93a3343d85a24dbcddae2c4c20) 2013-10-14 06:54:39 +04:00			`ctdb_load_nodes_file(rec->ctdb);`
add a new node state : DELETED. This is used to mark nodes as being DELETED internally in ctdb so that nodes are not renumbered if / when they are removed from the nodes file. This is used to be able to do "ctdb reloadnodes" at runtime without causing nodes to be renumbered. To do this, instead of deleting a node from the nodes file, just comment it out like 1.0.0.1 #1.0.0.2 1.0.0.3 After removing 1.0.0.2 from the cluster, the remaining nodes retain their pnn's from prior to the deletion, namely 0 and 2 Any line in the nodes file that is commented out represents a DELETED pnn (This used to be ctdb commit 6a5e4fd7fa391206b463bb4e976502f3ac5bd343) 2009-06-01 08:18:34 +04:00			`}`

add a new message to ask the recovery daemon to temporarily disable checking ip address consistency. This is useful when we are moving addresses using moveip in the cluster since otherwise if we collide with the recovery daemons own check we could cause a recovery (This used to be ctdb commit 9c63858c0b22c81eaccb9865a414af0bbb2833d4) 2009-10-06 05:11:32 +04:00
ctdb-daemon: Replace ctdb_message with srvid abstraction Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-04-08 07:38:26 +03:00			`static void recd_node_rebalance_handler(uint64_t srvid, TDB_DATA data,`
			`void *private_data)`
When adding ips to nodes, set up a deferred rebalance for the whole node to trigger after 60 seconds in case the normal ipreallocated is not sufficient to trigger rebalance. (This used to be ctdb commit 4340263b219d75c39f8de22abe3f6f1c1ee63ea2) 2012-02-27 23:56:04 +04:00			`{`
ctdb-daemon: Replace ctdb_message with srvid abstraction Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-04-08 07:38:26 +03:00			`struct ctdb_recoverd *rec = talloc_get_type(`
			`private_data, struct ctdb_recoverd);`
			`struct ctdb_context *ctdb = rec->ctdb;`
When adding ips to nodes, set up a deferred rebalance for the whole node to trigger after 60 seconds in case the normal ipreallocated is not sufficient to trigger rebalance. (This used to be ctdb commit 4340263b219d75c39f8de22abe3f6f1c1ee63ea2) 2012-02-27 23:56:04 +04:00			`uint32_t pnn;`
recoverd: Fix the implementation of CTDB_SRVID_REBALANCE_NODE The current implementation has a few flaws: * A takeover run is called unconditionally when the timer goes even if the recovery master role has moved. This means a node other than the recovery master can incorrectly do a takeover run. * The rebalancing target nodes are cleared in the setup for a takeover run, regardless of whether the takeover run succeeds. * The timer to force a rebalance isn't cleared if another takeover run occurs before the deadline. Any forced rebalancing will happen in the first takeover run and when the timer expires some time later then an unnecessary takeover run will occur. * If the recovery master role moves then the rebalancing data will stay on the original node and affect the next takeover run to occur if the recovery master role should come back to the original node. Instead, store an array of rebalance target nodes in the recovery master context. This is passed as an extra argument to ctdb_takeover_run() each time it is called and is cleared when a takeover run succeeds. The timer hangs off the array of rebalance target nodes, which is cleared if the node isn't the recovery master. This means that it is possible to lose rebalance data if the recovery master role moves. However, that's a difficult problem to solve. The best way of approaching it is probably to try to stop the recovery master role from jumping around unnecesarily when inactive nodes join the cluster. The long term solution is to avoid this nonsense completely. The IP allocation algorithm needs to cache state between runs so that it knows which nodes have just become healthy. This also needs recovery master stability. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit c51c1efe5fc7fa668597f2acd435dee16e410fc9) 2013-09-04 08:30:04 +04:00			`uint32_t *t;`
			`int len;`
When adding ips to nodes, set up a deferred rebalance for the whole node to trigger after 60 seconds in case the normal ipreallocated is not sufficient to trigger rebalance. (This used to be ctdb commit 4340263b219d75c39f8de22abe3f6f1c1ee63ea2) 2012-02-27 23:56:04 +04:00
recoverd: Fix the implementation of CTDB_SRVID_REBALANCE_NODE The current implementation has a few flaws: * A takeover run is called unconditionally when the timer goes even if the recovery master role has moved. This means a node other than the recovery master can incorrectly do a takeover run. * The rebalancing target nodes are cleared in the setup for a takeover run, regardless of whether the takeover run succeeds. * The timer to force a rebalance isn't cleared if another takeover run occurs before the deadline. Any forced rebalancing will happen in the first takeover run and when the timer expires some time later then an unnecessary takeover run will occur. * If the recovery master role moves then the rebalancing data will stay on the original node and affect the next takeover run to occur if the recovery master role should come back to the original node. Instead, store an array of rebalance target nodes in the recovery master context. This is passed as an extra argument to ctdb_takeover_run() each time it is called and is cleared when a takeover run succeeds. The timer hangs off the array of rebalance target nodes, which is cleared if the node isn't the recovery master. This means that it is possible to lose rebalance data if the recovery master role moves. However, that's a difficult problem to solve. The best way of approaching it is probably to try to stop the recovery master role from jumping around unnecesarily when inactive nodes join the cluster. The long term solution is to avoid this nonsense completely. The IP allocation algorithm needs to cache state between runs so that it knows which nodes have just become healthy. This also needs recovery master stability. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit c51c1efe5fc7fa668597f2acd435dee16e410fc9) 2013-09-04 08:30:04 +04:00			`if (rec->recmaster != ctdb_get_pnn(ctdb)) {`
			`return;`
			`}`

When adding ips to nodes, set up a deferred rebalance for the whole node to trigger after 60 seconds in case the normal ipreallocated is not sufficient to trigger rebalance. (This used to be ctdb commit 4340263b219d75c39f8de22abe3f6f1c1ee63ea2) 2012-02-27 23:56:04 +04:00			`if (data.dsize != sizeof(uint32_t)) {`
			`DEBUG(DEBUG_ERR,(__location__ " Incorrect size of node rebalance message. Was %zd but expected %zd bytes\n", data.dsize, sizeof(uint32_t)));`
			`return;`
			`}`

			`pnn = (uint32_t )&data.dptr[0];`

recoverd: Fix the implementation of CTDB_SRVID_REBALANCE_NODE The current implementation has a few flaws: * A takeover run is called unconditionally when the timer goes even if the recovery master role has moved. This means a node other than the recovery master can incorrectly do a takeover run. * The rebalancing target nodes are cleared in the setup for a takeover run, regardless of whether the takeover run succeeds. * The timer to force a rebalance isn't cleared if another takeover run occurs before the deadline. Any forced rebalancing will happen in the first takeover run and when the timer expires some time later then an unnecessary takeover run will occur. * If the recovery master role moves then the rebalancing data will stay on the original node and affect the next takeover run to occur if the recovery master role should come back to the original node. Instead, store an array of rebalance target nodes in the recovery master context. This is passed as an extra argument to ctdb_takeover_run() each time it is called and is cleared when a takeover run succeeds. The timer hangs off the array of rebalance target nodes, which is cleared if the node isn't the recovery master. This means that it is possible to lose rebalance data if the recovery master role moves. However, that's a difficult problem to solve. The best way of approaching it is probably to try to stop the recovery master role from jumping around unnecesarily when inactive nodes join the cluster. The long term solution is to avoid this nonsense completely. The IP allocation algorithm needs to cache state between runs so that it knows which nodes have just become healthy. This also needs recovery master stability. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit c51c1efe5fc7fa668597f2acd435dee16e410fc9) 2013-09-04 08:30:04 +04:00			`DEBUG(DEBUG_NOTICE,("Setting up rebalance of IPs to node %u\n", pnn));`
When adding ips to nodes, set up a deferred rebalance for the whole node to trigger after 60 seconds in case the normal ipreallocated is not sufficient to trigger rebalance. (This used to be ctdb commit 4340263b219d75c39f8de22abe3f6f1c1ee63ea2) 2012-02-27 23:56:04 +04:00
recoverd: Fix the implementation of CTDB_SRVID_REBALANCE_NODE The current implementation has a few flaws: * A takeover run is called unconditionally when the timer goes even if the recovery master role has moved. This means a node other than the recovery master can incorrectly do a takeover run. * The rebalancing target nodes are cleared in the setup for a takeover run, regardless of whether the takeover run succeeds. * The timer to force a rebalance isn't cleared if another takeover run occurs before the deadline. Any forced rebalancing will happen in the first takeover run and when the timer expires some time later then an unnecessary takeover run will occur. * If the recovery master role moves then the rebalancing data will stay on the original node and affect the next takeover run to occur if the recovery master role should come back to the original node. Instead, store an array of rebalance target nodes in the recovery master context. This is passed as an extra argument to ctdb_takeover_run() each time it is called and is cleared when a takeover run succeeds. The timer hangs off the array of rebalance target nodes, which is cleared if the node isn't the recovery master. This means that it is possible to lose rebalance data if the recovery master role moves. However, that's a difficult problem to solve. The best way of approaching it is probably to try to stop the recovery master role from jumping around unnecesarily when inactive nodes join the cluster. The long term solution is to avoid this nonsense completely. The IP allocation algorithm needs to cache state between runs so that it knows which nodes have just become healthy. This also needs recovery master stability. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit c51c1efe5fc7fa668597f2acd435dee16e410fc9) 2013-09-04 08:30:04 +04:00			`/* Copy any existing list of nodes. There's probably some`
			`* sort of realloc variant that will do this but we need to`
			`* make sure that freeing the old array also cancels the timer`
			`* event for the timeout... not sure if realloc will do that.`
			`*/`
			`len = (rec->force_rebalance_nodes != NULL) ?`
			`talloc_array_length(rec->force_rebalance_nodes) :`
			`0;`

			`/* This allows duplicates to be added but they don't cause`
			`* harm. A call to add a duplicate PNN arguably means that`
			`* the timeout should be reset, so this is the simplest`
			`* solution.`
			`*/`
			`t = talloc_zero_array(rec, uint32_t, len+1);`
			`CTDB_NO_MEMORY_VOID(ctdb, t);`
			`if (len > 0) {`
			`memcpy(t, rec->force_rebalance_nodes, sizeof(uint32_t) * len);`
When adding ips to nodes, set up a deferred rebalance for the whole node to trigger after 60 seconds in case the normal ipreallocated is not sufficient to trigger rebalance. (This used to be ctdb commit 4340263b219d75c39f8de22abe3f6f1c1ee63ea2) 2012-02-27 23:56:04 +04:00			`}`
recoverd: Fix the implementation of CTDB_SRVID_REBALANCE_NODE The current implementation has a few flaws: * A takeover run is called unconditionally when the timer goes even if the recovery master role has moved. This means a node other than the recovery master can incorrectly do a takeover run. * The rebalancing target nodes are cleared in the setup for a takeover run, regardless of whether the takeover run succeeds. * The timer to force a rebalance isn't cleared if another takeover run occurs before the deadline. Any forced rebalancing will happen in the first takeover run and when the timer expires some time later then an unnecessary takeover run will occur. * If the recovery master role moves then the rebalancing data will stay on the original node and affect the next takeover run to occur if the recovery master role should come back to the original node. Instead, store an array of rebalance target nodes in the recovery master context. This is passed as an extra argument to ctdb_takeover_run() each time it is called and is cleared when a takeover run succeeds. The timer hangs off the array of rebalance target nodes, which is cleared if the node isn't the recovery master. This means that it is possible to lose rebalance data if the recovery master role moves. However, that's a difficult problem to solve. The best way of approaching it is probably to try to stop the recovery master role from jumping around unnecesarily when inactive nodes join the cluster. The long term solution is to avoid this nonsense completely. The IP allocation algorithm needs to cache state between runs so that it knows which nodes have just become healthy. This also needs recovery master stability. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit c51c1efe5fc7fa668597f2acd435dee16e410fc9) 2013-09-04 08:30:04 +04:00			`t[len] = pnn;`

			`talloc_free(rec->force_rebalance_nodes);`

			`rec->force_rebalance_nodes = t;`
When adding ips to nodes, set up a deferred rebalance for the whole node to trigger after 60 seconds in case the normal ipreallocated is not sufficient to trigger rebalance. (This used to be ctdb commit 4340263b219d75c39f8de22abe3f6f1c1ee63ea2) 2012-02-27 23:56:04 +04:00			`}`



ctdb-recoverd: Add slightly more abstraction for disabling takeover runs Factor out new function srvid_disable_and_reply(), which can be re-used. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-02-06 05:05:12 +03:00			`static void srvid_disable_and_reply(struct ctdb_context *ctdb,`
			`TDB_DATA data,`
			`struct ctdb_op_state *op_state)`
recoverd: New SRVID message CTDB_SRVID_DISABLE_TAKEOVER_RUNS This implements a superset of CTDB_SRVID_DISABLE_IP_CHECK. It stops the IP checks but also causes any attempted takeover runs to fail and be rescheduled. This is meant to completely stop IP movements. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 00db4de53a0d86013e79e6577e7e6cf3ef864e56) 2013-08-27 09:04:40 +04:00			`{`
ctdb-daemon: Rename struct srvid_request_data to ctdb_disable_message Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-28 10:23:13 +03:00			`struct ctdb_disable_message *r;`
recoverd: New SRVID message CTDB_SRVID_DISABLE_TAKEOVER_RUNS This implements a superset of CTDB_SRVID_DISABLE_IP_CHECK. It stops the IP checks but also causes any attempted takeover runs to fail and be rescheduled. This is meant to completely stop IP movements. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 00db4de53a0d86013e79e6577e7e6cf3ef864e56) 2013-08-27 09:04:40 +04:00			`uint32_t timeout;`
			`TDB_DATA result;`
			`int32_t ret = 0;`

			`/* Validate input data */`
ctdb-daemon: Rename struct srvid_request_data to ctdb_disable_message Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-28 10:23:13 +03:00			`if (data.dsize != sizeof(struct ctdb_disable_message)) {`
recoverd: New SRVID message CTDB_SRVID_DISABLE_TAKEOVER_RUNS This implements a superset of CTDB_SRVID_DISABLE_IP_CHECK. It stops the IP checks but also causes any attempted takeover runs to fail and be rescheduled. This is meant to completely stop IP movements. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 00db4de53a0d86013e79e6577e7e6cf3ef864e56) 2013-08-27 09:04:40 +04:00			`DEBUG(DEBUG_ERR,(__location__ " Wrong size for data :%lu "`
			`"expecting %lu\n", (long unsigned)data.dsize,`
ctdb-daemon: Rename struct srvid_request to ctdb_srvid_message Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-29 06:32:49 +03:00			`(long unsigned)sizeof(struct ctdb_srvid_message)));`
ctdb-server: Coverity fixes Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org> 2013-11-11 05:39:27 +04:00			`return;`
recoverd: New SRVID message CTDB_SRVID_DISABLE_TAKEOVER_RUNS This implements a superset of CTDB_SRVID_DISABLE_IP_CHECK. It stops the IP checks but also causes any attempted takeover runs to fail and be rescheduled. This is meant to completely stop IP movements. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 00db4de53a0d86013e79e6577e7e6cf3ef864e56) 2013-08-27 09:04:40 +04:00			`}`
			`if (data.dptr == NULL) {`
			`DEBUG(DEBUG_ERR,(__location__ " No data received\n"));`
ctdb-server: Coverity fixes Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org> 2013-11-11 05:39:27 +04:00			`return;`
recoverd: New SRVID message CTDB_SRVID_DISABLE_TAKEOVER_RUNS This implements a superset of CTDB_SRVID_DISABLE_IP_CHECK. It stops the IP checks but also causes any attempted takeover runs to fail and be rescheduled. This is meant to completely stop IP movements. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 00db4de53a0d86013e79e6577e7e6cf3ef864e56) 2013-08-27 09:04:40 +04:00			`}`

ctdb-daemon: Rename struct srvid_request_data to ctdb_disable_message Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-28 10:23:13 +03:00			`r = (struct ctdb_disable_message *)data.dptr;`
			`timeout = r->timeout;`
recoverd: New SRVID message CTDB_SRVID_DISABLE_TAKEOVER_RUNS This implements a superset of CTDB_SRVID_DISABLE_IP_CHECK. It stops the IP checks but also causes any attempted takeover runs to fail and be rescheduled. This is meant to completely stop IP movements. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 00db4de53a0d86013e79e6577e7e6cf3ef864e56) 2013-08-27 09:04:40 +04:00
ctdb-recoverd: Add slightly more abstraction for disabling takeover runs Factor out new function srvid_disable_and_reply(), which can be re-used. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-02-06 05:05:12 +03:00			`ret = ctdb_op_disable(op_state, ctdb->ev, timeout);`
ctdb-recoverd: Reimplement disabling takeover runs using ctdb_op_disable() Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-02-08 12:52:12 +03:00			`if (ret != 0) {`
recoverd: New SRVID message CTDB_SRVID_DISABLE_TAKEOVER_RUNS This implements a superset of CTDB_SRVID_DISABLE_IP_CHECK. It stops the IP checks but also causes any attempted takeover runs to fail and be rescheduled. This is meant to completely stop IP movements. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 00db4de53a0d86013e79e6577e7e6cf3ef864e56) 2013-08-27 09:04:40 +04:00			`goto done;`
			`}`

			`/* Returning our PNN tells the caller that we succeeded */`
			`ret = ctdb_get_pnn(ctdb);`
			`done:`
			`result.dsize = sizeof(int32_t);`
			`result.dptr = (uint8_t *)&ret;`
ctdb-daemon: Rename struct srvid_request to ctdb_srvid_message Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-29 06:32:49 +03:00			`srvid_request_reply(ctdb, (struct ctdb_srvid_message *)r, result);`
recoverd: New SRVID message CTDB_SRVID_DISABLE_TAKEOVER_RUNS This implements a superset of CTDB_SRVID_DISABLE_IP_CHECK. It stops the IP checks but also causes any attempted takeover runs to fail and be rescheduled. This is meant to completely stop IP movements. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 00db4de53a0d86013e79e6577e7e6cf3ef864e56) 2013-08-27 09:04:40 +04:00			`}`

ctdb-daemon: Replace ctdb_message with srvid abstraction Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-04-08 07:38:26 +03:00			`static void disable_takeover_runs_handler(uint64_t srvid, TDB_DATA data,`
ctdb-recoverd: Add slightly more abstraction for disabling takeover runs Factor out new function srvid_disable_and_reply(), which can be re-used. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-02-06 05:05:12 +03:00			`void *private_data)`
			`{`
ctdb-daemon: Replace ctdb_message with srvid abstraction Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-04-08 07:38:26 +03:00			`struct ctdb_recoverd *rec = talloc_get_type(`
			`private_data, struct ctdb_recoverd);`
ctdb-recoverd: Add slightly more abstraction for disabling takeover runs Factor out new function srvid_disable_and_reply(), which can be re-used. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-02-06 05:05:12 +03:00
ctdb-daemon: Replace ctdb_message with srvid abstraction Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-04-08 07:38:26 +03:00			`srvid_disable_and_reply(rec->ctdb, data, rec->takeover_run);`
ctdb-recoverd: Add slightly more abstraction for disabling takeover runs Factor out new function srvid_disable_and_reply(), which can be re-used. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-02-06 05:05:12 +03:00			`}`

ctdb-recoverd: Simplify disable_ip_check_handler() using ctdb_op_disable() Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-02-06 07:03:03 +03:00			`/* Backward compatibility for this SRVID */`
ctdb-daemon: Replace ctdb_message with srvid abstraction Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-04-08 07:38:26 +03:00			`static void disable_ip_check_handler(uint64_t srvid, TDB_DATA data,`
			`void *private_data)`
recoverd: Reimplement CTDB_SRVID_DISABLE_IP_CHECK Use disable_takeover_runs_handler() instead of maintaining duplicate logic. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 0a51a85915486b2a8fded7ba6444b18c6c1ee8e8) 2013-08-28 05:32:54 +04:00			`{`
ctdb-daemon: Replace ctdb_message with srvid abstraction Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-04-08 07:38:26 +03:00			`struct ctdb_recoverd *rec = talloc_get_type(`
			`private_data, struct ctdb_recoverd);`
ctdb-recoverd: Simplify disable_ip_check_handler() using ctdb_op_disable() Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-02-06 07:03:03 +03:00			`uint32_t timeout;`
recoverd: Reimplement CTDB_SRVID_DISABLE_IP_CHECK Use disable_takeover_runs_handler() instead of maintaining duplicate logic. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 0a51a85915486b2a8fded7ba6444b18c6c1ee8e8) 2013-08-28 05:32:54 +04:00
			`if (data.dsize != sizeof(uint32_t)) {`
			`DEBUG(DEBUG_ERR,(__location__ " Wrong size for data :%lu "`
			`"expecting %lu\n", (long unsigned)data.dsize,`
			`(long unsigned)sizeof(uint32_t)));`
			`return;`
			`}`
			`if (data.dptr == NULL) {`
			`DEBUG(DEBUG_ERR,(__location__ " No data received\n"));`
			`return;`
			`}`

ctdb-recoverd: Simplify disable_ip_check_handler() using ctdb_op_disable() Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-02-06 07:03:03 +03:00			`timeout = ((uint32_t )data.dptr);`
recoverd: Reimplement CTDB_SRVID_DISABLE_IP_CHECK Use disable_takeover_runs_handler() instead of maintaining duplicate logic. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 0a51a85915486b2a8fded7ba6444b18c6c1ee8e8) 2013-08-28 05:32:54 +04:00
ctdb-daemon: Replace ctdb_message with srvid abstraction Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-04-08 07:38:26 +03:00			`ctdb_op_disable(rec->takeover_run, rec->ctdb->ev, timeout);`
recoverd: Reimplement CTDB_SRVID_DISABLE_IP_CHECK Use disable_takeover_runs_handler() instead of maintaining duplicate logic. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 0a51a85915486b2a8fded7ba6444b18c6c1ee8e8) 2013-08-28 05:32:54 +04:00			`}`
add a new message to ask the recovery daemon to temporarily disable checking ip address consistency. This is useful when we are moving addresses using moveip in the cluster since otherwise if we collide with the recovery daemons own check we could cause a recovery (This used to be ctdb commit 9c63858c0b22c81eaccb9865a414af0bbb2833d4) 2009-10-06 05:11:32 +04:00
ctdb-daemon: Replace ctdb_message with srvid abstraction Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-04-08 07:38:26 +03:00			`static void disable_recoveries_handler(uint64_t srvid, TDB_DATA data,`
ctdb-recoverd: New message ID CTDB_SRVID_DISABLE_RECOVERIES Also add test stub support. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-02-06 07:06:44 +03:00			`void *private_data)`
			`{`
ctdb-daemon: Replace ctdb_message with srvid abstraction Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-04-08 07:38:26 +03:00			`struct ctdb_recoverd *rec = talloc_get_type(`
			`private_data, struct ctdb_recoverd);`
ctdb-recoverd: New message ID CTDB_SRVID_DISABLE_RECOVERIES Also add test stub support. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-02-06 07:06:44 +03:00
ctdb-daemon: Replace ctdb_message with srvid abstraction Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-04-08 07:38:26 +03:00			`srvid_disable_and_reply(rec->ctdb, data, rec->recovery);`
ctdb-recoverd: New message ID CTDB_SRVID_DISABLE_RECOVERIES Also add test stub support. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-02-06 07:06:44 +03:00			`}`

add a new command "ctdb ipreallocate", this command will force the recovery master to perform a full ip reallocation process. the ctdb command will block until the ip reallocation has comleted (This used to be ctdb commit abad7b97fe0c066b33f6e75d0953bbed892a3216) 2009-07-02 07:00:26 +04:00			`/*`
recoverd: Make the SRVID request structure generic No need for a separate one for each SRVID. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit d9c22b04d5aa7938a3965bd3144568664eb772ce) 2013-08-16 14:10:10 +04:00			`handler for ip reallocate, just add it to the list of requests and`
add a new command "ctdb ipreallocate", this command will force the recovery master to perform a full ip reallocation process. the ctdb command will block until the ip reallocation has comleted (This used to be ctdb commit abad7b97fe0c066b33f6e75d0953bbed892a3216) 2009-07-02 07:00:26 +04:00			`handle this later in the monitor_cluster loop so we do not recurse`
recoverd: Make the SRVID request structure generic No need for a separate one for each SRVID. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit d9c22b04d5aa7938a3965bd3144568664eb772ce) 2013-08-16 14:10:10 +04:00			`with other requests to takeover_run()`
add a new command "ctdb ipreallocate", this command will force the recovery master to perform a full ip reallocation process. the ctdb command will block until the ip reallocation has comleted (This used to be ctdb commit abad7b97fe0c066b33f6e75d0953bbed892a3216) 2009-07-02 07:00:26 +04:00			`*/`
ctdb-daemon: Replace ctdb_message with srvid abstraction Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-04-08 07:38:26 +03:00			`static void ip_reallocate_handler(uint64_t srvid, TDB_DATA data,`
			`void *private_data)`
add a new command "ctdb ipreallocate", this command will force the recovery master to perform a full ip reallocation process. the ctdb command will block until the ip reallocation has comleted (This used to be ctdb commit abad7b97fe0c066b33f6e75d0953bbed892a3216) 2009-07-02 07:00:26 +04:00			`{`
ctdb-daemon: Rename struct srvid_request to ctdb_srvid_message Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-29 06:32:49 +03:00			`struct ctdb_srvid_message *request;`
ctdb-daemon: Replace ctdb_message with srvid abstraction Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-04-08 07:38:26 +03:00			`struct ctdb_recoverd *rec = talloc_get_type(`
			`private_data, struct ctdb_recoverd);`
add a new command "ctdb ipreallocate", this command will force the recovery master to perform a full ip reallocation process. the ctdb command will block until the ip reallocation has comleted (This used to be ctdb commit abad7b97fe0c066b33f6e75d0953bbed892a3216) 2009-07-02 07:00:26 +04:00
ctdb-daemon: Rename struct srvid_request to ctdb_srvid_message Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-29 06:32:49 +03:00			`if (data.dsize != sizeof(struct ctdb_srvid_message)) {`
add a new command "ctdb ipreallocate", this command will force the recovery master to perform a full ip reallocation process. the ctdb command will block until the ip reallocation has comleted (This used to be ctdb commit abad7b97fe0c066b33f6e75d0953bbed892a3216) 2009-07-02 07:00:26 +04:00			`DEBUG(DEBUG_ERR, (__location__ " Wrong size of return address.\n"));`
			`return;`
			`}`

ctdb-daemon: Rename struct srvid_request to ctdb_srvid_message Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-29 06:32:49 +03:00			`request = (struct ctdb_srvid_message *)data.dptr;`
add a new command "ctdb ipreallocate", this command will force the recovery master to perform a full ip reallocation process. the ctdb command will block until the ip reallocation has comleted (This used to be ctdb commit abad7b97fe0c066b33f6e75d0953bbed892a3216) 2009-07-02 07:00:26 +04:00
ctdb-daemon: Replace ctdb_message with srvid abstraction Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-04-08 07:38:26 +03:00			`srvid_request_add(rec->ctdb, &rec->reallocate_requests, request);`
add a new command "ctdb ipreallocate", this command will force the recovery master to perform a full ip reallocation process. the ctdb command will block until the ip reallocation has comleted (This used to be ctdb commit abad7b97fe0c066b33f6e75d0953bbed892a3216) 2009-07-02 07:00:26 +04:00			`}`

recoverd: Factor out the SRVID handling code The code that handles IP reallocate requests can be reused. This also changes the result back to a SRVID caller to the PNN on success or a negative error code on failure. None of the callers currently look at the result so this is harmless... but it will be useful later. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit e4eae6e3291baa299a1d0f733ab11b138ee699a3) 2013-08-16 14:02:34 +04:00			`static void process_ipreallocate_requests(struct ctdb_context *ctdb,`
			`struct ctdb_recoverd *rec)`
add a new command "ctdb ipreallocate", this command will force the recovery master to perform a full ip reallocation process. the ctdb command will block until the ip reallocation has comleted (This used to be ctdb commit abad7b97fe0c066b33f6e75d0953bbed892a3216) 2009-07-02 07:00:26 +04:00			`{`
			`TDB_DATA result;`
			`int32_t ret;`
ctdb-recoverd: Only respond to currently queued ipreallocated requests Otherwise new requests can come in during the latter parts of the takeover run when the IP allocation algorithm has already run, and the new requests will be dequeued even though they haven't really be processed. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org> 2013-11-22 06:57:03 +04:00			`struct srvid_requests *current;`
add a new command "ctdb ipreallocate", this command will force the recovery master to perform a full ip reallocation process. the ctdb command will block until the ip reallocation has comleted (This used to be ctdb commit abad7b97fe0c066b33f6e75d0953bbed892a3216) 2009-07-02 07:00:26 +04:00
ctdb-recoverd: Only respond to currently queued ipreallocated requests Otherwise new requests can come in during the latter parts of the takeover run when the IP allocation algorithm has already run, and the new requests will be dequeued even though they haven't really be processed. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org> 2013-11-22 06:57:03 +04:00			`/* Only process requests that are currently pending. More`
			`* might come in while the takeover run is in progress and`
			`* they will need to be processed later since they might`
			`* be in response flag changes.`
			`*/`
			`current = rec->reallocate_requests;`
			`rec->reallocate_requests = NULL;`

ctdb-takeover: Recovery daemon no longer passes fail callback Banning is now handled by the takeover code sending banning credit messages. This commit makes a change in behaviour quite obvious. Takeover runs were initiated from several locations in the code but banning was only done from one of these locations. Now banning can be done from any failed takeover run. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-05-03 08:35:08 +03:00			`if (do_takeover_run(rec, rec->nodemap)) {`
ctdb-recoverd: Reload remote IPs as part of takeover run This is currently done before each IP takeover run, so just factor it in. ctdb_reload_remote_public_ips() becomes static. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Thu Nov 12 09:28:45 CET 2015 on sn-devel-104 2015-10-28 12:04:41 +03:00			`ret = ctdb_get_pnn(ctdb);`
			`} else {`
			`ret = -1;`
server: reload the public addresses before doing a takeover run metze (This used to be ctdb commit 0e41a2204fa8a1e77dc83c0d4b253ab272b5c72d) 2010-01-19 10:42:48 +03:00			`}`

add a new command "ctdb ipreallocate", this command will force the recovery master to perform a full ip reallocation process. the ctdb command will block until the ip reallocation has comleted (This used to be ctdb commit abad7b97fe0c066b33f6e75d0953bbed892a3216) 2009-07-02 07:00:26 +04:00			`result.dsize = sizeof(int32_t);`
			`result.dptr = (uint8_t *)&ret;`

ctdb-recoverd: Only respond to currently queued ipreallocated requests Otherwise new requests can come in during the latter parts of the takeover run when the IP allocation algorithm has already run, and the new requests will be dequeued even though they haven't really be processed. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org> 2013-11-22 06:57:03 +04:00			`srvid_requests_reply(ctdb, &current, result);`
add a new command "ctdb ipreallocate", this command will force the recovery master to perform a full ip reallocation process. the ctdb command will block until the ip reallocation has comleted (This used to be ctdb commit abad7b97fe0c066b33f6e75d0953bbed892a3216) 2009-07-02 07:00:26 +04:00			`}`
add a new node state : DELETED. This is used to mark nodes as being DELETED internally in ctdb so that nodes are not renumbered if / when they are removed from the nodes file. This is used to be able to do "ctdb reloadnodes" at runtime without causing nodes to be renumbered. To do this, instead of deleting a node from the nodes file, just comment it out like 1.0.0.1 #1.0.0.2 1.0.0.3 After removing 1.0.0.2 from the cluster, the remaining nodes retain their pnn's from prior to the deletion, namely 0 and 2 Any line in the nodes file that is commented out represents a DELETED pnn (This used to be ctdb commit 6a5e4fd7fa391206b463bb4e976502f3ac5bd343) 2009-06-01 08:18:34 +04:00
ctdb-recoverd: Add message handler to assigning banning credits This will be called from recovery helper to assign banning credits to misbehaving node. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2016-03-17 09:26:30 +03:00			`/*`
			`* handler for assigning banning credits`
			`*/`
			`static void banning_handler(uint64_t srvid, TDB_DATA data, void *private_data)`
			`{`
			`struct ctdb_recoverd *rec = talloc_get_type(`
			`private_data, struct ctdb_recoverd);`
			`uint32_t ban_pnn;`

			`/* Ignore if we are not recmaster */`
			`if (rec->ctdb->pnn != rec->recmaster) {`
			`return;`
			`}`

			`if (data.dsize != sizeof(uint32_t)) {`
			`DEBUG(DEBUG_ERR, (__location__ "invalid data size %zu\n",`
			`data.dsize));`
			`return;`
			`}`

			`ban_pnn = (uint32_t )data.dptr;`

			`ctdb_set_culprit_count(rec, ban_pnn, rec->nodemap->num);`
			`}`
add a new node state : DELETED. This is used to mark nodes as being DELETED internally in ctdb so that nodes are not renumbered if / when they are removed from the nodes file. This is used to be able to do "ctdb reloadnodes" at runtime without causing nodes to be renumbered. To do this, instead of deleting a node from the nodes file, just comment it out like 1.0.0.1 #1.0.0.2 1.0.0.3 After removing 1.0.0.2 from the cluster, the remaining nodes retain their pnn's from prior to the deletion, namely 0 and 2 Any line in the nodes file that is commented out represents a DELETED pnn (This used to be ctdb commit 6a5e4fd7fa391206b463bb4e976502f3ac5bd343) 2009-06-01 08:18:34 +04:00
recovery daemon with recovery master election election is primitive, it elects the lowest vnn as the recovery master two new controls, to get/set recovery master for a node to use recovery daemon, start one ./bin/recoverd --socket=ctdb.socket* for each ctdb daemon it has been briefly tested by deleting and adding nodes to a 4 node cluster but needs more testing (This used to be ctdb commit 541d1cc49d46d44042a31a8404d521412ef2fdb3) 2007-05-07 00:51:58 +04:00			`/*`
			`handler for recovery master elections`
			`*/`
ctdb-daemon: Replace ctdb_message with srvid abstraction Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-04-08 07:38:26 +03:00			`static void election_handler(uint64_t srvid, TDB_DATA data, void *private_data)`
recovery daemon with recovery master election election is primitive, it elects the lowest vnn as the recovery master two new controls, to get/set recovery master for a node to use recovery daemon, start one ./bin/recoverd --socket=ctdb.socket* for each ctdb daemon it has been briefly tested by deleting and adding nodes to a 4 node cluster but needs more testing (This used to be ctdb commit 541d1cc49d46d44042a31a8404d521412ef2fdb3) 2007-05-07 00:51:58 +04:00			`{`
ctdb-daemon: Replace ctdb_message with srvid abstraction Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-04-08 07:38:26 +03:00			`struct ctdb_recoverd *rec = talloc_get_type(`
			`private_data, struct ctdb_recoverd);`
			`struct ctdb_context *ctdb = rec->ctdb;`
recovery daemon with recovery master election election is primitive, it elects the lowest vnn as the recovery master two new controls, to get/set recovery master for a node to use recovery daemon, start one ./bin/recoverd --socket=ctdb.socket* for each ctdb daemon it has been briefly tested by deleting and adding nodes to a 4 node cluster but needs more testing (This used to be ctdb commit 541d1cc49d46d44042a31a8404d521412ef2fdb3) 2007-05-07 00:51:58 +04:00			`int ret;`
			`struct election_message em = (struct election_message )data.dptr;`

ctdb-recoverd: A node refuses to play against itself Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org> 2013-11-01 07:34:20 +04:00			`/* Ignore election packets from ourself */`
			`if (ctdb->pnn == em->pnn) {`
			`return;`
			`}`

make election handling much more scalable (This used to be ctdb commit 05938d462b92bd9ecb8e35f53651bded47c48675) 2007-11-13 02:27:44 +03:00			`/* we got an election packet - update the timeout for the election */`
			`talloc_free(rec->election_timeout);`
ctdb-daemon: Stop using tevent compatibility definitions Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-26 08:50:09 +03:00			`rec->election_timeout = tevent_add_timer(`
			`ctdb->ev, ctdb,`
			`fast_start ?`
			`timeval_current_ofs(0, 500000) :`
			`timeval_current_ofs(ctdb->tunable.election_timeout, 0),`
			`ctdb_election_timeout, rec);`
make election handling much more scalable (This used to be ctdb commit 05938d462b92bd9ecb8e35f53651bded47c48675) 2007-11-13 02:27:44 +03:00
recovery daemon with recovery master election election is primitive, it elects the lowest vnn as the recovery master two new controls, to get/set recovery master for a node to use recovery daemon, start one ./bin/recoverd --socket=ctdb.socket* for each ctdb daemon it has been briefly tested by deleting and adding nodes to a 4 node cluster but needs more testing (This used to be ctdb commit 541d1cc49d46d44042a31a8404d521412ef2fdb3) 2007-05-07 00:51:58 +04:00			`/* someone called an election. check their election data`
			`and if we disagree and we would rather be the elected node,`
			`send a new election message to all other nodes`
			`*/`
choose the most connected node first (This used to be ctdb commit c7c17a79fa4f28509e34b6f635fa62517dc458c2) 2007-06-07 13:17:27 +04:00			`if (ctdb_election_win(rec, em)) {`
make election handling much more scalable (This used to be ctdb commit 05938d462b92bd9ecb8e35f53651bded47c48675) 2007-11-13 02:27:44 +03:00			`if (!rec->send_election_te) {`
ctdb-daemon: Stop using tevent compatibility definitions Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-26 08:50:09 +03:00			`rec->send_election_te = tevent_add_timer(`
			`ctdb->ev, rec,`
			`timeval_current_ofs(0, 500000),`
			`election_send_request, rec);`
recovery daemon with recovery master election election is primitive, it elects the lowest vnn as the recovery master two new controls, to get/set recovery master for a node to use recovery daemon, start one ./bin/recoverd --socket=ctdb.socket* for each ctdb daemon it has been briefly tested by deleting and adding nodes to a 4 node cluster but needs more testing (This used to be ctdb commit 541d1cc49d46d44042a31a8404d521412ef2fdb3) 2007-05-07 00:51:58 +04:00			`}`
			`return;`
			`}`
ctdb-recoverd: New function ctdb_recovery_have_lock() True if this recovery daemon holds the lock. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2014-12-09 05:50:22 +03:00
make election handling much more scalable (This used to be ctdb commit 05938d462b92bd9ecb8e35f53651bded47c48675) 2007-11-13 02:27:44 +03:00			`/* we didn't win */`
ctdb-recoverd: Simplify using TALLOC_FREE() Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-03-31 05:59:02 +03:00			`TALLOC_FREE(rec->send_election_te);`
recovery daemon with recovery master election election is primitive, it elects the lowest vnn as the recovery master two new controls, to get/set recovery master for a node to use recovery daemon, start one ./bin/recoverd --socket=ctdb.socket* for each ctdb daemon it has been briefly tested by deleting and adding nodes to a 4 node cluster but needs more testing (This used to be ctdb commit 541d1cc49d46d44042a31a8404d521412ef2fdb3) 2007-05-07 00:51:58 +04:00
ctdb-recoverd: Remove redundant condition when checking recovery lock It isn't possible to hold the recovery lock without having a lock file set. This is part of a goal to generalise the recovery lock mechanism to just use a helper program, which may use a lock file or may use something else. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-03-31 05:59:49 +03:00			`/* Release the recovery lock file */`
ctdb-recoverd: Recovery lock handle should be in recovery deamon context This shouldn't be in the CTDB context. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-05-24 07:54:39 +03:00			`if (ctdb_recovery_have_lock(rec)) {`
			`ctdb_recovery_unlock(rec);`
- startup frozen, and do an initial recovery - fixed a bug in traverse - get a lock on the node list file in the recmaster recovery daemon (This used to be ctdb commit 162a5647535ad1cb3e8e5d4042a2784365fb1913) 2007-05-23 08:35:19 +04:00			`}`

recovery daemon with recovery master election election is primitive, it elects the lowest vnn as the recovery master two new controls, to get/set recovery master for a node to use recovery daemon, start one ./bin/recoverd --socket=ctdb.socket* for each ctdb daemon it has been briefly tested by deleting and adding nodes to a 4 node cluster but needs more testing (This used to be ctdb commit 541d1cc49d46d44042a31a8404d521412ef2fdb3) 2007-05-07 00:51:58 +04:00			`/* ok, let that guy become recmaster then */`
ctdb-recoverd: Clarify that recmaster is being set on the current node That is, using CTDB_CURRENT_NODE makes this more obvious. Also fix incorrect error messages. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-10-23 07:27:12 +03:00			`ret = ctdb_ctrl_setrecmaster(ctdb, CONTROL_TIMEOUT(),`
			`CTDB_CURRENT_NODE, em->pnn);`
recovery daemon with recovery master election election is primitive, it elects the lowest vnn as the recovery master two new controls, to get/set recovery master for a node to use recovery daemon, start one ./bin/recoverd --socket=ctdb.socket* for each ctdb daemon it has been briefly tested by deleting and adding nodes to a 4 node cluster but needs more testing (This used to be ctdb commit 541d1cc49d46d44042a31a8404d521412ef2fdb3) 2007-05-07 00:51:58 +04:00			`if (ret != 0) {`
ctdb-recoverd: Clarify that recmaster is being set on the current node That is, using CTDB_CURRENT_NODE makes this more obvious. Also fix incorrect error messages. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-10-23 07:27:12 +03:00			`DEBUG(DEBUG_ERR, (__location__ " failed to set recmaster"));`
recovery daemon with recovery master election election is primitive, it elects the lowest vnn as the recovery master two new controls, to get/set recovery master for a node to use recovery daemon, start one ./bin/recoverd --socket=ctdb.socket* for each ctdb daemon it has been briefly tested by deleting and adding nodes to a 4 node cluster but needs more testing (This used to be ctdb commit 541d1cc49d46d44042a31a8404d521412ef2fdb3) 2007-05-07 00:51:58 +04:00			`return;`
			`}`
ctdb-recoverd: Have recovery daemon remember election result The recovery daemon pushes knowledge of recovery master election progress/result to local daemon. It then retrieves that information again. Instead, have the recovery daemon reliably track election progress/result in rec->recmaster so it doesn't need to be retrieved. Be careful to maintain consistency by only doing this when the local daemon has been updated. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-10-23 06:32:41 +03:00			`rec->recmaster = em->pnn;`
recovery daemon with recovery master election election is primitive, it elects the lowest vnn as the recovery master two new controls, to get/set recovery master for a node to use recovery daemon, start one ./bin/recoverd --socket=ctdb.socket* for each ctdb daemon it has been briefly tested by deleting and adding nodes to a 4 node cluster but needs more testing (This used to be ctdb commit 541d1cc49d46d44042a31a8404d521412ef2fdb3) 2007-05-07 00:51:58 +04:00
			`return;`
			`}`


implement a scheme where nodes are banned if they continuously caused the cluster to start a recovery session. The node is banned from the cluster for the RecoveryBanPeriod (default of 5 minutes) (This used to be ctdb commit 4ad43dd07f526b6002477177fbf55483246c2c0c) 2007-06-07 09:18:55 +04:00			`/*`
			`force the start of the election process`
			`*/`
add a new tunable : reclockpingperiod once every such interval : * the recovery master on each node will uppdate the "connected" count in the reclock count file (ctdb getreclock) * if the node thinks it is a recovery master but it detects another node that is DISCONNECTED but which still holds a lock to the reclock count file this may mean that we have a split cluster. if that other node that is DISCONNECTED but still holds the lock on hte reclock pnn count file, is MORE connected than the local node, yield the recmaster role and let the other half of the lcuster take over this add a second, last chance mechanism to detect split clusters. IF the cluster is split but GPFS is not yet split, this mechanism makes the largest half of the cluster become the active half. (This used to be ctdb commit 07af425f444531942cce8abff112c1524228d287) 2008-03-03 01:19:30 +03:00			`static void force_election(struct ctdb_recoverd *rec, uint32_t pnn,`
ctdb-daemon: Rename struct ctdb_node_map to ctdb_node_map_old Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-29 09:22:48 +03:00			`struct ctdb_node_map_old *nodemap)`
recovery daemon with recovery master election election is primitive, it elects the lowest vnn as the recovery master two new controls, to get/set recovery master for a node to use recovery daemon, start one ./bin/recoverd --socket=ctdb.socket* for each ctdb daemon it has been briefly tested by deleting and adding nodes to a 4 node cluster but needs more testing (This used to be ctdb commit 541d1cc49d46d44042a31a8404d521412ef2fdb3) 2007-05-07 00:51:58 +04:00			`{`
			`int ret;`
use a priority time for the election data, not just the vnn (This used to be ctdb commit a691f9c5cd77194005f0d98483da94b07a48d57d) 2007-06-07 12:37:27 +04:00			`struct ctdb_context *ctdb = rec->ctdb;`
when starting a new election, also force all nodes into recovery mode so there is no internode traffic to interfere with our election (This used to be ctdb commit ccfb67a076c72a0e7f2b6dc5fce9c19f652ba2ad) 2007-05-10 03:48:14 +04:00
When we create new election data to send during elections, we must re-read the node flags from the main daemon to catch when the STOPPED flag is changed. (This used to be ctdb commit ca4982c40d81db528fe915d5ecc01fcf7df0b522) 2009-07-17 05:37:03 +04:00			`DEBUG(DEBUG_INFO,(__location__ " Force an election\n"));`

when starting a new election, also force all nodes into recovery mode so there is no internode traffic to interfere with our election (This used to be ctdb commit ccfb67a076c72a0e7f2b6dc5fce9c19f652ba2ad) 2007-05-10 03:48:14 +04:00			`/* set all nodes to recovery mode to stop all internode traffic */`
ctdb-recoverd: Drop code to freeze databases from set_recovery_mode() This function is called only once from force_election() and does not require freezing of databases. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2016-09-13 08:45:54 +03:00			`ret = set_recovery_mode(ctdb, rec, nodemap, CTDB_RECOVERY_ACTIVE);`
in the destructor for the lock-wait child, make sure that we cancel any pending transactions. (This used to be ctdb commit 45b6ff64f6ddf037b810c4e5f8b9f04d71067b98) 2008-07-07 02:50:12 +04:00			`if (ret != 0) {`
merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_ERR, (__location__ " Unable to set recovery mode to active on cluster\n"));`
when starting a new election, also force all nodes into recovery mode so there is no internode traffic to interfere with our election (This used to be ctdb commit ccfb67a076c72a0e7f2b6dc5fce9c19f652ba2ad) 2007-05-10 03:48:14 +04:00			`return;`
			`}`
make election handling much more scalable (This used to be ctdb commit 05938d462b92bd9ecb8e35f53651bded47c48675) 2007-11-13 02:27:44 +03:00
			`talloc_free(rec->election_timeout);`
ctdb-daemon: Stop using tevent compatibility definitions Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-26 08:50:09 +03:00			`rec->election_timeout = tevent_add_timer(`
			`ctdb->ev, ctdb,`
			`fast_start ?`
			`timeval_current_ofs(0, 500000) :`
			`timeval_current_ofs(ctdb->tunable.election_timeout, 0),`
			`ctdb_election_timeout, rec);`
make election handling much more scalable (This used to be ctdb commit 05938d462b92bd9ecb8e35f53651bded47c48675) 2007-11-13 02:27:44 +03:00
Revert "if a new node enters the cluster, that node will already be frozen at start" This is unnecessary due to 03e2e436db5cfd29a56d13f5d2101e42389bfc94. Furthermore, if a node doesn't force an election but wins it then it can fail to record that it is the new recovery master. This can lead to a reverse split brain where there is no recovery master. This reverts commit c5035657606283d2e35bea40992505e84ca8e7be. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Conflicts: server/ctdb_recoverd.c (This used to be ctdb commit c8b542e059a54b8d524bd430cad9d82e5edd864d) 2013-10-29 09:38:42 +04:00			`ret = send_election_request(rec, pnn);`
recovery daemon with recovery master election election is primitive, it elects the lowest vnn as the recovery master two new controls, to get/set recovery master for a node to use recovery daemon, start one ./bin/recoverd --socket=ctdb.socket* for each ctdb daemon it has been briefly tested by deleting and adding nodes to a 4 node cluster but needs more testing (This used to be ctdb commit 541d1cc49d46d44042a31a8404d521412ef2fdb3) 2007-05-07 00:51:58 +04:00			`if (ret!=0) {`
merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_ERR, (__location__ " failed to initiate recmaster election"));`
recovery daemon with recovery master election election is primitive, it elects the lowest vnn as the recovery master two new controls, to get/set recovery master for a node to use recovery daemon, start one ./bin/recoverd --socket=ctdb.socket* for each ctdb daemon it has been briefly tested by deleting and adding nodes to a 4 node cluster but needs more testing (This used to be ctdb commit 541d1cc49d46d44042a31a8404d521412ef2fdb3) 2007-05-07 00:51:58 +04:00			`return;`
			`}`

moved system specific ip code to system.c (This used to be ctdb commit 9de9e4ccda9665108baac12a8716b189d26340b1) 2007-05-26 08:01:08 +04:00			`/* wait for a few seconds to collect all responses */`
make election handling much more scalable (This used to be ctdb commit 05938d462b92bd9ecb8e35f53651bded47c48675) 2007-11-13 02:27:44 +03:00			`ctdb_wait_election(rec);`
implement a scheme where nodes are banned if they continuously caused the cluster to start a recovery session. The node is banned from the cluster for the RecoveryBanPeriod (default of 5 minutes) (This used to be ctdb commit 4ad43dd07f526b6002477177fbf55483246c2c0c) 2007-06-07 09:18:55 +04:00			`}`



			`/*`
			`handler for when a node changes its flags`
			`*/`
ctdb-daemon: Replace ctdb_message with srvid abstraction Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-04-08 07:38:26 +03:00			`static void monitor_handler(uint64_t srvid, TDB_DATA data, void *private_data)`
implement a scheme where nodes are banned if they continuously caused the cluster to start a recovery session. The node is banned from the cluster for the RecoveryBanPeriod (default of 5 minutes) (This used to be ctdb commit 4ad43dd07f526b6002477177fbf55483246c2c0c) 2007-06-07 09:18:55 +04:00			`{`
ctdb-daemon: Replace ctdb_message with srvid abstraction Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-04-08 07:38:26 +03:00			`struct ctdb_recoverd *rec = talloc_get_type(`
			`private_data, struct ctdb_recoverd);`
			`struct ctdb_context *ctdb = rec->ctdb;`
implement a scheme where nodes are banned if they continuously caused the cluster to start a recovery session. The node is banned from the cluster for the RecoveryBanPeriod (default of 5 minutes) (This used to be ctdb commit 4ad43dd07f526b6002477177fbf55483246c2c0c) 2007-06-07 09:18:55 +04:00			`int ret;`
			`struct ctdb_node_flag_change c = (struct ctdb_node_flag_change )data.dptr;`
ctdb-daemon: Rename struct ctdb_node_map to ctdb_node_map_old Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-29 09:22:48 +03:00			`struct ctdb_node_map_old *nodemap=NULL;`
implement a scheme where nodes are banned if they continuously caused the cluster to start a recovery session. The node is banned from the cluster for the RecoveryBanPeriod (default of 5 minutes) (This used to be ctdb commit 4ad43dd07f526b6002477177fbf55483246c2c0c) 2007-06-07 09:18:55 +04:00			`TALLOC_CTX *tmp_ctx;`
			`int i;`

			`if (data.dsize != sizeof(*c)) {`
merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_ERR,(__location__ "Invalid data in ctdb_node_flag_change\n"));`
implement a scheme where nodes are banned if they continuously caused the cluster to start a recovery session. The node is banned from the cluster for the RecoveryBanPeriod (default of 5 minutes) (This used to be ctdb commit 4ad43dd07f526b6002477177fbf55483246c2c0c) 2007-06-07 09:18:55 +04:00			`return;`
			`}`

			`tmp_ctx = talloc_new(ctdb);`
			`CTDB_NO_MEMORY_VOID(ctdb, tmp_ctx);`

			`ret = ctdb_ctrl_getnodemap(ctdb, CONTROL_TIMEOUT(), CTDB_CURRENT_NODE, tmp_ctx, &nodemap);`
fixed segv on failed ctdb_ctrl_getnodemap (This used to be ctdb commit 5daf9a72f0e60a9af7cf32ae6d759be7d94857ec) 2007-12-27 02:07:01 +03:00			`if (ret != 0) {`
merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_ERR,(__location__ "ctdb_ctrl_getnodemap failed in monitor_handler\n"));`
fixed segv on failed ctdb_ctrl_getnodemap (This used to be ctdb commit 5daf9a72f0e60a9af7cf32ae6d759be7d94857ec) 2007-12-27 02:07:01 +03:00			`talloc_free(tmp_ctx);`
			`return;`
			`}`

implement a scheme where nodes are banned if they continuously caused the cluster to start a recovery session. The node is banned from the cluster for the RecoveryBanPeriod (default of 5 minutes) (This used to be ctdb commit 4ad43dd07f526b6002477177fbf55483246c2c0c) 2007-06-07 09:18:55 +04:00
			`for (i=0;i<nodemap->num;i++) {`
change ctdb_node_flags_change.vnn to ctdb_node_flags_changed.pnn change ctdb_ban_info.vnn to ctdb_ban_info.pnn (This used to be ctdb commit fcedd40e0493948829e1c921d4fe30e9196e398a) 2007-09-04 04:33:10 +04:00			`if (nodemap->nodes[i].pnn == c->pnn) break;`
implement a scheme where nodes are banned if they continuously caused the cluster to start a recovery session. The node is banned from the cluster for the RecoveryBanPeriod (default of 5 minutes) (This used to be ctdb commit 4ad43dd07f526b6002477177fbf55483246c2c0c) 2007-06-07 09:18:55 +04:00			`}`

			`if (i == nodemap->num) {`
merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_CRIT,(__location__ "Flag change for non-existant node %u\n", c->pnn));`
implement a scheme where nodes are banned if they continuously caused the cluster to start a recovery session. The node is banned from the cluster for the RecoveryBanPeriod (default of 5 minutes) (This used to be ctdb commit 4ad43dd07f526b6002477177fbf55483246c2c0c) 2007-06-07 09:18:55 +04:00			`talloc_free(tmp_ctx);`
			`return;`
			`}`

recoverd: Really fix bogus info in message about changed flags Commit 9119a568c2b4601318f7751f537dca2f92a7230b attempted to fix this. However, this was wrong because old_flags and new_flags were confused. The latter has since been fixed in commit 7eb2f89979360b6cc98ca9b17c48310277fa89fc so this can now be fixed properly. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 40f2825d6e818dc8c745b6385a545969dfb45fbc) 2013-07-11 07:01:13 +04:00			`if (c->old_flags != c->new_flags) {`
			`DEBUG(DEBUG_NOTICE,("Node %u has changed flags - now 0x%x was 0x%x\n", c->pnn, c->new_flags, c->old_flags));`
implement a scheme where nodes are banned if they continuously caused the cluster to start a recovery session. The node is banned from the cluster for the RecoveryBanPeriod (default of 5 minutes) (This used to be ctdb commit 4ad43dd07f526b6002477177fbf55483246c2c0c) 2007-06-07 09:18:55 +04:00			`}`

change the structure used for node flag change messages so that we can see both the old flags as well as the new flags (so we can tell which flags changed) send the CTDB_SRVID_RECONFIGURE messages to connected nodes only, not to every node, connected or not, in the cluster. in the handler inside the recovery daemon which is invoked for node flag change messages, only do a takeover_run() and redistribute the ip addresses IF it was the disabled or the unhealthy flags that changed. Also send out the cluster reconfigured message in this case. If any of the other flags changed we dont need to do the takeover_run(0 here since that will be done during recovery. (This used to be ctdb commit 5549b2058e2c148a8ca9d419123acf3247bb8829) 2007-08-21 11:25:15 +04:00			`nodemap->nodes[i].flags = c->new_flags;`
implement a scheme where nodes are banned if they continuously caused the cluster to start a recovery session. The node is banned from the cluster for the RecoveryBanPeriod (default of 5 minutes) (This used to be ctdb commit 4ad43dd07f526b6002477177fbf55483246c2c0c) 2007-06-07 09:18:55 +04:00
			`talloc_free(tmp_ctx);`
recovery daemon with recovery master election election is primitive, it elects the lowest vnn as the recovery master two new controls, to get/set recovery master for a node to use recovery daemon, start one ./bin/recoverd --socket=ctdb.socket* for each ctdb daemon it has been briefly tested by deleting and adding nodes to a 4 node cluster but needs more testing (This used to be ctdb commit 541d1cc49d46d44042a31a8404d521412ef2fdb3) 2007-05-07 00:51:58 +04:00			`}`
add a test in the function that checks whether the cluster needs recovery or not that all active nodes are in normal mode. If we discover that some node is still in recoverymode it may indicate that a previous recovery ended prematurely and thus we should start a new recovery (This used to be ctdb commit c15517872e6c98c8c425a8d47d2b348ecb0620b0) 2007-05-06 22:41:12 +04:00
reqrite the handling of flag updates across the cluster to eliminate a race between the ctdb tool and the recovery daemon both at once trying to push flag changes across the cluster. (This used to be ctdb commit a9a1156ea4e10483a4bf4265b8e9203f0af033aa) 2008-11-19 06:43:46 +03:00			`/*`
			`handler for when we need to push out flag changes ot all other nodes`
			`*/`
ctdb-daemon: Replace ctdb_message with srvid abstraction Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-04-08 07:38:26 +03:00			`static void push_flags_handler(uint64_t srvid, TDB_DATA data,`
			`void *private_data)`
reqrite the handling of flag updates across the cluster to eliminate a race between the ctdb tool and the recovery daemon both at once trying to push flag changes across the cluster. (This used to be ctdb commit a9a1156ea4e10483a4bf4265b8e9203f0af033aa) 2008-11-19 06:43:46 +03:00			`{`
ctdb-daemon: Replace ctdb_message with srvid abstraction Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-04-08 07:38:26 +03:00			`struct ctdb_recoverd *rec = talloc_get_type(`
			`private_data, struct ctdb_recoverd);`
			`struct ctdb_context *ctdb = rec->ctdb;`
reqrite the handling of flag updates across the cluster to eliminate a race between the ctdb tool and the recovery daemon both at once trying to push flag changes across the cluster. (This used to be ctdb commit a9a1156ea4e10483a4bf4265b8e9203f0af033aa) 2008-11-19 06:43:46 +03:00			`int ret;`
			`struct ctdb_node_flag_change c = (struct ctdb_node_flag_change )data.dptr;`
ctdb-daemon: Rename struct ctdb_node_map to ctdb_node_map_old Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-29 09:22:48 +03:00			`struct ctdb_node_map_old *nodemap=NULL;`
server: if takeover runs when the recovery master becomes unhealthy The problem was this: When the monitor event fails, the node->flags get updated, and an update (containing the old and new flags) is sent to the recovery master. If the recovery master sends the update to itself (the same process), it was compairing the node->flags variable with the received new flags. This check always found both flag values to be equal and never sets the rec->need_takeover_run variable to true. There were two problem, first the push_flags_handler() function didn't pass the received old flags. And the ctdb_control_modflags() function ignored the received old flags. metze (This used to be ctdb commit 8ec633b64a05a2d903c2b9639909f15f6375548f) 2009-10-09 17:47:49 +04:00			`TALLOC_CTX *tmp_ctx = talloc_new(ctdb);`
			`uint32_t *nodes;`
reqrite the handling of flag updates across the cluster to eliminate a race between the ctdb tool and the recovery daemon both at once trying to push flag changes across the cluster. (This used to be ctdb commit a9a1156ea4e10483a4bf4265b8e9203f0af033aa) 2008-11-19 06:43:46 +03:00
server: if takeover runs when the recovery master becomes unhealthy The problem was this: When the monitor event fails, the node->flags get updated, and an update (containing the old and new flags) is sent to the recovery master. If the recovery master sends the update to itself (the same process), it was compairing the node->flags variable with the received new flags. This check always found both flag values to be equal and never sets the rec->need_takeover_run variable to true. There were two problem, first the push_flags_handler() function didn't pass the received old flags. And the ctdb_control_modflags() function ignored the received old flags. metze (This used to be ctdb commit 8ec633b64a05a2d903c2b9639909f15f6375548f) 2009-10-09 17:47:49 +04:00			`/* read the node flags from the recmaster */`
ctdb-recoverd: Don't retrieve recovery master from local daemon The recovery daemon already knows which node is the master. This relies on rec->recmaster being correctly initialised and correctly set during elections. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-10-23 07:33:01 +03:00			`ret = ctdb_ctrl_getnodemap(ctdb, CONTROL_TIMEOUT(), rec->recmaster,`
			`tmp_ctx, &nodemap);`
server: if takeover runs when the recovery master becomes unhealthy The problem was this: When the monitor event fails, the node->flags get updated, and an update (containing the old and new flags) is sent to the recovery master. If the recovery master sends the update to itself (the same process), it was compairing the node->flags variable with the received new flags. This check always found both flag values to be equal and never sets the rec->need_takeover_run variable to true. There were two problem, first the push_flags_handler() function didn't pass the received old flags. And the ctdb_control_modflags() function ignored the received old flags. metze (This used to be ctdb commit 8ec633b64a05a2d903c2b9639909f15f6375548f) 2009-10-09 17:47:49 +04:00			`if (ret != 0) {`
			`DEBUG(DEBUG_ERR, (__location__ " Unable to get nodemap from node %u\n", c->pnn));`
			`talloc_free(tmp_ctx);`
			`return;`
reqrite the handling of flag updates across the cluster to eliminate a race between the ctdb tool and the recovery daemon both at once trying to push flag changes across the cluster. (This used to be ctdb commit a9a1156ea4e10483a4bf4265b8e9203f0af033aa) 2008-11-19 06:43:46 +03:00			`}`
server: if takeover runs when the recovery master becomes unhealthy The problem was this: When the monitor event fails, the node->flags get updated, and an update (containing the old and new flags) is sent to the recovery master. If the recovery master sends the update to itself (the same process), it was compairing the node->flags variable with the received new flags. This check always found both flag values to be equal and never sets the rec->need_takeover_run variable to true. There were two problem, first the push_flags_handler() function didn't pass the received old flags. And the ctdb_control_modflags() function ignored the received old flags. metze (This used to be ctdb commit 8ec633b64a05a2d903c2b9639909f15f6375548f) 2009-10-09 17:47:49 +04:00			`if (c->pnn >= nodemap->num) {`
			`DEBUG(DEBUG_ERR,(__location__ " Nodemap from recmaster does not contain node %d\n", c->pnn));`
			`talloc_free(tmp_ctx);`
			`return;`
			`}`

			`/* send the flags update to all connected nodes */`
			`nodes = list_of_connected_nodes(ctdb, nodemap, tmp_ctx, true);`

			`if (ctdb_client_async_control(ctdb, CTDB_CONTROL_MODIFY_FLAGS,`
			`nodes, 0, CONTROL_TIMEOUT(),`
			`false, data,`
			`NULL, NULL,`
			`NULL) != 0) {`
			`DEBUG(DEBUG_ERR, (__location__ " ctdb_control to modify node flags failed\n"));`

			`talloc_free(tmp_ctx);`
			`return;`
			`}`

			`talloc_free(tmp_ctx);`
reqrite the handling of flag updates across the cluster to eliminate a race between the ctdb tool and the recovery daemon both at once trying to push flag changes across the cluster. (This used to be ctdb commit a9a1156ea4e10483a4bf4265b8e9203f0af033aa) 2008-11-19 06:43:46 +03:00			`}`
implement a scheme where nodes are banned if they continuously caused the cluster to start a recovery session. The node is banned from the cluster for the RecoveryBanPeriod (default of 5 minutes) (This used to be ctdb commit 4ad43dd07f526b6002477177fbf55483246c2c0c) 2007-06-07 09:18:55 +04:00
break checking that the recoverymode on all nodes are ok out into its own function (This used to be ctdb commit 813cf9a252af96da24122b80f24aabeed2911939) 2007-08-23 07:48:39 +04:00
change the monitoring of recmode in the recovery daemon to use a fully async eventdriven api for controls (This used to be ctdb commit 8d0e43428c507967a0d96e6a4c5c821ac269c546) 2007-08-27 03:40:10 +04:00			`struct verify_recmode_normal_data {`
			`uint32_t count;`
			`enum monitor_result status;`
			`};`

			`static void verify_recmode_normal_callback(struct ctdb_client_control_state *state)`
			`{`
change async.private to async.private_data since private is a reserved work in c++ (This used to be ctdb commit 79eb28f6cd5dcc30b04966d202a050eaf98a2552) 2007-09-26 08:25:32 +04:00			`struct verify_recmode_normal_data *rmdata = talloc_get_type(state->async.private_data, struct verify_recmode_normal_data);`
change the monitoring of recmode in the recovery daemon to use a fully async eventdriven api for controls (This used to be ctdb commit 8d0e43428c507967a0d96e6a4c5c821ac269c546) 2007-08-27 03:40:10 +04:00

			`/* one more node has responded with recmode data*/`
			`rmdata->count--;`

			`/* if we failed to get the recmode, then return an error and let`
			`the main loop try again.`
			`*/`
			`if (state->state != CTDB_CONTROL_DONE) {`
			`if (rmdata->status == MONITOR_OK) {`
			`rmdata->status = MONITOR_FAILED;`
			`}`
			`return;`
			`}`

			`/* if we got a response, then the recmode will be stored in the`
			`status field`
			`*/`
			`if (state->status != CTDB_RECOVERY_NORMAL) {`
recoverd: Fix an unclear log message - "Restart recovery process" When the recovery master notices a node in recovery mode it starts the recovery process, it doesn't restart it. Update documentation to match. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 298c4d2c3b4ea3d900c91f5a0a5aca2952a13d61) 2013-06-30 11:57:33 +04:00			`DEBUG(DEBUG_NOTICE, ("Node:%u was in recovery mode. Start recovery process\n", state->c->hdr.destnode));`
change the monitoring of recmode in the recovery daemon to use a fully async eventdriven api for controls (This used to be ctdb commit 8d0e43428c507967a0d96e6a4c5c821ac269c546) 2007-08-27 03:40:10 +04:00			`rmdata->status = MONITOR_RECOVERY_NEEDED;`
			`}`

			`return;`
			`}`


			`/* verify that all nodes are in normal recovery mode */`
ctdb-daemon: Rename struct ctdb_node_map to ctdb_node_map_old Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-29 09:22:48 +03:00			`static enum monitor_result verify_recmode(struct ctdb_context ctdb, struct ctdb_node_map_old nodemap)`
break checking that the recoverymode on all nodes are ok out into its own function (This used to be ctdb commit 813cf9a252af96da24122b80f24aabeed2911939) 2007-08-23 07:48:39 +04:00			`{`
change the monitoring of recmode in the recovery daemon to use a fully async eventdriven api for controls (This used to be ctdb commit 8d0e43428c507967a0d96e6a4c5c821ac269c546) 2007-08-27 03:40:10 +04:00			`struct verify_recmode_normal_data *rmdata;`
try out a slightly different api for controls where you provide a callback function which is called upon completion (or timeout) of the control. modify scanning of recmaster in the monitoring_cluster code to try the api out (This used to be ctdb commit c37843f1d97b169afec910e7ddb4e5ac12c3015c) 2007-08-23 13:27:09 +04:00			`TALLOC_CTX *mem_ctx = talloc_new(ctdb);`
change the monitoring of recmode in the recovery daemon to use a fully async eventdriven api for controls (This used to be ctdb commit 8d0e43428c507967a0d96e6a4c5c821ac269c546) 2007-08-27 03:40:10 +04:00			`struct ctdb_client_control_state *state;`
			`enum monitor_result status;`
			`int j;`
break checking that the recoverymode on all nodes are ok out into its own function (This used to be ctdb commit 813cf9a252af96da24122b80f24aabeed2911939) 2007-08-23 07:48:39 +04:00
change the monitoring of recmode in the recovery daemon to use a fully async eventdriven api for controls (This used to be ctdb commit 8d0e43428c507967a0d96e6a4c5c821ac269c546) 2007-08-27 03:40:10 +04:00			`rmdata = talloc(mem_ctx, struct verify_recmode_normal_data);`
			`CTDB_NO_MEMORY_FATAL(ctdb, rmdata);`
			`rmdata->count = 0;`
			`rmdata->status = MONITOR_OK;`
break checking that the recoverymode on all nodes are ok out into its own function (This used to be ctdb commit 813cf9a252af96da24122b80f24aabeed2911939) 2007-08-23 07:48:39 +04:00
			`/* loop over all active nodes and send an async getrecmode call to`
			`them*/`
			`for (j=0; j<nodemap->num; j++) {`
			`if (nodemap->nodes[j].flags & NODE_FLAGS_INACTIVE) {`
			`continue;`
			`}`
change the monitoring of recmode in the recovery daemon to use a fully async eventdriven api for controls (This used to be ctdb commit 8d0e43428c507967a0d96e6a4c5c821ac269c546) 2007-08-27 03:40:10 +04:00			`state = ctdb_ctrl_getrecmode_send(ctdb, mem_ctx,`
break checking that the recoverymode on all nodes are ok out into its own function (This used to be ctdb commit 813cf9a252af96da24122b80f24aabeed2911939) 2007-08-23 07:48:39 +04:00			`CONTROL_TIMEOUT(),`
change how we do public addresses and takeover so that we can have multiple public addresses spread across multiple interfaces on each node. this is a massive patch since we have previously made the assumtion that we only have one public address per node. get rid of the public_interface argument. the public addresses file now explicitely lists which interface the address belongs to (This used to be ctdb commit 462ebbc791e906a6b874c862defea43235597ca8) 2007-09-04 03:50:07 +04:00			`nodemap->nodes[j].pnn);`
change the monitoring of recmode in the recovery daemon to use a fully async eventdriven api for controls (This used to be ctdb commit 8d0e43428c507967a0d96e6a4c5c821ac269c546) 2007-08-27 03:40:10 +04:00			`if (state == NULL) {`
			`/* we failed to send the control, treat this as`
			`an error and try again next iteration`
			`*/`
merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_ERR,("Failed to call ctdb_ctrl_getrecmode_send during monitoring\n"));`
try out a slightly different api for controls where you provide a callback function which is called upon completion (or timeout) of the control. modify scanning of recmaster in the monitoring_cluster code to try the api out (This used to be ctdb commit c37843f1d97b169afec910e7ddb4e5ac12c3015c) 2007-08-23 13:27:09 +04:00			`talloc_free(mem_ctx);`
break checking that the recoverymode on all nodes are ok out into its own function (This used to be ctdb commit 813cf9a252af96da24122b80f24aabeed2911939) 2007-08-23 07:48:39 +04:00			`return MONITOR_FAILED;`
			`}`
try out a slightly different api for controls where you provide a callback function which is called upon completion (or timeout) of the control. modify scanning of recmaster in the monitoring_cluster code to try the api out (This used to be ctdb commit c37843f1d97b169afec910e7ddb4e5ac12c3015c) 2007-08-23 13:27:09 +04:00
change the monitoring of recmode in the recovery daemon to use a fully async eventdriven api for controls (This used to be ctdb commit 8d0e43428c507967a0d96e6a4c5c821ac269c546) 2007-08-27 03:40:10 +04:00			`/* set up the callback functions */`
			`state->async.fn = verify_recmode_normal_callback;`
change async.private to async.private_data since private is a reserved work in c++ (This used to be ctdb commit 79eb28f6cd5dcc30b04966d202a050eaf98a2552) 2007-09-26 08:25:32 +04:00			`state->async.private_data = rmdata;`
change the monitoring of recmode in the recovery daemon to use a fully async eventdriven api for controls (This used to be ctdb commit 8d0e43428c507967a0d96e6a4c5c821ac269c546) 2007-08-27 03:40:10 +04:00
			`/* one more control to wait for to complete */`
			`rmdata->count++;`
break checking that the recoverymode on all nodes are ok out into its own function (This used to be ctdb commit 813cf9a252af96da24122b80f24aabeed2911939) 2007-08-23 07:48:39 +04:00			`}`

change the monitoring of recmode in the recovery daemon to use a fully async eventdriven api for controls (This used to be ctdb commit 8d0e43428c507967a0d96e6a4c5c821ac269c546) 2007-08-27 03:40:10 +04:00
			`/* now wait for up to the maximum number of seconds allowed`
			`or until all nodes we expect a response from has replied`
			`*/`
			`while (rmdata->count > 0) {`
ctdb-daemon: Stop using tevent compatibility definitions Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-26 08:50:09 +03:00			`tevent_loop_once(ctdb->ev);`
change the monitoring of recmode in the recovery daemon to use a fully async eventdriven api for controls (This used to be ctdb commit 8d0e43428c507967a0d96e6a4c5c821ac269c546) 2007-08-27 03:40:10 +04:00			`}`

			`status = rmdata->status;`
try out a slightly different api for controls where you provide a callback function which is called upon completion (or timeout) of the control. modify scanning of recmaster in the monitoring_cluster code to try the api out (This used to be ctdb commit c37843f1d97b169afec910e7ddb4e5ac12c3015c) 2007-08-23 13:27:09 +04:00			`talloc_free(mem_ctx);`
change the monitoring of recmode in the recovery daemon to use a fully async eventdriven api for controls (This used to be ctdb commit 8d0e43428c507967a0d96e6a4c5c821ac269c546) 2007-08-27 03:40:10 +04:00			`return status;`
break checking that the recoverymode on all nodes are ok out into its own function (This used to be ctdb commit 813cf9a252af96da24122b80f24aabeed2911939) 2007-08-23 07:48:39 +04:00			`}`

change the monitoring of recmode in the recovery daemon to use a fully async eventdriven api for controls (This used to be ctdb commit 8d0e43428c507967a0d96e6a4c5c821ac269c546) 2007-08-27 03:40:10 +04:00
try out a slightly different api for controls where you provide a callback function which is called upon completion (or timeout) of the control. modify scanning of recmaster in the monitoring_cluster code to try the api out (This used to be ctdb commit c37843f1d97b169afec910e7ddb4e5ac12c3015c) 2007-08-23 13:27:09 +04:00			`struct verify_recmaster_data {`
when a node disgrees with us re who is recmaster make it mark that node as a lcuprit so it eventually gets banned (This used to be ctdb commit eff3f326f8ce6070c9f3c430cd14d1b71a8db220) 2008-04-21 18:56:27 +04:00			`struct ctdb_recoverd *rec;`
try out a slightly different api for controls where you provide a callback function which is called upon completion (or timeout) of the control. modify scanning of recmaster in the monitoring_cluster code to try the api out (This used to be ctdb commit c37843f1d97b169afec910e7ddb4e5ac12c3015c) 2007-08-23 13:27:09 +04:00			`uint32_t count;`
change ctdb_node_flags_change.vnn to ctdb_node_flags_changed.pnn change ctdb_ban_info.vnn to ctdb_ban_info.pnn (This used to be ctdb commit fcedd40e0493948829e1c921d4fe30e9196e398a) 2007-09-04 04:33:10 +04:00			`uint32_t pnn;`
try out a slightly different api for controls where you provide a callback function which is called upon completion (or timeout) of the control. modify scanning of recmaster in the monitoring_cluster code to try the api out (This used to be ctdb commit c37843f1d97b169afec910e7ddb4e5ac12c3015c) 2007-08-23 13:27:09 +04:00			`enum monitor_result status;`
			`};`

change the api for managing callbacks to controls so that isntead of passing it as a parameter we set the callback function explicitely from the caller if the ..._send() function returned a valid state pointer. (This used to be ctdb commit aa939570662786455f63299b62c99882cff29d42) 2007-08-24 04:42:06 +04:00			`static void verify_recmaster_callback(struct ctdb_client_control_state *state)`
try out a slightly different api for controls where you provide a callback function which is called upon completion (or timeout) of the control. modify scanning of recmaster in the monitoring_cluster code to try the api out (This used to be ctdb commit c37843f1d97b169afec910e7ddb4e5ac12c3015c) 2007-08-23 13:27:09 +04:00			`{`
change async.private to async.private_data since private is a reserved work in c++ (This used to be ctdb commit 79eb28f6cd5dcc30b04966d202a050eaf98a2552) 2007-09-26 08:25:32 +04:00			`struct verify_recmaster_data *rmdata = talloc_get_type(state->async.private_data, struct verify_recmaster_data);`
try out a slightly different api for controls where you provide a callback function which is called upon completion (or timeout) of the control. modify scanning of recmaster in the monitoring_cluster code to try the api out (This used to be ctdb commit c37843f1d97b169afec910e7ddb4e5ac12c3015c) 2007-08-23 13:27:09 +04:00

			`/* one more node has responded with recmaster data*/`
			`rmdata->count--;`

			`/* if we failed to get the recmaster, then return an error and let`
			`the main loop try again.`
			`*/`
change the api for managing callbacks to controls so that isntead of passing it as a parameter we set the callback function explicitely from the caller if the ..._send() function returned a valid state pointer. (This used to be ctdb commit aa939570662786455f63299b62c99882cff29d42) 2007-08-24 04:42:06 +04:00			`if (state->state != CTDB_CONTROL_DONE) {`
try out a slightly different api for controls where you provide a callback function which is called upon completion (or timeout) of the control. modify scanning of recmaster in the monitoring_cluster code to try the api out (This used to be ctdb commit c37843f1d97b169afec910e7ddb4e5ac12c3015c) 2007-08-23 13:27:09 +04:00			`if (rmdata->status == MONITOR_OK) {`
			`rmdata->status = MONITOR_FAILED;`
			`}`
change the api for managing callbacks to controls so that isntead of passing it as a parameter we set the callback function explicitely from the caller if the ..._send() function returned a valid state pointer. (This used to be ctdb commit aa939570662786455f63299b62c99882cff29d42) 2007-08-24 04:42:06 +04:00			`return;`
try out a slightly different api for controls where you provide a callback function which is called upon completion (or timeout) of the control. modify scanning of recmaster in the monitoring_cluster code to try the api out (This used to be ctdb commit c37843f1d97b169afec910e7ddb4e5ac12c3015c) 2007-08-23 13:27:09 +04:00			`}`

			`/* if we got a response, then the recmaster will be stored in the`
			`status field`
			`*/`
change ctdb_node_flags_change.vnn to ctdb_node_flags_changed.pnn change ctdb_ban_info.vnn to ctdb_ban_info.pnn (This used to be ctdb commit fcedd40e0493948829e1c921d4fe30e9196e398a) 2007-09-04 04:33:10 +04:00			`if (state->status != rmdata->pnn) {`
recoverd: Improve log message when nodes disagree on recmaster Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 7b7aa7b599536cd60ebb84d363607bb4e953248a) 2013-08-14 05:44:12 +04:00			`DEBUG(DEBUG_ERR,("Node %d thinks node %d is recmaster. Need a new recmaster election\n", state->c->hdr.destnode, state->status));`
when a node disgrees with us re who is recmaster make it mark that node as a lcuprit so it eventually gets banned (This used to be ctdb commit eff3f326f8ce6070c9f3c430cd14d1b71a8db220) 2008-04-21 18:56:27 +04:00			`ctdb_set_culprit(rmdata->rec, state->c->hdr.destnode);`
try out a slightly different api for controls where you provide a callback function which is called upon completion (or timeout) of the control. modify scanning of recmaster in the monitoring_cluster code to try the api out (This used to be ctdb commit c37843f1d97b169afec910e7ddb4e5ac12c3015c) 2007-08-23 13:27:09 +04:00			`rmdata->status = MONITOR_ELECTION_NEEDED;`
			`}`

change the api for managing callbacks to controls so that isntead of passing it as a parameter we set the callback function explicitely from the caller if the ..._send() function returned a valid state pointer. (This used to be ctdb commit aa939570662786455f63299b62c99882cff29d42) 2007-08-24 04:42:06 +04:00			`return;`
try out a slightly different api for controls where you provide a callback function which is called upon completion (or timeout) of the control. modify scanning of recmaster in the monitoring_cluster code to try the api out (This used to be ctdb commit c37843f1d97b169afec910e7ddb4e5ac12c3015c) 2007-08-23 13:27:09 +04:00			`}`


			`/* verify that all nodes agree that we are the recmaster */`
ctdb-daemon: Rename struct ctdb_node_map to ctdb_node_map_old Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-29 09:22:48 +03:00			`static enum monitor_result verify_recmaster(struct ctdb_recoverd rec, struct ctdb_node_map_old nodemap, uint32_t pnn)`
try out a slightly different api for controls where you provide a callback function which is called upon completion (or timeout) of the control. modify scanning of recmaster in the monitoring_cluster code to try the api out (This used to be ctdb commit c37843f1d97b169afec910e7ddb4e5ac12c3015c) 2007-08-23 13:27:09 +04:00			`{`
when a node disgrees with us re who is recmaster make it mark that node as a lcuprit so it eventually gets banned (This used to be ctdb commit eff3f326f8ce6070c9f3c430cd14d1b71a8db220) 2008-04-21 18:56:27 +04:00			`struct ctdb_context *ctdb = rec->ctdb;`
try out a slightly different api for controls where you provide a callback function which is called upon completion (or timeout) of the control. modify scanning of recmaster in the monitoring_cluster code to try the api out (This used to be ctdb commit c37843f1d97b169afec910e7ddb4e5ac12c3015c) 2007-08-23 13:27:09 +04:00			`struct verify_recmaster_data *rmdata;`
			`TALLOC_CTX *mem_ctx = talloc_new(ctdb);`
			`struct ctdb_client_control_state *state;`
			`enum monitor_result status;`
			`int j;`

			`rmdata = talloc(mem_ctx, struct verify_recmaster_data);`
			`CTDB_NO_MEMORY_FATAL(ctdb, rmdata);`
when a node disgrees with us re who is recmaster make it mark that node as a lcuprit so it eventually gets banned (This used to be ctdb commit eff3f326f8ce6070c9f3c430cd14d1b71a8db220) 2008-04-21 18:56:27 +04:00			`rmdata->rec = rec;`
try out a slightly different api for controls where you provide a callback function which is called upon completion (or timeout) of the control. modify scanning of recmaster in the monitoring_cluster code to try the api out (This used to be ctdb commit c37843f1d97b169afec910e7ddb4e5ac12c3015c) 2007-08-23 13:27:09 +04:00			`rmdata->count = 0;`
change ctdb_node_flags_change.vnn to ctdb_node_flags_changed.pnn change ctdb_ban_info.vnn to ctdb_ban_info.pnn (This used to be ctdb commit fcedd40e0493948829e1c921d4fe30e9196e398a) 2007-09-04 04:33:10 +04:00			`rmdata->pnn = pnn;`
try out a slightly different api for controls where you provide a callback function which is called upon completion (or timeout) of the control. modify scanning of recmaster in the monitoring_cluster code to try the api out (This used to be ctdb commit c37843f1d97b169afec910e7ddb4e5ac12c3015c) 2007-08-23 13:27:09 +04:00			`rmdata->status = MONITOR_OK;`

ctdb-recoverd: Do not sanity check recovery master with local daemon Each recovery daemon knows who the recmaster is and is in sync with its local daemon. The recovery master is running this check so do not bother checking with its local daemon - both agree that it is the recovery master. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-10-23 07:05:08 +03:00			`/* loop over all active nodes and send an async getrecmaster call to`
try out a slightly different api for controls where you provide a callback function which is called upon completion (or timeout) of the control. modify scanning of recmaster in the monitoring_cluster code to try the api out (This used to be ctdb commit c37843f1d97b169afec910e7ddb4e5ac12c3015c) 2007-08-23 13:27:09 +04:00			`them*/`
			`for (j=0; j<nodemap->num; j++) {`
ctdb-recoverd: Do not sanity check recovery master with local daemon Each recovery daemon knows who the recmaster is and is in sync with its local daemon. The recovery master is running this check so do not bother checking with its local daemon - both agree that it is the recovery master. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-10-23 07:05:08 +03:00			`if (nodemap->nodes[j].pnn == rec->recmaster) {`
			`continue;`
			`}`
try out a slightly different api for controls where you provide a callback function which is called upon completion (or timeout) of the control. modify scanning of recmaster in the monitoring_cluster code to try the api out (This used to be ctdb commit c37843f1d97b169afec910e7ddb4e5ac12c3015c) 2007-08-23 13:27:09 +04:00			`if (nodemap->nodes[j].flags & NODE_FLAGS_INACTIVE) {`
			`continue;`
			`}`
			`state = ctdb_ctrl_getrecmaster_send(ctdb, mem_ctx,`
get rid of the explicit global timeout used in the previous example and try this time by relying on the timeouts for the individual controls (This used to be ctdb commit 448a0eb4fd896dc545aa0b4bb2ba4628491578be) 2007-08-23 13:38:54 +04:00			`CONTROL_TIMEOUT(),`
change how we do public addresses and takeover so that we can have multiple public addresses spread across multiple interfaces on each node. this is a massive patch since we have previously made the assumtion that we only have one public address per node. get rid of the public_interface argument. the public addresses file now explicitely lists which interface the address belongs to (This used to be ctdb commit 462ebbc791e906a6b874c862defea43235597ca8) 2007-09-04 03:50:07 +04:00			`nodemap->nodes[j].pnn);`
try out a slightly different api for controls where you provide a callback function which is called upon completion (or timeout) of the control. modify scanning of recmaster in the monitoring_cluster code to try the api out (This used to be ctdb commit c37843f1d97b169afec910e7ddb4e5ac12c3015c) 2007-08-23 13:27:09 +04:00			`if (state == NULL) {`
			`/* we failed to send the control, treat this as`
			`an error and try again next iteration`
			`*/`
merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_ERR,("Failed to call ctdb_ctrl_getrecmaster_send during monitoring\n"));`
try out a slightly different api for controls where you provide a callback function which is called upon completion (or timeout) of the control. modify scanning of recmaster in the monitoring_cluster code to try the api out (This used to be ctdb commit c37843f1d97b169afec910e7ddb4e5ac12c3015c) 2007-08-23 13:27:09 +04:00			`talloc_free(mem_ctx);`
			`return MONITOR_FAILED;`
			`}`

change the api for managing callbacks to controls so that isntead of passing it as a parameter we set the callback function explicitely from the caller if the ..._send() function returned a valid state pointer. (This used to be ctdb commit aa939570662786455f63299b62c99882cff29d42) 2007-08-24 04:42:06 +04:00			`/* set up the callback functions */`
			`state->async.fn = verify_recmaster_callback;`
change async.private to async.private_data since private is a reserved work in c++ (This used to be ctdb commit 79eb28f6cd5dcc30b04966d202a050eaf98a2552) 2007-09-26 08:25:32 +04:00			`state->async.private_data = rmdata;`
change the api for managing callbacks to controls so that isntead of passing it as a parameter we set the callback function explicitely from the caller if the ..._send() function returned a valid state pointer. (This used to be ctdb commit aa939570662786455f63299b62c99882cff29d42) 2007-08-24 04:42:06 +04:00
try out a slightly different api for controls where you provide a callback function which is called upon completion (or timeout) of the control. modify scanning of recmaster in the monitoring_cluster code to try the api out (This used to be ctdb commit c37843f1d97b169afec910e7ddb4e5ac12c3015c) 2007-08-23 13:27:09 +04:00			`/* one more control to wait for to complete */`
			`rmdata->count++;`
			`}`


			`/* now wait for up to the maximum number of seconds allowed`
			`or until all nodes we expect a response from has replied`
			`*/`
get rid of the explicit global timeout used in the previous example and try this time by relying on the timeouts for the individual controls (This used to be ctdb commit 448a0eb4fd896dc545aa0b4bb2ba4628491578be) 2007-08-23 13:38:54 +04:00			`while (rmdata->count > 0) {`
ctdb-daemon: Stop using tevent compatibility definitions Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-26 08:50:09 +03:00			`tevent_loop_once(ctdb->ev);`
try out a slightly different api for controls where you provide a callback function which is called upon completion (or timeout) of the control. modify scanning of recmaster in the monitoring_cluster code to try the api out (This used to be ctdb commit c37843f1d97b169afec910e7ddb4e5ac12c3015c) 2007-08-23 13:27:09 +04:00			`}`

			`status = rmdata->status;`
			`talloc_free(mem_ctx);`
			`return status;`
			`}`

recoverd: Interface reference count changes should not cause takeover runs At the moment a naive compare of the all the interface data is done. So, if any IPs move then the reference counts for the the relevant interfaces change, interfaces appear to have changed and another takeover run is initiated by each node that took/released IPs. This change stops the spurious takeover runs by changing the interface comparison to ignore the reference counts. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 0b7257642f62ebd83c05b6e2922f0dc2737f175c) 2013-02-21 03:43:35 +04:00			`static bool interfaces_have_changed(struct ctdb_context *ctdb,`
			`struct ctdb_recoverd *rec)`
			`{`
ctdb-daemon: Rename struct ctdb_control_get_ifaces to ctdb_iface_list_old Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-28 11:43:48 +03:00			`struct ctdb_iface_list_old *ifaces = NULL;`
recoverd: Interface reference count changes should not cause takeover runs At the moment a naive compare of the all the interface data is done. So, if any IPs move then the reference counts for the the relevant interfaces change, interfaces appear to have changed and another takeover run is initiated by each node that took/released IPs. This change stops the spurious takeover runs by changing the interface comparison to ignore the reference counts. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 0b7257642f62ebd83c05b6e2922f0dc2737f175c) 2013-02-21 03:43:35 +04:00			`TALLOC_CTX *mem_ctx;`
			`bool ret = false;`

			`mem_ctx = talloc_new(NULL);`

			`/* Read the interfaces from the local node */`
			`if (ctdb_ctrl_get_ifaces(ctdb, CONTROL_TIMEOUT(),`
			`CTDB_CURRENT_NODE, mem_ctx, &ifaces) != 0) {`
			`DEBUG(DEBUG_ERR, ("Unable to get interfaces from local node %u\n", ctdb->pnn));`
			`/* We could return an error. However, this will be`
			`* rare so we'll decide that the interfaces have`
			`* actually changed, just in case.`
			`*/`
			`talloc_free(mem_ctx);`
			`return true;`
			`}`

			`if (!rec->ifaces) {`
			`/* We haven't been here before so things have changed */`
recoverd: Log more information when interfaces change Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 3ef93a1a3e60cdf5d8954e7a16a988ea6126916b) 2013-08-15 11:04:01 +04:00			`DEBUG(DEBUG_NOTICE, ("Initial interface fetched\n"));`
recoverd: Interface reference count changes should not cause takeover runs At the moment a naive compare of the all the interface data is done. So, if any IPs move then the reference counts for the the relevant interfaces change, interfaces appear to have changed and another takeover run is initiated by each node that took/released IPs. This change stops the spurious takeover runs by changing the interface comparison to ignore the reference counts. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 0b7257642f62ebd83c05b6e2922f0dc2737f175c) 2013-02-21 03:43:35 +04:00			`ret = true;`
			`} else if (rec->ifaces->num != ifaces->num) {`
			`/* Number of interfaces has changed */`
recoverd: Log more information when interfaces change Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 3ef93a1a3e60cdf5d8954e7a16a988ea6126916b) 2013-08-15 11:04:01 +04:00			`DEBUG(DEBUG_NOTICE, ("Interface count changed from %d to %d\n",`
			`rec->ifaces->num, ifaces->num));`
recoverd: Interface reference count changes should not cause takeover runs At the moment a naive compare of the all the interface data is done. So, if any IPs move then the reference counts for the the relevant interfaces change, interfaces appear to have changed and another takeover run is initiated by each node that took/released IPs. This change stops the spurious takeover runs by changing the interface comparison to ignore the reference counts. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 0b7257642f62ebd83c05b6e2922f0dc2737f175c) 2013-02-21 03:43:35 +04:00			`ret = true;`
			`} else {`
			`/* See if interface names or link states have changed */`
			`int i;`
			`for (i = 0; i < rec->ifaces->num; i++) {`
ctdb-daemon: Rename struct ctdb_control_iface_info to ctdb_iface Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-28 11:37:17 +03:00			`struct ctdb_iface * iface = &rec->ifaces->ifaces[i];`
recoverd: Log more information when interfaces change Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 3ef93a1a3e60cdf5d8954e7a16a988ea6126916b) 2013-08-15 11:04:01 +04:00			`if (strcmp(iface->name, ifaces->ifaces[i].name) != 0) {`
			`DEBUG(DEBUG_NOTICE,`
			`("Interface in slot %d changed: %s => %s\n",`
			`i, iface->name, ifaces->ifaces[i].name));`
			`ret = true;`
			`break;`
			`}`
			`if (iface->link_state != ifaces->ifaces[i].link_state) {`
			`DEBUG(DEBUG_NOTICE,`
			`("Interface %s changed state: %d => %d\n",`
			`iface->name, iface->link_state,`
			`ifaces->ifaces[i].link_state));`
recoverd: Interface reference count changes should not cause takeover runs At the moment a naive compare of the all the interface data is done. So, if any IPs move then the reference counts for the the relevant interfaces change, interfaces appear to have changed and another takeover run is initiated by each node that took/released IPs. This change stops the spurious takeover runs by changing the interface comparison to ignore the reference counts. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 0b7257642f62ebd83c05b6e2922f0dc2737f175c) 2013-02-21 03:43:35 +04:00			`ret = true;`
			`break;`
			`}`
			`}`
			`}`

			`talloc_free(rec->ifaces);`
			`rec->ifaces = talloc_steal(rec, ifaces);`

			`talloc_free(mem_ctx);`
			`return ret;`
			`}`
implement a scheme where nodes are banned if they continuously caused the cluster to start a recovery session. The node is banned from the cluster for the RecoveryBanPeriod (default of 5 minutes) (This used to be ctdb commit 4ad43dd07f526b6002477177fbf55483246c2c0c) 2007-06-07 09:18:55 +04:00
ctdb-recoverd: Fold IP allocation house-keeping into IP verification Now all the IP takeover code for non-master node is in this function. The function can always be renamed to something more suitable. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Martin Schwenke <martins@samba.org> Autobuild-Date(master): Fri May 6 15:10:59 CEST 2016 on sn-devel-144 2016-05-03 09:36:37 +03:00			`/* Check that the local allocation of public IP addresses is correct`
			`* and do some house-keeping */`
			`static int verify_local_ip_allocation(struct ctdb_context *ctdb,`
			`struct ctdb_recoverd *rec,`
			`uint32_t pnn,`
			`struct ctdb_node_map_old *nodemap)`
track both when we last started and ended a recovery. make ctdb uptime print how long the recovery took in the recovery daemon when we check that the public ip address allocation on the local node is correct (we have the ips we should have and we dont have any we shouldnt have) use ctdb uptime and check the recovery start/stop times and make sure we dont check for ip allocation inconsistencies during a recovery where the ip address allocation is in flux. (This used to be ctdb commit f86551580349b7f662f9a07e4eb0c1189e38e429) 2008-07-02 07:55:59 +04:00			`{`
			`TALLOC_CTX *mem_ctx = talloc_new(NULL);`
			`int ret, j;`
server: only trigger one takeover run in verify_ip_allocation() metze (This used to be ctdb commit 10bc087d0280057962177721bdd6d4f28743b311) 2009-12-22 17:21:08 +03:00			`bool need_takeover_run = false;`
ctdb-recoverd: Check that IP failover is active in IP verification This makes verify_local_ip_allocation() self-contained and simplifies main_loop(). Due to indentation changes, this commit is most easily read when ignoring whitespace. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-11-09 07:41:45 +03:00			`struct ctdb_public_ip_list_old *ips = NULL;`

ctdb-recoverd: Fold IP allocation house-keeping into IP verification Now all the IP takeover code for non-master node is in this function. The function can always be renamed to something more suitable. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Martin Schwenke <martins@samba.org> Autobuild-Date(master): Fri May 6 15:10:59 CEST 2016 on sn-devel-144 2016-05-03 09:36:37 +03:00			`/* If we are not the recmaster then do some housekeeping */`
			`if (rec->recmaster != pnn) {`
			`/* Ignore any IP reallocate requests - only recmaster`
			`* processes them`
			`*/`
			`TALLOC_FREE(rec->reallocate_requests);`
			`/* Clear any nodes that should be force rebalanced in`
			`* the next takeover run. If the recovery master role`
			`* has moved then we don't want to process these some`
			`* time in the future.`
			`*/`
			`TALLOC_FREE(rec->force_rebalance_nodes);`
			`}`

ctdb-recoverd: Check that IP failover is active in IP verification This makes verify_local_ip_allocation() self-contained and simplifies main_loop(). Due to indentation changes, this commit is most easily read when ignoring whitespace. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-11-09 07:41:45 +03:00			`/* Return early if disabled... */`
			`if (ctdb->tunable.disable_ip_failover != 0 \|\|`
			`ctdb_op_is_disabled(rec->takeover_run)) {`
			`return 0;`
			`}`
track both when we last started and ended a recovery. make ctdb uptime print how long the recovery took in the recovery daemon when we check that the public ip address allocation on the local node is correct (we have the ips we should have and we dont have any we shouldnt have) use ctdb uptime and check the recovery start/stop times and make sure we dont check for ip allocation inconsistencies during a recovery where the ip address allocation is in flux. (This used to be ctdb commit f86551580349b7f662f9a07e4eb0c1189e38e429) 2008-07-02 07:55:59 +04:00
recoverd: Interface reference count changes should not cause takeover runs At the moment a naive compare of the all the interface data is done. So, if any IPs move then the reference counts for the the relevant interfaces change, interfaces appear to have changed and another takeover run is initiated by each node that took/released IPs. This change stops the spurious takeover runs by changing the interface comparison to ignore the reference counts. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 0b7257642f62ebd83c05b6e2922f0dc2737f175c) 2013-02-21 03:43:35 +04:00			`if (interfaces_have_changed(ctdb, rec)) {`
server: monitor interfaces in verify_ip_allocation() metze (This used to be ctdb commit 965a65520693e3731b5b0250127b04c777087808) 2009-12-22 17:21:08 +03:00			`need_takeover_run = true;`
			`}`

ctdb-recoverd: Clean up local IP verification Update log levels and messages, comments and wrapping of long lines. No functional changes. Note that interfaces_have_changed() already does adequate logging. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-11-09 08:12:31 +03:00			`/* If there are unhosted IPs but this node can host them then`
			`* trigger an IP reallocation */`
recoverd: Verifying local IPs should only check for unhosted available IPs Currently it checks for unhosted IPs among the known IPs rather than available IPs. This means that a takeover run can be flagged even when that takeover run will be unable to assign a known, unhosted IP. Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 3cc878bc97fdac764a60ed805f64d649eaab06e8) 2012-10-11 08:17:54 +04:00
ctdb-recoverd: Clean up local IP verification Update log levels and messages, comments and wrapping of long lines. No functional changes. Note that interfaces_have_changed() already does adequate logging. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-11-09 08:12:31 +03:00			`/* Read available IPs from local node */`
			`ret = ctdb_ctrl_get_public_ips_flags(`
			`ctdb, CONTROL_TIMEOUT(), CTDB_CURRENT_NODE, mem_ctx,`
			`CTDB_PUBLIC_IP_FLAGS_ONLY_AVAILABLE, &ips);`
ctdb-recoverd: Check that IP failover is active in IP verification This makes verify_local_ip_allocation() self-contained and simplifies main_loop(). Due to indentation changes, this commit is most easily read when ignoring whitespace. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-11-09 07:41:45 +03:00			`if (ret != 0) {`
ctdb-recoverd: Clean up local IP verification Update log levels and messages, comments and wrapping of long lines. No functional changes. Note that interfaces_have_changed() already does adequate logging. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-11-09 08:12:31 +03:00			`DEBUG(DEBUG_ERR, ("Unable to retrieve available public IPs\n"));`
ctdb-recoverd: Check that IP failover is active in IP verification This makes verify_local_ip_allocation() self-contained and simplifies main_loop(). Due to indentation changes, this commit is most easily read when ignoring whitespace. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-11-09 07:41:45 +03:00			`talloc_free(mem_ctx);`
			`return -1;`
			`}`
recoverd: Verifying local IPs should only check for unhosted available IPs Currently it checks for unhosted IPs among the known IPs rather than available IPs. This means that a takeover run can be flagged even when that takeover run will be unable to assign a known, unhosted IP. Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 3cc878bc97fdac764a60ed805f64d649eaab06e8) 2012-10-11 08:17:54 +04:00
ctdb-recoverd: Check that IP failover is active in IP verification This makes verify_local_ip_allocation() self-contained and simplifies main_loop(). Due to indentation changes, this commit is most easily read when ignoring whitespace. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-11-09 07:41:45 +03:00			`for (j=0; j<ips->num; j++) {`
			`if (ips->ips[j].pnn == -1 &&`
			`nodemap->nodes[pnn].flags == 0) {`
ctdb-recoverd: Clean up local IP verification Update log levels and messages, comments and wrapping of long lines. No functional changes. Note that interfaces_have_changed() already does adequate logging. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-11-09 08:12:31 +03:00			`DEBUG(DEBUG_WARNING,`
			`("Unassigned IP %s can be served by this node\n",`
			`ctdb_addr_to_str(&ips->ips[j].addr)));`
ctdb-recoverd: Check that IP failover is active in IP verification This makes verify_local_ip_allocation() self-contained and simplifies main_loop(). Due to indentation changes, this commit is most easily read when ignoring whitespace. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-11-09 07:41:45 +03:00			`need_takeover_run = true;`
recoverd: Verifying local IPs should only check for unhosted available IPs Currently it checks for unhosted IPs among the known IPs rather than available IPs. This means that a takeover run can be flagged even when that takeover run will be unable to assign a known, unhosted IP. Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 3cc878bc97fdac764a60ed805f64d649eaab06e8) 2012-10-11 08:17:54 +04:00			`}`
ctdb-recoverd: Check that IP failover is active in IP verification This makes verify_local_ip_allocation() self-contained and simplifies main_loop(). Due to indentation changes, this commit is most easily read when ignoring whitespace. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-11-09 07:41:45 +03:00			`}`
recoverd: Verifying local IPs should only check for unhosted available IPs Currently it checks for unhosted IPs among the known IPs rather than available IPs. This means that a takeover run can be flagged even when that takeover run will be unable to assign a known, unhosted IP. Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 3cc878bc97fdac764a60ed805f64d649eaab06e8) 2012-10-11 08:17:54 +04:00
ctdb-recoverd: Check that IP failover is active in IP verification This makes verify_local_ip_allocation() self-contained and simplifies main_loop(). Due to indentation changes, this commit is most easily read when ignoring whitespace. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-11-09 07:41:45 +03:00			`talloc_free(ips);`
recoverd: Verifying local IPs should only check for unhosted available IPs Currently it checks for unhosted IPs among the known IPs rather than available IPs. This means that a takeover run can be flagged even when that takeover run will be unable to assign a known, unhosted IP. Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 3cc878bc97fdac764a60ed805f64d649eaab06e8) 2012-10-11 08:17:54 +04:00
ctdb-recoverd: Skip known IP address checking when it is disabled When public IP checking is disabled, verify_local_ip_allocation() still retrieves known IP addresses and runs through a loop that does nothing. Instead, completely skip the retrieval and checking loop. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-11-09 07:44:15 +03:00			`if (!ctdb->do_checkpublicip) {`
			`goto done;`
			`}`

ctdb-recoverd: Clean up local IP verification Update log levels and messages, comments and wrapping of long lines. No functional changes. Note that interfaces_have_changed() already does adequate logging. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-11-09 08:12:31 +03:00			`/* Validate the IP addresses that this node has on network`
			`* interfaces. If there is an inconsistency between reality`
			`* and the state expected by CTDB then try to fix it by`
			`* triggering an IP reallocation or releasing extraneous IP`
			`* addresses. */`

			`/* Read known IPs from local node */`
			`ret = ctdb_ctrl_get_public_ips_flags(`
			`ctdb, CONTROL_TIMEOUT(), CTDB_CURRENT_NODE, mem_ctx, 0, &ips);`
ctdb-recoverd: Check that IP failover is active in IP verification This makes verify_local_ip_allocation() self-contained and simplifies main_loop(). Due to indentation changes, this commit is most easily read when ignoring whitespace. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-11-09 07:41:45 +03:00			`if (ret != 0) {`
ctdb-recoverd: Clean up local IP verification Update log levels and messages, comments and wrapping of long lines. No functional changes. Note that interfaces_have_changed() already does adequate logging. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-11-09 08:12:31 +03:00			`DEBUG(DEBUG_ERR, ("Unable to retrieve known public IPs\n"));`
ctdb-recoverd: Check that IP failover is active in IP verification This makes verify_local_ip_allocation() self-contained and simplifies main_loop(). Due to indentation changes, this commit is most easily read when ignoring whitespace. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-11-09 07:41:45 +03:00			`talloc_free(mem_ctx);`
			`return -1;`
			`}`
recoverd: Verifying local IPs should only check for unhosted available IPs Currently it checks for unhosted IPs among the known IPs rather than available IPs. This means that a takeover run can be flagged even when that takeover run will be unable to assign a known, unhosted IP. Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 3cc878bc97fdac764a60ed805f64d649eaab06e8) 2012-10-11 08:17:54 +04:00
ctdb-recoverd: Check that IP failover is active in IP verification This makes verify_local_ip_allocation() self-contained and simplifies main_loop(). Due to indentation changes, this commit is most easily read when ignoring whitespace. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-11-09 07:41:45 +03:00			`for (j=0; j<ips->num; j++) {`
			`if (ips->ips[j].pnn == pnn) {`
ctdb-recoverd: Skip known IP address checking when it is disabled When public IP checking is disabled, verify_local_ip_allocation() still retrieves known IP addresses and runs through a loop that does nothing. Instead, completely skip the retrieval and checking loop. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-11-09 07:44:15 +03:00			`if (!ctdb_sys_have_ip(&ips->ips[j].addr)) {`
ctdb-recoverd: Clean up local IP verification Update log levels and messages, comments and wrapping of long lines. No functional changes. Note that interfaces_have_changed() already does adequate logging. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-11-09 08:12:31 +03:00			`DEBUG(DEBUG_ERR,`
			`("Assigned IP %s not on an interface\n",`
			`ctdb_addr_to_str(&ips->ips[j].addr)));`
ctdb-recoverd: Check that IP failover is active in IP verification This makes verify_local_ip_allocation() self-contained and simplifies main_loop(). Due to indentation changes, this commit is most easily read when ignoring whitespace. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-11-09 07:41:45 +03:00			`need_takeover_run = true;`
			`}`
			`} else {`
ctdb-recoverd: Skip known IP address checking when it is disabled When public IP checking is disabled, verify_local_ip_allocation() still retrieves known IP addresses and runs through a loop that does nothing. Instead, completely skip the retrieval and checking loop. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-11-09 07:44:15 +03:00			`if (ctdb_sys_have_ip(&ips->ips[j].addr)) {`
ctdb-recoverd: Clean up local IP verification Update log levels and messages, comments and wrapping of long lines. No functional changes. Note that interfaces_have_changed() already does adequate logging. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-11-09 08:12:31 +03:00			`DEBUG(DEBUG_ERR,`
ctdb-recoverd: Don't directly release rogue IP addresses This is inconsistent with the rest of the local IP verification. It should notice problems but not try to fix them directly. Like other cases, it should use an IP takeover run to try to fix the problem. In this case the address might have just been added and an out-of-band RELEASE_IP might cause conflicts (i.e. "another change is in flight") with a scheduled IP takeover run. This effectively reverts commit 694c1b269edc95df446b2e171919be0c266383c4. Not sure why this was needed after c7e648c2d11f9785f2493a3dd67567a635633489. More recently commit 6471541d6d2bc9f2af0ff92b280abbd1d933cf88 moves responsibility for determining interface/netmask to 10.interface so this should continue to work just fine. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-08-02 05:18:15 +03:00			`("IP %s incorrectly on an interface\n",`
ctdb-recoverd: Clean up local IP verification Update log levels and messages, comments and wrapping of long lines. No functional changes. Note that interfaces_have_changed() already does adequate logging. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-11-09 08:12:31 +03:00			`ctdb_addr_to_str(&ips->ips[j].addr)));`
ctdb-recoverd: Don't directly release rogue IP addresses This is inconsistent with the rest of the local IP verification. It should notice problems but not try to fix them directly. Like other cases, it should use an IP takeover run to try to fix the problem. In this case the address might have just been added and an out-of-band RELEASE_IP might cause conflicts (i.e. "another change is in flight") with a scheduled IP takeover run. This effectively reverts commit 694c1b269edc95df446b2e171919be0c266383c4. Not sure why this was needed after c7e648c2d11f9785f2493a3dd67567a635633489. More recently commit 6471541d6d2bc9f2af0ff92b280abbd1d933cf88 moves responsibility for determining interface/netmask to 10.interface so this should continue to work just fine. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-08-02 05:18:15 +03:00			`need_takeover_run = true;`
track both when we last started and ended a recovery. make ctdb uptime print how long the recovery took in the recovery daemon when we check that the public ip address allocation on the local node is correct (we have the ips we should have and we dont have any we shouldnt have) use ctdb uptime and check the recovery start/stop times and make sure we dont check for ip allocation inconsistencies during a recovery where the ip address allocation is in flux. (This used to be ctdb commit f86551580349b7f662f9a07e4eb0c1189e38e429) 2008-07-02 07:55:59 +04:00			`}`
			`}`
			`}`

ctdb-recoverd: Skip known IP address checking when it is disabled When public IP checking is disabled, verify_local_ip_allocation() still retrieves known IP addresses and runs through a loop that does nothing. Instead, completely skip the retrieval and checking loop. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-11-09 07:44:15 +03:00			`done:`
server: only trigger one takeover run in verify_ip_allocation() metze (This used to be ctdb commit 10bc087d0280057962177721bdd6d4f28743b311) 2009-12-22 17:21:08 +03:00			`if (need_takeover_run) {`
ctdb-daemon: Rename struct srvid_request to ctdb_srvid_message Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-29 06:32:49 +03:00			`struct ctdb_srvid_message rd;`
server: only trigger one takeover run in verify_ip_allocation() metze (This used to be ctdb commit 10bc087d0280057962177721bdd6d4f28743b311) 2009-12-22 17:21:08 +03:00			`TDB_DATA data;`

ctdb-recoverd: Clean up local IP verification Update log levels and messages, comments and wrapping of long lines. No functional changes. Note that interfaces_have_changed() already does adequate logging. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-11-09 08:12:31 +03:00			`DEBUG(DEBUG_NOTICE,("Trigger takeoverrun\n"));`
server: only trigger one takeover run in verify_ip_allocation() metze (This used to be ctdb commit 10bc087d0280057962177721bdd6d4f28743b311) 2009-12-22 17:21:08 +03:00
ctdb-recoverd: Fix some uninitialised memory issues The first element of these structures is a 32-bit PNN. On 64-bit systems this field can be followed by 32-bits of padding. When the structures are copied this can cause uninitialised memory to be copied. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Michael Adam <obnox@samba.org> 2016-01-11 09:23:12 +03:00			`ZERO_STRUCT(rd);`
server: only trigger one takeover run in verify_ip_allocation() metze (This used to be ctdb commit 10bc087d0280057962177721bdd6d4f28743b311) 2009-12-22 17:21:08 +03:00			`rd.pnn = ctdb->pnn;`
			`rd.srvid = 0;`
			`data.dptr = (uint8_t *)&rd;`
			`data.dsize = sizeof(rd);`

rename ctdb_send_message to ctdb_client_send_message to resolve colission with the function of the same name in libctdb (This used to be ctdb commit ac3292c12832484a22715f1d46aa23f3b7c8a6f6) 2010-06-02 03:45:21 +04:00			`ret = ctdb_client_send_message(ctdb, rec->recmaster, CTDB_SRVID_TAKEOVER_RUN, data);`
server: only trigger one takeover run in verify_ip_allocation() metze (This used to be ctdb commit 10bc087d0280057962177721bdd6d4f28743b311) 2009-12-22 17:21:08 +03:00			`if (ret != 0) {`
ctdb-recoverd: Clean up local IP verification Update log levels and messages, comments and wrapping of long lines. No functional changes. Note that interfaces_have_changed() already does adequate logging. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-11-09 08:12:31 +03:00			`DEBUG(DEBUG_ERR,`
			`("Failed to send takeover run request\n"));`
server: only trigger one takeover run in verify_ip_allocation() metze (This used to be ctdb commit 10bc087d0280057962177721bdd6d4f28743b311) 2009-12-22 17:21:08 +03:00			`}`
			`}`
track both when we last started and ended a recovery. make ctdb uptime print how long the recovery took in the recovery daemon when we check that the public ip address allocation on the local node is correct (we have the ips we should have and we dont have any we shouldnt have) use ctdb uptime and check the recovery start/stop times and make sure we dont check for ip allocation inconsistencies during a recovery where the ip address allocation is in flux. (This used to be ctdb commit f86551580349b7f662f9a07e4eb0c1189e38e429) 2008-07-02 07:55:59 +04:00			`talloc_free(mem_ctx);`
			`return 0;`
			`}`

redo and update how we synchronize flags across the cluster. this simplifies the code and should close a race condition between the local recovery daemon and a remote node when flags are changing. (This used to be ctdb commit 32d460b8469eb53145f04161a5d01166f9b5f09e) 2008-12-05 08:32:30 +03:00
			`static void async_getnodemap_callback(struct ctdb_context ctdb, uint32_t node_pnn, int32_t res, TDB_DATA outdata, void callback_data)`
			`{`
ctdb-daemon: Rename struct ctdb_node_map to ctdb_node_map_old Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-29 09:22:48 +03:00			`struct ctdb_node_map_old **remote_nodemaps = callback_data;`
redo and update how we synchronize flags across the cluster. this simplifies the code and should close a race condition between the local recovery daemon and a remote node when flags are changing. (This used to be ctdb commit 32d460b8469eb53145f04161a5d01166f9b5f09e) 2008-12-05 08:32:30 +03:00
			`if (node_pnn >= ctdb->num_nodes) {`
			`DEBUG(DEBUG_ERR,(__location__ " pnn from invalid node\n"));`
			`return;`
			`}`

ctdb-daemon: Rename struct ctdb_node_map to ctdb_node_map_old Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-29 09:22:48 +03:00			`remote_nodemaps[node_pnn] = (struct ctdb_node_map_old *)talloc_steal(remote_nodemaps, outdata.dptr);`
redo and update how we synchronize flags across the cluster. this simplifies the code and should close a race condition between the local recovery daemon and a remote node when flags are changing. (This used to be ctdb commit 32d460b8469eb53145f04161a5d01166f9b5f09e) 2008-12-05 08:32:30 +03:00
			`}`

			`static int get_remote_nodemaps(struct ctdb_context ctdb, TALLOC_CTX mem_ctx,`
ctdb-daemon: Rename struct ctdb_node_map to ctdb_node_map_old Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-29 09:22:48 +03:00			`struct ctdb_node_map_old *nodemap,`
			`struct ctdb_node_map_old **remote_nodemaps)`
redo and update how we synchronize flags across the cluster. this simplifies the code and should close a race condition between the local recovery daemon and a remote node when flags are changing. (This used to be ctdb commit 32d460b8469eb53145f04161a5d01166f9b5f09e) 2008-12-05 08:32:30 +03:00			`{`
			`uint32_t *nodes;`

			`nodes = list_of_active_nodes(ctdb, nodemap, mem_ctx, true);`
			`if (ctdb_client_async_control(ctdb, CTDB_CONTROL_GET_NODEMAP,`
initial attempt at freezing databases in priority order (This used to be ctdb commit e8d692590da1070c87a4144031e3306d190ebed2) 2009-10-12 05:08:39 +04:00			`nodes, 0,`
redo and update how we synchronize flags across the cluster. this simplifies the code and should close a race condition between the local recovery daemon and a remote node when flags are changing. (This used to be ctdb commit 32d460b8469eb53145f04161a5d01166f9b5f09e) 2008-12-05 08:32:30 +03:00			`CONTROL_TIMEOUT(), false, tdb_null,`
			`async_getnodemap_callback,`
			`NULL,`
update to the flags handling make sure to abort the monitoring and restart if we failed to get the nodemap from a remote node (This used to be ctdb commit 4eac0214e732e6c2f867d66ec71d4406680dbb94) 2008-12-09 02:45:14 +03:00			`remote_nodemaps) != 0) {`
redo and update how we synchronize flags across the cluster. this simplifies the code and should close a race condition between the local recovery daemon and a remote node when flags are changing. (This used to be ctdb commit 32d460b8469eb53145f04161a5d01166f9b5f09e) 2008-12-05 08:32:30 +03:00			`DEBUG(DEBUG_ERR, (__location__ " Unable to pull all remote nodemaps\n"));`

			`return -1;`
			`}`

			`return 0;`
			`}`

ctdb-recoverd: Call election when necessary in recovery master validation There is no need to return one of several states and then trigger an election for one of those return states. Have the recovery master validation trigger the election directly and just return whether monitoring should continue. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-04-28 09:58:35 +03:00			`static bool validate_recovery_master(struct ctdb_recoverd *rec,`
			`TALLOC_CTX *mem_ctx)`
ctdb-recoverd: Factor out recovery master validation Starting to untangle cluster management, database recovery and public IP allocation. This is a non-trivial subset of the cluster management code that runs in the recovery daemon on all nodes. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Mon Nov 16 11:47:45 CET 2015 on sn-devel-104 2015-10-27 08:43:07 +03:00			`{`
			`struct ctdb_context *ctdb = rec->ctdb;`
			`uint32_t pnn = ctdb_get_pnn(ctdb);`
			`struct ctdb_node_map_old *nodemap = rec->nodemap;`
			`struct ctdb_node_map_old *recmaster_nodemap = NULL;`
			`int ret;`

			`/* When recovery daemon is started, recmaster is set to`
			`* "unknown" so it knows to start an election.`
			`*/`
			`if (rec->recmaster == CTDB_UNKNOWN_PNN) {`
			`DEBUG(DEBUG_NOTICE,`
			`("Initial recovery master set - forcing election\n"));`
ctdb-recoverd: Call election when necessary in recovery master validation There is no need to return one of several states and then trigger an election for one of those return states. Have the recovery master validation trigger the election directly and just return whether monitoring should continue. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-04-28 09:58:35 +03:00			`force_election(rec, pnn, nodemap);`
			`return false;`
ctdb-recoverd: Factor out recovery master validation Starting to untangle cluster management, database recovery and public IP allocation. This is a non-trivial subset of the cluster management code that runs in the recovery daemon on all nodes. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Mon Nov 16 11:47:45 CET 2015 on sn-devel-104 2015-10-27 08:43:07 +03:00			`}`

			`/*`
			`* If the current recmaster does not have CTDB_CAP_RECMASTER,`
			`* but we have, then force an election and try to become the new`
			`* recmaster.`
			`*/`
			`if (!ctdb_node_has_capabilities(rec->caps,`
			`rec->recmaster,`
			`CTDB_CAP_RECMASTER) &&`
			`(rec->ctdb->capabilities & CTDB_CAP_RECMASTER) &&`
			`!(nodemap->nodes[pnn].flags & NODE_FLAGS_INACTIVE)) {`
			`DEBUG(DEBUG_ERR,`
			`(" Current recmaster node %u does not have CAP_RECMASTER,"`
			`" but we (node %u) have - force an election\n",`
			`rec->recmaster, pnn));`
ctdb-recoverd: Call election when necessary in recovery master validation There is no need to return one of several states and then trigger an election for one of those return states. Have the recovery master validation trigger the election directly and just return whether monitoring should continue. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-04-28 09:58:35 +03:00			`force_election(rec, pnn, nodemap);`
			`return false;`
ctdb-recoverd: Factor out recovery master validation Starting to untangle cluster management, database recovery and public IP allocation. This is a non-trivial subset of the cluster management code that runs in the recovery daemon on all nodes. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Mon Nov 16 11:47:45 CET 2015 on sn-devel-104 2015-10-27 08:43:07 +03:00			`}`

			`/* Verify that the master node has not been deleted. This`
			`* should not happen because a node should always be shutdown`
			`* before being deleted, causing a new master to be elected`
			`* before now. However, if something strange has happened`
			`* then checking here will ensure we don't index beyond the`
			`* end of the nodemap array. */`
			`if (rec->recmaster >= nodemap->num) {`
			`DEBUG(DEBUG_ERR,`
			`("Recmaster node %u has been deleted. Force election\n",`
			`rec->recmaster));`
ctdb-recoverd: Call election when necessary in recovery master validation There is no need to return one of several states and then trigger an election for one of those return states. Have the recovery master validation trigger the election directly and just return whether monitoring should continue. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-04-28 09:58:35 +03:00			`force_election(rec, pnn, nodemap);`
			`return false;`
ctdb-recoverd: Factor out recovery master validation Starting to untangle cluster management, database recovery and public IP allocation. This is a non-trivial subset of the cluster management code that runs in the recovery daemon on all nodes. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Mon Nov 16 11:47:45 CET 2015 on sn-devel-104 2015-10-27 08:43:07 +03:00			`}`

			`/* if recovery master is disconnected/deleted we must elect a new recmaster */`
			`if (nodemap->nodes[rec->recmaster].flags &`
			`(NODE_FLAGS_DISCONNECTED\|NODE_FLAGS_DELETED)) {`
			`DEBUG(DEBUG_NOTICE,`
			`("Recmaster node %u is disconnected/deleted. Force election\n",`
			`rec->recmaster));`
ctdb-recoverd: Call election when necessary in recovery master validation There is no need to return one of several states and then trigger an election for one of those return states. Have the recovery master validation trigger the election directly and just return whether monitoring should continue. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-04-28 09:58:35 +03:00			`force_election(rec, pnn, nodemap);`
			`return false;`
ctdb-recoverd: Factor out recovery master validation Starting to untangle cluster management, database recovery and public IP allocation. This is a non-trivial subset of the cluster management code that runs in the recovery daemon on all nodes. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Mon Nov 16 11:47:45 CET 2015 on sn-devel-104 2015-10-27 08:43:07 +03:00			`}`

			`/* get nodemap from the recovery master to check if it is inactive */`
			`ret = ctdb_ctrl_getnodemap(ctdb, CONTROL_TIMEOUT(), rec->recmaster,`
			`mem_ctx, &recmaster_nodemap);`
			`if (ret != 0) {`
			`DEBUG(DEBUG_ERR,`
			`(__location__`
			`" Unable to get nodemap from recovery master %u\n",`
			`rec->recmaster));`
ctdb-recoverd: Call election when necessary in recovery master validation There is no need to return one of several states and then trigger an election for one of those return states. Have the recovery master validation trigger the election directly and just return whether monitoring should continue. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-04-28 09:58:35 +03:00			`/* No election, just error */`
			`return false;`
ctdb-recoverd: Factor out recovery master validation Starting to untangle cluster management, database recovery and public IP allocation. This is a non-trivial subset of the cluster management code that runs in the recovery daemon on all nodes. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Mon Nov 16 11:47:45 CET 2015 on sn-devel-104 2015-10-27 08:43:07 +03:00			`}`


			`if ((recmaster_nodemap->nodes[rec->recmaster].flags & NODE_FLAGS_INACTIVE) &&`
			`(rec->node_flags & NODE_FLAGS_INACTIVE) == 0) {`
			`DEBUG(DEBUG_NOTICE,`
			`("Recmaster node %u is inactive. Force election\n",`
			`rec->recmaster));`
			`/*`
			`* update our nodemap to carry the recmaster's notion of`
			`* its own flags, so that we don't keep freezing the`
			`* inactive recmaster node...`
			`*/`
			`nodemap->nodes[rec->recmaster].flags =`
			`recmaster_nodemap->nodes[rec->recmaster].flags;`
ctdb-recoverd: Call election when necessary in recovery master validation There is no need to return one of several states and then trigger an election for one of those return states. Have the recovery master validation trigger the election directly and just return whether monitoring should continue. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-04-28 09:58:35 +03:00			`force_election(rec, pnn, nodemap);`
			`return false;`
ctdb-recoverd: Factor out recovery master validation Starting to untangle cluster management, database recovery and public IP allocation. This is a non-trivial subset of the cluster management code that runs in the recovery daemon on all nodes. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Mon Nov 16 11:47:45 CET 2015 on sn-devel-104 2015-10-27 08:43:07 +03:00			`}`

ctdb-recoverd: Call election when necessary in recovery master validation There is no need to return one of several states and then trigger an election for one of those return states. Have the recovery master validation trigger the election directly and just return whether monitoring should continue. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-04-28 09:58:35 +03:00			`return true;`
ctdb-recoverd: Factor out recovery master validation Starting to untangle cluster management, database recovery and public IP allocation. This is a non-trivial subset of the cluster management code that runs in the recovery daemon on all nodes. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Mon Nov 16 11:47:45 CET 2015 on sn-devel-104 2015-10-27 08:43:07 +03:00			`}`

speed startup: alter recovery loop We do a recovery on startup. But the code does: Sleep for ctdb->tunable.recover_interval. Check for recovery. We want to do it in the other order. This is best done by extracting the loop into a separate "main_loop" function. Seconds between ctdbd first log message and node healthy: BEFORE: 24.09 AFTER: 23.58 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 097046025176b9fcb670839d1a9f100f890e7ed2) 2010-06-22 17:20:23 +04:00			`static void main_loop(struct ctdb_context ctdb, struct ctdb_recoverd rec,`
			`TALLOC_CTX *mem_ctx)`
start working on a recovery daemon change ctdb_control so it takes a timeval pointer as argument. this is the timeout. if the node has not responded within hte timeout ctdb_control will return an error instead of hanging. if the timeval pointer is NULL then the call will block indefinitely if there is no response. this is used for now in the createdb control but all the helpers ctdb_ctrl_* should probably be updated to take a timeout parameter as well. (This used to be ctdb commit 1fe64b04869b17dbf123851b0fe09df8d28a6211) 2007-05-04 02:30:18 +04:00			`{`
change recmaster from being a local variable in monitor_cluster() to be a member of the ctdb_recoverd structure (This used to be ctdb commit b7f955338f50c92374b4f559268fb3a1a516aefa) 2008-03-02 23:53:46 +03:00			`uint32_t pnn;`
ctdb-daemon: Rename struct ctdb_node_map to ctdb_node_map_old Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-29 09:22:48 +03:00			`struct ctdb_node_map_old *nodemap=NULL;`
			`struct ctdb_node_map_old **remote_nodemaps=NULL;`
update getvnnmap control to take a timeout parameter dont explicitely free the vnnmap pointer in the getvnnmap control this is freed by the mem_ctx instead add code to the recoverd to detect when/if recovery is required veiry that the number of active nodes, the nodemap and the vnn map is consistent across the entire cluster and if not trigger a recovery (which right now just prints "we need to do recovery" to the screen. (This used to be ctdb commit 2b0a207a3748bdb3394dc9fd0d1c344ee1bb0bb5) 2007-05-04 03:45:53 +04:00			`struct ctdb_vnn_map *vnnmap=NULL;`
			`struct ctdb_vnn_map *remote_vnnmap=NULL;`
ctdb-recoverd: Make num_lmasters a local variable It isn't used anywhere else and is always re-initialised to 0. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-03-29 09:49:02 +03:00			`uint32_t num_lmasters;`
read the current debuglevel in each loop in the recovery daemon so that we pick up when they change in the parent daemon (This used to be ctdb commit 792d5471ff0c2947b6e66183925860de27f30eaf) 2008-02-18 11:38:04 +03:00			`int32_t debug_level;`
update getvnnmap control to take a timeout parameter dont explicitely free the vnnmap pointer in the getvnnmap control this is freed by the mem_ctx instead add code to the recoverd to detect when/if recovery is required veiry that the number of active nodes, the nodemap and the vnn map is consistent across the entire cluster and if not trigger a recovery (which right now just prints "we need to do recovery" to the screen. (This used to be ctdb commit 2b0a207a3748bdb3394dc9fd0d1c344ee1bb0bb5) 2007-05-04 03:45:53 +04:00			`int i, j, ret;`
recoverd: Don't continue if the current node gets banned Can not continue with recovery or monitoring cluster. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 14399de1dd0bd8dabf1f48b1457e3ccb37589d8a) 2013-06-28 10:31:07 +04:00			`bool self_ban;`
implement a scheme where nodes are banned if they continuously caused the cluster to start a recovery session. The node is banned from the cluster for the RecoveryBanPeriod (default of 5 minutes) (This used to be ctdb commit 4ad43dd07f526b6002477177fbf55483246c2c0c) 2007-06-07 09:18:55 +04:00
make recovery daemon values tunable (This used to be ctdb commit ec29dbf2f5110428df8b97801443ba7addf61353) 2007-06-04 14:22:44 +04:00
merge from ronnie (This used to be ctdb commit 0aa6e04438aa5ec727815689baa19544df042cf7) 2008-01-07 08:17:22 +03:00			`/* verify that the main daemon is still running */`
Track all child process so we never send a signal to an unrelated process (our child died and kernel wrapped the pid-space and reused the pid for a different process Wrap all creation of child processes inside ctdb_fork() which is used to track all processes we have spawned. Capture SIGCHLD to track also which child processes have terminated. Wrap kill() inside ctdb_kill() and make sure that we never send a !0 signal to a child process pid that has already terminated (and might have been replaced with a (This used to be ctdb commit f73a4b1495830bcdd094a93732a89dd53b3c2f78) 2012-05-03 05:42:41 +04:00			`if (ctdb_kill(ctdb, ctdb->ctdbd_pid, 0) != 0) {`
merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_CRIT,("CTDB daemon is no longer available. Shutting down recovery daemon\n"));`
merge from ronnie (This used to be ctdb commit 0aa6e04438aa5ec727815689baa19544df042cf7) 2008-01-07 08:17:22 +03:00			`exit(-1);`
			`}`

additional monitoring between the two daemons. we currently only monitor that the dameons are running by kill(0, pid) and verifying the the domain socket between them is ok. this is not sufficient since we can have a situation where the recovery daemon is hung. this new code monitors that the recovery daemon is operating. if the recovery hangs, we log this and shut down the main daemon (This used to be ctdb commit cd69d292292eaab3aac0e9d9fc57cb621597c63c) 2008-09-09 07:44:46 +04:00			`/* ping the local daemon to tell it we are alive */`
			`ctdb_ctrl_recd_ping(ctdb);`

make election handling much more scalable (This used to be ctdb commit 05938d462b92bd9ecb8e35f53651bded47c48675) 2007-11-13 02:27:44 +03:00			`if (rec->election_timeout) {`
			`/* an election is in progress */`
speed startup: alter recovery loop We do a recovery on startup. But the code does: Sleep for ctdb->tunable.recover_interval. Check for recovery. We want to do it in the other order. This is best done by extracting the loop into a separate "main_loop" function. Seconds between ctdbd first log message and node healthy: BEFORE: 24.09 AFTER: 23.58 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 097046025176b9fcb670839d1a9f100f890e7ed2) 2010-06-22 17:20:23 +04:00			`return;`
make election handling much more scalable (This used to be ctdb commit 05938d462b92bd9ecb8e35f53651bded47c48675) 2007-11-13 02:27:44 +03:00			`}`

read the current debuglevel in each loop in the recovery daemon so that we pick up when they change in the parent daemon (This used to be ctdb commit 792d5471ff0c2947b6e66183925860de27f30eaf) 2008-02-18 11:38:04 +03:00			`/* read the debug level from the parent and update locally */`
			`ret = ctdb_ctrl_get_debuglevel(ctdb, CTDB_CURRENT_NODE, &debug_level);`
			`if (ret !=0) {`
			`DEBUG(DEBUG_ERR, (__location__ " Failed to read debuglevel from parent\n"));`
speed startup: alter recovery loop We do a recovery on startup. But the code does: Sleep for ctdb->tunable.recover_interval. Check for recovery. We want to do it in the other order. This is best done by extracting the loop into a separate "main_loop" function. Seconds between ctdbd first log message and node healthy: BEFORE: 24.09 AFTER: 23.58 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 097046025176b9fcb670839d1a9f100f890e7ed2) 2010-06-22 17:20:23 +04:00			`return;`
read the current debuglevel in each loop in the recovery daemon so that we pick up when they change in the parent daemon (This used to be ctdb commit 792d5471ff0c2947b6e66183925860de27f30eaf) 2008-02-18 11:38:04 +03:00			`}`
ctdb-logging: Change LogLevel to DEBUGLEVEL For compatibility with current Samba debug.[ch]. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Volker Lendecke <vl@samba.org> 2014-09-24 11:12:56 +04:00			`DEBUGLEVEL = debug_level;`
read the current debuglevel in each loop in the recovery daemon so that we pick up when they change in the parent daemon (This used to be ctdb commit 792d5471ff0c2947b6e66183925860de27f30eaf) 2008-02-18 11:38:04 +03:00
make recovery daemon values tunable (This used to be ctdb commit ec29dbf2f5110428df8b97801443ba7addf61353) 2007-06-04 14:22:44 +04:00			`/* get relevant tunables */`
get all the tunables at once in recovery daemon (This used to be ctdb commit 8e60be6c22aab145e68b16ede5f32f4430c2af93) 2007-06-07 12:05:25 +04:00			`ret = ctdb_ctrl_get_all_tunables(ctdb, CONTROL_TIMEOUT(), CTDB_CURRENT_NODE, &ctdb->tunable);`
			`if (ret != 0) {`
merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_ERR,("Failed to get tunables - retrying\n"));`
speed startup: alter recovery loop We do a recovery on startup. But the code does: Sleep for ctdb->tunable.recover_interval. Check for recovery. We want to do it in the other order. This is best done by extracting the loop into a separate "main_loop" function. Seconds between ctdbd first log message and node healthy: BEFORE: 24.09 AFTER: 23.58 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 097046025176b9fcb670839d1a9f100f890e7ed2) 2010-06-22 17:20:23 +04:00			`return;`
get all the tunables at once in recovery daemon (This used to be ctdb commit 8e60be6c22aab145e68b16ede5f32f4430c2af93) 2007-06-07 12:05:25 +04:00			`}`
start working on a recovery daemon change ctdb_control so it takes a timeval pointer as argument. this is the timeout. if the node has not responded within hte timeout ctdb_control will return an error instead of hanging. if the timeval pointer is NULL then the call will block indefinitely if there is no response. this is used for now in the createdb control but all the helpers ctdb_ctrl_* should probably be updated to take a timeout parameter as well. (This used to be ctdb commit 1fe64b04869b17dbf123851b0fe09df8d28a6211) 2007-05-04 02:30:18 +04:00
ctdb-recoverd: If obtaining recovery lock fails, try again When ctdb daemon starts up, it considers itself the recovery master and tries to do first recovery. However, it's possible that there is already a recovery master and the current node has not yet heard from it. So do not ban ourselves immediately if ctdb_recovery_lock() fails when doing first recovery. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2014-09-25 11:17:04 +04:00			`/* get runstate */`
			`ret = ctdb_ctrl_get_runstate(ctdb, CONTROL_TIMEOUT(),`
			`CTDB_CURRENT_NODE, &ctdb->runstate);`
			`if (ret != 0) {`
			`DEBUG(DEBUG_ERR, ("Failed to get runstate - retrying\n"));`
			`return;`
			`}`

recoverd: Recovery daemon should use ctdb_get_pnn, which can't fail Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit c6fded59fa4da67f738a90fdacb51900e41801f9) 2013-07-08 06:45:31 +04:00			`pnn = ctdb_get_pnn(ctdb);`
start working on a recovery daemon change ctdb_control so it takes a timeval pointer as argument. this is the timeout. if the node has not responded within hte timeout ctdb_control will return an error instead of hanging. if the timeval pointer is NULL then the call will block indefinitely if there is no response. this is used for now in the createdb control but all the helpers ctdb_ctrl_* should probably be updated to take a timeout parameter as well. (This used to be ctdb commit 1fe64b04869b17dbf123851b0fe09df8d28a6211) 2007-05-04 02:30:18 +04:00
ctdb-recoverd: Simplify using TALLOC_FREE() The only non-obvious part here is dropping the setting of the nodemap local variable to NULL. If the following control succeeds then it is set, otherwise return and it doesn't matter. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-10-23 08:00:55 +03:00			`/* get nodemap */`
			`TALLOC_FREE(rec->nodemap);`
add a new tunable : reclockpingperiod once every such interval : * the recovery master on each node will uppdate the "connected" count in the reclock count file (ctdb getreclock) * if the node thinks it is a recovery master but it detects another node that is DISCONNECTED but which still holds a lock to the reclock count file this may mean that we have a split cluster. if that other node that is DISCONNECTED but still holds the lock on hte reclock pnn count file, is MORE connected than the local node, yield the recmaster role and let the other half of the lcuster take over this add a second, last chance mechanism to detect split clusters. IF the cluster is split but GPFS is not yet split, this mechanism makes the largest half of the cluster become the active half. (This used to be ctdb commit 07af425f444531942cce8abff112c1524228d287) 2008-03-03 01:19:30 +03:00			`ret = ctdb_ctrl_getnodemap(ctdb, CONTROL_TIMEOUT(), pnn, rec, &rec->nodemap);`
change the signature for ctdb_ctrl_getnodemap() so that a timeout parameter is added. change ctdb_get_connected_nodes in the same way (This used to be ctdb commit d85f23bcf4c1230225abb2f4a053c70b68d469aa) 2007-05-04 03:01:01 +04:00			`if (ret != 0) {`
merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_ERR, (__location__ " Unable to get nodemap from node %u\n", pnn));`
speed startup: alter recovery loop We do a recovery on startup. But the code does: Sleep for ctdb->tunable.recover_interval. Check for recovery. We want to do it in the other order. This is best done by extracting the loop into a separate "main_loop" function. Seconds between ctdbd first log message and node healthy: BEFORE: 24.09 AFTER: 23.58 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 097046025176b9fcb670839d1a9f100f890e7ed2) 2010-06-22 17:20:23 +04:00			`return;`
change the signature for ctdb_ctrl_getnodemap() so that a timeout parameter is added. change ctdb_get_connected_nodes in the same way (This used to be ctdb commit d85f23bcf4c1230225abb2f4a053c70b68d469aa) 2007-05-04 03:01:01 +04:00			`}`
add a new tunable : reclockpingperiod once every such interval : * the recovery master on each node will uppdate the "connected" count in the reclock count file (ctdb getreclock) * if the node thinks it is a recovery master but it detects another node that is DISCONNECTED but which still holds a lock to the reclock count file this may mean that we have a split cluster. if that other node that is DISCONNECTED but still holds the lock on hte reclock pnn count file, is MORE connected than the local node, yield the recmaster role and let the other half of the lcuster take over this add a second, last chance mechanism to detect split clusters. IF the cluster is split but GPFS is not yet split, this mechanism makes the largest half of the cluster become the active half. (This used to be ctdb commit 07af425f444531942cce8abff112c1524228d287) 2008-03-03 01:19:30 +03:00			`nodemap = rec->nodemap;`
start working on a recovery daemon change ctdb_control so it takes a timeval pointer as argument. this is the timeout. if the node has not responded within hte timeout ctdb_control will return an error instead of hanging. if the timeval pointer is NULL then the call will block indefinitely if there is no response. this is used for now in the createdb control but all the helpers ctdb_ctrl_* should probably be updated to take a timeout parameter as well. (This used to be ctdb commit 1fe64b04869b17dbf123851b0fe09df8d28a6211) 2007-05-04 02:30:18 +04:00
recoverd: Set node_flags information as soon as we get nodemap Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 8d622660a14c929e365d306147b378ea6ab92175) 2013-06-28 08:09:35 +04:00			`/* remember our own node flags */`
			`rec->node_flags = nodemap->nodes[pnn].flags;`

recoverd: Don't continue if the current node gets banned Can not continue with recovery or monitoring cluster. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 14399de1dd0bd8dabf1f48b1457e3ccb37589d8a) 2013-06-28 10:31:07 +04:00			`ban_misbehaving_nodes(rec, &self_ban);`
			`if (self_ban) {`
			`DEBUG(DEBUG_NOTICE, ("This node was banned, restart main_loop\n"));`
			`return;`
			`}`
recoverd: Move code to ban other nodes after we get local node flags If a node gets banned first, then it should not ban other nodes. This code was moved up in main_loop to avoid waiting for nodemap from other nodes (commit 83b0261f2cb453195b86f547d360400103a8b795). To prevent a banned node from banning other nodes, we need to first get nodemap information from local node, so trying to ban other nodes can fail if we are already banned. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit ae1693905036ecdbc4594fde1f12500faae4a554) 2013-06-27 10:01:16 +04:00
ctdb-recovery: Get recmode unconditionally in the main_loop BUG: https://bugzilla.samba.org/show_bug.cgi?id=12857 This can be used later in the main_loop to avoid the local ip check. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2017-06-22 10:45:20 +03:00			`ret = ctdb_ctrl_getrecmode(ctdb, mem_ctx, CONTROL_TIMEOUT(),`
			`CTDB_CURRENT_NODE, &ctdb->recovery_mode);`
			`if (ret != 0) {`
			`D_ERR("Failed to read recmode from local node\n");`
			`return;`
			`}`

recoverd: Also check if current node is in recovery when it is banned Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 6a9dbb8fb0f1f6e8c206189cdc2d33bb371ea2a8) 2013-06-28 08:02:44 +04:00			`/* if the local daemon is STOPPED or BANNED, we verify that the databases are`
recoverd: fix a comment typo Signed-off-by: Michael Adam <obnox@samba.org> (This used to be ctdb commit 741944f118e98f178b860194eecb215180949d18) 2013-06-26 09:11:51 +04:00			`also frozen and that the recmode is set to active.`
recovery daemon needs to monitor when the local ctdb daemon is stopped and ensure that the databases gets frozen and the node enters recovery mode (This used to be ctdb commit 99f239f8b96c8c0a06ac8ca8b8083be96265865a) 2009-07-09 08:19:32 +04:00			`*/`
recoverd: Always do an early exit from main_loop if node is stopped or banned A stopped or banned node cannot do anything useful. So do not participate in any cluster activity and do not cause any unnecessary network traffic. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 2396981c4bcf30530aeb7f4395093cc202105b50) 2013-06-27 09:39:15 +04:00			`if (rec->node_flags & (NODE_FLAGS_STOPPED \| NODE_FLAGS_BANNED)) {`
recoverd: Stabilise the recovery master role On rare occasions when a node that has been inactive it will trigger an election when it becomes active again. If that node has been up for the longest then it will win the election and the recovery master role will spuriously move. While a node remains inactive we reset the priority time to discourage it from winning elections. The priority time will now reflect roughly how long the node has been active rather than how long it has been up. That means the most stable node is more likely to win elections. Having a stable recovery master means that disabling takeover runs while reloading IPs is more likely to succeed. It also improves the chances of being able to cache information in the recovery master - for example, between takeover runs. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit f0f48f22f45e4c82eba2582efae307e25385de81) 2013-09-17 06:00:26 +04:00			`/* If this node has become inactive then we want to`
			`* reduce the chances of it taking over the recovery`
			`* master role when it becomes active again. This`
			`* helps to stabilise the recovery master role so that`
			`* it stays on the most stable node.`
			`*/`
			`rec->priority_time = timeval_current();`

recovery daemon needs to monitor when the local ctdb daemon is stopped and ensure that the databases gets frozen and the node enters recovery mode (This used to be ctdb commit 99f239f8b96c8c0a06ac8ca8b8083be96265865a) 2009-07-09 08:19:32 +04:00			`if (ctdb->recovery_mode == CTDB_RECOVERY_NORMAL) {`
recoverd: Also check if current node is in recovery when it is banned Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 6a9dbb8fb0f1f6e8c206189cdc2d33bb371ea2a8) 2013-06-28 08:02:44 +04:00			`DEBUG(DEBUG_ERR,("Node is stopped or banned but recovery mode is not active. Activate recovery mode and lock databases\n"));`
recovery daemon needs to monitor when the local ctdb daemon is stopped and ensure that the databases gets frozen and the node enters recovery mode (This used to be ctdb commit 99f239f8b96c8c0a06ac8ca8b8083be96265865a) 2009-07-09 08:19:32 +04:00
			`ret = ctdb_ctrl_setrecmode(ctdb, CONTROL_TIMEOUT(), CTDB_CURRENT_NODE, CTDB_RECOVERY_ACTIVE);`
			`if (ret != 0) {`
recoverd: Also check if current node is in recovery when it is banned Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 6a9dbb8fb0f1f6e8c206189cdc2d33bb371ea2a8) 2013-06-28 08:02:44 +04:00			`DEBUG(DEBUG_ERR,(__location__ " Failed to activate recovery mode in STOPPED or BANNED state\n"));`
recovery daemon needs to monitor when the local ctdb daemon is stopped and ensure that the databases gets frozen and the node enters recovery mode (This used to be ctdb commit 99f239f8b96c8c0a06ac8ca8b8083be96265865a) 2009-07-09 08:19:32 +04:00
speed startup: alter recovery loop We do a recovery on startup. But the code does: Sleep for ctdb->tunable.recover_interval. Check for recovery. We want to do it in the other order. This is best done by extracting the loop into a separate "main_loop" function. Seconds between ctdbd first log message and node healthy: BEFORE: 24.09 AFTER: 23.58 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 097046025176b9fcb670839d1a9f100f890e7ed2) 2010-06-22 17:20:23 +04:00			`return;`
recovery daemon needs to monitor when the local ctdb daemon is stopped and ensure that the databases gets frozen and the node enters recovery mode (This used to be ctdb commit 99f239f8b96c8c0a06ac8ca8b8083be96265865a) 2009-07-09 08:19:32 +04:00			`}`
ctdb-recoverd: Freeze databases whenever the node is INACTIVE If the node becomes stopped or banned after recovery is marked active, then it will never freeze the databases, and hence the node will keep banning itself indefinitely, until ctdbd is restarted. This is a regression from 4.3, introduced with b4357a79d916b1f8ade8fa78563fbef0ce670aa9 and d8f3b490bbb691c9916eed0df5b980c1aef23c85 BUG: https://bugzilla.samba.org/show_bug.cgi?id=11945 Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org> Reviewed-by: Martin Schwenke <martin@meltin.net> Autobuild-User(master): Michael Adam <obnox@samba.org> Autobuild-Date(master): Wed Jun 1 17:36:12 CEST 2016 on sn-devel-144 2016-06-01 05:10:46 +03:00			`}`
			`if (! rec->frozen_on_inactive) {`
			`ret = ctdb_ctrl_freeze(ctdb, CONTROL_TIMEOUT(),`
			`CTDB_CURRENT_NODE);`
ctdb-recoverd: Set recovery mode before freezing databases Setting recovery mode to active is the only correct way to inform recovery daemon to run database recovery. Only freezing databases without setting recovery mode should not trigger database recovery, as this mechanism is used in tool to implement wipedb/restoredb commands. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2014-05-06 08:24:52 +04:00			`if (ret != 0) {`
ctdb-recoverd: Freeze databases whenever the node is INACTIVE If the node becomes stopped or banned after recovery is marked active, then it will never freeze the databases, and hence the node will keep banning itself indefinitely, until ctdbd is restarted. This is a regression from 4.3, introduced with b4357a79d916b1f8ade8fa78563fbef0ce670aa9 and d8f3b490bbb691c9916eed0df5b980c1aef23c85 BUG: https://bugzilla.samba.org/show_bug.cgi?id=11945 Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org> Reviewed-by: Martin Schwenke <martin@meltin.net> Autobuild-User(master): Michael Adam <obnox@samba.org> Autobuild-Date(master): Wed Jun 1 17:36:12 CEST 2016 on sn-devel-144 2016-06-01 05:10:46 +03:00			`DEBUG(DEBUG_ERR,`
			`(__location__ " Failed to freeze node "`
			`"in STOPPED or BANNED state\n"));`
ctdb-recoverd: Set recovery mode before freezing databases Setting recovery mode to active is the only correct way to inform recovery daemon to run database recovery. Only freezing databases without setting recovery mode should not trigger database recovery, as this mechanism is used in tool to implement wipedb/restoredb commands. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2014-05-06 08:24:52 +04:00			`return;`
			`}`
ctdb-recoverd: Freeze databases whenever the node is INACTIVE If the node becomes stopped or banned after recovery is marked active, then it will never freeze the databases, and hence the node will keep banning itself indefinitely, until ctdbd is restarted. This is a regression from 4.3, introduced with b4357a79d916b1f8ade8fa78563fbef0ce670aa9 and d8f3b490bbb691c9916eed0df5b980c1aef23c85 BUG: https://bugzilla.samba.org/show_bug.cgi?id=11945 Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org> Reviewed-by: Martin Schwenke <martin@meltin.net> Autobuild-User(master): Michael Adam <obnox@samba.org> Autobuild-Date(master): Wed Jun 1 17:36:12 CEST 2016 on sn-devel-144 2016-06-01 05:10:46 +03:00
			`rec->frozen_on_inactive = true;`
recovery daemon needs to monitor when the local ctdb daemon is stopped and ensure that the databases gets frozen and the node enters recovery mode (This used to be ctdb commit 99f239f8b96c8c0a06ac8ca8b8083be96265865a) 2009-07-09 08:19:32 +04:00			`}`
recoverd: Always do an early exit from main_loop if node is stopped or banned A stopped or banned node cannot do anything useful. So do not participate in any cluster activity and do not cause any unnecessary network traffic. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 2396981c4bcf30530aeb7f4395093cc202105b50) 2013-06-27 09:39:15 +04:00
			`/* If this node is stopped or banned then it is not the recovery`
			`* master, so don't do anything. This prevents stopped or banned`
			`* node from starting election and sending unnecessary controls.`
			`*/`
			`return;`
recovery daemon needs to monitor when the local ctdb daemon is stopped and ensure that the databases gets frozen and the node enters recovery mode (This used to be ctdb commit 99f239f8b96c8c0a06ac8ca8b8083be96265865a) 2009-07-09 08:19:32 +04:00			`}`
recoverd: Always do an early exit from main_loop if node is stopped or banned A stopped or banned node cannot do anything useful. So do not participate in any cluster activity and do not cause any unnecessary network traffic. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 2396981c4bcf30530aeb7f4395093cc202105b50) 2013-06-27 09:39:15 +04:00
ctdb-recoverd: Freeze databases whenever the node is INACTIVE If the node becomes stopped or banned after recovery is marked active, then it will never freeze the databases, and hence the node will keep banning itself indefinitely, until ctdbd is restarted. This is a regression from 4.3, introduced with b4357a79d916b1f8ade8fa78563fbef0ce670aa9 and d8f3b490bbb691c9916eed0df5b980c1aef23c85 BUG: https://bugzilla.samba.org/show_bug.cgi?id=11945 Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org> Reviewed-by: Martin Schwenke <martin@meltin.net> Autobuild-User(master): Michael Adam <obnox@samba.org> Autobuild-Date(master): Wed Jun 1 17:36:12 CEST 2016 on sn-devel-144 2016-06-01 05:10:46 +03:00			`rec->frozen_on_inactive = false;`

ctdb-recmaster: Update capabilities before calling first election Capabilities are used when computing an election result so having them up-to-date seems like a good idea. Also update several instances of an ambiguous comment. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-10-27 07:09:33 +03:00			`/* Retrieve capabilities from all connected nodes */`
			`ret = update_capabilities(rec, nodemap);`
			`if (ret != 0) {`
			`DEBUG(DEBUG_ERR, (__location__ " Unable to update node capabilities.\n"));`
			`return;`
			`}`

ctdb-recoverd: Call election when necessary in recovery master validation There is no need to return one of several states and then trigger an election for one of those return states. Have the recovery master validation trigger the election directly and just return whether monitoring should continue. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-04-28 09:58:35 +03:00			`if (! validate_recovery_master(rec, mem_ctx)) {`
speed startup: alter recovery loop We do a recovery on startup. But the code does: Sleep for ctdb->tunable.recover_interval. Check for recovery. We want to do it in the other order. This is best done by extracting the loop into a separate "main_loop" function. Seconds between ctdbd first log message and node healthy: BEFORE: 24.09 AFTER: 23.58 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 097046025176b9fcb670839d1a9f100f890e7ed2) 2010-06-22 17:20:23 +04:00			`return;`
recovery daemon with recovery master election election is primitive, it elects the lowest vnn as the recovery master two new controls, to get/set recovery master for a node to use recovery daemon, start one ./bin/recoverd --socket=ctdb.socket* for each ctdb daemon it has been briefly tested by deleting and adding nodes to a 4 node cluster but needs more testing (This used to be ctdb commit 541d1cc49d46d44042a31a8404d521412ef2fdb3) 2007-05-07 00:51:58 +04:00			`}`
let each node verify that they have a correct assignment of public ip addresses (i.e. htey hold those they should hold and they dont hold any of those they shouldnt hold) if an inconsistency is found, mark the local node as recovery mode active and wait for the recovery master to trigger a full blown recovery (This used to be ctdb commit 55a5bfc8244c5b9cdda3f11992f384f00566b5dc) 2007-09-14 04:16:36 +04:00
ctdb-recovery: Do not run local ip verification when in recovery BUG: https://bugzilla.samba.org/show_bug.cgi?id=12857 If we drop public IPs because CTDB is in recovery for too long, then avoid spamming logs "Trigger takeoverrun" every second. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2017-06-22 09:15:47 +03:00			`if (ctdb->recovery_mode == CTDB_RECOVERY_NORMAL) {`
			`/* Check if an IP takeover run is needed and trigger one if`
			`* necessary */`
			`verify_local_ip_allocation(ctdb, rec, pnn, nodemap);`
			`}`
recovery daemon with recovery master election election is primitive, it elects the lowest vnn as the recovery master two new controls, to get/set recovery master for a node to use recovery daemon, start one ./bin/recoverd --socket=ctdb.socket* for each ctdb daemon it has been briefly tested by deleting and adding nodes to a 4 node cluster but needs more testing (This used to be ctdb commit 541d1cc49d46d44042a31a8404d521412ef2fdb3) 2007-05-07 00:51:58 +04:00
			`/* if we are not the recmaster then we do not need to check`
			`if recovery is needed`
			`*/`
change recmaster from being a local variable in monitor_cluster() to be a member of the ctdb_recoverd structure (This used to be ctdb commit b7f955338f50c92374b4f559268fb3a1a516aefa) 2008-03-02 23:53:46 +03:00			`if (pnn != rec->recmaster) {`
speed startup: alter recovery loop We do a recovery on startup. But the code does: Sleep for ctdb->tunable.recover_interval. Check for recovery. We want to do it in the other order. This is best done by extracting the loop into a separate "main_loop" function. Seconds between ctdbd first log message and node healthy: BEFORE: 24.09 AFTER: 23.58 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 097046025176b9fcb670839d1a9f100f890e7ed2) 2010-06-22 17:20:23 +04:00			`return;`
recovery daemon with recovery master election election is primitive, it elects the lowest vnn as the recovery master two new controls, to get/set recovery master for a node to use recovery daemon, start one ./bin/recoverd --socket=ctdb.socket* for each ctdb daemon it has been briefly tested by deleting and adding nodes to a 4 node cluster but needs more testing (This used to be ctdb commit 541d1cc49d46d44042a31a8404d521412ef2fdb3) 2007-05-07 00:51:58 +04:00			`}`

simplify election handling make sure we read and update the flags from all remote nodes before we reach the first codepath that can call do_recovery() since during do_recovery() we need to know what the flags are. (This used to be ctdb commit e85f3806483ea420559d449e0e4d81bec996740f) 2007-10-11 00:16:36 +04:00
sync flags between nodes in monitor loop in recmaster (This used to be ctdb commit 6eef86e06388fc53a1212f1e2783ae174c6cd210) 2007-10-15 08:28:51 +04:00			`/* ensure our local copies of flags are right */`
dont manipulate ctdb->monitoring_mode directly from the SET_MON_MODE control, instead call ctdb_start/stop_monitoring() ctdb_stop_monitoring() dont allocate a new monitoring context, leave it NULL. Also set the monitoring_mode in this function so that ctdb_stop/start_monitoring() and ->monitoring_mode are kept in sync. Add a debug message to log that we have stopped monitoring. ctdb_start_monitoring() check whether monitoring is already active and make the function idempotent. Create the monitoring context when monitoring is started. Update ->monitoring_mode once the monitoring has been started. Add a debug message to log that we have started monitoring. When we temporarily stop monitoring while running an event script, restart monitoring after the event script wrapper returns instead of in the event script callback. Let monitoring_mode start out as DISABLED and let it be enabled once we call ctdb_start_monitoring. dont check for MONITORING_DISABLED in check_fore_dead_nodes(). If monitoring is disabled, this event handler will not be called. (This used to be ctdb commit 3a93ae8bdcffb1adbd6243844f3058fc742f76aa) 2007-11-30 00:44:34 +03:00			`ret = update_local_flags(rec, nodemap);`
ctdb-recoverd: Simplify return values when updating local flags Change this to return just 0 or -1. It isn't monitoring anything. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-04-27 14:47:08 +03:00			`if (ret != 0) {`
merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_ERR,("Unable to update local flags\n"));`
speed startup: alter recovery loop We do a recovery on startup. But the code does: Sleep for ctdb->tunable.recover_interval. Check for recovery. We want to do it in the other order. This is best done by extracting the loop into a separate "main_loop" function. Seconds between ctdbd first log message and node healthy: BEFORE: 24.09 AFTER: 23.58 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 097046025176b9fcb670839d1a9f100f890e7ed2) 2010-06-22 17:20:23 +04:00			`return;`
simplify election handling make sure we read and update the flags from all remote nodes before we reach the first codepath that can call do_recovery() since during do_recovery() we need to know what the flags are. (This used to be ctdb commit e85f3806483ea420559d449e0e4d81bec996740f) 2007-10-11 00:16:36 +04:00			`}`

when we reload the nodes file, we may need to reload the nodes file inside the recovery daemon as well. (This used to be ctdb commit 82fd2b6b5cd8e988c38fa6b74121a048757bdeef) 2008-10-17 14:18:06 +04:00			`if (ctdb->num_nodes != nodemap->num) {`
			`DEBUG(DEBUG_ERR, (__location__ " ctdb->num_nodes (%d) != nodemap->num (%d) reloading nodes file\n", ctdb->num_nodes, nodemap->num));`
recoverd: Remove function reload_nodes_file() It is a 1 line wrapper around ctdb_load_nodes_file(), so use that instead. We need less code... :-) Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 4a5d5935f4410a93a3343d85a24dbcddae2c4c20) 2013-10-14 06:54:39 +04:00			`ctdb_load_nodes_file(ctdb);`
speed startup: alter recovery loop We do a recovery on startup. But the code does: Sleep for ctdb->tunable.recover_interval. Check for recovery. We want to do it in the other order. This is best done by extracting the loop into a separate "main_loop" function. Seconds between ctdbd first log message and node healthy: BEFORE: 24.09 AFTER: 23.58 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 097046025176b9fcb670839d1a9f100f890e7ed2) 2010-06-22 17:20:23 +04:00			`return;`
when we reload the nodes file, we may need to reload the nodes file inside the recovery daemon as well. (This used to be ctdb commit 82fd2b6b5cd8e988c38fa6b74121a048757bdeef) 2008-10-17 14:18:06 +04:00			`}`
allow different nodes in the cluster to use different public_addresses files so that we can partition the cluster into different subsets of nodes which each serve a different subset of the public addresses (This used to be ctdb commit 889e0fe69e4c88c6166282b12843b8d9727552d6) 2007-09-04 17:15:23 +04:00
recovery daemon with recovery master election election is primitive, it elects the lowest vnn as the recovery master two new controls, to get/set recovery master for a node to use recovery daemon, start one ./bin/recoverd --socket=ctdb.socket* for each ctdb daemon it has been briefly tested by deleting and adding nodes to a 4 node cluster but needs more testing (This used to be ctdb commit 541d1cc49d46d44042a31a8404d521412ef2fdb3) 2007-05-07 00:51:58 +04:00			`/* verify that all active nodes agree that we are the recmaster */`
when a node disgrees with us re who is recmaster make it mark that node as a lcuprit so it eventually gets banned (This used to be ctdb commit eff3f326f8ce6070c9f3c430cd14d1b71a8db220) 2008-04-21 18:56:27 +04:00			`switch (verify_recmaster(rec, nodemap, pnn)) {`
try out a slightly different api for controls where you provide a callback function which is called upon completion (or timeout) of the control. modify scanning of recmaster in the monitoring_cluster code to try the api out (This used to be ctdb commit c37843f1d97b169afec910e7ddb4e5ac12c3015c) 2007-08-23 13:27:09 +04:00			`case MONITOR_RECOVERY_NEEDED:`
			`/* can not happen */`
speed startup: alter recovery loop We do a recovery on startup. But the code does: Sleep for ctdb->tunable.recover_interval. Check for recovery. We want to do it in the other order. This is best done by extracting the loop into a separate "main_loop" function. Seconds between ctdbd first log message and node healthy: BEFORE: 24.09 AFTER: 23.58 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 097046025176b9fcb670839d1a9f100f890e7ed2) 2010-06-22 17:20:23 +04:00			`return;`
try out a slightly different api for controls where you provide a callback function which is called upon completion (or timeout) of the control. modify scanning of recmaster in the monitoring_cluster code to try the api out (This used to be ctdb commit c37843f1d97b169afec910e7ddb4e5ac12c3015c) 2007-08-23 13:27:09 +04:00			`case MONITOR_ELECTION_NEEDED:`
add a new tunable : reclockpingperiod once every such interval : * the recovery master on each node will uppdate the "connected" count in the reclock count file (ctdb getreclock) * if the node thinks it is a recovery master but it detects another node that is DISCONNECTED but which still holds a lock to the reclock count file this may mean that we have a split cluster. if that other node that is DISCONNECTED but still holds the lock on hte reclock pnn count file, is MORE connected than the local node, yield the recmaster role and let the other half of the lcuster take over this add a second, last chance mechanism to detect split clusters. IF the cluster is split but GPFS is not yet split, this mechanism makes the largest half of the cluster become the active half. (This used to be ctdb commit 07af425f444531942cce8abff112c1524228d287) 2008-03-03 01:19:30 +03:00			`force_election(rec, pnn, nodemap);`
speed startup: alter recovery loop We do a recovery on startup. But the code does: Sleep for ctdb->tunable.recover_interval. Check for recovery. We want to do it in the other order. This is best done by extracting the loop into a separate "main_loop" function. Seconds between ctdbd first log message and node healthy: BEFORE: 24.09 AFTER: 23.58 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 097046025176b9fcb670839d1a9f100f890e7ed2) 2010-06-22 17:20:23 +04:00			`return;`
try out a slightly different api for controls where you provide a callback function which is called upon completion (or timeout) of the control. modify scanning of recmaster in the monitoring_cluster code to try the api out (This used to be ctdb commit c37843f1d97b169afec910e7ddb4e5ac12c3015c) 2007-08-23 13:27:09 +04:00			`case MONITOR_OK:`
			`break;`
			`case MONITOR_FAILED:`
speed startup: alter recovery loop We do a recovery on startup. But the code does: Sleep for ctdb->tunable.recover_interval. Check for recovery. We want to do it in the other order. This is best done by extracting the loop into a separate "main_loop" function. Seconds between ctdbd first log message and node healthy: BEFORE: 24.09 AFTER: 23.58 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 097046025176b9fcb670839d1a9f100f890e7ed2) 2010-06-22 17:20:23 +04:00			`return;`
recovery daemon with recovery master election election is primitive, it elects the lowest vnn as the recovery master two new controls, to get/set recovery master for a node to use recovery daemon, start one ./bin/recoverd --socket=ctdb.socket* for each ctdb daemon it has been briefly tested by deleting and adding nodes to a 4 node cluster but needs more testing (This used to be ctdb commit 541d1cc49d46d44042a31a8404d521412ef2fdb3) 2007-05-07 00:51:58 +04:00			`}`


ctdb-recoverd: Move VNN map retrieval to where it is needed The VNN map is only needed on the recovery master, so no need for all recovery daemons to retrieve it. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-10-27 06:35:09 +03:00			`/* get the vnnmap */`
			`ret = ctdb_ctrl_getvnnmap(ctdb, CONTROL_TIMEOUT(), pnn, mem_ctx, &vnnmap);`
			`if (ret != 0) {`
			`DEBUG(DEBUG_ERR, (__location__ " Unable to get vnnmap from node %u\n", pnn));`
			`return;`
			`}`

- merge from ronnie - add a flag to check that recovery completed correctly. If not, re-trigger it in monitoring (This used to be ctdb commit d5ed941d9bab4af30d8b5f9b77bdf43d9218d69b) 2007-09-14 03:49:12 +04:00			`if (rec->need_recovery) {`
			`/* a previous recovery didn't finish */`
new prototype banning code (This used to be ctdb commit 0c4c2240267af183d54ffd4c0aacda208f6eff6a) 2009-09-03 20:20:39 +04:00			`do_recovery(rec, mem_ctx, pnn, nodemap, vnnmap);`
speed startup: alter recovery loop We do a recovery on startup. But the code does: Sleep for ctdb->tunable.recover_interval. Check for recovery. We want to do it in the other order. This is best done by extracting the loop into a separate "main_loop" function. Seconds between ctdbd first log message and node healthy: BEFORE: 24.09 AFTER: 23.58 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 097046025176b9fcb670839d1a9f100f890e7ed2) 2010-06-22 17:20:23 +04:00			`return;`
- merge from ronnie - add a flag to check that recovery completed correctly. If not, re-trigger it in monitoring (This used to be ctdb commit d5ed941d9bab4af30d8b5f9b77bdf43d9218d69b) 2007-09-14 03:49:12 +04:00			`}`

add a test in the function that checks whether the cluster needs recovery or not that all active nodes are in normal mode. If we discover that some node is still in recoverymode it may indicate that a previous recovery ended prematurely and thus we should start a new recovery (This used to be ctdb commit c15517872e6c98c8c425a8d47d2b348ecb0620b0) 2007-05-06 22:41:12 +04:00			`/* verify that all active nodes are in normal mode`
			`and not in recovery mode`
new prototype banning code (This used to be ctdb commit 0c4c2240267af183d54ffd4c0aacda208f6eff6a) 2009-09-03 20:20:39 +04:00			`*/`
try out a slightly different api for controls where you provide a callback function which is called upon completion (or timeout) of the control. modify scanning of recmaster in the monitoring_cluster code to try the api out (This used to be ctdb commit c37843f1d97b169afec910e7ddb4e5ac12c3015c) 2007-08-23 13:27:09 +04:00			`switch (verify_recmode(ctdb, nodemap)) {`
break checking that the recoverymode on all nodes are ok out into its own function (This used to be ctdb commit 813cf9a252af96da24122b80f24aabeed2911939) 2007-08-23 07:48:39 +04:00			`case MONITOR_RECOVERY_NEEDED:`
new prototype banning code (This used to be ctdb commit 0c4c2240267af183d54ffd4c0aacda208f6eff6a) 2009-09-03 20:20:39 +04:00			`do_recovery(rec, mem_ctx, pnn, nodemap, vnnmap);`
speed startup: alter recovery loop We do a recovery on startup. But the code does: Sleep for ctdb->tunable.recover_interval. Check for recovery. We want to do it in the other order. This is best done by extracting the loop into a separate "main_loop" function. Seconds between ctdbd first log message and node healthy: BEFORE: 24.09 AFTER: 23.58 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 097046025176b9fcb670839d1a9f100f890e7ed2) 2010-06-22 17:20:23 +04:00			`return;`
break checking that the recoverymode on all nodes are ok out into its own function (This used to be ctdb commit 813cf9a252af96da24122b80f24aabeed2911939) 2007-08-23 07:48:39 +04:00			`case MONITOR_FAILED:`
speed startup: alter recovery loop We do a recovery on startup. But the code does: Sleep for ctdb->tunable.recover_interval. Check for recovery. We want to do it in the other order. This is best done by extracting the loop into a separate "main_loop" function. Seconds between ctdbd first log message and node healthy: BEFORE: 24.09 AFTER: 23.58 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 097046025176b9fcb670839d1a9f100f890e7ed2) 2010-06-22 17:20:23 +04:00			`return;`
try out a slightly different api for controls where you provide a callback function which is called upon completion (or timeout) of the control. modify scanning of recmaster in the monitoring_cluster code to try the api out (This used to be ctdb commit c37843f1d97b169afec910e7ddb4e5ac12c3015c) 2007-08-23 13:27:09 +04:00			`case MONITOR_ELECTION_NEEDED:`
			`/* can not happen */`
break checking that the recoverymode on all nodes are ok out into its own function (This used to be ctdb commit 813cf9a252af96da24122b80f24aabeed2911939) 2007-08-23 07:48:39 +04:00			`case MONITOR_OK:`
			`break;`
add a test in the function that checks whether the cluster needs recovery or not that all active nodes are in normal mode. If we discover that some node is still in recoverymode it may indicate that a previous recovery ended prematurely and thus we should start a new recovery (This used to be ctdb commit c15517872e6c98c8c425a8d47d2b348ecb0620b0) 2007-05-06 22:41:12 +04:00			`}`


ctdb-daemon: Rename recovery lock file to just recovery lock It isn't necessarily a file. Don't bother changing the control, since it doesn't pervade the code. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-05-17 11:28:56 +03:00			`if (ctdb->recovery_lock != NULL) {`
ctdb-recoverd: Remove check_recovery_lock() This has not done anything useful since commit b9d8bb23af8abefb2d967e9b4e9d6e60c4a3b520. Instead, just check that the lock is held. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2014-12-09 06:45:08 +03:00			`/* We must already hold the recovery lock */`
ctdb-recoverd: Recovery lock handle should be in recovery deamon context This shouldn't be in the CTDB context. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-05-24 07:54:39 +03:00			`if (!ctdb_recovery_have_lock(rec)) {`
ctdb-recoverd: Remove check_recovery_lock() This has not done anything useful since commit b9d8bb23af8abefb2d967e9b4e9d6e60c4a3b520. Instead, just check that the lock is held. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2014-12-09 06:45:08 +03:00			`DEBUG(DEBUG_ERR,("Failed recovery lock sanity check. Force a recovery\n"));`
new prototype banning code (This used to be ctdb commit 0c4c2240267af183d54ffd4c0aacda208f6eff6a) 2009-09-03 20:20:39 +04:00			`ctdb_set_culprit(rec, ctdb->pnn);`
			`do_recovery(rec, mem_ctx, pnn, nodemap, vnnmap);`
speed startup: alter recovery loop We do a recovery on startup. But the code does: Sleep for ctdb->tunable.recover_interval. Check for recovery. We want to do it in the other order. This is best done by extracting the loop into a separate "main_loop" function. Seconds between ctdbd first log message and node healthy: BEFORE: 24.09 AFTER: 23.58 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 097046025176b9fcb670839d1a9f100f890e7ed2) 2010-06-22 17:20:23 +04:00			`return;`
Dont access the reclock file at all if VerifyRecoveryLock is zero and also make sure the reclock file is closed if the variable is cleared at runtime (This used to be ctdb commit a25f4888689a0725971606163d87c39a41669292) 2009-06-25 05:41:18 +04:00			`}`
- catch ESTALE in the recovery lock by trying a read() - priortise nodes that are unbanned and healthy in the election (This used to be ctdb commit 929feb475dfdf7283f0e99b50b179e1c91d3a39f) 2007-10-05 07:28:21 +04:00			`}`
break checking that the recoverymode on all nodes are ok out into its own function (This used to be ctdb commit 813cf9a252af96da24122b80f24aabeed2911939) 2007-08-23 07:48:39 +04:00
Add new control to reload the public ip address file on a node Also add a method to use the recovery master/daemon to reload the public ips on all nodes in the cluster. Reloading the public ips on all node sin the cluster is only suported if all nodes in the cluster are available and healthy. (This used to be ctdb commit 05603e914f8c12618d7e06943c0f7df207f645b0) 2012-04-30 09:50:44 +04:00
ctdb-recoverd: Avoid nodemap-related checks when recoveries are disabled The potential resulting recovery won't run anyway. Also recoveries may have been disabled by "reloadnodes" and if the nodemaps are inconsistent between nodes then avoid triggering an unnecessary recovery. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-02-06 12:59:11 +03:00			`/* If recoveries are disabled then there is no use doing any`
			`* nodemap or flags checks. Recoveries might be disabled due`
			`* to "reloadnodes", so doing these checks might cause an`
			`* unnecessary recovery. */`
			`if (ctdb_op_is_disabled(rec->recovery)) {`
ctdb-recoverd: Move takeover run checks after recover checks If a recovery is going to be done then this will be followed by a takeover run anyway. So, there's no use doing the takeover run checks, potentially doing a takeover run and then doing a recovery. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-05-03 09:00:02 +03:00			`goto takeover_run_checks;`
ctdb-recoverd: Avoid nodemap-related checks when recoveries are disabled The potential resulting recovery won't run anyway. Also recoveries may have been disabled by "reloadnodes" and if the nodemaps are inconsistent between nodes then avoid triggering an unnecessary recovery. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-02-06 12:59:11 +03:00			`}`

redo and update how we synchronize flags across the cluster. this simplifies the code and should close a race condition between the local recovery daemon and a remote node when flags are changing. (This used to be ctdb commit 32d460b8469eb53145f04161a5d01166f9b5f09e) 2008-12-05 08:32:30 +03:00			`/* get the nodemap for all active remote nodes`
update getvnnmap control to take a timeout parameter dont explicitely free the vnnmap pointer in the getvnnmap control this is freed by the mem_ctx instead add code to the recoverd to detect when/if recovery is required veiry that the number of active nodes, the nodemap and the vnn map is consistent across the entire cluster and if not trigger a recovery (which right now just prints "we need to do recovery" to the screen. (This used to be ctdb commit 2b0a207a3748bdb3394dc9fd0d1c344ee1bb0bb5) 2007-05-04 03:45:53 +04:00			`*/`
ctdb-daemon: Rename struct ctdb_node_map to ctdb_node_map_old Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-29 09:22:48 +03:00			`remote_nodemaps = talloc_array(mem_ctx, struct ctdb_node_map_old *, nodemap->num);`
update to the flags handling make sure to abort the monitoring and restart if we failed to get the nodemap from a remote node (This used to be ctdb commit 4eac0214e732e6c2f867d66ec71d4406680dbb94) 2008-12-09 02:45:14 +03:00			`if (remote_nodemaps == NULL) {`
			`DEBUG(DEBUG_ERR, (__location__ " failed to allocate remote nodemap array\n"));`
speed startup: alter recovery loop We do a recovery on startup. But the code does: Sleep for ctdb->tunable.recover_interval. Check for recovery. We want to do it in the other order. This is best done by extracting the loop into a separate "main_loop" function. Seconds between ctdbd first log message and node healthy: BEFORE: 24.09 AFTER: 23.58 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 097046025176b9fcb670839d1a9f100f890e7ed2) 2010-06-22 17:20:23 +04:00			`return;`
update to the flags handling make sure to abort the monitoring and restart if we failed to get the nodemap from a remote node (This used to be ctdb commit 4eac0214e732e6c2f867d66ec71d4406680dbb94) 2008-12-09 02:45:14 +03:00			`}`
			`for(i=0; i<nodemap->num; i++) {`
			`remote_nodemaps[i] = NULL;`
			`}`
			`if (get_remote_nodemaps(ctdb, mem_ctx, nodemap, remote_nodemaps) != 0) {`
redo and update how we synchronize flags across the cluster. this simplifies the code and should close a race condition between the local recovery daemon and a remote node when flags are changing. (This used to be ctdb commit 32d460b8469eb53145f04161a5d01166f9b5f09e) 2008-12-05 08:32:30 +03:00			`DEBUG(DEBUG_ERR,(__location__ " Failed to read remote nodemaps\n"));`
speed startup: alter recovery loop We do a recovery on startup. But the code does: Sleep for ctdb->tunable.recover_interval. Check for recovery. We want to do it in the other order. This is best done by extracting the loop into a separate "main_loop" function. Seconds between ctdbd first log message and node healthy: BEFORE: 24.09 AFTER: 23.58 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 097046025176b9fcb670839d1a9f100f890e7ed2) 2010-06-22 17:20:23 +04:00			`return;`
redo and update how we synchronize flags across the cluster. this simplifies the code and should close a race condition between the local recovery daemon and a remote node when flags are changing. (This used to be ctdb commit 32d460b8469eb53145f04161a5d01166f9b5f09e) 2008-12-05 08:32:30 +03:00			`}`

			`/* verify that all other nodes have the same nodemap as we have`
			`*/`
update getvnnmap control to take a timeout parameter dont explicitely free the vnnmap pointer in the getvnnmap control this is freed by the mem_ctx instead add code to the recoverd to detect when/if recovery is required veiry that the number of active nodes, the nodemap and the vnn map is consistent across the entire cluster and if not trigger a recovery (which right now just prints "we need to do recovery" to the screen. (This used to be ctdb commit 2b0a207a3748bdb3394dc9fd0d1c344ee1bb0bb5) 2007-05-04 03:45:53 +04:00			`for (j=0; j<nodemap->num; j++) {`
We dont need to verify the nodemap on remote nodes that are banned (This used to be ctdb commit 7f8f9385deee6eff2b7303147bc6412bbdc122df) 2009-04-06 06:00:22 +04:00			`if (nodemap->nodes[j].flags & NODE_FLAGS_INACTIVE) {`
update getvnnmap control to take a timeout parameter dont explicitely free the vnnmap pointer in the getvnnmap control this is freed by the mem_ctx instead add code to the recoverd to detect when/if recovery is required veiry that the number of active nodes, the nodemap and the vnn map is consistent across the entire cluster and if not trigger a recovery (which right now just prints "we need to do recovery" to the screen. (This used to be ctdb commit 2b0a207a3748bdb3394dc9fd0d1c344ee1bb0bb5) 2007-05-04 03:45:53 +04:00			`continue;`
			`}`

update to the flags handling make sure to abort the monitoring and restart if we failed to get the nodemap from a remote node (This used to be ctdb commit 4eac0214e732e6c2f867d66ec71d4406680dbb94) 2008-12-09 02:45:14 +03:00			`if (remote_nodemaps[j] == NULL) {`
			`DEBUG(DEBUG_ERR,(__location__ " Did not get a remote nodemap for node %d, restarting monitoring\n", j));`
if we cant pull the remote nodemap off a node we should mark it as a culprit so it eventually becomes banned. (This used to be ctdb commit 0889ae3c237bdb3bd72d45f2f64f5e5d8420870c) 2009-04-02 07:50:43 +04:00			`ctdb_set_culprit(rec, j);`

speed startup: alter recovery loop We do a recovery on startup. But the code does: Sleep for ctdb->tunable.recover_interval. Check for recovery. We want to do it in the other order. This is best done by extracting the loop into a separate "main_loop" function. Seconds between ctdbd first log message and node healthy: BEFORE: 24.09 AFTER: 23.58 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 097046025176b9fcb670839d1a9f100f890e7ed2) 2010-06-22 17:20:23 +04:00			`return;`
update to the flags handling make sure to abort the monitoring and restart if we failed to get the nodemap from a remote node (This used to be ctdb commit 4eac0214e732e6c2f867d66ec71d4406680dbb94) 2008-12-09 02:45:14 +03:00			`}`

redo and update how we synchronize flags across the cluster. this simplifies the code and should close a race condition between the local recovery daemon and a remote node when flags are changing. (This used to be ctdb commit 32d460b8469eb53145f04161a5d01166f9b5f09e) 2008-12-05 08:32:30 +03:00			`/* if the nodes disagree on how many nodes there are`
update getvnnmap control to take a timeout parameter dont explicitely free the vnnmap pointer in the getvnnmap control this is freed by the mem_ctx instead add code to the recoverd to detect when/if recovery is required veiry that the number of active nodes, the nodemap and the vnn map is consistent across the entire cluster and if not trigger a recovery (which right now just prints "we need to do recovery" to the screen. (This used to be ctdb commit 2b0a207a3748bdb3394dc9fd0d1c344ee1bb0bb5) 2007-05-04 03:45:53 +04:00			`then this is a good reason to try recovery`
			`*/`
redo and update how we synchronize flags across the cluster. this simplifies the code and should close a race condition between the local recovery daemon and a remote node when flags are changing. (This used to be ctdb commit 32d460b8469eb53145f04161a5d01166f9b5f09e) 2008-12-05 08:32:30 +03:00			`if (remote_nodemaps[j]->num != nodemap->num) {`
merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_ERR, (__location__ " Remote node:%u has different node count. %u vs %u of the local node\n",`
redo and update how we synchronize flags across the cluster. this simplifies the code and should close a race condition between the local recovery daemon and a remote node when flags are changing. (This used to be ctdb commit 32d460b8469eb53145f04161a5d01166f9b5f09e) 2008-12-05 08:32:30 +03:00			`nodemap->nodes[j].pnn, remote_nodemaps[j]->num, nodemap->num));`
new prototype banning code (This used to be ctdb commit 0c4c2240267af183d54ffd4c0aacda208f6eff6a) 2009-09-03 20:20:39 +04:00			`ctdb_set_culprit(rec, nodemap->nodes[j].pnn);`
			`do_recovery(rec, mem_ctx, pnn, nodemap, vnnmap);`
speed startup: alter recovery loop We do a recovery on startup. But the code does: Sleep for ctdb->tunable.recover_interval. Check for recovery. We want to do it in the other order. This is best done by extracting the loop into a separate "main_loop" function. Seconds between ctdbd first log message and node healthy: BEFORE: 24.09 AFTER: 23.58 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 097046025176b9fcb670839d1a9f100f890e7ed2) 2010-06-22 17:20:23 +04:00			`return;`
update getvnnmap control to take a timeout parameter dont explicitely free the vnnmap pointer in the getvnnmap control this is freed by the mem_ctx instead add code to the recoverd to detect when/if recovery is required veiry that the number of active nodes, the nodemap and the vnn map is consistent across the entire cluster and if not trigger a recovery (which right now just prints "we need to do recovery" to the screen. (This used to be ctdb commit 2b0a207a3748bdb3394dc9fd0d1c344ee1bb0bb5) 2007-05-04 03:45:53 +04:00			`}`

			`/* if the nodes disagree on which nodes exist and are`
			`active, then that is also a good reason to do recovery`
			`*/`
			`for (i=0;i<nodemap->num;i++) {`
redo and update how we synchronize flags across the cluster. this simplifies the code and should close a race condition between the local recovery daemon and a remote node when flags are changing. (This used to be ctdb commit 32d460b8469eb53145f04161a5d01166f9b5f09e) 2008-12-05 08:32:30 +03:00			`if (remote_nodemaps[j]->nodes[i].pnn != nodemap->nodes[i].pnn) {`
merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_ERR, (__location__ " Remote node:%u has different nodemap pnn for %d (%u vs %u).\n",`
change how we do public addresses and takeover so that we can have multiple public addresses spread across multiple interfaces on each node. this is a massive patch since we have previously made the assumtion that we only have one public address per node. get rid of the public_interface argument. the public addresses file now explicitely lists which interface the address belongs to (This used to be ctdb commit 462ebbc791e906a6b874c862defea43235597ca8) 2007-09-04 03:50:07 +04:00			`nodemap->nodes[j].pnn, i,`
redo and update how we synchronize flags across the cluster. this simplifies the code and should close a race condition between the local recovery daemon and a remote node when flags are changing. (This used to be ctdb commit 32d460b8469eb53145f04161a5d01166f9b5f09e) 2008-12-05 08:32:30 +03:00			`remote_nodemaps[j]->nodes[i].pnn, nodemap->nodes[i].pnn));`
new prototype banning code (This used to be ctdb commit 0c4c2240267af183d54ffd4c0aacda208f6eff6a) 2009-09-03 20:20:39 +04:00			`ctdb_set_culprit(rec, nodemap->nodes[j].pnn);`
store the num_active variable (number of connected/active nodes) inside the rec structure and avoid passing this as an extra parameter to do_recovery() (This used to be ctdb commit 8bb229aa3b4bd41e48d4e4e2e148d8680c8ba436) 2008-02-29 04:55:20 +03:00			`do_recovery(rec, mem_ctx, pnn, nodemap,`
new prototype banning code (This used to be ctdb commit 0c4c2240267af183d54ffd4c0aacda208f6eff6a) 2009-09-03 20:20:39 +04:00			`vnnmap);`
speed startup: alter recovery loop We do a recovery on startup. But the code does: Sleep for ctdb->tunable.recover_interval. Check for recovery. We want to do it in the other order. This is best done by extracting the loop into a separate "main_loop" function. Seconds between ctdbd first log message and node healthy: BEFORE: 24.09 AFTER: 23.58 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 097046025176b9fcb670839d1a9f100f890e7ed2) 2010-06-22 17:20:23 +04:00			`return;`
update getvnnmap control to take a timeout parameter dont explicitely free the vnnmap pointer in the getvnnmap control this is freed by the mem_ctx instead add code to the recoverd to detect when/if recovery is required veiry that the number of active nodes, the nodemap and the vnn map is consistent across the entire cluster and if not trigger a recovery (which right now just prints "we need to do recovery" to the screen. (This used to be ctdb commit 2b0a207a3748bdb3394dc9fd0d1c344ee1bb0bb5) 2007-05-04 03:45:53 +04:00			`}`
			`}`
recoverd: Assemble up-to-date node flags information from remote nodes Currently nodemap used by recovery master is the one obtained from the local node. This information may have been updated while processing main loop. Before comparing node flags on all the nodes, create up-to-date node flags information based on the information received from all the nodes. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit fcf77dec5af973a0e32f3999bc012053a6f47a96) 2013-07-22 11:26:28 +04:00			`}`

			`/*`
			`* Update node flags obtained from each active node. This ensure we have`
			`* up-to-date information for all the nodes.`
			`*/`
			`for (j=0; j<nodemap->num; j++) {`
			`if (nodemap->nodes[j].flags & NODE_FLAGS_INACTIVE) {`
			`continue;`
			`}`
			`nodemap->nodes[j].flags = remote_nodemaps[j]->nodes[j].flags;`
			`}`

			`for (j=0; j<nodemap->num; j++) {`
			`if (nodemap->nodes[j].flags & NODE_FLAGS_INACTIVE) {`
			`continue;`
			`}`
update getvnnmap control to take a timeout parameter dont explicitely free the vnnmap pointer in the getvnnmap control this is freed by the mem_ctx instead add code to the recoverd to detect when/if recovery is required veiry that the number of active nodes, the nodemap and the vnn map is consistent across the entire cluster and if not trigger a recovery (which right now just prints "we need to do recovery" to the screen. (This used to be ctdb commit 2b0a207a3748bdb3394dc9fd0d1c344ee1bb0bb5) 2007-05-04 03:45:53 +04:00
redo and update how we synchronize flags across the cluster. this simplifies the code and should close a race condition between the local recovery daemon and a remote node when flags are changing. (This used to be ctdb commit 32d460b8469eb53145f04161a5d01166f9b5f09e) 2008-12-05 08:32:30 +03:00			`/* verify the flags are consistent`
			`*/`
			`for (i=0; i<nodemap->num; i++) {`
			`if (nodemap->nodes[i].flags & NODE_FLAGS_DISCONNECTED) {`
			`continue;`
			`}`

			`if (nodemap->nodes[i].flags != remote_nodemaps[j]->nodes[i].flags) {`
			`DEBUG(DEBUG_ERR, (__location__ " Remote node:%u has different flags for node %u. It has 0x%02x vs our 0x%02x\n",`
			`nodemap->nodes[j].pnn,`
			`nodemap->nodes[i].pnn,`
			`remote_nodemaps[j]->nodes[i].flags,`
recoverd: Fix printing of node flags from local information Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 124e2a471aeda9c900fd898178a30522d7d74221) 2013-01-23 07:35:47 +04:00			`nodemap->nodes[i].flags));`
redo and update how we synchronize flags across the cluster. this simplifies the code and should close a race condition between the local recovery daemon and a remote node when flags are changing. (This used to be ctdb commit 32d460b8469eb53145f04161a5d01166f9b5f09e) 2008-12-05 08:32:30 +03:00			`if (i == j) {`
			`DEBUG(DEBUG_ERR,("Use flags 0x%02x from remote node %d for cluster update of its own flags\n", remote_nodemaps[j]->nodes[i].flags, j));`
			`update_flags_on_all_nodes(ctdb, nodemap, nodemap->nodes[i].pnn, remote_nodemaps[j]->nodes[i].flags);`
new prototype banning code (This used to be ctdb commit 0c4c2240267af183d54ffd4c0aacda208f6eff6a) 2009-09-03 20:20:39 +04:00			`ctdb_set_culprit(rec, nodemap->nodes[j].pnn);`
redo and update how we synchronize flags across the cluster. this simplifies the code and should close a race condition between the local recovery daemon and a remote node when flags are changing. (This used to be ctdb commit 32d460b8469eb53145f04161a5d01166f9b5f09e) 2008-12-05 08:32:30 +03:00			`do_recovery(rec, mem_ctx, pnn, nodemap,`
new prototype banning code (This used to be ctdb commit 0c4c2240267af183d54ffd4c0aacda208f6eff6a) 2009-09-03 20:20:39 +04:00			`vnnmap);`
speed startup: alter recovery loop We do a recovery on startup. But the code does: Sleep for ctdb->tunable.recover_interval. Check for recovery. We want to do it in the other order. This is best done by extracting the loop into a separate "main_loop" function. Seconds between ctdbd first log message and node healthy: BEFORE: 24.09 AFTER: 23.58 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 097046025176b9fcb670839d1a9f100f890e7ed2) 2010-06-22 17:20:23 +04:00			`return;`
redo and update how we synchronize flags across the cluster. this simplifies the code and should close a race condition between the local recovery daemon and a remote node when flags are changing. (This used to be ctdb commit 32d460b8469eb53145f04161a5d01166f9b5f09e) 2008-12-05 08:32:30 +03:00			`} else {`
			`DEBUG(DEBUG_ERR,("Use flags 0x%02x from local recmaster node for cluster update of node %d flags\n", nodemap->nodes[i].flags, i));`
			`update_flags_on_all_nodes(ctdb, nodemap, nodemap->nodes[i].pnn, nodemap->nodes[i].flags);`
new prototype banning code (This used to be ctdb commit 0c4c2240267af183d54ffd4c0aacda208f6eff6a) 2009-09-03 20:20:39 +04:00			`ctdb_set_culprit(rec, nodemap->nodes[j].pnn);`
redo and update how we synchronize flags across the cluster. this simplifies the code and should close a race condition between the local recovery daemon and a remote node when flags are changing. (This used to be ctdb commit 32d460b8469eb53145f04161a5d01166f9b5f09e) 2008-12-05 08:32:30 +03:00			`do_recovery(rec, mem_ctx, pnn, nodemap,`
new prototype banning code (This used to be ctdb commit 0c4c2240267af183d54ffd4c0aacda208f6eff6a) 2009-09-03 20:20:39 +04:00			`vnnmap);`
speed startup: alter recovery loop We do a recovery on startup. But the code does: Sleep for ctdb->tunable.recover_interval. Check for recovery. We want to do it in the other order. This is best done by extracting the loop into a separate "main_loop" function. Seconds between ctdbd first log message and node healthy: BEFORE: 24.09 AFTER: 23.58 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 097046025176b9fcb670839d1a9f100f890e7ed2) 2010-06-22 17:20:23 +04:00			`return;`
redo and update how we synchronize flags across the cluster. this simplifies the code and should close a race condition between the local recovery daemon and a remote node when flags are changing. (This used to be ctdb commit 32d460b8469eb53145f04161a5d01166f9b5f09e) 2008-12-05 08:32:30 +03:00			`}`
			`}`
			`}`
update getvnnmap control to take a timeout parameter dont explicitely free the vnnmap pointer in the getvnnmap control this is freed by the mem_ctx instead add code to the recoverd to detect when/if recovery is required veiry that the number of active nodes, the nodemap and the vnn map is consistent across the entire cluster and if not trigger a recovery (which right now just prints "we need to do recovery" to the screen. (This used to be ctdb commit 2b0a207a3748bdb3394dc9fd0d1c344ee1bb0bb5) 2007-05-04 03:45:53 +04:00			`}`

ctdb_recoverd: Move num_lmasters calculation to near where it is used Unless this node is the recovery master then this is not needed. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-03-29 12:00:17 +03:00
			`/* count how many active nodes there are */`
			`num_lmasters = 0;`
			`for (i=0; i<nodemap->num; i++) {`
			`if (!(nodemap->nodes[i].flags & NODE_FLAGS_INACTIVE)) {`
			`if (ctdb_node_has_capabilities(rec->caps,`
			`ctdb->nodes[i]->pnn,`
			`CTDB_CAP_LMASTER)) {`
			`num_lmasters++;`
			`}`
			`}`
			`}`

update getvnnmap control to take a timeout parameter dont explicitely free the vnnmap pointer in the getvnnmap control this is freed by the mem_ctx instead add code to the recoverd to detect when/if recovery is required veiry that the number of active nodes, the nodemap and the vnn map is consistent across the entire cluster and if not trigger a recovery (which right now just prints "we need to do recovery" to the screen. (This used to be ctdb commit 2b0a207a3748bdb3394dc9fd0d1c344ee1bb0bb5) 2007-05-04 03:45:53 +04:00
recoverd: Fix the VNN lmaster consistency check It does cope with node that don't have the lmaster capability. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 588172bcb6bf267339e2bd09e23d2c4904a27a41) 2013-09-26 07:11:04 +04:00			`/* There must be the same number of lmasters in the vnn map as`
			`* there are active nodes with the lmaster capability... or`
			`* do a recovery.`
update getvnnmap control to take a timeout parameter dont explicitely free the vnnmap pointer in the getvnnmap control this is freed by the mem_ctx instead add code to the recoverd to detect when/if recovery is required veiry that the number of active nodes, the nodemap and the vnn map is consistent across the entire cluster and if not trigger a recovery (which right now just prints "we need to do recovery" to the screen. (This used to be ctdb commit 2b0a207a3748bdb3394dc9fd0d1c344ee1bb0bb5) 2007-05-04 03:45:53 +04:00			`*/`
ctdb-recoverd: Make num_lmasters a local variable It isn't used anywhere else and is always re-initialised to 0. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-03-29 09:49:02 +03:00			`if (vnnmap->size != num_lmasters) {`
recoverd: Fix the VNN lmaster consistency check It does cope with node that don't have the lmaster capability. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 588172bcb6bf267339e2bd09e23d2c4904a27a41) 2013-09-26 07:11:04 +04:00			`DEBUG(DEBUG_ERR, (__location__ " The vnnmap count is different from the number of active lmaster nodes: %u vs %u\n",`
ctdb-recoverd: Make num_lmasters a local variable It isn't used anywhere else and is always re-initialised to 0. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-03-29 09:49:02 +03:00			`vnnmap->size, num_lmasters));`
new prototype banning code (This used to be ctdb commit 0c4c2240267af183d54ffd4c0aacda208f6eff6a) 2009-09-03 20:20:39 +04:00			`ctdb_set_culprit(rec, ctdb->pnn);`
			`do_recovery(rec, mem_ctx, pnn, nodemap, vnnmap);`
speed startup: alter recovery loop We do a recovery on startup. But the code does: Sleep for ctdb->tunable.recover_interval. Check for recovery. We want to do it in the other order. This is best done by extracting the loop into a separate "main_loop" function. Seconds between ctdbd first log message and node healthy: BEFORE: 24.09 AFTER: 23.58 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 097046025176b9fcb670839d1a9f100f890e7ed2) 2010-06-22 17:20:23 +04:00			`return;`
update getvnnmap control to take a timeout parameter dont explicitely free the vnnmap pointer in the getvnnmap control this is freed by the mem_ctx instead add code to the recoverd to detect when/if recovery is required veiry that the number of active nodes, the nodemap and the vnn map is consistent across the entire cluster and if not trigger a recovery (which right now just prints "we need to do recovery" to the screen. (This used to be ctdb commit 2b0a207a3748bdb3394dc9fd0d1c344ee1bb0bb5) 2007-05-04 03:45:53 +04:00			`}`

			`/* verify that all active nodes in the nodemap also exist in`
			`the vnnmap.`
			`*/`
			`for (j=0; j<nodemap->num; j++) {`
implement a scheme where nodes are banned if they continuously caused the cluster to start a recovery session. The node is banned from the cluster for the RecoveryBanPeriod (default of 5 minutes) (This used to be ctdb commit 4ad43dd07f526b6002477177fbf55483246c2c0c) 2007-06-07 09:18:55 +04:00			`if (nodemap->nodes[j].flags & NODE_FLAGS_INACTIVE) {`
update getvnnmap control to take a timeout parameter dont explicitely free the vnnmap pointer in the getvnnmap control this is freed by the mem_ctx instead add code to the recoverd to detect when/if recovery is required veiry that the number of active nodes, the nodemap and the vnn map is consistent across the entire cluster and if not trigger a recovery (which right now just prints "we need to do recovery" to the screen. (This used to be ctdb commit 2b0a207a3748bdb3394dc9fd0d1c344ee1bb0bb5) 2007-05-04 03:45:53 +04:00			`continue;`
			`}`
change ctdb_node_flags_change.vnn to ctdb_node_flags_changed.pnn change ctdb_ban_info.vnn to ctdb_ban_info.pnn (This used to be ctdb commit fcedd40e0493948829e1c921d4fe30e9196e398a) 2007-09-04 04:33:10 +04:00			`if (nodemap->nodes[j].pnn == pnn) {`
update getvnnmap control to take a timeout parameter dont explicitely free the vnnmap pointer in the getvnnmap control this is freed by the mem_ctx instead add code to the recoverd to detect when/if recovery is required veiry that the number of active nodes, the nodemap and the vnn map is consistent across the entire cluster and if not trigger a recovery (which right now just prints "we need to do recovery" to the screen. (This used to be ctdb commit 2b0a207a3748bdb3394dc9fd0d1c344ee1bb0bb5) 2007-05-04 03:45:53 +04:00			`continue;`
			`}`

			`for (i=0; i<vnnmap->size; i++) {`
change how we do public addresses and takeover so that we can have multiple public addresses spread across multiple interfaces on each node. this is a massive patch since we have previously made the assumtion that we only have one public address per node. get rid of the public_interface argument. the public addresses file now explicitely lists which interface the address belongs to (This used to be ctdb commit 462ebbc791e906a6b874c862defea43235597ca8) 2007-09-04 03:50:07 +04:00			`if (vnnmap->map[i] == nodemap->nodes[j].pnn) {`
update getvnnmap control to take a timeout parameter dont explicitely free the vnnmap pointer in the getvnnmap control this is freed by the mem_ctx instead add code to the recoverd to detect when/if recovery is required veiry that the number of active nodes, the nodemap and the vnn map is consistent across the entire cluster and if not trigger a recovery (which right now just prints "we need to do recovery" to the screen. (This used to be ctdb commit 2b0a207a3748bdb3394dc9fd0d1c344ee1bb0bb5) 2007-05-04 03:45:53 +04:00			`break;`
			`}`
			`}`
implement a scheme where nodes are banned if they continuously caused the cluster to start a recovery session. The node is banned from the cluster for the RecoveryBanPeriod (default of 5 minutes) (This used to be ctdb commit 4ad43dd07f526b6002477177fbf55483246c2c0c) 2007-06-07 09:18:55 +04:00			`if (i == vnnmap->size) {`
merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_ERR, (__location__ " Node %u is active in the nodemap but did not exist in the vnnmap\n",`
change how we do public addresses and takeover so that we can have multiple public addresses spread across multiple interfaces on each node. this is a massive patch since we have previously made the assumtion that we only have one public address per node. get rid of the public_interface argument. the public addresses file now explicitely lists which interface the address belongs to (This used to be ctdb commit 462ebbc791e906a6b874c862defea43235597ca8) 2007-09-04 03:50:07 +04:00			`nodemap->nodes[j].pnn));`
new prototype banning code (This used to be ctdb commit 0c4c2240267af183d54ffd4c0aacda208f6eff6a) 2009-09-03 20:20:39 +04:00			`ctdb_set_culprit(rec, nodemap->nodes[j].pnn);`
			`do_recovery(rec, mem_ctx, pnn, nodemap, vnnmap);`
speed startup: alter recovery loop We do a recovery on startup. But the code does: Sleep for ctdb->tunable.recover_interval. Check for recovery. We want to do it in the other order. This is best done by extracting the loop into a separate "main_loop" function. Seconds between ctdbd first log message and node healthy: BEFORE: 24.09 AFTER: 23.58 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 097046025176b9fcb670839d1a9f100f890e7ed2) 2010-06-22 17:20:23 +04:00			`return;`
update getvnnmap control to take a timeout parameter dont explicitely free the vnnmap pointer in the getvnnmap control this is freed by the mem_ctx instead add code to the recoverd to detect when/if recovery is required veiry that the number of active nodes, the nodemap and the vnn map is consistent across the entire cluster and if not trigger a recovery (which right now just prints "we need to do recovery" to the screen. (This used to be ctdb commit 2b0a207a3748bdb3394dc9fd0d1c344ee1bb0bb5) 2007-05-04 03:45:53 +04:00			`}`
			`}`


also verify that the generation id is the same on all the nodes and if not, trigger a recovery (This used to be ctdb commit 46b8a66ee70419c153acf45eeec88c1fc8f230ce) 2007-05-04 05:57:45 +04:00			`/* verify that all other nodes have the same vnnmap`
			`and are from the same generation`
			`*/`
update getvnnmap control to take a timeout parameter dont explicitely free the vnnmap pointer in the getvnnmap control this is freed by the mem_ctx instead add code to the recoverd to detect when/if recovery is required veiry that the number of active nodes, the nodemap and the vnn map is consistent across the entire cluster and if not trigger a recovery (which right now just prints "we need to do recovery" to the screen. (This used to be ctdb commit 2b0a207a3748bdb3394dc9fd0d1c344ee1bb0bb5) 2007-05-04 03:45:53 +04:00			`for (j=0; j<nodemap->num; j++) {`
implement a scheme where nodes are banned if they continuously caused the cluster to start a recovery session. The node is banned from the cluster for the RecoveryBanPeriod (default of 5 minutes) (This used to be ctdb commit 4ad43dd07f526b6002477177fbf55483246c2c0c) 2007-06-07 09:18:55 +04:00			`if (nodemap->nodes[j].flags & NODE_FLAGS_INACTIVE) {`
update getvnnmap control to take a timeout parameter dont explicitely free the vnnmap pointer in the getvnnmap control this is freed by the mem_ctx instead add code to the recoverd to detect when/if recovery is required veiry that the number of active nodes, the nodemap and the vnn map is consistent across the entire cluster and if not trigger a recovery (which right now just prints "we need to do recovery" to the screen. (This used to be ctdb commit 2b0a207a3748bdb3394dc9fd0d1c344ee1bb0bb5) 2007-05-04 03:45:53 +04:00			`continue;`
			`}`
change ctdb_node_flags_change.vnn to ctdb_node_flags_changed.pnn change ctdb_ban_info.vnn to ctdb_ban_info.pnn (This used to be ctdb commit fcedd40e0493948829e1c921d4fe30e9196e398a) 2007-09-04 04:33:10 +04:00			`if (nodemap->nodes[j].pnn == pnn) {`
update getvnnmap control to take a timeout parameter dont explicitely free the vnnmap pointer in the getvnnmap control this is freed by the mem_ctx instead add code to the recoverd to detect when/if recovery is required veiry that the number of active nodes, the nodemap and the vnn map is consistent across the entire cluster and if not trigger a recovery (which right now just prints "we need to do recovery" to the screen. (This used to be ctdb commit 2b0a207a3748bdb3394dc9fd0d1c344ee1bb0bb5) 2007-05-04 03:45:53 +04:00			`continue;`
			`}`

change how we do public addresses and takeover so that we can have multiple public addresses spread across multiple interfaces on each node. this is a massive patch since we have previously made the assumtion that we only have one public address per node. get rid of the public_interface argument. the public addresses file now explicitely lists which interface the address belongs to (This used to be ctdb commit 462ebbc791e906a6b874c862defea43235597ca8) 2007-09-04 03:50:07 +04:00			`ret = ctdb_ctrl_getvnnmap(ctdb, CONTROL_TIMEOUT(), nodemap->nodes[j].pnn,`
formatting fixes (This used to be ctdb commit ed63a2057698aed3931762605b2ea2368681af2b) 2007-06-07 12:39:37 +04:00			`mem_ctx, &remote_vnnmap);`
update getvnnmap control to take a timeout parameter dont explicitely free the vnnmap pointer in the getvnnmap control this is freed by the mem_ctx instead add code to the recoverd to detect when/if recovery is required veiry that the number of active nodes, the nodemap and the vnn map is consistent across the entire cluster and if not trigger a recovery (which right now just prints "we need to do recovery" to the screen. (This used to be ctdb commit 2b0a207a3748bdb3394dc9fd0d1c344ee1bb0bb5) 2007-05-04 03:45:53 +04:00			`if (ret != 0) {`
merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_ERR, (__location__ " Unable to get vnnmap from remote node %u\n",`
change how we do public addresses and takeover so that we can have multiple public addresses spread across multiple interfaces on each node. this is a massive patch since we have previously made the assumtion that we only have one public address per node. get rid of the public_interface argument. the public addresses file now explicitely lists which interface the address belongs to (This used to be ctdb commit 462ebbc791e906a6b874c862defea43235597ca8) 2007-09-04 03:50:07 +04:00			`nodemap->nodes[j].pnn));`
speed startup: alter recovery loop We do a recovery on startup. But the code does: Sleep for ctdb->tunable.recover_interval. Check for recovery. We want to do it in the other order. This is best done by extracting the loop into a separate "main_loop" function. Seconds between ctdbd first log message and node healthy: BEFORE: 24.09 AFTER: 23.58 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 097046025176b9fcb670839d1a9f100f890e7ed2) 2010-06-22 17:20:23 +04:00			`return;`
update getvnnmap control to take a timeout parameter dont explicitely free the vnnmap pointer in the getvnnmap control this is freed by the mem_ctx instead add code to the recoverd to detect when/if recovery is required veiry that the number of active nodes, the nodemap and the vnn map is consistent across the entire cluster and if not trigger a recovery (which right now just prints "we need to do recovery" to the screen. (This used to be ctdb commit 2b0a207a3748bdb3394dc9fd0d1c344ee1bb0bb5) 2007-05-04 03:45:53 +04:00			`}`

also verify that the generation id is the same on all the nodes and if not, trigger a recovery (This used to be ctdb commit 46b8a66ee70419c153acf45eeec88c1fc8f230ce) 2007-05-04 05:57:45 +04:00			`/* verify the vnnmap generation is the same */`
			`if (vnnmap->generation != remote_vnnmap->generation) {`
merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_ERR, (__location__ " Remote node %u has different generation of vnnmap. %u vs %u (ours)\n",`
change how we do public addresses and takeover so that we can have multiple public addresses spread across multiple interfaces on each node. this is a massive patch since we have previously made the assumtion that we only have one public address per node. get rid of the public_interface argument. the public addresses file now explicitely lists which interface the address belongs to (This used to be ctdb commit 462ebbc791e906a6b874c862defea43235597ca8) 2007-09-04 03:50:07 +04:00			`nodemap->nodes[j].pnn, remote_vnnmap->generation, vnnmap->generation));`
new prototype banning code (This used to be ctdb commit 0c4c2240267af183d54ffd4c0aacda208f6eff6a) 2009-09-03 20:20:39 +04:00			`ctdb_set_culprit(rec, nodemap->nodes[j].pnn);`
			`do_recovery(rec, mem_ctx, pnn, nodemap, vnnmap);`
speed startup: alter recovery loop We do a recovery on startup. But the code does: Sleep for ctdb->tunable.recover_interval. Check for recovery. We want to do it in the other order. This is best done by extracting the loop into a separate "main_loop" function. Seconds between ctdbd first log message and node healthy: BEFORE: 24.09 AFTER: 23.58 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 097046025176b9fcb670839d1a9f100f890e7ed2) 2010-06-22 17:20:23 +04:00			`return;`
also verify that the generation id is the same on all the nodes and if not, trigger a recovery (This used to be ctdb commit 46b8a66ee70419c153acf45eeec88c1fc8f230ce) 2007-05-04 05:57:45 +04:00			`}`

update getvnnmap control to take a timeout parameter dont explicitely free the vnnmap pointer in the getvnnmap control this is freed by the mem_ctx instead add code to the recoverd to detect when/if recovery is required veiry that the number of active nodes, the nodemap and the vnn map is consistent across the entire cluster and if not trigger a recovery (which right now just prints "we need to do recovery" to the screen. (This used to be ctdb commit 2b0a207a3748bdb3394dc9fd0d1c344ee1bb0bb5) 2007-05-04 03:45:53 +04:00			`/* verify the vnnmap size is the same */`
			`if (vnnmap->size != remote_vnnmap->size) {`
merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_ERR, (__location__ " Remote node %u has different size of vnnmap. %u vs %u (ours)\n",`
change how we do public addresses and takeover so that we can have multiple public addresses spread across multiple interfaces on each node. this is a massive patch since we have previously made the assumtion that we only have one public address per node. get rid of the public_interface argument. the public addresses file now explicitely lists which interface the address belongs to (This used to be ctdb commit 462ebbc791e906a6b874c862defea43235597ca8) 2007-09-04 03:50:07 +04:00			`nodemap->nodes[j].pnn, remote_vnnmap->size, vnnmap->size));`
new prototype banning code (This used to be ctdb commit 0c4c2240267af183d54ffd4c0aacda208f6eff6a) 2009-09-03 20:20:39 +04:00			`ctdb_set_culprit(rec, nodemap->nodes[j].pnn);`
			`do_recovery(rec, mem_ctx, pnn, nodemap, vnnmap);`
speed startup: alter recovery loop We do a recovery on startup. But the code does: Sleep for ctdb->tunable.recover_interval. Check for recovery. We want to do it in the other order. This is best done by extracting the loop into a separate "main_loop" function. Seconds between ctdbd first log message and node healthy: BEFORE: 24.09 AFTER: 23.58 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 097046025176b9fcb670839d1a9f100f890e7ed2) 2010-06-22 17:20:23 +04:00			`return;`
update getvnnmap control to take a timeout parameter dont explicitely free the vnnmap pointer in the getvnnmap control this is freed by the mem_ctx instead add code to the recoverd to detect when/if recovery is required veiry that the number of active nodes, the nodemap and the vnn map is consistent across the entire cluster and if not trigger a recovery (which right now just prints "we need to do recovery" to the screen. (This used to be ctdb commit 2b0a207a3748bdb3394dc9fd0d1c344ee1bb0bb5) 2007-05-04 03:45:53 +04:00			`}`

			`/* verify the vnnmap is the same */`
			`for (i=0;i<vnnmap->size;i++) {`
			`if (remote_vnnmap->map[i] != vnnmap->map[i]) {`
merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_ERR, (__location__ " Remote node %u has different vnnmap.\n",`
change how we do public addresses and takeover so that we can have multiple public addresses spread across multiple interfaces on each node. this is a massive patch since we have previously made the assumtion that we only have one public address per node. get rid of the public_interface argument. the public addresses file now explicitely lists which interface the address belongs to (This used to be ctdb commit 462ebbc791e906a6b874c862defea43235597ca8) 2007-09-04 03:50:07 +04:00			`nodemap->nodes[j].pnn));`
new prototype banning code (This used to be ctdb commit 0c4c2240267af183d54ffd4c0aacda208f6eff6a) 2009-09-03 20:20:39 +04:00			`ctdb_set_culprit(rec, nodemap->nodes[j].pnn);`
store the num_active variable (number of connected/active nodes) inside the rec structure and avoid passing this as an extra parameter to do_recovery() (This used to be ctdb commit 8bb229aa3b4bd41e48d4e4e2e148d8680c8ba436) 2008-02-29 04:55:20 +03:00			`do_recovery(rec, mem_ctx, pnn, nodemap,`
new prototype banning code (This used to be ctdb commit 0c4c2240267af183d54ffd4c0aacda208f6eff6a) 2009-09-03 20:20:39 +04:00			`vnnmap);`
speed startup: alter recovery loop We do a recovery on startup. But the code does: Sleep for ctdb->tunable.recover_interval. Check for recovery. We want to do it in the other order. This is best done by extracting the loop into a separate "main_loop" function. Seconds between ctdbd first log message and node healthy: BEFORE: 24.09 AFTER: 23.58 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 097046025176b9fcb670839d1a9f100f890e7ed2) 2010-06-22 17:20:23 +04:00			`return;`
update getvnnmap control to take a timeout parameter dont explicitely free the vnnmap pointer in the getvnnmap control this is freed by the mem_ctx instead add code to the recoverd to detect when/if recovery is required veiry that the number of active nodes, the nodemap and the vnn map is consistent across the entire cluster and if not trigger a recovery (which right now just prints "we need to do recovery" to the screen. (This used to be ctdb commit 2b0a207a3748bdb3394dc9fd0d1c344ee1bb0bb5) 2007-05-04 03:45:53 +04:00			`}`
			`}`
			`}`

ctdb-ipalloc: Drop remote IP verification It is only run during a takeover run and only logs errors. It doesn't actually do anything to fix potential errors. The takeover run should fix any inconsistencies anyway. Instead, leave a comment in the recovery daemon's monitoring loop to add proper remote IP verification later. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-06-20 13:41:05 +03:00			`/* FIXME: Add remote public IP checking to ensure that nodes`
			`* have the IP addresses that are allocated to them. */`

ctdb-recoverd: Move takeover run checks after recover checks If a recovery is going to be done then this will be followed by a takeover run anyway. So, there's no use doing the takeover run checks, potentially doing a takeover run and then doing a recovery. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-05-03 09:00:02 +03:00			`takeover_run_checks:`

ctdb-recoverd: Unify takeover run triggering code in main loop Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Fri May 13 17:15:57 CEST 2016 on sn-devel-144 2016-05-03 09:07:34 +03:00			`/* If there are IP takeover runs requested or the previous one`
			`* failed then perform one and notify the waiters */`
ctdb-recoverd: Move takeover run checks after recover checks If a recovery is going to be done then this will be followed by a takeover run anyway. So, there's no use doing the takeover run checks, potentially doing a takeover run and then doing a recovery. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-05-03 09:00:02 +03:00			`if (!ctdb_op_is_disabled(rec->takeover_run) &&`
ctdb-recoverd: Unify takeover run triggering code in main loop Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Fri May 13 17:15:57 CEST 2016 on sn-devel-144 2016-05-03 09:07:34 +03:00			`(rec->reallocate_requests \|\| rec->need_takeover_run)) {`
ctdb-recoverd: Move takeover run checks after recover checks If a recovery is going to be done then this will be followed by a takeover run anyway. So, there's no use doing the takeover run checks, potentially doing a takeover run and then doing a recovery. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-05-03 09:00:02 +03:00			`process_ipreallocate_requests(ctdb, rec);`
			`}`
speed startup: alter recovery loop We do a recovery on startup. But the code does: Sleep for ctdb->tunable.recover_interval. Check for recovery. We want to do it in the other order. This is best done by extracting the loop into a separate "main_loop" function. Seconds between ctdbd first log message and node healthy: BEFORE: 24.09 AFTER: 23.58 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 097046025176b9fcb670839d1a9f100f890e7ed2) 2010-06-22 17:20:23 +04:00			`}`

ctdb-recoverd: Release recovery lock on exit The recovery lock helper must exit when it notices its parent is gone. However, that can take a few seconds. The usual way of terminating the recovery daemon is for the main ctdbd to send it a SIGTERM. Installing a handler is nice and simple. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-06-02 02:26:40 +03:00			`static void recd_sig_term_handler(struct tevent_context *ev,`
			`struct tevent_signal *se, int signum,`
			`int count, void *dont_care,`
			`void *private_data)`
			`{`
			`struct ctdb_recoverd *rec = talloc_get_type_abort(`
			`private_data, struct ctdb_recoverd);`

ctdb-recoverd: Log a message when terminating Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2016-11-25 06:57:30 +03:00			`DEBUG(DEBUG_ERR, ("Received SIGTERM, exiting\n"));`
ctdb-recoverd: Release recovery lock on exit The recovery lock helper must exit when it notices its parent is gone. However, that can take a few seconds. The usual way of terminating the recovery daemon is for the main ctdbd to send it a SIGTERM. Installing a handler is nice and simple. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-06-02 02:26:40 +03:00			`ctdb_recovery_unlock(rec);`
			`exit(0);`
			`}`


speed startup: alter recovery loop We do a recovery on startup. But the code does: Sleep for ctdb->tunable.recover_interval. Check for recovery. We want to do it in the other order. This is best done by extracting the loop into a separate "main_loop" function. Seconds between ctdbd first log message and node healthy: BEFORE: 24.09 AFTER: 23.58 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 097046025176b9fcb670839d1a9f100f890e7ed2) 2010-06-22 17:20:23 +04:00			`/*`
			`the main monitoring loop`
			`*/`
			`static void monitor_cluster(struct ctdb_context *ctdb)`
			`{`
ctdb-recoverd: Release recovery lock on exit The recovery lock helper must exit when it notices its parent is gone. However, that can take a few seconds. The usual way of terminating the recovery daemon is for the main ctdbd to send it a SIGTERM. Installing a handler is nice and simple. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-06-02 02:26:40 +03:00			`struct tevent_signal *se;`
speed startup: alter recovery loop We do a recovery on startup. But the code does: Sleep for ctdb->tunable.recover_interval. Check for recovery. We want to do it in the other order. This is best done by extracting the loop into a separate "main_loop" function. Seconds between ctdbd first log message and node healthy: BEFORE: 24.09 AFTER: 23.58 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 097046025176b9fcb670839d1a9f100f890e7ed2) 2010-06-22 17:20:23 +04:00			`struct ctdb_recoverd *rec;`

			`DEBUG(DEBUG_NOTICE,("monitor_cluster starting\n"));`

			`rec = talloc_zero(ctdb, struct ctdb_recoverd);`
			`CTDB_NO_MEMORY_FATAL(ctdb, rec);`

			`rec->ctdb = ctdb;`
ctdb-recoverd: Explicitly set initial recovery master to unknown Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-11-10 05:54:47 +03:00			`rec->recmaster = CTDB_UNKNOWN_PNN;`
ctdb-recoverd: Recovery lock handle should be in recovery deamon context This shouldn't be in the CTDB context. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-05-24 07:54:39 +03:00			`rec->recovery_lock_handle = NULL;`
added health monitoring logic to ctdb, so a node loses its public IP address if one of the sybsystem event scripts reports a problem (This used to be ctdb commit c7a089256d86cec21097453bce5acbccee87413f) 2007-06-06 04:25:46 +04:00
ctdb-recoverd: Reimplement disabling takeover runs using ctdb_op_disable() Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-02-08 12:52:12 +03:00			`rec->takeover_run = ctdb_op_init(rec, "takeover runs");`
			`CTDB_NO_MEMORY_FATAL(ctdb, rec->takeover_run);`
recoverd: do_takeover_run() should mark when a takeover run is in progress Nested takeover runs should never happens so they should fail. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 8ed29c60c0a7dd29f2a6efdf694d38e94281e1c4) 2013-09-03 05:20:01 +04:00
ctdb-recoverd: Reimplement ReRecoveryTimeout using ctdb_op_disable() Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-02-06 06:47:33 +03:00			`rec->recovery = ctdb_op_init(rec, "recoveries");`
			`CTDB_NO_MEMORY_FATAL(ctdb, rec->recovery);`

speed startup: alter recovery loop We do a recovery on startup. But the code does: Sleep for ctdb->tunable.recover_interval. Check for recovery. We want to do it in the other order. This is best done by extracting the loop into a separate "main_loop" function. Seconds between ctdbd first log message and node healthy: BEFORE: 24.09 AFTER: 23.58 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 097046025176b9fcb670839d1a9f100f890e7ed2) 2010-06-22 17:20:23 +04:00			`rec->priority_time = timeval_current();`
ctdb-recoverd: Freeze databases whenever the node is INACTIVE If the node becomes stopped or banned after recovery is marked active, then it will never freeze the databases, and hence the node will keep banning itself indefinitely, until ctdbd is restarted. This is a regression from 4.3, introduced with b4357a79d916b1f8ade8fa78563fbef0ce670aa9 and d8f3b490bbb691c9916eed0df5b980c1aef23c85 BUG: https://bugzilla.samba.org/show_bug.cgi?id=11945 Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org> Reviewed-by: Martin Schwenke <martin@meltin.net> Autobuild-User(master): Michael Adam <obnox@samba.org> Autobuild-Date(master): Wed Jun 1 17:36:12 CEST 2016 on sn-devel-144 2016-06-01 05:10:46 +03:00			`rec->frozen_on_inactive = false;`
force an update of the flags from the recmaster after each monitoring run (This used to be ctdb commit 251aeadc8b16a9c27a4bae78c97ad6e93e6cfdf4) 2008-06-26 07:08:37 +04:00
ctdb-recoverd: Release recovery lock on exit The recovery lock helper must exit when it notices its parent is gone. However, that can take a few seconds. The usual way of terminating the recovery daemon is for the main ctdbd to send it a SIGTERM. Installing a handler is nice and simple. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2016-06-02 02:26:40 +03:00			`se = tevent_add_signal(ctdb->ev, ctdb, SIGTERM, 0,`
			`recd_sig_term_handler, rec);`
			`if (se == NULL) {`
			`DEBUG(DEBUG_ERR, ("Failed to install SIGTERM handler\n"));`
			`exit(1);`
			`}`

speed startup: alter recovery loop We do a recovery on startup. But the code does: Sleep for ctdb->tunable.recover_interval. Check for recovery. We want to do it in the other order. This is best done by extracting the loop into a separate "main_loop" function. Seconds between ctdbd first log message and node healthy: BEFORE: 24.09 AFTER: 23.58 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 097046025176b9fcb670839d1a9f100f890e7ed2) 2010-06-22 17:20:23 +04:00			`/* register a message port for sending memory dumps */`
			`ctdb_client_set_message_handler(ctdb, CTDB_SRVID_MEM_DUMP, mem_dump_handler, rec);`
update getvnnmap control to take a timeout parameter dont explicitely free the vnnmap pointer in the getvnnmap control this is freed by the mem_ctx instead add code to the recoverd to detect when/if recovery is required veiry that the number of active nodes, the nodemap and the vnn map is consistent across the entire cluster and if not trigger a recovery (which right now just prints "we need to do recovery" to the screen. (This used to be ctdb commit 2b0a207a3748bdb3394dc9fd0d1c344ee1bb0bb5) 2007-05-04 03:45:53 +04:00
ctdb-recoverd: Add message handler to assigning banning credits This will be called from recovery helper to assign banning credits to misbehaving node. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2016-03-17 09:26:30 +03:00			`/* when a node is assigned banning credits */`
			`ctdb_client_set_message_handler(ctdb, CTDB_SRVID_BANNING,`
			`banning_handler, rec);`

speed startup: alter recovery loop We do a recovery on startup. But the code does: Sleep for ctdb->tunable.recover_interval. Check for recovery. We want to do it in the other order. This is best done by extracting the loop into a separate "main_loop" function. Seconds between ctdbd first log message and node healthy: BEFORE: 24.09 AFTER: 23.58 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 097046025176b9fcb670839d1a9f100f890e7ed2) 2010-06-22 17:20:23 +04:00			`/* register a message port for recovery elections */`
ctdb-include: Use new protocol definitions This gets rid of the duplicate definitions from ctdb_protocol.h. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-29 09:51:52 +03:00			`ctdb_client_set_message_handler(ctdb, CTDB_SRVID_ELECTION, election_handler, rec);`
speed startup: alter recovery loop We do a recovery on startup. But the code does: Sleep for ctdb->tunable.recover_interval. Check for recovery. We want to do it in the other order. This is best done by extracting the loop into a separate "main_loop" function. Seconds between ctdbd first log message and node healthy: BEFORE: 24.09 AFTER: 23.58 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 097046025176b9fcb670839d1a9f100f890e7ed2) 2010-06-22 17:20:23 +04:00
			`/* when nodes are disabled/enabled */`
			`ctdb_client_set_message_handler(ctdb, CTDB_SRVID_SET_NODE_FLAGS, monitor_handler, rec);`

			`/* when we are asked to puch out a flag change */`
			`ctdb_client_set_message_handler(ctdb, CTDB_SRVID_PUSH_NODE_FLAGS, push_flags_handler, rec);`

			`/* register a message port for vacuum fetch */`
			`ctdb_client_set_message_handler(ctdb, CTDB_SRVID_VACUUM_FETCH, vacuum_fetch_handler, rec);`

			`/* register a message port for reloadnodes */`
			`ctdb_client_set_message_handler(ctdb, CTDB_SRVID_RELOAD_NODES, reload_nodes_handler, rec);`

			`/* register a message port for performing a takeover run */`
			`ctdb_client_set_message_handler(ctdb, CTDB_SRVID_TAKEOVER_RUN, ip_reallocate_handler, rec);`

			`/* register a message port for disabling the ip check for a short while */`
			`ctdb_client_set_message_handler(ctdb, CTDB_SRVID_DISABLE_IP_CHECK, disable_ip_check_handler, rec);`

When adding ips to nodes, set up a deferred rebalance for the whole node to trigger after 60 seconds in case the normal ipreallocated is not sufficient to trigger rebalance. (This used to be ctdb commit 4340263b219d75c39f8de22abe3f6f1c1ee63ea2) 2012-02-27 23:56:04 +04:00			`/* register a message port for forcing a rebalance of a node next`
			`reallocation */`
			`ctdb_client_set_message_handler(ctdb, CTDB_SRVID_REBALANCE_NODE, recd_node_rebalance_handler, rec);`

recoverd: New SRVID message CTDB_SRVID_DISABLE_TAKEOVER_RUNS This implements a superset of CTDB_SRVID_DISABLE_IP_CHECK. It stops the IP checks but also causes any attempted takeover runs to fail and be rescheduled. This is meant to completely stop IP movements. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 00db4de53a0d86013e79e6577e7e6cf3ef864e56) 2013-08-27 09:04:40 +04:00			`/* Register a message port for disabling takeover runs */`
			`ctdb_client_set_message_handler(ctdb,`
			`CTDB_SRVID_DISABLE_TAKEOVER_RUNS,`
			`disable_takeover_runs_handler, rec);`

ctdb-recoverd: New message ID CTDB_SRVID_DISABLE_RECOVERIES Also add test stub support. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-02-06 07:06:44 +03:00			`/* Register a message port for disabling recoveries */`
			`ctdb_client_set_message_handler(ctdb,`
			`CTDB_SRVID_DISABLE_RECOVERIES,`
			`disable_recoveries_handler, rec);`

ctdb-recoverd: Detach database from recovery daemon As part of vacuuming, recoverd attaches to databases to migrate records. When detaching a database from main daemon, it should be removed from recovery daemon also. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org> Autobuild-User(master): Michael Adam <obnox@samba.org> Autobuild-Date(master): Wed Apr 23 17:05:45 CEST 2014 on sn-devel-104 2014-04-22 09:24:49 +04:00			`/* register a message port for detaching database */`
			`ctdb_client_set_message_handler(ctdb,`
			`CTDB_SRVID_DETACH_DATABASE,`
			`detach_database_handler, rec);`

speed startup: alter recovery loop We do a recovery on startup. But the code does: Sleep for ctdb->tunable.recover_interval. Check for recovery. We want to do it in the other order. This is best done by extracting the loop into a separate "main_loop" function. Seconds between ctdbd first log message and node healthy: BEFORE: 24.09 AFTER: 23.58 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 097046025176b9fcb670839d1a9f100f890e7ed2) 2010-06-22 17:20:23 +04:00			`for (;;) {`
			`TALLOC_CTX *mem_ctx = talloc_new(ctdb);`
speed startup: don't wait a full recovery interval if we've already waited We currently sleep for one second, whether or not we've already slept. Change this to sleep for the remainder of the second, if at all. Seconds between ctdbd first log message and node healthy: BEFORE: 18.09 AFTER: 17.08 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit fde760b5f39c77172308a583da4c2443b71541c9) 2010-06-22 17:20:35 +04:00			`struct timeval start;`
			`double elapsed;`

speed startup: alter recovery loop We do a recovery on startup. But the code does: Sleep for ctdb->tunable.recover_interval. Check for recovery. We want to do it in the other order. This is best done by extracting the loop into a separate "main_loop" function. Seconds between ctdbd first log message and node healthy: BEFORE: 24.09 AFTER: 23.58 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 097046025176b9fcb670839d1a9f100f890e7ed2) 2010-06-22 17:20:23 +04:00			`if (!mem_ctx) {`
			`DEBUG(DEBUG_CRIT,(__location__`
			`" Failed to create temp context\n"));`
			`exit(-1);`
			`}`

speed startup: don't wait a full recovery interval if we've already waited We currently sleep for one second, whether or not we've already slept. Change this to sleep for the remainder of the second, if at all. Seconds between ctdbd first log message and node healthy: BEFORE: 18.09 AFTER: 17.08 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit fde760b5f39c77172308a583da4c2443b71541c9) 2010-06-22 17:20:35 +04:00			`start = timeval_current();`
speed startup: alter recovery loop We do a recovery on startup. But the code does: Sleep for ctdb->tunable.recover_interval. Check for recovery. We want to do it in the other order. This is best done by extracting the loop into a separate "main_loop" function. Seconds between ctdbd first log message and node healthy: BEFORE: 24.09 AFTER: 23.58 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 097046025176b9fcb670839d1a9f100f890e7ed2) 2010-06-22 17:20:23 +04:00			`main_loop(ctdb, rec, mem_ctx);`
			`talloc_free(mem_ctx);`

			`/* we only check for recovery once every second */`
speed startup: don't wait a full recovery interval if we've already waited We currently sleep for one second, whether or not we've already slept. Change this to sleep for the remainder of the second, if at all. Seconds between ctdbd first log message and node healthy: BEFORE: 18.09 AFTER: 17.08 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit fde760b5f39c77172308a583da4c2443b71541c9) 2010-06-22 17:20:35 +04:00			`elapsed = timeval_elapsed(&start);`
			`if (elapsed < ctdb->tunable.recover_interval) {`
			`ctdb_wait_timeout(ctdb, ctdb->tunable.recover_interval`
			`- elapsed);`
			`}`
speed startup: alter recovery loop We do a recovery on startup. But the code does: Sleep for ctdb->tunable.recover_interval. Check for recovery. We want to do it in the other order. This is best done by extracting the loop into a separate "main_loop" function. Seconds between ctdbd first log message and node healthy: BEFORE: 24.09 AFTER: 23.58 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 097046025176b9fcb670839d1a9f100f890e7ed2) 2010-06-22 17:20:23 +04:00			`}`
start working on a recovery daemon change ctdb_control so it takes a timeval pointer as argument. this is the timeout. if the node has not responded within hte timeout ctdb_control will return an error instead of hanging. if the timeval pointer is NULL then the call will block indefinitely if there is no response. this is used for now in the createdb control but all the helpers ctdb_ctrl_* should probably be updated to take a timeout parameter as well. (This used to be ctdb commit 1fe64b04869b17dbf123851b0fe09df8d28a6211) 2007-05-04 02:30:18 +04:00			`}`

added health monitoring logic to ctdb, so a node loses its public IP address if one of the sybsystem event scripts reports a problem (This used to be ctdb commit c7a089256d86cec21097453bce5acbccee87413f) 2007-06-06 04:25:46 +04:00			`/*`
implement a scheme where nodes are banned if they continuously caused the cluster to start a recovery session. The node is banned from the cluster for the RecoveryBanPeriod (default of 5 minutes) (This used to be ctdb commit 4ad43dd07f526b6002477177fbf55483246c2c0c) 2007-06-07 09:18:55 +04:00			`event handler for when the main ctdbd dies`
			`*/`
ctdb-daemon: Stop using tevent compatibility definitions Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-26 08:50:09 +03:00			`static void ctdb_recoverd_parent(struct tevent_context *ev,`
			`struct tevent_fd *fde,`
moved the recovery daemon into the main ctdbd and enable it by default (This used to be ctdb commit 2a7d42124731f43d013cb76a798525eab4cc1ee0) 2007-05-15 09:13:36 +04:00			`uint16_t flags, void *private_data)`
			`{`
merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_ALERT,("recovery daemon parent died - exiting\n"));`
moved the recovery daemon into the main ctdbd and enable it by default (This used to be ctdb commit 2a7d42124731f43d013cb76a798525eab4cc1ee0) 2007-05-15 09:13:36 +04:00			`_exit(1);`
			`}`

Monitor that the recovery daemon is still running from the main ctdb daemon and if it has terminated, then we shut down the main daemon as well (This used to be ctdb commit 7e587acaf8006254e89ff9b4bf48454821c85863) 2008-05-06 05:19:17 +04:00			`/*`
			`called regularly to verify that the recovery daemon is still running`
			`*/`
ctdb-daemon: Stop using tevent compatibility definitions Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-26 08:50:09 +03:00			`static void ctdb_check_recd(struct tevent_context *ev,`
			`struct tevent_timer *te,`
			`struct timeval yt, void *p)`
Monitor that the recovery daemon is still running from the main ctdb daemon and if it has terminated, then we shut down the main daemon as well (This used to be ctdb commit 7e587acaf8006254e89ff9b4bf48454821c85863) 2008-05-06 05:19:17 +04:00			`{`
			`struct ctdb_context *ctdb = talloc_get_type(p, struct ctdb_context);`

Track all child process so we never send a signal to an unrelated process (our child died and kernel wrapped the pid-space and reused the pid for a different process Wrap all creation of child processes inside ctdb_fork() which is used to track all processes we have spawned. Capture SIGCHLD to track also which child processes have terminated. Wrap kill() inside ctdb_kill() and make sure that we never send a !0 signal to a child process pid that has already terminated (and might have been replaced with a (This used to be ctdb commit f73a4b1495830bcdd094a93732a89dd53b3c2f78) 2012-05-03 05:42:41 +04:00			`if (ctdb_kill(ctdb, ctdb->recoverd_pid, 0) != 0) {`
If/when the recovery daemon terminates unexpectedly, try to restart it again from the main daemon instead of just shutting down the main deamon too. While it does not address the reason for recovery daemon shutting down, it reduces the impact of such issues and makes the system more robust. (This used to be ctdb commit 0566ef3d6cef809bda204877c493c80ff9eb2c40) 2011-03-01 04:09:42 +03:00			`DEBUG(DEBUG_ERR,("Recovery daemon (pid:%d) is no longer running. Trying to restart recovery daemon.\n", (int)ctdb->recoverd_pid));`
Monitor that the recovery daemon is still running from the main ctdb daemon and if it has terminated, then we shut down the main daemon as well (This used to be ctdb commit 7e587acaf8006254e89ff9b4bf48454821c85863) 2008-05-06 05:19:17 +04:00
ctdb-daemon: Stop using tevent compatibility definitions Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-26 08:50:09 +03:00			`tevent_add_timer(ctdb->ev, ctdb, timeval_zero(),`
			`ctdb_restart_recd, ctdb);`
Monitor that the recovery daemon is still running from the main ctdb daemon and if it has terminated, then we shut down the main daemon as well (This used to be ctdb commit 7e587acaf8006254e89ff9b4bf48454821c85863) 2008-05-06 05:19:17 +04:00
If/when the recovery daemon terminates unexpectedly, try to restart it again from the main daemon instead of just shutting down the main deamon too. While it does not address the reason for recovery daemon shutting down, it reduces the impact of such issues and makes the system more robust. (This used to be ctdb commit 0566ef3d6cef809bda204877c493c80ff9eb2c40) 2011-03-01 04:09:42 +03:00			`return;`
Monitor that the recovery daemon is still running from the main ctdb daemon and if it has terminated, then we shut down the main daemon as well (This used to be ctdb commit 7e587acaf8006254e89ff9b4bf48454821c85863) 2008-05-06 05:19:17 +04:00			`}`

ctdb-daemon: Stop using tevent compatibility definitions Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-26 08:50:09 +03:00			`tevent_add_timer(ctdb->ev, ctdb->recd_ctx,`
			`timeval_current_ofs(30, 0),`
			`ctdb_check_recd, ctdb);`
Monitor that the recovery daemon is still running from the main ctdb daemon and if it has terminated, then we shut down the main daemon as well (This used to be ctdb commit 7e587acaf8006254e89ff9b4bf48454821c85863) 2008-05-06 05:19:17 +04:00			`}`

ctdb-daemon: Stop using tevent compatibility definitions Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-26 08:50:09 +03:00			`static void recd_sig_child_handler(struct tevent_context *ev,`
			`struct tevent_signal *se, int signum,`
			`int count, void *dont_care,`
			`void *private_data)`
proper waitpid() fix. remove all waitpid() calls and use the event system to trap sigchld (This used to be ctdb commit 77458b2b6b51b2970c12b0e5b097088d3fb9d358) 2008-07-09 08:02:54 +04:00			`{`
			`// struct ctdb_context *ctdb = talloc_get_type(private_data, struct ctdb_context);`
			`int status;`
			`pid_t pid = -1;`

			`while (pid != 0) {`
			`pid = waitpid(-1, &status, WNOHANG);`
			`if (pid == -1) {`
dont log an error if waitpid returns -1 and errno is ECHILD (This used to be ctdb commit fdf50f3e774e3980af81c0b6f4ff81d085f4f697) 2009-06-19 09:55:13 +04:00			`if (errno != ECHILD) {`
			`DEBUG(DEBUG_ERR, (__location__ " waitpid() returned error. errno:%s(%d)\n", strerror(errno),errno));`
			`}`
proper waitpid() fix. remove all waitpid() calls and use the event system to trap sigchld (This used to be ctdb commit 77458b2b6b51b2970c12b0e5b097088d3fb9d358) 2008-07-09 08:02:54 +04:00			`return;`
			`}`
			`if (pid > 0) {`
			`DEBUG(DEBUG_DEBUG, ("RECD SIGCHLD from %d\n", (int)pid));`
			`}`
			`}`
			`}`

implement a scheme where nodes are banned if they continuously caused the cluster to start a recovery session. The node is banned from the cluster for the RecoveryBanPeriod (default of 5 minutes) (This used to be ctdb commit 4ad43dd07f526b6002477177fbf55483246c2c0c) 2007-06-07 09:18:55 +04:00			`/*`
			`startup the recovery daemon as a child of the main ctdb daemon`
			`*/`
moved the recovery daemon into the main ctdbd and enable it by default (This used to be ctdb commit 2a7d42124731f43d013cb76a798525eab4cc1ee0) 2007-05-15 09:13:36 +04:00			`int ctdb_start_recoverd(struct ctdb_context *ctdb)`
start working on a recovery daemon change ctdb_control so it takes a timeval pointer as argument. this is the timeout. if the node has not responded within hte timeout ctdb_control will return an error instead of hanging. if the timeval pointer is NULL then the call will block indefinitely if there is no response. this is used for now in the createdb control but all the helpers ctdb_ctrl_* should probably be updated to take a timeout parameter as well. (This used to be ctdb commit 1fe64b04869b17dbf123851b0fe09df8d28a6211) 2007-05-04 02:30:18 +04:00			`{`
moved the recovery daemon into the main ctdbd and enable it by default (This used to be ctdb commit 2a7d42124731f43d013cb76a798525eab4cc1ee0) 2007-05-15 09:13:36 +04:00			`int fd[2];`
ctdb-daemon: Stop using tevent compatibility definitions Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-26 08:50:09 +03:00			`struct tevent_signal *se;`
event: Update events to latest Samba version 0.9.8 In Samba this is now called "tevent", and while we use the backwards compatibility wrappers they don't offer EVENT_FD_AUTOCLOSE: that is now a separate tevent_fd_set_auto_close() function. This is based on Samba version 7f29f817fa939ef1bbb740584f09e76e2ecd5b06. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 85e5e760cc91eb3157d3a88996ce474491646726) 2010-08-18 03:46:31 +04:00			`struct tevent_fd *fde;`
ctdb-daemon: Initialize logging in recovery daemon Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2016-11-29 08:49:41 +03:00			`int ret;`
start working on a recovery daemon change ctdb_control so it takes a timeval pointer as argument. this is the timeout. if the node has not responded within hte timeout ctdb_control will return an error instead of hanging. if the timeval pointer is NULL then the call will block indefinitely if there is no response. this is used for now in the createdb control but all the helpers ctdb_ctrl_* should probably be updated to take a timeout parameter as well. (This used to be ctdb commit 1fe64b04869b17dbf123851b0fe09df8d28a6211) 2007-05-04 02:30:18 +04:00
moved the recovery daemon into the main ctdbd and enable it by default (This used to be ctdb commit 2a7d42124731f43d013cb76a798525eab4cc1ee0) 2007-05-15 09:13:36 +04:00			`if (pipe(fd) != 0) {`
			`return -1;`
start working on a recovery daemon change ctdb_control so it takes a timeval pointer as argument. this is the timeout. if the node has not responded within hte timeout ctdb_control will return an error instead of hanging. if the timeval pointer is NULL then the call will block indefinitely if there is no response. this is used for now in the createdb control but all the helpers ctdb_ctrl_* should probably be updated to take a timeout parameter as well. (This used to be ctdb commit 1fe64b04869b17dbf123851b0fe09df8d28a6211) 2007-05-04 02:30:18 +04:00			`}`

ctdb-logging: Remove log ringbuffer As far as we know, nobody uses this and it just complicates the logging subsystem. Remove all ringbuffer code and documentation. Update the local daemons startup code correspondingly. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Volker Lendecke <vl@samba.org> 2014-08-08 06:51:03 +04:00			`ctdb->recoverd_pid = ctdb_fork(ctdb);`
when we are shutting down, we should first shut down the recovery daemon (This used to be ctdb commit 39ade6b329adcd3234124d6a8daaa6181abf739b) 2007-10-22 06:34:08 +04:00			`if (ctdb->recoverd_pid == -1) {`
moved the recovery daemon into the main ctdbd and enable it by default (This used to be ctdb commit 2a7d42124731f43d013cb76a798525eab4cc1ee0) 2007-05-15 09:13:36 +04:00			`return -1;`
start working on a recovery daemon change ctdb_control so it takes a timeval pointer as argument. this is the timeout. if the node has not responded within hte timeout ctdb_control will return an error instead of hanging. if the timeval pointer is NULL then the call will block indefinitely if there is no response. this is used for now in the createdb control but all the helpers ctdb_ctrl_* should probably be updated to take a timeout parameter as well. (This used to be ctdb commit 1fe64b04869b17dbf123851b0fe09df8d28a6211) 2007-05-04 02:30:18 +04:00			`}`
daemon: On shutdown, destroy timed events that check if recoverd is active When CTDB is shutting down, recovery daemon is stopped, but the event that checks if recovery daemon is still alive is not destroyed. So recovery master is restarted during shutdown if CTDB daemon takes longer to shutdown. There are two processes that check if recovery daemon is working. 1. ctdb_check_recd() - which checks every 30 seconds if the recovery daemon process exists. 2. ctdb_recd_ping_timeout() - which is triggered when recovery daemon fails to ping CTDB daemon. Both the events are periodic and need to be destroyed when shutting down. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 746168df2e691058e601016110fae818c6a265c3) 2012-12-04 08:05:44 +04:00
when we are shutting down, we should first shut down the recovery daemon (This used to be ctdb commit 39ade6b329adcd3234124d6a8daaa6181abf739b) 2007-10-22 06:34:08 +04:00			`if (ctdb->recoverd_pid != 0) {`
daemon: On shutdown, destroy timed events that check if recoverd is active When CTDB is shutting down, recovery daemon is stopped, but the event that checks if recovery daemon is still alive is not destroyed. So recovery master is restarted during shutdown if CTDB daemon takes longer to shutdown. There are two processes that check if recovery daemon is working. 1. ctdb_check_recd() - which checks every 30 seconds if the recovery daemon process exists. 2. ctdb_recd_ping_timeout() - which is triggered when recovery daemon fails to ping CTDB daemon. Both the events are periodic and need to be destroyed when shutting down. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 746168df2e691058e601016110fae818c6a265c3) 2012-12-04 08:05:44 +04:00			`talloc_free(ctdb->recd_ctx);`
			`ctdb->recd_ctx = talloc_new(ctdb);`
			`CTDB_NO_MEMORY(ctdb, ctdb->recd_ctx);`

moved the recovery daemon into the main ctdbd and enable it by default (This used to be ctdb commit 2a7d42124731f43d013cb76a798525eab4cc1ee0) 2007-05-15 09:13:36 +04:00			`close(fd[0]);`
ctdb-daemon: Stop using tevent compatibility definitions Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-26 08:50:09 +03:00			`tevent_add_timer(ctdb->ev, ctdb->recd_ctx,`
			`timeval_current_ofs(30, 0),`
			`ctdb_check_recd, ctdb);`
moved the recovery daemon into the main ctdbd and enable it by default (This used to be ctdb commit 2a7d42124731f43d013cb76a798525eab4cc1ee0) 2007-05-15 09:13:36 +04:00			`return 0;`
start working on a recovery daemon change ctdb_control so it takes a timeval pointer as argument. this is the timeout. if the node has not responded within hte timeout ctdb_control will return an error instead of hanging. if the timeval pointer is NULL then the call will block indefinitely if there is no response. this is used for now in the createdb control but all the helpers ctdb_ctrl_* should probably be updated to take a timeout parameter as well. (This used to be ctdb commit 1fe64b04869b17dbf123851b0fe09df8d28a6211) 2007-05-04 02:30:18 +04:00			`}`

moved the recovery daemon into the main ctdbd and enable it by default (This used to be ctdb commit 2a7d42124731f43d013cb76a798525eab4cc1ee0) 2007-05-15 09:13:36 +04:00			`close(fd[1]);`

			`srandom(getpid() ^ time(NULL));`
start working on a recovery daemon change ctdb_control so it takes a timeval pointer as argument. this is the timeout. if the node has not responded within hte timeout ctdb_control will return an error instead of hanging. if the timeval pointer is NULL then the call will block indefinitely if there is no response. this is used for now in the createdb control but all the helpers ctdb_ctrl_* should probably be updated to take a timeout parameter as well. (This used to be ctdb commit 1fe64b04869b17dbf123851b0fe09df8d28a6211) 2007-05-04 02:30:18 +04:00
ctdb-daemon: Initialize logging in recovery daemon Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2016-11-29 08:49:41 +03:00			`ret = logging_init(ctdb, NULL, NULL, "ctdb-recoverd");`
			`if (ret != 0) {`
			`return -1;`
			`}`

ctdb: Use prctl_set_comment from lib/util Signed-off-by: Christof Schmitt <cs@samba.org> Reviewed-by: Amitay Isaacs <amitay@gmail.com> 2015-09-24 02:10:59 +03:00			`prctl_set_comment("ctdb_recovered");`
ctdb-daemon: Remove setting of debug_extra from switch_from_server_to_client() Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2016-11-25 06:44:10 +03:00			`if (switch_from_server_to_client(ctdb) != 0) {`
create a helper function that converts a ctdb instance in daemon mode to become a ctdb client instance. use this from the recovery daemon child process to switch to client mode and connect back to the main daemon (This used to be ctdb commit 16f31786a031255ab5b3099a0a3c745de973347a) 2009-03-23 04:37:30 +03:00			`DEBUG(DEBUG_CRIT, (__location__ "ERROR: failed to switch recovery daemon into client mode. shutting down.\n"));`
start working on a recovery daemon change ctdb_control so it takes a timeval pointer as argument. this is the timeout. if the node has not responded within hte timeout ctdb_control will return an error instead of hanging. if the timeval pointer is NULL then the call will block indefinitely if there is no response. this is used for now in the createdb control but all the helpers ctdb_ctrl_* should probably be updated to take a timeout parameter as well. (This used to be ctdb commit 1fe64b04869b17dbf123851b0fe09df8d28a6211) 2007-05-04 02:30:18 +04:00			`exit(1);`
			`}`

Drop the debug level for logging fd creation to DEBUG_DEBUG (This used to be ctdb commit eae1d4f9e52e73b4d8769868fffdafa590d03784) 2010-02-03 22:37:41 +03:00			`DEBUG(DEBUG_DEBUG, (__location__ " Created PIPE FD:%d to recovery daemon\n", fd[0]));`
add logging everytime we create a filedescriptor in the main ctdb daemon so we can spot if there are leaks. plug two leaks for filedescriptors related to when sending ARP fail and one leak when we can not parse the local address during tcp connection establish (This used to be ctdb commit ddd089810a14efe4be6e1ff3eccaa604e4913c9e) 2009-10-15 04:24:54 +04:00
ctdb-daemon: Stop using tevent compatibility definitions Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-26 08:50:09 +03:00			`fde = tevent_add_fd(ctdb->ev, ctdb, fd[0], TEVENT_FD_READ,`
			`ctdb_recoverd_parent, &fd[0]);`
event: Update events to latest Samba version 0.9.8 In Samba this is now called "tevent", and while we use the backwards compatibility wrappers they don't offer EVENT_FD_AUTOCLOSE: that is now a separate tevent_fd_set_auto_close() function. This is based on Samba version 7f29f817fa939ef1bbb740584f09e76e2ecd5b06. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 85e5e760cc91eb3157d3a88996ce474491646726) 2010-08-18 03:46:31 +04:00			`tevent_fd_set_auto_close(fde);`
create a helper function that converts a ctdb instance in daemon mode to become a ctdb client instance. use this from the recovery daemon child process to switch to client mode and connect back to the main daemon (This used to be ctdb commit 16f31786a031255ab5b3099a0a3c745de973347a) 2009-03-23 04:37:30 +03:00
proper waitpid() fix. remove all waitpid() calls and use the event system to trap sigchld (This used to be ctdb commit 77458b2b6b51b2970c12b0e5b097088d3fb9d358) 2008-07-09 08:02:54 +04:00			`/* set up a handler to pick up sigchld */`
ctdb-daemon: Stop using tevent compatibility definitions Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-26 08:50:09 +03:00			`se = tevent_add_signal(ctdb->ev, ctdb, SIGCHLD, 0,`
			`recd_sig_child_handler, ctdb);`
proper waitpid() fix. remove all waitpid() calls and use the event system to trap sigchld (This used to be ctdb commit 77458b2b6b51b2970c12b0e5b097088d3fb9d358) 2008-07-09 08:02:54 +04:00			`if (se == NULL) {`
			`DEBUG(DEBUG_CRIT,("Failed to set up signal handler for SIGCHLD in recovery daemon\n"));`
			`exit(1);`
			`}`

moved the recovery daemon into the main ctdbd and enable it by default (This used to be ctdb commit 2a7d42124731f43d013cb76a798525eab4cc1ee0) 2007-05-15 09:13:36 +04:00			`monitor_cluster(ctdb);`
recovery daemon with recovery master election election is primitive, it elects the lowest vnn as the recovery master two new controls, to get/set recovery master for a node to use recovery daemon, start one ./bin/recoverd --socket=ctdb.socket* for each ctdb daemon it has been briefly tested by deleting and adding nodes to a 4 node cluster but needs more testing (This used to be ctdb commit 541d1cc49d46d44042a31a8404d521412ef2fdb3) 2007-05-07 00:51:58 +04:00
merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_ALERT,("ERROR: ctdb_recoverd finished!?\n"));`
moved the recovery daemon into the main ctdbd and enable it by default (This used to be ctdb commit 2a7d42124731f43d013cb76a798525eab4cc1ee0) 2007-05-15 09:13:36 +04:00			`return -1;`
start working on a recovery daemon change ctdb_control so it takes a timeval pointer as argument. this is the timeout. if the node has not responded within hte timeout ctdb_control will return an error instead of hanging. if the timeval pointer is NULL then the call will block indefinitely if there is no response. this is used for now in the createdb control but all the helpers ctdb_ctrl_* should probably be updated to take a timeout parameter as well. (This used to be ctdb commit 1fe64b04869b17dbf123851b0fe09df8d28a6211) 2007-05-04 02:30:18 +04:00			`}`
when we are shutting down, we should first shut down the recovery daemon (This used to be ctdb commit 39ade6b329adcd3234124d6a8daaa6181abf739b) 2007-10-22 06:34:08 +04:00
			`/*`
			`shutdown the recovery daemon`
			`*/`
			`void ctdb_stop_recoverd(struct ctdb_context *ctdb)`
			`{`
			`if (ctdb->recoverd_pid == 0) {`
			`return;`
			`}`

merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_NOTICE,("Shutting down recovery daemon\n"));`
Track all child process so we never send a signal to an unrelated process (our child died and kernel wrapped the pid-space and reused the pid for a different process Wrap all creation of child processes inside ctdb_fork() which is used to track all processes we have spawned. Capture SIGCHLD to track also which child processes have terminated. Wrap kill() inside ctdb_kill() and make sure that we never send a !0 signal to a child process pid that has already terminated (and might have been replaced with a (This used to be ctdb commit f73a4b1495830bcdd094a93732a89dd53b3c2f78) 2012-05-03 05:42:41 +04:00			`ctdb_kill(ctdb, ctdb->recoverd_pid, SIGTERM);`
daemon: On shutdown, destroy timed events that check if recoverd is active When CTDB is shutting down, recovery daemon is stopped, but the event that checks if recovery daemon is still alive is not destroyed. So recovery master is restarted during shutdown if CTDB daemon takes longer to shutdown. There are two processes that check if recovery daemon is working. 1. ctdb_check_recd() - which checks every 30 seconds if the recovery daemon process exists. 2. ctdb_recd_ping_timeout() - which is triggered when recovery daemon fails to ping CTDB daemon. Both the events are periodic and need to be destroyed when shutting down. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 746168df2e691058e601016110fae818c6a265c3) 2012-12-04 08:05:44 +04:00
			`TALLOC_FREE(ctdb->recd_ctx);`
			`TALLOC_FREE(ctdb->recd_ping_count);`
when we are shutting down, we should first shut down the recovery daemon (This used to be ctdb commit 39ade6b329adcd3234124d6a8daaa6181abf739b) 2007-10-22 06:34:08 +04:00			`}`
If/when the recovery daemon terminates unexpectedly, try to restart it again from the main daemon instead of just shutting down the main deamon too. While it does not address the reason for recovery daemon shutting down, it reduces the impact of such issues and makes the system more robust. (This used to be ctdb commit 0566ef3d6cef809bda204877c493c80ff9eb2c40) 2011-03-01 04:09:42 +03:00
ctdb-daemon: Stop using tevent compatibility definitions Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> 2015-10-26 08:50:09 +03:00			`static void ctdb_restart_recd(struct tevent_context *ev,`
			`struct tevent_timer *te,`
			`struct timeval t, void *private_data)`
If/when the recovery daemon terminates unexpectedly, try to restart it again from the main daemon instead of just shutting down the main deamon too. While it does not address the reason for recovery daemon shutting down, it reduces the impact of such issues and makes the system more robust. (This used to be ctdb commit 0566ef3d6cef809bda204877c493c80ff9eb2c40) 2011-03-01 04:09:42 +03:00			`{`
			`struct ctdb_context *ctdb = talloc_get_type(private_data, struct ctdb_context);`

			`DEBUG(DEBUG_ERR,("Restarting recovery daemon\n"));`
			`ctdb_stop_recoverd(ctdb);`
			`ctdb_start_recoverd(ctdb);`
			`}`

3259 lines 86 KiB C Raw Normal View History Unescape Escape

3259 lines

86 KiB

C

Raw Normal View History