samba-mirror

mirror of https://github.com/samba-team/samba.git synced 2024-12-22 13:34:15 +03:00

1261 lines

34 KiB

C

Raw Normal View History

break set/get vnn map out from ctdb_control and put it in ctdb_recover.c for the time being remove all the [de]marshalling and just pass a structure around instead (This used to be ctdb commit b1169555ab7015976c0135ff51121cc238f5887c) 2007-05-03 05:06:24 +04:00			`/*`
separate out the freeze/thaw handling from recovery (This used to be ctdb commit 0b0640bd8b8334961f240e0cf276ac112cd6e616) 2007-05-12 09:15:27 +04:00			`ctdb recovery code`
break set/get vnn map out from ctdb_control and put it in ctdb_recover.c for the time being remove all the [de]marshalling and just pass a structure around instead (This used to be ctdb commit b1169555ab7015976c0135ff51121cc238f5887c) 2007-05-03 05:06:24 +04:00
			`Copyright (C) Andrew Tridgell 2007`
			`Copyright (C) Ronnie Sahlberg 2007`

ctdb is GPL not LGPL (This used to be ctdb commit 8624378010d1c2a1438e1e701339dfba7276f960) 2007-05-31 07:50:53 +04:00			`This program is free software; you can redistribute it and/or modify`
			`it under the terms of the GNU General Public License as published by`
update lib/replace from samba4 (This used to be ctdb commit f0555484105668c01c21f56322992e752e831109) 2007-07-10 09:29:31 +04:00			`the Free Software Foundation; either version 3 of the License, or`
ctdb is GPL not LGPL (This used to be ctdb commit 8624378010d1c2a1438e1e701339dfba7276f960) 2007-05-31 07:50:53 +04:00			`(at your option) any later version.`

			`This program is distributed in the hope that it will be useful,`
break set/get vnn map out from ctdb_control and put it in ctdb_recover.c for the time being remove all the [de]marshalling and just pass a structure around instead (This used to be ctdb commit b1169555ab7015976c0135ff51121cc238f5887c) 2007-05-03 05:06:24 +04:00			`but WITHOUT ANY WARRANTY; without even the implied warranty of`
ctdb is GPL not LGPL (This used to be ctdb commit 8624378010d1c2a1438e1e701339dfba7276f960) 2007-05-31 07:50:53 +04:00			`MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the`
			`GNU General Public License for more details.`

			`You should have received a copy of the GNU General Public License`
update lib/replace from samba4 (This used to be ctdb commit f0555484105668c01c21f56322992e752e831109) 2007-07-10 09:29:31 +04:00			`along with this program; if not, see <http://www.gnu.org/licenses/>.`
break set/get vnn map out from ctdb_control and put it in ctdb_recover.c for the time being remove all the [de]marshalling and just pass a structure around instead (This used to be ctdb commit b1169555ab7015976c0135ff51121cc238f5887c) 2007-05-03 05:06:24 +04:00			`*/`
			`#include "includes.h"`
			`#include "lib/events/events.h"`
			`#include "lib/tdb/include/tdb.h"`
add a ctdb uptime command that prints when ctdb was started and when the last recovery occured (This used to be ctdb commit b86e8ccbdac044bb949c4fc2ebb27635126272a9) 2008-01-17 03:33:23 +03:00			`#include "system/time.h"`
break set/get vnn map out from ctdb_control and put it in ctdb_recover.c for the time being remove all the [de]marshalling and just pass a structure around instead (This used to be ctdb commit b1169555ab7015976c0135ff51121cc238f5887c) 2007-05-03 05:06:24 +04:00			`#include "system/network.h"`
			`#include "system/filesys.h"`
			`#include "system/wait.h"`
			`#include "../include/ctdb_private.h"`
			`#include "lib/util/dlinklist.h"`
			`#include "db_wrap.h"`

more robust freeze/thaw logic (This used to be ctdb commit 51c1e51aeb7dfac1683584df7ef1bef98c092f76) 2007-05-12 09:29:06 +04:00			`/*`
			`lock all databases - mark only`
			`*/`
initial attempt at freezing databases in priority order (This used to be ctdb commit e8d692590da1070c87a4144031e3306d190ebed2) 2009-10-12 05:08:39 +04:00			`static int ctdb_lock_all_databases_mark(struct ctdb_context *ctdb, uint32_t priority)`
more robust freeze/thaw logic (This used to be ctdb commit 51c1e51aeb7dfac1683584df7ef1bef98c092f76) 2007-05-12 09:29:06 +04:00			`{`
			`struct ctdb_db_context *ctdb_db;`
initial attempt at freezing databases in priority order (This used to be ctdb commit e8d692590da1070c87a4144031e3306d190ebed2) 2009-10-12 05:08:39 +04:00
			`if ((priority < 1) \|\| (priority > NUM_DB_PRIORITIES)) {`
			`DEBUG(DEBUG_ERR,(__location__ " Illegal priority when trying to mark all databases Prio:%u\n", priority));`
			`return -1;`
			`}`

			`if (ctdb->freeze_mode[priority] != CTDB_FREEZE_FROZEN) {`
merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_ERR,("Attempt to mark all databases locked when not frozen\n"));`
more robust freeze/thaw logic (This used to be ctdb commit 51c1e51aeb7dfac1683584df7ef1bef98c092f76) 2007-05-12 09:29:06 +04:00			`return -1;`
			`}`
Port Volkers deadlock avoidance patch to HEAD. This patch ensures that we lock all non-notify related databases first and then the notify databases to avoiud a deadlock where samba needs to lock records on two databases at once (and notify being the second database). Newer versions of samba would instead use the set-db-prio control to set this explicitely on a database per database basis instead of relying on hardcoded database names. This patch will be reverted in the future when all updated versions of samba has been pushed out. (This used to be ctdb commit 70e7781df1f118a0e2632a9c634f3fd388fa6c8c) 2009-10-14 01:17:49 +04:00			`/* The dual loop is a woraround for older versions of samba`
			`that does not yet support the set-db-priority/lock order`
			`call. So that we get basic deadlock avoiidance also for`
			`these old versions of samba.`
			`This code will be removed in the future.`
			`*/`
more robust freeze/thaw logic (This used to be ctdb commit 51c1e51aeb7dfac1683584df7ef1bef98c092f76) 2007-05-12 09:29:06 +04:00			`for (ctdb_db=ctdb->db_list;ctdb_db;ctdb_db=ctdb_db->next) {`
initial attempt at freezing databases in priority order (This used to be ctdb commit e8d692590da1070c87a4144031e3306d190ebed2) 2009-10-12 05:08:39 +04:00			`if (ctdb_db->priority != priority) {`
			`continue;`
			`}`
Port Volkers deadlock avoidance patch to HEAD. This patch ensures that we lock all non-notify related databases first and then the notify databases to avoiud a deadlock where samba needs to lock records on two databases at once (and notify being the second database). Newer versions of samba would instead use the set-db-prio control to set this explicitely on a database per database basis instead of relying on hardcoded database names. This patch will be reverted in the future when all updated versions of samba has been pushed out. (This used to be ctdb commit 70e7781df1f118a0e2632a9c634f3fd388fa6c8c) 2009-10-14 01:17:49 +04:00			`if (strstr(ctdb_db->db_name, "notify") != NULL) {`
			`continue;`
			`}`
			`if (tdb_lockall_mark(ctdb_db->ltdb->tdb) != 0) {`
			`return -1;`
			`}`
			`}`
			`for (ctdb_db=ctdb->db_list;ctdb_db;ctdb_db=ctdb_db->next) {`
			`if (ctdb_db->priority != priority) {`
			`continue;`
			`}`
			`if (strstr(ctdb_db->db_name, "notify") == NULL) {`
			`continue;`
			`}`
more robust freeze/thaw logic (This used to be ctdb commit 51c1e51aeb7dfac1683584df7ef1bef98c092f76) 2007-05-12 09:29:06 +04:00			`if (tdb_lockall_mark(ctdb_db->ltdb->tdb) != 0) {`
			`return -1;`
			`}`
			`}`
			`return 0;`
			`}`

			`/*`
			`lock all databases - unmark only`
			`*/`
initial attempt at freezing databases in priority order (This used to be ctdb commit e8d692590da1070c87a4144031e3306d190ebed2) 2009-10-12 05:08:39 +04:00			`static int ctdb_lock_all_databases_unmark(struct ctdb_context *ctdb, uint32_t priority)`
more robust freeze/thaw logic (This used to be ctdb commit 51c1e51aeb7dfac1683584df7ef1bef98c092f76) 2007-05-12 09:29:06 +04:00			`{`
			`struct ctdb_db_context *ctdb_db;`
initial attempt at freezing databases in priority order (This used to be ctdb commit e8d692590da1070c87a4144031e3306d190ebed2) 2009-10-12 05:08:39 +04:00
			`if ((priority < 1) \|\| (priority > NUM_DB_PRIORITIES)) {`
			`DEBUG(DEBUG_ERR,(__location__ " Illegal priority when trying to mark all databases Prio:%u\n", priority));`
			`return -1;`
			`}`

			`if (ctdb->freeze_mode[priority] != CTDB_FREEZE_FROZEN) {`
merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_ERR,("Attempt to unmark all databases locked when not frozen\n"));`
more robust freeze/thaw logic (This used to be ctdb commit 51c1e51aeb7dfac1683584df7ef1bef98c092f76) 2007-05-12 09:29:06 +04:00			`return -1;`
			`}`
			`for (ctdb_db=ctdb->db_list;ctdb_db;ctdb_db=ctdb_db->next) {`
initial attempt at freezing databases in priority order (This used to be ctdb commit e8d692590da1070c87a4144031e3306d190ebed2) 2009-10-12 05:08:39 +04:00			`if (ctdb_db->priority != priority) {`
			`continue;`
			`}`
more robust freeze/thaw logic (This used to be ctdb commit 51c1e51aeb7dfac1683584df7ef1bef98c092f76) 2007-05-12 09:29:06 +04:00			`if (tdb_lockall_unmark(ctdb_db->ltdb->tdb) != 0) {`
			`return -1;`
			`}`
			`}`
			`return 0;`
			`}`


break set/get vnn map out from ctdb_control and put it in ctdb_recover.c for the time being remove all the [de]marshalling and just pass a structure around instead (This used to be ctdb commit b1169555ab7015976c0135ff51121cc238f5887c) 2007-05-03 05:06:24 +04:00			`int`
			`ctdb_control_getvnnmap(struct ctdb_context ctdb, uint32_t opcode, TDB_DATA indata, TDB_DATA outdata)`
			`{`
			`CHECK_CONTROL_DATA_SIZE(0);`
separate the wire format and internal format for the vnn_map (This used to be ctdb commit 9a71718d87c5162f1423d85c2e86a01f6771925e) 2007-05-10 02:13:19 +04:00			`struct ctdb_vnn_map_wire *map;`
			`size_t len;`
break set/get vnn map out from ctdb_control and put it in ctdb_recover.c for the time being remove all the [de]marshalling and just pass a structure around instead (This used to be ctdb commit b1169555ab7015976c0135ff51121cc238f5887c) 2007-05-03 05:06:24 +04:00
separate the wire format and internal format for the vnn_map (This used to be ctdb commit 9a71718d87c5162f1423d85c2e86a01f6771925e) 2007-05-10 02:13:19 +04:00			`len = offsetof(struct ctdb_vnn_map_wire, map) + sizeof(uint32_t)*ctdb->vnn_map->size;`
			`map = talloc_size(outdata, len);`
fixed some incorrect CTDB_NO_MEMORY*() calls found after fixing the _VOID varient (This used to be ctdb commit 07c9133aedecaee3607ad3b6fa94e5c56417a9de) 2008-07-04 11:04:26 +04:00			`CTDB_NO_MEMORY(ctdb, map);`
separate the wire format and internal format for the vnn_map (This used to be ctdb commit 9a71718d87c5162f1423d85c2e86a01f6771925e) 2007-05-10 02:13:19 +04:00
			`map->generation = ctdb->vnn_map->generation;`
			`map->size = ctdb->vnn_map->size;`
			`memcpy(map->map, ctdb->vnn_map->map, sizeof(uint32_t)*map->size);`

			`outdata->dsize = len;`
			`outdata->dptr = (uint8_t *)map;`
break set/get vnn map out from ctdb_control and put it in ctdb_recover.c for the time being remove all the [de]marshalling and just pass a structure around instead (This used to be ctdb commit b1169555ab7015976c0135ff51121cc238f5887c) 2007-05-03 05:06:24 +04:00
			`return 0;`
			`}`

			`int`
			`ctdb_control_setvnnmap(struct ctdb_context ctdb, uint32_t opcode, TDB_DATA indata, TDB_DATA outdata)`
			`{`
fixed setvnnmap to use wire structures too (This used to be ctdb commit 1208e4219d220b80e2f74974cac8ed2b8956d3ef) 2007-05-10 02:22:26 +04:00			`struct ctdb_vnn_map_wire map = (struct ctdb_vnn_map_wire )indata.dptr;`
initial attempt at freezing databases in priority order (This used to be ctdb commit e8d692590da1070c87a4144031e3306d190ebed2) 2009-10-12 05:08:39 +04:00			`int i;`
fixed setvnnmap to use wire structures too (This used to be ctdb commit 1208e4219d220b80e2f74974cac8ed2b8956d3ef) 2007-05-10 02:22:26 +04:00
initial attempt at freezing databases in priority order (This used to be ctdb commit e8d692590da1070c87a4144031e3306d190ebed2) 2009-10-12 05:08:39 +04:00			`for(i=1; i<=NUM_DB_PRIORITIES; i++) {`
			`if (ctdb->freeze_mode[i] != CTDB_FREEZE_FROZEN) {`
			`DEBUG(DEBUG_ERR,("Attempt to set vnnmap when not frozen\n"));`
			`return -1;`
			`}`
don't allow setvnnmap while not frozen (This used to be ctdb commit a73f47f565894cc7e346177d87f2e6813837e1c6) 2007-05-14 07:48:40 +04:00			`}`

fixed setvnnmap to use wire structures too (This used to be ctdb commit 1208e4219d220b80e2f74974cac8ed2b8956d3ef) 2007-05-10 02:22:26 +04:00			`talloc_free(ctdb->vnn_map);`

			`ctdb->vnn_map = talloc(ctdb, struct ctdb_vnn_map);`
			`CTDB_NO_MEMORY(ctdb, ctdb->vnn_map);`

			`ctdb->vnn_map->generation = map->generation;`
			`ctdb->vnn_map->size = map->size;`
			`ctdb->vnn_map->map = talloc_array(ctdb->vnn_map, uint32_t, map->size);`
			`CTDB_NO_MEMORY(ctdb, ctdb->vnn_map->map);`
break set/get vnn map out from ctdb_control and put it in ctdb_recover.c for the time being remove all the [de]marshalling and just pass a structure around instead (This used to be ctdb commit b1169555ab7015976c0135ff51121cc238f5887c) 2007-05-03 05:06:24 +04:00
fixed setvnnmap to use wire structures too (This used to be ctdb commit 1208e4219d220b80e2f74974cac8ed2b8956d3ef) 2007-05-10 02:22:26 +04:00			`memcpy(ctdb->vnn_map->map, map->map, sizeof(uint32_t)*map->size);`
break set/get vnn map out from ctdb_control and put it in ctdb_recover.c for the time being remove all the [de]marshalling and just pass a structure around instead (This used to be ctdb commit b1169555ab7015976c0135ff51121cc238f5887c) 2007-05-03 05:06:24 +04:00
			`return 0;`
			`}`

fixup getdbmap control so it looks a bit nicer (This used to be ctdb commit 78a4d61cb78da20af5210488e685c91bc3023e90) 2007-05-03 07:07:34 +04:00			`int`
			`ctdb_control_getdbmap(struct ctdb_context ctdb, uint32_t opcode, TDB_DATA indata, TDB_DATA outdata)`
			`{`
			`uint32_t i, len;`
			`struct ctdb_db_context *ctdb_db;`
			`struct ctdb_dbid_map *dbid_map;`

			`CHECK_CONTROL_DATA_SIZE(0);`

			`len = 0;`
			`for(ctdb_db=ctdb->db_list;ctdb_db;ctdb_db=ctdb_db->next){`
			`len++;`
			`}`


added support for persistent databases in ctdbd (This used to be ctdb commit 3115090a0d882beca9d70761130b74bb0821f201) 2007-09-21 06:24:02 +04:00			`outdata->dsize = offsetof(struct ctdb_dbid_map, dbs) + sizeof(dbid_map->dbs[0])*len;`
fixup getdbmap control so it looks a bit nicer (This used to be ctdb commit 78a4d61cb78da20af5210488e685c91bc3023e90) 2007-05-03 07:07:34 +04:00			`outdata->dptr = (unsigned char *)talloc_zero_size(outdata, outdata->dsize);`
			`if (!outdata->dptr) {`
merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_ALERT, (__location__ " Failed to allocate dbmap array\n"));`
fixup getdbmap control so it looks a bit nicer (This used to be ctdb commit 78a4d61cb78da20af5210488e685c91bc3023e90) 2007-05-03 07:07:34 +04:00			`exit(1);`
			`}`

			`dbid_map = (struct ctdb_dbid_map *)outdata->dptr;`
			`dbid_map->num = len;`
added support for persistent databases in ctdbd (This used to be ctdb commit 3115090a0d882beca9d70761130b74bb0821f201) 2007-09-21 06:24:02 +04:00			`for (i=0,ctdb_db=ctdb->db_list;ctdb_db;i++,ctdb_db=ctdb_db->next){`
			`dbid_map->dbs[i].dbid = ctdb_db->db_id;`
			`dbid_map->dbs[i].persistent = ctdb_db->persistent;`
fixup getdbmap control so it looks a bit nicer (This used to be ctdb commit 78a4d61cb78da20af5210488e685c91bc3023e90) 2007-05-03 07:07:34 +04:00			`}`

			`return 0;`
			`}`
cleanup getnodemap (This used to be ctdb commit 3867ccf71a167fb82dbc5a3f03f968a325a0c70b) 2007-05-03 07:30:38 +04:00
			`int`
			`ctdb_control_getnodemap(struct ctdb_context ctdb, uint32_t opcode, TDB_DATA indata, TDB_DATA outdata)`
			`{`
			`uint32_t i, num_nodes;`
			`struct ctdb_node_map *node_map;`

			`CHECK_CONTROL_DATA_SIZE(0);`

first step towards fixing "make test" with the new daemon system (This used to be ctdb commit f95f7e4c93dea482e6cf0614b5415229a7c9f3fb) 2007-06-02 07:16:11 +04:00			`num_nodes = ctdb->num_nodes;`
cleanup getnodemap (This used to be ctdb commit 3867ccf71a167fb82dbc5a3f03f968a325a0c70b) 2007-05-03 07:30:38 +04:00
			`outdata->dsize = offsetof(struct ctdb_node_map, nodes) + num_nodes*sizeof(struct ctdb_node_and_flags);`
			`outdata->dptr = (unsigned char *)talloc_zero_size(outdata, outdata->dsize);`
			`if (!outdata->dptr) {`
merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_ALERT, (__location__ " Failed to allocate nodemap array\n"));`
cleanup getnodemap (This used to be ctdb commit 3867ccf71a167fb82dbc5a3f03f968a325a0c70b) 2007-05-03 07:30:38 +04:00			`exit(1);`
			`}`

			`node_map = (struct ctdb_node_map *)outdata->dptr;`
			`node_map->num = num_nodes;`
			`for (i=0; i<num_nodes; i++) {`
Fix treatment of link local ipv6 addresses: set the scope id. metze / Michael Signed-off-by: Michael Adam <obnox@samba.org> (This used to be ctdb commit 9d12de1ca6107801dada927729e755c0949d73bf) 2009-01-19 17:33:24 +03:00			`if (parse_ip(ctdb->nodes[i]->address.address,`
			`NULL, /* TODO: pass in the correct interface here*/`
we need to set the port properly in the parse_ip helper (This used to be ctdb commit 43fe18d86995744ba61c7a6405b70edcb265930a) 2009-03-24 05:45:11 +03:00			`0,`
Fix treatment of link local ipv6 addresses: set the scope id. metze / Michael Signed-off-by: Michael Adam <obnox@samba.org> (This used to be ctdb commit 9d12de1ca6107801dada927729e755c0949d73bf) 2009-01-19 17:33:24 +03:00			`&node_map->nodes[i].addr) == 0)`
			`{`
initial ipv6 patch Signed-off-by: Ronnie Sahlberg <ronniesahlberg@gmail.com> (This used to be ctdb commit 1f131f21386f428bbbbb29098d56c2f64596583b) 2008-08-19 08:58:29 +04:00			`DEBUG(DEBUG_ERR, (__location__ " Failed to parse %s into a sockaddr\n", ctdb->nodes[i]->address.address));`
			`}`

update TAKEIP/RELEASEIP/GETPUBLICIP/GETNODEMAP controls so we retain an older ipv4-only version of these controls. We need this so that we are backwardcompatible with old versions of ctdb and so that we can interoperate with a ipv4-only recmaster during a rolling upgrade. (This used to be ctdb commit 6b76c520f97127099bd9fbaa0fa7af1c61947fb7) 2008-10-14 03:40:29 +04:00			`node_map->nodes[i].pnn = ctdb->nodes[i]->pnn;`
			`node_map->nodes[i].flags = ctdb->nodes[i]->flags;`
			`}`

			`return 0;`
			`}`

			`/*`
			`get an old style ipv4-only nodemap`
			`*/`
			`int`
			`ctdb_control_getnodemapv4(struct ctdb_context ctdb, uint32_t opcode, TDB_DATA indata, TDB_DATA outdata)`
			`{`
			`uint32_t i, num_nodes;`
			`struct ctdb_node_mapv4 *node_map;`

			`CHECK_CONTROL_DATA_SIZE(0);`

			`num_nodes = ctdb->num_nodes;`

			`outdata->dsize = offsetof(struct ctdb_node_mapv4, nodes) + num_nodes*sizeof(struct ctdb_node_and_flagsv4);`
			`outdata->dptr = (unsigned char *)talloc_zero_size(outdata, outdata->dsize);`
			`if (!outdata->dptr) {`
			`DEBUG(DEBUG_ALERT, (__location__ " Failed to allocate nodemap array\n"));`
			`exit(1);`
			`}`

			`node_map = (struct ctdb_node_mapv4 *)outdata->dptr;`
			`node_map->num = num_nodes;`
			`for (i=0; i<num_nodes; i++) {`
			`if (parse_ipv4(ctdb->nodes[i]->address.address, 0, &node_map->nodes[i].sin) == 0) {`
			`DEBUG(DEBUG_ERR, (__location__ " Failed to parse %s into a sockaddr\n", ctdb->nodes[i]->address.address));`
			`return -1;`
			`}`

change how we do public addresses and takeover so that we can have multiple public addresses spread across multiple interfaces on each node. this is a massive patch since we have previously made the assumtion that we only have one public address per node. get rid of the public_interface argument. the public addresses file now explicitely lists which interface the address belongs to (This used to be ctdb commit 462ebbc791e906a6b874c862defea43235597ca8) 2007-09-04 03:50:07 +04:00			`node_map->nodes[i].pnn = ctdb->nodes[i]->pnn;`
cleanup getnodemap (This used to be ctdb commit 3867ccf71a167fb82dbc5a3f03f968a325a0c70b) 2007-05-03 07:30:38 +04:00			`node_map->nodes[i].flags = ctdb->nodes[i]->flags;`
			`}`

to make it easier/less disruptive to add nodes to a running cluster add a new control that causes the node to drop the current nodes list and reread it from the nodes file. During this operation, the node will also drop the tcp layer and restart it. When we drop the tcp layer, by talloc_free()ing the ctcp structure add a destructor to ctcp so that we also can clean up and remove the references in the ctdb structure to the transport layer add two new commands for the ctdb tool. one to list all nodes in the nodesfile and the second a command to trigger a node to drop the transport and reinitialize it with the nde nodes file (This used to be ctdb commit 4bc20ac73e9fa94ffd43cccb6eeb438eeff9963c) 2008-02-19 06:44:48 +03:00			`return 0;`
			`}`

			`static void`
			`ctdb_reload_nodes_event(struct event_context ev, struct timed_event te,`
			`struct timeval t, void *private_data)`
			`{`
redesign how reloadnodes is implemented. modify the transport methods to allow to restart individual connections and set up destructors properly. only tear down/set-up tcp connections to nodes removed from the cluster or nodes added to the cluster. Leave tcp connections to unchanged nodes connected. make "ctdb reloadnodes" explicitely cause a recovery of the cluster once the files have been realoaded (This used to be ctdb commit d1057ed6de7de9f2a64d8fa012c52647e89b515b) 2008-12-02 05:26:30 +03:00			`int i, num_nodes;`
When we reload the nodes file instead of shutting down/restarting the entire tcp layer just bounce all outgoing connections and reconnect (This used to be ctdb commit e701a531868149f16561011e65794a4a46ee6596) 2008-10-07 11:12:54 +04:00			`struct ctdb_context *ctdb = talloc_get_type(private_data, struct ctdb_context);`
redesign how reloadnodes is implemented. modify the transport methods to allow to restart individual connections and set up destructors properly. only tear down/set-up tcp connections to nodes removed from the cluster or nodes added to the cluster. Leave tcp connections to unchanged nodes connected. make "ctdb reloadnodes" explicitely cause a recovery of the cluster once the files have been realoaded (This used to be ctdb commit d1057ed6de7de9f2a64d8fa012c52647e89b515b) 2008-12-02 05:26:30 +03:00			`TALLOC_CTX *tmp_ctx;`
			`struct ctdb_node **nodes;`

			`tmp_ctx = talloc_new(ctdb);`

			`/* steal the old nodes file for a while */`
			`talloc_steal(tmp_ctx, ctdb->nodes);`
			`nodes = ctdb->nodes;`
			`ctdb->nodes = NULL;`
			`num_nodes = ctdb->num_nodes;`
			`ctdb->num_nodes = 0;`
to make it easier/less disruptive to add nodes to a running cluster add a new control that causes the node to drop the current nodes list and reread it from the nodes file. During this operation, the node will also drop the tcp layer and restart it. When we drop the tcp layer, by talloc_free()ing the ctcp structure add a destructor to ctcp so that we also can clean up and remove the references in the ctdb structure to the transport layer add two new commands for the ctdb tool. one to list all nodes in the nodesfile and the second a command to trigger a node to drop the transport and reinitialize it with the nde nodes file (This used to be ctdb commit 4bc20ac73e9fa94ffd43cccb6eeb438eeff9963c) 2008-02-19 06:44:48 +03:00
redesign how reloadnodes is implemented. modify the transport methods to allow to restart individual connections and set up destructors properly. only tear down/set-up tcp connections to nodes removed from the cluster or nodes added to the cluster. Leave tcp connections to unchanged nodes connected. make "ctdb reloadnodes" explicitely cause a recovery of the cluster once the files have been realoaded (This used to be ctdb commit d1057ed6de7de9f2a64d8fa012c52647e89b515b) 2008-12-02 05:26:30 +03:00			`/* load the new nodes file */`
to make it easier/less disruptive to add nodes to a running cluster add a new control that causes the node to drop the current nodes list and reread it from the nodes file. During this operation, the node will also drop the tcp layer and restart it. When we drop the tcp layer, by talloc_free()ing the ctcp structure add a destructor to ctcp so that we also can clean up and remove the references in the ctdb structure to the transport layer add two new commands for the ctdb tool. one to list all nodes in the nodesfile and the second a command to trigger a node to drop the transport and reinitialize it with the nde nodes file (This used to be ctdb commit 4bc20ac73e9fa94ffd43cccb6eeb438eeff9963c) 2008-02-19 06:44:48 +03:00			`ctdb_load_nodes_file(ctdb);`
ctdb->methods becomes NULL when we shutdown the transport. If we shutdown the transport and CTDB later decides to send a command out for queueing, the call to ctdb->methods->allocate_pkt() will SEGV. This could trigger for example when we are in the process of shuttind down CTDBD and have already shutdown the transport but we are still waiting for the "shutdown" eventscripts to finish. If the event scripts now take much much longer to execute for some reason, this race condition becomes much more probable. Decorate all dereferencing of ctdb->methods-> with a check that ctdb->menthods is non-NULL (This used to be ctdb commit c4c2c53918da6fb566d6e9cbd6b02e61ae2921e7) 2008-05-11 08:28:33 +04:00
When we reload the nodes file instead of shutting down/restarting the entire tcp layer just bounce all outgoing connections and reconnect (This used to be ctdb commit e701a531868149f16561011e65794a4a46ee6596) 2008-10-07 11:12:54 +04:00			`for (i=0; i<ctdb->num_nodes; i++) {`
redesign how reloadnodes is implemented. modify the transport methods to allow to restart individual connections and set up destructors properly. only tear down/set-up tcp connections to nodes removed from the cluster or nodes added to the cluster. Leave tcp connections to unchanged nodes connected. make "ctdb reloadnodes" explicitely cause a recovery of the cluster once the files have been realoaded (This used to be ctdb commit d1057ed6de7de9f2a64d8fa012c52647e89b515b) 2008-12-02 05:26:30 +03:00			`/* keep any identical pre-existing nodes and connections */`
			`if ((i < num_nodes) && ctdb_same_address(&ctdb->nodes[i]->address, &nodes[i]->address)) {`
			`talloc_free(ctdb->nodes[i]);`
			`ctdb->nodes[i] = talloc_steal(ctdb->nodes, nodes[i]);`
			`continue;`
			`}`

add a new node state : DELETED. This is used to mark nodes as being DELETED internally in ctdb so that nodes are not renumbered if / when they are removed from the nodes file. This is used to be able to do "ctdb reloadnodes" at runtime without causing nodes to be renumbered. To do this, instead of deleting a node from the nodes file, just comment it out like 1.0.0.1 #1.0.0.2 1.0.0.3 After removing 1.0.0.2 from the cluster, the remaining nodes retain their pnn's from prior to the deletion, namely 0 and 2 Any line in the nodes file that is commented out represents a DELETED pnn (This used to be ctdb commit 6a5e4fd7fa391206b463bb4e976502f3ac5bd343) 2009-06-01 08:18:34 +04:00			`if (ctdb->nodes[i]->flags & NODE_FLAGS_DELETED) {`
			`continue;`
			`}`

redesign how reloadnodes is implemented. modify the transport methods to allow to restart individual connections and set up destructors properly. only tear down/set-up tcp connections to nodes removed from the cluster or nodes added to the cluster. Leave tcp connections to unchanged nodes connected. make "ctdb reloadnodes" explicitely cause a recovery of the cluster once the files have been realoaded (This used to be ctdb commit d1057ed6de7de9f2a64d8fa012c52647e89b515b) 2008-12-02 05:26:30 +03:00			`/* any new or different nodes must be added */`
When we reload the nodes file instead of shutting down/restarting the entire tcp layer just bounce all outgoing connections and reconnect (This used to be ctdb commit e701a531868149f16561011e65794a4a46ee6596) 2008-10-07 11:12:54 +04:00			`if (ctdb->methods->add_node(ctdb->nodes[i]) != 0) {`
			`DEBUG(DEBUG_CRIT, (__location__ " methods->add_node failed at %d\n", i));`
			`ctdb_fatal(ctdb, "failed to add node. shutting down\n");`
			`}`
redesign how reloadnodes is implemented. modify the transport methods to allow to restart individual connections and set up destructors properly. only tear down/set-up tcp connections to nodes removed from the cluster or nodes added to the cluster. Leave tcp connections to unchanged nodes connected. make "ctdb reloadnodes" explicitely cause a recovery of the cluster once the files have been realoaded (This used to be ctdb commit d1057ed6de7de9f2a64d8fa012c52647e89b515b) 2008-12-02 05:26:30 +03:00			`if (ctdb->methods->connect_node(ctdb->nodes[i]) != 0) {`
			`DEBUG(DEBUG_CRIT, (__location__ " methods->add_connect failed at %d\n", i));`
			`ctdb_fatal(ctdb, "failed to connect to node. shutting down\n");`
			`}`
ctdb->methods becomes NULL when we shutdown the transport. If we shutdown the transport and CTDB later decides to send a command out for queueing, the call to ctdb->methods->allocate_pkt() will SEGV. This could trigger for example when we are in the process of shuttind down CTDBD and have already shutdown the transport but we are still waiting for the "shutdown" eventscripts to finish. If the event scripts now take much much longer to execute for some reason, this race condition becomes much more probable. Decorate all dereferencing of ctdb->methods-> with a check that ctdb->menthods is non-NULL (This used to be ctdb commit c4c2c53918da6fb566d6e9cbd6b02e61ae2921e7) 2008-05-11 08:28:33 +04:00			`}`
to make it easier/less disruptive to add nodes to a running cluster add a new control that causes the node to drop the current nodes list and reread it from the nodes file. During this operation, the node will also drop the tcp layer and restart it. When we drop the tcp layer, by talloc_free()ing the ctcp structure add a destructor to ctcp so that we also can clean up and remove the references in the ctdb structure to the transport layer add two new commands for the ctdb tool. one to list all nodes in the nodesfile and the second a command to trigger a node to drop the transport and reinitialize it with the nde nodes file (This used to be ctdb commit 4bc20ac73e9fa94ffd43cccb6eeb438eeff9963c) 2008-02-19 06:44:48 +03:00
add a new node state : DELETED. This is used to mark nodes as being DELETED internally in ctdb so that nodes are not renumbered if / when they are removed from the nodes file. This is used to be able to do "ctdb reloadnodes" at runtime without causing nodes to be renumbered. To do this, instead of deleting a node from the nodes file, just comment it out like 1.0.0.1 #1.0.0.2 1.0.0.3 After removing 1.0.0.2 from the cluster, the remaining nodes retain their pnn's from prior to the deletion, namely 0 and 2 Any line in the nodes file that is commented out represents a DELETED pnn (This used to be ctdb commit 6a5e4fd7fa391206b463bb4e976502f3ac5bd343) 2009-06-01 08:18:34 +04:00			`/* tell the recovery daemon to reaload the nodes file too */`
			`ctdb_daemon_send_message(ctdb, ctdb->pnn, CTDB_SRVID_RELOAD_NODES, tdb_null);`

redesign how reloadnodes is implemented. modify the transport methods to allow to restart individual connections and set up destructors properly. only tear down/set-up tcp connections to nodes removed from the cluster or nodes added to the cluster. Leave tcp connections to unchanged nodes connected. make "ctdb reloadnodes" explicitely cause a recovery of the cluster once the files have been realoaded (This used to be ctdb commit d1057ed6de7de9f2a64d8fa012c52647e89b515b) 2008-12-02 05:26:30 +03:00			`talloc_free(tmp_ctx);`
to make it easier/less disruptive to add nodes to a running cluster add a new control that causes the node to drop the current nodes list and reread it from the nodes file. During this operation, the node will also drop the tcp layer and restart it. When we drop the tcp layer, by talloc_free()ing the ctcp structure add a destructor to ctcp so that we also can clean up and remove the references in the ctdb structure to the transport layer add two new commands for the ctdb tool. one to list all nodes in the nodesfile and the second a command to trigger a node to drop the transport and reinitialize it with the nde nodes file (This used to be ctdb commit 4bc20ac73e9fa94ffd43cccb6eeb438eeff9963c) 2008-02-19 06:44:48 +03:00			`return;`
			`}`

			`/*`
			`reload the nodes file after a short delay (so that we can send the response`
			`back first`
			`*/`
			`int`
			`ctdb_control_reload_nodes_file(struct ctdb_context *ctdb, uint32_t opcode)`
			`{`
			`event_add_timed(ctdb->ev, ctdb, timeval_current_ofs(1,0), ctdb_reload_nodes_event, ctdb);`

cleanup getnodemap (This used to be ctdb commit 3867ccf71a167fb82dbc5a3f03f968a325a0c70b) 2007-05-03 07:30:38 +04:00			`return 0;`
			`}`
cleanup the control "write record" (This used to be ctdb commit 4dd5c26a21a5dc2b2f76eb23cfeb4df82ba4e956) 2007-05-03 10:18:03 +04:00
more efficient traversal in pulldb control (This used to be ctdb commit fe614b10868e63b70e081b5bbfb74bf16fdf5716) 2008-01-07 06:07:01 +03:00			`/*`
			`a traverse function for pulling all relevent records from pulldb`
			`*/`
			`struct pulldb_data {`
- got rid of the complex hand marshalling in the recovery controls - fixed the re-send of ctdb calls after a generation change - fixed a reqid idr leak in controls - removed the write_record test code - use the new nonblock lockall code to prevent ctdbd from ever doing a blocking lock that could deadlock with smbd - moved more of the recovery controls into ctdb_recover.c (This used to be ctdb commit 565a21aa4f1e842309986ab97d6244801153deec) 2007-05-10 11:43:45 +04:00			`struct ctdb_context *ctdb;`
renamed the pulldb structure to a ctdb_marshall_buffer (This used to be ctdb commit bad53b2d342bb9760497e6f4a61e64ca50d6e771) 2008-07-30 13:59:18 +04:00			`struct ctdb_marshall_buffer *pulldata;`
more efficient traversal in pulldb control (This used to be ctdb commit fe614b10868e63b70e081b5bbfb74bf16fdf5716) 2008-01-07 06:07:01 +03:00			`uint32_t len;`
			`bool failed;`
- got rid of the complex hand marshalling in the recovery controls - fixed the re-send of ctdb calls after a generation change - fixed a reqid idr leak in controls - removed the write_record test code - use the new nonblock lockall code to prevent ctdbd from ever doing a blocking lock that could deadlock with smbd - moved more of the recovery controls into ctdb_recover.c (This used to be ctdb commit 565a21aa4f1e842309986ab97d6244801153deec) 2007-05-10 11:43:45 +04:00			`};`

more efficient traversal in pulldb control (This used to be ctdb commit fe614b10868e63b70e081b5bbfb74bf16fdf5716) 2008-01-07 06:07:01 +03:00			`static int traverse_pulldb(struct tdb_context tdb, TDB_DATA key, TDB_DATA data, void p)`
- got rid of the complex hand marshalling in the recovery controls - fixed the re-send of ctdb calls after a generation change - fixed a reqid idr leak in controls - removed the write_record test code - use the new nonblock lockall code to prevent ctdbd from ever doing a blocking lock that could deadlock with smbd - moved more of the recovery controls into ctdb_recover.c (This used to be ctdb commit 565a21aa4f1e842309986ab97d6244801153deec) 2007-05-10 11:43:45 +04:00			`{`
more efficient traversal in pulldb control (This used to be ctdb commit fe614b10868e63b70e081b5bbfb74bf16fdf5716) 2008-01-07 06:07:01 +03:00			`struct pulldb_data params = (struct pulldb_data )p;`
			`struct ctdb_rec_data *rec;`
- got rid of the complex hand marshalling in the recovery controls - fixed the re-send of ctdb calls after a generation change - fixed a reqid idr leak in controls - removed the write_record test code - use the new nonblock lockall code to prevent ctdbd from ever doing a blocking lock that could deadlock with smbd - moved more of the recovery controls into ctdb_recover.c (This used to be ctdb commit 565a21aa4f1e842309986ab97d6244801153deec) 2007-05-10 11:43:45 +04:00
more efficient traversal in pulldb control (This used to be ctdb commit fe614b10868e63b70e081b5bbfb74bf16fdf5716) 2008-01-07 06:07:01 +03:00			`/* add the record to the blob */`
			`rec = ctdb_marshall_record(params->pulldata, 0, key, NULL, data);`
			`if (rec == NULL) {`
			`params->failed = true;`
			`return -1;`
- got rid of the complex hand marshalling in the recovery controls - fixed the re-send of ctdb calls after a generation change - fixed a reqid idr leak in controls - removed the write_record test code - use the new nonblock lockall code to prevent ctdbd from ever doing a blocking lock that could deadlock with smbd - moved more of the recovery controls into ctdb_recover.c (This used to be ctdb commit 565a21aa4f1e842309986ab97d6244801153deec) 2007-05-10 11:43:45 +04:00			`}`
more efficient traversal in pulldb control (This used to be ctdb commit fe614b10868e63b70e081b5bbfb74bf16fdf5716) 2008-01-07 06:07:01 +03:00			`params->pulldata = talloc_realloc_size(NULL, params->pulldata, rec->length + params->len);`
			`if (params->pulldata == NULL) {`
merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_ERR,(__location__ " Failed to expand pulldb_data to %u (%u records)\n",`
more efficient traversal in pulldb control (This used to be ctdb commit fe614b10868e63b70e081b5bbfb74bf16fdf5716) 2008-01-07 06:07:01 +03:00			`rec->length + params->len, params->pulldata->count));`
			`params->failed = true;`
			`return -1;`
			`}`
			`params->pulldata->count++;`
			`memcpy(params->len+(uint8_t *)params->pulldata, rec, rec->length);`
			`params->len += rec->length;`
			`talloc_free(rec);`
- got rid of the complex hand marshalling in the recovery controls - fixed the re-send of ctdb calls after a generation change - fixed a reqid idr leak in controls - removed the write_record test code - use the new nonblock lockall code to prevent ctdbd from ever doing a blocking lock that could deadlock with smbd - moved more of the recovery controls into ctdb_recover.c (This used to be ctdb commit 565a21aa4f1e842309986ab97d6244801153deec) 2007-05-10 11:43:45 +04:00
			`return 0;`
			`}`

			`/*`
			`pul a bunch of records from a ltdb, filtering by lmaster`
			`*/`
			`int32_t ctdb_control_pull_db(struct ctdb_context ctdb, TDB_DATA indata, TDB_DATA outdata)`
cleanup the control "write record" (This used to be ctdb commit 4dd5c26a21a5dc2b2f76eb23cfeb4df82ba4e956) 2007-05-03 10:18:03 +04:00			`{`
- got rid of the complex hand marshalling in the recovery controls - fixed the re-send of ctdb calls after a generation change - fixed a reqid idr leak in controls - removed the write_record test code - use the new nonblock lockall code to prevent ctdbd from ever doing a blocking lock that could deadlock with smbd - moved more of the recovery controls into ctdb_recover.c (This used to be ctdb commit 565a21aa4f1e842309986ab97d6244801153deec) 2007-05-10 11:43:45 +04:00			`struct ctdb_control_pulldb *pull;`
cleanup the control "write record" (This used to be ctdb commit 4dd5c26a21a5dc2b2f76eb23cfeb4df82ba4e956) 2007-05-03 10:18:03 +04:00			`struct ctdb_db_context *ctdb_db;`
more efficient traversal in pulldb control (This used to be ctdb commit fe614b10868e63b70e081b5bbfb74bf16fdf5716) 2008-01-07 06:07:01 +03:00			`struct pulldb_data params;`
rename the structure we use for marshalling multiple records (This used to be ctdb commit 4d205476d286570a6e1f52b59af42858ce051106) 2008-07-30 08:24:56 +04:00			`struct ctdb_marshall_buffer *reply;`
- got rid of the complex hand marshalling in the recovery controls - fixed the re-send of ctdb calls after a generation change - fixed a reqid idr leak in controls - removed the write_record test code - use the new nonblock lockall code to prevent ctdbd from ever doing a blocking lock that could deadlock with smbd - moved more of the recovery controls into ctdb_recover.c (This used to be ctdb commit 565a21aa4f1e842309986ab97d6244801153deec) 2007-05-10 11:43:45 +04:00
			`pull = (struct ctdb_control_pulldb *)indata.dptr;`

			`ctdb_db = find_ctdb_db(ctdb, pull->db_id);`
			`if (!ctdb_db) {`
merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_ERR,(__location__ " Unknown db 0x%08x\n", pull->db_id));`
- got rid of the complex hand marshalling in the recovery controls - fixed the re-send of ctdb calls after a generation change - fixed a reqid idr leak in controls - removed the write_record test code - use the new nonblock lockall code to prevent ctdbd from ever doing a blocking lock that could deadlock with smbd - moved more of the recovery controls into ctdb_recover.c (This used to be ctdb commit 565a21aa4f1e842309986ab97d6244801153deec) 2007-05-10 11:43:45 +04:00			`return -1;`
			`}`

initial attempt at freezing databases in priority order (This used to be ctdb commit e8d692590da1070c87a4144031e3306d190ebed2) 2009-10-12 05:08:39 +04:00			`if (ctdb->freeze_mode[ctdb_db->priority] != CTDB_FREEZE_FROZEN) {`
			`DEBUG(DEBUG_DEBUG,("rejecting ctdb_control_pull_db when not frozen\n"));`
			`return -1;`
			`}`

rename the structure we use for marshalling multiple records (This used to be ctdb commit 4d205476d286570a6e1f52b59af42858ce051106) 2008-07-30 08:24:56 +04:00			`reply = talloc_zero(outdata, struct ctdb_marshall_buffer);`
more efficient traversal in pulldb control (This used to be ctdb commit fe614b10868e63b70e081b5bbfb74bf16fdf5716) 2008-01-07 06:07:01 +03:00			`CTDB_NO_MEMORY(ctdb, reply);`

			`reply->db_id = pull->db_id;`
prioritise the dmaster in case of matching rsn (This used to be ctdb commit 4996a12174aa0d215a5b14cb970bdf83eed34a39) 2007-05-12 13:57:12 +04:00
more efficient traversal in pulldb control (This used to be ctdb commit fe614b10868e63b70e081b5bbfb74bf16fdf5716) 2008-01-07 06:07:01 +03:00			`params.ctdb = ctdb;`
			`params.pulldata = reply;`
rename the structure we use for marshalling multiple records (This used to be ctdb commit 4d205476d286570a6e1f52b59af42858ce051106) 2008-07-30 08:24:56 +04:00			`params.len = offsetof(struct ctdb_marshall_buffer, data);`
more efficient traversal in pulldb control (This used to be ctdb commit fe614b10868e63b70e081b5bbfb74bf16fdf5716) 2008-01-07 06:07:01 +03:00			`params.failed = false;`
- got rid of the complex hand marshalling in the recovery controls - fixed the re-send of ctdb calls after a generation change - fixed a reqid idr leak in controls - removed the write_record test code - use the new nonblock lockall code to prevent ctdbd from ever doing a blocking lock that could deadlock with smbd - moved more of the recovery controls into ctdb_recover.c (This used to be ctdb commit 565a21aa4f1e842309986ab97d6244801153deec) 2007-05-10 11:43:45 +04:00
initial attempt at freezing databases in priority order (This used to be ctdb commit e8d692590da1070c87a4144031e3306d190ebed2) 2009-10-12 05:08:39 +04:00			`if (ctdb_lock_all_databases_mark(ctdb, ctdb_db->priority) != 0) {`
merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_ERR,(__location__ " Failed to get lock on entired db - failing\n"));`
- got rid of the complex hand marshalling in the recovery controls - fixed the re-send of ctdb calls after a generation change - fixed a reqid idr leak in controls - removed the write_record test code - use the new nonblock lockall code to prevent ctdbd from ever doing a blocking lock that could deadlock with smbd - moved more of the recovery controls into ctdb_recover.c (This used to be ctdb commit 565a21aa4f1e842309986ab97d6244801153deec) 2007-05-10 11:43:45 +04:00			`return -1;`
			`}`

more efficient traversal in pulldb control (This used to be ctdb commit fe614b10868e63b70e081b5bbfb74bf16fdf5716) 2008-01-07 06:07:01 +03:00			`if (tdb_traverse_read(ctdb_db->ltdb->tdb, traverse_pulldb, &params) == -1) {`
merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_ERR,(__location__ " Failed to get traverse db '%s'\n", ctdb_db->db_name));`
initial attempt at freezing databases in priority order (This used to be ctdb commit e8d692590da1070c87a4144031e3306d190ebed2) 2009-10-12 05:08:39 +04:00			`ctdb_lock_all_databases_unmark(ctdb, ctdb_db->priority);`
more efficient traversal in pulldb control (This used to be ctdb commit fe614b10868e63b70e081b5bbfb74bf16fdf5716) 2008-01-07 06:07:01 +03:00			`talloc_free(params.pulldata);`
			`return -1;`
- got rid of the complex hand marshalling in the recovery controls - fixed the re-send of ctdb calls after a generation change - fixed a reqid idr leak in controls - removed the write_record test code - use the new nonblock lockall code to prevent ctdbd from ever doing a blocking lock that could deadlock with smbd - moved more of the recovery controls into ctdb_recover.c (This used to be ctdb commit 565a21aa4f1e842309986ab97d6244801153deec) 2007-05-10 11:43:45 +04:00			`}`

initial attempt at freezing databases in priority order (This used to be ctdb commit e8d692590da1070c87a4144031e3306d190ebed2) 2009-10-12 05:08:39 +04:00			`ctdb_lock_all_databases_unmark(ctdb, ctdb_db->priority);`
- got rid of the complex hand marshalling in the recovery controls - fixed the re-send of ctdb calls after a generation change - fixed a reqid idr leak in controls - removed the write_record test code - use the new nonblock lockall code to prevent ctdbd from ever doing a blocking lock that could deadlock with smbd - moved more of the recovery controls into ctdb_recover.c (This used to be ctdb commit 565a21aa4f1e842309986ab97d6244801153deec) 2007-05-10 11:43:45 +04:00
more efficient traversal in pulldb control (This used to be ctdb commit fe614b10868e63b70e081b5bbfb74bf16fdf5716) 2008-01-07 06:07:01 +03:00			`outdata->dptr = (uint8_t *)params.pulldata;`
			`outdata->dsize = params.len;`
- got rid of the complex hand marshalling in the recovery controls - fixed the re-send of ctdb calls after a generation change - fixed a reqid idr leak in controls - removed the write_record test code - use the new nonblock lockall code to prevent ctdbd from ever doing a blocking lock that could deadlock with smbd - moved more of the recovery controls into ctdb_recover.c (This used to be ctdb commit 565a21aa4f1e842309986ab97d6244801153deec) 2007-05-10 11:43:45 +04:00
			`return 0;`
			`}`

			`/*`
			`push a bunch of records into a ltdb, filtering by rsn`
			`*/`
			`int32_t ctdb_control_push_db(struct ctdb_context *ctdb, TDB_DATA indata)`
			`{`
rename the structure we use for marshalling multiple records (This used to be ctdb commit 4d205476d286570a6e1f52b59af42858ce051106) 2008-07-30 08:24:56 +04:00			`struct ctdb_marshall_buffer reply = (struct ctdb_marshall_buffer )indata.dptr;`
- got rid of the complex hand marshalling in the recovery controls - fixed the re-send of ctdb calls after a generation change - fixed a reqid idr leak in controls - removed the write_record test code - use the new nonblock lockall code to prevent ctdbd from ever doing a blocking lock that could deadlock with smbd - moved more of the recovery controls into ctdb_recover.c (This used to be ctdb commit 565a21aa4f1e842309986ab97d6244801153deec) 2007-05-10 11:43:45 +04:00			`struct ctdb_db_context *ctdb_db;`
			`int i, ret;`
			`struct ctdb_rec_data *rec;`

rename the structure we use for marshalling multiple records (This used to be ctdb commit 4d205476d286570a6e1f52b59af42858ce051106) 2008-07-30 08:24:56 +04:00			`if (indata.dsize < offsetof(struct ctdb_marshall_buffer, data)) {`
merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_ERR,(__location__ " invalid data in pulldb reply\n"));`
- got rid of the complex hand marshalling in the recovery controls - fixed the re-send of ctdb calls after a generation change - fixed a reqid idr leak in controls - removed the write_record test code - use the new nonblock lockall code to prevent ctdbd from ever doing a blocking lock that could deadlock with smbd - moved more of the recovery controls into ctdb_recover.c (This used to be ctdb commit 565a21aa4f1e842309986ab97d6244801153deec) 2007-05-10 11:43:45 +04:00			`return -1;`
			`}`

			`ctdb_db = find_ctdb_db(ctdb, reply->db_id);`
cleanup the control "write record" (This used to be ctdb commit 4dd5c26a21a5dc2b2f76eb23cfeb4df82ba4e956) 2007-05-03 10:18:03 +04:00			`if (!ctdb_db) {`
merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_ERR,(__location__ " Unknown db 0x%08x\n", reply->db_id));`
cleanup the control "write record" (This used to be ctdb commit 4dd5c26a21a5dc2b2f76eb23cfeb4df82ba4e956) 2007-05-03 10:18:03 +04:00			`return -1;`
			`}`

initial attempt at freezing databases in priority order (This used to be ctdb commit e8d692590da1070c87a4144031e3306d190ebed2) 2009-10-12 05:08:39 +04:00			`if (ctdb->freeze_mode[ctdb_db->priority] != CTDB_FREEZE_FROZEN) {`
			`DEBUG(DEBUG_DEBUG,("rejecting ctdb_control_push_db when not frozen\n"));`
			`return -1;`
			`}`

			`if (ctdb_lock_all_databases_mark(ctdb, ctdb_db->priority) != 0) {`
merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_ERR,(__location__ " Failed to get lock on entired db - failing\n"));`
- got rid of the complex hand marshalling in the recovery controls - fixed the re-send of ctdb calls after a generation change - fixed a reqid idr leak in controls - removed the write_record test code - use the new nonblock lockall code to prevent ctdbd from ever doing a blocking lock that could deadlock with smbd - moved more of the recovery controls into ctdb_recover.c (This used to be ctdb commit 565a21aa4f1e842309986ab97d6244801153deec) 2007-05-10 11:43:45 +04:00			`return -1;`
			`}`
cleanup the control "write record" (This used to be ctdb commit 4dd5c26a21a5dc2b2f76eb23cfeb4df82ba4e956) 2007-05-03 10:18:03 +04:00
- got rid of the complex hand marshalling in the recovery controls - fixed the re-send of ctdb calls after a generation change - fixed a reqid idr leak in controls - removed the write_record test code - use the new nonblock lockall code to prevent ctdbd from ever doing a blocking lock that could deadlock with smbd - moved more of the recovery controls into ctdb_recover.c (This used to be ctdb commit 565a21aa4f1e842309986ab97d6244801153deec) 2007-05-10 11:43:45 +04:00			`rec = (struct ctdb_rec_data *)&reply->data[0];`

added debug constants to allow for better mapping to syslog levels (This used to be ctdb commit 7ba8f1dde318eab03f4257e5a89fd23e7281e502) 2008-02-04 09:44:24 +03:00			`DEBUG(DEBUG_INFO,("starting push of %u records for dbid 0x%x\n",`
- merged ctdb_store test from ronnie - added DatabaseHashSize tunable - added logging of events inside recovery (for timing) (This used to be ctdb commit 3593cdb928b91e217faf1b3c537fa28dc82cdace) 2007-06-17 17:31:44 +04:00			`reply->count, reply->db_id));`

- got rid of the complex hand marshalling in the recovery controls - fixed the re-send of ctdb calls after a generation change - fixed a reqid idr leak in controls - removed the write_record test code - use the new nonblock lockall code to prevent ctdbd from ever doing a blocking lock that could deadlock with smbd - moved more of the recovery controls into ctdb_recover.c (This used to be ctdb commit 565a21aa4f1e842309986ab97d6244801153deec) 2007-05-10 11:43:45 +04:00			`for (i=0;i<reply->count;i++) {`
			`TDB_DATA key, data;`
new simpler and much faster recovery code based on tdb transactions (This used to be ctdb commit 9ef2268a1674b01f60c58fed72af8ac982fe77a3) 2008-01-06 04:38:01 +03:00			`struct ctdb_ltdb_header *hdr;`
- got rid of the complex hand marshalling in the recovery controls - fixed the re-send of ctdb calls after a generation change - fixed a reqid idr leak in controls - removed the write_record test code - use the new nonblock lockall code to prevent ctdbd from ever doing a blocking lock that could deadlock with smbd - moved more of the recovery controls into ctdb_recover.c (This used to be ctdb commit 565a21aa4f1e842309986ab97d6244801153deec) 2007-05-10 11:43:45 +04:00
			`key.dptr = &rec->data[0];`
			`key.dsize = rec->keylen;`
			`data.dptr = &rec->data[key.dsize];`
			`data.dsize = rec->datalen;`

			`if (data.dsize < sizeof(struct ctdb_ltdb_header)) {`
merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_CRIT,(__location__ " bad ltdb record\n"));`
more robust freeze/thaw logic (This used to be ctdb commit 51c1e51aeb7dfac1683584df7ef1bef98c092f76) 2007-05-12 09:29:06 +04:00			`goto failed;`
- got rid of the complex hand marshalling in the recovery controls - fixed the re-send of ctdb calls after a generation change - fixed a reqid idr leak in controls - removed the write_record test code - use the new nonblock lockall code to prevent ctdbd from ever doing a blocking lock that could deadlock with smbd - moved more of the recovery controls into ctdb_recover.c (This used to be ctdb commit 565a21aa4f1e842309986ab97d6244801153deec) 2007-05-10 11:43:45 +04:00			`}`
			`hdr = (struct ctdb_ltdb_header *)data.dptr;`
			`data.dptr += sizeof(*hdr);`
			`data.dsize -= sizeof(*hdr);`

new simpler and much faster recovery code based on tdb transactions (This used to be ctdb commit 9ef2268a1674b01f60c58fed72af8ac982fe77a3) 2008-01-06 04:38:01 +03:00			`ret = ctdb_ltdb_store(ctdb_db, key, hdr, data);`
- got rid of the complex hand marshalling in the recovery controls - fixed the re-send of ctdb calls after a generation change - fixed a reqid idr leak in controls - removed the write_record test code - use the new nonblock lockall code to prevent ctdbd from ever doing a blocking lock that could deadlock with smbd - moved more of the recovery controls into ctdb_recover.c (This used to be ctdb commit 565a21aa4f1e842309986ab97d6244801153deec) 2007-05-10 11:43:45 +04:00			`if (ret != 0) {`
merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_CRIT, (__location__ " Unable to store record\n"));`
more robust freeze/thaw logic (This used to be ctdb commit 51c1e51aeb7dfac1683584df7ef1bef98c092f76) 2007-05-12 09:29:06 +04:00			`goto failed;`
- got rid of the complex hand marshalling in the recovery controls - fixed the re-send of ctdb calls after a generation change - fixed a reqid idr leak in controls - removed the write_record test code - use the new nonblock lockall code to prevent ctdbd from ever doing a blocking lock that could deadlock with smbd - moved more of the recovery controls into ctdb_recover.c (This used to be ctdb commit 565a21aa4f1e842309986ab97d6244801153deec) 2007-05-10 11:43:45 +04:00			`}`
this fixes the non-dmaster bug that has plagued us for months (This used to be ctdb commit 2acf6c6201862debfca054a09262f75c066d2deb) 2008-01-05 01:34:47 +03:00
- got rid of the complex hand marshalling in the recovery controls - fixed the re-send of ctdb calls after a generation change - fixed a reqid idr leak in controls - removed the write_record test code - use the new nonblock lockall code to prevent ctdbd from ever doing a blocking lock that could deadlock with smbd - moved more of the recovery controls into ctdb_recover.c (This used to be ctdb commit 565a21aa4f1e842309986ab97d6244801153deec) 2007-05-10 11:43:45 +04:00			`rec = (struct ctdb_rec_data )(rec->length + (uint8_t )rec);`
			`}`

added debug constants to allow for better mapping to syslog levels (This used to be ctdb commit 7ba8f1dde318eab03f4257e5a89fd23e7281e502) 2008-02-04 09:44:24 +03:00			`DEBUG(DEBUG_DEBUG,("finished push of %u records for dbid 0x%x\n",`
- merged ctdb_store test from ronnie - added DatabaseHashSize tunable - added logging of events inside recovery (for timing) (This used to be ctdb commit 3593cdb928b91e217faf1b3c537fa28dc82cdace) 2007-06-17 17:31:44 +04:00			`reply->count, reply->db_id));`

initial attempt at freezing databases in priority order (This used to be ctdb commit e8d692590da1070c87a4144031e3306d190ebed2) 2009-10-12 05:08:39 +04:00			`ctdb_lock_all_databases_unmark(ctdb, ctdb_db->priority);`
- got rid of the complex hand marshalling in the recovery controls - fixed the re-send of ctdb calls after a generation change - fixed a reqid idr leak in controls - removed the write_record test code - use the new nonblock lockall code to prevent ctdbd from ever doing a blocking lock that could deadlock with smbd - moved more of the recovery controls into ctdb_recover.c (This used to be ctdb commit 565a21aa4f1e842309986ab97d6244801153deec) 2007-05-10 11:43:45 +04:00			`return 0;`
more robust freeze/thaw logic (This used to be ctdb commit 51c1e51aeb7dfac1683584df7ef1bef98c092f76) 2007-05-12 09:29:06 +04:00
			`failed:`
initial attempt at freezing databases in priority order (This used to be ctdb commit e8d692590da1070c87a4144031e3306d190ebed2) 2009-10-12 05:08:39 +04:00			`ctdb_lock_all_databases_unmark(ctdb, ctdb_db->priority);`
more robust freeze/thaw logic (This used to be ctdb commit 51c1e51aeb7dfac1683584df7ef1bef98c092f76) 2007-05-12 09:29:06 +04:00			`return -1;`
- got rid of the complex hand marshalling in the recovery controls - fixed the re-send of ctdb calls after a generation change - fixed a reqid idr leak in controls - removed the write_record test code - use the new nonblock lockall code to prevent ctdbd from ever doing a blocking lock that could deadlock with smbd - moved more of the recovery controls into ctdb_recover.c (This used to be ctdb commit 565a21aa4f1e842309986ab97d6244801153deec) 2007-05-10 11:43:45 +04:00			`}`


			`static int traverse_setdmaster(struct tdb_context tdb, TDB_DATA key, TDB_DATA data, void p)`
			`{`
			`uint32_t dmaster = (uint32_t )p;`
			`struct ctdb_ltdb_header header = (struct ctdb_ltdb_header )data.dptr;`
			`int ret;`

more optimisations to recovery (This used to be ctdb commit 9a41ad0a842cd4f3792d6e84b5c809b7ff6f342e) 2008-01-02 14:44:46 +03:00			`/* skip if already correct */`
			`if (header->dmaster == *dmaster) {`
			`return 0;`
			`}`

- got rid of the complex hand marshalling in the recovery controls - fixed the re-send of ctdb calls after a generation change - fixed a reqid idr leak in controls - removed the write_record test code - use the new nonblock lockall code to prevent ctdbd from ever doing a blocking lock that could deadlock with smbd - moved more of the recovery controls into ctdb_recover.c (This used to be ctdb commit 565a21aa4f1e842309986ab97d6244801153deec) 2007-05-10 11:43:45 +04:00			`header->dmaster = *dmaster;`

			`ret = tdb_store(tdb, key, data, TDB_REPLACE);`
			`if (ret) {`
merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_CRIT,(__location__ " failed to write tdb data back ret:%d\n",ret));`
- got rid of the complex hand marshalling in the recovery controls - fixed the re-send of ctdb calls after a generation change - fixed a reqid idr leak in controls - removed the write_record test code - use the new nonblock lockall code to prevent ctdbd from ever doing a blocking lock that could deadlock with smbd - moved more of the recovery controls into ctdb_recover.c (This used to be ctdb commit 565a21aa4f1e842309986ab97d6244801153deec) 2007-05-10 11:43:45 +04:00			`return ret;`
			`}`
this fixes the non-dmaster bug that has plagued us for months (This used to be ctdb commit 2acf6c6201862debfca054a09262f75c066d2deb) 2008-01-05 01:34:47 +03:00
			`/* TODO: add error checking here */`

- got rid of the complex hand marshalling in the recovery controls - fixed the re-send of ctdb calls after a generation change - fixed a reqid idr leak in controls - removed the write_record test code - use the new nonblock lockall code to prevent ctdbd from ever doing a blocking lock that could deadlock with smbd - moved more of the recovery controls into ctdb_recover.c (This used to be ctdb commit 565a21aa4f1e842309986ab97d6244801153deec) 2007-05-10 11:43:45 +04:00			`return 0;`
			`}`

			`int32_t ctdb_control_set_dmaster(struct ctdb_context *ctdb, TDB_DATA indata)`
			`{`
			`struct ctdb_control_set_dmaster p = (struct ctdb_control_set_dmaster )indata.dptr;`
			`struct ctdb_db_context *ctdb_db;`
cleanup the control "write record" (This used to be ctdb commit 4dd5c26a21a5dc2b2f76eb23cfeb4df82ba4e956) 2007-05-03 10:18:03 +04:00
- got rid of the complex hand marshalling in the recovery controls - fixed the re-send of ctdb calls after a generation change - fixed a reqid idr leak in controls - removed the write_record test code - use the new nonblock lockall code to prevent ctdbd from ever doing a blocking lock that could deadlock with smbd - moved more of the recovery controls into ctdb_recover.c (This used to be ctdb commit 565a21aa4f1e842309986ab97d6244801153deec) 2007-05-10 11:43:45 +04:00			`ctdb_db = find_ctdb_db(ctdb, p->db_id);`
			`if (!ctdb_db) {`
merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_ERR,(__location__ " Unknown db 0x%08x\n", p->db_id));`
cleanup the control "write record" (This used to be ctdb commit 4dd5c26a21a5dc2b2f76eb23cfeb4df82ba4e956) 2007-05-03 10:18:03 +04:00			`return -1;`
			`}`
- got rid of the complex hand marshalling in the recovery controls - fixed the re-send of ctdb calls after a generation change - fixed a reqid idr leak in controls - removed the write_record test code - use the new nonblock lockall code to prevent ctdbd from ever doing a blocking lock that could deadlock with smbd - moved more of the recovery controls into ctdb_recover.c (This used to be ctdb commit 565a21aa4f1e842309986ab97d6244801153deec) 2007-05-10 11:43:45 +04:00
initial attempt at freezing databases in priority order (This used to be ctdb commit e8d692590da1070c87a4144031e3306d190ebed2) 2009-10-12 05:08:39 +04:00			`if (ctdb->freeze_mode[ctdb_db->priority] != CTDB_FREEZE_FROZEN) {`
			`DEBUG(DEBUG_DEBUG,("rejecting ctdb_control_set_dmaster when not frozen\n"));`
			`return -1;`
			`}`

			`if (ctdb_lock_all_databases_mark(ctdb, ctdb_db->priority) != 0) {`
merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_ERR,(__location__ " Failed to get lock on entired db - failing\n"));`
cleanup the control "write record" (This used to be ctdb commit 4dd5c26a21a5dc2b2f76eb23cfeb4df82ba4e956) 2007-05-03 10:18:03 +04:00			`return -1;`
more robust freeze/thaw logic (This used to be ctdb commit 51c1e51aeb7dfac1683584df7ef1bef98c092f76) 2007-05-12 09:29:06 +04:00			`}`
- got rid of the complex hand marshalling in the recovery controls - fixed the re-send of ctdb calls after a generation change - fixed a reqid idr leak in controls - removed the write_record test code - use the new nonblock lockall code to prevent ctdbd from ever doing a blocking lock that could deadlock with smbd - moved more of the recovery controls into ctdb_recover.c (This used to be ctdb commit 565a21aa4f1e842309986ab97d6244801153deec) 2007-05-10 11:43:45 +04:00
			`tdb_traverse(ctdb_db->ltdb->tdb, traverse_setdmaster, &p->dmaster);`

initial attempt at freezing databases in priority order (This used to be ctdb commit e8d692590da1070c87a4144031e3306d190ebed2) 2009-10-12 05:08:39 +04:00			`ctdb_lock_all_databases_unmark(ctdb, ctdb_db->priority);`
- got rid of the complex hand marshalling in the recovery controls - fixed the re-send of ctdb calls after a generation change - fixed a reqid idr leak in controls - removed the write_record test code - use the new nonblock lockall code to prevent ctdbd from ever doing a blocking lock that could deadlock with smbd - moved more of the recovery controls into ctdb_recover.c (This used to be ctdb commit 565a21aa4f1e842309986ab97d6244801153deec) 2007-05-10 11:43:45 +04:00
			`return 0;`
			`}`

- make calling of recovered event script async - shutdown sockets before calling shutdown script (This used to be ctdb commit c5e099feef94a014a77742b6cc1d0afe78ef9da9) 2007-06-02 02:41:19 +04:00			`struct ctdb_set_recmode_state {`
add back the test inside the daemon that if someone asks us to drop recovery mode back to NORMAL that we can not lock the reclock file since at this stage it MUST be locked by the recovery daemon. in order to avoid a non-blocking fnctl() lock from blocking and cause "issues" we move the 'test that we can not lock reclock file' into a child process. (This used to be ctdb commit 3af994641ec2234e37da1fa1f693441586471a7e) 2007-10-16 09:27:07 +04:00			`struct ctdb_context *ctdb;`
- make calling of recovered event script async - shutdown sockets before calling shutdown script (This used to be ctdb commit c5e099feef94a014a77742b6cc1d0afe78ef9da9) 2007-06-02 02:41:19 +04:00			`struct ctdb_req_control *c;`
			`uint32_t recmode;`
add back the test inside the daemon that if someone asks us to drop recovery mode back to NORMAL that we can not lock the reclock file since at this stage it MUST be locked by the recovery daemon. in order to avoid a non-blocking fnctl() lock from blocking and cause "issues" we move the 'test that we can not lock reclock file' into a child process. (This used to be ctdb commit 3af994641ec2234e37da1fa1f693441586471a7e) 2007-10-16 09:27:07 +04:00			`int fd[2];`
			`struct timed_event *te;`
			`struct fd_event *fde;`
			`pid_t child;`
Track how long it takes to take out the recovery lock from both the main dameon and also from the recovery daemon. Log this in "ctdb statistics". Also add a varaible "RecLockLatencyMs" that will log an error everytime it takes longer than this to access the reclock file. (This used to be ctdb commit 042377ed803bb8f7ca9d6ea1a387427b7b8ba45a) 2009-05-14 04:33:25 +04:00			`struct timeval start_time;`
- make calling of recovered event script async - shutdown sockets before calling shutdown script (This used to be ctdb commit c5e099feef94a014a77742b6cc1d0afe78ef9da9) 2007-06-02 02:41:19 +04:00			`};`

add back the test inside the daemon that if someone asks us to drop recovery mode back to NORMAL that we can not lock the reclock file since at this stage it MUST be locked by the recovery daemon. in order to avoid a non-blocking fnctl() lock from blocking and cause "issues" we move the 'test that we can not lock reclock file' into a child process. (This used to be ctdb commit 3af994641ec2234e37da1fa1f693441586471a7e) 2007-10-16 09:27:07 +04:00			`/*`
			`called if our set_recmode child times out. this would happen if`
			`ctdb_recovery_lock() would block.`
			`*/`
			`static void ctdb_set_recmode_timeout(struct event_context ev, struct timed_event te,`
			`struct timeval t, void *private_data)`
			`{`
			`struct ctdb_set_recmode_state *state = talloc_get_type(private_data,`
			`struct ctdb_set_recmode_state);`

fixed problem with looping ctdb recoveries After a node failure, GPFS can get into a state where non-blocking fcntl() locks can take a long time. This means to the ctdb set_recmode test timing out, which leads to a recovery failure, and a new recovery. The recovery loop can last a long time. The fix is to consider a fcntl timeout as a success of this test. The test is to see that we can't lock the shared reclock file, so a timeout is fine for a success. (This used to be ctdb commit 6579a6a2a7161214adedf0f67dce62f4a4ad1afe) 2008-11-21 00:05:59 +03:00			`/* we consider this a success, not a failure, as we failed to`
			`set the recovery lock which is what we wanted. This can be`
			`caused by the cluster filesystem being very slow to`
			`arbitrate locks immediately after a node failure.`
			`*/`
Add a new variable VerifyRecoveryLock which can be used to disable the test that the recovery daemon holds the lock properly when performing a recovery (This used to be ctdb commit 329df9e47e6ca8ab5143985a999e68f37c6d88a5) 2009-04-30 19:18:27 +04:00			`DEBUG(DEBUG_ERR,(__location__ " set_recmode child process hung/timedout CFS slow to grant locks? (allowing recmode set anyway)\n"));`
fixed problem with looping ctdb recoveries After a node failure, GPFS can get into a state where non-blocking fcntl() locks can take a long time. This means to the ctdb set_recmode test timing out, which leads to a recovery failure, and a new recovery. The recovery loop can last a long time. The fix is to consider a fcntl timeout as a success of this test. The test is to see that we can't lock the shared reclock file, so a timeout is fine for a success. (This used to be ctdb commit 6579a6a2a7161214adedf0f67dce62f4a4ad1afe) 2008-11-21 00:05:59 +03:00			`state->ctdb->recovery_mode = state->recmode;`
			`ctdb_request_control_reply(state->ctdb, state->c, NULL, 0, NULL);`
add back the test inside the daemon that if someone asks us to drop recovery mode back to NORMAL that we can not lock the reclock file since at this stage it MUST be locked by the recovery daemon. in order to avoid a non-blocking fnctl() lock from blocking and cause "issues" we move the 'test that we can not lock reclock file' into a child process. (This used to be ctdb commit 3af994641ec2234e37da1fa1f693441586471a7e) 2007-10-16 09:27:07 +04:00			`talloc_free(state);`
			`}`


			`/* when we free the recmode state we must kill any child process.`
			`*/`
			`static int set_recmode_destructor(struct ctdb_set_recmode_state *state)`
			`{`
Track how long it takes to take out the recovery lock from both the main dameon and also from the recovery daemon. Log this in "ctdb statistics". Also add a varaible "RecLockLatencyMs" that will log an error everytime it takes longer than this to access the reclock file. (This used to be ctdb commit 042377ed803bb8f7ca9d6ea1a387427b7b8ba45a) 2009-05-14 04:33:25 +04:00			`double l = timeval_elapsed(&state->start_time);`

			`ctdb_reclock_latency(state->ctdb, "daemon reclock", &state->ctdb->statistics.reclock.ctdbd, l);`
dont leak file descriptors when set recmdoe timesout (This used to be ctdb commit fc8a364eb095ec11ca01246a583bf1dc53510141) 2009-06-19 08:58:06 +04:00
			`if (state->fd[0] != -1) {`
			`state->fd[0] = -1;`
			`}`
			`if (state->fd[1] != -1) {`
			`state->fd[1] = -1;`
			`}`
add back the test inside the daemon that if someone asks us to drop recovery mode back to NORMAL that we can not lock the reclock file since at this stage it MUST be locked by the recovery daemon. in order to avoid a non-blocking fnctl() lock from blocking and cause "issues" we move the 'test that we can not lock reclock file' into a child process. (This used to be ctdb commit 3af994641ec2234e37da1fa1f693441586471a7e) 2007-10-16 09:27:07 +04:00			`kill(state->child, SIGKILL);`
			`return 0;`
			`}`

			`/* this is called when the client process has completed ctdb_recovery_lock()`
			`and has written data back to us through the pipe.`
			`*/`
			`static void set_recmode_handler(struct event_context ev, struct fd_event fde,`
			`uint16_t flags, void *private_data)`
			`{`
			`struct ctdb_set_recmode_state *state= talloc_get_type(private_data,`
			`struct ctdb_set_recmode_state);`
merge from ronnie (This used to be ctdb commit 75d4b386293e186a6bb8532515585ab72670d663) 2007-10-18 09:44:02 +04:00			`char c = 0;`
add back the test inside the daemon that if someone asks us to drop recovery mode back to NORMAL that we can not lock the reclock file since at this stage it MUST be locked by the recovery daemon. in order to avoid a non-blocking fnctl() lock from blocking and cause "issues" we move the 'test that we can not lock reclock file' into a child process. (This used to be ctdb commit 3af994641ec2234e37da1fa1f693441586471a7e) 2007-10-16 09:27:07 +04:00			`int ret;`

			`/* we got a response from our child process so we can abort the`
			`timeout.`
			`*/`
			`talloc_free(state->te);`
			`state->te = NULL;`


			`/* read the childs status when trying to lock the reclock file.`
			`child wrote 0 if everything is fine and 1 if it did manage`
			`to lock the file, which would be a problem since that means`
			`we got a request to exit from recovery but we could still lock`
			`the file which at this time SHOULD be locked by the recovery`
			`daemon on the recmaster`
			`*/`
merge from ronnie (This used to be ctdb commit 75d4b386293e186a6bb8532515585ab72670d663) 2007-10-18 09:44:02 +04:00			`ret = read(state->fd[0], &c, 1);`
			`if (ret != 1 \|\| c != 0) {`
add back the test inside the daemon that if someone asks us to drop recovery mode back to NORMAL that we can not lock the reclock file since at this stage it MUST be locked by the recovery daemon. in order to avoid a non-blocking fnctl() lock from blocking and cause "issues" we move the 'test that we can not lock reclock file' into a child process. (This used to be ctdb commit 3af994641ec2234e37da1fa1f693441586471a7e) 2007-10-16 09:27:07 +04:00			`ctdb_request_control_reply(state->ctdb, state->c, NULL, -1, "managed to lock reclock file from inside daemon");`
			`talloc_free(state);`
			`return;`
			`}`

merge async recovery changes from Ronnie (This used to be ctdb commit 576e317640d25f8059114f15c6f1ebcee5e5b6e2) 2008-01-29 05:59:28 +03:00			`state->ctdb->recovery_mode = state->recmode;`
add back the test inside the daemon that if someone asks us to drop recovery mode back to NORMAL that we can not lock the reclock file since at this stage it MUST be locked by the recovery daemon. in order to avoid a non-blocking fnctl() lock from blocking and cause "issues" we move the 'test that we can not lock reclock file' into a child process. (This used to be ctdb commit 3af994641ec2234e37da1fa1f693441586471a7e) 2007-10-16 09:27:07 +04:00
merge async recovery changes from Ronnie (This used to be ctdb commit 576e317640d25f8059114f15c6f1ebcee5e5b6e2) 2008-01-29 05:59:28 +03:00			`ctdb_request_control_reply(state->ctdb, state->c, NULL, 0, NULL);`
			`talloc_free(state);`
			`return;`
add back the test inside the daemon that if someone asks us to drop recovery mode back to NORMAL that we can not lock the reclock file since at this stage it MUST be locked by the recovery daemon. in order to avoid a non-blocking fnctl() lock from blocking and cause "issues" we move the 'test that we can not lock reclock file' into a child process. (This used to be ctdb commit 3af994641ec2234e37da1fa1f693441586471a7e) 2007-10-16 09:27:07 +04:00			`}`

add a context and a timed event so that once we have been in recovery mode for too long we drop all public ip addresses (This used to be ctdb commit 403c68f96e1380dd07217c688de2730464f77ea0) 2008-10-22 04:04:41 +04:00			`static void`
			`ctdb_drop_all_ips_event(struct event_context ev, struct timed_event te,`
			`struct timeval t, void *private_data)`
			`{`
			`struct ctdb_context *ctdb = talloc_get_type(private_data, struct ctdb_context);`

increase the loglevel for the message we print when we automatically release all ips when we have been in recovery for too long (This used to be ctdb commit 7af060ded5113a49832f6a08a942523a202586b3) 2009-04-24 12:09:51 +04:00			`DEBUG(DEBUG_ERR,(__location__ " Been in recovery mode for too long. Dropping all IPS\n"));`
add a context and a timed event so that once we have been in recovery mode for too long we drop all public ip addresses (This used to be ctdb commit 403c68f96e1380dd07217c688de2730464f77ea0) 2008-10-22 04:04:41 +04:00			`talloc_free(ctdb->release_ips_ctx);`
			`ctdb->release_ips_ctx = NULL;`

			`ctdb_release_all_ips(ctdb);`
			`}`

added lockwait child code for entering recovery mode. A child processes holds lockall locks for the entire recovery process (This used to be ctdb commit f892f30def75b0d964c35eae38c4cf675597dd28) 2007-05-12 08:34:21 +04:00			`/*`
			`set the recovery mode`
			`*/`
- make calling of recovered event script async - shutdown sockets before calling shutdown script (This used to be ctdb commit c5e099feef94a014a77742b6cc1d0afe78ef9da9) 2007-06-02 02:41:19 +04:00			`int32_t ctdb_control_set_recmode(struct ctdb_context *ctdb,`
			`struct ctdb_req_control *c,`
			`TDB_DATA indata, bool *async_reply,`
added error messages in ctdb_control replies (This used to be ctdb commit bd848f5b760e6b2a73ebfc67fd8adb3c31479fb5) 2007-05-12 15:25:26 +04:00			`const char **errormsg)`
added lockwait child code for entering recovery mode. A child processes holds lockall locks for the entire recovery process (This used to be ctdb commit f892f30def75b0d964c35eae38c4cf675597dd28) 2007-05-12 08:34:21 +04:00			`{`
separate out the freeze/thaw handling from recovery (This used to be ctdb commit 0b0640bd8b8334961f240e0cf276ac112cd6e616) 2007-05-12 09:15:27 +04:00			`uint32_t recmode = (uint32_t )indata.dptr;`
initial attempt at freezing databases in priority order (This used to be ctdb commit e8d692590da1070c87a4144031e3306d190ebed2) 2009-10-12 05:08:39 +04:00			`int i, ret;`
- make specification of a recovery lock file compulsory - die if someone other than the recmaster can get the recovery lock (This used to be ctdb commit a827d0d0e430ca8ad5d521367e45097185492869) 2007-06-02 05:36:42 +04:00			`struct ctdb_set_recmode_state *state;`
add back the test inside the daemon that if someone asks us to drop recovery mode back to NORMAL that we can not lock the reclock file since at this stage it MUST be locked by the recovery daemon. in order to avoid a non-blocking fnctl() lock from blocking and cause "issues" we move the 'test that we can not lock reclock file' into a child process. (This used to be ctdb commit 3af994641ec2234e37da1fa1f693441586471a7e) 2007-10-16 09:27:07 +04:00			`pid_t parent = getpid();`
- make specification of a recovery lock file compulsory - die if someone other than the recmaster can get the recovery lock (This used to be ctdb commit a827d0d0e430ca8ad5d521367e45097185492869) 2007-06-02 05:36:42 +04:00
add a context and a timed event so that once we have been in recovery mode for too long we drop all public ip addresses (This used to be ctdb commit 403c68f96e1380dd07217c688de2730464f77ea0) 2008-10-22 04:04:41 +04:00			`/* if we enter recovery but stay in recovery for too long`
			`we will eventually drop all our ip addresses`
			`*/`
			`if (recmode == CTDB_RECOVERY_NORMAL) {`
			`talloc_free(ctdb->release_ips_ctx);`
			`ctdb->release_ips_ctx = NULL;`
			`} else {`
			`talloc_free(ctdb->release_ips_ctx);`
			`ctdb->release_ips_ctx = talloc_new(ctdb);`
			`CTDB_NO_MEMORY(ctdb, ctdb->release_ips_ctx);`

add a tuneable RecoveryDropAllIPs so it is possible to control after how long a node that has been stuck in recovery will wait until it will yield all public addresses. this now defaults to 60 seconds This is useful if a split brain occurs due to network partitioning since it will make sure that the "other half" of the cluster that does not contain the recovery master will eventually release all ips and thus avoiding a duplicate ip situation for the public addresses (This used to be ctdb commit 70f21428c9eec96bcc787be191e7478ad68956dc) 2009-04-24 12:23:48 +04:00			`event_add_timed(ctdb->ev, ctdb->release_ips_ctx, timeval_current_ofs(ctdb->tunable.recovery_drop_all_ips, 0), ctdb_drop_all_ips_event, ctdb);`
add a context and a timed event so that once we have been in recovery mode for too long we drop all public ip addresses (This used to be ctdb commit 403c68f96e1380dd07217c688de2730464f77ea0) 2008-10-22 04:04:41 +04:00			`}`

show start/stop time of recovery on all nodes (This used to be ctdb commit 9f7662279c367eb3e8a58e6f4aeca521e6f1f1d0) 2008-01-08 01:30:11 +03:00			`if (recmode != ctdb->recovery_mode) {`
merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_NOTICE,(__location__ " Recovery mode set to %s\n",`
show start/stop time of recovery on all nodes (This used to be ctdb commit 9f7662279c367eb3e8a58e6f4aeca521e6f1f1d0) 2008-01-08 01:30:11 +03:00			`recmode==CTDB_RECOVERY_NORMAL?"NORMAL":"ACTIVE"));`
			`}`

- make specification of a recovery lock file compulsory - die if someone other than the recmaster can get the recovery lock (This used to be ctdb commit a827d0d0e430ca8ad5d521367e45097185492869) 2007-06-02 05:36:42 +04:00			`if (recmode != CTDB_RECOVERY_NORMAL \|\|`
			`ctdb->recovery_mode != CTDB_RECOVERY_ACTIVE) {`
			`ctdb->recovery_mode = recmode;`
			`return 0;`
- make calling of recovered event script async - shutdown sockets before calling shutdown script (This used to be ctdb commit c5e099feef94a014a77742b6cc1d0afe78ef9da9) 2007-06-02 02:41:19 +04:00			`}`
- make specification of a recovery lock file compulsory - die if someone other than the recmaster can get the recovery lock (This used to be ctdb commit a827d0d0e430ca8ad5d521367e45097185492869) 2007-06-02 05:36:42 +04:00
			`/* some special handling when ending recovery mode */`
test (This used to be ctdb commit 4f2d722cf29175c3c207e6ebb6d4f9e370767249) 2008-06-26 08:14:37 +04:00
initial attempt at freezing databases in priority order (This used to be ctdb commit e8d692590da1070c87a4144031e3306d190ebed2) 2009-10-12 05:08:39 +04:00			`/* force the databases to thaw */`
			`for (i=1; i<=NUM_DB_PRIORITIES; i++) {`
			`if (ctdb->freeze_handles[i] != NULL) {`
			`ctdb_control_thaw(ctdb, i);`
			`}`
test (This used to be ctdb commit 4f2d722cf29175c3c207e6ebb6d4f9e370767249) 2008-06-26 08:14:37 +04:00			`}`

- make specification of a recovery lock file compulsory - die if someone other than the recmaster can get the recovery lock (This used to be ctdb commit a827d0d0e430ca8ad5d521367e45097185492869) 2007-06-02 05:36:42 +04:00			`state = talloc(ctdb, struct ctdb_set_recmode_state);`
			`CTDB_NO_MEMORY(ctdb, state);`

Track how long it takes to take out the recovery lock from both the main dameon and also from the recovery daemon. Log this in "ctdb statistics". Also add a varaible "RecLockLatencyMs" that will log an error everytime it takes longer than this to access the reclock file. (This used to be ctdb commit 042377ed803bb8f7ca9d6ea1a387427b7b8ba45a) 2009-05-14 04:33:25 +04:00			`state->start_time = timeval_current();`
dont leak file descriptors when set recmdoe timesout (This used to be ctdb commit fc8a364eb095ec11ca01246a583bf1dc53510141) 2009-06-19 08:58:06 +04:00			`state->fd[0] = -1;`
			`state->fd[1] = -1;`
Add a new variable VerifyRecoveryLock which can be used to disable the test that the recovery daemon holds the lock properly when performing a recovery (This used to be ctdb commit 329df9e47e6ca8ab5143985a999e68f37c6d88a5) 2009-04-30 19:18:27 +04:00
			`if (ctdb->tunable.verify_recovery_lock == 0) {`
			`/* dont need to verify the reclock file */`
			`ctdb->recovery_mode = recmode;`
			`return 0;`
			`}`

add back the test inside the daemon that if someone asks us to drop recovery mode back to NORMAL that we can not lock the reclock file since at this stage it MUST be locked by the recovery daemon. in order to avoid a non-blocking fnctl() lock from blocking and cause "issues" we move the 'test that we can not lock reclock file' into a child process. (This used to be ctdb commit 3af994641ec2234e37da1fa1f693441586471a7e) 2007-10-16 09:27:07 +04:00			`/* For the rest of what needs to be done, we need to do this in`
			`a child process since`
			`1, the call to ctdb_recovery_lock() can block if the cluster`
			`filesystem is in the process of recovery.`
			`*/`
			`ret = pipe(state->fd);`
- make specification of a recovery lock file compulsory - die if someone other than the recmaster can get the recovery lock (This used to be ctdb commit a827d0d0e430ca8ad5d521367e45097185492869) 2007-06-02 05:36:42 +04:00			`if (ret != 0) {`
add back the test inside the daemon that if someone asks us to drop recovery mode back to NORMAL that we can not lock the reclock file since at this stage it MUST be locked by the recovery daemon. in order to avoid a non-blocking fnctl() lock from blocking and cause "issues" we move the 'test that we can not lock reclock file' into a child process. (This used to be ctdb commit 3af994641ec2234e37da1fa1f693441586471a7e) 2007-10-16 09:27:07 +04:00			`talloc_free(state);`
merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_CRIT,(__location__ " Failed to open pipe for set_recmode child\n"));`
- make specification of a recovery lock file compulsory - die if someone other than the recmaster can get the recovery lock (This used to be ctdb commit a827d0d0e430ca8ad5d521367e45097185492869) 2007-06-02 05:36:42 +04:00			`return -1;`
add back the test inside the daemon that if someone asks us to drop recovery mode back to NORMAL that we can not lock the reclock file since at this stage it MUST be locked by the recovery daemon. in order to avoid a non-blocking fnctl() lock from blocking and cause "issues" we move the 'test that we can not lock reclock file' into a child process. (This used to be ctdb commit 3af994641ec2234e37da1fa1f693441586471a7e) 2007-10-16 09:27:07 +04:00			`}`
- make specification of a recovery lock file compulsory - die if someone other than the recmaster can get the recovery lock (This used to be ctdb commit a827d0d0e430ca8ad5d521367e45097185492869) 2007-06-02 05:36:42 +04:00
add back the test inside the daemon that if someone asks us to drop recovery mode back to NORMAL that we can not lock the reclock file since at this stage it MUST be locked by the recovery daemon. in order to avoid a non-blocking fnctl() lock from blocking and cause "issues" we move the 'test that we can not lock reclock file' into a child process. (This used to be ctdb commit 3af994641ec2234e37da1fa1f693441586471a7e) 2007-10-16 09:27:07 +04:00			`state->child = fork();`
			`if (state->child == (pid_t)-1) {`
			`close(state->fd[0]);`
			`close(state->fd[1]);`
			`talloc_free(state);`
			`return -1;`
- make specification of a recovery lock file compulsory - die if someone other than the recmaster can get the recovery lock (This used to be ctdb commit a827d0d0e430ca8ad5d521367e45097185492869) 2007-06-02 05:36:42 +04:00			`}`
added timeouts in all event scripts (This used to be ctdb commit d986c91a607ed7c7d4869ea786b5cdf80e7862f1) 2007-06-06 07:45:12 +04:00
add back the test inside the daemon that if someone asks us to drop recovery mode back to NORMAL that we can not lock the reclock file since at this stage it MUST be locked by the recovery daemon. in order to avoid a non-blocking fnctl() lock from blocking and cause "issues" we move the 'test that we can not lock reclock file' into a child process. (This used to be ctdb commit 3af994641ec2234e37da1fa1f693441586471a7e) 2007-10-16 09:27:07 +04:00			`if (state->child == 0) {`
			`char cc = 0;`
			`close(state->fd[0]);`

Add a new variable VerifyRecoveryLock which can be used to disable the test that the recovery daemon holds the lock properly when performing a recovery (This used to be ctdb commit 329df9e47e6ca8ab5143985a999e68f37c6d88a5) 2009-04-30 19:18:27 +04:00			`/* we should not be able to get the lock on the reclock file,`
add back the test inside the daemon that if someone asks us to drop recovery mode back to NORMAL that we can not lock the reclock file since at this stage it MUST be locked by the recovery daemon. in order to avoid a non-blocking fnctl() lock from blocking and cause "issues" we move the 'test that we can not lock reclock file' into a child process. (This used to be ctdb commit 3af994641ec2234e37da1fa1f693441586471a7e) 2007-10-16 09:27:07 +04:00			`as it should be held by the recovery master`
			`*/`
			`if (ctdb_recovery_lock(ctdb, false)) {`
merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_CRIT,("ERROR: recovery lock file %s not locked when recovering!\n", ctdb->recovery_lock_file));`
add back the test inside the daemon that if someone asks us to drop recovery mode back to NORMAL that we can not lock the reclock file since at this stage it MUST be locked by the recovery daemon. in order to avoid a non-blocking fnctl() lock from blocking and cause "issues" we move the 'test that we can not lock reclock file' into a child process. (This used to be ctdb commit 3af994641ec2234e37da1fa1f693441586471a7e) 2007-10-16 09:27:07 +04:00			`cc = 1;`
			`}`

			`write(state->fd[1], &cc, 1);`
			`/* make sure we die when our parent dies */`
			`while (kill(parent, 0) == 0 \|\| errno != ESRCH) {`
			`sleep(5);`
Add a new variable VerifyRecoveryLock which can be used to disable the test that the recovery daemon holds the lock properly when performing a recovery (This used to be ctdb commit 329df9e47e6ca8ab5143985a999e68f37c6d88a5) 2009-04-30 19:18:27 +04:00			`write(state->fd[1], &cc, 1);`
add back the test inside the daemon that if someone asks us to drop recovery mode back to NORMAL that we can not lock the reclock file since at this stage it MUST be locked by the recovery daemon. in order to avoid a non-blocking fnctl() lock from blocking and cause "issues" we move the 'test that we can not lock reclock file' into a child process. (This used to be ctdb commit 3af994641ec2234e37da1fa1f693441586471a7e) 2007-10-16 09:27:07 +04:00			`}`
			`_exit(0);`
			`}`
			`close(state->fd[1]);`
add logging everytime we create a filedescriptor in the main ctdb daemon so we can spot if there are leaks. plug two leaks for filedescriptors related to when sending ARP fail and one leak when we can not parse the local address during tcp connection establish (This used to be ctdb commit ddd089810a14efe4be6e1ff3eccaa604e4913c9e) 2009-10-15 04:24:54 +04:00			`set_close_on_exec(state->fd[0]);`

dont leak file descriptors when set recmdoe timesout (This used to be ctdb commit fc8a364eb095ec11ca01246a583bf1dc53510141) 2009-06-19 08:58:06 +04:00			`state->fd[1] = -1;`
add back the test inside the daemon that if someone asks us to drop recovery mode back to NORMAL that we can not lock the reclock file since at this stage it MUST be locked by the recovery daemon. in order to avoid a non-blocking fnctl() lock from blocking and cause "issues" we move the 'test that we can not lock reclock file' into a child process. (This used to be ctdb commit 3af994641ec2234e37da1fa1f693441586471a7e) 2007-10-16 09:27:07 +04:00
			`talloc_set_destructor(state, set_recmode_destructor);`

add logging everytime we create a filedescriptor in the main ctdb daemon so we can spot if there are leaks. plug two leaks for filedescriptors related to when sending ARP fail and one leak when we can not parse the local address during tcp connection establish (This used to be ctdb commit ddd089810a14efe4be6e1ff3eccaa604e4913c9e) 2009-10-15 04:24:54 +04:00			`DEBUG(DEBUG_NOTICE, (__location__ " Created PIPE FD:%d for setrecmode\n", state->fd[0]));`

reduce the timeout we wait for the reclock child process to finish to 5 seconds before we log an error and abort (This used to be ctdb commit 6d1e4321b63973c2e53c63d386e8cc0bd9605cae) 2009-06-19 07:09:11 +04:00			`state->te = event_add_timed(ctdb->ev, state, timeval_current_ofs(5, 0),`
fixed problem with looping ctdb recoveries After a node failure, GPFS can get into a state where non-blocking fcntl() locks can take a long time. This means to the ctdb set_recmode test timing out, which leads to a recovery failure, and a new recovery. The recovery loop can last a long time. The fix is to consider a fcntl timeout as a success of this test. The test is to see that we can't lock the shared reclock file, so a timeout is fine for a success. (This used to be ctdb commit 6579a6a2a7161214adedf0f67dce62f4a4ad1afe) 2008-11-21 00:05:59 +03:00			`ctdb_set_recmode_timeout, state);`
add back the test inside the daemon that if someone asks us to drop recovery mode back to NORMAL that we can not lock the reclock file since at this stage it MUST be locked by the recovery daemon. in order to avoid a non-blocking fnctl() lock from blocking and cause "issues" we move the 'test that we can not lock reclock file' into a child process. (This used to be ctdb commit 3af994641ec2234e37da1fa1f693441586471a7e) 2007-10-16 09:27:07 +04:00
			`state->fde = event_add_fd(ctdb->ev, state, state->fd[0],`
			`EVENT_FD_READ\|EVENT_FD_AUTOCLOSE,`
			`set_recmode_handler,`
			`(void *)state);`
reduce the timeout we wait for the reclock child process to finish to 5 seconds before we log an error and abort (This used to be ctdb commit 6d1e4321b63973c2e53c63d386e8cc0bd9605cae) 2009-06-19 07:09:11 +04:00
add back the test inside the daemon that if someone asks us to drop recovery mode back to NORMAL that we can not lock the reclock file since at this stage it MUST be locked by the recovery daemon. in order to avoid a non-blocking fnctl() lock from blocking and cause "issues" we move the 'test that we can not lock reclock file' into a child process. (This used to be ctdb commit 3af994641ec2234e37da1fa1f693441586471a7e) 2007-10-16 09:27:07 +04:00			`if (state->fde == NULL) {`
			`talloc_free(state);`
			`return -1;`
- make specification of a recovery lock file compulsory - die if someone other than the recmaster can get the recovery lock (This used to be ctdb commit a827d0d0e430ca8ad5d521367e45097185492869) 2007-06-02 05:36:42 +04:00			`}`
add back the test inside the daemon that if someone asks us to drop recovery mode back to NORMAL that we can not lock the reclock file since at this stage it MUST be locked by the recovery daemon. in order to avoid a non-blocking fnctl() lock from blocking and cause "issues" we move the 'test that we can not lock reclock file' into a child process. (This used to be ctdb commit 3af994641ec2234e37da1fa1f693441586471a7e) 2007-10-16 09:27:07 +04:00
			`state->ctdb = ctdb;`
			`state->recmode = recmode;`
			`state->c = talloc_steal(state, c);`

- make specification of a recovery lock file compulsory - die if someone other than the recmaster can get the recovery lock (This used to be ctdb commit a827d0d0e430ca8ad5d521367e45097185492869) 2007-06-02 05:36:42 +04:00			`*async_reply = true;`

separate out the freeze/thaw handling from recovery (This used to be ctdb commit 0b0640bd8b8334961f240e0cf276ac112cd6e616) 2007-05-12 09:15:27 +04:00			`return 0;`
added lockwait child code for entering recovery mode. A child processes holds lockall locks for the entire recovery process (This used to be ctdb commit f892f30def75b0d964c35eae38c4cf675597dd28) 2007-05-12 08:34:21 +04:00			`}`
merge from tridge (This used to be ctdb commit 7bca79ad6357149fd7c6b28ce4b05de3d223a7de) 2007-05-14 00:25:15 +04:00
- moved cmdline options that are only relevant to ctdbd into ctdbd.c - fixed a valgrind error on failing to send a control - don't mark node dead when already disconnected - moved node list lock code into common code (This used to be ctdb commit bcc0432d0fea7ef223f82ccee81cf35c18144b1b) 2007-06-02 04:03:28 +04:00
			`/*`
- make specification of a recovery lock file compulsory - die if someone other than the recmaster can get the recovery lock (This used to be ctdb commit a827d0d0e430ca8ad5d521367e45097185492869) 2007-06-02 05:36:42 +04:00			`try and get the recovery lock in shared storage - should only work`
			`on the recovery master recovery daemon. Anywhere else is a bug`
- moved cmdline options that are only relevant to ctdbd into ctdbd.c - fixed a valgrind error on failing to send a control - don't mark node dead when already disconnected - moved node list lock code into common code (This used to be ctdb commit bcc0432d0fea7ef223f82ccee81cf35c18144b1b) 2007-06-02 04:03:28 +04:00			`*/`
- make specification of a recovery lock file compulsory - die if someone other than the recmaster can get the recovery lock (This used to be ctdb commit a827d0d0e430ca8ad5d521367e45097185492869) 2007-06-02 05:36:42 +04:00			`bool ctdb_recovery_lock(struct ctdb_context *ctdb, bool keep)`
- moved cmdline options that are only relevant to ctdbd into ctdbd.c - fixed a valgrind error on failing to send a control - don't mark node dead when already disconnected - moved node list lock code into common code (This used to be ctdb commit bcc0432d0fea7ef223f82ccee81cf35c18144b1b) 2007-06-02 04:03:28 +04:00			`{`
			`struct flock lock;`

add extra debug statements to the log to make it easier to see when a recovery dameon has hung due to the underlying filesystem hanging. (This used to be ctdb commit 5b0067a4e335cbbf6e606646e612d4bfcfdb7441) 2009-05-12 12:39:34 +04:00			`if (keep) {`
			`DEBUG(DEBUG_ERR, ("Take the recovery lock\n"));`
			`}`
- make specification of a recovery lock file compulsory - die if someone other than the recmaster can get the recovery lock (This used to be ctdb commit a827d0d0e430ca8ad5d521367e45097185492869) 2007-06-02 05:36:42 +04:00			`if (ctdb->recovery_lock_fd != -1) {`
			`close(ctdb->recovery_lock_fd);`
Dont access the reclock file at all if VerifyRecoveryLock is zero and also make sure the reclock file is closed if the variable is cleared at runtime (This used to be ctdb commit a25f4888689a0725971606163d87c39a41669292) 2009-06-25 05:41:18 +04:00			`ctdb->recovery_lock_fd = -1;`
- moved cmdline options that are only relevant to ctdbd into ctdbd.c - fixed a valgrind error on failing to send a control - don't mark node dead when already disconnected - moved node list lock code into common code (This used to be ctdb commit bcc0432d0fea7ef223f82ccee81cf35c18144b1b) 2007-06-02 04:03:28 +04:00			`}`
Dont access the reclock file at all if VerifyRecoveryLock is zero and also make sure the reclock file is closed if the variable is cleared at runtime (This used to be ctdb commit a25f4888689a0725971606163d87c39a41669292) 2009-06-25 05:41:18 +04:00
- make specification of a recovery lock file compulsory - die if someone other than the recmaster can get the recovery lock (This used to be ctdb commit a827d0d0e430ca8ad5d521367e45097185492869) 2007-06-02 05:36:42 +04:00			`ctdb->recovery_lock_fd = open(ctdb->recovery_lock_file, O_RDWR\|O_CREAT, 0600);`
			`if (ctdb->recovery_lock_fd == -1) {`
merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_ERR,("ctdb_recovery_lock: Unable to open %s - (%s)\n",`
- make specification of a recovery lock file compulsory - die if someone other than the recmaster can get the recovery lock (This used to be ctdb commit a827d0d0e430ca8ad5d521367e45097185492869) 2007-06-02 05:36:42 +04:00			`ctdb->recovery_lock_file, strerror(errno)));`
- moved cmdline options that are only relevant to ctdbd into ctdbd.c - fixed a valgrind error on failing to send a control - don't mark node dead when already disconnected - moved node list lock code into common code (This used to be ctdb commit bcc0432d0fea7ef223f82ccee81cf35c18144b1b) 2007-06-02 04:03:28 +04:00			`return false;`
			`}`

make sure we set close on exec on any possibly inherited fds (This used to be ctdb commit d9dec82076f14a348e7b67b4350180681ff86f32) 2007-09-19 05:46:37 +04:00			`set_close_on_exec(ctdb->recovery_lock_fd);`

- moved cmdline options that are only relevant to ctdbd into ctdbd.c - fixed a valgrind error on failing to send a control - don't mark node dead when already disconnected - moved node list lock code into common code (This used to be ctdb commit bcc0432d0fea7ef223f82ccee81cf35c18144b1b) 2007-06-02 04:03:28 +04:00			`lock.l_type = F_WRLCK;`
			`lock.l_whence = SEEK_SET;`
			`lock.l_start = 0;`
			`lock.l_len = 1;`
			`lock.l_pid = 0;`

- make specification of a recovery lock file compulsory - die if someone other than the recmaster can get the recovery lock (This used to be ctdb commit a827d0d0e430ca8ad5d521367e45097185492869) 2007-06-02 05:36:42 +04:00			`if (fcntl(ctdb->recovery_lock_fd, F_SETLK, &lock) != 0) {`
fixed a fd leak on the recovery lock (This used to be ctdb commit 186f35c42ed4fcc9ed44390b0dd036ece475d45e) 2007-09-24 04:19:07 +04:00			`close(ctdb->recovery_lock_fd);`
			`ctdb->recovery_lock_fd = -1;`
added some debug lines to help track down a problem (This used to be ctdb commit 2ca31e9de179f76e392a26cc8305e2473357c760) 2007-10-18 10:27:36 +04:00			`if (keep) {`
merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_CRIT,("ctdb_recovery_lock: Failed to get recovery lock on '%s'\n", ctdb->recovery_lock_file));`
added some debug lines to help track down a problem (This used to be ctdb commit 2ca31e9de179f76e392a26cc8305e2473357c760) 2007-10-18 10:27:36 +04:00			`}`
- moved cmdline options that are only relevant to ctdbd into ctdbd.c - fixed a valgrind error on failing to send a control - don't mark node dead when already disconnected - moved node list lock code into common code (This used to be ctdb commit bcc0432d0fea7ef223f82ccee81cf35c18144b1b) 2007-06-02 04:03:28 +04:00			`return false;`
			`}`

			`if (!keep) {`
- make specification of a recovery lock file compulsory - die if someone other than the recmaster can get the recovery lock (This used to be ctdb commit a827d0d0e430ca8ad5d521367e45097185492869) 2007-06-02 05:36:42 +04:00			`close(ctdb->recovery_lock_fd);`
			`ctdb->recovery_lock_fd = -1;`
- moved cmdline options that are only relevant to ctdbd into ctdbd.c - fixed a valgrind error on failing to send a control - don't mark node dead when already disconnected - moved node list lock code into common code (This used to be ctdb commit bcc0432d0fea7ef223f82ccee81cf35c18144b1b) 2007-06-02 04:03:28 +04:00			`}`

add extra debug statements to the log to make it easier to see when a recovery dameon has hung due to the underlying filesystem hanging. (This used to be ctdb commit 5b0067a4e335cbbf6e606646e612d4bfcfdb7441) 2009-05-12 12:39:34 +04:00			`if (keep) {`
			`DEBUG(DEBUG_ERR, ("Recovery lock taken successfully\n"));`
			`}`

merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_NOTICE,("ctdb_recovery_lock: Got recovery lock on '%s'\n", ctdb->recovery_lock_file));`
added some debug lines to help track down a problem (This used to be ctdb commit 2ca31e9de179f76e392a26cc8305e2473357c760) 2007-10-18 10:27:36 +04:00
- moved cmdline options that are only relevant to ctdbd into ctdbd.c - fixed a valgrind error on failing to send a control - don't mark node dead when already disconnected - moved node list lock code into common code (This used to be ctdb commit bcc0432d0fea7ef223f82ccee81cf35c18144b1b) 2007-06-02 04:03:28 +04:00			`return true;`
			`}`
new simpler and much faster recovery code based on tdb transactions (This used to be ctdb commit 9ef2268a1674b01f60c58fed72af8ac982fe77a3) 2008-01-06 04:38:01 +03:00
added two new ctdb commands: ctdb vacuum : vacuums all the databases, deleting any zero length ctdb records ctdb repack : repacks all the databases, resulting in a perfectly packed database with no freelist entries (This used to be ctdb commit 3532119c84ab3247051ed6ba21ba3243ae2f6bf4) 2008-01-08 09:23:27 +03:00			`/*`
			`delete a record as part of the vacuum process`
			`only delete if we are not lmaster or dmaster, and our rsn is <= the provided rsn`
			`use non-blocking locks`
Redo the vacukming process to mkake it scalable. Vacumming used to delete one record at a time on all nodes, that was m*n behaviour and would require a huge storm of ctdb->ctdb controls and just wouldnt scale at all. The new vacuming process collects all records to be deleted locally and then only sends 1 control to the other nodes. This control contains a list of all records to be deleted. (This used to be ctdb commit 9e625ece19a91f362c9539fa73b6b2108f0d9c53) 2008-03-12 23:53:29 +03:00
			`return 0 if the record was successfully deleted (i.e. it does not exist`
			`when the function returns)`
			`or !0 is the record still exists in the tdb after returning.`
added two new ctdb commands: ctdb vacuum : vacuums all the databases, deleting any zero length ctdb records ctdb repack : repacks all the databases, resulting in a perfectly packed database with no freelist entries (This used to be ctdb commit 3532119c84ab3247051ed6ba21ba3243ae2f6bf4) 2008-01-08 09:23:27 +03:00			`*/`
Redo the vacukming process to mkake it scalable. Vacumming used to delete one record at a time on all nodes, that was m*n behaviour and would require a huge storm of ctdb->ctdb controls and just wouldnt scale at all. The new vacuming process collects all records to be deleted locally and then only sends 1 control to the other nodes. This control contains a list of all records to be deleted. (This used to be ctdb commit 9e625ece19a91f362c9539fa73b6b2108f0d9c53) 2008-03-12 23:53:29 +03:00			`static int delete_tdb_record(struct ctdb_context ctdb, struct ctdb_db_context ctdb_db, struct ctdb_rec_data *rec)`
added two new ctdb commands: ctdb vacuum : vacuums all the databases, deleting any zero length ctdb records ctdb repack : repacks all the databases, resulting in a perfectly packed database with no freelist entries (This used to be ctdb commit 3532119c84ab3247051ed6ba21ba3243ae2f6bf4) 2008-01-08 09:23:27 +03:00			`{`
			`TDB_DATA key, data;`
			`struct ctdb_ltdb_header hdr, hdr2;`
ensure the main daemon doesn't use a blocking lock on the freelist (This used to be ctdb commit 73f8257906b09e6516f675883d8e7a3c455ad869) 2008-01-08 14:31:48 +03:00
			`/* these are really internal tdb functions - but we need them here for`
			`non-blocking lock of the freelist */`
			`int tdb_lock_nonblock(struct tdb_context *tdb, int list, int ltype);`
			`int tdb_unlock(struct tdb_context *tdb, int list, int ltype);`
added two new ctdb commands: ctdb vacuum : vacuums all the databases, deleting any zero length ctdb records ctdb repack : repacks all the databases, resulting in a perfectly packed database with no freelist entries (This used to be ctdb commit 3532119c84ab3247051ed6ba21ba3243ae2f6bf4) 2008-01-08 09:23:27 +03:00

			`key.dsize = rec->keylen;`
			`key.dptr = &rec->data[0];`
			`data.dsize = rec->datalen;`
			`data.dptr = &rec->data[rec->keylen];`

			`if (ctdb_lmaster(ctdb, &key) == ctdb->pnn) {`
added debug constants to allow for better mapping to syslog levels (This used to be ctdb commit 7ba8f1dde318eab03f4257e5a89fd23e7281e502) 2008-02-04 09:44:24 +03:00			`DEBUG(DEBUG_INFO,(__location__ " Called delete on record where we are lmaster\n"));`
added two new ctdb commands: ctdb vacuum : vacuums all the databases, deleting any zero length ctdb records ctdb repack : repacks all the databases, resulting in a perfectly packed database with no freelist entries (This used to be ctdb commit 3532119c84ab3247051ed6ba21ba3243ae2f6bf4) 2008-01-08 09:23:27 +03:00			`return -1;`
			`}`

			`if (data.dsize != sizeof(struct ctdb_ltdb_header)) {`
merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_ERR,(__location__ " Bad record size\n"));`
added two new ctdb commands: ctdb vacuum : vacuums all the databases, deleting any zero length ctdb records ctdb repack : repacks all the databases, resulting in a perfectly packed database with no freelist entries (This used to be ctdb commit 3532119c84ab3247051ed6ba21ba3243ae2f6bf4) 2008-01-08 09:23:27 +03:00			`return -1;`
			`}`

			`hdr = (struct ctdb_ltdb_header *)data.dptr;`

			`/* use a non-blocking lock */`
			`if (tdb_chainlock_nonblock(ctdb_db->ltdb->tdb, key) != 0) {`
			`return -1;`
			`}`

			`data = tdb_fetch(ctdb_db->ltdb->tdb, key);`
			`if (data.dptr == NULL) {`
			`tdb_chainunlock(ctdb_db->ltdb->tdb, key);`
			`return 0;`
			`}`

			`if (data.dsize < sizeof(struct ctdb_ltdb_header)) {`
ensure the main daemon doesn't use a blocking lock on the freelist (This used to be ctdb commit 73f8257906b09e6516f675883d8e7a3c455ad869) 2008-01-08 14:31:48 +03:00			`if (tdb_lock_nonblock(ctdb_db->ltdb->tdb, -1, F_WRLCK) == 0) {`
			`tdb_delete(ctdb_db->ltdb->tdb, key);`
			`tdb_unlock(ctdb_db->ltdb->tdb, -1, F_WRLCK);`
merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_CRIT,(__location__ " Deleted corrupt record\n"));`
ensure the main daemon doesn't use a blocking lock on the freelist (This used to be ctdb commit 73f8257906b09e6516f675883d8e7a3c455ad869) 2008-01-08 14:31:48 +03:00			`}`
added two new ctdb commands: ctdb vacuum : vacuums all the databases, deleting any zero length ctdb records ctdb repack : repacks all the databases, resulting in a perfectly packed database with no freelist entries (This used to be ctdb commit 3532119c84ab3247051ed6ba21ba3243ae2f6bf4) 2008-01-08 09:23:27 +03:00			`tdb_chainunlock(ctdb_db->ltdb->tdb, key);`
			`free(data.dptr);`
			`return 0;`
			`}`

			`hdr2 = (struct ctdb_ltdb_header *)data.dptr;`

			`if (hdr2->rsn > hdr->rsn) {`
			`tdb_chainunlock(ctdb_db->ltdb->tdb, key);`
added debug constants to allow for better mapping to syslog levels (This used to be ctdb commit 7ba8f1dde318eab03f4257e5a89fd23e7281e502) 2008-02-04 09:44:24 +03:00			`DEBUG(DEBUG_INFO,(__location__ " Skipping record with rsn=%llu - called with rsn=%llu\n",`
added two new ctdb commands: ctdb vacuum : vacuums all the databases, deleting any zero length ctdb records ctdb repack : repacks all the databases, resulting in a perfectly packed database with no freelist entries (This used to be ctdb commit 3532119c84ab3247051ed6ba21ba3243ae2f6bf4) 2008-01-08 09:23:27 +03:00			`(unsigned long long)hdr2->rsn, (unsigned long long)hdr->rsn));`
			`free(data.dptr);`
			`return -1;`
			`}`

			`if (hdr2->dmaster == ctdb->pnn) {`
			`tdb_chainunlock(ctdb_db->ltdb->tdb, key);`
added debug constants to allow for better mapping to syslog levels (This used to be ctdb commit 7ba8f1dde318eab03f4257e5a89fd23e7281e502) 2008-02-04 09:44:24 +03:00			`DEBUG(DEBUG_INFO,(__location__ " Attempted delete record where we are the dmaster\n"));`
added two new ctdb commands: ctdb vacuum : vacuums all the databases, deleting any zero length ctdb records ctdb repack : repacks all the databases, resulting in a perfectly packed database with no freelist entries (This used to be ctdb commit 3532119c84ab3247051ed6ba21ba3243ae2f6bf4) 2008-01-08 09:23:27 +03:00			`free(data.dptr);`
			`return -1;`
			`}`

ensure the main daemon doesn't use a blocking lock on the freelist (This used to be ctdb commit 73f8257906b09e6516f675883d8e7a3c455ad869) 2008-01-08 14:31:48 +03:00			`if (tdb_lock_nonblock(ctdb_db->ltdb->tdb, -1, F_WRLCK) != 0) {`
			`tdb_chainunlock(ctdb_db->ltdb->tdb, key);`
			`free(data.dptr);`
			`return -1;`
			`}`

added two new ctdb commands: ctdb vacuum : vacuums all the databases, deleting any zero length ctdb records ctdb repack : repacks all the databases, resulting in a perfectly packed database with no freelist entries (This used to be ctdb commit 3532119c84ab3247051ed6ba21ba3243ae2f6bf4) 2008-01-08 09:23:27 +03:00			`if (tdb_delete(ctdb_db->ltdb->tdb, key) != 0) {`
ensure the main daemon doesn't use a blocking lock on the freelist (This used to be ctdb commit 73f8257906b09e6516f675883d8e7a3c455ad869) 2008-01-08 14:31:48 +03:00			`tdb_unlock(ctdb_db->ltdb->tdb, -1, F_WRLCK);`
added two new ctdb commands: ctdb vacuum : vacuums all the databases, deleting any zero length ctdb records ctdb repack : repacks all the databases, resulting in a perfectly packed database with no freelist entries (This used to be ctdb commit 3532119c84ab3247051ed6ba21ba3243ae2f6bf4) 2008-01-08 09:23:27 +03:00			`tdb_chainunlock(ctdb_db->ltdb->tdb, key);`
added debug constants to allow for better mapping to syslog levels (This used to be ctdb commit 7ba8f1dde318eab03f4257e5a89fd23e7281e502) 2008-02-04 09:44:24 +03:00			`DEBUG(DEBUG_INFO,(__location__ " Failed to delete record\n"));`
added two new ctdb commands: ctdb vacuum : vacuums all the databases, deleting any zero length ctdb records ctdb repack : repacks all the databases, resulting in a perfectly packed database with no freelist entries (This used to be ctdb commit 3532119c84ab3247051ed6ba21ba3243ae2f6bf4) 2008-01-08 09:23:27 +03:00			`free(data.dptr);`
			`return -1;`
			`}`

ensure the main daemon doesn't use a blocking lock on the freelist (This used to be ctdb commit 73f8257906b09e6516f675883d8e7a3c455ad869) 2008-01-08 14:31:48 +03:00			`tdb_unlock(ctdb_db->ltdb->tdb, -1, F_WRLCK);`
added two new ctdb commands: ctdb vacuum : vacuums all the databases, deleting any zero length ctdb records ctdb repack : repacks all the databases, resulting in a perfectly packed database with no freelist entries (This used to be ctdb commit 3532119c84ab3247051ed6ba21ba3243ae2f6bf4) 2008-01-08 09:23:27 +03:00			`tdb_chainunlock(ctdb_db->ltdb->tdb, key);`
			`free(data.dptr);`
			`return 0;`
			`}`
merge async recovery changes from Ronnie (This used to be ctdb commit 576e317640d25f8059114f15c6f1ebcee5e5b6e2) 2008-01-29 05:59:28 +03:00

Redo the vacukming process to mkake it scalable. Vacumming used to delete one record at a time on all nodes, that was m*n behaviour and would require a huge storm of ctdb->ctdb controls and just wouldnt scale at all. The new vacuming process collects all records to be deleted locally and then only sends 1 control to the other nodes. This control contains a list of all records to be deleted. (This used to be ctdb commit 9e625ece19a91f362c9539fa73b6b2108f0d9c53) 2008-03-12 23:53:29 +03:00
merge async recovery changes from Ronnie (This used to be ctdb commit 576e317640d25f8059114f15c6f1ebcee5e5b6e2) 2008-01-29 05:59:28 +03:00			`struct recovery_callback_state {`
			`struct ctdb_req_control *c;`
			`};`


			`/*`
			`called when the 'recovered' event script has finished`
			`*/`
			`static void ctdb_end_recovery_callback(struct ctdb_context ctdb, int status, void p)`
			`{`
			`struct recovery_callback_state *state = talloc_get_type(p, struct recovery_callback_state);`

			`ctdb_enable_monitoring(ctdb);`

			`if (status != 0) {`
merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_ERR,(__location__ " recovered event script failed (status %d)\n", status));`
merge async recovery changes from Ronnie (This used to be ctdb commit 576e317640d25f8059114f15c6f1ebcee5e5b6e2) 2008-01-29 05:59:28 +03:00			`}`

			`ctdb_request_control_reply(ctdb, state->c, NULL, status, NULL);`
			`talloc_free(state);`

track both when we last started and ended a recovery. make ctdb uptime print how long the recovery took in the recovery daemon when we check that the public ip address allocation on the local node is correct (we have the ips we should have and we dont have any we shouldnt have) use ctdb uptime and check the recovery start/stop times and make sure we dont check for ip allocation inconsistencies during a recovery where the ip address allocation is in flux. (This used to be ctdb commit f86551580349b7f662f9a07e4eb0c1189e38e429) 2008-07-02 07:55:59 +04:00			`gettimeofday(&ctdb->last_recovery_finished, NULL);`
merge async recovery changes from Ronnie (This used to be ctdb commit 576e317640d25f8059114f15c6f1ebcee5e5b6e2) 2008-01-29 05:59:28 +03:00			`}`

			`/*`
			`recovery has finished`
			`*/`
			`int32_t ctdb_control_end_recovery(struct ctdb_context *ctdb,`
			`struct ctdb_req_control *c,`
			`bool *async_reply)`
			`{`
			`int ret;`
			`struct recovery_callback_state *state;`

merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_NOTICE,("Recovery has finished\n"));`
merge async recovery changes from Ronnie (This used to be ctdb commit 576e317640d25f8059114f15c6f1ebcee5e5b6e2) 2008-01-29 05:59:28 +03:00
			`state = talloc(ctdb, struct recovery_callback_state);`
			`CTDB_NO_MEMORY(ctdb, state);`

			`state->c = talloc_steal(state, c);`

			`ctdb_disable_monitoring(ctdb);`

			`ret = ctdb_event_script_callback(ctdb,`
change the eventscript handling to allow EventScriptTimeout for each individual script isntead of for the entire set of scripts restructure the talloc hierarchy to allow this (This used to be ctdb commit 64da4402c6ad485f1d0a604878a7b0c01a0ea5f0) 2009-10-28 08:11:54 +03:00			`timeval_set(ctdb->tunable.script_timeout, 0),`
merge async recovery changes from Ronnie (This used to be ctdb commit 576e317640d25f8059114f15c6f1ebcee5e5b6e2) 2008-01-29 05:59:28 +03:00			`state,`
			`ctdb_end_recovery_callback,`
			`state, "recovered");`

			`if (ret != 0) {`
			`ctdb_enable_monitoring(ctdb);`

merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_ERR,(__location__ " Failed to end recovery\n"));`
merge async recovery changes from Ronnie (This used to be ctdb commit 576e317640d25f8059114f15c6f1ebcee5e5b6e2) 2008-01-29 05:59:28 +03:00			`talloc_free(state);`
			`return -1;`
			`}`

			`/* tell the control that we will be reply asynchronously */`
			`*async_reply = true;`
			`return 0;`
			`}`

			`/*`
			`called when the 'startrecovery' event script has finished`
			`*/`
			`static void ctdb_start_recovery_callback(struct ctdb_context ctdb, int status, void p)`
			`{`
			`struct recovery_callback_state *state = talloc_get_type(p, struct recovery_callback_state);`

			`if (status != 0) {`
merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_ERR,(__location__ " startrecovery event script failed (status %d)\n", status));`
merge async recovery changes from Ronnie (This used to be ctdb commit 576e317640d25f8059114f15c6f1ebcee5e5b6e2) 2008-01-29 05:59:28 +03:00			`}`

			`ctdb_request_control_reply(ctdb, state->c, NULL, status, NULL);`
			`talloc_free(state);`
			`}`

			`/*`
track both when we last started and ended a recovery. make ctdb uptime print how long the recovery took in the recovery daemon when we check that the public ip address allocation on the local node is correct (we have the ips we should have and we dont have any we shouldnt have) use ctdb uptime and check the recovery start/stop times and make sure we dont check for ip allocation inconsistencies during a recovery where the ip address allocation is in flux. (This used to be ctdb commit f86551580349b7f662f9a07e4eb0c1189e38e429) 2008-07-02 07:55:59 +04:00			`run the startrecovery eventscript`
merge async recovery changes from Ronnie (This used to be ctdb commit 576e317640d25f8059114f15c6f1ebcee5e5b6e2) 2008-01-29 05:59:28 +03:00			`*/`
			`int32_t ctdb_control_start_recovery(struct ctdb_context *ctdb,`
			`struct ctdb_req_control *c,`
			`bool *async_reply)`
			`{`
			`int ret;`
			`struct recovery_callback_state *state;`

update a comment to reflect that this is not always a real recovery it can also be printed when we just do an ip reallocation (This used to be ctdb commit e4c9e511fc5e15e0638ebb9117cb4a65ca8fda4b) 2008-07-02 06:01:19 +04:00			`DEBUG(DEBUG_NOTICE,(__location__ " startrecovery eventscript has been invoked\n"));`
track both when we last started and ended a recovery. make ctdb uptime print how long the recovery took in the recovery daemon when we check that the public ip address allocation on the local node is correct (we have the ips we should have and we dont have any we shouldnt have) use ctdb uptime and check the recovery start/stop times and make sure we dont check for ip allocation inconsistencies during a recovery where the ip address allocation is in flux. (This used to be ctdb commit f86551580349b7f662f9a07e4eb0c1189e38e429) 2008-07-02 07:55:59 +04:00			`gettimeofday(&ctdb->last_recovery_started, NULL);`
merge async recovery changes from Ronnie (This used to be ctdb commit 576e317640d25f8059114f15c6f1ebcee5e5b6e2) 2008-01-29 05:59:28 +03:00
			`state = talloc(ctdb, struct recovery_callback_state);`
			`CTDB_NO_MEMORY(ctdb, state);`

			`state->c = talloc_steal(state, c);`

			`ctdb_disable_monitoring(ctdb);`

			`ret = ctdb_event_script_callback(ctdb,`
change the eventscript handling to allow EventScriptTimeout for each individual script isntead of for the entire set of scripts restructure the talloc hierarchy to allow this (This used to be ctdb commit 64da4402c6ad485f1d0a604878a7b0c01a0ea5f0) 2009-10-28 08:11:54 +03:00			`timeval_set(ctdb->tunable.script_timeout, 0),`
merge async recovery changes from Ronnie (This used to be ctdb commit 576e317640d25f8059114f15c6f1ebcee5e5b6e2) 2008-01-29 05:59:28 +03:00			`state,`
			`ctdb_start_recovery_callback,`
			`state, "startrecovery");`

			`if (ret != 0) {`
merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c) 2008-02-04 12:07:15 +03:00			`DEBUG(DEBUG_ERR,(__location__ " Failed to start recovery\n"));`
merge async recovery changes from Ronnie (This used to be ctdb commit 576e317640d25f8059114f15c6f1ebcee5e5b6e2) 2008-01-29 05:59:28 +03:00			`talloc_free(state);`
			`return -1;`
			`}`

			`/* tell the control that we will be reply asynchronously */`
			`*async_reply = true;`
			`return 0;`
			`}`

Redo the vacukming process to mkake it scalable. Vacumming used to delete one record at a time on all nodes, that was m*n behaviour and would require a huge storm of ctdb->ctdb controls and just wouldnt scale at all. The new vacuming process collects all records to be deleted locally and then only sends 1 control to the other nodes. This control contains a list of all records to be deleted. (This used to be ctdb commit 9e625ece19a91f362c9539fa73b6b2108f0d9c53) 2008-03-12 23:53:29 +03:00			`/*`
			`try to delete all these records as part of the vacuuming process`
			`and return the records we failed to delete`
			`*/`
			`int32_t ctdb_control_try_delete_records(struct ctdb_context ctdb, TDB_DATA indata, TDB_DATA outdata)`
			`{`
rename the structure we use for marshalling multiple records (This used to be ctdb commit 4d205476d286570a6e1f52b59af42858ce051106) 2008-07-30 08:24:56 +04:00			`struct ctdb_marshall_buffer reply = (struct ctdb_marshall_buffer )indata.dptr;`
Redo the vacukming process to mkake it scalable. Vacumming used to delete one record at a time on all nodes, that was m*n behaviour and would require a huge storm of ctdb->ctdb controls and just wouldnt scale at all. The new vacuming process collects all records to be deleted locally and then only sends 1 control to the other nodes. This control contains a list of all records to be deleted. (This used to be ctdb commit 9e625ece19a91f362c9539fa73b6b2108f0d9c53) 2008-03-12 23:53:29 +03:00			`struct ctdb_db_context *ctdb_db;`
			`int i;`
			`struct ctdb_rec_data *rec;`
rename the structure we use for marshalling multiple records (This used to be ctdb commit 4d205476d286570a6e1f52b59af42858ce051106) 2008-07-30 08:24:56 +04:00			`struct ctdb_marshall_buffer *records;`
Redo the vacukming process to mkake it scalable. Vacumming used to delete one record at a time on all nodes, that was m*n behaviour and would require a huge storm of ctdb->ctdb controls and just wouldnt scale at all. The new vacuming process collects all records to be deleted locally and then only sends 1 control to the other nodes. This control contains a list of all records to be deleted. (This used to be ctdb commit 9e625ece19a91f362c9539fa73b6b2108f0d9c53) 2008-03-12 23:53:29 +03:00
rename the structure we use for marshalling multiple records (This used to be ctdb commit 4d205476d286570a6e1f52b59af42858ce051106) 2008-07-30 08:24:56 +04:00			`if (indata.dsize < offsetof(struct ctdb_marshall_buffer, data)) {`
Redo the vacukming process to mkake it scalable. Vacumming used to delete one record at a time on all nodes, that was m*n behaviour and would require a huge storm of ctdb->ctdb controls and just wouldnt scale at all. The new vacuming process collects all records to be deleted locally and then only sends 1 control to the other nodes. This control contains a list of all records to be deleted. (This used to be ctdb commit 9e625ece19a91f362c9539fa73b6b2108f0d9c53) 2008-03-12 23:53:29 +03:00			`DEBUG(DEBUG_ERR,(__location__ " invalid data in try_delete_records\n"));`
			`return -1;`
			`}`

			`ctdb_db = find_ctdb_db(ctdb, reply->db_id);`
			`if (!ctdb_db) {`
			`DEBUG(DEBUG_ERR,(__location__ " Unknown db 0x%08x\n", reply->db_id));`
			`return -1;`
			`}`


			`DEBUG(DEBUG_DEBUG,("starting try_delete_records of %u records for dbid 0x%x\n",`
			`reply->count, reply->db_id));`


			`/* create a blob to send back the records we couldnt delete */`
rename the structure we use for marshalling multiple records (This used to be ctdb commit 4d205476d286570a6e1f52b59af42858ce051106) 2008-07-30 08:24:56 +04:00			`records = (struct ctdb_marshall_buffer *)`
Redo the vacukming process to mkake it scalable. Vacumming used to delete one record at a time on all nodes, that was m*n behaviour and would require a huge storm of ctdb->ctdb controls and just wouldnt scale at all. The new vacuming process collects all records to be deleted locally and then only sends 1 control to the other nodes. This control contains a list of all records to be deleted. (This used to be ctdb commit 9e625ece19a91f362c9539fa73b6b2108f0d9c53) 2008-03-12 23:53:29 +03:00			`talloc_zero_size(outdata,`
rename the structure we use for marshalling multiple records (This used to be ctdb commit 4d205476d286570a6e1f52b59af42858ce051106) 2008-07-30 08:24:56 +04:00			`offsetof(struct ctdb_marshall_buffer, data));`
Redo the vacukming process to mkake it scalable. Vacumming used to delete one record at a time on all nodes, that was m*n behaviour and would require a huge storm of ctdb->ctdb controls and just wouldnt scale at all. The new vacuming process collects all records to be deleted locally and then only sends 1 control to the other nodes. This control contains a list of all records to be deleted. (This used to be ctdb commit 9e625ece19a91f362c9539fa73b6b2108f0d9c53) 2008-03-12 23:53:29 +03:00			`if (records == NULL) {`
			`DEBUG(DEBUG_ERR,(__location__ " Out of memory\n"));`
			`return -1;`
			`}`
			`records->db_id = ctdb_db->db_id;`


			`rec = (struct ctdb_rec_data *)&reply->data[0];`
			`for (i=0;i<reply->count;i++) {`
			`TDB_DATA key, data;`

			`key.dptr = &rec->data[0];`
			`key.dsize = rec->keylen;`
			`data.dptr = &rec->data[key.dsize];`
			`data.dsize = rec->datalen;`

			`if (data.dsize < sizeof(struct ctdb_ltdb_header)) {`
			`DEBUG(DEBUG_CRIT,(__location__ " bad ltdb record in indata\n"));`
			`return -1;`
			`}`

			`/* If we cant delete the record we must add it to the reply`
			`so the lmaster knows it may not purge this record`
			`*/`
			`if (delete_tdb_record(ctdb, ctdb_db, rec) != 0) {`
			`size_t old_size;`
			`struct ctdb_ltdb_header *hdr;`

			`hdr = (struct ctdb_ltdb_header *)data.dptr;`
			`data.dptr += sizeof(*hdr);`
			`data.dsize -= sizeof(*hdr);`

			`DEBUG(DEBUG_INFO, (__location__ " Failed to vacuum delete record with hash 0x%08x\n", ctdb_hash(&key)));`

			`old_size = talloc_get_size(records);`
			`records = talloc_realloc_size(outdata, records, old_size + rec->length);`
			`if (records == NULL) {`
			`DEBUG(DEBUG_ERR,(__location__ " Failed to expand\n"));`
			`return -1;`
			`}`
			`records->count++;`
			`memcpy(old_size+(uint8_t *)records, rec, rec->length);`
			`}`

			`rec = (struct ctdb_rec_data )(rec->length + (uint8_t )rec);`
			`}`


			`outdata->dptr = (uint8_t *)records;`
			`outdata->dsize = talloc_get_size(records);`

			`return 0;`
			`}`
Expand the client async framework so that it can take a callback function. This allows us to use the async framework also for controls that return outdata. Add a "capabilities" field to the ctdb_node structure. This field is only initialized and kept valid inside the recovery daemon context and not inside the main ctdb daemon. change the GET_CAPABILITIES control to return the capabilities in outdata instead of in the res return variable. When performing a recovery inside the recovery daemon, read the capabilities from all connected nodes and update the ctdb->nodes list of nodes. when building the new vnnmap after the database rebuild in recovery, do not include any nodes which lack the LMASTER capability in the new vnnmap. Unless there are no available connected node that sports the LMASTER capability in which case we let the local node (recmaster) take on the lmaster role temporarily (i.e. become a member of the vnnmap list) (This used to be ctdb commit 0f1883c69c689b28b0c04148774840b2c4081df6) 2008-05-06 09:42:59 +04:00
			`/*`
			`report capabilities`
			`*/`
			`int32_t ctdb_control_get_capabilities(struct ctdb_context ctdb, TDB_DATA outdata)`
			`{`
			`uint32_t *capabilities = NULL;`

			`capabilities = talloc(outdata, uint32_t);`
			`CTDB_NO_MEMORY(ctdb, capabilities);`
			`*capabilities = ctdb->capabilities;`

			`outdata->dsize = sizeof(uint32_t);`
			`outdata->dptr = (uint8_t *)capabilities;`

			`return 0;`
			`}`

additional monitoring between the two daemons. we currently only monitor that the dameons are running by kill(0, pid) and verifying the the domain socket between them is ok. this is not sufficient since we can have a situation where the recovery daemon is hung. this new code monitors that the recovery daemon is operating. if the recovery hangs, we log this and shut down the main daemon (This used to be ctdb commit cd69d292292eaab3aac0e9d9fc57cb621597c63c) 2008-09-09 07:44:46 +04:00			`static void ctdb_recd_ping_timeout(struct event_context ev, struct timed_event te, struct timeval t, void *p)`
			`{`
			`struct ctdb_context *ctdb = talloc_get_type(p, struct ctdb_context);`
The ctdb daemon keeps track of whether the recovery process is running correctly by measuring how long it was since the last successful communication with the recovery daemon was recorded. After a certain timeout the ctdb daemon would deem the recovery daemon as inoperable and shut down. If the system clock is suddenly changed forward by many (60 or more) seconds this could cause the timeout to trigger prematurely/immediately where ctdb would incorrectly think that more than 60 seconds had passed since last successful communications and thus abort. Instead of cehcking for one timeout occuring, only deem the recovery daemon to be "down" and trigger a shutdown if communications have timedout for three intervals in a row. (This used to be ctdb commit 196968c552e6ebcb57389d769a4b25f42fa8bc5d) 2008-09-17 08:17:41 +04:00			`uint32_t *count = talloc_get_type(ctdb->recd_ping_count, uint32_t);`
additional monitoring between the two daemons. we currently only monitor that the dameons are running by kill(0, pid) and verifying the the domain socket between them is ok. this is not sufficient since we can have a situation where the recovery daemon is hung. this new code monitors that the recovery daemon is operating. if the recovery hangs, we log this and shut down the main daemon (This used to be ctdb commit cd69d292292eaab3aac0e9d9fc57cb621597c63c) 2008-09-09 07:44:46 +04:00
add extra debug statements to the log to make it easier to see when a recovery dameon has hung due to the underlying filesystem hanging. (This used to be ctdb commit 5b0067a4e335cbbf6e606646e612d4bfcfdb7441) 2009-05-12 12:39:34 +04:00			`DEBUG(DEBUG_ERR, ("Recovery daemon ping timeout. Count : %u\n", *count));`
The ctdb daemon keeps track of whether the recovery process is running correctly by measuring how long it was since the last successful communication with the recovery daemon was recorded. After a certain timeout the ctdb daemon would deem the recovery daemon as inoperable and shut down. If the system clock is suddenly changed forward by many (60 or more) seconds this could cause the timeout to trigger prematurely/immediately where ctdb would incorrectly think that more than 60 seconds had passed since last successful communications and thus abort. Instead of cehcking for one timeout occuring, only deem the recovery daemon to be "down" and trigger a shutdown if communications have timedout for three intervals in a row. (This used to be ctdb commit 196968c552e6ebcb57389d769a4b25f42fa8bc5d) 2008-09-17 08:17:41 +04:00
use the correct tunable failcount not timeout (This used to be ctdb commit 475cfada33b4c13aaaca773d5485bbe26bffbf46) 2008-09-17 08:24:12 +04:00			`if (*count < ctdb->tunable.recd_ping_failcount) {`
The ctdb daemon keeps track of whether the recovery process is running correctly by measuring how long it was since the last successful communication with the recovery daemon was recorded. After a certain timeout the ctdb daemon would deem the recovery daemon as inoperable and shut down. If the system clock is suddenly changed forward by many (60 or more) seconds this could cause the timeout to trigger prematurely/immediately where ctdb would incorrectly think that more than 60 seconds had passed since last successful communications and thus abort. Instead of cehcking for one timeout occuring, only deem the recovery daemon to be "down" and trigger a shutdown if communications have timedout for three intervals in a row. (This used to be ctdb commit 196968c552e6ebcb57389d769a4b25f42fa8bc5d) 2008-09-17 08:17:41 +04:00			`(*count)++;`
			`event_add_timed(ctdb->ev, ctdb->recd_ping_count,`
			`timeval_current_ofs(ctdb->tunable.recd_ping_timeout, 0),`
			`ctdb_recd_ping_timeout, ctdb);`
			`return;`
			`}`

add extra debug statements to the log to make it easier to see when a recovery dameon has hung due to the underlying filesystem hanging. (This used to be ctdb commit 5b0067a4e335cbbf6e606646e612d4bfcfdb7441) 2009-05-12 12:39:34 +04:00			`DEBUG(DEBUG_ERR, ("Final timeout for recovery daemon ping. Shutting down ctdb daemon. (This can be caused if the cluster filesystem has hung)\n"));`
additional monitoring between the two daemons. we currently only monitor that the dameons are running by kill(0, pid) and verifying the the domain socket between them is ok. this is not sufficient since we can have a situation where the recovery daemon is hung. this new code monitors that the recovery daemon is operating. if the recovery hangs, we log this and shut down the main daemon (This used to be ctdb commit cd69d292292eaab3aac0e9d9fc57cb621597c63c) 2008-09-09 07:44:46 +04:00
			`ctdb_stop_recoverd(ctdb);`
			`ctdb_stop_keepalive(ctdb);`
			`ctdb_stop_monitoring(ctdb);`
			`ctdb_release_all_ips(ctdb);`
			`if (ctdb->methods != NULL) {`
			`ctdb->methods->shutdown(ctdb);`
			`}`
			`ctdb_event_script(ctdb, "shutdown");`
add extra debug statements to the log to make it easier to see when a recovery dameon has hung due to the underlying filesystem hanging. (This used to be ctdb commit 5b0067a4e335cbbf6e606646e612d4bfcfdb7441) 2009-05-12 12:39:34 +04:00			`DEBUG(DEBUG_ERR, ("Recovery daemon ping timeout. Daemon has been shut down.\n"));`
additional monitoring between the two daemons. we currently only monitor that the dameons are running by kill(0, pid) and verifying the the domain socket between them is ok. this is not sufficient since we can have a situation where the recovery daemon is hung. this new code monitors that the recovery daemon is operating. if the recovery hangs, we log this and shut down the main daemon (This used to be ctdb commit cd69d292292eaab3aac0e9d9fc57cb621597c63c) 2008-09-09 07:44:46 +04:00			`exit(0);`
			`}`

			`/* The recovery daemon will ping us at regular intervals.`
			`If we havent been pinged for a while we assume the recovery`
			`daemon is inoperable and we shut down.`
			`*/`
			`int32_t ctdb_control_recd_ping(struct ctdb_context *ctdb)`
			`{`
The ctdb daemon keeps track of whether the recovery process is running correctly by measuring how long it was since the last successful communication with the recovery daemon was recorded. After a certain timeout the ctdb daemon would deem the recovery daemon as inoperable and shut down. If the system clock is suddenly changed forward by many (60 or more) seconds this could cause the timeout to trigger prematurely/immediately where ctdb would incorrectly think that more than 60 seconds had passed since last successful communications and thus abort. Instead of cehcking for one timeout occuring, only deem the recovery daemon to be "down" and trigger a shutdown if communications have timedout for three intervals in a row. (This used to be ctdb commit 196968c552e6ebcb57389d769a4b25f42fa8bc5d) 2008-09-17 08:17:41 +04:00			`talloc_free(ctdb->recd_ping_count);`
additional monitoring between the two daemons. we currently only monitor that the dameons are running by kill(0, pid) and verifying the the domain socket between them is ok. this is not sufficient since we can have a situation where the recovery daemon is hung. this new code monitors that the recovery daemon is operating. if the recovery hangs, we log this and shut down the main daemon (This used to be ctdb commit cd69d292292eaab3aac0e9d9fc57cb621597c63c) 2008-09-09 07:44:46 +04:00
The ctdb daemon keeps track of whether the recovery process is running correctly by measuring how long it was since the last successful communication with the recovery daemon was recorded. After a certain timeout the ctdb daemon would deem the recovery daemon as inoperable and shut down. If the system clock is suddenly changed forward by many (60 or more) seconds this could cause the timeout to trigger prematurely/immediately where ctdb would incorrectly think that more than 60 seconds had passed since last successful communications and thus abort. Instead of cehcking for one timeout occuring, only deem the recovery daemon to be "down" and trigger a shutdown if communications have timedout for three intervals in a row. (This used to be ctdb commit 196968c552e6ebcb57389d769a4b25f42fa8bc5d) 2008-09-17 08:17:41 +04:00			`ctdb->recd_ping_count = talloc_zero(ctdb, uint32_t);`
			`CTDB_NO_MEMORY(ctdb, ctdb->recd_ping_count);`
additional monitoring between the two daemons. we currently only monitor that the dameons are running by kill(0, pid) and verifying the the domain socket between them is ok. this is not sufficient since we can have a situation where the recovery daemon is hung. this new code monitors that the recovery daemon is operating. if the recovery hangs, we log this and shut down the main daemon (This used to be ctdb commit cd69d292292eaab3aac0e9d9fc57cb621597c63c) 2008-09-09 07:44:46 +04:00
			`if (ctdb->tunable.recd_ping_timeout != 0) {`
The ctdb daemon keeps track of whether the recovery process is running correctly by measuring how long it was since the last successful communication with the recovery daemon was recorded. After a certain timeout the ctdb daemon would deem the recovery daemon as inoperable and shut down. If the system clock is suddenly changed forward by many (60 or more) seconds this could cause the timeout to trigger prematurely/immediately where ctdb would incorrectly think that more than 60 seconds had passed since last successful communications and thus abort. Instead of cehcking for one timeout occuring, only deem the recovery daemon to be "down" and trigger a shutdown if communications have timedout for three intervals in a row. (This used to be ctdb commit 196968c552e6ebcb57389d769a4b25f42fa8bc5d) 2008-09-17 08:17:41 +04:00			`event_add_timed(ctdb->ev, ctdb->recd_ping_count,`
additional monitoring between the two daemons. we currently only monitor that the dameons are running by kill(0, pid) and verifying the the domain socket between them is ok. this is not sufficient since we can have a situation where the recovery daemon is hung. this new code monitors that the recovery daemon is operating. if the recovery hangs, we log this and shut down the main daemon (This used to be ctdb commit cd69d292292eaab3aac0e9d9fc57cb621597c63c) 2008-09-09 07:44:46 +04:00			`timeval_current_ofs(ctdb->tunable.recd_ping_timeout, 0),`
			`ctdb_recd_ping_timeout, ctdb);`
			`}`

			`return 0;`
			`}`

add a context and a timed event so that once we have been in recovery mode for too long we drop all public ip addresses (This used to be ctdb commit 403c68f96e1380dd07217c688de2730464f77ea0) 2008-10-22 04:04:41 +04:00

			`int32_t ctdb_control_set_recmaster(struct ctdb_context *ctdb, uint32_t opcode, TDB_DATA indata)`
			`{`
			`CHECK_CONTROL_DATA_SIZE(sizeof(uint32_t));`
allow to change the recmaster even the database is not frozen (This used to be ctdb commit 03e2e436db5cfd29a56d13f5d2101e42389bfc94) 2008-11-21 08:24:12 +03:00
add a context and a timed event so that once we have been in recovery mode for too long we drop all public ip addresses (This used to be ctdb commit 403c68f96e1380dd07217c688de2730464f77ea0) 2008-10-22 04:04:41 +04:00			`ctdb->recovery_master = ((uint32_t *)(&indata.dptr[0]))[0];`
			`return 0;`
			`}`
add two new controls, CTOP_NODE and CONTINUE_NODE that are used to stop/continue a node instead of using modflags messages (This used to be ctdb commit 54b4a02053a0f98f8c424e7f658890254023d39a) 2009-07-09 06:22:46 +04:00
create a new event : stopped. This event is called when a node is stopped and is used by eventscripts that need to do certain cleanup and removal of configuration or ip addresses or routing ... Note that a STOPPED node is considered "inactive" and as such will not be running the "recovered" event when the rest of the cluster has recovered. (This used to be ctdb commit 65e9309564611bf937ded3c74a79abff895d7c59) 2009-07-17 06:26:16 +04:00
			`struct stop_node_callback_state {`
			`struct ctdb_req_control *c;`
			`};`

			`/*`
			`called when the 'stopped' event script has finished`
			`*/`
			`static void ctdb_stop_node_callback(struct ctdb_context ctdb, int status, void p)`
add two new controls, CTOP_NODE and CONTINUE_NODE that are used to stop/continue a node instead of using modflags messages (This used to be ctdb commit 54b4a02053a0f98f8c424e7f658890254023d39a) 2009-07-09 06:22:46 +04:00			`{`
create a new event : stopped. This event is called when a node is stopped and is used by eventscripts that need to do certain cleanup and removal of configuration or ip addresses or routing ... Note that a STOPPED node is considered "inactive" and as such will not be running the "recovered" event when the rest of the cluster has recovered. (This used to be ctdb commit 65e9309564611bf937ded3c74a79abff895d7c59) 2009-07-17 06:26:16 +04:00			`struct stop_node_callback_state *state = talloc_get_type(p, struct stop_node_callback_state);`

			`if (status != 0) {`
			`DEBUG(DEBUG_ERR,(__location__ " stopped event script failed (status %d)\n", status));`
			`ctdb->nodes[ctdb->pnn]->flags &= ~NODE_FLAGS_STOPPED;`
			`}`

			`ctdb_request_control_reply(ctdb, state->c, NULL, status, NULL);`
			`talloc_free(state);`
			`}`

			`int32_t ctdb_control_stop_node(struct ctdb_context ctdb, struct ctdb_req_control c, bool *async_reply)`
			`{`
			`int ret;`
			`struct stop_node_callback_state *state;`

change the infolevel when logging stop/continue commands (This used to be ctdb commit 1e007c833098b03dd81797c081da1ae1b10c971c) 2009-07-09 08:34:12 +04:00			`DEBUG(DEBUG_INFO,(__location__ " Stopping node\n"));`
create a new event : stopped. This event is called when a node is stopped and is used by eventscripts that need to do certain cleanup and removal of configuration or ip addresses or routing ... Note that a STOPPED node is considered "inactive" and as such will not be running the "recovered" event when the rest of the cluster has recovered. (This used to be ctdb commit 65e9309564611bf937ded3c74a79abff895d7c59) 2009-07-17 06:26:16 +04:00
			`state = talloc(ctdb, struct stop_node_callback_state);`
			`CTDB_NO_MEMORY(ctdb, state);`

			`state->c = talloc_steal(state, c);`

			`ctdb_disable_monitoring(ctdb);`

			`ret = ctdb_event_script_callback(ctdb,`
change the eventscript handling to allow EventScriptTimeout for each individual script isntead of for the entire set of scripts restructure the talloc hierarchy to allow this (This used to be ctdb commit 64da4402c6ad485f1d0a604878a7b0c01a0ea5f0) 2009-10-28 08:11:54 +03:00			`timeval_set(ctdb->tunable.script_timeout, 0),`
create a new event : stopped. This event is called when a node is stopped and is used by eventscripts that need to do certain cleanup and removal of configuration or ip addresses or routing ... Note that a STOPPED node is considered "inactive" and as such will not be running the "recovered" event when the rest of the cluster has recovered. (This used to be ctdb commit 65e9309564611bf937ded3c74a79abff895d7c59) 2009-07-17 06:26:16 +04:00			`state,`
			`ctdb_stop_node_callback,`
			`state, "stopped");`

			`if (ret != 0) {`
			`ctdb_enable_monitoring(ctdb);`

			`DEBUG(DEBUG_ERR,(__location__ " Failed to stop node\n"));`
			`talloc_free(state);`
			`return -1;`
			`}`

add two new controls, CTOP_NODE and CONTINUE_NODE that are used to stop/continue a node instead of using modflags messages (This used to be ctdb commit 54b4a02053a0f98f8c424e7f658890254023d39a) 2009-07-09 06:22:46 +04:00			`ctdb->nodes[ctdb->pnn]->flags \|= NODE_FLAGS_STOPPED;`

create a new event : stopped. This event is called when a node is stopped and is used by eventscripts that need to do certain cleanup and removal of configuration or ip addresses or routing ... Note that a STOPPED node is considered "inactive" and as such will not be running the "recovered" event when the rest of the cluster has recovered. (This used to be ctdb commit 65e9309564611bf937ded3c74a79abff895d7c59) 2009-07-17 06:26:16 +04:00			`*async_reply = true;`

add two new controls, CTOP_NODE and CONTINUE_NODE that are used to stop/continue a node instead of using modflags messages (This used to be ctdb commit 54b4a02053a0f98f8c424e7f658890254023d39a) 2009-07-09 06:22:46 +04:00			`return 0;`
			`}`

			`int32_t ctdb_control_continue_node(struct ctdb_context *ctdb)`
			`{`
change the infolevel when logging stop/continue commands (This used to be ctdb commit 1e007c833098b03dd81797c081da1ae1b10c971c) 2009-07-09 08:34:12 +04:00			`DEBUG(DEBUG_INFO,(__location__ " Continue node\n"));`
add two new controls, CTOP_NODE and CONTINUE_NODE that are used to stop/continue a node instead of using modflags messages (This used to be ctdb commit 54b4a02053a0f98f8c424e7f658890254023d39a) 2009-07-09 06:22:46 +04:00			`ctdb->nodes[ctdb->pnn]->flags &= ~NODE_FLAGS_STOPPED;`

			`return 0;`
			`}`

1261 lines 34 KiB C Raw Normal View History Unescape Escape

1261 lines

34 KiB

C

Raw Normal View History