samba-mirror

mirror of https://github.com/samba-team/samba.git synced 2024-12-24 21:34:56 +03:00

Author	SHA1	Message	Date
Martin Schwenke	97248de3a9	recoverd: An inactive node should not force recovery master elections An inactive node can't become the recovery master. So if an inactive node notices that the recovery master is inactive, it shouldn't force an election for recovery master and nominate itself as a candidate. This can cause the recovery master to flip-flop between nodes when all nodes are inactive. If there is actually an active node then it will trigger the election. This is fairly cosmetic but is a step along the way towards ironing out weirdness when all nodes are stopped. Also, fix a related comment. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit e7dc10da3ced54ea9d719ad167ee42dcca8dce75)	2012-08-08 16:14:52 +10:00
Martin Schwenke	20b75046fa	recoverd: main_loop() should not verify local IPs if node is stopped Doing these checks is pointless and potentially causes unnecessary log messages. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit a0c30c820fd47d4f8620dc060c825be10754f5d1)	2012-08-08 16:11:11 +10:00
Martin Schwenke	ae0cdd137f	recoverd: verify_local_ip_allocation() should dup ifaces before early return If CTDB starts in STOPPED state then it thinks it is in the middle of a recovery. rec->ifaces is also NULL and an early exit further down (that checks to see if a recovery is in process) means that it stays that way. However, each time this function is entered the need for a takeover run is re-flagged. The takeover run never happens due to the the early exit, causing a couple of unneeded messages to be logged each time. This is avoided by moving the code that sets rec->ifaces so that it is executed earlier and, in this case, in the middle of a recovery. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit f586e8a2911fc6e7f6698f516653145d8fd45dad)	2012-08-08 16:11:11 +10:00
Martin Schwenke	d038b9e8ba	recoverd: Fix bogus info in message about changed flags Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 9119a568c2b4601318f7751f537dca2f92a7230b)	2012-08-08 16:11:11 +10:00
Ronnie Sahlberg	694c1b269e	When we find an ip we shouldnt host, just release it Dont call a full blown clusterwide ipreallocation, just release it locally (This used to be ctdb commit 9a806dec8687e2ec08a308853b61af6aed5e5d1e)	2012-06-20 15:12:05 +10:00
Ronnie Sahlberg	e7d21834ae	RECOVER: When we pull databases during recovery, we used to reallocate the databuffer for each entry added. This would normally not be an issue, but for cases where memory is fragmented, this could start to cost significant cpu if we need to reallocate and move to a different region. Change this to instead preallocate , by default, 10MByte chunks to the data buffer. This significantly reduces the number of potential reallocate and move operations that may be required. Create a tunable to override/change how much preallocation should be used. (This used to be ctdb commit 1f262deaad0818f159f9c68330f7fec121679023)	2012-05-25 12:34:06 +10:00
Ronnie Sahlberg	a57eba2bb4	Track all child process so we never send a signal to an unrelated process (our child died and kernel wrapped the pid-space and reused the pid for a different process Wrap all creation of child processes inside ctdb_fork() which is used to track all processes we have spawned. Capture SIGCHLD to track also which child processes have terminated. Wrap kill() inside ctdb_kill() and make sure that we never send a !0 signal to a child process pid that has already terminated (and might have been replaced with a (This used to be ctdb commit f73a4b1495830bcdd094a93732a89dd53b3c2f78)	2012-05-03 14:03:26 +10:00
Ronnie Sahlberg	7a1aa560e7	Add new control to reload the public ip address file on a node Also add a method to use the recovery master/daemon to reload the public ips on all nodes in the cluster. Reloading the public ips on all node sin the cluster is only suported if all nodes in the cluster are available and healthy. (This used to be ctdb commit 05603e914f8c12618d7e06943c0f7df207f645b0)	2012-05-01 10:48:08 +10:00
Ronnie Sahlberg	db411aaada	Merge remote branch 'amitay/tevent-sync' (This used to be ctdb commit 17ff3f240b0d72c72ed28d70fb9aeb3b20c80670)	2012-04-26 08:09:23 +10:00
Amitay Isaacs	4392591555	Remove explicit include of lib/tevent/tevent.h. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 0681014ca5ed2a9b56f63fdace7f894beccf8a9a)	2012-04-13 17:28:14 +10:00
Amitay Isaacs	202791cf72	recoverd: Fix spurious warnings when running with --nopublicipcheck Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 7f8096f56d8274151705ac822b582d972078f8fe)	2012-04-13 15:38:11 +10:00
Martin Schwenke	fbe64dec01	Undo damage done by d8d37493478a26c5f1809a5f3df89ffd6e149281 The implementation of DisableIPFailover got intermingled with --nopublicipcheck. This just looks wrong - Ronnie must have been having a bad day. :-) Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 5083b266dd68b292c4275505f3d1b878dbf12f11)	2012-03-22 15:34:52 +11:00
Ronnie Sahlberg	f3600276fc	Add a tunable variable to control how long we defer after a ctdb addip until we force a rebalance and try to failback addresses onto this node Have it default to 300 seconds. (This used to be ctdb commit 49791db7dc74cffd7e88bd73091590cdc1909328)	2012-02-28 06:58:59 +11:00
Ronnie Sahlberg	ef2bd0b016	When adding ips to nodes, set up a deferred rebalance for the whole node to trigger after 60 seconds in case the normal ipreallocated is not sufficient to trigger rebalance. (This used to be ctdb commit 4340263b219d75c39f8de22abe3f6f1c1ee63ea2)	2012-02-28 06:56:04 +11:00
Ronnie Sahlberg	0420449a6c	Recover Persistent database DB by DB and not record by record Add a new tunable that changes the mode how persistent databases are recovered. RecoveryPDBBySeqNum When set to 1, persistent databases will be recovered in whole from the node which has the highest "__db_sequence_number__" record. This record is managed by samba for those databases where we do persistent writes and have inter-record relations. For these databases we do not want the usual "blend records from all nodes based on individual record RSN" but instead a mode where we pick one instance of the persistent database. If no node was found with a "__db_sequence_number__" record at all, we fail back to the original "recover records independently based on record RSN". Some persistent databases do not contain record interrelations and as such does not contain this special record at all. (This used to be ctdb commit 502150c764298a9fa8c4d8aa445bf7d85d4ee9dc)	2011-11-30 08:48:23 +11:00
Stefan Metzmacher	3aa5c979f3	recoverd: try to become the recovery master if we have the capability, but the current master doesn't metze (cherry picked from commit 6ba8af28f8a8f79db65120a97d7157dcc5c7e083) Signed-off-by: Michael Adam <obnox@samba.org> (This used to be ctdb commit ccd67cf7f26713e695000d89d9ce8cfa78bfe00f)	2011-11-29 10:28:52 +01:00
Ronnie Sahlberg	b18a22b820	This breaks the build since the recovery loop is different in master compared to old 1.0 branches This must have been mistakenly applied to master when you intended to push for a different branch i guess. Revert "recoverd: try to become the recovery master if we have the capability, but the current master doesn't" This reverts commit a97d417aba85e901540147a4dff4794249442939. (This used to be ctdb commit c19cb751077b78cf4b6e28a1e3746d4ffedbfd68)	2011-11-29 14:38:02 +11:00
Stefan Metzmacher	b02b55bd12	recoverd: try to become the recovery master if we have the capability, but the current master doesn't metze (This used to be ctdb commit a97d417aba85e901540147a4dff4794249442939)	2011-11-26 23:47:00 +01:00
Stefan Metzmacher	7a962685d3	recoverd: let async_getcap_callback() also update ctdb->capabilities metze (This used to be ctdb commit ef5b47d1183ee99c39ae63045a994d35255ac829)	2011-11-26 23:30:33 +01:00
Martin Schwenke	02612ea2bc	Clean up warnings: remove changed_flags in monitor_helper Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 3e4fa518f02db75e4e4a7f326a71df226913f8a8)	2011-11-09 14:45:01 +11:00
Ronnie Sahlberg	0dc5584101	Merge branch 'master-readonly-records' into foo Conflicts: Makefile.in tools/ctdb.c (This used to be ctdb commit 0fedef0ffba4178126eee9544c5e2db52f5db893)	2011-09-12 09:34:34 +10:00
David Disseldorp	5296da5609	client: add timeout argument to ctdb_attach Rather than using a fixed 2 second CTDB_CONTROL_GETDBPATH timeout. (This used to be ctdb commit 9e178671560cb95121e11d718a76b05380ecd6c5)	2011-09-06 13:57:04 +02:00
Ronnie Sahlberg	63dc96cdb2	ReadOnly: Change the ctdb_db structure to keep a uint8_t for flags instead of a boolean for the persistent flag. This is the same size as the original boolean but allows ut to add additional flags for the database (This used to be ctdb commit 7462761638d25880ad46024ad4ef21667eb99a98)	2011-09-01 10:21:55 +10:00
Ronnie Sahlberg	10caf186e1	remove log message we dont need S1026492 (This used to be ctdb commit c5f6e44b92210519d4bfc24611cae3f9978cc2e5)	2011-08-04 13:49:57 +10:00
Ronnie Sahlberg	ae35e9e5b2	Cleanup of logging messages/spamming Reduce an infomational message about not performing ip reallocation from NOTICE(the default) to INFO. These messages are normal during startup or when stopped/banned when we will be in recovery mode for a while. Remove a messager in the loop waiting for initial startup to complete about the generation being invalid. It is always invalid at this stage before we have finished initial recovery. Rate-limit the informational messages for CTDB_WAIT_UNTIL_RECOVERED so that we only print them once per second for the first 60 seconds and after that only once per 10 minutes. These messages are normal during startup, but we should not be logging them every second for cases where we will remain in recovery mode during startup for an extended period of time. Such as if suspended or permabanned. CQ S1023302 (This used to be ctdb commit 3a0af8780dc595acbed880f288fcbc4f62c862fb)	2011-05-04 10:42:32 +10:00
Michael Adam	2ad1c3f6c7	server: in the VACUUM_FETCH handler, add the VACUUM_MIGRAION to the call flags This way, the records coming in via this handler, can be treated appropriately. Namely, they can be deleted instead of being stored when the meet the fast-path vacuuming criteria (empty, never migrated with data...) (This used to be ctdb commit fb5d832104970320359b3e474eb291ca3d629380)	2011-03-14 13:35:44 +01:00
Michael Adam	89f27f9424	recoverd: in a recovery, set the MIGRATED_WITH_DATA flag on all records Those records that are kept after recovery, are non-empty, and stored identically on all nodes. So this is as if they had been migrated with data. Pair-Programmed-With: Stefan Metzmacher <metze@samba.org> (This used to be ctdb commit 101be642e492a3a54231e2e3e6553a59380fe702)	2011-03-14 13:35:43 +01:00
Ronnie Sahlberg	49a30783d3	If/when the recovery daemon terminates unexpectedly, try to restart it again from the main daemon instead of just shutting down the main deamon too. While it does not address the reason for recovery daemon shutting down, it reduces the impact of such issues and makes the system more robust. (This used to be ctdb commit 0566ef3d6cef809bda204877c493c80ff9eb2c40)	2011-03-01 12:13:58 +11:00
Ronnie Sahlberg	d236c970d0	recoverd: avoid triggering a full recovery if just some ip allocation has failed. We dont need to rebuild the databases in this situation, we just need to try again to sort out the ip address allocations. (This used to be ctdb commit 044c398ffea23d36ee033c8ddf07d11028197346)	2011-01-11 07:40:49 +11:00
Ronnie Sahlberg	c4006ce844	Add ctdb_fork(0 which will fork a child process and drop the real-time scheduler for the child. Use ctdb_fork() from callers where we dont want the child to be running at real-time privilege. (This used to be ctdb commit 58795a4c9e0624e20fa3e0023b65127053edd103)	2011-01-11 07:40:41 +11:00
Ronnie Sahlberg	c2c53db49d	during ip allocation, there are failure modes where a node might hold a ip address but thinks it is still unassigned (-1). add code to the recovery daemon to detect this case and trigger a reallocation so that the ip gets covered and change the takeip code to allow for this condition, taking on an ip address that is already hosted. cq s1021073 (This used to be ctdb commit 9020baf27cab7821c9094cda185206fb7af0fee7)	2010-12-03 13:30:39 +11:00
Ronnie Sahlberg	7e29fd6093	Dont check remote ip allocation if public ip mgmt is disabled (This used to be ctdb commit 441ad00af842a8b7b5291de60d8ab08a064f5327)	2010-11-10 14:55:25 +11:00
Ronnie Sahlberg	a6ed66dfd0	dont check the public ip assignment or if even we are hosting them and shouldnt when public ips have been disabled (This used to be ctdb commit 7d07a74dc7f907ac757d20626f68e257d7ba16be)	2010-11-10 14:55:24 +11:00
Ronnie Sahlberg	5f76f3c0e2	Add a new tunable : DisableIPFailover that when set to non 0 will stopp any ip reallocations at all from happening. (This used to be ctdb commit d8d37493478a26c5f1809a5f3df89ffd6e149281)	2010-11-10 14:55:24 +11:00
Ronnie Sahlberg	107d020cfa	update/improve the log message related to rerecovery timeouts (This used to be ctdb commit 8b4d1df3abcae03cf7a339d8390c816682a43019)	2010-09-28 08:47:12 +10:00
Stefan Metzmacher	5e46150490	server/recoverd: if we can't get the recovery lock, ban ourself metze (This used to be ctdb commit 80b8889267339b870868841ff077e850bc5b52e2)	2010-09-14 15:49:01 +10:00
Stefan Metzmacher	ff77985f38	server/recoverd: do takeover_run after verifying the reclock file metze (This used to be ctdb commit 93df096773c89f21f77b3bcf9aa90bf28881b852)	2010-09-14 15:48:37 +10:00
Ronnie Sahlberg	2e8aac6689	Merge commit 'rusty/ports-from-1.0.112' into foo (This used to be ctdb commit 13e58d92f5f1723e850a82ae030d0ca57e89b1ee)	2010-08-19 13:17:56 +10:00
Rusty Russell	9fbb191b78	logging: give a unique logging name to each forked child. This means we can distinguish which child is logging, esp. via syslog where we have no pid. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 68b3761a0874429b90731741f0531f76dcfbb081)	2010-08-18 11:46:32 +09:30
Rusty Russell	f93440c4b7	event: Update events to latest Samba version 0.9.8 In Samba this is now called "tevent", and while we use the backwards compatibility wrappers they don't offer EVENT_FD_AUTOCLOSE: that is now a separate tevent_fd_set_auto_close() function. This is based on Samba version `7f29f817fa`. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 85e5e760cc91eb3157d3a88996ce474491646726)	2010-08-18 09:16:31 +09:30
Rusty Russell	8f8959a145	speed startup: with --sloppy-start, cut initial election timeout to 1/2 second. Seconds between ctdbd first log message and node healthy: BEFORE: 4.03 AFTER: 2.02 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 8f17731dea4287d4f9b21dc58c1bdf26c8a0e628)	2010-06-22 22:55:20 +09:30
Rusty Russell	fabeea6197	speed startup: don't wait a full recovery interval if we've already waited We currently sleep for one second, whether or not we've already slept. Change this to sleep for the remainder of the second, if at all. Seconds between ctdbd first log message and node healthy: BEFORE: 18.09 AFTER: 17.08 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit fde760b5f39c77172308a583da4c2443b71541c9)	2010-06-22 22:50:35 +09:30
Rusty Russell	f7efc1f8e8	speed startup: alter recovery loop We do a recovery on startup. But the code does: Sleep for ctdb->tunable.recover_interval. Check for recovery. We want to do it in the other order. This is best done by extracting the loop into a separate "main_loop" function. Seconds between ctdbd first log message and node healthy: BEFORE: 24.09 AFTER: 23.58 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 097046025176b9fcb670839d1a9f100f890e7ed2)	2010-06-22 22:50:23 +09:30
Ronnie Sahlberg	bc208bc916	rename ctdb_set_message_handler to ctdb_client_set_message_handler to avoid a colission with the function of the same name in libctdb (This used to be ctdb commit 41dbdd4fc0ab560420fb0e24a3179ff7c94c5bb7)	2010-06-02 09:51:47 +10:00
Ronnie Sahlberg	761a075de9	rename ctdb_send_message to ctdb_client_send_message to resolve colission with the function of the same name in libctdb (This used to be ctdb commit ac3292c12832484a22715f1d46aa23f3b7c8a6f6)	2010-06-02 09:45:21 +10:00
Rusty Russell	d5f6026a22	libctdb: reorganize headers: remove ctdb.h, add ctdb_client.h and ctdb_protocol.h ctdb_client.h is the existing internal client interface (which was mainly in ctdb.h), and ctdb_protocol.h is the information needed for the wire protocol only. ctdb.h will be the new, shiny, libctdb API. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 4bba6b8cd47b352f98d41f9f06258d5ac3c9adef)	2010-05-20 15:18:30 +09:30
Ronnie Sahlberg	7a62592fc5	when performing a recovery, ensure that all nodes use the same reclock file setting as the recovery master (This used to be ctdb commit 26793ad42b77c2328a00ac9a12bca813c7425245)	2010-05-06 09:33:08 +10:00
Ronnie Sahlberg	62742bd337	Dont check ip assignment across the cluster while ip-verification checks are disabled (This used to be ctdb commit 189f4a5af1053271b0834522e35c336df959aa03)	2010-05-03 15:52:02 +10:00
Ronnie Sahlberg	4a43428440	The recent change to the recovery daemon to keep track of and verify that all nodes agree on the most recent ip address assignments broke "ctdb moveip ..." since that call would never trigger a full takeover run and thus would immediately trigger an inconsistency. Add a new message to the recovery daemon where we can tell the recovery daemon to update its assignments. BZ62782 (This used to be ctdb commit e7069082e5f0380dcddee247db8754218ce18cab)	2010-05-03 15:47:17 +10:00
Ronnie Sahlberg	06885ea9a7	In the recovery daemon, keep track of which node we have assigned public ip addresses and verify that the remote nodes have/keep a consistent view of assigned addresses. If a remote node has an inconsistent view of addresses visavi the recovery master this will trigger a full ip reallocation. (This used to be ctdb commit f3bf2ab61f8dbbc806ec23a68a87aaedd458e712)	2010-04-08 14:25:26 +10:00
Ronnie Sahlberg	3f226d0c8e	Lower the loglevel for "Recovery lock successfully taken" from ERR to NOTICE BZ62086 (This used to be ctdb commit 7fa8486f9ffe2a039360b07423f734bdd884fe1d)	2010-04-07 10:45:03 +10:00
Volker Lendecke	184ca81bcd	Fix a typo in run_startrecovery_eventscript (This used to be ctdb commit 4f807b3a2d859f13c3e59e1ae737e9b145d7d613)	2010-03-29 17:06:28 +11:00
Ronnie Sahlberg	d7c00d8d7e	Drop the debug level for logging fd creation to DEBUG_DEBUG (This used to be ctdb commit eae1d4f9e52e73b4d8769868fffdafa590d03784)	2010-02-04 06:37:41 +11:00
Stefan Metzmacher	dbe912793e	server: reload the public addresses before doing a takeover run metze (This used to be ctdb commit 0e41a2204fa8a1e77dc83c0d4b253ab272b5c72d)	2010-01-20 11:11:04 +01:00
Stefan Metzmacher	5fa6a51388	server: monitor interfaces in verify_ip_allocation() metze (This used to be ctdb commit 965a65520693e3731b5b0250127b04c777087808)	2010-01-20 11:11:01 +01:00
Stefan Metzmacher	22ade0e456	server: only trigger one takeover run in verify_ip_allocation() metze (This used to be ctdb commit 10bc087d0280057962177721bdd6d4f28743b311)	2010-01-20 11:11:01 +01:00
Stefan Metzmacher	37880b0d0a	server: use CTDB_PUBLIC_IP_FLAGS_ONLY_AVAILABLE during a takeover run We know ask for the known and available interfaces. This means a node gets a RELEASE_IP event for all interfaces it "knows", but doesn't serve and a node only gets a TAKE_IP event for "available" interfaces. metze (This used to be ctdb commit a695a38e49e7c3e15a9706392dc920eeab1f11ba)	2010-01-20 11:10:59 +01:00
Stefan Metzmacher	2f36e78d88	server: add missing goto again after do_recovery() metze (This used to be ctdb commit 898894d3acbcc0add2ab0706a3172a446622f687)	2010-01-20 09:44:35 +01:00
Ronnie Sahlberg	4c722fe34c	fix a conflict in the merge from rusty Merge commit 'rusty/ctdb-no-setsched' Conflicts: server/ctdb_vacuum.c (This used to be ctdb commit b4365045797f520a7914afdb69ebd1a8dacfa0d9)	2009-12-17 08:18:04 +11:00
Rusty Russell	f148735928	Add --valgringing flag instead of --nosetsched The do_setsched was being tested for whether to mmap tdbs: let's make it explicit. We can also happily move the kill-child eventscript hack under this flag. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 2ee86cc1f311d7b7504c7b14d142b9c4f6f4b469)	2009-12-16 20:59:15 +10:30
Stefan Metzmacher	8fbb5b7915	server/recovery: update flags on nodes before syncing dbs metze (This used to be ctdb commit 49d2dca9ad837e1b397294fb0e966bf0b77f751c)	2009-12-16 08:03:56 +01:00
Stefan Metzmacher	77d43d01aa	server: create recdb.tdb.X in /var/ctdb/state/ metze (This used to be ctdb commit 92e05282d6c4f16e55d914cc3bde3738ea2d44ad)	2009-12-16 08:03:56 +01:00
Stefan Metzmacher	003985acfd	ctdb: pass TDB_DISALLOW_NESTING to all tdb_open/tdb_wrap_open calls metze Signed-off-by: Stefan Metzmacher <metze@samba.org> (This used to be ctdb commit 1635e931b909c66eb3b1f5357e3a549b1a0da70d)	2009-12-16 08:03:55 +01:00
Ronnie Sahlberg	0982299bed	Revert "Make fetch_locked more scalable" This reverts commit 5736e17c139c9a8049e235429aeae0c6c9d0e93d. (This used to be ctdb commit 3d2d877d877146ca09a28a3a44f4840eb36fd377)	2009-12-15 14:26:28 +11:00
Michael Adam	b41d9a2bcc	Revert "recovery: add special pull-logic for persistent databases" This reverts commit 8aef46d2aab3efb322dda51eaa202653cefd5222. This special recovery logic is wrong now with the transaction rewrite. The treatment of persistent databases will later be rewritten to use the database sequence number. Michael (This used to be ctdb commit c5a0aef668a63f927d6184612b13ce316eb4a0be)	2009-12-12 00:45:40 +01:00
Volker Lendecke	f6ea3e6bcf	Make fetch_locked more scalable This patch improves the handling of the fetch_lock operation on non-persistent databases that ctdb clients have to do very frequently. The normal flow how this goes is the following: 1. Client does a local fetch_lock on the database 2. Client looks if the local node is dmaster. If yes, everything is fine If no, continue here 3. Client unlocks the local record 4. Client issues a "get me the record" call to ctdbd 5. ctdbd goes out and fetches the dmaster role 6. ctdbd tells the client to retry 7. Client starts over again The problem is between step 6 and 7: Before the client has had the chance to retry (i.e. catch the record with a fetch_locked), another node might have come asking ctdbd to migrate away the record again. This is a real problem, I've seen >20 loops of this kind in real workloads. This patch does the following: Whenever ctdb receives a record as result of step 5, it puts the key on a "holdback list". As long as a key is on this list, a request to migrate away the dmaster is put on hold. It is the client's duty to issue the "CTDB_CONTROL_GOTIT" control when it has successfully done step 2 after having asked ctdb to fetch the record. This will release the key from the "holdback list" and re-issue all dmaster migration requests. As a safeguard against malicious clients, once a second (default 1000msecs, tunable "HoldbackCleanupInterval" in milliseconds) ctdbd goes over the list of held back keys, deletes them and releases all held back migration requests. (This used to be ctdb commit 5736e17c139c9a8049e235429aeae0c6c9d0e93d)	2009-12-12 00:45:39 +01:00
Michael Adam	ffe62722cb	recovery: add special pull-logic for persistent databases The decision mechanism which records of a persistent db are to be pulled into the recdb during recovery is now as follows: * Usually a record with the higher rsn than that already stored is taken. (Just as for normal tdbs.) * If a transaction is running on some node, then those nodes copies of all records are taken and are not overwritten later by other nodes' copies. In order to keep track of whether a record's copy was obtained from a node with a transaction running, the recovery mechanism misuses the ctdb tdb header field 'lacount' in the recdb. It is cleared later when pushing out the recdb database to the other nodes. This way, an incomplete transaction is not spoiled when a recovery interrupts and the replay should usually succeed (possibly after a few retries). Michael (This used to be ctdb commit 8aef46d2aab3efb322dda51eaa202653cefd5222)	2009-12-04 15:00:21 +01:00
Michael Adam	9a8134e862	recovery: for persistent db's don't set the dmaster to the recmaster node number It is important to keep track of the dmaster (i.e. the node that last committed a transaction containing changes to this node). Michael (This used to be ctdb commit fe68972eb9cf3aa1f16ba1aacf57ade5d66e647c)	2009-12-04 11:30:21 +01:00
Michael Adam	f96e8166de	recovery: pass the persistent flag to recover_database() and further down to pull_remote_database(), pull_one_remote_database(), and push_recdb_database(). This is in preparation of special handling of persistent databases during recoveries. Michael (This used to be ctdb commit 90abc4ac7c16e854cf6e8f96b60a77bc92e35e07)	2009-12-04 11:30:21 +01:00
Ronnie Sahlberg	2000711cb1	when we detect a ip-allocation mismatch, just force a new ip reassignment instead of a full blown recovery (This used to be ctdb commit 4f50aa8bb8be544058523f2f544109a26c2b3b51)	2009-12-01 16:06:59 +11:00
Rusty Russell	e0c6e2f489	eventscript: introduce enum for different event script calls. Rather than doing strcmp everywhere, pass an explicit enum around. This also subtly documents what options are available. The "options" arg is now used for extra arguments only. Unfortunately, gcc complains on empty format strings, so we make ctdb_event_script() take no varargs, and add ctdb_event_script_args(). We leave ctdb_event_script_callback() taking varargs, which means callers have to do "%s", "". For the moment, we have CTDB_EVENT_UNKNOWN for handling forced scripts from the ctdb tool. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 470822b329f9d3ca9bef518b56e9ce28d5fedda2)	2009-11-24 11:16:49 +10:30
Stefan Metzmacher	198866d82d	server: if takeover runs when the recovery master becomes unhealthy The problem was this: When the monitor event fails, the node->flags get updated, and an update (containing the old and new flags) is sent to the recovery master. If the recovery master sends the update to itself (the same process), it was compairing the node->flags variable with the received new flags. This check always found both flag values to be equal and never sets the rec->need_takeover_run variable to true. There were two problem, first the push_flags_handler() function didn't pass the received old flags. And the ctdb_control_modflags() function ignored the received old flags. metze (This used to be ctdb commit 8ec633b64a05a2d903c2b9639909f15f6375548f)	2009-10-26 14:21:45 +11:00
Stefan Metzmacher	7a616a0d7b	server: print out the full 64-bit srvid on 32-bit hosts metze (This used to be ctdb commit 440e870d61267054b24404bcb69e599226353949)	2009-10-26 14:20:52 +11:00
Ronnie Sahlberg	902c476c03	From Volker L Fix some warnings and an incorrect check for a talloc failure (This used to be ctdb commit 27296a47b3d057a6729287acf128b2b67775ecde)	2009-10-22 12:19:40 +11:00
Ronnie Sahlberg	a92ba7f729	lower the debug levels for the "create FD messages" so we dont fill up the logs. (This used to be ctdb commit 87146db2769c2ec494813685bf9cec0d2a6336c3)	2009-10-21 15:26:24 +11:00
Ronnie Sahlberg	14b14a2efb	mprove the log message when we skip the ip allocation check from the recovery daemon. we also skip this check if we are already in the process of performing an ip reallocation and not only when we are performing a full recovery. (This used to be ctdb commit 1a09b02767f3928d3c5db0e0afc59bb938e4a445)	2009-10-21 11:51:30 +11:00
Ronnie Sahlberg	9de3652380	add logging everytime we create a filedescriptor in the main ctdb daemon so we can spot if there are leaks. plug two leaks for filedescriptors related to when sending ARP fail and one leak when we can not parse the local address during tcp connection establish (This used to be ctdb commit ddd089810a14efe4be6e1ff3eccaa604e4913c9e)	2009-10-15 11:24:54 +11:00
Ronnie Sahlberg	122c423b82	add a new control for explicitely cancelling recovery transactions, i.e. the transactions we start across all tdb databased during the recovery. this allows us to properly clean up and delete these tdb transactions on a recovery failure. (This used to be ctdb commit b2ce8b900a7d00944c84e0574fea5b371064a06d)	2009-10-12 16:48:05 +11:00
Ronnie Sahlberg	771802b212	allow setting the recmode even when not completely frozen. we sometimes have to do this when we want to trigger a recovery (This used to be ctdb commit 46194e87e189521375b39b4ef33da2b493429fd8)	2009-10-12 13:06:16 +11:00
Ronnie Sahlberg	73c0adb029	initial attempt at freezing databases in priority order (This used to be ctdb commit e8d692590da1070c87a4144031e3306d190ebed2)	2009-10-12 12:08:39 +11:00
Ronnie Sahlberg	ae57e54566	during recovery, update all remote nodes so they use the same priorities for the databases as this node. (This used to be ctdb commit 465dc95fef0ff6651ff49fa94e4cf2ebd1036ac4)	2009-10-10 16:28:20 +11:00
Ronnie Sahlberg	e8e2f35985	verify the DISABLED flag and compare with the previous flag we have registered for that node and not what the node says is the difference. this prevents a situation where the remove node may cause spurious ip reallocations. (This used to be ctdb commit dd122351efaeef5475cdec111eb900110d83ec35)	2009-10-10 13:55:11 +11:00
Ronnie Sahlberg	342148628f	if a node fails to become frozen during recovery, mark it up with as a culprit so it will soon get banned (This used to be ctdb commit f72d33ac73ebb1af802bacdfb30279df3cd8b8f9)	2009-10-08 16:45:25 +11:00
Ronnie Sahlberg	166b1c97b4	add a new message to ask the recovery daemon to temporarily disable checking ip address consistency. This is useful when we are moving addresses using moveip in the cluster since otherwise if we collide with the recovery daemons own check we could cause a recovery (This used to be ctdb commit 9c63858c0b22c81eaccb9865a414af0bbb2833d4)	2009-10-06 12:11:32 +11:00
Ronnie Sahlberg	a82b9cfbfd	with the new banning logic with one struct for each node we no longer "forget" the other culprits as often as we used to do, which means that things like "ctdb recover" can now actually lead to a node becomming banned if we perform too many recoveries too frequently. change this to provide absolution to all nodes once they have participated in a recovery session. (This used to be ctdb commit f66d17fb2e81a35d5adb3754e1cc902f76b4590a)	2009-09-25 13:14:53 +10:00
Ronnie Sahlberg	4b7f6c8a29	dont mark the recovery daemon as a ban culprit just because a node in the cluster was set to recvoery mode == ACTIVE. This happens normally when someone explicitely triggers a recovery using "ctdb recover" (This used to be ctdb commit 3085170be8460e59996a3eee4e29fec9ddbcf0f8)	2009-09-18 12:58:30 +10:00
Ronnie Sahlberg	e578bed20d	dont force an election just because the ban flag differs across the cluster. a simple push to resync this flag is sufficient (This used to be ctdb commit 8903b858ddd3a016d9cf765187839814443a67ca)	2009-09-09 10:57:39 +10:00
Ronnie Sahlberg	cda5f02c7c	new prototype banning code (This used to be ctdb commit 0c4c2240267af183d54ffd4c0aacda208f6eff6a)	2009-09-04 02:20:39 +10:00
Ronnie Sahlberg	df00979158	When we create new election data to send during elections, we must re-read the node flags from the main daemon to catch when the STOPPED flag is changed. (This used to be ctdb commit ca4982c40d81db528fe915d5ecc01fcf7df0b522)	2009-07-17 11:37:03 +10:00
Ronnie Sahlberg	0c5f5ae58d	stopped nodes can not win a recmaster election stopped nodes must yield the recmaster role (This used to be ctdb commit b75ac1185481060ab71bd743e1e48d333d716eba)	2009-07-09 14:44:03 +10:00
Ronnie Sahlberg	82c1be95ed	recovery daemon needs to monitor when the local ctdb daemon is stopped and ensure that the databases gets frozen and the node enters recovery mode (This used to be ctdb commit 99f239f8b96c8c0a06ac8ca8b8083be96265865a)	2009-07-09 14:19:32 +10:00
Ronnie Sahlberg	289c58e9b6	add a new command "ctdb ipreallocate", this command will force the recovery master to perform a full ip reallocation process. the ctdb command will block until the ip reallocation has comleted (This used to be ctdb commit abad7b97fe0c066b33f6e75d0953bbed892a3216)	2009-07-02 13:00:26 +10:00
Ronnie Sahlberg	7f8d98ebb0	update the recovery daemon to read the recovery lock file off the main daemon and handle when the file is changed/enabled/disabled (This used to be ctdb commit 31acc11a6389d4dd9f7b71b7cfa2f2450076f1f7)	2009-06-25 12:55:43 +10:00
Ronnie Sahlberg	180a576f7b	Dont access the reclock file at all if VerifyRecoveryLock is zero and also make sure the reclock file is closed if the variable is cleared at runtime (This used to be ctdb commit a25f4888689a0725971606163d87c39a41669292)	2009-06-25 11:41:18 +10:00
Ronnie Sahlberg	de1402d471	dont log an error if waitpid returns -1 and errno is ECHILD (This used to be ctdb commit fdf50f3e774e3980af81c0b6f4ff81d085f4f697)	2009-06-19 15:55:13 +10:00
Ronnie Sahlberg	d3c5fb4bd1	dont leak file descriptors (This used to be ctdb commit 268c3e4b269a92741a02280c84384178e73de10e)	2009-06-19 14:54:22 +10:00
Ronnie Sahlberg	d72b14e86c	in the recovery daemon, check that the recovery master can access the recovery lock file and verify it is not stale from a child process. This allows us to timeout the operation if the underlying filesystem has become temporarily unresponsive without causing a new recovery. (This used to be ctdb commit d177b08f1dc79534491f27726b05405d47e12e20)	2009-06-19 14:44:26 +10:00
Ronnie Sahlberg	e6170b5389	add a new node state : DELETED. This is used to mark nodes as being DELETED internally in ctdb so that nodes are not renumbered if / when they are removed from the nodes file. This is used to be able to do "ctdb reloadnodes" at runtime without causing nodes to be renumbered. To do this, instead of deleting a node from the nodes file, just comment it out like 1.0.0.1 #1.0.0.2 1.0.0.3 After removing 1.0.0.2 from the cluster, the remaining nodes retain their pnn's from prior to the deletion, namely 0 and 2 Any line in the nodes file that is commented out represents a DELETED pnn (This used to be ctdb commit 6a5e4fd7fa391206b463bb4e976502f3ac5bd343)	2009-06-01 14:18:34 +10:00
Ronnie Sahlberg	98a54c4675	Track how long it takes to take out the recovery lock from both the main dameon and also from the recovery daemon. Log this in "ctdb statistics". Also add a varaible "RecLockLatencyMs" that will log an error everytime it takes longer than this to access the reclock file. (This used to be ctdb commit 042377ed803bb8f7ca9d6ea1a387427b7b8ba45a)	2009-05-14 10:33:25 +10:00
Ronnie Sahlberg	3363480da4	tweak some timeouts so that we do trigger a banning even if the control hangs/timesout (This used to be ctdb commit 1860a365e6ba8212e15c33016c80a2adcf8d10f4)	2009-04-24 14:45:07 +10:00

1 2 3 4 5 ...

292 Commits