samba-mirror

mirror of https://github.com/samba-team/samba.git synced 2025-02-01 05:47:28 +03:00

Author	SHA1	Message	Date
Martin Schwenke	ab75f2a587	ctdb-recovery: Use a configurable handler when testing cluster mutex Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2016-04-28 09:39:16 +02:00
Martin Schwenke	419f57f378	ctdb-recovery: Factor out new function set_recmode_handler() This is used to reply to the recmode control for all the different cases. The callers can later be generalised to use a pointer, which can then be used for recovery lock handling in different contexts. Note that the handle is now freed in set_recmode_handler() rather than the callbacks. There is one difference in behaviour. Deferred attach calls are now processed in the timeout case, where they weren't before. That's a bug fix! Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2016-04-28 09:39:16 +02:00
Martin Schwenke	14a2330692	ctdb-recovery: Use single char ASCII numbers for status from child '0' = Child took the mutex '1' = Unable to take mutex - contention '2' = Unable to take mutex - timeout '3' = Unable to take mutex - error This is a straightforward API. When the child is generalised to an external helper then this makes it easier for a helper to be, for example, a simple script. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2016-04-28 09:39:16 +02:00
Martin Schwenke	4842b6bb91	ctdb-recovery: Rename recovery lock functions and struct Use the more general name "cluster mutex", since we are likely to end up with more than one cluster-wide lock. There will probably be a dedicated recovery lock, held only during recovery, and also a second lock that is held by the master node. Currently one lock is used for both purposes. At the moment the struct and functions are involved with setting the recovery mode. However, they'll be abstracted out to more generally deal with the cluster mutexes, so "recmode" -> "cluster_mutex". Drop "set" from names, since this is used to test the lock. Also drop "ctdb" prefix from functions, since they are local to this file. The struct will eventually be a long-lived handle that will release the mutex when freed, so name it accordingly. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2016-04-28 09:39:16 +02:00
Amitay Isaacs	95a15cde45	ctdb-daemon: Implement new controls DB_PULL and DB_PUSH_START/DB_PUSH_CONFIRM Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2016-03-25 03:26:15 +01:00
Martin Schwenke	46edef25df	ctdb-recovery: Limit scope of reclock latency statistics It does not make sense to update this statistic for the timeout case, since this could skew the statistic. To keep it simple, just update it for the usual case where there is lock contention, since this is the usual case. So the daemon statistic measures time to test the lock and the corresponding recovery daemon statistic measures time to take the lock. Additionally, the recovery daemon will eventually use this code to take the lock, and the method of updating the latency statistic will need to be pushed further out to a configurable handler that depends on the calling context. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Tue Feb 23 10:32:06 CET 2016 on sn-devel-144	2016-02-23 10:32:06 +01:00
Martin Schwenke	188019b877	ctdb-recovery: Negate the status when checking the recovery lock Have 0 indicate that the lock was taken. This allows non-zero values to be used to indicate why the lock could not be taken. EACCES means lock contention. For now use just EACCES to cover all failures, since ctdb_recovery_lock() returns a bool and details of other errors will be lost. ctdb_recovery_lock() will undergo some big changes, so don't try to fix this now. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2016-02-23 07:23:18 +01:00
Martin Schwenke	fad3f367b7	ctdb-recovery: Clean up status handling from recmode child This currently returns an incorrect error when the expected number of bytes are not read. Separate out the different cases to clarify the logic and avoid reporting the wrong error. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2016-02-23 07:23:18 +01:00
Martin Schwenke	b6c3918457	ctdb-recovery: Don't bother ensuring file descriptor is -1 This is already done before the destructor is assigned. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2016-02-23 07:23:18 +01:00
Martin Schwenke	531e6724ba	ctdb-recovery: Don't store recmode in recovery mode state The callbacks that use this value are only ever called if recovery mode is being set to NORMAL. So do not check if recmode is NORMAL either. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2016-02-23 07:23:18 +01:00
Martin Schwenke	6695fa50ae	ctdb: Use ctdb_wait_for_process_to_exit() Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2016-02-23 07:23:18 +01:00
Martin Schwenke	4d6ec81299	ctdb-recovery: Drop redundant status send when setting recovery mode The child process writes the status into the pipe before looping to wait. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2016-02-23 07:23:18 +01:00
Martin Schwenke	3e2f2169a4	ctdb-recovery: Include lib/util/time.h instead of samba_util.h Less is more... Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2016-02-23 07:23:18 +01:00
Martin Schwenke	24160ee6a4	ctdb-daemon: Don't leak memory if not using recovery lock Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Michael Adam <obnox@samba.org>	2016-01-12 19:16:17 +01:00
Christof Schmitt	03b27bd139	ctdb: Use prctl_set_comment from lib/util Signed-off-by: Christof Schmitt <cs@samba.org> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-11-18 04:05:13 +01:00
Amitay Isaacs	f50db5cba5	ctdb-server: Replace ctdb_logging.h with common/logging.h Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Michael Adam <obnox@samba.org>	2015-11-16 00:46:15 +01:00
Amitay Isaacs	64d8bb626b	ctdb-daemon: Rename struct ctdb_control_pulldb to ctdb_pulldb Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2015-11-04 00:47:15 +01:00
Amitay Isaacs	645cd43200	ctdb-daemon: Rename struct ctdb_dbid_map to ctdb_dbid_map_old Match struct ctdb_dbid as per protocol/protocol.h Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2015-11-04 00:47:15 +01:00
Amitay Isaacs	b99436e425	ctdb-daemon: Rename struct ctdb_rec_data to ctdb_rec_data_old Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2015-11-04 00:47:14 +01:00
Amitay Isaacs	e1fed53e2a	ctdb-daemon: Rename struct ctdb_req_control to ctdb_req_control_old Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2015-11-04 00:47:14 +01:00
Amitay Isaacs	4647787773	ctdb-daemon: Separate prototypes for common client/server functions This groups function prototypes for common client/server functions in common/common.h and removes them from ctdb_private.h. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2015-10-30 02:00:27 +01:00
Amitay Isaacs	01c6c90e98	ctdb-daemon: Remove dependency on includes.h Instead of includes.h, include the required header files explicitly. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2015-10-30 02:00:27 +01:00
Amitay Isaacs	2fdb332fad	ctdb-daemon: Stop using tevent compatibility definitions Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2015-10-30 02:00:27 +01:00
Amitay Isaacs	b900adc55c	ctdb-daemon: Separate prototypes for system specific functions This groups function prototypes for system specific functions in common/system.h and removes them from ctdb_private.h. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2015-10-30 02:00:27 +01:00
Amitay Isaacs	42f7722151	ctdb-daemon: Remove freeze requirement for updating vnnmap In the parallel database recovery model, all the database will not remain frozen at the same time. So relax the condition to check if recovery is active. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2015-10-07 14:53:27 +02:00
Amitay Isaacs	3cbd0409f3	ctdb-daemon: Add a check for database generation consistency Before setting recovery mode to normal, confirm that all the databases are recovered by matching the database generation with the global generation. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2015-10-07 14:53:27 +02:00
Amitay Isaacs	66c7bcc777	ctdb-daemon: Use database specific mark/unmark routines Instead of marking all the databases with priority, mark only the database which is currently being processed. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2015-10-07 14:53:27 +02:00
Amitay Isaacs	e0fa182d93	ctdb-daemon: Use database specific freeze check routine Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2015-10-07 14:53:27 +02:00
Amitay Isaacs	7afabb1285	ctdb-daemon: Avoid the use of ctdb->freeze_handle variable These variables are used for state information related to freezing databases. Instead use the API functions to check if the databases are frozen. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2015-10-07 14:53:26 +02:00
Amitay Isaacs	8c58c7392f	ctdb-daemon: Avoid the use of ctdb->freeze_mode variable Use ctdb->freeze_mode only in ctdb_freeze.c and use the functions to check if databases are frozen everywhere else. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2015-10-07 14:53:26 +02:00
Amitay Isaacs	9b6865475e	ctdb-daemon: Remove obsolete IPv4 only controls Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Jeremy Allison <jra@samba.org>	2015-05-12 01:32:11 +02:00
Martin Schwenke	20a7945a26	Revert "ctdb-recoverd: Abort when daemon can take recovery lock during recovery" This reverts commit 39d2fd330a60ea590d76213f8cb406a42fa8d680. An election can occur in the middle of a recovery. During the election the recovery master can change. When a node loses a round of the election and stops being the recovery master it releases the recovery lock. Then at the end of the ongoing recovery all nodes are able to take the recovery lock so they will all abort. The most likely cause for a change in recovery master is that several (all?) nodes are starting up and the "connected-ness" of each node is a primary factor in winning the election. In this situation the recovery master can bounce around the cluster. The simplest solution is to revert this patch so that the recovery will fail. The new recovery master will then start a new recovery. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Mon May 4 10:40:36 CEST 2015 on sn-devel-104	2015-05-04 10:40:36 +02:00
Martin Schwenke	1ef1cfdc4d	ctdb-common: Move ctdb_node_list_to_map() to utilities Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-04-07 07:43:13 +02:00
Martin Schwenke	dd52d82c73	ctdb-daemon: Factor out new function ctdb_node_list_to_map() Change ctdb_control_getnodemap() to use this. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-04-07 07:43:13 +02:00
Martin Schwenke	d340f308e7	ctdb-daemon: Don't delay reloading the nodes file Presumably this was done to minimise the chance of a recovery occurring while the nodemaps are inconsistent across nodes. Another potential theory is that the forced recovery in the ctdb.c:control_reload_nodes_file() stops another recovery occurring for ReRecoveryTimeout seconds, so this delay causes the reloads to occur during that period. This is no longer necessary because recoveries are now explicitly disabled while node files are reloaded. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-04-07 07:43:13 +02:00
Martin Schwenke	a5be2c245d	ctdb-daemon: Store node addresses as ctdb_sock_addr rather than strings Every time a nodemap is contructed the node IP addresses all need to be parsed. This isn't very productive use of CPU. Instead, parse each string once when the nodes file is loaded. This results in much simpler code. This code also removes the use of ctdb_address. Duplicating the port is pointless without an abstraction layer around ctdb_address. If CTDB gets an incompatible transport in the future then add an abstraction layer. Note that the infiniband code is not updated. Compilation of the infiniband code is already broken. Fixing it will be a separate, properly tested effort. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>	2015-03-23 12:23:12 +01:00
Martin Schwenke	39d2fd330a	ctdb-recoverd: Abort when daemon can take recovery lock during recovery Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Fri Feb 13 09:48:15 CET 2015 on sn-devel-104	2015-02-13 09:48:15 +01:00
Martin Schwenke	432d677489	ctdb-recoverd: Improve error messages on recovery lock coherence fail When the daemon is able to take the recovery lock during recovery we might as well guess that the cluster filesystem has a lock coherence problem and print a more useful message. This will be more helpful to those trying out cluster filesystems that don't have lock coherence or that are difficult to setup. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-02-13 07:19:07 +01:00
Martin Schwenke	1d6ed91f55	ctdb-recoverd: Simplify ctdb_recovery_lock() Have it just silently take or fail to take the lock, except on an unexpected failure (where it should log an error). This means that when it is called we need to keep the old behaviour and explicitly release the lock. In do_recovery() the lock is released and a message is printed before attempting to take the lock. In the daemon sanity check the lock must be released in the error path if it is actually taken. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-02-13 07:19:07 +01:00
Martin Schwenke	db32a2bce5	ctdb-recoverd: New function ctdb_recovery_unlock() Unlock the recovery lock file. This way knowledge of the file descriptor isn't sprinkled throughout the code. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-02-13 07:19:07 +01:00
Martin Schwenke	72701be663	ctdb-recoverd: New function ctdb_recovery_have_lock() True if this recovery daemon holds the lock. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-02-13 07:19:07 +01:00
Martin Schwenke	d110fe2318	ctdb-daemon: Mark tunable VerifyRecoveryLock as obsolete It is pointless having a recovery lock but not sanity checking that it is working. Also, the logic that uses this tunable is confusing. In some places the recovery lock is released unnecessarily because the tunable isn't set. Simplify the logic by assuming that if a recovery lock is specified then it should be verified. Update documentation that references this tunable. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-02-13 07:19:07 +01:00
Michael Adam	a59fb322d6	ctdb: improve helpfulness of debug message when taking reclock fails Print out the errno if the fcntl call. Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-by: Richard Sharpe <rsharpe@samba.org> Autobuild-User(master): Michael Adam <obnox@samba.org> Autobuild-Date(master): Fri Jan 9 04:25:02 CET 2015 on sn-devel-104	2015-01-09 04:25:02 +01:00
Martin Schwenke	acf26089f1	ctdb-util: Rename db_wrap to tdb_wrap and make it a build subsystem This makes it consistent with Samba, to ease transition. Update unit test code to link to with tdb_wrap instead of including db_wrap.c. There are some potential whitespace fixes in this commit that have been ignored. CTDB's lib/tdb_wrap will be deleted after the transition to Samba's lib/tdb_wrap, so there's no point polishing it too much. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2014-09-10 01:36:15 +02:00
Martin Schwenke	b0f9d33058	ctdb: Fix some "declarations after code" problems Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2014-09-10 01:36:14 +02:00
Martin Schwenke	c1558adeaa	ctdb: Use sys_read() and sys_write() to ensure correct signal interaction ... and avoid compiler warnings in some cases. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2014-08-21 04:46:13 +02:00
Amitay Isaacs	f87b7f664f	ctdb-vacuum: Use existing function ctdb_marshall_finish Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Volker Lendecke <vl@samba.org> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Wed Jul 23 09:44:00 CEST 2014 on sn-devel-104	2014-07-23 09:44:00 +02:00
Amitay Isaacs	2855173dac	ctdb-daemon: Do not thaw databases if recovery is active This prevents ctdb tool from thawing databases prematurely in thaw/wipedb/restoredb commands if recovery is active. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2014-07-07 13:29:50 +02:00
Amitay Isaacs	7aa20ccb5c	ctdb-daemon: No need to call event scripts with CTDB_CALLED_BY_USER This was added to support external monitoring using CTDB event scripts. However, it was never used. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2014-01-16 11:41:12 +11:00
Amitay Isaacs	6d1b74f052	ctdb-server: Coverity fixes Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org>	2013-11-19 17:13:03 +01:00

1 2 3 4 5

228 Commits