samba-mirror

mirror of https://github.com/samba-team/samba.git synced 2025-03-12 20:58:37 +03:00

Author	SHA1	Message	Date
Amitay Isaacs	01c6c90e98	ctdb-daemon: Remove dependency on includes.h Instead of includes.h, include the required header files explicitly. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2015-10-30 02:00:27 +01:00
Amitay Isaacs	2fdb332fad	ctdb-daemon: Stop using tevent compatibility definitions Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2015-10-30 02:00:27 +01:00
Amitay Isaacs	7084cb92e2	ctdb-include: Move include/internal/cmdline.h to common/ Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2015-10-30 02:00:27 +01:00
Amitay Isaacs	b900adc55c	ctdb-daemon: Separate prototypes for system specific functions This groups function prototypes for system specific functions in common/system.h and removes them from ctdb_private.h. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2015-10-30 02:00:27 +01:00
Volker Lendecke	d527ab1094	ctdbd: Fix a typo Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Jeremy Allison <jra@samba.org>	2015-10-14 02:19:14 +02:00
Amitay Isaacs	0101748287	ctdb-recoverd: Always check for recmaster before doing recovery Recovery daemon checks if it is the recovery master before performing certain checks. During those checks it's possible that re-election can change the recmaster. In such a case, the recovery daemon should never do a database recovery. This is not complete fix since the recovery master can still change while the recovery is going on. The correct fix is to abort recovery if the recovery master changes. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Wed Oct 7 17:55:05 CEST 2015 on sn-devel-104	2015-10-07 17:55:05 +02:00
Amitay Isaacs	3cf93d9136	ctdb-recoverd: Get rid of connected-ness comparison in election The reason for favouring more connected node is to create a larger cluster in case of a split brain. In split brain condition, the nodes are not communicating across partitions and each partition will run its own election. Among all the partitions, the node which holds the recovery lock will eventually "win". All the other nodes which won election but could not grab recovery lock will end up banning themselves. This also prevents the recovery master role from bouncing between nodes during startup when the entire cluster is restarted. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2015-10-07 14:53:29 +02:00
Amitay Isaacs	fbd9c9fd2f	ctdb-recoverd: Do not freeze databases for election If election occurs during SMB activity, then trying to freeze all the databases can cause samba/ctdb deadlock which parallel database recovery is trying to avoid. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2015-10-07 14:53:29 +02:00
Amitay Isaacs	e6ff36506c	ctdb-recoverd: Add code for parallel database recovery Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2015-10-07 14:53:29 +02:00
Amitay Isaacs	4b39a7706f	ctdb-recoverd: Update flags on all nodes before database recovery Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2015-10-07 14:53:29 +02:00
Amitay Isaacs	9843363629	ctdb-recoverd: Update capabilities before the database recovery Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2015-10-07 14:53:29 +02:00
Amitay Isaacs	14cacd2925	ctdb-recovery: Factor out existing database recovery code Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2015-10-07 14:53:29 +02:00
Amitay Isaacs	62f1e2579a	ctdb-daemon: Replace ctdb_message with srvid abstraction Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2015-10-07 14:53:28 +02:00
Amitay Isaacs	1df2594386	ctdb-daemon: Introduce per database generation The database generation for each database is updated only during recovery. After recovery is complete the database generation would be the same as the global generation. The database generation is required for parallel database recovery. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2015-10-07 14:53:27 +02:00
Amitay Isaacs	4f155e77a8	ctdb-daemon: Rename ctdb_control_wipe_database to ctdb_control_transdb The same structure is required in new controls for database transactions. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2015-10-07 14:53:27 +02:00
Martin Schwenke	b234ae0a90	ctdb-recoverd: Clear IP assignment tree on election loss If a node was previously recovery master (say, 20 years ago) and it becomes recovery master again then, if IP assignments have changed, verify_remote_ip_allocation() can produce messages like the following when called during recovery: ctdbd: recoverd:Inconsistent IP allocation - node 0 thinks 10.1.1.1 is held by node 0 while it is assigned to node 1 When a node loses an election it should clear all data specific to it being the recovery master. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-07-01 04:18:28 +02:00
Amitay Isaacs	941669ae36	ctdb-recovered: Drop unused variable Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org>	2015-06-05 11:28:23 +02:00
Amitay Isaacs	2e2dba8d13	ctdb-recoverd/vacuum: Remove vacuum_info structure For all the records listed in VACUUM_FETCH, migration requests are sent immediately without waiting. This means there can only be a single VACUUM_FETCH processing active at a time. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org>	2015-06-05 11:28:23 +02:00
Michael Adam	92d1486b87	ctdb-recoverd/vacuum: move fetch loop back into fetch handler. With the processing of one element factored out, it is more natural to have the actual loop inside the handler function. This also makes the talloc/free bracked around the loop more obvious. Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-06-05 11:28:23 +02:00
Michael Adam	4103463ad2	ctdb-recoverd/vacuum: slightly reorder the vacuum fetch loop Reads more naturally this way, imho. Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-06-05 11:28:23 +02:00
Michael Adam	a1c941be6f	ctdb-recoverd/vacuum: add common exit point to vacuum_fetch_handler Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-06-05 11:28:23 +02:00
Michael Adam	9092617888	ctdb-recoverd/vacuum: factor vacuum_fetch_process_one out of vacuum_fetch_loop Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-06-05 11:28:23 +02:00
Michael Adam	84ab6d003a	ctdb-recoverd/vacuum: move two variables into scope. Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-06-05 11:28:23 +02:00
Michael Adam	9e5cf6fd5c	ctdb-recoverd/vacuum: remove unneeded prototype. Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-06-05 11:28:23 +02:00
Martin Schwenke	91f99ddfb3	ctdb-recoverd: Remove redundant condition when checking recovery lock It isn't possible to hold the recovery lock without having a lock file set. This is part of a goal to generalise the recovery lock mechanism to just use a helper program, which may use a lock file or may use something else. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-05-10 03:22:13 +02:00
Martin Schwenke	a45ab7d1fe	ctdb-recoverd: Simplify using TALLOC_FREE() Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-05-10 03:22:13 +02:00
Martin Schwenke	2c72c9de48	ctdb-recoverd: Drop redundant condition in election handler Election packets from the current node are ignored at the beginning of the function, so this does not need to be checked. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-05-10 03:22:13 +02:00
Martin Schwenke	c75fdf208f	ctdb-recoverd: Remove unused memory context variable It is set, memory is allocated but it is never used. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-05-10 03:22:13 +02:00
Martin Schwenke	4b4ba77f4a	ctdb-recoverd: Replace unnecessary use of ctdb->recovery_master Databases are only pulled by the recovery master, so it can compare with current node PNN. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-05-10 03:22:13 +02:00
Martin Schwenke	6415edfa26	ctdb-recoverd: Rename some local variables to avoid conflict with convention rec is always a (struct ctdb_recoverd *) Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-05-10 03:22:13 +02:00
Martin Schwenke	36fc620898	ctdb_recoverd: Move num_lmasters calculation to near where it is used Unless this node is the recovery master then this is not needed. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-05-10 03:22:13 +02:00
Martin Schwenke	1fd2d3886c	ctdb-recoverd: Make num_lmasters a local variable It isn't used anywhere else and is always re-initialised to 0. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-05-10 03:22:13 +02:00
Martin Schwenke	385e9326ea	ctdb-recoverd: Remove unused struct members num_active and num_connected They are initialised and updated but the values are never used. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-05-10 03:22:13 +02:00
Martin Schwenke	c3d6678dbc	ctdb-recoverd: Use capabilities API Simplify update_capabilities() using the capabilities API and store the capabilities in new field rec->caps rather than scattered around ctdb->nodes. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-05-10 03:22:13 +02:00
Martin Schwenke	85bd9a33eb	ctdb-recoverd: Avoid nodemap-related checks when recoveries are disabled The potential resulting recovery won't run anyway. Also recoveries may have been disabled by "reloadnodes" and if the nodemaps are inconsistent between nodes then avoid triggering an unnecessary recovery. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-04-07 07:43:13 +02:00
Martin Schwenke	ee9619c28b	ctdb-recoverd: New message ID CTDB_SRVID_DISABLE_RECOVERIES Also add test stub support. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-04-07 07:43:13 +02:00
Martin Schwenke	2ca484cd50	ctdb-recoverd: Simplify disable_ip_check_handler() using ctdb_op_disable() Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-04-07 07:43:13 +02:00
Martin Schwenke	108db3396f	ctdb-recoverd: Add slightly more abstraction for disabling takeover runs Factor out new function srvid_disable_and_reply(), which can be re-used. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-04-07 07:43:13 +02:00
Martin Schwenke	ec32d9bea8	ctdb-recoverd: Reimplement ReRecoveryTimeout using ctdb_op_disable() Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-04-07 07:43:12 +02:00
Martin Schwenke	281f7e8152	ctdb-recoverd: Use a goto for do_recovery() failures This will allow extra things to be done on failure. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-04-07 07:43:12 +02:00
Martin Schwenke	a2044c65bc	ctdb-recoverd: Reimplement disabling takeover runs using ctdb_op_disable() Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-04-07 07:43:12 +02:00
Martin Schwenke	55b246195b	ctdb-recoverd: Add a new abstraction ctdb_op_disable() This can be used to disable and re-enable an operation, and do all the relevant sanity checking. Most of this is from existing functions disable_takeover_runs_handler(), clear_takeover_runs_disable() and reenable_takeover_runs(). Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-04-07 07:43:12 +02:00
Martin Schwenke	48c91407ab	ctdb-recoverd: Don't release and re-take the recovery lock Just continue to hold it, otherwise a broken node might win an election and grab the lock. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-02-13 07:19:07 +01:00
Martin Schwenke	1d6ed91f55	ctdb-recoverd: Simplify ctdb_recovery_lock() Have it just silently take or fail to take the lock, except on an unexpected failure (where it should log an error). This means that when it is called we need to keep the old behaviour and explicitly release the lock. In do_recovery() the lock is released and a message is printed before attempting to take the lock. In the daemon sanity check the lock must be released in the error path if it is actually taken. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-02-13 07:19:07 +01:00
Martin Schwenke	be19a17faf	ctdb-recoverd: Remove check_recovery_lock() This has not done anything useful since commit b9d8bb23af8abefb2d967e9b4e9d6e60c4a3b520. Instead, just check that the lock is held. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-02-13 07:19:07 +01:00
Martin Schwenke	668ed53662	ctdb-recoverd: Improve logging when recovery lock file is changed Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-02-13 07:19:07 +01:00
Martin Schwenke	db32a2bce5	ctdb-recoverd: New function ctdb_recovery_unlock() Unlock the recovery lock file. This way knowledge of the file descriptor isn't sprinkled throughout the code. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-02-13 07:19:07 +01:00
Martin Schwenke	72701be663	ctdb-recoverd: New function ctdb_recovery_have_lock() True if this recovery daemon holds the lock. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-02-13 07:19:07 +01:00
Martin Schwenke	d110fe2318	ctdb-daemon: Mark tunable VerifyRecoveryLock as obsolete It is pointless having a recovery lock but not sanity checking that it is working. Also, the logic that uses this tunable is confusing. In some places the recovery lock is released unnecessarily because the tunable isn't set. Simplify the logic by assuming that if a recovery lock is specified then it should be verified. Update documentation that references this tunable. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-02-13 07:19:07 +01:00
Amitay Isaacs	959b9ea0ef	ctdb-recoverd: Process all the records for vacuum fetch in a loop Processing one migration request at a time is very slow and processing a batch of records can take longer than VacuumInterval. This causes subsequent vacuum fetch requests to be dropped. The dropped records can accumulate quickly and will cause the vacuum database traverse to be quite expensive. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Fri Dec 5 17:06:58 CET 2014 on sn-devel-104	2014-12-05 17:06:58 +01:00

... 3 4 5 6 7 ...

585 Commits