samba-mirror

mirror of https://github.com/samba-team/samba.git synced 2024-12-28 07:21:54 +03:00

Author	SHA1	Message	Date
Martin Schwenke	d340f308e7	ctdb-daemon: Don't delay reloading the nodes file Presumably this was done to minimise the chance of a recovery occurring while the nodemaps are inconsistent across nodes. Another potential theory is that the forced recovery in the ctdb.c:control_reload_nodes_file() stops another recovery occurring for ReRecoveryTimeout seconds, so this delay causes the reloads to occur during that period. This is no longer necessary because recoveries are now explicitly disabled while node files are reloaded. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-04-07 07:43:13 +02:00
Martin Schwenke	85bd9a33eb	ctdb-recoverd: Avoid nodemap-related checks when recoveries are disabled The potential resulting recovery won't run anyway. Also recoveries may have been disabled by "reloadnodes" and if the nodemaps are inconsistent between nodes then avoid triggering an unnecessary recovery. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-04-07 07:43:13 +02:00
Martin Schwenke	ee9619c28b	ctdb-recoverd: New message ID CTDB_SRVID_DISABLE_RECOVERIES Also add test stub support. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-04-07 07:43:13 +02:00
Martin Schwenke	2ca484cd50	ctdb-recoverd: Simplify disable_ip_check_handler() using ctdb_op_disable() Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-04-07 07:43:13 +02:00
Martin Schwenke	108db3396f	ctdb-recoverd: Add slightly more abstraction for disabling takeover runs Factor out new function srvid_disable_and_reply(), which can be re-used. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-04-07 07:43:13 +02:00
Martin Schwenke	ec32d9bea8	ctdb-recoverd: Reimplement ReRecoveryTimeout using ctdb_op_disable() Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-04-07 07:43:12 +02:00
Martin Schwenke	281f7e8152	ctdb-recoverd: Use a goto for do_recovery() failures This will allow extra things to be done on failure. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-04-07 07:43:12 +02:00
Martin Schwenke	a2044c65bc	ctdb-recoverd: Reimplement disabling takeover runs using ctdb_op_disable() Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-04-07 07:43:12 +02:00
Martin Schwenke	55b246195b	ctdb-recoverd: Add a new abstraction ctdb_op_disable() This can be used to disable and re-enable an operation, and do all the relevant sanity checking. Most of this is from existing functions disable_takeover_runs_handler(), clear_takeover_runs_disable() and reenable_takeover_runs(). Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-04-07 07:43:12 +02:00
Martin Schwenke	ae9cd037ee	ctdb-daemon: Pass on consistent flag information to recovery daemon Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-04-07 07:43:12 +02:00
Amitay Isaacs	62ba95a9f3	ctdb-daemon: Drop tunable that is no longer in use Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2015-03-27 06:40:08 +01:00
Amitay Isaacs	41ed26cbf7	ctdb-recoverd: Fix typo in comment Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2015-03-27 06:40:08 +01:00
Martin Schwenke	81e526965c	ctdb-daemon: New control CTDB_CONTROL_GET_NODES_FILE This is like CTDB_CONTROL_GET_NODEMAP but it loads from the nodes file instead of the daemon. Also new client function ctdb_ctrl_getnodesfile() Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-03-23 12:23:12 +01:00
Martin Schwenke	5148228f41	ctdb-daemon: Move ctdb_read_nodes_file() to utilities Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-03-23 12:23:12 +01:00
Martin Schwenke	1ada9c4ef7	ctdb-daemon: Factor out node parsing code New function ctdb_read_nodes_file() reads a nodes file into a node map, which is a useful intermediate format. This function should replace the node reading code in the ctdb CLI tool. It will also be useful for sanity checking of nodes files across the cluster. New function convert_node_map_to_list() converts a node map to a node array (and associated node count). This fills in the details that aren't present in the node map. This may also useful as a separate function later if node list reloading stages the data after a sanity check - the approach is not yet finalised. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-03-23 12:23:12 +01:00
Martin Schwenke	a5be2c245d	ctdb-daemon: Store node addresses as ctdb_sock_addr rather than strings Every time a nodemap is contructed the node IP addresses all need to be parsed. This isn't very productive use of CPU. Instead, parse each string once when the nodes file is loaded. This results in much simpler code. This code also removes the use of ctdb_address. Duplicating the port is pointless without an abstraction layer around ctdb_address. If CTDB gets an incompatible transport in the future then add an abstraction layer. Note that the infiniband code is not updated. Compilation of the infiniband code is already broken. Fixing it will be a separate, properly tested effort. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>	2015-03-23 12:23:12 +01:00
Martin Schwenke	3cbeb17d0f	ctdb-common: Drop ctdb context from ctdb_parse_address() Having it require a CTDB context stops ctdb_parse_address() from being used in more generic code. Just use the existing talloc context for memory allocations. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-03-23 12:23:12 +01:00
Martin Schwenke	a1e65d0c8d	ctdb-daemon: Remove function ctdb_add_deleted_node() Just add a flags parameter to ctdb_add_nodes() and use the same code. Less is more. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-03-23 12:23:12 +01:00
Martin Schwenke	876529054a	ctdb-daemon: Set node PNN in one place This is currently set in 2 places. One of them makes the node loading code difficult to refactor. Also, when the surrounding code in either place is touched then it might get broken. This only needs to be done once at startup, not on every reload. So do it once in a very obvious way, sacrificing a few CPU cycles for some added clarity. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-03-23 12:23:12 +01:00
Martin Schwenke	db6385afe9	ctdb-daemon: Move VNN map initialisation out of node loading Each node reload unnecessarily and incorrectly resets the VNN map, causing a potentially unnecessary recovery. When nodes are reloaded any newly deleted nodes should already be disconnected and any newly added nodes should also be disconnected. This means that reloading the nodes file should not cause a change in the VNN map. The current implementation also leaks memory every time the nodes are reloaded. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-03-23 12:23:12 +01:00
Volker Lendecke	d171d2010a	ctdb: Fix CID 1125613 Destination buffer too small Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Michael Adam <obnox@samba.org> Autobuild-User(master): Michael Adam <obnox@samba.org> Autobuild-Date(master): Fri Mar 13 19:14:20 CET 2015 on sn-devel-104	2015-03-13 19:14:20 +01:00
Volker Lendecke	8d9bb5c54a	ctdb: Introduce a helper var in ctdb_get_script_list Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Michael Adam <obnox@samba.org>	2015-03-13 16:39:05 +01:00
Volker Lendecke	c1e8bfb186	ctdb: Fix memleak in ctdb_get_script_list scandir allocates every name individually, see example code in susv4 or man scandir Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Michael Adam <obnox@samba.org>	2015-03-13 16:39:05 +01:00
Volker Lendecke	a8cc495b96	ctdb: Make for-loop in ctdb_get_script_list more idiomatic Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Michael Adam <obnox@samba.org>	2015-03-13 16:39:05 +01:00
Volker Lendecke	b584bdebf9	ctdb: Fix whitespace Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Michael Adam <obnox@samba.org>	2015-03-13 16:39:05 +01:00
Volker Lendecke	f724bfb44a	ctdb: Fix CID 1288201 Array compared against 0 "helper_prog" is now declared as a static array Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Jeremy Allison <jra@samba.org>	2015-03-11 16:11:07 +01:00
Martin Schwenke	b7b508c765	ctdb-daemon: Use statically allocated arrays for helper paths The use of talloc with a static variable is somewhat confusing. Statically allocate an array and use ctdb_set_helper() instead. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Volker Lendecke <vl@samba.org>	2015-03-10 15:29:06 +01:00
Amitay Isaacs	3f97be6d0f	ctdb-locking: Back-off from logging every 10 seconds If ctdb_lock_helper cannot get a lock within 10 seconds, ctdb daemon logs a message and invokes an external debug script. This is repeated every 10 seconds. In case of a contention or on a loaded system, there can be multiple ctdb_lock_helper processes waiting to get lock on record(s). For each lock request taking longer, ctdb daemon will flood the log every 10 seconds. Instead of logging aggressively every 10 seconds, relax logging to every 100s and 1000s if the elapsed time has exceeded 100s and 1000s respectively. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org> Autobuild-User(master): Michael Adam <obnox@samba.org> Autobuild-Date(master): Thu Mar 5 12:06:44 CET 2015 on sn-devel-104	2015-03-05 12:06:44 +01:00
Martin Schwenke	54f0c39e5a	ctdb-client: Return a value of 1 when setting obsolete tunable variable Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-02-18 05:34:06 +01:00
Martin Schwenke	39d2fd330a	ctdb-recoverd: Abort when daemon can take recovery lock during recovery Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Fri Feb 13 09:48:15 CET 2015 on sn-devel-104	2015-02-13 09:48:15 +01:00
Martin Schwenke	432d677489	ctdb-recoverd: Improve error messages on recovery lock coherence fail When the daemon is able to take the recovery lock during recovery we might as well guess that the cluster filesystem has a lock coherence problem and print a more useful message. This will be more helpful to those trying out cluster filesystems that don't have lock coherence or that are difficult to setup. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-02-13 07:19:07 +01:00
Martin Schwenke	48c91407ab	ctdb-recoverd: Don't release and re-take the recovery lock Just continue to hold it, otherwise a broken node might win an election and grab the lock. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-02-13 07:19:07 +01:00
Martin Schwenke	1d6ed91f55	ctdb-recoverd: Simplify ctdb_recovery_lock() Have it just silently take or fail to take the lock, except on an unexpected failure (where it should log an error). This means that when it is called we need to keep the old behaviour and explicitly release the lock. In do_recovery() the lock is released and a message is printed before attempting to take the lock. In the daemon sanity check the lock must be released in the error path if it is actually taken. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-02-13 07:19:07 +01:00
Martin Schwenke	be19a17faf	ctdb-recoverd: Remove check_recovery_lock() This has not done anything useful since commit `b9d8bb23af`. Instead, just check that the lock is held. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-02-13 07:19:07 +01:00
Martin Schwenke	668ed53662	ctdb-recoverd: Improve logging when recovery lock file is changed Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-02-13 07:19:07 +01:00
Martin Schwenke	db32a2bce5	ctdb-recoverd: New function ctdb_recovery_unlock() Unlock the recovery lock file. This way knowledge of the file descriptor isn't sprinkled throughout the code. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-02-13 07:19:07 +01:00
Martin Schwenke	72701be663	ctdb-recoverd: New function ctdb_recovery_have_lock() True if this recovery daemon holds the lock. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-02-13 07:19:07 +01:00
Martin Schwenke	4d3b52f1ce	ctdb-daemon: Log a warning when setting obsolete tunables Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-02-13 07:19:07 +01:00
Martin Schwenke	d110fe2318	ctdb-daemon: Mark tunable VerifyRecoveryLock as obsolete It is pointless having a recovery lock but not sanity checking that it is working. Also, the logic that uses this tunable is confusing. In some places the recovery lock is released unnecessarily because the tunable isn't set. Simplify the logic by assuming that if a recovery lock is specified then it should be verified. Update documentation that references this tunable. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-02-13 07:19:07 +01:00
Martin Schwenke	5e00673f2d	ctdb-daemon: Fix SET_RECLOCK_FILE regression If the recovery lock file is unset then this dereferences a NULL pointer. The regression is due to commit `6f1ac7af0f`. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-02-04 03:14:07 +01:00
Michael Adam	a59fb322d6	ctdb: improve helpfulness of debug message when taking reclock fails Print out the errno if the fcntl call. Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-by: Richard Sharpe <rsharpe@samba.org> Autobuild-User(master): Michael Adam <obnox@samba.org> Autobuild-Date(master): Fri Jan 9 04:25:02 CET 2015 on sn-devel-104	2015-01-09 04:25:02 +01:00
Martin Schwenke	6f1ac7af0f	ctdb-daemon: Handle out-of-memory when setting recovery lock file Log a message when the reclock file actually changes and avoid a memory allocation when it doesn't change. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Michael Adam <obnox@samba.org>	2015-01-09 02:03:40 +01:00
Amitay Isaacs	e0bf5dd456	ctdb-daemon: Use correct tdb flags when enabling robust mutex support BUG: https://bugzilla.samba.org/show_bug.cgi?id=11000 Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Stefan Metzmacher <metze@samba.org>	2014-12-19 13:15:12 +01:00
Stefan Metzmacher	6604b7bd8d	ctdb/server: add format string checking to ctdb_tevent_logging() Signed-off-by: Stefan Metzmacher <metze@samba.org> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2014-12-17 09:26:07 +01:00
Martin Schwenke	108b1be0ee	ctdb-daemon: Trust vnn->interface for an IP when releasing it ctdb_sys_find_ifname() doesn't work for IPv6 addresses so don't use it. Trust the eventscript to do sanity checking on the interface. Current warnings are replaced with equivalents generated by the eventscript. The unlikely message: Public IP %s is hosted on interface %s but we have no VNN will be replaced by: WARNING: Public IP %s hosted on interface %s but VNN says __none__ which is clear enough. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2014-12-05 21:02:40 +01:00
Amitay Isaacs	959b9ea0ef	ctdb-recoverd: Process all the records for vacuum fetch in a loop Processing one migration request at a time is very slow and processing a batch of records can take longer than VacuumInterval. This causes subsequent vacuum fetch requests to be dropped. The dropped records can accumulate quickly and will cause the vacuum database traverse to be quite expensive. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Fri Dec 5 17:06:58 CET 2014 on sn-devel-104	2014-12-05 17:06:58 +01:00
Amitay Isaacs	257311e337	ctdb-vacuum: Do not delete VACUUM MIGRATED records immediately Such records should be processed by the local vacuuming daemon to ensure that all the remote copies have been deleted first. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2014-12-05 14:43:07 +01:00
Amitay Isaacs	dbb1958284	ctdb-vacuum: Use non-blocking lock when traversing delete tree This avoids vacuuming getting in the way of ctdb daemon to process record requests. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2014-12-05 14:43:07 +01:00
Amitay Isaacs	d35f512cd9	ctdb-vacuum: Use non-blocking lock when traversing delete queue This avoids vacuuming getting in the way of ctdb daemon to process record requests. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2014-12-05 14:43:07 +01:00
Amitay Isaacs	e4597f8771	ctdb-vacuum: Stagger vacuuming child processes This prevents multiple child processes being forked at the same time for vacuuming TDBs. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2014-12-05 14:43:07 +01:00

1 2 3 4 5 ...

1615 Commits