samba-mirror

mirror of https://github.com/samba-team/samba.git synced 2025-01-13 13:18:06 +03:00

Author	SHA1	Message	Date
Amitay Isaacs	9b6865475e	ctdb-daemon: Remove obsolete IPv4 only controls Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Jeremy Allison <jra@samba.org>	2015-05-12 01:32:11 +02:00
Amitay Isaacs	4f4e6ebace	ctdb-daemon: Remove older data structure that supports only IPv4 addresses Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Jeremy Allison <jra@samba.org>	2015-05-12 01:32:11 +02:00
Martin Schwenke	c75f297ac3	ctdb-daemon: Fix typo in debug message Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Sun May 10 06:10:21 CEST 2015 on sn-devel-104	2015-05-10 06:10:21 +02:00
Martin Schwenke	d30b529ccc	ctdb-daemon: Initialise eventscript status earlier Don't initialise it after ctdb_event_script_callback_v() may have short-circuited. This can stop ctdb_event_script_args() from ever terminating. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-05-10 03:22:14 +02:00
Martin Schwenke	070964dbcf	ctdb-daemon: Make ctdb_event_script_args() terminate if no scripts status.done is never set to true unless event_script_callback() is invoked. The short-circuit in ctdb_event_script_callback_v() means that this doesn't happen. CTDB can't work very well without 00.ctdb (for tunable initialisation and the like) but it shouldn't get stuck. So call the callback when there are no scripts in event_script_callback(). Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-05-10 03:22:14 +02:00
Martin Schwenke	6808b0aa6a	ctdb-daemon: Drop interface monitoring This is done by 10.interace where the monitor event fails when there is a missing interface. The in-daemon interface checking adds no value. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-05-10 03:22:14 +02:00
Martin Schwenke	7ee57b8d7c	ctdb-recoverd: Short circuit takeover run if no nodes are RUNNING If all nodes are still in, say, FIRST_RECOVERY runstate, then the logs contain unfortunate noise like: recoverd:Failed to find node to cover ip 10.0.2.131 This avoids that by adding an early exit that avoids running takeover_run_core() when there are no nodes in the CTDB_RUNSTATE_RUNNING. To support this add the runstate to the ipflags structure. There are clearly other ways of hacking this but this seems the simplest. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-05-10 03:22:13 +02:00
Martin Schwenke	91f99ddfb3	ctdb-recoverd: Remove redundant condition when checking recovery lock It isn't possible to hold the recovery lock without having a lock file set. This is part of a goal to generalise the recovery lock mechanism to just use a helper program, which may use a lock file or may use something else. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-05-10 03:22:13 +02:00
Martin Schwenke	a45ab7d1fe	ctdb-recoverd: Simplify using TALLOC_FREE() Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-05-10 03:22:13 +02:00
Martin Schwenke	2c72c9de48	ctdb-recoverd: Drop redundant condition in election handler Election packets from the current node are ignored at the beginning of the function, so this does not need to be checked. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-05-10 03:22:13 +02:00
Martin Schwenke	c75fdf208f	ctdb-recoverd: Remove unused memory context variable It is set, memory is allocated but it is never used. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-05-10 03:22:13 +02:00
Martin Schwenke	e6f99fcba3	ctdb-daemon: Broadcast IP rellocation request from monitor code No need to just send it to the recovery master. This reduces the need for main daemon code to know which node is the recovery master. The end goal is for the main daemon to not need to know which node is the recovery master - this information would be stored in the recovery daemon (and subsequently a separate cluster management daemon). Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-05-10 03:22:13 +02:00
Martin Schwenke	4b4ba77f4a	ctdb-recoverd: Replace unnecessary use of ctdb->recovery_master Databases are only pulled by the recovery master, so it can compare with current node PNN. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-05-10 03:22:13 +02:00
Martin Schwenke	6415edfa26	ctdb-recoverd: Rename some local variables to avoid conflict with convention rec is always a (struct ctdb_recoverd *) Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-05-10 03:22:13 +02:00
Martin Schwenke	36fc620898	ctdb_recoverd: Move num_lmasters calculation to near where it is used Unless this node is the recovery master then this is not needed. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-05-10 03:22:13 +02:00
Martin Schwenke	1fd2d3886c	ctdb-recoverd: Make num_lmasters a local variable It isn't used anywhere else and is always re-initialised to 0. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-05-10 03:22:13 +02:00
Martin Schwenke	385e9326ea	ctdb-recoverd: Remove unused struct members num_active and num_connected They are initialised and updated but the values are never used. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-05-10 03:22:13 +02:00
Martin Schwenke	c3d6678dbc	ctdb-recoverd: Use capabilities API Simplify update_capabilities() using the capabilities API and store the capabilities in new field rec->caps rather than scattered around ctdb->nodes. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-05-10 03:22:13 +02:00
Martin Schwenke	20a7945a26	Revert "ctdb-recoverd: Abort when daemon can take recovery lock during recovery" This reverts commit `39d2fd330a`. An election can occur in the middle of a recovery. During the election the recovery master can change. When a node loses a round of the election and stops being the recovery master it releases the recovery lock. Then at the end of the ongoing recovery all nodes are able to take the recovery lock so they will all abort. The most likely cause for a change in recovery master is that several (all?) nodes are starting up and the "connected-ness" of each node is a primary factor in winning the election. In this situation the recovery master can bounce around the cluster. The simplest solution is to revert this patch so that the recovery will fail. The new recovery master will then start a new recovery. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Mon May 4 10:40:36 CEST 2015 on sn-devel-104	2015-05-04 10:40:36 +02:00
Rajesh Joseph	9b33732a57	ctdb: Coverity fix for CID 1125630 Due to usage of CTDB_NO_MEMORY macro, some of the resources are not freed in failure cases. Signed-off-by: Rajesh Joseph <rjoseph@redhat.com> Reviewed-by: Guenther Deschner <gd@samba.org> Reviewed-by: Michael Adam <obnox@samba.org> Autobuild-User(master): Günther Deschner <gd@samba.org> Autobuild-Date(master): Fri Apr 17 16:49:05 CEST 2015 on sn-devel-104	2015-04-17 16:49:04 +02:00
Martin Schwenke	1ef1cfdc4d	ctdb-common: Move ctdb_node_list_to_map() to utilities Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-04-07 07:43:13 +02:00
Martin Schwenke	dd52d82c73	ctdb-daemon: Factor out new function ctdb_node_list_to_map() Change ctdb_control_getnodemap() to use this. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-04-07 07:43:13 +02:00
Martin Schwenke	d340f308e7	ctdb-daemon: Don't delay reloading the nodes file Presumably this was done to minimise the chance of a recovery occurring while the nodemaps are inconsistent across nodes. Another potential theory is that the forced recovery in the ctdb.c:control_reload_nodes_file() stops another recovery occurring for ReRecoveryTimeout seconds, so this delay causes the reloads to occur during that period. This is no longer necessary because recoveries are now explicitly disabled while node files are reloaded. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-04-07 07:43:13 +02:00
Martin Schwenke	85bd9a33eb	ctdb-recoverd: Avoid nodemap-related checks when recoveries are disabled The potential resulting recovery won't run anyway. Also recoveries may have been disabled by "reloadnodes" and if the nodemaps are inconsistent between nodes then avoid triggering an unnecessary recovery. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-04-07 07:43:13 +02:00
Martin Schwenke	ee9619c28b	ctdb-recoverd: New message ID CTDB_SRVID_DISABLE_RECOVERIES Also add test stub support. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-04-07 07:43:13 +02:00
Martin Schwenke	2ca484cd50	ctdb-recoverd: Simplify disable_ip_check_handler() using ctdb_op_disable() Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-04-07 07:43:13 +02:00
Martin Schwenke	108db3396f	ctdb-recoverd: Add slightly more abstraction for disabling takeover runs Factor out new function srvid_disable_and_reply(), which can be re-used. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-04-07 07:43:13 +02:00
Martin Schwenke	ec32d9bea8	ctdb-recoverd: Reimplement ReRecoveryTimeout using ctdb_op_disable() Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-04-07 07:43:12 +02:00
Martin Schwenke	281f7e8152	ctdb-recoverd: Use a goto for do_recovery() failures This will allow extra things to be done on failure. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-04-07 07:43:12 +02:00
Martin Schwenke	a2044c65bc	ctdb-recoverd: Reimplement disabling takeover runs using ctdb_op_disable() Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-04-07 07:43:12 +02:00
Martin Schwenke	55b246195b	ctdb-recoverd: Add a new abstraction ctdb_op_disable() This can be used to disable and re-enable an operation, and do all the relevant sanity checking. Most of this is from existing functions disable_takeover_runs_handler(), clear_takeover_runs_disable() and reenable_takeover_runs(). Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-04-07 07:43:12 +02:00
Martin Schwenke	ae9cd037ee	ctdb-daemon: Pass on consistent flag information to recovery daemon Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-04-07 07:43:12 +02:00
Amitay Isaacs	62ba95a9f3	ctdb-daemon: Drop tunable that is no longer in use Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2015-03-27 06:40:08 +01:00
Amitay Isaacs	41ed26cbf7	ctdb-recoverd: Fix typo in comment Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2015-03-27 06:40:08 +01:00
Martin Schwenke	81e526965c	ctdb-daemon: New control CTDB_CONTROL_GET_NODES_FILE This is like CTDB_CONTROL_GET_NODEMAP but it loads from the nodes file instead of the daemon. Also new client function ctdb_ctrl_getnodesfile() Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-03-23 12:23:12 +01:00
Martin Schwenke	5148228f41	ctdb-daemon: Move ctdb_read_nodes_file() to utilities Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-03-23 12:23:12 +01:00
Martin Schwenke	1ada9c4ef7	ctdb-daemon: Factor out node parsing code New function ctdb_read_nodes_file() reads a nodes file into a node map, which is a useful intermediate format. This function should replace the node reading code in the ctdb CLI tool. It will also be useful for sanity checking of nodes files across the cluster. New function convert_node_map_to_list() converts a node map to a node array (and associated node count). This fills in the details that aren't present in the node map. This may also useful as a separate function later if node list reloading stages the data after a sanity check - the approach is not yet finalised. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-03-23 12:23:12 +01:00
Martin Schwenke	a5be2c245d	ctdb-daemon: Store node addresses as ctdb_sock_addr rather than strings Every time a nodemap is contructed the node IP addresses all need to be parsed. This isn't very productive use of CPU. Instead, parse each string once when the nodes file is loaded. This results in much simpler code. This code also removes the use of ctdb_address. Duplicating the port is pointless without an abstraction layer around ctdb_address. If CTDB gets an incompatible transport in the future then add an abstraction layer. Note that the infiniband code is not updated. Compilation of the infiniband code is already broken. Fixing it will be a separate, properly tested effort. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>	2015-03-23 12:23:12 +01:00
Martin Schwenke	3cbeb17d0f	ctdb-common: Drop ctdb context from ctdb_parse_address() Having it require a CTDB context stops ctdb_parse_address() from being used in more generic code. Just use the existing talloc context for memory allocations. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-03-23 12:23:12 +01:00
Martin Schwenke	a1e65d0c8d	ctdb-daemon: Remove function ctdb_add_deleted_node() Just add a flags parameter to ctdb_add_nodes() and use the same code. Less is more. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-03-23 12:23:12 +01:00
Martin Schwenke	876529054a	ctdb-daemon: Set node PNN in one place This is currently set in 2 places. One of them makes the node loading code difficult to refactor. Also, when the surrounding code in either place is touched then it might get broken. This only needs to be done once at startup, not on every reload. So do it once in a very obvious way, sacrificing a few CPU cycles for some added clarity. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-03-23 12:23:12 +01:00
Martin Schwenke	db6385afe9	ctdb-daemon: Move VNN map initialisation out of node loading Each node reload unnecessarily and incorrectly resets the VNN map, causing a potentially unnecessary recovery. When nodes are reloaded any newly deleted nodes should already be disconnected and any newly added nodes should also be disconnected. This means that reloading the nodes file should not cause a change in the VNN map. The current implementation also leaks memory every time the nodes are reloaded. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2015-03-23 12:23:12 +01:00
Volker Lendecke	d171d2010a	ctdb: Fix CID 1125613 Destination buffer too small Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Michael Adam <obnox@samba.org> Autobuild-User(master): Michael Adam <obnox@samba.org> Autobuild-Date(master): Fri Mar 13 19:14:20 CET 2015 on sn-devel-104	2015-03-13 19:14:20 +01:00
Volker Lendecke	8d9bb5c54a	ctdb: Introduce a helper var in ctdb_get_script_list Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Michael Adam <obnox@samba.org>	2015-03-13 16:39:05 +01:00
Volker Lendecke	c1e8bfb186	ctdb: Fix memleak in ctdb_get_script_list scandir allocates every name individually, see example code in susv4 or man scandir Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Michael Adam <obnox@samba.org>	2015-03-13 16:39:05 +01:00
Volker Lendecke	a8cc495b96	ctdb: Make for-loop in ctdb_get_script_list more idiomatic Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Michael Adam <obnox@samba.org>	2015-03-13 16:39:05 +01:00
Volker Lendecke	b584bdebf9	ctdb: Fix whitespace Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Michael Adam <obnox@samba.org>	2015-03-13 16:39:05 +01:00
Volker Lendecke	f724bfb44a	ctdb: Fix CID 1288201 Array compared against 0 "helper_prog" is now declared as a static array Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Jeremy Allison <jra@samba.org>	2015-03-11 16:11:07 +01:00
Martin Schwenke	b7b508c765	ctdb-daemon: Use statically allocated arrays for helper paths The use of talloc with a static variable is somewhat confusing. Statically allocate an array and use ctdb_set_helper() instead. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Volker Lendecke <vl@samba.org>	2015-03-10 15:29:06 +01:00
Amitay Isaacs	3f97be6d0f	ctdb-locking: Back-off from logging every 10 seconds If ctdb_lock_helper cannot get a lock within 10 seconds, ctdb daemon logs a message and invokes an external debug script. This is repeated every 10 seconds. In case of a contention or on a loaded system, there can be multiple ctdb_lock_helper processes waiting to get lock on record(s). For each lock request taking longer, ctdb daemon will flood the log every 10 seconds. Instead of logging aggressively every 10 seconds, relax logging to every 100s and 1000s if the elapsed time has exceeded 100s and 1000s respectively. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org> Autobuild-User(master): Michael Adam <obnox@samba.org> Autobuild-Date(master): Thu Mar 5 12:06:44 CET 2015 on sn-devel-104	2015-03-05 12:06:44 +01:00

1 2 3 4 5 ...

1637 Commits