samba-mirror

mirror of https://github.com/samba-team/samba.git synced 2025-01-03 01:18:10 +03:00

Author	SHA1	Message	Date
Martin Schwenke	d89506449f	ctdb-failover: Add ctdb_smnotify_helper statd callout will shortly be updated to use NFS utils' sm-notify. This tiny helper will be used to create on-disk state files used by sm-notify. These state files contain endian-specific fields, so better to write a simple C implementation than to do crazy things in a shell script (or call out to Python). Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2024-08-29 22:48:33 +00:00
Volker Lendecke	3cc3329420	ctdb: Add a NULL check to convert_node_map_to_list() Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Jennifer Sutton <jsutton@samba.org>	2024-08-27 07:19:32 +00:00
Martin Schwenke	578dfa5765	ctdb-scripts: Avoid flapping NFS services at startup If an NFS service check is set to, say, unhealthy_after=2 then it will always switch from the (default startup) unhealthy state to healthy, even if there is a fatal problem. If all services/scripts appear OK then the node will become healthy. When the counter hits the limit it will return to unhealthy. This is misleading. Instead, never use the counter at startup, until the service becomes healthy. This stops services flapping unhealthy-healthy-unhealthy. A side-effect is that a service that starts in a broken state will never be restarted to try to fix the problem. This makes sense. The counting and restarting really exist to deal with problems that might occur under load. The first monitor events occur before public IPs are hosted, so there can be no load. If a service doesn't start reliably the first time then the admin probably wants to know about it. nfs_iterate_test() is updated to run an initial monitor event to mark the services as healthy. This initialises the counter so it can be used for the important part of the test. Passing the -i option avoids running the extra monitor event, so the first iteration will be the initial monitor event. Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2024-08-20 22:50:34 +00:00
Martin Schwenke	18a29ed367	ctdb-scripts: Make initial statistics output empty This makes initial failure to retrieve statistics less likely to result in a statistics change. To help with this, statistics retrieval stderr now goes to the log - only stdout goes to the file. This means that the test code for checking statistics changes needs to be redone to actually run the statistics command and check. As with rpcinfo output, this output needs to behave as deterministically in the test code as it done in the event script. Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2024-08-20 22:50:34 +00:00
Martin Schwenke	032b7b49c9	ctdb-scripts: Only consider statistics on timeout Checking statistics is only really relevant to timeouts. That is, if an rpcinfo times out it is worth checking if the service making progress. If the RPC service is not registered then the statistics don't need to be checked because they shouldn't be changing. The 2 previously added tests added to check statistics progress now behave identically and fail on all iterations. To support testing with "timeouts", an optional TIMEOUT flag can now be added to the RPC service passed to nfs_iterate_test(). 2 new tests are added to exercise the new behaviour. The 2 new "if" statements in nfs_iterate_test() could be combined. However, a subsequent commit would split them and would be more difficult to read. Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2024-08-20 22:50:34 +00:00
Martin Schwenke	f7a96deafa	ctdb-tests: Make _rpc_service_up() and _rpc_services_down() internal Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2024-08-20 22:50:34 +00:00
Martin Schwenke	0919701a68	ctdb-tests: Make NFS RPC monitoring tests consistent Update the remaining RPC monitoring tests to use nfs_iterate_test(), depending on it to set results. This makes all RPC monitoring tests consistent, so they will all benefit from future improvements. Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2024-08-20 22:50:34 +00:00
Martin Schwenke	47c33a2442	ctdb-tests: Drop unnecessarily "else" Doing this in a previous commit would have made it more difficult to read that commit. Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2024-08-20 22:50:34 +00:00
Martin Schwenke	8b2f228198	ctdb-tests: Replace implicit healthy behaviour with early exits The early exits from the sub-shell make the obvious successes much more obvious, and slightly simplify the code that follows. Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2024-08-20 22:50:34 +00:00
Martin Schwenke	a522864138	ctdb-tests: Simplify handling of statistics change Handling this across two different functions led to insanity, so simplify. The handling of unhealthy_after when $_numfails = 0 implicitly causes the node to be healthy. This is how the "rpcinfo succeeds" case works. Doing it this way for statistics makes this patch easier to read. The implicit behaviour will go away in the next patch. Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2024-08-20 22:50:34 +00:00
Martin Schwenke	084a69d552	ctdb-tests: Move result check to rpc_set_service_failure_response() The current structure here is wrong and repetitive. Checking rpcinfo result and determining output should be in the same place. Failure counting is now contained in rpc_set_service_failure_response(), but needs a file to survive the sub-shell. Don't attempt to combine and simplify code yet. That would make this commit harder to review. Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2024-08-20 22:50:34 +00:00
Martin Schwenke	4754001200	ctdb-tests: Initialise return code file The output file is initialised, so doesn't need to be created on success. Treat the return code file the same way. Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2024-08-20 22:50:34 +00:00
Martin Schwenke	833deb067d	ctdb-tests: Add function rpc_failure() to log failures and warnings Improves readability, makes future changes easier. Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2024-08-20 22:50:34 +00:00
Martin Schwenke	1d9661d587	ctdb-tests: Argument 3 to nfs_iterate_test() is up iteration Nothing more complex is ever done, so we might as well simplify and reduce coupling. Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2024-08-20 22:50:34 +00:00
Martin Schwenke	7c5e708001	ctdb-tests: nfs_iterate_test() marks RPC service down If an RPC service is given, it is automatically marked down. This avoids repetition in test cases and loosens coupling. Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2024-08-20 22:50:34 +00:00
Martin Schwenke	8edb1fd13c	ctdb-tcp: Remove a use of ctdb_addr_to_str() This one is in a rarely used error path, so call a function that talloc()s the string instead. Again, this will also print the port, which might be useful if we ever add the ability to also specify ports in the nodes list. Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Volker Lendecke <vl@samba.org> Autobuild-User(master): Volker Lendecke <vl@samba.org> Autobuild-Date(master): Tue Aug 20 14:24:14 UTC 2024 on atb-devel-224	2024-08-20 14:24:14 +00:00
Martin Schwenke	afaf151193	ctdb-tcp: Consolidate failure code Same thing several times, so change to common failure code. Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Volker Lendecke <vl@samba.org>	2024-08-20 13:06:33 +00:00
Martin Schwenke	f7aac2f755	ctdb-tcp: Use already constructed node name Node has been found, so use the pre-constructed name instead of calling ctdb_addr_to_str(). This will also print the port, which might be useful if we ever add the ability to also specify ports in the nodes list. Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Volker Lendecke <vl@samba.org>	2024-08-20 13:06:33 +00:00
Martin Schwenke	02c9e7a63f	ctdb-tcp: Use path_rundir_append() to construct lock_path The current constant value doesn't respect CTDB_TEST_MODE/CTDB_BASE. Instead use the path module to allow automatic listening in test mode with local daemons. A single node can be tested with local daemons, using something like: $ tests/local_daemons.sh foo setup -n 1 -C "node address" $ grep "node address" foo/node.0/ctdb.conf # node address = 127.0.0.1 $ tests/local_daemons.sh foo start all $ tests/local_daemons.sh foo print-log 0 \| grep -i chose ... node.0 ctdbd[24546]: ctdb chose network address 127.0.0.1:4379 The trick is that commenting out the node address in ctdb.conf means the chosen node address is the first one from the nodes file that allows bind/listen. In this case it is the only line. The following ensures that automatic listening works for a node that isn't the first: $ cat >mynodes 192.168.1.1 127.0.0.1 $ tests/local_daemons.sh foo setup -n 2 -N mynodes -C "node address" $ grep "node address" foo/node.1/ctdb.conf # node address = 127.0.0.1 $ tests/local_daemons.sh foo start 1 $ tests/local_daemons.sh foo print-log 1 \| grep -i chose [...] node.1 ctdbd[22787]: ctdb chose network address 127.0.0.1:4379 Note that the first address isn't local on this host, so will always fail. So, doing the above and starting both nodes yields... ... $ tests/local_daemons.sh foo start 1 $ sleep 3; tests/local_daemons.sh foo start 0 $ tests/local_daemons.sh foo print-log all \| grep -i 'chose\\|bind' [...] node.1 ctdbd[26351]: ctdb chose network address 127.0.0.1:4379 [...] node.0 ctdbd[26438]: ctdb_tcp_listen_addr: Failed to bind() to socket - Address already in use (98) [...] node.0 ctdbd[26438]: Unable to bind to any node address - giving up ... as expected. It would be nice to add tests for this, but we don't really have infrastructure for that. At least manual testing shows, for the obvious cases, the previous commits didn't break anything. :-) Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Volker Lendecke <vl@samba.org>	2024-08-20 13:06:33 +00:00
Martin Schwenke	17959ccb4b	ctdb-ib: Remove a use of ctdb_set_error() Now the transport code is free of ctdb_set_error(). Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Volker Lendecke <vl@samba.org>	2024-08-20 13:06:33 +00:00
Martin Schwenke	b433663414	ctdb-tcp: Factor out listening code to avoid repetition Modernise debug and comments while here. Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Volker Lendecke <vl@samba.org>	2024-08-20 13:06:33 +00:00
Martin Schwenke	2c75bb8687	ctdb-tcp: Use talloc_strdup() instead of repeating logic The node name is already constructed when the nodes file is loaded, so just copy the node name. Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Volker Lendecke <vl@samba.org>	2024-08-20 13:06:33 +00:00
Martin Schwenke	f36f03172a	ctdb-daemon: Remove a use of ctdb_errstr() Code to setup the transport is about to be cleaned up, including removing uses of ctdb_set_error(), so avoid logging a NULL pointer or some other old error. Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Volker Lendecke <vl@samba.org>	2024-08-20 13:06:33 +00:00
John Mulligan	a743a24d75	ctdb-doc: document nodes list configuration parameter Add the initial documentation of the node list configuration parameter. Signed-off-by: John Mulligan <jmulligan@redhat.com> Reviewed-by: Martin Schwenke <martin@meltin.net> Autobuild-User(master): Martin Schwenke <martins@samba.org> Autobuild-Date(master): Tue Aug 6 01:50:12 UTC 2024 on atb-devel-224	2024-08-06 01:50:12 +00:00
John Mulligan	6817eff833	ctdb-tests: add a nodestatus test that uses the nodes list command Signed-off-by: John Mulligan <jmulligan@redhat.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2024-08-06 00:43:36 +00:00
John Mulligan	6d29c7f819	ctdb-tests: add reloadnodes unit tests that use the nodes list command Signed-off-by: John Mulligan <jmulligan@redhat.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2024-08-06 00:43:36 +00:00
John Mulligan	8a5b743c43	ctdb-tests: add USENODESCOMMAND directive to fake ctdb Add a single line USENODESCOMMAND directive to the fake ctdb in order to enable use of a nodes script instead of a nodes file. For simplicity the fake ctdb always uses `nodes.sh` in the CTDB_BASE. Signed-off-by: John Mulligan <jmulligan@redhat.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2024-08-06 00:43:36 +00:00
John Mulligan	cdb5646b88	ctdb-tests: add unit test coverage for listnodes with command Signed-off-by: John Mulligan <jmulligan@redhat.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2024-08-06 00:43:36 +00:00
John Mulligan	cfc0917135	ctdb-tools: update cli tool to optionally load nodes from command Similar to the recent changes to the ctdb server code, add the ability to load the nodes from a subprocess stdout. Signed-off-by: John Mulligan <jmulligan@redhat.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2024-08-06 00:43:36 +00:00
John Mulligan	ac926a506d	ctdb-conf: add boolean arg for verbosity when loading config In a future commit we will add support for loading the config file from the `ctdb` command line tool. Prior to this change the config file load func always called D_NOTICE that causes the command to emit new text and thus break all the tests that rely on the specific test output (not to mention something users could notice). This change plumbs a new `verbose` argument into some of the config file loading functions. Generally, all existing functions will have verbose set to true to match the existing behavior. Future callers of this function can set it to false in order to avoid emitting the extra text. Signed-off-by: John Mulligan <jmulligan@redhat.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2024-08-06 00:43:36 +00:00
John Mulligan	a0e8304ccf	ctdb-server: rename ctdb_load_nodes_file to ctdb_load_nodes Rename ctdb_load_nodes_file to ctdb_load_nodes as it can now load nodes from more than a regular file. Signed-off-by: John Mulligan <jmulligan@redhat.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2024-08-06 00:43:36 +00:00
John Mulligan	7e7cb91806	ctdb-server: rename nodes_file field to nodes_source Rename the `struct ctdb_context` field nodes_file to nodes_source to better match that the field may indicate something other than a true file. Signed-off-by: John Mulligan <jmulligan@redhat.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2024-08-06 00:43:36 +00:00
John Mulligan	dc65e7082d	ctdb-server: use the new "nodes list" configuration option Use the new "nodes list" configuration option. Executing the given path if the path is prefixed by a `!`. The use case is to decouple the nodes file from the shared storage, especially in the case where the shared storage is provided by a vfs module. For an example, imagine a script that runs `curl` on a URL for a highly-available web server where the URL provides the content of the nodes file. Signed-off-by: John Mulligan <jmulligan@redhat.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2024-08-06 00:43:36 +00:00
John Mulligan	315890e845	ctdb-conf: add "nodes list" configuration option Add a "nodes list" configuration option to the [cluster] section of the ctdb server config. This option will be used similarly to the `cluster lock` parameter works. When unset it defaults to the same value as before (/etc/ctdb/nodes). If given a path that is not prefixed by `!` it instead loads the nodes file from the given path If given a path prefixed by `!` then it executes the path as a command and reads the standard output as if it were the content of the nodes file. Signed-off-by: John Mulligan <jmulligan@redhat.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2024-08-06 00:43:36 +00:00
John Mulligan	bab5170528	ctdb-conf: add ctdb_read_nodes_cmd function Add ctdb_read_nodes_cmd a function that works similarly to ctdb_read_nodes_file but reads the nodes list from the stdout of a subprocess instead of a file in the file system. Signed-off-by: John Mulligan <jmulligan@redhat.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2024-08-06 00:43:36 +00:00
Pavel Filipenský	1fcaf066f4	ctdb:events: Add 46.update-keytabs.script for 'recovered' event BUG: https://bugzilla.samba.org/show_bug.cgi?id=6750 Signed-off-by: Pavel Filipenský <pfilipensky@samba.org> Reviewed-by: Stefan Metzmacher <metze@samba.org>	2024-07-26 17:12:36 +00:00
Martin Schwenke	ead5a3111f	ctdb-daemon: Use ctdb_parse_node_address() in ctdbd While here, fix a trivial memory leak (ctdbd will exit anyway if this function fails). Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Anoop C S <anoopcs@samba.org> Autobuild-User(master): Martin Schwenke <martins@samba.org> Autobuild-Date(master): Tue Jul 23 12:39:18 UTC 2024 on atb-devel-224	2024-07-23 12:39:18 +00:00
Martin Schwenke	181cc097ef	ctdb-daemon: Use ctdb_read_nodes() in ctdbd ctdb_control_getnodesfile() calls ctdb_read_nodes(), which returns a struct ctdb_node_map rather than the old version, so update associated marshalling. While here modernise a debug message and wrap the function arguments. For ctdb_load_nodes_file() to use ctdb_read_nodes(), tweak convert_node_map_to_list() to also use the modern node map structure. Remove unused copy of ctdb_read_nodes_file(). Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Anoop C S <anoopcs@samba.org>	2024-07-23 11:37:34 +00:00
Martin Schwenke	5d2a864c0b	ctdb-protocol: Move ctdb_node_map_* to protocol_api.h Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Anoop C S <anoopcs@samba.org>	2024-07-23 11:37:34 +00:00
Martin Schwenke	fe97d04f18	ctdb-tests: Use ctdb_read_nodes() in the fake ctdbd Remove unused copy of ctdb_read_nodes_file(). Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Anoop C S <anoopcs@samba.org>	2024-07-23 11:37:34 +00:00
Martin Schwenke	3d52258d8a	ctdb-tools: Use ctdb_read_nodes() in the ctdb tool Remove unused copy of ctdb_read_nodes_file(). Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Anoop C S <anoopcs@samba.org>	2024-07-23 11:37:34 +00:00
Martin Schwenke	45da2281aa	ctdb-conf: Add a common node address handling module These functions are intended to be used in ctdbd, the ctdb tool and fake_ctdbd, replacing the different copies in each place. ctdb_read_nodes() will replace ctdb_read_nodes_file(). The name change is intentional - in future the location may be something other than a simple filename. The static copies of ctdb_read_nodes_file() and node_map_add() are slightly sanitised versions of those in tools/ctdb.c, with a call to ctdb_parse_node_address(). A bit more care is taken in node_map_add() to avoid undefined behaviour if talloc_realloc() fails. ctdb_parse_node_address() will replace ctdb_parse_address(). There is an obvious argument change, since the ctdb context argument was unused. It can only fail on an invalid node address, so return a bool. This function might be changed later to allow the input address string to include an optional port. Where to put this module isn't entirely clear. It could go in common, so be part of ctdb-util. However, if it later needs ctdb-conf (e.g. to allow the node list location to be configurable) then there would be a direct cyclic dependency. This is configuration handling, so conf/ seems sane. However, I didn't want to put it into the ctdb-conf target, since some code might need to parse a nodes list but not need to parse ctdb.conf. Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Anoop C S <anoopcs@samba.org>	2024-07-23 11:37:34 +00:00
Martin Schwenke	79c5f451c8	ctdb-protocol: Move definition of CTDB_PORT to protocol Users of CTDB_PORT will all pick up the new definition. Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Anoop C S <anoopcs@samba.org>	2024-07-23 11:37:34 +00:00
Martin Schwenke	67e49d3e54	ctdb-build: Remove unused dependencies on ctdb-util Since commit `ba8f8ef33c`. Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Anoop C S <anoopcs@samba.org>	2024-07-23 11:37:34 +00:00
Martin Schwenke	8ba8fef8ac	ctdb-tests: Correctly handle adding a deleted node at the end The current fake_ctdbd code for reloading the nodes file overruns the allocation when adding a deleted node at the end. This is a very unlikely case, but it might as well work. Check the size of the internal node map when marking a node deleted. Also, update the code that adds a node to correctly set the deleted flag when appropriate. The included test case tests this. Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Guenther Deschner <gd@samba.org> Autobuild-User(master): Günther Deschner <gd@samba.org> Autobuild-Date(master): Wed Jul 17 00:06:53 UTC 2024 on atb-devel-224	2024-07-17 00:06:53 +00:00
Martin Schwenke	340563633c	ctdb-tests: Add more reloadnodes unit tests There are no existing tests to exercise node IP address change detection. Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Guenther Deschner <gd@samba.org>	2024-07-16 23:05:35 +00:00
Björn Baumbach	056dd415dd	ctdb-failover: omit "restrict" optimization keyword Fails with some compilers with error: expected ';', ',' or ')' before 'lineptr' Signed-off-by: Björn Baumbach <bb@sernet.de> Reviewed-by: Volker Lendecke <vl@samba.org> Reviewed-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz> Reviewed-by: Jo Sutton <josutton@catalyst.net.nz> Reviewed-by: Martin Schwenke <martin@meltin.net> Autobuild-User(master): Martin Schwenke <martins@samba.org> Autobuild-Date(master): Tue Jul 2 23:52:37 UTC 2024 on atb-devel-224	2024-07-02 23:52:37 +00:00
Anoop C S	6ba69da8d3	ctdb/wscript: Remove long pending unsupported option It has been a while since --with-libcephfs option was dropped. Therefore stop advertising it through waf scripts. Signed-off-by: Anoop C S <anoopcs@samba.org> Reviewed-by: Volker Lendecke <vl@samba.org> Autobuild-User(master): Volker Lendecke <vl@samba.org> Autobuild-Date(master): Tue Jul 2 09:13:20 UTC 2024 on atb-devel-224	2024-07-02 09:13:20 +00:00
Xavi Hernandez	60550fbe18	Fix starvation of pending writes in CTDB queues CTDB uses a queue to receive requests and send answers. It works asynchronously using the tevent framework. However there was an issue that gave priority to the receiving side so, when a request was processed and the answer posted to the queue, if another incoming request arrived, it was served before sending the previous answer. This scenario could repeat for long periods of time if the frequency of incoming requests was high enough. Eventually, a small time gap between incoming request gave a chance to process the pending output queue, sending many answers in a burst. This patch makes sure that both queues (input and output) are processed if the event contains the appropriate flag. Signed-off-by: Xavi Hernandez <xhernandez@redhat.com> Reviewed-by: Volker Lendecke <vl@samba.org> Reviewed-by: Martin Schwenke <martin@meltin.net> Autobuild-User(master): Volker Lendecke <vl@samba.org> Autobuild-Date(master): Mon Jul 1 09:17:43 UTC 2024 on atb-devel-224	2024-07-01 09:17:43 +00:00
Martin Schwenke	11c4b25331	ctdb-conf: Rename config loading to not be daemon-specific We might end up using it elsewhere. Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Guenther Deschner <gd@samba.org> Reviewed-by: Anoop C S <anoopcs@samba.org>	2024-06-28 18:43:52 +05:30
Martin Schwenke	cf25243421	ctdb-conf: Move conf.[ch] to conf/ subdirectory Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Guenther Deschner <gd@samba.org> Reviewed-by: Anoop C S <anoopcs@samba.org>	2024-06-28 18:43:52 +05:30
Martin Schwenke	52e5e92693	ctdb-conf: Move all conf files to new conf/ subdirectory Leave common/conf.[ch] where they are to make this commit comprehensible. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Guenther Deschner <gd@samba.org> Reviewed-by: Anoop C S <anoopcs@samba.org>	2024-06-28 18:43:52 +05:30
Martin Schwenke	415f9f0745	ctdb-failover: Split statd_callout add-client/del-client rpc.statd is single-threaded and runs its HA callout synchronously. If it is too slow then latency accumulates and rpc.statd's backlog grows. Running a pair of add-client/del-client events with the current code averages ~0.030s in my test environment. This mean that 1000 clients reclaiming locks after failover can easily cause 10s of latency. This could cause rpc.statd to become unresponsive, resulting in a time out for an rpcinfo-based health check of the status service. Split the add-client/del-client events out to a standalone statd_callout executable, written in C, to be used as the HA callout for rpc.statd. All other functions move to statd_callout_helper. Now, running a pair of add-client/del-client events in my test environment averages only ~0.002s. This seems less likely to cause latency problems. The standalone statd_callout executable needs to read a configuration file, which is generated by statd_callout_helper from the "startup" event. It also needs access to a list of currently assigned public IPs. For backward compatibility, during installation a symlink is created from $CTDB_BASE/statd-callout to the new statd_callout, which is installed in the helper directory. Testing this as part of the eventscript unit tests starts to become even more of a hack than it used to be. However, the dependency on stubs and the corresponding setup of fake state makes it hard to move this elsewhere. Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Martin Schwenke <martins@samba.org> Autobuild-Date(master): Tue Jun 25 04:24:57 UTC 2024 on atb-devel-224	2024-06-25 04:24:57 +00:00
Martin Schwenke	089aec2885	ctdb-doc: Drop unnecessary, broken attempt at rpc.statd stack trace There is a typo here, since there will be no process called "status". Instead of fixing it, drop this because rpc.statd isn't the focus of this monitoring check and when systemd is init rpc.statd isn't restarted with nfs-ganesha. It stays running, so a confusing stack trace for rpc.statd is always logged. Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2024-06-25 03:16:37 +00:00
Martin Schwenke	707e0ef55b	ctdb-scripts: Fail monitoring after 1 x NFS-Ganesha not running If ganesha.nfsd is gone then a node can't provide an NFS service, so should be marked unhealthy. A later restart may bring it back to health. Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2024-06-25 03:16:37 +00:00
Martin Schwenke	4766d4568b	ctdb-doc: Add example for NFS-Ganesha RPC checking This one does an rpcinfo check, along with statistics mitigation. It can be used in combination with the existing 20.nfs_ganesha.check. The equivalent kernel NFS file only restarts every 10 failures. This one can be a little more proactive given that false positives are less likely with the statistics mitigation. Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2024-06-25 03:16:37 +00:00
Martin Schwenke	d96078e263	ctdb-scripts: Implement NFS statistics retrieval for NFS-Ganesha Simplicity is preferred here over absolute correctness. If the ganesha_stats command exits with an error or times out then no output is produced so, implicitly, the statistics do not change. Also, the statistics always change at startup. However, it is likely that the statistics change when NFS makes progress and do not change when NFS does not make progress. Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2024-06-25 03:16:37 +00:00
Martin Schwenke	5b7d17d44d	ctdb-scripts: Add service_stats_command variable to NFS checks When monitoring an RPC service, the rpcinfo command might time out even though the service is making progress. In this case, it is just slow, so counting the timeout as a failure and potentially restarting the service will not help. The problem is determining if a service is making progress. Add a new NFS checks service_stats_command. This command is intended to run a statistics command. The output is naively compared using cmp(1). If the output changes then rpcinfo failures are converted to successes. Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2024-06-25 03:16:37 +00:00
Günther Deschner	35f6c3f3d4	ctdb/docs: Include ceph rados namespace support in man page Document the new optional argument to specify the namespace to be associated with RADOS objects in a pool. Pair-Programmed-With: Anoop C S <anoopcs@samba.org> Signed-off-by: Günther Deschner <gd@samba.org> Reviewed-by: Günther Deschner <gd@samba.org> Reviewed-by: David Disseldorp <ddiss@samba.org> Autobuild-User(master): Anoop C S <anoopcs@samba.org> Autobuild-Date(master): Fri Jun 14 07:42:25 UTC 2024 on atb-devel-224	2024-06-14 07:42:25 +00:00
Günther Deschner	d8c52995f6	ctdb/ceph: Add optional namespace support for mutex helper RADOS objects within a pool can be associated to a namespace for logical separation. librados already provides an API to configure such a namespace with respect to a context. Make use of it as an optional argument to the helper binary. Pair-Programmed-With: Anoop C S <anoopcs@samba.org> Signed-off-by: Günther Deschner <gd@samba.org> Reviewed-by: Günther Deschner <gd@samba.org> Reviewed-by: David Disseldorp <ddiss@samba.org>	2024-06-14 06:40:37 +00:00
Martin Schwenke	e9eb581043	ctdb-scripts: Protect against races when starting grace period While the PID check is worth it in relevant cases, NFS-Ganesha still might go away after the check. Unfortunately, neither grace command fails an indicative exit code, so invent one by checking error messages. This can then be converted to success by the caller. Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Volker Lendecke <vl@samba.org> Autobuild-User(master): Volker Lendecke <vl@samba.org> Autobuild-Date(master): Thu May 30 12:50:01 UTC 2024 on atb-devel-224	2024-05-30 12:50:01 +00:00
Martin Schwenke	911117c79a	ctdb-scripts: Check NFS-Ganesha is running before attempting grace If monitoring has failed because it isn't running, then don't fail "startipreallocate" or "relaseip" by trying to go into grace. Don't check this for "takeip". In that case NFS-Ganesha had better be running. Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Volker Lendecke <vl@samba.org>	2024-05-30 11:42:30 +00:00
Martin Schwenke	27c53880c2	ctdb-scripts: Improve service PID check No need to grovel around in /proc. ps will happily tell us the command. Factor out the actual check into a separate function that can be used elsewhere. Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Volker Lendecke <vl@samba.org>	2024-05-30 11:42:30 +00:00
Martin Schwenke	bc10704aec	ctdb-scripts: Improve NFS-Ganesha export path extraction Path values do not need to have quotes. The current code fails if there aren't any. Instead, implement a 2 stage parser using 2 sed commands. See comments in the code for details. Regexps are POSIX basic regular expressions, apart from \<WORD\> (used to ensure WORD is on word boundaries, and the 'i' flag for case insensitivity. The latter is supported in FreeBSD sed. This code successfully parses Path values out of the following monstrosity: path = "/foo/bar1;a"; Path = /foo/bar2; Something = false; Pseudo = "/foo/bar3x" ; Path = "/foo/bar3; y" ; Access_type = RO; Pseudo = "/foo/bar4x" ; path=/foo/bar4; Access_type = RO; Pseudo = "/foo/barNONONO" ; not_Path=/foo/barNONONO; Access_type = RO; Path = /foo/bar5 Pseudo = "/foo/bar6x Path=foo" ; Path=/foo/bar6; Access_type = RO This is probably the best that can be done within a shell script. Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Volker Lendecke <vl@samba.org>	2024-05-30 11:42:30 +00:00
Martin Schwenke	944d9d308d	ctdb-scripts: Add script option CTDB_NFS_EXPORTS_FILE Exports may be contained in an include file rather than the top-level ganesha.conf. Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Volker Lendecke <vl@samba.org>	2024-05-30 11:42:30 +00:00
Martin Schwenke	1be5b1df1b	ctdb-scripts: Fix usage message An IP address is passed to these actions. Reported-by: Arnab Tah <atah@ddn.com> Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Volker Lendecke <vl@samba.org>	2024-05-30 11:42:30 +00:00
Martin Schwenke	2a3d7c0971	ctdb-scripts: Change NFS-Ganesha PID file location This is the current default. Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Volker Lendecke <vl@samba.org>	2024-05-30 11:42:30 +00:00
Martin Schwenke	a534f71347	ctdb-scripts: Quote variable expansions Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Volker Lendecke <vl@samba.org>	2024-05-30 11:42:30 +00:00
Martin Schwenke	6ffb73bb55	ctdb-scripts: Reformat with "shfmt -w -p -i 0 -fn" Best reviewed with "git show -w". Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Volker Lendecke <vl@samba.org>	2024-05-30 11:42:30 +00:00
Martin Schwenke	991d21d075	ctdb-scripts: No longer run statd-callout under sudo This simplifies and removes a bad hack. Also, in my test environment, it also drops the average time take to run an add-client/del-client pair from ~0.055s to ~0.030s. Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Volker Lendecke <vl@samba.org>	2024-05-30 11:42:30 +00:00
Martin Schwenke	ecb9545b3f	ctdb-scripts: Use find_statd_sm_dir() in one more place Take advantage of new function find_statd_sm_dir() when clearing the local system statd state directory, so it uses the correct directory when running on a non-RH distro. Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Volker Lendecke <vl@samba.org>	2024-05-30 11:42:30 +00:00
Martin Schwenke	50b3cebeb3	ctdb-scripts: Set ownership of statd-callout state directory For add-client and del-client, statd-callout is called by rpc.statd, which runs as rpcuser, statd or some other non-root system user. This means that add-client and del-client can't write in the statd-callout state directory if it is only writable by root. rpc.statd must be able to write to its own local system statd state directory, so find this directory and use it as a reference to set the ownership of CTDB's statd-callout state directory. Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Volker Lendecke <vl@samba.org>	2024-05-30 11:42:30 +00:00
Martin Schwenke	608557c6ce	ctdb-scripts: Avoid connecting to ctdbd in add-client/del-client rpc.statd runs statd-callout as a non-root user, which is currently hacked around using some sudo logic that fails to work in some contexts (e.g. in a container). Use $CTDB_MY_PUBLIC_IPS_CACHE to access the node's currently assigned public IPs, for add-client/del-client. This avoids connecting to ctdbd when called from rpc.statd. Also, use $CTDB_MY_PUBLIC_IPS_CACHE in other places where it makes sense. Connections to ctdbd are still made in the "notify" action, but this is always run as root. In the test code, set the PNN after public addresses setup so that the cache of assigned IPs correctly initialised. Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Volker Lendecke <vl@samba.org>	2024-05-30 11:42:30 +00:00
Martin Schwenke	5a4209b713	ctdb-tests: Default PNN is 0 This is called in a couple of places without an argument, so give it a default. Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Volker Lendecke <vl@samba.org>	2024-05-30 11:42:30 +00:00
Martin Schwenke	ed3f041c30	ctdb-scripts: Add caching function for public IPs This is way more complicated than I would like but, as per the comment, this is due to complexities in the way public IPs work. The main consumer will be statd-callout, which will then be able to run as a non-root user. Also generate the cache file in test code, whenever the PNN is set. However, this can cause "ctdb ip" to generate a fake IP layout before public IPs are setup. So, have the "ctdb ip" stub generate the IP layout every time it is run to avoid it being stale. Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Volker Lendecke <vl@samba.org>	2024-05-30 11:42:30 +00:00
Martin Schwenke	558cf280b2	ctdb-scripts: Move state directory creation to "startup" action Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Volker Lendecke <vl@samba.org>	2024-05-30 11:42:30 +00:00
Martin Schwenke	d02fb20d79	ctdb-scripts: Avoid globally changing to queue directory Add new variables statd_callout_state_dir and statd_callout_queue_dir - the latter is for files queued by add-client/del-client. Use $statd_callout_queue_dir to avoid a global cd to the queue directory near the top of the script. Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Volker Lendecke <vl@samba.org>	2024-05-30 11:42:30 +00:00
Martin Schwenke	b90d72c7b8	ctdb-scripts: Move ctdb.tdb attach to statd-callout All of the other uses of ctdb.tdb are in statd-callout. New variable statd_callout_db makes it easy to change the database name in future, perhaps even allowing it to be configurable. Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Volker Lendecke <vl@samba.org>	2024-05-30 11:42:30 +00:00
Martin Schwenke	c912e406c1	ctdb-scripts: Reformat with shfmt -w -p -i 0 -fn Tweak some lines to avoid overflowing 80 columns. Best viewed with "git show -w". Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Volker Lendecke <vl@samba.org>	2024-05-30 11:42:30 +00:00
Martin Schwenke	7b24cc032e	ctdb-scripts: Improve documentation Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Volker Lendecke <vl@samba.org>	2024-05-30 11:42:30 +00:00
Martin Schwenke	5176b43da7	ctdb-scripts: Avoid ShellCheck warning SC2162 SC2162 read without -r will mangle backslashes. Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Volker Lendecke <vl@samba.org>	2024-05-30 11:42:30 +00:00
Martin Schwenke	5401522380	ctdb-scripts: Reformat with "shfmt -w -p -i 0 -fn" Best reviewed with "git show -w". Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Volker Lendecke <vl@samba.org>	2024-05-30 11:42:30 +00:00
Jo Sutton	82224fca78	ctdb: Report errors from getline() Signed-off-by: Jo Sutton <josutton@catalyst.net.nz> Reviewed-by: Martin Schwenke <martin@meltin.net>	2024-04-24 05:16:29 +00:00
Jo Sutton	f9309c221b	ctdb: Ensure ‘ret’ is always initialized This avoids a compilation error: ../../ctdb/protocol/protocol_util.c: In function ‘ctdb_connection_list_read’: ../../ctdb/protocol/protocol_util.c:787:9: error: ‘ret’ may be used uninitialized in this function [-Werror=maybe-uninitialized] 787 \| return ret; \| ^~~ Signed-off-by: Jo Sutton <josutton@catalyst.net.nz> Reviewed-by: Martin Schwenke <martin@meltin.net>	2024-04-24 05:16:29 +00:00
Martin Schwenke	0159c48e89	ctdb-scripts: Do not de-duplicate the interfaces list Using xargs with sort -u to de-duplicate this list was my idea and causes a couple of things to go wrong. The use of xargs causes double-quotes to be lost. The resulting $public_ifaces value also contains newlines. The newlines could be removed with an additional xargs at the end of the pipeline... but that would add an extra level of quote stripping. I have unsuccessfully tried to find an alternative, but still elegant, command pipeline that de-duplicates the list, while maintaining quoting. So, just drop the de-duplication. This might make interface_ifindex_exists_with_options() slightly less efficient. However, that function walks the whole list, only terminating early when a match is found on both interface and options, so at least it will be correct. Include an extra testcase. Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Andrew Bartlett <abartlet@samba.org> Autobuild-User(master): Martin Schwenke <martins@samba.org> Autobuild-Date(master): Thu Apr 18 09:08:34 UTC 2024 on atb-devel-224	2024-04-18 09:08:34 +00:00
Volker Lendecke	7e621b1b53	ctdb: Modernize a few DEBUGs Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Martin Schwenke <mschwenke@ddn.com> Autobuild-User(master): Martin Schwenke <martins@samba.org> Autobuild-Date(master): Wed Apr 17 00:54:55 UTC 2024 on atb-devel-224	2024-04-17 00:54:55 +00:00
Volker Lendecke	73e806c559	ctdb: Remove common/line.[ch] This was an implementation of getline(3), use that instead. Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Martin Schwenke <mschwenke@ddn.com>	2024-04-16 23:51:45 +00:00
Volker Lendecke	ba8f8ef33c	ctdb: Use stdio's getline() in ctdb_connection_list_read() This is the only user of common/line.[ch], which can go next. Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Martin Schwenke <mschwenke@ddn.com>	2024-04-16 23:51:45 +00:00
Volker Lendecke	0baae61e42	lib: Give lib/util/util_file.c its own header file Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Martin Schwenke <mschwenke@ddn.com>	2024-04-16 23:51:45 +00:00
Vinit Agnihotri	f42c5802fa	ctdb-scripts: Add options to generate smb.conf interfaces include file Signed-off-by: Vinit Agnihotri <vagnihotri@ddn.com> Reviewed-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Andrew Bartlett <abartlet@samba.org>	2024-04-16 23:51:45 +00:00
Vinit Agnihotri	56eeb058d2	ctdb-scripts: Rename and relocate function get_all_interfaces() get_all_interfaces() functions gets all names for all public interfaces. However name is misleading. Thus renamed it to get_public_ifaces() and moved it under functions. Signed-off-by: Vinit Agnihotri <vagnihotri@ddn.com> Reviewed-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Andrew Bartlett <abartlet@samba.org>	2024-04-16 23:51:45 +00:00
Volker Lendecke	a3e186b617	lib: Remove timeval_until() We have the same function in tevent, no need to duplicate code. Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Andrew Bartlett <abartlet@samba.org>	2024-03-22 05:03:35 +00:00
Volker Lendecke	78208d4fe4	ctdb: Remove an unnecessary cast nl->srvid is uint64_t, as is the srvid parameter of ctdb_daemon_send_message() Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Martin Schwenke <martin@meltin.net> Autobuild-User(master): Volker Lendecke <vl@samba.org> Autobuild-Date(master): Wed Mar 13 08:43:16 UTC 2024 on atb-devel-224	2024-03-13 08:43:16 +00:00
Vinit Agnihotri	6005de8cb3	ctdb-scripts: Remove usage of releaseip-pre, takeip-pre pseudo-events These were generated by 06.nfs.script. Signed-off-by: Vinit Agnihotri <vagnihotri@ddn.com> Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Volker Lendecke <vl@samba.org> Autobuild-User(master): Volker Lendecke <vl@samba.org> Autobuild-Date(master): Wed Mar 6 07:09:06 UTC 2024 on atb-devel-224	2024-03-06 07:09:06 +00:00
Vinit Agnihotri	2de2d5dd20	ctdb-scripts: Remove unnecessary 06.nfs.script Signed-off-by: Vinit Agnihotri <vagnihotri@ddn.com> Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Volker Lendecke <vl@samba.org>	2024-03-06 06:05:38 +00:00
Vinit Agnihotri	e3294e5526	ctdb-doc: Put NFS in grace on startipreallocate Signed-off-by: Vinit Agnihotri <vagnihotri@ddn.com> Reviewed-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Volker Lendecke <vl@samba.org>	2024-03-06 06:05:38 +00:00
Vinit Agnihotri	34c76ffec5	ctdb-doc: Factor out grace period function Signed-off-by: Vinit Agnihotri <vagnihotri@ddn.com> Reviewed-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Volker Lendecke <vl@samba.org>	2024-03-06 06:05:38 +00:00
Vinit Agnihotri	9631e3569d	ctdb-client: Remove unused function Signed-off-by: Vinit Agnihotri <vagnihotri@ddn.com> Reviewed-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Volker Lendecke <vl@samba.org>	2024-03-06 06:05:38 +00:00
Vinit Agnihotri	a4e492f728	ctdb-scripts: Add handling for startipreallocate Signed-off-by: Vinit Agnihotri <vagnihotri@ddn.com> Reviewed-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Volker Lendecke <vl@samba.org>	2024-03-06 06:05:38 +00:00
Vinit Agnihotri	7dacbcd0ec	ctdb: send a CTDB_SRVID_START_IPREALLOCATE message after CTDB_EVENT_START_IPREALLOCATE Event scripts run the "start_ipreallocate" hook in order to notice that some ip addresses in the cluster potentially changed. CTDB_SRVID_START_IPREALLOCATE gives C code a chance to get notified as well once the event scripts are finished. Signed-off-by: Vinit Agnihotri <vagnihotri@ddn.com> Reviewed-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Volker Lendecke <vl@samba.org>	2024-03-06 06:05:38 +00:00

1 2 3 4 5 ...

9246 Commits