samba-mirror

mirror of https://github.com/samba-team/samba.git synced 2024-12-22 13:34:15 +03:00

Author	SHA1	Message	Date
Martin Schwenke	8d04235f46	ctdb-common: Add trivial FD monitoring abstraction Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-07-28 10:09:34 +00:00
Martin Schwenke	f9467cdf3b	ctdb-build: Link in backtrace support for ctdb_util_tests Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-07-28 10:09:34 +00:00
Martin Schwenke	7a1c43fc74	ctdb-build: Separate test backtrace support into separate subsystem A convention when testing members of ctdb-util is to include the .c file so that static functions can potentially be tested. This means that such tests can't be linked against ctdb-util or duplicate symbols will be encountered. ctdb-tests-common depends on ctdb-client, which depends in turn on ctdb-util, so this can't be used to pull in backtrace support. Instead, make ctdb-tests-backtrace its own subsystem. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-07-28 10:09:34 +00:00
Martin Schwenke	b195e8c0d0	ctdb-build: Sort sources in ctdb-util and ctdb_unit_tests Also, rename ctdb_unit_tests to ctdb_util_tests. The sorting makes it clear that only items from ctdb-util are tested here. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-07-28 10:09:34 +00:00
Martin Schwenke	3efa56aa61	ctdb-daemon: Fix printing of tickle ACKs Commit `f5a2037734` arguably got this back-to-front: 2022-07-27T09:50:01.985857+10:00 testn1 ctdbd[17820]: ../../ctdb/server/ctdb_takeover.c:514 sending TAKE_IP for '10.0.1.173' 2022-07-27T09:50:01.990601+10:00 testn1 ctdbd[17820]: Send TCP tickle ACK: 10.0.1.77:33004 -> 10.0.1.173:2049 2022-07-27T09:50:01.991323+10:00 testn1 ctdb-takeover[19758]: TAKEOVER_IP 10.0.1.173 succeeded on node 0 Unfortunately there is an inconsistency somewhere in the connection tracking code used for tickle ACKs, making this less than obvious. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Thu Jul 28 09:02:08 UTC 2022 on sn-devel-184	2022-07-28 09:02:08 +00:00
Martin Schwenke	30c40046ef	ctdb-build: Add missing dependency on talloc The include isn't strictly necessary, since it is included via common/reqid.c anyway. However, it is a useful hint. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Fri Jul 22 17:01:00 UTC 2022 on sn-devel-184	2022-07-22 17:01:00 +00:00
Martin Schwenke	e831af7b25	ctdb-tests: Work around unreadable file test failure when root root can read files for which the mode prohibits reading, so this test case fails when run as root. Work around this when running as root. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-07-22 16:09:31 +00:00
Martin Schwenke	b20ccaa36d	ctdb-scripts: Use "git config" as last resort to parse nfs.conf Some versions of nfs-utils (e.g. recent CentOS 7) use /etc/nfs.conf but do not include the nfsconf utility to extract values from the file. However, git has an excellent conf file parser, so use it as a last resort. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-07-22 16:09:31 +00:00
Martin Schwenke	db37043bc5	ctdb-scripts: Avoid ShellCheck warning SC2295 For example: In /home/martins/samba/samba/ctdb/tools/onnode line 304: [ "$nodes" != "${nodes%[ ${nl}]}" ] && verbose=true ^---^ SC2295 (info): Expansions inside ${..} need to be quoted separately, otherwise they match as patterns. Did you mean: [ "$nodes" != "${nodes%[ "${nl}"]}" ] && verbose=true For more information: https://www.shellcheck.net/wiki/SC2295 -- Expansions inside ${..} need to b... Who knew? Thanks ShellCheck! Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-07-22 16:09:31 +00:00
Martin Schwenke	00f1d6d947	ctdb-common: Use POSIX if_nameindex() to check interface existence This works as an unprivileged user, so avoids unnecessary errors when running in test mode (and not as root): 2022-02-18T12:21:12.436491+11:00 node.0 ctdbd[6958]: ctdb_sys_check_iface_exists: Failed to open raw socket 2022-02-18T12:21:12.436534+11:00 node.0 ctdbd[6958]: ctdb_sys_check_iface_exists: Failed to open raw socket 2022-02-18T12:21:12.436557+11:00 node.0 ctdbd[6958]: ctdb_sys_check_iface_exists: Failed to open raw socket 2022-02-18T12:21:12.436577+11:00 node.0 ctdbd[6958]: ctdb_sys_check_iface_exists: Failed to open raw socket The corresponding porting test would now become pointless because it would just confirm that "fake" does not exist. Attempt to make it useful by using a less likely name than "fake" and attempting to detect the loopback interface. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-07-22 16:09:31 +00:00
Martin Schwenke	c77a4fde7a	ctdb-daemon: Modernise debug in ctdb_add_public_address() Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-07-22 16:09:31 +00:00
Martin Schwenke	d62fcba7dc	ctdb-daemon: Avoid spurious error sending ARPs for released IP A public IP address can be released in between (and probably before) attempts to send ARPs. One situation when this can occur is when a cluster is shutting down: node A shuts down first, public IPs from node A are taken over by node B, node B is shutdown. Notice this when it occurs and cancel further attempts to send ARPs. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-07-22 16:09:31 +00:00
Martin Schwenke	f5a2037734	ctdb-daemon: Modernise debug in ctdb_control_send_arp() For the tickle ACK logging, render the connection in a buffer. This produces more complete information. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-07-22 16:09:31 +00:00
Martin Schwenke	ec5f6425b7	ctdb-protocol: Add separator argument to ctdb_connection_to_buf() Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-07-22 16:09:31 +00:00
Martin Schwenke	440bd86a99	ctdb-daemon: Drop unused ban_state element from CTDB node structure Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-07-22 16:09:31 +00:00
Martin Schwenke	9898e7c555	ctdb-recoverd: Clean up banning culprit code Make this fully self-contained in the recovery daemon and avoid indexing by PNN. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-07-22 16:09:31 +00:00
Martin Schwenke	19fbc2da38	ctdb-recoverd: Add pnn field to banning state structure This structure is now standalone, so indexing by PNN can be avoided via a subsequent commit. Index by culprit here to make this commit simple. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-07-22 16:09:31 +00:00
Martin Schwenke	0b5dd07604	ctdb-recoverd: Add function node_flags() and use it in elections Indexing a node map by PNN is suboptimal. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-07-22 16:09:31 +00:00
Martin Schwenke	e396eb9fbc	ctdb-scripts: Only run unhealthy call-out when passing threshold For memory usage, no need to dump all of this data on every failed monitor event. The first call will be enough to diagnose the problem. The node will then go unhealthy, drop clients and memory usage should then drop. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Fri Jul 22 07:32:54 UTC 2022 on sn-devel-184	2022-07-22 07:32:54 +00:00
Martin Schwenke	36bd6fd01f	ctdb-scripts: Always check memory usage If filesystem usage exceeds the unhealthy threshold then checking memory usage checking is not done. Always do them both. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-07-22 06:38:32 +00:00
Martin Schwenke	5e7bbcb069	ctdb-scripts: Avoid ShellCheck info SC2162 SC2162 (info): read without -r will mangle backslashes. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-07-22 06:38:32 +00:00
Martin Schwenke	dc7aaca889	ctdb-scripts: Reduce length of very long lines Use printf to allow easier line breaks and use some early returns. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-07-22 06:38:32 +00:00
Martin Schwenke	fc485feae8	ctdb-scripts: De-clutter validate_percentage() It always takes 2 arguments. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-07-22 06:38:32 +00:00
Martin Schwenke	a832c8e273	ctdb-scripts: Reformat using shfmt -w -p -i 0 -fn About to modify this file, so reformat first as per recent Samba convention. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-07-22 06:38:32 +00:00
Martin Schwenke	3df39aa7fb	ctdb-scripts: Avoid ShellCheck warning SC2164 SC2164 (warning): Use 'cd ... \|\| exit' or 'cd ... \|\| return' in case cd fails. A problem can only occur if /etc/ctdb/ or an important subdirectory is removed, which means the script itself would not be found. Use && to silence ShellCheck. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-07-22 06:38:32 +00:00
Martin Schwenke	be293a125f	ctdb-tests: Add new tool unit tests to cover UNKNOWN state Signed-off-by: Vinit Agnihotri <vagnihotri@ddn.com> Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Tue Jun 28 10:16:59 UTC 2022 on sn-devel-184	2022-06-28 10:16:59 +00:00
Vinit Agnihotri	794f125802	ctdb-tool: Add UNKNOWN pseudo state When a node is starting, CTDB reports remote nodes as unhealthy by default. This can be misleading. To hide this, report an "UNKNOWN" pseudo state when a remote node is not disconnected and the runstate is less than or equal to "FIRST_RECOVERY". Signed-off-by: Vinit Agnihotri <vagnihotri@ddn.com> Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-06-28 09:24:31 +00:00
Vinit Agnihotri	428bc71f98	ctdb-tests: Add runstate handling to fake ctdbd Signed-off-by: Vinit Agnihotri <vagnihotri@ddn.com> Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-06-28 09:24:31 +00:00
Martin Schwenke	05601cebc9	ctdb-tests: Return error on empty fake ctdbd configuration blocks These would be unintended errors. The block should be omitted to keep the default value. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-06-28 09:24:31 +00:00
Martin Schwenke	80ba66013e	ctdb-scripts: Drop use of eval in CTDB callout handling eval is not required and causes the follow ShellCheck warning: SC2294 (warning): eval negates the benefit of arrays. Drop eval to preserve whitespace/symbols (or eval as string). Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Fri Jun 24 10:40:50 UTC 2022 on sn-devel-184	2022-06-24 10:40:50 +00:00
Martin Schwenke	4cbb0b13ba	ctdb-tests: Do not require eval tricks for faking NFS callout The current code requires the use of eval in the NFS callout handling to facilitate testing. Improve the code to remove this need. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-06-24 09:49:33 +00:00
Martin Schwenke	0247fd8a02	ctdb-scripts: Avoid ShellCheck warning SC2162 SC2162 read without -r will mangle backslashes Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-06-24 09:49:33 +00:00
Martin Schwenke	7f799a8d6f	ctdb-tests: Fix faking of program stack traces The current code works in all current cases but is lazy and wrong. Fix it to avoid breaking on code changes involving different thread setups. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-06-24 09:49:33 +00:00
Martin Schwenke	0b728a4e8f	ctdb-tests: Improve Debian-style event script unit testing Tests can be run by hand using different distro styles, such as: CTDB_NFS_DISTRO_STYLE=systemd-debian \ ./tests/run_tests.sh ./tests/UNIT/eventscripts/{06,60}.nfs.* This fixes known problems for Debian styles, so the tests now pass for the following values of CTDB_NFS_DISTRO_STYLE: systemd-redhat sysvinit-redhat systemd-debian sysvinit-debian Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-06-24 09:49:33 +00:00
Martin Schwenke	7f3a0c7e9c	ctdb-scripts: Parameterise /etc directory to aid testing At the moment test results can be influenced by real system configuration files. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-06-24 09:49:32 +00:00
Martin Schwenke	337ef7c1b4	ctdb-scripts: Set NFS services to "AUTO" if started by another service For example, in Sys-V init "rquotad" is started by the main "nfs" service. At the moment the call-out can't distinguish between this case and "should never be run". Services set to "AUTO" are hand-stopped/started via service_stop()/service_start() on failure via restart_after. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-06-24 09:49:32 +00:00
Martin Schwenke	8b8660d883	ctdb-scripts: Refactor the manual RPC service start/stop This logic needs improving, so factor the decision making into new functions service_or_manual_stop() and service_or_manual_start(). Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-06-24 09:49:32 +00:00
Martin Schwenke	cd018d0ff5	ctdb-scripts: Simplify and rename basic_stop() and basic_start() Drop the argument. These now just stop/start the overall NFS service, so rename them appropriately. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-06-24 09:49:32 +00:00
Martin Schwenke	09fd1e5579	ctdb-scripts: Move nfslock out of basic_stop() and basic_start() These are only called in one place and should be done inline, since that is less confusing. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-06-24 09:49:32 +00:00
Martin Schwenke	a43a1ebe51	ctdb-tests: Reformat script Samba is reformatting shell scripts using shfmt -w -p -i 0 -fn so update this one before editing. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-06-24 09:49:32 +00:00
Martin Schwenke	e752f841e6	ctdb-daemon: Use DEBUG() macro for child logging Directly using dbgtext() with file logging results in a log entry with no header, which is wrong. This is a regression, introduced in commit `10d15c9e5d`. Prior to this, CTDB's callback for file logging would always add a header. Use DEBUG() instead dbgtext(). Note that DEBUG() effectively compares the passed script_log_level with DEBUGLEVEL, so an explicit check is no longer necessary. BUG: https://bugzilla.samba.org/show_bug.cgi?id=15090 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Volker Lendecke <vl@samba.org> Autobuild-User(master): Volker Lendecke <vl@samba.org> Autobuild-Date(master): Thu Jun 16 13:33:10 UTC 2022 on sn-devel-184	2022-06-16 13:33:10 +00:00
Martin Schwenke	88f35cf862	ctdb-daemon: Drop unused prefix, logfn, logfn_private These aren't set anywhere in the code. Drop the log argument because it is also no longer used. BUG: https://bugzilla.samba.org/show_bug.cgi?id=15090 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Volker Lendecke <vl@samba.org>	2022-06-16 12:42:35 +00:00
Martin Schwenke	1596a3e84b	ctdb-common: Tell file logging not to redirect stderr This allows ctdb_set_child_logging() to work. BUG: https://bugzilla.samba.org/show_bug.cgi?id=15090 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Volker Lendecke <vl@samba.org>	2022-06-16 12:42:35 +00:00
Martin Schwenke	b20ee18031	ctdb-tests: Fix a cut and paste error in a comment Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Tue May 31 05:56:43 UTC 2022 on sn-devel-184	2022-05-31 05:56:43 +00:00
Martin Schwenke	90a96f06a9	ctdb-recoverd: Do not ban on unknown error when taking cluster lock If the cluster filesystem is unavailable then I/O errors may occur. This is no worse than contention, so don't ban. This avoids having services unavailable for longer than necessary. Update the associated test to simply confirm that this results in a leaderless cluster, and leadership is restored when the lock can once again be taken. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-05-31 05:06:29 +00:00
Martin Schwenke	a400f4e7cc	ctdb-doc: Fix typos in the policy routing documentation Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-05-31 05:06:29 +00:00
Martin Schwenke	da9decfc5e	ctdb-daemon: Remove unused #includes of rb_tree.h ctdb_takeover.c and eventscript.c no longer use this. ipalloc_common.c has never used it. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-05-31 05:06:29 +00:00
Martin Schwenke	80de84d36e	ctdb-daemon: Log per-database summary of resent calls After a recovery that takes a significant amount of time the logs are flooded with messages about every resent call. Log a summary instead and demote per-call messages to INFO level. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-05-31 05:06:29 +00:00
Pavel Filipenský	8cb6565011	ctdb: Covscan: unchecked return value for trbt_traversearray32() Signed-off-by: Pavel Filipenský <pfilipen@redhat.com> Reviewed-by: Jeremy Allison <jra@samba.org> Reviewed-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>	2022-05-14 03:49:32 +00:00
Pavel Filipenský	91d1d0e4c8	ctdb: Fix trailing whitespace in rb_tree.c Signed-off-by: Pavel Filipenský <pfilipen@redhat.com> Reviewed-by: Jeremy Allison <jra@samba.org> Reviewed-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>	2022-05-14 03:49:32 +00:00
Martin Schwenke	64275fc1a2	ctdb-tests: Add backtrace on abort to some tests These are easier to debug with a backtrace. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Tue May 3 10:13:23 UTC 2022 on sn-devel-184	2022-05-03 10:13:23 +00:00
Martin Schwenke	d39377d6fc	ctdb-tests: Provide a method to dump the stack on abort Some tests make generous use of assert() and it can be difficult to guess the cause of failures without resorting to GDB. This provides some help. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-05-03 09:19:31 +00:00
Martin Schwenke	73b27def7b	build: Add missing ctdb-client dependencies Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-05-03 09:19:31 +00:00
Martin Schwenke	d57d624a77	ctdb-build: Drop unnecessary uses of include/ sub-directory None of these include any files from the include/ sub-directory. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-05-03 09:19:31 +00:00
Martin Schwenke	6d3c9e64d9	ctdb-tests: Use test_case() to help document test cases Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-05-03 09:19:31 +00:00
Martin Schwenke	d52b497d11	ctdb-locking: Don't pass NULL to tevent_req_is_unix_error() If there is an error then this pointer is unconditionally dereferenced. However, the only possible error appears to be ENOMEM, where a crash caused by dereferencing a NULL pointer isn't a terrible outcome. In the absence of a security issue this is probably not worth backporting. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-05-03 09:19:31 +00:00
Martin Schwenke	490e5f4d4c	ctdb-mutex: Don't pass NULL to tevent_req_is_unix_error() If there is an error then this pointer is unconditionally dereferenced. However, the only possible error appears to be ENOMEM, where a crash caused by dereferencing a NULL pointer isn't a terrible outcome. In the absence of a security issue this is probably not worth backporting. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-05-03 09:19:31 +00:00
Martin Schwenke	8deec3bc67	ctdb-scripts: Drop unused ctdbd_wrapper Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-04-06 06:34:37 +00:00
Martin Schwenke	a1e78cc372	ctdb-scripts: Drop uses of ctdbd_wrapper The only value this now provides is use of a notification script to log when start/stop are called. This was used for debugging strange start/stop failures, which have not been recently seen. Also, systemd does a good job of logging start/stop. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-04-06 06:34:37 +00:00
Martin Schwenke	aca5972233	ctdb-scripts: Remove failsafe that drops all IPs on failed shutdown IPs are dropped in the shutdown event. If a watchdog is necessary to ensure public IPs aren't on interfaces when CTDB isn't running, then see ctdb-crash-cleanup.sh. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-04-06 06:34:37 +00:00
Martin Schwenke	6fb08a6580	ctdb-daemon: Don't release all public IPs during shutdown sequence This further untangles public IP handling from the main daemon. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-04-06 06:34:37 +00:00
Martin Schwenke	cb438ecfd4	ctdb-scripts: Drop all public IPs in the "shutdown" event This is functionally the same as ctdb_release_all_ips(). Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-04-06 06:34:37 +00:00
Martin Schwenke	3caddaafa0	ctdb-config: Drop CTDB_STARTUP_TIMEOUT This was added to be able to notice startup failures when unknown tunables were present in the configuration. Tunables are now set by the daemon, so this is no longer necessary. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-04-06 06:34:37 +00:00
Martin Schwenke	208034ecfe	ctdb-doc: Update documentation for tunables configuration Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-04-06 06:34:37 +00:00
Martin Schwenke	0902553d15	ctdb-scripts: No longer load tunables via 00.ctdb.script setup event Drop related tests. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-04-06 06:34:37 +00:00
Martin Schwenke	f49446cb1e	ctdb-daemon: Load tunables from ctdb.tunables Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-04-06 06:34:37 +00:00
Martin Schwenke	a509ee059e	ctdb-daemon: New function ctdb_tunables_load() Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-04-06 06:34:37 +00:00
Martin Schwenke	b14f2a205d	ctdb-tests: Add unit tests for tunables code This aims to test ctdb_tunable_load_file() but also exercises ctdb_tunable_names() and ctdb_tunable_get_value(). ctdb_tunable_set_value() is indirectly exercised via ctdb_tunable_load_file(). Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-04-06 06:34:37 +00:00
Martin Schwenke	381134939b	ctdb-tests: Add function test_case(), tweak unit test header format Instead of documenting test cases with a comment, this allows them to be documented via an argument to a function that is printed when the test case is run. This makes it easier locate test case failures when commands used by test cases look similar, Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-04-06 06:34:37 +00:00
Martin Schwenke	c413838f79	ctdb-tests: Strip trailing newlines from expected result output This allows the provided output to be specified a little more carelessly. As per the comment, trailing newlines can't be matched anyway, so this is notionally a bug fix. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-04-06 06:34:37 +00:00
Martin Schwenke	5fa0c86b61	ctdb-tests: Reformat script Samba is reformatting shell scripts using shfmt -w -p -i 0 -fn so update this one before editing. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-04-06 06:34:37 +00:00
Martin Schwenke	bcd66e17ee	ctdb-common: Add function ctdb_tunable_load_file() Allows direct loading of tunables from a file. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-04-06 06:34:37 +00:00
Vinit Agnihotri	93824b8c33	packaging: move CTDB service file to top-level Signed-off-by: Vinit Agnihotri <vagnihotri@ddn.com> Reviewed-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-04-06 06:34:37 +00:00
Martin Schwenke	2f6b31788b	ctdb-packaging: Move RPM spec file to examples directory We used to use this for building test packages for standalone CTDB. However, our testing has now changed to use binary tarballs. We believe we were the only users of this spec file and expect CTDB to only be installed as part of a top-level Samba build, especially in RPM form. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-04-06 06:34:37 +00:00
Stefan Metzmacher	aa02cf3c44	ctdb/packaging/RPM: don't use waf directly ./configure && make && make install is will always work. Signed-off-by: Stefan Metzmacher <metze@samba.org> Reviewed-by: Andrew Bartlett <abartlet@samba.org>	2022-03-29 22:32:32 +00:00
Stefan Metzmacher	22c46d9f41	configure/Makefile: export PYTHONHASHSEED=1 in all 'configure/Makefile' scripts Signed-off-by: Stefan Metzmacher <metze@samba.org> Reviewed-by: Andrew Bartlett <abartlet@samba.org>	2022-03-29 22:32:32 +00:00
Archana	7debfe7a23	ctdb-tools: Remove deprecated networking commands and replace with new commands The changes are made to replace the deprecated network commands (ifconfig,netstat) with the new commands (ip addr,ss) respectively Signed-off-by: Archana Chidirala <archana.chidirala.chidirala@ibm.com> Reviewed-by: Volker Lendecke <vl@samba.org> Reviewed-by: Martin Schwenke <martin@meltin.net> Autobuild-User(master): Volker Lendecke <vl@samba.org> Autobuild-Date(master): Tue Mar 8 12:30:53 UTC 2022 on sn-devel-184	2022-03-08 12:30:53 +00:00
Archana	e16cd0316f	ctdb-packaging: Remove deprecated networking command netstat and replace with "ss" command Signed-off-by: Archana Chidirala <archana.chidirala.chidirala@ibm.com> Reviewed-by: Volker Lendecke <vl@samba.org> Reviewed-by: Martin Schwenke <martin@meltin.net>	2022-03-08 11:32:36 +00:00
Martin Schwenke	0d8084ed62	ctdb-protocol: CID 1499395: Uninitialized variables (UNINIT) Issue is reported here: 853 case CTDB_CONTROL_DB_VACUUM: { 854 struct ctdb_db_vacuum db_vacuum; 855 >>> CID 1499395: Uninitialized variables (UNINIT) >>> Using uninitialized value "db_vacuum.full_vacuum_run" when calling "ctdb_db_vacuum_len". 856 CHECK_CONTROL_DATA_SIZE(ctdb_db_vacuum_len(&db_vacuum)); 857 return ctdb_control_db_vacuum(ctdb, c, indata, async_reply); 858 } The problem is that ctdb_bool_len() unnecessarily dereferences its argument, which in this case is &db_vacuum.full_vacuum_run. Not a security issue because the value copied by dereferencing is not used. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Wed Feb 23 02:02:06 UTC 2022 on sn-devel-184	2022-02-23 02:02:06 +00:00
Martin Schwenke	0f373443ef	ctdb-tests: Fix missing #include for sigaction(2) Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-02-23 01:08:37 +00:00
Martin Schwenke	ef9017a150	ctdb-tests: Dump a stack trace on abort Debugging a test failure here without GDB is not possible. Dumping a stack trace gives a good hint. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-02-23 01:08:37 +00:00
Martin Schwenke	17d792e9aa	ctdb-tests: Iterate protocol tests internally Instead of repeatedly running a test binary. Run time for these tests reduces from ~90s to ~75s. When run under valgrind, the run time for protocol_test_001.sh reduces from ~390s to <1s. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Mon Feb 14 04:32:29 UTC 2022 on sn-devel-184	2022-02-14 04:32:29 +00:00
Martin Schwenke	2329305019	ctdb-tests: Add iteration support for protocol tests The current method of repeatedly running a binary has huge overhead, especially with valgrind. protocol_test_iterate_tag() allows output that is usually used for hinting where a test failure occurred to be replaced with a tag stored in a buffer, which is printed on test failure. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-02-14 03:36:38 +00:00
Martin Schwenke	331c435ce5	ctdb-tests: Add a test for stalled node triggering election A stalled node probably continues to hold the cluster lock, so confirm elections work in this case. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14958 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Mon Feb 14 02:46:01 UTC 2022 on sn-devel-184	2022-02-14 02:46:01 +00:00
Martin Schwenke	265e44abc4	ctdb-tests: Factor out functions to detect when generation changes BUG: https://bugzilla.samba.org/show_bug.cgi?id=14958 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-02-14 01:47:31 +00:00
Martin Schwenke	0e74e03c9c	ctdb-recoverd: Consistently log start of election Elections should now be quite rare, so always log when one begins. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14958 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-02-14 01:47:31 +00:00
Martin Schwenke	bf55a0117d	ctdb-recoverd: Always send unknown leader broadcast when starting election This is currently missed when the cluster lock is lost. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14958 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-02-14 01:47:31 +00:00
Martin Schwenke	9b3fab052b	ctdb-recoverd: Consistently have caller set election-in-progress The problem here is that election-in-progress must be set to potentially avoid restarting the election broadcast timeout in main_loop(), so this is already done by leader_handler(). Have force_election() set election-in-progress for all election types and do not bother setting it in cluster_lock_election(). BUG: https://bugzilla.samba.org/show_bug.cgi?id=14958 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-02-14 01:47:31 +00:00
Martin Schwenke	188a902156	ctdb-recoverd: Always cancel election in progress Election-in-progress is set by unknown leader broadcast, so needs to be cleared in all cases when election completes. This was seen in a case where the leader node stalled, so didn't send leader broadcasts for some time. The node continued to hold the cluster lock, so another node could not become leader. However, after the node returned to normal it still did not send leader broadcasts because election-in-progress was never cleared. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14958 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-02-14 01:47:31 +00:00
Martin Schwenke	f7de2132bb	ctdb-doc: Remove documentation for recovery process This is many years out of date and recent changes make it worse. It is unlikely that anyone has the time to fix this in the near future, so remove it because it is misleading. Database recovery steps are well documented in comments in the recovery helper. Cluster monitoring documentation can be re-added when things stop changing. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:33 +00:00
Martin Schwenke	a940ad9370	ctdb-doc: Update example configuration migration script Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:33 +00:00
Martin Schwenke	01313ea243	ctdb-tests: Improve test coverage for leader role yield and elections Rename test, clean up node selection. Duplicate for for banning and removing leader capability cases. Repeat all 3 tests without cluster lock. All of the standard election triggers are now tested, with and without cluster lock. Due to test cluster configuration limitations, the tests without cluster lock are skipped on a real cluster. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:33 +00:00
Martin Schwenke	5d31778149	ctdb-tests: Support commenting out local daemons configuration options Can be used to disable default options, such as cluster lock. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:33 +00:00
Martin Schwenke	34d2ca0ae6	ctdb-config: Add configuration option [cluster] leader timeout Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:33 +00:00
Martin Schwenke	1dfb266038	ctdb-config: [legacy] recmaster capability -> [cluster] leader capability Rename this configuration item and move it into the [cluster] configuration section. Update documentation to match. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:33 +00:00
Martin Schwenke	f5a39058f0	ctdb-config: [cluster] recovery lock -> [cluster] cluster lock Retain "recovery lock" and mark as deprecated for backward compatibility. Some documentation is still inconsistent. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:33 +00:00
Martin Schwenke	d752a92e11	ctdb-doc: Update documentation for leader and cluster lock Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:33 +00:00
Martin Schwenke	73555e8248	ctdb-recoverd: Use race for cluster lock as election when lock is enabled If the cluster is partitioned then nodes in one partition can not take the lock anyway, so election is pointless. It just introduces unnecessary corner cases. Instead just race for the lock. When a node notices a lack of leader and notifies other nodes of an election via an unknown leader broadcast, the cluster lock election is hooked into this broadcast. The test needs to be updated because losing the cluster lock can now result in a leadership change. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:33 +00:00
Martin Schwenke	938d64c8ff	ctdb-protocol: Mark {GET,SET}_RECMASTER controls obsolete Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:33 +00:00
Martin Schwenke	03ae158cff	ctdb-protocol: Drop marshalling for {GET,SET}_RECMASTER controls Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:33 +00:00
Martin Schwenke	a76374070d	ctdb-daemon: Drop implementation of {GET,SET}_RECMASTER controls Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:33 +00:00
Martin Schwenke	193b624d26	ctdb-protocol: Drop protocol client functions for recmaster controls Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:33 +00:00
Martin Schwenke	cda673ff6d	ctdb-client: Drop unused recmaster functions Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:33 +00:00
Martin Schwenke	16efbca003	ctdb-daemon: Drop unused old client recmaster functions Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:33 +00:00
Martin Schwenke	c68267b2a6	ctdb-recoverd: Drop calls to ctdb_ctrl_setrecmaster() Nothing fetches this value anymore. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:33 +00:00
Martin Schwenke	58d7fcdf7c	ctdb-recoverd: Drop recovery master verification This doesn't make sense if leader broadcasts are used. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:33 +00:00
Martin Schwenke	f02e097485	ctdb-tools: recovery master -> leader The following command names are changed: recmaster -> leader setrecmasterrole -> setleaderrole Command output changed for the following commands: status getcapabilities Documentation and tests are updated to reflect these changes. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:33 +00:00
Martin Schwenke	e60581d5b5	ctdb-tools: Use leader broadcast in get_leader() Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:33 +00:00
Martin Schwenke	92fb68e9b8	ctdb-tools: Factor out get_leader() This seems pointless but it localises a subsequent change and also starts a terminology change in the tool code. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:33 +00:00
Martin Schwenke	17ba15ccd8	ctdb-tools: Handle leader broadcasts in ctdb tool Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:33 +00:00
Martin Schwenke	ec90f36cc6	ctdb-tools: Print "UNKNOWN" when leader PNN is unknown Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:33 +00:00
Martin Schwenke	01a8d1a4a4	ctdb-client: Factor out function ctdb_client_wait_func_timeout() Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:33 +00:00
Martin Schwenke	403db5b528	ctdb-tests: Factor out getting leader and waiting for leader change Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:33 +00:00
Martin Schwenke	4786982cc8	ctdb-tests: Add leader broadcasts to fake_ctdbd Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:33 +00:00
Amitay Isaacs	756dfdfed9	ctdb-tests: Implement srvid_handler for dispatching messages Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2022-01-17 10:21:33 +00:00
Martin Schwenke	958746f947	ctdb-recoverd: Simplify some stopped/banned checks to inactive checks Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:33 +00:00
Martin Schwenke	358c59f51a	ctdb-recoverd: No longer take cluster lock during recovery Confirm instead that it is already held. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:33 +00:00
Martin Schwenke	36ffaaa691	ctdb-recoverd: Add and use function cluster_lock_enabled() Now all references to ctdb->recovery_lock are encapsulated in the cluster lock code. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:33 +00:00
Martin Schwenke	5ee664ee17	ctdb-recoverd: Terminology change: recovery lock -> cluster lock No functional changes, just name changes for clarity. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:33 +00:00
Martin Schwenke	0f2250f4f9	ctdb-recoverd: Take cluster lock when election completes It is no longer just a recovery lock but is always held by the cluster leader. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:33 +00:00
Martin Schwenke	011e880002	ctdb-recoverd: Factor out function cluster_lock_take() Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:33 +00:00
Martin Schwenke	037abf8620	ctdb-tests: Avoid a race See the comment in the code for details. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:32 +00:00
Martin Schwenke	ef7e3265f7	ctdb-tests: Setup cluster with expected arguments ctdb_test_init() doesn't actually pass arguments to local_daemons.sh. This needs to be done using ctdb_nodes_start_custom(). Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:32 +00:00
Martin Schwenke	b029ca4d51	ctdb-recoverd: Drop leader validation The introduction of the leader broadcast timeout provides an alternative to the current leader validation. Using the leader broadcast may not be as fast but it is more correct. When the leader node is stopped or banned, the only way of triggering an election is currently to fetch the leader's node map to check whether the it is still active. This is because the leader will no longer push the node map to other nodes. However, having all nodes fetch the node map from an inactive leader may be unreliable. Most of the other cases are also handled more reliably by the leader broadcast timeout. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:32 +00:00
Martin Schwenke	7e53fab0a3	ctdb-recoverd: Drop special case for elected-before-connected This no longer occurs at startup due to the leader broadcast timeout. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:32 +00:00
Martin Schwenke	ef4b8c13c0	ctdb-recoverd: Handle leader broadcast timeout If no leader broadcasts have been received from the leader for more than 5s then trigger an election. Apart from being sane behaviour, this avoids elected-before-connected bugs at startup, where a node elects itself leader before it is connected to other nodes. When a node processes a leader broadcast timeout it sends an unknown leader broadcast to all nodes. That causes cancellation of the leader broadcast timeout across the cluster. This is particular important at startup, since nodes may be started in a staggered fashion. Without this cluster-wide cancellation, a node might notice the lack of leader, win an election and complete a recovery before other nodes notice the lack of leader. When the leader broadcast timeout finally occurs on the other nodes then they'll put the cluster back into an unnecessary recovery. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:32 +00:00
Martin Schwenke	5c7f6da0f0	ctdb-recoverd: Send leader broadcasts These are triggered on 1 second timer, but are only sent if the node is the current leader and there is no election underway. If this node can not be the leader then ensure it releases the recovery lock. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:32 +00:00
Martin Schwenke	789a75abfa	ctdb-recoverd: Process leader broadcasts Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:32 +00:00
Martin Schwenke	3d3767a259	ctdb-protocol: Add CTDB_SRVID_LEADER CTDB_SRVID_LEADER will be regularly broadcast to all connected nodes by the leader. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:32 +00:00
Martin Schwenke	c2cfd9c21a	ctdb-recoverd: Add an explicit flag for election in progress An alternate election method will be added that doesn't use the election timeout, so this provides a common way for recognising when an election is in progress. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:32 +00:00
Martin Schwenke	ac5a3ca063	ctdb-recoverd: Only start election if node can be leader Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:32 +00:00
Martin Schwenke	7baadfe27e	ctdb-recoverd: Add and use function this_node_can_be_leader() This makes the code self-documenting. In ctdb_election_data() there is a slight behaviour change. An inactive node will now try to lose an election. This case should not happen because: * An inactive node can't win an election round and then send a reply. * Any inactive node should never start an election. There are currently places where this happens and they will be fixed later. There is an instance where this could be used in validate_recovery_master() but this involves a more serious logic change. Overhaul this function later. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:32 +00:00
Martin Schwenke	94b546c268	ctdb-recoverd: Logging/comments: recovery master -> leader There are some remaining instances in this file but they will be removed in subsequent commits. Modernise debug macros as appropriate. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:32 +00:00
Martin Schwenke	dd79e9bd14	ctdb-recoverd: Rename recmaster field to leader Recovery master is being renamed to leader. This follows clustering best practice (e.g. RAFT). Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:32 +00:00
Martin Schwenke	2ee6763c7d	ctdb-recoverd: Use rec->pnn everywhere This is currently referenced in a number of inconsistent ways, including: * pnn * rec->ctdb->pnn * ctdb->pnn * ctdb_get_pnn(ctdb) * ctdb_get_pnn(rec->ctdb) The first of these always requires some thought about the context - is this the node PNN or some other PNN (e.g. argument to function)? rec->pnn is now always used when referring to the recovery daemon's PNN. Doing this also reduces reliance on struct ctdb_context internals. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:32 +00:00
Martin Schwenke	4af3b10a37	ctdb-recoverd: Change argument to srvid_disable_and_reply() Reduce dependency on struct ctdb_context internals, enable a subsequent change. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:32 +00:00
Martin Schwenke	b7c138ca99	ctdb-recoverd: Simplify arguments to ctdb_ban_node() ban_time argument is always ctdb->tunable.recovery_ban_period, so build this in and make the calling code more readable. ctdb_ban_node() already logs how long a node is banned for, so don't repeatedly log this. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:32 +00:00
Martin Schwenke	a5e0ddac62	ctdb-recoverd: Simplify arguments to verify_local_ip_allocation() All other arguments are available via rec, so simplify. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:32 +00:00
Martin Schwenke	67b5191640	ctdb-recoverd: Simplify arguments to do_recovery() pnn and nodemap are both available via the rec context, so simplify. vnnmap is unused. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:32 +00:00
Martin Schwenke	57882beb16	ctdb-recoverd: Simplify arguments to some election functions The pnn and nodemap arguments to force_election() and send_election_request() are always effectively rec->pnn and rec->nodemap, so simplify. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:32 +00:00
Martin Schwenke	9dbe7cc85e	ctdb-recoverd: Add PNN to recovery daemon context This is currently referenced in a number of inconsistent ways, including: * pnn * rec->ctdb->pnn * ctdb->pnn * ctdb_get_pnn(ctdb) * ctdb_get_pnn(rec->ctdb) The first of these always requires some thought about the context - is this the node PNN or some other PNN (e.g. argument to function)? The intention is to always use rec->pnn when referring to the recovery daemon's PNN. Doing this also reduces reliance on struct ctdb_context internals. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:32 +00:00
Martin Schwenke	ff0140e470	ctdb-recoverd: Use this_node_is_leader() in an extra context This is arguably clearer. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:32 +00:00
Martin Schwenke	c8721d01c6	ctdb-recoverd: Factor out and use function this_node_is_leader() Make the code self-documenting. This preempts an upcoming change to terminology but doing it now saves a lot of churn. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:32 +00:00
Martin Schwenke	57a32cebdd	ctdb-recoverd: Pass SIGHUP to running helper The recovery and takeover helpers can run for a while and generate non-trivial logs, so have them reopen their logs to support log rotation. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Mon Jan 17 04:36:30 UTC 2022 on sn-devel-184	2022-01-17 04:36:30 +00:00
Martin Schwenke	8e949a6082	ctdb-recoverd: Record helper PID in recovery daemon context Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 03:43:30 +00:00
Martin Schwenke	97a45f6f25	ctdb-recoverd: Add log reopening on SIGHUP to helpers Recovery and takeover helpers can run for a while and generate non-trivial logs. They should support log reopening. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 03:43:30 +00:00
Martin Schwenke	51f0380e83	ctdb-daemon: Enable log reopening for event daemon Add and call hook to pass on SIGHUP to eventd. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 03:43:30 +00:00
Martin Schwenke	4f14d7c0b9	ctdb-event: Reopen logs on SIGHUP Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 03:43:30 +00:00
Martin Schwenke	c554a325fe	ctdb-daemon: Enable log reopening for recovery daemon Pass on a SIGHUP to the recovery daemon, which will then reopen its logs. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 03:43:30 +00:00
Martin Schwenke	4acfefed61	ctdb-recoverd: Add basic log reopening Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 03:43:30 +00:00
Martin Schwenke	4ed37de82b	ctdb-daemon: Add basic top-level log reopening Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 03:43:30 +00:00
Martin Schwenke	7277385390	ctdb-common: Add support for reopening logs Now that CTDB uses Samba's file logging it is possible to reopen the logs, so that log rotation can be supported. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 03:43:30 +00:00
Martin Schwenke	d0a19778cd	ctdb-common: Separate sock_daemon's SIGHUP and SIGUSR1 handling SIGHUP is for reopening logs, SIGUSR1 is for reconfigure. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 03:43:30 +00:00
Martin Schwenke	10d15c9e5d	ctdb-common: Use Samba's DEBUG_FILE logging This has support for log rotation (or re-opening). The log format is updated to use an RFC5424 timestamp and to include a hostname. The addition of the hostname allows trivial merging of log files from multiple cluster nodes. The hostname is faked from the CTDB_BASE environment variable during testing, as per the comment in the code. It is currently faked in a similar manner in local_daemons.sh when printing logs, so drop this. Unit tests need updating because stderr logging no longer produces a "PROGNAME[PID]: " header. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 03:43:30 +00:00
Martin Schwenke	666a048707	ctdb-common: Switch initial debug type to DEBUG_DEFAULT_STDERR This can be overridden by DEBUG_FILE, whereas DEBUG_STDERR can not. Signed-off-by: Martin Schwenke <martin@meltin.net> Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 03:43:30 +00:00
Martin Schwenke	7163846a49	ctdb-protocol: Print IPv6 sockets with RFC5952 "[2001:db8::1]:80" notation RFC5952 says the existing style is not recommended and the [] style should be employed. There are more optimised ways of adding the square brackets but they tend to be uglier. Parsing IPv6 sockets without [] is now tested indirectly by parsing examples in both styles and comparing the results. Signed-off-by: Martin Schwenke <martin@meltin.net> Signed-off-by: Volker Lendecke <vl@samba.org> Autobuild-User(master): Volker Lendecke <vl@samba.org> Autobuild-Date(master): Thu Jan 13 17:02:21 UTC 2022 on sn-devel-184	2022-01-13 17:02:21 +00:00
Martin Schwenke	255fe69c90	ctdb-tests: Add extra IPv6 socket parsing tests Add tests to confirm that square brackets are handled and that IPv4-mapped IPv6 addresses are parsed as expected. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Volker Lendecke <vl@samba.org>	2022-01-13 16:13:38 +00:00
Volker Lendecke	224e99804e	ctdb-protocol: Allow rfc5952 "[2001:db8::1]:80" ipv6 notation Bug: https://bugzilla.samba.org/show_bug.cgi?id=14934 Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Martin Schwenke <martin@meltin.net>	2022-01-13 16:13:38 +00:00
Volker Lendecke	820b0a63cc	ctdb-protocol: Save 50 bytes .text segment Having this as a small static .text is simpler than having to create this on the stack. Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Martin Schwenke <martin@meltin.net>	2022-01-13 16:13:38 +00:00
Volker Lendecke	baaedd69b3	ctdb-protocol: rindex->strrchr According to "man rindex" on debian bullseye rindex() was deprecated in Posix.1-2001 and removed from Posix.1-2008. Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Martin Schwenke <martin@meltin.net>	2022-01-13 16:13:38 +00:00
Pavel Filipenský	5ac8762256	ctdb:utils: Improve error handling of hex_decode() This has been found by covscan and make analyzers happy. Pair-programmed-with: Andreas Schneider <asn@samba.org> Signed-off-by: Pavel Filipenský <pfilipen@redhat.com> Signed-off-by: Andreas Schneider <asn@samba.org> Reviewed-by: Jeremy Allison <jra@samba.org>	2022-01-10 23:31:33 +00:00
Andreas Schneider	90fd7674f8	ctdb:client: Initialize structs and pointers in ctdb_ctrl_(en\|dis)able_node() Found by covscan. Signed-off-by: Andreas Schneider <asn@samba.org> Reviewed-by: Jeremy Allison <jra@samba.org>	2021-12-15 19:32:30 +00:00
Martin Schwenke	1719ef7893	ctdb-tests: Drop unused function ctdb_get_all_public_addresses() Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Ralph Boehme <slow@samba.org> Autobuild-User(master): Martin Schwenke <martins@samba.org> Autobuild-Date(master): Tue Oct 12 23:24:18 UTC 2021 on sn-devel-184	2021-10-12 23:24:18 +00:00
Ralph Boehme	4e3676cb3c	ctdb-tests: add a comment to the generated public_addresses file used by eventscript UNIT tests test stub code has been updated to handle this, so now let's put it to work. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14826 RN: Correctly ignore comments in CTDB public addresses file Signed-off-by: Ralph Boehme <slow@samba.org> Reviewed-by: Martin Schwenke <martin@meltin.net>	2021-10-12 22:38:32 +00:00
Martin Schwenke	5426c104f5	ctdb-tests: Fix typo in ctdb stub comment matching BUG: https://bugzilla.samba.org/show_bug.cgi?id=14826 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Ralph Boehme <slow@samba.org>	2021-10-12 22:38:32 +00:00
Ralph Boehme	530e8d4b9e	ctdb-scripts: filter out comments in public_addresses file Note that order of sed expressions matters: the expression to delete comment lines must come first as the second expression would transform # comment to comment BUG: https://bugzilla.samba.org/show_bug.cgi?id=14826 Signed-off-by: Ralph Boehme <slow@samba.org> Reviewed-by: Martin Schwenke <martin@meltin.net>	2021-10-12 22:38:32 +00:00
Martin Schwenke	9e7d2d9794	ctdb-daemon: Don't mark a node as unhealthy when connecting to it Remote nodes are already initialised as UNHEALTHY when the node list is initialised at startup (ctdb_load_nodes_file() calls convert_node_map_to_list()) and when disconnected (ctdb_node_dead()). So, drop this code. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14784 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Thu Sep 9 02:38:34 UTC 2021 on sn-devel-184	2021-09-09 02:38:34 +00:00
Martin Schwenke	7f697b1938	ctdb-daemon: Ignore flag changes for disconnected nodes If this node is not connected to a node then we shouldn't know anything about it. The state will be pushed later by the recovery master. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14784 Signed-off-by: Martin Schwenke <martin@meltin.net> Signed-off-by: Amitay Isaacs <amitay@gmail.com>	2021-09-09 01:46:49 +00:00
Martin Schwenke	ae10a8a4b7	ctdb-daemon: Simplify ctdb_control_modflags() Now that there are separate disable/enable controls used by the ctdb tool this control can ignore any flag updates for the current nodes. These only come from the recovery master, which depends on being able to fetch flags for all nodes. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14784 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2021-09-09 01:46:49 +00:00
Martin Schwenke	916c5ee131	ctdb-recoverd: Mark CTDB_SRVID_SET_NODE_FLAGS obsolete CTDB_SRVID_SET_NODE_FLAGS is no longer sent so drop monitor_handler() and replace with srvid_not_implemented(). Mark the SRVID obsolete in its comment. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14784 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2021-09-09 01:46:49 +00:00
Martin Schwenke	e75256767f	ctdb-daemon: Don't bother sending CTDB_SRVID_SET_NODE_FLAGS The code that handles this message is ctdb_recoverd.c:monitor_handler(). Although it appears to do something potentially useful, it only logs the flags changes. All changes made are to local structures - there are no actual side-effects. It used to trigger a takeover run when the DISABLED flag changed. This was dropped back in commit `662f06de9f`. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14784 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2021-09-09 01:46:49 +00:00
Martin Schwenke	0132bd5a22	ctdb-daemon: Modernise remaining debug macro in this function BUG: https://bugzilla.samba.org/show_bug.cgi?id=14784 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2021-09-09 01:46:49 +00:00
Martin Schwenke	b6d25d079e	ctdb-daemon: Update logging for flag changes When flags change, promote the message to NOTICE level and switch the message to the style that is currently generated by ctdb-recoverd.c:monitor_handler(). This will allow monitor_handler() to go away in future. Drop logging when flags do not change. The recovery master now logs when it pushes flags for a node, so the lack of a corresponding "changed flags" message here indicates that no update was required. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14784 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2021-09-09 01:46:49 +00:00
Martin Schwenke	eec44e2862	ctdb-daemon: Correct the condition for logging unchanged flags Don't trust the old flags from the recovery master. Surrounding code will change in future comments, including the use of old-style debug macros, so just make this change clear. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14784 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2021-09-09 01:46:49 +00:00
Martin Schwenke	5914054698	ctdb-tools: Use disable and enable controls in tool Note that there a change from broadcast to a directed control here. This is OK because the recovery master will push flags if any nodes disagree with the canonical flags fetched from a node. Static function ctdb_ctrl_modflags() is no longer used to drop it. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14784 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2021-09-09 01:46:49 +00:00
Martin Schwenke	6fe6a54e7f	ctdb-client: Add client code for disable/enable controls BUG: https://bugzilla.samba.org/show_bug.cgi?id=14784 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2021-09-09 01:46:49 +00:00
Martin Schwenke	15a6489c28	ctdb_daemon: Implement controls DISABLE_NODE/ENABLE_NODE BUG: https://bugzilla.samba.org/show_bug.cgi?id=14784 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2021-09-09 01:46:49 +00:00
Martin Schwenke	60c1ef1465	ctdb-daemon: Start as disabled means PERMANENTLY_DISABLED DISABLED is UNHEALTHY \| PERMANENTLY_DISABLED, which is not what is intended here. Luckily, it doesn't do any harm because nodes are marked unhealthy at startup anyway. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14784 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2021-09-09 01:46:49 +00:00
Martin Schwenke	1ac7bc7532	ctdb-daemon: Factor out a function to get node structure from PNN BUG: https://bugzilla.samba.org/show_bug.cgi?id=14784 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2021-09-09 01:46:49 +00:00
Martin Schwenke	e0a7b5a9e8	ctdb-daemon: Add a helper variable Simplifies a subsequent change. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14784 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2021-09-09 01:46:49 +00:00
Martin Schwenke	6845dca87e	ctdb-protocol: Add marshalling for controls DISABLE_NODE/ENABLE_NODE BUG: https://bugzilla.samba.org/show_bug.cgi?id=14784 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2021-09-09 01:46:49 +00:00
Martin Schwenke	49dc5d8cd2	ctdb-protocol: Add new controls to disable and enable nodes These are CTDB_CONTROL_DISABLE_NODE and CTDB_CONTROL_ENABLE_NODE. For consistency these match CTDB_CONTROL_STOP_NODE and CTDB_CONTROL_CONTINUE_NODE. It would be possible to add a single control but it would need to take data. The aim is to finally fix races in flag handling. Previous fixes have improved the situation but they have only narrowed the race window. The problem is that the recovery daemon on the master node pushes flags to nodes the same way that disable and enable are implemented. So the following sequence is still racy: 1. Node A is disabled 2. Recovery master pulls flags from all nodes including A 3. Node A is enabled 4. Recovery master notices A is disabled and pushes a flag update to all nodes including node A 5. Node A is erroneously marked disabled Node A can not tell if the MODIFY_FLAGS control is from a "ctdb disable" command or a flag update from the recovery master. The solution is to use a different mechanism for disable/enable and for a node to ignore MODIFY_FLAGS controls for their own flags. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14784 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2021-09-09 01:46:49 +00:00
Martin Schwenke	8305f6a7f1	ctdb-recoverd: Push flags for a node if any remote node disagrees This will usually happen if flags on the node in question change, so keeping the code simple and pushing to all nodes won't hurt. When all nodes come up there might be differences in connected nodes, causing such "fix ups". Receiving nodes will ignore no-op pushes. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14784 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2021-09-09 01:46:49 +00:00
Martin Schwenke	620d078714	ctdb-recoverd: Update the local node map before pushing out flags The resulting code structure looks a little weird. However, there is another condition that requires the flags to be pushed that will be inserted before the continue statement in a subsequent commit.. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14784 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2021-09-09 01:46:49 +00:00
Martin Schwenke	82a075d4d7	ctdb-recoverd: Add a helper variable Improves readability and simplifies subsequent changes. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14784 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2021-09-09 01:46:49 +00:00
Martin Schwenke	b724c1e6a6	utils: Avoid pylint warning pylint warns: Use lazy % formatting in logging functions Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: David Disseldorp <ddiss@samba.org> Reviewed-by: Jose A. Rivera <jarrpa@samba.org> Autobuild-User(master): Martin Schwenke <martins@samba.org> Autobuild-Date(master): Tue Jul 20 05:29:18 UTC 2021 on sn-devel-184	2021-07-20 05:29:18 +00:00
Martin Schwenke	319e27343d	utils: Reformat lines that are longer than 80 columns Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: David Disseldorp <ddiss@samba.org> Reviewed-by: Jose A. Rivera <jarrpa@samba.org>	2021-07-20 04:43:37 +00:00
Martin Schwenke	98c7a38b71	utils: Tweak exception handling to stop flake8 complaining Don't bother with "as e" to avoid warning about unused variable. Don't use bare "except:" (though pylint still complains about this version). Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: David Disseldorp <ddiss@samba.org> Reviewed-by: Jose A. Rivera <jarrpa@samba.org>	2021-07-20 04:43:37 +00:00
Martin Schwenke	12d3e215a6	utils: Simplify log level logic, drop global variable Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: David Disseldorp <ddiss@samba.org> Reviewed-by: Jose A. Rivera <jarrpa@samba.org>	2021-07-20 04:43:37 +00:00
Martin Schwenke	e323d16a9d	utils: Inline defaults and help strings Removes an unnecessary level of indirection: defaults and help strings are now where they are expected. Also removes some global variables. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: David Disseldorp <ddiss@samba.org> Reviewed-by: Jose A. Rivera <jarrpa@samba.org>	2021-07-20 04:43:37 +00:00
Martin Schwenke	af5aecced1	utils: Move argument processing into function and call from main() Removes the need for the global variables currently associated with this processing. Also removes unnecessarily double-handling the defaults, which are assigned to the global variables and set via add_argument(). Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: David Disseldorp <ddiss@samba.org> Reviewed-by: Jose A. Rivera <jarrpa@samba.org>	2021-07-20 04:43:37 +00:00
Martin Schwenke	e66637a079	utils: Reorder imports so that standard imports are first Avoids numerous pylint warnings. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: David Disseldorp <ddiss@samba.org> Reviewed-by: Jose A. Rivera <jarrpa@samba.org>	2021-07-20 04:43:37 +00:00
Martin Schwenke	bd0b2bb6ee	utils: Clean up ctdb_etcd_lock using autopep8 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: David Disseldorp <ddiss@samba.org> Reviewed-by: Jose A. Rivera <jarrpa@samba.org>	2021-07-20 04:43:37 +00:00
Martin Schwenke	939aed0498	utils: Use Python 3 Due to the number of flake8 and pylint warnings it is unclear if the source has Python 3 incompatibilities. These will be cleaned up in subsequent commits. Signed-off-by: "L.P.H. van Belle" <belle@bazuin.nl> Reviewed-by: Martin Schwenke <martin@meltin.net> Reviewed-by: David Disseldorp <ddiss@samba.org> Reviewed-by: Jose A. Rivera <jarrpa@samba.org>	2021-07-20 04:43:37 +00:00
Martin Schwenke	466aa8b6f5	ctdb-scripts: Ignore ShellCheck SC3013 for test -nt In ShellCheck 0.7.2, POSIX compatibility warnings got their own SC3xxx error codes, so now both the old and new codes need to be ignored. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Fri Jun 25 10:06:48 UTC 2021 on sn-devel-184	2021-06-25 10:06:48 +00:00
Martin Schwenke	fc0da6b0f8	ctdb-tests: Force stub version of service in eventscript tests Fedora 34 now has a shell function for the which command, which causes these uses of which to return the enclosing function definition rather than the executable file as expected. The event script unit tests always expect the stub service command to be used, so the conditional in these functions is unnecessary. $CTDB_HELPER_BINDIR already conveniently points to the stub directory, so use it here. Signed-off-by: Martin Schwenke <martin@meltin.net> Signed-off-by: Amitay Isaacs <amitay@gmail.com>	2021-06-25 09:16:31 +00:00
Martin Schwenke	23b2fab2c8	ctdb-common: Drop unused include of mkdir_p.h Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2021-06-25 09:16:31 +00:00
Martin Schwenke	e40d452722	ctdb-daemon: Close server socket when switching to client The socket is set close-on-exec but that doesn't help for processes that do not exec(). This should be done for all child processes. This has been seen in testing where "ctdb shutdown" waits for the socket to close before succeeding. It appears that lingering vacuuming processes have not closed the socket when becoming clients so they cause "ctdb shutdown" to hang even though the main daemon process has exited. The cause of the lingering vacuuming processes has been previously examined but still isn't understood. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2021-06-25 09:16:31 +00:00
Martin Schwenke	f7cf8132b0	ctdb-tests: Add debug_locks.sh tests for mutexes Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Fri May 28 07:34:23 UTC 2021 on sn-devel-184	2021-05-28 07:34:23 +00:00
Amitay Isaacs	99c3b49260	ctdb-scripts: Add lock debugging for tdb mutex locks Signed-off-by: Amitay Isaacs <amitay@gmail.com> Signed-off-by: Martin Schwenke <martin@meltin.net>	2021-05-28 06:46:29 +00:00

... 2 3 4 5 6 ...

9112 Commits