samba-mirror

mirror of https://github.com/samba-team/samba.git synced 2024-12-22 13:34:15 +03:00

Author	SHA1	Message	Date
Volker Lendecke	a6b66661c7	ctdb: Add "home_nodes" file to deterministic IP allocation With a file "home_nodes" next to "public_addresses" you can assign public IPs to specific nodes when using the deterministic allocation algorithm. Whenever the "home node" is up, the IP address will be assigned to that node, independent of any other deterministic calculation. The line 192.168.21.254 2 in the file "home_nodes" assigns the IP address to node 2. Only when node 2 is not able to host IP addresses, 192.168.21.254 undergoes the normal deterministic IP allocation algorithm. Signed-off-by: Volker Lendecke <vl@samba.org> add home_nodes Reviewed-by: Ralph Boehme <slow@samba.org> Autobuild-User(master): Volker Lendecke <vl@samba.org> Autobuild-Date(master): Tue Oct 10 14:17:19 UTC 2023 on atb-devel-224	2023-10-10 14:17:19 +00:00
Volker Lendecke	ea9cbbd830	ctdb: setup $CTDB_BASE for deterministic ip alloc tests ipalloc_deterministic() will require it in the next patch Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Ralph Boehme <slow@samba.org>	2023-10-10 13:14:31 +00:00
Volker Lendecke	23ccb1c0ca	ctdb: Align variable signedness ipalloc_state->num_nodes is uint32_t Reviewed-by: Ralph Boehme <slow@samba.org>	2023-10-10 13:14:31 +00:00
Volker Lendecke	ce3243d7b2	ctdb: Reduce indentation in get_tunable_values() Use an early return tvals; review with "git sh -b". Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Ralph Boehme <slow@samba.org>	2023-10-10 13:14:31 +00:00
Volker Lendecke	58ec800928	ctdb: Fix whitespace Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Ralph Boehme <slow@samba.org>	2023-10-10 13:14:31 +00:00
Martin Schwenke	3ee348a966	ctdb-scripts: Convert 40.vsftpd to use threshold-based fail counting This effectively provides simple testing for the threshold-based approach. Add new script option CTDB_VSFTPD_MONITOR_THRESHOLDS. Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Tue Oct 3 04:53:38 UTC 2023 on atb-devel-224	2023-10-03 04:53:38 +00:00
Martin Schwenke	8303c3a534	ctdb-scripts: Implement failcount handling with thresholds This can be used for simple failure counting, without restarts, as used in the 40.vsftpd event script. That case will subsequently be converted and this functionality can also be used elsewhere. Add documentation to ctdb-script.options(5) to allow parameters that use this to be more easily described. Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2023-10-03 03:53:35 +00:00
Martin Schwenke	4981984dd4	ctdb-scripts: Avoid errors for uninitialised counters Uninitialised counters are treated as 0, but still produce an error. The redirect to stderr needs to come before the redirect for a missing counter file. The seemingly saner alternative of moving it outside the subshell works when dash is /bin/sh (e.g. on Debian) but does not work when bash is /bin/sh (e.g. on Fedora). Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2023-10-03 03:53:35 +00:00
Martin Schwenke	7c468d9d28	ctdb-doc: Add some subsection names in description A subsequent commit will add a new section, which looks out of place without these new sections. Best reviewed with "git show -w". Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2023-10-03 03:53:35 +00:00
Martin Schwenke	749bc56876	ctdb-doc: Update CTDB manual pages to UTF-8 This will allow Unicode characters to be used, resulting in more readable source files. Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2023-10-03 03:53:35 +00:00
Martin Schwenke	8b9f464420	ctdb-daemon: Call setproctitle_init() Commit `19c82c19c0` changed the behaviour of prctl_set_comment() so it now calls setproctitle(3bsd) by default. In some Linux distributions (e.g. Rocky Linux 8.8), this results in messages like this spamming the logs: ctdbd: setproctitle not initialized, please either call setproctitle_init() or link against libbsd-ctor. Most Samba daemons seem to call setproctitle_init(), so do it here. In the longer term CTDB should also switch to using lib/util's process_set_title(), like the rest of Samba, for more flexible process names. BUG: https://bugzilla.samba.org/show_bug.cgi?id=15479 Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Ralph Boehme <slow@samba.org> Autobuild-User(master): Martin Schwenke <martins@samba.org> Autobuild-Date(master): Thu Sep 21 00:46:50 UTC 2023 on atb-devel-224	2023-09-21 00:46:50 +00:00
Joseph Sutton	c62491473a	ctdb: Fix code spelling Signed-off-by: Joseph Sutton <josephsutton@catalyst.net.nz> Reviewed-by: Andrew Bartlett <abartlet@samba.org>	2023-09-11 02:42:41 +00:00
Martin Schwenke	dc7b48c404	ctdb-common: Set immediate mode for pcap capture Fix a problem where ctdb_killtcp (almost always) fails to capture packets with --enable-pcap and libpcap ≥ 1.9.1. The problem is due to a gradual change in libpcap semantics when using pcap_get_selectable_fd(3PCAP) to get a file descriptor and then using that file descriptor in non-blocking mode. pcap_set_immediate_mode(3PCAP) says: pcap_set_immediate_mode() sets whether immediate mode should be set on a capture handle when the handle is activated. In immediate mode, packets are always delivered as soon as they arrive, with no buffering. and On Linux, with previous releases of libpcap, capture devices are always in immediate mode; however, in 1.5.0 and later, they are, by default, not in immediate mode, so if pcap_set_immediate_mode() is available, it should be used. However, it wasn't until libpcap commit 2ade7676101366983bd4f86bc039ffd25da8c126 (before libpcap 1.9.1) that it became a requirement to use pcap_set_immediate_mode(), even with a timeout of 0. More explanation in this libpcap issue comment: https://github.com/the-tcpdump-group/libpcap/issues/860#issuecomment-541204548 Do a configure check for pcap_set_immediate_mode() even though it has existed for 10 years. It is easy enough. BUG: https://bugzilla.samba.org/show_bug.cgi?id=15451 Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Tue Aug 15 10:53:52 UTC 2023 on atb-devel-224	2023-08-15 10:53:52 +00:00
Martin Schwenke	ffc2ae616d	ctdb-common: Replace pcap_open_live() by lower level calls A subsequent commit will insert an additional call before pcap_activate(). This sequence of calls is taken from the source for pcap_open_live(), so there should be no change in behaviour. Given the defaults set by pcap_create_common(), it would be possible to omit the calls to pcap_set_promisc() and pcap_set_timeout(). However, those defaults don't seem to be well documented, so continue to explicitly set everything that was set before. BUG: https://bugzilla.samba.org/show_bug.cgi?id=15451 Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2023-08-15 09:49:38 +00:00
Martin Schwenke	d87041d896	ctdb-common: Improve error handling Factor out a failure label, which will get more use in subsequent commits, and only set private_data when success is certain. BUG: https://bugzilla.samba.org/show_bug.cgi?id=15451 Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2023-08-15 09:49:38 +00:00
Joseph Sutton	9769b594f4	ctdb: Add missing newline to logging message Signed-off-by: Joseph Sutton <josephsutton@catalyst.net.nz> Reviewed-by: Andrew Bartlett <abartlet@samba.org>	2023-08-08 04:39:37 +00:00
Joseph Sutton	a8085b3dd5	ctdb: Add missing newlines to logging messages Signed-off-by: Joseph Sutton <josephsutton@catalyst.net.nz> Reviewed-by: Andrew Bartlett <abartlet@samba.org>	2023-08-08 04:39:36 +00:00
Martin Schwenke	f87f02f6f9	ctdb-doc: Fix documentation for ctdb event status Behaviour was changed, documentation wasn't. BUG: https://bugzilla.samba.org/show_bug.cgi?id=15438 Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Mon Aug 7 09:43:33 UTC 2023 on atb-devel-224	2023-08-07 09:43:33 +00:00
Martin Schwenke	f01a179abc	ctdb-tools: Fix CID 1539212 - signed/unsigned issue >>> CID 1539212: Control flow issues (NO_EFFECT) >>> This greater-than-or-equal-to-zero comparison of an unsigned value is always true. "p >= 0UL". 216 while (p >= 0 && output[p] == '\n') { This is a real problem in the unlikely event that the output contains only newlines. Fix the issue by using a pointer and add a test to cover this case. BUG: https://bugzilla.samba.org/show_bug.cgi?id=15438 Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2023-08-07 08:43:39 +00:00
Martin Schwenke	7920d2ff62	ctdb-tools: Improve printing of multi-line event script output Multi-line output currently prints like this: OUTPUT: aaa bbb ccc This is less beautiful than it could be. Instead, print multi-line output with no inlining and each line indented: OUTPUT: aaa bbb ccc However, continue to inline single line output: OUTPUT: foo Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2023-07-28 10:48:33 +00:00
Martin Schwenke	e3c0b72c34	ctdb-tools: Always print script output in event status When event scripts succeed they generally produce no output. However, when a script succeeds and produces output, such output almost certainly contains warnings. So, always print script output. Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2023-07-28 10:48:33 +00:00
Martin Schwenke	6e4c7ae9a2	ctdb-tests: Log to stderr in statd-callout tests Errors logged when testing statd-callout don't currently go anywhere. This is because arguments to the hacked version of script_log() are ignored. Remove the hack and configure logging to stderr. This could go in the local statd-callout.sh setup script. However, make it available for other script tests. Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Wed Jul 19 09:57:37 UTC 2023 on atb-devel-224	2023-07-19 09:57:37 +00:00
Martin Schwenke	ef15a34d5d	ctdb-scripts: Support script logging to stderr Logging in statd-callout tests is currently useless. This will provide a way of seeing errors in those tests. Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2023-07-19 09:01:33 +00:00
Martin Schwenke	0ac9413735	ctdb-scripts: Avoid ShellCheck warning SC2162 SC2162 read without -r will mangle backslashes. Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2023-07-19 09:01:33 +00:00
Martin Schwenke	59c5010b6e	ctdb-scripts: Reformat with "shfmt -w -p -i 0 -fn" Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2023-07-19 09:01:33 +00:00
Martin Schwenke	2e2d81b92a	ctdb-recoverd: CID 1509028 - Use of 32-bit time_t (Y2K38_SAFETY) usecs is going to be passed as a uint32_t. There is no need to calculate it as a time_t. Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2023-07-19 09:01:33 +00:00
Martin Schwenke	862fc5770c	ctdb: Do not use egrep On some platforms, egrep prints a deprecation warning to stderr: egrep: warning: egrep is obsolescent; using grep -E Use grep -E instead. This is nice and simple, so no use splitting this commit into 2 separate commits for each of tools and test. Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2023-07-19 09:01:33 +00:00
Martin Schwenke	4deb178eb3	ctdb-doc: Correct bit-rotted documenation Loading tunables is now done in ctdbd, so find another example for the "setup" event. Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2023-07-19 09:01:33 +00:00
Martin Schwenke	dbbede407f	ctdb-utils: Drop unused scsi_io.c source file It will be in the git history if we ever decide to use SCSI persistent reservations as a cluster lock. Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2023-07-19 09:01:33 +00:00
Martin Schwenke	61dfc8bc06	ctdb-server: Avoid logging a count of 0 resent calls This fixes a little thinko in commit `80de84d36e`, where this was overlooked. Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Mon Jul 10 15:15:06 UTC 2023 on atb-devel-224	2023-07-10 15:15:06 +00:00
Martin Schwenke	60bf6f68e1	ctdb-tools: Switch tickle ACK sending message to INFO level DEBUG level logging in ctdb_killtcp is very noisy. The most important messages when debugging are those for tickle ACKs and TCP RSTs. TCP RSTs are already logged at INFO level, so promote tickle ACKs to INFO level too. Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2023-07-10 14:21:30 +00:00
Martin Schwenke	6dac1da9cd	ctdb-tools: Fix a typo in a log message Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reported-by: Ulrich Sibiller <ulrich.sibiller@atos.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2023-07-10 14:21:30 +00:00
Martin Schwenke	51d0445a7d	ctdb-logging: Really make NOTICE the default debug level NOTICE level debug messages in common/run_event.c are not logged by default. Currently eventd ends up using ERROR, since this is specified as LOGGING_LOG_LEVEL_DEFAULT. It doesn't inherit the debug level from ctdbd and only uses NOTICE level when interactive. Change the real logging default to NOTICE and use it everywhere. Followups might be: * Remove the default_log_level argument to logging_conf_init() * Kick eventd to update debug level when "ctdb setdebug" is used Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2023-07-10 14:21:30 +00:00
Martin Schwenke	d2940694c6	ctdb-tests: Run ShellCheck on event-script unit test support scripts Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Andreas Schneider <asn@samba.org> Autobuild-User(master): Martin Schwenke <martins@samba.org> Autobuild-Date(master): Wed Jul 5 12:16:57 UTC 2023 on atb-devel-224	2023-07-05 12:16:56 +00:00
Martin Schwenke	b2026e92d6	ctdb-tests: Avoid ShellCheck warnings These are all trivial, so handle them in bulk. * Change code to avoid (approximately sorted by frequency): SC2004 $/${} is unnecessary on arithmetic variables. SC2086 Double quote to prevent globbing and word splitting. SC2162 read without -r will mangle backslashes. SC2254 Quote expansions in case patterns to match literally rather than as a glob. SC2154 (warning): <variable> is referenced but not assigned. SC3037 (warning): In POSIX sh, echo flags are undefined. SC2016 (info): Expressions don't expand in single quotes, use double quotes for that. SC2069 (warning): To redirect stdout+stderr, 2>&1 must be last (or use '{ cmd > file; } 2>&1' to clarify). SC2124 (warning): Assigning an array to a string! Assign as array, or use * instead of @ to concatenate. SC2166 (warning): Prefer [ p ] && [ q ] as [ p -a q ] is not well defined. SC2223 (info): This default assignment may cause DoS due to globbing. Quote it. * Locally disable checks: SC2034 (warning): <variable> appears unused. Verify use (or export if used externally). SC2086 (info): Double quote to prevent globbing and word splitting. [once] SC2120 (warning): <function> references arguments, but none are ever passed. SC2317 (info): Command appears to be unreachable. Check usage (or ignore if invoked indirectly). While touching reads for SC2162, switch unused variables to "_" instead of "_x", which seems to be preferred by ShellCheck. Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Andreas Schneider <asn@samba.org>	2023-07-05 11:18:37 +00:00
Martin Schwenke	a45a76fd19	ctdb-tests: Avoid ShellCheck warning SC2059 SC2059 (info): Don't use variables in the printf format string. Use printf '..%s..' "$foo". Move the format string to the function and just parameterise the share type. Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Andreas Schneider <asn@samba.org>	2023-07-05 11:18:37 +00:00
Martin Schwenke	58a117d3d5	ctdb-tests: Avoid ShellCheck warnings SC2046, SC2005 In ./tests/UNIT/eventscripts/scripts/local.sh line 328: echo $(ctdb ifaces -X \| awk -F'\|' 'FNR > 1 {print $2}') ^-- SC2046 (warning): Quote this to prevent word splitting. ^-- SC2005 (style): Useless echo? Instead of 'echo $(cmd)', just use 'cmd'. Use xargs to get output on 1 line. Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Andreas Schneider <asn@samba.org>	2023-07-05 11:18:37 +00:00
Martin Schwenke	1190c91090	ctdb-tests: Drop unreachable code This generates ShellCheck warnings: In ./tests/UNIT/eventscripts/scripts/60.nfs.sh line 412: if [ -n "$service_check_cmd" ]; then ^----------------^ SC2031 (info): service_check_cmd was modified in a subshell. That change might be lost. In ./tests/UNIT/eventscripts/scripts/60.nfs.sh line 413: if eval "$service_check_cmd"; then ^----------------^ SC2031 (info): service_check_cmd was modified in a subshell. That change might be lost. service_check_cmd will never be set here because it is only set in a sub-shell in rpc_set_service_failure_response(). This reverts some of commit `713ec21750`. If testcases requiring use of service_check_cmd are later added then this will need to be redone properly. This would probably start by renaming this function nfs_iterate_rpc_test(). Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Andreas Schneider <asn@samba.org>	2023-07-05 11:18:37 +00:00
Martin Schwenke	cbda1a78dc	ctdb-tests: Reformat with "shfmt -w -p -i 0 -fn" Best reviewed with "git show -w". Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Andreas Schneider <asn@samba.org>	2023-07-05 11:18:37 +00:00
Martin Schwenke	7813c979ed	ctdb-tests: Drop unused test code for tunables This is unused since loading tunables was moved to ctdbd. Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Andreas Schneider <asn@samba.org>	2023-07-05 11:18:37 +00:00
Martin Schwenke	92f1747448	ctdb-tests: Avoid ShellCheck warning SC2086 SC2086 Double quote to prevent globbing and word splitting. Apparently ShellCheck is more picky about some of these than it used to be. Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Andreas Schneider <asn@samba.org>	2023-07-05 11:18:37 +00:00
Martin Schwenke	37105addec	ctdb-scripts: Avoid ShellCheck warnings SC2317, SC2086 New in ShellCheck 0.9.0: SC2317 (info): Command appears to be unreachable. Check usage (or ignore if invoked indirectly). Also: SC2086 (info): Double quote to prevent globbing and word splitting. Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Andreas Schneider <asn@samba.org>	2023-07-05 11:18:37 +00:00
Martin Schwenke	aeb5b0adfa	ctdb-tools: Avoid ShellCheck warning SC2317 New in ShellCheck 0.9.0: SC2317 (info): Command appears to be unreachable. Check usage (or ignore if invoked indirectly). Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Andreas Schneider <asn@samba.org>	2023-07-05 11:18:37 +00:00
Christof Schmitt	4dccf5afa4	ctdb-recovery: Use correct struct ban_node_state type for state If this codepath is hit, ctdb aborts with: ctdb/server/ctdb_recovery_helper.c:2687: Type mismatch: name[struct ban_node_state] expected[struct node_ban_state]") at ../../lib/talloc/talloc.c:505 Fix this by using the correct type. Signed-off-by: Christof Schmitt <cs@samba.org> Reviewed-by: Volker Lendecke <vl@samba.org> Autobuild-User(master): Volker Lendecke <vl@samba.org> Autobuild-Date(master): Wed May 3 08:04:09 UTC 2023 on atb-devel-224	2023-05-03 08:04:09 +00:00
Joseph Sutton	440c3e8697	ctdb:tool: Remove unnecessary strlen() Signed-off-by: Joseph Sutton <josephsutton@catalyst.net.nz> Reviewed-by: Andreas Schneider <asn@samba.org>	2023-04-12 13:52:31 +00:00
Andreas Schneider	8f18fadd31	ctdb: Fix code spelling Signed-off-by: Andreas Schneider <asn@samba.org> Reviewed-by: Joseph Sutton <josephsutton@catalyst.net.nz>	2023-04-11 09:06:35 +00:00
Andrew Bartlett	83fe7a0316	lib/util: Add "debug syslog format = always", which logs to stdout in syslog style Signed-off-by: Andrew Bartlett <abartlet@samba.org> Reviewed-by: Stefan Metzmacher <metze@samba.org>	2023-04-06 12:51:30 +00:00
Andreas Schneider	409ede2d1f	ctdb:doc: Fix code spelling Best reviewed with: `git show --word-diff`. Signed-off-by: Andreas Schneider <asn@samba.org> Reviewed-by: Andrew Bartlett <abartlet@samba.org>	2023-04-04 07:31:36 +00:00
Andreas Schneider	d964700a19	ctdb:utils: Fix code spelling Best reviewed with: `git show --word-diff` Signed-off-by: Andreas Schneider <asn@samba.org> Reviewed-by: Martin Schwenke <mschwenke@ddn.com> Autobuild-User(master): Andreas Schneider <asn@cryptomilk.org> Autobuild-Date(master): Fri Mar 24 07:57:37 UTC 2023 on atb-devel-224	2023-03-24 07:57:37 +00:00
Andreas Schneider	8ccd915587	ctdb:utils: Remove trailing whitespaces in scsi_io.c Signed-off-by: Andreas Schneider <asn@samba.org> Reviewed-by: Martin Schwenke <mschwenke@ddn.com>	2023-03-24 07:01:31 +00:00
Andreas Schneider	88ee870e67	ctdb:tool: Fix code spelling Best reviewed with: `git show --word-diff` Signed-off-by: Andreas Schneider <asn@samba.org> Reviewed-by: Martin Schwenke <mschwenke@ddn.com>	2023-03-24 07:01:31 +00:00
Andreas Schneider	9a37aa3969	ctdb:tests: Fix code spelling Best reviewed with: `git show --word-diff` Signed-off-by: Andreas Schneider <asn@samba.org> Reviewed-by: Martin Schwenke <mschwenke@ddn.com>	2023-03-24 07:01:31 +00:00
Andreas Schneider	7aeed61dc5	ctdb:tcp: Fix code spelling Best reviewed with: `git show --word-diff` Signed-off-by: Andreas Schneider <asn@samba.org> Reviewed-by: Martin Schwenke <mschwenke@ddn.com>	2023-03-24 07:01:31 +00:00
Andreas Schneider	7749df4992	ctdb:server: Fix code spelling Best reviewed with: `git show --word-diff` Signed-off-by: Andreas Schneider <asn@samba.org> Reviewed-by: Martin Schwenke <mschwenke@ddn.com>	2023-03-24 07:01:31 +00:00
Andreas Schneider	19f418b68f	ctdb:server: Remove trailing whitespaces in ctdb_server.c Signed-off-by: Andreas Schneider <asn@samba.org> Reviewed-by: Martin Schwenke <mschwenke@ddn.com>	2023-03-24 07:01:31 +00:00
Andreas Schneider	59af504999	ctdb:server: Remove trailing whitespaces in ctdb_recover.c Signed-off-by: Andreas Schneider <asn@samba.org> Reviewed-by: Martin Schwenke <mschwenke@ddn.com>	2023-03-24 07:01:31 +00:00
Andreas Schneider	200bc1f937	ctdb:include: Fix code spelling Best reviewed with: `git show --word-diff` Signed-off-by: Andreas Schneider <asn@samba.org> Reviewed-by: Martin Schwenke <mschwenke@ddn.com>	2023-03-24 07:01:31 +00:00
Andreas Schneider	44bde7a788	ctdb:include: Remove trailing whitespaces in ctdb_protocol.h Signed-off-by: Andreas Schneider <asn@samba.org> Reviewed-by: Martin Schwenke <mschwenke@ddn.com>	2023-03-24 07:01:31 +00:00
Andreas Schneider	2e10481dac	ctdb:common: Fix code spelling Best reviewed with: `git show --word-diff` Signed-off-by: Andreas Schneider <asn@samba.org> Reviewed-by: Martin Schwenke <mschwenke@ddn.com>	2023-03-24 07:01:31 +00:00
Andreas Schneider	6d7d82938b	ctdb:client: Fix code spelling Best reviewed with: `git show --word-diff` Signed-off-by: Andreas Schneider <asn@samba.org> Reviewed-by: Martin Schwenke <mschwenke@ddn.com>	2023-03-24 07:01:31 +00:00
Martin Schwenke	238056e5aa	ctdb-scripts: Avoid using testparm to process its own output When testparm processes the output of "testparm -v" (which includes default values) it appears to do global checks (or some other sort of initialisation logic) for all specified values. This includes a DNS lookup for the node's hostname, as a side-effect of a libldap ldap_set_option() call when processing "ldap debug level". If DNS servers are down then this can induce timeouts, possibly resulting in monitor timeouts. Avoid this by using sed to extract configuration values from the testparm cache file. This is already shown to work when retrieving share paths, where testparm is basically used as cat. Update the sed pattern to avoid matching empty values on the right-hand side of the equals ('=') - this avoids the default empty path value (and "smb ports" never has an empty value). Corresponding test changes: * 50.samba.monitor.111.sh no longer expects a failure from being unable to set smb ports, since testparm is no longer used in that code path. * smb ports needs to be set in fake smb.conf so it is in the default output and can be extracted using sed. * Although testparm --parameter-name is no longer used in 50.samba.script, update the stub implementation (in case it is ever used again) to extract from fake smb.conf, since "smb ports" is now set there. The change from $parameter to $param allows a long line to stay below 80 columns. Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Volker Lendecke <vl@samba.org> Autobuild-User(master): Volker Lendecke <vl@samba.org> Autobuild-Date(master): Tue Feb 14 08:43:53 UTC 2023 on atb-devel-224	2023-02-14 08:43:53 +00:00
Martin Schwenke	9a04ca1e1c	ctdb-scripts: Do not replace commas with spaces in "smb ports" list The list changed back to space-separated in commit `93448f4be9`, so simplify the code a little. Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Volker Lendecke <vl@samba.org>	2023-02-14 07:44:30 +00:00
Martin Schwenke	029dddfb79	ctdb-scripts: Reformat script with "shfmt -w -p -i 0 -fn" Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Volker Lendecke <vl@samba.org>	2023-02-14 07:44:30 +00:00
Rob van der Linde	851127f5c9	Python: remove pydoctor Removes: * waf pydoctor * waf wafdocs * make pydoctor There is no "make wafdocs" it only appears to be in wscript. The reasoning being is these are broken and appear to not have been run for some time. Signed-off-by: Rob van der Linde <rob@catalyst.net.nz> Reviewed-by: Andrew Bartlett <abartlet@samba.org> Reviewed-by: Jeremy Allison <jra@samba.org> Autobuild-User(master): Jeremy Allison <jra@samba.org> Autobuild-Date(master): Thu Feb 2 21:15:54 UTC 2023 on atb-devel-224	2023-02-02 21:15:54 +00:00
Michael Tokarev	96154a26fe	spelling fixes for 4.18 (errror implemenation proces Controler) One of changes is somewhat interesting, it is "tfork waiter proces" process title in tfork.c. I wonder why no one noticed this before. There's another similar process title in there, "tfork waiter process(%d)". Hopefully no one does grep for "proces$" (and there's no reason to). Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> Reviewed-by: Volker Lendecke <vl@samba.org> Reviewed-by: Rowland Penny <rpenny@samba.org> Autobuild-User(master): Jeremy Allison <jra@samba.org> Autobuild-Date(master): Thu Jan 26 20:46:11 UTC 2023 on atb-devel-224	2023-01-26 20:46:11 +00:00
Volker Lendecke	35ee3e0231	ctdb: Fix the build on FreeBSD "basename" is define in libgen.h included from system/dir.h Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Stefan Metzmacher <metze@samba.org>	2023-01-18 11:49:38 +00:00
Volker Lendecke	688be0177b	ctdb: Fix a use-after-free in run_proc If you happen to talloc_free(run_ctx) before all the tevent_req's hanging off it, you run into the following: ==495196== Invalid read of size 8 ==495196== at 0x10D757: run_proc_state_destructor (run_proc.c:413) ==495196== by 0x488F736: _tc_free_internal (talloc.c:1158) ==495196== by 0x488FBDD: _talloc_free_internal (talloc.c:1248) ==495196== by 0x4890F41: _talloc_free (talloc.c:1792) ==495196== by 0x48538B1: tevent_req_received (tevent_req.c:293) ==495196== by 0x4853429: tevent_req_destructor (tevent_req.c:129) ==495196== by 0x488F736: _tc_free_internal (talloc.c:1158) ==495196== by 0x4890AF6: _tc_free_children_internal (talloc.c:1669) ==495196== by 0x488F967: _tc_free_internal (talloc.c:1184) ==495196== by 0x488FBDD: _talloc_free_internal (talloc.c:1248) ==495196== by 0x4890F41: _talloc_free (talloc.c:1792) ==495196== by 0x10DE62: main (run_proc_test.c:86) ==495196== Address 0x55b77f8 is 152 bytes inside a block of size 160 free'd ==495196== at 0x48399AB: free (vg_replace_malloc.c:538) ==495196== by 0x488FB25: _tc_free_internal (talloc.c:1222) ==495196== by 0x488FBDD: _talloc_free_internal (talloc.c:1248) ==495196== by 0x4890F41: _talloc_free (talloc.c:1792) ==495196== by 0x10D315: run_proc_context_destructor (run_proc.c:329) ==495196== by 0x488F736: _tc_free_internal (talloc.c:1158) ==495196== by 0x488FBDD: _talloc_free_internal (talloc.c:1248) ==495196== by 0x4890F41: _talloc_free (talloc.c:1792) ==495196== by 0x10DE62: main (run_proc_test.c:86) ==495196== Block was alloc'd at ==495196== at 0x483877F: malloc (vg_replace_malloc.c:307) ==495196== by 0x488EAD9: __talloc_with_prefix (talloc.c:783) ==495196== by 0x488EC73: __talloc (talloc.c:825) ==495196== by 0x488F0FC: _talloc_named_const (talloc.c:982) ==495196== by 0x48925B1: _talloc_zero (talloc.c:2421) ==495196== by 0x10C8F2: proc_new (run_proc.c:61) ==495196== by 0x10D4C9: run_proc_send (run_proc.c:381) ==495196== by 0x10DDF6: main (run_proc_test.c:79) This happens because run_proc_context_destructor() directly does a talloc_free() on the struct proc_context's and not the enclosing tevent_req's. run_proc_kill() makes sure that we don't follow proc->req, but it forgets the "state->proc", which is free()'ed, but later dereferenced in run_proc_state_destructor(). This is an attempt at a quick fix, I believe we should convert run_proc_context->plist into an array of tevent_req's, so that we can properly TALLOC_FREE() according to the "natural" hierarchy and not just pull an arbitrary thread out of that heap. Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Martin Schwenke <martin@meltin.net> Autobuild-User(master): Volker Lendecke <vl@samba.org> Autobuild-Date(master): Thu Oct 6 15:10:20 UTC 2022 on sn-devel-184	2022-10-06 15:10:20 +00:00
Martin Schwenke	d9dda4b7af	ctdb-scripts: Add debugging variable CTDB_KILLTCP_DEBUGLEVEL To debug ctdb_killtcp failures, add CTDB_KILLTCP_DEBUGLEVEL=DEBUG to script.options. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Tue Sep 20 11:42:16 UTC 2022 on sn-devel-184	2022-09-20 11:42:16 +00:00
Martin Schwenke	9f7d69a05b	ctdb-common: Support IB in pcap-based capture Add simple support for IPoIB via DLT_LINUX_SLL and DLT_LINUX_SLL2. This seems to work, even when an IB interface is specified. If this is later found to be insufficient, support for DLT_IPOIB can be implemented. See https://www.tcpdump.org/linktypes.html for a starting point. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-09-20 10:43:37 +00:00
Martin Schwenke	e5541a7e02	ctdb-common: Support "any" interface for pcap-based capture This uses Linux cooked capture link-layer headers. See: https://www.tcpdump.org/linktypes/LINKTYPE_LINUX_SLL.html https://www.tcpdump.org/linktypes/LINKTYPE_LINUX_SLL2.html The header type needs to be checked to ensure the protocol type (i.e. ether type, for the protocols we might be interested in) is meaningful. The size of the header needs to be known so it can be skipped, allowing the IP header to be found and parsed. It would be possible to define support for DLT_LINUX_SLL2 if it is missing. However, if a platform is missing support in the header file then it is almost certainly missing in the run-time library too. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-09-20 10:43:37 +00:00
Martin Schwenke	3bf20300ac	ctdb-common: Add packet type detection to pcap-based capture The current code will almost certainly generate ENOMSG for non-ethernet packets, even for ethernet packets when the "any" interface is used. pcap_datalink(3PCAP) says: Do NOT assume that the packets for a given capture or ``savefile`` will have any given link-layer header type, such as DLT_EN10MB for Ethernet. For example, the "any" device on Linux will have a link-layer header type of DLT_LINUX_SLL or DLT_LINUX_SLL2 even if all devices on the sys‐ tem at the time the "any" device is opened have some other data link type, such as DLT_EN10MB for Ethernet. So, pcap_datalink() must be used. Detect pcap packet types that are supported (currently only ethernet) in the open code. There is no use continuing if the read code can't parse packets. The pattern of using switch statements supports future addition of other packet types. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-09-20 10:43:37 +00:00
Martin Schwenke	5dd964aa02	ctdb-tools: Improve/add debug In particular, knowing the reason fetching the packet fails can help with debugging unsupported protocols in the pcap code. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-09-20 10:43:37 +00:00
Martin Schwenke	33a80c1d63	ctdb-common: Improve/add debug Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-09-20 10:43:37 +00:00
Martin Schwenke	075414dc05	ctdb-common: Use pcap_get_selectable_fd() This is preferred because it will fail for devices that do not support epoll_wait() and similar. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-09-20 10:43:37 +00:00
Martin Schwenke	40380a8042	ctdb-common: Stop a pcap-related crash on error errbuf can't be NULL. Might as well use it. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-09-20 10:43:37 +00:00
Martin Schwenke	8b54587b1a	ctdb-common: Fix a warning in the pcap code [173/416] Compiling ctdb/common/system_socket.c ../../common/system_socket.c: In function ‘ctdb_sys_read_tcp_packet’: ../../common/system_socket.c:1016:15: error: cast discards ‘const’ qualifier from pointer target type [-Werror=cast-qual] 1016 \| eth = (struct ether_header *)buffer; \| ^ Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-09-20 10:43:37 +00:00
Martin Schwenke	ad445abebd	ctdb-common: Do not use raw socket when ENABLE_PCAP is defined Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-09-20 10:43:37 +00:00
Martin Schwenke	c522f4f604	ctdb-common: Move a misplaced comment Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-09-20 10:43:37 +00:00
Martin Schwenke	d1543d5c78	ctdb-build: Add --enable-pcap configure option This forces the use pcap for packet capture on Linux. It appears that using a raw socket for capture does not work with infiniband - pcap support for that to come. Don't (yet?) change the default capture method to pcap. On some platforms (e.g. my personal Intel NUC, running Debian testing), pcap is much less reliable than the raw socket. However, pcap seems fine on most other platforms. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-09-20 10:43:37 +00:00
Martin Schwenke	a83e9ca696	ctdb-build: Use pcap-config when available The build currently fails on AIX, which can't find the pcap headers because they're installed in a non-standard place. However, there is a pcap-config script available. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-09-20 10:43:37 +00:00
Martin Schwenke	4f5b4bd9df	ctdb-tests: Reformat remaining test stubs with "shfmt -w -p -i 0 -fn" Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Fri Sep 16 04:35:09 UTC 2022 on sn-devel-184	2022-09-16 04:35:09 +00:00
Martin Schwenke	0e388a1994	ctdb-tests: Include eventscript stub commands in shellcheck test Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-09-16 03:36:32 +00:00
Martin Schwenke	4ee0abaece	ctdb-tests: Avoid shellcheck warnings in remaining test stubs A small amount of effort... Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-09-16 03:36:32 +00:00
Martin Schwenke	a31fb7e5ab	ctdb-scripts: Simplify determination of real interface This can now be made trivial. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-09-16 03:36:32 +00:00
Martin Schwenke	5abaec4992	ctdb-tests: Implement "ip -brief link show" in ip stub Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-09-16 03:36:32 +00:00
Martin Schwenke	ef921bdbdb	ctdb-tests: Avoid ShellCheck warnings Although this is a test stub, it is complicated enough to encourage ShellCheck cleanliness. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-09-16 03:36:32 +00:00
Martin Schwenke	67e0ca5e01	ctdb-tests: Reformat script with "shfmt -w -p -i 0 -fn" As per current Samba convention. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-09-16 03:36:32 +00:00
Martin Schwenke	517f09eb6f	ctdb-scripts: Drop assumption that there are VLANs with no '@' VLAN configuration on Linux often uses a convention of naming a VLAN on <iface> with VLAN ID <tag> as <iface>.<tag>. To be able to monitor the underlying interface, the original 10.interface code naively simply stripped off the '.' and everything after (i.e. ".", as a glob pattern). Some users do not use the above convention. A VLAN can be named without including the underlying interface, but still with a tag (e.g. vlan<tag> - the word "vlan" following by the tag) or, more generally, perhaps without a tag (e.g. <vlan> - an arbitrary name). The ip(8) command lists a VLAN as <vlan>@<iface>. The underlying interface can be found by stripping everything up to and including an '@' (i.e. "@"). Commit `bc71251433` added support for stripping "@". However, on suspicion, it kept support for the case where there is no '@', falling back to stripping ".". If ip(8) ever did this then it was a long time ago - it has been printing a format including '@' since at least 2004. Stripping "." interferes with interesting administrative decisions, like having '.' in interface names. So, drop the fallback to stripping "." because it appears to be unnecessary and can cause inconvenience. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-09-16 03:36:32 +00:00
Michael Tokarev	3ce1d2fde5	Fix spelling mistakes. Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> Reviewed-by: Jeremy Allison <jra@samba.org> Reviewed-by: Martin Schwenke <martin@meltin.net> Autobuild-User(master): Jeremy Allison <jra@samba.org> Autobuild-Date(master): Mon Sep 12 02:29:32 UTC 2022 on sn-devel-184	2022-09-12 02:29:32 +00:00
Martin Schwenke	a0e0fde039	ctdb-tests: Avoid shellcheck warnings Mostly SC2086: Double quote to prevent globbing and word splitting. Use ctdb_onnode() where it simplifies code. No behaviour changes intended. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Thu Aug 25 16:15:45 UTC 2022 on sn-devel-184	2022-08-25 16:15:45 +00:00
Martin Schwenke	ff4935d180	ctdb-tests: Simplify IP address checking Use a new function and wait_until() to simplify. get_test_ip_mask_and_iface() not needed here because select_test_node_and_ips() sets $test_ip, and neither $mask nor $iface is used. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-08-25 15:22:36 +00:00
Martin Schwenke	42aedc62e3	ctdb-tests: Fix typos These lines are just wrong: try_command_on_node -v $test_node "ip addr show to ${test_node}" if -n "$out"; then The 2nd variable referenced should be $test_ip. The 2nd line causes "-n: command not found" because it is missing [] test command brackets. Both typos would probably make the test pass unconditionally. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-08-25 15:22:36 +00:00
Martin Schwenke	b88e7322d9	ctdb-tests: Reformat script using shfmt -w -p -i 0 -fn Whitespace changes only. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-08-25 15:22:36 +00:00
Martin Schwenke	3aecd6e7b5	ctdb-common: CID 1507498: Control flow issues (DEADCODE) Fix typo in error checking. While here adjust the bottom of the range, making errno 0 invalid. Add corresponding test cases using an alternative syntax for errno packets (#nnn[;] - trailing ';' is optional). Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Mon Aug 1 09:19:55 UTC 2022 on sn-devel-184	2022-08-01 09:19:55 +00:00
Martin Schwenke	dde461868f	ctdb-tests: Add tests for cluster mutex I/O timeout Block the locker helper child by taking a lock on the 2nd byte of the lock file. This will cause a ping timeout if the process is blocked for long enough. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Martin Schwenke <martins@samba.org> Autobuild-Date(master): Thu Jul 28 11:10:54 UTC 2022 on sn-devel-184	2022-07-28 11:10:54 +00:00
Martin Schwenke	25d32ae97a	ctdb-tests: Terminate event loop if lock is no longer held Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-07-28 10:09:34 +00:00
Martin Schwenke	061315cc79	ctdb-mutex: Test the lock by locking a 2nd byte range Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-07-28 10:09:34 +00:00
Martin Schwenke	97a1714ee9	ctdb-mutex: open() and fstat() when testing lock file This makes a file descriptor available for other I/O. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-07-28 10:09:34 +00:00
Martin Schwenke	c07e81abf0	ctdb-mutex: Factor out function fcntl_lock_fd() Allows blocking mode and start offset to be specified. Always locks a 1-byte range. Make the lock structure static to avoid initialising the whole structure each time. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-07-28 10:09:34 +00:00
Martin Schwenke	9daf22a5c9	ctdb-mutex: Handle pings from lock checking child to parent The ping timeout is specified by passing an extra argument to the mutex helper, representing the ping timeout in seconds. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-07-28 10:09:34 +00:00
Martin Schwenke	b5db286791	ctdb-mutex: Do inode checks in a child process In future this will allow extra I/O tests and a timeout in the parent to (hopefully) release the lock if the child gets wedged. For simplicity, use tmon only to detect when either parent or child goes away. Plumbing a timeout for pings from child to parent will be done later. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-07-28 10:09:34 +00:00
Martin Schwenke	2ecdbcb22c	ctdb-mutex: Rename wait_for_lost to lock_io_check This will be generalised to do more I/O-based checks. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-07-28 10:09:34 +00:00
Martin Schwenke	7ab2e8f127	ctdb-mutex: Rename recheck_time to recheck_interval There will be more timeouts so clarify the intent of this one. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-07-28 10:09:34 +00:00
Martin Schwenke	c396b61504	ctdb-mutex: Consistently use progname in error messages To avoid error messages having ridiculously long paths, set progname to basename(argv[0]). Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-07-28 10:09:34 +00:00
Martin Schwenke	a8da8810f1	ctdb-tests: Add tests for trivial FD monitoring tmon_ping_test covers complex 2-way interaction between processes using tmon_ping_send(), including via a socketpair(). tmon_test covers the more general functionality of tmon_send() but uses a simpler 1-way harness with wide coverage. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-07-28 10:09:34 +00:00
Martin Schwenke	8d04235f46	ctdb-common: Add trivial FD monitoring abstraction Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-07-28 10:09:34 +00:00
Martin Schwenke	f9467cdf3b	ctdb-build: Link in backtrace support for ctdb_util_tests Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-07-28 10:09:34 +00:00
Martin Schwenke	7a1c43fc74	ctdb-build: Separate test backtrace support into separate subsystem A convention when testing members of ctdb-util is to include the .c file so that static functions can potentially be tested. This means that such tests can't be linked against ctdb-util or duplicate symbols will be encountered. ctdb-tests-common depends on ctdb-client, which depends in turn on ctdb-util, so this can't be used to pull in backtrace support. Instead, make ctdb-tests-backtrace its own subsystem. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-07-28 10:09:34 +00:00
Martin Schwenke	b195e8c0d0	ctdb-build: Sort sources in ctdb-util and ctdb_unit_tests Also, rename ctdb_unit_tests to ctdb_util_tests. The sorting makes it clear that only items from ctdb-util are tested here. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-07-28 10:09:34 +00:00
Martin Schwenke	3efa56aa61	ctdb-daemon: Fix printing of tickle ACKs Commit `f5a2037734` arguably got this back-to-front: 2022-07-27T09:50:01.985857+10:00 testn1 ctdbd[17820]: ../../ctdb/server/ctdb_takeover.c:514 sending TAKE_IP for '10.0.1.173' 2022-07-27T09:50:01.990601+10:00 testn1 ctdbd[17820]: Send TCP tickle ACK: 10.0.1.77:33004 -> 10.0.1.173:2049 2022-07-27T09:50:01.991323+10:00 testn1 ctdb-takeover[19758]: TAKEOVER_IP 10.0.1.173 succeeded on node 0 Unfortunately there is an inconsistency somewhere in the connection tracking code used for tickle ACKs, making this less than obvious. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Thu Jul 28 09:02:08 UTC 2022 on sn-devel-184	2022-07-28 09:02:08 +00:00
Martin Schwenke	30c40046ef	ctdb-build: Add missing dependency on talloc The include isn't strictly necessary, since it is included via common/reqid.c anyway. However, it is a useful hint. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Fri Jul 22 17:01:00 UTC 2022 on sn-devel-184	2022-07-22 17:01:00 +00:00
Martin Schwenke	e831af7b25	ctdb-tests: Work around unreadable file test failure when root root can read files for which the mode prohibits reading, so this test case fails when run as root. Work around this when running as root. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-07-22 16:09:31 +00:00
Martin Schwenke	b20ccaa36d	ctdb-scripts: Use "git config" as last resort to parse nfs.conf Some versions of nfs-utils (e.g. recent CentOS 7) use /etc/nfs.conf but do not include the nfsconf utility to extract values from the file. However, git has an excellent conf file parser, so use it as a last resort. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-07-22 16:09:31 +00:00
Martin Schwenke	db37043bc5	ctdb-scripts: Avoid ShellCheck warning SC2295 For example: In /home/martins/samba/samba/ctdb/tools/onnode line 304: [ "$nodes" != "${nodes%[ ${nl}]}" ] && verbose=true ^---^ SC2295 (info): Expansions inside ${..} need to be quoted separately, otherwise they match as patterns. Did you mean: [ "$nodes" != "${nodes%[ "${nl}"]}" ] && verbose=true For more information: https://www.shellcheck.net/wiki/SC2295 -- Expansions inside ${..} need to b... Who knew? Thanks ShellCheck! Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-07-22 16:09:31 +00:00
Martin Schwenke	00f1d6d947	ctdb-common: Use POSIX if_nameindex() to check interface existence This works as an unprivileged user, so avoids unnecessary errors when running in test mode (and not as root): 2022-02-18T12:21:12.436491+11:00 node.0 ctdbd[6958]: ctdb_sys_check_iface_exists: Failed to open raw socket 2022-02-18T12:21:12.436534+11:00 node.0 ctdbd[6958]: ctdb_sys_check_iface_exists: Failed to open raw socket 2022-02-18T12:21:12.436557+11:00 node.0 ctdbd[6958]: ctdb_sys_check_iface_exists: Failed to open raw socket 2022-02-18T12:21:12.436577+11:00 node.0 ctdbd[6958]: ctdb_sys_check_iface_exists: Failed to open raw socket The corresponding porting test would now become pointless because it would just confirm that "fake" does not exist. Attempt to make it useful by using a less likely name than "fake" and attempting to detect the loopback interface. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-07-22 16:09:31 +00:00
Martin Schwenke	c77a4fde7a	ctdb-daemon: Modernise debug in ctdb_add_public_address() Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-07-22 16:09:31 +00:00
Martin Schwenke	d62fcba7dc	ctdb-daemon: Avoid spurious error sending ARPs for released IP A public IP address can be released in between (and probably before) attempts to send ARPs. One situation when this can occur is when a cluster is shutting down: node A shuts down first, public IPs from node A are taken over by node B, node B is shutdown. Notice this when it occurs and cancel further attempts to send ARPs. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-07-22 16:09:31 +00:00
Martin Schwenke	f5a2037734	ctdb-daemon: Modernise debug in ctdb_control_send_arp() For the tickle ACK logging, render the connection in a buffer. This produces more complete information. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-07-22 16:09:31 +00:00
Martin Schwenke	ec5f6425b7	ctdb-protocol: Add separator argument to ctdb_connection_to_buf() Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-07-22 16:09:31 +00:00
Martin Schwenke	440bd86a99	ctdb-daemon: Drop unused ban_state element from CTDB node structure Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-07-22 16:09:31 +00:00
Martin Schwenke	9898e7c555	ctdb-recoverd: Clean up banning culprit code Make this fully self-contained in the recovery daemon and avoid indexing by PNN. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-07-22 16:09:31 +00:00
Martin Schwenke	19fbc2da38	ctdb-recoverd: Add pnn field to banning state structure This structure is now standalone, so indexing by PNN can be avoided via a subsequent commit. Index by culprit here to make this commit simple. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-07-22 16:09:31 +00:00
Martin Schwenke	0b5dd07604	ctdb-recoverd: Add function node_flags() and use it in elections Indexing a node map by PNN is suboptimal. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-07-22 16:09:31 +00:00
Martin Schwenke	e396eb9fbc	ctdb-scripts: Only run unhealthy call-out when passing threshold For memory usage, no need to dump all of this data on every failed monitor event. The first call will be enough to diagnose the problem. The node will then go unhealthy, drop clients and memory usage should then drop. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Fri Jul 22 07:32:54 UTC 2022 on sn-devel-184	2022-07-22 07:32:54 +00:00
Martin Schwenke	36bd6fd01f	ctdb-scripts: Always check memory usage If filesystem usage exceeds the unhealthy threshold then checking memory usage checking is not done. Always do them both. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-07-22 06:38:32 +00:00
Martin Schwenke	5e7bbcb069	ctdb-scripts: Avoid ShellCheck info SC2162 SC2162 (info): read without -r will mangle backslashes. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-07-22 06:38:32 +00:00
Martin Schwenke	dc7aaca889	ctdb-scripts: Reduce length of very long lines Use printf to allow easier line breaks and use some early returns. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-07-22 06:38:32 +00:00
Martin Schwenke	fc485feae8	ctdb-scripts: De-clutter validate_percentage() It always takes 2 arguments. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-07-22 06:38:32 +00:00
Martin Schwenke	a832c8e273	ctdb-scripts: Reformat using shfmt -w -p -i 0 -fn About to modify this file, so reformat first as per recent Samba convention. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-07-22 06:38:32 +00:00
Martin Schwenke	3df39aa7fb	ctdb-scripts: Avoid ShellCheck warning SC2164 SC2164 (warning): Use 'cd ... \|\| exit' or 'cd ... \|\| return' in case cd fails. A problem can only occur if /etc/ctdb/ or an important subdirectory is removed, which means the script itself would not be found. Use && to silence ShellCheck. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-07-22 06:38:32 +00:00
Martin Schwenke	be293a125f	ctdb-tests: Add new tool unit tests to cover UNKNOWN state Signed-off-by: Vinit Agnihotri <vagnihotri@ddn.com> Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Tue Jun 28 10:16:59 UTC 2022 on sn-devel-184	2022-06-28 10:16:59 +00:00
Vinit Agnihotri	794f125802	ctdb-tool: Add UNKNOWN pseudo state When a node is starting, CTDB reports remote nodes as unhealthy by default. This can be misleading. To hide this, report an "UNKNOWN" pseudo state when a remote node is not disconnected and the runstate is less than or equal to "FIRST_RECOVERY". Signed-off-by: Vinit Agnihotri <vagnihotri@ddn.com> Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-06-28 09:24:31 +00:00
Vinit Agnihotri	428bc71f98	ctdb-tests: Add runstate handling to fake ctdbd Signed-off-by: Vinit Agnihotri <vagnihotri@ddn.com> Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-06-28 09:24:31 +00:00
Martin Schwenke	05601cebc9	ctdb-tests: Return error on empty fake ctdbd configuration blocks These would be unintended errors. The block should be omitted to keep the default value. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-06-28 09:24:31 +00:00
Martin Schwenke	80ba66013e	ctdb-scripts: Drop use of eval in CTDB callout handling eval is not required and causes the follow ShellCheck warning: SC2294 (warning): eval negates the benefit of arrays. Drop eval to preserve whitespace/symbols (or eval as string). Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Fri Jun 24 10:40:50 UTC 2022 on sn-devel-184	2022-06-24 10:40:50 +00:00
Martin Schwenke	4cbb0b13ba	ctdb-tests: Do not require eval tricks for faking NFS callout The current code requires the use of eval in the NFS callout handling to facilitate testing. Improve the code to remove this need. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-06-24 09:49:33 +00:00
Martin Schwenke	0247fd8a02	ctdb-scripts: Avoid ShellCheck warning SC2162 SC2162 read without -r will mangle backslashes Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-06-24 09:49:33 +00:00
Martin Schwenke	7f799a8d6f	ctdb-tests: Fix faking of program stack traces The current code works in all current cases but is lazy and wrong. Fix it to avoid breaking on code changes involving different thread setups. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-06-24 09:49:33 +00:00
Martin Schwenke	0b728a4e8f	ctdb-tests: Improve Debian-style event script unit testing Tests can be run by hand using different distro styles, such as: CTDB_NFS_DISTRO_STYLE=systemd-debian \ ./tests/run_tests.sh ./tests/UNIT/eventscripts/{06,60}.nfs.* This fixes known problems for Debian styles, so the tests now pass for the following values of CTDB_NFS_DISTRO_STYLE: systemd-redhat sysvinit-redhat systemd-debian sysvinit-debian Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-06-24 09:49:33 +00:00
Martin Schwenke	7f3a0c7e9c	ctdb-scripts: Parameterise /etc directory to aid testing At the moment test results can be influenced by real system configuration files. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-06-24 09:49:32 +00:00
Martin Schwenke	337ef7c1b4	ctdb-scripts: Set NFS services to "AUTO" if started by another service For example, in Sys-V init "rquotad" is started by the main "nfs" service. At the moment the call-out can't distinguish between this case and "should never be run". Services set to "AUTO" are hand-stopped/started via service_stop()/service_start() on failure via restart_after. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-06-24 09:49:32 +00:00
Martin Schwenke	8b8660d883	ctdb-scripts: Refactor the manual RPC service start/stop This logic needs improving, so factor the decision making into new functions service_or_manual_stop() and service_or_manual_start(). Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-06-24 09:49:32 +00:00
Martin Schwenke	cd018d0ff5	ctdb-scripts: Simplify and rename basic_stop() and basic_start() Drop the argument. These now just stop/start the overall NFS service, so rename them appropriately. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-06-24 09:49:32 +00:00
Martin Schwenke	09fd1e5579	ctdb-scripts: Move nfslock out of basic_stop() and basic_start() These are only called in one place and should be done inline, since that is less confusing. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-06-24 09:49:32 +00:00
Martin Schwenke	a43a1ebe51	ctdb-tests: Reformat script Samba is reformatting shell scripts using shfmt -w -p -i 0 -fn so update this one before editing. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-06-24 09:49:32 +00:00
Martin Schwenke	e752f841e6	ctdb-daemon: Use DEBUG() macro for child logging Directly using dbgtext() with file logging results in a log entry with no header, which is wrong. This is a regression, introduced in commit `10d15c9e5d`. Prior to this, CTDB's callback for file logging would always add a header. Use DEBUG() instead dbgtext(). Note that DEBUG() effectively compares the passed script_log_level with DEBUGLEVEL, so an explicit check is no longer necessary. BUG: https://bugzilla.samba.org/show_bug.cgi?id=15090 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Volker Lendecke <vl@samba.org> Autobuild-User(master): Volker Lendecke <vl@samba.org> Autobuild-Date(master): Thu Jun 16 13:33:10 UTC 2022 on sn-devel-184	2022-06-16 13:33:10 +00:00
Martin Schwenke	88f35cf862	ctdb-daemon: Drop unused prefix, logfn, logfn_private These aren't set anywhere in the code. Drop the log argument because it is also no longer used. BUG: https://bugzilla.samba.org/show_bug.cgi?id=15090 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Volker Lendecke <vl@samba.org>	2022-06-16 12:42:35 +00:00
Martin Schwenke	1596a3e84b	ctdb-common: Tell file logging not to redirect stderr This allows ctdb_set_child_logging() to work. BUG: https://bugzilla.samba.org/show_bug.cgi?id=15090 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Volker Lendecke <vl@samba.org>	2022-06-16 12:42:35 +00:00
Martin Schwenke	b20ee18031	ctdb-tests: Fix a cut and paste error in a comment Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Tue May 31 05:56:43 UTC 2022 on sn-devel-184	2022-05-31 05:56:43 +00:00
Martin Schwenke	90a96f06a9	ctdb-recoverd: Do not ban on unknown error when taking cluster lock If the cluster filesystem is unavailable then I/O errors may occur. This is no worse than contention, so don't ban. This avoids having services unavailable for longer than necessary. Update the associated test to simply confirm that this results in a leaderless cluster, and leadership is restored when the lock can once again be taken. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-05-31 05:06:29 +00:00
Martin Schwenke	a400f4e7cc	ctdb-doc: Fix typos in the policy routing documentation Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-05-31 05:06:29 +00:00
Martin Schwenke	da9decfc5e	ctdb-daemon: Remove unused #includes of rb_tree.h ctdb_takeover.c and eventscript.c no longer use this. ipalloc_common.c has never used it. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-05-31 05:06:29 +00:00
Martin Schwenke	80de84d36e	ctdb-daemon: Log per-database summary of resent calls After a recovery that takes a significant amount of time the logs are flooded with messages about every resent call. Log a summary instead and demote per-call messages to INFO level. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-05-31 05:06:29 +00:00
Pavel Filipenský	8cb6565011	ctdb: Covscan: unchecked return value for trbt_traversearray32() Signed-off-by: Pavel Filipenský <pfilipen@redhat.com> Reviewed-by: Jeremy Allison <jra@samba.org> Reviewed-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>	2022-05-14 03:49:32 +00:00
Pavel Filipenský	91d1d0e4c8	ctdb: Fix trailing whitespace in rb_tree.c Signed-off-by: Pavel Filipenský <pfilipen@redhat.com> Reviewed-by: Jeremy Allison <jra@samba.org> Reviewed-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>	2022-05-14 03:49:32 +00:00
Martin Schwenke	64275fc1a2	ctdb-tests: Add backtrace on abort to some tests These are easier to debug with a backtrace. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Tue May 3 10:13:23 UTC 2022 on sn-devel-184	2022-05-03 10:13:23 +00:00
Martin Schwenke	d39377d6fc	ctdb-tests: Provide a method to dump the stack on abort Some tests make generous use of assert() and it can be difficult to guess the cause of failures without resorting to GDB. This provides some help. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-05-03 09:19:31 +00:00
Martin Schwenke	73b27def7b	build: Add missing ctdb-client dependencies Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-05-03 09:19:31 +00:00
Martin Schwenke	d57d624a77	ctdb-build: Drop unnecessary uses of include/ sub-directory None of these include any files from the include/ sub-directory. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-05-03 09:19:31 +00:00
Martin Schwenke	6d3c9e64d9	ctdb-tests: Use test_case() to help document test cases Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-05-03 09:19:31 +00:00
Martin Schwenke	d52b497d11	ctdb-locking: Don't pass NULL to tevent_req_is_unix_error() If there is an error then this pointer is unconditionally dereferenced. However, the only possible error appears to be ENOMEM, where a crash caused by dereferencing a NULL pointer isn't a terrible outcome. In the absence of a security issue this is probably not worth backporting. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-05-03 09:19:31 +00:00
Martin Schwenke	490e5f4d4c	ctdb-mutex: Don't pass NULL to tevent_req_is_unix_error() If there is an error then this pointer is unconditionally dereferenced. However, the only possible error appears to be ENOMEM, where a crash caused by dereferencing a NULL pointer isn't a terrible outcome. In the absence of a security issue this is probably not worth backporting. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-05-03 09:19:31 +00:00
Martin Schwenke	8deec3bc67	ctdb-scripts: Drop unused ctdbd_wrapper Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-04-06 06:34:37 +00:00
Martin Schwenke	a1e78cc372	ctdb-scripts: Drop uses of ctdbd_wrapper The only value this now provides is use of a notification script to log when start/stop are called. This was used for debugging strange start/stop failures, which have not been recently seen. Also, systemd does a good job of logging start/stop. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-04-06 06:34:37 +00:00
Martin Schwenke	aca5972233	ctdb-scripts: Remove failsafe that drops all IPs on failed shutdown IPs are dropped in the shutdown event. If a watchdog is necessary to ensure public IPs aren't on interfaces when CTDB isn't running, then see ctdb-crash-cleanup.sh. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-04-06 06:34:37 +00:00
Martin Schwenke	6fb08a6580	ctdb-daemon: Don't release all public IPs during shutdown sequence This further untangles public IP handling from the main daemon. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-04-06 06:34:37 +00:00
Martin Schwenke	cb438ecfd4	ctdb-scripts: Drop all public IPs in the "shutdown" event This is functionally the same as ctdb_release_all_ips(). Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-04-06 06:34:37 +00:00
Martin Schwenke	3caddaafa0	ctdb-config: Drop CTDB_STARTUP_TIMEOUT This was added to be able to notice startup failures when unknown tunables were present in the configuration. Tunables are now set by the daemon, so this is no longer necessary. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-04-06 06:34:37 +00:00
Martin Schwenke	208034ecfe	ctdb-doc: Update documentation for tunables configuration Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-04-06 06:34:37 +00:00
Martin Schwenke	0902553d15	ctdb-scripts: No longer load tunables via 00.ctdb.script setup event Drop related tests. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-04-06 06:34:37 +00:00
Martin Schwenke	f49446cb1e	ctdb-daemon: Load tunables from ctdb.tunables Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-04-06 06:34:37 +00:00
Martin Schwenke	a509ee059e	ctdb-daemon: New function ctdb_tunables_load() Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-04-06 06:34:37 +00:00
Martin Schwenke	b14f2a205d	ctdb-tests: Add unit tests for tunables code This aims to test ctdb_tunable_load_file() but also exercises ctdb_tunable_names() and ctdb_tunable_get_value(). ctdb_tunable_set_value() is indirectly exercised via ctdb_tunable_load_file(). Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-04-06 06:34:37 +00:00
Martin Schwenke	381134939b	ctdb-tests: Add function test_case(), tweak unit test header format Instead of documenting test cases with a comment, this allows them to be documented via an argument to a function that is printed when the test case is run. This makes it easier locate test case failures when commands used by test cases look similar, Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-04-06 06:34:37 +00:00
Martin Schwenke	c413838f79	ctdb-tests: Strip trailing newlines from expected result output This allows the provided output to be specified a little more carelessly. As per the comment, trailing newlines can't be matched anyway, so this is notionally a bug fix. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-04-06 06:34:37 +00:00
Martin Schwenke	5fa0c86b61	ctdb-tests: Reformat script Samba is reformatting shell scripts using shfmt -w -p -i 0 -fn so update this one before editing. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-04-06 06:34:37 +00:00
Martin Schwenke	bcd66e17ee	ctdb-common: Add function ctdb_tunable_load_file() Allows direct loading of tunables from a file. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-04-06 06:34:37 +00:00
Vinit Agnihotri	93824b8c33	packaging: move CTDB service file to top-level Signed-off-by: Vinit Agnihotri <vagnihotri@ddn.com> Reviewed-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-04-06 06:34:37 +00:00
Martin Schwenke	2f6b31788b	ctdb-packaging: Move RPM spec file to examples directory We used to use this for building test packages for standalone CTDB. However, our testing has now changed to use binary tarballs. We believe we were the only users of this spec file and expect CTDB to only be installed as part of a top-level Samba build, especially in RPM form. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-04-06 06:34:37 +00:00
Stefan Metzmacher	aa02cf3c44	ctdb/packaging/RPM: don't use waf directly ./configure && make && make install is will always work. Signed-off-by: Stefan Metzmacher <metze@samba.org> Reviewed-by: Andrew Bartlett <abartlet@samba.org>	2022-03-29 22:32:32 +00:00
Stefan Metzmacher	22c46d9f41	configure/Makefile: export PYTHONHASHSEED=1 in all 'configure/Makefile' scripts Signed-off-by: Stefan Metzmacher <metze@samba.org> Reviewed-by: Andrew Bartlett <abartlet@samba.org>	2022-03-29 22:32:32 +00:00
Archana	7debfe7a23	ctdb-tools: Remove deprecated networking commands and replace with new commands The changes are made to replace the deprecated network commands (ifconfig,netstat) with the new commands (ip addr,ss) respectively Signed-off-by: Archana Chidirala <archana.chidirala.chidirala@ibm.com> Reviewed-by: Volker Lendecke <vl@samba.org> Reviewed-by: Martin Schwenke <martin@meltin.net> Autobuild-User(master): Volker Lendecke <vl@samba.org> Autobuild-Date(master): Tue Mar 8 12:30:53 UTC 2022 on sn-devel-184	2022-03-08 12:30:53 +00:00
Archana	e16cd0316f	ctdb-packaging: Remove deprecated networking command netstat and replace with "ss" command Signed-off-by: Archana Chidirala <archana.chidirala.chidirala@ibm.com> Reviewed-by: Volker Lendecke <vl@samba.org> Reviewed-by: Martin Schwenke <martin@meltin.net>	2022-03-08 11:32:36 +00:00
Martin Schwenke	0d8084ed62	ctdb-protocol: CID 1499395: Uninitialized variables (UNINIT) Issue is reported here: 853 case CTDB_CONTROL_DB_VACUUM: { 854 struct ctdb_db_vacuum db_vacuum; 855 >>> CID 1499395: Uninitialized variables (UNINIT) >>> Using uninitialized value "db_vacuum.full_vacuum_run" when calling "ctdb_db_vacuum_len". 856 CHECK_CONTROL_DATA_SIZE(ctdb_db_vacuum_len(&db_vacuum)); 857 return ctdb_control_db_vacuum(ctdb, c, indata, async_reply); 858 } The problem is that ctdb_bool_len() unnecessarily dereferences its argument, which in this case is &db_vacuum.full_vacuum_run. Not a security issue because the value copied by dereferencing is not used. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Wed Feb 23 02:02:06 UTC 2022 on sn-devel-184	2022-02-23 02:02:06 +00:00
Martin Schwenke	0f373443ef	ctdb-tests: Fix missing #include for sigaction(2) Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-02-23 01:08:37 +00:00
Martin Schwenke	ef9017a150	ctdb-tests: Dump a stack trace on abort Debugging a test failure here without GDB is not possible. Dumping a stack trace gives a good hint. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-02-23 01:08:37 +00:00
Martin Schwenke	17d792e9aa	ctdb-tests: Iterate protocol tests internally Instead of repeatedly running a test binary. Run time for these tests reduces from ~90s to ~75s. When run under valgrind, the run time for protocol_test_001.sh reduces from ~390s to <1s. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Mon Feb 14 04:32:29 UTC 2022 on sn-devel-184	2022-02-14 04:32:29 +00:00
Martin Schwenke	2329305019	ctdb-tests: Add iteration support for protocol tests The current method of repeatedly running a binary has huge overhead, especially with valgrind. protocol_test_iterate_tag() allows output that is usually used for hinting where a test failure occurred to be replaced with a tag stored in a buffer, which is printed on test failure. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-02-14 03:36:38 +00:00
Martin Schwenke	331c435ce5	ctdb-tests: Add a test for stalled node triggering election A stalled node probably continues to hold the cluster lock, so confirm elections work in this case. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14958 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Mon Feb 14 02:46:01 UTC 2022 on sn-devel-184	2022-02-14 02:46:01 +00:00
Martin Schwenke	265e44abc4	ctdb-tests: Factor out functions to detect when generation changes BUG: https://bugzilla.samba.org/show_bug.cgi?id=14958 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-02-14 01:47:31 +00:00
Martin Schwenke	0e74e03c9c	ctdb-recoverd: Consistently log start of election Elections should now be quite rare, so always log when one begins. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14958 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-02-14 01:47:31 +00:00
Martin Schwenke	bf55a0117d	ctdb-recoverd: Always send unknown leader broadcast when starting election This is currently missed when the cluster lock is lost. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14958 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-02-14 01:47:31 +00:00
Martin Schwenke	9b3fab052b	ctdb-recoverd: Consistently have caller set election-in-progress The problem here is that election-in-progress must be set to potentially avoid restarting the election broadcast timeout in main_loop(), so this is already done by leader_handler(). Have force_election() set election-in-progress for all election types and do not bother setting it in cluster_lock_election(). BUG: https://bugzilla.samba.org/show_bug.cgi?id=14958 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-02-14 01:47:31 +00:00
Martin Schwenke	188a902156	ctdb-recoverd: Always cancel election in progress Election-in-progress is set by unknown leader broadcast, so needs to be cleared in all cases when election completes. This was seen in a case where the leader node stalled, so didn't send leader broadcasts for some time. The node continued to hold the cluster lock, so another node could not become leader. However, after the node returned to normal it still did not send leader broadcasts because election-in-progress was never cleared. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14958 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-02-14 01:47:31 +00:00
Martin Schwenke	f7de2132bb	ctdb-doc: Remove documentation for recovery process This is many years out of date and recent changes make it worse. It is unlikely that anyone has the time to fix this in the near future, so remove it because it is misleading. Database recovery steps are well documented in comments in the recovery helper. Cluster monitoring documentation can be re-added when things stop changing. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:33 +00:00
Martin Schwenke	a940ad9370	ctdb-doc: Update example configuration migration script Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:33 +00:00
Martin Schwenke	01313ea243	ctdb-tests: Improve test coverage for leader role yield and elections Rename test, clean up node selection. Duplicate for for banning and removing leader capability cases. Repeat all 3 tests without cluster lock. All of the standard election triggers are now tested, with and without cluster lock. Due to test cluster configuration limitations, the tests without cluster lock are skipped on a real cluster. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:33 +00:00
Martin Schwenke	5d31778149	ctdb-tests: Support commenting out local daemons configuration options Can be used to disable default options, such as cluster lock. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:33 +00:00
Martin Schwenke	34d2ca0ae6	ctdb-config: Add configuration option [cluster] leader timeout Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:33 +00:00
Martin Schwenke	1dfb266038	ctdb-config: [legacy] recmaster capability -> [cluster] leader capability Rename this configuration item and move it into the [cluster] configuration section. Update documentation to match. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:33 +00:00
Martin Schwenke	f5a39058f0	ctdb-config: [cluster] recovery lock -> [cluster] cluster lock Retain "recovery lock" and mark as deprecated for backward compatibility. Some documentation is still inconsistent. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:33 +00:00
Martin Schwenke	d752a92e11	ctdb-doc: Update documentation for leader and cluster lock Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:33 +00:00
Martin Schwenke	73555e8248	ctdb-recoverd: Use race for cluster lock as election when lock is enabled If the cluster is partitioned then nodes in one partition can not take the lock anyway, so election is pointless. It just introduces unnecessary corner cases. Instead just race for the lock. When a node notices a lack of leader and notifies other nodes of an election via an unknown leader broadcast, the cluster lock election is hooked into this broadcast. The test needs to be updated because losing the cluster lock can now result in a leadership change. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:33 +00:00
Martin Schwenke	938d64c8ff	ctdb-protocol: Mark {GET,SET}_RECMASTER controls obsolete Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:33 +00:00
Martin Schwenke	03ae158cff	ctdb-protocol: Drop marshalling for {GET,SET}_RECMASTER controls Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:33 +00:00
Martin Schwenke	a76374070d	ctdb-daemon: Drop implementation of {GET,SET}_RECMASTER controls Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:33 +00:00
Martin Schwenke	193b624d26	ctdb-protocol: Drop protocol client functions for recmaster controls Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:33 +00:00
Martin Schwenke	cda673ff6d	ctdb-client: Drop unused recmaster functions Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:33 +00:00
Martin Schwenke	16efbca003	ctdb-daemon: Drop unused old client recmaster functions Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:33 +00:00
Martin Schwenke	c68267b2a6	ctdb-recoverd: Drop calls to ctdb_ctrl_setrecmaster() Nothing fetches this value anymore. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:33 +00:00
Martin Schwenke	58d7fcdf7c	ctdb-recoverd: Drop recovery master verification This doesn't make sense if leader broadcasts are used. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:33 +00:00
Martin Schwenke	f02e097485	ctdb-tools: recovery master -> leader The following command names are changed: recmaster -> leader setrecmasterrole -> setleaderrole Command output changed for the following commands: status getcapabilities Documentation and tests are updated to reflect these changes. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:33 +00:00
Martin Schwenke	e60581d5b5	ctdb-tools: Use leader broadcast in get_leader() Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:33 +00:00
Martin Schwenke	92fb68e9b8	ctdb-tools: Factor out get_leader() This seems pointless but it localises a subsequent change and also starts a terminology change in the tool code. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:33 +00:00
Martin Schwenke	17ba15ccd8	ctdb-tools: Handle leader broadcasts in ctdb tool Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:33 +00:00
Martin Schwenke	ec90f36cc6	ctdb-tools: Print "UNKNOWN" when leader PNN is unknown Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:33 +00:00
Martin Schwenke	01a8d1a4a4	ctdb-client: Factor out function ctdb_client_wait_func_timeout() Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:33 +00:00
Martin Schwenke	403db5b528	ctdb-tests: Factor out getting leader and waiting for leader change Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:33 +00:00
Martin Schwenke	4786982cc8	ctdb-tests: Add leader broadcasts to fake_ctdbd Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:33 +00:00
Amitay Isaacs	756dfdfed9	ctdb-tests: Implement srvid_handler for dispatching messages Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2022-01-17 10:21:33 +00:00
Martin Schwenke	958746f947	ctdb-recoverd: Simplify some stopped/banned checks to inactive checks Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:33 +00:00
Martin Schwenke	358c59f51a	ctdb-recoverd: No longer take cluster lock during recovery Confirm instead that it is already held. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:33 +00:00
Martin Schwenke	36ffaaa691	ctdb-recoverd: Add and use function cluster_lock_enabled() Now all references to ctdb->recovery_lock are encapsulated in the cluster lock code. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:33 +00:00
Martin Schwenke	5ee664ee17	ctdb-recoverd: Terminology change: recovery lock -> cluster lock No functional changes, just name changes for clarity. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:33 +00:00
Martin Schwenke	0f2250f4f9	ctdb-recoverd: Take cluster lock when election completes It is no longer just a recovery lock but is always held by the cluster leader. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:33 +00:00
Martin Schwenke	011e880002	ctdb-recoverd: Factor out function cluster_lock_take() Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:33 +00:00
Martin Schwenke	037abf8620	ctdb-tests: Avoid a race See the comment in the code for details. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:32 +00:00
Martin Schwenke	ef7e3265f7	ctdb-tests: Setup cluster with expected arguments ctdb_test_init() doesn't actually pass arguments to local_daemons.sh. This needs to be done using ctdb_nodes_start_custom(). Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:32 +00:00
Martin Schwenke	b029ca4d51	ctdb-recoverd: Drop leader validation The introduction of the leader broadcast timeout provides an alternative to the current leader validation. Using the leader broadcast may not be as fast but it is more correct. When the leader node is stopped or banned, the only way of triggering an election is currently to fetch the leader's node map to check whether the it is still active. This is because the leader will no longer push the node map to other nodes. However, having all nodes fetch the node map from an inactive leader may be unreliable. Most of the other cases are also handled more reliably by the leader broadcast timeout. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:32 +00:00
Martin Schwenke	7e53fab0a3	ctdb-recoverd: Drop special case for elected-before-connected This no longer occurs at startup due to the leader broadcast timeout. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:32 +00:00
Martin Schwenke	ef4b8c13c0	ctdb-recoverd: Handle leader broadcast timeout If no leader broadcasts have been received from the leader for more than 5s then trigger an election. Apart from being sane behaviour, this avoids elected-before-connected bugs at startup, where a node elects itself leader before it is connected to other nodes. When a node processes a leader broadcast timeout it sends an unknown leader broadcast to all nodes. That causes cancellation of the leader broadcast timeout across the cluster. This is particular important at startup, since nodes may be started in a staggered fashion. Without this cluster-wide cancellation, a node might notice the lack of leader, win an election and complete a recovery before other nodes notice the lack of leader. When the leader broadcast timeout finally occurs on the other nodes then they'll put the cluster back into an unnecessary recovery. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:32 +00:00
Martin Schwenke	5c7f6da0f0	ctdb-recoverd: Send leader broadcasts These are triggered on 1 second timer, but are only sent if the node is the current leader and there is no election underway. If this node can not be the leader then ensure it releases the recovery lock. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:32 +00:00
Martin Schwenke	789a75abfa	ctdb-recoverd: Process leader broadcasts Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:32 +00:00
Martin Schwenke	3d3767a259	ctdb-protocol: Add CTDB_SRVID_LEADER CTDB_SRVID_LEADER will be regularly broadcast to all connected nodes by the leader. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:32 +00:00
Martin Schwenke	c2cfd9c21a	ctdb-recoverd: Add an explicit flag for election in progress An alternate election method will be added that doesn't use the election timeout, so this provides a common way for recognising when an election is in progress. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:32 +00:00
Martin Schwenke	ac5a3ca063	ctdb-recoverd: Only start election if node can be leader Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:32 +00:00
Martin Schwenke	7baadfe27e	ctdb-recoverd: Add and use function this_node_can_be_leader() This makes the code self-documenting. In ctdb_election_data() there is a slight behaviour change. An inactive node will now try to lose an election. This case should not happen because: * An inactive node can't win an election round and then send a reply. * Any inactive node should never start an election. There are currently places where this happens and they will be fixed later. There is an instance where this could be used in validate_recovery_master() but this involves a more serious logic change. Overhaul this function later. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:32 +00:00
Martin Schwenke	94b546c268	ctdb-recoverd: Logging/comments: recovery master -> leader There are some remaining instances in this file but they will be removed in subsequent commits. Modernise debug macros as appropriate. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:32 +00:00
Martin Schwenke	dd79e9bd14	ctdb-recoverd: Rename recmaster field to leader Recovery master is being renamed to leader. This follows clustering best practice (e.g. RAFT). Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:32 +00:00
Martin Schwenke	2ee6763c7d	ctdb-recoverd: Use rec->pnn everywhere This is currently referenced in a number of inconsistent ways, including: * pnn * rec->ctdb->pnn * ctdb->pnn * ctdb_get_pnn(ctdb) * ctdb_get_pnn(rec->ctdb) The first of these always requires some thought about the context - is this the node PNN or some other PNN (e.g. argument to function)? rec->pnn is now always used when referring to the recovery daemon's PNN. Doing this also reduces reliance on struct ctdb_context internals. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:32 +00:00
Martin Schwenke	4af3b10a37	ctdb-recoverd: Change argument to srvid_disable_and_reply() Reduce dependency on struct ctdb_context internals, enable a subsequent change. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:32 +00:00
Martin Schwenke	b7c138ca99	ctdb-recoverd: Simplify arguments to ctdb_ban_node() ban_time argument is always ctdb->tunable.recovery_ban_period, so build this in and make the calling code more readable. ctdb_ban_node() already logs how long a node is banned for, so don't repeatedly log this. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:32 +00:00
Martin Schwenke	a5e0ddac62	ctdb-recoverd: Simplify arguments to verify_local_ip_allocation() All other arguments are available via rec, so simplify. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:32 +00:00
Martin Schwenke	67b5191640	ctdb-recoverd: Simplify arguments to do_recovery() pnn and nodemap are both available via the rec context, so simplify. vnnmap is unused. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:32 +00:00
Martin Schwenke	57882beb16	ctdb-recoverd: Simplify arguments to some election functions The pnn and nodemap arguments to force_election() and send_election_request() are always effectively rec->pnn and rec->nodemap, so simplify. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:32 +00:00
Martin Schwenke	9dbe7cc85e	ctdb-recoverd: Add PNN to recovery daemon context This is currently referenced in a number of inconsistent ways, including: * pnn * rec->ctdb->pnn * ctdb->pnn * ctdb_get_pnn(ctdb) * ctdb_get_pnn(rec->ctdb) The first of these always requires some thought about the context - is this the node PNN or some other PNN (e.g. argument to function)? The intention is to always use rec->pnn when referring to the recovery daemon's PNN. Doing this also reduces reliance on struct ctdb_context internals. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:32 +00:00
Martin Schwenke	ff0140e470	ctdb-recoverd: Use this_node_is_leader() in an extra context This is arguably clearer. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:32 +00:00
Martin Schwenke	c8721d01c6	ctdb-recoverd: Factor out and use function this_node_is_leader() Make the code self-documenting. This preempts an upcoming change to terminology but doing it now saves a lot of churn. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 10:21:32 +00:00
Martin Schwenke	57a32cebdd	ctdb-recoverd: Pass SIGHUP to running helper The recovery and takeover helpers can run for a while and generate non-trivial logs, so have them reopen their logs to support log rotation. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Mon Jan 17 04:36:30 UTC 2022 on sn-devel-184	2022-01-17 04:36:30 +00:00
Martin Schwenke	8e949a6082	ctdb-recoverd: Record helper PID in recovery daemon context Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2022-01-17 03:43:30 +00:00

... 3 4 5 6 7 ...

9267 Commits