1
0
mirror of https://github.com/samba-team/samba.git synced 2025-01-20 14:03:59 +03:00

9257 Commits

Author SHA1 Message Date
Volker Lendecke
25a222225d ctdb: Use str_list_add_printf() in lock_helper_args()
Saves lines, str_list_add_printf takes care of NULL checks

Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Martin Schwenke <martin@meltin.net>

Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Sun Sep 22 10:44:59 UTC 2024 on atb-devel-224
2024-09-22 10:44:59 +00:00
Volker Lendecke
83716809a8 ctdb: Change the ctdb_vfork_exec prototype to const char*const*
I could not find out how to cast a char ** to const char ** without
warning. This transfers fine to the execv call as well.

Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2024-09-22 09:36:36 +00:00
Volker Lendecke
53750d9deb ctdb: Fix a typo
Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Noel Power <noel.power@suse.com>
2024-09-20 17:13:37 +00:00
Volker Lendecke
65b3081f4b ctdb: Use str_list_add_printf() in debug_locks_args()
Saves lines, str_list_add_printf takes care of NULL checks

Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Noel Power <noel.power@suse.com>
2024-09-20 17:13:37 +00:00
Volker Lendecke
2fa0eabe64 ctdb: Make ctdb_lock_timeout_handler() easier to understand
Don't hide the real action inside an if-branch

Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Noel Power <noel.power@suse.com>
2024-09-20 17:13:37 +00:00
Martin Schwenke
574f2c3ed8 ctdb-tests: Add persistent TDB backup tests
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>

Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Fri Aug 30 00:08:41 UTC 2024 on atb-devel-224
2024-08-30 00:08:41 +00:00
Martin Schwenke
05da9001b9 ctdb-scripts: Add support for backing up persistent TDBs
Signed-off-by: Vinit Agnihotri <vagnihotri@ddn.com>
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2024-08-29 22:48:33 +00:00
Martin Schwenke
82250f3629 ctdb-scripts: Move database handling to its own event script
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2024-08-29 22:48:33 +00:00
Martin Schwenke
9c354e358e ctdb-scripts: Reformat with "shfmt -w -p -i 0 -fn"
Best reviewed with "git show -w".

Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2024-08-29 22:48:33 +00:00
Martin Schwenke
b4c7a4f7f0 ctdb-scripts: Remove unused variable NFS_HOSTNAME
This was passed to CTDB's old smnotify.  This has been replaced by use
of nfs-utils' sm-notify, which doesn't need this.

In test, a fake NFS_HOSTNAME is still needed.  Real sm-notify will get
it from a reverse host lookup of the IP address.

Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2024-08-29 22:48:33 +00:00
Martin Schwenke
ece6153038 ctdb-scripts: Use nfs-utils' sm-notify instead of CTDB's smnotify
CTDB's smnotify does not support IPv6 and is difficult to maintain.

So, create directories of files and pass them to NFS util's sm-notify.

There is an implied change here, because NFS utils sm-notify stopped
sending IP addresses as mon_name back in 2010:

  http://git.linux-nfs.org/?p=steved/nfs-utils.git;a=commitdiff;h=900df0e7c0b9006d72d8459b30dc2cd69ce495a5

This will change advice given in the wiki to use a hostname for the
cluster with round-robin DNS, since this is what is best supported.

Another behavioural change is that sm-notify only sends "up"
notifications with an odd state.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2024-08-29 22:48:33 +00:00
Martin Schwenke
d89506449f ctdb-failover: Add ctdb_smnotify_helper
statd callout will shortly be updated to use NFS utils' sm-notify.
This tiny helper will be used to create on-disk state files used by
sm-notify.  These state files contain endian-specific fields, so
better to write a simple C implementation than to do crazy things in a
shell script (or call out to Python).

Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2024-08-29 22:48:33 +00:00
Volker Lendecke
3cc3329420 ctdb: Add a NULL check to convert_node_map_to_list()
Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Jennifer Sutton <jsutton@samba.org>
2024-08-27 07:19:32 +00:00
Martin Schwenke
578dfa5765 ctdb-scripts: Avoid flapping NFS services at startup
If an NFS service check is set to, say, unhealthy_after=2 then it will
always switch from the (default startup) unhealthy state to healthy,
even if there is a fatal problem.  If all services/scripts appear OK
then the node will become healthy.  When the counter hits the limit it
will return to unhealthy.  This is misleading.

Instead, never use the counter at startup, until the service becomes
healthy.  This stops services flapping unhealthy-healthy-unhealthy.

A side-effect is that a service that starts in a broken state will
never be restarted to try to fix the problem.  This makes sense.  The
counting and restarting really exist to deal with problems that might
occur under load.  The first monitor events occur before public IPs
are hosted, so there can be no load.  If a service doesn't start
reliably the first time then the admin probably wants to know about
it.

nfs_iterate_test() is updated to run an initial monitor event to mark
the services as healthy.  This initialises the counter so it can be
used for the important part of the test.  Passing the -i option avoids
running the extra monitor event, so the first iteration will be the
initial monitor event.

Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2024-08-20 22:50:34 +00:00
Martin Schwenke
18a29ed367 ctdb-scripts: Make initial statistics output empty
This makes initial failure to retrieve statistics less likely to
result in a statistics change.  To help with this, statistics
retrieval stderr now goes to the log - only stdout goes to the file.

This means that the test code for checking statistics changes needs to
be redone to actually run the statistics command and check.  As with
rpcinfo output, this output needs to behave as deterministically in
the test code as it done in the event script.

Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2024-08-20 22:50:34 +00:00
Martin Schwenke
032b7b49c9 ctdb-scripts: Only consider statistics on timeout
Checking statistics is only really relevant to timeouts.  That is, if
an rpcinfo times out it is worth checking if the service making
progress.  If the RPC service is not registered then the statistics
don't need to be checked because they shouldn't be changing.

The 2 previously added tests added to check statistics progress now
behave identically and fail on all iterations.  To support testing
with "timeouts", an optional TIMEOUT flag can now be added to the RPC
service passed to nfs_iterate_test().  2 new tests are added to
exercise the new behaviour.

The 2 new "if" statements in nfs_iterate_test() could be combined.
However, a subsequent commit would split them and would be more
difficult to read.

Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2024-08-20 22:50:34 +00:00
Martin Schwenke
f7a96deafa ctdb-tests: Make _rpc_service_up() and _rpc_services_down() internal
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2024-08-20 22:50:34 +00:00
Martin Schwenke
0919701a68 ctdb-tests: Make NFS RPC monitoring tests consistent
Update the remaining RPC monitoring tests to use nfs_iterate_test(),
depending on it to set results.  This makes all RPC monitoring tests
consistent, so they will all benefit from future improvements.

Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2024-08-20 22:50:34 +00:00
Martin Schwenke
47c33a2442 ctdb-tests: Drop unnecessarily "else"
Doing this in a previous commit would have made it more difficult to
read that commit.

Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2024-08-20 22:50:34 +00:00
Martin Schwenke
8b2f228198 ctdb-tests: Replace implicit healthy behaviour with early exits
The early exits from the sub-shell make the obvious successes much
more obvious, and slightly simplify the code that follows.

Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2024-08-20 22:50:34 +00:00
Martin Schwenke
a522864138 ctdb-tests: Simplify handling of statistics change
Handling this across two different functions led to insanity, so
simplify.

The handling of unhealthy_after when $_numfails = 0 implicitly causes
the node to be healthy.  This is how the "rpcinfo succeeds" case
works.  Doing it this way for statistics makes this patch easier to
read.  The implicit behaviour will go away in the next patch.

Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2024-08-20 22:50:34 +00:00
Martin Schwenke
084a69d552 ctdb-tests: Move result check to rpc_set_service_failure_response()
The current structure here is wrong and repetitive.  Checking rpcinfo
result and determining output should be in the same place.

Failure counting is now contained in
rpc_set_service_failure_response(), but needs a file to survive the
sub-shell.

Don't attempt to combine and simplify code yet.  That would make this
commit harder to review.

Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2024-08-20 22:50:34 +00:00
Martin Schwenke
4754001200 ctdb-tests: Initialise return code file
The output file is initialised, so doesn't need to be created on
success.  Treat the return code file the same way.

Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2024-08-20 22:50:34 +00:00
Martin Schwenke
833deb067d ctdb-tests: Add function rpc_failure() to log failures and warnings
Improves readability, makes future changes easier.

Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2024-08-20 22:50:34 +00:00
Martin Schwenke
1d9661d587 ctdb-tests: Argument 3 to nfs_iterate_test() is up iteration
Nothing more complex is ever done, so we might as well simplify and
reduce coupling.

Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2024-08-20 22:50:34 +00:00
Martin Schwenke
7c5e708001 ctdb-tests: nfs_iterate_test() marks RPC service down
If an RPC service is given, it is automatically marked down.  This
avoids repetition in test cases and loosens coupling.

Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2024-08-20 22:50:34 +00:00
Martin Schwenke
8edb1fd13c ctdb-tcp: Remove a use of ctdb_addr_to_str()
This one is in a rarely used error path, so call a function that
talloc()s the string instead.

Again, this will also print the port, which might be useful if we ever
add the ability to also specify ports in the nodes list.

Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Volker Lendecke <vl@samba.org>

Autobuild-User(master): Volker Lendecke <vl@samba.org>
Autobuild-Date(master): Tue Aug 20 14:24:14 UTC 2024 on atb-devel-224
2024-08-20 14:24:14 +00:00
Martin Schwenke
afaf151193 ctdb-tcp: Consolidate failure code
Same thing several times, so change to common failure code.

Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
2024-08-20 13:06:33 +00:00
Martin Schwenke
f7aac2f755 ctdb-tcp: Use already constructed node name
Node has been found, so use the pre-constructed name instead of
calling ctdb_addr_to_str().

This will also print the port, which might be useful if we ever add
the ability to also specify ports in the nodes list.

Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
2024-08-20 13:06:33 +00:00
Martin Schwenke
02c9e7a63f ctdb-tcp: Use path_rundir_append() to construct lock_path
The current constant value doesn't respect CTDB_TEST_MODE/CTDB_BASE.
Instead use the path module to allow automatic listening in test mode
with local daemons.

A single node can be tested with local daemons, using something like:

  $ tests/local_daemons.sh foo setup -n 1 -C "node address"
  $ grep "node address" foo/node.0/ctdb.conf
      # node address = 127.0.0.1
  $ tests/local_daemons.sh foo start all
  $ tests/local_daemons.sh foo print-log 0 | grep -i chose
  ... node.0 ctdbd[24546]: ctdb chose network address 127.0.0.1:4379

The trick is that commenting out the node address in ctdb.conf means
the chosen node address is the first one from the nodes file that
allows bind/listen.  In this case it is the only line.

The following ensures that automatic listening works for a node that
isn't the first:

  $ cat >mynodes
  192.168.1.1
  127.0.0.1
  $ tests/local_daemons.sh foo setup -n 2 -N mynodes -C "node address"
  $ grep "node address" foo/node.1/ctdb.conf
      # node address = 127.0.0.1
  $ tests/local_daemons.sh foo start 1
  $ tests/local_daemons.sh foo print-log 1 | grep -i chose
  [...] node.1 ctdbd[22787]: ctdb chose network address 127.0.0.1:4379

Note that the first address isn't local on this host, so will always
fail.

So, doing the above and starting both nodes yields...

  ...
  $ tests/local_daemons.sh foo start 1
  $ sleep 3; tests/local_daemons.sh foo start 0
  $ tests/local_daemons.sh foo print-log all | grep -i 'chose\|bind'
  [...] node.1 ctdbd[26351]: ctdb chose network address 127.0.0.1:4379
  [...] node.0 ctdbd[26438]: ctdb_tcp_listen_addr: Failed to bind() to socket - Address already in use (98)
  [...] node.0 ctdbd[26438]: Unable to bind to any node address - giving up

... as expected.

It would be nice to add tests for this, but we don't really have
infrastructure for that.  At least manual testing shows, for the
obvious cases, the previous commits didn't break anything.  :-)

Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
2024-08-20 13:06:33 +00:00
Martin Schwenke
17959ccb4b ctdb-ib: Remove a use of ctdb_set_error()
Now the transport code is free of ctdb_set_error().

Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
2024-08-20 13:06:33 +00:00
Martin Schwenke
b433663414 ctdb-tcp: Factor out listening code to avoid repetition
Modernise debug and comments while here.

Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
2024-08-20 13:06:33 +00:00
Martin Schwenke
2c75bb8687 ctdb-tcp: Use talloc_strdup() instead of repeating logic
The node name is already constructed when the nodes file is loaded, so
just copy the node name.

Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
2024-08-20 13:06:33 +00:00
Martin Schwenke
f36f03172a ctdb-daemon: Remove a use of ctdb_errstr()
Code to setup the transport is about to be cleaned up, including
removing uses of ctdb_set_error(), so avoid logging a NULL pointer or
some other old error.

Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
2024-08-20 13:06:33 +00:00
John Mulligan
a743a24d75 ctdb-doc: document nodes list configuration parameter
Add the initial documentation of the node list configuration parameter.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>

Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Tue Aug  6 01:50:12 UTC 2024 on atb-devel-224
2024-08-06 01:50:12 +00:00
John Mulligan
6817eff833 ctdb-tests: add a nodestatus test that uses the nodes list command
Signed-off-by: John Mulligan <jmulligan@redhat.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2024-08-06 00:43:36 +00:00
John Mulligan
6d29c7f819 ctdb-tests: add reloadnodes unit tests that use the nodes list command
Signed-off-by: John Mulligan <jmulligan@redhat.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2024-08-06 00:43:36 +00:00
John Mulligan
8a5b743c43 ctdb-tests: add USENODESCOMMAND directive to fake ctdb
Add a single line USENODESCOMMAND directive to the fake ctdb in order to
enable use of a nodes script instead of a nodes file. For simplicity
the fake ctdb always uses `nodes.sh` in the CTDB_BASE.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2024-08-06 00:43:36 +00:00
John Mulligan
cdb5646b88 ctdb-tests: add unit test coverage for listnodes with command
Signed-off-by: John Mulligan <jmulligan@redhat.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2024-08-06 00:43:36 +00:00
John Mulligan
cfc0917135 ctdb-tools: update cli tool to optionally load nodes from command
Similar to the recent changes to the ctdb server code, add the ability
to load the nodes from a subprocess stdout.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2024-08-06 00:43:36 +00:00
John Mulligan
ac926a506d ctdb-conf: add boolean arg for verbosity when loading config
In a future commit we will add support for loading the config file from
the `ctdb` command line tool. Prior to this change the config file load
func always called D_NOTICE that causes the command to emit new text and
thus break all the tests that rely on the specific test output (not to
mention something users could notice). This change plumbs a new
`verbose` argument into some of the config file loading functions.
Generally, all existing functions will have verbose set to true to match
the existing behavior. Future callers of this function can set it to
false in order to avoid emitting the extra text.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2024-08-06 00:43:36 +00:00
John Mulligan
a0e8304ccf ctdb-server: rename ctdb_load_nodes_file to ctdb_load_nodes
Rename ctdb_load_nodes_file to ctdb_load_nodes as it can now load nodes
from more than a regular file.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2024-08-06 00:43:36 +00:00
John Mulligan
7e7cb91806 ctdb-server: rename nodes_file field to nodes_source
Rename the `struct ctdb_context` field nodes_file to nodes_source to
better match that the field may indicate something other than a true
file.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2024-08-06 00:43:36 +00:00
John Mulligan
dc65e7082d ctdb-server: use the new "nodes list" configuration option
Use the new "nodes list" configuration option. Executing the given path
if the path is prefixed by a `!`. The use case is to decouple the nodes
file from the shared storage, especially in the case where the shared
storage is provided by a vfs module.

For an example, imagine a script that runs `curl` on a URL for a
highly-available web server where the URL provides the content
of the nodes file.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2024-08-06 00:43:36 +00:00
John Mulligan
315890e845 ctdb-conf: add "nodes list" configuration option
Add a "nodes list" configuration option to the [cluster] section of the
ctdb server config. This option will be used similarly to the `cluster
lock` parameter works. When unset it defaults to the same value as
before (/etc/ctdb/nodes). If given a path that is not prefixed by `!` it
instead loads the nodes file from the given path If given a path
prefixed by `!` then it executes the path as a command and reads the
standard output as if it were the content of the nodes file.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2024-08-06 00:43:36 +00:00
John Mulligan
bab5170528 ctdb-conf: add ctdb_read_nodes_cmd function
Add ctdb_read_nodes_cmd a function that works similarly to
ctdb_read_nodes_file but reads the nodes list from the stdout of a
subprocess instead of a file in the file system.

Signed-off-by: John Mulligan <jmulligan@redhat.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2024-08-06 00:43:36 +00:00
Pavel Filipenský
1fcaf066f4 ctdb:events: Add 46.update-keytabs.script for 'recovered' event
BUG: https://bugzilla.samba.org/show_bug.cgi?id=6750

Signed-off-by: Pavel Filipenský <pfilipensky@samba.org>
Reviewed-by: Stefan Metzmacher <metze@samba.org>
2024-07-26 17:12:36 +00:00
Martin Schwenke
ead5a3111f ctdb-daemon: Use ctdb_parse_node_address() in ctdbd
While here, fix a trivial memory leak (ctdbd will exit anyway if this
function fails).

Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Anoop C S <anoopcs@samba.org>

Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Tue Jul 23 12:39:18 UTC 2024 on atb-devel-224
2024-07-23 12:39:18 +00:00
Martin Schwenke
181cc097ef ctdb-daemon: Use ctdb_read_nodes() in ctdbd
ctdb_control_getnodesfile() calls ctdb_read_nodes(), which returns a
struct ctdb_node_map rather than the old version, so update associated
marshalling.  While here modernise a debug message and wrap the
function arguments.

For ctdb_load_nodes_file() to use ctdb_read_nodes(), tweak
convert_node_map_to_list() to also use the modern node map structure.

Remove unused copy of ctdb_read_nodes_file().

Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Anoop C S <anoopcs@samba.org>
2024-07-23 11:37:34 +00:00
Martin Schwenke
5d2a864c0b ctdb-protocol: Move ctdb_node_map_* to protocol_api.h
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Anoop C S <anoopcs@samba.org>
2024-07-23 11:37:34 +00:00