1
0
mirror of https://github.com/samba-team/samba.git synced 2024-12-23 17:34:34 +03:00
Commit Graph

9087 Commits

Author SHA1 Message Date
Martin Schwenke
ef4b8c13c0 ctdb-recoverd: Handle leader broadcast timeout
If no leader broadcasts have been received from the leader for more
than 5s then trigger an election.

Apart from being sane behaviour, this avoids elected-before-connected
bugs at startup, where a node elects itself leader before it is
connected to other nodes.

When a node processes a leader broadcast timeout it sends an unknown
leader broadcast to all nodes.  That causes cancellation of the leader
broadcast timeout across the cluster.  This is particular important at
startup, since nodes may be started in a staggered fashion.  Without
this cluster-wide cancellation, a node might notice the lack of
leader, win an election and complete a recovery before other nodes
notice the lack of leader.  When the leader broadcast timeout finally
occurs on the other nodes then they'll put the cluster back into an
unnecessary recovery.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2022-01-17 10:21:32 +00:00
Martin Schwenke
5c7f6da0f0 ctdb-recoverd: Send leader broadcasts
These are triggered on 1 second timer, but are only sent if the node
is the current leader and there is no election underway.

If this node can not be the leader then ensure it releases the
recovery lock.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2022-01-17 10:21:32 +00:00
Martin Schwenke
789a75abfa ctdb-recoverd: Process leader broadcasts
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2022-01-17 10:21:32 +00:00
Martin Schwenke
3d3767a259 ctdb-protocol: Add CTDB_SRVID_LEADER
CTDB_SRVID_LEADER will be regularly broadcast to all connected nodes
by the leader.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2022-01-17 10:21:32 +00:00
Martin Schwenke
c2cfd9c21a ctdb-recoverd: Add an explicit flag for election in progress
An alternate election method will be added that doesn't use the
election timeout, so this provides a common way for recognising when
an election is in progress.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2022-01-17 10:21:32 +00:00
Martin Schwenke
ac5a3ca063 ctdb-recoverd: Only start election if node can be leader
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2022-01-17 10:21:32 +00:00
Martin Schwenke
7baadfe27e ctdb-recoverd: Add and use function this_node_can_be_leader()
This makes the code self-documenting.

In ctdb_election_data() there is a slight behaviour change.  An
inactive node will now try to lose an election.  This case should not happen
because:

* An inactive node can't win an election round and then send a reply.

* Any inactive node should never start an election.  There are
  currently places where this happens and they will be fixed later.

There is an instance where this could be used in
validate_recovery_master() but this involves a more serious logic
change.  Overhaul this function later.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2022-01-17 10:21:32 +00:00
Martin Schwenke
94b546c268 ctdb-recoverd: Logging/comments: recovery master -> leader
There are some remaining instances in this file but they will be
removed in subsequent commits.

Modernise debug macros as appropriate.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2022-01-17 10:21:32 +00:00
Martin Schwenke
dd79e9bd14 ctdb-recoverd: Rename recmaster field to leader
Recovery master is being renamed to leader.  This follows clustering
best practice (e.g. RAFT).

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2022-01-17 10:21:32 +00:00
Martin Schwenke
2ee6763c7d ctdb-recoverd: Use rec->pnn everywhere
This is currently referenced in a number of inconsistent
ways, including:

* pnn
* rec->ctdb->pnn
* ctdb->pnn
* ctdb_get_pnn(ctdb)
* ctdb_get_pnn(rec->ctdb)

The first of these always requires some thought about the context - is
this the node PNN or some other PNN (e.g. argument to function)?

rec->pnn is now always used when referring to the recovery daemon's
PNN.

Doing this also reduces reliance on struct ctdb_context internals.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2022-01-17 10:21:32 +00:00
Martin Schwenke
4af3b10a37 ctdb-recoverd: Change argument to srvid_disable_and_reply()
Reduce dependency on struct ctdb_context internals, enable a
subsequent change.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2022-01-17 10:21:32 +00:00
Martin Schwenke
b7c138ca99 ctdb-recoverd: Simplify arguments to ctdb_ban_node()
ban_time argument is always ctdb->tunable.recovery_ban_period, so
build this in and make the calling code more readable.

ctdb_ban_node() already logs how long a node is banned for, so don't
repeatedly log this.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2022-01-17 10:21:32 +00:00
Martin Schwenke
a5e0ddac62 ctdb-recoverd: Simplify arguments to verify_local_ip_allocation()
All other arguments are available via rec, so simplify.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2022-01-17 10:21:32 +00:00
Martin Schwenke
67b5191640 ctdb-recoverd: Simplify arguments to do_recovery()
pnn and nodemap are both available via the rec context, so simplify.
vnnmap is unused.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2022-01-17 10:21:32 +00:00
Martin Schwenke
57882beb16 ctdb-recoverd: Simplify arguments to some election functions
The pnn and nodemap arguments to force_election() and
send_election_request() are always effectively rec->pnn and
rec->nodemap, so simplify.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2022-01-17 10:21:32 +00:00
Martin Schwenke
9dbe7cc85e ctdb-recoverd: Add PNN to recovery daemon context
This is currently referenced in a number of inconsistent
ways, including:

* pnn
* rec->ctdb->pnn
* ctdb->pnn
* ctdb_get_pnn(ctdb)
* ctdb_get_pnn(rec->ctdb)

The first of these always requires some thought about the context - is
this the node PNN or some other PNN (e.g. argument to function)?

The intention is to always use rec->pnn when referring to the recovery
daemon's PNN.

Doing this also reduces reliance on struct ctdb_context internals.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2022-01-17 10:21:32 +00:00
Martin Schwenke
ff0140e470 ctdb-recoverd: Use this_node_is_leader() in an extra context
This is arguably clearer.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2022-01-17 10:21:32 +00:00
Martin Schwenke
c8721d01c6 ctdb-recoverd: Factor out and use function this_node_is_leader()
Make the code self-documenting.

This preempts an upcoming change to terminology but doing it now saves
a lot of churn.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2022-01-17 10:21:32 +00:00
Martin Schwenke
57a32cebdd ctdb-recoverd: Pass SIGHUP to running helper
The recovery and takeover helpers can run for a while and generate
non-trivial logs, so have them reopen their logs to support log
rotation.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>

Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Mon Jan 17 04:36:30 UTC 2022 on sn-devel-184
2022-01-17 04:36:30 +00:00
Martin Schwenke
8e949a6082 ctdb-recoverd: Record helper PID in recovery daemon context
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2022-01-17 03:43:30 +00:00
Martin Schwenke
97a45f6f25 ctdb-recoverd: Add log reopening on SIGHUP to helpers
Recovery and takeover helpers can run for a while and generate
non-trivial logs.  They should support log reopening.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2022-01-17 03:43:30 +00:00
Martin Schwenke
51f0380e83 ctdb-daemon: Enable log reopening for event daemon
Add and call hook to pass on SIGHUP to eventd.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2022-01-17 03:43:30 +00:00
Martin Schwenke
4f14d7c0b9 ctdb-event: Reopen logs on SIGHUP
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2022-01-17 03:43:30 +00:00
Martin Schwenke
c554a325fe ctdb-daemon: Enable log reopening for recovery daemon
Pass on a SIGHUP to the recovery daemon, which will then reopen its
logs.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2022-01-17 03:43:30 +00:00
Martin Schwenke
4acfefed61 ctdb-recoverd: Add basic log reopening
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2022-01-17 03:43:30 +00:00
Martin Schwenke
4ed37de82b ctdb-daemon: Add basic top-level log reopening
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2022-01-17 03:43:30 +00:00
Martin Schwenke
7277385390 ctdb-common: Add support for reopening logs
Now that CTDB uses Samba's file logging it is possible to reopen the
logs, so that log rotation can be supported.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2022-01-17 03:43:30 +00:00
Martin Schwenke
d0a19778cd ctdb-common: Separate sock_daemon's SIGHUP and SIGUSR1 handling
SIGHUP is for reopening logs, SIGUSR1 is for reconfigure.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2022-01-17 03:43:30 +00:00
Martin Schwenke
10d15c9e5d ctdb-common: Use Samba's DEBUG_FILE logging
This has support for log rotation (or re-opening).

The log format is updated to use an RFC5424 timestamp and to include a
hostname.  The addition of the hostname allows trivial merging of log
files from multiple cluster nodes.

The hostname is faked from the CTDB_BASE environment variable during
testing, as per the comment in the code.  It is currently faked in a
similar manner in local_daemons.sh when printing logs, so drop this.

Unit tests need updating because stderr logging no longer produces a
"PROGNAME[PID]: " header.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2022-01-17 03:43:30 +00:00
Martin Schwenke
666a048707 ctdb-common: Switch initial debug type to DEBUG_DEFAULT_STDERR
This can be overridden by DEBUG_FILE, whereas DEBUG_STDERR can not.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2022-01-17 03:43:30 +00:00
Martin Schwenke
7163846a49 ctdb-protocol: Print IPv6 sockets with RFC5952 "[2001:db8::1]:80" notation
RFC5952 says the existing style is not recommended and the [] style
should be employed.

There are more optimised ways of adding the square brackets but they
tend to be uglier.

Parsing IPv6 sockets without [] is now tested indirectly by parsing
examples in both styles and comparing the results.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Signed-off-by: Volker Lendecke <vl@samba.org>

Autobuild-User(master): Volker Lendecke <vl@samba.org>
Autobuild-Date(master): Thu Jan 13 17:02:21 UTC 2022 on sn-devel-184
2022-01-13 17:02:21 +00:00
Martin Schwenke
255fe69c90 ctdb-tests: Add extra IPv6 socket parsing tests
Add tests to confirm that square brackets are handled and that
IPv4-mapped IPv6 addresses are parsed as expected.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Volker Lendecke <vl@samba.org>
2022-01-13 16:13:38 +00:00
Volker Lendecke
224e99804e ctdb-protocol: Allow rfc5952 "[2001:db8::1]:80" ipv6 notation
Bug: https://bugzilla.samba.org/show_bug.cgi?id=14934
Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2022-01-13 16:13:38 +00:00
Volker Lendecke
820b0a63cc ctdb-protocol: Save 50 bytes .text segment
Having this as a small static .text is simpler than having to create
this on the stack.

Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2022-01-13 16:13:38 +00:00
Volker Lendecke
baaedd69b3 ctdb-protocol: rindex->strrchr
According to "man rindex" on debian bullseye rindex() was deprecated
in Posix.1-2001 and removed from Posix.1-2008.

Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2022-01-13 16:13:38 +00:00
Pavel Filipenský
5ac8762256 ctdb:utils: Improve error handling of hex_decode()
This has been found by covscan and make analyzers happy.

Pair-programmed-with: Andreas Schneider <asn@samba.org>

Signed-off-by: Pavel Filipenský <pfilipen@redhat.com>
Signed-off-by: Andreas Schneider <asn@samba.org>
Reviewed-by: Jeremy Allison <jra@samba.org>
2022-01-10 23:31:33 +00:00
Andreas Schneider
90fd7674f8 ctdb:client: Initialize structs and pointers in ctdb_ctrl_(en|dis)able_node()
Found by covscan.

Signed-off-by: Andreas Schneider <asn@samba.org>
Reviewed-by: Jeremy Allison <jra@samba.org>
2021-12-15 19:32:30 +00:00
Martin Schwenke
1719ef7893 ctdb-tests: Drop unused function ctdb_get_all_public_addresses()
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Ralph Boehme <slow@samba.org>

Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Tue Oct 12 23:24:18 UTC 2021 on sn-devel-184
2021-10-12 23:24:18 +00:00
Ralph Boehme
4e3676cb3c ctdb-tests: add a comment to the generated public_addresses file used by eventscript UNIT tests
test stub code has been updated to handle this, so now let's put it
to work.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=14826
RN: Correctly ignore comments in CTDB public addresses file

Signed-off-by: Ralph Boehme <slow@samba.org>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2021-10-12 22:38:32 +00:00
Martin Schwenke
5426c104f5 ctdb-tests: Fix typo in ctdb stub comment matching
BUG: https://bugzilla.samba.org/show_bug.cgi?id=14826

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Ralph Boehme <slow@samba.org>
2021-10-12 22:38:32 +00:00
Ralph Boehme
530e8d4b9e ctdb-scripts: filter out comments in public_addresses file
Note that order of sed expressions matters: the expression to delete
comment lines must come first as the second expression would transform

  # comment

to

  comment

BUG: https://bugzilla.samba.org/show_bug.cgi?id=14826

Signed-off-by: Ralph Boehme <slow@samba.org>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2021-10-12 22:38:32 +00:00
Martin Schwenke
9e7d2d9794 ctdb-daemon: Don't mark a node as unhealthy when connecting to it
Remote nodes are already initialised as UNHEALTHY when the node list
is initialised at startup (ctdb_load_nodes_file() calls
convert_node_map_to_list()) and when disconnected (ctdb_node_dead()).
So, drop this code.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=14784
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>

Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Thu Sep  9 02:38:34 UTC 2021 on sn-devel-184
2021-09-09 02:38:34 +00:00
Martin Schwenke
7f697b1938 ctdb-daemon: Ignore flag changes for disconnected nodes
If this node is not connected to a node then we shouldn't know
anything about it.  The state will be pushed later by the recovery
master.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=14784
Signed-off-by: Martin Schwenke <martin@meltin.net>
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
2021-09-09 01:46:49 +00:00
Martin Schwenke
ae10a8a4b7 ctdb-daemon: Simplify ctdb_control_modflags()
Now that there are separate disable/enable controls used by the ctdb
tool this control can ignore any flag updates for the current nodes.
These only come from the recovery master, which depends on being able
to fetch flags for all nodes.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=14784
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2021-09-09 01:46:49 +00:00
Martin Schwenke
916c5ee131 ctdb-recoverd: Mark CTDB_SRVID_SET_NODE_FLAGS obsolete
CTDB_SRVID_SET_NODE_FLAGS is no longer sent so drop monitor_handler()
and replace with srvid_not_implemented().  Mark the SRVID obsolete in
its comment.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=14784
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2021-09-09 01:46:49 +00:00
Martin Schwenke
e75256767f ctdb-daemon: Don't bother sending CTDB_SRVID_SET_NODE_FLAGS
The code that handles this message is
ctdb_recoverd.c:monitor_handler().  Although it appears to do
something potentially useful, it only logs the flags changes.  All
changes made are to local structures - there are no actual
side-effects.

It used to trigger a takeover run when the DISABLED flag changed.
This was dropped back in commit
662f06de9f.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=14784
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2021-09-09 01:46:49 +00:00
Martin Schwenke
0132bd5a22 ctdb-daemon: Modernise remaining debug macro in this function
BUG: https://bugzilla.samba.org/show_bug.cgi?id=14784
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2021-09-09 01:46:49 +00:00
Martin Schwenke
b6d25d079e ctdb-daemon: Update logging for flag changes
When flags change, promote the message to NOTICE level and switch the
message to the style that is currently generated by
ctdb-recoverd.c:monitor_handler().  This will allow monitor_handler()
to go away in future.

Drop logging when flags do not change.  The recovery master now logs
when it pushes flags for a node, so the lack of a corresponding
"changed flags" message here indicates that no update was required.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=14784
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2021-09-09 01:46:49 +00:00
Martin Schwenke
eec44e2862 ctdb-daemon: Correct the condition for logging unchanged flags
Don't trust the old flags from the recovery master.

Surrounding code will change in future comments, including the use of
old-style debug macros, so just make this change clear.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=14784
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2021-09-09 01:46:49 +00:00
Martin Schwenke
5914054698 ctdb-tools: Use disable and enable controls in tool
Note that there a change from broadcast to a directed control here.
This is OK because the recovery master will push flags if any nodes
disagree with the canonical flags fetched from a node.

Static function ctdb_ctrl_modflags() is no longer used to drop it.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=14784
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2021-09-09 01:46:49 +00:00
Martin Schwenke
6fe6a54e7f ctdb-client: Add client code for disable/enable controls
BUG: https://bugzilla.samba.org/show_bug.cgi?id=14784
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2021-09-09 01:46:49 +00:00
Martin Schwenke
15a6489c28 ctdb_daemon: Implement controls DISABLE_NODE/ENABLE_NODE
BUG: https://bugzilla.samba.org/show_bug.cgi?id=14784
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2021-09-09 01:46:49 +00:00
Martin Schwenke
60c1ef1465 ctdb-daemon: Start as disabled means PERMANENTLY_DISABLED
DISABLED is UNHEALTHY | PERMANENTLY_DISABLED, which is not what is
intended here.  Luckily, it doesn't do any harm because nodes are
marked unhealthy at startup anyway.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=14784
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2021-09-09 01:46:49 +00:00
Martin Schwenke
1ac7bc7532 ctdb-daemon: Factor out a function to get node structure from PNN
BUG: https://bugzilla.samba.org/show_bug.cgi?id=14784
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2021-09-09 01:46:49 +00:00
Martin Schwenke
e0a7b5a9e8 ctdb-daemon: Add a helper variable
Simplifies a subsequent change.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=14784
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2021-09-09 01:46:49 +00:00
Martin Schwenke
6845dca87e ctdb-protocol: Add marshalling for controls DISABLE_NODE/ENABLE_NODE
BUG: https://bugzilla.samba.org/show_bug.cgi?id=14784
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2021-09-09 01:46:49 +00:00
Martin Schwenke
49dc5d8cd2 ctdb-protocol: Add new controls to disable and enable nodes
These are CTDB_CONTROL_DISABLE_NODE and CTDB_CONTROL_ENABLE_NODE.

For consistency these match CTDB_CONTROL_STOP_NODE and
CTDB_CONTROL_CONTINUE_NODE.  It would be possible to add a single
control but it would need to take data.

The aim is to finally fix races in flag handling.  Previous fixes have
improved the situation but they have only narrowed the race window.
The problem is that the recovery daemon on the master node pushes
flags to nodes the same way that disable and enable are implemented.
So the following sequence is still racy:

1. Node A is disabled
2. Recovery master pulls flags from all nodes including A
3. Node A is enabled
4. Recovery master notices A is disabled and pushes a flag update to
   all nodes including node A
5. Node A is erroneously marked disabled

Node A can not tell if the MODIFY_FLAGS control is from a "ctdb
disable" command or a flag update from the recovery master.

The solution is to use a different mechanism for disable/enable and
for a node to ignore MODIFY_FLAGS controls for their own flags.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=14784
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2021-09-09 01:46:49 +00:00
Martin Schwenke
8305f6a7f1 ctdb-recoverd: Push flags for a node if any remote node disagrees
This will usually happen if flags on the node in question change, so
keeping the code simple and pushing to all nodes won't hurt.  When all
nodes come up there might be differences in connected nodes, causing
such "fix ups".  Receiving nodes will ignore no-op pushes.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=14784
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2021-09-09 01:46:49 +00:00
Martin Schwenke
620d078714 ctdb-recoverd: Update the local node map before pushing out flags
The resulting code structure looks a little weird.  However, there is
another condition that requires the flags to be pushed that will be
inserted before the continue statement in a subsequent commit..

BUG: https://bugzilla.samba.org/show_bug.cgi?id=14784
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2021-09-09 01:46:49 +00:00
Martin Schwenke
82a075d4d7 ctdb-recoverd: Add a helper variable
Improves readability and simplifies subsequent changes.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=14784
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2021-09-09 01:46:49 +00:00
Martin Schwenke
b724c1e6a6 utils: Avoid pylint warning
pylint warns:

  Use lazy % formatting in logging functions

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: David Disseldorp <ddiss@samba.org>
Reviewed-by: Jose A. Rivera <jarrpa@samba.org>

Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Tue Jul 20 05:29:18 UTC 2021 on sn-devel-184
2021-07-20 05:29:18 +00:00
Martin Schwenke
319e27343d utils: Reformat lines that are longer than 80 columns
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: David Disseldorp <ddiss@samba.org>
Reviewed-by: Jose A. Rivera <jarrpa@samba.org>
2021-07-20 04:43:37 +00:00
Martin Schwenke
98c7a38b71 utils: Tweak exception handling to stop flake8 complaining
Don't bother with "as e" to avoid warning about unused variable.
Don't use bare "except:" (though pylint still complains about this
version).

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: David Disseldorp <ddiss@samba.org>
Reviewed-by: Jose A. Rivera <jarrpa@samba.org>
2021-07-20 04:43:37 +00:00
Martin Schwenke
12d3e215a6 utils: Simplify log level logic, drop global variable
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: David Disseldorp <ddiss@samba.org>
Reviewed-by: Jose A. Rivera <jarrpa@samba.org>
2021-07-20 04:43:37 +00:00
Martin Schwenke
e323d16a9d utils: Inline defaults and help strings
Removes an unnecessary level of indirection: defaults and help strings
are now where they are expected.  Also removes some global variables.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: David Disseldorp <ddiss@samba.org>
Reviewed-by: Jose A. Rivera <jarrpa@samba.org>
2021-07-20 04:43:37 +00:00
Martin Schwenke
af5aecced1 utils: Move argument processing into function and call from main()
Removes the need for the global variables currently associated with
this processing.  Also removes unnecessarily double-handling the
defaults, which are assigned to the global variables and set via
add_argument().

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: David Disseldorp <ddiss@samba.org>
Reviewed-by: Jose A. Rivera <jarrpa@samba.org>
2021-07-20 04:43:37 +00:00
Martin Schwenke
e66637a079 utils: Reorder imports so that standard imports are first
Avoids numerous pylint warnings.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: David Disseldorp <ddiss@samba.org>
Reviewed-by: Jose A. Rivera <jarrpa@samba.org>
2021-07-20 04:43:37 +00:00
Martin Schwenke
bd0b2bb6ee utils: Clean up ctdb_etcd_lock using autopep8
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: David Disseldorp <ddiss@samba.org>
Reviewed-by: Jose A. Rivera <jarrpa@samba.org>
2021-07-20 04:43:37 +00:00
Martin Schwenke
939aed0498 utils: Use Python 3
Due to the number of flake8 and pylint warnings it is unclear if the
source has Python 3 incompatibilities.  These will be cleaned up in
subsequent commits.

Signed-off-by: "L.P.H. van Belle" <belle@bazuin.nl>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: David Disseldorp <ddiss@samba.org>
Reviewed-by: Jose A. Rivera <jarrpa@samba.org>
2021-07-20 04:43:37 +00:00
Martin Schwenke
466aa8b6f5 ctdb-scripts: Ignore ShellCheck SC3013 for test -nt
In ShellCheck 0.7.2, POSIX compatibility warnings got their own SC3xxx
error codes, so now both the old and new codes need to be ignored.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>

Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Fri Jun 25 10:06:48 UTC 2021 on sn-devel-184
2021-06-25 10:06:48 +00:00
Martin Schwenke
fc0da6b0f8 ctdb-tests: Force stub version of service in eventscript tests
Fedora 34 now has a shell function for the which command, which causes
these uses of which to return the enclosing function definition rather
than the executable file as expected.

The event script unit tests always expect the stub service command to
be used, so the conditional in these functions is unnecessary.
$CTDB_HELPER_BINDIR already conveniently points to the stub directory,
so use it here.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
2021-06-25 09:16:31 +00:00
Martin Schwenke
23b2fab2c8 ctdb-common: Drop unused include of mkdir_p.h
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2021-06-25 09:16:31 +00:00
Martin Schwenke
e40d452722 ctdb-daemon: Close server socket when switching to client
The socket is set close-on-exec but that doesn't help for processes
that do not exec().  This should be done for all child processes.

This has been seen in testing where "ctdb shutdown" waits for the
socket to close before succeeding.  It appears that lingering
vacuuming processes have not closed the socket when becoming clients
so they cause "ctdb shutdown" to hang even though the main daemon
process has exited.  The cause of the lingering vacuuming processes
has been previously examined but still isn't understood.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2021-06-25 09:16:31 +00:00
Martin Schwenke
f7cf8132b0 ctdb-tests: Add debug_locks.sh tests for mutexes
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>

Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Fri May 28 07:34:23 UTC 2021 on sn-devel-184
2021-05-28 07:34:23 +00:00
Amitay Isaacs
99c3b49260 ctdb-scripts: Add lock debugging for tdb mutex locks
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Signed-off-by: Martin Schwenke <martin@meltin.net>
2021-05-28 06:46:29 +00:00
Amitay Isaacs
cb55b68b3e ctdb-utils: Add tdb_mutex_check utility
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2021-05-28 06:46:29 +00:00
Martin Schwenke
dd5972b699 ctdb-scripts: Simplify logic in debug_via_proc_locks()
The path of the TDB is known, so calculate the file ID (device number
+ inode number) from it and use this to directly filter /proc/locks to
find processes holding locks.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2021-05-28 06:46:29 +00:00
Martin Schwenke
e62ae53ef6 ctdb-scripts: Update debug_locks.sh to handle arguments
Don't use the  arguments yet.  They will be used in a simplified
version of the code.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2021-05-28 06:46:29 +00:00
Martin Schwenke
1dfff9751b ctdb-scripts: Move current lock debugging to a function
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2021-05-28 06:46:29 +00:00
Amitay Isaacs
d07875330a ctdb-locking: Pass additional arguments to debug locks script
1. PID of lock helper waiting for lock
2. Scope of lock: "record" or "db"
3. Path to database that lock helper is trying to lock
4. Whether the database uses mutexes: "mutex" or "fcntl"

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2021-05-28 06:46:29 +00:00
Martin Schwenke
2c7dbb043f ctdb-tests: Add debug_locks.sh testing
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2021-05-28 06:46:29 +00:00
Martin Schwenke
a3e7fd9c61 ctdb-tests: Fix nonsense arguments to ps stub
These were fine (though still lazy) when these tests were the only
user of this stub.  However, the ps stub is about to be enhanced, so
fix these uses of it to represent the intended usage.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2021-05-28 06:46:29 +00:00
Martin Schwenke
ffb56c9143 ctdb-scripts: Avoid direct /proc access
The main reason for this is to facilitate testing.

Avoid some /proc accesses entirely by using ps(1) (which can be
replaced by a stub when testing) because this script might as well be
more portable in case anyone wants to add lock debugging for a
non-Linux platform.  While the "state" format specification isn't
POSIX-compliant, it works on both Linux and FreeBSD so it is a
reasonable improvement.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2021-05-28 06:46:29 +00:00
Martin Schwenke
55d4b3438f ctdb-scripts: Factor out function dump_stacks()
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2021-05-28 06:46:29 +00:00
Volker Lendecke
adef87a621 ctdb: Fix a crash in run_proc_signal_handler()
If a script times out the caller can talloc_free() the script_list
output of run_event_recv, which talloc_free's proc->output from
run_proc.c as well. If the script generates further output after the
timeout and then exits after a while, the SIGCHLD handler in the
eventd tries to read into proc->output, which was already free'ed.

Fix this by not doing just a talloc_steal but a talloc_move. This way
proc_read_handler() called from run_proc_signal_handler() does not try
to realloc the stale reference to proc->output but gets a NULL
reference.

I don't really know how to do a knownfail in ctdb, so this commit
actually activates catching the signal by waiting long enough for
22.bar to exit and generate the SIGCHLD.

Bug: https://bugzilla.samba.org/show_bug.cgi?id=14475
Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Ralph Boehme <slow@samba.org>
2021-05-18 10:42:32 +00:00
Volker Lendecke
f320d1a7ab ctdb: Introduce output before and after the 10-second timeout
This will lead to a crash in run_event_test.c soon

Bug: https://bugzilla.samba.org/show_bug.cgi?id=14475
Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Ralph Boehme <slow@samba.org>
2021-05-18 10:42:32 +00:00
Volker Lendecke
19290f10c7 ctdb: Wait for SIGCHLD if script timed out
Bug: https://bugzilla.samba.org/show_bug.cgi?id=14475
Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Ralph Boehme <slow@samba.org>
2021-05-18 10:42:32 +00:00
Volker Lendecke
07ab9b7a71 ctdb: Introduce a helper variable in run_event_test.c
Bug: https://bugzilla.samba.org/show_bug.cgi?id=14475
Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Ralph Boehme <slow@samba.org>
2021-05-18 10:42:32 +00:00
Volker Lendecke
9398d4b912 ctdb: Call run_event_recv() in a callback function
Triggers a different code path in run_event_* and aligns it more what
the ctdb eventd really does.

Bug: https://bugzilla.samba.org/show_bug.cgi?id=14475
Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Ralph Boehme <slow@samba.org>
2021-05-18 10:42:32 +00:00
Volker Lendecke
f188c9d732 ctdb: fix typos
Bug: https://bugzilla.samba.org/show_bug.cgi?id=14475
Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Ralph Boehme <slow@samba.org>
2021-05-18 10:42:32 +00:00
Volker Lendecke
cf43f331be lib: Make pidfile_path_create() return the existing PID on conflict
Use F_GETLK to get the lock holder PID, this is more accurate than
reading the file contents: A conflicting process might not have
written its PID yet. Also, F_GETLK easily allows to do a retry if the
lock holder just died.

Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Jeremy Allison <jra@samba.org>
2021-03-16 17:09:32 +00:00
Volker Lendecke
06b740e2fb ctdb: Fix a typo
Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Jeremy Allison <jra@samba.org>
2021-03-09 22:36:28 +00:00
Martin Schwenke
6a81f43177 ctdb-tests: Actually wait for record to migrate to lmaster node
This test has been failing with:

  Wait until record is migrated to lmaster node 0
  <30|BAD: node 0 is not dmaster
  dmaster: 1
  rsn: 8
  flags: 0x00010000 MIGRATED_WITH_DATA
  data(6) = "value1"
  *** TEST COMPLETED (RC=1) AT 2021-02-02 06:18:48, CLEANING UP...

This should never happen.  If this really fails then the wait should
time out.

The problem is that wait_until() does:

  "$@" || _rc=$?

and vacuum_test_key_dmaster() currently calls ctdb_test_fail() on
failure, which causes the shell to exit.  Instead, pass a variant to
wait_until() that simply returns the correct status instead of
exiting.

An alternative would be to change the statement in wait_until() to do:

  ("$@") || _rc=$?

so it captures the exit.  However, this is a global change and
requires more thought.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Jeremy Allison <jra@samba.org>
2021-02-08 22:33:14 +00:00
Volker Lendecke
e593f96960 lib: Make accept_recv() return the listening socket
This is helpful if you are in a listening loop with the same receiver
for many sockets doing the same thing.

Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Jeremy Allison <jra@samba.org>
2021-01-22 19:54:38 +00:00
Volker Lendecke
40e4958953 lib: Make accept_recv() return struct samba_sockaddr
Avoid casting problems by using the samba_sockaddr union

Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Jeremy Allison <jra@samba.org>
2021-01-22 19:54:38 +00:00
Volker Lendecke
6aa672a41c ctdb: Use hex_byte() in hex_to_data()
Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Ralph Boehme <slow@samba.org>
Reviewed-by: Jeremy Allison <jra@samba.org>
2021-01-08 20:31:33 +00:00
Martin Schwenke
65ab8cb014 ctdb-daemon: Do not attempt to chown Unix domain socket in test mode
If run with UID wrapper and UID_WRAPPER_ROOT=1 then securing the
socket will fail.

Test mode means that local daemons are in use, so securing the socket
is not important.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
2020-11-02 08:58:31 +00:00
Martin Schwenke
78c3b5b6a8 ctdb-daemon: Clean up call to bind socket
Variable res is only used once and ret is re-used many times.  Drop
res, use ret, which doesn't need to be initialised.  Modernise debug
macro.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
2020-11-02 08:58:31 +00:00
Martin Schwenke
9404f8631e ctdb-daemon: Clean up socket bind/secure/listen
Obey the coding style, modernise debug macros, clean up whitespace.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
2020-11-02 08:58:31 +00:00
Amitay Isaacs
6aa396b0cd ctdb-common: Avoid aliasing errors during code optimization
When compiling with GCC 10.x and -O3 optimization, the IP checksum
calculation code generates wrong checksum.  The function uint16_checksum
gets inlined during optimization and ip4pkt->tcp data gets wrongly
aliased.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=14537

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>

Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Wed Oct 21 05:52:28 UTC 2020 on sn-devel-184
2020-10-21 05:52:28 +00:00
Martin Schwenke
b68105b8f7 ctdb-tests: Strengthen node state checking in ctdb disable/enable test
Check that the desired state is set on all nodes instead of just the
test node.  This ensures that node flags have correctly propagated
across the cluster.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=14513
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>

Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Tue Oct  6 04:32:06 UTC 2020 on sn-devel-184
2020-10-06 04:32:06 +00:00
Martin Schwenke
4b01f54041 ctdb-recoverd: Drop unnecessary and broken code
update_flags() has already updated the recovery master's canonical
node map, based on the flags from each remote node, and pushed out
these flags to all nodes.

If i == j then the node map has already been updated from this remote
node's flags, so simply drop this case.

Although update_flags() has updated flags for all nodes, it did not
update each node map in remote_nodemaps[] to reflect this.  This means
that remote_nodemaps[] may contain inconsistent flags for some nodes
so it should not be used to check consistency when i != j.

Further, a meaningful difference in flags can only really occur if
update_flags() failed.  In that case this code is never reached.

These observations combine to imply that this whole loop should be
dropped.

This leaves potential sub-second inconsistencies due to out-of-band
healthy/unhealthy flag changes pushed via CTDB_SRVID_PUSH_NODE_FLAGS.
These updates could be dropped (takeover run asks each node for
available IPs rather than making centralised decisions based on node
flags) but for now they will be fixed in the next iteration of
main_loop().

BUG: https://bugzilla.samba.org/show_bug.cgi?id=14513
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-10-06 03:12:35 +00:00
Martin Schwenke
3ab52b5286 ctdb-recoverd: Drop unnecessary code
This has already been done in update_flags().

BUG: https://bugzilla.samba.org/show_bug.cgi?id=14513
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-10-06 03:12:35 +00:00
David Disseldorp
68b981ee8a ctdb/test_ceph_rados_reclock: check for service registration
Signed-off-by: David Disseldorp <ddiss@samba.org>
Reviewed-by: Samuel Cabrero <scabrero@samba.org>

Autobuild-User(master): David Disseldorp <ddiss@samba.org>
Autobuild-Date(master): Thu Sep 24 00:52:42 UTC 2020 on sn-devel-184
2020-09-24 00:52:42 +00:00
David Disseldorp
55dbd1080d ctdb/doc: mention ctdb_mutex_ceph_rados_helper mgr registration
Signed-off-by: David Disseldorp <ddiss@samba.org>
Reviewed-by: Samuel Cabrero <scabrero@samba.org>
2020-09-23 23:29:41 +00:00
David Disseldorp
ff36cb7402 ctdb/ceph: register recovery lock holder with ceph-mgr
The Ceph Manager's service map is useful for tracking the status of
Ceph related services. By registering the CTDB recovery lock holder,
Ceph storage administrators can more easily identify where and when a
CTDB cluster is up and running.

Signed-off-by: David Disseldorp <ddiss@samba.org>
Reviewed-by: Samuel Cabrero <scabrero@samba.org>
2020-09-23 23:29:41 +00:00
Martin Schwenke
d98f68f918 ctdb-daemon: Drop implementation of old-style database pull/push controls
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>

Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Fri Sep 11 06:29:32 UTC 2020 on sn-devel-184
2020-09-11 06:29:32 +00:00
Martin Schwenke
7d826731d4 ctdb-protocol: Drop marshalling functions for old-style database pull/push
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-09-11 05:06:42 +00:00
Martin Schwenke
3bbb4a8535 ctdb-protocol: Drop client functions for old-style database pull/push
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-09-11 05:06:42 +00:00
Martin Schwenke
2898695473 ctdb-client: Drop unused synchronous functions for database pull/push
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-09-11 05:06:42 +00:00
Martin Schwenke
2efce7d477 ctdb-recovery: Simplify database push function names
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-09-11 05:06:42 +00:00
Martin Schwenke
f4e2206e88 ctdb-recovery: Drop unnecessary database push wrapper
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-09-11 05:06:42 +00:00
Martin Schwenke
225a699633 ctdb-recovery: Drop passing of capabilities into database pull
This is no longer necessary because the capability new style database
pull is assumed to always be available.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-09-11 05:06:42 +00:00
Martin Schwenke
595c1a7c0f ctdb-recovery: Simplify database pull function names
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-09-11 05:06:42 +00:00
Martin Schwenke
f968576642 ctdb-recovery: Remove use of old pull and push controls
Removes use of the old controls without cleaning up the code.  Clean
up can be done later.

After this change the CTDB_CAP_FRAGMENTED_CONTROLS capability is no
longer checked.  This capability can be removed along with the
controls.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-09-11 05:06:42 +00:00
Martin Schwenke
d9d8bf8c54 ctdb-tests: Simplify comment in large database recovery test
The older style controls mentioned are being removed.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-09-11 05:06:42 +00:00
David Mulder
6f5b0fef59 ctdb: Prevent man page duplication
The new waf detects a duplicate instance of
ctdb_mutex_ceph_rados_helper.7.xml, which is due
to manpages_extra being a pointer to
manpages_misc, therefore each call to build()
added duplicate entries to the manpages_misc
global entry.

Signed-off-by: David Mulder <dmulder@suse.com>
Reviewed-by: Andrew Bartlett <abartlet@samba.org>
2020-09-11 03:43:40 +00:00
Martin Schwenke
8bb6a6607d ctdb-recoverd: Broadcast takeover run message when verifying IPs
This makes it consistent with the monitoring code.  If the master has
changed then this means the master will always get the message.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>

Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Tue Aug 18 06:24:11 UTC 2020 on sn-devel-184
2020-08-18 06:24:11 +00:00
Martin Schwenke
4aa8e72d60 ctdb-recoverd: Rename update_local_flags() -> update_flags()
This also updates remote flags so the name is misleading.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-08-18 05:02:25 +00:00
Martin Schwenke
702c7c4934 ctdb-recoverd: Change update_local_flags() to use already retrieved nodemaps
BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-08-18 05:02:25 +00:00
Martin Schwenke
910a0b3b74 ctdb-recoverd: Get remote nodemaps earlier
update_local_flags() will be changed to use these nodemaps.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-08-18 05:02:25 +00:00
Martin Schwenke
d50919b0cb ctdb-recoverd: Do not fetch the nodemap from the recovery master
The nodemap has already been fetched from the local node and is
actually passed to this function.  Care must be taken to avoid
referencing the "remote" nodemap for the recovery master.  It also
isn't useful to do so, since it would be the same nodemap.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-08-18 05:02:25 +00:00
Martin Schwenke
762d1d8a96 ctdb-recoverd: Change get_remote_nodemaps() to use connected nodes
The plan here is to use the nodemaps retrieved by get_remote_nodes()
in update_local_flags().  This will improve efficiency, since
get_remote_nodes() fetches flags from nodes in parallel.  It also
means that get_remote_nodes() can be used exactly once early on in
main_loop() to retrieve remote nodemaps.  Retrieving nodemaps multiple
times is unnecessary and racy - a single monitoring iteration should
not fetch flags multiple times and compare them.

This introduces a temporary behaviour change but it will be of no
consequence when the above changes are made.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-08-18 05:02:25 +00:00
Martin Schwenke
368c83bfe3 ctdb-recoverd: Fix node_pnn check and assignment of nodemap into array
This array is indexed by the same index as nodemap, not the PNN.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-08-18 05:02:25 +00:00
Martin Schwenke
10ce0dbf1c ctdb-recoverd: Add fail callback to assign banning credits
Also drop error handling in main_loop() that is replaced by this
change.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-08-18 05:02:25 +00:00
Martin Schwenke
a079ee3169 ctdb-recoverd: Add an intermediate state struct for nodemap fetching
This will allow an error callback to be added.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-08-18 05:02:25 +00:00
Martin Schwenke
2eaa0af616 ctdb-recoverd: Move memory allocation into get_remote_nodemaps()
BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-08-18 05:02:25 +00:00
Martin Schwenke
3324dd272c ctdb-recoverd: Change signature of get_remote_nodemaps()
Change 1st argument to a rec context, since this will be needed later.
Drop the nodemap argument and access it via rec->nodemap instead.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-08-18 05:02:25 +00:00
Martin Schwenke
d2d90f2502 ctdb-recoverd: Fix a local memory leak
The memory is allocated off the memory context used by the current
iteration of main loop.  It is freed when main loop completes the fix
doesn't require backporting to stable branches.  However, it is sloppy
so it is worth fixing.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-08-18 05:02:25 +00:00
Martin Schwenke
52f520d39c ctdb-recoverd: Basic cleanups for get_remote_nodemaps()
Don't log an error on failure - let the caller can do this.  Apart
from this: fix up coding style and modernise the remaining error
message.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-08-18 05:02:25 +00:00
Martin Schwenke
0cb61c6fb6 ctdb-doc: Link to CTDB page in wiki
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>

Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Mon Aug 17 06:13:11 UTC 2020 on sn-devel-184
2020-08-17 06:13:11 +00:00
Martin Schwenke
971c20e9dc ctdb-tools: Drop "ctdb isnotrecmaster" command
This isn't used anywhere and can easily be checked via "ctdb pnn" and
"ctdb recmaster" commands.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-08-17 04:51:32 +00:00
Ralph Boehme
2327471756 lib: relicense smb_strtoul(l) under LGPLv3
Signed-off-by: Ralph Boehme <slow@samba.org>
Reviewed-by: Swen Schillig <swen@linux.ibm.com>
Reviewed-by: Volker Lendecke <vl@samba.org>

Autobuild-User(master): Jeremy Allison <jra@samba.org>
Autobuild-Date(master): Mon Aug  3 22:21:04 UTC 2020 on sn-devel-184
2020-08-03 22:21:02 +00:00
Martin Schwenke
642dc6ded6 ctdb-scripts: Use nfsconf as a last resort get nfsd thread count
If nfsconf exists then use it as last resort to attempt to extract
[nfsd]:threads from /etc/nfs.conf.

Invocation of nfsconf requires "|| true" because this script uses "set
-e".  Add a stub that always fails to at least test this much.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=14444
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>

Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Mon Jul 27 07:06:58 UTC 2020 on sn-devel-184
2020-07-27 07:06:57 +00:00
Martin Schwenke
334dd8cedd ctdb-scripts: Use nfsconf as a last resort to set NFS_HOSTNAME
If nfsconf exists then use it as last resort to attempt to extract
[statd]:name from /etc/nfs.conf.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=14444
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-07-27 05:42:31 +00:00
Martin Schwenke
f37b3cf2a6 ctdb: Change LVS to use leader/follower
Instead of master/slave.

Nearly all of these are simple textual substitutions, which preserve
the case of the original.    A couple of minor cleanups were made in the
documentation (such as "LVSMASTER" -> "LVS leader").

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-07-24 08:37:31 +00:00
Martin Schwenke
16b848553d ctdb: Change NAT gateway to use leader/follower
Instead of master/slave.

Nearly all of these are simple textual substitutions, which preserve
the case of the original.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-07-24 08:37:31 +00:00
Martin Schwenke
5ce6133a75 ctdb-recoverd: Simplify calculation of new flags
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>

Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Fri Jul 24 06:03:23 UTC 2020 on sn-devel-184
2020-07-24 06:03:23 +00:00
Martin Schwenke
3654e41677 ctdb-recoverd: Correctly find nodemap entry for pnn
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-07-24 04:41:25 +00:00
Martin Schwenke
9475ab0441 ctdb-recoverd: Do not retrieve nodemap from recovery master
It is already in rec->nodemap.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-07-24 04:41:25 +00:00
Martin Schwenke
0c6a7db3ba ctdb-recoverd: Flatten update_flags_on_all_nodes()
The logic currently in ctdb_ctrl_modflags() will be optimised so that
it no longer matches the pattern for a control function.  So, remove
this function and squash its functionality into the only caller.

Although there are some superficial changes, the behaviour is
unchanged.

Flattening the 2 functions produces some seriously weird logic for
setting the new flags, to the point where using ctdb_ctrl_modflags()
for this purpose now looks very strange.  The weirdness will be
cleaned up in a subsequent commit.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-07-24 04:41:25 +00:00
Martin Schwenke
a88c10c5a9 ctdb-recoverd: Move ctdb_ctrl_modflags() to ctdb_recoverd.c
This file is the only user of this function.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-07-24 04:41:25 +00:00
Martin Schwenke
b1e631ff92 ctdb-recoverd: Improve a call to update_flags_on_all_nodes()
This should take a PNN, not an array index.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-07-24 04:41:25 +00:00
Martin Schwenke
915d24ac12 ctdb-recoverd: Use update_flags_on_all_nodes()
This is clearer than using the MODFLAGS control directly.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-07-24 04:41:25 +00:00
Martin Schwenke
f681c0e947 ctdb-recoverd: Introduce some local variables to improve readability
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-07-24 04:41:25 +00:00
Martin Schwenke
cb3a3147b7 ctdb-recoverd: Change update_flags_on_all_nodes() to take rec argument
This makes fields such as recmaster and nodemap easily available if
required.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-07-24 04:41:25 +00:00
Martin Schwenke
6982fcb3e6 ctdb-recoverd: Drop unused nodemap argument from update_flags_on_all_nodes()
An unused argument needlessly extends the length of function calls.  A
subsequent change will allow rec->nodemap to be used if necessary.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-07-24 04:41:25 +00:00
Martin Schwenke
484a764e83 ctdb-tests: Improve test portability/quality
Avoid use of non-portable md5sum by constructing database names using
index.  Improve indentation, use more modern commands, code
improvements (shellcheck).

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>

Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Wed Jul 22 09:14:35 UTC 2020 on sn-devel-184
2020-07-22 09:14:35 +00:00
Martin Schwenke
f4c2c77ff7 ctdb-tests: Improve test quality
Simplify code, use more modern commands, code improvements (shellcheck).

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-07-22 07:53:36 +00:00
Martin Schwenke
c6c81ea287 ctdb-tests: Improve test portability
"wc -l" on some platforms (e.g. FreeBSD) contains leading spaces, so
strip them.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-07-22 07:53:36 +00:00
Martin Schwenke
244eaad76a ctdb-tests: Improve test quality
Select test node with IPs instead of using a fixed node.  Remove
unnecessary code, use more modern commands, code
improvements (shellcheck).

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-07-22 07:53:36 +00:00
Martin Schwenke
760c3039b0 ctdb-tests: Improve test portability
"wc -l" on some platforms (e.g. FreeBSD) contains leading spaces and
stops "$num from being a number.  Create a more portable solution and
put it in a function instead of repeating the logic.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-07-22 07:53:36 +00:00
Martin Schwenke
41ff58338a ctdb-tests: Drop uses of "onnode any ..." in testcases
It would be nice to get rid of "onnode any".  There's no use making
tests nondeterministic.  If covering different cases matters then they
should be explicitly handled.

In most places "any" is replaced by "$test_node".  In some cases,
where $test_node is not set, a fixed node that is already used
elsewhere can be reused.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-07-22 07:53:36 +00:00
Martin Schwenke
ce3de39894 ctdb-tests: Don't bother shutting down daemons in ctdb_init()
They'll never be up here...

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-07-22 07:53:36 +00:00
Martin Schwenke
37c26a9590 ctdb-tests: Separate custom cluster startup from test initialisation
Separate cluster startup from test initialisation for tests that start
the cluster with customised configuration.  In these cases the result
of the cluster startup is actually the point of the test.
Additionally, pubips.013.failover_noop.sh claims to have completed
test initialisation twice, which just seems wrong.

The result is:

* ctdb_test_init() takes one option (-n) to indicate when it should
  not configure/start the cluster

* New function ctdb_nodes_start_custom() accepts options for special
  cluster configuration, only operates on local daemons and triggers a
  test failure rather than a test error on failure.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-07-22 07:53:36 +00:00
Martin Schwenke
a766136df4 ctdb-tests: Do not trigger ctdb_test_error() from ctdb_init()
The only caller calls ctdb_test_error() on failure and nesting this
calls can be confusing.  A future change will make this even more
confusing.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-07-22 07:53:36 +00:00
Martin Schwenke
a369bedf8c ctdb-tests: Make unit.sh pass shellcheck
Mostly avoidance of quoting warnings.

Silencing warnings about unquoted $CTDB_TEST_CAT_RESULTS_OPTS is
handled by passing '-' to cat when that variable's value is empty.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-07-22 07:53:36 +00:00
Martin Schwenke
be3065ea95 ctdb-tests: Make integration.bash pass shellcheck
Apart from the non-constant sourcing of include files.

Mostly avoidance of quoting warnings.

One subtle change is to simply pass "120" to wait_until_ready() to
stop warnings that it expects arguments but none are passed (both
SC2119 and SC2120).  There seems no way to indicate to structure
function argument handling so that shellcheck realises arguments are
optional.  In later shellcheck versions, disabling SC2120 for a
function also silences complaints about its callers... but not all of
our testing uses "later" shellcheck versions.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-07-22 07:53:36 +00:00
Martin Schwenke
d667352805 ctdb-tests: Use "#!/usr/bin/env bash" for improved portability
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-07-22 07:53:35 +00:00
Martin Schwenke
8b24cae630 ctdb-tests: Update preamble for INTEGRATION tests
* Use "#!/usr/bin/env bash" for improved portability

* Drop test_info() definition and replace it with a comment

  The use of test_info() is pointless.

* Drop call to cluster_is_healthy()

  This is a holdover from when the previous test would restart daemons
  to get things ready for a test.  There was also a bug where going
  into recovery during the restart would sometimes cause the cluster
  to become unhealthy.  If we really need something like this then we
  can add it to ctdb_test_init().

* Make order of preamble consistent

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-07-22 07:53:35 +00:00
Martin Schwenke
0f201dd67a ctdb-tests: Drop unreachable line
ctdb_test_skip() will exit.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-07-22 07:53:35 +00:00
Martin Schwenke
29a3fce28f ctdb-tests: Redirect stderr too when checking for shellcheck
Avoid:

  .../UNIT/shellcheck/scripts/local.sh: line 14: type: shellcheck: not found

The "type" command in dash prints the "not found" message to stdout
but the bash version prints to stderr, so redirect stderr too.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-07-22 07:53:35 +00:00
Martin Schwenke
1565446508 ctdb-tests: Show hung script debugging output
The output in a test failure appears to contain no pstree output
because "00\.test\.script,.*" does not match.  However, this is just a
guess because the output is not shown.

Showing the output makes it easier to understand test failures.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-07-22 07:53:35 +00:00
Martin Schwenke
70c38d404b ctdb-tests: Enable SOCKET_WRAPPER_DIR_ALLOW_ORIG
This will allow local daemons to be used in more contexts, especially
in tests run by Jenkins where the directory names for some targets can
be very long.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-07-22 07:53:35 +00:00
Martin Schwenke
066c205e5f ctdb-build: Don't build/install tests in top-level build by default
The standalone build still includes tests, as does the top-level build
when --enable-selftest is used.  The latter is consistent with the use
of --enable-selftest in the rest of the tree.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-07-22 07:53:35 +00:00
Martin Schwenke
3ff8765d04 ctdb-tests: Stop cat command failure from causing test failure
In certain circumstance, which aren't obvious, cat(1) can fail when
attempting to write a lot of data.  This is due to something (probably
write(2)) returning EAGAIN.

Given that the -v option should only really be used for test
debugging, ignore the failure instead of spending time debugging it.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=14446
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-07-22 07:53:35 +00:00
Martin Schwenke
6436c74ebf Revert "ctdb-build: Don't build/install tests in top-level build by default"
Fix missing Reviewed-by: tag.

This reverts commit 91c36c16c8.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>

Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Wed Jul 22 06:29:43 UTC 2020 on sn-devel-184
2020-07-22 06:29:43 +00:00
Martin Schwenke
bdd89d5276 Revert "ctdb-tests: Enable SOCKET_WRAPPER_DIR_ALLOW_ORIG"
Fix missing Reviewed-by: tag.

This reverts commit 9694ba6fe4.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-07-22 05:07:47 +00:00
Martin Schwenke
6a3372e895 Revert "ctdb-tests: Show hung script debugging output"
Fix missing Reviewed-by: tag.

This reverts commit c78de201f8.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-07-22 05:07:46 +00:00
Martin Schwenke
e4b1cdc709 Revert "ctdb-tests: Redirect stderr too when checking for shellcheck"
Fix missing Reviewed-by: tag.

This reverts commit 847aa0e367.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-07-22 05:07:46 +00:00
Martin Schwenke
a694c07126 Revert "ctdb-tests: Drop unreachable line"
Fix missing Reviewed-by: tag.

This reverts commit a55dd6f17b.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-07-22 05:07:46 +00:00
Martin Schwenke
4438e44f88 Revert "ctdb-tests: Update preamble for INTEGRATION tests"
Fix missing Reviewed-by: tag.

This reverts commit 65f56505e2.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-07-22 05:07:46 +00:00
Martin Schwenke
271ad95e23 Revert "ctdb-tests: Use "#!/usr/bin/env bash" for improved portability"
Fix missing Reviewed-by: tag.

This reverts commit 9a7cabd342.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-07-22 05:07:46 +00:00
Martin Schwenke
60d999ad94 Revert "ctdb-tests: Make integration.bash pass shellcheck"
Fix missing Reviewed-by: tag.

This reverts commit 0f04b8a70b.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-07-22 05:07:46 +00:00
Martin Schwenke
548f2021df Revert "ctdb-tests: Make unit.sh pass shellcheck"
Fix missing Reviewed-by: tag.

This reverts commit 30293baae5.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-07-22 05:07:46 +00:00
Martin Schwenke
da654f9795 Revert "ctdb-tests: Do not trigger ctdb_test_error() from ctdb_init()"
Fix missing Reviewed-by: tag.

This reverts commit 44e05ac851.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-07-22 05:07:46 +00:00
Martin Schwenke
e11526ad54 Revert "ctdb-tests: Separate custom cluster startup from test initialisation"
Fix missing Reviewed-by: tag.

This reverts commit e9df17b500.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-07-22 05:07:45 +00:00
Martin Schwenke
941a2d0a3b Revert "ctdb-tests: Don't bother shutting down daemons in ctdb_init()"
Fix missing Reviewed-by: tag.

This reverts commit 58f9f699f1.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-07-22 05:07:45 +00:00
Martin Schwenke
c9dfdeaddc Revert "ctdb-tests: Drop uses of "onnode any ..." in testcases"
Fix missing Reviewed-by: tag.

This reverts commit aa5b214eaa.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-07-22 05:07:45 +00:00
Martin Schwenke
635d5cfa31 Revert "ctdb-tests: Improve test portability"
Fix missing Reviewed-by: tag.

This reverts commit 1079d6e3ae.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-07-22 05:07:45 +00:00
Martin Schwenke
c83ece42e5 Revert "ctdb-tests: Improve test quality"
Fix missing Reviewed-by: tag.

This reverts commit ea1cbff624.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-07-22 05:07:45 +00:00
Martin Schwenke
cf3b1fb390 Revert "ctdb-tests: Improve test portability"
Fix missing Reviewed-by: tag.

This reverts commit 1f6556916e.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-07-22 05:07:45 +00:00
Martin Schwenke
979a6c8c5f Revert "ctdb-tests: Improve test quality"
Fix missing Reviewed-by: tag.

This reverts commit a308f2534d.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-07-22 05:07:45 +00:00
Martin Schwenke
d035b69b53 Revert "ctdb-tests: Improve test portability/quality"
Fix missing Reviewed-by: tag.

This reverts commit d2f8cd835d.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-07-22 05:07:45 +00:00
Martin Schwenke
5948a57920 Revert "ctdb-tests: Stop cat command failure from causing test failure"
Fix missing Reviewed-by: tag.

This reverts commit 5707781ccf.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-07-22 05:07:45 +00:00
Martin Schwenke
5707781ccf ctdb-tests: Stop cat command failure from causing test failure
In certain circumstance, which aren't obvious, cat(1) can fail when
attempting to write a lot of data.  This is due to something (probably
write(2)) returning EAGAIN.

Given that the -v option should only really be used for test
debugging, ignore the failure instead of spending time debugging it.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=14446
Signed-off-by: Martin Schwenke <martin@meltin.net>

Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Wed Jul 22 04:10:47 UTC 2020 on sn-devel-184
2020-07-22 04:10:47 +00:00
Martin Schwenke
d2f8cd835d ctdb-tests: Improve test portability/quality
Avoid use of non-portable md5sum by constructing database names using
index.  Improve indentation, use more modern commands, code
improvements (shellcheck).

Signed-off-by: Martin Schwenke <martin@meltin.net>
2020-07-22 02:42:38 +00:00
Martin Schwenke
a308f2534d ctdb-tests: Improve test quality
Simplify code, use more modern commands, code improvements (shellcheck).

Signed-off-by: Martin Schwenke <martin@meltin.net>
2020-07-22 02:42:38 +00:00
Martin Schwenke
1f6556916e ctdb-tests: Improve test portability
"wc -l" on some platforms (e.g. FreeBSD) contains leading spaces, so
strip them.

Signed-off-by: Martin Schwenke <martin@meltin.net>
2020-07-22 02:42:38 +00:00
Martin Schwenke
ea1cbff624 ctdb-tests: Improve test quality
Select test node with IPs instead of using a fixed node.  Remove
unnecessary code, use more modern commands, code
improvements (shellcheck).

Signed-off-by: Martin Schwenke <martin@meltin.net>
2020-07-22 02:42:38 +00:00
Martin Schwenke
1079d6e3ae ctdb-tests: Improve test portability
"wc -l" on some platforms (e.g. FreeBSD) contains leading spaces and
stops "$num from being a number.  Create a more portable solution and
put it in a function instead of repeating the logic.

Signed-off-by: Martin Schwenke <martin@meltin.net>
2020-07-22 02:42:38 +00:00
Martin Schwenke
aa5b214eaa ctdb-tests: Drop uses of "onnode any ..." in testcases
It would be nice to get rid of "onnode any".  There's no use making
tests nondeterministic.  If covering different cases matters then they
should be explicitly handled.

In most places "any" is replaced by "$test_node".  In some cases,
where $test_node is not set, a fixed node that is already used
elsewhere can be reused.

Signed-off-by: Martin Schwenke <martin@meltin.net>
2020-07-22 02:42:38 +00:00
Martin Schwenke
58f9f699f1 ctdb-tests: Don't bother shutting down daemons in ctdb_init()
They'll never be up here...

Signed-off-by: Martin Schwenke <martin@meltin.net>
2020-07-22 02:42:38 +00:00
Martin Schwenke
e9df17b500 ctdb-tests: Separate custom cluster startup from test initialisation
Separate cluster startup from test initialisation for tests that start
the cluster with customised configuration.  In these cases the result
of the cluster startup is actually the point of the test.
Additionally, pubips.013.failover_noop.sh claims to have completed
test initialisation twice, which just seems wrong.

The result is:

* ctdb_test_init() takes one option (-n) to indicate when it should
  not configure/start the cluster

* New function ctdb_nodes_start_custom() accepts options for special
  cluster configuration, only operates on local daemons and triggers a
  test failure rather than a test error on failure.

Signed-off-by: Martin Schwenke <martin@meltin.net>
2020-07-22 02:42:38 +00:00
Martin Schwenke
44e05ac851 ctdb-tests: Do not trigger ctdb_test_error() from ctdb_init()
The only caller calls ctdb_test_error() on failure and nesting this
calls can be confusing.  A future change will make this even more
confusing.

Signed-off-by: Martin Schwenke <martin@meltin.net>
2020-07-22 02:42:38 +00:00
Martin Schwenke
30293baae5 ctdb-tests: Make unit.sh pass shellcheck
Mostly avoidance of quoting warnings.

Silencing warnings about unquoted $CTDB_TEST_CAT_RESULTS_OPTS is
handled by passing '-' to cat when that variable's value is empty.

Signed-off-by: Martin Schwenke <martin@meltin.net>
2020-07-22 02:42:37 +00:00
Martin Schwenke
0f04b8a70b ctdb-tests: Make integration.bash pass shellcheck
Apart from the non-constant sourcing of include files.

Mostly avoidance of quoting warnings.

One subtle change is to simply pass "120" to wait_until_ready() to
stop warnings that it expects arguments but none are passed (both
SC2119 and SC2120).  There seems no way to indicate to structure
function argument handling so that shellcheck realises arguments are
optional.  In later shellcheck versions, disabling SC2120 for a
function also silences complaints about its callers... but not all of
our testing uses "later" shellcheck versions.

Signed-off-by: Martin Schwenke <martin@meltin.net>
2020-07-22 02:42:37 +00:00
Martin Schwenke
9a7cabd342 ctdb-tests: Use "#!/usr/bin/env bash" for improved portability
Signed-off-by: Martin Schwenke <martin@meltin.net>
2020-07-22 02:42:37 +00:00
Martin Schwenke
65f56505e2 ctdb-tests: Update preamble for INTEGRATION tests
* Use "#!/usr/bin/env bash" for improved portability

* Drop test_info() definition and replace it with a comment

  The use of test_info() is pointless.

* Drop call to cluster_is_healthy()

  This is a holdover from when the previous test would restart daemons
  to get things ready for a test.  There was also a bug where going
  into recovery during the restart would sometimes cause the cluster
  to become unhealthy.  If we really need something like this then we
  can add it to ctdb_test_init().

* Make order of preamble consistent

Signed-off-by: Martin Schwenke <martin@meltin.net>
2020-07-22 02:42:37 +00:00
Martin Schwenke
a55dd6f17b ctdb-tests: Drop unreachable line
ctdb_test_skip() will exit.

Signed-off-by: Martin Schwenke <martin@meltin.net>
2020-07-22 02:42:37 +00:00
Martin Schwenke
847aa0e367 ctdb-tests: Redirect stderr too when checking for shellcheck
Avoid:

  .../UNIT/shellcheck/scripts/local.sh: line 14: type: shellcheck: not found

The "type" command in dash prints the "not found" message to stdout
but the bash version prints to stderr, so redirect stderr too.

Signed-off-by: Martin Schwenke <martin@meltin.net>
2020-07-22 02:42:37 +00:00
Martin Schwenke
c78de201f8 ctdb-tests: Show hung script debugging output
The output in a test failure appears to contain no pstree output
because "00\.test\.script,.*" does not match.  However, this is just a
guess because the output is not shown.

Showing the output makes it easier to understand test failures.

Signed-off-by: Martin Schwenke <martin@meltin.net>
2020-07-22 02:42:37 +00:00
Martin Schwenke
9694ba6fe4 ctdb-tests: Enable SOCKET_WRAPPER_DIR_ALLOW_ORIG
This will allow local daemons to be used in more contexts, especially
in tests run by Jenkins where the directory names for some targets can
be very long.

Signed-off-by: Martin Schwenke <martin@meltin.net>
2020-07-22 02:42:37 +00:00
Martin Schwenke
91c36c16c8 ctdb-build: Don't build/install tests in top-level build by default
The standalone build still includes tests, as does the top-level build
when --enable-selftest is used.  The latter is consistent with the use
of --enable-selftest in the rest of the tree.

Signed-off-by: Martin Schwenke <martin@meltin.net>
2020-07-22 02:42:37 +00:00
Martin Schwenke
0e287127cb ctdb-tools: Improve onnode's ShellCheck credibility
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>

Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Thu Jul 16 06:51:47 UTC 2020 on sn-devel-184
2020-07-16 06:51:47 +00:00
Martin Schwenke
5f217d6037 ctdb-tools: Allow onnode -P to respect ONNODE_SSH
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-07-16 05:28:42 +00:00
Martin Schwenke
00eb88b241 ctdb-tools: Whitespace fixups
Drop some unnecessary whitespace and re-indent push().

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-07-16 05:28:42 +00:00
Martin Schwenke
bc174243d7 ctdb-tools: Drop undocumented ONNODE_SSH_OPTS variable
Options can be set in ONNODE_SSH, so this variable is unnecessary.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-07-16 05:28:42 +00:00
Martin Schwenke
1e55591bc5 ctdb-tests: Add a new fetch ring test that also checks hot keys
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>

Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Fri May 22 08:05:54 UTC 2020 on sn-devel-184
2020-05-22 08:05:54 +00:00
Martin Schwenke
fb38252677 ctdb-tests: Update fetch_ring to take database and key on command line
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-05-22 06:41:45 +00:00
Martin Schwenke
53b73b9b0f ctdb-daemon: Fix sorting of hot keys
The current code only ever swaps with slot 0.  This will only ever
happen with slots 0 and 1, so probably never sorts.

Replace with qsort().

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-05-22 06:41:45 +00:00
Martin Schwenke
5c8dfbbf9b ctdb-daemon: Add extra logging of hot keys
ctdbd currently only logs when a new hot key is added.  If a key gets
hotter then nothing new is logged.

Log hot key updates when the number of migrations has doubled since
the last time that key was logged.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-05-22 06:41:45 +00:00
Martin Schwenke
baf058dcf7 ctdb-daemon: Update hot key logging
This message indicates that a hot key was added, so say that.  After
all the hot key slots have been filled the id will always be 0, so
don't bother logging it.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-05-22 06:41:44 +00:00
Martin Schwenke
1ab39b3270 ctdb-daemon: Fix bug in slot 0 comparison optimisation
This is only valid if all slots are in use.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-05-22 06:41:44 +00:00
Martin Schwenke
f9f60c2a60 ctdb-daemon: Switch some variables to unsigned
These should be unsigned but luck is currently on our side.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-05-22 06:41:44 +00:00
Martin Schwenke
21b9844bcb ctdb-daemon: Add separate hot keys array for database statistics
There are 2 reasons for this.  Sorting of hot keys is broken and will
be changed to an implementation that needs a named (i.e. not
anonymous) structure.  Also, at least one non-protocol field will be
added to facilitate more useful logging.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-05-22 06:41:44 +00:00
Martin Schwenke
c28914bfa7 ctdb-build: Fix a typo
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-05-22 06:41:44 +00:00
Ralph Boehme
6e419dda71 ctdb: increase TasksMax limit, the systemd default is just 512
In 2015 systemd introduced a TasksMax which limits the number of processes in a
unit:

https://lists.freedesktop.org/archives/systemd-devel/2015-November/035006.html

The default of 512 may be too low in certain situations leading to vfork()
failing with errno=EAGAIN when trying to spawn lock-helper processes.

With the default for LockProcessesPerDB being 200 the increased TasksMax limit
should cover the problematic scenario.

Additional background: the failing vfork()s have been seen on production
clusters and were tracked down to being logged in the context of ctdb calling
tdb_repack().

Links:

9ded9cd14c
https://www.suse.com/support/kb/doc/?id=000015901
https://success.docker.com/article/how-to-reserve-resource-temporarily-unavailable-errors-due-to-tasksmax-setting
https://www.percona.com/blog/2019/01/02/tasksmax-another-setting-that-can-cause-mysql-error-messages/

Signed-off-by: Ralph Boehme <slow@samba.org>
Reviewed-by: Martin Schwenke <martin@meltin.net>

Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Wed May 13 13:30:12 UTC 2020 on sn-devel-184
2020-05-13 13:30:12 +00:00
Amitay Isaacs
23c2195e2c ctdb-build: Add messages_dgm build to ctdb
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Volker Lendecke <vl@samba.org>

Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Wed May  6 01:47:16 UTC 2020 on sn-devel-184
2020-05-06 01:47:16 +00:00
Amitay Isaacs
a59fd8164c lib/util: Build genrand for util core
messages_dgm depends on genrand.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
2020-05-06 00:06:40 +00:00
Volker Lendecke
d9ccd853c3 ctdb: Implement CTDB_CONTROL_ECHO_DATA
Testing control: 4 bytes msec delay plus a blob, return the request after the
delay. This is an enhanced "ping" which can be used to test asynchronous
clients.

Doesn't have the full protocol implementation yet

Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2020-04-28 09:08:39 +00:00
Volker Lendecke
bdabf78122 ctdb-protocol: Add marshalling for control ECHO_DATA
Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2020-04-28 09:08:39 +00:00
Volker Lendecke
6f56f45639 ctdb-protocol: Add marshalling for struct ctdb_echo_data
Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2020-04-28 09:08:39 +00:00
Volker Lendecke
4f3db63d5e ctdb-protocol: Add new control CTDB_CONTROL_ECHO_DATA
Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2020-04-28 09:08:39 +00:00
Volker Lendecke
861dd8c48a ctdb: Fix duplicate ;;
Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2020-04-28 09:08:39 +00:00
Renaud Fortier
fdfc480a56 ctdb-scripts: Update nfs-ganesha-callout
On debian buster, this variable doesn't exist anymore. Look at this PR
as a reference:

  https://github.com/gluster/storhaug/pull/30

Signed-off-by: Renaud Fortier <renaud.fortier@fsaa.ulaval.ca>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Andrew Bartlett <abartlet@samba.org>

Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Thu Apr 23 08:07:51 UTC 2020 on sn-devel-184
2020-04-23 08:07:51 +00:00
Volker Lendecke
ad4b53f2d9 ctdb: Fix a memleak
Bug: https://bugzilla.samba.org/show_bug.cgi?id=14348
Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Martin Schwenke <martin@meltin.net>

Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Fri Apr 17 08:32:35 UTC 2020 on sn-devel-184
2020-04-17 08:32:35 +00:00
Martin Schwenke
f8f3d7954d ctdb-vacuum: Reschedule vacuum event if VacuumInterval has increased
The vacuuming integration tests set VacuumInterval to a very high
number to avoid vacuuming collisions.  This is done after the cluster
is healthy, so Samba will have already been started and vacuuming will
already be scheduled *at the default interval* for databases attached
by Samba.  This means that vacuuming controls used by vacuuming tests
can still collide with the scheduled vacuuming events.

Add some logic to reschedule a vacuuming event that has fired but
where VacuumInterval has increased since it was originally scheduled.
The increase in VacuumInterval is used as the time offset for
rescheduling the event.

Although this changes production behaviour for the convenience of
testing, the new behaviour is completely reasonable and obeys the
principle of least surprise.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>

Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Tue Apr  7 03:04:57 UTC 2020 on sn-devel-184
2020-04-07 03:04:57 +00:00
Martin Schwenke
5d03a3c86e ctdb-vacuum: Store value of VacuumInterval in ctdb_vacuum_handle
No behaviour change.  This is final staging to make the next change
completely obvious.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-04-07 01:26:41 +00:00
Martin Schwenke
7ad7c0b932 ctdb-vacuum: Use vacuum_handle local variables
No behaviour change.  This just makes future changes clearer by
avoiding reformatting (or introducing local variables).

Clean up error handling while touching a relevant line.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-04-07 01:26:41 +00:00
Martin Schwenke
716f52f68b ctdb-recoverd: Avoid dereferencing NULL rec->nodemap
Inside the nested event loop in ctdb_ctrl_getnodemap(), various
asynchronous handlers may dereference rec->nodemap, which will be
NULL.

One example is lost_reclock_handler(), which causes rec->nodemap to be
unconditionally dereferenced in list_of_nodes() via this call chain:

  list_of_nodes()
  list_of_active_nodes()
  set_recovery_mode()
  force_election()
  lost_reclock_handler()

Instead of attempting to trace all of the cases, just avoid leaving
rec->nodemap set to NULL.  Attempting to use an old value is generally
harmless, especially since it will be the same as the new value in
most cases.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=14324

Reported-by: Volker Lendecke <vl@samba.org>
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>

Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Tue Mar 24 01:22:45 UTC 2020 on sn-devel-184
2020-03-24 01:22:45 +00:00
Martin Schwenke
147afe77de ctdb-daemon: Don't allow attach from recovery if recovery is not active
Neither the recovery daemon nor the recovery helper should attach
databases outside of the recovery process.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=14294

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-03-23 23:45:38 +00:00
Martin Schwenke
052f1bdb9c ctdb-daemon: Remove more unused old client database functions
BUG: https://bugzilla.samba.org/show_bug.cgi?id=14294

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-03-23 23:45:38 +00:00
Martin Schwenke
3a66d181b6 ctdb-recovery: Remove old code for creating missing databases
BUG: https://bugzilla.samba.org/show_bug.cgi?id=14294

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-03-23 23:45:38 +00:00
Martin Schwenke
76a8174279 ctdb-recovery: Create database on nodes where it is missing
BUG: https://bugzilla.samba.org/show_bug.cgi?id=14294

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-03-23 23:45:38 +00:00
Martin Schwenke
e6e63f8fb8 ctdb-recovery: Fetch database name from all nodes where it is attached
BUG: https://bugzilla.samba.org/show_bug.cgi?id=14294

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-03-23 23:45:38 +00:00
Martin Schwenke
1bdfeb3fdc ctdb-recovery: Pass db structure for each database recovery
Instead of db_id and db_flags.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=14294

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-03-23 23:45:38 +00:00
Martin Schwenke
c6f74e590f ctdb-recovery: GET_DBMAP from all nodes
This builds a complete list of databases across the cluster so it can
be used to create databases on the nodes where they are missing.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=14294

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-03-23 23:45:38 +00:00
Martin Schwenke
4c0b9c3605 ctdb-recovery: Replace use of ctdb_dbid_map with local db_list
This will be used to build a merged list of databases from all nodes,
allowing the recovery helper to create missing databases.

It would be possible to also include the db_name field in this
structure but that would cause a lot of churn.  This field is used
locally in the recovery of each database so can continue to live in
the relevant state structure(s).

BUG: https://bugzilla.samba.org/show_bug.cgi?id=14294

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-03-23 23:45:38 +00:00
Martin Schwenke
7e5a8a4884 ctdb-daemon: Respect CTDB_CTRL_FLAG_ATTACH_RECOVERY when attaching databases
This is currently only set by the recovery daemon when it attaches
missing databases, so there is no obvious behaviour change.  However,
attaching missing databases can now be moved to the recovery helper as
long as it sets this flag.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=14294

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-03-23 23:45:38 +00:00
Martin Schwenke
98e3d0db2b ctdb-recovery: Use CTDB_CTRL_FLAG_ATTACH_RECOVERY to attach during recovery
ctdb_ctrl_createdb() is only called by the recovery daemon, so this is
a safe, temporary change.  This is temporary because
ctdb_ctrl_createdb(), create_missing_remote_databases() and
create_missing_local_databases() will all go away soon.

Note that this doesn't cause a change in behaviour.  The main daemon
will still only defer attaches from non-recoverd processes during
recovery.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=14294

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-03-23 23:45:38 +00:00
Martin Schwenke
17ed042590 ctdb-protocol: Add control flag CTDB_CTRL_FLAG_ATTACH_RECOVERY
BUG: https://bugzilla.samba.org/show_bug.cgi?id=14294

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-03-23 23:45:37 +00:00
Martin Schwenke
fc23cd1b9c ctdb-daemon: Remove unused old client database functions
BUG: https://bugzilla.samba.org/show_bug.cgi?id=14294

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-03-23 23:45:37 +00:00
Martin Schwenke
c6c89495fb ctdb-daemon: Fix database attach deferral logic
Commit 3cc230b5ee says:

  Dont allow clients to connect to databases untile we are well past
  and through the initial recovery phase

It is unclear what this commit was attempting to do.  The commit
message implies that more attaches should be deferred but the code
change adds a conjunction that causes less attaches to be deferred.
In particular, no attaches will be deferred after startup is complete.
This seems wrong.

To implement what seems to be stated in the commit message an "or"
needs to be used so that non-recovery daemon attaches are deferred
either when in recovery or before startup is complete.  Making this
change highlights that attaches need to be allowed during the
"startup" event because this is when smbd is started.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=14294

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-03-23 23:45:37 +00:00
Amitay Isaacs
1c56d6413f ctdb-recovery: Refactor banning a node into separate computation
If a node is marked for banning, confirm that it's not become inactive
during the recovery.  If yes, then don't ban the node.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=14294

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2020-03-23 23:45:37 +00:00
Amitay Isaacs
c6a0ff1bed ctdb-recovery: Don't trust nodemap obtained from local node
It's possible to have a node stopped, but recovery master not yet
updated flags on the local ctdb daemon when recovery is started.  So do
not trust the list of active nodes obtained from the local node.  Query
the connected nodes to calculate the list of active nodes.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=14294

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2020-03-23 23:45:37 +00:00
Amitay Isaacs
6e2f8756f1 ctdb-recovery: Consolidate node state
This avoids passing multiple arguments to async computation.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=14294

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2020-03-23 23:45:37 +00:00
Amitay Isaacs
072ff4d12b ctdb-recovery: Fetched vnnmap is never used, so don't fetch it
New vnnmap is constructed using the information from all the connected
nodes.  So there is no need to fetch the vnnmap from recovery master.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=14294

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2020-03-23 23:45:37 +00:00
Martin Schwenke
319c93f0c6 ctdb-tcp: Do not stop outbound connection in ctdb_tcp_node_connect()
The only place the outgoing connection needs to be stopped is when
there is a timeout when waiting for the connection to become writable.
Add a new function ctdb_tcp_node_connect_timeout() to handle this
case.

All of the other cases are attempts to establish a new outgoing
connection (initial attempt, retry after an error or disconnect, ...)
so drop stopping the connection in those cases.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=14295

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Signed-off-by: Martin Schwenke <martin@meltin.net>

Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Thu Mar 12 05:29:20 UTC 2020 on sn-devel-184
2020-03-12 05:29:20 +00:00
Martin Schwenke
3c8747fe29 ctdb-tcp: Factor out function ctdb_tcp_start_outgoing()
BUG: https://bugzilla.samba.org/show_bug.cgi?id=14295

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Signed-off-by: Martin Schwenke <martin@meltin.net>
2020-03-12 03:47:30 +00:00
Ralph Boehme
2c73dbafba ctdb-tcp: add ctdb_tcp_stop_incoming()
No change in behaviour.  This makes the code self-documenting.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=14295

Signed-off-by: Ralph Boehme <slow@samba.org>
Signed-off-by: Martin Schwenke <martin@meltin.net>
2020-03-12 03:47:30 +00:00
Ralph Boehme
1e2a967ff4 ctdb-tcp: rename ctdb_tcp_stop_connection() to ctdb_tcp_stop_outgoing()
No change in behaviour.  This makes the code self-documenting.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=14295

Signed-off-by: Ralph Boehme <slow@samba.org>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2020-03-12 03:47:30 +00:00
Ralph Boehme
ea37ecdcd5 ctdb-tcp: Remove redundant restart in ctdb_tcp_tnode_cb()
The node dead upcall has already restarted the outgoing connection.
There's no need to repeat it.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=14295

Signed-off-by: Ralph Boehme <slow@samba.org>
Signed-off-by: Martin Schwenke <martin@meltin.net>
2020-03-12 03:47:30 +00:00
Ralph Boehme
b83ef98c74 ctdb-tcp: always call node_dead() upcall in ctdb_tcp_tnode_cb()
ctdb_tcp_tnode_cb() is called when we receive data on the outgoing connection.

This can happen when we get an EOF on the connection because the other side as
closed. In this case data will be NULL.

It would also be called if we received data from the peer. In this case data
will not be NULL.

The latter case is a fatal error though and we already call
ctdb_tcp_stop_connection() for this case as well, which means even though the
node is not fully connected anymore, by not calling the node_dead() upcall
NODE_FLAGS_DISCONNECTED will not be set.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=14295

Signed-off-by: Ralph Boehme <slow@samba.org>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2020-03-12 03:47:30 +00:00
Noel Power
0ff1b78fc2 ctdb-tcp: move free of inbound queue to TCP restart
Since commit 77deaadca8, a nodeA which
had previously accepted a connection from nodeB (where nodeB dies
e.g. as as result of fencing) when nodeB attempts to connect again
after restarting is always rejected with

 ctdb_listen_event: Incoming queue active, rejecting connection from w.x.y.z

messages.

Consolidate dead node handling in the TCP restart handling.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=14295

Signed-off-by: Noel Power <noel.power@suse.com>
Reviewed-by: Ralph Boehme <slow@samba.org>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2020-03-12 03:47:30 +00:00
Martin Schwenke
15762a3455 ctdb-daemon: more logical whitespace, debug modernisation
BUG: https://bugzilla.samba.org/show_bug.cgi?id=14295

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Ralph Boehme <slow@samba.org>
2020-03-12 03:47:30 +00:00
Ralph Boehme
6a4fa0785f ctdb-daemon: ensure restart() callback is called in half-connected state
If NODE_FLAGS_DISCONNECTED is set the node can be in half-connected state. With
this change we ensure to restart the transport for this case.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=14295

Signed-off-by: Ralph Boehme <slow@samba.org>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2020-03-12 03:47:30 +00:00
Martin Schwenke
9f9dcfb6c3 ctdb-tests: Use built-in hexdump() in system socket tests
Better compatibility, since od output isn't consistent on FreeBSD.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>

Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Tue Mar 10 09:17:12 UTC 2020 on sn-devel-184
2020-03-10 09:17:12 +00:00
Martin Schwenke
602694522f ctdb-tests: Split system socket test
One test for each of types, TCP, ARP.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-03-10 07:37:34 +00:00
Martin Schwenke
b10e79f208 ctdb-tests: Skip "ctdb process-exists" tests when not on Linux
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-03-10 07:37:34 +00:00
Martin Schwenke
c5dd476715 ctdb-tests: Add function ctdb_test_check_supported_OS
Skips test if not on one of the supported OSes.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-03-10 07:37:34 +00:00
Martin Schwenke
8402dabf88 ctdb-tests: Use ctdb_test_skip() when initscript can not be found
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-03-10 07:37:34 +00:00
Martin Schwenke
30180ef6c2 ctdb-tests: Use ctdb_test_skip() when shellcheck is not installed
When the tests are run interactively this will make it more noticeable
that shellcheck is not installed because the test summary will
indicate missing tests.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-03-10 07:37:34 +00:00
Martin Schwenke
77f6977102 ctdb-tests: Skipped tests should not cause failure
Skipped tests return a status that indicates failure.  In combination
with the -e option this results in an exit with failure on the first
skipped test.

Convert skipped test status to success.  The skip has already been
counted.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-03-10 07:37:34 +00:00
Martin Schwenke
be90ab01bb ctdb-docs: Improve recovery lock documentation
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Christof Schmitt <cs@samba.org>

Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Mon Mar  9 02:27:18 UTC 2020 on sn-devel-184
2020-03-09 02:27:18 +00:00
Martin Schwenke
7cff3ed12c ctdb-tests: Use a local "ctdb shutdown" command to avoid a race
When "ctdb shutdown" is run with -n <N> it does not wait for the node
<N>'s ctdbd to go down but exits immediately.  This means that the
local_daemons.sh shutdown command can find the PID file still present
and then attempt the shutdown, but the daemon can have exited between
the check and the shutdown.  Although the test waits until the node is
disconnected, the transport is taken down just before the exit, so
this does not guarantee the daemon has exited.

A local shutdown command (no -n <N>) waits until the socket
disconnects and this happens *after* the PID file is gone, so this is
safe to use with the local_daemons.sh shutdown command.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>

Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Mon Mar  2 10:39:28 UTC 2020 on sn-devel-184
2020-03-02 10:39:28 +00:00
Martin Schwenke
1e72fbdde0 ctdb-tests: Silence a ShellCheck warning
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Ralph Boehme <slow@samba.org>

Autobuild-User(master): Ralph Böhme <slow@samba.org>
Autobuild-Date(master): Sat Feb 29 11:53:42 UTC 2020 on sn-devel-184
2020-02-29 11:53:42 +00:00
Ralph Boehme
c2dba1f53b ctdb: add tail logs option to local_daemons.sh
Signed-off-by: Ralph Boehme <slow@samba.org>
Reviewed-by: Martin Schwenke <martin@meltin.net>

Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Sat Feb 29 08:02:50 UTC 2020 on sn-devel-184
2020-02-29 08:02:50 +00:00
Anoop C S
959235fffb ctdb-docs: Move CTDB_SERVICE_NMB to new 48.netbios section
Signed-off-by: Anoop C S <anoopcs@redhat.com>
Reviewed-by: Guenther Deschner <gd@samba.org>
Reviewed-by: Martin Schwenke <martin@meltin.net>

Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Thu Feb 27 07:34:53 UTC 2020 on sn-devel-184
2020-02-27 07:34:53 +00:00
Anoop C S
512fa29cce ctdb-scripts: Change CTDB_SERVICE_NMB default value to 'nmb'
Till now 50.samba script was based on RHEL versions <=6 where we didn't
have separate start up script for nmb and smbd used to start nmbd when
required. Now that nmbd has its own start up script named "nmb" it is
reasonable to have "nmb" as default value for CTDB_SERVICE_NMB inside
new 48.netbios ctdb script.

Signed-off-by: Anoop C S <anoopcs@redhat.com>
Reviewed-by: Guenther Deschner <gd@samba.org>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2020-02-27 06:07:41 +00:00
Günther Deschner
26e1556819 ctdb-scripts: add new 48.netbios script for starting nmbd
This change basically moves out nmbd references from 50.samba script to
a new 48.netbios script. Accordingly ctdb test scripts are tweaked to
cope with newly added script.

Signed-off-by: Guenther Deschner <gd@samba.org>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2020-02-27 06:07:41 +00:00
Martin Schwenke
4de1e3207b ctdb-docs: Provide example commands for "ctdb event ..."
The example output doesn't tell a user what command generated it.
Adding the command makes the examples much more useful.

Reported-by: Stefan Kania <stefan@kania-online.de>
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>

Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Tue Feb 18 04:22:56 UTC 2020 on sn-devel-184
2020-02-18 04:22:56 +00:00
Martin Schwenke
3d5de9b26d ctdb-tests: Flag setup, startup, shutdown failures as test errors
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-02-18 02:56:38 +00:00
Martin Schwenke
455d931a16 ctdb-tests: Dump logs on shutdown failure
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-02-18 02:56:38 +00:00
Martin Schwenke
03403aacfe ctdb-tests: Avoid shutdown error when daemon already cleanly shut down
This depends on a small amount of internal knowledge but is the
cleanest way of avoiding errors for nodes that have already been shut
down.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-02-18 02:56:38 +00:00
Martin Schwenke
dc076b835f ctdb-tests: Rationalise node stop/start/restart
Separate functions are not needed for stopping/starting/restarting
individual nodes.  The stop and start functions essentially just use
onnode, though for local daemons this is embedded in local_daemons.sh.
So, just provide one stop and one start function that takes an
optional nodespec, defaulting to all nodes.

Restarting becomes common.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-02-18 02:56:38 +00:00
Martin Schwenke
a20403adf8 ctdb-daemon: Fix signed/unsigned comparison
csbuild says:

  ctdb/server/ctdb_lock.c: scope_hint: In function ‘ctdb_find_lock_context’
  ctdb/server/ctdb_lock.c:671:33: warning: comparison of integer expressions of different signedness: ‘int’ and ‘uint32_t’ {aka ‘unsigned int’} [-Wsign-compare]

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-02-18 02:56:38 +00:00
Martin Schwenke
c9405aec70 ctdb-daemon: Check for lock count underflow
This is a programming error.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-02-18 02:56:38 +00:00
Amitay Isaacs
c16da0e8f0 ctdb-common: Remove signed/unsigned comparisons
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2020-02-18 02:56:38 +00:00
Martin Schwenke
bd279d3f98 ctdb-tests: Fix getdbmap test so that it actually works sanely
* Typo in variable name db_map_pattern
* Variable num_db_init used before set
* dbmap_pattern does not cover database flags

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>

Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Wed Feb 12 04:38:47 UTC 2020 on sn-devel-184
2020-02-12 04:38:47 +00:00
Martin Schwenke
224e897872 ctdb-tests: Fix handling of --no-event-scripts option
Shellcheck noticed that pnn was never referenced.  Not sure this ever
worked or whether it got broken somewhere along the way.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-02-12 03:11:39 +00:00
Martin Schwenke
a6d464aa2e ctdb-tests: Use a here document to improve readability
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-02-12 03:11:39 +00:00
Martin Schwenke
b9f23f5b49 ctdb-tests: Use select_test_node()
select_test_node_and_ips() is not required in these cases.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-02-12 03:11:39 +00:00
Martin Schwenke
0162fd87ed ctdb-tests: Increase to dumping up to 500 lines of logs on error
100 lines are not enough to debug a current issue.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-02-12 03:11:39 +00:00
Martin Schwenke
5a702b01f6 ctdb-tests: Fix return value of DB test tool delete command
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-02-12 03:11:39 +00:00
Martin Schwenke
a40fc709cc ctdb-tcp: Make error handling for outbound connection consistent
If we can't bind the local end of an outgoing connection then
something has gone wrong.  Retrying is better than failing into a
zombie state.  The interface might come back up and/or the address my
be reconfigured.

While here, do the same thing for the other (potentially transient)
failures.

The unknown address family failure is special but just handle it via a
retry.  Technically it can't happen because the node address parsing
can only return values with address family AF_INET or AF_INET6.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=14274

Reported-by: 耿纪超 <gengjichao@jd.com>
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-02-12 03:11:39 +00:00
Martin Schwenke
0b3db29bd5 ctdb-tests: Add some tool unit tests to ensure that timeouts work
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>

Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Mon Feb 10 05:34:08 UTC 2020 on sn-devel-184
2020-02-10 05:34:08 +00:00
Martin Schwenke
0e59cd25e1 ctdb-tools: Allow shorter runtime limit to be specified
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-02-10 04:07:39 +00:00
Martin Schwenke
39206fd327 ctdb-tools: When in test mode set process group in top-level ctdb tool
If ctdbd hangs when shutting down in post-test clean-up then killing
the process group can kill the test.  When in test mode, create a
process group but only in the top-level ctdb tool - the natgw and lvs
helpers also run the ctdb tool.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-02-10 04:07:39 +00:00
Martin Schwenke
3b0b830e40 ctdb-tests: Use $PWD/bin/ if it exists when running in-tree
When running tests from a top-level build, a stale build in ctdb/bin/
will be preferred and may cause confusing results.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-02-10 04:07:39 +00:00
Martin Schwenke
b0b14e4edd ctdb-tests: Make $ctdb_dir absolute
This is used to set several variables so it might as well be cd-proof.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-02-10 04:07:39 +00:00
Martin Schwenke
1a0e1f8924 ctdb-daemon: Fork when not interactive and test mode is enabled
There is no sane way of keeping stdin open when using the shell to
background ctdbd in local_daemons.sh.  Instead, have ctdbd fork when
not interactive and when test mode is enabled.  become_daemon() can't
be used for this: if it forks then it also closes stdin.

For the interactive case, become_daemon() wasn't doing anything
special, so do nothing instead.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-02-10 04:07:39 +00:00
Martin Schwenke
a220e9454a ctdb-daemon: Make some conditions more explicit
These don't need to depend on do_fork.  Child logging should be set up
whenever the daemon is not interactive.  The stdin handler should be
setup whenever test mode is enabled.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-02-10 04:07:39 +00:00
Martin Schwenke
cefb3327c6 ctdb-daemon: Pass more information to ctdb_start_daemon()
No functional changes.

This is staging for a change that makes ctdbd fork when test mode is
enabled but interactive is not set.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-02-10 04:07:38 +00:00
Martin Schwenke
3509aa28d4 ctdb-tests: Don't actually close stdin in fake ssh
A subsequent file descriptor allocation may return 0 and unexpected
things may then happen.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-02-10 04:07:38 +00:00
Martin Schwenke
8de1bb75e5 ctdb-tests: Redirect stdin from /dev/null when running a test
Otherwise, if the test is run via ssh it will "unexpectedly" find
itself at the other end of a pipe.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-02-10 04:07:38 +00:00
Martin Schwenke
0737849b90 Revert "ctdb-tests: Enable job control when keeping stdin open"
This doesn't work when stdin is not a tty.

This reverts commit ea754bfdec.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-02-10 04:07:38 +00:00
Volker Lendecke
3d40efaed8 ctdb-test: Fix a typo
Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Andreas Schneider <asn@samba.org>

Autobuild-User(master): Andreas Schneider <asn@cryptomilk.org>
Autobuild-Date(master): Thu Jan 30 13:53:22 UTC 2020 on sn-devel-184
2020-01-30 13:53:22 +00:00
Martin Schwenke
ea754bfdec ctdb-tests: Enable job control when keeping stdin open
POSIX says:

  If job control is disabled (see set, -m), the standard input for an
  asynchronous list, before any explicit redirections are performed,
  shall be considered to be assigned to a file that has the same
  properties as /dev/null. This shall not happen if job control is
  enabled. In all cases, explicit redirection of standard input shall
  override this activity.

ctdbd is backgrounded at startup, so the above causes stdin to be
redirected from /dev/null.  Enable job control to work around this.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>

Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Tue Jan 28 11:24:35 UTC 2020 on sn-devel-184
2020-01-28 11:24:35 +00:00
Martin Schwenke
2380b13bf8 ctdb-tests: Don't close stdin when starting local daemons
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2020-01-28 09:57:33 +00:00