1
0
mirror of https://github.com/samba-team/samba.git synced 2025-01-11 05:18:09 +03:00
Commit Graph

8393 Commits

Author SHA1 Message Date
Martin Schwenke
dc89db8ca6 ctdb-tests: Fix logic error in simple ctdb reloadips test
There is a chance that restoring IP addresses to the test node will
result in different IP addresses being assigned to that node.
Removing a single IP address may then fail (or be a no-op) if it is
done after the restore.

So, swap the single IP address removal to happen first, then restore,
then remove all IP addresses.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=13924

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2019-05-14 07:25:37 +00:00
Martin Schwenke
8be4ee1a28 ctdb-tests: Make ctdb reloadips tests more reliable
ctdb reloadips will fail if it can't disable takover runs.  The most
likely reason for this is that there is already a takeover run in
progress.  We can't predict when this will happen, so retry if this
occurs.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=13924

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2019-05-14 07:25:37 +00:00
Martin Schwenke
cf00db4035 ctdb-tests: Capture output in $out on failure as well
BUG: https://bugzilla.samba.org/show_bug.cgi?id=13924

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2019-05-14 07:25:37 +00:00
Martin Schwenke
c75fbeaa96 ctdb-tests: Remove old socket wrapper state directory during setup
Otherwise, when looping tests for a long time, nodes are unable to
connect to each other.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=13924

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>

Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Mon May 13 08:42:44 UTC 2019 on sn-devel-184
2019-05-13 08:42:44 +00:00
Martin Schwenke
97ad353a67 ctdb-tests: Actually restart if cluster doesn't become healthy
BUG: https://bugzilla.samba.org/show_bug.cgi?id=13924

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2019-05-13 07:27:24 +00:00
Martin Schwenke
a60e77157c ctdb-tests: Add dump-logs command for local daemons
Dump a single merged log to stdout.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2019-05-13 07:27:24 +00:00
Amitay Isaacs
a0a82f1b6a ctdb-tests: Add reqid wrapping test
BUG: https://bugzilla.samba.org/show_bug.cgi?id=13930

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2019-05-13 07:27:24 +00:00
Martin Schwenke
8663e0a64f ctdb-daemon: Never use 0 as a client ID
ctdb_control_db_attach() and ctdb_control_db_detach() assume that any
control with client ID 0 comes from another daemon and treat it
specially.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=13930

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2019-05-13 07:27:24 +00:00
David Disseldorp
04c0e5212d ctdb/build: fix ctdb_mutex_ceph_rados_helper builds
2b5dbb3525 fixed builds with an explicit
--with-libcephfs but broke builds against system Ceph libraries. This
change handles both cases.

Signed-off-by: David Disseldorp <ddiss@samba.org>
Reviewed-by: Martin Schwenke <martin@meltin.net>

Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Thu May  9 04:24:56 UTC 2019 on sn-devel-184
2019-05-09 04:24:56 +00:00
Andreas Schneider
830cb7e675 ctdb:common: Do not print NULL if we don't get a sockpath
sock_socket_start_recv() might not fill sockpath if we return early.

Found by GCC 9.

Signed-off-by: Andreas Schneider <asn@samba.org>
Reviewed-by: Jeremy Allison <jra@samba.org>
2019-05-08 16:33:24 +00:00
Andreas Schneider
d16e9dfc81 ctdb: Fix format in db_hash_test
error: ‘%04d’ directive writing between 4 and 11 bytes into a region of
size 5 [-Werror=format-overflow=]
   sprintf(key, "key%04d", i);

Signed-off-by: Andreas Schneider <asn@samba.org>
Reviewed-by: Ralph Boehme <slow@samba.org>
2019-05-07 17:31:23 +00:00
Martin Schwenke
5a9e338330 ctdb-tests: Don't clean up test var directory in autotest target
If the directory is always cleaned up then it is not possible to look
at daemon logs to debug test failures.

This target is only really used by autobuild.py, which (optionally)
cleans up the parent directory anyway.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=13924

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>

Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Tue May  7 06:56:01 UTC 2019 on sn-devel-184
2019-05-07 06:56:01 +00:00
Martin Schwenke
a2ab6485e0 ctdb-tests: Fix usage message
Since commit 0e9ead8f28 daemons have
been shut down after each test, so this option no longer has anything
to do with killing daemons.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=13924

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2019-05-07 05:45:35 +00:00
Martin Schwenke
3cb53a7a05 ctdb-tests: Wait to allow database attach/detach to take effect
Sometimes the detach test fails:

  Check detaching single test database detach_test1.tdb
  BAD: database detach_test1.tdb is still attached
  Number of databases:4
  dbid:0x5ae995ee name:detach_test4.tdb path:tests/var/simple/node.0/db/volatile/detach_test4.tdb.0
  dbid:0xd84cc13c name:detach_test3.tdb path:tests/var/simple/node.0/db/volatile/detach_test3.tdb.0
  dbid:0x8e8e8cef name:detach_test2.tdb path:tests/var/simple/node.0/db/volatile/detach_test2.tdb.0
  dbid:0xc62491f4 name:detach_test1.tdb path:tests/var/simple/node.0/db/volatile/detach_test1.tdb.0
  Number of databases:3
  dbid:0x5ae995ee name:detach_test4.tdb path:tests/var/simple/node.1/db/volatile/detach_test4.tdb.1
  dbid:0xd84cc13c name:detach_test3.tdb path:tests/var/simple/node.1/db/volatile/detach_test3.tdb.1
  dbid:0x8e8e8cef name:detach_test2.tdb path:tests/var/simple/node.1/db/volatile/detach_test2.tdb.1
  Number of databases:4
  dbid:0x5ae995ee name:detach_test4.tdb path:tests/var/simple/node.2/db/volatile/detach_test4.tdb.2
  dbid:0xd84cc13c name:detach_test3.tdb path:tests/var/simple/node.2/db/volatile/detach_test3.tdb.2
  dbid:0x8e8e8cef name:detach_test2.tdb path:tests/var/simple/node.2/db/volatile/detach_test2.tdb.2
  dbid:0xc62491f4 name:detach_test1.tdb path:tests/var/simple/node.2/db/volatile/detach_test1.tdb.2
  *** TEST COMPLETED (RC=1) AT 2019-04-27 03:35:40, CLEANING UP...

When issued from a client, the detach control re-broadcasts itself
asynchronously to all nodes and then returns success.  The controls to
some nodes to do the actual detach may still be in flight when success
is returned to the client.  Therefore, the test should wait for a few
seconds to allow the asynchronous controls to complete.

The same is true for the attach control, so workaround the problem in
the attach test too.

An alternative is to make the attach and detach controls synchronous
by avoiding the broadcast and waiting for the results of the
individual controls sent to the nodes.  However, a simple
implementation would involve adding new nested event loops.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=13924

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2019-05-07 05:45:35 +00:00
Martin Schwenke
066cc5b0c5 ctdb-tests: Avoid bulk output in $out, prefer $outfile
BUG: https://bugzilla.samba.org/show_bug.cgi?id=13924

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2019-05-07 05:45:35 +00:00
Martin Schwenke
9d02452a24 ctdb-tests: Make try_command_on_node less error-prone
This sometimes fails, apparently due to a cat process in onnode
getting EAGAIN.  The conclusion is that tests that process large
amounts of output should not depend on a sub-shell delivering that
output into a shell variable.

Change try_command_on_node() to leave all of the output in file
$outfile and just put the first 1KB into $out.  $outfile is removed
after each test completes.

Change the implementation of sanity_check_output() to use $outfile
instead of $out.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=13924

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2019-05-07 05:45:34 +00:00
Martin Schwenke
7c3819d1ac ctdb-tests: Change sanity_check_output() to internally use $out
All callers are currently passed $out.  Global variable $out is used
in many other places so use it here to simplify the interface and make
future changes simpler.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=13924

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2019-05-07 05:45:34 +00:00
Martin Schwenke
b80967f5dc ctdb-scripts: Drop script configuration variable CTDB_MONITOR_SWAP_USAGE
CTDB's system memory monitoring in 05.system.script monitors both main
memory and swap.  The swap monitoring was originally based on
the (possibly incorrect, see below) idea that swap space stacks on top
of main memory, so that when a system starts filling swap space then
this is supposed to be a good sign that the system is running out of
memory.  Additionally, performance on a Linux system tends to be
destroyed by the I/O associated with a lot of swapping to spinning
disks.

However, some platforms default to creating only 4GB of swap space
even when there is 128GB of main memory.  With such a small swap to
main memory ratio, memory pressure can force swap to be nearly full
even when a significant amount of main memory is still available and
the system is performing well.  This suggests that checking swap
utilisation might be less than useful in many circumstances.

So, remove the separate swap space checking and change the memory
check to cover the total of main memory and swap space.

Test function set_mem_usage() still takes an argument for each of main
memory and swap space utilisation.  For simplicity, the same number is
now passed twice to make the intended results comprehensible.  This
could be changed later.

A couple of tests are cleaned up to no longer use hard-coded
/proc/meminfo and ps output.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2019-05-07 05:45:34 +00:00
Martin Schwenke
8108b3134c ctdb-tests: Extend test to cover ctdb rddumpmemory
BUG: https://bugzilla.samba.org/show_bug.cgi?id=13923

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2019-05-07 05:45:34 +00:00
Martin Schwenke
f78d9388fb ctdb-tools: Fix ctdb dumpmemory to avoid printing trailing NUL
Fix ctdb rddumpmemory too.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=13923

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2019-05-07 05:45:34 +00:00
Martin Schwenke
95477e69e3 ctdb-daemon: Log when ctdbd CPU utilisation exceeds a threshold
This is to help us notice when ctdbd is using the full capacity of a
CPU, so is saturated.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2019-05-07 05:45:34 +00:00
Martin Schwenke
87032ccebd ctdb-build: Add check for getrusage()
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2019-05-07 05:45:34 +00:00
David Disseldorp
2b5dbb3525 build: add explicit cephfs include path for vfs_ceph builds
Needed if building with a custom --with-libcephfs path.

Signed-off-by: David Disseldorp <ddiss@samba.org>
Reviewed-by: Jeremy Allison <jra@samba.org>
2019-04-12 18:38:20 +00:00
Amitay Isaacs
289201277c ctdb-common: Avoid race between fd and signal events
BUG: https://bugzilla.samba.org/show_bug.cgi?id=13895

In run_proc, there was an implicit assumption that when a process exits,
fd event (pipe between parent and child) would be processed first and
signal event (SIGCHLD for the child) would be processed later.

However, that is not the case.  SIGCHLD can be received asynchronously
any time even when the pipe data has not fully been read.  This causes
run_proc to miss some of the output from child process in tests.

When SIGCHLD is being processed, if the pipe between parent and child is
still open, then do an explict read from the pipe to ensure we read any
data still in the pipe before closing the pipe.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>

Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Fri Apr 12 08:19:29 UTC 2019 on sn-devel-144
2019-04-12 08:19:29 +00:00
Martin Schwenke
38dc6d11a2 ctdb-daemon: Revert "We can not assume that just because we could complete a TCP handshake"
We also can not assume that nodes can be marked as connected via only
the keepalive mechanism.  Keepalives are not sent to disconnected
nodes so, in the absence of other packets (e.g. broadcasts), 2 nodes
may never become marked as connected to each other.

Revert to marking nodes as connected in the TCP transport code.  If a
connection is to a non(-operational) ctdbd then it will revert to
disconnected after a short while and may actually flap.  This should
be rare.

This reverts commit 66919db3d7.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=13888

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2019-04-12 07:11:30 +00:00
Martin Schwenke
10291d91f2 Revert "ctdb-scripts: Do not "correct" number of nfsd threads when it is 0"
I thought this was being triggered during automated testing.
However, it appears that a poor choice of fixed ports for NFS RPC
services was the real problem.  Revert, since the original behaviour
may be useful.

This reverts commit f1a1c300e1.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2019-04-12 07:11:30 +00:00
Swen Schillig
6c1068ac00 ctdb-tools: Update error check for new string conversion wrapper
The new string conversion wrappers detect and flag errors
which occured during the string to integer conversion.
Those modifications required an update of the callees
error checks.

Signed-off-by: Swen Schillig <swen@linux.ibm.com>
Reviewed-by: Ralph Boehme <slow@samba.org>
Reviewed-by: Christof Schmitt <cs@samba.org>
2019-04-11 22:29:27 +00:00
Swen Schillig
c0c1004cd0 ctdb-protocol: Update error check for new string conversion wrapper
The new string conversion wrappers detect and flag errors
which occured during the string to integer conversion.
Those modifications required an update of the callees
error checks.

Signed-off-by: Swen Schillig <swen@linux.ibm.com>
Reviewed-by: Ralph Boehme <slow@samba.org>
Reviewed-by: Christof Schmitt <cs@samba.org>
2019-04-11 22:29:26 +00:00
Swen Schillig
9ee32f3a96 ctdb-test: Adding test case to verify queue resizeing
If a data packet arrives which exceeds the queue's current buffer size,
the buffer needs to be increased to hold the full packet. Once the packet
is processed the buffer size should be decreased to its standard size again.
This test case verifies this process.

Signed-off-by: Swen Schillig <swen@linux.ibm.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Christof Schmitt <cs@samba.org>

Autobuild-User(master): Christof Schmitt <cs@samba.org>
Autobuild-Date(master): Wed Apr 10 00:17:37 UTC 2019 on sn-devel-144
2019-04-10 00:17:37 +00:00
Swen Schillig
1f193174f2 ctdb-test: Adding test case verifying data in buffer move
Signed-off-by: Swen Schillig <swen@linux.ibm.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Christof Schmitt <cs@samba.org>
2019-04-09 23:14:19 +00:00
Swen Schillig
5c1009b319 ctdb-test: Modify ctdb_io_test test_setup to provide queue reference
Some test scenarios require access to the created queue.
Prepare the test_setup function to provide it as additional parameter.

Signed-off-by: Swen Schillig <swen@linux.ibm.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Christof Schmitt <cs@samba.org>
2019-04-09 23:14:19 +00:00
Volker Lendecke
43cacaad57 ctdb: Fix a typo
Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Martin Schwenke <martin@meltin.net>

Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Sat Apr  6 11:51:55 UTC 2019 on sn-devel-144
2019-04-06 11:51:55 +00:00
Volker Lendecke
bb1e32297e ctdb: Slightly simplify ctdb_ltdb_lock_fetch_requeue
Reduce indentation with an early return

Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2019-04-06 10:47:13 +00:00
Martin Schwenke
f1a1c300e1 ctdb-scripts: Do not "correct" number of nfsd threads when it is 0
While 0 may indicate that all threads have exited after being stuck,
it may also indicate that nfsd should not be running due to being shut
down.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=13860

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@samba.org>

Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Sun Mar 31 11:47:44 UTC 2019 on sn-devel-144
2019-03-31 11:47:44 +00:00
Martin Schwenke
a2bd408589 ctdb-scripts: Update statd-callout to try several configuration files
The alternative seems to be to try something via CTDB_NFS_CALLOUT.
That would be complicated and seems like overkill for something this
simple.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=13860

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@samba.org>
2019-03-31 10:45:20 +00:00
Martin Schwenke
0d67ea5fcc ctdb-scripts: Allow load_system_config() to take multiple alternatives
The situation for NFS config has got more complicated and is probably
broken in statd-callout on Debian-like systems at the moment.  Allow
several alternative configuration names to be tried.  Stop after the
first that is found and loaded.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=13860

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@samba.org>
2019-03-31 10:45:20 +00:00
Martin Schwenke
95283bdf2e ctdb-scripts: Default to using systemd services in NFS call-out
BUG: https://bugzilla.samba.org/show_bug.cgi?id=13860

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@samba.org>
2019-03-31 10:45:20 +00:00
Martin Schwenke
2833ddcfcb ctdb-tests: Update NFS test infrastructure to support systemd services
The tests are written around the default of sysvinit-redhat.  Add
support for systemd-redhat.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=13860

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@samba.org>
2019-03-31 10:45:20 +00:00
Martin Schwenke
a8fafd377f ctdb-scripts: Add systemd services to NFS call-out
At least Red Hat and Debian appear to use (a variant of?) the upstream
systemd units for NFS, so adding support for these services is
relatively easy.  Distributions using Sys-V init can patch the
call-out to use the relevant Sys-V init services.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=13860

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@samba.org>
2019-03-31 10:45:20 +00:00
Martin Schwenke
708c04071a ctdb-scripts: Start NFS quota service if defined
BUG: https://bugzilla.samba.org/show_bug.cgi?id=13860

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@samba.org>
2019-03-31 10:45:20 +00:00
Martin Schwenke
42103b5686 ctdb-scripts: Stop/start mount/rquotad/status via NFS call-out
When an NFS check restarts a failed service by hand then systemd will
be unable to stop or start this service again because (at least) the
PID file will be wrong.  Do this via the NFS Linux kernel call-out
instead.  Allow the call-out to use the services instead of doing
manual restarts.  Add variables for mount, status and rquotad services
to support this.

Adding systemd NFS services to the call-out will follow.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=13860

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@samba.org>
2019-03-31 10:45:20 +00:00
Martin Schwenke
8de0a339b5 ctdb-scripts: Factor out nfs_load_config()
BUG: https://bugzilla.samba.org/show_bug.cgi?id=13860

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@samba.org>
2019-03-31 10:45:20 +00:00
Martin Schwenke
e72c3c800a ctdb-scripts: Add test variable CTDB_NFS_DISTRO_STYLE
BUG: https://bugzilla.samba.org/show_bug.cgi?id=13860

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@samba.org>
2019-03-31 10:45:20 +00:00
Martin Schwenke
9981353ab7 ctdb-scripts: Rename variable nfslock_service to nfs_lock_service
There will be more of these variable for other services so, for
readability, it makes sense for them to start with "nfs_".

BUG: https://bugzilla.samba.org/show_bug.cgi?id=13860

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@samba.org>
2019-03-31 10:45:20 +00:00
Martin Schwenke
d7e187c1a7 ctdb-scripts: Reindent some functions prior to making changes
BUG: https://bugzilla.samba.org/show_bug.cgi?id=13860

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@samba.org>
2019-03-31 10:45:20 +00:00
Andrew Bartlett
a574e8f517 build: Standardise on calling conf.SAMBA_CHECK_PYTHON() in libraries
We do this by removing the confusing mandatory option to
conf.SAMBA_CHECK_PYTHON{,_HEADERS}(), instead just use the value of
--disable-python internally

This follows the default minimum of Python 3.4 and keeps things consistent
with the main Samba build where --disable-python is required to skip building
python bindings.

Signed-off-by: Andrew Bartlett <abartlet@samba.org>
Reviewed-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
2019-03-21 04:06:16 +00:00
Andrew Bartlett
a459650054 build: Remove manual specification of minimum python version
We now used the default of 3.4 from conf.SAMBA_CHECK_PYTHON()

Signed-off-by: Andrew Bartlett <abartlet@samba.org>
Reviewed-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
2019-03-21 04:06:16 +00:00
Amitay Isaacs
edd4a23d76 ctdb-version: Simplify version string usage
There is no need to write SAMBA_VERSION_STRING as CTDB_VERSION_STRING.
Wherever required use SAMBA_VERSION_STRING directly.

Avoids the confusion with two version.h files.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=13789

Signed-off-by: Amitay Isaacs <amitay@samba.org>
Reviewed-by: Martin Schwenke <martin@meltin.net>

Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Fri Mar 15 06:31:50 UTC 2019 on sn-devel-144
2019-03-15 06:31:50 +00:00
Martin Schwenke
148306674d ctdb-build: Drop creation of .distversion in tarball
BUG: https://bugzilla.samba.org/show_bug.cgi?id=13789

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2019-03-15 05:17:14 +00:00
Stefan Metzmacher
05c28fee21 ctdb-build: use a fixed ctdb_version.h using SAMBA_VERSION_STRING
This way we don't get constant rebuild as SAMBA_VERSION_STRING
is "4.7.0pre1.DEVELOPERBUILD" for the binaries under bin/
instead of "4.7.0pre1.GIT.59e51f6".

BUG: https://bugzilla.samba.org/show_bug.cgi?id=13789

Signed-off-by: Stefan Metzmacher <metze@samba.org>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2019-03-15 05:17:14 +00:00
Martin Schwenke
2c3df58132 ctdb-tests: Add a test for version consistency checking
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@samba.org>
2019-03-15 05:17:14 +00:00
Martin Schwenke
8c2ff3f2b5 ctdb-daemon: Add an environment variable to set version
This can be used to test the version checking logic.  Cache the
version to avoid re-checking the environment variable each time.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@samba.org>
2019-03-15 05:17:14 +00:00
Martin Schwenke
627a5cf1e7 ctdb-tests: Fix remaining common.sh ShellCheck hits
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@samba.org>
2019-03-15 05:17:14 +00:00
Martin Schwenke
6555fbce99 ctdb-tests: Shell cleanups in wait_until() function
This file is included by local_daemons.sh, which is not a bash script
and wait_until() uses the "local" keyword.  Prefixing variable names
with '_' to indicate that they are local changes a lot of lines in
this function.  So, fix indentation, potential quoting problems and
other ShellCheck hits while touching this function.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@samba.org>
2019-03-15 05:17:14 +00:00
Martin Schwenke
2fce893b2c ctdb-tests: export CTDB_SCRIPTS_TOOLS_BIN_DIR
This isn't used anywhere that requires it to be exported, but the lack
of consistency will cause problems and confusion at some later stage.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@samba.org>
2019-03-15 05:17:14 +00:00
Martin Schwenke
957c38b65c ctdb-packaging: Test package requires tcpdump
BUG: https://bugzilla.samba.org/show_bug.cgi?id=13838

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@samba.org>
2019-03-15 05:17:14 +00:00
Martin Schwenke
b2b8dce4fc ctdb-packaging: ctdb package should not own system library directory
BUG: https://bugzilla.samba.org/show_bug.cgi?id=13838

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@samba.org>
2019-03-15 05:17:14 +00:00
Martin Schwenke
d9286701cd ctdb-tests: Add some testing for IPv4-mapped IPv6 address parsing
ctdb_sock_addr values are hashed in some contexts.  This means that
all of the memory used for the ctdb_sock_addr should be consistent
regardless of how parsing is done.  The first 2 cases are just sanity
checks but the 3rd case involving an IPv4-mapped IPv6 address is the
real target of this test addition.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=13839

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@samba.org>
2019-03-15 05:17:13 +00:00
Zhu Shangzhong
539b5ff32b ctdb: Initialize addr struct to zero before reparsing as IPV4
Failed to kill the tcp connection that using IPv4-mapped IPv6 address
(e.g. ctdb_killtcp eth0 ::ffff:192.168.200.44:2049
::ffff:192.168.200.45:863).

When the ctdb_killtcp is used to kill the tcp connection, the IPs and
ports in the connection will be parsed to conn.client and conn.server
(call stack: main->ctdb_sock_addr_from_string->ip_from_string). In
the ip_from_string, as we are using IPv4-mapped IPv6 addresses, the
ipv6_from_string will be used to parse ip to addr.ip6 first. The next
step the ipv4_from_string will be used to reparse ip to addr.ip.

As a result, the data that dump from conn.server is "2 0 8 1 192 168
200 44 0 0 0 0 0 0 0 0 0 0 255 255 192 168 200 44 0 0 0 0", the data
from conn.client is "2 0 3 95 192 168 200 45 0 0 0 0 0 0 0 0 0 0 255 255
192 168 200 45 0 0 0 0". The connection will be add to conn_list by
ctdb_connection_list_add. Then the reset_connections_send uses conn_list
as parameter to start to reset connections in the conn_list.

In the reset_connections_send, the database "connections" will be
created. The connections from conn_list will be written to the
database(call db_hash_add), and use the data that dump from conn_client
and conn_server as key.

In the reset_connections_capture_tcp_handler, the
ctdb_sys_read_tcp_packet will receive data on the raw socket. And
extract the IPs and ports from the tcp packet. when extracting IP and
port, the tcp4_extract OR tcp6_extract will be used. Then we got the
new conn.client and conn.server. the data that dump from the
conn.server is "2 0 8 1 192 168 200 44 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0", the data from conn.client is "2 0 3 95 192 168 200 45 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0". Finally, we use the data as key to check
if this connection is one being reset(call db_hash_delete). The
db_hash_delete will return ENOENT. Because the two key that being used
by db_hash_delete and db_hash_add are different.

So, the TCP RST will be NOT sent for the connection forever. We should
initialize addr struct to zero before reparsing as IPV4 in the
ip_from_string.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=13839

Signed-off-by: Zhu Shangzhong <zhu.shangzhong@zte.com.cn>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@samba.org>
2019-03-15 05:17:13 +00:00
Martin Schwenke
a215d2017f ctdb-tests: Build cluster mutex path manually
CTDB_CLUSTER_MUTEX_HELPER can't be depended on because it is only set
when the tests are not installed and setting it unconditionally for
this particular use would be wrong.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=13800

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@samba.org>
2019-03-15 05:17:13 +00:00
David Disseldorp
e735160389 ctdb_mutex_ceph_rados_helper: revert strtoull_err() usage
Compilation currently fails, as ctdb_mutex_ceph_rados_helper doesn't
include or link against the samba-util library. Revert back to the
previous strtoull() behaviour, which works fine.

Signed-off-by: David Disseldorp <ddiss@samba.org>
Reviewed-by: Jeremy Allison <jra@samba.org>

Autobuild-User(master): Jeremy Allison <jra@samba.org>
Autobuild-Date(master): Fri Mar  1 18:34:18 UTC 2019 on sn-devel-144
2019-03-01 18:34:18 +00:00
Amitay Isaacs
278eb236ae ctdb-daemon: Fix maybe-uninitialized error with picky developer
263/386] Compiling ctdb/server/ctdb_recovery_helper.c
In file included from ../../server/ctdb_recovery_helper.c:24:0:
../../server/ctdb_recovery_helper.c: In function ‘main’:
../../../lib/talloc/talloc.h:911:34: error: ‘mem_ctx’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
 #define TALLOC_FREE(ctx) do { if (ctx != NULL) { talloc_free(ctx); ctx=NULL; } } while(0)

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Jeremy Allison <jra@samba.org>
2019-03-01 17:21:15 +00:00
Swen Schillig
fa2c919e1d ctdb-utils: Use wrapper for string to integer conversion
In order to detect an value overflow error during
the string to integer conversion with strtoul/strtoull,
the errno variable must be set to zero before the execution and
checked after the conversion is performed. This is achieved by
using the wrapper function strtoul_err and strtoull_err.

Signed-off-by: Swen Schillig <swen@linux.ibm.com>
Reviewed-by: Ralph Böhme <slow@samba.org>
Reviewed-by: Jeremy Allison <jra@samba.org>
2019-03-01 00:32:11 +00:00
Swen Schillig
81cc7a3518 ctdb-tools: Use wrapper for string to integer conversion
In order to detect an value overflow error during
the string to integer conversion with strtoul/strtoull,
the errno variable must be set to zero before the execution and
checked after the conversion is performed. This is achieved by
using the wrapper function strtoul_err and strtoull_err.

Signed-off-by: Swen Schillig <swen@linux.ibm.com>
Reviewed-by: Ralph Böhme <slow@samba.org>
Reviewed-by: Jeremy Allison <jra@samba.org>
2019-03-01 00:32:11 +00:00
Swen Schillig
55acae774a ctdb-server: Use wrapper for string to integer conversion
In order to detect an value overflow error during
the string to integer conversion with strtoul/strtoull,
the errno variable must be set to zero before the execution and
checked after the conversion is performed. This is achieved by
using the wrapper function strtoul_err and strtoull_err.

Signed-off-by: Swen Schillig <swen@linux.ibm.com>
Reviewed-by: Ralph Böhme <slow@samba.org>
Reviewed-by: Jeremy Allison <jra@samba.org>
2019-03-01 00:32:11 +00:00
Swen Schillig
e96bccc879 ctdb-protocol: Use wrapper for string to integer conversion
In order to detect an value overflow error during
the string to integer conversion with strtoul/strtoull,
the errno variable must be set to zero before the execution and
checked after the conversion is performed. This is achieved by
using the wrapper function strtoul_err and strtoull_err.

Signed-off-by: Swen Schillig <swen@linux.ibm.com>
Reviewed-by: Ralph Böhme <slow@samba.org>
Reviewed-by: Jeremy Allison <jra@samba.org>
2019-03-01 00:32:10 +00:00
Martin Schwenke
c93430fe8f ctdb-cluster-mutex: Separate out command and file handling
This code is difficult to read and there really is no common code
between the 2 cases.  For example, there is no need to split a
filename into words.  Separating each of the 2 cases into its own
function makes the logic much easier to understand.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=13800

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>

Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Mon Feb 25 03:40:16 CET 2019 on sn-devel-144
2019-02-25 03:40:16 +01:00
Martin Schwenke
ebc082122f ctdb-tests: Add a test for configuring the recovery lock as a command
BUG: https://bugzilla.samba.org/show_bug.cgi?id=13800

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2019-02-25 02:12:17 +01:00
Martin Schwenke
e74f5243fc ctdb-tests: Add -R option for local daemons to use recovery lock command
Under the covers, a command is always used.  However, there is no way
of testing of the code path where a command is explicitly configured.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=13800

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2019-02-25 02:12:17 +01:00
Martin Schwenke
ce09d9c3e4 ctdb-tests: Force test failure if local daemon setup fails
BUG: https://bugzilla.samba.org/show_bug.cgi?id=13800

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2019-02-25 02:12:17 +01:00
Martin Schwenke
13a1a48089 ctdb-recoverd: Time out attempt to take recovery lock after 120s
Currently this will wait forever.  It really needs a timeout in case
the cluster filesystem (or other lock mechanism) is completely wedged.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=13800

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2019-02-25 02:12:17 +01:00
Martin Schwenke
45a77d65b2 ctdb-recoverd: Ban node on unknown error when taking recovery lock
We really shouldn't see unknown errors.  They probably represent a
misconfigured recovery lock or similar.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=13800

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2019-02-25 02:12:17 +01:00
Martin Schwenke
c0fb62ed39 ctdb-recoverd: Make recoverd context available in recovery lock handle
BUG: https://bugzilla.samba.org/show_bug.cgi?id=13800

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2019-02-25 02:12:16 +01:00
Martin Schwenke
7e4aae6943 ctdb-recoverd: Clean up logging on failure to take recovery lock
Add an explicit case for a timeout and clean up the other messages.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=13800

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2019-02-25 02:12:16 +01:00
Martin Schwenke
621658cbed ctdb-recoverd: Free cluster mutex handler on failure to take lock
If nested events occur while the file descriptor handler is still
active then chaos can ensue.  For example, if a node is banned and the
lock is explicitly cancelled (e.g. due to election loss) then
double-talloc-free()s abound.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=13800

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2019-02-25 02:12:16 +01:00
Martin Schwenke
82e7f38214 ctdb-config: Change example recovery lock setting to one that fails
ctdbd will start without a recovery lock configured.  It will log a
message saying that this is not optimal.  However, a careless user may
overlook both this message and the importance of setting a recovery
lock.  If the existing example configuration is uncommented then the
directory containing it will be created (by 01.reclock.script) and the
failure (i.e. multiple nodes able to take the lock) will be confusing.

Instead, change the example setting to one that will result in banned
nodes, encouraging users to consciously configure (or deconfigure) the
recovery lock.  Tweak the corresponding comment.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=13790

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2019-02-25 02:12:16 +01:00
Christof Schmitt
92a9052437 ctdb-tests: Add test for ctdb_io.c
BUG: https://bugzilla.samba.org/show_bug.cgi?id=13791

Signed-off-by: Christof Schmitt <cs@samba.org>
Reviewed-by: Martin Schwenke <martin@meltin.net>

Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Fri Feb 22 03:51:37 CET 2019 on sn-devel-144
2019-02-22 03:51:37 +01:00
Swen Schillig
fa8e69ac95 ctdb: buffer write beyond limits
In order to calculate the number of bytes correctly which
are to be read into the buffer, the buffer.offset must be taken
into account.

This patch fixes a regression introduced by 382705f495.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=13791

Signed-off-by: Swen Schillig <swen@linux.ibm.com>
Reviewed-by: Christof Schmitt <cs@samba.org>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2019-02-22 02:08:07 +01:00
Douglas Bagnall
d0e26ea67f spelling of associated
Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
Reviewed-by: Andrew Bartlett <abartlet@samba.org>
2019-02-13 04:15:14 +01:00
Andreas Schneider
fb57c97ce4 ctdb:tools: Use correct C99 initializer for ltdb_header
Signed-off-by: Andreas Schneider <asn@samba.org>
Reviewed-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
2019-01-28 10:29:21 +01:00
Andreas Schneider
6c520978e2 ctdb:common: Use C99 initializer for 'struct ifreq'
Signed-off-by: Andreas Schneider <asn@samba.org>
Reviewed-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
2019-01-28 10:29:21 +01:00
Andreas Schneider
611b6c7ebc ctdb: Use C99 initializer for last element of tunables
Signed-off-by: Andreas Schneider <asn@samba.org>
Reviewed-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
2019-01-28 10:29:21 +01:00
Andreas Schneider
23709cc351 ctdb: Use C99 initializer for poptOption in test_options
Signed-off-by: Andreas Schneider <asn@samba.org>
Reviewed-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
2019-01-28 10:29:12 +01:00
Andreas Schneider
db6992c2e9 ctdb: Use C99 initializer for poptOption in ctdb tool
Signed-off-by: Andreas Schneider <asn@samba.org>
Reviewed-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
2019-01-28 10:29:12 +01:00
Volker Lendecke
193a0d6f01 ctdb: Print locks latency in machinereadable stats
Bug: https://bugzilla.samba.org/show_bug.cgi?id=13742
Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Martin Schwenke <martin@meltin.net>

Autobuild-User(master): Volker Lendecke <vl@samba.org>
Autobuild-Date(master): Wed Jan 16 05:34:17 CET 2019 on sn-devel-144
2019-01-16 05:34:17 +01:00
Martin Schwenke
944c92a15d ctdb-daemon: Modernise debug during record deletion for vacuuming
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>

Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Tue Dec 18 10:13:50 CET 2018 on sn-devel-144
2018-12-18 10:13:50 +01:00
Martin Schwenke
cdca0d7e78 ctdb-daemon Add extra debug during record deletion for vacuuming
It isn't currently possible to distinguish these 2 cases.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2018-12-18 07:12:10 +01:00
Martin Schwenke
2e3ad8c20d ctdb-tests: Minimise chances of test interfering with itself
Checking that the database contains 0 records cause a traverse.  This
may take a lock and cause vacuuming to fail (or be deferred for a
particular record/chain).  Minimise the chance of this happening by
only checking for 0 records every 10 seconds.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2018-12-18 07:12:10 +01:00
Martin Schwenke
f1b594dce1 ctdb-daemon: Do not force full vacuum on first vacuuming run
When the number of fast path vacuuming runs is 0 then a full vacuuming
run is done.  This means the first one is a full run, which is almost
certainly not what is intended.

Combine the 2 conditionals to only flag a full vacuuming run when the
count exceeds the configured limit.  This means that the
full_vacuum_run flag is set in both parent and child, but this is
harmless... and is better than getting it wrong.

Also tweak the comparison to be less-than-or-equal, since the zeroth
run needs to be counted.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2018-12-18 07:12:09 +01:00
Amitay Isaacs
9bdd6814e4 ctdb-packaging: Update library versions to upstream versions
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2018-12-18 07:12:09 +01:00
Amitay Isaacs
59e244c9d0 ctdb-packaging: Match configure command as per spec file
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2018-12-18 07:12:09 +01:00
Amitay Isaacs
4443124fe8 ctdb-packaging: Call waf with python wrapper
This allows to build packages even when python3 is not available by
setting PYTHON variable.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2018-12-18 07:12:09 +01:00
Amitay Isaacs
9912709eca ctdb-build: Use open() instead of file() for python3
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2018-12-18 07:12:09 +01:00
Amitay Isaacs
1e061ff1e3 ctdb-tool: Avoid data uninitialized warnings
../../tools/ctdb.c: In function 'str_to_data':
../../tools/ctdb.c:624: warning: 'data.dsize' may be used uninitialized in this function
../../tools/ctdb.c:624: warning: 'data.dptr' may be used uninitialized in this function

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2018-12-18 07:12:09 +01:00
Martin Schwenke
1ed91f0e10 ctdb-tests: Do not force TEST_VAR_DIR to be absolute
This can result in Unix domain socket names that are too long.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>

Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Tue Dec 18 05:31:00 CET 2018 on sn-devel-144
2018-12-18 05:31:00 +01:00
Martin Schwenke
61b54193fe ctdb-event: Force symbolic link targets to be absolute
If CTDB_BASE is relative then symbolic link targets will be incorrect.

Don't force CTDB_BASE to be absolute because this can result in Unix
domain socket names that are too long.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2018-12-18 02:02:04 +01:00
Martin Schwenke
108aca0a9e ctdb-event: Declare and construct data_script only if needed
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2018-12-18 02:02:04 +01:00
Martin Schwenke
63a4c634a6 ctdb-tests: Force symbolic link targets to be absolute
If CTDB_BASE is relative then the symbolic link target will be
incorrect.

Don't force CTDB_BASE to be absolute because this can result in Unix
domain socket names that are too long.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2018-12-18 02:02:04 +01:00
Martin Schwenke
45f96c7346 ctdb-event: Force EVENTSCRIPTS_TESTS_VAR_DIR to be absolute
Event scripts (well, statd_callout) can change directory, causing
stubs to be unable to locate EVENTSCRIPTS_TESTS_VAR_DIR if it is
relative.

Don't force TEST_VAR_DIR to be absolute because this can result in
Unix domain socket names that are too long.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2018-12-18 02:02:04 +01:00
Martin Schwenke
ba83aa9d8b ctdb-event: Force script directory to be absolute
If TEST_VAR_DIR is relative then symbolic link targets will be
incorrect.

Don't force TEST_VAR_DIR to be absolute because this can result in
Unix domain socket names that are too long.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2018-12-18 02:02:03 +01:00
Martin Schwenke
da8aaf2aee ctdb-recoverd: Call an election when the recovery lock is lost
The lock may have been lost due to a failure in the underlying locking
mechanism.  This could be due to quorum loss or similar.  It is best
to call an election to confirm that this node should still be master.
At worst, the node will reelect itself, fail to take the lock and then
ban itself.  This is a suitable outcome for a node that has been
partitioned from others in the cluster.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2018-12-18 02:02:03 +01:00
Martin Schwenke
9d1d5fa4ac ctdb-doc: Add non-breaking space to lock_buckets documentation
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2018-12-18 02:02:03 +01:00
Martin Schwenke
93284ed032 ctdb-daemon: Divide by 2 when calculating hop count bucket
This provides finer resolution while still maintaining a reasonable
maximum.  In this case the top bucket contains any hop counts
>= 16384, compared to the current situation where the top bucket contains
hop counts >= 268435456.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2018-12-18 02:02:03 +01:00
Andrew Bartlett
5ddff307b4 build: Move python detection back into waf (instead of in configure and Makefile)
This avoids creating a mini-configure in the configure script.

Users wishing to use python2 to build need to specify PYTHON=
to both ./configure and make

After we merged the python3 change, it became clear that relying on systems prefixing
the correct python just causes trouble and make debugging harder, so only use $PYTHON
for the override, not the default case

This essentially reverts a660b7fb8e but
leaves the files more consistent.

Signed-off-by: Andrew Bartlett <abartlet@samba.org>
Reviewed-by: Noel Power <npower@samba.org>
2018-12-14 14:40:19 +01:00
Noel Power
a660b7fb8e PY3: switch current build to use python3
Make sure default make and configure for all now defaults
to building with python3.

To build a samba (or sub component e.g. talloc etc.) with python3
  ./configure && make

To build a samba (or sub component e.g. talloc etc.) with python2
  PYTHON=python ./configure && PYTHON=python make

Signed-off-by: Noel Power <noel.power@suse.com>
Reviewed-by: Andrew Bartlett <abartlet@samba.org>
2018-12-10 10:38:25 +01:00
Christof Schmitt
7d271450f7 ctdb: Remove <file> parameter from pfetch usage info
The code does not implement saving the record data to a file, so update
the usage info accordingly.

Signed-off-by: Christof Schmitt <cs@samba.org>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>

Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Mon Dec 10 05:02:13 CET 2018 on sn-devel-144
2018-12-10 05:02:12 +01:00
Christof Schmitt
81de8f0641 ctdb: Fix hex to int conversion in h2i
Signed-off-by: Christof Schmitt <cs@samba.org>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2018-12-10 02:01:14 +01:00
Swen Schillig
353a947b4a ctdb: Adding memory pool for queue callback
The received packet is copied into a newly allocated memory chunk for further
processing by the assigned callback. Once this is done, the memory is free'd.
This is repeated for each received packet making the memory allocation / free
an expensive task. To optimize this process, a memory pool is defined which
is sized identically to the queue's buffer.
During tests it could be seen that more than 95% of all messages were sized
below the standard buffer_size of 1k.

Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Christof Schmitt <cs@samba.org>

Autobuild-User(master): Christof Schmitt <cs@samba.org>
Autobuild-Date(master): Fri Dec  7 23:27:16 CET 2018 on sn-devel-144
2018-12-07 23:27:16 +01:00
Swen Schillig
382705f495 ctdb: Introduce buffer.offset to avoid memmove
The memmove operation is quite expensive, therefore,
a new buffer attribute "offset" is introduced to support
an optimized buffer processing.
The optimization is to "walk" through the buffer and process
each packet until the buffer is fully processed (empty)
without requiring any memmove.
Only if a packet is in-complete, the buffer content is moved
and the new data is read from the queue.
This way almost all memmove operations are eliminated.

Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Christof Schmitt <cs@samba.org>
2018-12-07 19:56:16 +01:00
Stefan Metzmacher
f87d6cbfff ctdb/wscript: make use of MODE_{644,744,755,777}
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Reviewed-by: Andrew Bartlett <abartlet@samba.org>
2018-12-05 13:35:19 +01:00
Stefan Metzmacher
8ba0a9a1ab ctdb/wscript: use python 3.6 compatible functions
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Reviewed-by: Andrew Bartlett <abartlet@samba.org>
2018-12-05 13:35:19 +01:00
Martin Schwenke
dd7574afd1 ctdb-daemon: Exit with error if a database directory does not exist
Since 4.9.0, the log messages can be confusing if a required database
directory does not exist.  Explicitly check for database directories,
logging a clear error and exiting if one is missing.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=13696

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>

Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Mon Dec  3 06:56:41 CET 2018 on sn-devel-144
2018-12-03 06:56:41 +01:00
Olly Betts
28aeb86a9f Fix spelling mistakes
Signed-off-by: Olly Betts <olly@survex.com>
Reviewed-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
Reviewed-by: Andrew Bartlett <abartlet@samba.org>
2018-11-30 03:35:13 +01:00
Andreas Schneider
ab5f26f3f3 ctdb: Use #ifdef instead of #if for config.h definitions
Signed-off-by: Andreas Schneider <asn@samba.org>
Reviewed-by: Gary Lockyer <gary@catalyst.net.nz>
2018-11-28 23:19:21 +01:00
Martin Schwenke
c1dd6382e3 ctdb-tests: Make the debug hung script test cope with unreadable stacks
Ideally this would just involve using "test -r".  However, operating
system security features may mean that kernel stacks are not readable
even though they appear to be.

Instead, try reading that stack of a process on the test node.  If
that succeeds then so should reading the stack of the "stuck" sleep
process in the test.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=13684

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Tim Beale <timbeale@catalyst.net.nz>

Autobuild-User(master): Tim Beale <timbeale@samba.org>
Autobuild-Date(master): Thu Nov 15 08:15:32 CET 2018 on sn-devel-144
2018-11-15 08:15:32 +01:00
Andreas Schneider
008b9652ca ctdb: Fix an out of bound array access
Found by cppcheck.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=13680

Signed-off-by: Andreas Schneider <asn@samba.org>
Reviewed-by: Andrew Bartlett <abartlet@samba.org>
2018-11-14 05:07:15 +01:00
Andreas Schneider
2d512b278e debug: Use debuglevel_(get|set) function
Signed-off-by: Andreas Schneider <asn@samba.org>
Reviewed-by: Jeremy Allison <jra@samba.org>

Autobuild-User(master): Andreas Schneider <asn@cryptomilk.org>
Autobuild-Date(master): Thu Nov  8 11:03:11 CET 2018 on sn-devel-144
2018-11-08 11:03:11 +01:00
Martin Schwenke
6e16e95f74 ctdb-daemon: Do not fork when CTDB_TEST_MODE is set
Explicitly background ctdbd instead.

This has the advantage of leaving stdin open.  ctdbd can then be
enhanced to exit when stdin closes, allowing better cleanup in a test
environment.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>

Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Tue Nov  6 10:30:14 CET 2018 on sn-devel-144
2018-11-06 10:30:14 +01:00
Martin Schwenke
01f6fbba4e ctdb-daemon: Switch interactive variable to a bool
popt uses an int in place of a bool, so declare an extra int and make
the conversion explicit.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2018-11-06 07:16:18 +01:00
Martin Schwenke
804bdf9719 ctdb-tests: Add local_daemons.sh onnode and socket commands
These aren't used by simple tests but they will be useful for
integrating ctdbd local daemons into other test suites and for
debugging.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2018-11-06 07:16:18 +01:00
Martin Schwenke
19de5f463d ctdb-tests: Use local_daemons.sh in local_daemons.bash
The etc-ctdb/ subdirectory containing the event script moves into the
top-level tests/ directory because the subdirectory is really now
owned by local_daemons.sh instead of simple/.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2018-11-06 07:16:17 +01:00
Martin Schwenke
46fd4f144e ctdb-tests: Add local_daemons.sh
This provides a separate script for handling local daemons.  It can be
used for testing outside of the CTDB simple test suite.  It is
installed as ctdb_local_daemons.

The logic is copied from ctdb/tests/simple/scripts/local_daemons.bash.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2018-11-06 07:16:17 +01:00
Martin Schwenke
25efb924bf ctdb-tests: Allow use of setup_ctdb_base() outside of test cases
Always create an empty event script directory.  If $TEST_SUBDIR is
unset then don't use it to look for etc-ctdb/.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2018-11-06 07:16:17 +01:00
Martin Schwenke
9679c9e9b6 ctdb-build: Don't set unused variable TEST_BIN_DIR
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2018-11-06 07:16:17 +01:00
Martin Schwenke
aa0c7ccfc8 ctdb-tests: Move setting of ctdb_dir and top_dir
These are only used in script_install_paths.sh.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2018-11-06 07:16:17 +01:00
Martin Schwenke
1d86fd537a ctdb-tests: Use $CTDB_SCRIPTS_TOOLS_BIN_DIR
Don't calculate this locally as _tools_dir.  Add it to PATH
unconditionally - this may result in duplicate entries in PATH but the
resulting code is simpler.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2018-11-06 07:16:17 +01:00
Martin Schwenke
7ac5e7ae44 ctdb-tests: Use $CTDB_SCRIPTS_TESTS_BINDIR
Don't calculate this locally as _test_bin_dir.  Just calculate
top_dir, source script_install_paths.sh and use
$CTDB_SCRIPT_TESTS_BINDIR.

Don't bother sanity checking if TEST_BIN_DIR is set.  It will go away
soon.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2018-11-06 07:16:17 +01:00
Martin Schwenke
bde73c7a04 ctdb-tests: Add new variable CTDB_SCRIPTS_TESTS_BINDIR
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2018-11-06 07:16:17 +01:00
Martin Schwenke
2cb82ef453 ctdb-tests: Change all cluster setup to use ctdb_test_init()
ctdb_test_init() now passes any arguments to setup_ctdb().

Update tests that have custom local daemon configuration to call
ctdb_test_init() directly.  Remove the redundant, initial call to
ctdb_test_init() to avoid starting the cluster an extra time.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2018-11-06 07:16:17 +01:00
Martin Schwenke
9a2910c60b ctdb-tests: Drop passing of test arguments to ctdb_test_init()
Arguments are currently ignored anyway.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2018-11-06 07:16:17 +01:00
Martin Schwenke
d762e52e65 ctdb-tests: Drop ctdb_restart_when_done()
This no longer does anything.  Integration test cases now start and
shut down the cluster.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2018-11-06 07:16:16 +01:00
Martin Schwenke
79db138de5 ctdb-tests: Drop dependency on variable ctdb_test_restart_scheduled
The remainder of the scheduled restart logic is about to be removed,
so produce debugging information any time the cluster is not healthy.

While here, reindent and drop the else since there is already an early
return before it.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2018-11-06 07:16:16 +01:00
Martin Schwenke
bc8df7191c ctdb-tests: Drop tests that only start and stop daemons
Integration test cases now start and shut down the cluster.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2018-11-06 07:16:16 +01:00
Martin Schwenke
412fb7b7d5 ctdb-tests: Move enabling of event scripts to setup_ctdb()
This is for the real cluster case.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2018-11-06 07:16:16 +01:00
Martin Schwenke
aa2ee4bea8 ctdb-tests: Improve signal handling trap
Interrupting a test run currently moves on to the next test.  It
should exit.

Follow the practice of exiting with 128 + signal number.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2018-11-06 07:16:16 +01:00
Martin Schwenke
92337234e5 ctdb-tests: Drop cleanup_handler()
Running testsuite-specific code here isn't a good option.

Daemons are now shut down in ctdb_test_exit(), even when testing is
interrupted.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2018-11-06 07:16:16 +01:00
Martin Schwenke
0e9ead8f28 ctdb-tests: Start daemons in ctdb_test_init(), stop them in ctdb_test_exit()
This makes tests self-contained.  They can also now be individually
looped, if necessary.

Most tests (all but 1 complex, more than 50% of simple) restart the
daemons anyway, so this simplification is worth it.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2018-11-06 07:16:16 +01:00
Martin Schwenke
e733e4cb74 ctdb-tests: Ignore SIGPIPE during simple test cleanup
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2018-11-06 07:16:16 +01:00
Martin Schwenke
6d9c89bfa3 ctdb-tests: Drop setting of unused variable scriptname
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2018-11-06 07:16:16 +01:00
Martin Schwenke
520568051c ctdb-tests: Drop use of confusing testfailures variable
Exit on first test failure instead of setting a variable.  The bizarre
logic in ctdb_test_exit() makes this worth dropping.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2018-11-06 07:16:16 +01:00
Martin Schwenke
9ebcebe519 ctdb-tests: Drop useless "ctdb version" test
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2018-11-06 07:16:16 +01:00
Martin Schwenke
43c26e1e64 ctdb-tests: Rationalise tunable simple tests
These 3 tests duplicate various checks and can easily be handled as a
single test.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2018-11-06 07:16:15 +01:00
Martin Schwenke
ba86eacb66 ctdb-tests: Rationalise ctdb stop/continue/disable/enable simple tests
The "continue" and "enable" tests are just extensions of the "stop"
and "disable" tests, so drop the latter 2.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2018-11-06 07:16:15 +01:00
Martin Schwenke
5fdac517fa ctdb-tests: Use wait_until_node_has_no_ips() in some tests
This strengthens those tests to ensure that released IPs aren't
replaced with others.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2018-11-06 07:16:15 +01:00
Martin Schwenke
eda1296d67 ctdb-tests: Add function wait_until_node_has_no_ips()
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2018-11-06 07:16:15 +01:00
Martin Schwenke
44019b5577 ctdb-event: Only run talloc report if CTDB_INTERACTIVE is set
This is only really wanted for interactive testing when logging to
stderr.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2018-11-06 07:16:15 +01:00
Martin Schwenke
e952f0316b ctdb-event: Never fork to become daemon in eventd
This stops ctdbd from being able to shut down eventd, since the PID it
records will be invalid.  There's no need for eventd to fork.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2018-11-06 07:16:15 +01:00
Martin Schwenke
4e6bd42493 ctdb-daemon: Improve documentation for -i option
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2018-11-06 07:16:15 +01:00
Martin Schwenke
9c41481f21 ctdb-daemon: Don't set log_to_stdout for become_daemon()
ctdbd logs to stderr in interactive mode, not stdout.  This way stdout
is always closed.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2018-11-06 07:16:15 +01:00
Martin Schwenke
c84254d23d ctdb-daemon: Avoid unnecessarily spamming the logs when in test mode
Logging the logging location to syslog can be useful on production
systems when the configuration goes unexpectedly missing.  However, in
test mode this just adds noise to the logs on the test system.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2018-11-06 07:16:14 +01:00
Martin Schwenke
1bbc4fad43 ctdb-tools: Detect unknown node number
If there aren't enough addresses in the list then the shift will
silently fail and the printed address will be the unshifted value of
$1, which is incorrect/unexpected.  So, sanity check the node number.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2018-11-06 07:16:14 +01:00
Martin Schwenke
08469408c3 ctdb-tests: README updates
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2018-11-06 07:16:14 +01:00
Martin Schwenke
4f4a835c34 ctdb-tests: Remove export of CTDB_SOCKET
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2018-11-06 07:16:14 +01:00
Martin Schwenke
d246b1dadf ctdb-tests: Use path_socket() in dummy client
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2018-11-06 07:16:14 +01:00
Martin Schwenke
3b1e5977d8 ctdb-tests: Drop incorrect comment, unused function
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2018-11-06 07:16:14 +01:00
Martin Schwenke
82e589e388 ctdb-tests: Drop setting of CTDB_SOCKET and CTDB_PIDFILE
The local daemons ssh stub doesn't need to do this because the ctdbd
and the ctdb tool now only need CTDB_TEST_MODE and CTDB_BASE for local
daemon tests.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2018-11-06 07:16:14 +01:00
Martin Schwenke
d75fa2c3fd ctdb-daemon: Drop unused function ctdb_set_socketname()
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2018-11-06 07:16:14 +01:00
Martin Schwenke
5f478b7c5f ctdb-daemon: Use path functions for socket and PID file
Drop the use of ctdb_set_sockname() because it complicates the memory
allocation and this is the only place it is used.  Just assign to the
relevant pointer.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2018-11-06 07:16:14 +01:00
Martin Schwenke
cd021596da ctdb-tests: Use path_socket() in test client tools
Just leak the memory allocated by path_socket().  This is only used in
short-lived test programs, so it isn't worth the hassle of plumbing a
talloc context through several layers to get here.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2018-11-06 07:16:14 +01:00
Martin Schwenke
cc3aedd307 ctdb-tools: Use path_socket() in ctdb tool
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2018-11-06 07:16:14 +01:00
Martin Schwenke
38566780d2 ctdb-tests: Use ctdb-path for fake_ctdbd directory setup
This needs to be done before any of the code changes are made,
including updating the ctdb tool.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2018-11-06 07:16:13 +01:00
Martin Schwenke
d4a1f897af ctdb-tests: Use ctdb-path-like values for local daemons socket and PID file
However, don't use ctdb-path itself because some tests use nested
instances of onnode.  The outermost instance would set CTDB_SOCKET and
any inner instance would pick up that value, regardless of CTDB_BASE.

This is a temporary measure to avoid breaking testing while use of the
path functions is added to ctdbd and the ctdb tool.  When this is
complete these variables can be removed altogether because the code
will just depend on CTDB_TEST_MODE and CTDB_BASE.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2018-11-06 07:16:13 +01:00
Martin Schwenke
32c2ec8fa2 ctdb-common: Allow path_socket() to use $CTDB_SOCKET
Use of CTDB_SOCKET is being generally removed.  However, this override
is being added to allow test code outside of ctdb/ to be able to
specify the socket, if desired.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2018-11-06 07:16:13 +01:00
Martin Schwenke
27df4f002a ctdb-recovery: Ban a node that causes recovery failure
... instead of applying banning credits.

There have been a couple of cases where recovery repeatedly takes just
over 2 minutes to fail.  Therefore, banning credits expire between
failures and a continuously problematic node is never banned,
resulting in endless recoveries.  This is because it takes 2
applications of banning credits before a node is banned, which
generally involves 2 recovery failures.

The recovery helper makes up to 3 attempts to recover each database
during a single run.  If a node causes 3 failures then this is really
equivalent to 3 recovery failures in the model that existed before the
recovery helper added retries.  In that case the node would have been
banned after 2 failures.

So, instead of applying banning credits to the "most failing" node,
simply ban it directly from the recovery helper.

If multiple nodes are causing recovery failures then this can cause a
node to be banned more quickly than it might otherwise have been, even
pre-recovery-helper.  However, 90 seconds (i.e. 3 failures) is a long
time to be in recovery, so banning earlier seems like the best
approach.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=13670

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>

Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Mon Nov  5 06:52:33 CET 2018 on sn-devel-144
2018-11-05 06:52:33 +01:00
Martin Schwenke
fbea9d3699 ctdb-daemon: Fix valgrind hit in event code
==25741== Syscall param write(buf) points to uninitialised byte(s)
==25741==    at 0x4939291: write (write.c:27)
==25741==    by 0x4868285: sys_write (sys_rw.c:68)
==25741==    by 0x13915D: sock_queue_trigger (sock_io.c:316)
==25741==    by 0x4DE6478: tevent_common_invoke_immediate_handler (in /usr/lib/x86_64-linux-gnu/libtevent.so.0.9.37)
==25741==    by 0x4DE64A2: tevent_common_loop_immediate (in /usr/lib/x86_64-linux-gnu/libtevent.so.0.9.37)
==25741==    by 0x4DEBE5A: ??? (in /usr/lib/x86_64-linux-gnu/libtevent.so.0.9.37)
==25741==    by 0x4DEA2D6: ??? (in /usr/lib/x86_64-linux-gnu/libtevent.so.0.9.37)
==25741==    by 0x4DE57E3: _tevent_loop_once (in /usr/lib/x86_64-linux-gnu/libtevent.so.0.9.37)
==25741==    by 0x15D1BA: ctdb_event_script_args (eventscript.c:821)
==25741==    by 0x13B437: ctdb_start_daemon (ctdb_daemon.c:1315)
==25741==    by 0x110642: main (ctdbd.c:393)
==25741==  Address 0x57888a4 is 100 bytes inside a block of size 144 alloc'd
==25741==    at 0x48357BF: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==25741==    by 0x4B9B7C0: talloc_named_const (in /usr/lib/x86_64-linux-gnu/libtalloc.so.2.1.14)
==25741==    by 0x15CCC6: eventd_client_write (eventscript.c:430)
==25741==    by 0x15CCC6: eventd_client_run (eventscript.c:556)
==25741==    by 0x15CCC6: ctdb_event_script_run (eventscript.c:649)
==25741==    by 0x15D198: ctdb_event_script_args (eventscript.c:812)
==25741==    by 0x13B437: ctdb_start_daemon (ctdb_daemon.c:1315)
==25741==    by 0x110642: main (ctdbd.c:393)
==25741==

BUG: https://bugzilla.samba.org/show_bug.cgi?id=13659

Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>

Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Mon Oct 22 09:27:15 CEST 2018 on sn-devel-144
2018-10-22 09:27:15 +02:00
Amitay Isaacs
a190960380 ctdb-event: Check the return status of sock_daemon_set_startup_fd
BUG: https://bugzilla.samba.org/show_bug.cgi?id=13659

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2018-10-22 06:04:20 +02:00
Amitay Isaacs
80549927bc ctdb-common: Set close-on-exec for startup fd
The startup_fd should not be propagated to the child processes created
from a daemon.  It should only be used in the daemon code to return the
status of the startup.  Another use of startup_fd is to notify the
parent if the daemon process has exited.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=13659

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2018-10-22 06:04:20 +02:00
Martin Schwenke
c9e1603a5d ctdb-daemon: Exit if eventd goes away
ctdbd enters a broken state if eventd goes away.  A clean shutdown is
not possible because that involves running events.  Restarting eventd
is possible but this might mask a serious problem and it is possible
that eventd might keep on disappearing.  Just exit.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=13659

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2018-10-22 06:04:20 +02:00
Martin Schwenke
a3d12252fa ctdb-daemon: Return early when refusing to run an event script
BUG: https://bugzilla.samba.org/show_bug.cgi?id=13659

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2018-10-22 06:04:20 +02:00
Martin Schwenke
80f3f7c188 ctdb-tests: Improve counting of database records
Record counts are sometimes incomplete for large databases when
relevant tests are run on a real cluster.

This probably has something to do with ssh, pipes and buffering, so
move the filtering and counting to the remote end.  This means that
only the count comes across the pipe, instead of all the record data.

Instead of explicitly excluding the key for persistent database
sequence numbers, just exclude any key starting with '_'.  Such keys
are not used in tests.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>

Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Mon Oct  8 05:36:11 CEST 2018 on sn-devel-144
2018-10-08 05:36:11 +02:00
Martin Schwenke
52dcecbc92 ctdb-tests: Add extra debug to large database recovery test
This test sometimes fails, probably because the test is flakey.
Either the records aren't being added correctly or the counting of
records loses records.  Try to debug both possibilities.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2018-10-08 02:46:23 +02:00
Martin Schwenke
d67d8ed44a ctdb-tests: Shut down transaction_loop clients more cleanly
A transaction_loop client can exit with a transaction active when its
time limit expires.  This causes a recovery and causes problems with
the test cleanup, which detects unwanted recoveries and fails.

Set a flag when the time limit expires and exit cleanly before the
next transaction is started.

Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2018-10-08 02:46:23 +02:00
Martin Schwenke
2aa006a311 ctdb-tools: Have onnode pass -n option even when regular ssh not in use
ONNODE_SSH is really a test hook, so it doesn't need to support
completely random values.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2018-10-08 02:46:23 +02:00
Martin Schwenke
6ac5124b01 ctdb-tests: Support closing of stdin in local daemons ssh stub
Not sure this is needed but this makes it behave the same as ssh.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2018-10-08 02:46:23 +02:00
Martin Schwenke
0dfb3c87b5 ctdb-tests: Be more careful when building public IP addresses
The goal is to allow more local daemons by expanding the address range
rather than generating invalid addresses.

For IPv6, use a separate address space instead of an offset for the
2nd address.

For IPv4, use the last 2 octets with addresses starting at
192.168.100.1 and 192.168.200.1.  Avoid addresses with 0 and 255 in
the last octet by using a maximum of 100 addresses per "subnet"
starting at .1.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2018-10-08 02:46:23 +02:00
Martin Schwenke
36eb738877 ctdb-tests: Be more careful when building node addresses
The goal is to allow more local daemons by expanding the address range
rather than generating invalid addresses.

For IPv6, use all 4 trailing hex digits.

For IPv4, use the last 2 octets.  Although 127.0.0.0 is a /8 network,
avoid unexpected issues due to 0 and 255 in the last octet.  Use a
maximum of 100 addresses per "subnet" starting at .1.  Keep the first
group of addresses in 127.0.0.0/24 to continue to allow a reasonable
number of nodes to be tested with socket-wrapper.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2018-10-08 02:46:23 +02:00
Martin Schwenke
03dddc37b5 ctdb-tests: Don't format IPv4 octets as hex digits
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2018-10-08 02:46:22 +02:00
Martin Schwenke
0eabac5295 ctdb-tests: Be more efficient about starting/stopping local daemons
Don't loop, just use onnode all.

For shutting down, use onnode -p all.  This results in a significant
time saving for stopping many deamons because "ctdb shutdown" is now
synchronous.

onnode -p all can be used to start daemons directly because they
daemonize.  However, this does not work under valgrind because the
valgrind process does not exit, so onnode will wait forever for it.
In this case, use onnode without the -p option.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2018-10-08 02:46:22 +02:00
Martin Schwenke
a9ac33015b ctdb-tests: Do not use ctdbd_wrapper in local daemon tests
Run the daemon directly and shut it down using ctdb shutdown.

The wrapper waits for ctdbd to reach >=FIRST_RECOVERY runstate within
a timeout period and shuts ctdbd down if that doesn't happen.  This is
only really used to ensure that ctdbd doesn't exit early after an
apparently successful start.  There are no known cases where ctdbd
will continue running but fail to reach >=FIRST_RECOVERY runstate.

When ctdbd is started in tests, the test code will wait until ctdbd is
in a healthy state on all nodes before proceeding, so there is
effectively no change in behaviour.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2018-10-08 02:46:22 +02:00
Martin Schwenke
8bde6fa09c ctdb-tests: Don't remove non-existent test database directory
This directory is no longer used.  Lack of removal doesn't seem to
cause a problem.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2018-10-08 02:46:22 +02:00
Martin Schwenke
f2e4a5e9fa ctdb-tests: Drop unused function maybe_stop_ctdb()
There are too many functions to start/stop daemons.  Simplify this.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2018-10-08 02:46:22 +02:00
Martin Schwenke
2cd6a00399 ctdb-tests: Explicitly check for local daemons when shutting down
This is clearer if the logic is explicit...  and...

There are too many functions to start/stop daemons.  Simplify this.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2018-10-08 02:46:22 +02:00
Martin Schwenke
90f6b0a1ed ctdb-tests: Drop functions daemons_start(), daemons_stop()
There are too many functions to start/stop daemons.  Simplify this.

Inline the functionality into ctdb_start_all() and ctdb_stop_all().

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2018-10-08 02:46:22 +02:00
Martin Schwenke
f1ede41adf ctdb-tests: Don't used daemons_start()/daemons_stop() directly in tests
There are too many functions to start/stop daemons.  Simplify this.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2018-10-08 02:46:22 +02:00
Martin Schwenke
4642a347d0 ctdb-tests: Rename _ctdb_start_all() -> ctdb_start_all()
There are too many functions to start/stop daemons.  Simplify this.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2018-10-08 02:46:22 +02:00
Martin Schwenke
f57e5bbde7 ctdb-tests: Rename ctdb_start_all() -> ctdb_init()
There are too many functions to start/stop daemons.  Simplify this.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2018-10-08 02:46:21 +02:00
Martin Schwenke
a66a96934a ctdb-tests: Drop ps_ctdbd()
This was used for debugging tests by ensuring that the arguments to
ctdbd were as expected.  It no longer outputs anything useful because
ctdbd is now started without arguments.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2018-10-08 02:46:21 +02:00
Amitay Isaacs
83b3c5670d ctdb-tests: Drop code for RECEIVE_RECORDS control
BUG: https://bugzilla.samba.org/show_bug.cgi?id=13641

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2018-10-08 02:46:21 +02:00
Amitay Isaacs
2f89bd96fb ctdb-protocol: Drop marshalling code for RECEIVE_RECORDS control
BUG: https://bugzilla.samba.org/show_bug.cgi?id=13641

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2018-10-08 02:46:21 +02:00
Amitay Isaacs
81dae71fa7 ctdb-protocol: Mark RECEIVE_RECORDS control obsolete
BUG: https://bugzilla.samba.org/show_bug.cgi?id=13641

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2018-10-08 02:46:21 +02:00
Amitay Isaacs
d18385ea2a ctdb-daemon: Drop implementation of RECEIVE_RECORDS control
BUG: https://bugzilla.samba.org/show_bug.cgi?id=13641

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2018-10-08 02:46:21 +02:00
Amitay Isaacs
e15cdc652d ctdb-vacuum: Remove unnecessary check for zero records in delete list
Since no records are deleted from RB tree during step 1, there is no
need for the check.  Run step 2 unconditionally.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=13641

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2018-10-08 02:46:21 +02:00
Amitay Isaacs
ef05239717 ctdb-vacuum: Fix the incorrect counting of remote errors
If a node fails to delete a record in TRY_DELETE_RECORDS control during
vacuuming, then it's possible that other nodes also may fail to delete a
record.  So instead of deleting the record from RB tree on first failure,
keep track of the remote failures.

Update delete_list.remote_error and delete_list.left statistics only
once per record during the delete_record_traverse.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=13641

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2018-10-08 02:46:21 +02:00
Amitay Isaacs
202b9027ba ctdb-vacuum: Simplify the deletion of vacuumed records
The 3-phase deletion of vacuumed records was introduced to overcome
the problem of record(s) resurrection during recovery.  This problem
is now handled by avoiding the records from recently INACTIVE nodes in
the recovery process.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=13641

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2018-10-08 02:46:20 +02:00
Martin Schwenke
dcc9935995 ctdb-tests: Add recovery record resurrection test for volatile databases
Ensure that deleted records and vacuumed records are not resurrected
from recently inactive nodes.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=13641

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
2018-10-08 02:46:20 +02:00
Amitay Isaacs
c4ec99b1d3 ctdb-daemon: Invalidate records if a node becomes INACTIVE
BUG: https://bugzilla.samba.org/show_bug.cgi?id=13641

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2018-10-08 02:46:20 +02:00
Amitay Isaacs
040401ca3a ctdb-daemon: Don't pull any records if records are invalidated
This avoids unnecessary work during recovery to pull records from nodes
that were INACTIVE just before the recovery.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=13641

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2018-10-08 02:46:20 +02:00
Amitay Isaacs
71896fddf1 ctdb-daemon: Add invalid_records flag to ctdb_db_context
If a node becomes INACTIVE, then all the records in volatile databases
are invalidated.  This avoids the need to include records from such
nodes during subsequent recovery after the node comes out INACTIVE state.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=13641

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
2018-10-08 02:46:20 +02:00
Noel Power
2e59a3343f PY3: make sure print stmt is enclosed by '(' & ')'
Signed-off-by: Noel Power <noel.power@suse.com>
Reviewed-by: Andrew Bartlett <abartlet@samba.org>
2018-09-19 22:25:05 +02:00
Martin Schwenke
486022ef8f ctdb-recoverd: Set recovery lock handle at start of attempt
This allows the attempt to be cancelled if an election is lost and an
unlock is done before the attempt is completed.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=13617

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>

Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Tue Sep 18 02:18:30 CEST 2018 on sn-devel-144
2018-09-18 02:18:30 +02:00