IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
There's a tiny chance that the connection information may not be
transferred to other nodes quickly enough, so add an explicit wait.
Also clean up the description and recognise that it is the takeover
node that does the tickling.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This is a gawk extension and can't be used reliably if just running
"awk". It is simple enough to switch to using the standard sub() and
gsub() functions.
The alternative is to switch to explicitly running "gawk". However,
although the eventscripts aren't exactly portable, it is probably
better to move closer to portability than further away.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Michael Adam <obnox@samba.org>
tcptickle_sniff_start() assumes that if $dst contains a ': then it
should use the IPv6 sniffing code. However, $dst is a socket, so has
a trailing ":<port>".
Strip the trailing ":<port>" before checking for ':' as a marker for
an IPv6 address.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Michael Adam <obnox@samba.org>
These tests simulate a dead node rather than a CTDB failure, so drop
IP addresses when killing a "node" to avoid problems with duplicates.
To cope with a CTDB failure a watchdog would be needed to ensure that
the public IPs are dropped when CTDB dies. Let's not do that now.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Fri Dec 5 23:29:39 CET 2014 on sn-devel-104
Extend select_test_node_and_ips() to set $test_prefix in addition to
$test_ip.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
There are parentheses missing that stop the default pattern from
matching commands with trailing garbage (e.g. "exportfs.orig").
A careful check of POSIX (and running GNU sed with --posix) suggests
that "\|" isn't a supported way of specifying alternation in a regular
expression. Therefore, it is clearer to switch to extended regular
expressions so that this has a chance of being portable (even though
the point is to print /proc/<pid>/stack, which only works on Linux).
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Tue Nov 18 06:37:45 CET 2014 on sn-devel-104
Some of this implements logic that exists in functions. Some of it is
overly complicated and potentially failure-prone.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Debugging can still be running when a monitor event times out and
scriptstatus output changes.
When debugging a hung script to a log file, write to a temporary file
and move the temporary file over the log file when done. The test
then waits for the log file to appear.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Thu Jul 3 08:19:23 CEST 2014 on sn-devel-104
About a year ago a check was added to _cluster_is_healthy() to make
sure that node 0 isn't in recovery. This was to avoid unexpected
recoveries causing tests to fail. However, it was misguided because
each test initially calls cluster_is_healthy() and will now fail if an
unexpected recovery occurs.
Instead, have cluster_is_healthy() warn if the cluster is in recovery.
Also:
* Rename wait_until_healthy() to wait_until_ready() because it waits
until both healthy and out of recovery.
* Change the post-recovery sleep in restart_ctdb() to 2 seconds and
add a loop to wait (for 2 seconds at a time) if the cluster is back
in recovery. The logic here is that the re-recovery timeout has
been set to 1 second, so sleeping for just 1 second might race
against the next recovery.
* Use reverse logic in node_has_status() so that it works for "all".
* Tweak wait_until() so that it can handle timeouts with a
recheck-interval specified.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This one ensures that a newly started node gets an up-to-date tickle
list. Tweak some of the integration test functions to accommodate
this.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
It is hard to diagnose failures in the NFS tickle test because there's
no way of telling if the test node doesn't have the tickle or if it
didn't get propagated.
Factor out check_tickles() into local.bash and give it some
parameters.
Have the NFS test call it first to ensure the tickle has been
registered. Then use new function check_tickles_all() to ensure the
tickle has been propagated to all nodes. Give this a bit of extra
time (double the timeout) just in case we're racing with the update.
Add a useful comment to the CIFS test so that I stop asking myself how
the test could ever have worked reliably. :-)
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
* Add stack dumps for "interesting" processes that sometimes get
stuck, so try to print stack traces for them if they appear in the
pstree output.
* Add new configuration variables CTDB_DEBUG_HUNG_SCRIPT_LOGFILE and
CTDB_DEBUG_HUNG_SCRIPT_STACKPAT. These are primarily for testing
but the latter may be useful for live debugging.
* Load CTDB configuration so that above configuration variables can be
set/changed without restarting ctdbd.
Add a test that tries to ensure that all of this is working.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This adds a lot of IPs (currently 100) in a new network and deletes
them in a few steps. First the primary is deleted and then a check is
done to ensure that the remaining IPs are all correct. Then about 1/2
of the IPs and deleted and remaining IPs are checked. Then the
remaining IPs are deleted and a check is done to ensure they are all
gone.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This currently requires an eventscript to be dynamically installed.
This eventscript is only used to help determine when a monitor event
has occurred. This code is horrible and fragile.
A better way is to just monitor the output of "ctdb scriptstatus".
When changes it changes then a monitor event has occurred.
Also remove the old code that checks for tickle information in shared
storage. CTDB hasn't done things this way for a long time.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Michael Adam <obnox@samba.org>
This is a needlessly complex way of testing the same thing as the
eventscripts unit tests 60.nfs.monitor.161.sh and
60.nfs.monitor.162.sh.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit d1674aad224f8f0c9a03c3cd38a647318ba0f03e)
This is adequately covered by eventscripts unit tests
50.samba.monitor.105.sh and 50.samba.monitor.106.sh.
This test is broken if CTDB_SAMBA_CHECK_PORTS is not specified in the
CTDB configuration. Fixing it is hard and involves adding a more
complex stub for testparm. We already have that in the eventscript
unit tests above.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 81b94fbb7495ac3204f1a84c673c8babf04663bc)
Refactor the NFS test setup/cleanup code into new common functions.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit 29e98017221326bdc9b1c4f7c05b3b495c1de29b)
Tickle tests fail if run from a node involved in the test.
The condition is actually weaker than this: the test can't be run from
a CTDB node that is hosting public addresses that may be used by the
test.
Rework ctdb_test_check_real_cluster() to support checking this.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 14012781c3751a514055df29ea70adfb12ecb2d9)
This is made possible by separation of public addresses files for
local daemons and the addition of get_ctdbd_command_line_option().
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 2bcd58b30d7cf6dd48ad7f019810c6965a44c85a)
Testing with local daemons is the current default but this is not the
most common use case. Therefore, we make local daemons optional by
using the -l switch with run_tests or by setting TEST_LOCAL_DAEMONS to
the number of daemons to be used (-l sets this to 3).
TEST_LOCAL_DAEMONS replaces CTDB_TEST_NUM_DAEMONS and
CTDB_TEST_REAL_CLUSTER is removed.
Most relevant logic is moved from ctdb_test_env to integration.bash.
ctdb_test_check_real_cluster() is moved from integration.bash to
complex/scripts/local.bash.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 72ecae61c43b318ec94b527a12cbb0a382e8c3db)
* run_tests no longer includes common.sh, which is only to be included
by test cases. Therefore, it defines its own die() function.
* TEST_SUBDIR is now set in common.sh
* Move complex-only functions to complex/scripts/local.bash
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit bfa1d6638d3e116640eb4e3bb71b21ba6ef8cae5)
... on Debian system and derivated.
(ctdb_diagnostics still hardcodes /etc/sysconfig/)
(This used to be ctdb commit 1341329f6125d491b82c873f793af819e677f714)
The manual replacement of loadconfig() had bit rotted and no longer
worked.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit bf23e7166385d305c6860b37c120f70a9aa33aa5)
Use onnode any where possible rather than a fixed node.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 51561720d2b4db5b307da3d410661075e2a6c3ca)
We now kill ctdbd on the test node instead of disabling it. This
ensures that the only tickles we see will come from the takeover node.
We also sleep for TickleUpdateInterval before checking for asking ctdb
about the tickles.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 48cd8325c070f6942aa13a25269021e4c8ed188f)
The test depended on the exit code of "ctdb gettickles", which always
succeeds. This change wraps the command in a function that checks
whether the tickle we're interested in is registered.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit c4b05a731e1bee8f5b46529773a4f5389b2b6064)
The NFS test sleeps for MonitorInterval to give CTDB time to record an
NFS tickle. However, this isn't always long enough. This changes the
test to wait until a monitor event has actually occurred.
The CIFS test assumes that Samba is able to register a tickle with
CTDB before it notices that netstat has registered the tickle and can
use onnode to ask CTDB about it. That is an incorrect assumption -
sometimes we can get to the point of asking CTDB about the tickle
before Samba and CTDB have processed it. This adds a timeout loop
that makes the CIFS test wait until the tickle has been registered or
fail after 10 seconds.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 20a9d35933d89dc7eb710075f360686a49d78609)
The timeout for waiting for state changes isn't very predictable. It
is "about" MonitorInterval seconds... but can be longer given the
duration of eventscript runs and other things. So, we change the
timeout to MonitorInterval + EventScriptTimeout, hoping it never takes
that long.
Move the eventscript installation/removal from the old fake-tests into
a function in the functions file. Implement supporting functions to
create/remove/check-for various files that it handles. Also add a
function that uses all of this that waits for the next monitor event
(but only if all other monitor events pass).
The final check in the skip share check tests uses the above and waits
for a monitor event, and then checks that the node is still healthy.
Also enhance the wait_until function to handle a command starting with
'!' (as a separate word) to make it easy to wait for a file not to
exist.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 25e82a8a667a54c6921ef076c63fdd738dd75d19)