IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
ctdb_client.h is the existing internal client interface (which was mainly
in ctdb.h), and ctdb_protocol.h is the information needed for the wire
protocol only.
ctdb.h will be the new, shiny, libctdb API.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This used to be ctdb commit 4bba6b8cd47b352f98d41f9f06258d5ac3c9adef)
Recent updates to the test meant that it only worked with the latest
ctdb versions. This changes things so that we never bother matching
the machine readable header, just the actual data in the output. It
also takes a slightly more liberal approach in massaging the human
readable output to ensure it matches the machine readable output.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 11e72356e849ed4cb315c942e30e9bcadc624f42)
This makes sure that we don't get public addresses assigned during the
initial recovery and remove them again in the startup event.
metze
(This used to be ctdb commit f872e8c63a2f8979e6a0d088630575bdd4d7b4f1)
This is needed because the "startup" event runs after the initial recovery,
but we need to do some actions before the initial recovery.
metze
(This used to be ctdb commit e953808449c102258abb6cba6f4abf486dda3b82)
Make it return success for make test.
This is temporarily disabled until the rewrite of the
transaction code (in samba and the daemon) using the global
lock feature has been ported to the ctdb client code.
Michael
(This used to be ctdb commit 78ca29352aa39f4ef4e41096b92d55cb2e0d348a)
Writes without transaction are not possible any more on
persistent databases.
Michael
(This used to be ctdb commit 59f46d7261dfdbdef900bf95dd9eb28ad22a46b2)
This is useless now that persistent write operations without
transaction are forbidden.
Michael
(This used to be ctdb commit b022863d44026c19d5aae54aa485b670bea0540e)
This is useless now that persistent writes without transactions are forbidden.
Michael
(This used to be ctdb commit 9ac82311d796e1fab31f8de62b8ccc754445093c)
This is like the 53_ctdb_transaction test, but it additionally
runs a loop with recoveries while the transactions are running.
When called like this, the transaction loops run for 10 minutes:
CTDB_TEST_TIMELIMIT=600 tests/scripts/run_tests tests/simple/54_ctdb_transaction_recovery.sh
The default timelimit is 30 seconds.
Michael
(This used to be ctdb commit 2ff2679e8f3d50ebf735f2c420898a84268bdc95)
The test depended on the exit code of "ctdb gettickles", which always
succeeds. This change wraps the command in a function that checks
whether the tickle we're interested in is registered.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit c4b05a731e1bee8f5b46529773a4f5389b2b6064)
The NFS test sleeps for MonitorInterval to give CTDB time to record an
NFS tickle. However, this isn't always long enough. This changes the
test to wait until a monitor event has actually occurred.
The CIFS test assumes that Samba is able to register a tickle with
CTDB before it notices that netstat has registered the tickle and can
use onnode to ask CTDB about it. That is an incorrect assumption -
sometimes we can get to the point of asking CTDB about the tickle
before Samba and CTDB have processed it. This adds a timeout loop
that makes the CIFS test wait until the tickle has been registered or
fail after 10 seconds.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 20a9d35933d89dc7eb710075f360686a49d78609)
This reverts commit 4e9a3a5dc232bac12ab387ea0cf4f1b279bed5c1.
Transaction commit should not be allowed to fail.
This is a real error.
Michael
(This used to be ctdb commit 825c506da76d7afd0714b75b8c8727874183a618)
Commit 25e82a8a667a54c6921ef076c63fdd738dd75d19 changed wait_until()
to protect the command it runs from "set -e" by running it in a
subshell. This breaks uses where the command is expected to set
global variables. For example, wait_until_get_src_socket lost the
value of $out from its call to get_src_socket().
The fix is to not be lazy and use a sub-shell!
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 39642e745254d93d74dde907787503854fe6ca4a)
The timeout for waiting for state changes isn't very predictable. It
is "about" MonitorInterval seconds... but can be longer given the
duration of eventscript runs and other things. So, we change the
timeout to MonitorInterval + EventScriptTimeout, hoping it never takes
that long.
Move the eventscript installation/removal from the old fake-tests into
a function in the functions file. Implement supporting functions to
create/remove/check-for various files that it handles. Also add a
function that uses all of this that waits for the next monitor event
(but only if all other monitor events pass).
The final check in the skip share check tests uses the above and waits
for a monitor event, and then checks that the node is still healthy.
Also enhance the wait_until function to handle a command starting with
'!' (as a separate word) to make it easy to wait for a file not to
exist.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 25e82a8a667a54c6921ef076c63fdd738dd75d19)
Monitor events sometimes happen a little bit more than MonitorInterval
seconds apart. This changes some timeouts to MonitorInterval + 1
seconds.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 6ef4364b3349145b2fec23e0431cd6df6dcadd41)
This function has been broken since it was updated to work with the
"stopped" state (probably commit
67c5bfb5f02c9d45a32d976021ede4fb2174dfe9). Although ${var#:*:0}
removes the shortest matching prefix of $var, '*' can match substrings
that include ':' if '0' isn't where you expect. So we were making
unexpected matches and incorrectly returning true for some cases.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 11137bc2d492a62a26ec9f9f62ff362e81643f66)
This facilitates tracing of tests.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 1f906bd3476e7cebf217e35b5477d6a7bb615a0c)
Time can actually go backwards during this test if ntpd happens to
adjust it little bit. So we should cope...
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 23ae9e9863ea90c6fb3f105403fd098041fa73f4)
This reverts commit affa6f47432507e84b7e76b88a2c27fff8e6e2e4.
Transaction commit should not be allowed to fail.
This is a fatal error.
Michael
(This used to be ctdb commit 4364419a486c1995bea56dab603cc4960e7c8e7a)
This reverts commit 7a6134e684c9ac4763bf198ef1410867b6082c94.
Transaction commit should not be allowed to fail.
This is a fatal error.
Michael
(This used to be ctdb commit 74e416108df6934f45ca646d709785dd76ab3c35)
Many tests currently do this sort of thing:
onnode 0 $CTDB_TEST_WRAPPER wait_until_node_has_status 1 disconnected
In fact, they all use exactly the same "onnode 0 $CTDB_TEST_WRAPPER"
idiom. This is both repetitious and dangerous, since node 0 might be
shutdown during a test. Instead, we push "onnode any
$CTDB_TEST_WRAPPER" (which selects a connected node) into
wait_until_node_has_status() and just call that function directly in
tests, like this:
wait_until_node_has_status 1 disconnected
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit a2aaef03d4d6bbd4b42f50f732254935d4d3469c)
Make it possible to start on only 1 node - for tests that need to
restart a particular node.
_ctdb_hack_options() attempts to see what options are being passed to
a daemon that is being run via the initscript. It then sets a
corresponding environment variable that the initscript knows about.
Currently only the --start-as-stopped option is supported. This is
extremely ugly but it seems like the only way... :-(
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 407b3117dfc1072117abf681ec98b9e252d8744c)
The debug code should run "ctdb status" on a cluster node, not on the
test client.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 448cd8db1305c1e6dfab323f92eac4a576596e4e)
The parser for the output of the ctdb_fetch program
did not match the output that ctdb_fetch generates.
It seemed to rather come from the ctdb_bench test...
This patch adapts the parser to correctly interpret
the output of ctdb_fetch.
Michael
(This used to be ctdb commit 836b95f32724cf37e4f643f20653f78842613692)
In complex/31_nfs_tickle.sh we run sed against a file that might not
exist, causing potential garbage from stderr in the output. Check
that the file exists before running sed.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit f9b71757f034732647228d4b8a8f00528028b6b0)
Add a -v so we see the output of the command that tries to get the
value of NFS_TICKLE_SHARED_DIRECTORY. That way we can tell if a value
was retrived OK or if we're using the default.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit c53353c6402f378f29200313d82f1f9262d628b1)
Rather than hardcoding the location of the shared tickle directory,
attempt to use the value of NFS_TICKLE_SHARED_DIRECTORY from
/etc/sysconfig/nfs on node 0.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 878437a909ea44dfc3635f082e34741ee256e505)
This failed when node 0 had no public IPs because we would always run
"ctdb gettickles" on node. We now ask node 0 for the tickles on the
test node.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 8fcc4610de926f44e18ec85fb57ca5f7d3c28bd6)
Something, perhaps root_squash, causing permission denied on the test
file after we copy it over with scp. This sets the initial
permissions to be friendly and adds -p to the scp command to maintain
those friendly permissions.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 52f21f5a92eb14df7540a2ae9e212d936e646c06)
Add a "stopped" case to log events and stop the event script from
failing with an unknown event.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 7f67f7395e2233f0bba2e9662404aad49e13f645)
The parsing of "ctdb status -Y" output to determine various node
states was implemented very strictly. Therefore, the parsing broke
due to the addition of the new "stopped" state to the output of "ctdb
status -Y". This relaxes the parsing so that it should work for
versions prior to the introduction of the "stopped" state, as well as
future versions that add new states to the end of the list of bits in
output of "ctdb status -Y".
Similarly the check for cluster unhealthy (in _cluster_is_healthy())
now just checks for a single 1 in any bit in the "ctdb status -Y"
output, rather than checking for a particular number of 0s.
New tests
tests/simple/{41_ctdb_stop.sh,42_ctdb_continue.sh,43_stop_recmaster_yield.sh}
do rudimentary testing of the stop and continue functions.
Remove tests tests/simple/41_ctdb_ban.sh and
tests/simple/42_ctdb_unban.sh. They were both unreliable.
tests/simple/21_ctdb_disablemonitor.sh now schedules a restart, since
one will be required.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 67c5bfb5f02c9d45a32d976021ede4fb2174dfe9)
The debug code should run "ctdb status" on a cluster node, not on the
test client.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 34e6f8a04b12f8879eb42d417f9741502ccccf0f)
* 2 new tests for NFS failover.
* Factor repeated code from tests into new functions
select_test_node_and_ips(), gratarp_sniff_start() and
gratarp_sniff_wait_show(). Use these new functions in existing and
new tests.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit de0b58e18fcc0f90075fca74077ab62ae8dab5da)
cluster_is_healthy() is now run locally in tests and internally causes
_cluster_is_healthy() to be run on node 0. When it detects that the
cluster is unhealthy and $ctdb_test_restart_scheduled is not true,
debug information is printed. This replaces the previous use of
$CTDB_TEST_CLEANING_UP.
To avoid spurious debug on expected restarts, added scheduled
restarts to several tests.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit ee7caae3a55a64fb50cd28fa2fd4663c5dd83b4f)
This works around potential race conditions in the init script where
the restart operation is not necessarily reliable. It just wraps the
actual restart in a loop and tries for a successful restart up to 5
times.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 1cac8a0ad429f29d1508158c7f7c42a2f1a22945)
If wait_until() does not timeout, print the time taken for the command
to succeed.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit bdb856ee22816ae1f6b8d15856555f488054f489)
3 separate tests:
* Check that gratuitous ARPs are received and take effect.
* Check that ping still works after failover.
* Check, via SSH, that the hostname changes after failover.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 92011cc05bbdb517ec6a4573f5cb9f6f21c3059e)
* Removed a race from tpcdump_start(). It seems impossible to tell
when tcpdump is actually ready to capture packets. So this function
now generates some dummy ping packets and waits until it sees them
in the output file.
* tcpdump_start() sets $tcpdump_filter. This is the default filter
for tcpdump_wait() and tcpdump_show(), but other filters may be
passed to those functions.
* New functions tcptickle_sniff_start() and
tcptickle_sniff_wait_show() handle capturing TCP tickle packets.
These are used by complex/31_nfs_tickle.sh and
complex/32_cifs_tickle.sh.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 8e2a89935a969340bfead8ed040d74703947cb81)
There are still very rare cases where IPs haven't been reallocated
before the beginning of the next test, so this adds a sleep and an
extra call to "ctdb recover" to restart_ctdb().
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit c2bdb77d91761c003e2f0e6918a27c54150f6030)
Sometimes "stty size" reports 0, for example when running in a shell
under Emacs. In this case, we just change it to 80.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit e309cb3f95efcf6cff7d7c19713d7b161a138383)
* ctdb_restart_when_done() now schedules a restart by setting an
explicit variable that is respected in ctdb_test_exit(), rather than
adding a restart to $ctdb_test_exit_hook. This means that restarts
are all done in one place.
* ctdb_test_exit() turns off "set -e" to make sure that all cleanup
happens.
* ctdb_test_exit() now prints a clear message indicating where the
test ends and the cleanup begins. This message also includes the
return code of the test.
* Add debug in cluster_is_healthy to try to capture information about
unexpected unhealthiness when a test starts.
* Simplify simple/07_ctdb_process_exists.sh so that the exit code is
generated more obviously.
* Remove redundant calls to ctdb_test_exit at the end of tests, since
they're done automatically via a trap. Also remove any preceding
warnings of restarts or final hints about test success/failure.
* Allow multi-digit debug levels in simple/12_ctdb_getdebug.sh and
simple/13_ctdb_setdebug.sh.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit b6fa044a1364cbb3008085041453ee4885f7ced1)
Glitches during restarts of the CTDB cluster have been causing some
tests to fail. This is because restarts are initiated in the body of
many tests. This adds a simple function ctdb_restart_when_done, which
schedules a restart using an existing hook in the test exit code.
This function is now used in tests that need to restart CTDB.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit d440e83bb4f0c19c085915d0f0e87cc0dabbc569)
New tests/complex/ subdirectory contains 2 new tests to ensure that
NFS and CIFS connections are tracked by CTDB and that tickle resets
are sent when a node is disabled.
Changes to ctdb_test_functions.bash to support these tests.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 31cc46eb157ca1301312f14879e4fb4da7d81088)
Limit the allowable difference in message counts in either direction
around the ring to 5% (up from 2%). There is something environmental
making this blow out to 3% very occasionally when there's no obvious
problem with ctdb.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit d6e6909ac629212b3028e13b958e1a17c64bee8c)
negative. ctdb_bench.c checks to ensure the timer has advanced from 0
before dividing.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 723413f246399b25166462d2018237920515655f)
simple/12_ctdb_getdebug.sh now recognises output with multi-digit node
numbers.
Sharing the ctdb directory via NFS and testing on a real cluster by
setting CTDB_TEST_REAL_CLUSTER didn't work by default. The fix is to
hack scripts/test_wrap so that it tries to find a valid bin directory
next to the directory containing it is in.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit ea2ca769e1d1068fbbad843750b19acfd87360e0)
looping forever on older versions of CTDB.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 03dfcf9f284c9926479a8dd4e5acf1f5d2b964aa)
tests/TODO. Remove tests/start_daemons.sh - no longer used.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 5cdbef46b74e6a5ba2383ef025e69fe605fa4f6e)
directory to $PATH if local daemons are being used.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit a497010f67d6a8e68f4d6d7e516b88d2261b1062)
run from the tests subdirectory... and most other sensible places.
Rename $CTDB_TEST_REMOTE_SCRIPTS_DIR to $CTDB_TEST_REMOTE_DIR since it
is now intented that this directory can contain test binaries too.
daemons_start() now passes a full path to the events.d directory when
starting ctdbd. Also fixed the path to ctdbd. 41_ctdb_ban.sh and
42_ctdb_unban.sh fail immediately if they fail to select a test node.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 3ecce31d3a3f72b9fa851ac99291865c119e551e)
replace them with new simple tests (52_ctdb_fetch.sh,
53_ctdb_transaction.sh, 61_ctdb_persistent_safe.sh,
62_ctdb_persistent_unsafe.sh). Remove "_simple" from some test
filenames in the simple subdirectory - that's redundant. Always run
ctdb as $CTDB to allow $VALGRIND magic to be used. Use pgrep/pkill to
detect/kill local daemons so those running under valgrind can be found
too - to support this, always run local daemons with the full path to
the executable. run_tests now supports -s option to print sumamry
when done - with more and more tests, it is getting hard to follow
progress. Sort the output of commands in 06_ctdb_getpid.sh to make
sure they compare nicely and also allow the processes' executables to
be called "memcheck" to catch those running under valgrind. Remove
redundant calls to onnode in commands run from calls
try_command_on_node in some tests. 41_ctdb_ban.sh and
42_ctdb_unban.sh avoid banning the recmaster, since this causes the
recmaster to be reassigned and all nodes to be unbanned. Minor
cleanups.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 33cdf3e4bcfadf8e20822ca352babf7acca16821)
public_addresses files - these are now created in tests/var as needed.
09_ctdb_ping.sh now recognises new "node disconnected" message.
Replace custom recovery detection code (which could not have been
working) with a call to "ctdb recover" in 32_ctdb_enable_simple.sh.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 486ed3b5b483f1c12c2d978ec6564bd33a2c6aee)
Function try_command_on_node node passes any options other than -v to
onnode. Minor update to 14_ctdb_statistics.sh.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit e627f8e5fe8e2e3ea423b7dbd12d74597fb4983b)
26_ctdb_config_check_error_on_unreachable_ctdb.sh to work with new
generic error message when trying to talk to disconnected node.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 33afe9bae732e62994e5957ee143a9c49571898b)
something vaguely useful. ctdb_test_exit unsets $ctdb_test_exit_hook.
Fix bug in 17_ctdb_config_delete_ip.sh.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit f90f6e19952d58b8590fe550ecf2308bd9b065dc)
$CTDB_TEST_REAL_CLUSTER is not set. After a ctdb restart, force a
recovery to attempt to help tests that follows.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 497c40f69e06776861a780500da1952eb7ea8fc1)
setup of local daemons so that it correctly assigns no public IPs to a
single node each time. Separate out daemon_setup so that the
selection of the node with no public IPs is only done once at the
beginning of testing. Clean up all current tests, mostly with a view
to ensuring that a node selected for testing some kind of failover
actually has public addresses assigned. Reenabled 01_ctdb_version.sh
- it now passes if rpm doesn't do anything useful on the node.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 86807e8b7b179cbe87e559fb3b1f02c8b1990dc4)
sleeps from ban/unban tests. Now expect "ctdb ping" to return false
if it fails, so made relevant change to 09_ctdb_ping.sh. New
functions install_eventscript and uninstall_eventscript. New
setup/cleanup tests 00_ctdb_install_eventscript.sh and
99_ctdb_uninstall_eventscript.sh. New test 21_ctdb_disablemonitor.sh,
which is incredibly complex.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 23bffef2295772f5b795236d60b7fb6ea754b7fb)
them, randomly pick a single node that will not have any public IPs
assigned. This will make life a bit more interesting and will
simulate what happens on real clusters with a management node. Some
tests were disabling a node to implicitly trigger a ctdb restart - now
use an explicit restart of ctdb when it is required.
17_ctdb_config_delete_ip.sh now randomly chooses a public IP on any
node to disable - this works around a problem where the hardcoded node
might not have any public addresses.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 3d59783c0e9478f4766c380945d6200fc654f5d9)
support checking the monitoring mode.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 4e1c079deb0aafb99d4114bb6504ff5ba1cbdeb4)
if the shell exits and ctdb_test_exit cancels this trap. This means
that a testcase executing under set -e will call ctdb_test_exit on
failure, allowing the cluster to be restarted if necessary so that
following tests can complete successfully. ctdb_test_exit now
respects $?, so a test will fail if the last thing executed before
ctdb_test_exit failed - this probably means the above trap was
triggered.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 805a426aaee5ecfc5bd1c097069fe58f8241dfe2)
$TEST_WRAP to $CTDB_TEST_WRAPPER - value now set using
$CTDB_TEST_REMOTE_SCRIPTS_DIR if that is set.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit a69545d7dec78eefb85a1598e5db4667cc210bf9)
frozen/unfrozen via ctdb statistics command.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit e040a989096cf7d5c0cdece1713ff903cb7568f8)
file soon. Simplify 06_ctdb_getpid.sh by using -v option to
try_command_on_node.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit c6fc68db9061547e73ec2b811e260bd7da7f58fa)
processing to all tests. New script ctdb_test_env sets up environment
for tests, is now sourced by run_tests, and can also take a test on
the command-line, complete with options. Various cleanups and
improvements. Document tests that have been properly implemented in
ctdbd.sh.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 826e85fe5291067b8d0b9c22918d63024aa6141c)
Move setting of $CTDB_NODES_SOCKETS to tests/scripts/run_tests and
make it only happen if $CTDB_TEST_REAL_CLUSTER is not set. Bugfix in
function ips_are_on_nodeglob. New/proper implementations of functions
stop_daemons and start_daemons, now called by function restart_ctdb.
In start_daemons.sh, add public addresses file generation/usage, use
new option --nopublicipcheck to ctdbd to avoid crazy behaviour and
kill ctdbd more carefully to avoid killing real daemons on a real
cluster - this should be able to coexist on a node of a real cluster.
start_daemons.sh is temporarily incompatible with start_daemons
function, but expecting to replace that script with function calls
very soon anyway...
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 4c54772c5c2fa7d2a25963379b5b96caf0c2521c)
add support to send ipv6 "gratious arp" aka neighbor solicitation packets from ctdb
Signed-off-by: Ronnie Sahlberg <ronniesahlberg@gmail.com>
(This used to be ctdb commit 0a38ea11af9237501f2951fee698a59b46f8750d)
ctdb_attach() so that we can pass TDB_NOSYNC when we attach to
a persistent database and want fast unsafe writes instead of
slow but safe tdb_transaction writes.
enhance the ctdb_persistent test suite to test both safe and unsafe writes
(This used to be ctdb commit 4948574f5a290434f3edd0c052cf13f3645deec4)
use the timelimit argument to the persistent writer to run the test for
30 seconds by default
(This used to be ctdb commit 181318fea6886c40d0aff02d0de777f28ffeddce)
so that we have n persistent writers on n nodes,
all writers writing persistently to the same record.
each writer on a node has its own "counter" in this record that is incremented by one in each iteration.
the persistent writer on node 0 also checks that all the counters in the record are increasing monotonically and if they are not, flagging it as an ERROR.
(This used to be ctdb commit 00006222ece63ae3b7f3477646232ae5bbeee6f4)
delete all persistent databases when the test starts
(the tests only uses test databases in a special test directory)
do not set up any public addresses in the tests
wait until there are no disconnected or unhealthy nodes when starting the
test daemons instead of waiting for the recovery mode to change.
we do want to wait until the system has recovered and ALL nodes are ok.
(This used to be ctdb commit dfe0c44c1e8e9dab790686c5ba798986d04bf218)
set the node initially unhealthy and let the status monitoring bring the node online.
This fixes a problem with winbindd, where it refused to start because secrets.tdb was not populated
but we could not populate ctdbd, because the net command would not run while ctdbd was still doing startup
and thus frozen
(This used to be ctdb commit 3a001b793dd76fb96addf1e2ccb74da326fbcfbc)
struct so that if we timeout a control we can print debug info such as
what opcode failed and to which node
we dont need the *status parameter to ctdb_client_control_state
create async versions of the getrecmaster control
pass a memory context to getrecmaster
(This used to be ctdb commit 558b680c82f830fba82c283c78c2de8a0b150b75)
we store in the tree and use a node destructor so that when the data is
talloc_free()d we also remove the node from the tree.
(This used to be ctdb commit b8dabd1811ebd85ee031563e95085f720a2fa04d)
the data of the tree.
this callback makes it more convenient to manage cases where one might
want to insert multiple entries into the tree with the same key
rename the tree->tree pointer to tree->root since this is supposed to
point to the root of the tree
add a small test utility
(This used to be ctdb commit f6313bed9c53e0d1c36c9e08ac707e88e2a4fcd5)
- added DatabaseHashSize tunable
- added logging of events inside recovery (for timing)
(This used to be ctdb commit 3593cdb928b91e217faf1b3c537fa28dc82cdace)
records in the test database.
the tool first creates the number of records requested, then it will
loop infinitely reading the records out again.
(This used to be ctdb commit dca2497097a9cdff268b1e5d446f6feb71bfb389)
to start a recovery session. The node is banned from the cluster for the RecoveryBanPeriod (default of 5 minutes)
(This used to be ctdb commit 4ad43dd07f526b6002477177fbf55483246c2c0c)
- use -n to specify node number in ctdb utility
- change 'ctdb status' to 'ctdb statistics'
- added 'ctdb status' which shows status
- added netmask to public IPs, so you don't try a takeover on a
foreign network
- cleaned up tools/ctdb_control.c a lot
- generate usage message at runtime
(This used to be ctdb commit 28de71c03ace7d32a9fd9882fabbd5d668b97656)
update the recovery test script to start all ctdb daemons with a
recovery daemon
(This used to be ctdb commit 47794e16df285cacefc30208d892d931a6e46b96)
ctdb_ctrl_ calls are timedout due to nodes arriving or leaving the
cluster it crashes the recovery daemon afterwards with a SEGV but no
useful stack backtrace
(This used to be ctdb commit cd3abc7349e86555ccd87cd47a1dcc2adad2f46c)
start the daemons with explicit socketnames and explicit ip address/port
remove all --socket= from all ctdb_control calls since they are not
needed anymore
(This used to be ctdb commit 593a959d428f5b4a913117a9b5c8fe65a3eb950e)
of traversing the full cluster.
this makes it easier to debug recovery
update the test script for recovery to reflect the newish signatures to
ctdb_control
the catdb control does still segfault however when there are missing
nodes in the cluster as there are toward the end of the recovery test
(This used to be ctdb commit 8de2a97c14a444f817ceb36461314f10c9601ecc)
this program is a client to the local ctdb daemon
every second it pulls all vnnmap and nodemaps from all nodes that are
available and checks if a recovery is required
a recovery is required if :
* all nodes do NOT have an identical vnnmap and generation
* all nodes do NOT have an identical nodemap
* there are active nodes that are NOT in the nodemap
* there are nodes in the nodemap that are NOT active
During recovery, the recovery tool will also make sure that all nodes
know about and have created all databases.
(This used to be ctdb commit 2f2650467bac7e8954de7c17cb34f46b0bdbcd26)
- allow controls to know which client invoked them
- added a client_id to clients, so they can be identified remotely
- added the ability to remove registered srvids
- in the list_keys code, register a temp srvid, then remove it afterwards
(This used to be ctdb commit 29603c51cc6d81362532cd8e50f75c8360c5f5ef)