IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Don't calculate this locally as _tools_dir. Add it to PATH
unconditionally - this may result in duplicate entries in PATH but the
resulting code is simpler.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Don't calculate this locally as _test_bin_dir. Just calculate
top_dir, source script_install_paths.sh and use
$CTDB_SCRIPT_TESTS_BINDIR.
Don't bother sanity checking if TEST_BIN_DIR is set. It will go away
soon.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
ctdb_test_init() now passes any arguments to setup_ctdb().
Update tests that have custom local daemon configuration to call
ctdb_test_init() directly. Remove the redundant, initial call to
ctdb_test_init() to avoid starting the cluster an extra time.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This no longer does anything. Integration test cases now start and
shut down the cluster.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
The remainder of the scheduled restart logic is about to be removed,
so produce debugging information any time the cluster is not healthy.
While here, reindent and drop the else since there is already an early
return before it.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Integration test cases now start and shut down the cluster.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Interrupting a test run currently moves on to the next test. It
should exit.
Follow the practice of exiting with 128 + signal number.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Running testsuite-specific code here isn't a good option.
Daemons are now shut down in ctdb_test_exit(), even when testing is
interrupted.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This makes tests self-contained. They can also now be individually
looped, if necessary.
Most tests (all but 1 complex, more than 50% of simple) restart the
daemons anyway, so this simplification is worth it.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Exit on first test failure instead of setting a variable. The bizarre
logic in ctdb_test_exit() makes this worth dropping.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
These 3 tests duplicate various checks and can easily be handled as a
single test.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
The "continue" and "enable" tests are just extensions of the "stop"
and "disable" tests, so drop the latter 2.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This strengthens those tests to ensure that released IPs aren't
replaced with others.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This is only really wanted for interactive testing when logging to
stderr.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This stops ctdbd from being able to shut down eventd, since the PID it
records will be invalid. There's no need for eventd to fork.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
ctdbd logs to stderr in interactive mode, not stdout. This way stdout
is always closed.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Logging the logging location to syslog can be useful on production
systems when the configuration goes unexpectedly missing. However, in
test mode this just adds noise to the logs on the test system.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
If there aren't enough addresses in the list then the shift will
silently fail and the printed address will be the unshifted value of
$1, which is incorrect/unexpected. So, sanity check the node number.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
The local daemons ssh stub doesn't need to do this because the ctdbd
and the ctdb tool now only need CTDB_TEST_MODE and CTDB_BASE for local
daemon tests.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Drop the use of ctdb_set_sockname() because it complicates the memory
allocation and this is the only place it is used. Just assign to the
relevant pointer.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Just leak the memory allocated by path_socket(). This is only used in
short-lived test programs, so it isn't worth the hassle of plumbing a
talloc context through several layers to get here.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
This needs to be done before any of the code changes are made,
including updating the ctdb tool.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
However, don't use ctdb-path itself because some tests use nested
instances of onnode. The outermost instance would set CTDB_SOCKET and
any inner instance would pick up that value, regardless of CTDB_BASE.
This is a temporary measure to avoid breaking testing while use of the
path functions is added to ctdbd and the ctdb tool. When this is
complete these variables can be removed altogether because the code
will just depend on CTDB_TEST_MODE and CTDB_BASE.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Use of CTDB_SOCKET is being generally removed. However, this override
is being added to allow test code outside of ctdb/ to be able to
specify the socket, if desired.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
... instead of applying banning credits.
There have been a couple of cases where recovery repeatedly takes just
over 2 minutes to fail. Therefore, banning credits expire between
failures and a continuously problematic node is never banned,
resulting in endless recoveries. This is because it takes 2
applications of banning credits before a node is banned, which
generally involves 2 recovery failures.
The recovery helper makes up to 3 attempts to recover each database
during a single run. If a node causes 3 failures then this is really
equivalent to 3 recovery failures in the model that existed before the
recovery helper added retries. In that case the node would have been
banned after 2 failures.
So, instead of applying banning credits to the "most failing" node,
simply ban it directly from the recovery helper.
If multiple nodes are causing recovery failures then this can cause a
node to be banned more quickly than it might otherwise have been, even
pre-recovery-helper. However, 90 seconds (i.e. 3 failures) is a long
time to be in recovery, so banning earlier seems like the best
approach.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=13670
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Mon Nov 5 06:52:33 CET 2018 on sn-devel-144
==25741== Syscall param write(buf) points to uninitialised byte(s)
==25741== at 0x4939291: write (write.c:27)
==25741== by 0x4868285: sys_write (sys_rw.c:68)
==25741== by 0x13915D: sock_queue_trigger (sock_io.c:316)
==25741== by 0x4DE6478: tevent_common_invoke_immediate_handler (in /usr/lib/x86_64-linux-gnu/libtevent.so.0.9.37)
==25741== by 0x4DE64A2: tevent_common_loop_immediate (in /usr/lib/x86_64-linux-gnu/libtevent.so.0.9.37)
==25741== by 0x4DEBE5A: ??? (in /usr/lib/x86_64-linux-gnu/libtevent.so.0.9.37)
==25741== by 0x4DEA2D6: ??? (in /usr/lib/x86_64-linux-gnu/libtevent.so.0.9.37)
==25741== by 0x4DE57E3: _tevent_loop_once (in /usr/lib/x86_64-linux-gnu/libtevent.so.0.9.37)
==25741== by 0x15D1BA: ctdb_event_script_args (eventscript.c:821)
==25741== by 0x13B437: ctdb_start_daemon (ctdb_daemon.c:1315)
==25741== by 0x110642: main (ctdbd.c:393)
==25741== Address 0x57888a4 is 100 bytes inside a block of size 144 alloc'd
==25741== at 0x48357BF: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==25741== by 0x4B9B7C0: talloc_named_const (in /usr/lib/x86_64-linux-gnu/libtalloc.so.2.1.14)
==25741== by 0x15CCC6: eventd_client_write (eventscript.c:430)
==25741== by 0x15CCC6: eventd_client_run (eventscript.c:556)
==25741== by 0x15CCC6: ctdb_event_script_run (eventscript.c:649)
==25741== by 0x15D198: ctdb_event_script_args (eventscript.c:812)
==25741== by 0x13B437: ctdb_start_daemon (ctdb_daemon.c:1315)
==25741== by 0x110642: main (ctdbd.c:393)
==25741==
BUG: https://bugzilla.samba.org/show_bug.cgi?id=13659
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Mon Oct 22 09:27:15 CEST 2018 on sn-devel-144
The startup_fd should not be propagated to the child processes created
from a daemon. It should only be used in the daemon code to return the
status of the startup. Another use of startup_fd is to notify the
parent if the daemon process has exited.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=13659
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
ctdbd enters a broken state if eventd goes away. A clean shutdown is
not possible because that involves running events. Restarting eventd
is possible but this might mask a serious problem and it is possible
that eventd might keep on disappearing. Just exit.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=13659
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Record counts are sometimes incomplete for large databases when
relevant tests are run on a real cluster.
This probably has something to do with ssh, pipes and buffering, so
move the filtering and counting to the remote end. This means that
only the count comes across the pipe, instead of all the record data.
Instead of explicitly excluding the key for persistent database
sequence numbers, just exclude any key starting with '_'. Such keys
are not used in tests.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Mon Oct 8 05:36:11 CEST 2018 on sn-devel-144
This test sometimes fails, probably because the test is flakey.
Either the records aren't being added correctly or the counting of
records loses records. Try to debug both possibilities.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
A transaction_loop client can exit with a transaction active when its
time limit expires. This causes a recovery and causes problems with
the test cleanup, which detects unwanted recoveries and fails.
Set a flag when the time limit expires and exit cleanly before the
next transaction is started.
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
ONNODE_SSH is really a test hook, so it doesn't need to support
completely random values.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>