IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
This event is called when a node is stopped and is used by eventscripts that need to do certain cleanup and removal of configuration or ip addresses or routing ...
Note that a STOPPED node is considered "inactive" and as such will not be running the "recovered" event when the rest of the cluster has recovered.
(This used to be ctdb commit 65e9309564611bf937ded3c74a79abff895d7c59)
also verify that we actually do have a natgw master available if this is configured and make the node unhealthy if not.
(This used to be ctdb commit 7f273ee769d671d8c8be87c9187302fb77e814f3)
The debug code should run "ctdb status" on a cluster node, not on the
test client.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 34e6f8a04b12f8879eb42d417f9741502ccccf0f)
This node flag means the node is DISABLED and that all its public ip addresses
are failed over, but also that it has been removed from the VNNmap.
A STOPPED node should be in recovery mode active untill restarted using the continue command.
Adding two new commands "ctdb stop" "ctdb continue"
(This used to be ctdb commit d47dab1026deba0554f21282a59bd172209ea066)
* 2 new tests for NFS failover.
* Factor repeated code from tests into new functions
select_test_node_and_ips(), gratarp_sniff_start() and
gratarp_sniff_wait_show(). Use these new functions in existing and
new tests.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit de0b58e18fcc0f90075fca74077ab62ae8dab5da)
cluster_is_healthy() is now run locally in tests and internally causes
_cluster_is_healthy() to be run on node 0. When it detects that the
cluster is unhealthy and $ctdb_test_restart_scheduled is not true,
debug information is printed. This replaces the previous use of
$CTDB_TEST_CLEANING_UP.
To avoid spurious debug on expected restarts, added scheduled
restarts to several tests.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit b67946a6f6b185a7920bf1e560988417c8c4d87d)
This works around potential race conditions in the init script where
the restart operation is not necessarily reliable. It just wraps the
actual restart in a loop and tries for a successful restart up to 5
times.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 3f7a4afa0fcc5825beb89267973939df8cde4999)
If wait_until() does not timeout, print the time taken for the command
to succeed.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 8d12fe61eb59a4a611dd5950506d14bd4891075d)
3 separate tests:
* Check that gratuitous ARPs are received and take effect.
* Check that ping still works after failover.
* Check, via SSH, that the hostname changes after failover.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit aa9f79e4b3e077b48a8a16903d2236c284617e49)
* Removed a race from tpcdump_start(). It seems impossible to tell
when tcpdump is actually ready to capture packets. So this function
now generates some dummy ping packets and waits until it sees them
in the output file.
* tcpdump_start() sets $tcpdump_filter. This is the default filter
for tcpdump_wait() and tcpdump_show(), but other filters may be
passed to those functions.
* New functions tcptickle_sniff_start() and
tcptickle_sniff_wait_show() handle capturing TCP tickle packets.
These are used by complex/31_nfs_tickle.sh and
complex/32_cifs_tickle.sh.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 52e1cd7e9217cfa521850a9a9a9daddcce011f27)
There are still very rare cases where IPs haven't been reallocated
before the beginning of the next test, so this adds a sleep and an
extra call to "ctdb recover" to restart_ctdb().
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 7c27c493a6de92544754e42f2a8f227b3d663c73)
Sometimes "stty size" reports 0, for example when running in a shell
under Emacs. In this case, we just change it to 80.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit da87914ab47fe5786b620587464b58853e98dd7e)
* ctdb_restart_when_done() now schedules a restart by setting an
explicit variable that is respected in ctdb_test_exit(), rather than
adding a restart to $ctdb_test_exit_hook. This means that restarts
are all done in one place.
* ctdb_test_exit() turns off "set -e" to make sure that all cleanup
happens.
* ctdb_test_exit() now prints a clear message indicating where the
test ends and the cleanup begins. This message also includes the
return code of the test.
* Add debug in cluster_is_healthy to try to capture information about
unexpected unhealthiness when a test starts.
* Simplify simple/07_ctdb_process_exists.sh so that the exit code is
generated more obviously.
* Remove redundant calls to ctdb_test_exit at the end of tests, since
they're done automatically via a trap. Also remove any preceding
warnings of restarts or final hints about test success/failure.
* Allow multi-digit debug levels in simple/12_ctdb_getdebug.sh and
simple/13_ctdb_setdebug.sh.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 56ece515e047a54f33e8b07726e52ba21a1d67e1)
* Move building of CTDB_OPTIONS to new function build_ctdb_options()
and have it use a helper function for readability.
* New functions check_persistent_databases() and set_ctdb_variables().
* Remove valgrind-specific stop code, since the general pkill should
kill ctdbd when running under valgrind.
* Remove some bash-isms (e.g. >& /dev/null) since the script is /bin/sh.
* Make indentation consistent.
* Minor clean-ups.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Conflicts:
config/ctdb.init
(This used to be ctdb commit bebb21f18e3026cb78a306104e92ee005d1077b2)
this is to better handle linux clients which often default to ignore grat arps that arrive within 1 second of eachother.
(This used to be ctdb commit 5664da36943b4901a807a9594b0f45e859aafbf3)
cluster_is_healthy() is now run locally in tests and internally causes
_cluster_is_healthy() to be run on node 0. When it detects that the
cluster is unhealthy and $ctdb_test_restart_scheduled is not true,
debug information is printed. This replaces the previous use of
$CTDB_TEST_CLEANING_UP.
To avoid spurious debug on expected restarts, added scheduled
restarts to several tests.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit ee7caae3a55a64fb50cd28fa2fd4663c5dd83b4f)
This works around potential race conditions in the init script where
the restart operation is not necessarily reliable. It just wraps the
actual restart in a loop and tries for a successful restart up to 5
times.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 1cac8a0ad429f29d1508158c7f7c42a2f1a22945)
If wait_until() does not timeout, print the time taken for the command
to succeed.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit bdb856ee22816ae1f6b8d15856555f488054f489)
This will force a wait until the ip addresses have been reallocated after a disable/enable command and will make scripting of enable/disable more predictable.
This will cause the command enable/disable to wait until the ip realocation that normally follows shortly after a enable/disable to finish before the command returns to the prompt.
(This used to be ctdb commit 6e1f60d8d780c1240aaabb78ecc8550d0480cd7e)
3 separate tests:
* Check that gratuitous ARPs are received and take effect.
* Check that ping still works after failover.
* Check, via SSH, that the hostname changes after failover.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 92011cc05bbdb517ec6a4573f5cb9f6f21c3059e)
* Removed a race from tpcdump_start(). It seems impossible to tell
when tcpdump is actually ready to capture packets. So this function
now generates some dummy ping packets and waits until it sees them
in the output file.
* tcpdump_start() sets $tcpdump_filter. This is the default filter
for tcpdump_wait() and tcpdump_show(), but other filters may be
passed to those functions.
* New functions tcptickle_sniff_start() and
tcptickle_sniff_wait_show() handle capturing TCP tickle packets.
These are used by complex/31_nfs_tickle.sh and
complex/32_cifs_tickle.sh.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 8e2a89935a969340bfead8ed040d74703947cb81)
There are still very rare cases where IPs haven't been reallocated
before the beginning of the next test, so this adds a sleep and an
extra call to "ctdb recover" to restart_ctdb().
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit c2bdb77d91761c003e2f0e6918a27c54150f6030)
Sometimes "stty size" reports 0, for example when running in a shell
under Emacs. In this case, we just change it to 80.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit e309cb3f95efcf6cff7d7c19713d7b161a138383)
* ctdb_restart_when_done() now schedules a restart by setting an
explicit variable that is respected in ctdb_test_exit(), rather than
adding a restart to $ctdb_test_exit_hook. This means that restarts
are all done in one place.
* ctdb_test_exit() turns off "set -e" to make sure that all cleanup
happens.
* ctdb_test_exit() now prints a clear message indicating where the
test ends and the cleanup begins. This message also includes the
return code of the test.
* Add debug in cluster_is_healthy to try to capture information about
unexpected unhealthiness when a test starts.
* Simplify simple/07_ctdb_process_exists.sh so that the exit code is
generated more obviously.
* Remove redundant calls to ctdb_test_exit at the end of tests, since
they're done automatically via a trap. Also remove any preceding
warnings of restarts or final hints about test success/failure.
* Allow multi-digit debug levels in simple/12_ctdb_getdebug.sh and
simple/13_ctdb_setdebug.sh.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit b6fa044a1364cbb3008085041453ee4885f7ced1)
validate the input values used and refuse setting the debug level to an unknown value
(This used to be ctdb commit daec49cea1790bcc64599959faf2159dec2c5929)
The valgrind start case should not use daemon, since this is specific
to Red Hat.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 867f57d166395c92949e480ca725249b0ca8950b)
Use a local variable $ctdbd so that we always run ctdbd from the the
same place and so that we know what to kill. This variable respects
the $CTDBD environment variable, which may be used to specify an
alternative location for the daemon.
In the important cases use "pkill -0 -f" to check if ctdbd is
running. Also, remove the special case for killing ctdbd when running
under valgrind. The regular case will handle this just fine.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 070305adfe636c2580776e6bf24bb8be06622b86)
Glitches during restarts of the CTDB cluster have been causing some
tests to fail. This is because restarts are initiated in the body of
many tests. This adds a simple function ctdb_restart_when_done, which
schedules a restart using an existing hook in the test exit code.
This function is now used in tests that need to restart CTDB.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit fc69b6a66282d5be6edeb286bf72aeafb252e6dd)
Commit a0f5148ac749758e2dfbd6099e829c5bf1d900e6 caused a subtle
regression. Due to the subtlety, this description is much longer than
the 1 line patch that fixes it! The regression, where a process that
invokes onnode is unexpectedly blocked, is only apparent if the
following conditions are met:
1. $CTDB_NODES_SOCKETS is set;
2. The command passed to onnode attempts to background a process; and
3. onnode is run in certain types of subshell (e.g. foo=$(onnode ...)).
In particular, when testing against local daemons (i.e. condition (1)
is met), tests/simple/07_ctdb_process_exists.sh would fail (because it
does both (2), (3)).
The problem is caused by the use of file descriptor 3 in the code that
allows separate filtering of stdout and stderr. A backgrounded
process will have this descriptor open and the $(...) construct
appears to wait for all file descriptors to be closed. This only
happens with local daemons because SSH is replaced by a shell and file
descriptor 3 leaks into that shell. It does not occur when SSH is
used because the file descriptor does not leak into the remote shell
where the process is backgrounded.
The fix is simply to redirect file descriptor 3 to /dev/null in the
fakessh function, which is used when $CTDB_NODES_SOCKETS is set.
Also fixed is another minor bug when the -o option and
$CTDB_NODES_SOCKETS are used in combination. The code uses the node
name as a suffix for the output filename(s). Usually this is an IP
address. However, when $CTDB_NODES_SOCKETS is in use the node name is
the socket name, which might be a path several directories deep.
Each output file is created via a simple redirection and this would
fail if unexpected directories appear in the filename. 3 possible
fixes were considered:
1. Replace all '/'s in the node name by '_'s. Nice and simple.
2. Use the basename of the node name. However, sockets may be in
different directories but have the same basename.
3. Create all required directories before redirecting. This is a
little more complex and probably doesn't meet the user's
expectations.
Option (1) is implemented here.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 5d320099025b6835eda3a1e431708f7e0a6b0ba6)
This allows us to timeout the operation if the underlying filesystem has become temporarily unresponsive without causing a new recovery.
(This used to be ctdb commit d177b08f1dc79534491f27726b05405d47e12e20)
Commit a0f5148ac749758e2dfbd6099e829c5bf1d900e6 caused a subtle
regression. Due to the subtlety, this description is much longer than
the 1 line patch that fixes it! The regression, where a process that
invokes onnode is unexpectedly blocked, is only apparent if the
following conditions are met:
1. $CTDB_NODES_SOCKETS is set;
2. The command passed to onnode attempts to background a process; and
3. onnode is run in certain types of subshell (e.g. foo=$(onnode ...)).
In particular, when testing against local daemons (i.e. condition (1)
is met), tests/simple/07_ctdb_process_exists.sh would fail (because it
does both (2), (3)).
The problem is caused by the use of file descriptor 3 in the code that
allows separate filtering of stdout and stderr. A backgrounded
process will have this descriptor open and the $(...) construct
appears to wait for all file descriptors to be closed. This only
happens with local daemons because SSH is replaced by a shell and file
descriptor 3 leaks into that shell. It does not occur when SSH is
used because the file descriptor does not leak into the remote shell
where the process is backgrounded.
The fix is simply to redirect file descriptor 3 to /dev/null in the
fakessh function, which is used when $CTDB_NODES_SOCKETS is set.
Also fixed is another minor bug when the -o option and
$CTDB_NODES_SOCKETS are used in combination. The code uses the node
name as a suffix for the output filename(s). Usually this is an IP
address. However, when $CTDB_NODES_SOCKETS is in use the node name is
the socket name, which might be a path several directories deep.
Each output file is created via a simple redirection and this would
fail if unexpected directories appear in the filename. 3 possible
fixes were considered:
1. Replace all '/'s in the node name by '_'s. Nice and simple.
2. Use the basename of the node name. However, sockets may be in
different directories but have the same basename.
3. Create all required directories before redirecting. This is a
little more complex and probably doesn't meet the user's
expectations.
Option (1) is implemented here.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit c97d56d93d9c1007a4e85affb19ed0c2d0e11b6d)
Glitches during restarts of the CTDB cluster have been causing some
tests to fail. This is because restarts are initiated in the body of
many tests. This adds a simple function ctdb_restart_when_done, which
schedules a restart using an existing hook in the test exit code.
This function is now used in tests that need to restart CTDB.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit d440e83bb4f0c19c085915d0f0e87cc0dabbc569)
New tests/complex/ subdirectory contains 2 new tests to ensure that
NFS and CIFS connections are tracked by CTDB and that tickle resets
are sent when a node is disabled.
Changes to ctdb_test_functions.bash to support these tests.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 5d188af387a2a1d68d66f47edb7a9ca546ed357c)
The threshold for the difference in the number messages sent in either
direction around the ring of nodes was set to 2%. Something
environmental is causing this different to sometimes be as high as 3%.
We're confident it isn't a CTDB issue so we're increasing the
threshold to 5%.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit be3e23c9fcb9c716e492af102830a4f6ad8bda7b)
New tests/complex/ subdirectory contains 2 new tests to ensure that
NFS and CIFS connections are tracked by CTDB and that tickle resets
are sent when a node is disabled.
Changes to ctdb_test_functions.bash to support these tests.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 31cc46eb157ca1301312f14879e4fb4da7d81088)
Limit the allowable difference in message counts in either direction
around the ring to 5% (up from 2%). There is something environmental
making this blow out to 3% very occasionally when there's no obvious
problem with ctdb.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit d6e6909ac629212b3028e13b958e1a17c64bee8c)
in this case, read the nodes file directly instead of asking the local daemon for the list.
add an option -Y to provide machinereadable output to listnodes
(This used to be ctdb commit 4a55cacc4f5526abd2124460b669e633deeda408)
AIX dont have getopt.h by default.
Dont try including this file when building on AIX
(This used to be ctdb commit 06b33a826e71e1dd2f9e02ad614be55535d42045)
* Move building of CTDB_OPTIONS to new function build_ctdb_options()
and have it use a helper function for readability.
* New functions check_persistent_databases() and set_ctdb_variables().
* Remove valgrind-specific stop code, since the general pkill should
kill ctdbd when running under valgrind.
* Remove some bash-isms (e.g. >& /dev/null) since the script is /bin/sh.
* Make indentation consistent.
* Minor clean-ups.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 951dbcb29fd53cf51a08958efe185db4954d24f3)