1
0
mirror of https://github.com/samba-team/samba.git synced 2024-12-28 07:21:54 +03:00
Commit Graph

1921 Commits

Author SHA1 Message Date
Martin Schwenke
c9ca8ccc23 When running with local daemons, provided there is more than 2 of
them, randomly pick a single node that will not have any public IPs
assigned.  This will make life a bit more interesting and will
simulate what happens on real clusters with a management node.  Some
tests were disabling a node to implicitly trigger a ctdb restart - now
use an explicit restart of ctdb when it is required.
17_ctdb_config_delete_ip.sh now randomly chooses a public IP on any
node to disable - this works around a problem where the hardcoded node
might not have any public addresses.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 3d59783c0e9478f4766c380945d6200fc654f5d9)
2008-12-08 08:15:18 +11:00
Martin Schwenke
805c5bf1f3 New test for getmonmode. Overload node_has_status some more to
support checking the monitoring mode.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 4e1c079deb0aafb99d4114bb6504ff5ba1cbdeb4)
2008-12-04 17:19:51 +11:00
Martin Schwenke
733fe4594c Merge commit 'origin/master' into martins
(This used to be ctdb commit 4ff5875c965f21ab76a5924efd92f1832aeb36d4)
2008-12-04 14:42:04 +11:00
Martin Schwenke
9a4f7e4f4c ctdb_test_init now contains a trap to force ctdb_test_exit to be run
if the shell exits and ctdb_test_exit cancels this trap.  This means
that a testcase executing under set -e will call ctdb_test_exit on
failure, allowing the cluster to be restarted if necessary so that
following tests can complete successfully.  ctdb_test_exit now
respects $?, so a test will fail if the last thing executed before
ctdb_test_exit failed - this probably means the above trap was
triggered.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 805a426aaee5ecfc5bd1c097069fe58f8241dfe2)
2008-12-03 18:08:21 +11:00
Martin Schwenke
3cdc0cb708 $PATH only inludes $CTDB_DIR/bin if we're using local sockets. Rename
$TEST_WRAP to $CTDB_TEST_WRAPPER - value now set using
$CTDB_TEST_REMOTE_SCRIPTS_DIR if that is set.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit a69545d7dec78eefb85a1598e5db4667cc210bf9)
2008-12-03 15:48:24 +11:00
Ronnie Sahlberg
539f044aa3 print the list of valid debug level literals when an invalid debug level
is specified in 'ctdb setdebug'

(This used to be ctdb commit 979e78cfd96d74686af6f55f726c395a75275803)
2008-12-02 14:08:10 +11:00
Ronnie Sahlberg
edb7241c05 redesign how reloadnodes is implemented.
modify the transport methods to allow to restart individual connections
and set up destructors properly.

only tear down/set-up tcp connections to nodes removed from the cluster
or nodes added to the cluster.
Leave tcp connections to unchanged nodes connected.

make "ctdb reloadnodes" explicitely cause a recovery of the cluster once
the files have been realoaded

(This used to be ctdb commit d1057ed6de7de9f2a64d8fa012c52647e89b515b)
2008-12-02 13:26:30 +11:00
root
7592a97d16 debuglevel is a signed int, not usnigned.
(This used to be ctdb commit e577a276900854622f4e9da9d1ccd7b484d0d1ec)
2008-11-28 11:29:43 +11:00
Ronnie Sahlberg
51cc8b4df8 make it possible to delete an ip from all nodes at once using
"ctdb delip x.x.x.x -n all"

This is not as straightforward as one might think since during the
delete process we don not want the ip to be bouncing from one node to
another as node by node deletes it.

Thus we first delete the ip from all connected nodes which are not
currently hosting it.

After this we delete the ip from the node which is hosting it.

(This used to be ctdb commit bbd46f341e9aa32d8dbd49f7a9a07cb3f1f92ea3)
2008-11-28 09:52:26 +11:00
Martin Schwenke
a04094659c 4 new tests. Hacked function node_has_status to support
frozen/unfrozen via ctdb statistics command.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit e040a989096cf7d5c0cdece1713ff903cb7568f8)
2008-11-27 18:11:22 +11:00
Martin Schwenke
0b6da4f7ec 4 new tests. Marked more ctdbd.sh tests as done - will remove this
file soon.  Simplify 06_ctdb_getpid.sh by using -v option to
try_command_on_node.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit c6fc68db9061547e73ec2b811e260bd7da7f58fa)
2008-11-25 17:53:28 +11:00
Ronnie Sahlberg
a782bdbacd inew version 1.0.66
ddwq

(This used to be ctdb commit 499a01fece2a5f24f1b2943cf3dc6e9a3a8ca3b5)
2008-11-24 19:06:02 +11:00
Martin Schwenke
5d50f5a91c New test 09_ctdb_ping.sh. Add documentation and command-line
processing to all tests.  New script ctdb_test_env sets up environment
for tests, is now sourced by run_tests, and can also take a test on
the command-line, complete with options.  Various cleanups and
improvements.  Document tests that have been properly implemented in
ctdbd.sh.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 826e85fe5291067b8d0b9c22918d63024aa6141c)
2008-11-24 17:47:09 +11:00
Martin Schwenke
0e9f8c4a6f Incorporate temporary patch from Ronnie that adds --nopublicipcheck
option to ctdbd.  Commit here because it seems to work.

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit e5af1e4d945c25cd20d6fb5ef042e6de1aeda4a9)
2008-11-21 19:12:22 +11:00
Martin Schwenke
734f3ada52 Move tests/*.c to tests/src/*.c and adjust Makefile.in accordingly.
Move setting of $CTDB_NODES_SOCKETS to tests/scripts/run_tests and
make it only happen if $CTDB_TEST_REAL_CLUSTER is not set.  Bugfix in
function ips_are_on_nodeglob.  New/proper implementations of functions
stop_daemons and start_daemons, now called by function restart_ctdb.
In start_daemons.sh, add public addresses file generation/usage, use
new option --nopublicipcheck to ctdbd to avoid crazy behaviour and
kill ctdbd more carefully to avoid killing real daemons on a real
cluster - this should be able to coexist on a node of a real cluster.
start_daemons.sh is temporarily incompatible with start_daemons
function, but expecting to replace that script with function calls
very soon anyway...

Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit 4c54772c5c2fa7d2a25963379b5b96caf0c2521c)
2008-11-21 19:01:48 +11:00
Ronnie Sahlberg
1e2831898c allow to change the recmaster even the database is not frozen
(This used to be ctdb commit 03e2e436db5cfd29a56d13f5d2101e42389bfc94)
2008-11-21 16:24:12 +11:00
Martin Schwenke
bc3a6b20c5 Merge commit 'origin/master' into martins
(This used to be ctdb commit e088116238eb107e9831fccbfd66c1db3d837a3b)
2008-11-21 13:00:37 +11:00
Ronnie Sahlberg
69932283ac remove two variables no longer used from the example sysconfig file
(This used to be ctdb commit dab594caf0bfc23c75c8cd2aa75479c7d2e79f1c)
2008-11-21 11:30:32 +11:00
Andrew Tridgell
59b6a9a9e6 fixed problem with looping ctdb recoveries
After a node failure, GPFS can get into a state where non-blocking
fcntl() locks can take a long time. This means to the ctdb set_recmode
test timing out, which leads to a recovery failure, and a new
recovery. The recovery loop can last a long time.

The fix is to consider a fcntl timeout as a success of this test. The
test is to see that we can't lock the shared reclock file, so a
timeout is fine for a success.

(This used to be ctdb commit 6579a6a2a7161214adedf0f67dce62f4a4ad1afe)
2008-11-21 10:24:13 +11:00
Andrew Tridgell
eeae32c8d2 Merge commit 'ronnie/master'
(This used to be ctdb commit fe6ddf7992ca3e72a26dbac6666e0f6270da611f)
2008-11-20 21:23:26 +11:00
Martin Schwenke
d741559fa6 Add some simple tests that can be run from within the tree.
Signed-off-by: Martin Schwenke <martin@meltin.net>

(This used to be ctdb commit eacb2ef82ea4809d874158756db973dd1e3fc8fc)
2008-11-20 20:40:01 +11:00
Ronnie Sahlberg
331b9bdb5f dont override/change CTDB_BASE if it is already set by the shell
(This used to be ctdb commit 0a6f9326cb99f14b5c9edd0d8854d8229df49910)
2008-11-20 16:39:56 +11:00
Ronnie Sahlberg
a2a5904f66 Keepalive packets were only sent every KeepaliveInterval if the socket
had been completely idle during that interval.
If we had been sending other packets such as Messages, Calls or Controls
there wouldnt be any need for an explicit keepalive and thus we didnt
send one.

This does make it somewhat awkward when analyzing traces since it is
non-intuitive when keepalives are sent and when they are not sent.

Change the keepalive logic to always send a keepalive regardless of
whether the link is idle or not.

(This used to be ctdb commit 7a18f33ec7512100dd067c65f0470889ff8fd591)
2008-11-20 13:35:08 +11:00
Ronnie Sahlberg
94a56ea410 reqrite the handling of flag updates across the cluster to eliminate a
race between the ctdb tool and the recovery daemon both at once
trying to push flag changes across the cluster.

(This used to be ctdb commit a9a1156ea4e10483a4bf4265b8e9203f0af033aa)
2008-11-20 12:43:18 +11:00
Martin Schwenke
71b16e1123 Merge branch 'master' into martins
(This used to be ctdb commit f77a91c0828a79f99d0c422f7e09b17c69174907)
2008-11-19 13:21:07 +11:00
Ronnie Sahlberg
090e5fdf5e new version 1.0.65
update the example sysconfig file. the default log level is 2, not 0

(This used to be ctdb commit 1f25958dc739677a487fa496fbeffcda7a0f2204)
2008-11-13 10:55:20 +11:00
Ronnie Sahlberg
07d35c754f add a CTDB_SOCKET variable that can be used to override the default
/tmp/ctdb.socket

(This used to be ctdb commit b75e2263c565c21ecbbd98fbd2c10787e467bf5c)
2008-11-11 14:49:30 +11:00
Ronnie Sahlberg
06728fdac9 we actually need a ctdb_db variable
(This used to be ctdb commit aba984f1b85f5a2d370b093061cf15843ee53758)
2008-11-03 21:54:52 +11:00
Ronnie Sahlberg
d7007793ea latency is measured in us, not ms
use an explicit ctdb_db variable instead of dereferencing state

(This used to be ctdb commit 8c6a02fb423a8cbcbfc706767e3d353cd48073c3)
2008-10-30 13:34:10 +11:00
Ronnie Sahlberg
e1b0cea427 add control and logging of very high latencies.
log the type of operation and the database name for all latencies higher
than a treshold

(This used to be ctdb commit 1d581dcd507e8e13d7ae085ff4d6a9f3e2aaeba5)
2008-10-30 12:49:53 +11:00
Ronnie Sahlberg
0e7fa751af new version 1.0.64
(This used to be ctdb commit 1a7ff4577d33f0dd470f7465c7d0e875c962f54e)
2008-10-22 11:06:18 +11:00
Ronnie Sahlberg
b9bd20ce55 add a context and a timed event so that once we have been in recovery
mode for too long we drop all public ip addresses

(This used to be ctdb commit 403c68f96e1380dd07217c688de2730464f77ea0)
2008-10-22 11:04:41 +11:00
Ronnie Sahlberg
fc6ed25cd5 new version 1.0.63
(This used to be ctdb commit 59a879626a6a55fb6a43cadf5338c1aa6afe96d1)
2008-10-20 09:47:54 +11:00
Ronnie Sahlberg
d265e62ee7 dont log "running periodic cleanup" ...
(This used to be ctdb commit e25ea88ea4f270ba65ed5fdacd693f1248f343c0)
2008-10-20 09:45:15 +11:00
Ronnie Sahlberg
beed899c4f null out the pointer before we reload the nodes file
(This used to be ctdb commit 4b0f32047e8bece0a052bdbe2209afe91b7e8ce3)
2008-10-17 21:38:42 +11:00
Ronnie Sahlberg
a924ef78b6 when we reload the nodes file, we may need to reload the nodes file
inside the recovery daemon as well.

(This used to be ctdb commit 82fd2b6b5cd8e988c38fa6b74121a048757bdeef)
2008-10-17 21:18:06 +11:00
Ronnie Sahlberg
5318ca64b6 make it possible to set the script log level in CTDB sysconfig
(This used to be ctdb commit 06097b88709ced09d1f9f869eed9a54e6d2fedbf)
2008-10-17 13:36:52 +11:00
Ronnie Sahlberg
ce66008e08 specify a "script log level" on the commandline to set under which log
level any/all output from eventscripts will be logged as

(This used to be ctdb commit cdc79d4f22f1a6aec5c34115969421f93663932a)
2008-10-17 07:56:12 +11:00
Ronnie Sahlberg
4a66281cb6 new version 1.0.62
(This used to be ctdb commit 49431e799ba7f7c78f596fdf896316a2e22c745e)
2008-10-16 17:59:55 +11:00
Ronnie Sahlberg
5808a7be96 allow multiple eventscripts using the same prefix.
this eases the pain for users that use out of tree eventscripts

(This used to be ctdb commit 8313dfb6fc5404cd2d065af6620412f8664ada11)
2008-10-16 17:57:50 +11:00
Martin Schwenke
b9137e2422 Merge commit 'origin/master' into martins
(This used to be ctdb commit 9c392c9d18e2360360122b7356874fe5cc7cca64)
2008-10-16 14:15:15 +11:00
Andrew Tridgell
371e6aa155 Merge commit 'ronnie/master'
(This used to be ctdb commit 5403ed6dcfdfc101b05b43f83002e720d81b4e38)
2008-10-16 12:58:25 +11:00
Ronnie Sahlberg
02f6731454 new version 1.0.61
(This used to be ctdb commit 0098efd4443038f2d902e3a7c3640e63f06be7d1)
2008-10-15 16:40:44 +11:00
Ronnie Sahlberg
a9977269f0 install the new multipath monitoring event script
(This used to be ctdb commit 3b8d49bf58f4145cdca08565f06cd43fd36991e1)
2008-10-15 16:29:09 +11:00
Ronnie Sahlberg
60b98f600e add an eventscript to monitor that the multipath devices are healthy
(This used to be ctdb commit f9779d3a237db59d7fdad92185ac7e42715466e6)
2008-10-15 16:27:33 +11:00
Ronnie Sahlberg
f9beb55bf5 we must also check the status returned from the get tickles control to
determine whether it was successful or not

(This used to be ctdb commit 6fb2f8a36239e5902e27cf10213f85faf216d6f1)
2008-10-15 08:33:37 +11:00
Ronnie Sahlberg
233b0e5cbb lower the loglevel for the informational message that a TCP_ADD opeation
described an ip address not known to be a public address.

This could happen if someone for genuine reasons accesses a share
through a static ip address.
It can also happen if non homogenous public address configurations are
used and when a tcp description is pushed out to a different node that
does not server/know the specific ip address.

(This used to be ctdb commit 9b1d089c99413f3681440f3cf33c293d118c9108)
2008-10-15 03:02:09 +11:00
Ronnie Sahlberg
3902855275 change ip route add to route add -net since this works more reliably
update the makefile and rpm to install 99.routing

(This used to be ctdb commit c0b3bd8a3fa580dca5afa97c8012fccb25231373)
2008-10-15 01:49:19 +11:00
Ronnie Sahlberg
07b9c38f57 new version 1.0.60
(This used to be ctdb commit 77ed0d71b1fb8d06d70d01a8e8f9eb04ffe7f02f)
2008-10-15 01:32:46 +11:00
Ronnie Sahlberg
6e490e8cce verify that the nodes we try to ban/unban are operational and print an
error to the user othervise.

(This used to be ctdb commit 5747dd2d80af29d6252afb6aeb3e66328ee20de5)
2008-10-15 01:23:57 +11:00