root
6793f077a8
Add a new variable VerifyRecoveryLock which can be used to disable the test that the recovery daemon holds the lock properly when performing a recovery
...
(This used to be ctdb commit 329df9e47e6ca8ab5143985a999e68f37c6d88a5)
2009-05-01 01:17:59 +10:00
Ronnie Sahlberg
2e3542b5e5
dont unconditionally kill/restart ctdb when given "service ctdb start" only start ctdb if it is not already running, and print an error message othervise
...
(This used to be ctdb commit 94343309992929a592348c936e09a7b4f8b512c1)
2009-04-30 17:38:30 +10:00
Ronnie Sahlberg
3a6ace330e
we only need to have transaction nesting disabled when we start the new transaction for the recovery
...
(This used to be ctdb commit bf8dae63d10498e6b6179bbacdd72f1ff0fc60be)
2009-04-26 08:48:15 +10:00
Ronnie Sahlberg
d20bb2498d
set the TDB_NO_NESTING flag for the tdb before we start a transaction from within recovery
...
(This used to be ctdb commit 1b2029dbb055ff07367ebc1f307f5241320227b2)
2009-04-26 08:42:54 +10:00
Ronnie Sahlberg
777c634eae
add TDB_NO_NESTING. When this flag is set tdb will not allow any nested transactions and tdb_transaction_start() will implicitely _cancel() any pending transactions before starting any new ones.
...
(This used to be ctdb commit 459e4ee135bd1cd24c15e5325906eb4ecfd550ec)
2009-04-26 08:38:37 +10:00
Ronnie Sahlberg
38ea6708dd
add a tuneable RecoveryDropAllIPs so it is possible to control after how long a node that has been stuck in recovery will wait until it will yield all public addresses.
...
this now defaults to 60 seconds
This is useful if a split brain occurs due to network partitioning since it will make sure that the "other half" of the cluster that does not contain the recovery master will eventually release all ips and thus avoiding a duplicate ip situation for the public addresses
(This used to be ctdb commit 70f21428c9eec96bcc787be191e7478ad68956dc)
2009-04-24 18:28:08 +10:00
Ronnie Sahlberg
ce3283f7cb
increase the loglevel for the message we print when we automatically release all ips when we have been in recovery for too long
...
(This used to be ctdb commit 7af060ded5113a49832f6a08a942523a202586b3)
2009-04-24 18:11:10 +10:00
Ronnie Sahlberg
3363480da4
tweak some timeouts so that we do trigger a banning even if the control hangs/timesout
...
(This used to be ctdb commit 1860a365e6ba8212e15c33016c80a2adcf8d10f4)
2009-04-24 14:45:07 +10:00
Ronnie Sahlberg
e5532b6f26
If we can not pull a database from a node during recovery, mark this node as a "culprit" so that it will eventually become banned.
...
(This used to be ctdb commit 69dc3bf60b86d8df6dc5c7c6ebf303e847fb2ba9)
2009-04-24 14:44:57 +10:00
Andrew Tridgell
37e2417c59
change shutdown level for ctdb to be 01
...
We want ctdb to shutdown first, as it manages many other
services. With the old level of 32 the NFS service would shutdown
first, and that would trigger ctdb to do a recovery. Then ctdb itself
would be shutdown a few seconds later, which causes a lot of error
messages in the other nodes logs
(This used to be ctdb commit 2f952af1a12e81a652ec9a4794db96f9593f2676)
2009-04-23 11:35:42 +10:00
Ronnie Sahlberg
8752745173
new version 1.0.79
...
(This used to be ctdb commit 6c900aa343096c5e1e297e055c36832ffa5028dd)
2009-04-08 12:56:52 +10:00
Ronnie Sahlberg
4be3e86405
create a function "remote_ip" which can be used from scripts to remove a single ip from an interface.
...
use this fucntion from the natgw eventscript
(This used to be ctdb commit feab5f30b2d6cebf4dd28abc5a81f93424a4c852)
2009-04-08 12:49:28 +10:00
Ronnie Sahlberg
976e76f408
set libdir to ../lib64 on x86-64 platforms
...
(This used to be ctdb commit a9f851caec2525ccbb3a6d6283eaef52b89a4eb2)
2009-04-08 10:45:00 +10:00
Ronnie Sahlberg
62afe2ff71
install ctdb.pc from the RPM
...
(This used to be ctdb commit 1b47ddc97373376b416a50939b74dc8c926fc917)
2009-04-08 09:34:20 +10:00
Ronnie Sahlberg
0f70c47008
From Mathieu Parent <math.parent@gmail.com>
...
Install the pkgconfig file
(This used to be ctdb commit 7c4389cc0baa43a0ffa9fb08944c253db7885807)
2009-04-08 09:21:11 +10:00
Mathieu Parent
6efe2b6533
(This used to be ctdb commit b0718551f55d5da9be0e6aba233f57c1ff8509be)
2009-04-08 09:14:20 +10:00
Ronnie Sahlberg
59fd3bd564
install /etc/ctdb/notify.sh as executable.
...
this addresses bug 6250
(This used to be ctdb commit b8be5b06c3359d037db336dc12d38e0018349951)
2009-04-08 08:48:55 +10:00
Ronnie Sahlberg
a87e6f56ae
we only need to switch into client mode from the eventscript child if we are running the monitor event
...
(This used to be ctdb commit 13e2c9044950f21918e4610726e73ed3d8f76920)
2009-04-06 14:03:09 +10:00
Ronnie Sahlberg
e5e2f6f8f7
increase the listen queue. Now that the eventscripts may become clients and connect back to the server we do get a lot more concurrent connection attempts (takepip/teleaseip are performed in parallell)
...
(This used to be ctdb commit 018f8b0b1823ef59b46f1a671aec5309d10628f4)
2009-04-06 14:00:41 +10:00
Ronnie Sahlberg
1f87ee85bc
use _exit() and not exit() when we terminate a failed eventscript child process
...
(This used to be ctdb commit 33b296cee177adc61edc911caec8c24b3efa8441)
2009-04-06 13:16:36 +10:00
Ronnie Sahlberg
2e1208e648
We dont need to verify the nodemap on remote nodes that are banned
...
(This used to be ctdb commit 7f8f9385deee6eff2b7303147bc6412bbdc122df)
2009-04-06 12:00:22 +10:00
Ronnie Sahlberg
2393df3989
if we cant pull the remote nodemap off a node we should mark it as a culprit so it eventually becomes banned.
...
(This used to be ctdb commit 0889ae3c237bdb3bd72d45f2f64f5e5d8420870c)
2009-04-02 14:50:43 +11:00
Ronnie Sahlberg
d94917ec49
Change the (dodgy) seqnumfrequency variable to have ms resolution instead of second resolution.
...
Rename the variable to SeqnumInterval for
1, it is an interval and not a 1/interval unit
2, so that we catch when people use this old variable and can update the sysconfig file instead of silently changin semantics of this variable
this is a real dodgy variable
(This used to be ctdb commit 68eac459e5d2b6b534f72821036675ffe5d7a350)
2009-04-01 17:21:38 +11:00
Ronnie Sahlberg
297ab50173
remove a prototype for a function no longer used
...
(This used to be ctdb commit 9ac9745ba9296d01e3b18148ae8c3240e51cf090)
2009-04-01 17:13:48 +11:00
Ronnie Sahlberg
71745ef97d
new release 1.0.78
...
(This used to be ctdb commit 00d2213613822b758939019361a619bd7d7f4984)
2009-03-31 20:04:45 +11:00
Ronnie Sahlberg
24d84952f8
we should also install the 11.natgw eventscript if we want to be able to use it
...
(This used to be ctdb commit 42e2797271bc1cdb4eecf1227d4c2db668587193)
2009-03-31 20:00:00 +11:00
Ronnie Sahlberg
53d6626503
install a default /etc/ctdb/notify.sh script as example on how to use
...
snmptrap/email to notify that a node has changed health status
(This used to be ctdb commit ee52c0866e2b26c396fe60946159c559d47199eb)
2009-03-31 14:38:52 +11:00
Ronnie Sahlberg
ad40ee25f9
add a mechanism where the ctdb daemon will run a usercontrolled script when the node status changes to/from UNHEALTHY state.
...
This would allow a sysadmin to set up ctdb to send an email/snmptrap/... when the status of the node changes.
(This used to be ctdb commit ce534a83a05dbd40238e4eee0669d60ff396f935)
2009-03-31 14:23:31 +11:00
Ronnie Sahlberg
df9d401d8c
new version 1.0.77
...
(This used to be ctdb commit 274a4a1fe2e016f33296ebfc5ed6337ce3141d06)
2009-03-31 11:42:10 +11:00
Ronnie Sahlberg
b9e6e15cd4
we must also try to set the routes when we release an ip since during the release/10.interfaces there can actually be a window where the kernel decides to remove all addresses (before we manually add them back in 10.interfaces) during which the kernel may also decide to delete all routes since there are no gateways reachable through this interface anymore.
...
(This used to be ctdb commit 34633223a46caaa079da233663f9c6dcc1803f87)
2009-03-31 11:33:28 +11:00
Ronnie Sahlberg
4c1acd8472
new version 1.0.76
...
(This used to be ctdb commit 56b7095994d1de95e40a223ed503b5572ea9d1b9)
2009-03-25 14:52:08 +11:00
Ronnie Sahlberg
6721546b53
change the ctdb command table to allow us to describe commands which can be run independtly of the ctdb daemon.
...
create a new debugging command xpnn which discovers the pnn of the local node and which works even if the local daemon is not running
(This used to be ctdb commit cd78765f9400d7abce7929a2dd199f65226e7664)
2009-03-25 14:46:05 +11:00
Ronnie Sahlberg
3d16205096
iupdate the documentation for NATGW to reflect that you can now use
...
multiple natgw groups in one cluster
(This used to be ctdb commit e059df6d3cd81c67e5505e8ef2d6d0ef9a287b31)
2009-03-25 13:46:41 +11:00
Ronnie Sahlberg
d7ff332896
update how the NATGW configuration works.
...
allow the cluster to be partitioned into multiple disjoint natgw subsets
(This used to be ctdb commit 1046885cd22b5001e0251de2e536b5f6793459be)
2009-03-25 13:37:57 +11:00
Ronnie Sahlberg
fb7c5809fa
web: fix typo
...
Conflicts:
web/index.html
(This used to be ctdb commit 95d22e4cf265d2119f72200ab0ec708f095853df)
2009-03-24 19:02:00 +11:00
Ronnie Sahlberg
89b78ebc6b
update the documentatio n with all the new commands we supprot in the
...
ctdb tool
(This used to be ctdb commit ae317b2013eee01c4c0a5108c03f4024bea9e313)
2009-03-24 18:59:27 +11:00
Ronnie Sahlberg
11933e030a
fix the html so that mine and obnox names are shown
...
(This used to be ctdb commit 0840aa2bd31b2da95342dca8ff35786a3d998688)
2009-03-24 18:23:56 +11:00
Ronnie Sahlberg
689f76f0b0
Merge branch 'obnox'
...
(This used to be ctdb commit 972036a5d510fb9b399f1ee34a8861dee4221267)
2009-03-24 17:49:55 +11:00
Ronnie Sahlberg
e9e27bf264
new version 1.0.75
...
(This used to be ctdb commit 857733ae2bdfa0037af224abfabc020e2ac384c7)
2009-03-24 14:08:57 +11:00
Ronnie Sahlberg
36ec47d610
create a varient of kill_tcp_connections that only kills off the local side of a connection
...
(This used to be ctdb commit dc2f28f7c988364b5d45f3048be4db3e5ff113b3)
2009-03-24 14:05:31 +11:00
Ronnie Sahlberg
686adea3fe
set --single-public-ip when lvs is used
...
(This used to be ctdb commit 292fff6eace39141591871e12f9a64e3441237be)
2009-03-24 13:51:32 +11:00
Ronnie Sahlberg
7265c713db
we need to set the port properly in the parse_ip helper
...
(This used to be ctdb commit 43fe18d86995744ba61c7a6405b70edcb265930a)
2009-03-24 13:45:11 +11:00
Ronnie Sahlberg
4d5bb92fa9
add michael adams as one of the ctdb developers on the main ctdb webpage
...
(This used to be ctdb commit be50059c33845fec260ca53975d421a890303880)
2009-03-23 21:44:35 +11:00
Michael Adam
a83ed1d743
Merge commit 'ctdb-ronnie/master'
...
(This used to be ctdb commit 39a972b0d6d0d70282c25c54a124b67431467e77)
2009-03-23 10:07:44 +01:00
root
629d5ee1fa
add a new command "ctdb scriptstatus"
...
this command shows which eventscripts were executed during the last monitoring cycle and the status from each eventscript.
If an eventscript timedout or returned an error we also
show the output from the eventscript.
Example :
[root@rcn1 ctdb-git]# ./bin/ctdb scriptstatus
6 scripts were executed last monitoring cycle
00.ctdb Status:OK Duration:0.021 Mon Mar 23 19:04:32 2009
10.interface Status:OK Duration:0.048 Mon Mar 23 19:04:32 2009
20.multipathd Status:OK Duration:0.011 Mon Mar 23 19:04:33 2009
40.vsftpd Status:OK Duration:0.011 Mon Mar 23 19:04:33 2009
41.httpd Status:OK Duration:0.011 Mon Mar 23 19:04:33 2009
50.samba Status:ERROR Duration:0.057 Mon Mar 23 19:04:33 2009
OUTPUT:ERROR: Samba tcp port 445 is not responding
Add a new helper function "switch_from_server_to_client()" which both
the recovery daemon can use as well as in the child process we start for running the actual eventscripts.
Create several new controls, both for the eventscript child process to inform the master daemon of the current status of the scripts as well as for the ctdb tool to extract this information from the runninc daemon.
(This used to be ctdb commit c98f90ad61c9b1e679116fbed948ddca4111968d)
2009-03-23 19:07:45 +11:00
root
dc05c1b80c
create a helper function that converts a ctdb instance in daemon mode to become
...
a ctdb client instance.
use this from the recovery daemon child process to switch to client mode
and connect back to the main daemon
(This used to be ctdb commit 16f31786a031255ab5b3099a0a3c745de973347a)
2009-03-23 12:37:30 +11:00
Ronnie Sahlberg
4d2195c503
The wbinfo --sequence command has been depreciated in favor of the new
...
--online-status command
(This used to be ctdb commit b6e34503ac094a274a569a69e3d93d92ad911f4d)
2009-03-19 10:43:57 +11:00
Ronnie Sahlberg
293a3f1158
update the natgw eventscript and documentation
...
(This used to be ctdb commit 95d8ddbc2dd0b159e8df003502c3c336668d2c41)
2009-03-19 10:17:44 +11:00
root
9bf792d704
redo how the natgw is done. just use a default route with a high metric instead of fancy policyrouting
...
(This used to be ctdb commit f03bd2b3d906dac9fb876dca54535d22e9cf1b9e)
2009-03-18 19:19:49 +11:00
Ronnie Sahlberg
c9d7c06b61
add documentation for the NAT-GW feature
...
(This used to be ctdb commit b6f7cddc68508e52bc65b357b0b77258ae96362a)
2009-03-18 10:05:00 +11:00