Ronnie Sahlberg
aa080f66d9
first cut at a better and more scalable socketkiller
...
that can kill multiple connections asynchronously using one listening
socket
(This used to be ctdb commit 22bb44f3d745aa354becd75d30774992f6c40b3a)
2007-07-11 17:43:51 +10:00
Andrew Tridgell
32de198fd3
update lib/replace from samba4
...
(This used to be ctdb commit f0555484105668c01c21f56322992e752e831109)
2007-07-10 15:29:31 +10:00
Andrew Tridgell
bdf01ed7c0
- neaten up the command line for killtcp
...
- split out the event script code into a separate module
- get rid of the separate takeover directory
(This used to be ctdb commit 8ea2c923a3e2464200ff79bf2c3f1f89e6a93ad4)
2007-07-04 16:51:13 +10:00
Andrew Tridgell
14c788f3cb
move more util code to lib/util
...
(This used to be ctdb commit de5ab0584c978a6be4afeacd80c84015b206a3c6)
2007-06-07 22:30:29 +10:00
Andrew Tridgell
ae3d54094b
start splitting the code into separate client and server pieces
...
(This used to be ctdb commit 603cd77988c181525946cd5eb0f4d0d646b58059)
2007-06-07 22:06:19 +10:00
Andrew Tridgell
3d75c9a51d
later times are a lower priority, not a higher priority
...
(This used to be ctdb commit e96424e7d366df29767c4eeaccdcc0cc975cb8ae)
2007-06-07 19:21:55 +10:00
Andrew Tridgell
dbb803e6af
choose the most connected node first
...
(This used to be ctdb commit c7c17a79fa4f28509e34b6f635fa62517dc458c2)
2007-06-07 19:17:27 +10:00
Andrew Tridgell
df6439d796
formatting fixes
...
(This used to be ctdb commit ed63a2057698aed3931762605b2ea2368681af2b)
2007-06-07 18:39:37 +10:00
Andrew Tridgell
d774192737
use a priority time for the election data, not just the vnn
...
(This used to be ctdb commit a691f9c5cd77194005f0d98483da94b07a48d57d)
2007-06-07 18:37:27 +10:00
Andrew Tridgell
c42ddcda23
validate vnn on node flags change
...
(This used to be ctdb commit 5628ebbcc2aa61b63c761783c70fe4d8a0070607)
2007-06-07 18:13:14 +10:00
Andrew Tridgell
96861466b7
there are now far too many controls for the controls statistics fields to be useful
...
(This used to be ctdb commit f5e188fc7e13b55b6b4081dcc74ea9614a76f9bb)
2007-06-07 18:07:38 +10:00
Andrew Tridgell
3e4d7bef23
get all the tunables at once in recovery daemon
...
(This used to be ctdb commit 8e60be6c22aab145e68b16ede5f32f4430c2af93)
2007-06-07 18:05:25 +10:00
Andrew Tridgell
cb4c33cc68
handle CTDB_CURRENT_NODE in ban commands
...
(This used to be ctdb commit fefb53f1d22c5458a1e107f8352818aee87983de)
2007-06-07 16:48:31 +10:00
Andrew Tridgell
23bf62fe30
added admin commands to ban/unban nodes
...
(This used to be ctdb commit 4dad04172e7e4955b5bf6444a85b19901c9683ad)
2007-06-07 16:34:33 +10:00
Andrew Tridgell
2ed57a9ae1
implement a scheme where nodes are banned if they continuously caused the cluster
...
to start a recovery session. The node is banned from the cluster for the RecoveryBanPeriod (default of 5 minutes)
(This used to be ctdb commit 4ad43dd07f526b6002477177fbf55483246c2c0c)
2007-06-07 15:18:55 +10:00
Andrew Tridgell
9754d16d48
merged admin enable/disable change from ronnie
...
(This used to be ctdb commit df17b69dfd83a98f9c711994c7dd51ad2cc0ab8a)
2007-06-07 11:15:22 +10:00
Ronnie Sahlberg
9ff733c784
add a control to permanently enable/disable a node
...
(This used to be ctdb commit d66fdba16ca22f62ddac6882a17614879b08a798)
2007-06-07 09:16:17 +10:00
Andrew Tridgell
8fbca613d4
get parents idea of recmode and recmaster when deciding if we should do a takeover run
...
(This used to be ctdb commit 0e8124acd2f1a9b34292c1ee13c7e4cd6fe49876)
2007-06-06 21:56:54 +10:00
Andrew Tridgell
4a7f116746
update flags in parent daemon too
...
(This used to be ctdb commit 8995246d95e670753ab8c61d724d284cac2b414d)
2007-06-06 21:34:36 +10:00
Andrew Tridgell
ae56096b0b
ensure all nodes display disabled nodes correctly
...
(This used to be ctdb commit 959f82cfe926994658f5826007caccb0409003e1)
2007-06-06 21:27:09 +10:00
Andrew Tridgell
81fad8636f
added timeouts in all event scripts
...
(This used to be ctdb commit d986c91a607ed7c7d4869ea786b5cdf80e7862f1)
2007-06-06 13:45:12 +10:00
Andrew Tridgell
76b7361c7e
- added monitoring of rpc ports for nfs, and of Samba ports and directories
...
- added monitoring of the ethernet link state
When monitoring detects an error, the node loses its public IP address
(This used to be ctdb commit 0af57aead8c983511d25774b4ffe09fa5ff26501)
2007-06-06 12:08:42 +10:00
Andrew Tridgell
cafddf76dc
- fixed flags display in logs
...
- added monitor handler to test event script
(This used to be ctdb commit a4c18dddee169df49e5d77d9a94ce9329f169319)
2007-06-06 11:13:24 +10:00
Andrew Tridgell
eaf701fbda
send the right sort of message on monitoring failure
...
(This used to be ctdb commit 9db537d9b11d48a36346db721ed8936ff5ecacb2)
2007-06-06 11:12:45 +10:00
Andrew Tridgell
af8834dd02
added health monitoring logic to ctdb, so a node loses its public IP address if one of the sybsystem event scripts reports a problem
...
(This used to be ctdb commit c7a089256d86cec21097453bce5acbccee87413f)
2007-06-06 10:25:46 +10:00
Andrew Tridgell
be3a00bd73
clean out some more cruft
...
(This used to be ctdb commit ad16c5fe2748b48a6f6c79976359d56d9bed33f4)
2007-06-05 17:57:07 +10:00
Andrew Tridgell
ac55bc4166
first step in health monitoring of cluster nodes. When not healthy they will be marked disabled
...
(This used to be ctdb commit d3dbd9fc4db21632075b56fc52cf95435c63374a)
2007-06-05 17:43:19 +10:00
Andrew Tridgell
a3048a8942
more unused code
...
(This used to be ctdb commit b01f226949965942c1d64ff3b4ecc0b835d4fecc)
2007-06-05 15:17:53 +10:00
Andrew Tridgell
efcacd76b7
remove an unused function
...
(This used to be ctdb commit 9a36d0e0c110c66fe72dce530318b9bc0ac1ce0b)
2007-06-05 15:17:24 +10:00
Andrew Tridgell
ee546dec81
merge from ronnie
...
(This used to be ctdb commit 531d7ea7aca3116e78a4502a1c8b75a3fb764a4f)
2007-06-04 22:13:59 +10:00
Ronnie Sahlberg
4be9a44ba7
add a control that lists all public ip addresses and which node that
...
currently serves it
(This used to be ctdb commit db9b89dc423b31079e5502323e5fd2bbaf82e1e9)
2007-06-04 21:11:51 +10:00
Andrew Tridgell
39ced972ae
make recovery daemon values tunable
...
(This used to be ctdb commit ec29dbf2f5110428df8b97801443ba7addf61353)
2007-06-04 20:22:44 +10:00
Ronnie Sahlberg
1ee8989bd4
merge from tridge
...
(This used to be ctdb commit 3bfede5d46dba5a3654dad9205534391bc339461)
2007-06-04 20:10:53 +10:00
Ronnie Sahlberg
79b54a624e
change the takoverip/releaseip controls to pass a structure containing
...
both the nodenumber and the id of the node that has taken over that
address in addition to the public address itself so that all nodes
can learn which node is currently hosting each of the public addresses
(This used to be ctdb commit 53e9ff790387b85a36fa9c3c44cd4c95cbdf35da)
2007-06-04 20:07:37 +10:00
Andrew Tridgell
dbb2ec43dd
added tunables settable using ctdb command line tool
...
(This used to be ctdb commit 73d440f8cb19373cfad7a2f0f0ca4f963c57ff29)
2007-06-04 19:53:19 +10:00
Andrew Tridgell
f1d81386e6
- start moving tunable variables into their own structure
...
- fixed the test scripts to use a separate dbdir
(This used to be ctdb commit 396752e8908c48373564e915e2d49cfc9ff61eba)
2007-06-04 17:46:37 +10:00
Andrew Tridgell
a57991c0eb
remove some cruft thats not needed any more
...
(This used to be ctdb commit c4308805b997740b77e058c1a14b84cb400a7c30)
2007-06-04 17:23:55 +10:00
Ronnie Sahlberg
a3e4e204dc
add the ip address to the nodemap structure we pull from a server and
...
display the physical address of a node when we do a ctdb status
(This used to be ctdb commit 660bf30db713f0680acd3f74275ad603b35a0c24)
2007-06-04 13:26:07 +10:00
Ronnie Sahlberg
8175804757
print an error message to stdout if we failed to open the logfile for
...
the daemon
(This used to be ctdb commit fca953b1a3f3d6bf18264ecda1c75c68b60e2008)
2007-06-03 18:59:27 +10:00
Andrew Tridgell
518d410075
fixed a race condition in the handling of the recovery lock
...
(This used to be ctdb commit 3b98c5ad23662259b0eed399ab0c8037cf9b2b0b)
2007-06-03 10:29:14 +10:00
Andrew Tridgell
68963d865a
first step towards fixing "make test" with the new daemon system
...
(This used to be ctdb commit f95f7e4c93dea482e6cf0614b5415229a7c9f3fb)
2007-06-02 13:16:11 +10:00
Andrew Tridgell
ebf12646cf
- make specification of a recovery lock file compulsory
...
- die if someone other than the recmaster can get the recovery lock
(This used to be ctdb commit a827d0d0e430ca8ad5d521367e45097185492869)
2007-06-02 11:36:42 +10:00
Andrew Tridgell
4f72a202d9
- moved cmdline options that are only relevant to ctdbd into ctdbd.c
...
- fixed a valgrind error on failing to send a control
- don't mark node dead when already disconnected
- moved node list lock code into common code
(This used to be ctdb commit bcc0432d0fea7ef223f82ccee81cf35c18144b1b)
2007-06-02 10:03:28 +10:00
Andrew Tridgell
27b0e323e6
disable realtime scheduler in event scripts
...
(This used to be ctdb commit 56225ac6fdfe754289bc7d5e0fc8d21c81a7aa8e)
2007-06-02 08:46:49 +10:00
Andrew Tridgell
5e5701a7b8
- make calling of recovered event script async
...
- shutdown sockets before calling shutdown script
(This used to be ctdb commit c5e099feef94a014a77742b6cc1d0afe78ef9da9)
2007-06-02 08:41:19 +10:00
Andrew Tridgell
7db1d04d5c
make the running of the takeover and release event scripts async, to prevent outages due to slow scripts
...
(This used to be ctdb commit 4189be97eee7ab2a50335c860f2fcd9566667d01)
2007-06-01 19:05:41 +10:00
Andrew Tridgell
bf3b740a1b
ctdb is GPL not LGPL
...
(This used to be ctdb commit 8624378010d1c2a1438e1e701339dfba7276f960)
2007-05-31 13:50:53 +10:00
Andrew Tridgell
1e72af9c51
close sockets when we exec scripts
...
(This used to be ctdb commit 0fac2164db4279db2d7d376a34be05b890304087)
2007-05-30 15:43:25 +10:00
Andrew Tridgell
c833b06a35
we need to listen at transport initialise stage to find our own node number
...
(This used to be ctdb commit 4a9455dfbe95e53884b46ad26dba0c33e3432ba9)
2007-05-30 14:46:14 +10:00
Andrew Tridgell
3c062bb5ae
- use a CTDB_BROADCAST_ALL for the attach message so it goes to currently disconnected nodes
...
- start node monitoring only after transport starts
- check if a node is already disconnected in the node dead function
(This used to be ctdb commit b81ab6d507797282237768380c6f0e5a4c6519a5)
2007-05-30 14:35:22 +10:00