1
0
mirror of https://github.com/samba-team/samba.git synced 2025-01-12 09:18:10 +03:00
Commit Graph

454 Commits

Author SHA1 Message Date
Andrew Tridgell
174879621e add config option for disabling bans
(This used to be ctdb commit 153b911f7f957d4c564b04f5aa878033a02da9e4)
2007-10-15 13:22:58 +10:00
Ronnie Sahlberg
bdd67bba1e add a --single-public-ip argument to ctdbd to specify the ip address
used in single public ip address mode.
when using this argument, --public-interface must also be used.

add a vnn structure to the ctdb context to describe the single public ip 
address


update the killtcp control in the daemon that if a socketpair that is to 
be killed does not match a normal public address it checks if the 
destination address maches the single public ip address and if so uses 
that vnn structure from the ctdb context


this allows killtcp to kill also connections to the single public ip 
instead of only normal public addresses

(This used to be ctdb commit 5661ba17b91f62821dec1c76056c78b99752a90b)
2007-10-10 09:42:32 +10:00
Ronnie Sahlberg
80cd82f8e4 add a control to send gratious arps from the ctdb daemon
(This used to be ctdb commit 563819dd1acb344f95aabb4bad990b36f7ea4520)
2007-10-09 11:56:09 +10:00
Ronnie Sahlberg
72379ee3eb change async.private to async.private_data since private is a reserved
work in c++

(This used to be ctdb commit 79eb28f6cd5dcc30b04966d202a050eaf98a2552)
2007-09-26 14:25:32 +10:00
Andrew Tridgell
80100c3573 run monitoring more quickly when unhealthy and at startup
(This used to be ctdb commit ff1c205928e3ef5bcc6bf4e4b2122a19fa38d8f4)
2007-09-24 10:12:18 +10:00
Andrew Tridgell
c60988325d added support for persistent databases in ctdbd
(This used to be ctdb commit 3115090a0d882beca9d70761130b74bb0821f201)
2007-09-21 12:24:02 +10:00
Andrew Tridgell
3c0f61cb92 we don't need the is_loopback logic in ctdb any more
(This used to be ctdb commit 4ecf29ade0099c7180932288191de9840c8d90a9)
2007-09-13 10:45:06 +10:00
Andrew Tridgell
f3ae1cdb02 - use struct sockaddr_in more consistently instead of string addresses
- allow for public_address lines with a defaulting interface

(This used to be ctdb commit 29cb760f76e639a0f2ce1d553645a9dc26ee09e5)
2007-09-10 14:27:29 +10:00
Ronnie Sahlberg
4ac749bfa4 change the signature to ctdb_sys_have_ip() to also return:
a bool that specifies whether the ip was held by a loopback adaptor or 
not
 the name of the interface where the ip was held

when we release an ip address from an interface, move the ip address 
over to the loopback interface

when we release an ip address  after we have move it onto loopback, 
use 60.nfs to kill off the server side (the local part) of the tcp 
connection   so that the tcp connections dont survive a 
failover/failback

61.nfstickle,   since we kill hte tcp connections when we release an ip 
address   we no longer need to restart the nfs service in 61.nfstickle

update ctdb_takeover to use the new signature for ctdb_sys_have_ip

when we add a tcp connection to kill in ctdb_killtcp_add_connection()
check if either the srouce or destination address match a known public 
address

(This used to be ctdb commit f9fd2a4719c50f6b8e01d0a1b3a74b76b52ecaf3)
2007-09-10 07:20:44 +10:00
Ronnie Sahlberg
77ec4d5248 allow different nodes in the cluster to use different public_addresses
files
so that we can partition the cluster into different subsets of nodes 
which each serve a different subset of the public addresses

(This used to be ctdb commit 889e0fe69e4c88c6166282b12843b8d9727552d6)
2007-09-04 23:15:23 +10:00
Ronnie Sahlberg
8f819c6a0e get rid of the ctdb_vnn_list structure and just use a single list of
ctdb_vnn

(This used to be ctdb commit 7b9fd06321af17043136b1420b57284450ae7ba5)
2007-09-04 18:20:29 +10:00
Ronnie Sahlberg
cf45c5096c we cant have takeover_ctx hanging off ctdb since it is freed/recreated
everytime we release an ip.
this context is used to hold all resources needed when sending out 
gratious arps and tcp tickles during ip takeover.

we hang it off the vnn structure that manages that particular ip address 
instead   so that we can have multiple ones going in parallell

this bug (or the same bug in different shape) has probably been in ctdb 
for very very long   but is likely to be hard to trigger

(This used to be ctdb commit c58db1cadaba253b2659573673b28c235ef7db76)
2007-09-04 14:36:52 +10:00
Ronnie Sahlberg
d66d9cdd22 change debug output from vnn to pnn
change ctdb_daemon_send_message to take pnn as parameter isntead of vnn

(This used to be ctdb commit e352a2bbf9bb9a0b2c4f8329e8a529cf02414097)
2007-09-04 10:45:41 +10:00
Ronnie Sahlberg
0c91261340 change ctdb_send_message to take pnn as parameter instead of vnn
(This used to be ctdb commit 93dd4fba2e0fa6a011d15406652836785a974880)
2007-09-04 10:42:20 +10:00
Ronnie Sahlberg
157be530dd change ctdb_ctrl_getvnn to ctdb_ctrl_getpnn
(This used to be ctdb commit ef47cc4cd416065c69382e4d9e76c30a0a34e42f)
2007-09-04 10:38:48 +10:00
Ronnie Sahlberg
211b497818 change ctdb_node_flags_change.vnn to ctdb_node_flags_changed.pnn
change ctdb_ban_info.vnn to ctdb_ban_info.pnn

(This used to be ctdb commit fcedd40e0493948829e1c921d4fe30e9196e398a)
2007-09-04 10:33:10 +10:00
Ronnie Sahlberg
6f693bbcbd change server_id.vnn to server_id.pnn
(This used to be ctdb commit 26f2ee2b754a9271454412f05111a19b3013c6eb)
2007-09-04 10:21:51 +10:00
Ronnie Sahlberg
583b6e6ba6 change ctdb_get_vnn to ctdb_get_pnn
(This used to be ctdb commit 1e19930198c2bcc7ccb755e0ee51555fb823029a)
2007-09-04 10:18:44 +10:00
Ronnie Sahlberg
fc9d39c3a6 change ctdb_validate_vnn to ctdb_validate_pnn
(This used to be ctdb commit a4a1f41b69475b9dc16d8fd7f8965c32e96c32f0)
2007-09-04 10:09:58 +10:00
Ronnie Sahlberg
eb4cf6a686 change ctdb->vnn to ctdb->pnn
(This used to be ctdb commit 8c776e5707e503ec6586aae39ac6b3ea5a2fd2bc)
2007-09-04 10:06:36 +10:00
Ronnie Sahlberg
12ebb74838 change how we do public addresses and takeover so that we can have
multiple public addresses spread across multiple interfaces on each 
node.

this is a massive patch since we have previously made the assumtion that 
we only have one public address per node.

get rid of the public_interface argument.  the public addresses file 
now explicitely lists which interface the address belongs to

(This used to be ctdb commit 462ebbc791e906a6b874c862defea43235597ca8)
2007-09-04 09:50:07 +10:00
Ronnie Sahlberg
7f02e16143 add async versions of the freeze node control and freeze all nodes in
parallell 

(This used to be ctdb commit f34e89f54d9f4380e76eb1b5b2385a4d8500b505)
2007-08-27 10:31:22 +10:00
Ronnie Sahlberg
801bdbdc80 add a control to pull the server id list off a node
(This used to be ctdb commit 38aa759aa88a042c31b401551f6a713fb7bbe84e)
2007-08-26 10:57:02 +10:00
Ronnie Sahlberg
6681da31df add an initial implementation of a service_id structure and three
controls to  register/unregister/check a server id.

a server id consists of TYPE:VNN:ID    where type is specific to the 
application.  VNN is the node where the serverid was registered and ID 
might be a node unique identifier such as a pid or similar.


Clients can register a server id for themself at the local ctdb daemon.
When a client dissappears   or when the domain socket connection for the 
client drops  then any and all server ids registered across that domain 
socket will also be automatically removed from the store.

clients can register as many server_ids as they want at the same time    
but each TYPE:VNN:ID must be globally unique.

Clients have the option of explicitely unregister a server id by using 
the UNREGISTER control.


Registration and unregistration can only be done by clients to the local 
daemon. clients can not register their server id to a remote node.


clients can check if a server id does exist on any ctdb node in the 
network by using the check control

(This used to be ctdb commit d44798feec26147c5cc05922cb2186f0ef0307be)
2007-08-24 15:53:41 +10:00
Ronnie Sahlberg
495a6403da change the api for managing callbacks to controls so that isntead of
passing it as a parameter we set the callback function explicitely from 
the caller if the ..._send() function returned a valid state pointer.

(This used to be ctdb commit aa939570662786455f63299b62c99882cff29d42)
2007-08-24 10:42:06 +10:00
Ronnie Sahlberg
f854b5f876 try out a slightly different api for controls where you provide a
callback function which is called upon completion (or timeout) of the 
control.

modify scanning of recmaster in the monitoring_cluster code to try the 
api out

(This used to be ctdb commit c37843f1d97b169afec910e7ddb4e5ac12c3015c)
2007-08-23 19:27:09 +10:00
Ronnie Sahlberg
8fd3df2553 hang the ctdb_req_control structure off the ctdb_client_control_state
struct  so that if we timeout a control we can print debug info such as 
what opcode failed and to which node

we dont need the *status parameter to ctdb_client_control_state

create async versions of the getrecmaster control

pass a memory context to getrecmaster

(This used to be ctdb commit 558b680c82f830fba82c283c78c2de8a0b150b75)
2007-08-23 13:00:10 +10:00
Ronnie Sahlberg
f6e0336b23 create a define to represent the 'invalid' generation id we used in two
places.

create a new helper function to generate new generation id values that 
know about the invalid id and avoids generating it.

update the ctdb status tool to know about the invalid generation id and 
print the string INVALID instead

(This used to be ctdb commit 4fbcd189543cb8a92227fdcd3d158472e558ccda)
2007-08-22 12:38:31 +10:00
Ronnie Sahlberg
8b06fc7284 change the structure used for node flag change messages so that we can
see both the old flags as well as the new flags (so we can tell which 
flags changed)

send the CTDB_SRVID_RECONFIGURE messages to connected nodes only, not to 
every node, connected or not, in the cluster.


in the handler inside the recovery daemon which is invoked for node flag 
change messages, only do a takeover_run() and redistribute the ip addresses IF it was the 
disabled or the unhealthy flags that changed. Also send out the cluster 
reconfigured message in this case.
If any of the other flags changed we dont need to do the takeover_run(0 
here since that will be done during recovery.

(This used to be ctdb commit 5549b2058e2c148a8ca9d419123acf3247bb8829)
2007-08-21 17:25:15 +10:00
Andrew Tridgell
46639ac19e merged new event script calling code from ronnnie
(This used to be ctdb commit bbacad61b3eee4276ffe44ed2a23949aca8152cf)
2007-08-20 11:10:30 +10:00
Ronnie Sahlberg
3b9d50f3ee change the now rather small /etc/ctdb/events script into a service
specific script /etc/ctdb/events.d/00.ctdb

get rid of CTDB_EVENTS_SCRIPT and --event-script

(This used to be ctdb commit 81ccfaf838e5772d4a58eb6a70224b7b39aba9f3)
2007-08-15 15:01:31 +10:00
Ronnie Sahlberg
4023576e50 call the service specific event scripts directly from the forked child
instead for from /etc/ctdb/events so that we can get better debugging 
output in the logs when something fails in the scripts

(This used to be ctdb commit 4ed96b768aea1611e8002f7095d3c4d12ccf77a3)
2007-08-15 14:44:03 +10:00
Andrew Tridgell
f03defff70 merge changes needed for samba4
(This used to be ctdb commit a7f80f78cd62401b3516da3640bf24d6362db872)
2007-08-15 09:03:58 +10:00
Ronnie Sahlberg
fca90ce3c3 updated ctdb tickle management
there is an array for each node/public address that contains tcp tickles

we send a TCP_ADD as a broadcast to all nodes when a client is added

if tcp tickles are removed, they are only removed immediately from the 
local node.
once every 20 seconds a node will push/broadcast out the tickle list for 
all public addresses it manages.   this will remove any deleted tickles 
from the remote nodes

(This used to be ctdb commit e3c432a915222e1392d91835bc7a73a96ab61ac9)
2007-07-20 15:05:55 +10:00
Ronnie Sahlberg
7b17afdfcd change the tickle list from one global list into an array per public
ip/node

once we have started sending all tickles for a specific ip   delete the 
entire array   so that the tickles dont remain forever in the ctdb 
server

add a control to send the full list of every tickle that is registered 
for a particular public ip/node

(This used to be ctdb commit d0eee33e44d3f8e26debbec21d41e2cbdbb520e6)
2007-07-20 10:06:41 +10:00
Ronnie Sahlberg
f09566a81a add a private_data field to the killtcp structure and let the system
specific routines populate it as it see fit when creating a 
capture socket.
pass this structure to read_tcp and close capture socket as parameter

(This used to be ctdb commit 79bbfcfb2223889126fe307d5bbfd24917da07ee)
2007-07-13 17:07:10 +10:00
Andrew Tridgell
1e14ecd176 - merge from ronnie
- cleaner handling of system capture socket

(This used to be ctdb commit d194a41a71b8466d0726dcbae3970a86386fcb3c)
2007-07-13 11:31:18 +10:00
Andrew Tridgell
d2a5af7eb8 fully save/restore scheduler parameters
(This used to be ctdb commit 59408eabe7515d49a6eef3b6fb2590a1cd1df956)
2007-07-13 09:35:46 +10:00
Andrew Tridgell
fc73bc5c24 added --nosetsched option to ctdbd
(This used to be ctdb commit 4cbbb88c1735c7d112e751e22da1c1c69e09bf4a)
2007-07-13 08:47:02 +10:00
Ronnie Sahlberg
a650497680 as an optimization for when we want to send multiple tickles at a time
let the caller create the sending socket and use a single socket instead 
of one new one for each tickle.
pass a sending socket to ctdb_sys_send_tcp()

ctdb_sys_kill_tcp is not longer used so remove it

set the socketflags for close on exec and nonblocking in the helper that 
creates the sockets instead of in the caller

add a helper to create a sending socket to send tickles from

(This used to be ctdb commit 469f3fb238a0674a2b48fdf1a7e657e32428178a)
2007-07-12 09:22:06 +10:00
Ronnie Sahlberg
823b7d4a5f rename killtcp->fd to killtcp->capture_fd
we might want to have two sockets attached to the killtcp structure
one for capturing and a second one for sending  so we dont have to 
create a new socket for each tickle we want to send

(This used to be ctdb commit b3e82ec38047bbec1edfd88ade264077d4cbd2ee)
2007-07-12 08:52:24 +10:00
Ronnie Sahlberg
76ab80104a make the ctdb tool use the killtcp control in the daemon instead of
calling killtcp directly

(This used to be ctdb commit d21e3e9cf11bdcba6234302e033d6549c557dd69)
2007-07-12 08:30:04 +10:00
Ronnie Sahlberg
1ed0c3a9f7 add daemon code for the new kill_tcp control
(This used to be ctdb commit 8fe4ae62255ecb2db36bea736ff17409ba6614c5)
2007-07-11 18:24:25 +10:00
Ronnie Sahlberg
e4db03f7e6 add a ctdb_ prefix to two public functions
(This used to be ctdb commit 32adee5426aa75ddcd4d648ef326ed03d5ff5c46)
2007-07-11 18:13:03 +10:00
Ronnie Sahlberg
aa080f66d9 first cut at a better and more scalable socketkiller
that can kill multiple connections asynchronously using one listening 
socket

(This used to be ctdb commit 22bb44f3d745aa354becd75d30774992f6c40b3a)
2007-07-11 17:43:51 +10:00
Ronnie Sahlberg
0c44e0ad46 add a ctdb_kill_tcp_callback() that will perform a kill tcp using a
background process

(This used to be ctdb commit dcfcaacff56347d94c244512eb72219b05ef9c3d)
2007-07-11 12:33:14 +10:00
Andrew Tridgell
f97f2946d2 minor back-merge from samba4
(This used to be ctdb commit c591f9b2d2847f440702e7264c7da2fd6d69f4be)
2007-07-10 18:13:47 +10:00
Andrew Tridgell
32de198fd3 update lib/replace from samba4
(This used to be ctdb commit f0555484105668c01c21f56322992e752e831109)
2007-07-10 15:29:31 +10:00
Ronnie Sahlberg
e1f774a95b use the official iana number for ctdb and not 9001
(This used to be ctdb commit f72aeb5eadb0bda97d882b5a27562bfa1bb5f5a2)
2007-07-06 15:29:03 +10:00
Andrew Tridgell
bdf01ed7c0 - neaten up the command line for killtcp
- split out the event script code into a separate module
- get rid of the separate takeover directory

(This used to be ctdb commit 8ea2c923a3e2464200ff79bf2c3f1f89e6a93ad4)
2007-07-04 16:51:13 +10:00
Ronnie Sahlberg
a52f6760f3 add a new ctdb_sys_kill_tcp() function that kills (RST) the specified
connection

(This used to be ctdb commit 11a972f37d4ca7daf052b3b502620af05699bec4)
2007-07-04 13:53:22 +10:00
Ronnie Sahlberg
8f0a00b72b change the signature for ctdb_sys_send_ack() to ctdb_sys_send_tcp()
to make it possible to provide which seq/ack numbers to use and also 
whether the RST flag should be set.

update all callers to the new signature

(This used to be ctdb commit b694d7d4a6f3865a18bea8f484ba690e4ae7546c)
2007-07-04 13:32:38 +10:00
Ronnie Sahlberg
1cd8bc0c64 add a tuneable to control how long we wait after a successful recovery
before we alow another recovery to be initiated

(This used to be ctdb commit f3b43519423b7a73e6a2dd986bdf11203b8653cf)
2007-07-04 08:36:59 +10:00
Andrew Tridgell
6399cf9542 added code to kill registered clients on a IP release
(This used to be ctdb commit ca0243b544987ce0618a99ac87b4abf598991e93)
2007-06-19 03:54:06 +10:00
Andrew Tridgell
732353de5f - merged ctdb_store test from ronnie
- added DatabaseHashSize tunable
- added logging of events inside recovery (for timing)

(This used to be ctdb commit 3593cdb928b91e217faf1b3c537fa28dc82cdace)
2007-06-17 23:31:44 +10:00
Andrew Tridgell
044a2e04c4 - send tcp info to all connected nodes, not just vnnmap nodes
- use a non-blocking freeze when banned
- release all IPs when banned

(This used to be ctdb commit 070e85e532b33b792f85c3e72eee205d906aaf85)
2007-06-10 08:46:33 +10:00
Andrew Tridgell
18ae6e56f0 propogate flag changes to all connected nodes
(This used to be ctdb commit 711d1f7e20f1e98caaf08a57df0b1825ff6e97a0)
2007-06-09 21:58:50 +10:00
Andrew Tridgell
06a71762a4 some #include cleanups
(This used to be ctdb commit 1a07d87122d51a40cd8ad5fe13533298c26857cb)
2007-06-07 22:26:27 +10:00
Andrew Tridgell
96861466b7 there are now far too many controls for the controls statistics fields to be useful
(This used to be ctdb commit f5e188fc7e13b55b6b4081dcc74ea9614a76f9bb)
2007-06-07 18:07:38 +10:00
Andrew Tridgell
3e4d7bef23 get all the tunables at once in recovery daemon
(This used to be ctdb commit 8e60be6c22aab145e68b16ede5f32f4430c2af93)
2007-06-07 18:05:25 +10:00
Andrew Tridgell
23bf62fe30 added admin commands to ban/unban nodes
(This used to be ctdb commit 4dad04172e7e4955b5bf6444a85b19901c9683ad)
2007-06-07 16:34:33 +10:00
Andrew Tridgell
2ed57a9ae1 implement a scheme where nodes are banned if they continuously caused the cluster
to start a recovery session. The node is banned from the cluster for the RecoveryBanPeriod (default of 5 minutes)

(This used to be ctdb commit 4ad43dd07f526b6002477177fbf55483246c2c0c)
2007-06-07 15:18:55 +10:00
Andrew Tridgell
9754d16d48 merged admin enable/disable change from ronnie
(This used to be ctdb commit df17b69dfd83a98f9c711994c7dd51ad2cc0ab8a)
2007-06-07 11:15:22 +10:00
Ronnie Sahlberg
9ff733c784 add a control to permanently enable/disable a node
(This used to be ctdb commit d66fdba16ca22f62ddac6882a17614879b08a798)
2007-06-07 09:16:17 +10:00
Andrew Tridgell
81fad8636f added timeouts in all event scripts
(This used to be ctdb commit d986c91a607ed7c7d4869ea786b5cdf80e7862f1)
2007-06-06 13:45:12 +10:00
Andrew Tridgell
af8834dd02 added health monitoring logic to ctdb, so a node loses its public IP address if one of the sybsystem event scripts reports a problem
(This used to be ctdb commit c7a089256d86cec21097453bce5acbccee87413f)
2007-06-06 10:25:46 +10:00
Andrew Tridgell
be3a00bd73 clean out some more cruft
(This used to be ctdb commit ad16c5fe2748b48a6f6c79976359d56d9bed33f4)
2007-06-05 17:57:07 +10:00
Andrew Tridgell
ac55bc4166 first step in health monitoring of cluster nodes. When not healthy they will be marked disabled
(This used to be ctdb commit d3dbd9fc4db21632075b56fc52cf95435c63374a)
2007-06-05 17:43:19 +10:00
Andrew Tridgell
ee546dec81 merge from ronnie
(This used to be ctdb commit 531d7ea7aca3116e78a4502a1c8b75a3fb764a4f)
2007-06-04 22:13:59 +10:00
Ronnie Sahlberg
4be9a44ba7 add a control that lists all public ip addresses and which node that
currently serves it

(This used to be ctdb commit db9b89dc423b31079e5502323e5fd2bbaf82e1e9)
2007-06-04 21:11:51 +10:00
Andrew Tridgell
39ced972ae make recovery daemon values tunable
(This used to be ctdb commit ec29dbf2f5110428df8b97801443ba7addf61353)
2007-06-04 20:22:44 +10:00
Ronnie Sahlberg
1ee8989bd4 merge from tridge
(This used to be ctdb commit 3bfede5d46dba5a3654dad9205534391bc339461)
2007-06-04 20:10:53 +10:00
Ronnie Sahlberg
79b54a624e change the takoverip/releaseip controls to pass a structure containing
both the nodenumber and the id of the node that has taken over that 
address in addition to the public address itself    so that all nodes 
can learn which node is currently hosting each of the public addresses

(This used to be ctdb commit 53e9ff790387b85a36fa9c3c44cd4c95cbdf35da)
2007-06-04 20:07:37 +10:00
Andrew Tridgell
dbb2ec43dd added tunables settable using ctdb command line tool
(This used to be ctdb commit 73d440f8cb19373cfad7a2f0f0ca4f963c57ff29)
2007-06-04 19:53:19 +10:00
Andrew Tridgell
f1d81386e6 - start moving tunable variables into their own structure
- fixed the test scripts to use a separate dbdir

(This used to be ctdb commit 396752e8908c48373564e915e2d49cfc9ff61eba)
2007-06-04 17:46:37 +10:00
Andrew Tridgell
a57991c0eb remove some cruft thats not needed any more
(This used to be ctdb commit c4308805b997740b77e058c1a14b84cb400a7c30)
2007-06-04 17:23:55 +10:00
Ronnie Sahlberg
a3e4e204dc add the ip address to the nodemap structure we pull from a server and
display the physical address of a node when we do a ctdb status

(This used to be ctdb commit 660bf30db713f0680acd3f74275ad603b35a0c24)
2007-06-04 13:26:07 +10:00
Andrew Tridgell
c5e4ce360a make test now works again
(This used to be ctdb commit 439d87bbb9840f82937e51aff4fe2b80160878c6)
2007-06-02 13:31:36 +10:00
Andrew Tridgell
ebf12646cf - make specification of a recovery lock file compulsory
- die if someone other than the recmaster can get the recovery lock

(This used to be ctdb commit a827d0d0e430ca8ad5d521367e45097185492869)
2007-06-02 11:36:42 +10:00
Andrew Tridgell
4f72a202d9 - moved cmdline options that are only relevant to ctdbd into ctdbd.c
- fixed a valgrind error on failing to send a control

- don't mark node dead when already disconnected

- moved node list lock code into common code

(This used to be ctdb commit bcc0432d0fea7ef223f82ccee81cf35c18144b1b)
2007-06-02 10:03:28 +10:00
Andrew Tridgell
27b0e323e6 disable realtime scheduler in event scripts
(This used to be ctdb commit 56225ac6fdfe754289bc7d5e0fc8d21c81a7aa8e)
2007-06-02 08:46:49 +10:00
Andrew Tridgell
5e5701a7b8 - make calling of recovered event script async
- shutdown sockets before calling shutdown script

(This used to be ctdb commit c5e099feef94a014a77742b6cc1d0afe78ef9da9)
2007-06-02 08:41:19 +10:00
Andrew Tridgell
7db1d04d5c make the running of the takeover and release event scripts async, to prevent outages due to slow scripts
(This used to be ctdb commit 4189be97eee7ab2a50335c860f2fcd9566667d01)
2007-06-01 19:05:41 +10:00
Andrew Tridgell
bf3b740a1b ctdb is GPL not LGPL
(This used to be ctdb commit 8624378010d1c2a1438e1e701339dfba7276f960)
2007-05-31 13:50:53 +10:00
Andrew Tridgell
1e72af9c51 close sockets when we exec scripts
(This used to be ctdb commit 0fac2164db4279db2d7d376a34be05b890304087)
2007-05-30 15:43:25 +10:00
Andrew Tridgell
8ed48aac51 don't start the transport connecting to the other nodes until after the startup event script has run
(This used to be ctdb commit afca3cc74211aa2e18b1f74d36b2add8dffcfdc7)
2007-05-30 13:26:50 +10:00
Andrew Tridgell
2d9e0ad56a use /etc/services for ctdb
(This used to be ctdb commit 64bf6964ff33320c5351337c7f8ed4da5bd71275)
2007-05-29 15:15:00 +10:00
Andrew Tridgell
1140d5a20a fixed more warnings on 64 bit boxes
(This used to be ctdb commit 2f6eae476203f8a8b28e083553204c01f224c8a5)
2007-05-29 13:58:41 +10:00
Andrew Tridgell
bc891232b6 fixed some debug messages
(This used to be ctdb commit 037f0149c0c0e65af0a1669b9a52586129e4b48f)
2007-05-29 13:48:30 +10:00
Andrew Tridgell
edcaa0d6a0 clean shutdown in ctdb - release all our IPs
(This used to be ctdb commit 2f196cb6a86eb85205d7de1c4cadd4e1e701c06f)
2007-05-29 13:33:59 +10:00
Andrew Tridgell
ead091449b call the event script on recovery too
(This used to be ctdb commit 8c43a91cbd6e502c93bd6cc51df1272eae426709)
2007-05-29 12:55:24 +10:00
Andrew Tridgell
dfadb60318 - moved ctdbd specific options to ctdbd.c from cmdline.c
- allow a event script to be specified that will take IPs, release
  IPs, and handle recovery in system specific ways

- redirect stderr in subcommands to the log

(This used to be ctdb commit de0fc9ba370db781f9c46406ed180c8211946c7a)
2007-05-29 12:49:25 +10:00
Andrew Tridgell
ccf4d78e04 - renamed ctdb_control utility to ctdb
- use -n to specify node number in ctdb utility

- change 'ctdb status' to 'ctdb statistics'

- added 'ctdb status' which shows status

- added netmask to public IPs, so you don't try a takeover on a
  foreign network

- cleaned up tools/ctdb_control.c a lot

- generate usage message at runtime

(This used to be ctdb commit 28de71c03ace7d32a9fd9882fabbd5d668b97656)
2007-05-29 12:16:59 +10:00
Andrew Tridgell
9cc3ce8554 automatic cleanup of tcp tickle records
(This used to be ctdb commit ede79b571bf89b89f1b8394f262ca0689f8c65f3)
2007-05-28 00:34:40 +10:00
Andrew Tridgell
d41290fbae added code to ctdb to send a tcp 'tickle' ack when we takeover an
IP. A raw tcp ack is sent for each tcp connection held by clients
before the IP takeover.

These acks have a deliberately incorrect sequence number, and should
cause the windows client to send its own ack which will in turn cause
a tcp reset and thus cause windows clients to much more quickly
reconnect to the new node.

(This used to be ctdb commit eef38bfe8461b47489d169c61895d6bb8a8f79a1)
2007-05-27 15:26:29 +10:00
Andrew Tridgell
647540253e tweak timeouts
(This used to be ctdb commit 54a90797469f56d796efd82e9294efff3c5dabcc)
2007-05-27 09:43:25 +10:00
Andrew Tridgell
cc4d8102cd moved system specific ip code to system.c
(This used to be ctdb commit 9de9e4ccda9665108baac12a8716b189d26340b1)
2007-05-26 14:01:08 +10:00
Andrew Tridgell
9e61a5bd77 send a message to clients when an IP has been released
(This used to be ctdb commit 8b7ab0b00253462593d368052c2cb10a385b4e63)
2007-05-26 00:05:30 +10:00
Andrew Tridgell
31053286c5 keep sending ARPs for 2 minutes, every 5 seconds
(This used to be ctdb commit d5223f2eed4a762b93a101c720286568578ce7ed)
2007-05-25 21:27:26 +10:00
Andrew Tridgell
7a9e40b288 consider a node dead after 6 seconds, not 15
(This used to be ctdb commit b055907f0bd2fa0e83bd84e49039fa868905b941)
2007-05-25 20:00:06 +10:00
Andrew Tridgell
56e3eed3d1 added IP takeover logic for public IPs to ctdb
(This used to be ctdb commit 374adb729472670f35cef41269b8719f49c0de0e)
2007-05-25 17:04:13 +10:00
Ronnie Sahlberg
2b6c39a0af add controls to take over and release an ip address
add sending of grat arp     both normal grat arp (request) and also
unsolicited grat arp replies

(This used to be ctdb commit 7305c00c21c30bdbafc3722a018513378bd307e6)
2007-05-25 13:05:25 +10:00
Andrew Tridgell
7596347844 make ctdbd realtime if possible
(This used to be ctdb commit 8852f6cca52b64a5239c83ab7c6a99ae4edb2597)
2007-05-24 14:52:10 +10:00
Andrew Tridgell
70912e2b0c added automatic vacuuming of empty records during recovery
(This used to be ctdb commit f9181a784ac7009df5e9c996f4e0c3e99098b59a)
2007-05-23 17:21:14 +10:00
Andrew Tridgell
74bf76ca10 merge from ronnie
(This used to be ctdb commit 267481b67152bc5885884d223085aa9ef5fe73bd)
2007-05-23 14:50:41 +10:00
Andrew Tridgell
76b2822340 - startup frozen, and do an initial recovery
- fixed a bug in traverse
- get a lock on the node list file in the recmaster recovery daemon

(This used to be ctdb commit 162a5647535ad1cb3e8e5d4042a2784365fb1913)
2007-05-23 14:35:19 +10:00
Andrew Tridgell
9f7a70657f start ctdb frozen, and let the election sort things out. This prevents a race on startup
(This used to be ctdb commit b788ed3fa64e31e517b4e602e8bd3ae7201ecddd)
2007-05-23 12:23:07 +10:00
Ronnie Sahlberg
e989a1bac8 add controls to enable/disable the monitoring of dead nodes
(This used to be ctdb commit 79d29c39bb81feb069db3fc6d3d392c1e75a4d13)
2007-05-21 09:24:34 +10:00
Andrew Tridgell
d549f1e1a3 merge from ronnie
(This used to be ctdb commit 985d718e03510398b9a5cfdf6a4d559a90738a11)
2007-05-19 17:21:58 +10:00
Ronnie Sahlberg
02a9f1b0a0 use ctdb_dead_node() instead of reimplementing the same code again
this leaves only one single function where a node is marked as dead
instead of two places

(This used to be ctdb commit aa764ea26cc26d5c1ae188105236da603576f45b)
2007-05-19 16:59:10 +10:00
Andrew Tridgell
a14fd9d29c make sure we don't increment rx_cnt for redirected packets, or for packets that have been requeued after a lockwait
(This used to be ctdb commit 92e5569407dba173a27e9645b4339ce3e2c00520)
2007-05-19 13:45:24 +10:00
Ronnie Sahlberg
9f7b9faf64 add a node->tx_cnt counter
only send keepalive packets if the count is zero

(This used to be ctdb commit 2cbd424231caccf0a531cf6501761115efe68f3e)
2007-05-19 10:20:19 +10:00
Andrew Tridgell
28f2fc669b a better way to resend calls after recovery
(This used to be ctdb commit 444f52e134fc22aaf254d05c86d8b357ded876f4)
2007-05-19 00:56:49 +10:00
Andrew Tridgell
049e1504ee timeout pending controls immediately when a node becomes disconnected
(This used to be ctdb commit 93c4b16f4efef383ba8db83953019ef4821613e0)
2007-05-18 23:48:29 +10:00
Andrew Tridgell
346dfc1bef - up rx_cnt on all packet types
- notice when a node becomes available again

(This used to be ctdb commit e05110dd6112e81f224937dfd7370d963ce9531a)
2007-05-18 23:23:36 +10:00
Ronnie Sahlberg
db4c479568 add dead node detection so that if a node does not generate any
keepalive traffic for x seconds   it is deemed dead


this triggers a recovery after a while if a ctdbd has been STOPPED    
but it doesnt recover automatically when the node reappears

(This used to be ctdb commit d6324afe0d13b5e21d06e347caca433c6b36a32a)
2007-05-18 19:19:35 +10:00
Andrew Tridgell
49fe66713f - don't try to send controls to dead nodes
- use only connected nodes in a traverse

(This used to be ctdb commit 9a676dd5d331022d946a56c52c42fc6985b93dbc)
2007-05-17 23:23:41 +10:00
Andrew Tridgell
874fd5c2f7 removed the CTDB_CTRL_FLAG_NOREQUEUE flag
(This used to be ctdb commit 366e849f6f350eda78d79cf1ea55c2637e605c86)
2007-05-17 14:10:38 +10:00
Ronnie Sahlberg
f4738f9c41 we no longer pass lmaster across during pulldb so dont print it from
catdb either

(This used to be ctdb commit b57d60f4789ea7f0dd69c93f6629d8742e182576)
2007-05-17 12:07:29 +10:00
Ronnie Sahlberg
cc760cf13a add a control to shutdown/kill a node
(This used to be ctdb commit 3802f7304fd59d56062c855987e2561753e85a69)
2007-05-17 10:45:31 +10:00
Ronnie Sahlberg
d6ed77468d merge from tridge
(This used to be ctdb commit 0c6dc471e33e80db00a2b006262c4107f39fa023)
2007-05-16 18:44:51 +10:00
Andrew Tridgell
c105f6d789 - merge from ronnie
- fixed a memory leak found by dmitry

(This used to be ctdb commit ae87bf0005666b50850161c3843d6bc7cb5c8971)
2007-05-16 18:10:26 +10:00
Ronnie Sahlberg
f4056d2e28 remove a prototype we no longer need
(This used to be ctdb commit 4a11373ec5e8196cf430f18f6171915f790f794b)
2007-05-16 14:45:43 +10:00
Ronnie Sahlberg
a4ebb6d5ef if a caller specifies a timeout when calling a control, it makes no
sense to have the daemon requeue the packets if they timeout or fail to 
deliver to the remote node

(This used to be ctdb commit 9fb753046787190970654aeb937e96685ac53184)
2007-05-16 12:34:30 +10:00
Ronnie Sahlberg
4b8ddfccad merge from tridge
(This used to be ctdb commit 8d424b41d6cf2973b28a749d1b8e6a028dad9ffe)
2007-05-16 11:12:28 +10:00
Andrew Tridgell
a5198559c9 moved the recovery daemon into the main ctdbd and enable it by default
(This used to be ctdb commit 2a7d42124731f43d013cb76a798525eab4cc1ee0)
2007-05-15 15:13:36 +10:00
Ronnie Sahlberg
0d71b6d1e6 merge from tridge
(This used to be ctdb commit 0697f59a044deeab126a39bff97bcd5c1101298e)
2007-05-15 10:28:41 +10:00
Andrew Tridgell
c6afe22b92 added a control to get the local vnn
(This used to be ctdb commit 0b109f574b710f290372512d0694290ea7cd4368)
2007-05-15 10:17:16 +10:00
Andrew Tridgell
cf1056df94 added a -i switch to run ctdbd without forking
(This used to be ctdb commit 327df14ecd58f405fbe8b38afa2ee54a8dd0a2e4)
2007-05-15 09:44:33 +10:00
Ronnie Sahlberg
ed466e20b6 remove the control to bump the rsn since we dont need it anymore
(This used to be ctdb commit a646b6d77bd8adf6c986259c534a05400c4bde11)
2007-05-14 08:03:48 +10:00
Ronnie Sahlberg
4f7fc688f7 merge from tridge
(This used to be ctdb commit 7bca79ad6357149fd7c6b28ce4b05de3d223a7de)
2007-05-14 06:25:15 +10:00
Andrew Tridgell
81826da2df added error messages in ctdb_control replies
(This used to be ctdb commit bd848f5b760e6b2a73ebfc67fd8adb3c31479fb5)
2007-05-12 21:25:26 +10:00
Andrew Tridgell
36ccc10389 make sure we ignore requeued ctdb_call packets of older generations except for packets from the client
(This used to be ctdb commit facab105fbd7fe50f96bdd763ae50ddc54fbdacc)
2007-05-12 18:08:50 +10:00
Andrew Tridgell
2c90d9e794 show total frozen/recoving in status
(This used to be ctdb commit 0d0eb66a63fe6912edb85bf7387ac76acb70babd)
2007-05-12 15:51:08 +10:00
Andrew Tridgell
9cf77dd23f separate out the freeze/thaw handling from recovery
(This used to be ctdb commit 0b0640bd8b8334961f240e0cf276ac112cd6e616)
2007-05-12 15:15:27 +10:00
Andrew Tridgell
74a799a83b added lockwait child code for entering recovery mode. A child processes holds lockall locks for the entire recovery process
(This used to be ctdb commit f892f30def75b0d964c35eae38c4cf675597dd28)
2007-05-12 14:34:21 +10:00
Ronnie Sahlberg
9ec3024287 add a control to bump the rsn number for all records in a database
use this control from the recovery daemon to ensure that the recmaster 
always have a higher rsn than andy other node for the records after 
recovery completes

(This used to be ctdb commit 6fb6a8b981a804bfcc460c4481c51c7c647230f6)
2007-05-11 10:36:47 +10:00
Andrew Tridgell
f8765b19bf - got rid of the complex hand marshalling in the recovery controls
- fixed the re-send of ctdb calls after a generation change

- fixed a reqid idr leak in controls

- removed the write_record test code

- use the new nonblock lockall code to prevent ctdbd from ever doing a
  blocking lock that could deadlock with smbd

- moved more of the recovery controls into ctdb_recover.c

(This used to be ctdb commit 565a21aa4f1e842309986ab97d6244801153deec)
2007-05-10 17:43:45 +10:00
Andrew Tridgell
15bc97cdaa better timeout handling for calls, controls and traverses
(This used to be ctdb commit 63346a6c59d4821b4c443939b5d88db8cd20f5fe)
2007-05-10 14:06:48 +10:00
Andrew Tridgell
31cd92dc7e merge from ronnie
(This used to be ctdb commit 92b7a849565730744c75a7fb776173554e9f57bf)
2007-05-10 13:15:58 +10:00
Andrew Tridgell
682df74d59 separate the wire format and internal format for the vnn_map
(This used to be ctdb commit 9a71718d87c5162f1423d85c2e86a01f6771925e)
2007-05-10 08:13:19 +10:00
Andrew Tridgell
d2a90cc5a5 merge from ronnie
(This used to be ctdb commit f67a4842e7b1efb2ad61c41e4895c7698e564bf3)
2007-05-09 11:54:37 +10:00
Ronnie Sahlberg
6929739b7f add a command line flag to ctdbd to start a recovery daemon.
update the recovery test script to start all ctdb daemons with a 
recovery daemon

(This used to be ctdb commit 47794e16df285cacefc30208d892d931a6e46b96)
2007-05-09 09:59:23 +10:00
Andrew Tridgell
fdb8144e62 fixed a problem with the number of timed events growing without bound with the new seqnum code
(This used to be ctdb commit 6109ae3dae8d93c93a2dc76cc561ea6e21458aa6)
2007-05-08 21:16:29 +10:00
Ronnie Sahlberg
39d81cffb1 recovery daemon with recovery master election
election is primitive, it elects the lowest vnn as the recovery master

two new controls, to get/set recovery master for a node



to use recovery daemon,   start one  
./bin/recoverd --socket=ctdb.socket*
for each ctdb daemon


it has been briefly tested by deleting and adding nodes to a 4 node 
cluster but needs more testing

(This used to be ctdb commit 541d1cc49d46d44042a31a8404d521412ef2fdb3)
2007-05-07 06:51:58 +10:00
Ronnie Sahlberg
a9657f6aa5 add new controls to get and set the recovery master node of a daemon
i.e. which node is "elected" to check for and drive recovery

(This used to be ctdb commit d577093eb4b619392c71ab5ce81e8c02565d93f0)
2007-05-07 05:02:48 +10:00
Ronnie Sahlberg
25edbc9a50 add a control to get the pid of a daemon.
this makes it possible to kill a specific daemon in the recover test 
script

(This used to be ctdb commit 2fa394b4c80988cb1a6d04b236ec64cc9d9e8a40)
2007-05-06 04:31:22 +10:00
Ronnie Sahlberg
2e64727079 merge from tridge
(This used to be ctdb commit 8648104f8d76d22427c14422b126f7e979cc2d95)
2007-05-05 16:51:34 +10:00
Andrew Tridgell
9636c97c5a show number of connected clients in status output
(This used to be ctdb commit 99765bbe327bfe9c43415f4943281458f25be51b)
2007-05-05 14:09:46 +10:00
Ronnie Sahlberg
5cb817f031 split the vnn broadcast address into two
one broadcast address for all nodes
and one broadcast address for all nodes in the current vnnmap

update all useage of the old flag to now only broadcast to the vnnmap
except for tools/ctdb_control where it makes more sense to broadcast to 
all nodes

(This used to be ctdb commit dfb65b88cf67ad9d61268c4b47a6d8ae346f47df)
2007-05-05 13:17:26 +10:00
Andrew Tridgell
410d41480a added a dumpmemory control, used to find memory leaks
(This used to be ctdb commit 44fdafaf421e3e906796d529aed2f7c5df201b94)
2007-05-05 11:03:10 +10:00
Andrew Tridgell
adc64aed0a - fixed a crash bug after client disconnect in ctdb_control
- added total memory used to ctdb_control status output

(This used to be ctdb commit a99ffe4372edc63d83d8c8ebf9a60b3413301f5a)
2007-05-05 08:33:35 +10:00
Andrew Tridgell
d8f4e6b209 - added counters for controls in ctdb_control status
(This used to be ctdb commit 858061372fc9902837a1a5b8bcfc0ada58eec193)
2007-05-05 08:11:54 +10:00
Ronnie Sahlberg
1725fcf294 merge from tridge
(This used to be ctdb commit 62574808ef4dcb76760f1dd2496fbe8e34197c23)
2007-05-05 01:22:30 +10:00
Andrew Tridgell
fccc585f5a added seqnum propogation code to ctdb
(This used to be ctdb commit be2572b1b09eaaa1ea6a726d60f16996f9407d13)
2007-05-04 22:18:00 +10:00
Ronnie Sahlberg
508cafd17e merge from tridge
(This used to be ctdb commit 6c8b90cedc67daa89d54db5268fde18bfc20abaf)
2007-05-04 17:05:28 +10:00
Andrew Tridgell
ed3e847785 added a ctdb control for enabling the tdb seqnum
(This used to be ctdb commit c66920d9fb08a4a33418e2c1dcf1fc320fba3761)
2007-05-04 15:33:28 +10:00
Ronnie Sahlberg
7dfdab1b9d recovery daemon
this program is a client to the local ctdb daemon

every second it pulls all vnnmap and nodemaps from all nodes that are 
available and checks if a recovery is required

a recovery is required if :
* all nodes do NOT have an identical vnnmap and generation
* all nodes do NOT have an identical nodemap
* there are active nodes that are NOT in the nodemap
* there are nodes in the nodemap that are NOT active

During recovery,  the recovery tool will also make sure that all nodes 
know about and have created all databases.

(This used to be ctdb commit 2f2650467bac7e8954de7c17cb34f46b0bdbcd26)
2007-05-04 15:21:40 +10:00
Andrew Tridgell
f2fd53056d nicer interface to ctdb traverse
(This used to be ctdb commit e5ce866dcc5037b5069e42bf1e168b646f007b01)
2007-05-04 12:18:39 +10:00
Andrew Tridgell
e752f3bd97 - changed the REQ_REGISTER PDU to be a control
- allow controls to know which client invoked them

- added a client_id to clients, so they can be identified remotely

- added the ability to remove registered srvids

- in the list_keys code, register a temp srvid, then remove it afterwards

(This used to be ctdb commit 29603c51cc6d81362532cd8e50f75c8360c5f5ef)
2007-05-04 11:41:29 +10:00
Ronnie Sahlberg
2b1714a521 update getvnnmap control to take a timeout parameter
dont explicitely free the vnnmap pointer in the getvnnmap control  this 
is freed by the mem_ctx instead

add code to the recoverd to detect when/if recovery is required
veiry that the number of active nodes, the nodemap and the vnn map is 
consistent across the entire cluster and if not   trigger a recovery 
(which right now just prints "we need to do recovery" to the screen.

(This used to be ctdb commit 2b0a207a3748bdb3394dc9fd0d1c344ee1bb0bb5)
2007-05-04 09:45:53 +10:00
Ronnie Sahlberg
ae73784c28 change the signature for ctdb_ctrl_getnodemap() so that a timeout
parameter is added.
change ctdb_get_connected_nodes in the same way

(This used to be ctdb commit d85f23bcf4c1230225abb2f4a053c70b68d469aa)
2007-05-04 09:01:01 +10:00
Ronnie Sahlberg
ebc478749b start working on a recovery daemon
change ctdb_control so it takes a timeval pointer as argument.
this is the timeout. if the node has not responded within hte timeout
ctdb_control will return an error instead of hanging.
if the timeval pointer is NULL then the call will block indefinitely if 
there is no response.

this is used for now in the createdb control   but all the helpers 
ctdb_ctrl_* should probably be updated to take a timeout parameter as 
well.

(This used to be ctdb commit 1fe64b04869b17dbf123851b0fe09df8d28a6211)
2007-05-04 08:30:18 +10:00
Ronnie Sahlberg
63f42d3ff8 merge from tridge
(This used to be ctdb commit fb8ac93c7dfc11e774ef1ce05b0d0df1de56a621)
2007-05-03 17:16:38 +10:00
Andrew Tridgell
60b42276eb first version of traverse is working
(This used to be ctdb commit ecac90cee389a6fa0e9b1efba521e098a24d323f)
2007-05-03 17:12:23 +10:00
Ronnie Sahlberg
14724b504b cleanup the control "write record"
(This used to be ctdb commit 4dd5c26a21a5dc2b2f76eb23cfeb4df82ba4e956)
2007-05-03 16:18:03 +10:00
Andrew Tridgell
486c6b4fce merged from ronnie
(This used to be ctdb commit 57a80110ddfd202f8de37297db76dc43a064e476)
2007-05-03 13:53:54 +10:00
Ronnie Sahlberg
d88154b24a cleanup getnodemap
(This used to be ctdb commit 3867ccf71a167fb82dbc5a3f03f968a325a0c70b)
2007-05-03 13:30:38 +10:00
Ronnie Sahlberg
633ae7f346 fixup getdbmap control so it looks a bit nicer
(This used to be ctdb commit 78a4d61cb78da20af5210488e685c91bc3023e90)
2007-05-03 13:07:34 +10:00
Andrew Tridgell
472b96d6d3 first stage of efficient non-blocking ctdb traverse
(This used to be ctdb commit 4c23e6f26bde421bb56b55de9d6cd3e319b2be40)
2007-05-03 12:16:03 +10:00
Ronnie Sahlberg
27880056db break set/get vnn map out from ctdb_control and put it in ctdb_recover.c
for the time being

remove all the [de]marshalling and just pass a structure around instead

(This used to be ctdb commit b1169555ab7015976c0135ff51121cc238f5887c)
2007-05-03 11:06:24 +10:00
Ronnie Sahlberg
768eb0f763 merge from tridge
(This used to be ctdb commit 17b73a811009588f836c3f9fd1b775d9d504d30c)
2007-05-02 22:00:48 +10:00
Ronnie Sahlberg
206fb1fd3b add a recover test change alignment for the pull/push db structures
(This used to be ctdb commit 0eb45623ca103e69765ed577ae02e7f8ca777e37)
2007-05-02 21:00:02 +10:00
Andrew Tridgell
317ad52758 added a builtin fetch function to support samba3 unlocked fetch
(This used to be ctdb commit 8c57a8355a94a7d714b9bec98533bc40a2bc4684)
2007-05-02 15:11:11 +10:00
Ronnie Sahlberg
d20990c2b6 add a control to create a database
(This used to be ctdb commit 74e489c6737cc79537c7812ea82daafb1b363ec2)
2007-05-02 12:43:35 +10:00
Ronnie Sahlberg
599fa31266 update some calls to ctdb_control() that were still using the old
signature (flags field)

update some calls to ctdb_get_config() to use the new name 
ctdb_ctrl_get_config()

change #include "talloc/talloc.h" to #include "lib/talloc/talloc.h" in 
lib/events/events.h

(This used to be ctdb commit d2cdd87037b9f0c387228d7d4743da4869929c93)
2007-05-02 11:02:04 +10:00
Ronnie Sahlberg
3a891c6676 merge with tridges tree to resolve all conflicts
(This used to be ctdb commit 0f7c6c580ef0de60af68fd22bce36c0c0b2515b0)
2007-05-02 10:53:29 +10:00
Ronnie Sahlberg
51630f9b12 add an initial recovery control to perform samba3 style recovery
this is not optimized at all and copies/merges all records between 
databases instead of only those records for which a certain node is 
lmaster.  (step 7 should later be enhanced to a, delete the database, 
push only those records for which the node is lmaster)

(This used to be ctdb commit 509d2c71169e96a8610f9db91293dc7a73c2cc10)
2007-05-02 10:20:34 +10:00
Andrew Tridgell
169f129404 merge latest versions of lib/replace, lib/talloc, lib/tdb and lib/events into ctdb bzr tree
(This used to be ctdb commit eaea8a9fa8d2f5e08f3af619fa1008a663f39053)
2007-05-02 07:32:04 +10:00
Andrew Tridgell
2dc24c7d56 added a hopcount in ctdb_call
(This used to be ctdb commit 36d838801a2a2008c50322cdbfff65a308b1cd1a)
2007-05-01 13:25:02 +10:00
Andrew Tridgell
9366120d92 changed the way set_call and attach are done so that you can safely
attach to databases after the protocol has started. The daemon
broadcasts information on new databases to the other daemons.

This also eliminates the need for the client to know about the hash
between db name and db_id.

(This used to be ctdb commit 3bad91a9d987d4c09fe3322eac23c2733660ad08)
2007-04-30 15:31:40 +02:00
Andrew Tridgell
f455d3f44b saner logfile code
testing of ctdbd

(This used to be ctdb commit 05789da5818f8b20f04779b0df5125914d9047f6)
2007-04-29 22:42:23 +02:00
Ronnie Sahlberg
eacfcaf437 add push/pull of tdb and a control to copy a tdb from one node to
another node

(This used to be ctdb commit c313daff4c1362cd08a9f682ce04cab312678038)
2007-04-30 00:58:27 +10:00
Andrew Tridgell
e21f69107f yay! finally fixed the bug that volker, ronnie and I have been chasing
for 2 days.

The main bug was in smbd, but there was a secondary (and more subtle)
bug in ctdb that the bug in smbd exposed. When we get send a dmaster
reply, we have to correctly update the dmaster in the recipient even
if the original requst has timed out, otherwise ctdbd can get into a
loop fighting over who will handle a key.

This patch also cleans up the packet allocation, and makes ctdbd
become a real daemon.

(This used to be ctdb commit 59405e59ef522b97d8e20e4b14310a217141ac7c)
2007-04-29 16:19:40 +02:00
Ronnie Sahlberg
f67a79ad8e merge from tridge
(This used to be ctdb commit a84e9b47a87fc7d4756b4a179aa2ea0bc7c54c78)
2007-04-29 23:49:27 +10:00
Ronnie Sahlberg
77ce5750b2 add a new "recovery mode" field to ctdb.
while recovery is in progress  the daemon will discard all CTDB_REQ_CALL 
and rely on clients retransmitting them

add new controls to get/set the recovery mode

(This used to be ctdb commit 41458a61577885ac49150f830e92e93e634c5411)
2007-04-29 22:51:56 +10:00
Ronnie Sahlberg
1af701291f implement a control to pull a database from a remote node
it does not yet work since ctdb_control can right now only be called 
from client context and the pull is implemented as the target ctdb node 
itself using a get_keys to pull the keys from the source node   thus 
ctdb daemon needs to ctdb_control to a remote node

(This used to be ctdb commit a55c7c64b4ff87f54b90649c9f469b1ff36dc9ea)
2007-04-29 22:14:51 +10:00
Ronnie Sahlberg
376a3ea852 control to delete all records in a database
(This used to be ctdb commit 6664e00fc02e1c60cc1a35ecd15f4893a34f23d1)
2007-04-29 18:48:46 +10:00
Ronnie Sahlberg
c0b0b4a0f5 add a new control to set all records in a database to a new dmaster
(This used to be ctdb commit fd0d2385206b0329b74d908f3bdf89d3f32095d1)
2007-04-29 18:34:11 +10:00
Ronnie Sahlberg
097037a055 add a control to read an entire tdb from a node including
key/lmaster/header and data

(This used to be ctdb commit ac00d6271ba6356c1edf804df44d0d2600791610)
2007-04-29 05:47:13 +10:00
Andrew Tridgell
10910f52eb added reset status control
(This used to be ctdb commit ec342b667a085a5c740fbeec8882070571071862)
2007-04-28 19:13:36 +02:00
Andrew Tridgell
1627a5d749 removed unnecessary variable
(This used to be ctdb commit ef0027faa631b00c7fc1a7c4538fbf3080248f0b)
2007-04-28 18:55:37 +02:00
Andrew Tridgell
6e09bfdaf9 much simpler redirect logic
(This used to be ctdb commit 95db9afa7dd039e1700e2f3078782f6ac66e9f51)
2007-04-28 18:18:33 +02:00
Andrew Tridgell
1e538be42d better name for this hack
(This used to be ctdb commit e5a98eee991a7926ddb6964ea3785b11303d175e)
2007-04-28 17:46:37 +02:00
Andrew Tridgell
c885b159f4 use ctdb_get_connected_nodes for node listing
(This used to be ctdb commit b4efdd1944207e51dccd6cd5e50f451a7dddcd91)
2007-04-28 17:42:40 +02:00
Andrew Tridgell
4b6d00974d added status all and debug all control operations
(This used to be ctdb commit 7f902f6c4270adc0543096c78415d335b88d6232)
2007-04-28 17:13:30 +02:00
Andrew Tridgell
e6d5848a20 report number of clients in ping
(This used to be ctdb commit 9deaa1892faa8288cad9f6fde20d2aa8ba8af699)
2007-04-28 15:15:21 +02:00
Ronnie Sahlberg
acb4bc095b add a few more controls that are useful for debugging a cluster
(This used to be ctdb commit 751c1365ab55a217ff33d985d52bd26581578617)
2007-04-28 20:40:26 +10:00
Ronnie Sahlberg
643bfe83d3 add a control to pull the database list from a remote node
(This used to be ctdb commit d130e02936ea4bdcd3a6f02c53be4b7771993138)
2007-04-28 20:00:50 +10:00
Andrew Tridgell
353a82f87c factor out the packet allocation code
(This used to be ctdb commit 4cbaf22cc4bede8c95dd9b87e21bbe886307e23b)
2007-04-28 10:50:32 +02:00