samba-mirror

mirror of https://github.com/samba-team/samba.git synced 2024-12-27 03:21:53 +03:00

Author	SHA1	Message	Date
Andrew Tridgell	1c91398aef	ensure the recovery daemon is not clagged up by vacuum calls (This used to be ctdb commit ff7e80e247bf5a86adda0ef850d901478449675b)	2008-01-08 21:28:42 +11:00
Andrew Tridgell	96100fcae6	added two new ctdb commands: ctdb vacuum : vacuums all the databases, deleting any zero length ctdb records ctdb repack : repacks all the databases, resulting in a perfectly packed database with no freelist entries (This used to be ctdb commit 3532119c84ab3247051ed6ba21ba3243ae2f6bf4)	2008-01-08 17:23:27 +11:00
Andrew Tridgell	37861932ce	merge from ronnie (This used to be ctdb commit 0aa6e04438aa5ec727815689baa19544df042cf7)	2008-01-07 16:17:22 +11:00
Andrew Tridgell	748843a3c6	added paranoid transaction ids (This used to be ctdb commit afc1da53873cdbd31fcc8c6b22fae262e344cf6e)	2008-01-06 13:24:55 +11:00
Andrew Tridgell	c08f2616cd	new simpler and much faster recovery code based on tdb transactions (This used to be ctdb commit 9ef2268a1674b01f60c58fed72af8ac982fe77a3)	2008-01-06 12:38:01 +11:00
Andrew Tridgell	43aa27c9ee	this is needed with merged tdb (This used to be ctdb commit 3dc07f2bf98ab445ab960ef14173bc6924e3b658)	2008-01-05 17:42:01 +11:00
Andrew Tridgell	e4aefbc66d	a new tunable DatabaseMaxDead that enables the tdb max dead cache logic (This used to be ctdb commit 01c519c3658a8fcb9545b507b597e723658e4c4e)	2008-01-05 09:36:53 +11:00
Ronnie Sahlberg	50573c5391	add ctdb_disable/enable_monitoring() that only modifies the monitoring flag. change calling of the recovered/takeip/releaseip event scripts to use these enable/disable functions instead of stopping/starting monitoring. when we disable monitoring we want all events to still be running in particular the events to monitor for dead nodes and we only want to supress running the monitor event scripts (This used to be ctdb commit a006dcc4f75aba950dd701ad7d1a84e89df285e8)	2007-11-30 10:09:54 +11:00
Ronnie Sahlberg	0eb6c04dc1	get rid of the control to set the monitoring mode. monitoring should always be enabled (though a node may want to temporarily disable running the "monitor" event scripts but can do so internally without the need for this control) (This used to be ctdb commit e3a33618026823e6af845fd8513cddb08e6b5584)	2007-11-30 10:00:04 +11:00
Ronnie Sahlberg	9e73dc87cc	Add a --node-ip argument so that one can specify which ip address a specific instance of ctdbd should bind to. This helps when running a "virtual" cluster on a single machine where all instcances bind to different alias interfaces. If --node-ip is specified, then we will only try to bind to this ip address only. Othervise we fall back to the original method trying the ip addresses in /etc/ctdb/nodes one by one until we find one we can bind to. No variable in /etc/sysconfig/ctdb added since this parameter only makes sense in a virtual test/debug cluster. (This used to be ctdb commit d96cb02c2c24f9eabbc53d3d38e90dea49cff3e0)	2007-11-26 10:52:55 +11:00
Andrew Tridgell	bde886988b	prevent a deadly embrace between smbd and ctdbd by moving the calling of the startup event scripts after the point where recovery has started and the node is in normal operation This makes the 'startup' script just a special type of the 'monitor' script which is called first (This used to be ctdb commit 7424c30a5fd04aea0137c466b4318c3f185280d8)	2007-11-12 10:53:11 +11:00
Ronnie Sahlberg	4a97876fb7	when we are shutting down, we should first shut down the recovery daemon (This used to be ctdb commit 39ade6b329adcd3234124d6a8daaa6181abf739b)	2007-10-22 12:34:08 +10:00
Ronnie Sahlberg	d1ba047b7f	add a new transport method so that when a node is marked as dead, we shut down and restart the transport othervise, if we use the tcp transport the tcp connection might try to retransmit the queued data during the time the node is unavailable. this together with the exponential backoff for tcp means that the tcp connection quickly reaches the maximum backoff rto which is often 60 or 120 seconds. this would mean that it could take up to 60/120 seconds before the tcp layer detects that the connection is dead and it has to be reestablished. (This used to be ctdb commit 0256db470879ce556b0f00070f7ebeaf37e529ab)	2007-10-19 08:58:30 +10:00
Ronnie Sahlberg	056aac6e0c	add a new tunable : DeterministicIPs that makes the allocation of public addresses to nodes deterministic. Activate it by adding CTDB_SET_DeterministicIPs=1 in /etc/sysconfig/ctdb When this is set, the first entry in /etc/ctdb/public_addresses will always be hosted by node 0, when that node is available, the second entry by node1 and so on. This tunable allows the allocation of addresses to become very unbalanced and is only for debugging/testing use. Beware, this feature requires that /etc/ctdb/public_addresses are identical on all the nodes in the cluster. (This used to be ctdb commit f0ca221f235731542090d8a6c86f2b7cd2ce2f96)	2007-10-16 12:15:02 +10:00
Andrew Tridgell	0e855c0772	merge from ronnie (This used to be ctdb commit d18712caba11855010be52f90bac656683076676)	2007-10-15 14:17:49 +10:00
Andrew Tridgell	174879621e	add config option for disabling bans (This used to be ctdb commit 153b911f7f957d4c564b04f5aa878033a02da9e4)	2007-10-15 13:22:58 +10:00
Ronnie Sahlberg	bdd67bba1e	add a --single-public-ip argument to ctdbd to specify the ip address used in single public ip address mode. when using this argument, --public-interface must also be used. add a vnn structure to the ctdb context to describe the single public ip address update the killtcp control in the daemon that if a socketpair that is to be killed does not match a normal public address it checks if the destination address maches the single public ip address and if so uses that vnn structure from the ctdb context this allows killtcp to kill also connections to the single public ip instead of only normal public addresses (This used to be ctdb commit 5661ba17b91f62821dec1c76056c78b99752a90b)	2007-10-10 09:42:32 +10:00
Ronnie Sahlberg	80cd82f8e4	add a control to send gratious arps from the ctdb daemon (This used to be ctdb commit 563819dd1acb344f95aabb4bad990b36f7ea4520)	2007-10-09 11:56:09 +10:00
Ronnie Sahlberg	72379ee3eb	change async.private to async.private_data since private is a reserved work in c++ (This used to be ctdb commit 79eb28f6cd5dcc30b04966d202a050eaf98a2552)	2007-09-26 14:25:32 +10:00
Andrew Tridgell	80100c3573	run monitoring more quickly when unhealthy and at startup (This used to be ctdb commit ff1c205928e3ef5bcc6bf4e4b2122a19fa38d8f4)	2007-09-24 10:12:18 +10:00
Andrew Tridgell	c60988325d	added support for persistent databases in ctdbd (This used to be ctdb commit 3115090a0d882beca9d70761130b74bb0821f201)	2007-09-21 12:24:02 +10:00
Andrew Tridgell	3c0f61cb92	we don't need the is_loopback logic in ctdb any more (This used to be ctdb commit 4ecf29ade0099c7180932288191de9840c8d90a9)	2007-09-13 10:45:06 +10:00
Andrew Tridgell	f3ae1cdb02	- use struct sockaddr_in more consistently instead of string addresses - allow for public_address lines with a defaulting interface (This used to be ctdb commit 29cb760f76e639a0f2ce1d553645a9dc26ee09e5)	2007-09-10 14:27:29 +10:00
Ronnie Sahlberg	4ac749bfa4	change the signature to ctdb_sys_have_ip() to also return: a bool that specifies whether the ip was held by a loopback adaptor or not the name of the interface where the ip was held when we release an ip address from an interface, move the ip address over to the loopback interface when we release an ip address after we have move it onto loopback, use 60.nfs to kill off the server side (the local part) of the tcp connection so that the tcp connections dont survive a failover/failback 61.nfstickle, since we kill hte tcp connections when we release an ip address we no longer need to restart the nfs service in 61.nfstickle update ctdb_takeover to use the new signature for ctdb_sys_have_ip when we add a tcp connection to kill in ctdb_killtcp_add_connection() check if either the srouce or destination address match a known public address (This used to be ctdb commit f9fd2a4719c50f6b8e01d0a1b3a74b76b52ecaf3)	2007-09-10 07:20:44 +10:00
Ronnie Sahlberg	77ec4d5248	allow different nodes in the cluster to use different public_addresses files so that we can partition the cluster into different subsets of nodes which each serve a different subset of the public addresses (This used to be ctdb commit 889e0fe69e4c88c6166282b12843b8d9727552d6)	2007-09-04 23:15:23 +10:00
Ronnie Sahlberg	8f819c6a0e	get rid of the ctdb_vnn_list structure and just use a single list of ctdb_vnn (This used to be ctdb commit 7b9fd06321af17043136b1420b57284450ae7ba5)	2007-09-04 18:20:29 +10:00
Ronnie Sahlberg	cf45c5096c	we cant have takeover_ctx hanging off ctdb since it is freed/recreated everytime we release an ip. this context is used to hold all resources needed when sending out gratious arps and tcp tickles during ip takeover. we hang it off the vnn structure that manages that particular ip address instead so that we can have multiple ones going in parallell this bug (or the same bug in different shape) has probably been in ctdb for very very long but is likely to be hard to trigger (This used to be ctdb commit c58db1cadaba253b2659573673b28c235ef7db76)	2007-09-04 14:36:52 +10:00
Ronnie Sahlberg	d66d9cdd22	change debug output from vnn to pnn change ctdb_daemon_send_message to take pnn as parameter isntead of vnn (This used to be ctdb commit e352a2bbf9bb9a0b2c4f8329e8a529cf02414097)	2007-09-04 10:45:41 +10:00
Ronnie Sahlberg	0c91261340	change ctdb_send_message to take pnn as parameter instead of vnn (This used to be ctdb commit 93dd4fba2e0fa6a011d15406652836785a974880)	2007-09-04 10:42:20 +10:00
Ronnie Sahlberg	157be530dd	change ctdb_ctrl_getvnn to ctdb_ctrl_getpnn (This used to be ctdb commit ef47cc4cd416065c69382e4d9e76c30a0a34e42f)	2007-09-04 10:38:48 +10:00
Ronnie Sahlberg	211b497818	change ctdb_node_flags_change.vnn to ctdb_node_flags_changed.pnn change ctdb_ban_info.vnn to ctdb_ban_info.pnn (This used to be ctdb commit fcedd40e0493948829e1c921d4fe30e9196e398a)	2007-09-04 10:33:10 +10:00
Ronnie Sahlberg	6f693bbcbd	change server_id.vnn to server_id.pnn (This used to be ctdb commit 26f2ee2b754a9271454412f05111a19b3013c6eb)	2007-09-04 10:21:51 +10:00
Ronnie Sahlberg	583b6e6ba6	change ctdb_get_vnn to ctdb_get_pnn (This used to be ctdb commit 1e19930198c2bcc7ccb755e0ee51555fb823029a)	2007-09-04 10:18:44 +10:00
Ronnie Sahlberg	fc9d39c3a6	change ctdb_validate_vnn to ctdb_validate_pnn (This used to be ctdb commit a4a1f41b69475b9dc16d8fd7f8965c32e96c32f0)	2007-09-04 10:09:58 +10:00
Ronnie Sahlberg	eb4cf6a686	change ctdb->vnn to ctdb->pnn (This used to be ctdb commit 8c776e5707e503ec6586aae39ac6b3ea5a2fd2bc)	2007-09-04 10:06:36 +10:00
Ronnie Sahlberg	12ebb74838	change how we do public addresses and takeover so that we can have multiple public addresses spread across multiple interfaces on each node. this is a massive patch since we have previously made the assumtion that we only have one public address per node. get rid of the public_interface argument. the public addresses file now explicitely lists which interface the address belongs to (This used to be ctdb commit 462ebbc791e906a6b874c862defea43235597ca8)	2007-09-04 09:50:07 +10:00
Ronnie Sahlberg	7f02e16143	add async versions of the freeze node control and freeze all nodes in parallell (This used to be ctdb commit f34e89f54d9f4380e76eb1b5b2385a4d8500b505)	2007-08-27 10:31:22 +10:00
Ronnie Sahlberg	801bdbdc80	add a control to pull the server id list off a node (This used to be ctdb commit 38aa759aa88a042c31b401551f6a713fb7bbe84e)	2007-08-26 10:57:02 +10:00
Ronnie Sahlberg	6681da31df	add an initial implementation of a service_id structure and three controls to register/unregister/check a server id. a server id consists of TYPE:VNN:ID where type is specific to the application. VNN is the node where the serverid was registered and ID might be a node unique identifier such as a pid or similar. Clients can register a server id for themself at the local ctdb daemon. When a client dissappears or when the domain socket connection for the client drops then any and all server ids registered across that domain socket will also be automatically removed from the store. clients can register as many server_ids as they want at the same time but each TYPE:VNN:ID must be globally unique. Clients have the option of explicitely unregister a server id by using the UNREGISTER control. Registration and unregistration can only be done by clients to the local daemon. clients can not register their server id to a remote node. clients can check if a server id does exist on any ctdb node in the network by using the check control (This used to be ctdb commit d44798feec26147c5cc05922cb2186f0ef0307be)	2007-08-24 15:53:41 +10:00
Ronnie Sahlberg	495a6403da	change the api for managing callbacks to controls so that isntead of passing it as a parameter we set the callback function explicitely from the caller if the ..._send() function returned a valid state pointer. (This used to be ctdb commit aa939570662786455f63299b62c99882cff29d42)	2007-08-24 10:42:06 +10:00
Ronnie Sahlberg	f854b5f876	try out a slightly different api for controls where you provide a callback function which is called upon completion (or timeout) of the control. modify scanning of recmaster in the monitoring_cluster code to try the api out (This used to be ctdb commit c37843f1d97b169afec910e7ddb4e5ac12c3015c)	2007-08-23 19:27:09 +10:00
Ronnie Sahlberg	8fd3df2553	hang the ctdb_req_control structure off the ctdb_client_control_state struct so that if we timeout a control we can print debug info such as what opcode failed and to which node we dont need the *status parameter to ctdb_client_control_state create async versions of the getrecmaster control pass a memory context to getrecmaster (This used to be ctdb commit 558b680c82f830fba82c283c78c2de8a0b150b75)	2007-08-23 13:00:10 +10:00
Ronnie Sahlberg	f6e0336b23	create a define to represent the 'invalid' generation id we used in two places. create a new helper function to generate new generation id values that know about the invalid id and avoids generating it. update the ctdb status tool to know about the invalid generation id and print the string INVALID instead (This used to be ctdb commit 4fbcd189543cb8a92227fdcd3d158472e558ccda)	2007-08-22 12:38:31 +10:00
Ronnie Sahlberg	8b06fc7284	change the structure used for node flag change messages so that we can see both the old flags as well as the new flags (so we can tell which flags changed) send the CTDB_SRVID_RECONFIGURE messages to connected nodes only, not to every node, connected or not, in the cluster. in the handler inside the recovery daemon which is invoked for node flag change messages, only do a takeover_run() and redistribute the ip addresses IF it was the disabled or the unhealthy flags that changed. Also send out the cluster reconfigured message in this case. If any of the other flags changed we dont need to do the takeover_run(0 here since that will be done during recovery. (This used to be ctdb commit 5549b2058e2c148a8ca9d419123acf3247bb8829)	2007-08-21 17:25:15 +10:00
Andrew Tridgell	46639ac19e	merged new event script calling code from ronnnie (This used to be ctdb commit bbacad61b3eee4276ffe44ed2a23949aca8152cf)	2007-08-20 11:10:30 +10:00
Ronnie Sahlberg	3b9d50f3ee	change the now rather small /etc/ctdb/events script into a service specific script /etc/ctdb/events.d/00.ctdb get rid of CTDB_EVENTS_SCRIPT and --event-script (This used to be ctdb commit 81ccfaf838e5772d4a58eb6a70224b7b39aba9f3)	2007-08-15 15:01:31 +10:00
Ronnie Sahlberg	4023576e50	call the service specific event scripts directly from the forked child instead for from /etc/ctdb/events so that we can get better debugging output in the logs when something fails in the scripts (This used to be ctdb commit 4ed96b768aea1611e8002f7095d3c4d12ccf77a3)	2007-08-15 14:44:03 +10:00
Andrew Tridgell	f03defff70	merge changes needed for samba4 (This used to be ctdb commit a7f80f78cd62401b3516da3640bf24d6362db872)	2007-08-15 09:03:58 +10:00
Ronnie Sahlberg	fca90ce3c3	updated ctdb tickle management there is an array for each node/public address that contains tcp tickles we send a TCP_ADD as a broadcast to all nodes when a client is added if tcp tickles are removed, they are only removed immediately from the local node. once every 20 seconds a node will push/broadcast out the tickle list for all public addresses it manages. this will remove any deleted tickles from the remote nodes (This used to be ctdb commit e3c432a915222e1392d91835bc7a73a96ab61ac9)	2007-07-20 15:05:55 +10:00
Ronnie Sahlberg	7b17afdfcd	change the tickle list from one global list into an array per public ip/node once we have started sending all tickles for a specific ip delete the entire array so that the tickles dont remain forever in the ctdb server add a control to send the full list of every tickle that is registered for a particular public ip/node (This used to be ctdb commit d0eee33e44d3f8e26debbec21d41e2cbdbb520e6)	2007-07-20 10:06:41 +10:00

1 2 3 4 5 ...

319 Commits