samba-mirror

mirror of https://github.com/samba-team/samba.git synced 2025-03-03 12:58:35 +03:00

Author	SHA1	Message	Date
Ronnie Sahlberg	eb7a15730e	add a short delay after stopping nfslock to make it less likely that "weird" things happen (This used to be ctdb commit 4934c083cbcc19714094e08a0b7da1fb6fdc8a5a)	2007-09-07 12:14:53 +10:00
Ronnie Sahlberg	68c37f9b41	merge from tridge (This used to be ctdb commit 58c918b1bfe09c31049769dee266129cbad4cb20)	2007-09-07 09:21:40 +10:00
Ronnie Sahlberg	fa872de664	60.nfs: we must always restart the lockmanager when the cluster has been reconfigured and ip addresses has changed. This is to make sure we get a clusterwide grace period for nfs locking. if we dont do this and only restart locking on the nodes that were direclty affected, a different client can take out a conflicting lock from a different node before affected clients has had a chance to reclaim all the locks lost during reconfigure. grace period on rhel5 kernel has bene increased to 90 seconds! statd-callout: we must restart lockmanager to ensure a clusterwide grace period for nfs. this makes locking "more correct" for nfs clients and prevents other clients/nodes from taking out a conflicting lock while a different client/node tries to reclaim lost locks. This makes it "almost consistent" for NFS clients but there is still the possibility that a cifs client can take out a conflicting lock before an nfs client has had a chance to reclaim an existing lock. This can not be solved with anything less than making the kernel nfs lock manager "samba aware" and making samba aware of the internal state of the kernel lock manager so that they can cooperate. we can not just stop/start the lockmanager back to back in rhel5 since if they are stopped/started too close to eachother then when the new lockmanager upon starting up sends out statd notifications two things can happen: 1, new lockmanager sends out notification BEFORE it has registered with portmapper leading to lockmanager starts lockmanager sends notification to the client client tries to recover the lock and tries to portmap the lockmanager port on the server. server is not (yet) registered with portmapper and server responds "no such program" to hte clients request to discover where lockmanager is. client then just completely gives up reclaiming the lock and doesnt even reattempt the portmapper call after some timeout. ==> lock reclaim failed. 2, if they are started back to back, and a client tries to reclaim the lock the lockmanager sometimes sends two responses back to back to the client. one with status NLM_GRANTED (==you got the lock reclaimed) and one with status NLM_DENIED (==you could not get the lock reclaimed) This confuses the client and leads to the server thinking that the client does have the lock and the client thinking it has not got the lock and orphaned locks result. We also send out additional notification messages of different formats to allow more legacy clients to interoperate with locking. (This used to be ctdb commit 13208c1aab2942e28dff87e38e6794bf0c026033)	2007-09-07 08:52:56 +10:00
Ronnie Sahlberg	82984577f1	we dont need the rpc.statd on shared directory neither do we need PUBLIC_IP anymore (This used to be ctdb commit fd571ac87f65928e92dde6977745083bf381df1a)	2007-09-06 11:32:18 +10:00
Ronnie Sahlberg	00453a375a	improve the handling of hosts to notify with statd (This used to be ctdb commit cc87bda7e344bc777b9620a6211e62de4dce4e3b)	2007-09-06 11:30:49 +10:00
Ronnie Sahlberg	19546fb007	specify the additional ports for nfs (This used to be ctdb commit 1934163f0b393738615a05854082a7d488003e1c)	2007-09-06 10:26:44 +10:00
Ronnie Sahlberg	f7d193e9ce	the event scripts for nfs are called 60.nfs and 61.nfstickle (This used to be ctdb commit b15f1c25560320993b93aa3d943985dab4e47947)	2007-09-06 10:18:13 +10:00
Ronnie Sahlberg	0781616ef9	document NFS_TICKLE_SHARED_DIRECTORY on our web page (This used to be ctdb commit 40ec29f602897e9b01a6747806f502ab38423d54)	2007-09-06 08:21:11 +10:00
Ronnie Sahlberg	46eecfea27	we dont use 'sendip' any more so dont check for it and exit from the 61.nfstickles script if it is missing from the host (This used to be ctdb commit 8eac441e24f4ef33b55f9eaa4856b5c1e1c15213)	2007-09-05 15:39:51 +10:00
Ronnie Sahlberg	a9c8456ed6	we should always get data back from getnodemap (This used to be ctdb commit ff999a4b56f714c58c81baa454a2d39d04944136)	2007-09-05 14:59:29 +10:00
Ronnie Sahlberg	e4eeceaf3a	dont dereference vnn before we have assigned it a pointer value (This used to be ctdb commit 2a8fc69aea8527b22a3fe57427677e4caff57338)	2007-09-05 14:29:44 +10:00
Andrew Tridgell	c572d3c226	added a diagnostics tool for ctdb (This used to be ctdb commit 032a2238caf688656b00e06bf363182368e037e1)	2007-09-05 14:20:34 +10:00
Ronnie Sahlberg	77ec4d5248	allow different nodes in the cluster to use different public_addresses files so that we can partition the cluster into different subsets of nodes which each serve a different subset of the public addresses (This used to be ctdb commit 889e0fe69e4c88c6166282b12843b8d9727552d6)	2007-09-04 23:15:23 +10:00
Ronnie Sahlberg	8f819c6a0e	get rid of the ctdb_vnn_list structure and just use a single list of ctdb_vnn (This used to be ctdb commit 7b9fd06321af17043136b1420b57284450ae7ba5)	2007-09-04 18:20:29 +10:00
Ronnie Sahlberg	cf45c5096c	we cant have takeover_ctx hanging off ctdb since it is freed/recreated everytime we release an ip. this context is used to hold all resources needed when sending out gratious arps and tcp tickles during ip takeover. we hang it off the vnn structure that manages that particular ip address instead so that we can have multiple ones going in parallell this bug (or the same bug in different shape) has probably been in ctdb for very very long but is likely to be hard to trigger (This used to be ctdb commit c58db1cadaba253b2659573673b28c235ef7db76)	2007-09-04 14:36:52 +10:00
Ronnie Sahlberg	3e6be59f61	fix typo in debug output (This used to be ctdb commit 011a777c6e538ca79f104c7884a4f0e222997382)	2007-09-04 14:21:35 +10:00
Ronnie Sahlberg	784eac9079	dont just always return 0 from the killtcp control. return 0 or -1 so that the ctdb tool knows whether the control succeeded or not (This used to be ctdb commit cace8b40090be5529ec6b463d3839d0e22f4039d)	2007-09-04 14:19:18 +10:00
Ronnie Sahlberg	a50e83448c	change vnn to pnn in the traverse structure (This used to be ctdb commit d56ae0963b420edea6a2d5eeb408a9811af3f3f6)	2007-09-04 10:49:21 +10:00
Ronnie Sahlberg	f69321edc8	change debug output from vnn to pnn (This used to be ctdb commit 93a7cf759ae3f9af6671b9f8589e1399a669b46f)	2007-09-04 10:47:02 +10:00
Ronnie Sahlberg	d66d9cdd22	change debug output from vnn to pnn change ctdb_daemon_send_message to take pnn as parameter isntead of vnn (This used to be ctdb commit e352a2bbf9bb9a0b2c4f8329e8a529cf02414097)	2007-09-04 10:45:41 +10:00
Ronnie Sahlberg	0c91261340	change ctdb_send_message to take pnn as parameter instead of vnn (This used to be ctdb commit 93dd4fba2e0fa6a011d15406652836785a974880)	2007-09-04 10:42:20 +10:00
Ronnie Sahlberg	157be530dd	change ctdb_ctrl_getvnn to ctdb_ctrl_getpnn (This used to be ctdb commit ef47cc4cd416065c69382e4d9e76c30a0a34e42f)	2007-09-04 10:38:48 +10:00
Ronnie Sahlberg	211b497818	change ctdb_node_flags_change.vnn to ctdb_node_flags_changed.pnn change ctdb_ban_info.vnn to ctdb_ban_info.pnn (This used to be ctdb commit fcedd40e0493948829e1c921d4fe30e9196e398a)	2007-09-04 10:33:10 +10:00
Ronnie Sahlberg	6f693bbcbd	change server_id.vnn to server_id.pnn (This used to be ctdb commit 26f2ee2b754a9271454412f05111a19b3013c6eb)	2007-09-04 10:21:51 +10:00
Ronnie Sahlberg	583b6e6ba6	change ctdb_get_vnn to ctdb_get_pnn (This used to be ctdb commit 1e19930198c2bcc7ccb755e0ee51555fb823029a)	2007-09-04 10:18:44 +10:00
Ronnie Sahlberg	4ba9990143	change vnn to pnn in the ctdb tool (This used to be ctdb commit 822556a4d4ba23459be3a25cbd3f48d1f64ba95f)	2007-09-04 10:14:41 +10:00
Ronnie Sahlberg	fc9d39c3a6	change ctdb_validate_vnn to ctdb_validate_pnn (This used to be ctdb commit a4a1f41b69475b9dc16d8fd7f8965c32e96c32f0)	2007-09-04 10:09:58 +10:00
Ronnie Sahlberg	eb4cf6a686	change ctdb->vnn to ctdb->pnn (This used to be ctdb commit 8c776e5707e503ec6586aae39ac6b3ea5a2fd2bc)	2007-09-04 10:06:36 +10:00
Ronnie Sahlberg	12ebb74838	change how we do public addresses and takeover so that we can have multiple public addresses spread across multiple interfaces on each node. this is a massive patch since we have previously made the assumtion that we only have one public address per node. get rid of the public_interface argument. the public addresses file now explicitely lists which interface the address belongs to (This used to be ctdb commit 462ebbc791e906a6b874c862defea43235597ca8)	2007-09-04 09:50:07 +10:00
Andrew Tridgell	7423bcaabe	up the release number (This used to be ctdb commit 71a6213c92a12bf794c17c30ae4987149b68fe1b)	2007-08-30 17:51:05 +10:00
Ronnie Sahlberg	4e61e05f49	when we start 60.nfs we must make sure that the shared storage nfs-state directory actually exists (by creating it) or else the lock manager will not start (This used to be ctdb commit f2d15d04df842538c8d8331796a3c6fbe23463f2)	2007-08-30 15:27:45 +10:00
Andrew Tridgell	8c94d4dc87	merge from ronnie (This used to be ctdb commit ab11fd70cf4d2165a5b55930cbad6fddf5397f54)	2007-08-27 18:04:53 +10:00
Ronnie Sahlberg	794fb10634	add an extra debug statement when we send a SIGTERM to a process (This used to be ctdb commit a9c1be9cf9efdc69bfc95657b70e9f8b8230cda8)	2007-08-27 17:33:46 +10:00
Ronnie Sahlberg	2c0c94782a	make the ctdb shutdown command use the async _send() function to send the shutdown command and return success to the caller if the _send() was successful (This used to be ctdb commit 6bacaf8c7a96044708a6eda10cc8576adb7f5f79)	2007-08-27 15:03:52 +10:00
Andrew Tridgell	7f630b67f6	fixed segv when no public interface is set (This used to be ctdb commit 55b415f87bd3cba13c73ccd2fe661720754a6af7)	2007-08-27 11:49:42 +10:00
Ronnie Sahlberg	7f02e16143	add async versions of the freeze node control and freeze all nodes in parallell (This used to be ctdb commit f34e89f54d9f4380e76eb1b5b2385a4d8500b505)	2007-08-27 10:31:22 +10:00
Ronnie Sahlberg	a9c45b2562	change the monitoring of recmode in the recovery daemon to use a fully async eventdriven api for controls (This used to be ctdb commit 8d0e43428c507967a0d96e6a4c5c821ac269c546)	2007-08-27 09:40:10 +10:00
Ronnie Sahlberg	801bdbdc80	add a control to pull the server id list off a node (This used to be ctdb commit 38aa759aa88a042c31b401551f6a713fb7bbe84e)	2007-08-26 10:57:02 +10:00
Ronnie Sahlberg	6681da31df	add an initial implementation of a service_id structure and three controls to register/unregister/check a server id. a server id consists of TYPE:VNN:ID where type is specific to the application. VNN is the node where the serverid was registered and ID might be a node unique identifier such as a pid or similar. Clients can register a server id for themself at the local ctdb daemon. When a client dissappears or when the domain socket connection for the client drops then any and all server ids registered across that domain socket will also be automatically removed from the store. clients can register as many server_ids as they want at the same time but each TYPE:VNN:ID must be globally unique. Clients have the option of explicitely unregister a server id by using the UNREGISTER control. Registration and unregistration can only be done by clients to the local daemon. clients can not register their server id to a remote node. clients can check if a server id does exist on any ctdb node in the network by using the check control (This used to be ctdb commit d44798feec26147c5cc05922cb2186f0ef0307be)	2007-08-24 15:53:41 +10:00
Ronnie Sahlberg	de23937368	cleanup invoke_control_callback. we dont need to pass some of these parameters to _recv() since they are already set (This used to be ctdb commit 2034dbebb26d7a2d51241943f6ccbe15bb6a5169)	2007-08-24 10:54:34 +10:00
Ronnie Sahlberg	495a6403da	change the api for managing callbacks to controls so that isntead of passing it as a parameter we set the callback function explicitely from the caller if the ..._send() function returned a valid state pointer. (This used to be ctdb commit aa939570662786455f63299b62c99882cff29d42)	2007-08-24 10:42:06 +10:00
Ronnie Sahlberg	1da9c03b1f	comment why we do a talloc_steal (This used to be ctdb commit aba7972728307e0ae52ccf8c0dd5808110fb92d7)	2007-08-24 09:34:04 +10:00
Ronnie Sahlberg	62a03ef9d5	get rid of the explicit global timeout used in the previous example and try this time by relying on the timeouts for the individual controls (This used to be ctdb commit 448a0eb4fd896dc545aa0b4bb2ba4628491578be)	2007-08-23 19:38:54 +10:00
Ronnie Sahlberg	f854b5f876	try out a slightly different api for controls where you provide a callback function which is called upon completion (or timeout) of the control. modify scanning of recmaster in the monitoring_cluster code to try the api out (This used to be ctdb commit c37843f1d97b169afec910e7ddb4e5ac12c3015c)	2007-08-23 19:27:09 +10:00
Ronnie Sahlberg	4c13bf0c5f	break checking that the recoverymode on all nodes are ok out into its own function (This used to be ctdb commit 813cf9a252af96da24122b80f24aabeed2911939)	2007-08-23 13:48:39 +10:00
Ronnie Sahlberg	8fd3df2553	hang the ctdb_req_control structure off the ctdb_client_control_state struct so that if we timeout a control we can print debug info such as what opcode failed and to which node we dont need the *status parameter to ctdb_client_control_state create async versions of the getrecmaster control pass a memory context to getrecmaster (This used to be ctdb commit 558b680c82f830fba82c283c78c2de8a0b150b75)	2007-08-23 13:00:10 +10:00
Ronnie Sahlberg	20120c2331	in ctdb_call_recv() we must check that state is non-NULL since ctdb_call() may pass a null pointer to _recv() and this would cause a segfault. fortunately there appears there are no critical users for this codepath right now so the risk was more theoretical IF clients start using this call it coult segfault. change ctdb_control() to become fully async so we later can make recovery daemon do the expensive controls to nodes in parallell instead of in sequence (This used to be ctdb commit 379789cda6ef049f389f10136aaa1b37a4d063a9)	2007-08-23 11:58:09 +10:00
Ronnie Sahlberg	277cdbe3d1	create an enum to describe the state of a control in flight instead of using the enum that is for calls (This used to be ctdb commit f9cf7076151af983a1c4ea56fbeb6d94ea508a34)	2007-08-23 09:53:10 +10:00
Andrew Tridgell	d95476fa38	merge from ronnie (This used to be ctdb commit e0f1c1acb1188500674626d631e1a1b8726e72ad)	2007-08-22 17:31:29 +10:00
Andrew Tridgell	df9ec77b6b	merge from volker (This used to be ctdb commit a5587b3c065f7115ad5e55429c2c9d9923d3b4dc)	2007-08-22 17:18:55 +10:00

1 2 3 4 5 ...

1085 Commits