samba-mirror

mirror of https://github.com/samba-team/samba.git synced 2025-01-06 13:18:07 +03:00

Author	SHA1	Message	Date
Michael Adam	abac42ca34	server: add a new ctdb control CTDB_TRANS2_ACTIVE This aske the daemon wheter a transaction is currently active on a given DB on that node. More precisely this asks for the transaction_active flag in the ctdb_db_context that is set in the CTDB_TRANS2_COMMIT control and cleared in the CTDB_TRANS2_ERROR or CTDB_TRANS2_FINISHED controls. This will be useful for fixing race conditions in the transaction code. Michael (This used to be ctdb commit 8d430ae6968dfe566614379436fc3c56003fcd88)	2009-10-29 10:14:30 +11:00
Michael Adam	769a36c048	In ctdb_ltdb_store(), add a missing transaction_cancel when local store failed. Spotted by Volker. Michael (This used to be ctdb commit 0a4d409baabf242a87c06293789d589c896b104c)	2009-10-21 12:49:59 +11:00
Ronnie Sahlberg	9de3652380	add logging everytime we create a filedescriptor in the main ctdb daemon so we can spot if there are leaks. plug two leaks for filedescriptors related to when sending ARP fail and one leak when we can not parse the local address during tcp connection establish (This used to be ctdb commit ddd089810a14efe4be6e1ff3eccaa604e4913c9e)	2009-10-15 11:24:54 +11:00
Michael Adam	a6cf23362f	ctdbd: refuse PERSISTENT_STORE if transaction is running. Michael (This used to be ctdb commit c07d6d90f7afd19213ad44624c3e2b9c85f4eea8)	2009-07-29 11:13:38 +10:00
Michael Adam	4cd06a330e	Fix persistent transaction commit race condition. In ctdb_client.c:ctdb_transaction_commit(), after a failed TRANS2_COMMIT control call (for instance due to the 1-second being exceeded waiting for a busy node's reply), there is a 1-second gap between the transaction_cancel() and replay_transaction() calls in which there is no lock on the persistent db. And due to the lack of global state indicating that a transaction is in progress in ctdbd, other nodes may succeed to start transactions on the db in this gap and even worse work on top of the possibly already pushed changes. So the data diverges on the several nodes. This change fixes this by introducing global state for a transaction commit being active in the ctdb_db_context struct and in a db_id field in the client so that a client keeps track of _which_ tdb it as transaction commit running on. These data are set by ctdb upon entering the trans2_commit control and they are cleared in the trans2_error or trans2_finished controls. This makes it impossible to start a nother transaction or migrate a record to a different node while a transaction is active on a persistent tdb, including the retry loop. This approach is dead lock free and still allows recovery process to be started in the retry-gap between cancel and replay. Also note, that this solution does not require any change in the client side. This was debugged and developed together with Stefan Metzmacher <metze@samba.org> - thanks! Michael (This used to be ctdb commit f88103516e5ad723062fb95fcb07a128f1069d69)	2009-07-29 11:12:39 +10:00
Ronnie Sahlberg	e1b0cea427	add control and logging of very high latencies. log the type of operation and the database name for all latencies higher than a treshold (This used to be ctdb commit 1d581dcd507e8e13d7ae085ff4d6a9f3e2aaeba5)	2008-10-30 12:49:53 +11:00
Andrew Tridgell	aa1bc0abba	added a new control CTDB_CONTROL_TRANS2_COMMIT_RETRY so we can tell the difference between a initial commit attempt and a retry, which allows us to get the persistent updates counter right for retries (This used to be ctdb commit 7f29c50ccbc7789bfbc20bcb4b65758af9ebe6c5)	2008-08-08 13:11:28 +10:00
Andrew Tridgell	5a0249d34c	return a more detailed error code from a trans2 commit error (This used to be ctdb commit 6915661a460cd589b441ac7cd8695f35c4e83113)	2008-08-08 09:58:49 +10:00
Andrew Tridgell	5ee51ae84e	fixed a looping error bug with the new transactions code (This used to be ctdb commit 0592ba2a4fbd1b3b7a6bd0780eadbd6d449baaad)	2008-08-08 00:44:33 +10:00
Andrew Tridgell	bbedba23c7	cover some corner cases where the persistent database could become inconsistent (This used to be ctdb commit c76c214be401cb116265ed17ffe6c77c979ded82)	2008-08-07 13:34:18 +10:00
Andrew Tridgell	78acc59784	implemented replayable transactions in ctdb to prevent deadlock (This used to be ctdb commit b6d9a0396fb4b325778d3810dc656f719f31b9f1)	2008-08-04 14:51:51 +10:00
Andrew Tridgell	98502135e7	added new multi-record transaction commit code (This used to be ctdb commit 9ff3380099fe6f4d39de126db0826971a10ee692)	2008-07-30 19:57:00 +10:00
Ronnie Sahlberg	90ff67dc74	Only decrement the "number of persistent writes in flight" If/when it is >0 or we will break if used against an unpatched samba server (This used to be ctdb commit 52a38487f981fd5981c02a7a063ad2c598591c10)	2008-07-17 18:47:20 +10:00
Ronnie Sahlberg	6eb4e46fe1	Add two new controls to start and cancel a persistent update. This allows ctdb to automatically start a new full blown recovery if a client has started updating the local tdb for a persistent database but is kill -9ed before it has ensured the update is distributed clusterwide. (This used to be ctdb commit 1ffccb3e0b3b5bd376c5302304029af393709518)	2008-07-17 13:50:55 +10:00
Ronnie Sahlberg	334db8ccba	proper waitpid() fix. remove all waitpid() calls and use the event system to trap sigchld (This used to be ctdb commit 77458b2b6b51b2970c12b0e5b097088d3fb9d358)	2008-07-09 14:02:54 +10:00
Ronnie Sahlberg	522830dea8	Revert "waitpid() can block if it takes a long time before the child terminates" This reverts commit bfba5c7249eff8a10a43b53c1b89dd44b625fd10. revert the waitpid changes. we need to waitpid for some childredn so should refactor the approach completely (This used to be ctdb commit 702ced6c2fe569c01fe96c60d0f35a7e61506a96)	2008-07-08 17:41:31 +10:00
Ronnie Sahlberg	d67de4a7d2	waitpid() can block if it takes a long time before the child terminates so we should not call it from the main daemon. 1, set SIGCHLD to SIG_DFL to make sure we ignore this signal 2, get rid of all waitpid() calls 3, change reporting of event script status code from _exit()/waitpid() to write()/read() one byte across the pipe. (This used to be ctdb commit bfba5c7249eff8a10a43b53c1b89dd44b625fd10)	2008-07-08 03:48:11 +10:00
Ronnie Sahlberg	60a3fb926d	dont bother casting to a void* private_data pointer, just pass it as 'state' structure (This used to be ctdb commit 1d7c3eb454e33cd17c74606c4ea011fd79959c80)	2008-05-28 13:40:12 +10:00
Ronnie Sahlberg	0b0f5bc5e6	remove another field we dont need in the childwrite_handle structure (This used to be ctdb commit 70085523f4c35a20786023c489325554e2a6f9c1)	2008-05-28 13:31:58 +10:00
Ronnie Sahlberg	71ec7b25b0	remote a comment that is no longer relevant remove a field in the childwrite_handle structure we dont need (This used to be ctdb commit a53db1ec3f29f4418ff51e0f452026c12470bf93)	2008-05-28 13:30:22 +10:00
Ronnie Sahlberg	ceaf488f05	do persistent writes in a child process (This used to be ctdb commit 2da3d1f876f5d654f849af8a3e588f5a61300c3d)	2008-05-28 13:04:25 +10:00
Ronnie Sahlberg	0941019cb7	restore a timeout value to the default settings instead of the hardcoded 3 second test value (This used to be ctdb commit 437752d002a108bcbbf6dc8bfb5dbf16dc5f1c58)	2008-05-22 16:33:36 +10:00
Ronnie Sahlberg	dd6c9d5a78	fix some memory hierarchy bugs in allocation of the state structure for persistent writes. since these two controls (UPDATE_RECORD and PERSISTENT_STORE) can respond asynchronously to the control, we can not allocate the state variable as a child off ctdb_req_control instead we must allocate state as a child off ctdb itself and steal ctdb_req_control so it becomes a child of state. othervise both ctdb_req_control and also state will be released immediately after we have finished setting up the async reply and returned. (This used to be ctdb commit 6f6de0becd179be9eb9a6bf70562b090205ce196)	2008-05-22 16:29:46 +10:00
Ronnie Sahlberg	d895f43504	cleanup of the previous patch. With these patches, ctdbd will enforce and (by default) always use tdb_transactions when updating/writing records to a persistent database. This might come with a small performance degratation since transactions are slower than no transactions at all. If a client, such as samba wants to use a persistent database but does NOT want to pay the performance penalty, it can specify TDB_NOSYNC as the srvid parameter in the ctdb_control() for CTDB_CONTROL_DB_ATTACH_PERSISTENT. In this case CTDBD will remember that "this database is not that important" so I can use unsafe (no transaction) tdb_stores to write the updates. It will be faster than the default (always use transaction) but less crash safe. (This used to be ctdb commit 3d85d2cf669686f89cacdc481eaa97aef1ba62c0)	2008-05-22 13:12:53 +10:00
Ronnie Sahlberg	ed2cf0291d	second try for safe transaction stores into persistend tdb databases for stores into persistent databases, ALWAYS use a lockwait child take out the lock for the record and never the daemon itself. (This used to be ctdb commit 7fb6cf549de1b5e9ac5a3e4483c7591850ea2464)	2008-05-22 12:47:33 +10:00
Andrew Tridgell	f6e53f433b	merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c)	2008-02-04 20:07:15 +11:00
Andrew Tridgell	4178cb98a1	fixed a valgrind error, and some warnings (This used to be ctdb commit c0f52dbb385fa0748680adb7c40755c92e577551)	2007-09-24 09:57:14 +10:00
Andrew Tridgell	2607c222fc	avoid using connected nodes that aren't in the vnn map yet (This used to be ctdb commit 2b5ae133f5f6fa9ad1a8896fe4b4c542d4ca462d)	2007-09-21 15:44:13 +10:00
Ronnie Sahlberg	51d912063c	in ctdb_control_persistent_store() we must talloc_steal() the pointer to c to prevent it from being immediately freed (and our persistent store state with it) if we need to wait asynchronously for other nodes before we can reply back to the client (This used to be ctdb commit fa5915280933e4d2e7d4d07199829c9c2b87a335)	2007-09-21 15:19:33 +10:00
Andrew Tridgell	c60988325d	added support for persistent databases in ctdbd (This used to be ctdb commit 3115090a0d882beca9d70761130b74bb0821f201)	2007-09-21 12:24:02 +10:00

1 2

80 Commits