samba-mirror

mirror of https://github.com/samba-team/samba.git synced 2024-12-25 23:21:54 +03:00

Author	SHA1	Message	Date
Michael Adam	dbb520b6ad	call: becoming dmaster in VACUUM_MIGRATION, set the VACUUM_MIGRATED record flag This temporary flag is used for the local record storage function to decide whether to delete an empty record which has never been migrated with data as part of the fast-path vacuuming process or, or to store the record. (This used to be ctdb commit c11ca778ee90444c44dee0a629cd2eefa3a1f75e)	2011-03-14 13:35:45 +01:00
Michael Adam	73e6618a48	call: hand the submitted record_flags to local record storage function. (This used to be ctdb commit 4079b8bf7a57a27a45d29784a1b0a414c778e552)	2011-03-14 13:35:45 +01:00
Michael Adam	eb1b7d1c05	call: transfer the record flags in the ctdb call packets. This way, the MIGRATED_WITH_DATA information can be transported along with the records. This is important for vacuuming to function properly. The record flags are appended to the data section of the ctdb_req_dmaster and ctdb_reply_dmaster structs. Pair-Programmed-With: Stefan Metzmacher <metze@samba.org> (This used to be ctdb commit 945187d64cfc7bd30a0c3b0d548cbe582d95dde3)	2011-03-14 13:35:44 +01:00
Michael Adam	64fc05e562	server: when we migrate off a record with data, set the MIGRATED_WITH_DATA flag (This used to be ctdb commit f5fb232117886186066ab3430fdd2307cba94960)	2011-03-14 13:35:43 +01:00
Michael Adam	53b558a3bc	server: add a comment explaining the call redirect logic in ctdb_call_send_redirect(). (This used to be ctdb commit 81663b81687c0ba681500cca6aa8174bb9587ad2)	2011-02-24 10:35:26 +01:00
Ronnie Sahlberg	92f86534ac	ctdb_req_dmaster from non-master If we find a situatior where we get a stray packet with the wrong dmaster, dont suicide with ctdb_fatal() since this is too disruptive. Just drop the stray packet and force a recovery to make sure all is good again. CQ S1022004 (This used to be ctdb commit 62b7fe853db37c0a90e48a0332a3426a8dcb4ed8)	2011-02-18 11:29:44 +11:00
Ronnie Sahlberg	b57bd0f896	Remove LACOUNT and LACCESSOR and migrate the records immediately. This concept didnt work out and it is really just as expensive as a full migration anyway, without the benefit of caching the data for subsequence accesses. Now, migrate the records immediately on first access. This will be combined with a "cheap vacuum-lite" for special empty records to prevent growth of databases. Later extensions to mimic read-only behaviour of records will include proper shared read-only locking of database records, making the laccessor/lacount read-only access to the data obsolete anyway. By removing this special case and handling of lacount laccessor makes the codapath where shared read-only locking will be be implemented simpler, and frees up space in the ctdb_ltdb header for use by vacuuming flags as well as read-only locking flags. (This used to be ctdb commit 155dd1f4885fe142c6f8bd09430f65daf8a17e51)	2011-02-18 10:08:32 +11:00
Ronnie Sahlberg	220c5371c7	Revert "server: when we migrate off a record with data, set the MIGRATED_WITH_DATA flag" This reverts commit 17e231abf5ade83d7fa624b5cf54ae876e2795aa. (This used to be ctdb commit 23f81ba39ee7cd8a7360f4602b3eb264eb221552)	2010-12-13 14:23:48 +11:00
Ronnie Sahlberg	dff88a8a6a	Revert "Add a new header flag for "migrated with data" and set this to 1" This reverts commit a8cc35191df1cd4b866897df71d317ce5f198cb5. (This used to be ctdb commit 7c37435fb517a621c45b21a21b4eb15f8bbd3c83)	2010-12-13 14:23:32 +11:00
Ronnie Sahlberg	8e53df6f41	Add a new header flag for "migrated with data" and set this to 1 when we migrate a non-empty record onto the node or a non-empty record off the node When we migrate a record back to the lmaster and yield the dmaster role, inspect this flag if if it is still not set, we can delete the record from the local database as soon as we have migrated it back to the lmaster. (This used to be ctdb commit a8cc35191df1cd4b866897df71d317ce5f198cb5)	2010-12-07 15:33:41 +11:00
Michael Adam	6f77811cb1	server: when we migrate off a record with data, set the MIGRATED_WITH_DATA flag (This used to be ctdb commit 17e231abf5ade83d7fa624b5cf54ae876e2795aa)	2010-12-07 15:31:57 +11:00
Ronnie Sahlberg	db8cb31d8b	during shutdown there is a window after we have stopped TCP and disconnected from all other nodes but before we have stopped all processing. During this window we may still hit asynchronous events that will fail because we can not send/receive packets from other nodes. These messages are logged as ... Transport is DOWN. To help indicate that they are benign messages related to the process of shutting down. These messages spam the syslog during normal shutdown, so this patch will drop the loglevel of these messages to DEBUG, so that they will not appear in or spam the syslog. (This used to be ctdb commit 8275d265d2ae19b765e30ecf18f6b6319b6e6453)	2010-10-28 13:41:08 +11:00
Ronnie Sahlberg	39c367a68f	Create macros to update the statistics counters and use these macros everywhere instead of manipulating the coutenrs directly. (This used to be ctdb commit 2e648df890e5713bc575965d87937827b068d0d7)	2010-09-29 12:14:24 +10:00
Rusty Russell	f93440c4b7	event: Update events to latest Samba version 0.9.8 In Samba this is now called "tevent", and while we use the backwards compatibility wrappers they don't offer EVENT_FD_AUTOCLOSE: that is now a separate tevent_fd_set_auto_close() function. This is based on Samba version `7f29f817fa`. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 85e5e760cc91eb3157d3a88996ce474491646726)	2010-08-18 09:16:31 +09:30
Ronnie Sahlberg	7730facc62	fix a debug message (This used to be ctdb commit 856bd6de6218d9b70baed0e6443be4253ea31afe)	2010-06-09 16:22:44 +10:00
Ronnie Sahlberg	d9a3e1d0c0	idr can timeout and wrap/be reused quite quickly. If a noremote node hangs for an extended period, it is possible that we might have a DMASTER request in flight for record A to that node. Eventually we will reuse the idr, and may reuse it for a DMASTER request to a different node for a different record B. If while the request for B is in flight, the first tnode un-hangs and responds back we would receive a dmaster reply for the wrong record. This would cause a record to become perpetually locked, since inside the daemon we would tdb_chainlock(dmaster_reply->pdu->key) but once the migration would complete we would chainunlock idr->state->call->key Adding code to verify that when we receive a dmaster reply packet that it does in fact match the exact same key that the state variable we have for the idr in flight. (This used to be ctdb commit 2f6a870d7ff02ceb61fde242f752dccbfcb4cb37)	2010-06-09 16:19:29 +10:00
Ronnie Sahlberg	75f3ef154c	add extra logging for failed ctdb_ltdb_unlock() for a few more places it is called from (This used to be ctdb commit 5c0fea90c6474a51992a9c4aeb6af7dfeb213ee0)	2010-06-09 14:37:24 +10:00
Ronnie Sahlberg	fa618aa66a	add additional logging when tdb_chainunlock() fails so we can see where it was called from when it fails (This used to be ctdb commit 0c091b3db6bdefd371787d87bc749593ea8e3c76)	2010-06-09 14:37:16 +10:00
Michael Adam	b72ccfc39a	server:ctdb_send_dmaster_reply: fix a message typo. Michael (This used to be ctdb commit aa63f728152c37e31cecf2258efcdc8cf5ac0092)	2010-02-23 21:07:54 +11:00
Ronnie Sahlberg	06fdfddf27	Reducing the log level for a debug message DEBUG(DEBUG_DEBUG,("pnn %u starting migration of %08x t\ (This used to be ctdb commit 6ce4b21b00cce1530aff022584bf695c257a5d55)	2010-02-16 11:02:01 +11:00
Ronnie Sahlberg	ce9d57bc36	Reduce the log level for two debug messages DEBUG(DEBUG_DEBUG,("pnn %u dmaster response %08x\n", ctdb->pnn, ctdb_has DEBUG(DEBUG_DEBUG,("pnn %u dmaster request on %08x for %u from %u\n", (This used to be ctdb commit a3473e7a445b14520a49585c460429dfbfe1fce0)	2010-02-16 11:01:52 +11:00
Michael Adam	ea65e80223	call: lower the debug message "refusing migration while transction" to lvl INFO This gets just too noisy on a busy system. And it is purley informational anyways... Michael (This used to be ctdb commit 7f64a00c76203fdf6673c3f862a4bfd17fb848d7)	2009-12-09 21:56:59 +01:00
Ronnie Sahlberg	f5e90ec3b5	Revert "From Wolfgang M." This reverts commit 5b70fa8cfd5916d3c212823ad5cc1b251ae175ed. (This used to be ctdb commit 363e7e939ad46b3f75c83c30d4163d63876c2456)	2009-10-29 13:44:12 +11:00
Ronnie Sahlberg	831f9e05a6	From Wolfgang M. With the new vacuuming code, dont treat an invalid dmaster as fatal. Let it update to the new value insetad. (This used to be ctdb commit 5b70fa8cfd5916d3c212823ad5cc1b251ae175ed)	2009-10-22 07:58:44 +11:00
Michael Adam	4cd06a330e	Fix persistent transaction commit race condition. In ctdb_client.c:ctdb_transaction_commit(), after a failed TRANS2_COMMIT control call (for instance due to the 1-second being exceeded waiting for a busy node's reply), there is a 1-second gap between the transaction_cancel() and replay_transaction() calls in which there is no lock on the persistent db. And due to the lack of global state indicating that a transaction is in progress in ctdbd, other nodes may succeed to start transactions on the db in this gap and even worse work on top of the possibly already pushed changes. So the data diverges on the several nodes. This change fixes this by introducing global state for a transaction commit being active in the ctdb_db_context struct and in a db_id field in the client so that a client keeps track of _which_ tdb it as transaction commit running on. These data are set by ctdb upon entering the trans2_commit control and they are cleared in the trans2_error or trans2_finished controls. This makes it impossible to start a nother transaction or migrate a record to a different node while a transaction is active on a persistent tdb, including the retry loop. This approach is dead lock free and still allows recovery process to be started in the retry-gap between cancel and replay. Also note, that this solution does not require any change in the client side. This was debugged and developed together with Stefan Metzmacher <metze@samba.org> - thanks! Michael (This used to be ctdb commit f88103516e5ad723062fb95fcb07a128f1069d69)	2009-07-29 11:12:39 +10:00
Ronnie Sahlberg	e6e1ff32a5	dont try sending a keepalive if the transport is down (This used to be ctdb commit 5cdc04669db8c2ddbbff5af82307a16e8d807b83)	2009-06-30 12:17:05 +10:00
Ronnie Sahlberg	6450ae533a	Dont even try allocating and sending a CALL packet if the transport is down (This used to be ctdb commit cb8dd896914d4e44ad7b8bb000176a7c78f394ae)	2009-06-30 12:16:13 +10:00
Ronnie Sahlberg	127754e192	failing a dmaster send due to the transport being down is fatal (This used to be ctdb commit c17dafc79bec25bbb796478c33f503503d382a20)	2009-06-30 12:14:58 +10:00
Ronnie Sahlberg	757ba01ddc	if we fail a dmaster migration due to the transport being down, then that is a fatal condition. (This used to be ctdb commit 75dea671f68ac6649095357c36b3697a927721e9)	2009-06-30 12:13:15 +10:00
Ronnie Sahlberg	dd1774cd85	dont try to send error packets if the transport is down (This used to be ctdb commit 65b94d280731df3245b26d69f39acfaf5bccf0d8)	2009-06-30 12:10:27 +10:00
Ronnie Sahlberg	22fb69d337	dont even try to allocate a packet if the transport is down since it will fail (This used to be ctdb commit a73f316cb9cec877dc0bc3f7baa21be1b1454273)	2009-06-30 11:55:42 +10:00
Ronnie Sahlberg	26ec64a571	fix a memory leak allocate the memory to the 'call' context and not off the 'ctdb' context (This used to be ctdb commit be89005bd5d13409e377d425db2aad1c0d5b3826)	2008-03-25 11:11:13 +11:00
Ronnie Sahlberg	d53424731f	in ctdb_call_local() we can not talloc_steal() the returned data and hang it off ctdb. This can cause a memory leak if the call is terminated before we have managed to respond to the client. (and the call is talloc_free()d but the data is still hanging off ctdb) instead we must talloc_steal() the data and hang it off the call structure to avoid the memory leak. In order to do this we must also change the call structure that is passed into ctdb_call_local() to be allocated through talloc(). This structure was previously either a static variable, or an element of a larger talloc()ed structure (ctdb_call_state or ctdb_client_call_state) so we must change all creations of a ctdb_call into explicitely creating it through talloc() (This used to be ctdb commit 4becf32aea088a25686e8bc330eb47d85ae0ef8f)	2008-03-19 13:54:17 +11:00
Andrew Tridgell	f6e53f433b	merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c)	2008-02-04 20:07:15 +11:00
Andrew Tridgell	9d6ac0cf55	added debug constants to allow for better mapping to syslog levels (This used to be ctdb commit 7ba8f1dde318eab03f4257e5a89fd23e7281e502)	2008-02-04 17:44:24 +11:00
Andrew Tridgell	fc21f78231	make some specific cases of the non-dmaster bug non-fatal (This used to be ctdb commit 7b516ab06c7ba7ffe9ecf3f76720df5360176b2c)	2008-01-05 09:32:29 +11:00
Ronnie Sahlberg	f69321edc8	change debug output from vnn to pnn (This used to be ctdb commit 93a7cf759ae3f9af6671b9f8589e1399a669b46f)	2007-09-04 10:47:02 +10:00
Ronnie Sahlberg	eb4cf6a686	change ctdb->vnn to ctdb->pnn (This used to be ctdb commit 8c776e5707e503ec6586aae39ac6b3ea5a2fd2bc)	2007-09-04 10:06:36 +10:00
Ronnie Sahlberg	135a964220	pass the header to ctdb_become_dmaster instead of just the reqid this allows us to print from which node Invalid or Dropped orphan become dmaster packets came from (This used to be ctdb commit 88efd1bf4c796cd2b184156b72296587bc38bb40)	2007-07-11 09:44:52 +10:00
Andrew Tridgell	32de198fd3	update lib/replace from samba4 (This used to be ctdb commit f0555484105668c01c21f56322992e752e831109)	2007-07-10 15:29:31 +10:00
Andrew Tridgell	a55c03b31b	log the generation numbers to give a hint about this bug (This used to be ctdb commit 12018494baa33c5f6c52e6eae94ac77a56d3e5a0)	2007-07-08 19:36:55 +10:00
Andrew Tridgell	06a71762a4	some #include cleanups (This used to be ctdb commit 1a07d87122d51a40cd8ad5fe13533298c26857cb)	2007-06-07 22:26:27 +10:00
Andrew Tridgell	ae3d54094b	start splitting the code into separate client and server pieces (This used to be ctdb commit 603cd77988c181525946cd5eb0f4d0d646b58059)	2007-06-07 22:06:19 +10:00

43 Commits