samba-mirror

mirror of https://github.com/samba-team/samba.git synced 2025-01-06 13:18:07 +03:00

Author	SHA1	Message	Date
Amitay Isaacs	d701072c3e	ctdb-call: Delete old defer queue if recovery occurs Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2015-10-07 14:53:27 +02:00
Amitay Isaacs	3d11efe3c6	ctdb-daemon: Use database generation in packet headers for database requests Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2015-10-07 14:53:27 +02:00
Amitay Isaacs	6a212d13d0	ctdb-call: Convert pending calls list to per database list The pending calls are migration requests received from clients (over unix domain socket) which are under processing. After a recovery is finished, any requests which are under processing will be dropped since they do not belong to the current generation. All the pending call requests are resent with new generation to restart record migrations. This is in preparation for parallel database recovery. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2015-10-07 14:53:26 +02:00
Amitay Isaacs	2c57cc9597	ctdb-call: Drop all deferred requests from older generation Deferring packets has a nasty interaction with recovery. All deferred packets must be dropped when recovery happens, since those packets are tracked as pending requests and will be re-sent with new generation. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> Autobuild-User(master): Martin Schwenke <martins@samba.org> Autobuild-Date(master): Fri Sep 5 09:30:50 CEST 2014 on sn-devel-104	2014-09-05 09:30:50 +02:00
Amitay Isaacs	ef59f2e6bb	ctdb-daemon: Defer all calls when processing dmaster packets When CTDB receives DMASTER_REQUEST or DMASTER_REPLY packet, the specified record needs to be updated as soon as possible to avoid inconsistent dmaster information between nodes. During this time, queue up all calls for that record and process them only after dmaster request/reply has been processed. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2014-09-05 07:05:10 +02:00
Amitay Isaacs	deb7bb89b3	ctdb-daemon: Remove duplicate code with refactored function Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2014-09-05 07:05:10 +02:00
Martin Schwenke	6fd3ce5391	ctdb-daemon: Fix some strict-aliasing warnings Seeing these with -Wall: ../server/ctdb_call.c:1117:3: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing] record_flags = (uint32_t )&c->data[c->keylen + c->datalen]; ^ memcpy() seems to be the easiest way to get fix these. The alternative would be to use unmarshalling functions. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2014-08-21 04:46:13 +02:00
Martin Schwenke	c1558adeaa	ctdb: Use sys_read() and sys_write() to ensure correct signal interaction ... and avoid compiler warnings in some cases. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>	2014-08-21 04:46:13 +02:00
Amitay Isaacs	c6d0e8dadc	ctdb-readonly: Do not abort if revoke of readonly record fails on a node Revoking readonly record involves first marking the record on dmaster as RO_REVOKING_READONLY. Then all the other nodes are sent update_record control to get rid of RO_DELEGATION. Once that succeeds, the record is marked RO_REVOKING_COMPLETE. Currently, revoking of readonly delegations on the nodes is tried only once. If a node goes in recovery, it can fail update_record control and revoke code will abort ctdb. Since database recovery would revoke all readonly delegations anyway, there is no reason to abort. Simply undo the start of revoke process by resetting RO_REVOKING_READONLY flag. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> Autobuild-User(master): Martin Schwenke <martins@samba.org> Autobuild-Date(master): Wed Aug 13 11:24:09 CEST 2014 on sn-devel-104	2014-08-13 11:24:09 +02:00
Amitay Isaacs	f96f395d85	ctdb-readonly: Add an early return to simplify code This patch makes the subsequent logic change small and easier to understand. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>	2014-08-13 08:57:11 +02:00
Amitay Isaacs	7667da6590	ctdb-readonly: Do not use hard-coded value for readonly revoke timeout In case of control timeouts, readonly revoke code currently aborts. This needs to be fixed. Meanwhile, using control_timeout instead of 5 seconds, increases the timeout to 60 seconds. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Mon Mar 31 07:20:48 CEST 2014 on sn-devel-104	2014-03-31 07:20:48 +02:00
Amitay Isaacs	be33efa3e4	ctdbd: Remove transaction code related to TRANS2 commits This removes data types and structure elements related to TRANS2 persistent transaction code. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 22a253b7ccf1ff854cddf0b67969dc84d7d6a654)	2013-10-04 15:20:25 +10:00
Michael Adam	18f17aaa33	server: standardize formatting of comment block for ctdb_reply_dmaster() while I'm at it.. This was the comment block I was touching and meant to adapt in commit 00d3bf092e2f72eda330978c75ec85f17e870553. My search was apparently not unique... Signed-off-by: Michael Adam <obnox@samba.org> (This used to be ctdb commit 09940255011b119dc6af3304f5d3e9568e6006fd)	2013-08-26 13:24:32 +02:00
Amitay Isaacs	19444f7c3d	ctdbd: Make sure call data is freed if doing an early return This should avoid memory bloat when a request bounces between nodes. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 7677fb263f06a97398e2c546e32273fb96edca69)	2013-08-22 16:59:49 +10:00
Amitay Isaacs	1467b666f2	Revert "LACOUNT: Add back lacount mechanism to defer migrating a fetched/read copy until after default of 20 consecutive requests from the same node" This reverts commit 035c0d981bde8c0eee8b3f24ba8e2dc817e5b504. This is a premature optimization. Record can bounce between nodes very quickly if it is a contended record. There is no need to hold a record on a node unnecessarily. In case record contention becomes bad, enabling sticky records on a database is a better idea. Conflicts: include/ctdb_private.h server/ctdb_tunables.c Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit ac417b0003f0116f116834ad2ac51482d25cfa0d)	2013-08-22 14:08:52 +10:00
Amitay Isaacs	59dae19f5a	ctdbd: Print a log message when a key becomes hot Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 48f40985f4592c28402303ccbb458756f4914f75)	2013-08-22 14:08:52 +10:00
Michael Adam	621bfe8b0d	server: standardize formatting of comment block for ctdb_reply_dmaster() while I'm at it.. Signed-off-by: Michael Adam <obnox@samba.org> (This used to be ctdb commit 00d3bf092e2f72eda330978c75ec85f17e870553)	2013-08-19 17:12:33 +02:00
Michael Adam	922246de73	server: fix wording and punctuation in comment block for ctdb_reply_dmaster(). Signed-off-by: Michael Adam <obnox@samba.org> (This used to be ctdb commit cb3a1c5af3b796dba30cae07118670d3c9e57df7)	2013-08-19 17:12:32 +02:00
Amitay Isaacs	a98baa539e	ctdbd: When a record is made sticky, log only once Instead of logging from ctdb_request_call(), log the message from ctdb_make_record_sticky(). That way if the record is already sticky, the message is not repeated unnecessarily. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 44a64d1c388bfe3c3388b191edfaedecfb7bb831)	2013-08-09 11:07:37 +10:00
Amitay Isaacs	d42cea6efe	ctdbd: Improve high hopcount log messages when request is redirected Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 9cde47e1a5bf1b9ca3b4da8c2db94caac2b1aa5e)	2013-08-09 11:07:37 +10:00
Amitay Isaacs	0993387f4a	ctdbd: Don't consider a hot record if the hopcount is zero Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit ab35773518ad15588013f4d859f7bee790437450)	2013-07-30 15:34:32 +10:00
Amitay Isaacs	054d8727ed	ctdbd: Fix updating of hot keys in database statistics Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit fde4b4db5a57f75c5efa5647c309f33e0d5a68f3)	2013-07-29 16:00:46 +10:00
Amitay Isaacs	1c21f37e57	ctdbd: Set process names for child processes This helps distinguish processes in process list in top, perf, etc. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 2493f57ce268d6fe7e4c40a87852c347fd60d29e)	2013-07-10 14:33:19 +10:00
Mathieu Parent	d82b9ae410	build: Fix tdb.h path to enable building with system TDB library (This used to be ctdb commit f8bf99de3a5f56be67aaa67ed836458b1cf73e86)	2013-06-14 16:45:27 +10:00
Michael Adam	d1dd29197e	ctdbd: fix comment explaining redirection of CTDB_REQ_CALL redirection. Signed-off-by: Michael Adam <obnox@samba.org> (This used to be ctdb commit b697625b184227dad1be31a41b7a3fd9bd312e29)	2013-05-24 22:06:24 +10:00
Michael Adam	3f03a3c8a3	ctdbd: remove a nonempty blank line Signed-off-by: Michael Adam <obnox@samba.org> (This used to be ctdb commit d9e24782a90d9ce29c0e6584b75d2b186142174d)	2013-05-24 22:06:21 +10:00
Michael Adam	a0b20771fe	ctdbd: update comment describing ctdb_call_send_redirect() Signed-off-by: Michael Adam <obnox@samba.org> (This used to be ctdb commit 9a21d417c51fb9cad8f2e87e00ca54d379aef860)	2013-05-24 22:06:16 +10:00
Michael Adam	eb0389b0b1	ctdb_call: use CTDB_REC_RO_FLAGS where appropriate Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-By: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit f99eb2f56d8ca27110a45ae0e1c4bff40ac7a60e)	2013-04-24 18:48:58 +10:00
Michael Adam	f1fe9ddf42	ctdb_call: don't bump the rsn in ctdb_become_dmaster() any more This is now done in ctdb_ltdb_store_server(), so this extra bump can be spared. Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-By: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit cad3107b12e8392f786f9a758ee38cf3a3d58538)	2013-04-17 21:16:32 +10:00
Ronnie Sahlberg	59565c05cf	STATISTICS: Add tracking of the 10 hottest keys per database measured in hopcount and add mechanisms to dump it using the ctdb dbstatistics command (This used to be ctdb commit 8307c70ed98996b430c470e9641a09fdeeb81bd8)	2012-06-13 16:19:18 +10:00
Amitay Isaacs	7631830152	server: Replace BOOL datatype with bool, True/False with true/false Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 6e5cbe8fff71985e5a2fc16b7e9f2b868011ff5d)	2012-05-28 11:22:25 +10:00
Ronnie Sahlberg	a57eba2bb4	Track all child process so we never send a signal to an unrelated process (our child died and kernel wrapped the pid-space and reused the pid for a different process Wrap all creation of child processes inside ctdb_fork() which is used to track all processes we have spawned. Capture SIGCHLD to track also which child processes have terminated. Wrap kill() inside ctdb_kill() and make sure that we never send a !0 signal to a child process pid that has already terminated (and might have been replaced with a (This used to be ctdb commit f73a4b1495830bcdd094a93732a89dd53b3c2f78)	2012-05-03 14:03:26 +10:00
Amitay Isaacs	4392591555	Remove explicit include of lib/tevent/tevent.h. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 0681014ca5ed2a9b56f63fdace7f894beccf8a9a)	2012-04-13 17:28:14 +10:00
Ronnie Sahlberg	fa3a06246a	STICKY: add prototype code to make records stick to a node to "calm" down if they are found to be very hot and accessed by a lot of clients. This can improve performance and stop clients from having to chase a rapidly migrating/bouncing record (This used to be ctdb commit d0d98f7e45e5084b81335b004d50bddc80cdc219)	2012-03-20 17:12:19 +11:00
Ronnie Sahlberg	e7e51ddb64	LACOUNT: Add back lacount mechanism to defer migrating a fetched/read copy until after default of 20 consecutive requests from the same node This can improve performance slightly on certain workloads where smbds frequently read from the same record (This used to be ctdb commit 035c0d981bde8c0eee8b3f24ba8e2dc817e5b504)	2012-03-20 12:26:22 +11:00
Ronnie Sahlberg	6a493a0b08	STATISTICS: add per-db hop count statistics (This used to be ctdb commit 1c976d83b1d7dac6f0ef81306774998e4c8b56a1)	2012-03-20 12:11:55 +11:00
Ronnie Sahlberg	038c946e80	add max hop count buckets to see how bad hopcounts are (This used to be ctdb commit 7d3931298e6477d92f43652c3006b0c426cb1307)	2012-03-20 11:20:53 +11:00
Ronnie Sahlberg	62daab3688	READONLY: when updating a remote node to revoke a delegation, make sure we dont create the record if it doesnt already exist (This used to be ctdb commit fb00e1290fcea3386132a46c883994019a43799a)	2012-03-02 12:57:23 +11:00
Ronnie Sahlberg	73f8be16c6	ReadOnly: add per-database statistics to view how much delegations/revokes we have (This used to be ctdb commit 751ed46197661eb841042ab6a02855a51dd0b17c)	2012-02-08 15:29:27 +11:00
Ronnie Sahlberg	1eafa68f0f	STATISTICS: add total counts for number of delegations and number of revokes Everytime we give a delegation to another node we count this as one delegation. If the same record is delegated to several nodes we count one for each node. Everytime a record has all its delegations revoked we count this as one revoke. (This used to be ctdb commit b098bcf8007be63889aaed640a951b0eeaa9d191)	2012-02-08 13:42:30 +11:00
Michael Adam	0832daf1e9	server: fix a comment typo (This used to be ctdb commit 85879edd09ffa26f87c566954cbd2c14f1e331ed)	2012-01-10 10:33:28 +01:00
Ronnie Sahlberg	f253f69063	typo (This used to be ctdb commit 8fc71ad4da746e28406c06a95928052b29803062)	2011-12-14 12:52:35 +11:00
Martin Schwenke	c1e8ea08e3	Clean up warnings: log some unchecked return codes from function calls In a few places functions are called, the return code is assigned into a variable but it is not checked. This generates a compiler warning like this: warning: variable ‘ret’ set but not used [-Wunused-but-set-variable] Instead we remove the warning by checking the return code variable and log a warning at DEBUG level if the return code indicates an error. The justification is that there may have been a future intent to check the return code but it hasn't been important enough to follow-up. If it matters, it will be logged for easy debugging. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 1932466c76de2b184c2a257120768ab8c9d6c12a)	2011-11-09 15:20:07 +11:00
Volker Lendecke	47f20e8ae5	Fix some typos Signed-off-by: Michael Adam <obnox@samba.org> (This used to be ctdb commit f24e943eb7d8b86ce6b32ae37e3884ec4af0f7df)	2011-11-02 17:05:47 +01:00
Ronnie Sahlberg	19bd9149c3	ReadOnly: fix bug writing incorrect amount of data in delegated record Fix bug when ctdbd updates the local copy of a delegated record to write the correct amount of data to the record. (This used to be ctdb commit 8814d8bc159a5e368afaa236ac7d865165db04b2)	2011-10-28 11:44:19 +11:00
Ronnie Sahlberg	a9e77de9e4	ReadOnly: Dont update the record header from the calling client. While it is convenient since it avoids having to create a child process from the main dameon for writing the updated record it makes the cleitn more complex. Remove the code in the example client code that writes the record to the local tdb. Add code to the local ctdbd processing of replies to check if this reply contain a ro delegation and if so, spawn a child process to lock the tdb and then write the data. (This used to be ctdb commit bf1d429227dc4f5818263cc39401d0a22663cdba)	2011-10-24 13:14:26 +11:00
Ronnie Sahlberg	9729d3e339	ReadOnly: Check the readonly flag instead of whether the tdb pointer is NULL or not (This used to be ctdb commit 01314c2cb3a480917d6a632b83c39f0a48bba0e7)	2011-08-23 10:41:52 +10:00
Ronnie Sahlberg	de7c3de0a2	ReadOnly: clear out the tracking record once a revoke is completed (This used to be ctdb commit 7af255551f058d1f6bfdd38ca603e7a19d1bb7ba)	2011-08-23 10:35:56 +10:00
Ronnie Sahlberg	6fd8cc659d	ReadOnly: Add processing for ReadOnly delegation requests and revoke requests to the processing loop for CALL packets we receive from different nodes. This implements the ReadOnly and ReadWrite request processing, delegation and revoking of delegations for all requests coming in across the network from a remote node. (This used to be ctdb commit 78f2c2ea70e6270cec59db7c3f174a511bf608a9)	2011-08-23 10:32:02 +10:00
Ronnie Sahlberg	07d995ab72	ReadOnly: When releasing all deferred calls that blocked during revoke of all previous delegations, add a 1 second grace/delay for any new readonly delegation requests so that the read-write fetch-lock porcess has a chance to make progress (This used to be ctdb commit 2a4e9e69850d64dd8aef695f587ebe04393a688f)	2011-08-23 10:30:22 +10:00
Ronnie Sahlberg	dda2616cf5	ReadOnly: Add a function to start a revoke of all delegations for a record. This triggers a child process to be created to perform the actual potentially blocking calls that are required. (This used to be ctdb commit 7d575ee92c95bc4aab78a33bc1aac7ff0811ab3a)	2011-08-23 10:27:31 +10:00
Ronnie Sahlberg	1bb855bd52	ReadOnly: Add functions to register CALLs to a context used to handle deferal of processing of CALL commands. Once the contexts are freed, the deferred calls are re-issued to the input packet processing functions again. This is needed when/if a CALL can not currently be processed by the main engine due to the record being locked down for revoking of all delegations. The data is passed through several layers of callbacks, and finally a timed event callback to ensure that the processing of the packet will be restarted again at the topmost eventloop, avoinding event loop nesting. (This used to be ctdb commit cc6f78efcfa3b8caeffbd68018e6dfbf81488dce)	2011-08-23 10:25:57 +10:00
Ronnie Sahlberg	3d495c48d2	ReadOnly: Add an extra flag to ctdb_call_local to specify whether we want to write the record and header back to the tdb (for example we do when performing dmaster migrations) (This used to be ctdb commit b935e83255aeb3754b2fd37cf5611e02f7283514)	2011-08-23 10:25:05 +10:00
Rusty Russell	435dad05cb	ctdbd: fix lock held on error ("ctdb_req_dmaster from non-master.") We should release the lock on the record before returning; otherwise the recovery (which tries to freeze the database) will fail. Symptoms are as follows: ctdbd: pnn 15 dmaster request for new-dmaster 19 from non-master 1 real-dmaster=5 key f049c3c8 dbid 0x6cf2837d gen=1148812532 curgen=1148812532 c->rsn=2 header.rsn=15 reqid=2147483585 keyval=0x4f464e49 ctdbd: ctdb_req_dmaster from non-master. Force a recovery. ... ctdbd: freeze_lock-1:server/ctdb_freeze.c:55 Failed to lock database registry.tdb CQ:1022545 (This used to be ctdb commit 38b2dbe0605816742e74e2b8a811eaba99c7e12d)	2011-03-21 13:57:40 +11:00
Michael Adam	dbb520b6ad	call: becoming dmaster in VACUUM_MIGRATION, set the VACUUM_MIGRATED record flag This temporary flag is used for the local record storage function to decide whether to delete an empty record which has never been migrated with data as part of the fast-path vacuuming process or, or to store the record. (This used to be ctdb commit c11ca778ee90444c44dee0a629cd2eefa3a1f75e)	2011-03-14 13:35:45 +01:00
Michael Adam	73e6618a48	call: hand the submitted record_flags to local record storage function. (This used to be ctdb commit 4079b8bf7a57a27a45d29784a1b0a414c778e552)	2011-03-14 13:35:45 +01:00
Michael Adam	eb1b7d1c05	call: transfer the record flags in the ctdb call packets. This way, the MIGRATED_WITH_DATA information can be transported along with the records. This is important for vacuuming to function properly. The record flags are appended to the data section of the ctdb_req_dmaster and ctdb_reply_dmaster structs. Pair-Programmed-With: Stefan Metzmacher <metze@samba.org> (This used to be ctdb commit 945187d64cfc7bd30a0c3b0d548cbe582d95dde3)	2011-03-14 13:35:44 +01:00
Michael Adam	64fc05e562	server: when we migrate off a record with data, set the MIGRATED_WITH_DATA flag (This used to be ctdb commit f5fb232117886186066ab3430fdd2307cba94960)	2011-03-14 13:35:43 +01:00
Michael Adam	53b558a3bc	server: add a comment explaining the call redirect logic in ctdb_call_send_redirect(). (This used to be ctdb commit 81663b81687c0ba681500cca6aa8174bb9587ad2)	2011-02-24 10:35:26 +01:00
Ronnie Sahlberg	92f86534ac	ctdb_req_dmaster from non-master If we find a situatior where we get a stray packet with the wrong dmaster, dont suicide with ctdb_fatal() since this is too disruptive. Just drop the stray packet and force a recovery to make sure all is good again. CQ S1022004 (This used to be ctdb commit 62b7fe853db37c0a90e48a0332a3426a8dcb4ed8)	2011-02-18 11:29:44 +11:00
Ronnie Sahlberg	b57bd0f896	Remove LACOUNT and LACCESSOR and migrate the records immediately. This concept didnt work out and it is really just as expensive as a full migration anyway, without the benefit of caching the data for subsequence accesses. Now, migrate the records immediately on first access. This will be combined with a "cheap vacuum-lite" for special empty records to prevent growth of databases. Later extensions to mimic read-only behaviour of records will include proper shared read-only locking of database records, making the laccessor/lacount read-only access to the data obsolete anyway. By removing this special case and handling of lacount laccessor makes the codapath where shared read-only locking will be be implemented simpler, and frees up space in the ctdb_ltdb header for use by vacuuming flags as well as read-only locking flags. (This used to be ctdb commit 155dd1f4885fe142c6f8bd09430f65daf8a17e51)	2011-02-18 10:08:32 +11:00
Ronnie Sahlberg	220c5371c7	Revert "server: when we migrate off a record with data, set the MIGRATED_WITH_DATA flag" This reverts commit 17e231abf5ade83d7fa624b5cf54ae876e2795aa. (This used to be ctdb commit 23f81ba39ee7cd8a7360f4602b3eb264eb221552)	2010-12-13 14:23:48 +11:00
Ronnie Sahlberg	dff88a8a6a	Revert "Add a new header flag for "migrated with data" and set this to 1" This reverts commit a8cc35191df1cd4b866897df71d317ce5f198cb5. (This used to be ctdb commit 7c37435fb517a621c45b21a21b4eb15f8bbd3c83)	2010-12-13 14:23:32 +11:00
Ronnie Sahlberg	8e53df6f41	Add a new header flag for "migrated with data" and set this to 1 when we migrate a non-empty record onto the node or a non-empty record off the node When we migrate a record back to the lmaster and yield the dmaster role, inspect this flag if if it is still not set, we can delete the record from the local database as soon as we have migrated it back to the lmaster. (This used to be ctdb commit a8cc35191df1cd4b866897df71d317ce5f198cb5)	2010-12-07 15:33:41 +11:00
Michael Adam	6f77811cb1	server: when we migrate off a record with data, set the MIGRATED_WITH_DATA flag (This used to be ctdb commit 17e231abf5ade83d7fa624b5cf54ae876e2795aa)	2010-12-07 15:31:57 +11:00
Ronnie Sahlberg	db8cb31d8b	during shutdown there is a window after we have stopped TCP and disconnected from all other nodes but before we have stopped all processing. During this window we may still hit asynchronous events that will fail because we can not send/receive packets from other nodes. These messages are logged as ... Transport is DOWN. To help indicate that they are benign messages related to the process of shutting down. These messages spam the syslog during normal shutdown, so this patch will drop the loglevel of these messages to DEBUG, so that they will not appear in or spam the syslog. (This used to be ctdb commit 8275d265d2ae19b765e30ecf18f6b6319b6e6453)	2010-10-28 13:41:08 +11:00
Ronnie Sahlberg	39c367a68f	Create macros to update the statistics counters and use these macros everywhere instead of manipulating the coutenrs directly. (This used to be ctdb commit 2e648df890e5713bc575965d87937827b068d0d7)	2010-09-29 12:14:24 +10:00
Rusty Russell	f93440c4b7	event: Update events to latest Samba version 0.9.8 In Samba this is now called "tevent", and while we use the backwards compatibility wrappers they don't offer EVENT_FD_AUTOCLOSE: that is now a separate tevent_fd_set_auto_close() function. This is based on Samba version `7f29f817fa`. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 85e5e760cc91eb3157d3a88996ce474491646726)	2010-08-18 09:16:31 +09:30
Ronnie Sahlberg	7730facc62	fix a debug message (This used to be ctdb commit 856bd6de6218d9b70baed0e6443be4253ea31afe)	2010-06-09 16:22:44 +10:00
Ronnie Sahlberg	d9a3e1d0c0	idr can timeout and wrap/be reused quite quickly. If a noremote node hangs for an extended period, it is possible that we might have a DMASTER request in flight for record A to that node. Eventually we will reuse the idr, and may reuse it for a DMASTER request to a different node for a different record B. If while the request for B is in flight, the first tnode un-hangs and responds back we would receive a dmaster reply for the wrong record. This would cause a record to become perpetually locked, since inside the daemon we would tdb_chainlock(dmaster_reply->pdu->key) but once the migration would complete we would chainunlock idr->state->call->key Adding code to verify that when we receive a dmaster reply packet that it does in fact match the exact same key that the state variable we have for the idr in flight. (This used to be ctdb commit 2f6a870d7ff02ceb61fde242f752dccbfcb4cb37)	2010-06-09 16:19:29 +10:00
Ronnie Sahlberg	75f3ef154c	add extra logging for failed ctdb_ltdb_unlock() for a few more places it is called from (This used to be ctdb commit 5c0fea90c6474a51992a9c4aeb6af7dfeb213ee0)	2010-06-09 14:37:24 +10:00
Ronnie Sahlberg	fa618aa66a	add additional logging when tdb_chainunlock() fails so we can see where it was called from when it fails (This used to be ctdb commit 0c091b3db6bdefd371787d87bc749593ea8e3c76)	2010-06-09 14:37:16 +10:00
Michael Adam	b72ccfc39a	server:ctdb_send_dmaster_reply: fix a message typo. Michael (This used to be ctdb commit aa63f728152c37e31cecf2258efcdc8cf5ac0092)	2010-02-23 21:07:54 +11:00
Ronnie Sahlberg	06fdfddf27	Reducing the log level for a debug message DEBUG(DEBUG_DEBUG,("pnn %u starting migration of %08x t\ (This used to be ctdb commit 6ce4b21b00cce1530aff022584bf695c257a5d55)	2010-02-16 11:02:01 +11:00
Ronnie Sahlberg	ce9d57bc36	Reduce the log level for two debug messages DEBUG(DEBUG_DEBUG,("pnn %u dmaster response %08x\n", ctdb->pnn, ctdb_has DEBUG(DEBUG_DEBUG,("pnn %u dmaster request on %08x for %u from %u\n", (This used to be ctdb commit a3473e7a445b14520a49585c460429dfbfe1fce0)	2010-02-16 11:01:52 +11:00
Michael Adam	ea65e80223	call: lower the debug message "refusing migration while transction" to lvl INFO This gets just too noisy on a busy system. And it is purley informational anyways... Michael (This used to be ctdb commit 7f64a00c76203fdf6673c3f862a4bfd17fb848d7)	2009-12-09 21:56:59 +01:00
Ronnie Sahlberg	f5e90ec3b5	Revert "From Wolfgang M." This reverts commit 5b70fa8cfd5916d3c212823ad5cc1b251ae175ed. (This used to be ctdb commit 363e7e939ad46b3f75c83c30d4163d63876c2456)	2009-10-29 13:44:12 +11:00
Ronnie Sahlberg	831f9e05a6	From Wolfgang M. With the new vacuuming code, dont treat an invalid dmaster as fatal. Let it update to the new value insetad. (This used to be ctdb commit 5b70fa8cfd5916d3c212823ad5cc1b251ae175ed)	2009-10-22 07:58:44 +11:00
Michael Adam	4cd06a330e	Fix persistent transaction commit race condition. In ctdb_client.c:ctdb_transaction_commit(), after a failed TRANS2_COMMIT control call (for instance due to the 1-second being exceeded waiting for a busy node's reply), there is a 1-second gap between the transaction_cancel() and replay_transaction() calls in which there is no lock on the persistent db. And due to the lack of global state indicating that a transaction is in progress in ctdbd, other nodes may succeed to start transactions on the db in this gap and even worse work on top of the possibly already pushed changes. So the data diverges on the several nodes. This change fixes this by introducing global state for a transaction commit being active in the ctdb_db_context struct and in a db_id field in the client so that a client keeps track of _which_ tdb it as transaction commit running on. These data are set by ctdb upon entering the trans2_commit control and they are cleared in the trans2_error or trans2_finished controls. This makes it impossible to start a nother transaction or migrate a record to a different node while a transaction is active on a persistent tdb, including the retry loop. This approach is dead lock free and still allows recovery process to be started in the retry-gap between cancel and replay. Also note, that this solution does not require any change in the client side. This was debugged and developed together with Stefan Metzmacher <metze@samba.org> - thanks! Michael (This used to be ctdb commit f88103516e5ad723062fb95fcb07a128f1069d69)	2009-07-29 11:12:39 +10:00
Ronnie Sahlberg	e6e1ff32a5	dont try sending a keepalive if the transport is down (This used to be ctdb commit 5cdc04669db8c2ddbbff5af82307a16e8d807b83)	2009-06-30 12:17:05 +10:00
Ronnie Sahlberg	6450ae533a	Dont even try allocating and sending a CALL packet if the transport is down (This used to be ctdb commit cb8dd896914d4e44ad7b8bb000176a7c78f394ae)	2009-06-30 12:16:13 +10:00
Ronnie Sahlberg	127754e192	failing a dmaster send due to the transport being down is fatal (This used to be ctdb commit c17dafc79bec25bbb796478c33f503503d382a20)	2009-06-30 12:14:58 +10:00
Ronnie Sahlberg	757ba01ddc	if we fail a dmaster migration due to the transport being down, then that is a fatal condition. (This used to be ctdb commit 75dea671f68ac6649095357c36b3697a927721e9)	2009-06-30 12:13:15 +10:00
Ronnie Sahlberg	dd1774cd85	dont try to send error packets if the transport is down (This used to be ctdb commit 65b94d280731df3245b26d69f39acfaf5bccf0d8)	2009-06-30 12:10:27 +10:00
Ronnie Sahlberg	22fb69d337	dont even try to allocate a packet if the transport is down since it will fail (This used to be ctdb commit a73f316cb9cec877dc0bc3f7baa21be1b1454273)	2009-06-30 11:55:42 +10:00
Ronnie Sahlberg	26ec64a571	fix a memory leak allocate the memory to the 'call' context and not off the 'ctdb' context (This used to be ctdb commit be89005bd5d13409e377d425db2aad1c0d5b3826)	2008-03-25 11:11:13 +11:00
Ronnie Sahlberg	d53424731f	in ctdb_call_local() we can not talloc_steal() the returned data and hang it off ctdb. This can cause a memory leak if the call is terminated before we have managed to respond to the client. (and the call is talloc_free()d but the data is still hanging off ctdb) instead we must talloc_steal() the data and hang it off the call structure to avoid the memory leak. In order to do this we must also change the call structure that is passed into ctdb_call_local() to be allocated through talloc(). This structure was previously either a static variable, or an element of a larger talloc()ed structure (ctdb_call_state or ctdb_client_call_state) so we must change all creations of a ctdb_call into explicitely creating it through talloc() (This used to be ctdb commit 4becf32aea088a25686e8bc330eb47d85ae0ef8f)	2008-03-19 13:54:17 +11:00
Andrew Tridgell	f6e53f433b	merge from ronnie (This used to be ctdb commit e7b57d38cf7255be823a223cf15b7526285b4f1c)	2008-02-04 20:07:15 +11:00
Andrew Tridgell	9d6ac0cf55	added debug constants to allow for better mapping to syslog levels (This used to be ctdb commit 7ba8f1dde318eab03f4257e5a89fd23e7281e502)	2008-02-04 17:44:24 +11:00
Andrew Tridgell	fc21f78231	make some specific cases of the non-dmaster bug non-fatal (This used to be ctdb commit 7b516ab06c7ba7ffe9ecf3f76720df5360176b2c)	2008-01-05 09:32:29 +11:00
Ronnie Sahlberg	f69321edc8	change debug output from vnn to pnn (This used to be ctdb commit 93a7cf759ae3f9af6671b9f8589e1399a669b46f)	2007-09-04 10:47:02 +10:00
Ronnie Sahlberg	eb4cf6a686	change ctdb->vnn to ctdb->pnn (This used to be ctdb commit 8c776e5707e503ec6586aae39ac6b3ea5a2fd2bc)	2007-09-04 10:06:36 +10:00
Ronnie Sahlberg	135a964220	pass the header to ctdb_become_dmaster instead of just the reqid this allows us to print from which node Invalid or Dropped orphan become dmaster packets came from (This used to be ctdb commit 88efd1bf4c796cd2b184156b72296587bc38bb40)	2007-07-11 09:44:52 +10:00
Andrew Tridgell	32de198fd3	update lib/replace from samba4 (This used to be ctdb commit f0555484105668c01c21f56322992e752e831109)	2007-07-10 15:29:31 +10:00
Andrew Tridgell	a55c03b31b	log the generation numbers to give a hint about this bug (This used to be ctdb commit 12018494baa33c5f6c52e6eae94ac77a56d3e5a0)	2007-07-08 19:36:55 +10:00
Andrew Tridgell	06a71762a4	some #include cleanups (This used to be ctdb commit 1a07d87122d51a40cd8ad5fe13533298c26857cb)	2007-06-07 22:26:27 +10:00
Andrew Tridgell	ae3d54094b	start splitting the code into separate client and server pieces (This used to be ctdb commit 603cd77988c181525946cd5eb0f4d0d646b58059)	2007-06-07 22:06:19 +10:00

1 2 3

147 Commits